CHM Apache Server Ation Version 2.5 D Docs Trunk.en

User Manual: CHM d-docs-trunk.en

Open the PDF directly: View PDF PDF.
Page Count: 1138

DownloadCHM Apache  Server Ation Version 2.5 D-docs-trunk.en
Open PDF In BrowserView PDF
Apache HTTP Server Documentation Version 2.5

Apache Software Foundation
June 1, 2016

ii

About The PDF Documentation
Licensed to the Apache Software Foundation (ASF) under one or more contributor license agreements. See the NOTICE file distributed with this work for additional information regarding copyright ownership. The ASF licenses this
file to You under the Apache License, Version 2.0 (the ”License”); you may not use this file except in compliance with
the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0
This version of the Apache HTTP Server Documentation is converted from XML source files to LATEX using XSLT
with the help of Apache Ant, Apache XML Xalan, and Apache XML Xerces.
Since the HTML version of the documentation is more commonly checked during development, the PDF version may contain some errors and inconsistencies, especially in formatting. If you have difficulty reading a
part of this file, please consult the HTML version of the documentation on the Apache HTTP Server website at
http://httpd.apache.org/docs/trunk/
The Apache HTTP Server Documentation is maintained by the Apache HTTP Server Documentation Project. More
information is available at http://httpd.apache.org/docs-project/

Contents
1

2

Release Notes

1

1.1

Upgrading to 2.4 from 2.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

2

1.2

Overview of new features in Apache HTTP Server 2.4 . . . . . . . . . . . . . . . . . . . . . . . .

8

1.3

Overview of new features in Apache HTTP Server 2.2 . . . . . . . . . . . . . . . . . . . . . . . .

12

1.4

Overview of new features in Apache HTTP Server 2.0 . . . . . . . . . . . . . . . . . . . . . . . .

15

1.5

The Apache License, Version 2.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

17

Using the Apache HTTP Server

21

2.1

Compiling and Installing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

22

2.2

Starting Apache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

27

2.3

Stopping and Restarting Apache HTTP Server . . . . . . . . . . . . . . . . . . . . . . . . . . . .

29

2.4

Configuration Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

32

2.5

Configuration Sections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

35

2.6

Caching Guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

43

2.7

Server-Wide Configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

54

2.8

Log Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

56

2.9

Mapping URLs to Filesystem Locations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

64

2.10

Dynamic Shared Object (DSO) Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

68

2.11

HTTP Protocol Compliance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

71

2.12

Content Negotiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

78

2.13

Custom Error Responses . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

85

2.14

Binding to Addresses and Ports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

88

2.15

Multi-Processing Modules (MPMs) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

90

2.16

Environment Variables in Apache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

92

2.17

Expressions in Apache HTTP Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

99

2.18

Apache’s Handler Use . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

2.19

Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

2.20

Shared Object Cache in Apache HTTP Server . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

2.21

suEXEC Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
iii

iv

CONTENTS
2.22

3

4

5

6

Issues Regarding DNS and Apache HTTP Server . . . . . . . . . . . . . . . . . . . . . . . . . . 121

Apache Virtual Host documentation

123

3.1

Apache Virtual Host documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124

3.2

Name-based Virtual Host Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

3.3

Apache IP-based Virtual Host Support . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

3.4

Dynamically Configured Mass Virtual Hosting . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130

3.5

VirtualHost Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

3.6

An In-Depth Discussion of Virtual Host Matching . . . . . . . . . . . . . . . . . . . . . . . . . . 141

3.7

File Descriptor Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144

URL Rewriting Guide

145

4.1

Apache mod rewrite . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146

4.2

Apache mod rewrite Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147

4.3

Redirecting and Remapping with mod rewrite . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152

4.4

Using mod rewrite to control access . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 159

4.5

Dynamic mass virtual hosts with mod rewrite . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162

4.6

Using mod rewrite for Proxying . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 165

4.7

Using RewriteMap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166

4.8

Advanced Techniques with mod rewrite . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 172

4.9

When not to use mod rewrite . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175

4.10

RewriteRule Flags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178

4.11

Apache mod rewrite Technical Details . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187

Apache SSL/TLS Encryption

191

5.1

Apache SSL/TLS Encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192

5.2

SSL/TLS Strong Encryption: An Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193

5.3

SSL/TLS Strong Encryption: Compatibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 202

5.4

SSL/TLS Strong Encryption: How-To . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 206

5.5

SSL/TLS Strong Encryption: FAQ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212

Guides, Tutorials, and HowTos

225

6.1

How-To / Tutorials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 226

6.2

Authentication and Authorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227

6.3

Access Control . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 234

6.4

Apache Tutorial: Dynamic Content with CGI . . . . . . . . . . . . . . . . . . . . . . . . . . . . 236

6.5

Apache httpd Tutorial: Introduction to Server Side Includes . . . . . . . . . . . . . . . . . . . . . 243

6.6

Apache HTTP Server Tutorial: .htaccess files . . . . . . . . . . . . . . . . . . . . . . . . . . . . 249

6.7

Per-user web directories . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255

CONTENTS
6.8
7

8

9

v

Reverse Proxy Guide . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 258

Platform-specific Notes

265

7.1

Platform Specific Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 266

7.2

Using Apache HTTP Server on Microsoft Windows . . . . . . . . . . . . . . . . . . . . . . . . . 267

7.3

Compiling Apache for Microsoft Windows . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 275

7.4

Using Apache With RPM Based Systems (Redhat / CentOS / Fedora) . . . . . . . . . . . . . . . . 281

7.5

Using Apache With Novell NetWare . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284

7.6

Running a High-Performance Web Server on HPUX . . . . . . . . . . . . . . . . . . . . . . . . . 292

Apache HTTP Server and Supporting Programs

293

8.1

Server and Supporting Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 294

8.2

httpd - Apache Hypertext Transfer Protocol Server . . . . . . . . . . . . . . . . . . . . . . . . . . 295

8.3

ab - Apache HTTP server benchmarking tool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 297

8.4

apachectl - Apache HTTP Server Control Interface . . . . . . . . . . . . . . . . . . . . . . . . . . 301

8.5

apxs - APache eXtenSion tool . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 303

8.6

configure - Configure the source tree . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 307

8.7

dbmmanage - Manage user authentication files in DBM format . . . . . . . . . . . . . . . . . . . 315

8.8

fcgistarter - Start a FastCGI program . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 317

8.9

firehose - Demultiplex a firehose stream . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 318

8.10

htcacheclean - Clean up the disk cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319

8.11

htdbm - Manipulate DBM password databases . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321

8.12

htdigest - manage user files for digest authentication . . . . . . . . . . . . . . . . . . . . . . . . . 324

8.13

htpasswd - Manage user files for basic authentication . . . . . . . . . . . . . . . . . . . . . . . . 325

8.14

httxt2dbm - Generate dbm files for use with RewriteMap . . . . . . . . . . . . . . . . . . . . . . 328

8.15

logresolve - Resolve IP-addresses to hostnames in Apache log files . . . . . . . . . . . . . . . . . 329

8.16

log server status - Log periodic status summaries . . . . . . . . . . . . . . . . . . . . . . . . . . 330

8.17

rotatelogs - Piped logging program to rotate Apache logs . . . . . . . . . . . . . . . . . . . . . . 331

8.18

split-logfile - Split up multi-vhost logfiles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334

8.19

suexec - Switch user before executing external programs . . . . . . . . . . . . . . . . . . . . . . . 335

8.20

Other Programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336

Apache Miscellaneous Documentation

337

9.1

Apache Miscellaneous Documentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 338

9.2

Apache Performance Tuning . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339

9.3

Performance Scaling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 350

9.4

Security Tips . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364

9.5

Relevant Standards . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 369

9.6

Password Formats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 371

vi

CONTENTS

10 Apache modules

375

10.1

Terms Used to Describe Modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 376

10.2

Terms Used to Describe Directives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 377

10.3

Apache Module core . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 380

10.4

Apache Module mod access compat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 440

10.5

Apache Module mod actions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 445

10.6

Apache Module mod alias . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 447

10.7

Apache Module mod allowhandlers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 454

10.8

Apache Module mod allowmethods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 455

10.9

Apache Module mod asis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 456

10.10

Apache Module mod auth basic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 458

10.11

Apache Module mod auth digest . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 462

10.12

Apache Module mod auth form . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 466

10.13

Apache Module mod authn anon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 477

10.14

Apache Module mod authn core . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 480

10.15

Apache Module mod authn dbd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 484

10.16

Apache Module mod authn dbm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 487

10.17

Apache Module mod authn file . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 489

10.18

Apache Module mod authn socache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 491

10.19

Apache Module mod authnz fcgi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 494

10.20

Apache Module mod authnz ldap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 501

10.21

Apache Module mod authz core . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 519

10.22

Apache Module mod authz dbd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 527

10.23

Apache Module mod authz dbm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 532

10.24

Apache Module mod authz groupfile . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 534

10.25

Apache Module mod authz host . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 536

10.26

Apache Module mod authz owner . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 539

10.27

Apache Module mod authz user . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 541

10.28

Apache Module mod autoindex . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 542

10.29

Apache Module mod buffer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 554

10.30

Apache Module mod cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 555

10.31

Apache Module mod cache disk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 570

10.32

Apache Module mod cache socache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 574

10.33

Apache Module mod cern meta . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 578

10.34

Apache Module mod cgi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 580

10.35

Apache Module mod cgid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 583

10.36

Apache Module mod charset lite . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 585

CONTENTS

vii

10.37

Apache Module mod data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 588

10.38

Apache Module mod dav . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 589

10.39

Apache Module mod dav fs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 592

10.40

Apache Module mod dav lock . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 593

10.41

Apache Module mod dbd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 594

10.42

Apache Module mod deflate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 599

10.43

Apache Module mod dialup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 606

10.44

Apache Module mod dir . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 607

10.45

Apache Module mod dumpio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 612

10.46

Apache Module mod echo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 614

10.47

Apache Module mod env . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 615

10.48

Apache Module mod example hooks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 617

10.49

Apache Module mod expires . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 619

10.50

Apache Module mod ext filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 622

10.51

Apache Module mod file cache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 626

10.52

Apache Module mod filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 629

10.53

Apache Module mod firehose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 637

10.54

Apache Module mod headers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 641

10.55

Apache Module mod heartbeat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 647

10.56

Apache Module mod heartmonitor . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 648

10.57

Apache Module mod http2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 650

10.58

Apache Module mod ident . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 661

10.59

Apache Module mod imagemap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 663

10.60

Apache Module mod include . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 667

10.61

Apache Module mod info . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 680

10.62

Apache Module mod isapi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 683

10.63

Apache Module mod journald . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 687

10.64

Apache Module mod lbmethod bybusyness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 688

10.65

Apache Module mod lbmethod byrequests . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 689

10.66

Apache Module mod lbmethod bytraffic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 691

10.67

Apache Module mod lbmethod heartbeat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 692

10.68

Apache Module mod ldap . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 693

10.69

Apache Module mod log config . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 705

10.70

Apache Module mod log debug . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 712

10.71

Apache Module mod log forensic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 714

10.72

Apache Module mod logio . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 716

10.73

Apache Module mod lua . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 718

viii

CONTENTS
10.74

Apache Module mod macro . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 745

10.75

Apache Module mod mime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 749

10.76

Apache Module mod mime magic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 762

10.77

Apache Module mod negotiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 766

10.78

Apache Module mod nw ssl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 770

10.79

Apache Module mod policy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 771

10.80

Apache Module mod privileges . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 781

10.81

Apache Module mod proxy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 787

10.82

Apache Module mod proxy ajp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 815

10.83

Apache Module mod proxy balancer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 824

10.84

Apache Module mod proxy connect . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 828

10.85

Apache Module mod proxy express . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 830

10.86

Apache Module mod proxy fcgi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 833

10.87

Apache Module mod proxy fdpass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 836

10.88

Apache Module mod proxy ftp . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 837

10.89

Apache Module mod proxy hcheck . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 840

10.90

Apache Module mod proxy html . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 844

10.91

Apache Module mod proxy http . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 850

10.92

Apache Module mod proxy http2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 852

10.93

Apache Module mod proxy scgi . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 853

10.94

Apache Module mod proxy wstunnel . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 856

10.95

Apache Module mod ratelimit . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 858

10.96

Apache Module mod reflector . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 859

10.97

Apache Module mod remoteip . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 860

10.98

Apache Module mod reqtimeout . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 864

10.99

Apache Module mod request . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 866

10.100 Apache Module mod rewrite . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 867
10.101 Apache Module mod sed . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 881
10.102 Apache Module mod session . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 883
10.103 Apache Module mod session cookie . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 890
10.104 Apache Module mod session crypto . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 893
10.105 Apache Module mod session dbd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 897
10.106 Apache Module mod setenvif . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 902
10.107 Apache Module mod slotmem plain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 906
10.108 Apache Module mod slotmem shm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 907
10.109 Apache Module mod so . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 908
10.110 Apache Module mod socache dbm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 910

CONTENTS

ix

10.111 Apache Module mod socache dc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 911
10.112 Apache Module mod socache memcache . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 912
10.113 Apache Module mod socache shmcb . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 913
10.114 Apache Module mod speling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 914
10.115 Apache Module mod ssl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 916
10.116 Apache Module mod ssl ct . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 955
10.117 Apache Module mod status . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 962
10.118 Apache Module mod substitute . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 964
10.119 Apache Module mod suexec . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 967
10.120 Apache Module mod syslog . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 968
10.121 Apache Module mod systemd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 969
10.122 Apache Module mod unique id . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 970
10.123 Apache Module mod unixd . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 972
10.124 Apache Module mod userdir . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 975
10.125 Apache Module mod usertrack . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 977
10.126 Apache Module mod version . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 980
10.127 Apache Module mod vhost alias . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 982
10.128 Apache Module mod watchdog . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 986
10.129 Apache Module mod xml2enc . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 987
10.130 Apache Module mpm common . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 990
10.131 Apache Module event . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1001
10.132 Apache Module mpm netware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1006
10.133 Apache Module mpmt os2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1008
10.134 Apache Module prefork . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1009
10.135 Apache Module mpm winnt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1012
10.136 Apache Module worker . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1014
11 Developer Documentation

1017

11.1

Developer Documentation for the Apache HTTP Server 2.4 . . . . . . . . . . . . . . . . . . . . . 1018

11.2

Apache 1.3 API notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1019

11.3

API Changes in Apache HTTP Server 2.4 since 2.2 . . . . . . . . . . . . . . . . . . . . . . . . . 1035

11.4

Developing modules for the Apache HTTP Server 2.4 . . . . . . . . . . . . . . . . . . . . . . . . 1042

11.5

Documenting code in Apache 2.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1070

11.6

Hook Functions in the Apache HTTP Server 2.x . . . . . . . . . . . . . . . . . . . . . . . . . . . 1071

11.7

Converting Modules from Apache 1.3 to Apache 2.0 . . . . . . . . . . . . . . . . . . . . . . . . . 1074

11.8

Request Processing in the Apache HTTP Server 2.x . . . . . . . . . . . . . . . . . . . . . . . . . 1078

11.9

How filters work in Apache 2.0 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1081

11.10

Guide to writing output filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1084

x

CONTENTS
11.11

Apache HTTP Server 2.x Thread Safety Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1091

12 Glossary and Index

1095

12.1

Glossary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1096

12.2

Module Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1101

12.3

Directive Quick Reference . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1106

Chapter 1

Release Notes

1

2

CHAPTER 1. RELEASE NOTES

1.1

Upgrading to 2.4 from 2.2

In order to assist folks upgrading, we maintain a document describing information critical to existing Apache HTTP
Server users. These are intended to be brief notes, and you should be able to find more information in either the New
Features (p. 8) document, or in the src/CHANGES file. Application and module developers can find a summary of
API changes in the API updates (p. 1035) overview.
This document describes changes in server behavior that might require you to change your configuration or how you
use the server in order to continue using 2.4 as you are currently using 2.2. To take advantage of new features in 2.4,
see the New Features document.
This document describes only the changes from 2.2 to 2.4. If you are upgrading from version 2.0, you should also
consult the 2.0 to 2.2 upgrading document.1
See also
• Overview of new features in Apache HTTP Server 2.4 (p. 8)

Compile-Time Configuration Changes
The compilation process is very similar to the one used in version 2.2. Your old configure command line (as found
in build/config.nice in the installed server directory) can be used in most cases. There are some changes in
the default settings. Some details of changes:
• These modules have been removed: mod authn default, mod authz default, mod mem cache. If you were using
mod mem cache in 2.2, look at MOD CACHE DISK in 2.4.
• All load balancing implementations have been moved to individual, self-contained mod proxy submodules, e.g.
MOD LBMETHOD BYBUSYNESS . You might need to build and load any of these that your configuration uses.
• Platform support has been removed for BeOS, TPF, and even older platforms such as A/UX, Next, and Tandem.
These were believed to be broken anyway.
• configure: dynamic modules (DSO) are built by default
• configure: By default, only a basic set of modules is loaded. The other L OAD M ODULE directives are commented
out in the configuration file.
• configure: the "most" module set gets built by default
• configure: the "reallyall" module set adds developer modules to the "all" set

Run-Time Configuration Changes
There have been significant changes in authorization configuration, and other minor configuration changes, that could
require changes to your 2.2 configuration files before using them for 2.4.
Authorization
Any configuration file that uses authorization will likely need changes.
You should review the Authentication, Authorization and Access Control Howto (p. 227) , especially the section
Beyond just authorization (p. 227) which explains the new mechanisms for controlling the order in which the authorization directives are applied.
1 http://httpd.apache.org/docs/2.2/upgrading.html

1.1. UPGRADING TO 2.4 FROM 2.2

3

Directives that control how authorization modules respond when they don’t match the authenticated user have been
removed: This includes AuthzLDAPAuthoritative, AuthzDBDAuthoritative, AuthzDBMAuthoritative, AuthzGroupFileAuthoritative, AuthzUserAuthoritative, and AuthzOwnerAuthoritative. These directives have been replaced by the
more expressive R EQUIRE A NY, R EQUIRE N ONE, and R EQUIRE A LL.
If you use MOD AUTHZ DBM, you must port your configuration to use Require dbm-group ... in place of
Require group ....
Access control
In 2.2, access control based on client hostname, IP address, and other characteristics of client requests was done using
the directives O RDER, A LLOW, D ENY, and S ATISFY.
In 2.4, such access control is done in the same way as other authorization checks, using the new module
MOD AUTHZ HOST . The old access control idioms should be replaced by the new authentication mechanisms, although for compatibility with old configurations, the new module MOD ACCESS COMPAT is provided.

=⇒Mixing
old and new directives
Mixing old directives like O

RDER , A LLOW or D ENY with new ones like R EQUIRE is technically possible but discouraged. MOD ACCESS COMPAT was created to support configurations
containing only old directives to facilitate the 2.4 upgrade. Please check the examples below
to get a better idea about issues that might arise.

Here are some examples of old and new ways to do the same access control.
In this example, all requests are denied.
2.2 configuration:
Order deny,allow
Deny from all

2.4 configuration:
Require all denied

In this example, all requests are allowed.
2.2 configuration:
Order allow,deny
Allow from all

2.4 configuration:
Require all granted

In the following example, all hosts in the example.org domain are allowed access; all other hosts are denied access.
2.2 configuration:
Order Deny,Allow
Deny from all
Allow from example.org

4

CHAPTER 1. RELEASE NOTES
2.4 configuration:
Require host example.org

In the following example, mixing old and new directives leads to unexpected results.
Mixing old and new directives: NOT WORKING AS EXPECTED
DocumentRoot "/var/www/html"

AllowOverride None
Order deny,allow
Deny from all


SetHandler server-status
Require 127.0.0.1

access.log - GET /server-status 403 127.0.0.1
error.log - AH01797: client denied by server configuration: /var/www/html/server-status

Why httpd denies access to servers-status even if the configuration seems to allow it? Because MOD ACCESS COMPAT
directives take precedence over the MOD AUTHZ HOST one in this configuration merge (p. 35) scenario.
This example conversely works as expected:
Mixing old and new directives: WORKING AS EXPECTED
DocumentRoot "/var/www/html"

AllowOverride None
Require all denied


SetHandler server-status
Order deny,allow
Deny from all
Allow From 127.0.0.1

access.log - GET /server-status 200 127.0.0.1

So even if mixing configuration is still possible, please try to avoid it when upgrading: either keep old directives and
then migrate to the new ones on a later stage or just migrate everything in bulk.
Other configuration changes
Some other small adjustments may be necessary for particular configurations as discussed below.
• M AX R EQUESTS P ER C HILD has been renamed to M AX C ONNECTIONS P ER C HILD, describes more accurately
what it does. The old name is still supported.

1.1. UPGRADING TO 2.4 FROM 2.2

5

• M AX C LIENTS has been renamed to M AX R EQUEST W ORKERS, which describes more accurately what it does.
For async MPMs, like EVENT, the maximum number of clients is not equivalent than the number of worker
threads. The old name is still supported.
• The D EFAULT T YPE directive no longer has any effect, other than to emit a warning if it’s used with any value
other than none. You need to use other configuration settings to replace it in 2.4.
• A LLOW OVERRIDE now defaults to None.
• E NABLE S ENDFILE now defaults to Off.
• F ILE ETAG now defaults to "MTime Size" (without INode).
• MOD DAV FS: The format of the DAV L OCK DB file has changed for systems with inodes.
DAV L OCK DB file must be deleted on upgrade.

The old

• K EEPA LIVE only accepts values of On or Off. Previously, any value other than "Off" or "0" was treated as
"On".
• Directives AcceptMutex, LockFile, RewriteLock, SSLMutex, SSLStaplingMutex, and WatchdogMutexPath
have been replaced with a single M UTEX directive. You will need to evaluate any use of these removed directives in your 2.2 configuration to determine if they can just be deleted or will need to be replaced using
M UTEX.
• MOD CACHE: C ACHE I GNORE URLS ESSION I DENTIFIERS now does an exact match against the query string
instead of a partial match. If your configuration was using partial strings, e.g. using sessionid to match
/someapplication/image.gif;jsessionid=123456789, then you will need to change to the full
string jsessionid.
• MOD CACHE: The second parameter to C ACHE E NABLE only matches forward proxy content if it begins with
the correct protocol. In 2.2 and earlier, a parameter of ’/’ matched all content.
• MOD LDAP: LDAPT RUSTED C LIENT C ERT is now consistently a per-directory setting only. If you use this
directive, review your configuration to make sure it is present in all the necessary directory contexts.
• MOD FILTER: F ILTER P ROVIDER syntax has changed and now uses a boolean expression to determine if a filter
is applied.
• MOD INCLUDE:
– The #if expr element now uses the new expression parser (p. 99) . The old syntax can be restored with
the new directive SSIL EGACY E XPR PARSER.
– An SSI* config directive in directory scope no longer causes all other per-directory SSI* directives to be
reset to their default values.
• MOD CHARSET LITE: The DebugLevel option has been removed in favour of per-module L OG L EVEL configuration.
• MOD EXT FILTER: The DebugLevel option has been removed in favour of per-module L OG L EVEL configuration.
• MOD PROXY SCGI: The default setting for PATH INFO has changed from httpd 2.2, and some web applications
will no longer operate properly with the new PATH INFO setting. The previous setting can be restored by
configuring the proxy-scgi-pathinfo variable.
• MOD SSL: CRL based revocation checking now needs to be explicitly configured through SSLCAR EVOCA TION C HECK .
• MOD SUBSTITUTE: The maximum line length is now limited to 1MB.
• MOD REQTIMEOUT: If the module is loaded, it will now set some default timeouts.
• MOD DUMPIO: D UMP IOL OG L EVEL is no longer supported. Data is always logged at L OG L EVEL trace7.
• On Unix platforms, piped logging commands configured using either E RROR L OG or C USTOM L OG were invoked using /bin/sh -c in 2.2 and earlier. In 2.4 and later, piped logging commands are executed directly.
To restore the old behaviour, see the piped logging documentation (p. 56) .

6

CHAPTER 1. RELEASE NOTES

Misc Changes
• MOD AUTOINDEX: will now extract titles and display descriptions for .xhtml files, which were previously
ignored.
• MOD SSL: The default format of the * DN variables has changed. The old format can still be used with the
new LegacyDNStringFormat argument to SSLO PTIONS. The SSLv2 protocol is no longer supported.
SSLP ROXY C HECK P EER CN and SSLP ROXY C HECK P EER E XPIRE now default to On, causing proxy requests
to HTTPS hosts with bad or outdated certificates to fail with a 502 status code (Bad gateway)
• htpasswd now uses MD5 hash by default on all platforms.
• The NAME V IRTUAL H OST directive no longer has any effect, other than to emit a warning. Any address/port
combination appearing in multiple virtual hosts is implicitly treated as a name-based virtual host.
• MOD DEFLATE will now skip compression if it knows that the size overhead added by the compression is larger
than the data to be compressed.
• Multi-language error documents from 2.2.x may not work unless they are adjusted to the new syntax of
MOD INCLUDE ’s #if expr= element or the directive SSIL EGACY E XPR PARSER is enabled for the directory
containing the error documents.
• The functionality provided by mod authn alias in previous versions (i.e., the AUTHN P ROVIDER A LIAS
directive) has been moved into MOD AUTHN CORE.
• MOD CGID uses the servers T IMEOUT to limit the length of time to wait for CGI output. This timeout can be
overridden with CGIDS CRIPT TI MEOUT.

Third Party Modules
All modules must be recompiled for 2.4 before being loaded.
Many third-party modules designed for version 2.2 will otherwise work unchanged with the Apache HTTP Server
version 2.4. Some will require changes; see the API update (p. 1035) overview.

Common problems when upgrading
• Startup errors:
– Invalid command ’User’, perhaps misspelled or defined by a module not
included in the server configuration - load module MOD UNIXD
– Invalid command ’Require’, perhaps misspelled or defined by a module
not included in the server configuration,
or Invalid command ’Order’,
perhaps misspelled or defined by a module not included in the server
configuration - load module MOD ACCESS COMPAT, or update configuration to 2.4 authorization
directives.
– Ignoring deprecated use of DefaultType in line NN of
/path/to/httpd.conf - remove D EFAULT T YPE and replace with other configuration
settings.
– Invalid command ’AddOutputFilterByType’, perhaps misspelled or defined
by a module not included in the server configuration - A DD O UTPUT F ILTER B Y T YPE has moved from the core to mod filter, which must be loaded.
• Errors serving requests:
– configuration error:
MOD AUTHN CORE .

couldn’t check user:

/path

-

load

module

1.1. UPGRADING TO 2.4 FROM 2.2

7

– .htaccess files aren’t being processed - Check for an appropriate A LLOW OVERRIDE directive; the
default changed to None in 2.4.

8

1.2

CHAPTER 1. RELEASE NOTES

Overview of new features in Apache HTTP Server 2.4

This document describes some of the major changes between the 2.2 and 2.4 versions of the Apache HTTP Server.
For new features since version 2.0, see the 2.2 new features (p. 12) document.

Core Enhancements
Run-time Loadable MPMs Multiple MPMs can now be built as loadable modules (p. 90) at compile time. The MPM
of choice can be configured at run time via L OAD M ODULE directive.
Event MPM The Event MPM (p. 1001) is no longer experimental but is now fully supported.
Asynchronous support Better support for asynchronous read/write for supporting MPMs and platforms.
Per-module and per-directory LogLevel configuration The L OG L EVEL can now be configured per module and per
directory. New levels trace1 to trace8 have been added above the debug log level.
Per-request configuration sections , , and  sections can be used to set the configuration
based on per-request criteria.
General-purpose expression parser A new expression parser allows to specify complex conditions (p. 99) using a
common syntax in directives like S ET E NV I F E XPR, R EWRITE C OND, H EADER, , and others.
KeepAliveTimeout in milliseconds It is now possible to specify K EEPA LIVE T IMEOUT in milliseconds.
NameVirtualHost directive No longer needed and is now deprecated.
Override Configuration The new A LLOW OVERRIDE L IST directive allows more fine grained control which directives are allowed in .htaccess files.
Config file variables It is now possible to D EFINE variables in the configuration, allowing a clearer representation if
the same value is used at many places in the configuration.
Reduced memory usage Despite many new features, 2.4.x tends to use less memory than 2.2.x.

New Modules
MOD PROXY FCGI

FastCGI Protocol backend for MOD PROXY

MOD PROXY SCGI

SCGI Protocol backend for MOD PROXY

MOD PROXY EXPRESS

Provides dynamically configured mass reverse proxies for MOD PROXY

Replaces the apparent client remote IP address and hostname for the request with the IP address
list presented by a proxies or a load balancer via the request headers.

MOD REMOTEIP

MOD HEARTMONITOR , MOD LBMETHOD HEARTBEAT

Allow MOD PROXY BALANCER to base loadbalancing
decisions on the number of active connections on the backend servers.
Formerly a third-party module, this supports fixing of HTML links in a reverse proxy situation,
where the backend generates URLs that are not valid for the proxy’s clients.

MOD PROXY HTML

MOD SED

An advanced replacement of MOD SUBSTITUTE, allows to edit the response body with the full power of

sed.
MOD AUTH FORM
MOD SESSION

Enables form-based authentication.

Enables the use of session state for clients, using cookie or database storage.

1.2. OVERVIEW OF NEW FEATURES IN APACHE HTTP SERVER 2.4
MOD ALLOWMETHODS

9

New module to restrict certain HTTP methods without interfering with authentication or

authorization.
MOD LUA

Embeds the Lua2 language into httpd, for configuration and small business logic functions. (Experimental)

MOD LOG DEBUG

Provides for buffering the input and output filter stacks

MOD BUFFER
MOD DATA

Allows the addition of customizable debug logging at different phases of the request processing.

Convert response body into an RFC2397 data URL

MOD RATELIMIT

Provides Bandwidth Rate Limiting for Clients

Provides Filters to handle and make available HTTP request bodies

MOD REQUEST

MOD REFLECTOR

Provides Reflection of a request body as a response via the output filter stack.

MOD SLOTMEM SHM

Provides a Slot-based shared memory provider (ala the scoreboard).

MOD XML 2 ENC

Formerly a third-party module, this supports internationalisation in libxml2-based (markup-aware)
filter modules.

MOD MACRO

(available since 2.4.5) Provide macros within configuration files.

MOD PROXY WSTUNNEL
MOD AUTHNZ FCGI

(available since 2.4.5) Support web-socket tunnels.

(available since 2.4.10) Enable FastCGI authorizer applications to authenticate and/or autho-

rize clients.
MOD HTTP 2

(available since 2.4.17) Support for the HTTP/2 transport layer.

Module Enhancements
can now be configured to use an OCSP server to check the validation status of a client certificate.
The default responder is configurable, along with the decision on whether to prefer the responder designated in
the client certificate itself.

MOD SSL MOD SSL

MOD SSL now also supports OCSP stapling, where the server pro-actively obtains an OCSP verification of its
certificate and transmits that to the client during the handshake.
MOD SSL

can now be configured to share SSL Session data between servers through memcached

EC keys are now supported in addition to RSA and DSA.
Support for TLS-SRP (available in 2.4.4 and later).
The P ROXY PASS directive is now most optimally configured within a L OCATION or L OCATION M ATCH
block, and offers a significant performance advantage over the traditional two-parameter syntax when present
in large numbers. The source address used for proxy requests is now configurable. Support for Unix domain
sockets to the backend (available in 2.4.7 and later).

MOD PROXY

MOD PROXY BALANCER

More runtime configuration changes for BalancerMembers via balancer-manager

Additional BalancerMembers can be added at runtime via balancer-manager
Runtime configuration of a subset of Balancer parameters
BalancerMembers can be set to ’Drain’ so that they only respond to existing sticky sessions, allowing them to
be taken gracefully offline.
Balancer settings can be persistent after restarts.
2 http://www.lua.org/

10

CHAPTER 1. RELEASE NOTES
The MOD CACHE CACHE filter can be optionally inserted at a given point in the filter chain to provide
fine control over caching.

MOD CACHE

MOD CACHE

can now cache HEAD requests.

Where possible, MOD CACHE directives can now be set per directory, instead of per server.
The base URL of cached URLs can be customised, so that a cluster of caches can share the same endpoint URL
prefix.
MOD CACHE

is now capable of serving stale cached data when a backend is unavailable (error 5xx).

MOD CACHE

can now insert HIT/MISS/REVALIDATE into an X-Cache header.

Support for the ’onerror’ attribute within an ’include’ element, allowing an error document to be
served on error instead of the default error string.

MOD INCLUDE

MOD CGI , MOD INCLUDE , MOD ISAPI ,

... Translation of headers to environment variables is more strict than before
to mitigate some possible cross-site-scripting attacks via header injection. Headers containing invalid characters
(including underscores) are now silently dropped. Environment Variables in Apache (p. 92) has some pointers
on how to work around broken legacy clients which require such headers. (This affects all modules which use
these environment variables.)

MOD AUTHZ CORE Authorization Logic Containers Advanced authorization logic may now be specified using the
R EQUIRE directive and the related container directives, such as .

adds the [QSD] (Query String Discard) and [END] flags for R EWRITE RULE to
simplify common rewriting scenarios. Adds the possibility to use complex boolean expressions in R EWRITE C OND. Allows the use of SQL queries as R EWRITE M AP functions.

MOD REWRITE MOD REWRITE

MOD LDAP , MOD AUTHNZ LDAP MOD AUTHNZ LDAP adds support for nested groups. MOD LDAP adds LDAPC ONNECTION P OOLTTL, LDAPT IMEOUT, and other improvements in the handling of timeouts. This is especially useful for setups where a stateful firewall drops idle connections to the LDAP server. MOD LDAP adds

LDAPL IBRARY D EBUG to log debug information provided by the used LDAP toolkit.
MOD INFO MOD INFO
MOD AUTH BASIC

can now dump the pre-parsed configuration to stdout during server startup.

New generic mechanism to fake basic authentication (available in 2.4.5 and later).

Program Enhancements
fcgistarter New FastCGI daemon starter utility
htcacheclean Current cached URLs can now be listed, with optional metadata included. Allow explicit deletion
of individual cached URLs from the cache. File sizes can now be rounded up to the given block size, making the
size limits map more closely to the real size on disk. Cache size can now be limited by the number of inodes,
instead of or in addition to being limited by the size of the files on disk.
rotatelogs May now create a link to the current log file. May now invoke a custom post-rotate script.
htpasswd, htdbm Support for the bcrypt algorithm (available in 2.4.4 and later).

Documentation
mod rewrite The MOD REWRITE documentation has been rearranged and almost completely rewritten, with a focus
on examples and common usage, as well as on showing you when other solutions are more appropriate. The
Rewrite Guide (p. 146) is now a top-level section with much more detail and better organization.
mod ssl The MOD SSL documentation has been greatly enhanced, with more examples at the getting started level, in
addition to the previous focus on technical details.

1.2. OVERVIEW OF NEW FEATURES IN APACHE HTTP SERVER 2.4

11

Caching Guide The Caching Guide (p. 43) has been rewritten to properly distinguish between the RFC2616
HTTP/1.1 caching features provided by MOD CACHE, and the generic key/value caching provided by the
socache (p. 114) interface, as well as to cover specialised caching provided by mechanisms such as
MOD FILE CACHE .

Module Developer Changes
Check Configuration Hook Added A new hook, check config, has been added which runs between the
pre config and open logs hooks. It also runs before the test config hook when the -t option is
passed to httpd. The check config hook allows modules to review interdependent configuration directive values and adjust them while messages can still be logged to the console. The user can thus be alerted to
misconfiguration problems before the core open logs hook function redirects console output to the error log.
Expression Parser Added We now have a general-purpose expression parser, whose API is exposed in ap expr.h.
This is adapted from the expression parser previously implemented in MOD SSL.
Authorization Logic Containers Authorization modules now register as a provider, via ap register auth provider(),
to support advanced authorization logic, such as .
Small-Object Caching Interface The ap socache.h header exposes a provider-based interface for caching small data
objects, based on the previous implementation of the MOD SSL session cache. Providers using a shared-memory
cyclic buffer, disk-based dbm files, and a memcache distributed cache are currently supported.
Cache Status Hook Added The MOD CACHE module now includes a new cache status hook, which is called
when the caching decision becomes known. A default implementation is provided which adds an optional
X-Cache and X-Cache-Detail header to the response.
The developer documentation contains a detailed list of API changes (p. 1035) .

12

CHAPTER 1. RELEASE NOTES

1.3

Overview of new features in Apache HTTP Server 2.2

This document describes some of the major changes between the 2.0 and 2.2 versions of the Apache HTTP Server.
For new features since version 1.3, see the 2.0 new features (p. 15) document.

Core Enhancements
Authn/Authz The bundled authentication and authorization modules have been refactored.
The new
mod authn alias(already removed from 2.3/2.4) module can greatly simplify certain authentication configurations. See module name changes, and the developer changes for more information about how these changes
affects users and module writers.
Caching MOD CACHE, MOD CACHE DISK, and mod mem cache(already removed from 2.3/2.4) have undergone a
lot of changes, and are now considered production-quality. htcacheclean has been introduced to clean up
MOD CACHE DISK setups.
Configuration The default configuration layout has been simplified and modularised. Configuration snippets which
can be used to enable commonly-used features are now bundled with Apache, and can be easily added to the
main server config.
Graceful stop The PREFORK, WORKER and EVENT MPMs now allow httpd to be shutdown gracefully via the
graceful-stop (p. 29) signal. The G RACEFUL S HUTDOWN T IMEOUT directive has been added to specify
an optional timeout, after which httpd will terminate regardless of the status of any requests being served.
Proxying The new MOD PROXY BALANCER module provides load balancing services for MOD PROXY. The
new MOD PROXY AJP module adds support for the Apache JServ Protocol version 1.3 used by
Apache Tomcat3 .
Regular Expression Library Updated Version 5.0 of the Perl Compatible Regular Expression Library4 (PCRE) is
now included. httpd can be configured to use a system installation of PCRE by passing the --with-pcre
flag to configure.
Smart Filtering MOD FILTER introduces dynamic configuration to the output filter chain. It enables filters to be
conditionally inserted, based on any Request or Response header or environment variable, and dispenses with
the more problematic dependencies and ordering problems in the 2.0 architecture.
Large File Support httpd is now built with support for files larger than 2GB on modern 32-bit Unix systems.
Support for handling >2GB request bodies has also been added.
Event MPM The EVENT MPM uses a separate thread to handle Keep Alive requests and accepting connections. Keep
Alive requests have traditionally required httpd to dedicate a worker to handle it. This dedicated worker could
not be used again until the Keep Alive timeout was reached.
SQL Database Support MOD DBD, together with the apr dbd framework, brings direct SQL support to modules
that need it. Supports connection pooling in threaded MPMs.

Module Enhancements
Authn/Authz Modules in the aaa directory have been renamed and offer better support for digest authentication.
For example, mod auth is now split into MOD AUTH BASIC and MOD AUTHN FILE; mod auth dbm is
now called MOD AUTHN DBM; mod access has been renamed MOD AUTHZ HOST. There is also a new
mod authn alias(already removed from 2.3/2.4) module for simplifying certain authentication configurations.
3 http://tomcat.apache.org/
4 http://www.pcre.org/

1.3. OVERVIEW OF NEW FEATURES IN APACHE HTTP SERVER 2.2

13

This module is a port of the 2.0 mod auth ldap module to the 2.2 Authn/Authz framework. New features include using LDAP attribute values and complicated search filters in the R EQUIRE directive.

MOD AUTHNZ LDAP

MOD AUTHZ OWNER

A new module that authorizes access to files based on the owner of the file on the file system

A new module that allows configuration blocks to be enabled based on the version number of the
running server.

MOD VERSION

Added a new ?config argument which will show the configuration directives as parsed by Apache,
including their file name and line number. The module also shows the order of all request hooks and additional
build information, similar to httpd -V.

MOD INFO

MOD SSL

Added a support for RFC 28175 , which allows connections to upgrade from clear text to TLS encryption.

MOD IMAGEMAP

mod imap has been renamed to MOD IMAGEMAP to avoid user confusion.

Program Enhancements
httpd A new command line option -M has been added that lists all modules that are loaded based on the current
configuration. Unlike the -l option, this list includes DSOs loaded via MOD SO.
httxt2dbm A new program used to generate dbm files from text input, for use in R EWRITE M AP with the dbm map
type.

Module Developer Changes
APR 1.0 API Apache 2.2 uses the APR 1.0 API. All deprecated functions and symbols have been removed from APR
and APR-Util. For details, see the APR Website6 .
Authn/Authz The bundled authentication and authorization modules have been renamed along the following lines:
•
•
•
•

mod
mod
mod
mod

auth * -> Modules that implement an HTTP authentication mechanism
authn * -> Modules that provide a backend authentication provider
authz * -> Modules that implement authorization (or access)
authnz * -> Module that implements both authentication & authorization

There is a new authentication backend provider scheme which greatly eases the construction of new authentication backends.
Connection Error Logging A new function, ap log cerror has been added to log errors that occur with the
client’s connection. When logged, the message includes the client IP address.
Test Configuration Hook Added A new hook, test config has been added to aid modules that want to execute
special code only when the user passes -t to httpd.
Set Threaded MPM’s Stacksize A new directive, T HREAD S TACK S IZE has been added to set the stack size on all
threaded MPMs. This is required for some third-party modules on platforms with small default thread stack
size.
Protocol handling for output filters In the past, every filter has been responsible for ensuring that it generates the
correct response headers where it affects them. Filters can now delegate common protocol management to
MOD FILTER , using the ap register output filter protocol or ap filter protocol calls.
5 http://www.ietf.org/rfc/rfc2817.txt
6 http://apr.apache.org/

14

CHAPTER 1. RELEASE NOTES

Monitor hook added Monitor hook enables modules to run regular/scheduled jobs in the parent (root) process.
Regular expression API changes The pcreposix.h header is no longer available; it is replaced by the new
ap regex.h header. The POSIX.2 regex.h implementation exposed by the old header is now available
under the ap namespace from ap regex.h. Calls to regcomp, regexec and so on can be replaced by
calls to ap regcomp, ap regexec.
DBD Framework (SQL Database API) With Apache 1.x and 2.0, modules requiring an SQL backend had to take
responsibility for managing it themselves. Apart from reinventing the wheel, this can be very inefficient, for
example when several modules each maintain their own connections.
Apache 2.1 and later provides the ap dbd API for managing database connections (including optimised strategies for threaded and unthreaded MPMs), while APR 1.2 and later provides the apr dbd API for interacting
with the database.
New modules SHOULD now use these APIs for all SQL database operations. Existing applications SHOULD
be upgraded to use it where feasible, either transparently or as a recommended option to their users.

1.4. OVERVIEW OF NEW FEATURES IN APACHE HTTP SERVER 2.0

1.4

15

Overview of new features in Apache HTTP Server 2.0

This document describes some of the major changes between the 1.3 and 2.0 versions of the Apache HTTP Server.
See also
• Upgrading to 2.0 from 1.3 (p. 2)

Core Enhancements
Unix Threading On Unix systems with POSIX threads support, Apache httpd can now run in a hybrid multiprocess,
multithreaded mode. This improves scalability for many, but not all configurations.
New Build System The build system has been rewritten from scratch to be based on autoconf and libtool. This
makes Apache httpd’s configuration system more similar to that of other packages.
Multiprotocol Support Apache HTTP Server now has some of the infrastructure in place to support serving multiple
protocols. MOD ECHO has been written as an example.
Better support for non-Unix platforms Apache HTTP Server 2.0 is faster and more stable on non-Unix platforms
such as BeOS, OS/2, and Windows. With the introduction of platform-specific multi-processing modules (p. 90)
(MPMs) and the Apache Portable Runtime (APR), these platforms are now implemented in their native API,
avoiding the often buggy and poorly performing POSIX-emulation layers.
New Apache httpd API The API for modules has changed significantly for 2.0. Many of the module-ordering/priority problems from 1.3 should be gone. 2.0 does much of this automatically, and module ordering is now
done per-hook to allow more flexibility. Also, new calls have been added that provide additional module capabilities without patching the core Apache HTTP Server.
IPv6 Support On systems where IPv6 is supported by the underlying Apache Portable Runtime library, Apache httpd
gets IPv6 listening sockets by default. Additionally, the L ISTEN, NAME V IRTUAL H OST, and V IRTUAL H OST
directives support IPv6 numeric address strings (e.g., "Listen [2001:db8::1]:8080").
Filtering Apache httpd modules may now be written as filters which act on the stream of content as it is delivered to
or from the server. This allows, for example, the output of CGI scripts to be parsed for Server Side Include directives using the INCLUDES filter in MOD INCLUDE. The module MOD EXT FILTER allows external programs
to act as filters in much the same way that CGI programs can act as handlers.
Multilanguage Error Responses Error response messages to the browser are now provided in several languages,
using SSI documents. They may be customized by the administrator to achieve a consistent look and feel.
Simplified configuration Many confusing directives have been simplified. The often confusing Port and
BindAddress directives are gone; only the L ISTEN directive is used for IP address binding; the S ERVER NAME directive specifies the server name and port number only for redirection and vhost recognition.
Native Windows NT Unicode Support Apache httpd 2.0 on Windows NT now uses utf-8 for all filename encodings.
These directly translate to the underlying Unicode file system, providing multilanguage support for all Windows
NT-based installations, including Windows 2000 and Windows XP. This support does not extend to Windows
95, 98 or ME, which continue to use the machine’s local codepage for filesystem access.
Regular Expression Library Updated Apache httpd 2.0 includes the Perl Compatible Regular Expression Library7
(PCRE). All regular expression evaluation now uses the more powerful Perl 5 syntax.
7 http://www.pcre.org/

16

CHAPTER 1. RELEASE NOTES

Module Enhancements
New module in Apache httpd 2.0. This module is an interface to the SSL/TLS encryption protocols provided by OpenSSL.

MOD SSL

New module in Apache httpd 2.0. This module implements the HTTP Distributed Authoring and Versioning (DAV) specification for posting and maintaining web content.

MOD DAV

New module in Apache httpd 2.0. This module allows supporting browsers to request that content
be compressed before delivery, saving network bandwidth.

MOD DEFLATE

New module in Apache httpd 2.0.41. This module allows an LDAP database to be used to store
credentials for HTTP Basic Authentication. A companion module, MOD LDAP provides connection pooling and
results caching.

MOD AUTH LDAP

MOD AUTH DIGEST

Includes additional support for session caching across processes using shared memory.

MOD CHARSET LITE

New module in Apache httpd 2.0. This experimental module allows for character set transla-

tion or recoding.
New module in Apache httpd 2.0. This module includes the functionality of mod mmap static
in Apache HTTP Server version 1.3, plus adds further caching abilities.

MOD FILE CACHE

MOD HEADERS This module is much more flexible in Apache httpd
MOD PROXY , and it can conditionally set response headers.

2.0. It can now modify request headers used by

The proxy module has been completely rewritten to take advantage of the new filter infrastructure
and to implement a more reliable, HTTP/1.1 compliant proxy. In addition, new 

configuration sections provide more readable (and internally faster) control of proxied sites; overloaded configuration are not supported. The module is now divided into specific protocol support modules including proxy connect, proxy ftp and proxy http. MOD PROXY A new F ORCE L ANGUAGE P RIORITY directive can be used to assure that the client receives a single document in all cases, rather than NOT ACCEPTABLE or MULTIPLE CHOICES responses. In addition, the negotiation and MultiViews algorithms have been cleaned up to provide more consistent results and a new form of type map that can include document content is provided. MOD NEGOTIATION Autoindex’ed directory listings can now be configured to use HTML tables for cleaner formatting, and allow finer-grained control of sorting, including version-sorting, and wildcard filtering of the directory listing. MOD AUTOINDEX New directives allow the default start and end tags for SSI elements to be changed and allow for error and time format configuration to take place in the main configuration file rather than in the SSI document. Results from regular expression parsing and grouping (now based on Perl’s regular expression syntax) can be retrieved using MOD INCLUDE’s variables $0 .. $9. MOD INCLUDE MOD AUTH DBM Now supports multiple types of DBM-like databases using the AUTH DBMT YPE directive. 1.5. THE APACHE LICENSE, VERSION 2.0 1.5 17 The Apache License, Version 2.0 Apache License Version 2.0, January 2004 http://www.apache.org/licenses/ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 1. Definitions "License" shall mean the terms and conditions for use, reproduction, and distribution as defined by Sections 1 through 9 of this document. "Licensor" shall mean the copyright owner or entity authorized by the copyright owner that is granting the License. "Legal Entity" shall mean the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, "control" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity. "You" (or "Your") shall mean an individual or Legal Entity exercising permissions granted by this License. "Source" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files. "Object" form shall mean any form resulting from mechanical transformation or translation of a Source form, including but not limited to compiled object code, generated documentation, and conversions to other media types. "Work" shall mean the work of authorship, whether in Source or Object form, made available under the License, as indicated by a copyright notice that is included in or attached to the work (an example is provided in the Appendix below). "Derivative Works" shall mean any work, whether in Source or Object form, that is based on (or derived from) the Work and for which the editorial revisions, annotations, elaborations, or other modifications represent, as a whole, an original work of authorship. For the purposes of this License, Derivative Works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work and Derivative Works thereof. "Contribution" shall mean any work of authorship, including the original version of the Work and any modifications or additions to that Work or Derivative Works thereof, that is intentionally submitted to Licensor for inclusion in the Work by the copyright owner or by an individual or Legal Entity authorized to submit on behalf of the copyright owner. For the purposes of this definition, "submitted" means any form of electronic, verbal, or written communication sent to the Licensor or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Licensor for the purpose of discussing and improving the Work, but excluding communication that is conspicuously marked or otherwise designated in writing by the copyright owner as "Not a Contribution." "Contributor" shall mean Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Work. 2. Grant of Copyright License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the Work and such Derivative Works in Source or Object form. 3. Grant of Patent License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) 18 CHAPTER 1. RELEASE NOTES patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the Work to which such Contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed. 4. Redistribution. You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form, provided that You meet the following conditions: (a) You must give any other recipients of the Work or Derivative Works a copy of this License; and (b) You must cause any modified files to carry prominent notices stating that You changed the files; and (c) You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works; and (d) If the Work includes a "NOTICE" text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify the License. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License. You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the Work otherwise complies with the conditions stated in this License. 5. Submission of Contributions. Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the Work by You to the Licensor shall be under the terms and conditions of this License, without any additional terms or conditions. Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with Licensor regarding such Contributions. 6. Trademarks. This License does not grant permission to use the trade names, trademarks, service marks, or product names of the Licensor, except as required for reasonable and customary use in describing the origin of the Work and reproducing the content of the NOTICE file. 7. Disclaimer of Warranty. Unless required by applicable law or agreed to in writing, Licensor provides the Work (and each Contributor provides its Contributions) on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Work and assume any risks associated with Your exercise of permissions under this License. 8. Limitation of Liability. In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Work (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages. 1.5. THE APACHE LICENSE, VERSION 2.0 19 9. Accepting Warranty or Additional Liability. While redistributing the Work or Derivative Works thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License. However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability. END OF TERMS AND CONDITIONS APPENDIX: How to apply the Apache License to your work. To apply the Apache License to your work, attach the following boilerplate notice, with the fields enclosed by brackets "[]" replaced with your own identifying information. (Don’t include the brackets!) The text should be enclosed in the appropriate comment syntax for the file format. We also recommend that a file or class name and description of purpose be included on the same "printed page" as the copyright notice for easier identification within third-party archives. Copyright [yyyy] [name of copyright owner] Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. 20 CHAPTER 1. RELEASE NOTES Chapter 2 Using the Apache HTTP Server 21 22 CHAPTER 2. USING THE APACHE HTTP SERVER 2.1 Compiling and Installing This document covers compilation and installation of the Apache HTTP Server on Unix and Unix-like systems only. For compiling and installation on Windows, see Using Apache HTTP Server with Microsoft Windows (p. 267) and Compiling Apache for Microsoft Windows (p. 275) . For other platforms, see the platform (p. 266) documentation. Apache httpd uses libtool and autoconf to create a build environment that looks like many other Open Source projects. If you are upgrading from one minor version to the next (for example, 2.4.8 to 2.4.9), please skip down to the upgrading section. See also • Configure the source tree (p. 307) • Starting Apache httpd (p. 27) • Stopping and Restarting (p. 29) Overview for the impatient Download Extract Configure Compile Install Customize Test $ $ $ $ $ $ $ $ $ lynx http://httpd.apache.org/download.cgi gzip -d httpd-NN.tar.gz tar xvf httpd-NN.tar cd httpd-NN ./configure --prefix=PREFIX make make install vi PREFIX/conf/httpd.conf PREFIX/bin/apachectl -k start NN must be replaced with the current version number, and PREFIX must be replaced with the filesystem path under which the server should be installed. If PREFIX is not specified, it defaults to /usr/local/apache2. Each section of the compilation and installation process is described in more detail below, beginning with the requirements for compiling and installing Apache httpd. Requirements The following requirements exist for building Apache httpd: APR and APR-Util Make sure you have APR and APR-Util already installed on your system. If you don’t, or prefer to not use the system-provided versions, download the latest versions of both APR and APR-Util from Apache APR1 , unpack them into /httpd source tree root/srclib/apr and /httpd source tree root/srclib/apr-util (be sure the directory names do not have version numbers; for example, the APR distribution must be under /httpd source tree root/srclib/apr/) and use ./configure’s --with-included-apr option. On some platforms, you may have to install the corresponding -dev packages to allow httpd to build against your installed copy of APR and APR-Util. Perl-Compatible Regular Expressions Library (PCRE) This library is required but not longer bundled with httpd. Download the source code from http://www.pcre.org2 , or install a Port or Package. If your build system can’t find the pcre-config script installed by the PCRE build, point to it using the --with-pcre parameter. On some platforms, you may have to install the corresponding -dev package to allow httpd to build against your installed copy of PCRE. 1 http://apr.apache.org/ 2 http://www.pcre.org/ 2.1. COMPILING AND INSTALLING 23 Disk Space Make sure you have at least 50 MB of temporary free disk space available. After installation the server occupies approximately 10 MB of disk space. The actual disk space requirements will vary considerably based on your chosen configuration options, any third-party modules, and, of course, the size of the web site or sites that you have on the server. ANSI-C Compiler and Build System Make sure you have an ANSI-C compiler installed. The GNU C compiler (GCC)3 from the Free Software Foundation (FSF)4 is recommended. If you don’t have GCC then at least make sure your vendor’s compiler is ANSI compliant. In addition, your PATH must contain basic build tools such as make. Accurate time keeping Elements of the HTTP protocol are expressed as the time of day. So, it’s time to investigate setting some time synchronization facility on your system. Usually the ntpdate or xntpd programs are used for this purpose which are based on the Network Time Protocol (NTP). See the NTP homepage5 for more details about NTP software and public time servers. Perl 56 [OPTIONAL] For some of the support scripts like apxs or dbmmanage (which are written in Perl) the Perl 5 interpreter is required (versions 5.003 or newer are sufficient). If no Perl 5 interpreter is found by the configure script, you will not be able to use the affected support scripts. Of course, you will still be able to build and use Apache httpd. Download The Apache HTTP Server can be downloaded from the Apache HTTP Server download site7 , which lists several mirrors. Most users of Apache on unix-like systems will be better off downloading and compiling a source version. The build process (described below) is easy, and it allows you to customize your server to suit your needs. In addition, binary releases are often not up to date with the latest source releases. If you do download a binary, follow the instructions in the INSTALL.bindist file inside the distribution. After downloading, it is important to verify that you have a complete and unmodified version of the Apache HTTP Server. This can be accomplished by testing the downloaded tarball against the PGP signature. Details on how to do this are available on the download page8 and an extended example is available describing the use of PGP9 . Extract Extracting the source from the Apache HTTP Server tarball is a simple matter of uncompressing, and then untarring: $ gzip -d httpd-NN.tar.gz $ tar xvf httpd-NN.tar This will create a new directory under the current directory containing the source code for the distribution. You should cd into that directory before proceeding with compiling the server. Configuring the source tree The next step is to configure the Apache source tree for your particular platform and personal requirements. This is done using the script configure included in the root directory of the distribution. (Developers downloading an 3 http://gcc.gnu.org/ 4 http://www.gnu.org/ 5 http://www.ntp.org 7 http://httpd.apache.org/download.cgi 8 http://httpd.apache.org/download.cgi#verify 9 http://httpd.apache.org/dev/verification.html 24 CHAPTER 2. USING THE APACHE HTTP SERVER unreleased version of the Apache source tree will need to have autoconf and libtool installed and will need to run buildconf before proceeding with the next steps. This is not necessary for official releases.) To configure the source tree using all the default options, simply type ./configure. To change the default options, configure accepts a variety of variables and command line options. The most important option is the location --prefix where Apache is to be installed later, because Apache has to be configured for this location to work correctly. More fine-tuned control of the location of files is possible with additional configure options (p. 307) . Also at this point, you can specify which features (p. 307) you want included in Apache by enabling and disabling modules (p. 1101) . Apache comes with a wide range of modules included by default. They will be compiled as shared objects (DSOs) (p. 68) which can be loaded or unloaded at runtime. You can also choose to compile modules statically by using the option --enable-module=static. Additional modules are enabled using the --enable-module option, where module is the name of the module with the mod string removed and with any underscore converted to a dash. Similarly, you can disable modules with the --disable-module option. Be careful when using these options, since configure cannot warn you if the module you specify does not exist; it will simply ignore the option. In addition, it is sometimes necessary to provide the configure script with extra information about the location of your compiler, libraries, or header files. This is done by passing either environment variables or command line options to configure. For more information, see the configure manual page. Or invoke configure using the --help option. For a short impression of what possibilities you have, here is a typical example which compiles Apache for the installation tree /sw/pkg/apache with a particular compiler and flags plus the two additional modules MOD LDAP and MOD LUA: $ CC="pgcc" CFLAGS="-O2" \ ./configure --prefix=/sw/pkg/apache \ --enable-ldap=shared \ --enable-lua=shared When configure is run it will take several minutes to test for the availability of features on your system and build Makefiles which will later be used to compile the server. Details on all the different configure options are available on the configure manual page. Build Now you can build the various parts which form the Apache package by simply running the command: $ make Please be patient here, since a base configuration takes several minutes to compile and the time will vary widely depending on your hardware and the number of modules that you have enabled. Install Now it’s time to install the package under the configured installation PREFIX (see --prefix option above) by running: $ make install 2.1. COMPILING AND INSTALLING 25 This step will typically require root privileges, since PREFIX is usually a directory with restricted write permissions. If you are upgrading, the installation will not overwrite your configuration files or documents. Customize Next, you can customize your Apache HTTP server by editing the configuration files (p. 32) under PREFIX/conf/. $ vi PREFIX/conf/httpd.conf Have a look at the Apache manual under PREFIX/docs/manual/ or consult http://httpd.apache.org/docs/trunk/ for the most recent version of this manual and a complete reference of available configuration directives (p. 1106) . Test Now you can start (p. 27) your Apache HTTP server by immediately running: $ PREFIX/bin/apachectl -k start You should then be able to request your first document via the URL http://localhost/. The web page you see is located under the D OCUMENT ROOT, which will usually be PREFIX/htdocs/. Then stop (p. 29) the server again by running: $ PREFIX/bin/apachectl -k stop Upgrading The first step in upgrading is to read the release announcement and the file CHANGES in the source distribution to find any changes that may affect your site. When changing between major releases (for example, from 2.0 to 2.2 or from 2.2 to 2.4), there will likely be major differences in the compile-time and run-time configuration that will require manual adjustments. All modules will also need to be upgraded to accommodate changes in the module API. Upgrading from one minor version to the next (for example, from 2.2.55 to 2.2.57) is easier. The make install process will not overwrite any of your existing documents, log files, or configuration files. In addition, the developers make every effort to avoid incompatible changes in the configure options, run-time configuration, or the module API between minor versions. In most cases you should be able to use an identical configure command line, an identical configuration file, and all of your modules should continue to work. To upgrade across minor versions, start by finding the file config.nice in the build directory of your installed server or at the root of the source tree for your old install. This will contain the exact configure command line that you used to configure the source tree. Then to upgrade from one version to the next, you need only copy the config.nice file to the source tree of the new version, edit it to make any desired changes, and then run: $ $ $ $ $ ./config.nice make make install PREFIX/bin/apachectl -k graceful-stop PREFIX/bin/apachectl -k start 26 ! CHAPTER 2. USING THE APACHE HTTP SERVER You should always test any new version in your environment before putting it into production. For example, you can install and run the new version along side the old one by using a different --prefix and a different port (by adjusting the L ISTEN directive) to test for any incompatibilities before doing the final upgrade. You can pass additional arguments to config.nice, which will be appended to your original configure options: $ ./config.nice --prefix=/home/test/apache --with-port=90 Third-party packages A large number of third parties provide their own packaged distributions of the Apache HTTP Server for installation on particular platforms. This includes the various Linux distributions, various third-party Windows packages, Mac OS X, Solaris, and many more. Our software license not only permits, but encourages, this kind of redistribution. However, it does result in a situation where the configuration layout and defaults on your installation of the server may differ from what is stated in the documentation. While unfortunate, this situation is not likely to change any time soon. A description of these third-party distrubutions10 is maintained in the HTTP Server wiki, and should reflect the current state of these third-party distributions. However, you will need to familiarize yourself with your particular platform’s package management and installation procedures. 10 http://wiki.apache.org/httpd/DistrosDefaultLayout 2.2. STARTING APACHE 2.2 27 Starting Apache On Windows, Apache is normally run as a service. For details, see Running Apache as a Service (p. 267) . On Unix, the httpd program is run as a daemon that executes continuously in the background to handle requests. This document describes how to invoke httpd. See also • Stopping and Restarting (p. 29) • httpd • apachectl How Apache Starts If the L ISTEN specified in the configuration file is default of 80 (or any other port below 1024), then it is necessary to have root privileges in order to start apache, so that it can bind to this privileged port. Once the server has started and performed a few preliminary activities such as opening its log files, it will launch several child processes which do the work of listening for and answering requests from clients. The main httpd process continues to run as the root user, but the child processes run as a less privileged user. This is controlled by the selected Multi-Processing Module (p. 90) . The recommended method of invoking the httpd executable is to use the apachectl control script. This script sets certain environment variables that are necessary for httpd to function correctly under some operating systems, and then invokes the httpd binary. apachectl will pass through any command line arguments, so any httpd options may also be used with apachectl. You may also directly edit the apachectl script by changing the HTTPD variable near the top to specify the correct location of the httpd binary and any command-line arguments that you wish to be always present. The first thing that httpd does when it is invoked is to locate and read the configuration file (p. 32) httpd.conf. The location of this file is set at compile-time, but it is possible to specify its location at run time using the -f command-line option as in /usr/local/apache2/bin/apachectl -f /usr/local/apache2/conf/httpd.conf If all goes well during startup, the server will detach from the terminal and the command prompt will return almost immediately. This indicates that the server is up and running. You can then use your browser to connect to the server and view the test page in the D OCUMENT ROOT directory. Errors During Start-up If Apache suffers a fatal problem during startup, it will write a message describing the problem either to the console or to the E RROR L OG before exiting. One of the most common error messages is "Unable to bind to Port ...". This message is usually caused by either: • Trying to start the server on a privileged port when not logged in as the root user; or • Trying to start the server when there is another instance of Apache or some other web server already bound to the same Port. For further trouble-shooting instructions, consult the Apache FAQ11 . 11 http://wiki.apache.org/httpd/FAQ 28 CHAPTER 2. USING THE APACHE HTTP SERVER Starting at Boot-Time If you want your server to continue running after a system reboot, you should add a call to apachectl to your system startup files (typically rc.local or a file in an rc.N directory). This will start Apache as root. Before doing this ensure that your server is properly configured for security and access restrictions. The apachectl script is designed to act like a standard SysV init script; it can take the arguments start, restart, and stop and translate them into the appropriate signals to httpd. So you can often simply link apachectl into the appropriate init directory. But be sure to check the exact requirements of your system. Additional Information Additional information about the command-line options of httpd and apachectl as well as other support programs included with the server is available on the Server and Supporting Programs (p. 294) page. There is also documentation on all the modules (p. 1101) included with the Apache distribution and the directives (p. 1106) that they provide. 2.3. STOPPING AND RESTARTING APACHE HTTP SERVER 2.3 29 Stopping and Restarting Apache HTTP Server This document covers stopping and restarting Apache HTTP Server on Unix-like systems. Windows NT, 2000 and XP users should see Running httpd as a Service (p. 267) and Windows 9x and ME users should see Running httpd as a Console Application (p. 267) for information on how to control httpd on those platforms. See also • httpd • apachectl • Starting (p. 27) Introduction In order to stop or restart the Apache HTTP Server, you must send a signal to the running httpd processes. There are two ways to send the signals. First, you can use the unix kill command to directly send signals to the processes. You will notice many httpd executables running on your system, but you should not send signals to any of them except the parent, whose pid is in the P ID F ILE. That is to say you shouldn’t ever need to send signals to any process except the parent. There are four signals that you can send the parent: TERM, USR1, HUP, and WINCH, which will be described in a moment. To send a signal to the parent you should issue a command such as: kill -TERM ‘cat /usr/local/apache2/logs/httpd.pid‘ The second method of signaling the httpd processes is to use the -k command line options: stop, restart, graceful and graceful-stop, as described below. These are arguments to the httpd binary, but we recommend that you send them using the apachectl control script, which will pass them through to httpd. After you have signaled httpd, you can read about its progress by issuing: tail -f /usr/local/apache2/logs/error log Modify those examples to match your S ERVER ROOT and P ID F ILE settings. Stop Now Signal: TERM apachectl -k stop Sending the TERM or stop signal to the parent causes it to immediately attempt to kill off all of its children. It may take it several seconds to complete killing off its children. Then the parent itself exits. Any requests in progress are terminated, and no further requests are served. Graceful Restart Signal: USR1 apachectl -k graceful 30 CHAPTER 2. USING THE APACHE HTTP SERVER The USR1 or graceful signal causes the parent process to advise the children to exit after their current request (or to exit immediately if they’re not serving anything). The parent re-reads its configuration files and re-opens its log files. As each child dies off the parent replaces it with a child from the new generation of the configuration, which begins serving new requests immediately. This code is designed to always respect the process control directive of the MPMs, so the number of processes and threads available to serve clients will be maintained at the appropriate values throughout the restart process. Furthermore, it respects S TART S ERVERS in the following manner: if after one second at least S TART S ERVERS new children have not been created, then create enough to pick up the slack. Hence the code tries to maintain both the number of children appropriate for the current load on the server, and respect your wishes with the S TART S ERVERS parameter. Users of MOD STATUS will notice that the server statistics are not set to zero when a USR1 is sent. The code was written to both minimize the time in which the server is unable to serve new requests (they will be queued up by the operating system, so they’re not lost in any event) and to respect your tuning parameters. In order to do this it has to keep the scoreboard used to keep track of all children across generations. The status module will also use a G to indicate those children which are still serving requests started before the graceful restart was given. At present there is no way for a log rotation script using USR1 to know for certain that all children writing the prerestart log have finished. We suggest that you use a suitable delay after sending the USR1 signal before you do anything with the old log. For example if most of your hits take less than 10 minutes to complete for users on low bandwidth links then you could wait 15 minutes before doing anything with the old log. =⇒When you issue a restart, a syntax check is first run, to ensure that there are no errors in the configuration files. If your configuration file has errors in it, you will get an error message about that syntax error, and the server will refuse to restart. This avoids the situation where the server halts and then cannot restart, leaving you with a non-functioning server. This still will not guarantee that the server will restart correctly. To check the semantics of the configuration files as well as the syntax, you can try starting httpd as a non-root user. If there are no errors it will attempt to open its sockets and logs and fail because it’s not root (or because the currently running httpd already has those ports bound). If it fails for any other reason then it’s probably a config file error and the error should be fixed before issuing the graceful restart. Restart Now Signal: HUP apachectl -k restart Sending the HUP or restart signal to the parent causes it to kill off its children like in TERM, but the parent doesn’t exit. It re-reads its configuration files, and re-opens any log files. Then it spawns a new set of children and continues serving hits. Users of MOD STATUS will notice that the server statistics are set to zero when a HUP is sent. =⇒Asuration with a graceful restart, a syntax check is run before the restart is attempted. If your configfile has errors in it, the restart will not be attempted, and you will receive notification of the syntax error(s). Graceful Stop Signal: WINCH apachectl -k graceful-stop The WINCH or graceful-stop signal causes the parent process to advise the children to exit after their current request (or to exit immediately if they’re not serving anything). The parent will then remove its P ID F ILE and cease 2.3. STOPPING AND RESTARTING APACHE HTTP SERVER 31 listening on all ports. The parent will continue to run, and monitor children which are handling requests. Once all children have finalised and exited or the timeout specified by the G RACEFUL S HUTDOWN T IMEOUT has been reached, the parent will also exit. If the timeout is reached, any remaining children will be sent the TERM signal to force them to exit. A TERM signal will immediately terminate the parent process and all children when in the "graceful" state. However as the P ID F ILE will have been removed, you will not be able to use apachectl or httpd to send this signal. =⇒httpd The graceful-stop signal allows you to run multiple identically configured instances of at the same time. This is a powerful feature when performing graceful upgrades of httpd, however it can also cause deadlocks and race conditions with some configurations. Care has been taken to ensure that on-disk files such as lock files (M UTEX) and Unix socket files (S CRIPT S OCK) contain the server PID, and should coexist without problem. However, if a configuration directive, third-party module or persistent CGI utilises any other on-disk lock or state files, care should be taken to ensure that multiple running instances of httpd do not clobber each other’s files. You should also be wary of other potential race conditions, such as using rotatelogs style piped logging. Multiple running instances of rotatelogs attempting to rotate the same logfiles at the same time may destroy each other’s logfiles. 32 2.4 CHAPTER 2. USING THE APACHE HTTP SERVER Configuration Files This document describes the files used to configure Apache HTTP Server. Main Configuration Files Related Modules MOD MIME Related Directives I NCLUDE T YPES C ONFIG Apache HTTP Server is configured by placing directives (p. 1106) in plain text configuration files. The main configuration file is usually called httpd.conf. The location of this file is set at compile-time, but may be overridden with the -f command line flag. In addition, other configuration files may be added using the I NCLUDE directive, and wildcards can be used to include many configuration files. Any directive may be placed in any of these configuration files. Changes to the main configuration files are only recognized by httpd when it is started or restarted. The server also reads a file containing mime document types; the filename is set by the T YPES C ONFIG directive, and is mime.types by default. Syntax of the Configuration Files httpd configuration files contain one directive per line. The backslash "\" may be used as the last character on a line to indicate that the directive continues onto the next line. There must be no other characters or white space between the backslash and the end of the line. Arguments to directives are separated by whitespace. If an argument contains spaces, you must enclose that argument in quotes. Directives in the configuration files are case-insensitive, but arguments to directives are often case sensitive. Lines that begin with the hash character "#" are considered comments, and are ignored. Comments may not be included on the same line as a configuration directive. White space occurring before a directive is ignored, so you may indent directives for clarity. Blank lines are also ignored. The values of variables defined with the D EFINE of or shell environment variables can be used in configuration file lines using the syntax ${VAR}. If "VAR" is the name of a valid variable, the value of that variable is substituted into that spot in the configuration file line, and processing continues as if that text were found directly in the configuration file. Variables defined with D EFINE take precedence over shell environment variables. If the "VAR" variable is not found, the characters ${VAR} are left unchanged, and a warning is logged. Variable names may not contain colon ":" characters, to avoid clashes with R EWRITE M AP’s syntax. Only shell environment variables defined before the server is started can be used in expansions. Environment variables defined in the configuration file itself, for example with S ET E NV, take effect too late to be used for expansions in the configuration file. The maximum length of a line in normal configuration files, after variable substitution and joining any continued lines, is approximately 16 MiB. In .htaccess files (p. 32) , the maximum length is 8190 characters. You can check your configuration files for syntax errors without starting the server by using apachectl configtest or the -t command line option. You can use MOD INFO’s -DDUMP CONFIG to dump the configuration with all included files and environment variables resolved and all comments and non-matching and sections removed. However, the output does not reflect the merging or overriding that may happen for repeated directives. 2.4. CONFIGURATION FILES 33 Modules Related Modules MOD SO Related Directives L OAD M ODULE httpd is a modular server. This implies that only the most basic functionality is included in the core server. Extended features are available through modules (p. 1101) which can be loaded into httpd. By default, a base (p. 376) set of modules is included in the server at compile-time. If the server is compiled to use dynamically loaded (p. 68) modules, then modules can be compiled separately and added at any time using the L OAD M ODULE directive. Otherwise, httpd must be recompiled to add or remove modules. Configuration directives may be included conditional on a presence of a particular module by enclosing them in an block. However, blocks are not required, and in some cases may mask the fact that you’re missing an important module. To see which modules are currently compiled into the server, you can use the -l command line option. You can also see what modules are loaded dynamically using the -M command line option. Scope of Directives Related Modules Related Directives Directives placed in the main configuration files apply to the entire server. If you wish to change the configuration for only a part of the server, you can scope your directives by placing them in , , , , , and sections. These sections limit the application of the directives which they enclose to particular filesystem locations or URLs. They can also be nested, allowing for very fine grained configuration. httpd has the capability to serve many different websites simultaneously. This is called Virtual Hosting (p. 124) . Directives can also be scoped by placing them inside sections, so that they will only apply to requests for a particular website. Although most directives can be placed in any of these sections, some directives do not make sense in some contexts. For example, directives controlling process creation can only be placed in the main server context. To find which directives can be placed in which sections, check the Context (p. 377) of the directive. For further information, we provide details on How Directory, Location and Files sections work (p. 35) . .htaccess Files Related Modules Related Directives ACCESS F ILE NAME A LLOW OVERRIDE httpd allows for decentralized management of configuration via special files placed inside the web tree. The special files are usually called .htaccess, but any name can be specified in the ACCESS F ILE NAME directive. Directives 34 CHAPTER 2. USING THE APACHE HTTP SERVER placed in .htaccess files apply to the directory where you place the file, and all sub-directories. The .htaccess files follow the same syntax as the main configuration files. Since .htaccess files are read on every request, changes made in these files take immediate effect. To find which directives can be placed in .htaccess files, check the Context (p. 377) of the directive. The server administrator further controls what directives may be placed in .htaccess files by configuring the A LLOW OVERRIDE directive in the main configuration files. For more information on .htaccess files, see the .htaccess tutorial (p. 249) . 2.5. CONFIGURATION SECTIONS 2.5 35 Configuration Sections Directives in the configuration files (p. 32) may apply to the entire server, or they may be restricted to apply only to particular directories, files, hosts, or URLs. This document describes how to use configuration section containers or .htaccess files to change the scope of other configuration directives. Types of Configuration Section Containers Related Modules CORE MOD VERSION MOD PROXY Related Directives

There are two basic types of containers. Most containers are evaluated for each request. The enclosed directives are applied only for those requests that match the containers. The , , and containers, on the other hand, are evaluated only at server startup and restart. If their conditions are true at startup, then the enclosed directives will apply to all requests. If the conditions are not true, the enclosed directives will be ignored. The directive encloses directives that will only be applied if an appropriate parameter is defined on the httpd command line. For example, with the following configuration, all requests will be redirected to another site only if the server is started using httpd -DClosedForNow: Redirect "/" "http://otherserver.example.com/" The directive is very similar, except it encloses directives that will only be applied if a particular module is available in the server. The module must either be statically compiled in the server, or it must be dynamically compiled and its L OAD M ODULE line must be earlier in the configuration file. This directive should only be used if you need your configuration file to work whether or not certain modules are installed. It should not be used to enclose directives that you want to work all the time, because it can suppress useful error messages about missing modules. In the following example, the M IME M AGIC F ILE directive will be applied only if MOD MIME MAGIC is available. MimeMagicFile conf/magic The directive is very similar to and , except it encloses directives that will only be applied if a particular version of the server is executing. This module is designed for the use in test suites and large networks which have to deal with different httpd versions and different configurations. 36 CHAPTER 2. USING THE APACHE HTTP SERVER = 2.4> # this happens only in versions greater or # equal 2.4.0. , , and the can apply negative conditions by preceding their test with "!". Also, these sections can be nested to achieve more complex restrictions. Filesystem, Webspace, and Boolean Expressions The most commonly used configuration section containers are the ones that change the configuration of particular places in the filesystem or webspace. First, it is important to understand the difference between the two. The filesystem is the view of your disks as seen by your operating system. For example, in a default install, Apache httpd resides at /usr/local/apache2 in the Unix filesystem or "c:/Program Files/Apache Group/Apache2" in the Windows filesystem. (Note that forward slashes should always be used as the path separator in Apache httpd configuration files, even for Windows.) In contrast, the webspace is the view of your site as delivered by the web server and seen by the client. So the path /dir/ in the webspace corresponds to the path /usr/local/apache2/htdocs/dir/ in the filesystem of a default Apache httpd install on Unix. The webspace need not map directly to the filesystem, since webpages may be generated dynamically from databases or other locations. Filesystem Containers The and directives, along with their regex counterparts, apply directives to parts of the filesystem. Directives enclosed in a section apply to the named filesystem directory and all subdirectories of that directory (as well as the files in those directories). The same effect can be obtained using .htaccess files (p. 249) . For example, in the following configuration, directory indexes will be enabled for the /var/web/dir1 directory and all subdirectories. Options +Indexes Directives enclosed in a section apply to any file with the specified name, regardless of what directory it lies in. So for example, the following configuration directives will, when placed in the main section of the configuration file, deny access to any file named private.html regardless of where it is found. Require all denied To address files found in a particular part of the filesystem, the and sections can be combined. For example, the following configuration will deny access to /var/web/dir1/private.html, /var/web/dir1/subdir2/private.html, /var/web/dir1/subdir3/private.html, and any other instance of private.html found under the /var/web/dir1/ directory. Require all denied 2.5. CONFIGURATION SECTIONS 37 Webspace Containers The directive and its regex counterpart, on the other hand, change the configuration for content in the webspace. For example, the following configuration prevents access to any URL-path that begins in /private. In particular, it will apply to requests for http://yoursite.example.com/private, http://yoursite.example.com/private123, and http://yoursite.example.com/private/dir/file.html as well as any other requests starting with the /private string. Require all denied The directive need not have anything to do with the filesystem. For example, the following example shows how to map a particular URL to an internal Apache HTTP Server handler provided by MOD STATUS. No file called server-status needs to exist in the filesystem. SetHandler server-status Overlapping Webspace In order to have two overlapping URLs one has to consider the order in which certain sections or directives are evaluated. For this would be: es on the other hand, are mapped vice-versa: Alias "/foo/bar" "/srv/www/uncommon/bar" Alias "/foo" "/srv/www/common/foo" The same is true for the P ROXY PASS directives: ProxyPass "/special-area" "http://special.example.com" smax=5 max=10 ProxyPass "/" "balancer://mycluster/" stickysession=JSESSIONID|jsessionid nofailover=On Wildcards and Regular Expressions The , , and directives can each use shell-style wildcard characters as in fnmatch from the C standard library. The character "*" matches any sequence of characters, "?" matches any single character, and "[seq]" matches any character in seq. The "/" character will not be matched by any wildcard; it must be specified explicitly. If even more flexible matching is required, each container has a regular expression (regex) counterpart , , and that allow perl-compatible regular expressions to be used in choosing the matches. But see the section below on configuration merging to find out how using regex sections will change how directives are applied. A non-regex wildcard section that changes the configuration of all user directories could look as follows: 38 CHAPTER 2. USING THE APACHE HTTP SERVER Options Indexes Using regex sections, we can deny access to many types of image files at once: Require all denied Regular expressions containing named groups and backreferences are added to the environment with the corresponding name in uppercase. This allows elements of filename paths and URLs to be referenced from within expressions (p. 99) and modules like MOD REWRITE. [ˆ/]+)"> require ldap-group cn=%{env:MATCH_SITENAME},ou=combined,o=Example Boolean expressions The directive change the configuration depending on a condition which can be expressed by a boolean expression. For example, the following configuration denies access if the HTTP Referer header does not start with "http://www.example.com/". Require all denied What to use When Choosing between filesystem containers and webspace containers is actually quite easy. When applying directives to objects that reside in the filesystem always use or . When applying directives to objects that do not reside in the filesystem (such as a webpage generated from a database), use . It is important to never use when trying to restrict access to objects in the filesystem. This is because many different webspace locations (URLs) could map to the same filesystem location, allowing your restrictions to be circumvented. For example, consider the following configuration: Require all denied This works fine if the request is for http://yoursite.example.com/dir/. But what if you are on a case-insensitive filesystem? Then your restriction could be easily circumvented by requesting http://yoursite.example.com/DIR/. The directive, in contrast, will apply to any content served from that location, regardless of how it is called. (An exception is filesystem links. The same directory can be placed in more than one part of the filesystem using symbolic links. The directive will follow the symbolic link without resetting the pathname. Therefore, for the highest level of security, symbolic links should be disabled with the appropriate O PTIONS directive.) If you are, perhaps, thinking that none of this applies to you because you use a case-sensitive filesystem, remember that there are many other ways to map multiple webspace locations to the same filesystem location. Therefore you should 2.5. CONFIGURATION SECTIONS 39 always use the filesystem containers when you can. There is, however, one exception to this rule. Putting configuration restrictions in a section is perfectly safe because this section will apply to all requests regardless of the specific URL. Nesting of sections Some section types can be nested inside other section types. On the one hand, can be used inside . On the other hand, can be used inside , , and sections. The regex counterparts of the named section behave identically. Nested sections are merged after non-nested sections of the same type. Virtual Hosts The container encloses directives that apply to specific hosts. This is useful when serving multiple hosts from the same machine with a different configuration for each. For more information, see the Virtual Host Documentation (p. 124) . Proxy The

and

containers apply enclosed configuration directives only to sites accessed through MOD PROXY’s proxy server that match the specified URL. For example, the following configuration will allow only a subset of clients to access the www.example.com website using the proxy server: Require host yournetwork.example.com What Directives are Allowed? To find out what directives are allowed in what types of configuration sections, check the Context (p. 377) of the directive. Everything that is allowed in sections is also syntactically allowed in , , , , ,

, and

sections. There are some exceptions, however: • The A LLOW OVERRIDE directive works only in sections. • The FollowSymLinks and SymLinksIfOwnerMatch O PTIONS work only in sections or .htaccess files. • The O PTIONS directive cannot be used in and sections. How the sections are merged The configuration sections are applied in a very particular order. Since this can have important effects on how configuration directives are interpreted, it is important to understand how this works. The order of merging is: 1. (except regular expressions) and .htaccess done simultaneously (with .htaccess, if allowed, overriding ) 40 CHAPTER 2. USING THE APACHE HTTP SERVER 2. (and ) 3. and done simultaneously 4. and done simultaneously 5. Apart from , each group is processed in the order that they appear in the configuration files. (group 1 above) is processed in the order shortest directory component to longest. So for example, will be processed before . If multiple sections apply to the same directory they are processed in the configuration file order. Configurations included via the I NCLUDE directive will be treated as if they were inside the including file at the location of the I NCLUDE directive. Sections inside sections are applied after the corresponding sections outside the virtual host definition. This allows virtual hosts to override the main server configuration. When the request is served by MOD PROXY, the

container takes the place of the container in the processing order. =⇒Technical Note There is actually a / sequence performed just before the name translation phase (where Aliases and DocumentRoots are used to map URLs to filenames). The results of this sequence are completely thrown away after the translation has completed. Relationship between modules and configuration sections One question that often arises after reading how configuration sections are merged is related to how and when directives of specific modules like MOD REWRITE are processed. The answer is not trivial and needs a bit of background. Each httpd module manages its own configuration, and each of its directives in httpd.conf specify one piece of configuration in a particular context. httpd does not execute a command as it is read. At runtime, the core of httpd iterates over the defined configuration sections in the order described above to determine which ones apply to the current request. When the first section matches, it is considered the current configuration for this request. If a subsequent section matches too, then each module with a directive in either of the sections is given a chance to merge its configuration between the two sections. The result is a third configuration, and the process goes on until all the configuration sections are evaluated. After the above step, the "real" processing of the HTTP request begins: each module has a chance to run and perform whatever tasks they like. They can retrieve their own final merged configuration from the core of the httpd to determine how they should act. An example can help to visualize the whole process. The following configuration uses the H EADER directive of MOD HEADERS to set a specific HTTP header. What value will httpd set in the CustomHeaderName header for a request to /example/index.html ? Header set CustomHeaderName one Header set CustomHeaderName three Header set CustomHeaderName two 2.5. CONFIGURATION SECTIONS 41 • D IRECTORY "/" matches and an initial configuration to set the CustomHeaderName header with the value one is created. • D IRECTORY "/example" matches, and since MOD HEADERS specifies in its code to override in case of a merge, a new configuration is created to set the CustomHeaderName header with the value two. • F ILES M ATCH ".*" matches and another merge opportunity arises, causing the CustomHeaderName header to be set with the value three. • Eventually during the next steps of the HTTP request processing MOD HEADERS will be called and it will receive the configuration to set the CustomHeaderName header with the value three. MOD HEADERS normally uses this configuration to perfom its job, namely setting the foo header. This does not mean that a module can’t perform a more complex action like discarding directives because not needed or deprecated, etc.. This is true for .htaccess too since they have the same priority as D IRECTORY in the merge order. The important concept to understand is that configuration sections like D IRECTORY and F ILES M ATCH are not comparable to module specific directives like H EADER or R EWRITE RULE because they operate on different levels. Some useful examples Below is an artificial example to show the order of merging. Assuming they all apply to the request, the directives in this example will be applied in the order A > B > C > D > E. E D B C A For a more concrete example, consider the following. Regardless of any access restrictions placed in sections, the section will be evaluated last and will allow unrestricted access to the server. In other words, order of merging is important, so be careful! Require all granted 42 CHAPTER 2. USING THE APACHE HTTP SERVER # Whoops! This section will have no effect Require all granted Require not host badguy.example.com 2.6. CACHING GUIDE 2.6 43 Caching Guide This document supplements the MOD CACHE, MOD CACHE DISK, MOD FILE CACHE and htcacheclean (p. 319) reference documentation. It describes how to use the Apache HTTP Server’s caching features to accelerate web and proxy serving, while avoiding common problems and misconfigurations. Introduction The Apache HTTP server offers a range of caching features that are designed to improve the performance of the server in various ways. Three-state RFC2616 HTTP caching MOD CACHE and its provider modules MOD CACHE DISK provide intelligent, HTTP-aware caching. The content itself is stored in the cache, and mod cache aims to honor all of the various HTTP headers and options that control the cacheability of content as described in Section 13 of RFC261612 . MOD CACHE is aimed at both simple and complex caching configurations, where you are dealing with proxied content, dynamic local content or have a need to speed up access to local files on a potentially slow disk. Two-state key/value shared object caching The shared object cache API (p. 114) (socache) and its provider modules provide a server wide key/value based shared object cache. These modules are designed to cache low level data such as SSL sessions and authentication credentials. Backends allow the data to be stored server wide in shared memory, or datacenter wide in a cache such as memcache or distcache. Specialized file caching MOD FILE CACHE offers the ability to pre-load files into memory on server startup, and can improve access times and save file handles on files that are accessed often, as there is no need to go to disk on each request. To get the most from this document, you should be familiar with the basics of HTTP, and have read the Users’ Guides to Mapping URLs to the Filesystem (p. 64) and Content negotiation (p. 78) . Three-state RFC2616 HTTP caching Related Modules MOD CACHE MOD CACHE DISK Related Directives C ACHE E NABLE C ACHE D ISABLE U SE C ANONICAL NAME C ACHE N EGOTIATED D OCS The HTTP protocol contains built in support for an in-line caching mechanism described by section 13 of RFC261613 , and the MOD CACHE module can be used to take advantage of this. Unlike a simple two state key/value cache where the content disappears completely when no longer fresh, an HTTP cache includes a mechanism to retain stale content, and to ask the origin server whether this stale content has changed and if not, make it fresh again. An entry in an HTTP cache exists in one of three states: Fresh If the content is new enough (younger than its freshness lifetime), it is considered fresh. An HTTP cache is free to serve fresh content without making any calls to the origin server at all. 12 http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html 13 http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html 44 CHAPTER 2. USING THE APACHE HTTP SERVER Stale If the content is too old (older than its freshness lifetime), it is considered stale. An HTTP cache should contact the origin server and check whether the content is still fresh before serving stale content to a client. The origin server will either respond with replacement content if not still valid, or ideally, the origin server will respond with a code to tell the cache the content is still fresh, without the need to generate or send the content again. The content becomes fresh again and the cycle continues. The HTTP protocol does allow the cache to serve stale data under certain circumstances, such as when an attempt to freshen the data with an origin server has failed with a 5xx error, or when another request is already in the process of freshening the given entry. In these cases a Warning header is added to the response. Non Existent If the cache gets full, it reserves the option to delete content from the cache to make space. Content can be deleted at any time, and can be stale or fresh. The htcacheclean (p. 319) tool can be run on a once off basis, or deployed as a daemon to keep the size of the cache within the given size, or the given number of inodes. The tool attempts to delete stale content before attempting to delete fresh content. Full details of how HTTP caching works can be found in Section 13 of RFC261614 . Interaction with the Server The MOD CACHE module hooks into the server in two possible places depending on the value of the C ACHE Q UICK H ANDLER directive: Quick handler phase This phase happens very early on during the request processing, just after the request has been parsed. If the content is found within the cache, it is served immediately and almost all request processing is bypassed. In this scenario, the cache behaves as if it has been "bolted on" to the front of the server. This mode offers the best performance, as the majority of server processing is bypassed. This mode however also bypasses the authentication and authorization phases of server processing, so this mode should be chosen with care when this is important. Requests with an "Authorization" header (for example, HTTP Basic Authentication) are neither cacheable nor served from the cache when MOD CACHE is running in this phase. Normal handler phase This phase happens late in the request processing, after all the request phases have completed. In this scenario, the cache behaves as if it has been "bolted on" to the back of the server. This mode offers the most flexibility, as the potential exists for caching to occur at a precisely controlled point in the filter chain, and cached content can be filtered or personalized before being sent to the client. If the URL is not found within the cache, MOD CACHE will add a filter (p. 110) to the filter stack in order to record the response to the cache, and then stand down, allowing normal request processing to continue. If the content is determined to be cacheable, the content will be saved to the cache for future serving, otherwise the content will be ignored. If the content found within the cache is stale, the MOD CACHE module converts the request into a conditional request. If the origin server responds with a normal response, the normal response is cached, replacing the content already cached. If the origin server responds with a 304 Not Modified response, the content is marked as fresh again, and the cached content is served by the filter instead of saving it. 14 http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html 2.6. CACHING GUIDE 45 Improving Cache Hits When a virtual host is known by one of many different server aliases, ensuring that U SE C ANONICAL NAME is set to On can dramatically improve the ratio of cache hits. This is because the hostname of the virtual-host serving the content is used within the cache key. With the setting set to On virtual-hosts with multiple server names or aliases will not produce differently cached entities, and instead content will be cached as per the canonical hostname. Freshness Lifetime Well formed content that is intended to be cached should declare an explicit freshness lifetime with the Cache-Control header’s max-age or s-maxage fields, or by including an Expires header. At the same time, the origin server defined freshness lifetime can be overridden by a client when the client presents their own Cache-Control header within the request. In this case, the lowest freshness lifetime between request and response wins. When this freshness lifetime is missing from the request or the response, a default freshness lifetime is applied. The default freshness lifetime for cached entities is one hour, however this can be easily over-ridden by using the C ACHE D EFAULT E XPIRE directive. If a response does not include an Expires header but does include a Last-Modified header, MOD CACHE can infer a freshness lifetime based on a heuristic, which can be controlled through the use of the C ACHE L AST M ODI FIED FACTOR directive. For local content, or for remote content that does not define its own Expires header, MOD EXPIRES may be used to fine-tune the freshness lifetime by adding max-age and Expires. The maximum freshness lifetime may also be controlled by using the C ACHE M AX E XPIRE. A Brief Guide to Conditional Requests When content expires from the cache and becomes stale, rather than pass on the original request, httpd will modify the request to make it conditional instead. When an ETag header exists in the original cached response, MOD CACHE will add an If-None-Match header to the request to the origin server. When a Last-Modified header exists in the original cached response, MOD CACHE will add an If-Modified-Since header to the request to the origin server. Performing either of these actions makes the request conditional. When a conditional request is received by an origin server, the origin server should check whether the ETag or the LastModified parameter has changed, as appropriate for the request. If not, the origin should respond with a terse "304 Not Modified" response. This signals to the cache that the stale content is still fresh should be used for subsequent requests until the content’s new freshness lifetime is reached again. If the content has changed, then the content is served as if the request were not conditional to begin with. Conditional requests offer two benefits. Firstly, when making such a request to the origin server, if the content from the origin matches the content in the cache, this can be determined easily and without the overhead of transferring the entire resource. Secondly, a well designed origin server will be designed in such a way that conditional requests will be significantly cheaper to produce than a full response. For static files, typically all that is involved is a call to stat() or similar system call, to see if the file has changed in size or modification time. As such, even local content may still be served faster from the cache if it has not changed. Origin servers should make every effort to support conditional requests as is practical, however if conditional requests are not supported, the origin will respond as if the request was not conditional, and the cache will respond as if the 46 CHAPTER 2. USING THE APACHE HTTP SERVER content had changed and save the new content to the cache. In this case, the cache will behave like a simple two state cache, where content is effectively either fresh or deleted. What Can be Cached? The full definition of which responses can be cached by an HTTP cache is defined in RFC2616 Section 13.4 Response Cacheability15 , and can be summed up as follows: 1. Caching must be enabled for this URL. See the C ACHE E NABLE and C ACHE D ISABLE directives. 2. The response must have a HTTP status code of 200, 203, 300, 301 or 410. 3. The request must be a HTTP GET request. 4. If the response contains an "Authorization:" header, it must also contain an "s-maxage", "must-revalidate" or "public" option in the "Cache-Control:" header, or it won’t be cached. 5. If the URL included a query string (e.g. from a HTML form GET method) it will not be cached unless the response specifies an explicit expiration by including an "Expires:" header or the max-age or s-maxage directive of the "Cache-Control:" header, as per RFC2616 sections 13.9 and 13.2.1. 6. If the response has a status of 200 (OK), the response must also include at least one of the "Etag", "LastModified" or the "Expires" headers, or the max-age or s-maxage directive of the "Cache-Control:" header, unless the C ACHE I GNORE N O L AST M OD directive has been used to require otherwise. 7. If the response includes the "private" option in a "Cache-Control:" header, it will not be stored unless the C ACHE S TORE P RIVATE has been used to require otherwise. 8. Likewise, if the response includes the "no-store" option in a "Cache-Control:" header, it will not be stored unless the C ACHE S TORE N O S TORE has been used. 9. A response will not be stored if it includes a "Vary:" header containing the match-all "*". What Should Not be Cached? It should be up to the client creating the request, or the origin server constructing the response to decide whether or not the content should be cacheable or not by correctly setting the Cache-Control header, and MOD CACHE should be left alone to honor the wishes of the client or server as appropriate. Content that is time sensitive, or which varies depending on the particulars of the request that are not covered by HTTP negotiation, should not be cached. This content should declare itself uncacheable using the Cache-Control header. If content changes often, expressed by a freshness lifetime of minutes or seconds, the content can still be cached, however it is highly desirable that the origin server supports conditional requests correctly to ensure that full responses do not have to be generated on a regular basis. Content that varies based on client provided request headers can be cached through intelligent use of the Vary response header. 15 http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.4 2.6. CACHING GUIDE 47 Variable/Negotiated Content When the origin server is designed to respond with different content based on the value of headers in the request, for example to serve multiple languages at the same URL, HTTP’s caching mechanism makes it possible to cache multiple variants of the same page at the same URL. This is done by the origin server adding a Vary header to indicate which headers must be taken into account by a cache when determining whether two variants are different from one another. If for example, a response is received with a vary header such as; Vary: negotiate,accept-language,accept-charset MOD CACHE will only serve the cached content to requesters with accept-language and accept-charset headers matching those of the original request. Multiple variants of the content can be cached side by side, MOD CACHE uses the Vary header and the corresponding values of the request headers listed by Vary to decide on which of many variants to return to the client. Caching to Disk The MOD CACHE module relies on specific backend store implementations in order to manage the cache, and for caching to disk MOD CACHE DISK is provided to support this. Typically the module will be configured as so; CacheRoot "/var/cache/apache/" CacheEnable disk / CacheDirLevels 2 CacheDirLength 1 Importantly, as the cached files are locally stored, operating system in-memory caching will typically be applied to their access also. So although the files are stored on disk, if they are frequently accessed it is likely the operating system will ensure that they are actually served from memory. Understanding the Cache-Store To store items in the cache, MOD CACHE DISK creates a 22 character hash of the URL being requested. This hash incorporates the hostname, protocol, port, path and any CGI arguments to the URL, as well as elements defined by the Vary header to ensure that multiple URLs do not collide with one another. Each character may be any one of 64-different characters, which mean that overall there are 64ˆ22 possible hashes. For example, a URL might be hashed to xyTGxSMO2b68mBCykqkp1w. This hash is used as a prefix for the naming of the files specific to that URL within the cache, however first it is split up into directories as per the C ACHE D IR L EVELS and C ACHE D IR L ENGTH directives. C ACHE D IR L EVELS specifies how many levels of subdirectory there should be, and C ACHE D IR L ENGTH specifies how many characters should be in each directory. With the example settings given above, the hash would be turned into a filename prefix as /var/cache/apache/x/y/TGxSMO2b68mBCykqkp1w. The overall aim of this technique is to reduce the number of subdirectories or files that may be in a particular directory, as most file-systems slow down as this number increases. With setting of "1" for C ACHE D IR L ENGTH there can at most be 64 subdirectories at any particular level. With a setting of 2 there can be 64 * 64 subdirectories, and so on. Unless you have a good reason not to, using a setting of "1" for C ACHE D IR L ENGTH is recommended. 48 CHAPTER 2. USING THE APACHE HTTP SERVER Setting C ACHE D IR L EVELS depends on how many files you anticipate to store in the cache. With the setting of "2" used in the above example, a grand total of 4096 subdirectories can ultimately be created. With 1 million files cached, this works out at roughly 245 cached URLs per directory. Each URL uses at least two files in the cache-store. Typically there is a ".header" file, which includes metainformation about the URL, such as when it is due to expire and a ".data" file which is a verbatim copy of the content to be served. In the case of a content negotiated via the "Vary" header, a ".vary" directory will be created for the URL in question. This directory will have multiple ".data" files corresponding to the differently negotiated content. Maintaining the Disk Cache The MOD CACHE DISK module makes no attempt to regulate the amount of disk space used by the cache, although it will gracefully stand down on any disk error and behave as if the cache was never present. Instead, provided with httpd is the htcacheclean (p. 319) tool which allows you to clean the cache periodically. Determining how frequently to run htcacheclean (p. 319) and what target size to use for the cache is somewhat complex and trial and error may be needed to select optimal values. htcacheclean (p. 319) has two modes of operation. It can be run as persistent daemon, or periodically from cron. htcacheclean (p. 319) can take up to an hour or more to process very large (tens of gigabytes) caches and if you are running it from cron it is recommended that you determine how long a typical run takes, to avoid running more than one instance at a time. It is also recommended that an appropriate "nice" level is chosen for htcacheclean so that the tool does not cause excessive disk io while the server is running. 2.6. CACHING GUIDE 49 Figure 1: Typical cache growth / clean sequence. Because MOD CACHE DISK does not itself pay attention to how much space is used you should ensure that htcacheclean (p. 319) is configured to leave enough "grow room" following a clean. Two-state Key/Value Shared Object Caching Related Modules MOD MOD MOD MOD MOD AUTHN SOCACHE SOCACHE DBM SOCACHE DC SOCACHE MEMCACHE SOCACHE SHMCB Related Directives AUTHN C ACHE SOC ACHE SSLS ESSION C ACHE SSLS TAPLING C ACHE MOD SSL The Apache HTTP server offers a low level shared object cache for caching information such as SSL sessions, or authentication credentials, within the socache (p. 114) interface. Additional modules are provided for each implementation, offering the following backends: 50 CHAPTER 2. USING THE APACHE HTTP SERVER MOD SOCACHE DBM MOD SOCACHE DC DBM based shared object cache. Distcache based shared object cache. MOD SOCACHE MEMCACHE MOD SOCACHE SHMCB Memcache based shared object cache. Shared memory based shared object cache. Caching Authentication Credentials Related Modules MOD AUTHN SOCACHE Related Directives AUTHN C ACHE SOC ACHE The MOD AUTHN SOCACHE module allows the result of authentication to be cached, relieving load on authentication backends. Caching SSL Sessions Related Modules MOD SSL Related Directives SSLS ESSION C ACHE SSLS TAPLING C ACHE The MOD SSL module uses the socache interface to provide a session cache and a stapling cache. Specialized File Caching Related Modules MOD FILE CACHE Related Directives C ACHE F ILE MM AP F ILE On platforms where a filesystem might be slow, or where file handles are expensive, the option exists to pre-load files into memory on startup. On systems where opening files is slow, the option exists to open the file on startup and cache the file handle. These options can help on systems where access to static files is slow. File-Handle Caching The act of opening a file can itself be a source of delay, particularly on network filesystems. By maintaining a cache of open file descriptors for commonly served files, httpd can avoid this delay. Currently httpd provides one implementation of File-Handle Caching. CacheFile The most basic form of caching present in httpd is the file-handle caching provided by MOD FILE CACHE. Rather than caching file-contents, this cache maintains a table of open file descriptors. Files to be cached in this manner are specified in the configuration file using the C ACHE F ILE directive. The C ACHE F ILE directive instructs httpd to open the file when it is started and to re-use this file-handle for all subsequent access to this file. 2.6. CACHING GUIDE 51 CacheFile /usr/local/apache2/htdocs/index.html If you intend to cache a large number of files in this manner, you must ensure that your operating system’s limit for the number of open files is set appropriately. Although using C ACHE F ILE does not cause the file-contents to be cached per-se, it does mean that if the file changes while httpd is running these changes will not be picked up. The file will be consistently served as it was when httpd was started. If the file is removed while httpd is running, it will continue to maintain an open file descriptor and serve the file as it was when httpd was started. This usually also means that although the file will have been deleted, and not show up on the filesystem, extra free space will not be recovered until httpd is stopped and the file descriptor closed. In-Memory Caching Serving directly from system memory is universally the fastest method of serving content. Reading files from a disk controller or, even worse, from a remote network is orders of magnitude slower. Disk controllers usually involve physical processes, and network access is limited by your available bandwidth. Memory access on the other hand can take mere nano-seconds. System memory isn’t cheap though, byte for byte it’s by far the most expensive type of storage and it’s important to ensure that it is used efficiently. By caching files in memory you decrease the amount of memory available on the system. As we’ll see, in the case of operating system caching, this is not so much of an issue, but when using httpd’s own in-memory caching it is important to make sure that you do not allocate too much memory to a cache. Otherwise the system will be forced to swap out memory, which will likely degrade performance. Operating System Caching Almost all modern operating systems cache file-data in memory managed directly by the kernel. This is a powerful feature, and for the most part operating systems get it right. For example, on Linux, let’s look at the difference in the time it takes to read a file for the first time and the second time; colm@coroebus:˜$ time cat testfile > /dev/null real 0m0.065s user 0m0.000s sys 0m0.001s colm@coroebus:˜$ time cat testfile > /dev/null real 0m0.003s user 0m0.003s sys 0m0.000s Even for this small file, there is a huge difference in the amount of time it takes to read the file. This is because the kernel has cached the file contents in memory. By ensuring there is "spare" memory on your system, you can ensure that more and more file-contents will be stored in this cache. This can be a very efficient means of in-memory caching, and involves no extra configuration of httpd at all. Additionally, because the operating system knows when files are deleted or modified, it can automatically remove file contents from the cache when necessary. This is a big advantage over httpd’s in-memory caching which has no way of knowing when a file has changed. Despite the performance and advantages of automatic operating system caching there are some circumstances in which in-memory caching may be better performed by httpd. 52 CHAPTER 2. USING THE APACHE HTTP SERVER MMapFile Caching MOD FILE CACHE provides the MM AP F ILE directive, which allows you to have httpd map a static file’s contents into memory at start time (using the mmap system call). httpd will use the in-memory contents for all subsequent accesses to this file. MMapFile /usr/local/apache2/htdocs/index.html As with the C ACHE F ILE directive, any changes in these files will not be picked up by httpd after it has started. The MM AP F ILE directive does not keep track of how much memory it allocates, so you must ensure not to over-use the directive. Each httpd child process will replicate this memory, so it is critically important to ensure that the files mapped are not so large as to cause the system to swap memory. Security Considerations Authorization and Access Control Using MOD CACHE in its default state where C ACHE Q UICK H ANDLER is set to On is very much like having a caching reverse-proxy bolted to the front of the server. Requests will be served by the caching module unless it determines that the origin server should be queried just as an external cache would, and this drastically changes the security model of httpd. As traversing a filesystem hierarchy to examine potential .htaccess files would be a very expensive operation, partially defeating the point of caching (to speed up requests), MOD CACHE makes no decision about whether a cached entity is authorised for serving. In other words; if MOD CACHE has cached some content, it will be served from the cache as long as that content has not expired. If, for example, your configuration permits access to a resource by IP address you should ensure that this content is not cached. You can do this by using the C ACHE D ISABLE directive, or MOD EXPIRES. Left unchecked, MOD CACHE - very much like a reverse proxy - would cache the content when served and then serve it to any client, on any IP address. When the C ACHE Q UICK H ANDLER directive is set to Off, the full set of request processing phases are executed and the security model remains unchanged. Local exploits As requests to end-users can be served from the cache, the cache itself can become a target for those wishing to deface or interfere with content. It is important to bear in mind that the cache must at all times be writable by the user which httpd is running as. This is in stark contrast to the usually recommended situation of maintaining all content unwritable by the Apache user. If the Apache user is compromised, for example through a flaw in a CGI process, it is possible that the cache may be targeted. When using MOD CACHE DISK, it is relatively easy to insert or modify a cached entity. This presents a somewhat elevated risk in comparison to the other types of attack it is possible to make as the Apache user. If you are using MOD CACHE DISK you should bear this in mind - ensure you upgrade httpd when security upgrades are announced and run CGI processes as a non-Apache user using suEXEC (p. 115) if possible. Cache Poisoning When running httpd as a caching proxy server, there is also the potential for so-called cache poisoning. Cache Poisoning is a broad term for attacks in which an attacker causes the proxy server to retrieve incorrect (and usually undesirable) content from the origin server. 2.6. CACHING GUIDE 53 For example if the DNS servers used by your system running httpd are vulnerable to DNS cache poisoning, an attacker may be able to control where httpd connects to when requesting content from the origin server. Another example is so-called HTTP request-smuggling attacks. This document is not the correct place for an in-depth discussion of HTTP request smuggling (instead, try your favourite search engine) however it is important to be aware that it is possible to make a series of requests, and to exploit a vulnerability on an origin webserver such that the attacker can entirely control the content retrieved by the proxy. Denial of Service / Cachebusting The Vary mechanism allows multiple variants of the same URL to be cached side by side. Depending on header values provided by the client, the cache will select the correct variant to return to the client. This mechanism can become a problem when an attempt is made to vary on a header that is known to contain a wide range of possible values under normal use, for example the User-Agent header. Depending on the popularity of the particular web site thousands or millions of duplicate cache entries could be created for the same URL, crowding out other entries in the cache. In other cases, there may be a need to change the URL of a particular resource on every request, usually by adding a "cachebuster" string to the URL. If this content is declared cacheable by a server for a significant freshness lifetime, these entries can crowd out legitimate entries in a cache. While MOD CACHE provides a C ACHE I GNORE URLS ES SION I DENTIFIERS directive, this directive should be used with care to ensure that downstream proxy or browser caches aren’t subjected to the same denial of service issue. 54 2.7 CHAPTER 2. USING THE APACHE HTTP SERVER Server-Wide Configuration This document explains some of the directives provided by the CORE server which are used to configure the basic operations of the server. Server Identification Related Modules Related Directives S ERVER NAME S ERVER A DMIN S ERVER S IGNATURE S ERVERT OKENS U SE C ANONICAL NAME U SE C ANONICAL P HYSICAL P ORT The S ERVER A DMIN and S ERVERT OKENS directives control what information about the server will be presented in server-generated documents such as error messages. The S ERVERT OKENS directive sets the value of the Server HTTP response header field. The S ERVER NAME, U SE C ANONICAL NAME and U SE C ANONICAL P HYSICAL P ORT directives are used by the server to determine how to construct self-referential URLs. For example, when a client requests a directory, but does not include the trailing slash in the directory name, httpd must redirect the client to the full name including the trailing slash so that the client will correctly resolve relative references in the document. File Locations Related Modules Related Directives C ORE D UMP D IRECTORY D OCUMENT ROOT E RROR L OG M UTEX P ID F ILE S CORE B OARD F ILE S ERVER ROOT These directives control the locations of the various files that httpd needs for proper operation. When the pathname used does not begin with a slash (/), the files are located relative to the S ERVER ROOT. Be careful about locating files in paths which are writable by non-root users. See the security tips (p. 364) documentation for more details. 2.7. SERVER-WIDE CONFIGURATION 55 Limiting Resource Usage Related Modules Related Directives L IMIT R EQUEST B ODY L IMIT R EQUEST F IELDS L IMIT R EQUEST F IELDSIZE L IMIT R EQUEST L INE RL IMIT CPU RL IMIT MEM RL IMIT NPROC T HREAD S TACK S IZE The L IMIT R EQUEST* directives are used to place limits on the amount of resources httpd will use in reading requests from clients. By limiting these values, some kinds of denial of service attacks can be mitigated. The RL IMIT* directives are used to limit the amount of resources which can be used by processes forked off from the httpd children. In particular, this will control resources used by CGI scripts and SSI exec commands. The T HREAD S TACK S IZE directive is used with some platforms to control the stack size. Implementation Choices Related Modules Related Directives M UTEX The M UTEX directive can be used to change the underlying implementation used for mutexes, in order to relieve functional or performance problems with APR’s default choice. 56 2.8 CHAPTER 2. USING THE APACHE HTTP SERVER Log Files In order to effectively manage a web server, it is necessary to get feedback about the activity and performance of the server as well as any problems that may be occurring. The Apache HTTP Server provides very comprehensive and flexible logging capabilities. This document describes how to configure its logging capabilities, and how to understand what the logs contain. Overview Related Modules MOD MOD MOD MOD Related Directives LOG CONFIG LOG FORENSIC LOGIO CGI The Apache HTTP Server provides a variety of different mechanisms for logging everything that happens on your server, from the initial request, through the URL mapping process, to the final resolution of the connection, including any errors that may have occurred in the process. In addition to this, third-party modules may provide logging capabilities, or inject entries into the existing log files, and applications such as CGI programs, or PHP scripts, or other handlers, may send messages to the server error log. In this document we discuss the logging modules that are a standard part of the http server. Security Warning Anyone who can write to the directory where Apache httpd is writing a log file can almost certainly gain access to the uid that the server is started as, which is normally root. Do NOT give people write access to the directory the logs are stored in without being aware of the consequences; see the security tips (p. 364) document for details. In addition, log files may contain information supplied directly by the client, without escaping. Therefore, it is possible for malicious clients to insert control-characters in the log files, so care must be taken in dealing with raw logs. Error Log Related Modules CORE Related Directives E RROR L OG E RROR L OG F ORMAT L OG L EVEL The server error log, whose name and location is set by the E RROR L OG directive, is the most important log file. This is the place where Apache httpd will send diagnostic information and record any errors that it encounters in processing requests. It is the first place to look when a problem occurs with starting the server or with the operation of the server, since it will often contain details of what went wrong and how to fix it. The error log is usually written to a file (typically error log on Unix systems and error.log on Windows and OS/2). On Unix systems it is also possible to have the server send errors to syslog or pipe them to a program. The format of the error log is defined by the E RROR L OG F ORMAT directive, with which you can customize what values are logged. A default is format defined if you don’t specify one. A typical log message follows: 2.8. LOG FILES 57 [Fri Sep 09 10:42:29.902022 2011] [core:error] [pid 35708:tid 4328636416] [client 72.15.99.187] File does not exist: /usr/local/apache2/htdocs/favicon.ico The first item in the log entry is the date and time of the message. The next is the module producing the message (core, in this case) and the severity level of that message. This is followed by the process ID and, if appropriate, the thread ID, of the process that experienced the condition. Next, we have the client address that made the request. And finally is the detailed error message, which in this case indicates a request for a file that did not exist. A very wide variety of different messages can appear in the error log. Most look similar to the example above. The error log will also contain debugging output from CGI scripts. Any information written to stderr by a CGI script will be copied directly to the error log. Putting a %L token in both the error log and the access log will produce a log entry ID with which you can correlate the entry in the error log with the entry in the access log. If MOD UNIQUE ID is loaded, its unique request ID will be used as the log entry ID, too. During testing, it is often useful to continuously monitor the error log for any problems. On Unix systems, you can accomplish this using: tail -f error log Per-module logging The L OG L EVEL directive allows you to specify a log severity level on a per-module basis. In this way, if you are troubleshooting a problem with just one particular module, you can turn up its logging volume without also getting the details of other modules that you’re not interested in. This is particularly useful for modules such as MOD PROXY or MOD REWRITE where you want to know details about what it’s trying to do. Do this by specifying the name of the module in your L OG L EVEL directive: LogLevel info rewrite:trace5 This sets the main L OG L EVEL to info, but turns it up to trace5 for MOD REWRITE. =⇒earlier This replaces the per-module logging directives, such as RewriteLog, that were present in versions of the server. Access Log Related Modules MOD LOG CONFIG MOD SETENVIF Related Directives C USTOM L OG L OG F ORMAT S ET E NV I F The server access log records all requests processed by the server. The location and content of the access log are controlled by the C USTOM L OG directive. The L OG F ORMAT directive can be used to simplify the selection of the contents of the logs. This section describes how to configure the server to record information in the access log. Of course, storing the information in the access log is only the start of log management. The next step is to analyze this information to produce useful statistics. Log analysis in general is beyond the scope of this document, and not really 58 CHAPTER 2. USING THE APACHE HTTP SERVER part of the job of the web server itself. For more information about this topic, and for applications which perform log analysis, check the Open Directory16 . Various versions of Apache httpd have used other modules and directives to control access logging, including mod log referer, mod log agent, and the TransferLog directive. The C USTOM L OG directive now subsumes the functionality of all the older directives. The format of the access log is highly configurable. The format is specified using a format string that looks much like a C-style printf(1) format string. Some examples are presented in the next sections. For a complete list of the possible contents of the format string, see the MOD LOG CONFIG format strings (p. 705) . Common Log Format A typical configuration for the access log might look as follows. LogFormat "%h %l %u %t \"%r\" %>s %b" common CustomLog "logs/access_log" common This defines the nickname common and associates it with a particular log format string. The format string consists of percent directives, each of which tell the server to log a particular piece of information. Literal characters may also be placed in the format string and will be copied directly into the log output. The quote character (") must be escaped by placing a backslash before it to prevent it from being interpreted as the end of the format string. The format string may also contain the special control characters "\n" for new-line and "\t" for tab. The C USTOM L OG directive sets up a new log file using the defined nickname. The filename for the access log is relative to the S ERVER ROOT unless it begins with a slash. The above configuration will write log entries in a format known as the Common Log Format (CLF). This standard format can be produced by many different web servers and read by many log analysis programs. The log file entries produced in CLF will look something like this: 127.0.0.1 - frank [10/Oct/2000:13:55:36 -0700] "GET /apache pb.gif HTTP/1.0" 200 2326 Each part of this log entry is described below. 127.0.0.1 (%h) This is the IP address of the client (remote host) which made the request to the server. If H OSTNAME L OOKUPS is set to On, then the server will try to determine the hostname and log it in place of the IP address. However, this configuration is not recommended since it can significantly slow the server. Instead, it is best to use a log post-processor such as logresolve to determine the hostnames. The IP address reported here is not necessarily the address of the machine at which the user is sitting. If a proxy server exists between the user and the server, this address will be the address of the proxy, rather than the originating machine. - (%l) The "hyphen" in the output indicates that the requested piece of information is not available. In this case, the information that is not available is the RFC 1413 identity of the client determined by identd on the clients machine. This information is highly unreliable and should almost never be used except on tightly controlled internal networks. Apache httpd will not even attempt to determine this information unless I DENTITY C HECK is set to On. frank (%u) This is the userid of the person requesting the document as determined by HTTP authentication. The same value is typically provided to CGI scripts in the REMOTE USER environment variable. If the status code for the request (see below) is 401, then this value should not be trusted because the user is not yet authenticated. If the document is not password protected, this part will be "-" just like the previous one. 16 http://dmoz.org/Computers/Software/Internet/Site Management/Log Analysis/ 2.8. LOG FILES 59 [10/Oct/2000:13:55:36 -0700] (%t) The time that the request was received. The format is: [day/month/year:hour:minute:second zone] day = 2*digit month = 3*letter year = 4*digit hour = 2*digit minute = 2*digit second = 2*digit zone = (‘+’ | ‘-’) 4*digit It is possible to have the time displayed in another format by specifying %{format}t in the log format string, where format is either as in strftime(3) from the C standard library, or one of the supported special tokens. For details see the MOD LOG CONFIG format strings (p. 705) . "GET /apache pb.gif HTTP/1.0" (\"%r\") The request line from the client is given in double quotes. The request line contains a great deal of useful information. First, the method used by the client is GET. Second, the client requested the resource /apache pb.gif, and third, the client used the protocol HTTP/1.0. It is also possible to log one or more parts of the request line independently. For example, the format string "%m %U%q %H" will log the method, path, query-string, and protocol, resulting in exactly the same output as "%r". 200 (%>s) This is the status code that the server sends back to the client. This information is very valuable, because it reveals whether the request resulted in a successful response (codes beginning in 2), a redirection (codes beginning in 3), an error caused by the client (codes beginning in 4), or an error in the server (codes beginning in 5). The full list of possible status codes can be found in the HTTP specification17 (RFC2616 section 10). 2326 (%b) The last part indicates the size of the object returned to the client, not including the response headers. If no content was returned to the client, this value will be "-". To log "0" for no content, use %B instead. Combined Log Format Another commonly used format string is called the Combined Log Format. It can be used as follows. LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\"" combined CustomLog "log/access_log" combined This format is exactly the same as the Common Log Format, with the addition of two more fields. Each of the additional fields uses the percent-directive %{header}i, where header can be any HTTP request header. The access log under this format will look like: 127.0.0.1 - frank [10/Oct/2000:13:55:36 -0700] "GET /apache pb.gif HTTP/1.0" 200 2326 "http://www.example.com/start.html" "Mozilla/4.08 [en] (Win98; I ;Nav)" The additional fields are: "http://www.example.com/start.html" (\"%{Referer}i\") The "Referer" (sic) HTTP request header. This gives the site that the client reports having been referred from. (This should be the page that links to or includes /apache pb.gif). "Mozilla/4.08 [en] (Win98; I ;Nav)" (\"%{User-agent}i\") The User-Agent HTTP request header. This is the identifying information that the client browser reports about itself. 17 http://www.w3.org/Protocols/rfc2616/rfc2616.txt 60 CHAPTER 2. USING THE APACHE HTTP SERVER Multiple Access Logs Multiple access logs can be created simply by specifying multiple C USTOM L OG directives in the configuration file. For example, the following directives will create three access logs. The first contains the basic CLF information, while the second and third contain referer and browser information. The last two C USTOM L OG lines show how to mimic the effects of the ReferLog and AgentLog directives. LogFormat CustomLog CustomLog CustomLog "%h %l %u %t \"%r\" %>s %b" common "logs/access_log" common "logs/referer_log" "%{Referer}i -> %U" "logs/agent_log" "%{User-agent}i" This example also shows that it is not necessary to define a nickname with the L OG F ORMAT directive. Instead, the log format can be specified directly in the C USTOM L OG directive. Conditional Logs There are times when it is convenient to exclude certain entries from the access logs based on characteristics of the client request. This is easily accomplished with the help of environment variables (p. 92) . First, an environment variable must be set to indicate that the request meets certain conditions. This is usually accomplished with S ET E NV I F. Then the env= clause of the C USTOM L OG directive is used to include or exclude requests where the environment variable is set. Some examples: # Mark requests from the loop-back interface SetEnvIf Remote_Addr "127\.0\.0\.1" dontlog # Mark requests for the robots.txt file SetEnvIf Request_URI "ˆ/robots\.txt$" dontlog # Log what remains CustomLog "logs/access_log" common env=!dontlog As another example, consider logging requests from english-speakers to one log file, and non-english speakers to a different log file. SetEnvIf Accept-Language "en" english CustomLog "logs/english_log" common env=english CustomLog "logs/non_english_log" common env=!english In a caching scenario one would want to know about the efficiency of the cache. A very simple method to find this out would be: SetEnv CACHE_MISS 1 LogFormat "%h %l %u %t "%r " %>s %b %{CACHE_MISS}e" common-cache CustomLog "logs/access_log" common-cache MOD CACHE will run before MOD ENV and, when successful, will deliver the content without it. In that case a cache hit will log -, while a cache miss will log 1. In addition to the env= syntax, L OG F ORMAT supports logging values conditional upon the HTTP response code: LogFormat "%400,501{User-agent}i" browserlog LogFormat "%!200,304,302{Referer}i" refererlog 2.8. LOG FILES 61 In the first example, the User-agent will be logged if the HTTP status code is 400 or 501. In other cases, a literal "-" will be logged instead. Likewise, in the second example, the Referer will be logged if the HTTP status code is not 200, 204, or 302. (Note the "!" before the status codes. Although we have just shown that conditional logging is very powerful and flexible, it is not the only way to control the contents of the logs. Log files are more useful when they contain a complete record of server activity. It is often easier to simply post-process the log files to remove requests that you do not want to consider. Log Rotation On even a moderately busy server, the quantity of information stored in the log files is very large. The access log file typically grows 1 MB or more per 10,000 requests. It will consequently be necessary to periodically rotate the log files by moving or deleting the existing logs. This cannot be done while the server is running, because Apache httpd will continue writing to the old log file as long as it holds the file open. Instead, the server must be restarted (p. 29) after the log files are moved or deleted so that it will open new log files. By using a graceful restart, the server can be instructed to open new log files without losing any existing or pending connections from clients. However, in order to accomplish this, the server must continue to write to the old log files while it finishes serving old requests. It is therefore necessary to wait for some time after the restart before doing any processing on the log files. A typical scenario that simply rotates the logs and compresses the old logs to save space is: mv access log access log.old mv error log error log.old apachectl graceful sleep 600 gzip access log.old error log.old Another way to perform log rotation is using piped logs as discussed in the next section. Piped Logs Apache httpd is capable of writing error and access log files through a pipe to another process, rather than directly to a file. This capability dramatically increases the flexibility of logging, without adding code to the main server. In order to write logs to a pipe, simply replace the filename with the pipe character "|", followed by the name of the executable which should accept log entries on its standard input. The server will start the piped-log process when the server starts, and will restart it if it crashes while the server is running. (This last feature is why we can refer to this technique as "reliable piped logging".) Piped log processes are spawned by the parent Apache httpd process, and inherit the userid of that process. This means that piped log programs usually run as root. It is therefore very important to keep the programs simple and secure. One important use of piped logs is to allow log rotation without having to restart the server. The Apache HTTP Server includes a simple program called rotatelogs for this purpose. For example, to rotate the logs every 24 hours, you can use: CustomLog "|/usr/local/apache/bin/rotatelogs /var/log/access_log 86400" common Notice that quotes are used to enclose the entire command that will be called for the pipe. Although these examples are for the access log, the same technique can be used for the error log. As with conditional logging, piped logs are a very powerful tool, but they should not be used where a simpler solution like off-line post-processing is available. 62 CHAPTER 2. USING THE APACHE HTTP SERVER By default the piped log process is spawned without invoking a shell. Use "|$" instead of "|" to spawn using a shell (usually with /bin/sh -c): # Invoke "rotatelogs" using a shell CustomLog "|$/usr/local/apache/bin/rotatelogs /var/log/access_log 86400" common This was the default behaviour for Apache 2.2. Depending on the shell specifics this might lead to an additional shell process for the lifetime of the logging pipe program and signal handling problems during restart. For compatibility reasons with Apache 2.2 the notation "||" is also supported and equivalent to using "|". =⇒Windows note Note that on Windows, you may run into problems when running many piped log- ger processes, especially when HTTPD is running as a service. This is caused by running out of desktop heap space. The desktop heap space given to each service is specified by the third argument to the SharedSectionparameter in the HKEY LOCAL MACHINE\System\CurrentControlSet\Control\SessionManager\SubSystems\Windows registry value.Change this value with care; the normal caveats for changing the Windows registry apply, but you might also exhaust the desktop heap pool if the number is adjusted too high. Virtual Hosts When running a server with many virtual hosts (p. 124) , there are several options for dealing with log files. First, it is possible to use logs exactly as in a single-host server. Simply by placing the logging directives outside the sections in the main server context, it is possible to log all requests in the same access log and error log. This technique does not allow for easy collection of statistics on individual virtual hosts. If C USTOM L OG or E RROR L OG directives are placed inside a section, all requests or errors for that virtual host will be logged only to the specified file. Any virtual host which does not have logging directives will still have its requests sent to the main server logs. This technique is very useful for a small number of virtual hosts, but if the number of hosts is very large, it can be complicated to manage. In addition, it can often create problems with insufficient file descriptors (p. 144) . For the access log, there is a very good compromise. By adding information on the virtual host to the log format string, it is possible to log all hosts to the same log, and later split the log into individual files. For example, consider the following directives. LogFormat "%v %l %u %t \"%r\" %>s %b" comonvhost CustomLog "logs/access_log" comonvhost The %v is used to log the name of the virtual host that is serving the request. Then a program like split-logfile (p. 334) can be used to post-process the access log in order to split it into one file per virtual host. Other Log Files Related Modules MOD LOGIO MOD LOG CONFIG MOD LOG FORENSIC MOD CGI Related Directives L OG F ORMAT B UFFERED L OGS F ORENSIC L OG P ID F ILE S CRIPT L OG S CRIPT L OG B UFFER S CRIPT L OG L ENGTH 2.8. LOG FILES 63 Logging actual bytes sent and received MOD LOGIO adds in two additional L OG F ORMAT fields (%I and %O) that log the actual number of bytes received and sent on the network. Forensic Logging MOD LOG FORENSIC provides for forensic logging of client requests. Logging is done before and after processing a request, so the forensic log contains two log lines for each request. The forensic logger is very strict with no customizations. It can be an invaluable debugging and security tool. PID File On startup, Apache httpd saves the process id of the parent httpd process to the file logs/httpd.pid. This filename can be changed with the P ID F ILE directive. The process-id is for use by the administrator in restarting and terminating the daemon by sending signals to the parent process; on Windows, use the -k command line option instead. For more information see the Stopping and Restarting (p. 29) page. Script Log In order to aid in debugging, the S CRIPT L OG directive allows you to record the input to and output from CGI scripts. This should only be used in testing - not for live servers. More information is available in the mod cgi (p. 580) documentation. 64 2.9 CHAPTER 2. USING THE APACHE HTTP SERVER Mapping URLs to Filesystem Locations This document explains how the Apache HTTP Server uses the URL of a request to determine the filesystem location from which to serve a file. Related Modules and Directives Related Modules MOD MOD MOD MOD ACTIONS ALIAS AUTOINDEX DIR MOD MOD MOD MOD MOD MOD IMAGEMAP NEGOTIATION PROXY REWRITE SPELING USERDIR MOD VHOST ALIAS Related Directives A LIAS A LIAS M ATCH C HECK S PELLING D IRECTORY I NDEX D OCUMENT ROOT E RROR D OCUMENT O PTIONS P ROXY PASS P ROXY PASS R EVERSE P ROXY PASS R EVERSE C OOKIE D OMAIN P ROXY PASS R EVERSE C OOKIE PATH R EDIRECT R EDIRECT M ATCH R EWRITE C OND R EWRITE RULE S CRIPTA LIAS S CRIPTA LIAS M ATCH U SER D IR DocumentRoot In deciding what file to serve for a given request, httpd’s default behavior is to take the URL-Path for the request (the part of the URL following the hostname and port) and add it to the end of the D OCUMENT ROOT specified in your configuration files. Therefore, the files and directories underneath the D OCUMENT ROOT make up the basic document tree which will be visible from the web. For example, if D OCUMENT ROOT were set to /var/www/html for http://www.example.com/fish/guppies.html would result /var/www/html/fish/guppies.html being served to the requesting client. then in a request the file If a directory is requested (i.e. a path ending with /), the file served from that directory is defined by the D IRECTO RY I NDEX directive. For example, if DocumentRoot were set as above, and you were to set: DirectoryIndex index.html index.php Then a request for http://www.example.com/fish/ will cause httpd to attempt to serve the file /var/www/html/fish/index.html. In the event that that file does not exist, it will next attempt to serve the file /var/www/html/fish/index.php. If neither of these files existed, the next step is to attempt to provide a directory index, if MOD AUTOINDEX is loaded and configured to permit that. httpd is also capable of Virtual Hosting (p. 124) , where the server receives requests for more than one host. In this case, a different D OCUMENT ROOT can be specified for each virtual host, or alternatively, the directives provided by 2.9. MAPPING URLS TO FILESYSTEM LOCATIONS 65 the module MOD VHOST ALIAS can be used to dynamically determine the appropriate place from which to serve content based on the requested IP address or hostname. The D OCUMENT ROOT directive is set in your main server configuration file (httpd.conf) and, possibly, once per additional Virtual Host (p. 124) you create. Files Outside the DocumentRoot There are frequently circumstances where it is necessary to allow web access to parts of the filesystem that are not strictly underneath the D OCUMENT ROOT. httpd offers several different ways to accomplish this. On Unix systems, symbolic links can bring other parts of the filesystem under the D OCUMENT ROOT. For security reasons, httpd will follow symbolic links only if the O PTIONS setting for the relevant directory includes FollowSymLinks or SymLinksIfOwnerMatch. Alternatively, the A LIAS directive will map any part of the filesystem into the web space. For example, with Alias "/docs" "/var/web" the URL http://www.example.com/docs/dir/file.html will be served from /var/web/dir/file.html. The S CRIPTA LIAS directive works the same way, with the additional effect that all content located at the target path is treated as CGI scripts. For situations where you require additional flexibility, you can use the A LIAS M ATCH and S CRIPTA LIAS M ATCH directives to do powerful regular expression based matching and substitution. For example, ScriptAliasMatch "ˆ/˜([a-zA-Z0-9]+)/cgi-bin/(.+)" "/home/$1/cgi-bin/$2" will map a request to http://example.com/˜user/cgi-bin/script.cgi /home/user/cgi-bin/script.cgi and will treat the resulting file as a CGI script. to the path User Directories Traditionally on Unix systems, the home directory of a particular user can be referred to as ˜user/. The module MOD USERDIR extends this idea to the web by allowing files under each user’s home directory to be accessed using URLs such as the following. http://www.example.com/˜user/file.html For security reasons, it is inappropriate to give direct access to a user’s home directory from the web. Therefore, the U SER D IR directive specifies a directory underneath the user’s home directory where web files are located. Using the default setting of Userdir public html, the above URL maps to a file at a directory like /home/user/public html/file.html where /home/user/ is the user’s home directory as specified in /etc/passwd. There are also several other forms of the Userdir directive which you can use on systems where /etc/passwd does not contain the location of the home directory. Some people find the "˜" symbol (which is often encoded on the web as %7e) to be awkward and prefer to use an alternate string to represent user directories. This functionality is not supported by mod userdir. However, if users’ home directories are structured in a regular way, then it is possible to use the A LIAS M ATCH directive to achieve the desired effect. For example, to make http://www.example.com/upages/user/file.html map to /home/user/public html/file.html, use the following AliasMatch directive: AliasMatch "ˆ/upages/([a-zA-Z0-9]+)(/(.*))?$" "/home/$1/public_html/$3" 66 CHAPTER 2. USING THE APACHE HTTP SERVER URL Redirection The configuration directives discussed in the above sections tell httpd to get content from a specific place in the filesystem and return it to the client. Sometimes, it is desirable instead to inform the client that the requested content is located at a different URL, and instruct the client to make a new request with the new URL. This is called redirection and is implemented by the R EDIRECT directive. For example, if the contents of the directory /foo/ under the D OCUMENT ROOT are moved to the new directory /bar/, you can instruct clients to request the content at the new location as follows: Redirect permanent "/foo/" "http://www.example.com/bar/" This will redirect any URL-Path starting in /foo/ to the same URL path on the www.example.com server with /bar/ substituted for /foo/. You can redirect clients to any server, not only the origin server. httpd also provides a R EDIRECT M ATCH directive for more complicated rewriting problems. For example, to redirect requests for the site home page to a different site, but leave all other requests alone, use the following configuration: RedirectMatch permanent "ˆ/$" "http://www.example.com/startpage.html" Alternatively, to temporarily redirect all pages on one site to a particular page on another site, use the following: RedirectMatch temp ".*" "http://othersite.example.com/startpage.html" Reverse Proxy httpd also allows you to bring remote documents into the URL space of the local server. This technique is called reverse proxying because the web server acts like a proxy server by fetching the documents from a remote server and returning them to the client. It is different from normal (forward) proxying because, to the client, it appears the documents originate at the reverse proxy server. In the following example, when clients request documents under the /foo/ directory, the server fetches those documents from the /bar/ directory on internal.example.com and returns them to the client as if they were from the local server. ProxyPass "/foo/" "http://internal.example.com/bar/" ProxyPassReverse "/foo/" "http://internal.example.com/bar/" ProxyPassReverseCookieDomain internal.example.com public.example.com ProxyPassReverseCookiePath "/foo/" "/bar/" The P ROXY PASS configures the server to fetch the appropriate documents, while the P ROXY PASS R EVERSE directive rewrites redirects originating at internal.example.com so that they target the appropriate directory on the local server. Similarly, the P ROXY PASS R EVERSE C OOKIE D OMAIN and P ROXY PASS R EVERSE C OOKIE PATH rewrite cookies set by the backend server. It is important to note, however, that links inside the documents will not be rewritten. So any absolute links on internal.example.com will result in the client breaking out of the proxy server and requesting directly from internal.example.com. You can modify these links (and other content) in a page as it is being served to the client using MOD SUBSTITUTE. Substitute s/internal\.example\.com/www.example.com/i For more sophisticated rewriting of links in HTML and XHTML, the MOD PROXY HTML module is also available. It allows you to create maps of URLs that need to be rewritten, so that complex proxying scenarios can be handled. 2.9. MAPPING URLS TO FILESYSTEM LOCATIONS 67 Rewriting Engine When even more powerful substitution is required, the rewriting engine provided by MOD REWRITE can be useful. The directives provided by this module can use characteristics of the request such as browser type or source IP address in deciding from where to serve content. In addition, mod rewrite can use external database files or programs to determine how to handle a request. The rewriting engine is capable of performing all three types of mappings discussed above: internal redirects (aliases), external redirects, and proxying. Many practical examples employing mod rewrite are discussed in the detailed mod rewrite documentation (p. 146) . File Not Found Inevitably, URLs will be requested for which no matching file can be found in the filesystem. This can happen for several reasons. In some cases, it can be a result of moving documents from one location to another. In this case, it is best to use URL redirection to inform clients of the new location of the resource. In this way, you can assure that old bookmarks and links will continue to work, even though the resource is at a new location. Another common cause of "File Not Found" errors is accidental mistyping of URLs, either directly in the browser, or in HTML links. httpd provides the module MOD SPELING (sic) to help with this problem. When this module is activated, it will intercept "File Not Found" errors and look for a resource with a similar filename. If one such file is found, mod speling will send an HTTP redirect to the client informing it of the correct location. If several "close" files are found, a list of available alternatives will be presented to the client. An especially useful feature of mod speling, is that it will compare filenames without respect to case. This can help systems where users are unaware of the case-sensitive nature of URLs and the unix filesystem. But using mod speling for anything more than the occasional URL correction can place additional load on the server, since each "incorrect" request is followed by a URL redirection and a new request from the client. MOD DIR provides FALLBACK R ESOURCE , which can be used to map virtual URIs to a real resource, which then serves them. This is a very useful replacement to MOD REWRITE when implementing a ’front controller’ If all attempts to locate the content fail, httpd returns an error page with HTTP status code 404 (file not found). The appearance of this page is controlled with the E RROR D OCUMENT directive and can be customized in a flexible manner as discussed in the Custom error responses (p. 85) document. Other URL Mapping Modules Other modules available for URL mapping include: • MOD ACTIONS - Maps a request to a CGI script based on the request method, or resource MIME type. • MOD DIR - Provides basic mapping of a trailing slash into an index file such as index.html. • MOD IMAGEMAP - Maps a request to a URL based on where a user clicks on an image embedded in a HTML document. • MOD NEGOTIATION - Selects an appropriate document based on client preferences such as language or content compression. 68 CHAPTER 2. USING THE APACHE HTTP SERVER 2.10 Dynamic Shared Object (DSO) Support The Apache HTTP Server is a modular program where the administrator can choose the functionality to include in the server by selecting a set of modules. Modules will be compiled as Dynamic Shared Objects (DSOs) that exist separately from the main httpd binary file. DSO modules may be compiled at the time the server is built, or they may be compiled and added at a later time using the Apache Extension Tool (apxs). Alternatively, the modules can be statically compiled into the httpd binary when the server is built. This document describes how to use DSO modules as well as the theory behind their use. Implementation Related Modules MOD SO Related Directives L OAD M ODULE The DSO support for loading individual Apache httpd modules is based on a module named MOD SO which must be statically compiled into the Apache httpd core. It is the only module besides CORE which cannot be put into a DSO itself. Practically all other distributed Apache httpd modules will then be placed into a DSO. After a module is compiled into a DSO named mod foo.so you can use MOD SO’s L OAD M ODULE directive in your httpd.conf file to load this module at server startup or restart. The DSO builds for individual modules can be disabled via configure’s --enable-mods-static option as discussed in the install documentation (p. 22) . To simplify this creation of DSO files for Apache httpd modules (especially for third-party modules) a support program named apxs (APache eXtenSion) is available. It can be used to build DSO based modules outside of the Apache httpd source tree. The idea is simple: When installing Apache HTTP Server the configure’s make install procedure installs the Apache httpd C header files and puts the platform-dependent compiler and linker flags for building DSO files into the apxs program. This way the user can use apxs to compile his Apache httpd module sources without the Apache httpd distribution source tree and without having to fiddle with the platform-dependent compiler and linker flags for DSO support. Usage Summary To give you an overview of the DSO features of Apache HTTP Server 2.x, here is a short and concise summary: 1. Build and install a distributed Apache httpd module, say mod foo.c, into its own DSO mod foo.so: $ ./configure --prefix=/path/to/install --enable-foo $ make install 2. Configure Apache HTTP Server with all modules enabled. Only a basic set will be loaded during server startup. You can change the set of loaded modules by activating or deactivating the L OAD M ODULE directives in httpd.conf. $ ./configure --enable-mods-shared=all $ make install 2.10. DYNAMIC SHARED OBJECT (DSO) SUPPORT 69 3. Some modules are only useful for developers and will not be build. when using the module set all. To build all available modules including developer modules use reallyall. In addition the L OAD M ODULE directives for all built modules can be activated via the configure option --enable-load-all-modules. $ ./configure --enable-mods-shared=reallyall --enable-load-all-modules $ make install 4. Build and install a third-party Apache httpd module, say mod foo.c, into its own DSO mod foo.so outside of the Apache httpd source tree using apxs: $ cd /path/to/3rdparty $ apxs -cia mod foo.c In all cases, once the shared module is compiled, you must use a L OAD M ODULE directive in httpd.conf to tell Apache httpd to activate the module. See the apxs documentation (p. 303) for more details. Background On modern Unix derivatives there exists a mechanism called dynamic linking/loading of Dynamic Shared Objects (DSO) which provides a way to build a piece of program code in a special format for loading it at run-time into the address space of an executable program. This loading can usually be done in two ways: automatically by a system program called ld.so when an executable program is started or manually from within the executing program via a programmatic system interface to the Unix loader through the system calls dlopen()/dlsym(). In the first way the DSO’s are usually called shared libraries or DSO libraries and named libfoo.so or libfoo.so.1.2. They reside in a system directory (usually /usr/lib) and the link to the executable program is established at build-time by specifying -lfoo to the linker command. This hard-codes library references into the executable program file so that at start-time the Unix loader is able to locate libfoo.so in /usr/lib, in paths hard-coded via linker-options like -R or in paths configured via the environment variable LD LIBRARY PATH. It then resolves any (yet unresolved) symbols in the executable program which are available in the DSO. Symbols in the executable program are usually not referenced by the DSO (because it’s a reusable library of general code) and hence no further resolving has to be done. The executable program has no need to do anything on its own to use the symbols from the DSO because the complete resolving is done by the Unix loader. (In fact, the code to invoke ld.so is part of the run-time startup code which is linked into every executable program which has been bound non-static). The advantage of dynamic loading of common library code is obvious: the library code needs to be stored only once, in a system library like libc.so, saving disk space for every program. In the second way the DSO’s are usually called shared objects or DSO files and can be named with an arbitrary extension (although the canonical name is foo.so). These files usually stay inside a program-specific directory and there is no automatically established link to the executable program where they are used. Instead the executable program manually loads the DSO at run-time into its address space via dlopen(). At this time no resolving of symbols from the DSO for the executable program is done. But instead the Unix loader automatically resolves any (yet unresolved) symbols in the DSO from the set of symbols exported by the executable program and its already loaded DSO libraries (especially all symbols from the ubiquitous libc.so). This way the DSO gets knowledge of the executable program’s symbol set as if it had been statically linked with it in the first place. Finally, to take advantage of the DSO’s API the executable program has to resolve particular symbols from the DSO via dlsym() for later use inside dispatch tables etc. In other words: The executable program has to manually resolve 70 CHAPTER 2. USING THE APACHE HTTP SERVER every symbol it needs to be able to use it. The advantage of such a mechanism is that optional program parts need not be loaded (and thus do not spend memory) until they are needed by the program in question. When required, these program parts can be loaded dynamically to extend the base program’s functionality. Although this DSO mechanism sounds straightforward there is at least one difficult step here: The resolving of symbols from the executable program for the DSO when using a DSO to extend a program (the second way). Why? Because "reverse resolving" DSO symbols from the executable program’s symbol set is against the library design (where the library has no knowledge about the programs it is used by) and is neither available under all platforms nor standardized. In practice the executable program’s global symbols are often not re-exported and thus not available for use in a DSO. Finding a way to force the linker to export all global symbols is the main problem one has to solve when using DSO for extending a program at run-time. The shared library approach is the typical one, because it is what the DSO mechanism was designed for, hence it is used for nearly all types of libraries the operating system provides. Advantages and Disadvantages The above DSO based features have the following advantages: • The server package is more flexible at run-time because the server process can be assembled at run-time via L OAD M ODULE httpd.conf configuration directives instead of configure options at build-time. For instance, this way one is able to run different server instances (standard & SSL version, minimalistic & dynamic version [mod perl, mod php], etc.) with only one Apache httpd installation. • The server package can be easily extended with third-party modules even after installation. This is a great benefit for vendor package maintainers, who can create an Apache httpd core package and additional packages containing extensions like PHP, mod perl, mod security, etc. • Easier Apache httpd module prototyping, because with the DSO/apxs pair you can both work outside the Apache httpd source tree and only need an apxs -i command followed by an apachectl restart to bring a new version of your currently developed module into the running Apache HTTP Server. DSO has the following disadvantages: • The server is approximately 20% slower at startup time because of the symbol resolving overhead the Unix loader now has to do. • The server is approximately 5% slower at execution time under some platforms, because position independent code (PIC) sometimes needs complicated assembler tricks for relative addressing, which are not necessarily as fast as absolute addressing. • Because DSO modules cannot be linked against other DSO-based libraries (ld -lfoo) on all platforms (for instance a.out-based platforms usually don’t provide this functionality while ELF-based platforms do) you cannot use the DSO mechanism for all types of modules. Or in other words, modules compiled as DSO files are restricted to only use symbols from the Apache httpd core, from the C library (libc) and all other dynamic or static libraries used by the Apache httpd core, or from static library archives (libfoo.a) containing position independent code. The only chances to use other code is to either make sure the httpd core itself already contains a reference to it or loading the code yourself via dlopen(). 2.11. HTTP PROTOCOL COMPLIANCE 2.11 71 HTTP Protocol Compliance This document describes the mechanism to set a policy for HTTP protocol compliance for a given URL space by the origin servers or applications behind that URL space. For those who may have received an error message from a rejected policy, and need to know what the policy rejection means and what they might do to fix the error, each policy is described below. See also • Filters (p. 110) Enforcing HTTP Protocol Compliance in Apache 2 Related Modules MOD POLICY Related Directives P OLICY C ONDITIONAL P OLICY L ENGTH P OLICY K EEPALIVE P OLICY T YPE P OLICY VARY P OLICY VALIDATION P OLICY N OCACHE P OLICY M AXAGE P OLICY V ERSION The HTTP protocol follows the robustness principle as described in RFC112218 , which states "Be liberal in what you accept, and conservative in what you send". As a result of this principle, HTTP clients will compensate for and recover from incorrect or misconfigured responses, or responses that are uncacheable. As a website is scaled up to face greater and greater traffic loads, suboptimal or misconfigured applications or server configurations can threaten both the stability and scalability of the website, as well as the hosting costs associated with it. A website can also scale up to face greater configuration complexity, and it can be increasingly difficult to detect and keep track of suboptimally configured URL spaces on a given server. Eventually a point is reached where the principle "conservative in what you send" needs to be enforced by the server administrator. The MOD POLICY module provides a set of filters which can be applied to a server, allowing key features of the HTTP protocol to be explicitly tested, and non compliant responses logged as warnings, or rejected outright as an error. Each filter can be applied separately, allowing the administrator to pick and choose which policies should be enforced depending on the circumstances of their environment. The filters might be placed in testing and staging environments for the benefit of application and website developers, or may be applied to production servers to protect infrastructure from systems outside the administrator’s direct control. 18 http://tools.ietf.org/html/rfc1122 72 CHAPTER 2. USING THE APACHE HTTP SERVER In the above example, an Apache httpd server has been placed between the application server and the internet at large, and configured to cache responses from the application server. The MOD POLICY filters have been added to enforce support for cacheable content and conditional requests, ensuring that both MOD CACHE and public caches on the internet are fully able to cache content created by the restful application server efficiently. In the above simpler example, a static server serving highly cacheable content has a set of policies applied to ensure that the server configuration conforms to a minimum level of compliance. 2.11. HTTP PROTOCOL COMPLIANCE 73 Conditional Request Policy Related Modules MOD POLICY Related Directives P OLICY C ONDITIONAL This policy will be rejected if the server does not correctly respond to a conditional request with the appropriate status code. Conditional requests form the mechanism by which an HTTP cache makes stale content fresh again, and particularly for content with short freshness lifetimes, lack of support for conditional requests can add avoidable load to the server. Most specifically, the existence of any of following headers in the request makes the request conditional: If-Match If the provided ETag in the If-Match header does not match the ETag of the response, the server should return 412 Precondition Failed. Full details of how to handle an If-Match header can be found in RFC2616 section 14.2419 . If-None-Match If the provided ETag in the If-None-Match header matches the ETag of the response, the server should return either 304 Not Modified for GET/HEAD requests, or 412 Precondition Failed for other methods. Full details of how to handle an If-None-Match header can be found in RFC2616 section 14.2620 . If-Modified-Since If the provided date in the If-Modified-Since header is older than the Last-Modified header of the response, the server should return 304 Not Modified. Full details of how to handle an If-Modified-Since header can be found in RFC2616 section 14.2521 . If-Unmodified-Since If the provided date in the If-Modified-Since header is newer than the Last-Modified header of the response, the server should return 412 Precondition Failed. Full details of how to handle an If-Unmodified-Since header can be found in RFC2616 section 14.2822 . If-Range If the provided ETag or date in the If-Range header matches the ETag or Last-Modified of the response, and a valid Range is present, the server should return 206 Partial Response. Full details of how to handle an If-Range header can be found in RFC2616 section 14.2723 . If the response is detected to have been successful (a 2xx response), but was conditional and one of the responses above was expected instead, this policy will be rejected. Responses that indicate a redirect or a failure of some kind (3xx, 4xx, 5xx) will be ignored by this policy. This policy is implemented by the POLICY CONDITIONAL filter. Content-Length Policy Related Modules MOD POLICY 19 http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.24 20 http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.26 21 http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.25 22 http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.28 23 http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.27 Related Directives P OLICY L ENGTH 74 CHAPTER 2. USING THE APACHE HTTP SERVER This policy will be rejected if the server response does not contain an explicit Content-Length header. There are a number of ways of determining the length of a response body, described in full in RFC2616 section 4.4 Message Length24 . When the Content-Length header is present, the size of the body is declared at the start of the response. If this information is missing, an HTTP cache might choose to ignore the response, as it does not know in advance whether the response will fit within the cache’s defined limits. HTTP/1.1 defines the Transfer-Encoding header as an alternative to Content-Length, allowing the end of the response to be indicated to the client without the client having to know the length beforehand. However, when HTTP/1.0 requests are processed, and no Content-Length is specified, the only mechanism available to the server to indicate the end of the request is to drop the connection. In an environment containing load balancers, this can cause the keepalive mechanism to be bypassed. If the response is detected to have been successful (a 2xx response), and has a response body (this excludes 204 No Content), and the Content-Length header is missing, this policy will be rejected. Responses that indicate a redirect or a failure of some kind (3xx, 4xx, 5xx) will be ignored by this policy. ! It should be noted that some modules, such as MOD PROXY, add their own Content-Length header should the response be small enough for it to have been possible to read the response lacking such a header in one go. This may cause small responses to pass this policy, while larger responses may fail for the same URL. This policy is implemented by the POLICY LENGTH filter. Content-Type Policy Related Modules Related Directives P OLICY T YPE MOD POLICY This policy will be rejected if the server response does not contain an explicit and syntactically correct Content-Type header that matches the server defined pattern. The media type of the body is placed in the Content-Type header, and the format of the header is described in full in RFC2616 section 3.7 Media Types25 . A syntactically valid content type might look as follows: Content-Type: text/html; charset=iso-8859-1 Invalid content types might include: # invalid Content-Type: # blank Content-Type: foo The server administrator has the option to restrict the policy to one or more specific types, or could specify a general wildcard type such as */*. This policy is implemented by the POLICY TYPE filter. 24 http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.4 25 http://www.w3.org/Protocols/rfc2616/rfc2616-sec3.html#sec3.7 2.11. HTTP PROTOCOL COMPLIANCE 75 Keepalive Policy Related Modules MOD POLICY Related Directives P OLICY K EEPALIVE This policy will be rejected if the server response does not contain an explicit Content-Length header, or a Transfer-Encoding of chunked. There are a number of ways of determining the length of a response body, described in full in RFC2616 section 4.4 Message Length26 . When the Content-Length header is present, the size of the body is declared at the start of the response. HTTP/1.1 defines the Transfer-Encoding header as an alternative to Content-Length, allowing the end of the response to be indicated to the client without the client having to know the length beforehand. In the absence of these two mechanisms, the only way for a server to indicate the end of the request is to drop the connection. In an environment containing load balancers, this can cause the keepalive mechanism to be bypassed. Most specifically, we follow these rules: IF we have not marked this connection as errored; and the client isn’t expecting 100-continue and the response status does not require a close; and the response body has a defined length due to the status code being 304 or 204, the request method being HEAD, already having defined Content-Length or Transfer-Encoding: chunked, or the request version being HTTP/1.1 and thus capable of being set as chunked THEN we support keepalive. ! The server may choose to turn off keepalive for various reasons, such as an imminent shutdown, or a Connection: close from the client, or an HTTP/1.0 client request with a response with no Content-Length, but for our purposes we only care that keepalive was possible from the application, not that keepalive actually took place. It should also be noted that the Apache httpd server includes a filter that adds chunked encoding to responses without an explicit content length. This policy catches those cases where this filter is bypassed or not in effect. This policy is implemented by the POLICY KEEPALIVE filter. Freshness Lifetime / Maxage Policy Related Modules MOD POLICY Related Directives P OLICY M AXAGE This policy will be rejected if the server response does not have an explicit freshness lifetime at least as long as the server defined limit, or if the freshness lifetime is calculated based on a heuristic. Full details of how a freshness lifetime is calculated is described in full in RFC2616 section 13.2 Expiration Model27 . 26 http://www.w3.org/Protocols/rfc2616/rfc2616-sec4.html#sec4.4 27 http://www.w3.org/Protocols/rfc2616/rfc2616-sec13.html#sec13.2 76 CHAPTER 2. USING THE APACHE HTTP SERVER During the freshness lifetime, a cache does not need to contact the origin server at all, it can simply pass the cached content as is back to the client. When the freshness lifetime is reached, the cache should contact the origin server in an effort to check whether the content is still fresh, and if not, replace the content. When the freshness lifetime is too short, it can result in excessive load on the server. In addition, should an outage occur that is as long or longer than the freshness lifetime, all cached content will become stale, which could cause a thundering herd of traffic when the server or network returns. This policy is implemented by the POLICY MAXAGE filter. No Cache Policy Related Directives P OLICY N OCACHE Related Modules MOD POLICY This policy will be rejected if the server response declares itself uncacheable using either the Cache-Control or Pragma headers. Full details of how content may be declared uncacheable is described in full in RFC2616 section 14.9.1 What is Cacheable28 , and within the definition for the Pragma header in RFC2616 section 14.32 Pragma29 . Most specifically, should any of the following header combinations exist in the response headers, the response will be rejected: • Cache-Control: no-cache • Cache-Control: no-store • Cache-Control: private • Pragma: no-cache When unexpected, uncacheable content may produce unacceptable levels of server load, or may incur significant cost. When this policy is enabled, all server defined uncacheable content will be rejected. This policy is implemented by the POLICY NOCACHE filter. Validation Policy Related Modules MOD POLICY Related Directives P OLICY VALIDATION This policy will be rejected if the server response does not contain either a syntactically correct ETag or Last-Modified header. The ETag header is described in full in RFC2616 section 14.19 Etag30 , and the Last-Modified header is described in full in 28 http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9.1 29 http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.32 30 http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.19 2.11. HTTP PROTOCOL COMPLIANCE 77 RFC2616 section 14.29 Last-Modified31 . In addition to being checked present, the headers are checked for syntax. An ETag that is not surrounded with quotes, or is not declared "weak" by prefixing it with a "W/" will cause the policy to be rejected. A Last-Modified that is not parsed as a valid date will cause the policy to be rejected. This policy is implemented by the POLICY VALIDATION filter. Vary Header Policy Related Modules MOD POLICY Related Directives P OLICY VARY This policy will be rejected if the server response contains a Vary header, and that header in turn contains a header blacklisted by the administrator. The Vary header is described in full in RFC2616 section 14.44 Vary32 . Some client provided headers, such as User-Agent, can contain thousands or millions of combinations of values over a period of time, and if the response is declared cacheable, a cache might attempt to cache each of these responses separately, filling up the cache and crowding out other entries in the cache. In this scenario, if so configured, the policy will reject the response. This policy is implemented by the POLICY VARY filter. Protocol Version Policy Related Modules MOD POLICY Related Directives P OLICY V ERSION This policy will be rejected if the client request was made with a version number lower than the version of HTTP specified. This policy is typically used with restful applications where control over the type of client is desired. This policy can be used alongside the POLICY KEEPALIVE filter to ensure that HTTP/1.0 clients don’t cause keepalive connections to be dropped. Possible minimum versions that could be specified are: • HTTP/1.1 • HTTP/1.0 • HTTP/0.9 This policy is implemented by the POLICY VERSON filter. 31 http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.29 32 http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.44 78 CHAPTER 2. USING THE APACHE HTTP SERVER 2.12 Content Negotiation Apache HTTPD supports content negotiation as described in the HTTP/1.1 specification. It can choose the best representation of a resource based on the browser-supplied preferences for media type, languages, character set and encoding. It also implements a couple of features to give more intelligent handling of requests from browsers that send incomplete negotiation information. Content negotiation is provided by the MOD NEGOTIATION module, which is compiled in by default. About Content Negotiation A resource may be available in several different representations. For example, it might be available in different languages or different media types, or a combination. One way of selecting the most appropriate choice is to give the user an index page, and let them select. However it is often possible for the server to choose automatically. This works because browsers can send, as part of each request, information about what representations they prefer. For example, a browser could indicate that it would like to see information in French, if possible, else English will do. Browsers indicate their preferences by headers in the request. To request only French representations, the browser would send Accept-Language: fr Note that this preference will only be applied when there is a choice of representations and they vary by language. As an example of a more complex request, this browser has been configured to accept French and English, but prefer French, and to accept various media types, preferring HTML over plain text or other text types, and preferring GIF or JPEG over other media types, but also allowing any other media type as a last resort: Accept-Language: fr; q=1.0, en; q=0.5 Accept: text/html; q=1.0, text/*; q=0.8, image/gif; q=0.6, image/jpeg; q=0.6, image/*; q=0.5, */*; q=0.1 httpd supports ’server driven’ content negotiation, as defined in the HTTP/1.1 specification. It fully supports the Accept, Accept-Language, Accept-Charset and Accept-Encoding request headers. httpd also supports ’transparent’ content negotiation, which is an experimental negotiation protocol defined in RFC 2295 and RFC 2296. It does not offer support for ’feature negotiation’ as defined in these RFCs. A resource is a conceptual entity identified by a URI (RFC 2396). An HTTP server like Apache HTTP Server provides access to representations of the resource(s) within its namespace, with each representation in the form of a sequence of bytes with a defined media type, character set, encoding, etc. Each resource may be associated with zero, one, or more than one representation at any given time. If multiple representations are available, the resource is referred to as negotiable and each of its representations is termed a variant. The ways in which the variants for a negotiable resource vary are called the dimensions of negotiation. Negotiation in httpd In order to negotiate a resource, the server needs to be given information about each of the variants. This is done in one of two ways: • Using a type map (i.e., a *.var file) which names the files containing the variants explicitly, or • Using a ’MultiViews’ search, where the server does an implicit filename pattern match and chooses from among the results. 2.12. CONTENT NEGOTIATION 79 Using a type-map file A type map is a document which is associated with the handler named type-map (or, for backwards-compatibility with older httpd configurations, the MIME-type application/x-type-map). Note that to use this feature, you must have a handler set in the configuration that defines a file suffix as type-map; this is best done with AddHandler type-map .var in the server configuration file. Type map files should have the same name as the resource which they are describing, followed by the extension .var. In the examples shown below, the resource is named foo, so the type map file is named foo.var. This file should have an entry for each available variant; these entries consist of contiguous HTTP-format header lines. Entries for different variants are separated by blank lines. Blank lines are illegal within an entry. It is conventional to begin a map file with an entry for the combined entity as a whole (although this is not required, and if present will be ignored). An example map file is shown below. URIs in this file are relative to the location of the type map file. Usually, these files will be located in the same directory as the type map file, but this is not required. You may provide absolute or relative URIs for any file located on the same server as the map file. URI: foo URI: foo.en.html Content-type: text/html Content-language: en URI: foo.fr.de.html Content-type: text/html;charset=iso-8859-2 Content-language: fr, de Note also that a typemap file will take precedence over the filename’s extension, even when Multiviews is on. If the variants have different source qualities, that may be indicated by the "qs" parameter to the media type, as in this picture (available as JPEG, GIF, or ASCII-art): URI: foo URI: foo.jpeg Content-type: image/jpeg; qs=0.8 URI: foo.gif Content-type: image/gif; qs=0.5 URI: foo.txt Content-type: text/plain; qs=0.01 qs values can vary in the range 0.000 to 1.000. Note that any variant with a qs value of 0.000 will never be chosen. Variants with no ’qs’ parameter value are given a qs factor of 1.0. The qs parameter indicates the relative ’quality’ of this variant compared to the other available variants, independent of the client’s capabilities. For example, a JPEG file is usually of higher source quality than an ASCII file if it is attempting to represent a photograph. However, if the resource being represented is an original ASCII art, then an ASCII representation would have a higher source quality than a JPEG representation. A qs value is therefore specific to a given variant depending on the nature of the resource it represents. The full list of headers recognized is available in the mod negotiation typemap (p. 766) documentation. 80 CHAPTER 2. USING THE APACHE HTTP SERVER Multiviews MultiViews is a per-directory option, meaning it can be set with an O PTIONS directive within a , or section in httpd.conf, or (if A LLOW OVERRIDE is properly set) in .htaccess files. Note that Options All does not set MultiViews; you have to ask for it by name. The effect of MultiViews is as follows: if the server receives a request for /some/dir/foo, if /some/dir has MultiViews enabled, and /some/dir/foo does not exist, then the server reads the directory looking for files named foo.*, and effectively fakes up a type map which names all those files, assigning them the same media types and content-encodings it would have if the client had asked for one of them by name. It then chooses the best match to the client’s requirements. MultiViews may also apply to searches for the file named by the D IRECTORY I NDEX directive, if the server is trying to index a directory. If the configuration files specify DirectoryIndex index then the server will arbitrate between index.html and index.html3 if both are present. If neither are present, and index.cgi is there, the server will run it. If one of the files found when reading the directory does not have an extension recognized by mod mime to designate its Charset, Content-Type, Language, or Encoding, then the result depends on the setting of the M ULTI V IEWS M ATCH directive. This directive determines whether handlers, filters, and other extension types can participate in MultiViews negotiation. The Negotiation Methods After httpd has obtained a list of the variants for a given resource, either from a type-map file or from the filenames in the directory, it invokes one of two methods to decide on the ’best’ variant to return, if any. It is not necessary to know any of the details of how negotiation actually takes place in order to use httpd’s content negotiation features. However the rest of this document explains the methods used for those interested. There are two negotiation methods: 1. Server driven negotiation with the httpd algorithm is used in the normal case. The httpd algorithm is explained in more detail below. When this algorithm is used, httpd can sometimes ’fiddle’ the quality factor of a particular dimension to achieve a better result. The ways httpd can fiddle quality factors is explained in more detail below. 2. Transparent content negotiation is used when the browser specifically requests this through the mechanism defined in RFC 2295. This negotiation method gives the browser full control over deciding on the ’best’ variant, the result is therefore dependent on the specific algorithms used by the browser. As part of the transparent negotiation process, the browser can ask httpd to run the ’remote variant selection algorithm’ defined in RFC 2296. Dimensions of Negotiation Dimension Notes Media Type Browser indicates preferences with the Accept header field. Each item can have an associated quality factor. Variant description can also have a quality factor (the "qs" parameter). Browser indicates preferences with the Accept-Language header field. Each item can have a quality factor. Variants can be associated with none, one or more than one language. Browser indicates preference with the Accept-Encoding header field. Each item can have a quality factor. Browser indicates preference with the Accept-Charset header field. Each item can have a quality factor. Variants can indicate a charset as a parameter of the media type. Language Encoding Charset 2.12. CONTENT NEGOTIATION 81 httpd Negotiation Algorithm httpd can use the following algorithm to select the ’best’ variant (if any) to return to the browser. This algorithm is not further configurable. It operates as follows: 1. First, for each dimension of the negotiation, check the appropriate Accept* header field and assign a quality to each variant. If the Accept* header for any dimension implies that this variant is not acceptable, eliminate it. If no variants remain, go to step 4. 2. Select the ’best’ variant by a process of elimination. Each of the following tests is applied in order. Any variants not selected at each test are eliminated. After each test, if only one variant remains, select it as the best match and proceed to step 3. If more than one variant remains, move on to the next test. (a) Multiply the quality factor from the Accept header with the quality-of-source factor for this variants media type, and select the variants with the highest value. (b) Select the variants with the highest language quality factor. (c) Select the variants with the best language match, using either the order of languages in the Accept-Language header (if present), or else the order of languages in the LanguagePriority directive (if present). (d) Select the variants with the highest ’level’ media parameter (used to give the version of text/html media types). (e) Select variants with the best charset media parameters, as given on the Accept-Charset header line. Charset ISO-8859-1 is acceptable unless explicitly excluded. Variants with a text/* media type but not explicitly associated with a particular charset are assumed to be in ISO-8859-1. (f) Select those variants which have associated charset media parameters that are not ISO-8859-1. If there are no such variants, select all variants instead. (g) Select the variants with the best encoding. If there are variants with an encoding that is acceptable to the user-agent, select only these variants. Otherwise if there is a mix of encoded and non-encoded variants, select only the unencoded variants. If either all variants are encoded or all variants are not encoded, select all variants. (h) Select the variants with the smallest content length. (i) Select the first variant of those remaining. This will be either the first listed in the type-map file, or when variants are read from the directory, the one whose file name comes first when sorted using ASCII code order. 3. The algorithm has now selected one ’best’ variant, so return it as the response. The HTTP response header Vary is set to indicate the dimensions of negotiation (browsers and caches can use this information when caching the resource). End. 4. To get here means no variant was selected (because none are acceptable to the browser). Return a 406 status (meaning "No acceptable representation") with a response body consisting of an HTML document listing the available variants. Also set the HTTP Vary header to indicate the dimensions of variance. Fiddling with Quality Values httpd sometimes changes the quality values from what would be expected by a strict interpretation of the httpd negotiation algorithm above. This is to get a better result from the algorithm for browsers which do not send full or accurate information. Some of the most popular browsers send Accept header information which would otherwise result in the selection of the wrong variant in many cases. If a browser sends full and correct information these fiddles will not be applied. 82 CHAPTER 2. USING THE APACHE HTTP SERVER Media Types and Wildcards The Accept: request header indicates preferences for media types. It can also include ’wildcard’ media types, such as "image/*" or "*/*" where the * matches any string. So a request including: Accept: image/*, */* would indicate that any type starting "image/" is acceptable, as is any other type. Some browsers routinely send wildcards in addition to explicit types they can handle. For example: Accept: text/html, text/plain, image/gif, image/jpeg, */* The intention of this is to indicate that the explicitly listed types are preferred, but if a different representation is available, that is ok too. Using explicit quality values, what the browser really wants is something like: Accept: text/html, text/plain, image/gif, image/jpeg, */*; q=0.01 The explicit types have no quality factor, so they default to a preference of 1.0 (the highest). The wildcard */* is given a low preference of 0.01, so other types will only be returned if no variant matches an explicitly listed type. If the Accept: header contains no q factors at all, httpd sets the q value of "*/*", if present, to 0.01 to emulate the desired behavior. It also sets the q value of wildcards of the format "type/*" to 0.02 (so these are preferred over matches against "*/*". If any media type on the Accept: header contains a q factor, these special values are not applied, so requests from browsers which send the explicit information to start with work as expected. Language Negotiation Exceptions New in httpd 2.0, some exceptions have been added to the negotiation algorithm to allow graceful fallback when language negotiation fails to find a match. When a client requests a page on your server, but the server cannot find a single page that matches the Accept-language sent by the browser, the server will return either a "No Acceptable Variant" or "Multiple Choices" response to the client. To avoid these error messages, it is possible to configure httpd to ignore the Accept-language in these cases and provide a document that does not explicitly match the client’s request. The F ORCE L ANGUAGE P RIORITY directive can be used to override one or both of these error messages and substitute the servers judgement in the form of the L ANGUAGE P RIORITY directive. The server will also attempt to match language-subsets when no other match can be found. For example, if a client requests documents with the language en-GB for British English, the server is not normally allowed by the HTTP/1.1 standard to match that against a document that is marked as simply en. (Note that it is almost surely a configuration error to include en-GB and not en in the Accept-Language header, since it is very unlikely that a reader understands British English, but doesn’t understand English in general. Unfortunately, many current clients have default configurations that resemble this.) However, if no other language match is possible and the server is about to return a "No Acceptable Variants" error or fallback to the L ANGUAGE P RIORITY, the server will ignore the subset specification and match en-GB against en documents. Implicitly, httpd will add the parent language to the client’s acceptable language list with a very low quality value. But note that if the client requests "en-GB; q=0.9, fr; q=0.8", and the server has documents designated "en" and "fr", then the "fr" document will be returned. This is necessary to maintain compliance with the HTTP/1.1 specification and to work effectively with properly configured clients. In order to support advanced techniques (such as cookies or special URL-paths) to determine the user’s preferred language, since httpd 2.0.47 MOD NEGOTIATION recognizes the environment variable (p. 92) prefer-language. If it exists and contains an appropriate language tag, MOD NEGOTIATION will try to select a matching variant. If there’s no such variant, the normal negotiation process applies. 2.12. CONTENT NEGOTIATION 83 Example SetEnvIf Cookie "language=(.+)" prefer-language=$1 Header append Vary cookie Extensions to Transparent Content Negotiation httpd extends the transparent content negotiation protocol (RFC 2295) as follows. A new {encoding ..} element is used in variant lists to label variants which are available with a specific content-encoding only. The implementation of the RVSA/1.0 algorithm (RFC 2296) is extended to recognize encoded variants in the list, and to use them as candidate variants whenever their encodings are acceptable according to the Accept-Encoding request header. The RVSA/1.0 implementation does not round computed quality factors to 5 decimal places before choosing the best variant. Note on hyperlinks and naming conventions If you are using language negotiation you can choose between different naming conventions, because files can have more than one extension, and the order of the extensions is normally irrelevant (see the mod mime (p. 749) documentation for details). A typical file has a MIME-type extension (e.g., html), maybe an encoding extension (e.g., gz), and of course a language extension (e.g., en) when we have different language variants of this file. Examples: • foo.en.html • foo.html.en • foo.en.html.gz Here some more examples of filenames together with valid and invalid hyperlinks: Filename Valid hyperlink Invalid hyperlink foo.html.en foo foo.html foo foo foo.html foo - foo.en.html foo.html.en.gz foo.en.html.gz foo.gz.html.en foo.html.gz.en foo foo.gz foo.gz.html foo foo.html foo.html.gz foo.html foo.gz foo.html.gz foo.html foo.html.gz foo.gz foo.html foo.gz Looking at the table above, you will notice that it is always possible to use the name without any extensions in a hyperlink (e.g., foo). The advantage is that you can hide the actual type of a document rsp. file and can change it later, e.g., from html to shtml or cgi without changing any hyperlink references. If you want to continue to use a MIME-type in your hyperlinks (e.g. foo.html) the language extension (including an encoding extension if there is one) must be on the right hand side of the MIME-type extension (e.g., foo.html.en). 84 CHAPTER 2. USING THE APACHE HTTP SERVER Note on Caching When a cache stores a representation, it associates it with the request URL. The next time that URL is requested, the cache can use the stored representation. But, if the resource is negotiable at the server, this might result in only the first requested variant being cached and subsequent cache hits might return the wrong response. To prevent this, httpd normally marks all responses that are returned after content negotiation as non-cacheable by HTTP/1.0 clients. httpd also supports the HTTP/1.1 protocol features to allow caching of negotiated responses. For requests which come from a HTTP/1.0 compliant client (either a browser or a cache), the directive C ACHE N E GOTIATED D OCS can be used to allow caching of responses which were subject to negotiation. This directive can be given in the server config or virtual host, and takes no arguments. It has no effect on requests from HTTP/1.1 clients. For HTTP/1.1 clients, httpd sends a Vary HTTP response header to indicate the negotiation dimensions for the response. Caches can use this information to determine whether a subsequent request can be served from the local copy. To encourage a cache to use the local copy regardless of the negotiation dimensions, set the force-no-vary environment variable (p. 92) . 2.13. CUSTOM ERROR RESPONSES 2.13 85 Custom Error Responses Although the Apache HTTP Server provides generic error responses in the event of 4xx or 5xx HTTP status codes, these responses are rather stark, uninformative, and can be intimidating to site users. You may wish to provide custom error responses which are either friendlier, or in some language other than English, or perhaps which are styled more in line with your site layout. Customized error responses can be defined for any HTTP status code designated as an error condition - that is, any 4xx or 5xx status. Additionally, a set of values are provided, so that the error document can be customized further based on the values of these variables, using Server Side Includes (p. 243) . Or, you can have error conditions handled by a cgi program, or other dynamic handler (PHP, mod perl, etc) which makes use of these variables. Configuration Custom error documents are configured using the E RROR D OCUMENT directive, which may be used in global, virtualhost, or directory context. It may be used in .htaccess files if A LLOW OVERRIDE is set to FileInfo. ErrorDocument ErrorDocument ErrorDocument ErrorDocument ErrorDocument 500 500 500 404 401 "Sorry, our script crashed. Oh dear" /cgi-bin/crash-recover http://error.example.com/server_error.html /errors/not_found.html /subscription/how_to_subscribe.html The syntax of the ErrorDocument directive is: ErrorDocument <3-digit-code> where the action will be treated as: 1. A local URL to redirect to (if the action begins with a "/"). 2. An external URL to redirect to (if the action is a valid URL). 3. Text to be displayed (if none of the above). The text must be wrapped in quotes (") if it consists of more than one word. When redirecting to a local URL, additional environment variables are set so that the response can be further customized. They are not sent to external URLs. Available Variables Redirecting to another URL can be useful, but only if some information can be passed which can then be used to explain or log the error condition more clearly. To achieve this, when the error redirect is sent, additional environment variables will be set, which will be generated from the headers provided to the original request by prepending ’REDIRECT ’ onto the original header name. This provides the error document the context of the original request. For example, you might receive, in addition to more usual environment variables, the following. 86 CHAPTER 2. USING THE APACHE HTTP SERVER REDIRECT REDIRECT REDIRECT REDIRECT REDIRECT REDIRECT REDIRECT REDIRECT REDIRECT REDIRECT HTTP ACCEPT=*/*, image/gif, image/jpeg, image/png HTTP USER AGENT=Mozilla/5.0 Fedora/3.5.8-1.fc12 Firefox/3.5.8 PATH=.:/bin:/usr/local/bin:/sbin QUERY STRING= REMOTE ADDR=121.345.78.123 REMOTE HOST=client.example.com SERVER NAME=www.example.edu SERVER PORT=80 SERVER SOFTWARE=Apache/2.2.15 URL=/cgi-bin/buggy.pl REDIRECT environment variables are created from the environment variables which existed prior to the redirect. They are renamed with a REDIRECT prefix, i.e., HTTP USER AGENT becomes REDIRECT HTTP USER AGENT. REDIRECT URL, REDIRECT STATUS, and REDIRECT QUERY STRING are guaranteed to be set, and the other headers will be set only if they existed prior to the error condition. None of these will be set if the E RROR D OCUMENT target is an external redirect (anything starting with a scheme name like http:, even if it refers to the same host as the server). Customizing Error Responses If you point your ErrorDocument to some variety of dynamic handler such as a server-side include document, CGI script, or some variety of other handler, you may wish to use the available custom environment variables to customize this response. If the ErrorDocument specifies a local redirect to a CGI script, the script should include a "Status:" header field in its output in order to ensure the propagation all the way back to the client of the error condition that caused it to be invoked. For instance, a Perl ErrorDocument script might include the following: ... print "Content-type: text/html\n"; printf "Status: %s Condition Intercepted\n", $ENV{"REDIRECT_STATUS"}; ... If the script is dedicated to handling a particular error condition, such as 404NotFound, it can use the specific code and error text instead. Note that if the response contains Location: header (in order to issue a client-side redirect), the script must emit an appropriate Status: header (such as 302Found). Otherwise the Location: header may have no effect. Multi Language Custom Error Documents Provided with your installation of the Apache HTTP Server is a directory of custom error documents translated into 16 different languages. There’s also a configuration file in the conf/extra configuration directory that can be included to enable this feature. In your server configuration file, you’ll see a line such as: # Multi-language error messages #Include conf/extra/httpd-multilang-errordoc.conf Uncommenting this Include line will enable this feature, and provide language-negotiated error messages, based on the language preference set in the client browser. 2.13. CUSTOM ERROR RESPONSES 87 Additionally, these documents contain various of the REDIRECT variables, so that additional information can be provided to the end-user about what happened, and what they can do now. These documents can be customized to whatever degree you wish to provide more useful information to users about your site, and what they can expect to find there. MOD INCLUDE and MOD NEGOTIATION must be enabled to use this feature. 88 CHAPTER 2. USING THE APACHE HTTP SERVER 2.14 Binding to Addresses and Ports Configuring Apache HTTP Server to listen on specific addresses and ports. See also • Virtual Hosts (p. 124) • DNS Issues (p. 121) Overview Related Modules CORE MPM COMMON Related Directives L ISTEN When httpd starts, it binds to some port and address on the local machine and waits for incoming requests. By default, it listens to all addresses on the machine. However, it may need to be told to listen on specific ports, or only on selected addresses, or a combination of both. This is often combined with the Virtual Host (p. 124) feature, which determines how httpd responds to different IP addresses, hostnames and ports. The L ISTEN directive tells the server to accept incoming requests only on the specified port(s) or address-and-port combinations. If only a port number is specified in the L ISTEN directive, the server listens to the given port on all interfaces. If an IP address is given as well as a port, the server will listen on the given port and interface. Multiple L ISTEN directives may be used to specify a number of addresses and ports to listen on. The server will respond to requests from any of the listed addresses and ports. For example, to make the server accept connections on both port 80 and port 8000, on all interfaces, use: Listen 80 Listen 8000 To make the server accept connections on port 80 for one interface, and port 8000 on another, use Listen 192.0.2.1:80 Listen 192.0.2.5:8000 IPv6 addresses must be enclosed in square brackets, as in the following example: Listen [2001:db8::a00:20ff:fea7:ccea]:80 ! Overlapping L ISTEN directives will result in a fatal error which will prevent the server from starting up. (48)Address already in use: to address [::]:80 make sock: See the discussion in the wikia for further troubleshooting tips. a http://wiki.apache.org/httpd/CouldNotBindToAddress could not bind 2.14. BINDING TO ADDRESSES AND PORTS 89 Special IPv6 Considerations A growing number of platforms implement IPv6, and APR supports IPv6 on most of these platforms, allowing httpd to allocate IPv6 sockets, and to handle requests sent over IPv6. One complicating factor for httpd administrators is whether or not an IPv6 socket can handle both IPv4 connections and IPv6 connections. Handling IPv4 connections with an IPv6 socket uses IPv4-mapped IPv6 addresses, which are allowed by default on most platforms, but are disallowed by default on FreeBSD, NetBSD, and OpenBSD, in order to match the system-wide policy on those platforms. On systems where it is disallowed by default, a special configure parameter can change this behavior for httpd. On the other hand, on some platforms, such as Linux and Tru64, the only way to handle both IPv6 and IPv4 is to use mapped addresses. If you want httpd to handle IPv4 and IPv6 connections with a minimum of sockets, which requires using IPv4-mapped IPv6 addresses, specify the --enable-v4-mapped configure option. --enable-v4-mapped is the default on all platforms except FreeBSD, NetBSD, and OpenBSD, so this is probably how your httpd was built. If you want httpd to handle IPv4 connections only, regardless of what your platform and APR will support, specify an IPv4 address on all L ISTEN directives, as in the following examples: Listen 0.0.0.0:80 Listen 192.0.2.1:80 If your platform supports it and you want httpd to handle IPv4 and IPv6 connections on separate sockets (i.e., to disable IPv4-mapped addresses), specify the --disable-v4-mapped configure option. --disable-v4-mapped is the default on FreeBSD, NetBSD, and OpenBSD. Specifying the protocol with Listen The optional second protocol argument of L ISTEN is not required for most configurations. If not specified, https is the default for port 443 and http the default for all other ports. The protocol is used to determine which module should handle a request, and to apply protocol specific optimizations with the ACCEPT F ILTER directive. You only need to set the protocol if you are running on non-standard ports. For example, running an https site on port 8443: Listen 192.170.2.1:8443 https How This Works With Virtual Hosts The L ISTEN directive does not implement Virtual Hosts - it only tells the main server what addresses and ports to listen on. If no directives are used, the server will behave in the same way for all accepted requests. However, can be used to specify a different behavior for one or more of the addresses or ports. To implement a VirtualHost, the server must first be told to listen to the address and port to be used. Then a section should be created for the specified address and port to set the behavior of this virtual host. Note that if the is set for an address and port that the server is not listening to, it cannot be accessed. 90 CHAPTER 2. USING THE APACHE HTTP SERVER 2.15 Multi-Processing Modules (MPMs) This document describes what a Multi-Processing Module is and how they are used by the Apache HTTP Server. Introduction The Apache HTTP Server is designed to be a powerful and flexible web server that can work on a very wide variety of platforms in a range of different environments. Different platforms and different environments often require different features, or may have different ways of implementing the same feature most efficiently. Apache httpd has always accommodated a wide variety of environments through its modular design. This design allows the webmaster to choose which features will be included in the server by selecting which modules to load either at compile-time or at run-time. Apache HTTP Server 2.0 extends this modular design to the most basic functions of a web server. The server ships with a selection of Multi-Processing Modules (MPMs) which are responsible for binding to network ports on the machine, accepting requests, and dispatching children to handle the requests. Extending the modular design to this level of the server allows two important benefits: • Apache httpd can more cleanly and efficiently support a wide variety of operating systems. In particular, the Windows version of the server is now much more efficient, since MPM WINNT can use native networking features in place of the POSIX layer used in Apache httpd 1.3. This benefit also extends to other operating systems that implement specialized MPMs. • The server can be better customized for the needs of the particular site. For example, sites that need a great deal of scalability can choose to use a threaded MPM like WORKER or EVENT, while sites requiring stability or compatibility with older software can use a PREFORK. At the user level, MPMs appear much like other Apache httpd modules. The main difference is that one and only one MPM must be loaded into the server at any time. The list of available MPMs appears on the module index page (p. 1101) . MPM Defaults The following table lists the default MPMs for various operating systems. This will be the MPM selected if you do not make another choice at compile-time. Netware OS/2 Unix MPM NETWARE MPMT OS 2 PREFORK , WORKER , or depending on platform capabilities EVENT, Windows MPM WINNT =⇒Here, ’Unix’ is used to mean Unix-like operating systems, such as Linux, BSD, Solaris, Mac OS X, etc. In the case of Unix, the decision as to which MPM is installed is based on two questions: 1. Does the system support threads? 2. Does the system support thread-safe polling (Specifically, the kqueue and epoll functions)? If the answer to both questions is ’yes’, the default MPM is EVENT. If The answer to #1 is ’yes’, but the answer to #2 is ’no’, the default will be WORKER. 2.15. MULTI-PROCESSING MODULES (MPMS) 91 If the answer to both questions is ’no’, then the default MPM will be PREFORK. In practical terms, this means that the default will almost always be EVENT, as all modern operating systems support these two features. Building an MPM as a static module MPMs can be built as static modules on all platforms. A single MPM is chosen at build time and linked into the server. The server must be rebuilt in order to change the MPM. To override the default MPM choice, use the --with-mpm=NAME option of the configure script. NAME is the name of the desired MPM. Once the server has been compiled, it is possible to determine which MPM was chosen by using ./httpd -l. This command will list every module that is compiled into the server, including the MPM. Building an MPM as a DSO module On Unix and similar platforms, MPMs can be built as DSO modules and dynamically loaded into the server in the same manner as other DSO modules. Building MPMs as DSO modules allows the MPM to be changed by updating the L OAD M ODULE directive for the MPM instead of by rebuilding the server. LoadModule mpm_prefork_module modules/mod_mpm_prefork.so Attempting to L OAD M ODULE more than one MPM will result in a startup failure with the following error. AH00534: httpd: Configuration error: More than one MPM loaded. This feature is enabled using the --enable-mpms-shared option of the configure script. With argument all, all possible MPMs for the platform will be installed. Alternately, a list of MPMs can be specified as the argument. The default MPM, either selected automatically or specified with the --with-mpm option of the configure script, will be loaded in the generated server configuration file. Edit the L OAD M ODULE directive to select a different MPM. 92 2.16 CHAPTER 2. USING THE APACHE HTTP SERVER Environment Variables in Apache There are two kinds of environment variables that affect the Apache HTTP Server. First, there are the environment variables controlled by the underlying operating system. These are set before the server starts. They can be used in expansions in configuration files, and can optionally be passed to CGI scripts and SSI using the PassEnv directive. Second, the Apache HTTP Server provides a mechanism for storing information in named variables that are also called environment variables. This information can be used to control various operations such as logging or access control. The variables are also used as a mechanism to communicate with external programs such as CGI scripts. This document discusses different ways to manipulate and use these variables. Although these variables are referred to as environment variables, they are not the same as the environment variables controlled by the underlying operating system. Instead, these variables are stored and manipulated in an internal Apache structure. They only become actual operating system environment variables when they are provided to CGI scripts and Server Side Include scripts. If you wish to manipulate the operating system environment under which the server itself runs, you must use the standard environment manipulation mechanisms provided by your operating system shell. Setting Environment Variables Related Modules MOD MOD MOD MOD CACHE ENV REWRITE SETENVIF MOD UNIQUE ID Related Directives B ROWSER M ATCH B ROWSER M ATCH N O C ASE PASS E NV R EWRITE RULE S ET E NV S ET E NV I F S ET E NV I F N O C ASE U NSET E NV Basic Environment Manipulation The most basic way to set an environment variable in Apache is using the unconditional S ET E NV directive. Variables may also be passed from the environment of the shell which started the server using the PASS E NV directive. Conditional Per-Request Settings For additional flexibility, the directives provided by MOD SETENVIF allow environment variables to be set on a perrequest basis, conditional on characteristics of particular requests. For example, a variable could be set only when a specific browser (User-Agent) is making a request, or only when a specific Referer [sic] header is found. Even more flexibility is available through the MOD REWRITE’s R EWRITE RULE which uses the [E=...] option to set environment variables. Unique Identifiers Finally, MOD UNIQUE ID sets the environment variable UNIQUE ID for each request to a value which is guaranteed to be unique across "all" requests under very specific conditions. 2.16. ENVIRONMENT VARIABLES IN APACHE 93 Standard CGI Variables In addition to all environment variables set within the Apache configuration and passed from the shell, CGI scripts and SSI pages are provided with a set of environment variables containing meta-information about the request as required by the CGI specification33 . Some Caveats • It is not possible to override or change the standard CGI variables using the environment manipulation directives. • When suexec is used to launch CGI scripts, the environment will be cleaned down to a set of safe variables before CGI scripts are launched. The list of safe variables is defined at compile-time in suexec.c. • For portability reasons, the names of environment variables may contain only letters, numbers, and the underscore character. In addition, the first character may not be a number. Characters which do not match this restriction will be replaced by an underscore when passed to CGI scripts and SSI pages. • A special case are HTTP headers which are passed to CGI scripts and the like via environment variables (see below). They are converted to uppercase and only dashes are replaced with underscores; if the header contains any other (invalid) character, the whole header is silently dropped. See below for a workaround. • The S ET E NV directive runs late during request processing meaning that directives such as S ET E NV I F and R EWRITE C OND will not see the variables set with it. • When the server looks up a path via an internal subrequest such as looking for a D IRECTORY I NDEX or generating a directory listing with MOD AUTOINDEX, per-request environment variables are not inherited in the subrequest. Additionally, S ET E NV I F directives are not separately evaluated in the subrequest due to the API phases MOD SETENVIF takes action in. Using Environment Variables Related Modules MOD MOD MOD MOD AUTHZ HOST CGI EXT FILTER HEADERS MOD INCLUDE MOD LOG CONFIG MOD REWRITE Related Directives R EQUIRE C USTOM L OG D ENY E XT F ILTER D EFINE H EADER L OG F ORMAT R EWRITE C OND R EWRITE RULE CGI Scripts One of the primary uses of environment variables is to communicate information to CGI scripts. As discussed above, the environment passed to CGI scripts includes standard meta-information about the request in addition to any variables set within the Apache configuration. For more details, see the CGI tutorial (p. 236) . SSI Pages Server-parsed (SSI) documents processed by MOD INCLUDE’s INCLUDES filter can print environment variables using the echo element, and can use environment variables in flow control elements to makes parts of a page conditional on 33 http://www.ietf.org/rfc/rfc3875 94 CHAPTER 2. USING THE APACHE HTTP SERVER characteristics of a request. Apache also provides SSI pages with the standard CGI environment variables as discussed above. For more details, see the SSI tutorial (p. 243) . Access Control Access to the server can be controlled based on the value of environment variables using the allow from env= and deny from env= directives. In combination with S ET E NV I F, this allows for flexible control of access to the server based on characteristics of the client. For example, you can use these directives to deny access to a particular browser (User-Agent). Conditional Logging Environment variables can be logged in the access log using the L OG F ORMAT option %e. In addition, the decision on whether or not to log requests can be made based on the status of environment variables using the conditional form of the C USTOM L OG directive. In combination with S ET E NV I F this allows for flexible control of which requests are logged. For example, you can choose not to log requests for filenames ending in gif, or you can choose to only log requests from clients which are outside your subnet. Conditional Response Headers The H EADER directive can use the presence or absence of an environment variable to determine whether or not a certain HTTP header will be placed in the response to the client. This allows, for example, a certain response header to be sent only if a corresponding header is received in the request from the client. External Filter Activation External filters configured by MOD EXT FILTER using the E XT F ILTER D EFINE directive can by activated conditional on an environment variable using the disableenv= and enableenv= options. URL Rewriting The %{ENV:variable} form of TestString in the R EWRITE C OND allows MOD REWRITE’s rewrite engine to make decisions conditional on environment variables. Note that the variables accessible in MOD REWRITE without the ENV: prefix are not actually environment variables. Rather, they are variables special to MOD REWRITE which cannot be accessed from other modules. Special Purpose Environment Variables Interoperability problems have led to the introduction of mechanisms to modify the way Apache behaves when talking to particular clients. To make these mechanisms as flexible as possible, they are invoked by defining environment variables, typically with B ROWSER M ATCH, though S ET E NV and PASS E NV could also be used, for example. downgrade-1.0 This forces the request to be treated as a HTTP/1.0 request even if it was in a later dialect. 2.16. ENVIRONMENT VARIABLES IN APACHE 95 force-gzip If you have the DEFLATE filter activated, this environment variable will ignore the accept-encoding setting of your browser and will send compressed output unconditionally. force-no-vary This causes any Vary fields to be removed from the response header before it is sent back to the client. Some clients don’t interpret this field correctly; setting this variable can work around this problem. Setting this variable also implies force-response-1.0. force-response-1.0 This forces an HTTP/1.0 response to clients making an HTTP/1.0 request. It was originally implemented as a result of a problem with AOL’s proxies. Some HTTP/1.0 clients may not behave correctly when given an HTTP/1.1 response, and this can be used to interoperate with them. gzip-only-text/html When set to a value of "1", this variable disables the DEFLATE output filter provided by MOD DEFLATE for contenttypes other than text/html. If you’d rather use statically compressed files, MOD NEGOTIATION evaluates the variable as well (not only for gzip, but for all encodings that differ from "identity"). no-gzip When set, the DEFLATE filter of MOD DEFLATE will be turned off and MOD NEGOTIATION will refuse to deliver encoded resources. no-cache Available in versions 2.2.12 and later When set, MOD CACHE will not save an otherwise cacheable response. This environment variable does not influence whether a response already in the cache will be served for the current request. nokeepalive This disables K EEPA LIVE when set. prefer-language This influences MOD NEGOTIATION’s behaviour. If it contains a language tag (such as en, ja or x-klingon), MOD NEGOTIATION tries to deliver a variant with that language. If there’s no such variant, the normal negotiation (p. 78) process applies. 96 CHAPTER 2. USING THE APACHE HTTP SERVER redirect-carefully This forces the server to be more careful when sending a redirect to the client. This is typically used when a client has a known problem handling redirects. This was originally implemented as a result of a problem with Microsoft’s WebFolders software which has a problem handling redirects on directory resources via DAV methods. suppress-error-charset Available in versions after 2.0.54 When Apache issues a redirect in response to a client request, the response includes some actual text to be displayed in case the client can’t (or doesn’t) automatically follow the redirection. Apache ordinarily labels this text according to the character set which it uses, which is ISO-8859-1. However, if the redirection is to a page that uses a different character set, some broken browser versions will try to use the character set from the redirection text rather than the actual page. This can result in Greek, for instance, being incorrectly rendered. Setting this environment variable causes Apache to omit the character set for the redirection text, and these broken browsers will then correctly use that of the destination page. ! Security note Sending error pages without a specified character set may allow a cross-site-scripting attack for existing browsers (MSIE) which do not follow the HTTP/1.1 specification and attempt to "guess" the character set from the content. Such browsers can be easily fooled into using the UTF-7 character set, and UTF-7 content from input data (such as the request-URI) will not be escaped by the usual escaping mechanisms designed to prevent cross-site-scripting attacks. force-proxy-request-1.0, proxy-nokeepalive, proxy-sendchunked, proxy-sendcl, proxy-chain-auth, proxyinterim-response, proxy-initial-not-pooled These directives alter the protocol behavior of MOD PROXY. See the MOD PROXY and MOD PROXY HTTP documentation for more details. Examples Passing broken headers to CGI scripts Starting with version 2.4, Apache is more strict about how HTTP headers are converted to environment variables in MOD CGI and other modules: Previously any invalid characters in header names were simply translated to underscores. This allowed for some potential cross-site-scripting attacks via header injection (see Unusual Web Bugs34 , slide 19/20). If you have to support a client which sends broken headers and which can’t be fixed, a simple workaround involving MOD SETENVIF and MOD HEADERS allows you to still accept these headers: # # The following works around a client sending a broken Accept_Encoding # header. # 34 http://events.ccc.de/congress/2007/Fahrplan/events/2212.en.html 2.16. ENVIRONMENT VARIABLES IN APACHE 97 SetEnvIfNoCase ˆAccept.Encoding$ ˆ(.*)$ fix_accept_encoding=$1 RequestHeader set Accept-Encoding %{fix_accept_encoding}e env=fix_accept_encoding Changing protocol behavior with misbehaving clients Earlier versions recommended that the following lines be included in httpd.conf to deal with known client problems. Since the affected clients are no longer seen in the wild, this configuration is likely no-longer necessary. # # The following directives modify normal HTTP response behavior. # The first directive disables keepalive for Netscape 2.x and browsers that # spoof it. There are known problems with these browser implementations. # The second directive is for Microsoft Internet Explorer 4.0b2 # which has a broken HTTP/1.1 implementation and does not properly # support keepalive when it is used on 301 or 302 (redirect) responses. # BrowserMatch "Mozilla/2" nokeepalive BrowserMatch "MSIE 4\.0b2;" nokeepalive downgrade-1.0 force-response-1.0 # # The following directive disables HTTP/1.1 responses to browsers which # are in violation of the HTTP/1.0 spec by not being able to understand a # basic 1.1 response. # BrowserMatch "RealPlayer 4\.0" force-response-1.0 BrowserMatch "Java/1\.0" force-response-1.0 BrowserMatch "JDK/1\.0" force-response-1.0 Do not log requests for images in the access log This example keeps requests for images from appearing in the access log. It can be easily modified to prevent logging of particular directories, or to prevent logging of requests coming from particular hosts. SetEnvIf Request_URI \.gif image-request SetEnvIf Request_URI \.jpg image-request SetEnvIf Request_URI \.png image-request CustomLog "logs/access_log" common env=!image-request Prevent "Image Theft" This example shows how to keep people not on your server from using images on your server as inline-images on their pages. This is not a recommended configuration, but it can work in limited circumstances. We assume that all your images are in a directory called /web/images. SetEnvIf Referer "ˆhttp://www\.example\.com/" local_referal # Allow browsers that do not send Referer info SetEnvIf Referer "ˆ$" local_referal Require env local_referal 98 CHAPTER 2. USING THE APACHE HTTP SERVER For more information about this technique, see the "Keeping Your Images from Adorning Other Sites35 " tutorial on ServerWatch. 35 http://www.serverwatch.com/tutorials/article.php/1132731 2.17. EXPRESSIONS IN APACHE HTTP SERVER 2.17 99 Expressions in Apache HTTP Server Historically, there are several syntax variants for expressions used to express a condition in the different modules of the Apache HTTP Server. There is some ongoing effort to only use a single variant, called ap expr, for all configuration directives. This document describes the ap expr expression parser. The ap expr expression is intended to replace most other expression variants in HTTPD. For example, the deprecated SSLR EQUIRE expressions can be replaced by Require expr (p. 519) . See also • • E RROR D OCUMENT • A LIAS • S CRIPTA LIAS • R EDIRECT • AUTH BASIC FAKE • AUTH F ORM L OGIN R EQUIRED L OCATION • AUTH F ORM L OGIN S UCCESS L OCATION • AUTH F ORM L OGOUT L OCATION • AUTH NAME • AUTH T YPE • R EWRITE C OND • S ET E NV I F E XPR • H EADER • R EQUEST H EADER • F ILTER P ROVIDER • Require expr (p. 519) • Require ldap-user (p. 501) • Require ldap-group (p. 501) • Require ldap-dn (p. 501) • Require ldap-attribute (p. 501) • Require ldap-filter (p. 501) • Require ldap-search (p. 501) • Require dbd-group (p. 527) • Require dbm-group (p. 532) • Require group (p. 534) • Require host (p. 536) • SSLR EQUIRE • L OG M ESSAGE • MOD INCLUDE 100 CHAPTER 2. USING THE APACHE HTTP SERVER Grammar in Backus-Naur Form notation Backus-Naur Form36 (BNF) is a notation technique for context-free grammars, often used to describe the syntax of languages used in computing. In most cases, expressions are used to express boolean values. For these, the starting point in the BNF is expr. However, a few directives like L OG M ESSAGE accept expressions that evaluate to a string value. For those, the starting point in the BNF is string. expr ::= "true" | "false" | "!" expr | expr "&&" expr | expr "||" expr | "(" expr ")" | comp comp ::= | | | | | | | stringcomp integercomp unaryop word word binaryop word word "in" "{" wordlist "}" word "in" listfunction word "=˜" regex word "!˜" regex stringcomp ::= | | | | | word word word word word word "==" "!=" "<" "<=" ">" ">=" integercomp ::= | | | | | word word word word word word "-eq" "-ne" "-lt" "-le" "-gt" "-ge" word word word word word word word word word word word word | | | | | | wordlist ::= word | wordlist "," word word ::= | | | | | | string ::= stringpart | string stringpart word "." word digit "’" string "’" """ string """ variable rebackref function 36 http://en.wikipedia.org/wiki/Backus%E2%80%93Naur Form word word word word word word "eq" "ne" "lt" "le" "gt" "ge" word word word word word word 2.17. EXPRESSIONS IN APACHE HTTP SERVER stringpart ::= cstring | variable | rebackref cstring digit ::= ... ::= [0-9]+ variable ::= "%{" varname "}" | "%{" funcname ":" funcargs "}" rebackref ::= "$" [0-9] function 101 ::= funcname "(" wordlist ")" listfunction ::= listfuncname "(" word ")" Variables The expression parser provides a number of variables of the form %{HTTP HOST}. Note that the value of a variable may depend on the phase of the request processing in which it is evaluated. For example, an expression used in an directive is evaluated before authentication is done. Therefore, %{REMOTE USER} will not be set in this case. The following variables provide the values of the named HTTP request headers. The values of other headers can be obtained with the req function. Using these variables may cause the header name to be added to the Vary header of the HTTP response, except where otherwise noted for the directive accepting the expression. The req novary function may be used to circumvent this behavior. Name HTTP HTTP HTTP HTTP HTTP HTTP HTTP ACCEPT COOKIE FORWARDED HOST PROXY CONNECTION REFERER USER AGENT Other request related variables Name Description REQUEST METHOD REQUEST SCHEME REQUEST URI DOCUMENT URI REQUEST FILENAME The HTTP method of the incoming request (e.g. GET) The scheme part of the request’s URI The path part of the request’s URI Same as REQUEST URI The full local filesystem path to the file or script matching the request, if this has already been determined by the server at the time REQUEST FILENAME is referenced. Otherwise, such as when used in virtual host context, the same value as REQUEST URI Same as REQUEST FILENAME The date and time of last modification of the file in the format 20101231235959, if this has already been determined by the server at the time LAST MODIFIED is referenced. The user name of the owner of the script. The group name of the group of the script. The trailing path name information, see ACCEPT PATH I NFO SCRIPT FILENAME LAST MODIFIED SCRIPT USER SCRIPT GROUP PATH INFO 102 QUERY STRING IS SUBREQ THE REQUEST REMOTE ADDR REMOTE HOST REMOTE USER REMOTE IDENT SERVER NAME SERVER PORT SERVER ADMIN SERVER PROTOCOL SERVER PROTOCOL VERSION SERVER PROTOCOL VERSION MAJOR SERVER PROTOCOL VERSION MINOR DOCUMENT ROOT AUTH TYPE CONTENT TYPE HANDLER HTTP2 HTTPS IPV6 REQUEST STATUS REQUEST LOG ID CONN LOG ID CONN REMOTE ADDR CONTEXT PREFIX CONTEXT DOCUMENT ROOT CHAPTER 2. USING THE APACHE HTTP SERVER The query string of the current request "true" if the current request is a subrequest, "false" otherwise The complete request line (e.g., "GET /index.html HTTP/1.1") The IP address of the remote host The host name of the remote host The name of the authenticated user, if any (not available during ) The user name set by MOD IDENT The S ERVER NAME of the current vhost The server port of the current vhost, see S ERVER NAME The S ERVER A DMIN of the current vhost The protocol used by the request (e.g. HTTP/1.1). In some types of internal subrequests, this variable has the value INCLUDED. A number that encodes the HTTP version of the request: 1000 * major + minor. For example, 1001 corresponds to HTTP/1.1 and 9 corresponds to HTTP/0.9 The major version part of the HTTP version of the request, e.g. 1 for HTTP/1.0 The minor version part of the HTTP version of the request, e.g. 0 for HTTP/1.0 The D OCUMENT ROOT of the current vhost The configured AUTH T YPE (e.g. "basic") The content type of the response (not available during ) The name of the handler (p. 108) creating the response "on" if the request uses http/2, "off" otherwise "on" if the request uses https, "off" otherwise "on" if the connection uses IPv6, "off" otherwise The HTTP error status of the request (not available during ) The error log id of the request (see E RROR L OG F ORMAT) The error log id of the connection (see E RROR L OG F ORMAT) The peer IP address of the connection (see the MOD REMOTEIP module) Misc variables Name Description TIME YEAR TIME MON TIME DAY TIME HOUR TIME MIN TIME SEC TIME WDAY TIME SERVER SOFTWARE API VERSION The current year (e.g. 2010) The current month (1, ..., 12) The current day of the month The hour part of the current time (0, ..., 23) The minute part of the current time The second part of the current time The day of the week (starting with 0 for Sunday) The date and time in the format 20101231235959 The server version string The date of the API version (module magic number) Some modules register additional variables, see e.g. MOD SSL. Binary operators With the exception of some built-in comparison operators, binary operators have the form "-[a-zA-Z][a-zA-Z0-9 ]+", i.e. a minus and at least two characters. The name is not case sensitive. Modules may register additional binary operators. 2.17. EXPRESSIONS IN APACHE HTTP SERVER 103 Comparison operators Name Alternative Description == != < <= > >= =˜ !˜ -eq -ne -lt -le -gt -ge = String equality String inequality String less than String less than or equal String greater than String greater than or equal String matches the regular expression String does not match the regular expression Integer equality Integer inequality Integer less than Integer less than or equal Integer greater than Integer greater than or equal eq ne lt le gt ge Other binary operators Name Description -ipmatch -strmatch -strcmatch -fnmatch IP address matches address/netmask left string matches pattern given by right string (containing wildcards *, ?, []) same as -strmatch, but case insensitive same as -strmatch, but slashes are not matched by wildcards Unary operators Unary operators take one argument and have the form "-[a-zA-Z]", i.e. a minus and one character. The name is case sensitive. Modules may register additional unary operators. 104 CHAPTER 2. USING THE APACHE HTTP SERVER Name Description Restricted -d The argument is treated as a filename. True if the file exists and is a directory The argument is treated as a filename. True if the file (or dir or special) exists The argument is treated as a filename. True if the file exists and is regular file The argument is treated as a filename. True if the file exists and is not empty The argument is treated as a filename. True if the file exists and is symlink The argument is treated as a filename. True if the file exists and is symlink (same as -L) True if string is a valid file, accessible via all the server’s currentlyconfigured access controls for that path. This uses an internal subrequest to do the check, so use it with care - it can impact your server’s performance! True if string is a valid URL, accessible via all the server’s currentlyconfigured access controls for that path. This uses an internal subrequest to do the check, so use it with care - it can impact your server’s performance! Alias for -U True if string is not empty True if string is empty False if string is empty, "0", "off", "false", or "no" (case insensitive). True otherwise. Same as "%{REMOTE ADDR} -ipmatch ...", but more efficient yes -e -f -s -L -h -F -U -A -n -z -T -R yes yes yes yes yes The operators marked as "restricted" are not available in some modules like MOD INCLUDE. Functions Normal string-valued functions take one string as argument and return a string. Functions names are not case sensitive. Modules may register additional functions. 2.17. EXPRESSIONS IN APACHE HTTP SERVER Name Description req, http Get HTTP request header; header names may be added to the Vary header, see below Same as req, but header names will not be added to the Vary header Get HTTP response header (most response headers will not yet be set during ) Lookup request environment variable (as a shortcut, v can be used too to access variables) Lookup operating system environment variable Lookup request note Return first match of note, reqenv, osenv Convert string to lower case Convert string to upper case Escape special characters in %hex encoding Unescape %hex encoded string, leaving encoded slashes alone; return empty string if %00 is found Encode the string using base64 encoding Decode base64 encoded string, return truncated string if 0x00 is found Hash the string using MD5, then encode the hash with hexadecimal encoding Hash the string using SHA1, then encode the hash with hexadecimal encoding Read contents from a file (including line endings, when present) Return last modification time of a file (or 0 if file does not exist or is not regular file) Return size of a file (or 0 if file does not exist or is not regular file) Escape characters as required by LDAP distinguished name escaping (RFC4514) and LDAP filter escaping (RFC4515). replace(string, "from", "to") replaces all occurences of "from" in the string with "to". req novary resp reqenv osenv note env tolower toupper escape unescape base64 unbase64 md5 sha1 file filemod filesize ldap replace 105 Restricted yes yes yes The functions marked as "restricted" are not available in some modules like MOD INCLUDE. When the functions req or http are used, the header name will automatically be added to the Vary header of the HTTP response, except where otherwise noted for the directive accepting the expression. The req novary function can be used to prevent names from being added to the Vary header. In addition to string-valued functions, there are also list-valued functions which take one string as argument and return a wordlist, i.e. a list of strings. The wordlist can be used with the special -in operator. Functions names are not case sensitive. Modules may register additional functions. There are no built-in list-valued functions. MOD SSL provides PeerExtList. See the description of SSLR EQUIRE for details (but PeerExtList is also usable outside of SSLR EQUIRE). Example expressions The following examples show how expressions might be used to evaluate requests: # Compare the host name to example.com and redirect to www.example.com if it matches 106 CHAPTER 2. USING THE APACHE HTTP SERVER Redirect permanent "/" "http://www.example.com/" # Force text/plain if requesting a file with the query string contains ’forcetext’ ForceType text/plain # Only allow access to this content during business hours Require expr %{TIME_HOUR} -gt 9 && %{TIME_HOUR} -lt 17 # Check a HTTP header for a list of values Header set matched true # Check an environment variable for a regular expression, negated. Header set matched true # Check result of URI mapping by running in Directory context with -f AddEncoding x-gzip gz SetOutputFilter INFLATE # Check against the client IP Header set matched true # Function examples in boolean context Header set checksum-matched true Header set checksum-matched-2 true # Function example in string context Header set foo-checksum "expr=%{md5:foo}" # This delays the evaluation of the condition clause compared to Header always set CustomHeader my-value "expr=%{REQUEST_URI} =˜ m#ˆ/special_path.php$#" 2.17. EXPRESSIONS IN APACHE HTTP SERVER 107 Other Name Alternative Description -in /regexp/ /regexp/i $0 ... $9 in m#regexp# m#regexp#i string contained in wordlist Regular expression (the second form allows different delimiters than /) Case insensitive regular expression Regular expression backreferences Regular expression backreferences The strings $0 ... $9 allow to reference the capture groups from a previously executed, successfully matching regular expressions. They can normally only be used in the same expression as the matching regex, but some modules allow special uses. Comparison with SSLRequire The ap expr syntax is mostly a superset of the syntax of the deprecated SSLR EQUIRE directive. The differences are described in SSLR EQUIRE’s documentation. Version History The req novary function is available for versions 2.4.4 and later. The SERVER PROTOCOL VERSION, SERVER PROTOCOL VERSION MAJOR SERVER PROTOCOL VERSION MINOR variables are available for versions 2.5.0 and later. and 108 CHAPTER 2. USING THE APACHE HTTP SERVER 2.18 Apache’s Handler Use This document describes the use of Apache’s Handlers. What is a Handler Related Modules MOD ACTIONS MOD ASIS MOD MOD MOD MOD MOD MOD CGI IMAGEMAP INFO MIME NEGOTIATION STATUS Related Directives ACTION A DD H ANDLER R EMOVE H ANDLER S ET H ANDLER A "handler" is an internal Apache representation of the action to be performed when a file is called. Generally, files have implicit handlers, based on the file type. Normally, all files are simply served by the server, but certain file types are "handled" separately. Handlers may also be configured explicitly, based on either filename extensions or on location, without relation to file type. This is advantageous both because it is a more elegant solution, and because it also allows for both a type and a handler to be associated with a file. (See also Files with Multiple Extensions (p. 749) .) Handlers can either be built into the server or included in a module, or they can be added with the ACTION directive. The built-in handlers in the standard distribution are as follows: • default-handler: Send the file using the default handler(), which is the handler used by default to handle static content. (core) • send-as-is: Send file with HTTP headers as is. (MOD ASIS) • cgi-script: Treat the file as a CGI script. (MOD CGI) • imap-file: Parse as an imagemap rule file. (MOD IMAGEMAP) • server-info: Get the server’s configuration information. (MOD INFO) • server-status: Get the server’s status report. (MOD STATUS) • type-map: Parse as a type map file for content negotiation. (MOD NEGOTIATION) Examples Modifying static content using a CGI script The following directives will cause requests for files with the html extension to trigger the launch of the footer.pl CGI script. Action add-footer /cgi-bin/footer.pl AddHandler add-footer .html Then the CGI script is responsible for sending the originally requested document (pointed to by the PATH TRANSLATED environment variable) and making whatever modifications or additions are desired. 2.18. APACHE’S HANDLER USE 109 Files with HTTP headers The following directives will enable the send-as-is handler, which is used for files which contain their own HTTP headers. All files in the /web/htdocs/asis/ directory will be processed by the send-as-is handler, regardless of their filename extensions. SetHandler send-as-is Programmer’s Note In order to implement the handler features, an addition has been made to the Apache API (p. 1019) that you may wish to make use of. Specifically, a new record has been added to the request rec structure: char *handler If you wish to have your module engage a handler, you need only to set r->handler to the name of the handler at any time prior to the invoke handler stage of the request. Handlers are implemented as they were before, albeit using the handler name instead of a content type. While it is not necessary, the naming convention for handlers is to use a dash-separated word, with no slashes, so as to not invade the media type name-space. 110 2.19 CHAPTER 2. USING THE APACHE HTTP SERVER Filters This document describes the use of filters in Apache. Filtering in Apache 2 Related Modules MOD MOD MOD MOD MOD MOD FILTER DEFLATE EXT FILTER INCLUDE CHARSET LITE REFLECTOR MOD MOD MOD MOD MOD MOD BUFFER DATA RATELIMIT REQTIMEOUT REQUEST SED MOD MOD MOD MOD SUBSTITUTE XML 2 ENC PROXY HTML POLICY Related Directives F ILTER C HAIN F ILTER D ECLARE F ILTER P ROTOCOL F ILTER P ROVIDER A DD I NPUT F ILTER A DD O UTPUT F ILTER R EMOVE I NPUT F ILTER R EMOVE O UTPUT F ILTER R EFLECTOR H EADER E XT F ILTER D EFINE E XT F ILTERO PTIONS S ET I NPUT F ILTER S ET O UTPUT F ILTER The Filter Chain is available in Apache 2.0 and higher, and enables applications to process incoming and outgoing data in a highly flexible and configurable manner, regardless of where the data comes from. We can pre-process incoming data, and post-process outgoing data, at will. This is basically independent of the traditional request processing phases. 2.19. FILTERS 111 Some examples of filtering in the standard Apache distribution are: • MOD INCLUDE, implements server-side includes. • MOD SSL, implements SSL encryption (https). • MOD DEFLATE, implements compression/decompression on the fly. • MOD CHARSET LITE, transcodes between different character sets. • MOD EXT FILTER, runs an external program as a filter. Apache also uses a number of filters internally to perform functions like chunking and byte-range handling. A wider range of applications are implemented by third-party filter modules available from modules.apache.org37 and elsewhere. A few of these are: • HTML and XML processing and rewriting • XSLT transforms and XIncludes • XML Namespace support 37 http://modules.apache.org/ 112 CHAPTER 2. USING THE APACHE HTTP SERVER • File Upload handling and decoding of HTML Forms • Image processing • Protection of vulnerable applications such as PHP scripts • Text search-and-replace editing Smart Filtering MOD FILTER , included in Apache 2.1 and later, enables the filter chain to be configured dynamically at run time. So for example you can set up a proxy to rewrite HTML with an HTML filter and JPEG images with a completely separate filter, despite the proxy having no prior information about what the origin server will send. This works by using a filter harness, that dispatches to different providers according to the actual contents at runtime. Any filter may be either inserted directly in the chain and run unconditionally, or used as a provider and inserted dynamically. For example, • an HTML processing filter will only run if the content is text/html or application/xhtml+xml • A compression filter will only run if the input is a compressible type and not already compressed • A charset conversion filter will be inserted if a text document is not already in the desired charset Exposing Filters as an HTTP Service Filters can be used to process content originating from the client in addition to processing content originating on the server using the MOD REFLECTOR module. 2.19. FILTERS 113 MOD REFLECTOR accepts POST requests from clients, and reflects the content request body received within the POST request back in the response, passing through the output filter stack on the way back to the client. This technique can be used as an alternative to a web service running within an application server stack, where an output filter provides the transformation required on the request body. For example, the MOD DEFLATE module might be used to provide a general compression service, or an image transformation filter might be turned into an image transformation service. Using Filters There are two ways to use filtering: Simple and Dynamic. In general, you should use one or the other; mixing them can have unexpected consequences (although simple Input filtering can be mixed freely with either simple or dynamic Output filtering). The Simple Way is the only way to configure input filters, and is sufficient for output filters where you need a static filter chain. Relevant directives are S ET I NPUT F ILTER, S ET O UTPUT F ILTER, A DD I NPUT F ILTER, A DD O UTPUT F ILTER, R EMOVE I NPUT F ILTER, and R EMOVE O UTPUT F ILTER. The Dynamic Way enables both static and flexible, dynamic configuration of output filters, as discussed in the MOD FILTER page. Relevant directives are F ILTER C HAIN , F ILTER D ECLARE , and F ILTER P ROVIDER . One further directive A DD O UTPUT F ILTER B Y T YPE is still supported, but deprecated. Use dynamic configuration instead. 114 2.20 CHAPTER 2. USING THE APACHE HTTP SERVER Shared Object Cache in Apache HTTP Server The Shared Object Cache provides a means to share simple data across all a server’s workers, regardless of thread and process models (p. 90) . It is used where the advantages of sharing data across processes outweigh the performance overhead of inter-process communication. Shared Object Cache Providers The shared object cache as such is an abstraction. Four different modules implement it. To use the cache, one or more of these modules must be present, and configured. The only configuration required is to select which cache provider to use. This is the responsibility of modules using the cache, and they enable selection using directives such as C ACHE S OCACHE, AUTHN C ACHE SOC ACHE, SSLS ES SION C ACHE , and SSLS TAPLING C ACHE . Currently available providers are: "dbm" (MOD SOCACHE DBM) This makes use of a DBM hash file. The choice of underlying DBM used may be configurable if the installed APR version supports multiple DBM implementations. "dc" (MOD SOCACHE DC) This makes use of the distcache38 distributed session caching libraries. "memcache" (MOD SOCACHE MEMCACHE) This makes use of the memcached39 high-performance, distributed memory object caching system. "shmcb" (MOD SOCACHE SHMCB) This makes use of a high-performance cyclic buffer inside a shared memory segment. The API provides the following functions: const char *create(ap socache instance t **instance, const char *arg, apr pool t *tmp, apr pool t *p); Create a session cache based on the given configuration string. The instance pointer returned in the instance paramater will be passed as the first argument to subsequent invocations. apr status t init(ap socache instance t *instance, const char *cname, const struct ap socache hints *hints, server rec *s, apr poo Initialize the cache. The cname must be of maximum length 16 characters, and uniquely identifies the consumer of the cache within the server; using the module name is recommended, e.g. "mod ssl-sess". This string may be used within a filesystem path so use of only alphanumeric [a-z0-9 -] characters is recommended. If hints is non-NULL, it gives a set of hints for the provider. Return APR error code. void destroy(ap socache instance t *instance, server rec *s) Destroy a given cache instance object. apr status t store(ap socache instance t *instance, server rec *s, const unsigned char *id, unsigned int idlen, apr time t expiry, Store an object in a cache instance. apr status t retrieve(ap socache instance t *instance, server rec *s, const unsigned char *id, unsigned int idlen, unsigned char * Retrieve a cached object. apr status t remove(ap socache instance t *instance, server rec *s, const unsigned char *id, unsigned int idlen, apr pool t *pool Remove an object from the cache. void status(ap socache instance t *instance, request rec *r, int flags) Dump the status of a cache instance for mod status. apr status t iterate(ap socache instance t *instance, server rec *s, void *userctx, ap socache iterator t *iterator, apr pool t *poo Dump all cached objects through an iterator callback. 38 http://distcache.sourceforge.net/ 39 http://memcached.org/ 2.21. SUEXEC SUPPORT 2.21 115 suEXEC Support The suEXEC feature provides users of the Apache HTTP Server the ability to run CGI and SSI programs under user IDs different from the user ID of the calling web server. Normally, when a CGI or SSI program executes, it runs as the same user who is running the web server. Used properly, this feature can reduce considerably the security risks involved with allowing users to develop and run private CGI or SSI programs. However, if suEXEC is improperly configured, it can cause any number of problems and possibly create new holes in your computer’s security. If you aren’t familiar with managing setuid root programs and the security issues they present, we highly recommend that you not consider using suEXEC. Before we begin Before jumping head-first into this document, you should be aware that certain assumptions are made about you and the environment in which you will be using suexec. First, it is assumed that you are using a UNIX derivative operating system that is capable of setuid and setgid operations. All command examples are given in this regard. Other platforms, if they are capable of supporting suEXEC, may differ in their configuration. Second, it is assumed you are familiar with some basic concepts of your computer’s security and its administration. This involves an understanding of setuid/setgid operations and the various effects they may have on your system and its level of security. Third, it is assumed that you are using an unmodified version of suEXEC code. All code for suEXEC has been carefully scrutinized and tested by the developers as well as numerous beta testers. Every precaution has been taken to ensure a simple yet solidly safe base of code. Altering this code can cause unexpected problems and new security risks. It is highly recommended you not alter the suEXEC code unless you are well versed in the particulars of security programming and are willing to share your work with the Apache HTTP Server development team for consideration. Fourth, and last, it has been the decision of the Apache HTTP Server development team to NOT make suEXEC part of the default installation of Apache httpd. To this end, suEXEC configuration requires of the administrator careful attention to details. After due consideration has been given to the various settings for suEXEC, the administrator may install suEXEC through normal installation methods. The values for these settings need to be carefully determined and specified by the administrator to properly maintain system security during the use of suEXEC functionality. It is through this detailed process that we hope to limit suEXEC installation only to those who are careful and determined enough to use it. Still with us? Yes? Good. Let’s move on! suEXEC Security Model Before we begin configuring and installing suEXEC, we will first discuss the security model you are about to implement. By doing so, you may better understand what exactly is going on inside suEXEC and what precautions are taken to ensure your system’s security. suEXEC is based on a setuid "wrapper" program that is called by the main Apache HTTP Server. This wrapper is called when an HTTP request is made for a CGI or SSI program that the administrator has designated to run as a userid other than that of the main server. When such a request is made, Apache httpd provides the suEXEC wrapper with the program’s name and the user and group IDs under which the program is to execute. The wrapper then employs the following process to determine success or failure – if any one of these conditions fail, the program logs the failure and exits with an error, otherwise it will continue: 1. Is the user executing this wrapper a valid user of this system? This is to ensure that the user executing the wrapper is truly a user of the system. 116 CHAPTER 2. USING THE APACHE HTTP SERVER 2. Was the wrapper called with the proper number of arguments? The wrapper will only execute if it is given the proper number of arguments. The proper argument format is known to the Apache HTTP Server. If the wrapper is not receiving the proper number of arguments, it is either being hacked, or there is something wrong with the suEXEC portion of your Apache httpd binary. 3. Is this valid user allowed to run the wrapper? Is this user the user allowed to run this wrapper? Only one user (the Apache user) is allowed to execute this program. 4. Does the target CGI or SSI program have an unsafe hierarchical reference? Does the target CGI or SSI program’s path contain a leading ’/’ or have a ’..’ backreference? These are not allowed; the target CGI/SSI program must reside within suEXEC’s document root (see --with-suexec-docroot=DIR below). 5. Is the target user name valid? Does the target user exist? 6. Is the target group name valid? Does the target group exist? 7. Is the target user NOT superuser? suEXEC does not allow root to execute CGI/SSI programs. 8. Is the target userid ABOVE the minimum ID number? The minimum user ID number is specified during configuration. This allows you to set the lowest possible userid that will be allowed to execute CGI/SSI programs. This is useful to block out "system" accounts. 9. Is the target group NOT the superuser group? Presently, suEXEC does not allow the root group to execute CGI/SSI programs. 10. Is the target groupid ABOVE the minimum ID number? The minimum group ID number is specified during configuration. This allows you to set the lowest possible groupid that will be allowed to execute CGI/SSI programs. This is useful to block out "system" groups. 11. Can the wrapper successfully become the target user and group? Here is where the program becomes the target user and group via setuid and setgid calls. The group access list is also initialized with all of the groups of which the user is a member. 12. Can we change directory to the one in which the target CGI/SSI program resides? If it doesn’t exist, it can’t very well contain files. If we can’t change directory to it, it might as well not exist. 13. Is the directory within the httpd webspace? If the request is for a regular portion of the server, is the requested directory within suEXEC’s document root? If the request is for a U SER D IR, is the requested directory within the directory configured as suEXEC’s userdir (see suEXEC’s configuration options)? 14. Is the directory NOT writable by anyone else? We don’t want to open up the directory to others; only the owner user may be able to alter this directories contents. 15. Does the target CGI/SSI program exist? If it doesn’t exists, it can’t very well be executed. 2.21. SUEXEC SUPPORT 117 16. Is the target CGI/SSI program NOT writable by anyone else? We don’t want to give anyone other than the owner the ability to change the CGI/SSI program. 17. Is the target CGI/SSI program NOT setuid or setgid? We do not want to execute programs that will then change our UID/GID again. 18. Is the target user/group the same as the program’s user/group? Is the user the owner of the file? 19. Can we successfully clean the process environment to ensure safe operations? suEXEC cleans the process’ environment by establishing a safe execution PATH (defined during configuration), as well as only passing through those variables whose names are listed in the safe environment list (also created during configuration). 20. Can we successfully become the target CGI/SSI program and execute? Here is where suEXEC ends and the target CGI/SSI program begins. This is the standard operation of the suEXEC wrapper’s security model. It is somewhat stringent and can impose new limitations and guidelines for CGI/SSI design, but it was developed carefully step-by-step with security in mind. For more information as to how this security model can limit your possibilities in regards to server configuration, as well as what security risks can be avoided with a proper suEXEC setup, see the "Beware the Jabberwock" section of this document. Configuring & Installing suEXEC Here’s where we begin the fun. suEXEC configuration options --enable-suexec This option enables the suEXEC feature which is never installed or activated by default. At least one --with-suexec-xxxxx option has to be provided together with the --enable-suexec option to let APACI accept your request for using the suEXEC feature. --enable-suexec-capabilities Linux specific: Normally, the suexec binary is installed "setuid/setgid root", which allows it to run with the full privileges of the root user. If this option is used, the suexec binary will instead be installed with only the setuid/setgid "capability" bits set, which is the subset of full root priviliges required for suexec operation. Note that the suexec binary may not be able to write to a log file in this mode; it is recommended that the --with-suexec-syslog --without-suexec-logfile options are used in conjunction with this mode, so that syslog logging is used instead. --with-suexec-bin=PATH The path to the suexec binary must be hard-coded in the server for security reasons. Use this option to override the default path. e.g. --with-suexec-bin=/usr/sbin/suexec --with-suexec-caller=UID The username (p. 990) under which httpd normally runs. This is the only user allowed to execute the suEXEC wrapper. --with-suexec-userdir=DIR Define to be the subdirectory under users’ home directories where suEXEC access should be allowed. All executables under this directory will be executable by suEXEC as the user so they should be "safe" programs. If you are using a "simple" U SER D IR directive (ie. one without a "*" in it) this should be set to the same value. suEXEC will not work properly in cases where the U SER D IR directive points to a location that is not the same as the user’s home directory as referenced in the passwd file. Default value is "public html". If you have virtual hosts with a different U SER D IR for each, you will need to define them to all reside in one parent directory; then name that parent directory here. If this is not defined properly, "˜userdir" cgi requests will not work! 118 CHAPTER 2. USING THE APACHE HTTP SERVER --with-suexec-docroot=DIR Define as the DocumentRoot set for httpd. This will be the only hierarchy (aside from U SER D IRs) that can be used for suEXEC behavior. The default directory is the --datadir value with the suffix "/htdocs", e.g. if you configure with "--datadir=/home/apache" the directory "/home/apache/htdocs" is used as document root for the suEXEC wrapper. --with-suexec-uidmin=UID Define this as the lowest UID allowed to be a target user for suEXEC. For most systems, 500 or 100 is common. Default value is 100. --with-suexec-gidmin=GID Define this as the lowest GID allowed to be a target group for suEXEC. For most systems, 100 is common and therefore used as default value. --with-suexec-logfile=FILE This defines the filename to which all suEXEC transactions and errors are logged (useful for auditing and debugging purposes). By default the logfile is named "suexec log" and located in your standard logfile directory (--logfiledir). --with-suexec-syslog If defined, suexec will log notices and errors to syslog instead of a logfile. This option must be combined with --without-suexec-logfile. --with-suexec-safepath=PATH Define a safe PATH environment to pass to CGI executables. Default value is "/usr/local/bin:/usr/bin:/bin". Compiling and installing the suEXEC wrapper If you have enabled the suEXEC feature with the --enable-suexec option the suexec binary (together with httpd itself) is automatically built if you execute the make command. After all components have been built you can execute the command make install to install them. The binary image suexec is installed in the directory defined by the --sbindir option. The default location is "/usr/local/apache2/bin/suexec". Please note that you need root privileges for the installation step. In order for the wrapper to set the user ID, it must be installed as owner root and must have the setuserid execution bit set for file modes. Setting paranoid permissions Although the suEXEC wrapper will check to ensure that its caller is the correct user as specified with the --with-suexec-caller configure option, there is always the possibility that a system or library call suEXEC uses before this check may be exploitable on your system. To counter this, and because it is best-practise in general, you should use filesystem permissions to ensure that only the group httpd runs as may execute suEXEC. If for example, your web server is configured to run as: User www Group webgroup and suexec is installed at "/usr/local/apache2/bin/suexec", you should run: chgrp webgroup /usr/local/apache2/bin/suexec chmod 4750 /usr/local/apache2/bin/suexec This will ensure that only the group httpd runs as can even execute the suEXEC wrapper. 2.21. SUEXEC SUPPORT 119 Enabling & Disabling suEXEC Upon startup of httpd, it looks for the file suexec in the directory defined by the --sbindir option (default is "/usr/local/apache/sbin/suexec"). If httpd finds a properly configured suEXEC wrapper, it will print the following message to the error log: [notice] suEXEC mechanism enabled (wrapper: /path/to/suexec) If you don’t see this message at server startup, the server is most likely not finding the wrapper program where it expects it, or the executable is not installed setuid root. If you want to enable the suEXEC mechanism for the first time and an Apache HTTP Server is already running you must kill and restart httpd. Restarting it with a simple HUP or USR1 signal will not be enough. If you want to disable suEXEC you should kill and restart httpd after you have removed the suexec file. Using suEXEC Requests for CGI programs will call the suEXEC wrapper only if they are for a virtual host containing a S UEXE C U SER G ROUP directive or if they are processed by MOD USERDIR . Virtual Hosts: One way to use the suEXEC wrapper is through the S UEXEC U SER G ROUP directive in V IRTUAL H OST definitions. By setting this directive to values different from the main server user ID, all requests for CGI resources will be executed as the User and Group defined for that . If this directive is not specified for a then the main server userid is assumed. User directories: Requests that are processed by MOD USERDIR will call the suEXEC wrapper to execute CGI programs under the userid of the requested user directory. The only requirement needed for this feature to work is for CGI execution to be enabled for the user and that the script must meet the scrutiny of the security checks above. See also the --with-suexec-userdir compile time option. Debugging suEXEC The suEXEC wrapper will write log information to the file defined with the --with-suexec-logfile option as indicated above, or to syslog if --with-suexec-syslog is used. If you feel you have configured and installed the wrapper properly, have a look at the log and the error log for the server to see where you may have gone astray. The output of "suexec -V" will show the options used to compile suexec, if using a binary distribution. Beware the Jabberwock: Warnings & Examples NOTE! This section may not be complete. For the latest revision of this section of the documentation, see the Online Documentation40 version. There are a few points of interest regarding the wrapper that can cause limitations on server setup. Please review these before submitting any "bugs" regarding suEXEC. • suEXEC Points Of Interest 40 http://httpd.apache.org/docs/trunk/suexec.html 120 CHAPTER 2. USING THE APACHE HTTP SERVER • Hierarchy limitations For security and efficiency reasons, all suEXEC requests must remain within either a top-level document root for virtual host requests, or one top-level personal document root for userdir requests. For example, if you have four VirtualHosts configured, you would need to structure all of your VHosts’ document roots off of one main httpd document hierarchy to take advantage of suEXEC for VirtualHosts. (Example forthcoming.) • suEXEC’s PATH environment variable This can be a dangerous thing to change. Make certain every path you include in this define is a trusted directory. You don’t want to open people up to having someone from across the world running a trojan horse on them. • Altering the suEXEC code Again, this can cause Big Trouble if you try this without knowing what you are doing. Stay away from it if at all possible. 2.22. ISSUES REGARDING DNS AND APACHE HTTP SERVER 2.22 121 Issues Regarding DNS and Apache HTTP Server This page could be summarized with the statement: don’t configure Apache HTTP Server in such a way that it relies on DNS resolution for parsing of the configuration files. If httpd requires DNS resolution to parse the configuration files then your server may be subject to reliability problems (ie. it might not start up), or denial and theft of service attacks (including virtual hosts able to steal hits from other virtual hosts). A Simple Example # This is a misconfiguration example, do not use on your server ServerAdmin webgirl@example.dom DocumentRoot "/www/example" In order for the server to function properly, it absolutely needs to have two pieces of information about each virtual host: the S ERVER NAME and at least one IP address that the server will bind and respond to. The above example does not include the IP address, so httpd must use DNS to find the address of www.example.dom. If for some reason DNS is not available at the time your server is parsing its config file, then this virtual host will not be configured. It won’t be able to respond to any hits to this virtual host. Suppose that www.example.dom has address 192.0.2.1. Then consider this configuration snippet: # This is a misconfiguration example, do not use on your server ServerAdmin webgirl@example.dom DocumentRoot "/www/example" This time httpd needs to use reverse DNS to find the ServerName for this virtualhost. If that reverse lookup fails then it will partially disable the virtualhost. If the virtual host is name-based then it will effectively be totally disabled, but if it is IP-based then it will mostly work. However, if httpd should ever have to generate a full URL for the server which includes the server name (such as when a Redirect is issued), then it will fail to generate a valid URL. Here is a snippet that avoids both of these problems: ServerName www.example.dom ServerAdmin webgirl@example.dom DocumentRoot "/www/example" Denial of Service Consider this configuration snippet: ServerAdmin webgirl@example1.dom DocumentRoot "/www/example1" ServerAdmin webguy@example2.dom 122 CHAPTER 2. USING THE APACHE HTTP SERVER DocumentRoot "/www/example2" Suppose that you’ve assigned 192.0.2.1 to www.example1.dom and 192.0.2.2 to www.example2.dom. Furthermore, suppose that example1.dom has control of their own DNS. With this config you have put example1.dom into a position where they can steal all traffic destined to example2.dom. To do so, all they have to do is set www.example1.dom to 192.0.2.2. Since they control their own DNS you can’t stop them from pointing the www.example1.dom record wherever they wish. Requests coming in to 192.0.2.2 (including all those where users typed in URLs of the form http://www.example2.dom/whatever) will all be served by the example1.dom virtual host. To better understand why this happens requires a more in-depth discussion of how httpd matches up incoming requests with the virtual host that will serve it. A rough document describing this is available (p. 141) . The "main server" Address Name-based virtual host support (p. 125) requires httpd to know the IP address(es) of the host that httpd is running on. To get this address it uses either the global S ERVER NAME (if present) or calls the C function gethostname (which should return the same as typing "hostname" at the command prompt). Then it performs a DNS lookup on this address. At present there is no way to avoid this lookup. If you fear that this lookup might fail because your DNS server is down then you can insert the hostname in /etc/hosts (where you probably already have it so that the machine can boot properly). Then ensure that your machine is configured to use /etc/hosts in the event that DNS fails. Depending on what OS you are using this might be accomplished by editing /etc/resolv.conf, or maybe /etc/nsswitch.conf. If your server doesn’t have to perform DNS for any other reason then you might be able to get away with running httpd with the HOSTRESORDER environment variable set to "local". This all depends on what OS and resolver libraries you are using. It also affects CGIs unless you use MOD ENV to control the environment. It’s best to consult the man pages or FAQs for your OS. Tips to Avoid These Problems • use IP addresses in V IRTUAL H OST • use IP addresses in L ISTEN • ensure all virtual hosts have an explicit S ERVER NAME • create a server that has no pages to serve Chapter 3 Apache Virtual Host documentation 123 124 CHAPTER 3. APACHE VIRTUAL HOST DOCUMENTATION 3.1 Apache Virtual Host documentation The term Virtual Host refers to the practice of running more than one web site (such as company1.example.com and company2.example.com) on a single machine. Virtual hosts can be "IP-based (p. 128) ", meaning that you have a different IP address for every web site, or "name-based (p. 125) ", meaning that you have multiple names running on each IP address. The fact that they are running on the same physical server is not apparent to the end user. Apache was one of the first servers to support IP-based virtual hosts right out of the box. Versions 1.1 and later of Apache support both IP-based and name-based virtual hosts (vhosts). The latter variant of virtual hosts is sometimes also called host-based or non-IP virtual hosts. Below is a list of documentation pages which explain all details of virtual host support in Apache HTTP Server: See also • MOD VHOST ALIAS • Name-based virtual hosts (p. 125) • IP-based virtual hosts (p. 128) • Virtual host examples (p. 134) • File descriptor limits (p. 144) • Mass virtual hosting (p. 130) • Details of host matching (p. 141) Virtual Host Support • Name-based Virtual Hosts (p. 125) (More than one web site per IP address) • IP-based Virtual Hosts (p. 128) (An IP address for each web site) • Virtual Host examples for common setups (p. 134) • File Descriptor Limits (p. 144) (or, Too many log files) • Dynamically Configured Mass Virtual Hosting (p. 130) • In-Depth Discussion of Virtual Host Matching (p. 141) Configuration directives • • S ERVER NAME • S ERVER A LIAS • S ERVER PATH If you are trying to debug your virtual host configuration, you may find the Apache -S command line switch useful. That is, type the following command: /usr/local/apache2/bin/httpd -S This command will dump out a description of how Apache parsed the configuration file. Careful examination of the IP addresses and server names may help uncover configuration mistakes. (See the docs for the httpd program for other command line options) 3.2. NAME-BASED VIRTUAL HOST SUPPORT 3.2 125 Name-based Virtual Host Support This document describes when and how to use name-based virtual hosts. See also • IP-based Virtual Host Support (p. 128) • An In-Depth Discussion of Virtual Host Matching (p. 141) • Dynamically configured mass virtual hosting (p. 130) • Virtual Host examples for common setups (p. 134) Name-based vs. IP-based Virtual Hosts IP-based virtual hosts (p. 128) use the IP address of the connection to determine the correct virtual host to serve. Therefore you need to have a separate IP address for each host. With name-based virtual hosting, the server relies on the client to report the hostname as part of the HTTP headers. Using this technique, many different hosts can share the same IP address. Name-based virtual hosting is usually simpler, since you need only configure your DNS server to map each hostname to the correct IP address and then configure the Apache HTTP Server to recognize the different hostnames. Namebased virtual hosting also eases the demand for scarce IP addresses. Therefore you should use name-based virtual hosting unless you are using equipment that explicitly demands IP-based hosting. Historical reasons for IP-based virtual hosting based on client support are no longer applicable to a general-purpose web server. Name-based virtual hosting builds off of the IP-based virtual host selection algorithm, meaning that searches for the proper server name occur only between virtual hosts that have the best IP-based address. How the server selects the proper name-based virtual host It is important to recognize that the first step in name-based virtual host resolution is IP-based resolution. Name-based virtual host resolution only chooses the most appropriate name-based virtual host after narrowing down the candidates to the best IP-based match. Using a wildcard (*) for the IP address in all of the VirtualHost directives makes this IP-based mapping irrelevant. When a request arrives, the server will find the best (most specific) matching argument based on the IP address and port used by the request. If there is more than one virtual host containing this best-match address and port combination, Apache will further compare the S ERVER NAME and S ERVER A LIAS directives to the server name present in the request. If you omit the S ERVER NAME directive from any name-based virtual host, the server will default to a fully qualified domain name (FQDN) derived from the system hostname. This implicitly set server name can lead to counter-intuitive virtual host matching and is discouraged. The default name-based vhost for an IP and port combination If no matching ServerName or ServerAlias is found in the set of virtual hosts containing the most specific matching IP address and port combination, then the first listed virtual host that matches that will be used. 126 CHAPTER 3. APACHE VIRTUAL HOST DOCUMENTATION Using Name-based Virtual Hosts Related Modules CORE Related Directives D OCUMENT ROOT S ERVER A LIAS S ERVER NAME The first step is to create a block for each different host that you would like to serve. Inside each block, you will need at minimum a S ERVER NAME directive to designate which host is served and a D OCUMENT ROOT directive to show where in the filesystem the content for that host lives. =⇒Main host goes away Any request that doesn’t match an existing is handled by the global server configuration, regardless of the hostname or ServerName. When you add a name-based virtual host to an existing server, and the virtual host arguments match preexisting IP and port combinations, requests will now be handled by an explicit virtual host. In this case, it’s usually wise to create a default virtual host with a S ERVER NAME matching that of the base server. New domains on the same interface and port, but requiring separate configurations, can then be added as subsequent (non-default) virtual hosts. =⇒ServerName inheritance It is best to always explicitly list a S ERVER NAME in every name-based virtual host. If a V IRTUAL H OST doesn’t specify a S ERVER NAME, a server name will be inherited from the base server configuration. If no server name was specified globally, one is detected at startup through reverse DNS resolution of the first listening address. In either case, this inherited server name will influence name-based virtual host resolution, so it is best to always explicitly list a S ERVER NAME in every name-based virtual host. For example, suppose that you are serving the domain www.example.com and you wish to add the virtual host other.example.com, which points at the same IP address. Then you simply add the following to httpd.conf: # This first-listed virtual host is also the default for *:80 ServerName www.example.com ServerAlias example.com DocumentRoot "/www/domain" ServerName other.example.com DocumentRoot "/www/otherdomain" You can alternatively specify an explicit IP address in place of the * in directives. For example, you might want to do this in order to run some name-based virtual hosts on one IP address, and either IP-based, or another set of name-based virtual hosts on another address. Many servers want to be accessible by more than one name. This is possible with the S ERVER A LIAS directive, placed inside the section. For example in the first block above, the S ERVER A LIAS directive indicates that the listed names are other names which people can use to see that same web site: ServerAlias example.com *.example.com 3.2. NAME-BASED VIRTUAL HOST SUPPORT 127 then requests for all hosts in the example.com domain will be served by the www.example.com virtual host. The wildcard characters * and ? can be used to match names. Of course, you can’t just make up names and place them in S ERVER NAME or ServerAlias. You must first have your DNS server properly configured to map those names to an IP address associated with your server. Name-based virtual hosts for the best-matching set of s are processed in the order they appear in the configuration. The first matching S ERVER NAME or S ERVER A LIAS is used, with no different precedence for wildcards (nor for ServerName vs. ServerAlias). The complete list of names in the V IRTUAL H OST directive are treated just like a (non wildcard) S ERVER A LIAS. Finally, you can fine-tune the configuration of the virtual hosts by placing other directives inside the containers. Most directives can be placed in these containers and will then change the configuration only of the relevant virtual host. To find out if a particular directive is allowed, check the Context (p. 377) of the directive. Configuration directives set in the main server context (outside any container) will be used only if they are not overridden by the virtual host settings. 128 CHAPTER 3. APACHE VIRTUAL HOST DOCUMENTATION 3.3 Apache IP-based Virtual Host Support See also • Name-based Virtual Hosts Support (p. 125) What is IP-based virtual hosting IP-based virtual hosting is a method to apply different directives based on the IP address and port a request is received on. Most commonly, this is used to serve different websites on different ports or interfaces. In many cases, name-based virtual hosts (p. 125) are more convenient, because they allow many virtual hosts to share a single address/port. See Name-based vs. IP-based Virtual Hosts (p. 125) to help you decide. System requirements As the term IP-based indicates, the server must have a different IP address/port combination for each IP-based virtual host. This can be achieved by the machine having several physical network connections, or by use of virtual interfaces which are supported by most modern operating systems (see system documentation for details, these are frequently called "ip aliases", and the "ifconfig" command is most commonly used to set them up), and/or using multiple port numbers. In the terminology of Apache HTTP Server, using a single IP address but multiple TCP ports, is also IP-based virtual hosting. How to set up Apache There are two ways of configuring apache to support multiple hosts. Either by running a separate httpd daemon for each hostname, or by running a single daemon which supports all the virtual hosts. Use multiple daemons when: • There are security partitioning issues, such as company1 does not want anyone at company2 to be able to read their data except via the web. In this case you would need two daemons, each running with different U SER, G ROUP, L ISTEN, and S ERVER ROOT settings. • You can afford the memory and file descriptor requirements of listening to every IP alias on the machine. It’s only possible to L ISTEN to the "wildcard" address, or to specific addresses. So if you have a need to listen to a specific address for whatever reason, then you will need to listen to all specific addresses. (Although one httpd could listen to N-1 of the addresses, and another could listen to the remaining address.) Use a single daemon when: • Sharing of the httpd configuration between virtual hosts is acceptable. • The machine services a large number of requests, and so the performance loss in running separate daemons may be significant. Setting up multiple daemons Create a separate httpd installation for each virtual host. For each installation, use the L ISTEN directive in the configuration file to select which IP address (or virtual host) that daemon services. e.g. 3.3. APACHE IP-BASED VIRTUAL HOST SUPPORT 129 Listen 192.0.2.100:80 It is recommended that you use an IP address instead of a hostname (see DNS caveats (p. 121) ). Setting up a single daemon with virtual hosts For this case, a single httpd will service requests for the main server and all the virtual hosts. The V IRTUAL H OST directive in the configuration file is used to set the values of S ERVER A DMIN, S ERVER NAME, D OCUMENT ROOT, E RROR L OG and T RANSFER L OG or C USTOM L OG configuration directives to different values for each virtual host. e.g. ServerAdmin webmaster@www1.example.com DocumentRoot "/www/vhosts/www1" ServerName www1.example.com ErrorLog "/www/logs/www1/error_log" CustomLog "/www/logs/www1/access_log" combined ServerAdmin "webmaster@www2.example.org" DocumentRoot "/www/vhosts/www2" ServerName www2.example.org ErrorLog "/www/logs/www2/error_log" CustomLog "/www/logs/www2/access_log" combined It is recommended that you use an IP address instead of a hostname in the directive (see DNS caveats (p. 121) ). Specific IP addresses or ports have precedence over their wildcard equivalents, and any virtual host that matches has precedence over the servers base configuration. Almost any configuration directive can be put in the VirtualHost directive, with the exception of directives that control process creation and a few other directives. To find out if a directive can be used in the VirtualHost directive, check the Context (p. 377) using the directive index (p. 1106) . S UEXEC U SER G ROUP may be used inside a VirtualHost directive if the suEXEC wrapper (p. 115) is used. SECURITY: When specifying where to write log files, be aware of some security risks which are present if anyone other than the user that starts Apache has write access to the directory where they are written. See the security tips (p. 364) document for details. 130 3.4 CHAPTER 3. APACHE VIRTUAL HOST DOCUMENTATION Dynamically Configured Mass Virtual Hosting This document describes how to efficiently serve an arbitrary number of virtual hosts with the Apache HTTP Server. A separate document (p. 162) discusses using MOD REWRITE to create dynamic mass virtual hosts. Motivation The techniques described here are of interest if your httpd.conf contains many sections that are substantially the same, for example: ServerName customer-1.example.com DocumentRoot "/www/hosts/customer-1.example.com/docs" ScriptAlias "/cgi-bin/" "/www/hosts/customer-1.example.com/cgi-bin" ServerName customer-2.example.com DocumentRoot "/www/hosts/customer-2.example.com/docs" ScriptAlias "/cgi-bin/" "/www/hosts/customer-2.example.com/cgi-bin" ServerName customer-N.example.com DocumentRoot "/www/hosts/customer-N.example.com/docs" ScriptAlias "/cgi-bin/" "/www/hosts/customer-N.example.com/cgi-bin" We wish to replace these multiple blocks with a mechanism that works them out dynamically. This has a number of advantages: 1. Your configuration file is smaller, so Apache starts more quickly and uses less memory. Perhaps more importantly, the smaller configuration is easier to maintain, and leaves less room for errors. 2. Adding virtual hosts is simply a matter of creating the appropriate directories in the filesystem and entries in the DNS - you don’t need to reconfigure or restart Apache. The main disadvantage is that you cannot have a different log file for each virtual host; however, if you have many virtual hosts, doing this can be a bad idea anyway, because of the number of file descriptors needed (p. 144) . It is better to log to a pipe or a fifo (p. 56) , and arrange for the process at the other end to split up the log files into one per virtual host. One example of such a process can be found in the split-logfile (p. 336) utility. Overview A virtual host is defined by two pieces of information: its IP address, and the contents of the Host: header in the HTTP request. The dynamic mass virtual hosting technique used here is based on automatically inserting this information into the pathname of the file that is used to satisfy the request. This can be most easily done by using MOD VHOST ALIAS with Apache httpd. Alternatively, mod rewrite can be used (p. 162) . Both of these modules are disabled by default; you must enable one of them when configuring and building Apache httpd if you want to use this technique. 3.4. DYNAMICALLY CONFIGURED MASS VIRTUAL HOSTING 131 A couple of things need to be determined from the request in order to make the dynamic virtual host look like a normal one. The most important is the server name, which is used by the server to generate self-referential URLs etc. It is configured with the ServerName directive, and it is available to CGIs via the SERVER NAME environment variable. The actual value used at run time is controlled by the U SE C ANONICAL NAME setting. With UseCanonicalName Off, the server name is taken from the contents of the Host: header in the request. With UseCanonicalName DNS, it is taken from a reverse DNS lookup of the virtual host’s IP address. The former setting is used for name-based dynamic virtual hosting, and the latter is used for IP-based hosting. If httpd cannot work out the server name because there is no Host: header, or the DNS lookup fails, then the value configured with ServerName is used instead. The other thing to determine is the document root (configured with DocumentRoot and available to CGI scripts via the DOCUMENT ROOT environment variable). In a normal configuration, this is used by the core module when mapping URIs to filenames, but when the server is configured to do dynamic virtual hosting, that job must be taken over by another module (either MOD VHOST ALIAS or MOD REWRITE), which has a different way of doing the mapping. Neither of these modules is responsible for setting the DOCUMENT ROOT environment variable so if any CGIs or SSI documents make use of it, they will get a misleading value. Dynamic Virtual Hosts with mod vhost alias This extract from httpd.conf implements the virtual host arrangement outlined in the Motivation section above using MOD VHOST ALIAS. # get the server name from the Host: header UseCanonicalName Off # this log format can be split per-virtual-host based on the first field # using the split-logfile utility. LogFormat "%V %h %l %u %t \"%r\" %s %b" vcommon CustomLog "logs/access_log" vcommon # include the server name in the filenames used to satisfy requests VirtualDocumentRoot "/www/hosts/%0/docs" VirtualScriptAlias "/www/hosts/%0/cgi-bin" This configuration can be changed into an IP-based virtual hosting solution by just turning UseCanonicalName Off into UseCanonicalName DNS. The server name that is inserted into the filename is then derived from the IP address of the virtual host. The variable %0 references the requested servername, as indicated in the Host: header. See the MOD VHOST ALIAS documentation for more usage examples. Simplified Dynamic Virtual Hosts This is an adjustment of the above system, tailored for an ISP’s web hosting server. Using %2, we can select substrings of the server name to use in the filename so that, for example, the documents for www.user.example.com are found in /home/user/www. It uses a single cgi-bin directory instead of one per virtual host. UseCanonicalName Off LogFormat "%V %h %l %u %t \"%r\" %s %b" vcommon CustomLog "logs/access_log" vcommon # include part of the server name in the filenames VirtualDocumentRoot "/home/%2/www" 132 CHAPTER 3. APACHE VIRTUAL HOST DOCUMENTATION # single cgi-bin directory ScriptAlias "/cgi-bin/" "/www/std-cgi/" There are examples of more complicated VirtualDocumentRoot settings in the MOD VHOST ALIAS documentation. Using Multiple Virtual Hosting Systems on the Same Server With more complicated setups, you can use httpd’s normal directives to control the scope of the various virtual hosting configurations. For example, you could have one IP address for general customers’ homepages, and another for commercial customers, with the following setup. This can be combined with conventional configuration sections, as shown below. UseCanonicalName Off LogFormat "%V %h %l %u %t \"%r\" %s %b" vcommon Options FollowSymLinks AllowOverride All Options FollowSymLinks AllowOverride None ServerName www.commercial.example.com CustomLog "logs/access_log.commercial" vcommon VirtualDocumentRoot "/www/commercial/%0/docs" VirtualScriptAlias "/www/commercial/%0/cgi-bin" ServerName www.homepages.example.com CustomLog "logs/access_log.homepages" vcommon VirtualDocumentRoot "/www/homepages/%0/docs" ScriptAlias "/cgi-bin/" "/www/std-cgi/" =⇒Note If the first VirtualHost block does not include a S ERVER NAME directive, the reverse DNS of the relevant IP will be used instead. If this is not the server name you wish to use, a bogus entry (eg. ServerName none.example.com) can be added to get around this behaviour. 3.4. DYNAMICALLY CONFIGURED MASS VIRTUAL HOSTING 133 More Efficient IP-Based Virtual Hosting The configuration changes suggested to turn the first example into an IP-based virtual hosting setup result in a rather inefficient setup. A new DNS lookup is required for every request. To avoid this overhead, the filesystem can be arranged to correspond to the IP addresses, instead of to the host names, thereby negating the need for a DNS lookup. Logging will also have to be adjusted to fit this system. # get the server name from the reverse DNS of the IP address UseCanonicalName DNS # include the IP address in the logs so they may be split LogFormat "%A %h %l %u %t \"%r\" %s %b" vcommon CustomLog "logs/access_log" vcommon # include the IP address in the filenames VirtualDocumentRootIP "/www/hosts/%0/docs" VirtualScriptAliasIP "/www/hosts/%0/cgi-bin" Mass virtual hosts with mod rewrite Mass virtual hosting may also be accomplished using MOD REWRITE, either using simple R EWRITE RULE directives, or using more complicated techniques such as storing the vhost definitions externally and accessing them via R EWRITE M AP. These techniques are discussed in the rewrite documentation (p. 162) . Mass virtual hosts with mod macro Another option for dynamically generated virtual hosts is MOD MACRO, with which you can create a virtualhost template, and invoke it for multiple hostnames. An example of this is provided in the Usage section of the module documentation. 134 3.5 CHAPTER 3. APACHE VIRTUAL HOST DOCUMENTATION VirtualHost Examples This document attempts to answer the commonly-asked questions about setting up virtual hosts (p. 124) . These scenarios are those involving multiple web sites running on a single server, via name-based (p. 125) or IP-based (p. 128) virtual hosts. Running several name-based web sites on a single IP address. Your server has multiple hostnames that resolve to a single address, and you want to respond differently for www.example.com and www.example.org. =⇒Note Creating virtual host configurations on your Apache server does not magically cause DNS entries to be created for those host names. You must have the names in DNS, resolving to your IP address, or nobody else will be able to see your web site. You can put entries in your hosts file for local testing, but that will work only from the machine with those hosts entries. # Ensure that Apache listens on port 80 Listen 80 DocumentRoot "/www/example1" ServerName www.example.com # Other directives here DocumentRoot "/www/example2" ServerName www.example.org # Other directives here The asterisks match all addresses, so the main server serves no requests. Due to the fact that the virtual host with ServerName www.example.com is first in the configuration file, it has the highest priority and can be seen as the default or primary server. That means that if a request is received that does not match one of the specified ServerName directives, it will be served by this first VirtualHost. The above configuration is what you will want to use in almost all name-based virtual hosting situations. The only thing that this configuration will not work for, in fact, is when you are serving different content based on differing IP addresses or ports. =⇒Note You may replace * with a specific IP address on the system. Such virtual hosts will only be used for HTTP requests received on connection to the specified IP address. However, it is additionally useful to use * on systems where the IP address is not predictable - for example if you have a dynamic IP address with your ISP, and you are using some variety of dynamic DNS solution. Since * matches any IP address, this configuration would work without changes whenever your IP address changes. 3.5. VIRTUALHOST EXAMPLES 135 Name-based hosts on more than one IP address. =⇒Note Any of the techniques discussed here can be extended to any number of IP addresses. The server has two IP addresses. On one (172.20.30.40), we will serve the "main" server, server.example.com and on the other (172.20.30.50), we will serve two or more virtual hosts. Listen 80 # This is the "main" server running on 172.20.30.40 ServerName server.example.com DocumentRoot "/www/mainserver" DocumentRoot "/www/example1" ServerName www.example.com # Other directives here ... DocumentRoot "/www/example2" ServerName www.example.org # Other directives here ... Any request to an address other than 172.20.30.50 will be served from the main server. A request to 172.20.30.50 with an unknown hostname, or no Host: header, will be served from www.example.com. Serving the same content on different IP addresses (such as an internal and external address). The server machine has two IP addresses (192.168.1.1 and 172.20.30.40). The machine is sitting between an internal (intranet) network and an external (internet) network. Outside of the network, the name server.example.com resolves to the external address (172.20.30.40), but inside the network, that same name resolves to the internal address (192.168.1.1). The server can be made to respond to internal and external requests with the same content, with just one VirtualHost section. DocumentRoot "/www/server1" ServerName server.example.com ServerAlias server Now requests from both networks will be served from the same VirtualHost. =⇒Note: On the internal network, one can just use the name server rather than the fully qualified host name server.example.com. Note also that, in the above example, you can replace the list of IP addresses with *, which will cause the server to respond the same on all addresses. 136 CHAPTER 3. APACHE VIRTUAL HOST DOCUMENTATION Running different sites on different ports. You have multiple domains going to the same IP and also want to serve multiple ports. The example below illustrates that the name-matching takes place after the best matching IP address and port combination is determined. Listen 80 Listen 8080 ServerName www.example.com DocumentRoot "/www/domain-80" ServerName www.example.com DocumentRoot "/www/domain-8080" ServerName www.example.org DocumentRoot "/www/otherdomain-80" ServerName www.example.org DocumentRoot "/www/otherdomain-8080" IP-based virtual hosting The server has two IP addresses (172.20.30.40 and 172.20.30.50) which resolve to the names www.example.com and www.example.org respectively. Listen 80 DocumentRoot "/www/example1" ServerName www.example.com DocumentRoot "/www/example2" ServerName www.example.org Requests for any address not specified in one of the directives (such as localhost, for example) will go to the main server, if there is one. Mixed port-based and ip-based virtual hosts The server machine has two IP addresses (172.20.30.40 and 172.20.30.50) which resolve to the names www.example.com and www.example.org respectively. In each case, we want to run hosts on ports 80 and 3.5. VIRTUALHOST EXAMPLES 137 8080. Listen Listen Listen Listen 172.20.30.40:80 172.20.30.40:8080 172.20.30.50:80 172.20.30.50:8080 DocumentRoot "/www/example1-80" ServerName www.example.com DocumentRoot "/www/example1-8080" ServerName www.example.com DocumentRoot "/www/example2-80" ServerName www.example.org DocumentRoot "/www/example2-8080" ServerName www.example.org Mixed name-based and IP-based vhosts Any address mentioned in the argument to a virtualhost that never appears in another virtual host is a strictly IP-based virtual host. Listen 80 DocumentRoot "/www/example1" ServerName www.example.com DocumentRoot "/www/example2" ServerName www.example.org DocumentRoot "/www/example3" ServerName www.example.net # IP-based DocumentRoot "/www/example4" ServerName www.example.edu 138 CHAPTER 3. APACHE VIRTUAL HOST DOCUMENTATION DocumentRoot "/www/example5" ServerName www.example.gov Using Virtual host and mod proxy together The following example allows a front-end machine to proxy a virtual host through to a server running on another machine. In the example, a virtual host of the same name is configured on a machine at 192.168.111.2. The P ROXY P RESERVE H OST O N directive is used so that the desired hostname is passed through, in case we are proxying multiple hostnames to a single machine. ProxyPreserveHost On ProxyPass "/" "http://192.168.111.2/" ProxyPassReverse "/" "http://192.168.111.2/" ServerName hostname.example.com Using default vhosts default vhosts for all ports Catching every request to any unspecified IP address and port, i.e., an address/port combination that is not used for any other virtual host. DocumentRoot "/www/default" Using such a default vhost with a wildcard port effectively prevents any request going to the main server. A default vhost never serves a request that was sent to an address/port that is used for name-based vhosts. If the request contained an unknown or no Host: header it is always served from the primary name-based vhost (the vhost for that address/port appearing first in the configuration file). You can use A LIAS M ATCH or R EWRITE RULE to rewrite any request to a single information page (or script). default vhosts for different ports Same as setup 1, but the server listens on several ports and we want to use a second default vhost for port 80. DocumentRoot "/www/default80" # ... DocumentRoot "/www/default" # ... 3.5. VIRTUALHOST EXAMPLES 139 The default vhost for port 80 (which must appear before any default vhost with a wildcard port) catches all requests that were sent to an unspecified IP address. The main server is never used to serve a request. default vhosts for one port We want to have a default vhost for port 80, but no other default vhosts. DocumentRoot "/www/default" ... A request to an unspecified address on port 80 is served from the default vhost. Any other request to an unspecified address and port is served from the main server. Any use of * in a virtual host declaration will have higher precedence than default . Migrating a name-based vhost to an IP-based vhost The name-based vhost with the hostname www.example.org (from our name-based example, setup 2) should get its own IP address. To avoid problems with name servers or proxies who cached the old IP address for the name-based vhost we want to provide both variants during a migration phase. The solution is easy, because we can simply add the new IP address (172.20.30.50) to the VirtualHost directive. Listen 80 ServerName www.example.com DocumentRoot "/www/example1" DocumentRoot "/www/example2" ServerName www.example.org # ... DocumentRoot "/www/example3" ServerName www.example.net ServerAlias *.example.net # ... The vhost can now be accessed through the new address (as an IP-based vhost) and through the old address (as a name-based vhost). Using the ServerPath directive We have a server with two name-based vhosts. In order to match the correct virtual host a client must send the correct Host: header. Old HTTP/1.0 clients do not send such a header and Apache has no clue what vhost the client tried to reach (and serves the request from the primary vhost). To provide as much backward compatibility as possible we create a primary vhost which returns a single page containing links with an URL prefix to the name-based virtual hosts. 140 CHAPTER 3. APACHE VIRTUAL HOST DOCUMENTATION # primary vhost DocumentRoot "/www/subdomain" RewriteEngine On RewriteRule "." "/www/subdomain/index.html" # ... DocumentRoot "/www/subdomain/sub1" ServerName www.sub1.domain.tld ServerPath /sub1/ RewriteEngine On RewriteRule "ˆ(/sub1/.*)" "/www/subdomain$1" # ... DocumentRoot "/www/subdomain/sub2" ServerName www.sub2.domain.tld ServerPath /sub2/ RewriteEngine On RewriteRule "ˆ(/sub2/.*)" "/www/subdomain$1" # ... Due to the S ERVER PATH directive a request to the URL http://www.sub1.domain.tld/sub1/ is always served from the sub1-vhost. A request to the URL http://www.sub1.domain.tld/ is only served from the sub1-vhost if the client sent a correct Host: header. If no Host: header is sent the client gets the information page from the primary host. Please note that there is one oddity: A request to http://www.sub2.domain.tld/sub1/ is also served from the sub1-vhost if the client sent no Host: header. The R EWRITE RULE directives are used to make sure that a client which sent a correct Host: header can use both URL variants, i.e., with or without URL prefix. 3.6. AN IN-DEPTH DISCUSSION OF VIRTUAL HOST MATCHING 3.6 141 An In-Depth Discussion of Virtual Host Matching This document attempts to explain exactly what Apache HTTP Server does when deciding what virtual host to serve a request from. Most users should read about Name-based vs. IP-based Virtual Hosts (p. 125) to decide which type they want to use, then read more about name-based (p. 125) or IP-based (p. 128) virtualhosts, and then see some examples (p. 134) . If you want to understand all the details, then you can come back to this page. See also • IP-based Virtual Host Support (p. 128) • Name-based Virtual Hosts Support (p. 125) • Virtual Host examples for common setups (p. 134) • Dynamically configured mass virtual hosting (p. 130) Configuration File There is a main server which consists of all the definitions appearing outside of sections. There are virtual servers, called vhosts, which are defined by sections. Each VirtualHost directive includes one or more addresses and optional ports. Hostnames can be used in place of IP addresses in a virtual host definition, but they are resolved at startup and if any name resolutions fail, those virtual host definitions are ignored. This is, therefore, not recommended. The address can be specified as *, which will match a request if no other vhost has the explicit address on which the request was received. The address appearing in the VirtualHost directive can have an optional port. If the port is unspecified, it is treated as a wildcard port, which can also be indicated explicitly using *. The wildcard port matches any port. (Port numbers specified in the VirtualHost directive do not influence what port numbers Apache will listen on, they only control which VirtualHost will be selected to handle a request. Use the L ISTEN directive to control the addresses and ports on which the server listens.) Collectively the entire set of addresses (including multiple results from DNS lookups) are called the vhost’s address set. Apache automatically discriminates on the basis of the HTTP Host header supplied by the client whenever the most specific match for an IP address and port combination is listed in multiple virtual hosts. The S ERVER NAME directive may appear anywhere within the definition of a server. However, each appearance overrides the previous appearance (within that server). If no ServerName is specified, the server attempts to deduce it from the server’s IP address. The first name-based vhost in the configuration file for a given IP:port pair is significant because it is used for all requests received on that address and port for which no other vhost for that IP:port pair has a matching ServerName or ServerAlias. It is also used for all SSL connections if the server does not support Server Name Indication. The complete list of names in the VirtualHost directive are treated just like a (non wildcard) ServerAlias (but are not overridden by any ServerAlias statement). For every vhost various default values are set. In particular: 1. If a vhost has no S ERVER A DMIN, T IMEOUT, K EEPA LIVE T IMEOUT, K EEPA LIVE, M AX K EEPA LIVE R EQUESTS , R ECEIVE B UFFER S IZE , or S END B UFFER S IZE directive then the respective value is inherited from the main server. (That is, inherited from whatever the final setting of that value is in the main server.) 142 CHAPTER 3. APACHE VIRTUAL HOST DOCUMENTATION 2. The "lookup defaults" that define the default directory permissions for a vhost are merged with those of the main server. This includes any per-directory configuration information for any module. 3. The per-server configs for each module from the main server are merged into the vhost server. Essentially, the main server is treated as "defaults" or a "base" on which to build each vhost. But the positioning of these main server definitions in the config file is largely irrelevant – the entire config of the main server has been parsed when this final merging occurs. So even if a main server definition appears after a vhost definition it might affect the vhost definition. If the main server has no ServerName at this point, then the hostname of the machine that httpd is running on is used instead. We will call the main server address set those IP addresses returned by a DNS lookup on the ServerName of the main server. For any undefined ServerName fields, a name-based vhost defaults to the address given first in the VirtualHost statement defining the vhost. Any vhost that includes the magic default wildcard is given the same ServerName as the main server. Virtual Host Matching The server determines which vhost to use for a request as follows: IP address lookup When the connection is first received on some address and port, the server looks for all the VirtualHost definitions that have the same IP address and port. If there are no exact matches for the address and port, then wildcard (*) matches are considered. If no matches are found, the request is served by the main server. If there are VirtualHost definitions for the IP address, the next step is to decide if we have to deal with an IP-based or a name-based vhost. IP-based vhost If there is exactly one VirtualHost directive listing the IP address and port combination that was determined to be the best match, no further actions are performed and the request is served from the matching vhost. Name-based vhost If there are multiple VirtualHost directives listing the IP address and port combination that was determined to be the best match, the "list" in the remaining steps refers to the list of vhosts that matched, in the order they were in the configuration file. If the connection is using SSL, the server supports Server Name Indication, and the SSL client handshake includes the TLS extension with the requested hostname, then that hostname is used below just like the Host: header would be used on a non-SSL connection. Otherwise, the first name-based vhost whose address matched is used for SSL connections. This is significant because the vhost determines which certificate the server will use for the connection. If the request contains a Host: header field, the list is searched for the first vhost with a matching ServerName or ServerAlias, and the request is served from that vhost. A Host: header field can contain a port number, but Apache always ignores it and matches against the real port to which the client sent the request. The first vhost in the config file with the specified IP address has the highest priority and catches any request to an unknown server name, or a request without a Host: header field (such as a HTTP/1.0 request). 3.6. AN IN-DEPTH DISCUSSION OF VIRTUAL HOST MATCHING 143 Persistent connections The IP lookup described above is only done once for a particular TCP/IP session while the name lookup is done on every request during a KeepAlive/persistent connection. In other words, a client may request pages from different name-based vhosts during a single persistent connection. Absolute URI If the URI from the request is an absolute URI, and its hostname and port match the main server or one of the configured virtual hosts and match the address and port to which the client sent the request, then the scheme/hostname/port prefix is stripped off and the remaining relative URI is served by the corresponding main server or virtual host. If it does not match, then the URI remains untouched and the request is taken to be a proxy request. Observations • Name-based virtual hosting is a process applied after the server has selected the best matching IP-based virtual host. • If you don’t care what IP address the client has connected to, use a "*" as the address of every virtual host, and name-based virtual hosting is applied across all configured virtual hosts. • ServerName and ServerAlias checks are never performed for an IP-based vhost. • Only the ordering of name-based vhosts for a specific address set is significant. The one name-based vhosts that comes first in the configuration file has the highest priority for its corresponding address set. • Any port in the Host: header field is never used during the matching process. Apache always uses the real port to which the client sent the request. • If two vhosts have an address in common, those common addresses act as name-based virtual hosts implicitly. This is new behavior as of 2.3.11. • The main server is only used to serve a request if the IP address and port number to which the client connected does not match any vhost (including a * vhost). In other words, the main server only catches a request for an unspecified address/port combination (unless there is a default vhost which matches that port). • You should never specify DNS names in VirtualHost directives because it will force your server to rely on DNS to boot. Furthermore it poses a security threat if you do not control the DNS for all the domains listed. There’s more information (p. 121) available on this and the next two topics. • ServerName should always be set for each vhost. Otherwise A DNS lookup is required for each vhost. Tips In addition to the tips on the DNS Issues (p. 121) page, here are some further tips: • Place all main server definitions before any VirtualHost definitions. (This is to aid the readability of the configuration – the post-config merging process makes it non-obvious that definitions mixed in around virtual hosts might affect all virtual hosts.) 144 CHAPTER 3. APACHE VIRTUAL HOST DOCUMENTATION 3.7 File Descriptor Limits When using a large number of Virtual Hosts, Apache may run out of available file descriptors (sometimes called file handles) if each Virtual Host specifies different log files. The total number of file descriptors used by Apache is one for each distinct error log file, one for every other log file directive, plus 10-20 for internal use. Unix operating systems limit the number of file descriptors that may be used by a process; the limit is typically 64, and may usually be increased up to a large hard-limit. Although Apache attempts to increase the limit as required, this may not work if: 1. Your system does not provide the setrlimit() system call. 2. The setrlimit(RLIMIT NOFILE) call does not function on your system (such as Solaris 2.3) 3. The number of file descriptors required exceeds the hard limit. 4. Your system imposes other limits on file descriptors, such as a limit on stdio streams only using file descriptors below 256. (Solaris 2) In the event of problems you can: • Reduce the number of log files; don’t specify log files in the sections, but only log to the main log files. (See Splitting up your log files, below, for more information on doing this.) • If you system falls into 1 or 2 (above), then increase the file descriptor limit before starting Apache, using a script like #!/bin/sh ulimit -S -n 100 exec httpd Splitting up your log files If you want to log multiple virtual hosts to the same log file, you may want to split up the log files afterwards in order to run statistical analysis of the various virtual hosts. This can be accomplished in the following manner. First, you will need to add the virtual host information to the log entries. This can be done using the L OG F ORMAT directive, and the %v variable. Add this to the beginning of your log format string: LogFormat "%v %h %l %u %t \"%r\" %>s %b" vhost CustomLog "logs/multiple_vhost_log" vhost This will create a log file in the common log format, but with the canonical virtual host (whatever appears in the S ERVER NAME directive) prepended to each line. (See MOD LOG CONFIG for more about customizing your log files.) When you wish to split your log file into its component parts (one file per virtual host) you can use the program split-logfile (p. 336) to accomplish this. You’ll find this program in the support directory of the Apache distribution. Run this program with the command: split-logfile < /logs/multiple vhost log This program, when run with the name of your vhost log file, will generate one file for each virtual host that appears in your log file. Each file will be called hostname.log. Chapter 4 URL Rewriting Guide 145 146 CHAPTER 4. URL REWRITING GUIDE 4.1 Apache mod rewrite MOD REWRITE provides a way to modify incoming URL requests, dynamically, based on regular expression (p. 147) rules. This allows you to map arbitrary URLs onto your internal URL structure in any way you like. It supports an unlimited number of rules and an unlimited number of attached rule conditions for each rule to provide a really flexible and powerful URL manipulation mechanism. The URL manipulations can depend on various tests: server variables, environment variables, HTTP headers, time stamps, external database lookups, and various other external programs or handlers, can be used to achieve granular URL matching. Rewrite rules can operate on the full URLs, including the path-info and query string portions, and may be used in per-server context (httpd.conf), per-virtualhost context ( blocks), or per-directory context (.htaccess files and blocks). The rewritten result can lead to further rules, internal sub-processing, external request redirection, or proxy passthrough, depending on what flags (p. 178) you attach to the rules. Since mod rewrite is so powerful, it can indeed be rather complex. This document supplements the reference documentation (p. 867) , and attempts to allay some of that complexity, and provide highly annotated examples of common scenarios that you may handle with mod rewrite. But we also attempt to show you when you should not use mod rewrite, and use other standard Apache features instead, thus avoiding this unnecessary complexity. • mod rewrite reference documentation (p. 867) • Introduction to regular expressions and mod rewrite (p. 147) • RewriteRule Flags (p. 178) • Using RewriteMap (p. 166) • When NOT to use mod rewrite (p. 175) • Using mod rewrite for redirection and remapping of URLs (p. 152) • Using mod rewrite to control access (p. 159) • Dynamic virtual hosts with mod rewrite (p. 162) • Dynamic proxying with mod rewrite (p. 165) • Advanced techniques (p. 172) • Technical details (p. 187) See also • mod rewrite reference documentation (p. 867) • Mapping URLs to the Filesystem (p. 64) • mod rewrite wiki1 • Glossary (p. 1096) 1 http://wiki.apache.org/httpd/Rewrite 4.2. APACHE MOD REWRITE INTRODUCTION 4.2 147 Apache mod rewrite Introduction This document supplements the MOD REWRITE reference documentation (p. 867) . It describes the basic concepts necessary for use of MOD REWRITE. Other documents go into greater detail, but this doc should help the beginner get their feet wet. See also • Module documentation (p. 867) • Redirection and remapping (p. 152) • Controlling access (p. 159) • Virtual hosts (p. 162) • Proxying (p. 165) • Using RewriteMap (p. 166) • Advanced techniques (p. 172) • When not to use mod rewrite (p. 175) Introduction The Apache module MOD REWRITE is a very powerful and sophisticated module which provides a way to do URL manipulations. With it, you can do nearly all types of URL rewriting that you may need. It is, however, somewhat complex, and may be intimidating to the beginner. There is also a tendency to treat rewrite rules as magic incantation, using them without actually understanding what they do. This document attempts to give sufficient background so that what follows is understood, rather than just copied blindly. Remember that many common URL-manipulation tasks don’t require the full power and complexity of MOD REWRITE. For simple tasks, see MOD ALIAS and the documentation on mapping URLs to the filesystem (p. 64) . Finally, before proceeding, be sure to configure MOD REWRITE’s log level to one of the trace levels using the L OG L EVEL directive. Although this can give an overwhelming amount of information, it is indispensable in debugging problems with MOD REWRITE configuration, since it will tell you exactly how each rule is processed. Regular Expressions mod rewrite uses the Perl Compatible Regular Expression2 vocabulary. In this document, we do not attempt to provide a detailed reference to regular expressions. For that, we recommend the PCRE man pages3 , the Perl regular expression man page4 , and Mastering Regular Expressions, by Jeffrey Friedl5 . In this document, we attempt to provide enough of a regex vocabulary to get you started, without being overwhelming, in the hope that R EWRITE RULEs will be scientific formulae, rather than magical incantations. 2 http://pcre.org/ 3 http://pcre.org/pcre.txt 4 http://perldoc.perl.org/perlre.html 5 http://shop.oreilly.com/product/9780596528126.do 148 CHAPTER 4. URL REWRITING GUIDE Regex vocabulary The following are the minimal building blocks you will need, in order to write regular expressions and R EWRITE RULEs. They certainly do not represent a complete regular expression vocabulary, but they are a good place to start, and should help you read basic regular expressions, as well as write your own. Character Meaning Example . + Matches any single character Repeats the previous match one or more times Repeats the previous match zero or more times. c.t will match cat, cot, cut, etc. a+ matches a, aa, aaa, etc * ? Makes the match optional. ˆ Called an anchor, matches the beginning of the string The other anchor, this matches the end of the string. Groups several characters into a single unit, and captures a match for use in a backreference. A character class - matches one of the characters Negative character class - matches any character not specified $ ( ) [ ] [ˆ ] a* matches all the same things a+ matches, but will also match an empty string. colou?r will match color and colour. ˆa matches a string that begins with a a$ matches a string that ends with a. (ab)+ matches ababab - that is, the + applies to the group. For more on backreferences see below. c[uoa]t matches cut, cot or cat. c[ˆ/]t matches cat or c=t but not c/t In MOD REWRITE the ! character can be used before a regular expression to negate it. This is, a string will be considered to have matched only if it does not match the rest of the expression. Regex Back-Reference Availability One important thing here has to be remembered: Whenever you use parentheses in Pattern or in one of the CondPattern, back-references are internally created which can be used with the strings $N and %N (see below). These are available for creating the Substitution parameter of a R EWRITE RULE or the TestString parameter of a R EWRITE C OND. Captures in the R EWRITE RULE patterns are (counterintuitively) available to all preceding R EWRITE C OND directives, because the R EWRITE RULE expression is evaluated before the individual conditions. Figure 1 shows to which locations the back-references are transferred for expansion as well as illustrating the flow of the RewriteRule, RewriteCond matching. In the next chapters, we will be exploring how to use these back-references, so do not fret if it seems a bit alien to you at first. Figure 1: The back-reference flow through a rule. In this example, a request for /test/1234 would /admin.foo?page=test&id=1234&host=admin.example.com. be transformed into 4.2. APACHE MOD REWRITE INTRODUCTION 149 RewriteRule Basics A R EWRITE RULE consists of three arguments separated by spaces. The arguments are 1. Pattern: which incoming URLs should be affected by the rule; 2. Substitution: where should the matching requests be sent; 3. [flags]: options affecting the rewritten request. The Pattern is a regular expression. It is initially (for the first rewrite rule or until a substitution occurs) matched against the URL-path of the incoming request (the part after the hostname but before any question mark indicating the beginning of a query string) or, in per-directory context, against the request’s path relative to the directory for which the rule is defined. Once a substitution has occurred, the rules that follow are matched against the substituted value. Figure 2: Syntax of the RewriteRule directive. The Substitution can itself be one of three things: A full filesystem path to a resource RewriteRule "ˆ/games" "/usr/local/games/web" This maps a request to an arbitrary location on your filesystem, much like the A LIAS directive. A web-path to a resource RewriteRule "ˆ/foo$" "/bar" If D OCUMENT ROOT is set to /usr/local/apache2/htdocs, then this directive would map requests for http://example.com/foo to the path /usr/local/apache2/htdocs/bar. An absolute URL RewriteRule "ˆ/product/view$" "http://site2.example.com/seeproduct.html" [R] This tells the client to make a new request for the specified URL. 150 CHAPTER 4. URL REWRITING GUIDE The Substitution can also contain back-references to parts of the incoming URL-path matched by the Pattern. Consider the following: RewriteRule "ˆ/product/(.*)/view$" "/var/web/productdb/$1" The variable $1 will be replaced with whatever text was matched by the expression inside the parenthesis in the Pattern. For example, a request for http://example.com/product/r14df/view will be mapped to the path /var/web/productdb/r14df. If there is more than one expression in parenthesis, they are available in order in the variables $1, $2, $3, and so on. Rewrite Flags The behavior of a R EWRITE RULE can be modified by the application of one or more flags to the end of the rule. For example, the matching behavior of a rule can be made case-insensitive by the application of the [NC] flag: RewriteRule "ˆpuppy.html" "smalldog.html" [NC] For more details on the available flags, their meanings, and examples, see the Rewrite Flags (p. 178) document. Rewrite Conditions One or more R EWRITE C OND directives can be used to restrict the types of requests that will be subject to the following R EWRITE RULE. The first argument is a variable describing a characteristic of the request, the second argument is a regular expression that must match the variable, and a third optional argument is a list of flags that modify how the match is evaluated. Figure 3: Syntax of the RewriteCond directive 4.2. APACHE MOD REWRITE INTRODUCTION 151 For example, to send all requests from a particular IP range to a different server, you could use: RewriteCond "%{REMOTE_ADDR}" "ˆ10\.2\." RewriteRule "(.*)" "http://intranet.example.com$1" When more than one R EWRITE C OND is specified, they must all match for the R EWRITE RULE to be applied. For example, to deny requests that contain the word "hack" in their query string, unless they also contain a cookie containing the word "go", you could use: RewriteCond "%{QUERY_STRING}" "hack" RewriteCond "%{HTTP_COOKIE}" !go RewriteRule "." "-" [F] Notice that the exclamation mark specifies a negative match, so the rule is only applied if the cookie does not contain "go". Matches in the regular expressions contained in the R EWRITE C ONDs can be used as part of the Substitution in the R EWRITE RULE using the variables %1, %2, etc. For example, this will direct the request to a different directory depending on the hostname used to access the site: RewriteCond "%{HTTP_HOST}" "(.*)" RewriteRule "ˆ/(.*)" "/sites/%1/$1" If the request was for http://example.com/foo/bar, then %1 would contain example.com and $1 would contain foo/bar. Rewrite maps The R EWRITE M AP directive provides a way to call an external function, so to speak, to do your rewriting for you. This is discussed in greater detail in the RewriteMap supplementary documentation (p. 166) . .htaccess files Rewriting is typically configured in the main server configuration setting (outside any section) or inside containers. This is the easiest way to do rewriting and is recommended. It is possible, however, to do rewriting inside sections or .htaccess files (p. 249) at the expense of some additional complexity. This technique is called per-directory rewrites. The main difference with per-server rewrites is that the path prefix of the directory containing the .htaccess file is stripped before matching in the R EWRITE RULE. In addition, the R EWRITE BASE should be used to assure the request is properly mapped. 152 CHAPTER 4. URL REWRITING GUIDE 4.3 Redirecting and Remapping with mod rewrite This document supplements the MOD REWRITE reference documentation (p. 867) . It describes how you can use MOD REWRITE to redirect and remap request. This includes many examples of common uses of mod rewrite, including detailed descriptions of how each works. ! Note that many of these examples won’t work unchanged in your particular server configuration, so it’s important that you understand them, rather than merely cutting and pasting the examples into your configuration. See also • Module documentation (p. 867) • mod rewrite introduction (p. 147) • Controlling access (p. 159) • Virtual hosts (p. 162) • Proxying (p. 165) • Using RewriteMap (p. 166) • Advanced techniques (p. 172) • When not to use mod rewrite (p. 175) From Old to New (internal) Description: Assume we have recently renamed the page foo.html to bar.html and now want to provide the old URL for backward compatibility. However, we want that users of the old URL even not recognize that the pages was renamed - that is, we don’t want the address to change in their browser. Solution: We rewrite the old URL to the new one internally via the following rule: RewriteEngine RewriteRule on "ˆ/foo\.html$" "/bar.html" [PT] Rewriting From Old to New (external) Description: Assume again that we have recently renamed the page foo.html to bar.html and now want to provide the old URL for backward compatibility. But this time we want that the users of the old URL get hinted to the new one, i.e. their browsers Location field should change, too. Solution: We force a HTTP redirect to the new URL which leads to a change of the browsers and thus the users view: RewriteEngine RewriteRule on "ˆ/foo\.html$" "bar.html" [R] Discussion In this example, as contrasted to the internal example above, we can simply use the Redirect directive. mod rewrite was used in that earlier example in order to hide the redirect from the client: Redirect "/foo.html" "/bar.html" 4.3. REDIRECTING AND REMAPPING WITH MOD REWRITE 153 Resource Moved to Another Server Description: If a resource has moved to another server, you may wish to have URLs continue to work for a time on the old server while people update their bookmarks. Solution: You can use MOD REWRITE to redirect these URLs to the new server, but you might also consider using the Redirect or RedirectMatch directive. #With mod_rewrite RewriteEngine on RewriteRule "ˆ/docs/(.+)" "http://new.example.com/docs/$1" [R,L] #With RedirectMatch RedirectMatch "ˆ/docs/(.*)" "http://new.example.com/docs/$1" #With Redirect Redirect "/docs/" "http://new.example.com/docs/" From Static to Dynamic Description: How can we transform a static page foo.html into a dynamic variant foo.cgi in a seamless way, i.e. without notice by the browser/user. Solution: We just rewrite the URL to the CGI-script and force the handler to be cgi-script so that it is executed as a CGI program. This way a request to /˜quux/foo.html internally leads to the invocation of /˜quux/foo.cgi. RewriteEngine RewriteBase RewriteRule on "/˜quux/" "ˆfoo\.html$" "foo.cgi" [H=cgi-script] Backward Compatibility for file extension change Description: How can we make URLs backward compatible (still existing virtually) after migrating document.YYYY to document.XXXX, e.g. after translating a bunch of .html files to .php? Solution: We rewrite the name to its basename and test for existence of the new extension. If it exists, we take that name, else we rewrite the URL to its original state. # backward compatibility ruleset for # rewriting document.html to document.php # when and only when document.php exists RewriteEngine on RewriteBase "/var/www/htdocs" RewriteCond RewriteCond RewriteRule "$1.php" "$1.html" "ˆ(.*).html$" -f !-f "$1.php" 154 CHAPTER 4. URL REWRITING GUIDE Discussion This example uses an often-overlooked feature of mod rewrite, by taking advantage of the order of execution of the ruleset. In particular, mod rewrite evaluates the left-hand-side of the RewriteRule before it evaluates the RewriteCond directives. Consequently, $1 is already defined by the time the RewriteCond directives are evaluated. This allows us to test for the existence of the original (document.html) and target (document.php) files using the same base filename. This ruleset is designed to use in a per-directory context (In a block or in a .htaccess file), so that the -f checks are looking at the correct directory path. You may need to set a R EWRITE BASE directive to specify the directory base that you’re working in. Canonical Hostnames Description: The goal of this rule is to force the use of a particular hostname, in preference to other hostnames which may be used to reach the same site. For example, if you wish to force the use of www.example.com instead of example.com, you might use a variant of the following recipe. Solution: The very best way to solve this doesn’t involve mod rewrite at all, but rather uses the R EDIRECT directive placed in a virtual host for the non-canonical hostname(s). ServerName undesired.example.com ServerAlias example.com notthis.example.com Redirect "/" "http://www.example.com/" ServerName www.example.com You can alternatively accomplish this using the directive: (2.4 and later) Redirect "/" "http://www.example.com/" Or, for example, to redirect a portion of your site to HTTPS, you might do the following: Redirect "/admin/" "https://www.example.com/admin/" If, for whatever reason, you still want to use mod rewrite - if, for example, you need this to work with a larger set of RewriteRules - you might use one of the recipes below. For sites running on a port other than 80: RewriteCond RewriteCond RewriteCond RewriteRule "%{HTTP_HOST}" "%{HTTP_HOST}" "%{SERVER_PORT}" "ˆ/?(.*)" And for a site running on port 80 "!ˆwww\.example\.com" [NC] "!ˆ$" "!ˆ80$" "http://www.example.com:%{SERVER_PORT}/$1" [L,R,NE] 4.3. REDIRECTING AND REMAPPING WITH MOD REWRITE RewriteCond "%{HTTP_HOST}" RewriteCond "%{HTTP_HOST}" RewriteRule "ˆ/?(.*)" 155 "!ˆwww\.example\.com" [NC] "!ˆ$" "http://www.example.com/$1" [L,R,NE] If you wanted to do this generically for all domain names - that is, if you want to redirect example.com to www.example.com for all possible values of example.com, you could use the following recipe: RewriteCond "%{HTTP_HOST}" "!ˆwww\." [NC] RewriteCond "%{HTTP_HOST}" "!ˆ$" RewriteRule "ˆ/?(.*)" "http://www.%{HTTP_HOST}/$1" [L,R,NE] These rulesets will work either in your main server configuration file, or in a .htaccess file placed in the D OCUMENT ROOT of the server. Search for pages in more than one directory Description: A particular resource might exist in one of several places, and we want to look in those places for the resource when it is requested. Perhaps we’ve recently rearranged our directory structure, dividing content into several locations. Solution: The following ruleset searches in two directories to find the resource, and, if not finding it in either place, will attempt to just serve it out of the location requested. RewriteEngine on # first try to find it in dir1/... # ...and if found stop and be happy: RewriteCond "%{DOCUMENT_ROOT}/dir1/%{REQUEST_URI}" RewriteRule "ˆ(.+)" "%{DOCUMENT_ROOT}/dir1/$1" [L] # second try to find it in dir2/... # ...and if found stop and be happy: RewriteCond "%{DOCUMENT_ROOT}/dir2/%{REQUEST_URI}" RewriteRule "ˆ(.+)" "%{DOCUMENT_ROOT}/dir2/$1" [L] # else go on for other Alias or ScriptAlias directives, # etc. RewriteRule "ˆ" "-" -f -f [PT] Redirecting to Geographically Distributed Servers Description: We have numerous mirrors of our website, and want to redirect people to the one that is located in the country where they are located. Solution: Looking at the hostname of the requesting client, we determine which country they are coming from. If we can’t do a lookup on their IP address, we fall back to a default server. We’ll use a R EWRITE M AP directive to build a list of servers that we wish to use. HostnameLookups on RewriteEngine on RewriteMap multiplex RewriteCond "%{REMOTE_HOST}" RewriteRule "ˆ/(.*)$" "txt:/path/to/map.mirrors" "([a-z]+)$" [NC] "${multiplex:%1|http://www.example.com/}$1" [R,L] 156 CHAPTER 4. URL REWRITING GUIDE ## map.mirrors -- Multiplexing Map de http://www.example.de/ uk http://www.example.uk/ com http://www.example.com/ ##EOF## Discussion ! This ruleset relies on H OST NAME L OOKUPS being set on, which can be a significant performance hit. The R EWRITE C OND directive captures the last portion of the hostname of the requesting client - the country code - and the following RewriteRule uses that value to look up the appropriate mirror host in the map file. Canonical URLs Description: On some webservers there is more than one URL for a resource. Usually there are canonical URLs (which are be actually used and distributed) and those which are just shortcuts, internal ones, and so on. Independent of which URL the user supplied with the request, they should finally see the canonical one in their browser address bar. Solution: We do an external HTTP redirect for all non-canonical URLs to fix them in the location view of the Browser and for all subsequent requests. In the example ruleset below we replace /puppies and /canines by the canonical /dogs. RewriteRule "ˆ/(puppies|canines)/(.*)" "/dogs/$2" [R] Discussion: This should really be accomplished with Redirect or RedirectMatch directives: RedirectMatch "ˆ/(puppies|canines)/(.*)" "/dogs/$2" Moved DocumentRoot Description: Usually the D OCUMENT ROOT of the webserver directly relates to the URL "/". But often this data is not really of top-level priority. For example, you may wish for visitors, on first entering a site, to go to a particular subdirectory /about/. This may be accomplished using the following ruleset: Solution: We redirect the URL / to /about/: RewriteEngine on RewriteRule "ˆ/$" "/about/" [R] Note that this can also be handled using the R EDIRECT M ATCH directive: RedirectMatch "ˆ/$" "http://example.com/about/" Note also that the example rewrites only the root URL. That is, it rewrites a request for http://example.com/, but not a request for http://example.com/page.html. If you have in fact changed your document root - that is, if all of your content is in fact in that subdirectory, it is greatly preferable to simply change your D OCUMENT ROOT directive, or move all of the content up one directory, rather than rewriting URLs. 4.3. REDIRECTING AND REMAPPING WITH MOD REWRITE 157 Fallback Resource Description: You want a single resource (say, a certain file, like index.php) to handle all requests that come to a particular directory, except those that should go to an existing resource such as an image, or a css file. Solution: As of version 2.2.16, you should use the FALLBACK R ESOURCE directive for this: FallbackResource index.php However, in earlier versions of Apache, or if your needs are more complicated than this, you can use a variation of the following rewrite set to accomplish the same thing: RewriteBase "/my_blog" RewriteCond "/var/www/my_blog/%{REQUEST_FILENAME}" !-f RewriteCond "/var/www/my_blog/%{REQUEST_FILENAME}" !-d RewriteRule "ˆ" "index.php" [PT] If, on the other hand, you wish to pass the requested URI as a query string argument to index.php, you can replace that RewriteRule with: RewriteRule "(.*)" "index.php?$1" [PT,QSA] Note that these rulesets can be used in a .htaccess file, as well as in a block. Rewrite query string Description: You want to capture a particular value from a query string and either replace it or incorporate it into another component of the URL. Solutions: Many of the solutions in this section will all use the same condition, which leaves the matched value in the %2 backreference. %1 is the beginining of the query string (up to the key of intererest), and %3 is the remainder. This condition is a bit complex for flexibility and to avoid double ’&&’ in the substitutions. • This solution removes the matching key and value: # Remove mykey=??? RewriteCond "%{QUERY_STRING}" "(.*(?:ˆ|&))mykey=([ˆ&]*)&?(.*)&?$" RewriteRule "(.*)" "$1?%1%3" • This solution uses the captured value in the URL subsitution, discarding the rest of the original query by appending a ’?’: # Copy from query string to PATH_INFO RewriteCond "%{QUERY_STRING}" "(.*(?:ˆ|&))mykey=([ˆ&]*)&?(.*)&?$" RewriteRule "(.*)" "$1/products/%2/?" [PT] • This solution checks the captured value in a subsequent condition: 158 CHAPTER 4. URL REWRITING GUIDE # Capture the value of mykey in the query string RewriteCond "%{QUERY_STRING}" "(.*(?:ˆ|&))mykey=([ˆ&]*)&?(.*)&?$" RewriteCond "%2" !=not-so-secret-value RewriteRule "(.*)" "-" [F] • This solution shows the reverse of the previous ones, copying path components (perhaps PATH INFO) from the URL into the query string. # The desired URL might be /products/kitchen-sink, and the script expects # /path?products=kitchen-sink. RewriteRule "ˆ/?path/([ˆ/]+)/([ˆ/]+)" "/path?$1=$2" [PT] 4.4. USING MOD REWRITE TO CONTROL ACCESS 4.4 159 Using mod rewrite to control access This document supplements the MOD REWRITE reference documentation (p. 867) . It describes how you can use MOD REWRITE to control access to various resources, and other related techniques. This includes many examples of common uses of mod rewrite, including detailed descriptions of how each works. ! Note that many of these examples won’t work unchanged in your particular server configuration, so it’s important that you understand them, rather than merely cutting and pasting the examples into your configuration. See also • Module documentation (p. 867) • mod rewrite introduction (p. 147) • Redirection and remapping (p. 152) • Virtual hosts (p. 162) • Proxying (p. 165) • Using RewriteMap (p. 166) • Advanced techniques (p. 172) • When not to use mod rewrite (p. 175) Forbidding Image "Hotlinking" Description: The following technique forbids the practice of other sites including your images inline in their pages. This practice is often referred to as "hotlinking", and results in your bandwidth being used to serve content for someone else’s site. Solution: This technique relies on the value of the HTTP REFERER variable, which is optional. As such, it’s possible for some people to circumvent this limitation. However, most users will experience the failed request, which should, over time, result in the image being removed from that other site. There are several ways that you can handle this situation. In this first example, we simply deny the request, if it didn’t initiate from a page on our site. For the purpose of this example, we assume that our site is www.example.com. RewriteCond "%{HTTP_REFERER}" "!ˆ$" RewriteCond "%{HTTP_REFERER}" "!www.example.com" [NC] RewriteRule "\.(gif|jpg|png)$" "-" [F,NC] In this second example, instead of failing the request, we display an alternate image instead. RewriteCond "%{HTTP_REFERER}" "!ˆ$" RewriteCond "%{HTTP_REFERER}" "!www.example.com" RewriteRule "\.(gif|jpg|png)$" "/images/go-away.png" [NC] [R,NC] In the third example, we redirect the request to an image on some other site. RewriteCond "%{HTTP_REFERER}" "!ˆ$" RewriteCond "%{HTTP_REFERER}" "!www.example.com" RewriteRule "\.(gif|jpg|png)$" "http://other.example.com/image.gif" [NC] [R,NC] 160 CHAPTER 4. URL REWRITING GUIDE Of these techniques, the last two tend to be the most effective in getting people to stop hotlinking your images, because they will simply not see the image that they expected to see. Discussion: If all you wish to do is deny access to the resource, rather than redirecting that request elsewhere, this can be accomplished without the use of mod rewrite: SetEnvIf Referer example\.com localreferer Require env localreferer Blocking of Robots Description: In this recipe, we discuss how to block persistent requests from a particular robot, or user agent. The standard for robot exclusion defines a file, /robots.txt that specifies those portions of your website where you wish to exclude robots. However, some robots do not honor these files. Note that there are methods of accomplishing this which do not use mod rewrite. Note also that any technique that relies on the clients USER AGENT string can be circumvented very easily, since that string can be changed. Solution: We use a ruleset that specifies the directory to be protected, and the client USER AGENT that identifies the malicious or persistent robot. In this example, we are blocking a robot called NameOfBadRobot from a location /secret/files. You may also specify an IP address range, if you are trying to block that user agent only from the particular source. RewriteCond "%{HTTP_USER_AGENT}" RewriteCond "%{REMOTE_ADDR}" RewriteRule "ˆ/secret/files/" "ˆNameOfBadRobot" "=123\.45\.67\.[8-9]" "-" [F] Discussion: Rather than using mod rewrite for this, you can accomplish the same end using alternate means, as illustrated here: SetEnvIfNoCase User-Agent ˆNameOfBadRobot goaway Require all granted Require not env goaway As noted above, this technique is trivial to circumvent, by simply modifying the USER AGENT request header. If you are experiencing a sustained attack, you should consider blocking it at a higher level, such as at your firewall. Denying Hosts in a Blacklist Description: We wish to maintain a blacklist of hosts, rather like hosts.deny, and have those hosts blocked from accessing our server. Solution: RewriteEngine on RewriteMap hosts-deny "txt:/path/to/hosts.deny" RewriteCond "${hosts-deny:%{REMOTE_ADDR}|NOT-FOUND}" "!=NOT-FOUND" [OR] RewriteCond "${hosts-deny:%{REMOTE_HOST}|NOT-FOUND}" "!=NOT-FOUND" RewriteRule "ˆ" "-" [F] 4.4. USING MOD REWRITE TO CONTROL ACCESS 161 ## ## hosts.deny ## ## ATTENTION! This is a map, not a list, even when we treat it as such. ## mod rewrite parses it for key/value pairs, so at least a ## dummy value "-" must be present for each entry. ## 193.102.180.41 bsdti1.sdm.de 192.76.162.40 - Discussion: The second RewriteCond assumes that you have HostNameLookups turned on, so that client IP addresses will be resolved. If that’s not the case, you should drop the second RewriteCond, and drop the [OR] flag from the first RewriteCond. Referer-based Deflector Description: Redirect requests based on the Referer from which the request came, with different targets per Referer. Solution: The following ruleset uses a map file to associate each Referer with a redirection target. RewriteMap deflector "txt:/path/to/deflector.map" RewriteCond "%{HTTP_REFERER}" !="" RewriteCond "${deflector:%{HTTP_REFERER}}" =RewriteRule "ˆ" "%{HTTP_REFERER}" [R,L] RewriteCond "%{HTTP_REFERER}" !="" RewriteCond "${deflector:%{HTTP_REFERER}|NOT-FOUND}" "!=NOT-FOUND" RewriteRule "ˆ" "${deflector:%{HTTP_REFERER}}" [R,L] The map file lists redirection targets for each referer, or, if we just wish to redirect back to where they came from, a "-" is placed in the map: ## ## ## deflector.map http://badguys.example.com/bad/index.html http://badguys.example.com/bad/index2.html http://badguys.example.com/bad/index3.html http://somewhere.example.com/ 162 CHAPTER 4. URL REWRITING GUIDE 4.5 Dynamic mass virtual hosts with mod rewrite This document supplements the MOD REWRITE reference documentation (p. 867) . It describes how you can use MOD REWRITE to create dynamically configured virtual hosts. ! mod rewrite is usually not the best way to configure virtual hosts. You should first consider the alternatives (p. 130) before resorting to mod rewrite. See also the "how to avoid mod rewrite (p. 175) document. See also • Module documentation (p. 867) • mod rewrite introduction (p. 147) • Redirection and remapping (p. 152) • Controlling access (p. 159) • Proxying (p. 165) • RewriteMap (p. 166) • Advanced techniques (p. 172) • When not to use mod rewrite (p. 175) Virtual Hosts For Arbitrary Hostnames Description: We want to automatically create a virtual host for every hostname which resolves in our domain, without having to create new VirtualHost sections. In this recipe, we assume that we’ll be using the hostname SITE.example.com for each user, and serve their content out of /home/SITE/www. However, we want www.example.com to be ommitted from this mapping. Solution: RewriteEngine on RewriteMap lowercase int:tolower RewriteCond RewriteCond RewriteRule %{HTTP_HOST} !ˆwww\. ${lowercase:%{HTTP_HOST}} ˆ(.*) /home/%1/www$1 Discussion ! ˆ([ˆ.]+)\.example\.com$ You will need to take care of the DNS resolution - Apache does not handle name resolution. You’ll need either to create CNAME records for each hostname, or a DNS wildcard record. Creating DNS records is beyond the scope of this document. The internal tolower RewriteMap directive is used to ensure that the hostnames being used are all lowercase, so that there is no ambiguity in the directory structure which must be created. Parentheses used in a R EWRITE C OND are captured into the backreferences %1, %2, etc, while parentheses used in R EWRITE RULE are captured into the backreferences $1, $2, etc. The first RewriteCond checks to see if the hostname starts with www., and if it does, the rewriting is skipped. As with many techniques discussed in this document, mod rewrite really isn’t the best way to accomplish this task. You should, instead, consider using MOD VHOST ALIAS instead, as it will much more gracefully handle anything beyond serving static files, such as any dynamic content, and Alias resolution. 4.5. DYNAMIC MASS VIRTUAL HOSTS WITH MOD REWRITE Dynamic Virtual Hosts Using MOD 163 REWRITE This extract from httpd.conf does the same thing as the first example. The first half is very similar to the corresponding part above, except for some changes, required for backward compatibility and to make the mod rewrite part work properly; the second half configures mod rewrite to do the actual work. Because mod rewrite runs before other URI translation modules (e.g., mod alias), mod rewrite must be told to explicitly ignore any URLs that would have been handled by those modules. And, because these rules would otherwise bypass any ScriptAlias directives, we must have mod rewrite explicitly enact those mappings. # get the server name from the Host: header UseCanonicalName Off # splittable logs LogFormat "%{Host}i %h %l %u %t \"%r\" %s %b" vcommon CustomLog "logs/access_log" vcommon # ExecCGI is needed here because we can’t force # CGI execution in the way that ScriptAlias does Options FollowSymLinks ExecCGI RewriteEngine On # a ServerName derived from a Host: header may be any case at all RewriteMap lowercase "int:tolower" ## deal with normal documents first: # allow Alias /icons/ to work - repeat for other aliases RewriteCond "%{REQUEST_URI}" "!ˆ/icons/" # allow CGIs to work RewriteCond "%{REQUEST_URI}" "!ˆ/cgi-bin/" # do the magic RewriteRule "ˆ/(.*)$" "/www/hosts/${lowercase:%{SERVER_NAME}}/docs/$1" ## and now deal with CGIs - we have to force a handler RewriteCond "%{REQUEST_URI}" "ˆ/cgi-bin/" RewriteRule "ˆ/(.*)$" "/www/hosts/${lowercase:%{SERVER_NAME}}/cgi-bin/$1" Using a Separate Virtual Host Configuration File This arrangement uses more advanced MOD REWRITE features to work out the translation from virtual host to document root, from a separate configuration file. This provides more flexibility, but requires more complicated configuration. The vhost.map file should look something like this: customer-1.example.com /www/customers/1 customer-2.example.com /www/customers/2 # ... customer-N.example.com /www/customers/N The httpd.conf should contain the following: [H=cgi- 164 CHAPTER 4. URL REWRITING GUIDE RewriteEngine on RewriteMap lowercase # define the map file RewriteMap vhost "int:tolower" "txt:/www/conf/vhost.map" # deal with aliases as above RewriteCond "%{REQUEST_URI}" RewriteCond "%{REQUEST_URI}" RewriteCond "${lowercase:%{SERVER_NAME}}" # this does the file-based remap RewriteCond "${vhost:%1}" RewriteRule "ˆ/(.*)$" RewriteCond RewriteCond RewriteCond RewriteRule "%{REQUEST_URI}" "${lowercase:%{SERVER_NAME}}" "${vhost:%1}" "ˆ/(.*)$" "!ˆ/icons/" "!ˆ/cgi-bin/" "ˆ(.+)$" "ˆ(/.*)$" "%1/docs/$1" "ˆ/cgi-bin/" "ˆ(.+)$" "ˆ(/.*)$" "%1/cgi-bin/$1" [H=cgi-script] 4.6. USING MOD REWRITE FOR PROXYING 4.6 165 Using mod rewrite for Proxying This document supplements the MOD REWRITE reference documentation (p. 867) . It describes how to use the RewriteRule’s [P] flag to proxy content to another server. A number of recipes are provided that describe common scenarios. See also • Module documentation (p. 867) • mod rewrite introduction (p. 147) • Redirection and remapping (p. 152) • Controlling access (p. 159) • Virtual hosts (p. 162) • Using RewriteMap (p. 166) • Advanced techniques (p. 172) • When not to use mod rewrite (p. 175) Proxying Content with mod rewrite Description: mod rewrite provides the [P] flag, which allows URLs to be passed, via mod proxy, to another server. Two examples are given here. In one example, a URL is passed directly to another server, and served as though it were a local URL. In the other example, we proxy missing content to a back-end server. Solution: To simply map a URL to another server, we use the [P] flag, as follows: RewriteEngine on RewriteBase "/products/" RewriteRule "ˆwidget/(.*)$" "http://product.example.com/widget/$1" [P] ProxyPassReverse "/products/widget/" "http://product.example.com/widget/" In the second example, we proxy the request only if we can’t find the resource locally. This can be very useful when you’re migrating from one server to another, and you’re not sure if all the content has been migrated yet. RewriteCond "%{REQUEST_FILENAME}" !-f RewriteCond "%{REQUEST_FILENAME}" !-d RewriteRule "ˆ/(.*)" "http://old.example.com/$1" [P] ProxyPassReverse "/" "http://old.example.com/" Discussion: In each case, we add a P ROXY PASS R EVERSE directive to ensure that any redirects issued by the backend are correctly passed on to the client. Consider using either P ROXY PASS or P ROXY PASS M ATCH whenever possible in preference to mod rewrite. 166 CHAPTER 4. URL REWRITING GUIDE 4.7 Using RewriteMap This document supplements the MOD REWRITE reference documentation (p. 867) . It describes the use of the R EWRITE M AP directive, and provides examples of each of the various R EWRITE M AP types. ! Note that many of these examples won’t work unchanged in your particular server configuration, so it’s important that you understand them, rather than merely cutting and pasting the examples into your configuration. See also • Module documentation (p. 867) • mod rewrite introduction (p. 147) • Redirection and remapping (p. 152) • Controlling access (p. 159) • Virtual hosts (p. 162) • Proxying (p. 165) • Advanced techniques (p. 172) • When not to use mod rewrite (p. 175) Introduction The R EWRITE M AP directive defines an external function which can be called in the context of R EWRITE RULE or R EWRITE C OND directives to perform rewriting that is too complicated, or too specialized to be performed just by regular expressions. The source of this lookup can be any of the types listed in the sections below, and enumerated in the R EWRITE M AP reference documentation. The syntax of the R EWRITE M AP directive is as follows: RewriteMap MapName MapType:MapSource The MapName is an arbitray name that you assign to the map, and which you will use in directives later on. Arguments are passed to the map via the following syntax: ${ MapName : LookupKey } ${ MapName : LookupKey | DefaultValue } When such a construct occurs, the map MapName is consulted and the key LookupKey is looked-up. If the key is found, the map-function construct is substituted by SubstValue. If the key is not found then it is substituted by DefaultValue or by the empty string if no DefaultValue was specified. For example, you can define a R EWRITE M AP as: RewriteMap examplemap "txt:/path/to/file/map.txt" You would then be able to use this map in a R EWRITE RULE as follows: RewriteRule "ˆ/ex/(.*)" "${examplemap:$1}" A default value can be specified in the event that nothing is found in the map: 4.7. USING REWRITEMAP 167 RewriteRule "ˆ/ex/(.*)" "${examplemap:$1|/not_found.html}" =⇒Per-directory and .htaccess context The R EWRITE M AP directive may not be used in sections or .htaccess files. You must declare the map in server or virtualhost context. You may use the map, once created, in your R EWRITE RULE and R EWRITE C OND directives in those scopes. You just can’t declare it in those scopes. The sections that follow describe the various MapTypes that may be used, and give examples of each. int: Internal Function When a MapType of int is used, the MapSource is one of the available internal R EWRITE M AP functions. Module authors can provide additional internal functions by registering them with the ap register rewrite mapfunc API. The functions that are provided by default are: • toupper: Converts the key to all upper case. • tolower: Converts the key to all lower case. • escape: Translates special characters in the key to hex-encodings. • unescape: Translates hex-encodings in the key back to special characters. To use one of these functions, create a R EWRITE M AP referencing the int function, and then use that in your R EWRITE RULE: Redirect a URI to an all-lowercase version of itself RewriteMap lc int:tolower RewriteRule "(.*)" "${lc:$1}" [R] =⇒Please note that the example offered here is for illustration purposes only, and is not a recominstead. mendation. If you want to make URLs case-insensitive, consider using MOD SPELING txt: Plain text maps When a MapType of txt is used, the MapSource is a filesystem path to a plain-text mapping file, containing one space-separated key/value pair per line. Optionally, a line may contain a comment, starting with a ’#’ character. A valid text rewrite map file will have the following syntax: # Comment line MatchingKey SubstValue MatchingKey SubstValue # comment 168 CHAPTER 4. URL REWRITING GUIDE When the R EWRITE M AP is invoked the argument is looked for in the first argument of a line, and, if found, the substitution value is returned. For example, we can use a mapfile to translate product names to product IDs for easier-to-remember URLs, using the following recipe: Product to ID configuration RewriteMap product2id "txt:/etc/apache2/productmap.txt" RewriteRule "ˆ/product/(.*)" "/prods.php?id=${product2id:$1|NOTFOUND}" [PT] We assume here that the prods.php script knows what to do when it received an argument of id=NOTFOUND when a product is not found in the lookup map. The file /etc/apache2/productmap.txt then contains the following: Product to ID map ## ## productmap.txt - Product to ID map file ## television 993 stereo 198 fishingrod 043 basketball 418 telephone 328 Thus, when http://example.com/product/television is requested, the R EWRITE RULE is applied, and the request is internally mapped to /prods.php?id=993. =⇒Note: .htaccess files The example given is crafted to be used in server or virtualhost scope. If you’re planning to use this in a .htaccess file, you’ll need to remove the leading slash from the rewrite pattern in order for it to match anything: RewriteRule "ˆproduct/(.*)" "/prods.php?id=${product2id:$1|NOTFOUND}" [PT] =⇒Cached lookups The looked-up keys are cached by httpd until the mtime (modified time) of the mapfile changes, or the httpd server is restarted. This ensures better performance on maps that are called by many requests. rnd: Randomized Plain Text When a MapType of rnd is used, the MapSource is a filesystem path to a plain-text mapping file, each line of which contains a key, and one or more values separated by |. One of these values will be chosen at random if the key is matched. For example, you can use the following map file and directives to provide a random load balancing between several back-end servers, via a reverse-proxy. Images are sent to one of the servers in the ’static’ pool, while everything else is sent to one of the ’dynamic’ pool. 4.7. USING REWRITEMAP 169 Rewrite map file ## ## map.txt -- rewriting map ## static www1|www2|www3|www4 dynamic www5|www6 Configuration directives RewriteMap servers "rnd:/path/to/file/map.txt" RewriteRule "ˆ/(.*\.(png|gif|jpg))" "http://${servers:static}/$1" [NC,P,L] RewriteRule "ˆ/(.*)" "http://${servers:dynamic}/$1" [P,L] So, when an image is requested and the first of these rules is matched, R EWRITE M AP looks up the string static in the map file, which returns one of the specified hostnames at random, which is then used in the R EWRITE RULE target. If you wanted to have one of the servers more likely to be chosen (for example, if one of the server has more memory than the others, and so can handle more requests) simply list it more times in the map file. static www1|www1|www2|www3|www4 dbm: DBM Hash File When a MapType of dbm is used, the MapSource is a filesystem path to a DBM database file containing key/value pairs to be used in the mapping. This works exactly the same way as the txt map, but is much faster, because a DBM is indexed, whereas a text file is not. This allows more rapid access to the desired key. You may optionally specify a particular dbm type: RewriteMap examplemap "dbm=sdbm:/etc/apache/mapfile.dbm" The type can be sdbm, gdbm, ndbm or db. However, it is recommended that you just use the httxt2dbm (p. 328) utility that is provided with Apache HTTP Server, as it will use the correct DBM library, matching the one that was used when httpd itself was built. To create a dbm file, first create a text map file as described in the txt section. Then run httxt2dbm: $ httxt2dbm -i mapfile.txt -o mapfile.map You can then reference the resulting file in your R EWRITE M AP directive: RewriteMap mapname "dbm:/etc/apache/mapfile.map" =⇒Note that with some dbm types, more than one file is generated, with a common base name. For example, you may have two files named mapfile.map.dir and mapfiile.map.pag. This is normal, and you need only use the base name mapfile.map in your R EWRITE M AP directive. =⇒Cached lookups The looked-up keys are cached by httpd until the mtime (modified time) of the mapfile changes, or the httpd server is restarted. This ensures better performance on maps that are called by many requests. 170 CHAPTER 4. URL REWRITING GUIDE prg: External Rewriting Program When a MapType of prg is used, the MapSource is a filesystem path to an executable program which will providing the mapping behavior. This can be a compiled binary file, or a program in an interpreted language such as Perl or Python. This program is started once, when the Apache HTTP Server is started, and then communicates with the rewriting engine via STDIN and STDOUT. That is, for each map function lookup, it expects one argument via STDIN, and should return one new-line terminated response string on STDOUT. If there is no corresponding lookup value, the map program should return the four-character string "NULL" to indicate this. External rewriting programs are not started if they’re defined in a context that does not have R EWRITE E NGINE set to on. By default, external rewriting programs are run as the user:group who started httpd. This can be changed on UNIX systems by passing user name and group name as third argument to R EWRITE M AP in the username:groupname format. This feature utilizes the rewrite-map mutex, which is required for reliable communication with the program. The mutex mechanism and lock file can be configured with the M UTEX directive. A simple example is shown here which will replace all dashes with underscores in a request URI. Rewrite configuration RewriteMap d2u "prg:/www/bin/dash2under.programlisting" apache:apache RewriteRule "-" "${d2u:%{REQUEST_URI}}" dash2under.pl #!/usr/bin/perl $| = 1; # Turn off I/O buffering while () { s/-/_/g; # Replace dashes with underscores print $_; } =⇒Caution! • Keep your rewrite map program as simple as possible. If the program hangs, it will cause httpd to wait indefinitely for a response from the map, which will, in turn, cause httpd to stop responding to requests. • Be sure to turn off buffering in your program. In Perl this is done by the second line in the example script: $| = 1; This will of course vary in other languages. Buffered I/O will cause httpd to wait for the output, and so it will hang. • Remember that there is only one copy of the program, started at server startup. All requests will need to go through this one bottleneck. This can cause significant slowdowns if many requests must go through this process, or if the script itself is very slow. dbd or fastdbd: SQL Query When a MapType of dbd or fastdbd is used, the MapSource is a SQL SELECT statement that takes a single argument and returns a single value. 4.7. USING REWRITEMAP MOD DBD 171 will need to be configured to point at the right database for this statement to be executed. There are two forms of this MapType. Using a MapType of dbd causes the query to be executed with each map request, while using fastdbd caches the database lookups internally. So, while fastdbd is more efficient, and therefore faster, it won’t pick up on changes to the database until the server is restarted. If a query returns more than one row, a random row from the result set is used. Example RewriteMap myquery "fastdbd:SELECT destination FROM rewrite WHERE source = %s" Summary The R EWRITE M AP directive can occur more than once. For each mapping-function use one R EWRITE M AP directive to declare its rewriting mapfile. While you cannot declare a map in per-directory context (.htaccess files or blocks) it is possible to use this map in per-directory context. 172 CHAPTER 4. URL REWRITING GUIDE 4.8 Advanced Techniques with mod rewrite This document supplements the MOD REWRITE reference documentation (p. 867) . It provides a few advanced techniques using mod rewrite. ! Note that many of these examples won’t work unchanged in your particular server configuration, so it’s important that you understand them, rather than merely cutting and pasting the examples into your configuration. See also • Module documentation (p. 867) • mod rewrite introduction (p. 147) • Redirection and remapping (p. 152) • Controlling access (p. 159) • Virtual hosts (p. 162) • Proxying (p. 165) • Using RewriteMap (p. 166) • When not to use mod rewrite (p. 175) URL-based sharding across multiple backends Description: A common technique for distributing the burden of server load or storage space is called "sharding". When using this method, a front-end server will use the url to consistently "shard" users or objects to separate backend servers. Solution: A mapping is maintained, from users to target servers, in external map files. They look like: user1 physical host of user1 user2 physical host of user2 : : We put this into a map.users-to-hosts file. The aim is to map; /u/user1/anypath to http://physical host of user1/u/user/anypath thus every URL path need not be valid on every backend physical host. The following ruleset does this for us with the help of the map files assuming that server0 is a default server which will be used if a user has no entry in the map: RewriteEngine on RewriteMap users-to-hosts "txt:/path/to/map.users-to-hosts" RewriteRule "ˆ/u/([ˆ/]+)/?(.*)" "http://${users-to-hosts:$1|server0}/u/$1/$2" See the R EWRITE M AP documentation for more discussion of the syntax of this directive. 4.8. ADVANCED TECHNIQUES WITH MOD REWRITE 173 On-the-fly Content-Regeneration Description: We wish to dynamically generate content, but store it statically once it is generated. This rule will check for the existence of the static file, and if it’s not there, generate it. The static files can be removed periodically, if desired (say, via cron) and will be regenerated on demand. Solution: This is done via the following ruleset: # This example is valid in per-directory context only RewriteCond "%{REQUEST_URI}" !-U RewriteRule "ˆ(.+)\.html$" "/regenerate_page.cgi" [PT,L] The -U operator determines whether the test string (in this case, REQUEST URI) is a valid URL. It does this via a subrequest. In the event that this subrequest fails - that is, the requested resource doesn’t exist - this rule invokes the CGI program /regenerate page.cgi, which generates the requested resource and saves it into the document directory, so that the next time it is requested, a static copy can be served. In this way, documents that are infrequently updated can be served in static form. if documents need to be refreshed, they can be deleted from the document directory, and they will then be regenerated the next time they are requested. Load Balancing Description: We wish to randomly distribute load across several servers using mod rewrite. Solution: We’ll use R EWRITE M AP and a list of servers to accomplish this. RewriteEngine on RewriteMap lb "rnd:/path/to/serverlist.txt" RewriteRule "ˆ/(.*)" "http://${lb:servers}/$1" [P,L] serverlist.txt will contain a list of the servers: ## serverlist.txt servers one.example.com|two.example.com|three.example.com If you want one particular server to get more of the load than the others, add it more times to the list. Discussion Apache comes with a load-balancing module - MOD PROXY BALANCER - which is far more flexible and featureful than anything you can cobble together using mod rewrite. Structured Userdirs Description: Some sites with thousands of users use a structured homedir layout, i.e. each homedir is in a subdirectory which begins (for instance) with the first character of the username. So, /˜larry/anypath is /home/l/larry/public html/anypath while /˜waldo/anypath is /home/w/waldo/public html/anypath. Solution: We use the following ruleset to expand the tilde URLs into the above layout. RewriteEngine on RewriteRule "ˆ/˜(([a-z])[a-z0-9]+)(.*)" "/home/$2/$1/public_html$3" 174 CHAPTER 4. URL REWRITING GUIDE Redirecting Anchors Description: By default, redirecting to an HTML anchor doesn’t work, because mod rewrite escapes the # character, turning it into %23. This, in turn, breaks the redirection. Solution: Use the [NE] flag on the RewriteRule. NE stands for No Escape. Discussion: This technique will of course also work with other special characters that mod rewrite, by default, URLencodes. Time-Dependent Rewriting Description: We wish to use mod rewrite to serve different content based on the time of day. Solution: There are a lot of variables named TIME xxx for rewrite conditions. In conjunction with the special lexicographic comparison patterns STRING and =STRING we can do time-dependent redirects: RewriteEngine RewriteCond RewriteCond RewriteRule RewriteRule on "%{TIME_HOUR}%{TIME_MIN}" "%{TIME_HOUR}%{TIME_MIN}" "ˆfoo\.html$" "ˆfoo\.html$" >0700 <1900 "foo.day.html" [L] "foo.night.html" This provides the content of foo.day.html under the URL foo.html from 07:01-18:59 and at the remaining time the contents of foo.night.html. ! MOD CACHE , intermediate proxies and browsers may each cache responses and cause the either page to be shown outside of the time-window configured. MOD EXPIRES may be used to control this effect. You are, of course, much better off simply serving the content dynamically, and customizing it based on the time of day. Set Environment Variables Based On URL Parts Description: At time, we want to maintain some kind of status when we perform a rewrite. For example, you want to make a note that you’ve done that rewrite, so that you can check later to see if a request can via that rewrite. One way to do this is by setting an environment variable. Solution: Use the [E] flag to set an environment variable. RewriteEngine on RewriteRule "ˆ/horse/(.*)" "/pony/$1" [E=rewritten:1] Later in your ruleset you might check for this environment variable using a RewriteCond: RewriteCond "%{ENV:rewritten}" =1 Note that environment variables do not survive an external redirect. You might consider using the [CO] flag to set a cookie. 4.9. WHEN NOT TO USE MOD REWRITE 4.9 175 When not to use mod rewrite This document supplements the MOD REWRITE reference documentation (p. 867) . It describes perhaps one of the most important concepts about MOD REWRITE - namely, when to avoid using it. MOD REWRITE should be considered a last resort, when other alternatives are found wanting. Using it when there are simpler alternatives leads to configurations which are confusing, fragile, and hard to maintain. Understanding what other alternatives are available is a very important step towards MOD REWRITE mastery. Note that many of these examples won’t work unchanged in your particular server configuration, so it’s important that you understand them, rather than merely cutting and pasting the examples into your configuration. The most common situation in which MOD REWRITE is the right tool is when the very best solution requires access to the server configuration files, and you don’t have that access. Some configuration directives are only available in the server configuration file. So if you are in a hosting situation where you only have .htaccess files to work with, you may need to resort to MOD REWRITE. See also • Module documentation (p. 867) • mod rewrite introduction (p. 147) • Redirection and remapping (p. 152) • Controlling access (p. 159) • Virtual hosts (p. 162) • Proxying (p. 165) • Using RewriteMap (p. 166) • Advanced techniques (p. 172) Simple Redirection MOD ALIAS provides the R EDIRECT and R EDIRECT M ATCH directives, which provide a means to redirect one URL to another. This kind of simple redirection of one URL, or a class of URLs, to somewhere else, should be accomplished using these directives rather than R EWRITE RULE. RedirectMatch allows you to include a regular expression in your redirection criteria, providing many of the benefits of using RewriteRule. A common use for RewriteRule is to redirect an entire class of URLs. For example, all URLs in the /one directory must be redirected to http://one.example.com/, or perhaps all http requests must be redirected to https. These situations are better handled by the Redirect directive. Remember that Redirect preserves path information. That is to say, a redirect for a URL /one will also redirect all URLs under that, such as /one/two.html and /one/three/four.html. To redirect URLs under /one to http://one.example.com, do the following: Redirect "/one/" "http://one.example.com/" To redirect one hostname to another, for example example.com to www.example.com, see the Canonical Hostnames (p. 152) recipe. To redirect http URLs to https, do the following: ServerName www.example.com Redirect "/" "https://www.example.com/" 176 CHAPTER 4. URL REWRITING GUIDE ServerName www.example.com # ... SSL configuration goes here The use of RewriteRule to perform this task may be appropriate if there are other RewriteRule directives in the same scope. This is because, when there are Redirect and RewriteRule directives in the same scope, the RewriteRule directives will run first, regardless of the order of appearance in the configuration file. In the case of the http-to-https redirection, the use of RewriteRule would be appropriate if you don’t have access to the main server configuration file, and are obliged to perform this task in a .htaccess file instead. URL Aliasing The A LIAS directive provides mapping from a URI to a directory - usually a directory outside of your D OCUMEN T ROOT. Although it is possible to perform this mapping with MOD REWRITE, A LIAS is the preferred method, for reasons of simplicity and performance. Using Alias Alias "/cats" "/var/www/virtualhosts/felines/htdocs" The use of MOD REWRITE to perform this mapping may be appropriate when you do not have access to the server configuration files. Alias may only be used in server or virtualhost context, and not in a .htaccess file. Symbolic links would be another way to accomplish the same thing, if you have Options FollowSymLinks enabled on your server. Virtual Hosting Although it is possible to handle virtual hosts with mod rewrite (p. 162) , it is seldom the right way. Creating individual blocks is almost always the right way to go. In the event that you have an enormous number of virtual hosts, consider using MOD VHOST ALIAS to create these hosts automatically. Modules such as MOD MACRO are also useful for creating a large number of virtual hosts dynamically. Using MOD REWRITE for vitualhost creation may be appropriate if you are using a hosting service that does not provide you access to the server configuration files, and you are therefore restricted to configuration using .htaccess files. See the virtual hosts with mod rewrite (p. 162) document for more details on how you might accomplish this if it still seems like the right approach. Simple Proxying R EWRITE RULE provides the [P] (p. 178) flag to pass rewritten URIs through MOD PROXY. RewriteRule "ˆ/?images(.*)" "http://imageserver.local/images$1" [P] However, in many cases, when there is no actual pattern matching needed, as in the example shown above, the P ROX Y PASS directive is a better choice. The example here could be rendered as: 4.9. WHEN NOT TO USE MOD REWRITE 177 ProxyPass "/images/" "http://imageserver.local/images/" Note that whether you use R EWRITE RULE or P ROXY PASS, you’ll still need to use the P ROXY PASS R EVERSE directive to catch redirects issued from the back-end server: ProxyPassReverse "/images/" "http://imageserver.local/images/" You may need to use RewriteRule instead when there are other RewriteRules in effect in the same scope, as a RewriteRule will usually take effect before a ProxyPass, and so may preempt what you’re trying to accomplish. Environment Variable Testing MOD REWRITE is frequently used to take a particular action based on the presence or absence of a particular environment variable or request header. This can be done more efficiently using the . Consider, for example, the common scenario where R EWRITE RULE is used to enforce a canonical hostname, such as www.example.com instead of example.com. This can be done using the directive, as shown here: Redirect "/" "http://www.example.com/" This technique can be used to take actions based on any request header, response header, or environment variable, replacing MOD REWRITE in many common scenarios. See especially the expression evaluation documentation (p. 99) for a overview of what types of expressions you can use in sections, and in certain other directives. 178 CHAPTER 4. URL REWRITING GUIDE 4.10 RewriteRule Flags This document discusses the flags which are available to the R EWRITE RULE directive, providing detailed explanations and examples. See also • Module documentation (p. 867) • mod rewrite introduction (p. 147) • Redirection and remapping (p. 152) • Controlling access (p. 159) • Virtual hosts (p. 162) • Proxying (p. 165) • Using RewriteMap (p. 166) • Advanced techniques (p. 172) • When not to use mod rewrite (p. 175) Introduction A R EWRITE RULE can have its behavior modified by one or more flags. Flags are included in square brackets at the end of the rule, and multiple flags are separated by commas. RewriteRule pattern target [Flag1,Flag2,Flag3] Each flag (with a few exceptions) has a short form, such as CO, as well as a longer form, such as cookie. While it is most common to use the short form, it is recommended that you familiarize yourself with the long form, so that you remember what each flag is supposed to do. Some flags take one or more arguments. Flags are not case sensitive. Flags that alter metadata associated with the request (T=, H=, E=) have no affect in per-directory and htaccess context, when a substitution (other than ’-’) is performed during the same round of rewrite processing. Presented here are each of the available flags, along with an example of how you might use them. B (escape backreferences) The [B] flag instructs R EWRITE RULE to escape non-alphanumeric characters before applying the transformation. In 2.4.10 and later, you can limit the escaping to specific characters in backreferences by listing them: [B=#?;]. Note: The space character can be used in the list of characters to escape, but it cannot be the last character in the list. mod rewrite has to unescape URLs before mapping them, so backreferences are unescaped at the time they are applied. Using the B flag, non-alphanumeric characters in backreferences will be escaped. For example, consider the rule: RewriteRule "ˆsearch/(.*)$" "/search.php?term=$1" Given a search term of ’x & y/z’, a browser will encode it as ’x%20%26%20y%2Fz’, making the request ’search/x%20%26%20y%2Fz’. Without the B flag, this rewrite rule will map to ’search.php?term=x & y/z’, which isn’t a valid URL, and so would be encoded as search.php?term=x%20&y%2Fz=, which is not what was intended. 4.10. REWRITERULE FLAGS 179 With the B flag set on this same rule, the parameters are re-encoded before being passed on to the output URL, resulting in a correct mapping to /search.php?term=x%20%26%20y%2Fz. Note that you may also need to set A LLOW E NCODED S LASHES to On to get this particular example to work, as httpd does not allow encoded slashes in URLs, and returns a 404 if it sees one. This escaping is particularly necessary in a proxy situation, when the backend may break if presented with an unescaped URL. An alternative to this flag is using a R EWRITE C OND to capture against %{THE REQUEST} which will capture strings in the encoded form. BNP—backrefnoplus (don’t escape space to +) The [BNP] flag instructs R EWRITE RULE to escape the space character in a backreference to %20 rather than ’+’. Useful when the backreference will be used in the path component rather than the query string. C—chain The [C] or [chain] flag indicates that the R EWRITE RULE is chained to the next rule. That is, if the rule matches, then it is processed as usual and control moves on to the next rule. However, if it does not match, then the next rule, and any other rules that are chained together, are skipped. CO—cookie The [CO], or [cookie] flag, allows you to set a cookie when a particular R EWRITE RULE matches. The argument consists of three required fields and four optional fields. The full syntax for the flag, including all attributes, is as follows: [CO=NAME:VALUE:DOMAIN:lifetime:path:secure:httponly] If a literal ’:’ character is needed in any of the cookie fields, an alternate syntax is available. To opt-in to the alternate syntax, the cookie "Name" should be preceded with a ’;’ character, and field separators should be specified as ’;’. [CO=;NAME;VALUE:MOREVALUE;DOMAIN;lifetime;path;secure;httponly] You must declare a name, a value, and a domain for the cookie to be set. Domain The domain for which you want the cookie to be valid. This may be a hostname, such as www.example.com, or it may be a domain, such as .example.com. It must be at least two parts separated by a dot. That is, it may not be merely .com or .net. Cookies of that kind are forbidden by the cookie security model. You may optionally also set the following values: Lifetime The time for which the cookie will persist, in minutes. A value of 0 indicates that the cookie will persist only for the current browser session. This is the default value if none is specified. Path The path, on the current website, for which the cookie is valid, such as /customers/ or /files/download/. By default, this is set to / - that is, the entire website. 180 CHAPTER 4. URL REWRITING GUIDE Secure If set to secure, true, or 1, the cookie will only be permitted to be translated via secure (https) connections. httponly If set to HttpOnly, true, or 1, the cookie will have the HttpOnly flag set, which means that the cookie is inaccessible to JavaScript code on browsers that support this feature. Consider this example: RewriteEngine On RewriteRule "ˆ/index\.html" "-" [CO=frontdoor:yes:.example.com:1440:/] In the example give, the rule doesn’t rewrite the request. The "-" rewrite target tells mod rewrite to pass the request through unchanged. Instead, it sets a cookie called ’frontdoor’ to a value of ’yes’. The cookie is valid for any host in the .example.com domain. It is set to expire in 1440 minutes (24 hours) and is returned for all URIs. DPI—discardpath The DPI flag causes the PATH INFO portion of the rewritten URI to be discarded. This flag is available in version 2.2.12 and later. In per-directory context, the URI each R EWRITE RULE compares against is the concatenation of the current values of the URI and PATH INFO. The current URI can be the initial URI as requested by the client, the result of a previous round of mod rewrite processing, or the result of a prior rule in the current round of mod rewrite processing. In contrast, the PATH INFO that is appended to the URI before each rule reflects only the value of PATH INFO before this round of mod rewrite processing. As a consequence, if large portions of the URI are matched and copied into a substitution in multiple R EWRITE RULE directives, without regard for which parts of the URI came from the current PATH INFO, the final URI may have multiple copies of PATH INFO appended to it. Use this flag on any substitution where the PATH INFO that resulted from the previous mapping of this request to the filesystem is not of interest. This flag permanently forgets the PATH INFO established before this round of mod rewrite processing began. PATH INFO will not be recalculated until the current round of mod rewrite processing completes. Subsequent rules during this round of processing will see only the direct result of substitutions, without any PATH INFO appended. E—env With the [E], or [env] flag, you can set the value of an environment variable. Note that some environment variables may be set after the rule is run, thus unsetting what you have set. See the Environment Variables document (p. 92) for more details on how Environment variables work. The full syntax for this flag is: [E=VAR:VAL] [E=!VAR] VAL may contain backreferences ($N or %N) which are expanded. Using the short form [E=VAR] 4.10. REWRITERULE FLAGS 181 you can set the environment variable named VAR to an empty value. The form [E=!VAR] allows to unset a previously set environment variable named VAR. Environment variables can then be used in a variety of contexts, including CGI programs, other RewriteRule directives, or CustomLog directives. The following example sets an environment variable called ’image’ to a value of ’1’ if the requested URI is an image file. Then, that environment variable is used to exclude those requests from the access log. RewriteRule "\.(png|gif|jpg)$" CustomLog "logs/access_log" "-" [E=image:1] combined env=!image Note that this same effect can be obtained using S ET E NV I F. This technique is offered as an example, not as a recommendation. END Using the [END] flag terminates not only the current round of rewrite processing (like [L]) but also prevents any subsequent rewrite processing from occurring in per-directory (htaccess) context. This does not apply to new requests resulting from external redirects. F—forbidden Using the [F] flag causes the server to return a 403 Forbidden status code to the client. While the same behavior can be accomplished using the D ENY directive, this allows more flexibility in assigning a Forbidden status. The following rule will forbid .exe files from being downloaded from your server. RewriteRule "\.exe" "-" [F] This example uses the "-" syntax for the rewrite target, which means that the requested URI is not modified. There’s no reason to rewrite to another URI, if you’re going to forbid the request. When using [F], an [L] is implied - that is, the response is returned immediately, and no further rules are evaluated. G—gone The [G] flag forces the server to return a 410 Gone status with the response. This indicates that a resource used to be available, but is no longer available. As with the [F] flag, you will typically use the "-" syntax for the rewrite target when using the [G] flag: RewriteRule "oldproduct" "-" [G,NC] When using [G], an [L] is implied - that is, the response is returned immediately, and no further rules are evaluated. 182 CHAPTER 4. URL REWRITING GUIDE H—handler Forces the resulting request to be handled with the specified handler. For example, one might use this to force all files without a file extension to be parsed by the php handler: RewriteRule "!\." "-" [H=application/x-httpd-php] The regular expression above - !\. - will match any request that does not contain the literal . character. This can be also used to force the handler based on some conditions. For example, the following snippet used in per-server context allows .php files to be displayed by mod php if they are requested with the .phps extension: RewriteRule "ˆ(/source/.+\.php)s$" "$1" [H=application/x-httpd-php-source] The regular expression above - ˆ(/source/.+\.php)s$ - will match any request that starts with /source/ followed by 1 or n characters followed by .phps literally. The backreference $1 referrers to the captured match within parenthesis of the regular expression. L—last The [L] flag causes MOD REWRITE to stop processing the rule set. In most contexts, this means that if the rule matches, no further rules will be processed. This corresponds to the last command in Perl, or the break command in C. Use this flag to indicate that the current rule should be applied immediately without considering further rules. If you are using R EWRITE RULE in either .htaccess files or in sections, it is important to have some understanding of how the rules are processed. The simplified form of this is that once the rules have been processed, the rewritten request is handed back to the URL parsing engine to do what it may with it. It is possible that as the rewritten request is handled, the .htaccess file or section may be encountered again, and thus the ruleset may be run again from the start. Most commonly this will happen if one of the rules causes a redirect - either internal or external - causing the request process to start over. It is therefore important, if you are using R EWRITE RULE directives in one of these contexts, that you take explicit steps to avoid rules looping, and not count solely on the [L] flag to terminate execution of a series of rules, as shown below. An alternative flag, [END], can be used to terminate not only the current round of rewrite processing but prevent any subsequent rewrite processing from occurring in per-directory (htaccess) context. This does not apply to new requests resulting from external redirects. The example given here will rewrite any request to index.php, giving the original request as a query string argument to index.php, however, the R EWRITE C OND ensures that if the request is already for index.php, the R EWRITE RULE will be skipped. RewriteBase "/" RewriteCond "%{REQUEST_URI}" !=/index.php RewriteRule "ˆ(.*)" "/index.php?req=$1" [L,PT] N—next The [N] flag causes the ruleset to start over again from the top, using the result of the ruleset so far as a starting point. Use with extreme caution, as it may result in loop. The [Next] flag could be used, for example, if you wished to replace a certain string or letter repeatedly in a request. The example shown here will replace A with B everywhere in a request, and will continue doing so until there are no more As to be replaced. 4.10. REWRITERULE FLAGS 183 RewriteRule "(.*)A(.*)" "$1B$2" [N] You can think of this as a while loop: While this pattern still matches (i.e., while the URI still contains an A), perform this substitution (i.e., replace the A with a B). In 2.5.0 and later, this module returns an error after 10,000 iterations to protect against unintended looping. An alternative maximum number of iterations can be specified by adding to the N flag. # Be willing to replace 1 character in each pass of the loop RewriteRule "(.+)[><;]$" "$1" [N=32000] # ... or, give up if after 10 loops RewriteRule "(.+)[><;]$" "$1" [N=10] NC—nocase Use of the [NC] flag causes the R EWRITE RULE to be matched in a case-insensitive manner. That is, it doesn’t care whether letters appear as upper-case or lower-case in the matched URI. In the example below, any request for an image file will be proxied to your dedicated image server. The match is case-insensitive, so that .jpg and .JPG files are both acceptable, for example. RewriteRule "(.*\.(jpg|gif|png))$" "http://images.example.com$1" [P,NC] NE—noescape By default, special characters, such as & and ?, for example, will be converted to their hexcode equivalent. Using the [NE] flag prevents that from happening. RewriteRule "ˆ/anchor/(.+)" "/bigpage.html#$1" [NE,R] The above example will redirect /anchor/xyz to /bigpage.html#xyz. Omitting the [NE] will result in the # being converted to its hexcode equivalent, %23, which will then result in a 404 Not Found error condition. NS—nosubreq Use of the [NS] flag prevents the rule from being used on subrequests. For example, a page which is included using an SSI (Server Side Include) is a subrequest, and you may want to avoid rewrites happening on those subrequests. Also, when MOD DIR tries to find out information about possible directory default files (such as index.html files), this is an internal subrequest, and you often want to avoid rewrites on such subrequests. On subrequests, it is not always useful, and can even cause errors, if the complete set of rules are applied. Use this flag to exclude problematic rules. To decide whether or not to use this rule: if you prefix URLs with CGI-scripts, to force them to be processed by the CGI-script, it’s likely that you will run into problems (or significant overhead) on sub-requests. In these cases, use this flag. Images, javascript files, or css files, loaded as part of an HTML page, are not subrequests - the browser requests them as separate HTTP requests. P—proxy Use of the [P] flag causes the request to be handled by MOD PROXY, and handled via a proxy request. For example, if you wanted all image requests to be handled by a back-end image server, you might do something like the following: 184 CHAPTER 4. URL REWRITING GUIDE RewriteRule "/(.*)\.(jpg|gif|png)$" "http://images.example.com/$1.$2" [P] Use of the [P] flag implies [L] - that is, the request is immediately pushed through the proxy, and any following rules will not be considered. You must make sure that the substitution string is a valid URI (typically starting with http://hostname) which can be handled by the MOD PROXY. If not, you will get an error from the proxy module. Use this flag to achieve a more powerful implementation of the P ROXY PASS directive, to map remote content into the namespace of the local server. ! Security Warning Take care when constructing the target URL of the rule, considering the security impact from allowing the client influence over the set of URLs to which your server will act as a proxy. Ensure that the scheme and hostname part of the URL is either fixed, or does not allow the client undue influence. ! Performance warning Using this flag triggers the use of MOD PROXY, without handling of persistent connections. This means the performance of your proxy will be better if you set it up with P ROXY PASS or P ROXY PASS M ATCH This is because this flag triggers the use of the default worker, which does not handle connection pooling. Avoid using this flag and prefer those directives, whenever you can. Note: MOD PROXY must be enabled in order to use this flag. PT—passthrough The target (or substitution string) in a RewriteRule is assumed to be a file path, by default. The use of the [PT] flag causes it to be treated as a URI instead. That is to say, the use of the [PT] flag causes the result of the R EWRITE RULE to be passed back through URL mapping, so that location-based mappings, such as A LIAS, R EDIRECT, or S CRIPTA LIAS, for example, might have a chance to take effect. If, for example, you have an A LIAS for /icons, and have a R EWRITE RULE pointing there, you should use the [PT] flag to ensure that the A LIAS is evaluated. Alias "/icons" "/usr/local/apache/icons" RewriteRule "/pics/(.+)\.jpg$" "/icons/$1.gif" [PT] Omission of the [PT] flag in this case will cause the Alias to be ignored, resulting in a ’File not found’ error being returned. The PT flag implies the L flag: rewriting will be stopped in order to pass the request to the next phase of processing. Note that the PT flag is implied in per-directory contexts such as sections or in .htaccess files. The only way to circumvent that is to rewrite to -. QSA—qsappend When the replacement URI contains a query string, the default behavior of R EWRITE RULE is to discard the existing query string, and replace it with the newly generated one. Using the [QSA] flag causes the query strings to be combined. 4.10. REWRITERULE FLAGS 185 Consider the following rule: RewriteRule "/pages/(.+)" "/page.php?page=$1" [QSA] With the [QSA] flag, a request for /pages/123?one=two will be mapped to /page.php?page=123&one=two. Without the [QSA] flag, that same request will be mapped to /page.php?page=123 - that is, the existing query string will be discarded. QSD—qsdiscard When the requested URI contains a query string, and the target URI does not, the default behavior of R EWRITE RULE is to copy that query string to the target URI. Using the [QSD] flag causes the query string to be discarded. This flag is available in version 2.4.0 and later. Using [QSD] and [QSA] together will result in [QSD] taking precedence. If the target URI has a query string, the default behavior will be observed - that is, the original query string will be discarded and replaced with the query string in the RewriteRule target URI. QSL—qslast By default, the first (left-most) question mark in the substitution delimits the path from the query string. Using the [QSL] flag instructs R EWRITE RULE to instead split the two components using the last (right-most) question mark. This is useful when mapping to files that have literal question marks in their filename. If no query string is used in the substitution, a question mark can be appended to it in combination with this flag. This flag is available in version 2.4.19 and later. R—redirect Use of the [R] flag causes a HTTP redirect to be issued to the browser. If a fully-qualified URL is specified (that is, including http://servername/) then a redirect will be issued to that location. Otherwise, the current protocol, servername, and port number will be used to generate the URL sent with the redirect. Any valid HTTP response status code may be specified, using the syntax [R=305], with a 302 status code being used by default if none is specified. The status code specified need not necessarily be a redirect (3xx) status code. However, if a status code is outside the redirect range (300-399) then the substitution string is dropped entirely, and rewriting is stopped as if the L were used. In addition to response status codes, you may also specify redirect status using their symbolic names: temp (default), permanent, or seeother. You will almost always want to use [R] in conjunction with [L] (that is, use [R,L]) because on its own, the [R] flag prepends http://thishost[:thisport] to the URI, but then passes this on to the next rule in the ruleset, which can often result in ’Invalid URI in request’ warnings. S—skip The [S] flag is used to skip rules that you don’t want to run. The syntax of the skip flag is [S=N], where N signifies the number of rules to skip (provided the R EWRITE RULE and any preceding R EWRITE C OND directives match). This can be thought of as a goto statement in your rewrite ruleset. In the following example, we only want to run the R EWRITE RULE if the requested URI doesn’t correspond with an actual file. 186 CHAPTER 4. URL REWRITING GUIDE # Is the request for a non-existent file? RewriteCond "%{REQUEST_FILENAME}" !-f RewriteCond "%{REQUEST_FILENAME}" !-d # If so, skip these two RewriteRules RewriteRule ".?" "-" [S=2] RewriteRule "(.*\.gif)" RewriteRule "(.*\.html)" "images.php?$1" "docs.php?$1" This technique is useful because a R EWRITE C OND only applies to the R EWRITE RULE immediately following it. Thus, if you want to make a RewriteCond apply to several RewriteRules, one possible technique is to negate those conditions and add a RewriteRule with a [Skip] flag. You can use this to make pseudo if-then-else constructs: The last rule of the then-clause becomes skip=N, where N is the number of rules in the else-clause: # Does the file exist? RewriteCond "%{REQUEST_FILENAME}" !-f RewriteCond "%{REQUEST_FILENAME}" !-d # Create an if-then-else construct by skipping 3 lines if we meant to go to the "else" stan RewriteRule ".?" "-" [S=3] # IF the file exists, then: RewriteRule "(.*\.gif)" "images.php?$1" RewriteRule "(.*\.html)" "docs.php?$1" # Skip past the "else" stanza. RewriteRule ".?" "-" [S=1] # ELSE... RewriteRule "(.*)" "404.php?file=$1" # END It is probably easier to accomplish this kind of configuration using the , , and directives instead. T—type Sets the MIME type with which the resulting response will be sent. This has the same effect as the A DD T YPE directive. For example, you might use the following technique to serve Perl source code as plain text, if requested in a particular way: # Serve .pl files as plain text RewriteRule "\.pl$" "-" [T=text/plain] Or, perhaps, if you have a camera that produces jpeg images without file extensions, you could force those images to be served with the correct MIME type by virtue of their file names: # Files with ’IMG’ in the name are jpg images. RewriteRule "IMG" "-" [T=image/jpg] Please note that this is a trivial example, and could be better done using instead. Always consider the alternate solutions to a problem before resorting to rewrite, which will invariably be a less efficient solution than the alternatives. If used in per-directory context, use only - (dash) as the substitution for the entire round of mod rewrite processing, otherwise the MIME-type set with this flag is lost due to an internal re-processing (including subsequent rounds of mod rewrite processing). The L flag can be useful in this context to end the current round of mod rewrite processing. 4.11. APACHE MOD REWRITE TECHNICAL DETAILS 4.11 187 Apache mod rewrite Technical Details This document discusses some of the technical details of mod rewrite and URL matching. See also • Module documentation (p. 867) • mod rewrite introduction (p. 147) • Redirection and remapping (p. 152) • Controlling access (p. 159) • Virtual hosts (p. 162) • Proxying (p. 165) • Using RewriteMap (p. 166) • Advanced techniques (p. 172) • When not to use mod rewrite (p. 175) API Phases The Apache HTTP Server handles requests in several phases. At each of these phases, one or more modules may be called upon to handle that portion of the request lifecycle. Phases include things like URL-to-filename translation, authentication, authorization, content, and logging. (This is not an exhaustive list.) mod rewrite acts in two of these phases (or "hooks", as they are often called) to influence how URLs may be rewritten. First, it uses the URL-to-filename translation hook, which occurs after the HTTP request has been read, but before any authorization starts. Secondly, it uses the Fixup hook, which is after the authorization phases, and after per-directory configuration files (.htaccess files) have been read, but before the content handler is called. After a request comes in and a corresponding server or virtual host has been determined, the rewriting engine starts processing any mod rewrite directives appearing in the per-server configuration. (i.e., in the main server configuration file and sections.) This happens in the URL-to-filename phase. A few steps later, once the final data directories have been found, the per-directory configuration directives (.htaccess files and blocks) are applied. This happens in the Fixup phase. In each of these cases, mod rewrite rewrites the REQUEST URI either to a new URL, or to a filename. In per-directory context (i.e., within .htaccess files and Directory blocks), these rules are being applied after a URL has already been translated to a filename. Because of this, the URL-path that mod rewrite initially compares R EWRITE RULE directives against is the full filesystem path to the translated filename with the current directories path (including a trailing slash) removed from the front. To illustrate: If rules are in /var/www/foo/.htaccess and a request for /foo/bar/baz is being processed, an expression like ˆbar/baz$ would match. If a substitution is made in per-directory context, a new internal subrequest is issued with the new URL, which restarts processing of the request phases. If the substitution is a relative path, the R EWRITE BASE directive determines the URL-path prefix prepended to the substitution. In per-directory context, care must be taken to create rules which will eventually (in some future "round" of per-directory rewrite processing) not perform a substitution to avoid looping. (See RewriteLooping6 for further discussion of this problem.) Because of this further manipulation of the URL in per-directory context, you’ll need to take care to craft your rewrite rules differently in that context. In particular, remember that the leading directory path will be stripped off of the URL that your rewrite rules will see. Consider the examples below for further clarification. 6 http://wiki.apache.org/httpd/RewriteLooping 188 CHAPTER 4. URL REWRITING GUIDE Location of rule Rule VirtualHost section .htaccess file in document root .htaccess file in images directory RewriteRule "ˆ/images/(.+)\.jpg" "/images/$1.gif" RewriteRule "ˆimages/(.+)\.jpg" "images/$1.gif" RewriteRule "ˆ(.+)\.jpg" "$1.gif" For even more insight into how mod rewrite manipulates URLs in different contexts, you should consult the log entries (p. 867) made during rewriting. Ruleset Processing Now when mod rewrite is triggered in these two API phases, it reads the configured rulesets from its configuration structure (which itself was either created on startup for per-server context or during the directory walk of the Apache kernel for per-directory context). Then the URL rewriting engine is started with the contained ruleset (one or more rules together with their conditions). The operation of the URL rewriting engine itself is exactly the same for both configuration contexts. Only the final result processing is different. The order of rules in the ruleset is important because the rewriting engine processes them in a special (and not very obvious) order. The rule is this: The rewriting engine loops through the ruleset rule by rule (R EWRITE RULE directives) and when a particular rule matches it optionally loops through existing corresponding conditions (RewriteCond directives). For historical reasons the conditions are given first, and so the control flow is a little bit long-winded. See Figure 1 for more details. 4.11. APACHE MOD REWRITE TECHNICAL DETAILS 189 Figure 1:The control flow through the rewriting ruleset First the URL is matched against the Pattern of each rule. If it fails, mod rewrite immediately stops processing this rule, and continues with the next rule. If the Pattern matches, mod rewrite looks for corresponding rule conditions (RewriteCond directives, appearing immediately above the RewriteRule in the configuration). If none are present, it substitutes the URL with a new value, which is constructed from the string Substitution, and goes on with its rulelooping. But if conditions exist, it starts an inner loop for processing them in the order that they are listed. For conditions, the logic is different: we don’t match a pattern against the current URL. Instead we first create a string TestString by expanding variables, back-references, map lookups, etc. and then we try to match CondPattern against it. If the pattern doesn’t match, the complete set of conditions and the corresponding rule fails. If the pattern matches, then the next condition is processed until no more conditions are available. If all conditions match, processing is continued with the substitution of the URL with Substitution. 190 CHAPTER 4. URL REWRITING GUIDE Chapter 5 Apache SSL/TLS Encryption 191 192 CHAPTER 5. APACHE SSL/TLS ENCRYPTION 5.1 Apache SSL/TLS Encryption The Apache HTTP Server module MOD SSL provides an interface to the OpenSSL1 library, which provides Strong Encryption using the Secure Sockets Layer and Transport Layer Security protocols. Documentation • mod ssl Configuration How-To (p. 206) • Introduction To SSL (p. 193) • Compatibility (p. 202) • Frequently Asked Questions (p. 212) • Glossary (p. 1096) mod ssl Extensive documentation on the directives and environment variables provided by this module is provided in the mod ssl reference documentation (p. 916) . 1 http://www.openssl.org/ 5.2. SSL/TLS STRONG ENCRYPTION: AN INTRODUCTION 5.2 193 SSL/TLS Strong Encryption: An Introduction As an introduction this chapter is aimed at readers who are familiar with the Web, HTTP, and Apache, but are not security experts. It is not intended to be a definitive guide to the SSL protocol, nor does it discuss specific techniques for managing certificates in an organization, or the important legal issues of patents and import and export restrictions. Rather, it is intended to provide a common background to MOD SSL users by pulling together various concepts, definitions, and examples as a starting point for further exploration. Cryptographic Techniques Understanding SSL requires an understanding of cryptographic algorithms, message digest functions (aka. one-way or hash functions), and digital signatures. These techniques are the subject of entire books (see for instance [AC96]) and provide the basis for privacy, integrity, and authentication. Cryptographic Algorithms Suppose Alice wants to send a message to her bank to transfer some money. Alice would like the message to be private, since it will include information such as her account number and transfer amount. One solution is to use a cryptographic algorithm, a technique that would transform her message into an encrypted form, unreadable until it is decrypted. Once in this form, the message can only be decrypted by using a secret key. Without the key the message is useless: good cryptographic algorithms make it so difficult for intruders to decode the original text that it isn’t worth their effort. There are two categories of cryptographic algorithms: conventional and public key. Conventional cryptography also known as symmetric cryptography, requires the sender and receiver to share a key: a secret piece of information that may be used to encrypt or decrypt a message. As long as this key is kept secret, nobody other than the sender or recipient can read the message. If Alice and the bank know a secret key, then they can send each other private messages. The task of sharing a key between sender and recipient before communicating, while also keeping it secret from others, can be problematic. Public key cryptography also known as asymmetric cryptography, solves the key exchange problem by defining an algorithm which uses two keys, each of which may be used to encrypt a message. If one key is used to encrypt a message then the other must be used to decrypt it. This makes it possible to receive secure messages by simply publishing one key (the public key) and keeping the other secret (the private key). Anyone can encrypt a message using the public key, but only the owner of the private key will be able to read it. In this way, Alice can send private messages to the owner of a key-pair (the bank), by encrypting them using their public key. Only the bank will be able to decrypt them. Message Digests Although Alice may encrypt her message to make it private, there is still a concern that someone might modify her original message or substitute it with a different one, in order to transfer the money to themselves, for instance. One way of guaranteeing the integrity of Alice’s message is for her to create a concise summary of her message and send this to the bank as well. Upon receipt of the message, the bank creates its own summary and compares it with the one Alice sent. If the summaries are the same then the message has been received intact. A summary such as this is called a message digest, one-way function or hash function. Message digests are used to create a short, fixed-length representation of a longer, variable-length message. Digest algorithms are designed to produce a unique digest for each message. Message digests are designed to make it impractically difficult to determine 194 CHAPTER 5. APACHE SSL/TLS ENCRYPTION the message from the digest and (in theory) impossible to find two different messages which create the same digest – thus eliminating the possibility of substituting one message for another while maintaining the same digest. Another challenge that Alice faces is finding a way to send the digest to the bank securely; if the digest is not sent securely, its integrity may be compromised and with it the possibility for the bank to determine the integrity of the original message. Only if the digest is sent securely can the integrity of the associated message be determined. One way to send the digest securely is to include it in a digital signature. Digital Signatures When Alice sends a message to the bank, the bank needs to ensure that the message is really from her, so an intruder cannot request a transaction involving her account. A digital signature, created by Alice and included with the message, serves this purpose. Digital signatures are created by encrypting a digest of the message and other information (such as a sequence number) with the sender’s private key. Though anyone can decrypt the signature using the public key, only the sender knows the private key. This means that only the sender can have signed the message. Including the digest in the signature means the signature is only good for that message; it also ensures the integrity of the message since no one can change the digest and still sign it. To guard against interception and reuse of the signature by an intruder at a later date, the signature contains a unique sequence number. This protects the bank from a fraudulent claim from Alice that she did not send the message – only she could have signed it (non-repudiation). Certificates Although Alice could have sent a private message to the bank, signed it and ensured the integrity of the message, she still needs to be sure that she is really communicating with the bank. This means that she needs to be sure that the public key she is using is part of the bank’s key-pair, and not an intruder’s. Similarly, the bank needs to verify that the message signature really was signed by the private key that belongs to Alice. If each party has a certificate which validates the other’s identity, confirms the public key and is signed by a trusted agency, then both can be assured that they are communicating with whom they think they are. Such a trusted agency is called a Certificate Authority and certificates are used for authentication. Certificate Contents A certificate associates a public key with the real identity of an individual, server, or other entity, known as the subject. As shown in Table 1, information about the subject includes identifying information (the distinguished name) and the public key. It also includes the identification and signature of the Certificate Authority that issued the certificate and the period of time during which the certificate is valid. It may have additional information (or extensions) as well as administrative information for the Certificate Authority’s use, such as a serial number. Table 1: Certificate Information Subject Issuer Period of Validity Administrative Information Extended Information Distinguished Name, Public Key Distinguished Name, Signature Not Before Date, Not After Date Version, Serial Number Basic Constraints, Netscape Flags, etc. 5.2. SSL/TLS STRONG ENCRYPTION: AN INTRODUCTION 195 A distinguished name is used to provide an identity in a specific context – for instance, an individual might have a personal certificate as well as one for their identity as an employee. Distinguished names are defined by the X.509 standard [X509], which defines the fields, field names and abbreviations used to refer to the fields (see Table 2). Table 2: Distinguished Name Information DN Field Abbrev. Description Example Common Name Organization or Company CN O CN=Joe Average O=Snake Oil, Ltd. Organizational Unit OU City/Locality State/Province Country L ST C Name being certified Name is associated with this organization Name is associated with this organization unit, such as a department Name is located in this City Name is located in this State/Province Name is located in this Country (ISO code) OU=Research Institute L=Snake City ST=Desert C=XZ A Certificate Authority may define a policy specifying which distinguished field names are optional and which are required. It may also place requirements upon the field contents, as may users of certificates. For example, a Netscape browser requires that the Common Name for a certificate representing a server matches a wildcard pattern for the domain name of that server, such as *.snakeoil.com. The binary format of a certificate is defined using the ASN.1 notation [ASN1] [PKCS]. This notation defines how to specify the contents and encoding rules define how this information is translated into binary form. The binary encoding of the certificate is defined using Distinguished Encoding Rules (DER), which are based on the more general Basic Encoding Rules (BER). For those transmissions which cannot handle binary, the binary form may be translated into an ASCII form by using Base64 encoding [MIME]. When placed between begin and end delimiter lines (as below), this encoded version is called a PEM ("Privacy Enhanced Mail") encoded certificate. Example of a PEM-encoded certificate (snakeoil.crt) -----BEGIN CERTIFICATE----MIIC7jCCAlegAwIBAgIBATANBgkqhkiG9w0BAQQFADCBqTELMAkGA1UEBhMCWFkx FTATBgNVBAgTDFNuYWtlIERlc2VydDETMBEGA1UEBxMKU25ha2UgVG93bjEXMBUG A1UEChMOU25ha2UgT2lsLCBMdGQxHjAcBgNVBAsTFUNlcnRpZmljYXRlIEF1dGhv cml0eTEVMBMGA1UEAxMMU25ha2UgT2lsIENBMR4wHAYJKoZIhvcNAQkBFg9jYUBz bmFrZW9pbC5kb20wHhcNOTgxMDIxMDg1ODM2WhcNOTkxMDIxMDg1ODM2WjCBpzEL MAkGA1UEBhMCWFkxFTATBgNVBAgTDFNuYWtlIERlc2VydDETMBEGA1UEBxMKU25h a2UgVG93bjEXMBUGA1UEChMOU25ha2UgT2lsLCBMdGQxFzAVBgNVBAsTDldlYnNl cnZlciBUZWFtMRkwFwYDVQQDExB3d3cuc25ha2VvaWwuZG9tMR8wHQYJKoZIhvcN AQkBFhB3d3dAc25ha2VvaWwuZG9tMIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKB gQDH9Ge/s2zcH+da+rPTx/DPRp3xGjHZ4GG6pCmvADIEtBtKBFAcZ64n+Dy7Np8b vKR+yy5DGQiijsH1D/j8HlGE+q4TZ8OFk7BNBFazHxFbYI4OKMiCxdKzdif1yfaa lWoANFlAzlSdbxeGVHoT0K+gT5w3UxwZKv2DLbCTzLZyPwIDAQABoyYwJDAPBgNV HRMECDAGAQH/AgEAMBEGCWCGSAGG+EIBAQQEAwIAQDANBgkqhkiG9w0BAQQFAAOB gQAZUIHAL4D09oE6Lv2k56Gp38OBDuILvwLg1v1KL8mQR+KFjghCrtpqaztZqcDt 2q2QoyulCgSzHbEGmi0EsdkPfg6mp0penssIFePYNI+/8u9HT4LuKMJX15hxBam7 dUHzICxBVC1lnHyYGjDuAMhe396lYAn8bCld1/L4NMGBCQ== -----END CERTIFICATE----- Certificate Authorities By verifying the information in a certificate request before granting the certificate, the Certificate Authority assures itself of the identity of the private key owner of a key-pair. For instance, if Alice requests a personal certificate, the Certificate Authority must first make sure that Alice really is the person the certificate request claims she is. 196 CHAPTER 5. APACHE SSL/TLS ENCRYPTION Certificate Chains A Certificate Authority may also issue a certificate for another Certificate Authority. When examining a certificate, Alice may need to examine the certificate of the issuer, for each parent Certificate Authority, until reaching one which she has confidence in. She may decide to trust only certificates with a limited chain of issuers, to reduce her risk of a "bad" certificate in the chain. Creating a Root-Level CA As noted earlier, each certificate requires an issuer to assert the validity of the identity of the certificate subject, up to the top-level Certificate Authority (CA). This presents a problem: who can vouch for the certificate of the top-level authority, which has no issuer? In this unique case, the certificate is "self-signed", so the issuer of the certificate is the same as the subject. Browsers are preconfigured to trust well-known certificate authorities, but it is important to exercise extra care in trusting a self-signed certificate. The wide publication of a public key by the root authority reduces the risk in trusting this key – it would be obvious if someone else publicized a key claiming to be the authority. A number of companies, such as Thawte2 and VeriSign3 have established themselves as Certificate Authorities. These companies provide the following services: • Verifying certificate requests • Processing certificate requests • Issuing and managing certificates It is also possible to create your own Certificate Authority. Although risky in the Internet environment, it may be useful within an Intranet where the organization can easily verify the identities of individuals and servers. Certificate Management Establishing a Certificate Authority is a responsibility which requires a solid administrative, technical and management framework. Certificate Authorities not only issue certificates, they also manage them – that is, they determine for how long certificates remain valid, they renew them and keep lists of certificates that were issued in the past but are no longer valid (Certificate Revocation Lists, or CRLs). For example, if Alice is entitled to a certificate as an employee of a company but has now left that company, her certificate may need to be revoked. Because certificates are only issued after the subject’s identity has been verified and can then be passed around to all those with whom the subject may communicate, it is impossible to tell from the certificate alone that it has been revoked. Therefore when examining certificates for validity it is necessary to contact the issuing Certificate Authority to check CRLs – this is usually not an automated part of the process. =⇒Note If you use a Certificate Authority that browsers are not configured to trust by default, it is necessary to load the Certificate Authority certificate into the browser, enabling the browser to validate server certificates signed by that Certificate Authority. Doing so may be dangerous, since once loaded, the browser will accept all certificates signed by that Certificate Authority. Secure Sockets Layer (SSL) The Secure Sockets Layer protocol is a protocol layer which may be placed between a reliable connection-oriented network layer protocol (e.g. TCP/IP) and the application protocol layer (e.g. HTTP). SSL provides for secure communication between client and server by allowing mutual authentication, the use of digital signatures for integrity and encryption for privacy. 2 http://www.thawte.com/ 3 http://www.verisign.com/ 5.2. SSL/TLS STRONG ENCRYPTION: AN INTRODUCTION 197 The protocol is designed to support a range of choices for specific algorithms used for cryptography, digests and signatures. This allows algorithm selection for specific servers to be made based on legal, export or other concerns and also enables the protocol to take advantage of new algorithms. Choices are negotiated between client and server when establishing a protocol session. Table 4: Versions of the SSL protocol Version Source Description SSL v2.0 Vendor Standard (from Netscape Corp.) Expired Internet Draft (from Netscape Corp.) [SSL3] Proposed Internet Standard (from IETF) [TLS1] First SSL protocol for which implementations exist Revisions to prevent specific security attacks, add non-RSA ciphers and support for certificate chains Revision of SSL 3.0 to update the MAC layer to HMAC, add block padding for block ciphers, message order standardization and more alert messages. Update of TLS 1.0 to add protection against Cipher block chaining (CBC) attacks. Update of TLS 1.2 deprecating MD5 as hash, and adding incompatibility to SSL so it will never negotiate the use of SSLv2. SSL v3.0 TLS v1.0 TLS v1.1 TLS v1.2 Proposed Internet Standard (from IETF) [TLS11] Proposed Internet Standard (from IETF) [TLS12] There are a number of versions of the SSL protocol, as shown in Table 4. As noted there, one of the benefits in SSL 3.0 is that it adds support of certificate chain loading. This feature allows a server to pass a server certificate along with issuer certificates to the browser. Chain loading also permits the browser to validate the server certificate, even if Certificate Authority certificates are not installed for the intermediate issuers, since they are included in the certificate chain. SSL 3.0 is the basis for the Transport Layer Security [TLS] protocol standard, currently in development by the Internet Engineering Task Force (IETF). Establishing a Session The SSL session is established by following a handshake sequence between client and server, as shown in Figure 1. This sequence may vary, depending on whether the server is configured to provide a server certificate or request a client certificate. Although cases exist where additional handshake steps are required for management of cipher information, this article summarizes one common scenario. See the SSL specification for the full range of possibilities. =⇒Note Once an SSL session has been established, it may be reused. This avoids the performance penalty of repeating the many steps needed to start a session. To do this, the server assigns each SSL session a unique session identifier which is cached in the server and which the client can use in future connections to reduce the handshake time (until the session identifier expires from the cache of the server). 198 CHAPTER 5. APACHE SSL/TLS ENCRYPTION Figure 1: Simplified SSL Handshake Sequence The elements of the handshake sequence, as used by the client and server, are listed below: 1. Negotiate the Cipher Suite to be used during data transfer 2. Establish and share a session key between client and server 3. Optionally authenticate the server to the client 4. Optionally authenticate the client to the server The first step, Cipher Suite Negotiation, allows the client and server to choose a Cipher Suite supported by both of them. The SSL3.0 protocol specification defines 31 Cipher Suites. A Cipher Suite is defined by the following components: • Key Exchange Method • Cipher for Data Transfer • Message Digest for creating the Message Authentication Code (MAC) These three elements are described in the sections that follow. Key Exchange Method The key exchange method defines how the shared secret symmetric cryptography key used for application data transfer will be agreed upon by client and server. SSL 2.0 uses RSA key exchange only, while SSL 3.0 supports a choice of key exchange algorithms including RSA key exchange (when certificates are used), and Diffie-Hellman key exchange (for exchanging keys without certificates, or without prior communication between client and server). One variable in the choice of key exchange methods is digital signatures – whether or not to use them, and if so, what kind of signatures to use. Signing with a private key provides protection against a man-in-the-middle-attack during the information exchange used to generating the shared key [AC96, p516]. 5.2. SSL/TLS STRONG ENCRYPTION: AN INTRODUCTION 199 Cipher for Data Transfer SSL uses conventional symmetric cryptography, as described earlier, for encrypting messages in a session. There are nine choices of how to encrypt, including the option not to encrypt: • No encryption • Stream Ciphers – RC4 with 40-bit keys – RC4 with 128-bit keys • CBC Block Ciphers – – – – – – RC2 with 40 bit key DES with 40 bit key DES with 56 bit key Triple-DES with 168 bit key Idea (128 bit key) Fortezza (96 bit key) "CBC" refers to Cipher Block Chaining, which means that a portion of the previously encrypted cipher text is used in the encryption of the current block. "DES" refers to the Data Encryption Standard [AC96, ch12], which has a number of variants (including DES40 and 3DES EDE). "Idea" is currently one of the best and cryptographically strongest algorithms available, and "RC2" is a proprietary algorithm from RSA DSI [AC96, ch13]. Digest Function The choice of digest function determines how a digest is created from a record unit. SSL supports the following: • No digest (Null choice) • MD5, a 128-bit hash • Secure Hash Algorithm (SHA-1), a 160-bit hash The message digest is used to create a Message Authentication Code (MAC) which is encrypted with the message to verify integrity and to protect against replay attacks. Handshake Sequence Protocol The handshake sequence uses three protocols: • The SSL Handshake Protocol for performing the client and server SSL session establishment. • The SSL Change Cipher Spec Protocol for actually establishing agreement on the Cipher Suite for the session. • The SSL Alert Protocol for conveying SSL error messages between client and server. These protocols, as well as application protocol data, are encapsulated in the SSL Record Protocol, as shown in Figure 2. An encapsulated protocol is transferred as data by the lower layer protocol, which does not examine the data. The encapsulated protocol has no knowledge of the underlying protocol. 200 CHAPTER 5. APACHE SSL/TLS ENCRYPTION Figure 2: SSL Protocol Stack The encapsulation of SSL control protocols by the record protocol means that if an active session is renegotiated the control protocols will be transmitted securely. If there was no previous session, the Null cipher suite is used, which means there will be no encryption and messages will have no integrity digests, until the session has been established. Data Transfer The SSL Record Protocol, shown in Figure 3, is used to transfer application and SSL Control data between the client and server, where necessary fragmenting this data into smaller units, or combining multiple higher level protocol data messages into single units. It may compress, attach digest signatures, and encrypt these units before transmitting them using the underlying reliable transport protocol (Note: currently, no major SSL implementations include support for compression). Figure 3: SSL Record Protocol 5.2. SSL/TLS STRONG ENCRYPTION: AN INTRODUCTION 201 Securing HTTP Communication One common use of SSL is to secure Web HTTP communication between a browser and a webserver. This does not preclude the use of non-secured HTTP - the secure version (called HTTPS) is the same as plain HTTP over SSL, but uses the URL scheme https rather than http, and a different server port (by default, port 443). This functionality is a large part of what MOD SSL provides for the Apache webserver. References [AC96] Bruce Schneier, Applied Cryptography, 2nd Edition, Wiley, 1996. See http://www.counterpane.com/ for various other materials by Bruce Schneier. [ASN1] ITU-T Recommendation X.208, Specification of Abstract Syntax Notation One (ASN.1), last updated 2008. See http://www.itu.int/ITU-T/asn1/. [X509] ITU-T Recommendation X.509, The Directory - Authentication Framework. http://en.wikipedia.org/wiki/X.509. [PKCS] Public Key Cryptography Standards http://www.rsasecurity.com/rsalabs/pkcs/. (PKCS), RSA Laboratories For references, see Technical Notes, See [MIME] N. Freed, N. Borenstein, Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies, RFC2045. See for instance http://tools.ietf.org/html/rfc2045. [SSL3] Alan O. Freier, Philip Karlton, Paul C. Kocher, The SSL Protocol Version 3.0, 1996. http://www.netscape.com/eng/ssl3/draft302.txt. [TLS1] Tim Dierks, Christopher Allen, The TLS Protocol Version 1.0, 1999. See http://ietf.org/rfc/rfc2246.txt. [TLS11] The TLS Protocol Version 1.1, 2006. See http://tools.ietf.org/html/rfc4346. [TLS12] The TLS Protocol Version 1.2, 2008. See http://tools.ietf.org/html/rfc5246. See 202 5.3 CHAPTER 5. APACHE SSL/TLS ENCRYPTION SSL/TLS Strong Encryption: Compatibility This page covers backwards compatibility between mod ssl and other SSL solutions. mod ssl is not the only SSL solution for Apache; four additional products are (or were) also available: Ben Laurie’s freely available Apache-SSL4 (from where mod ssl were originally derived in 1998), Red Hat’s commercial Secure Web Server (which was based on mod ssl), Covalent’s commercial Raven SSL Module (also based on mod ssl) and finally C2Net’s (now Red Hat’s) commercial product Stronghold5 (based on a different evolution branch, named Sioux up to Stronghold 2.x, and based on mod ssl since Stronghold 3.x). mod ssl mostly provides a superset of the functionality of all the other solutions, so it’s simple to migrate from one of the older modules to mod ssl. The configuration directives and environment variable names used by the older SSL solutions vary from those used in mod ssl; mapping tables are included here to give the equivalents used by mod ssl. Configuration Directives The mapping between configuration directives used by Apache-SSL 1.x and mod ssl 2.0.x is given in Table 1. The mapping from Sioux 1.x and Stronghold 2.x is only partial because of special functionality in these interfaces which mod ssl doesn’t provide. Table 1: Configuration Directive Mapping Old Directive Apache-SSL 1.x & mod ssl 2.0.x compatibility: mod ssl Directive Comment SSLEnable SSLDisable SSLLogFile file SSLEngine on SSLEngine off SSLRequiredCiphers spec SSLRequireCipher c1 ... SSLCipherSuite spec SSLRequire %{SSL CIPHER} in {"c1", ...} SSLRequire not (%{SSL CIPHER} in {"c1", ...}) SSLOptions +FakeBasicAuth - compactified compactified Use per-module L OG L EVEL setting instead. renamed generalized merged functionality removed functionality removed SSLOptions +ExportCertData - merged functionality not supported SSLCertificateFile file SSLCertificateKeyFile file SSLCipherSuite arg SSLCACertificatePath arg - renamed renamed renamed renamed Use per-module L OG L EVEL setting instead. renamed renamed renamed SSLBanCipher c1 ... SSLFakeBasicAuth SSLCacheServerPath dir SSLCacheServerPort integer generalized Apache-SSL 1.x compatibility: SSLExportClientCertificates SSLCacheServerRunDir dir Sioux 1.x compatibility: SSL SSL SSL SSL SSL CertFile file KeyFile file CipherSuite arg X509VerifyDir arg Log file SSL Connect flag SSL ClientAuth arg SSL X509VerifyDepth arg 4 http://www.apache-ssl.org/ 5 http://www.redhat.com/explore/stronghold/ SSLEngine flag SSLVerifyClient arg SSLVerifyDepth arg 5.3. SSL/TLS STRONG ENCRYPTION: COMPATIBILITY SSL FetchKeyPhraseFrom arg - SSL SessionDir dir - SSL SSL SSL SSL SSL Require expr CertFileType arg KeyFileType arg X509VerifyPolicy arg LogX509Attributes arg 203 not directly mappable; use SSLPassPhraseDialog not directly mappable; use SSLSessionCache not directly mappable; use SSLRequire functionality not supported functionality not supported functionality not supported functionality not supported - Stronghold 2.x compatibility: StrongholdAccelerator engine StrongholdKey dir StrongholdLicenseFile dir SSLFlag flag SSLSessionLockFile file SSLCipherList spec RequireSSL SSLErrorFile file SSLRoot dir SSL CertificateLogDir dir AuthCertDir dir SSL Group name SSLProxyMachineCertPath dir SSLProxyMachineCertFile file SSLProxyCipherList spec SSLCryptoDevice engine renamed functionality not needed functionality not needed SSLEngine flag renamed SSLMutex file renamed SSLCipherSuite spec renamed SSLRequireSSL renamed functionality not supported functionality not supported functionality not supported functionality not supported functionality not supported SSLProxyMachineCertificatePath renamed dir SSLProxyMachineCertificateFile renamed file SSLProxyCipherSpec spec renamed Environment Variables The mapping between environment variable names used by the older SSL solutions and the names used by mod ssl is given in Table 2. Table 2: Environment Variable Derivation Old Variable mod ssl Variable Comment SSL PROTOCOL VERSION SSLEAY VERSION HTTPS SECRETKEYSIZE HTTPS KEYSIZE HTTPS CIPHER HTTPS EXPORT SSL SERVER KEY SIZE SSL SERVER CERTIFICATE SSL SERVER CERT START SSL SERVER CERT END SSL SERVER CERT SERIAL SSL SERVER SIGNATURE ALGORITHM SSL SERVER DN SSL SERVER CN SSL SERVER EMAIL SSL SERVER O SSL SERVER OU SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL renamed renamed renamed renamed renamed renamed renamed renamed renamed renamed renamed renamed renamed renamed renamed renamed renamed PROTOCOL VERSION LIBRARY CIPHER USEKEYSIZE CIPHER ALGKEYSIZE CIPHER CIPHER EXPORT CIPHER ALGKEYSIZE SERVER CERT SERVER V START SERVER V END SERVER M SERIAL SERVER A SIG SERVER S DN SERVER S DN CN SERVER S DN Email SERVER S DN O SERVER S DN OU 204 SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL CHAPTER 5. APACHE SSL/TLS ENCRYPTION SERVER C SERVER SP SERVER L SERVER IDN SERVER ICN SERVER IEMAIL SERVER IO SERVER IOU SERVER IC SERVER ISP SERVER IL CLIENT CERTIFICATE CLIENT CERT START CLIENT CERT END CLIENT CERT SERIAL CLIENT SIGNATURE ALGORITHM CLIENT DN CLIENT CN CLIENT EMAIL CLIENT O CLIENT OU CLIENT C CLIENT SP CLIENT L CLIENT IDN CLIENT ICN CLIENT IEMAIL CLIENT IO CLIENT IOU CLIENT IC CLIENT ISP CLIENT IL EXPORT KEYSIZE SECKEYSIZE SSLEAY VERSION STRONG CRYPTO SERVER KEY EXP SERVER KEY ALGORITHM SERVER KEY SIZE SERVER SESSIONDIR SERVER CERTIFICATELOGDIR SERVER CERTFILE SERVER KEYFILE SERVER KEYFILETYPE CLIENT KEY EXP CLIENT KEY ALGORITHM CLIENT KEY SIZE SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL - SERVER S DN C SERVER S DN SP SERVER S DN L SERVER I DN SERVER I DN CN SERVER I DN Email SERVER I DN O SERVER I DN OU SERVER I DN C SERVER I DN SP SERVER I DN L CLIENT CERT CLIENT V START CLIENT V END CLIENT M SERIAL CLIENT A SIG CLIENT S DN CLIENT S DN CN CLIENT S DN Email CLIENT S DN O CLIENT S DN OU CLIENT S DN C CLIENT S DN SP CLIENT S DN L CLIENT I DN CLIENT I DN CN CLIENT I DN Email CLIENT I DN O CLIENT I DN OU CLIENT I DN C CLIENT I DN SP CLIENT I DN L CIPHER EXPORT CIPHER ALGKEYSIZE CIPHER USEKEYSIZE VERSION LIBRARY renamed renamed renamed renamed renamed renamed renamed renamed renamed renamed renamed renamed renamed renamed renamed renamed renamed renamed renamed renamed renamed renamed renamed renamed renamed renamed renamed renamed renamed renamed renamed renamed renamed renamed renamed renamed Not supported by mod Not supported by mod Not supported by mod Not supported by mod Not supported by mod Not supported by mod Not supported by mod Not supported by mod Not supported by mod Not supported by mod Not supported by mod Not supported by mod Custom Log Functions When mod ssl is enabled, additional functions exist for the Custom Log Format (p. 705) of MOD LOG CONFIG as documented in the Reference Chapter. Beside the “%{varname}x” eXtension format function which can be used to ssl ssl ssl ssl ssl ssl ssl ssl ssl ssl ssl ssl 5.3. SSL/TLS STRONG ENCRYPTION: COMPATIBILITY 205 expand any variables provided by any module, an additional Cryptography “%{name}c” cryptography format function exists for backward compatibility. The currently implemented function calls are listed in Table 3. Table 3: Custom Log Cryptography Function Function Call Description %...{version}c %...{cipher}c %...{subjectdn}c %...{issuerdn}c %...{errcode}c %...{errstr}c SSL protocol version SSL cipher Client Certificate Subject Distinguished Name Client Certificate Issuer Distinguished Name Certificate Verification Error (numerical) Certificate Verification Error (string) 206 CHAPTER 5. APACHE SSL/TLS ENCRYPTION 5.4 SSL/TLS Strong Encryption: How-To This document is intended to get you started, and get a few things working. You are strongly encouraged to read the rest of the SSL documentation, and arrive at a deeper understanding of the material, before progressing to the advanced techniques. Basic Configuration Example Your SSL configuration will need to contain, at minimum, the following directives. Listen 443 ServerName www.example.com SSLEngine on SSLCertificateFile "/path/to/www.example.com.cert" SSLCertificateKeyFile "/path/to/www.example.com.key" Cipher Suites and Enforcing Strong Security • How can I create an SSL server which accepts strong encryption only? • How can I create an SSL server which accepts all types of ciphers in general, but requires a strong cipher for access to a particular URL? How can I create an SSL server which accepts strong encryption only? The following enables only the strongest ciphers: SSLCipherSuite HIGH:!aNULL:!MD5 While with the following configuration you specify a preference for specific speed-optimized ciphers (which will be selected by mod ssl, provided that they are supported by the client): SSLCipherSuite RC4-SHA:AES128-SHA:HIGH:!aNULL:!MD5 SSLHonorCipherOrder on How can I create an SSL server which accepts all types of ciphers in general, but requires a strong ciphers for access to a particular URL? Obviously, a server-wide SSLC IPHER S UITE which restricts ciphers to the strong variants, isn’t the answer here. However, MOD SSL can be reconfigured within Location blocks, to give a per-directory solution, and can automatically force a renegotiation of the SSL parameters to meet the new configuration. This can be done as follows: # be liberal in general SSLCipherSuite ALL:!aNULL:RC4+RSA:+HIGH:+MEDIUM:+LOW:+EXP:+eNULL # but https://hostname/strong/area/ and below # requires strong ciphers SSLCipherSuite HIGH:!aNULL:!MD5 5.4. SSL/TLS STRONG ENCRYPTION: HOW-TO 207 OCSP Stapling The Online Certificate Status Protocol (OCSP) is a mechanism for determining whether or not a server certificate has been revoked, and OCSP Stapling is a special form of this in which the server, such as httpd and mod ssl, maintains current OCSP responses for its certificates and sends them to clients which communicate with the server. Most certificates contain the address of an OCSP responder maintained by the issuing Certificate Authority, and mod ssl can communicate with that responder to obtain a signed response that can be sent to clients communicating with the server. Because the client can obtain the certificate revocation status from the server, without requiring an extra connection from the client to the Certificate Authority, OCSP Stapling is the preferred way for the revocation status to be obtained. Other benefits of eliminating the communication between clients and the Certificate Authority are that the client browsing history is not exposed to the Certificate Authority and obtaining status is more reliable by not depending on potentially heavily loaded Certificate Authority servers. Because the response obtained by the server can be reused for all clients using the same certificate during the time that the response is valid, the overhead for the server is minimal. Once general SSL support has been configured properly, enabling OCSP Stapling generally requires only very minor modifications to the httpd configuration - the addition of these two directives: SSLUseStapling On SSLStaplingCache "shmcb:ssl_stapling(32768)" These directives are placed at global scope (i.e., not within a virtual host definition) wherever other global SSL configuration directives are placed, such as in conf/extra/httpd-ssl.conf for normal open source builds of httpd, /etc/apache2/mods-enabled/ssl.conf for the Ubuntu or Debian-bundled httpd, etc. This particular SSLS TAPLING C ACHE directive requires MOD SOCACHE SHMCB (from the shmcb prefix on the directive’s argument). This module is usually enabled already for SSLS ESSION C ACHE or on behalf of some module other than MOD SSL. If you enabled an SSL session cache using a mechanism other than MOD SOCACHE SHMCB, use that alternative mechanism for SSLS TAPLING C ACHE as well. For example: SSLSessionCache "dbm:ssl_scache" SSLStaplingCache "dbm:ssl_stapling" You can use the openssl command-line program to verify that an OCSP response is sent by your server: $ openssl s_client -connect www.example.com:443 -status -servername www.example.com ... OCSP response: ====================================== OCSP Response Data: OCSP Response Status: successful (0x0) Response Type: Basic OCSP Response ... Cert Status: Good ... The following sections highlight the most common situations which require further modification to the configuration. Refer also to the MOD SSL reference manual. 208 CHAPTER 5. APACHE SSL/TLS ENCRYPTION If more than a few SSL certificates are used for the server OCSP responses are stored in the SSL stapling cache. While the responses are typically a few hundred to a few thousand bytes in size, mod ssl supports OCSP responses up to around 10K bytes in size. With more than a few certificates, the stapling cache size (32768 bytes in the example above) may need to be increased. Error message AH01929 will be logged in case of an error storing a response. If the certificate does not point to an OCSP responder, or if a different address must be used Refer to the SSLS TAPLING F ORCE URL directive. You can confirm that a server certificate points to an OCSP responder using the openssl command-line program, as follows: $ openssl x509 -in ./www.example.com.crt -text | grep ’OCSP.*http’ OCSP - URI:http://ocsp.example.com If the OCSP URI is provided and the web server can communicate to it directly without using a proxy, no configuration is required. Note that firewall rules that control outbound connections from the web server may need to be adjusted. If no OCSP URI is provided, contact your Certificate Authority to determine if one is available; if so, configure it with SSLS TAPLING F ORCE URL in the virtual host that uses the certificate. If multiple SSL-enabled virtual hosts are configured and OCSP Stapling should be disabled for some Add SSLUseStapling Off to the virtual hosts for which OCSP Stapling should be disabled. If the OCSP responder is slow or unreliable Several directives are available to handle timeouts and errors. Refer to the documentation for the SSLS TAPLING FAKE T RY L ATER, SSLS TAPLING R ESPONDERT IMEOUT, and SSLS TAPLING R ETURN R ESPONDER E RRORS directives. If mod ssl logs error AH02217 AH02217: ssl_stapling_init_cert: Can’t retrieve issuer certificate! In order to support OCSP Stapling when a particular server certificate is used, the certificate chain for that certificate must be configured. If it was not configured as part of enabling SSL, the AH02217 error will be issued when stapling is enabled, and an OCSP response will not be provided for clients using the certificate. Refer to the SSLC ERTIFICATE C HAIN F ILE and SSLC ERTIFICATE F ILE for instructions for configuring the certificate chain. Client Authentication and Access Control • How can I force clients to authenticate using certificates? • How can I force clients to authenticate using certificates for a particular URL, but still allow arbitrary clients to access the rest of the server? • How can I allow only clients who have certificates to access a particular URL, but allow all clients to access the rest of the server? 5.4. SSL/TLS STRONG ENCRYPTION: HOW-TO 209 • How can I require HTTPS with strong ciphers, and either basic authentication or client certificates, for access to part of the Intranet website, for clients coming from the Internet? How can I force clients to authenticate using certificates? When you know all of your users (eg, as is often the case on a corporate Intranet), you can require plain certificate authentication. All you need to do is to create client certificates signed by your own CA certificate (ca.crt) and then verify the clients against this certificate. # require a client certificate which has to be directly # signed by our CA certificate in ca.crt SSLVerifyClient require SSLVerifyDepth 1 SSLCACertificateFile "conf/ssl.crt/ca.crt" How can I force clients to authenticate using certificates for a particular URL, but still allow arbitrary clients to access the rest of the server? To force clients to authenticate using certificates for a particular URL, you can use the per-directory reconfiguration features of MOD SSL: SSLVerifyClient none SSLCACertificateFile "conf/ssl.crt/ca.crt" SSLVerifyClient require SSLVerifyDepth 1 How can I allow only clients who have certificates to access a particular URL, but allow all clients to access the rest of the server? The key to doing this is checking that part of the client certificate matches what you expect. Usually this means checking all or part of the Distinguished Name (DN), to see if it contains some known string. There are two ways to do this, using either MOD AUTH BASIC or SSLR EQUIRE. The MOD AUTH BASIC method is generally required when the certificates are completely arbitrary, or when their DNs have no common fields (usually the organisation, etc.). In this case, you should establish a password database containing all clients allowed, as follows: SSLVerifyClient none SSLCACertificateFile "conf/ssl.crt/ca.crt" SSLCACertificatePath "conf/ssl.crt" SSLVerifyClient require SSLVerifyDepth 5 SSLOptions +FakeBasicAuth SSLRequireSSL AuthName "Snake Oil Authentication" AuthType Basic 210 CHAPTER 5. APACHE SSL/TLS ENCRYPTION AuthBasicProvider AuthUserFile Require file "/usr/local/apache2/conf/httpd.passwd" valid-user The password used in this example is the DES encrypted string "password". See the SSLO PTIONS docs for more information. httpd.passwd /C=DE/L=Munich/O=Snake Oil, Ltd./OU=Staff/CN=Foo:xxj31ZMTZzkVA /C=US/L=S.F./O=Snake Oil, Ltd./OU=CA/CN=Bar:xxj31ZMTZzkVA /C=US/L=L.A./O=Snake Oil, Ltd./OU=Dev/CN=Quux:xxj31ZMTZzkVA When your clients are all part of a common hierarchy, which is encoded into the DN, you can match them more easily using SSLR EQUIRE, as follows: SSLVerifyClient none SSLCACertificateFile "conf/ssl.crt/ca.crt" SSLCACertificatePath "conf/ssl.crt" SSLVerifyClient require SSLVerifyDepth 5 SSLOptions +FakeBasicAuth SSLRequireSSL SSLRequire %{SSL_CLIENT_S_DN_O} eq "Snake Oil, Ltd." \ and %{SSL_CLIENT_S_DN_OU} in {"Staff", "CA", "Dev"} How can I require HTTPS with strong ciphers, and either basic authentication or client certificates, for access to part of the Intranet website, for clients coming from the Internet? I still want to allow plain HTTP access for clients on the Intranet. These examples presume that clients on the Intranet have IPs in the range 192.168.1.0/24, and that the part of the Intranet website you want to allow internet access to is /usr/local/apache2/htdocs/subarea. This configuration should remain outside of your HTTPS virtual host, so that it applies to both HTTPS and HTTP. SSLCACertificateFile "conf/ssl.crt/company-ca.crt" # Outside the subarea only Intranet access is granted Require ip 192.168.1.0/24 # Inside the subarea any Intranet access is allowed # but from the Internet only HTTPS + Strong-Cipher + Password # or the alternative HTTPS + Strong-Cipher + Client-Certificate # If HTTPS is used, make sure a strong cipher is used. # Additionally allow client certs as alternative to basic auth. SSLVerifyClient optional 5.4. SSL/TLS STRONG ENCRYPTION: HOW-TO SSLVerifyDepth SSLOptions SSLRequire 211 1 +FakeBasicAuth +StrictRequire %{SSL_CIPHER_USEKEYSIZE} >= 128 # Force clients from the Internet to use HTTPS RewriteEngine on RewriteCond "%{REMOTE_ADDR}" "!ˆ192\.168\.1\.[0-9]+$" RewriteCond "%{HTTPS}" "!=on" RewriteRule "." "-" [F] # Allow Network Access and/or Basic Auth Satisfy any # Network Access Control Require ip 192.168.1.0/24 # HTTP Basic Authentication AuthType basic AuthName "Protected Intranet Area" AuthBasicProvider file AuthUserFile "conf/protected.passwd" Require valid-user Logging MOD SSL can log extremely verbose debugging information to the error log, when its L OG L EVEL is set to the higher trace levels. On the other hand, on a very busy server, level info may already be too much. Remember that you can configure the L OG L EVEL per module to suite your needs. 212 CHAPTER 5. APACHE SSL/TLS ENCRYPTION 5.5 SSL/TLS Strong Encryption: FAQ The wise man doesn’t give the right answers, he poses the right questions. – Claude Levi-Strauss Installation • Why do I get permission errors related to SSLMutex when I start Apache? • Why does mod ssl stop with the error "Failed to generate temporary 512 bit RSA private key" when I start Apache? Why do I get permission errors related to SSLMutex when I start Apache? Errors such as “mod ssl: Child could not open SSLMutex lockfile /opt/apache/logs/ssl mutex.18332 (System error follows) [...] System: Permission denied (errno: 13)” are usually caused by overly restrictive permissions on the parent directories. Make sure that all parent directories (here /opt, /opt/apache and /opt/apache/logs) have the x-bit set for, at minimum, the UID under which Apache’s children are running (see the U SER directive). Why does mod ssl stop with the error "Failed to generate temporary 512 bit RSA private key" when I start Apache? Cryptographic software needs a source of unpredictable data to work correctly. Many open source operating systems provide a "randomness device" that serves this purpose (usually named /dev/random). On other systems, applications have to seed the OpenSSL Pseudo Random Number Generator (PRNG) manually with appropriate data before generating keys or performing public key encryption. As of version 0.9.5, the OpenSSL functions that need randomness report an error if the PRNG has not been seeded with at least 128 bits of randomness. To prevent this error, MOD SSL has to provide enough entropy to the PRNG to allow it to work correctly. This can be done via the SSLR ANDOM S EED directive. Configuration • Is it possible to provide HTTP and HTTPS from the same server? • Which port does HTTPS use? • How do I speak HTTPS manually for testing purposes? • Why does the connection hang when I connect to my SSL-aware Apache server? • Why do I get “Connection Refused” errors, when trying to access my newly installed Apache+mod ssl server via HTTPS? • Why are the SSL XXX variables not available to my CGI & SSI scripts? • How can I switch between HTTP and HTTPS in relative hyperlinks? Is it possible to provide HTTP and HTTPS from the same server? Yes. HTTP and HTTPS use different server ports (HTTP binds to port 80, HTTPS to port 443), so there is no direct conflict between them. You can either run two separate server instances bound to these ports, or use Apache’s elegant virtual hosting facility to create two virtual servers, both served by the same instance of Apache - one responding over HTTP to requests on port 80, and the other responding over HTTPS to requests on port 443. 5.5. SSL/TLS STRONG ENCRYPTION: FAQ 213 Which port does HTTPS use? You can run HTTPS on any port, but the standards specify port 443, which is where any HTTPS compliant browser will look by default. You can force your browser to look on a different port by specifying it in the URL. For example, if your server is set up to serve pages over HTTPS on port 8080, you can access them at https://example.com:8080/ How do I speak HTTPS manually for testing purposes? While you usually just use $ telnet localhost 80 GET / HTTP/1.0 for simple testing of Apache via HTTP, it’s not so easy for HTTPS because of the SSL protocol between TCP and HTTP. With the help of OpenSSL’s s client command, however, you can do a similar check via HTTPS: $ openssl s client -connect localhost:443 -state -debug GET / HTTP/1.0 Before the actual HTTP response you will receive detailed information about the SSL handshake. For a more general command line client which directly understands both HTTP and HTTPS, can perform GET and POST operations, can use a proxy, supports byte ranges, etc. you should have a look at the nifty cURL6 tool. Using this, you can check that Apache is responding correctly to requests via HTTP and HTTPS as follows: $ curl http://localhost/ $ curl https://localhost/ Why does the connection hang when I connect to my SSL-aware Apache server? This can happen when you try to connect to a HTTPS server (or virtual server) via HTTP (eg, using http://example.com/ instead of https://example.com). It can also happen when trying to connect via HTTPS to a HTTP server (eg, using https://example.com/ on a server which doesn’t support HTTPS, or which supports it on a non-standard port). Make sure that you’re connecting to a (virtual) server that supports SSL. Why do I get “Connection Refused” messages, when trying to access my newly installed Apache+mod ssl server via HTTPS? This error can be caused by an incorrect configuration. Please make sure that your L ISTEN directives match your directives. If all else fails, please start afresh, using the default configuration provided by MOD SSL . Why are the SSL XXX variables not available to my CGI & SSI scripts? Please make sure you have “SSLOptions +StdEnvVars” enabled for the context of your CGI/SSI requests. 6 http://curl.haxx.se/ 214 CHAPTER 5. APACHE SSL/TLS ENCRYPTION How can I switch between HTTP and HTTPS in relative hyperlinks? Usually, to switch between HTTP and HTTPS, you have to use fully-qualified hyperlinks (because you have to change the URL scheme). Using MOD REWRITE however, you can manipulate relative hyperlinks, to achieve the same effect. RewriteEngine on RewriteRule "ˆ/(.*)_SSL$" "https://%{SERVER_NAME}/$1" [R,L] RewriteRule "ˆ/(.*)_NOSSL$" "http://%{SERVER_NAME}/$1" [R,L] This rewrite ruleset lets you use hyperlinks of the form , to switch to HTTPS in a relative link. (Replace SSL with NOSSL to switch to HTTP.) Certificates • What are RSA Private Keys, CSRs and Certificates? • Is there a difference on startup between a non-SSL-aware Apache and an SSL-aware Apache? • How do I create a self-signed SSL Certificate for testing purposes? • How do I create a real SSL Certificate? • How do I create and use my own Certificate Authority (CA)? • How can I change the pass-phrase on my private key file? • How can I get rid of the pass-phrase dialog at Apache startup time? • How do I verify that a private key matches its Certificate? • How can I convert a certificate from PEM to DER format? • Why do browsers complain that they cannot verify my server certificate? What are RSA Private Keys, CSRs and Certificates? An RSA private key file is a digital file that you can use to decrypt messages sent to you. It has a public component which you distribute (via your Certificate file) which allows people to encrypt those messages to you. A Certificate Signing Request (CSR) is a digital file which contains your public key and your name. You send the CSR to a Certifying Authority (CA), who will convert it into a real Certificate, by signing it. A Certificate contains your RSA public key, your name, the name of the CA, and is digitally signed by the CA. Browsers that know the CA can verify the signature on that Certificate, thereby obtaining your RSA public key. That enables them to send messages which only you can decrypt. See the Introduction (p. 193) chapter for a general description of the SSL protocol. Is there a difference on startup between a non-SSL-aware Apache and an SSL-aware Apache? Yes. In general, starting Apache with MOD SSL built-in is just like starting Apache without it. However, if you have a passphrase on your SSL private key file, a startup dialog will pop up which asks you to enter the pass phrase. Having to manually enter the passphrase when starting the server can be problematic - for example, when starting the server from the system boot scripts. In this case, you can follow the steps below to remove the passphrase from your private key. Bear in mind that doing so brings additional security risks - proceed with caution! 5.5. SSL/TLS STRONG ENCRYPTION: FAQ 215 How do I create a self-signed SSL Certificate for testing purposes? 1. Make sure OpenSSL is installed and in your PATH. 2. Run the following command, to create server.key and server.crt files: $ openssl req -new -x509 -nodes -out server.crt -keyout server.key These can be used as follows in your httpd.conf file: SSLCertificateFile /path/to/this/server.crt SSLCertificateKeyFile /path/to/this/server.key 3. It is important that you are aware that this server.key does not have any passphrase. To add a passphrase to the key, you should run the following command, and enter & verify the passphrase as requested. $ openssl rsa -des3 -in server.key -out server.key.new $ mv server.key.new server.key Please backup the server.key file, and the passphrase you entered, in a secure location. How do I create a real SSL Certificate? Here is a step-by-step description: 1. Make sure OpenSSL is installed and in your PATH. 2. Create a RSA private key for your Apache server (will be Triple-DES encrypted and PEM formatted): $ openssl genrsa -des3 -out server.key 2048 Please backup this server.key file and the pass-phrase you entered in a secure location. You can see the details of this RSA private key by using the command: $ openssl rsa -noout -text -in server.key If necessary, you can also create a decrypted PEM version (not recommended) of this RSA private key with: $ openssl rsa -in server.key -out server.key.unsecure 3. Create a Certificate Signing Request (CSR) with the server RSA private key (output will be PEM formatted): $ openssl req -new -key server.key -out server.csr Make sure you enter the FQDN ("Fully Qualified Domain Name") of the server when OpenSSL prompts you for the "CommonName", i.e. when you generate a CSR for a website which will be later accessed via https://www.foo.dom/, enter "www.foo.dom" here. You can see the details of this CSR by using $ openssl req -noout -text -in server.csr 4. You now have to send this Certificate Signing Request (CSR) to a Certifying Authority (CA) to be signed. Once the CSR has been signed, you will have a real Certificate, which can be used by Apache. You can have a CSR signed by a commercial CA, or you can create your own CA to sign it. Commercial CAs usually ask you to post the CSR into a web form, pay for the signing, and then send a signed Certificate, which you can store in a server.crt file. 216 CHAPTER 5. APACHE SSL/TLS ENCRYPTION For details on how to create your own CA, and use this to sign a CSR, see below. Once your CSR has been signed, you can see the details of the Certificate as follows: $ openssl x509 -noout -text -in server.crt 5. You should now have two files: server.key and server.crt. These can be used as follows in your httpd.conf file: SSLCertificateFile /path/to/this/server.crt SSLCertificateKeyFile /path/to/this/server.key The server.csr file is no longer needed. How do I create and use my own Certificate Authority (CA)? The short answer is to use the CA.sh or CA.pl script provided by OpenSSL. Unless you have a good reason not to, you should use these for preference. If you cannot, you can create a self-signed certificate as follows: 1. Create a RSA private key for your server (will be Triple-DES encrypted and PEM formatted): $ openssl genrsa -des3 -out server.key 2048 Please backup this server.key file and the pass-phrase you entered in a secure location. You can see the details of this RSA private key by using the command: $ openssl rsa -noout -text -in server.key If necessary, you can also create a decrypted PEM version (not recommended) of this RSA private key with: $ openssl rsa -in server.key -out server.key.unsecure 2. Create a self-signed Certificate (X509 structure) with the RSA key you just created (output will be PEM formatted): $ openssl req -new -x509 -nodes -sha1 -days 365 -key server.key -out server.crt -extensions usr cert This signs the server CSR and results in a server.crt file. You can see the details of this Certificate using: $ openssl x509 -noout -text -in server.crt How can I change the pass-phrase on my private key file? You simply have to read it with the old pass-phrase and write it again, specifying the new pass-phrase. You can accomplish this with the following commands: $ openssl rsa -des3 -in server.key -out server.key.new $ mv server.key.new server.key The first time you’re asked for a PEM pass-phrase, you should enter the old pass-phrase. After that, you’ll be asked again to enter a pass-phrase - this time, use the new pass-phrase. If you are asked to verify the pass-phrase, you’ll need to enter the new pass-phrase a second time. 5.5. SSL/TLS STRONG ENCRYPTION: FAQ 217 How can I get rid of the pass-phrase dialog at Apache startup time? The reason this dialog pops up at startup and every re-start is that the RSA private key inside your server.key file is stored in encrypted format for security reasons. The pass-phrase is needed to decrypt this file, so it can be read and parsed. Removing the pass-phrase removes a layer of security from your server - proceed with caution! 1. Remove the encryption from the RSA private key (while keeping a backup copy of the original file): $ cp server.key server.key.org $ openssl rsa -in server.key.org -out server.key 2. Make sure the server.key file is only readable by root: $ chmod 400 server.key Now server.key contains an unencrypted copy of the key. If you point your server at this file, it will not prompt you for a pass-phrase. HOWEVER, if anyone gets this key they will be able to impersonate you on the net. PLEASE make sure that the permissions on this file are such that only root or the web server user can read it (preferably get your web server to start as root but run as another user, and have the key readable only by root). As an alternative approach you can use the “SSLPassPhraseDialog exec:/path/to/program” facility. Bear in mind that this is neither more nor less secure, of course. How do I verify that a private key matches its Certificate? A private key contains a series of numbers. Two of these numbers form the "public key", the others are part of the "private key". The "public key" bits are included when you generate a CSR, and subsequently form part of the associated Certificate. To check that the public key in your Certificate matches the public portion of your private key, you simply need to compare these numbers. To view the Certificate and the key run the commands: $ openssl x509 -noout -text -in server.crt $ openssl rsa -noout -text -in server.key The ‘modulus’ and the ‘public exponent’ portions in the key and the Certificate must match. As the public exponent is usually 65537 and it’s difficult to visually check that the long modulus numbers are the same, you can use the following approach: $ openssl x509 -noout -modulus -in server.crt | openssl md5 $ openssl rsa -noout -modulus -in server.key | openssl md5 This leaves you with two rather shorter numbers to compare. It is, in theory, possible that these numbers may be the same, without the modulus numbers being the same, but the chances of this are overwhelmingly remote. Should you wish to check to which key or certificate a particular CSR belongs you can perform the same calculation on the CSR as follows: $ openssl req -noout -modulus -in server.csr | openssl md5 How can I convert a certificate from PEM to DER format? The default certificate format for OpenSSL is PEM, which is simply Base64 encoded DER, with header and footer lines. For some applications (e.g. Microsoft Internet Explorer) you need the certificate in plain DER format. You can convert a PEM file cert.pem into the corresponding DER file cert.der using the following command: $ openssl x509 -in cert.pem -out cert.der -outform DER 218 CHAPTER 5. APACHE SSL/TLS ENCRYPTION Why do browsers complain that they cannot verify my server certificate? One reason this might happen is because your server certificate is signed by an intermediate CA. Various CAs, such as Verisign or Thawte, have started signing certificates not with their root certificate but with intermediate certificates. Intermediate CA certificates lie between the root CA certificate (which is installed in the browsers) and the server certificate (which you installed on the server). In order for the browser to be able to traverse and verify the trust chain from the server certificate to the root certificate it needs need to be given the intermediate certificates. The CAs should be able to provide you such intermediate certificate packages that can be installed on the server. You need to include those intermediate certificates with the SSLC ERTIFICATE C HAIN F ILE directive. The SSL Protocol • Why do I get lots of random SSL protocol errors under heavy server load? • Why does my webserver have a higher load, now that it serves SSL encrypted traffic? • Why do HTTPS connections to my server sometimes take up to 30 seconds to establish a connection? • What SSL Ciphers are supported by mod ssl? • Why do I get “no shared cipher” errors, when trying to use Anonymous Diffie-Hellman (ADH) ciphers? • Why do I get a ’no shared ciphers’ error when connecting to my newly installed server? • Why can’t I use SSL with name-based/non-IP-based virtual hosts? • Is it possible to use Name-Based Virtual Hosting to identify different SSL virtual hosts? • How do I get SSL compression working? • When I use Basic Authentication over HTTPS the lock icon in Netscape browsers stays unlocked when the dialog pops up. Does this mean the username/password is being sent unencrypted? • Why do I get I/O errors when connecting via HTTPS to an Apache+mod ssl server with Microsoft Internet Explorer (MSIE)? • How do I enable TLS-SRP? • Why do I get handshake failures with Java-based clients when using a certificate with more than 1024 bits? Why do I get lots of random SSL protocol errors under heavy server load? There can be a number of reasons for this, but the main one is problems with the SSL session Cache specified by the SSLS ESSION C ACHE directive. The DBM session cache is the most likely source of the problem, so using the SHM session cache (or no cache at all) may help. Why does my webserver have a higher load, now that it serves SSL encrypted traffic? SSL uses strong cryptographic encryption, which necessitates a lot of number crunching. When you request a webpage via HTTPS, everything (even the images) is encrypted before it is transferred. So increased HTTPS traffic leads to load increases. Why do HTTPS connections to my server sometimes take up to 30 seconds to establish a connection? This is usually caused by a /dev/random device for SSLR ANDOM S EED which blocks the read(2) call until enough entropy is available to service the request. More information is available in the reference manual for the SSLR AN DOM S EED directive. 5.5. SSL/TLS STRONG ENCRYPTION: FAQ 219 What SSL Ciphers are supported by mod ssl? Usually, any SSL ciphers supported by the version of OpenSSL in use, are also supported by MOD SSL. Which ciphers are available can depend on the way you built OpenSSL. Typically, at least the following ciphers are supported: 1. RC4 with SHA1 2. AES with SHA1 3. Triple-DES with SHA1 To determine the actual list of ciphers available, you should run the following: $ openssl ciphers -v Why do I get “no shared cipher” errors, when trying to use Anonymous Diffie-Hellman (ADH) ciphers? By default, OpenSSL does not allow ADH ciphers, for security reasons. Please be sure you are aware of the potential side-effects if you choose to enable these ciphers. In order to use Anonymous Diffie-Hellman (ADH) ciphers, you must build OpenSSL with “-DSSL ALLOW ADH”, and then add “ADH” into your SSLC IPHER S UITE. Why do I get a ’no shared ciphers’ error when connecting to my newly installed server? Either you have made a mistake with your SSLC IPHER S UITE directive (compare it with the pre-configured example in extra/httpd-ssl.conf) or you chose to use DSA/DH algorithms instead of RSA when you generated your private key and ignored or overlooked the warnings. If you have chosen DSA/DH, then your server cannot communicate using RSA-based SSL ciphers (at least until you configure an additional RSA-based certificate/key pair). Modern browsers like NS or IE can only communicate over SSL using RSA ciphers. The result is the "no shared ciphers" error. To fix this, regenerate your server certificate/key pair, using the RSA algorithm. Why can’t I use SSL with name-based/non-IP-based virtual hosts? The reason is very technical, and a somewhat "chicken and egg" problem. The SSL protocol layer stays below the HTTP protocol layer and encapsulates HTTP. When an SSL connection (HTTPS) is established Apache/mod ssl has to negotiate the SSL protocol parameters with the client. For this, mod ssl has to consult the configuration of the virtual server (for instance it has to look for the cipher suite, the server certificate, etc.). But in order to go to the correct virtual server Apache has to know the Host HTTP header field. To do this, the HTTP request header has to be read. This cannot be done before the SSL handshake is finished, but the information is needed in order to complete the SSL handshake phase. See the next question for how to circumvent this issue. Note that if you have a wildcard SSL certificate, or a certificate that has multiple hostnames on it using subjectAltName fields, you can use SSL on name-based virtual hosts without further workarounds. Is it possible to use Name-Based Virtual Hosting to identify different SSL virtual hosts? Name-Based Virtual Hosting is a very popular method of identifying different virtual hosts. It allows you to use the same IP address and the same port number for many different sites. When people move on to SSL, it seems natural to assume that the same method can be used to have lots of different SSL virtual hosts on the same server. 220 CHAPTER 5. APACHE SSL/TLS ENCRYPTION It is possible, but only if using a 2.2.12 or later web server, built with 0.9.8j or later OpenSSL. This is because it requires a feature that only the most recent revisions of the SSL specification added, called Server Name Indication (SNI). Note that if you have a wildcard SSL certificate, or a certificate that has multiple hostnames on it using subjectAltName fields, you can use SSL on name-based virtual hosts without further workarounds. The reason is that the SSL protocol is a separate layer which encapsulates the HTTP protocol. So the SSL session is a separate transaction, that takes place before the HTTP session has begun. The server receives an SSL request on IP address X and port Y (usually 443). Since the SSL request did not contain any Host: field, the server had no way to decide which SSL virtual host to use. Usually, it just used the first one it found which matched the port and IP address specified. If you are using a version of the web server and OpenSSL that support SNI, though, and the client’s browser also supports SNI, then the hostname is included in the original SSL request, and the web server can select the correct SSL virtual host. You can, of course, use Name-Based Virtual Hosting to identify many non-SSL virtual hosts (all on port 80, for example) and then have a single SSL virtual host (on port 443). But if you do this, you must make sure to put the non-SSL port number on the NameVirtualHost directive, e.g. NameVirtualHost 192.168.1.1:80 Other workaround solutions include: Using separate IP addresses for different SSL hosts. Using different port numbers for different SSL hosts. How do I get SSL compression working? Although SSL compression negotiation was defined in the specification of SSLv2 and TLS, it took until May 2004 for RFC 3749 to define DEFLATE as a negotiable standard compression method. OpenSSL 0.9.8 started to support this by default when compiled with the zlib option. If both the client and the server support compression, it will be used. However, most clients still try to initially connect with an SSLv2 Hello. As SSLv2 did not include an array of preferred compression algorithms in its handshake, compression cannot be negotiated with these clients. If the client disables support for SSLv2, either an SSLv3 or TLS Hello may be sent, depending on which SSL library is used, and compression may be set up. You can verify whether clients make use of SSL compression by logging the %{SSL COMPRESS METHOD}x variable. When I use Basic Authentication over HTTPS the lock icon in Netscape browsers stays unlocked when the dialog pops up. Does this mean the username/password is being sent unencrypted? No, the username/password is transmitted encrypted. The icon in Netscape browsers is not actually synchronized with the SSL/TLS layer. It only toggles to the locked state when the first part of the actual webpage data is transferred, which may confuse people. The Basic Authentication facility is part of the HTTP layer, which is above the SSL/TLS layer in HTTPS. Before any HTTP data communication takes place in HTTPS, the SSL/TLS layer has already completed its handshake phase, and switched to encrypted communication. So don’t be confused by this icon. Why do I get I/O errors when connecting via HTTPS to an Apache+mod ssl server with older versions of Microsoft Internet Explorer (MSIE)? The first reason is that the SSL implementation in some MSIE versions has some subtle bugs related to the HTTP keep-alive facility and the SSL close notify alerts on socket connection close. Additionally the interaction between SSL and HTTP/1.1 features are problematic in some MSIE versions. You can work around these problems by forcing 5.5. SSL/TLS STRONG ENCRYPTION: FAQ 221 Apache not to use HTTP/1.1, keep-alive connections or send the SSL close notify messages to MSIE clients. This can be done by using the following directive in your SSL-aware virtual host section: SetEnvIf User-Agent "MSIE [2-5]" \ nokeepalive ssl-unclean-shutdown \ downgrade-1.0 force-response-1.0 Further, some MSIE versions have problems with particular ciphers. Unfortunately, it is not possible to implement a MSIE-specific workaround for this, because the ciphers are needed as early as the SSL handshake phase. So a MSIEspecific S ET E NV I F won’t solve these problems. Instead, you will have to make more drastic adjustments to the global parameters. Before you decide to do this, make sure your clients really have problems. If not, do not make these changes - they will affect all your clients, MSIE or otherwise. How do I enable TLS-SRP? TLS-SRP (Secure Remote Password key exchange for TLS, specified in RFC 5054) can supplement or replace certificates in authenticating an SSL connection. To use TLS-SRP, set the SSLSRPV ERIFIER F ILE directive to point to an OpenSSL SRP verifier file. To create the verifier file, use the openssl tool: openssl srp -srpvfile passwd.srpv -add username After creating this file, specify it in the SSL server configuration: SSLSRPVerifierFile /path/to/passwd.srpv To force clients to use non-certificate TLS-SRP cipher suites, use the following directive: SSLCipherSuite "!DSS:!aRSA:SRP" Why do I get handshake failures with Java-based clients when using a certificate with more than 1024 bits? Beginning with version 2.5.0-dev as of 2013-09-29, MOD SSL will use DH parameters which include primes with lengths of more than 1024 bits. Java 7 and earlier limit their support for DH prime sizes to a maximum of 1024 bits, however. If your Java-based client aborts with exceptions such as java.lang.RuntimeException: Could not generate DH keypair and java.security.InvalidAlgorithmParameterException: Prime size must be multiple of 64, and can only range from 512 to 1024 (inclusive), and httpd logs tlsv1 alert internal error (SSL alert number 80) (at L OG L EVEL info or higher), you can either rearrange mod ssl’s cipher list with SSLC IPHER S UITE (possibly in conjunction with SSLH ONOR C IPHERO RDER), or you can use custom DH parameters with a 1024-bit prime, which will always have precedence over any of the built-in DH parameters. To generate custom DH parameters, use the openssl dhparam 1024 command. Alternatively, you can use the following standard 1024-bit DH parameters from RFC 24097 , section 6.2: 7 http://www.ietf.org/rfc/rfc2409.txt 222 CHAPTER 5. APACHE SSL/TLS ENCRYPTION -----BEGIN DH PARAMETERS----MIGHAoGBAP//////////yQ/aoiFowjTExmKLgNwc0SkCTgiKZ8x0Agu+pjsTmyJR Sgh5jjQE3e+VGbPNOkMbMCsKbfJfFDdP4TVtbVHCReSFtXZiXn7G9ExC6aY37WsL /1y29Aa37e44a/taiZ+lrp8kEXxLH+ZJKGZR7OZTgf//////////AgEC -----END DH PARAMETERS----- Add the custom parameters including the "BEGIN DH PARAMETERS" and "END DH PARAMETERS" lines to the end of the first certificate file you have configured using the SSLC ERTIFICATE F ILE directive. mod ssl Support • What information resources are available in case of mod ssl problems? • What support contacts are available in case of mod ssl problems? • What information should I provide when writing a bug report? • I had a core dump, can you help me? • How do I get a backtrace, to help find the reason for my core dump? What information resources are available in case of mod ssl problems? The following information resources are available. In case of problems you should search here first. Answers in the User Manual’s F.A.Q. List (this) http://httpd.apache.org/docs/trunk/ssl/ssl faq.html8 First check the F.A.Q. (this text). If your problem is a common one, it may have been answered several times before, and been included in this doc. What support contacts are available in case of mod ssl problems? The following lists all support possibilities for mod ssl, in order of preference. Please go through these possibilities in this order - don’t just pick the one you like the look of. 1. Send a Problem Report to the Apache httpd Users Support Mailing List users@httpd.apache.org9 This is the second way of submitting your problem report. Again, you must subscribe to the list first, but you can then easily discuss your problem with the whole Apache httpd user community. 2. Write a Problem Report in the Bug Database http://httpd.apache.org/bug report.html10 This is the last way of submitting your problem report. You should only do this if you’ve already posted to the mailing lists, and had no success. Please follow the instructions on the above page carefully. 8 http://httpd.apache.org/docs/trunk/ssl/ssl 9 mailto:users@httpd.apache.org 10 http://httpd.apache.org/bug report.html faq.html 5.5. SSL/TLS STRONG ENCRYPTION: FAQ 223 What information should I provide when writing a bug report? You should always provide at least the following information: Apache httpd and OpenSSL version information The Apache version can be determined by running httpd -v. The OpenSSL version can be determined by running openssl version. Alternatively, if you have Lynx installed, you can run the command lynx -mime header http://localhost/ | grep Server to gather this information in a single step. The details on how you built and installed Apache httpd and OpenSSL For this you can provide a logfile of your terminal session which shows the configuration and install steps. If this is not possible, you should at least provide the configure command line you used. In case of core dumps please include a Backtrace If your Apache httpd dumps its core, please attach a stack-frame “backtrace” (see below for information on how to get this). This information is required in order to find a reason for your core dump. A detailed description of your problem Don’t laugh, we really mean it! Many problem reports don’t include a description of what the actual problem is. Without this, it’s very difficult for anyone to help you. So, it’s in your own interest (you want the problem be solved, don’t you?) to include as much detail as possible, please. Of course, you should still include all the essentials above too. I had a core dump, can you help me? In general no, at least not unless you provide more details about the code location where Apache dumped core. What is usually always required in order to help you is a backtrace (see next question). Without this information it is mostly impossible to find the problem and help you in fixing it. How do I get a backtrace, to help find the reason for my core dump? Following are the steps you will need to complete, to get a backtrace: 1. Make sure you have debugging symbols available, at least in Apache. On platforms where you use GCC/GDB, you will have to build Apache+mod ssl with “OPTIM="-g -ggdb3"” to get this. On other platforms at least “OPTIM="-g"” is needed. 2. Start the server and try to reproduce the core-dump. For this you may want to use a directive like “CoreDumpDirectory /tmp” to make sure that the core-dump file can be written. This should result in a /tmp/core or /tmp/httpd.core file. If you don’t get one of these, try running your server under a non-root UID. Many modern kernels do not allow a process to dump core after it has done a setuid() (unless it does an exec()) for security reasons (there can be privileged information left over in memory). If necessary, you can run /path/to/httpd -X manually to force Apache to not fork. 3. Analyze the core-dump. For this, run gdb /path/to/httpd /tmp/httpd.core or a similar command. In GDB, all you have to do then is to enter bt, and voila, you get the backtrace. For other debuggers consult your local debugger manual. 224 CHAPTER 5. APACHE SSL/TLS ENCRYPTION Chapter 6 Guides, Tutorials, and HowTos 225 226 6.1 CHAPTER 6. GUIDES, TUTORIALS, AND HOWTOS How-To / Tutorials How-To / Tutorials Authentication and Authorization Authentication is any process by which you verify that someone is who they claim they are. Authorization is any process by which someone is allowed to be where they want to go, or to have information that they want to have. See: Authentication, Authorization (p. 227) Access Control Access control refers to the process of restricting, or granting access to a resource based on arbitrary criteria. There are a variety of different ways that this can be accomplished. See: Access Control (p. 234) Dynamic Content with CGI The CGI (Common Gateway Interface) defines a way for a web server to interact with external content-generating programs, which are often referred to as CGI programs or CGI scripts. It is a simple way to put dynamic content on your web site. This document will be an introduction to setting up CGI on your Apache web server, and getting started writing CGI programs. See: CGI: Dynamic Content (p. 236) .htaccess files .htaccess files provide a way to make configuration changes on a per-directory basis. A file, containing one or more configuration directives, is placed in a particular document directory, and the directives apply to that directory, and all subdirectories thereof. See: .htaccess files (p. 249) HTTP/2 with httpd HTTP/2 is the evolution of the world’s most successful application layer protocol, HTTP. It focuses on making more efficient use of network resources without changing the semantics of HTTP. This guide explains how HTTP/2 is implemented in httpd, showing basic configurations tips and best practices. See: HTTP/2 guide (p. ??) Introduction to Server Side Includes SSI (Server Side Includes) are directives that are placed in HTML pages, and evaluated on the server while the pages are being served. They let you add dynamically generated content to an existing HTML page, without having to serve the entire page via a CGI program, or other dynamic technology. See: Server Side Includes (SSI) (p. 243) Per-user web directories On systems with multiple users, each user can be permitted to have a web site in their home directory using the U SER D IR directive. Visitors to a URL http://example.com/˜username/ will get content out of the home directory of the user "username", out of the subdirectory specified by the U SER D IR directive. See: User web directories (public html) (p. 258) Reverse Proxy guide Apache httpd has extensive capabilities as a reverse proxy server using the P ROXY PASS directive as well as BALANCER M EMBER to create sophisticated reverse proxying implementations which provide for high-availability, load balancing and failover, cloud-based clustering and dynamic on-the-fly reconfiguration. See: Reverse proxy guide (p. ??) 6.2. AUTHENTICATION AND AUTHORIZATION 6.2 227 Authentication and Authorization Authentication is any process by which you verify that someone is who they claim they are. Authorization is any process by which someone is allowed to be where they want to go, or to have information that they want to have. For general access control, see the Access Control How-To (p. 234) . Related Modules and Directives There are three types of modules involved in the authentication and authorization process. You will usually need to choose at least one module from each group. • Authentication type (see the AUTH T YPE directive) – MOD AUTH BASIC – MOD AUTH DIGEST • Authentication provider (see the AUTH BASIC P ROVIDER and AUTH D IGEST P ROVIDER directives) – – – – – – MOD AUTHN ANON MOD AUTHN DBD MOD AUTHN DBM MOD AUTHN FILE MOD AUTHNZ LDAP MOD AUTHN SOCACHE • Authorization (see the R EQUIRE directive) – – – – – – – MOD AUTHNZ LDAP MOD AUTHZ DBD MOD AUTHZ DBM MOD AUTHZ GROUPFILE MOD AUTHZ HOST MOD AUTHZ OWNER MOD AUTHZ USER In addition to these modules, there are also MOD AUTHN CORE and MOD AUTHZ CORE. These modules implement core directives that are core to all auth modules. The module MOD AUTHNZ LDAP is both an authentication and authorization provider. The module MOD AUTHZ HOST provides authorization and access control based on hostname, IP address or characteristics of the request, but is not part of the authentication provider system. For backwards compatibility with the mod access, there is a new module MOD ACCESS COMPAT. You probably also want to take a look at the Access Control (p. 234) howto, which discusses the various ways to control access to your server. Introduction If you have information on your web site that is sensitive or intended for only a small group of people, the techniques in this article will help you make sure that the people that see those pages are the people that you wanted to see them. This article covers the "standard" way of protecting parts of your web site that most of you are going to use. 228 CHAPTER 6. GUIDES, TUTORIALS, AND HOWTOS =⇒Note: If your data really needs to be secure, consider using MOD SSL in addition to any authentica- tion. The Prerequisites The directives discussed in this article will need to go either in your main server configuration file (typically in a section), or in per-directory configuration files (.htaccess files). If you plan to use .htaccess files, you will need to have a server configuration that permits putting authentication directives in these files. This is done with the A LLOW OVERRIDE directive, which specifies which directives, if any, may be put in per-directory configuration files. Since we’re talking here about authentication, you will need an A LLOW OVERRIDE directive like the following: AllowOverride AuthConfig Or, if you are just going to put the directives directly in your main server configuration file, you will of course need to have write permission to that file. And you’ll need to know a little bit about the directory structure of your server, in order to know where some files are kept. This should not be terribly difficult, and I’ll try to make this clear when we come to that point. You will also need to make sure that the modules MOD AUTHN CORE and MOD AUTHZ CORE have either been built into the httpd binary or loaded by the httpd.conf configuration file. Both of these modules provide core directives and functionality that are critical to the configuration and use of authentication and authorization in the web server. Getting it working Here’s the basics of password protecting a directory on your server. First, you need to create a password file. Exactly how you do this will vary depending on what authentication provider you have chosen. More on that later. To start with, we’ll use a text password file. This file should be placed somewhere not accessible from the web. This is so that folks cannot download the password file. For example, if your documents are served out of /usr/local/apache/htdocs, you might want to put the password file(s) in /usr/local/apache/passwd. To create the file, use the htpasswd utility that came with Apache. This will be located in the bin directory of wherever you installed Apache. If you have installed Apache from a third-party package, it may be in your execution path. To create the file, type: htpasswd -c /usr/local/apache/passwd/passwords rbowen htpasswd will ask you for the password, and then ask you to type it again to confirm it: # htpasswd -c /usr/local/apache/passwd/passwords rbowen New password: mypassword Re-type new password: mypassword Adding password for user rbowen 6.2. AUTHENTICATION AND AUTHORIZATION 229 If htpasswd is not in your path, of course you’ll have to type the full path to the file to get it to run. With a default installation, it’s located at /usr/local/apache2/bin/htpasswd Next, you’ll need to configure the server to request a password and tell the server which users are allowed access. You can do this either by editing the httpd.conf file or using an .htaccess file. For example, if you wish to protect the directory /usr/local/apache/htdocs/secret, you can use the following directives, either placed in the file /usr/local/apache/htdocs/secret/.htaccess, or placed in httpd.conf inside a section. AuthType Basic AuthName "Restricted Files" # (Following line optional) AuthBasicProvider file AuthUserFile "/usr/local/apache/passwd/passwords" Require user rbowen Let’s examine each of those directives individually. The AUTH T YPE directive selects that method that is used to authenticate the user. The most common method is Basic, and this is the method implemented by MOD AUTH BASIC. It is important to be aware, however, that Basic authentication sends the password from the client to the server unencrypted. This method should therefore not be used for highly sensitive data, unless accompanied by MOD SSL. Apache supports one other authentication method: AuthType Digest. This method is implemented by MOD AUTH DIGEST and was intended to be more secure. This is no longer the case and the connection should be encrypted with MOD SSL instead. The AUTH NAME directive sets the Realm to be used in the authentication. The realm serves two major functions. First, the client often presents this information to the user as part of the password dialog box. Second, it is used by the client to determine what password to send for a given authenticated area. So, for example, once a client has authenticated in the "Restricted Files" area, it will automatically retry the same password for any area on the same server that is marked with the "Restricted Files" Realm. Therefore, you can prevent a user from being prompted more than once for a password by letting multiple restricted areas share the same realm. Of course, for security reasons, the client will always need to ask again for the password whenever the hostname of the server changes. The AUTH BASIC P ROVIDER is, in this case, optional, since file is the default value for this directive. You’ll need to use this directive if you are choosing a different source for authentication, such as MOD AUTHN DBM or MOD AUTHN DBD . The AUTH U SER F ILE directive sets the path to the password file that we just created with htpasswd. If you have a large number of users, it can be quite slow to search through a plain text file to authenticate the user on each request. Apache also has the ability to store user information in fast database files. The MOD AUTHN DBM module provides the AUTH DBMU SER F ILE directive. These files can be created and manipulated with the dbmmanage and htdbm programs. Many other types of authentication options are available from third party modules in the Apache Modules Database1 . Finally, the R EQUIRE directive provides the authorization part of the process by setting the user that is allowed to access this region of the server. In the next section, we discuss various ways to use the R EQUIRE directive. Letting more than one person in The directives above only let one person (specifically someone with a username of rbowen) into the directory. In most cases, you’ll want to let more than one person in. This is where the AUTH G ROUP F ILE comes in. 1 http://modules.apache.org/ 230 CHAPTER 6. GUIDES, TUTORIALS, AND HOWTOS If you want to let more than one person in, you’ll need to create a group file that associates group names with a list of users in that group. The format of this file is pretty simple, and you can create it with your favorite editor. The contents of the file will look like this: GroupName: rbowen dpitts sungo rshersey That’s just a list of the members of the group in a long line separated by spaces. To add a user to your already existing password file, type: htpasswd /usr/local/apache/passwd/passwords dpitts You’ll get the same response as before, but it will be appended to the existing file, rather than creating a new file. (It’s the -c that makes it create a new password file). Now, you need to modify your .htaccess file to look like the following: AuthType Basic AuthName "By Invitation Only" # Optional line: AuthBasicProvider file AuthUserFile "/usr/local/apache/passwd/passwords" AuthGroupFile "/usr/local/apache/passwd/groups" Require group GroupName Now, anyone that is listed in the group GroupName, and has an entry in the password file, will be let in, if they type the correct password. There’s another way to let multiple users in that is less specific. Rather than creating a group file, you can just use the following directive: Require valid-user Using that rather than the Require user rbowen line will allow anyone in that is listed in the password file, and who correctly enters their password. You can even emulate the group behavior here, by just keeping a separate password file for each group. The advantage of this approach is that Apache only has to check one file, rather than two. The disadvantage is that you have to maintain a bunch of password files, and remember to reference the right one in the AUTH U SER F ILE directive. Possible problems Because of the way that Basic authentication is specified, your username and password must be verified every time you request a document from the server. This is even if you’re reloading the same page, and for every image on the page (if they come from a protected directory). As you can imagine, this slows things down a little. The amount that it slows things down is proportional to the size of the password file, because it has to open up that file, and go down the list of users until it gets to your name. And it has to do this every time a page is loaded. A consequence of this is that there’s a practical limit to how many users you can put in one password file. This limit will vary depending on the performance of your particular server machine, but you can expect to see slowdowns once you get above a few hundred entries, and may wish to consider a different authentication method at that time. 6.2. AUTHENTICATION AND AUTHORIZATION 231 Alternate password storage Because storing passwords in plain text files has the above problems, you may wish to store your passwords somewhere else, such as in a database. MOD AUTHN DBM and MOD AUTHN DBD are two modules which make this possible. Rather than selecting A U T H B A S I C P R O V I D E R file, instead you can choose dbm or dbd as your storage format. To select a dbm file rather than a text file, for example: AuthName "Private" AuthType Basic AuthBasicProvider dbm AuthDBMUserFile "/www/passwords/passwd.dbm" Require valid-user Other options are available. Consult the MOD AUTHN DBM documentation for more details. Using multiple providers With the introduction of the new provider based authentication and authorization architecture, you are no longer locked into a single authentication or authorization method. In fact any number of the providers can be mixed and matched to provide you with exactly the scheme that meets your needs. In the following example, both the file and LDAP based authentication providers are being used. AuthName "Private" AuthType Basic AuthBasicProvider file ldap AuthUserFile "/usr/local/apache/passwd/passwords" AuthLDAPURL ldap://ldaphost/o=yourorg Require valid-user In this example the file provider will attempt to authenticate the user first. If it is unable to authenticate the user, the LDAP provider will be called. This allows the scope of authentication to be broadened if your organization implements more than one type of authentication store. Other authentication and authorization scenarios may include mixing one type of authentication with a different type of authorization. For example, authenticating against a password file yet authorizing against an LDAP directory. Just as multiple authentication providers can be implemented, multiple authorization methods can also be used. In this example both file group authorization as well as LDAP group authorization is being used. AuthName "Private" AuthType Basic AuthBasicProvider file AuthUserFile "/usr/local/apache/passwd/passwords" AuthLDAPURL ldap://ldaphost/o=yourorg AuthGroupFile "/usr/local/apache/passwd/groups" Require group GroupName Require ldap-group cn=mygroup,o=yourorg 232 CHAPTER 6. GUIDES, TUTORIALS, AND HOWTOS To take authorization a little further, authorization container directives such as and allow logic to be applied so that the order in which authorization is handled can be completely controlled through the configuration. See Authorization Containers (p. 519) for an example of how they may be applied. Beyond just authorization The way that authorization can be applied is now much more flexible than just a single check against a single data store. Ordering, logic and choosing how authorization will be done is now possible. Applying logic and ordering Controlling how and in what order authorization will be applied has been a bit of a mystery in the past. In Apache 2.2 a provider-based authentication mechanism was introduced to decouple the actual authentication process from authorization and supporting functionality. One of the side benefits was that authentication providers could be configured and called in a specific order which didn’t depend on the load order of the auth module itself. This same provider based mechanism has been brought forward into authorization as well. What this means is that the R EQUIRE directive not only specifies which authorization methods should be used, it also specifies the order in which they are called. Multiple authorization methods are called in the same order in which the R EQUIRE directives appear in the configuration. With the introduction of authorization container directives such as and , the configuration also has control over when the authorization methods are called and what criteria determines when access is granted. See Authorization Containers (p. 519) for an example of how they may be used to express complex authorization logic. By default all R EQUIRE directives are handled as though contained within a container directive. In other words, if any of the specified authorization methods succeed, then authorization is granted. Using authorization providers for access control Authentication by username and password is only part of the story. Frequently you want to let people in based on something other than who they are. Something such as where they are coming from. The authorization providers all, env, host and ip let you allow or deny access based on other host based criteria such as host name or ip address of the machine requesting a document. The usage of these providers is specified through the R EQUIRE directive. This directive registers the authorization providers that will be called during the authorization stage of the request processing. For example: Require ip address where address is an IP address (or a partial IP address) or: Require host domain_name where domain name is a fully qualified domain name (or a partial domain name); you may provide multiple addresses or domain names, if desired. For example, if you have someone spamming your message board, and you want to keep them out, you could do the following: Require all granted Require not ip 10.252.46.165 6.2. AUTHENTICATION AND AUTHORIZATION 233 Visitors coming from that address will not be able to see the content covered by this directive. If, instead, you have a machine name, rather than an IP address, you can use that. Require all granted Require not host host.example.com And, if you’d like to block access from an entire domain, you can specify just part of an address or domain name: Require all Require not Require not Require not granted ip 192.168.205 host phishers.example.com moreidiots.example host ke Using with multiple directives, each negated with not, will only allow access, if all of negated conditions are true. In other words, access will be blocked, if any of the negated conditions fails. Access Control backwards compatibility One of the side effects of adopting a provider based mechanism for authentication is that the previous access control directives O RDER, A LLOW, D ENY and S ATISFY are no longer needed. However to provide backwards compatibility for older configurations, these directives have been moved to the MOD ACCESS COMPAT module. ! Note The directives provided by MOD ACCESS COMPAT have been deprecated by Mixing old directives like O RDER, A LLOW or D ENY with new is technically possible but discouraged. The MOD ACCESS COMPAT module was created to support configurations containing only old directives to facilitate the 2.4 upgrade. Please check the upgrading (p. 2) guide for more information. MOD AUTHZ HOST . ones like R EQUIRE Authentication Caching There may be times when authentication puts an unacceptable load on a provider or on your network. This is most likely to affect users of MOD AUTHN DBD (or third-party/custom providers). To deal with this, HTTPD 2.3/2.4 introduces a new caching provider MOD AUTHN SOCACHE to cache credentials and reduce the load on the origin provider(s). This may offer a substantial performance boost to some users. More information You should also read the documentation for MOD AUTH BASIC and MOD AUTHZ HOST which contain some more information about how this all works. The directive can also help in simplifying certain authentication configurations. The various ciphers supported by Apache for authentication data are explained in Password Encryptions (p. 371) . And you may want to look at the Access Control (p. 234) howto, which discusses a number of related topics. 234 6.3 CHAPTER 6. GUIDES, TUTORIALS, AND HOWTOS Access Control Access control refers to any means of controlling access to any resource. This is separate from authentication and authorization (p. 227) . Related Modules and Directives Access control can be done by several different modules. The most important of these are MOD AUTHZ CORE and MOD AUTHZ HOST . Also discussed in this document is access control using MOD REWRITE. Access control by host If you wish to restrict access to portions of your site based on the host address of your visitors, this is most easily done using MOD AUTHZ HOST. The R EQUIRE provides a variety of different ways to allow or deny access to resources. In conjunction with the R EQUIRE A LL, R EQUIRE A NY, and R EQUIRE N ONE directives, these requirements may be combined in arbitrarily complex ways, to enforce whatever your access policy happens to be. ! The A LLOW, D ENY, and O RDER directives, provided by MOD ACCESS COMPAT, are deprecated and will go away in a future version. You should avoid using them, and avoid outdated tutorials recommending their use. The usage of these directives is: Require host address Require ip ip.address In the first form, address is a fully qualified domain name (or a partial domain name); you may provide multiple addresses or domain names, if desired. In the second form, ip.address is an IP address, a partial IP address, a network/netmask pair, or a network/nnn CIDR specification. Either IPv4 or IPv6 addresses may be used. See the mod authz host documentation (p. 536) for further examples of this syntax. You can insert not to negate a particular requirement. Note, that since a not is a negation of a value, it cannot be used by itself to allow or deny a request, as not true does not constitute false. Thus, to deny a visit using a negation, the block must have one element that evaluates as true or false. For example, if you have someone spamming your message board, and you want to keep them out, you could do the following: Require all granted Require not ip 10.252.46.165 Visitors coming from that address (10.252.46.165) will not be able to see the content covered by this directive. If, instead, you have a machine name, rather than an IP address, you can use that. Require not host host.example.com And, if you’d like to block access from an entire domain, you can specify just part of an address or domain name: 6.3. ACCESS CONTROL 235 Require not ip 192.168.205 Require not host phishers.example.com moreidiots.example Require not host gov Use of the R EQUIRE A LL, R EQUIRE A NY, and R EQUIRE N ONE directives may be used to enforce more complex sets of requirements. Access control by arbitrary variables Using the , you can allow or deny access based on arbitrary environment variables or request header values. For example, to deny access based on user-agent (the browser type) you might do the following: Require all denied Using the R EQUIRE expr syntax, this could also be written as: Require expr %{HTTP_USER_AGENT} != ’BadBot’ =⇒Warning: Access control by User-Agent is an unreliable technique, since the User-Agent header can be set to anything at all, at the whim of the end user. See the expressions document (p. 99) for a further discussion of what expression syntaxes and variables are available to you. Access control with mod rewrite The [F] R EWRITE RULE flag causes a 403 Forbidden response to be sent. Using this, you can deny access to a resource based on arbitrary criteria. For example, if you wish to block access to a resource between 8pm and 7am, you can do this using MOD REWRITE. RewriteEngine On RewriteCond "%{TIME_HOUR}" ">=20" [OR] RewriteCond "%{TIME_HOUR}" "<07" RewriteRule "ˆ/fridge" "-" [F] This will return a 403 Forbidden response for any request after 8pm or before 7am. This technique can be used for any criteria that you wish to check. You can also redirect, or otherwise rewrite these requests, if that approach is preferred. The directive, added in 2.4, replaces many things that MOD REWRITE has traditionally been used to do, and you should probably look there first before resorting to mod rewrite. More information The expression engine (p. 99) gives you a great deal of power to do a variety of things based on arbitrary server variables, and you should consult that document for more detail. Also, you should read the MOD AUTHZ CORE documentation for examples of combining multiple access requirements and specifying how they interact. See also the Authentication and Authorization (p. 227) howto. 236 6.4 CHAPTER 6. GUIDES, TUTORIALS, AND HOWTOS Apache Tutorial: Dynamic Content with CGI Introduction Related Modules MOD ALIAS MOD CGI Related Directives A DD H ANDLER O PTIONS S CRIPTA LIAS The CGI (Common Gateway Interface) defines a way for a web server to interact with external content-generating programs, which are often referred to as CGI programs or CGI scripts. It is the simplest, and most common, way to put dynamic content on your web site. This document will be an introduction to setting up CGI on your Apache web server, and getting started writing CGI programs. Configuring Apache to permit CGI In order to get your CGI programs to work properly, you’ll need to have Apache configured to permit CGI execution. There are several ways to do this. ! Note: If Apache has been built with shared module support you need to ensure that the module is loaded; in your httpd.conf you need to make sure the L OAD M ODULE directive has not been commented out. A correctly configured directive may look like this: LoadModule cgi_module modules/mod_cgi.so ScriptAlias The S CRIPTA LIAS directive tells Apache that a particular directory is set aside for CGI programs. Apache will assume that every file in this directory is a CGI program, and will attempt to execute it, when that particular resource is requested by a client. The S CRIPTA LIAS directive looks like: ScriptAlias "/cgi-bin/" "/usr/local/apache2/cgi-bin/" The example shown is from your default httpd.conf configuration file, if you installed Apache in the default location. The S CRIPTA LIAS directive is much like the A LIAS directive, which defines a URL prefix that is to mapped to a particular directory. A LIAS and S CRIPTA LIAS are usually used for directories that are outside of the D OCUMENT ROOT directory. The difference between A LIAS and S CRIPTA LIAS is that S CRIPTA LIAS has the added meaning that everything under that URL prefix will be considered a CGI program. So, the example above tells Apache that any request for a resource beginning with /cgi-bin/ should be served from the directory /usr/local/apache2/cgi-bin/, and should be treated as a CGI program. For example, if the URL http://www.example.com/cgi-bin/test.pl is requested, Apache will attempt to execute the file /usr/local/apache2/cgi-bin/test.pl and return the output. Of course, the file will have to exist, and be executable, and return output in a particular way, or Apache will return an error message. CGI outside of ScriptAlias directories CGI programs are often restricted to S CRIPTA LIAS’ed directories for security reasons. In this way, administrators can tightly control who is allowed to use CGI programs. However, if the proper security precautions are taken, there is no 6.4. APACHE TUTORIAL: DYNAMIC CONTENT WITH CGI 237 reason why CGI programs cannot be run from arbitrary directories. For example, you may wish to let users have web content in their home directories with the U SER D IR directive. If they want to have their own CGI programs, but don’t have access to the main cgi-bin directory, they will need to be able to run CGI programs elsewhere. There are two steps to allowing CGI execution in an arbitrary directory. First, the cgi-script handler must be activated using the A DD H ANDLER or S ET H ANDLER directive. Second, ExecCGI must be specified in the O PTIONS directive. Explicitly using Options to permit CGI execution You could explicitly use the O PTIONS directive, inside your main server configuration file, to specify that CGI execution was permitted in a particular directory: Options +ExecCGI The above directive tells Apache to permit the execution of CGI files. You will also need to tell the server what files are CGI files. The following A DD H ANDLER directive tells the server to treat all files with the cgi or pl extension as CGI programs: AddHandler cgi-script .cgi .pl .htaccess files The .htaccess tutorial (p. 249) shows how to activate CGI programs if you do not have access to httpd.conf. User Directories To allow CGI program execution for any file ending in .cgi in users’ directories, you can use the following configuration. Options +ExecCGI AddHandler cgi-script .cgi If you wish designate a cgi-bin subdirectory of a user’s directory where everything will be treated as a CGI program, you can use the following. Options ExecCGI SetHandler cgi-script Writing a CGI program There are two main differences between “regular” programming, and CGI programming. First, all output from your CGI program must be preceded by a MIME-type header. This is HTTP header that tells the client what sort of content it is receiving. Most of the time, this will look like: 238 CHAPTER 6. GUIDES, TUTORIALS, AND HOWTOS Content-type: text/html Secondly, your output needs to be in HTML, or some other format that a browser will be able to display. Most of the time, this will be HTML, but occasionally you might write a CGI program that outputs a gif image, or other non-HTML content. Apart from those two things, writing a CGI program will look a lot like any other program that you might write. Your first CGI program The following is an example CGI program that prints one line to your browser. Type in the following, save it to a file called first.pl, and put it in your cgi-bin directory. #!/usr/bin/perl print "Content-type: text/html\n\n"; print "Hello, World."; Even if you are not familiar with Perl, you should be able to see what is happening here. The first line tells Apache (or whatever shell you happen to be running under) that this program can be executed by feeding the file to the interpreter found at the location /usr/bin/perl. The second line prints the content-type declaration we talked about, followed by two carriage-return newline pairs. This puts a blank line after the header, to indicate the end of the HTTP headers, and the beginning of the body. The third line prints the string "Hello, World.". And that’s the end of it. If you open your favorite browser and tell it to get the address http://www.example.com/cgi-bin/first.pl or wherever you put your file, you will see the one line Hello, World. appear in your browser window. It’s not very exciting, but once you get that working, you’ll have a good chance of getting just about anything working. But it’s still not working! There are four basic things that you may see in your browser when you try to access your CGI program from the web: The output of your CGI program Great! That means everything worked fine. If the output is correct, but the browser is not processing it correctly, make sure you have the correct Content-Type set in your CGI program. The source code of your CGI program or a "POST Method Not Allowed" message That means that you have not properly configured Apache to process your CGI program. Reread the section on configuring Apache and try to find what you missed. A message starting with "Forbidden" That means that there is a permissions problem. Check the Apache error log and the section below on file permissions. A message saying "Internal Server Error" If you check the Apache error log, you will probably find that it says "Premature end of script headers", possibly along with an error message generated by your CGI program. In this case, you will want to check each of the below sections to see what might be preventing your CGI program from emitting the proper HTTP headers. 6.4. APACHE TUTORIAL: DYNAMIC CONTENT WITH CGI 239 File permissions Remember that the server does not run as you. That is, when the server starts up, it is running with the permissions of an unprivileged user - usually nobody, or www - and so it will need extra permissions to execute files that are owned by you. Usually, the way to give a file sufficient permissions to be executed by nobody is to give everyone execute permission on the file: chmod a+x first.pl Also, if your program reads from, or writes to, any other files, those files will need to have the correct permissions to permit this. Path information and environment When you run a program from your command line, you have certain information that is passed to the shell without you thinking about it. For example, you have a PATH, which tells the shell where it can look for files that you reference. When a program runs through the web server as a CGI program, it may not have the same PATH. Any programs that you invoke in your CGI program (like sendmail, for example) will need to be specified by a full path, so that the shell can find them when it attempts to execute your CGI program. A common manifestation of this is the path to the script interpreter (often perl) indicated in the first line of your CGI program, which will look something like: #!/usr/bin/perl Make sure that this is in fact the path to the interpreter. ! When editing CGI scripts on Windows, end-of-line characters may be appended to the interpreter path. Ensure that files are then transferred to the server in ASCII mode. Failure to do so may result in "Command not found" warnings from the OS, due to the unrecognized end-of-line character being interpreted as a part of the interpreter filename. Missing environment variables If your CGI program depends on non-standard environment variables, you will need to assure that those variables are passed by Apache. When you miss HTTP headers from the environment, make sure they are formatted according to RFC 26162 , section 4.2: Header names must start with a letter, followed only by letters, numbers or hyphen. Any header violating this rule will be dropped silently. Program errors Most of the time when a CGI program fails, it’s because of a problem with the program itself. This is particularly true once you get the hang of this CGI stuff, and no longer make the above two mistakes. The first thing to do is to make sure that your program runs from the command line before testing it via the web server. For example, try: cd /usr/local/apache2/cgi-bin ./first.pl 2 http://tools.ietf.org/html/rfc2616 240 CHAPTER 6. GUIDES, TUTORIALS, AND HOWTOS (Do not call the perl interpreter. The shell and Apache should find the interpreter using the path information on the first line of the script.) The first thing you see written by your program should be a set of HTTP headers, including the Content-Type, followed by a blank line. If you see anything else, Apache will return the Premature end of script headers error if you try to run it through the server. See Writing a CGI program above for more details. Error logs The error logs are your friend. Anything that goes wrong generates message in the error log. You should always look there first. If the place where you are hosting your web site does not permit you access to the error log, you should probably host your site somewhere else. Learn to read the error logs, and you’ll find that almost all of your problems are quickly identified, and quickly solved. Suexec The suexec (p. 115) support program allows CGI programs to be run under different user permissions, depending on which virtual host or user home directory they are located in. Suexec has very strict permission checking, and any failure in that checking will result in your CGI programs failing with Premature end of script headers. To check if you are using suexec, run apachectl -V and check for the location of SUEXEC BIN. If Apache finds an suexec binary there on startup, suexec will be activated. Unless you fully understand suexec, you should not be using it. To disable suexec, simply remove (or rename) the suexec binary pointed to by SUEXEC BIN and then restart the server. If, after reading about suexec (p. 115) , you still wish to use it, then run suexec -V to find the location of the suexec log file, and use that log file to find what policy you are violating. What’s going on behind the scenes? As you become more advanced in CGI programming, it will become useful to understand more about what’s happening behind the scenes. Specifically, how the browser and server communicate with one another. Because although it’s all very well to write a program that prints "Hello, World.", it’s not particularly useful. Environment variables Environment variables are values that float around you as you use your computer. They are useful things like your path (where the computer searches for the actual file implementing a command when you type it), your username, your terminal type, and so on. For a full list of your normal, every day environment variables, type env at a command prompt. During the CGI transaction, the server and the browser also set environment variables, so that they can communicate with one another. These are things like the browser type (Netscape, IE, Lynx), the server type (Apache, IIS, WebSite), the name of the CGI program that is being run, and so on. These variables are available to the CGI programmer, and are half of the story of the client-server communication. The complete list of required variables is at Common Gateway Interface RFC3 . This simple Perl CGI program will display all of the environment variables that are being passed around. Two similar programs are included in the cgi-bin 3 http://www.ietf.org/rfc/rfc3875 6.4. APACHE TUTORIAL: DYNAMIC CONTENT WITH CGI 241 directory of the Apache distribution. Note that some variables are required, while others are optional, so you may see some variables listed that were not in the official list. In addition, Apache provides many different ways for you to add your own environment variables (p. 92) to the basic ones provided by default. #!/usr/bin/perl use strict; use warnings; print "Content-type: text/html\n\n"; foreach my $key (keys %ENV) { print "$key --> $ENV{$key}
"; } STDIN and STDOUT Other communication between the server and the client happens over standard input (STDIN) and standard output (STDOUT). In normal everyday context, STDIN means the keyboard, or a file that a program is given to act on, and STDOUT usually means the console or screen. When you POST a web form to a CGI program, the data in that form is bundled up into a special format and gets delivered to your CGI program over STDIN. The program then can process that data as though it was coming in from the keyboard, or from a file The "special format" is very simple. A field name and its value are joined together with an equals (=) sign, and pairs of values are joined together with an ampersand (&). Inconvenient characters like spaces, ampersands, and equals signs, are converted into their hex equivalent so that they don’t gum up the works. The whole data string might look something like: name=Rich%20Bowen&city=Lexington&state=KY&sidekick=Squirrel%20Monkey You’ll sometimes also see this type of string appended to a URL. When that is done, the server puts that string into the environment variable called QUERY STRING. That’s called a GET request. Your HTML form specifies whether a GET or a POST is used to deliver the data, by setting the METHOD attribute in the FORM tag. Your program is then responsible for splitting that string up into useful information. Fortunately, there are libraries and modules available to help you process this data, as well as handle other of the aspects of your CGI program. CGI modules/libraries When you write CGI programs, you should consider using a code library, or module, to do most of the grunt work for you. This leads to fewer errors, and faster development. If you’re writing CGI programs in Perl, modules are available on CPAN4 . The most popular module for this purpose is CGI.pm. You might also consider CGI::Lite, which implements a minimal set of functionality, which is all you need in most programs. If you’re writing CGI programs in C, there are a variety of options. One of these is the CGIC library, from http://www.boutell.com/cgic/. 4 http://www.cpan.org/ 242 CHAPTER 6. GUIDES, TUTORIALS, AND HOWTOS For more information The current CGI specification is available in the Common Gateway Interface RFC5 . When you post a question about a CGI problem that you’re having, whether to a mailing list, or to a newsgroup, make sure you provide enough information about what happened, what you expected to happen, and how what actually happened was different, what server you’re running, what language your CGI program was in, and, if possible, the offending code. This will make finding your problem much simpler. Note that questions about CGI problems should never be posted to the Apache bug database unless you are sure you have found a problem in the Apache source code. 5 http://www.ietf.org/rfc/rfc3875 6.5. APACHE HTTPD TUTORIAL: INTRODUCTION TO SERVER SIDE INCLUDES 6.5 243 Apache httpd Tutorial: Introduction to Server Side Includes Server-side includes provide a means to add dynamic content to existing HTML documents. Introduction Related Modules MOD INCLUDE MOD CGI MOD EXPIRES Related Directives O PTIONS XB IT H ACK A DD T YPE S ET O UTPUT F ILTER B ROWSER M ATCH N O C ASE This article deals with Server Side Includes, usually called simply SSI. In this article, I’ll talk about configuring your server to permit SSI, and introduce some basic SSI techniques for adding dynamic content to your existing HTML pages. In the latter part of the article, we’ll talk about some of the somewhat more advanced things that can be done with SSI, such as conditional statements in your SSI directives. What are SSI? SSI (Server Side Includes) are directives that are placed in HTML pages, and evaluated on the server while the pages are being served. They let you add dynamically generated content to an existing HTML page, without having to serve the entire page via a CGI program, or other dynamic technology. For example, you might place a directive into an existing HTML page, such as: And, when the page is served, this fragment will be evaluated and replaced with its value: Tuesday, 15-Jan-2013 19:28:54 EST The decision of when to use SSI, and when to have your page entirely generated by some program, is usually a matter of how much of the page is static, and how much needs to be recalculated every time the page is served. SSI is a great way to add small pieces of information, such as the current time - shown above. But if a majority of your page is being generated at the time that it is served, you need to look for some other solution. Configuring your server to permit SSI To permit SSI on your server, you must have the following directive either in your httpd.conf file, or in a .htaccess file: Options +Includes This tells Apache that you want to permit files to be parsed for SSI directives. Note that most configurations contain multiple O PTIONS directives that can override each other. You will probably need to apply the Options to the specific directory where you want SSI enabled in order to assure that it gets evaluated last. 244 CHAPTER 6. GUIDES, TUTORIALS, AND HOWTOS Not just any file is parsed for SSI directives. You have to tell Apache which files should be parsed. There are two ways to do this. You can tell Apache to parse any file with a particular file extension, such as .shtml, with the following directives: AddType text/html .shtml AddOutputFilter INCLUDES .shtml One disadvantage to this approach is that if you wanted to add SSI directives to an existing page, you would have to change the name of that page, and all links to that page, in order to give it a .shtml extension, so that those directives would be executed. The other method is to use the XB IT H ACK directive: XBitHack on XB IT H ACK tells Apache to parse files for SSI directives if they have the execute bit set. So, to add SSI directives to an existing page, rather than having to change the file name, you would just need to make the file executable using chmod. chmod +x pagename.html A brief comment about what not to do. You’ll occasionally see people recommending that you just tell Apache to parse all .html files for SSI, so that you don’t have to mess with .shtml file names. These folks have perhaps not heard about XB IT H ACK. The thing to keep in mind is that, by doing this, you’re requiring that Apache read through every single file that it sends out to clients, even if they don’t contain any SSI directives. This can slow things down quite a bit, and is not a good idea. Of course, on Windows, there is no such thing as an execute bit to set, so that limits your options a little. In its default configuration, Apache does not send the last modified date or content length HTTP headers on SSI pages, because these values are difficult to calculate for dynamic content. This can prevent your document from being cached, and result in slower perceived client performance. There are two ways to solve this: 1. Use the XBitHack Full configuration. This tells Apache to determine the last modified date by looking only at the date of the originally requested file, ignoring the modification date of any included files. 2. Use the directives provided by MOD EXPIRES to set an explicit expiration time on your files, thereby letting browsers and proxies know that it is acceptable to cache them. Basic SSI directives SSI directives have the following syntax: It is formatted like an HTML comment, so if you don’t have SSI correctly enabled, the browser will ignore it, but it will still be visible in the HTML source. If you have SSI correctly configured, the directive will be replaced with its results. The function can be one of a number of things, and we’ll talk some more about most of these in the next installment of this series. For now, here are some examples of what you can do with SSI 6.5. APACHE HTTPD TUTORIAL: INTRODUCTION TO SERVER SIDE INCLUDES 245 Today’s date The echo function just spits out the value of a variable. There are a number of standard variables, which include the whole set of environment variables that are available to CGI programs. Also, you can define your own variables with the set function. If you don’t like the format in which the date gets printed, you can use the config function, with a timefmt attribute, to modify that formatting. Today is Modification date of the file This document last modified This function is also subject to timefmt format configurations. Including the results of a CGI program This is one of the more common uses of SSI - to output the results of a CGI program, such as everybody’s favorite, a “hit counter.” Additional examples Following are some specific examples of things you can do in your HTML documents with SSI. When was this document modified? Earlier, we mentioned that you could use SSI to inform the user when the document was most recently modified. However, the actual method for doing that was left somewhat in question. The following code, placed in your HTML document, will put such a time stamp on your page. Of course, you will have to have SSI correctly enabled, as discussed above. This file last modified Of course, you will need to replace the ssi.shtml with the actual name of the file that you’re referring to. This can be inconvenient if you’re just looking for a generic piece of code that you can paste into any file, so you probably want to use the LAST MODIFIED variable instead: This file last modified For more details on the timefmt format, go to your favorite search site and look for strftime. The syntax is the same. 246 CHAPTER 6. GUIDES, TUTORIALS, AND HOWTOS Including a standard footer If you are managing any site that is more than a few pages, you may find that making changes to all those pages can be a real pain, particularly if you are trying to maintain some kind of standard look across all those pages. Using an include file for a header and/or a footer can reduce the burden of these updates. You just have to make one footer file, and then include it into each page with the include SSI command. The include function can determine what file to include with either the file attribute, or the virtual attribute. The file attribute is a file path, relative to the current directory. That means that it cannot be an absolute file path (starting with /), nor can it contain ../ as part of that path. The virtual attribute is probably more useful, and should specify a URL relative to the document being served. It can start with a /, but must be on the same server as the file being served. I’ll frequently combine the last two things, putting a LAST MODIFIED directive inside a footer file to be included. SSI directives can be contained in the included file, and includes can be nested - that is, the included file can include another file, and so on. What else can I config? In addition to being able to config the time format, you can also config two other things. Usually, when something goes wrong with your SSI directive, you get the message [an error occurred while processing this directive] If you want to change that message to something else, you can do so with the errmsg attribute to the config function: Hopefully, end users will never see this message, because you will have resolved all the problems with your SSI directives before your site goes live. (Right?) And you can config the format in which file sizes are returned with the sizefmt attribute. You can specify bytes for a full count in bytes, or abbrev for an abbreviated number in Kb or Mb, as appropriate. Executing commands I expect that I’ll have an article some time in the coming months about using SSI with small CGI programs. For now, here’s something else that you can do with the exec function. You can actually have SSI execute a command using the shell (/bin/sh, to be precise - or the DOS shell, if you’re on Win32). The following, for example, will give you a directory listing.


or, on Windows 6.5. APACHE HTTPD TUTORIAL: INTRODUCTION TO SERVER SIDE INCLUDES 247

You might notice some strange formatting with this directive on Windows, because the output from dir contains the string “” in it, which confuses browsers. Note that this feature is exceedingly dangerous, as it will execute whatever code happens to be embedded in the exec tag. If you have any situation where users can edit content on your web pages, such as with a “guestbook”, for example, make sure that you have this feature disabled. You can allow SSI, but not the exec feature, with the IncludesNOEXEC argument to the Options directive. Advanced SSI techniques In addition to spitting out content, Apache SSI gives you the option of setting variables, and using those variables in comparisons and conditionals. Setting variables Using the set directive, you can set variables for later use. We’ll need this later in the discussion, so we’ll talk about it here. The syntax of this is as follows: In addition to merely setting values literally like that, you can use any other variable, including environment variables (p. 92) or the variables discussed above (like LAST MODIFIED, for example) to give values to your variables. You will specify that something is a variable, rather than a literal string, by using the dollar sign ($) before the name of the variable. To put a literal dollar sign into the value of your variable, you need to escape the dollar sign with a backslash. Finally, if you want to put a variable in the midst of a longer string, and there’s a chance that the name of the variable will run up against some other characters, and thus be confused with those characters, you can place the name of the variable in braces, to remove this confusion. (It’s hard to come up with a really good example of this, but hopefully you’ll get the point.) 248 CHAPTER 6. GUIDES, TUTORIALS, AND HOWTOS Conditional expressions Now that we have variables, and are able to set and compare their values, we can use them to express conditionals. This lets SSI be a tiny programming language of sorts. MOD INCLUDE provides an if, elif, else, endif structure for building conditional statements. This allows you to effectively generate multiple logical pages out of one actual page. The structure of this conditional construct is: A test condition can be any sort of logical comparison - either comparing values to one another, or testing the “truth” of a particular value. (A given string is true if it is nonempty.) For a full list of the comparison operators available to you, see the MOD INCLUDE documentation. For example, if you wish to customize the text on your web page based on the time of day, you could use the following recipe, placed in the HTML page: Good morning! afternoon! Any other variable (either ones that you define, or normal environment variables) can be used in conditional statements. See Expressions in Apache HTTP Server (p. 99) for more information on the expression evaluation engine. With Apache’s ability to set environment variables with the SetEnvIf directives, and other related directives, this functionality can let you do a wide variety of dynamic content on the server side without resorting a full web application. Conclusion SSI is certainly not a replacement for CGI, or other technologies used for generating dynamic web pages. But it is a great way to add small amounts of dynamic content to pages, without doing a lot of extra work. 6.6. APACHE HTTP SERVER TUTORIAL: .HTACCESS FILES 6.6 249 Apache HTTP Server Tutorial: .htaccess files .htaccess files provide a way to make configuration changes on a per-directory basis. .htaccess files Related Modules CORE MOD AUTHN FILE MOD AUTHZ GROUPFILE MOD CGI MOD INCLUDE MOD MIME Related Directives ACCESS F ILE NAME A LLOW OVERRIDE O PTIONS A DD H ANDLER S ET H ANDLER AUTH T YPE AUTH NAME AUTH U SER F ILE AUTH G ROUP F ILE R EQUIRE =⇒You should avoid using .htaccess files completely if you have access to httpd main server config file. Using .htaccess files slows down your Apache http server. Any directive that you can include in a .htaccess file is better set in a D IRECTORY block, as it will have the same effect with better performance. What they are/How to use them .htaccess files (or "distributed configuration files") provide a way to make configuration changes on a perdirectory basis. A file, containing one or more configuration directives, is placed in a particular document directory, and the directives apply to that directory, and all subdirectories thereof. =⇒Note: If you want to call your .htaccess file something else, you can change the name of the file using the ACCESS F ILE NAME directive. For example, if you would rather call the file .config then you can put the following in your server configuration file: AccessFileName ".config" In general, .htaccess files use the same syntax as the main configuration files (p. 32) . What you can put in these files is determined by the A LLOW OVERRIDE directive. This directive specifies, in categories, what directives will be honored if they are found in a .htaccess file. If a directive is permitted in a .htaccess file, the documentation for that directive will contain an Override section, specifying what value must be in A LLOW OVERRIDE in order for that directive to be permitted. For example, if you look at the documentation for the A DD D EFAULT C HARSET directive, you will find that it is permitted in .htaccess files. (See the Context line in the directive summary.) The Override (p. 377) line reads FileInfo. Thus, you must have at least AllowOverride FileInfo in order for this directive to be honored in .htaccess files. Example: Context: Override: (p. 377) (p. 377) server config, virtual host, directory, .htaccess FileInfo 250 CHAPTER 6. GUIDES, TUTORIALS, AND HOWTOS If you are unsure whether a particular directive is permitted in a .htaccess file, look at the documentation for that directive, and check the Context line for ".htaccess". When (not) to use .htaccess files In general, you should only use .htaccess files when you don’t have access to the main server configuration file. There is, for example, a common misconception that user authentication should always be done in .htaccess files, and, in more recent years, another misconception that MOD REWRITE directives must go in .htaccess files. This is simply not the case. You can put user authentication configurations in the main server configuration, and this is, in fact, the preferred way to do things. Likewise, mod rewrite directives work better, in many respects, in the main server configuration. .htaccess files should be used in a case where the content providers need to make configuration changes to the server on a per-directory basis, but do not have root access on the server system. In the event that the server administrator is not willing to make frequent configuration changes, it might be desirable to permit individual users to make these changes in .htaccess files for themselves. This is particularly true, for example, in cases where ISPs are hosting multiple user sites on a single machine, and want their users to be able to alter their configuration. However, in general, use of .htaccess files should be avoided when possible. Any configuration that you would consider putting in a .htaccess file, can just as effectively be made in a section in your main server configuration file. There are two main reasons to avoid the use of .htaccess files. The first of these is performance. When A LLOW OVERRIDE is set to allow the use of .htaccess files, httpd will look in every directory for .htaccess files. Thus, permitting .htaccess files causes a performance hit, whether or not you actually even use them! Also, the .htaccess file is loaded every time a document is requested. Further note that httpd must look for .htaccess files in all higher-level directories, in order to have a full complement of directives that it must apply. (See section on how directives are applied.) Thus, if a file is requested out of a directory /www/htdocs/example, httpd must look for the following files: /.htaccess /www/.htaccess /www/htdocs/.htaccess /www/htdocs/example/.htaccess And so, for each file access out of that directory, there are 4 additional file-system accesses, even if none of those files are present. (Note that this would only be the case if .htaccess files were enabled for /, which is not usually the case.) In the case of R EWRITE RULE directives, in .htaccess context these regular expressions must be re-compiled with every request to the directory, whereas in main server configuration context they are compiled once and cached. Additionally, the rules themselves are more complicated, as one must work around the restrictions that come with per-directory context and mod rewrite. Consult the Rewrite Guide (p. 147) for more detail on this subject. The second consideration is one of security. You are permitting users to modify server configuration, which may result in changes over which you have no control. Carefully consider whether you want to give your users this privilege. Note also that giving users less privileges than they need will lead to additional technical support requests. Make sure you clearly tell your users what level of privileges you have given them. Specifying exactly what you have set A LLOW OVERRIDE to, and pointing them to the relevant documentation, will save yourself a lot of confusion later. Note that it is completely equivalent to put a .htaccess file in a directory /www/htdocs/example containing a directive, and to put that same directive in a Directory section in your main server configuration: .htaccess file in /www/htdocs/example: 6.6. APACHE HTTP SERVER TUTORIAL: .HTACCESS FILES 251 Contents of .htaccess file in /www/htdocs/example AddType text/example ".exm" Section from your httpd.conf file AddType text/example ".exm" However, putting this configuration in your server configuration file will result in less of a performance hit, as the configuration is loaded once when httpd starts, rather than every time a file is requested. The use of .htaccess files can be disabled completely by setting the A LLOW OVERRIDE directive to none: AllowOverride None How directives are applied The configuration directives found in a .htaccess file are applied to the directory in which the .htaccess file is found, and to all subdirectories thereof. However, it is important to also remember that there may have been .htaccess files in directories higher up. Directives are applied in the order that they are found. Therefore, a .htaccess file in a particular directory may override directives found in .htaccess files found higher up in the directory tree. And those, in turn, may have overridden directives found yet higher up, or in the main server configuration file itself. Example: In the directory /www/htdocs/example1 we have a .htaccess file containing the following: Options +ExecCGI (Note: you must have "AllowOverride Options" in effect to permit the use of the "O PTIONS" directive in .htaccess files.) In the directory /www/htdocs/example1/example2 we have a .htaccess file containing: Options Includes Because of this second .htaccess file, in the directory /www/htdocs/example1/example2, CGI execution is not permitted, as only Options Includes is in effect, which completely overrides any earlier setting that may have been in place. Merging of .htaccess with the main configuration files As discussed in the documentation on Configuration Sections (p. 35) , .htaccess files can override the sections for the corresponding directory, but will be overridden by other types of configuration sections from the main configuration files. This fact can be used to enforce certain configurations, even in the presence of a liberal A LLOW OVERRIDE setting. For example, to prevent script execution while allowing anything else to be set in .htaccess you can use: 252 CHAPTER 6. GUIDES, TUTORIALS, AND HOWTOS AllowOverride All Options +IncludesNoExec -ExecCGI =⇒This example assumes that your D OCUMENT ROOT is /www/htdocs. Authentication example If you jumped directly to this part of the document to find out how to do authentication, it is important to note one thing. There is a common misconception that you are required to use .htaccess files in order to implement password authentication. This is not the case. Putting authentication directives in a section, in your main server configuration file, is the preferred way to implement this, and .htaccess files should be used only if you don’t have access to the main server configuration file. See above for a discussion of when you should and should not use .htaccess files. Having said that, if you still think you need to use a .htaccess file, you may find that a configuration such as what follows may work for you. .htaccess file contents: AuthType Basic AuthName "Password Required" AuthUserFile "/www/passwords/password.file" AuthGroupFile "/www/passwords/group.file" Require group admins Note that AllowOverride AuthConfig must be in effect for these directives to have any effect. Please see the authentication tutorial (p. 227) for a more complete discussion of authentication and authorization. Server Side Includes example Another common use of .htaccess files is to enable Server Side Includes for a particular directory. This may be done with the following configuration directives, placed in a .htaccess file in the desired directory: Options +Includes AddType text/html "shtml" AddHandler server-parsed shtml Note that AllowOverride Options and AllowOverride FileInfo must both be in effect for these directives to have any effect. Please see the SSI tutorial (p. 243) for a more complete discussion of server-side includes. Rewrite Rules in .htaccess files When using R EWRITE RULE in .htaccess files, be aware that the per-directory context changes things a bit. In particular, rules are taken to be relative to the current directory, rather than being the original requested URI. Consider the following examples: 6.6. APACHE HTTP SERVER TUTORIAL: .HTACCESS FILES 253 # In httpd.conf RewriteRule "ˆ/images/(.+)\.jpg" "/images/$1.png" # In .htaccess in root dir RewriteRule "ˆimages/(.+)\.jpg" "images/$1.png" # In .htaccess in images/ RewriteRule "ˆ(.+)\.jpg" "$1.png" In a .htaccess in your document directory, the leading slash is removed from the value supplied to R EWRITE RULE, and in the images subdirectory, /images/ is removed from it. Thus, your regular expression needs to omit that portion as well. Consult the mod rewrite documentation (p. 146) for further details on using mod rewrite. CGI example Finally, you may wish to use a .htaccess file to permit the execution of CGI programs in a particular directory. This may be implemented with the following configuration: Options +ExecCGI AddHandler cgi-script "cgi" "pl" Alternately, if you wish to have all files in the given directory be considered to be CGI programs, this may be done with the following configuration: Options +ExecCGI SetHandler cgi-script Note that AllowOverride Options and AllowOverride FileInfo must both be in effect for these directives to have any effect. Please see the CGI tutorial (p. 236) for a more complete discussion of CGI programming and configuration. Troubleshooting When you put configuration directives in a .htaccess file, and you don’t get the desired effect, there are a number of things that may be going wrong. Most commonly, the problem is that A LLOW OVERRIDE is not set such that your configuration directives are being honored. Make sure that you don’t have a AllowOverride None in effect for the file scope in question. A good test for this is to put garbage in your .htaccess file and reload the page. If a server error is not generated, then you almost certainly have AllowOverride None in effect. If, on the other hand, you are getting server errors when trying to access documents, check your httpd error log. It will likely tell you that the directive used in your .htaccess file is not permitted. [Fri Sep 17 18:43:16 2010] [alert] [client 192.168.200.51] /var/www/html/.htaccess: DirectoryIndex not allowed here This will indicate either that you’ve used a directive that is never permitted in .htaccess files, or that you simply don’t have A LLOW OVERRIDE set to a level sufficient for the directive you’ve used. Consult the documentation for that particular directive to determine which is the case. 254 CHAPTER 6. GUIDES, TUTORIALS, AND HOWTOS Alternately, it may tell you that you had a syntax error in your usage of the directive itself. [Sat Aug 09 16:22:34 2008] [alert] [client 192.168.200.51] /var/www/html/.htaccess: RewriteCond: bad flag delimiters In this case, the error message should be specific to the particular syntax error that you have committed. 6.7. PER-USER WEB DIRECTORIES 6.7 255 Per-user web directories On systems with multiple users, each user can be permitted to have a web site in their home directory using the U SER D IR directive. Visitors to a URL http://example.com/˜username/ will get content out of the home directory of the user "username", out of the subdirectory specified by the U SER D IR directive. Note that, by default, access to these directories is not enabled. You can enable access when using U SER D IR by uncommenting the line: #Include conf/extra/httpd-userdir.conf in the default config file conf/httpd.conf, and adapting the httpd-userdir.conf file as necessary, or by including the appropriate directives in a block within the main config file. See also • Mapping URLs to the Filesystem (p. 64) Per-user web directories Related Modules MOD USERDIR Related Directives U SER D IR D IRECTORY M ATCH A LLOW OVERRIDE Setting the file path with UserDir The U SER D IR directive specifies a directory out of which per-user content is loaded. This directive may take several different forms. If a path is given which does not start with a leading slash, it is assumed to be a directory path relative to the home directory of the specified user. Given this configuration: UserDir public_html the URL http://example.com/˜rbowen/file.html /home/rbowen/public html/file.html will be translated to the file path If a path is given starting with a slash, a directory path will be constructed using that path, plus the username specified. Given this configuration: UserDir /var/html the URL http://example.com/˜rbowen/file.html /var/html/rbowen/file.html will be translated to the file path If a path is provided which contains an asterisk (*), a path is used in which the asterisk is replaced with the username. Given this configuration: UserDir /var/www/*/docs 256 CHAPTER 6. GUIDES, TUTORIALS, AND HOWTOS the URL http://example.com/˜rbowen/file.html /var/www/rbowen/docs/file.html will be translated to the file path Multiple directories or directory paths can also be set. UserDir public_html /var/html For the URL http://example.com/˜rbowen/file.html, Apache will search for ˜rbowen. If it isn’t found, Apache will search for rbowen in /var/html. If found, the above URL will then be translated to the file path /var/html/rbowen/file.html Redirecting to external URLs The U SER D IR directive can be used to redirect user directory requests to external URLs. UserDir http://example.org/users/*/ The above example will redirect a request http://example.org/users/bob/abc.html. for http://example.com/˜bob/abc.html to Restricting what users are permitted to use this feature Using the syntax shown in the UserDir documentation, you can restrict what users are permitted to use this functionality: UserDir disabled root jro fish The configuration above will enable the feature for all users except for those listed in the disabled statement. You can, likewise, disable the feature for all but a few users by using a configuration like the following: UserDir disabled UserDir enabled rbowen krietz See U SER D IR documentation for additional examples. Enabling a cgi directory for each user In order to give each user their own cgi-bin directory, you can use a directive to make a particular subdirectory of a user’s home directory cgi-enabled. Options ExecCGI SetHandler cgi-script Then, presuming that UserDir is set to public html, a cgi program example.cgi could be loaded from that directory as: http://example.com/˜rbowen/cgi-bin/example.cgi 6.7. PER-USER WEB DIRECTORIES 257 Allowing users to alter configuration If you want to allows users to modify the server configuration in their web space, they will need to use .htaccess files to make these changes. Ensure that you have set A LLOW OVERRIDE to a value sufficient for the directives that you want to permit the users to modify. See the .htaccess tutorial (p. 249) for additional details on how this works. 258 6.8 CHAPTER 6. GUIDES, TUTORIALS, AND HOWTOS Reverse Proxy Guide In addition to being a "basic" web server, and providing static and dynamic content to end-users, Apache httpd (as well as most other web servers) can also act as a reverse proxy server, also-known-as a "gateway" server. In such scenarios, httpd itself does not generate or host the data, but rather the content is obtained by one or several backend servers, which normally have no direct connection to the external network. As httpd receives a request from a client, the request itself is proxied to one of these backend servers, which then handles the request, generates the content and then sends this content back to httpd, which then generates the actual HTTP response back to the client. There are numerous reasons for such an implementation, but generally the typical rationales are due to security, highavailability, load-balancing and centralized authentication/authorization. It is critical in these implementations that the layout, design and architecture of the backend infrastructure (those servers which actually handle the requests) are insulated and protected from the outside; as far as the client is concerned, the reverse proxy server is the sole source of all content. A typical implementation is below: Reverse Proxy Related Modules MOD PROXY MOD PROXY BALANCER MOD PROXY HCHECK Related Directives P ROXY PASS BALANCER M EMBER 6.8. REVERSE PROXY GUIDE 259 Simple reverse proxying The P ROXY PASS directive specifies the mapping of incoming requests to the backend server (or a cluster of servers known as a Balancer group). The simpliest example proxies all requests ("/") to a single backend: ProxyPass "/" "http://www.example.com/" To ensure that and Location: headers generated from the backend are modified to point to the reverse proxy, instead of back to itself, the P ROXY PASS R EVERSE directive is most often required: ProxyPass "/" "http://www.example.com/" ProxyPassReverse "/" "http://www.example.com/" Only specific URIs can be proxied, as shown in this example: ProxyPass "/images" "http://www.example.com/" ProxyPassReverse "/images" "http://www.example.com/" In the above, any requests which start with the /images path with be proxied to the specified backend, otherwise it will be handled locally. Clusters and Balancers As useful as the above is, it still has the deficiencies that should the (single) backend node go down, or become heavily loaded, that proxying those requests provides no real advantage. What is needed is the ability to define a set or group of backend servers which can handle such requests and for the reverse proxy to load balance and failover among them. This group is sometimes called a cluster but Apache httpd’s term is a balancer. One defines a balancer by leveraging the

and BALANCER M EMBER directives as shown: BalancerMember http://www2.example.com:8080 BalancerMember http://www3.example.com:8080 ProxySet lbmethod=bytraffic ProxyPass "/images/" "balancer://myset/" ProxyPassReverse "/images/" "balancer://myset/" The balancer:// scheme is what tells httpd that we are creating a balancer set, with the name myset. It includes 2 backend servers, which httpd calls BalancerMembers. In this case, any requests for /images will be proxied to one of the 2 backends. The P ROXY S ET directive specifies that the myset Balancer use a load balancing algorithm that balances based on I/O bytes. =⇒Hint BalancerMembers are also sometimes referred to as workers. 260 CHAPTER 6. GUIDES, TUTORIALS, AND HOWTOS Balancer and BalancerMember configuration You can adjust numerous configuration details of the balancers and the workers via the various parameters defined in P ROXY PASS. For example, assuming we would want http://www3.example.com:8080 to handle 3x the traffic with a timeout of 1 second, we would adjust the configuration as follows: BalancerMember http://www2.example.com:8080 BalancerMember http://www3.example.com:8080 loadfactor=3 timeout=1 ProxySet lbmethod=bytraffic ProxyPass "/images" "balancer://myset/" ProxyPassReverse "/images" "balancer://myset/" Failover You can also fine-tune various failover scenarios, detailing which workers and even which balancers should accessed in such cases. For example, the below setup implements 2 failover cases: In the first, http://hstandby.example.com:8080 is only sent traffic if all other workers in the myset balancer are not available. If that worker itself is not available, only then will the http://bkup1.example.com:8080 and http://bkup2.example.com:8080 workers be brought into rotation: BalancerMember http://www2.example.com:8080 BalancerMember http://www3.example.com:8080 loadfactor=3 timeout=1 BalancerMember http://hstandby.example.com:8080 status=+H BalancerMember http://bkup1.example.com:8080 lbset=1 BalancerMember http://bkup2.example.com:8080 lbset=1 ProxySet lbmethod=byrequests ProxyPass "/images/" "balancer://myset/" ProxyPassReverse "/images/" "balancer://myset/" The magic of this failover setup is setting http://hstandby.example.com:8080 with the +H status flag, which puts it in hot standby mode, and making the 2 bkup# servers part of the #1 load balancer set (the default set is 0); for failover, hot standbys (if they exist) are used 1st, when all regular workers are unavailable; load balancer sets are always tried lowest number first. Balancer Manager One of the most unique and useful features of Apache httpd’s reverse proxy is the embedded balancer-manager application. Similar to MOD STATUS, balancer-manager displays the current working configuration and status of the enabled balancers and workers currently in use. However, not only does it display these parameters, it also allows for dynamic, runtime, on-the-fly reconfiguration of almost all of them, including adding new BalancerMembers (workers) to an existing balancer. To enable these capability, the following needs to be added to your configuration: SetHandler balancer-manager Require host localhost 6.8. REVERSE PROXY GUIDE ! 261 Warning Do not enable the balancer-manager until you have secured your server (p. 787) . In particular, ensure that access to the URL is tightly restricted. When the reverse proxy server is accessed at that url http://rproxy.example.com/balancer-manager/, you will see a page similar to the below: (eg: This form allows the devops admin to adjust various parameters, take workers offline, change load balancing methods and add new works. For example, clicking on the balancer itself, you will get the following page: 262 Whereas clicking on a worker, displays this page: CHAPTER 6. GUIDES, TUTORIALS, AND HOWTOS 6.8. REVERSE PROXY GUIDE 263 To have these changes persist restarts of the reverse proxy, ensure that BALANCER P ERSIST is enabled. Dynamic Health Checks Before httpd proxies a request to a worker, it can "test" if that worker is available via setting the ping parameter for that worker using P ROXY PASS. Oftentimes it is more useful to check the health of the workers out of band, in a dynamic fashion. This is achieved in Apache httpd by the MOD PROXY HCHECK module. BalancerMember status flags In the balancer-manager the current state, or status, of a worker is displayed and can be set/reset. The meanings of these statuses are as follows: 264 CHAPTER 6. GUIDES, TUTORIALS, AND HOWTOS Flag String Description D Ok Init Dis S Stop I Ign H Stby E Err N Drn C HcFl Worker is available Worker has been initialized Worker is disabled and will not accept any requests; will be automatically retried. Worker is administratively stopped; will not accept requests and will not be automatically retried Worker is in ignore-errors mode and will always be considered available. Worker is in hot-standby mode and will only be used if no other viable workers are available. Worker is in an error state, usually due to failing pre-request check; requests will not be proxied to this worker, but it will be retried depending on the retry setting of the worker. Worker is in drain mode and will only accept existing sticky sessions destined for itself and ignore all other requests. Worker has failed dynamic health check and will not be used until it passes subsequent health checks. Chapter 7 Platform-specific Notes 265 266 7.1 CHAPTER 7. PLATFORM-SPECIFIC NOTES Platform Specific Notes Microsoft Windows Using Apache This document explains how to install, configure and run Apache 2.4 under Microsoft Windows. See: Using Apache with Microsoft Windows (p. 267) Compiling Apache There are many important points before you begin compiling Apache. This document explain them. See: Compiling Apache for Microsoft Windows (p. 275) Unix Systems RPM Based Systems (Redhat / CentOS / Fedora) This document explains how to build, install, and run Apache 2.4 on systems supporting the RPM packaging format. See: Using Apache With RPM Based Systems (p. 281) Other Platforms Novell NetWare This document explains how to install, configure and run Apache 2.4 under Novell NetWare 5.1 and above. See: Using Apache With Novell NetWare (p. 284) 7.2. USING APACHE HTTP SERVER ON MICROSOFT WINDOWS 7.2 267 Using Apache HTTP Server on Microsoft Windows This document explains how to install, configure and run Apache 2.5 under Microsoft Windows. If you have questions after reviewing the documentation (and any event and error logs), you should consult the peer-supported users’ mailing list1 . This document assumes that you are installing a binary distribution of Apache. If you want to compile Apache yourself (possibly to help with development or tracking down bugs), see Compiling Apache for Microsoft Windows (p. 275) . Operating System Requirements The primary Windows platform for running Apache 2.5 is Windows 2000 or later. Always obtain and install the current service pack to avoid operating system bugs. =⇒Apache HTTP Server versions later than 2.2 will not run on any operating system earlier than Windows 2000. Downloading Apache for Windows The Apache HTTP Server Project itself does not provide binary releases of software, only source code. Individual committers may provide binary packages as a convenience, but it is not a release deliverable. If you cannot compile the Apache HTTP Server yourself, you can obtain a binary package from numerous binary distributions available on the Internet. Popular options for deploying Apache httpd, and, optionally, PHP and MySQL, on Microsoft Windows, include: • ApacheHaus2 • Apache Lounge3 • Bitnami WAMP Stack4 • WampServer5 • XAMPP6 Customizing Apache for Windows Apache is configured by the files in the conf subdirectory. These are the same files used to configure the Unix version, but there are a few different directives for Apache on Windows. See the directive index (p. 1106) for all the available directives. The main differences in Apache for Windows are: • Because Apache for Windows is multithreaded, it does not use a separate process for each request, as Apache can on Unix. Instead there are usually only two Apache processes running: a parent process, and a child which handles the requests. Within the child process each request is handled by a separate thread. The process management directives are also different: 1 http://httpd.apache.org/userslist.html 2 http://www.apachehaus.com/cgi-bin/download.plx 3 http://www.apachelounge.com/download/ 4 http://bitnami.com/stack/wamp 5 http://www.wampserver.com/ 6 http://www.apachefriends.org/en/xampp.html 268 CHAPTER 7. PLATFORM-SPECIFIC NOTES M AX C ONNECTIONS P ER C HILD: Like the Unix directive, this controls how many connections a single child process will serve before exiting. However, unlike on Unix, a replacement process is not instantly available. Use the default MaxConnectionsPerChild 0, unless instructed to change the behavior to overcome a memory leak in third party modules or in-process applications. ! Warning: The server configuration file is reread when a new child process is started. If you have modified httpd.conf, the new child may not start or you may receive unexpected results. T HREADS P ER C HILD: This directive is new. It tells the server how many threads it should use. This is the maximum number of connections the server can handle at once, so be sure to set this number high enough for your site if you get a lot of hits. The recommended default is ThreadsPerChild 150, but this must be adjusted to reflect the greatest anticipated number of simultaneous connections to accept. • The directives that accept filenames as arguments must use Windows filenames instead of Unix ones. However, because Apache may interpret backslashes as an "escape character" sequence, you should consistently use forward slashes in path names, not backslashes. • While filenames are generally case-insensitive on Windows, URLs are still treated internally as case-sensitive before they are mapped to the filesystem. For example, the , A LIAS, and P ROXY PASS directives all use case-sensitive arguments. For this reason, it is particularly important to use the directive when attempting to limit access to content in the filesystem, since this directive applies to any content in a directory, regardless of how it is accessed. If you wish to assure that only lowercase is used in URLs, you can use something like: RewriteEngine On RewriteMap lowercase "int:tolower" RewriteCond "%{REQUEST_URI}" "[A-Z]" RewriteRule "(.*)" "${lowercase:$1}" [R,L] • When running, Apache needs write access only to the logs directory and any configured cache directory tree. Due to the issue of case insensitive and short 8.3 format names, Apache must validate all path names given. This means that each directory which Apache evaluates, from the drive root up to the directory leaf, must have read, list and traverse directory permissions. If Apache2.5 is installed at C:\Program Files, then the root directory, Program Files and Apache2.5 must all be visible to Apache. • Apache for Windows contains the ability to load modules at runtime, without recompiling the server. If Apache is compiled normally, it will install a number of optional modules in the \Apache2.5\modules directory. To activate these or other modules, the new L OAD M ODULE directive must be used. For example, to activate the status module, use the following (in addition to the status-activating directives in access.conf): LoadModule status_module "modules/mod_status.so" Information on creating loadable modules (p. 908) is also available. • Apache can also load ISAPI (Internet Server Application Programming Interface) extensions such as those used by Microsoft IIS and other Windows servers. More information is available (p. 683) . Note that Apache cannot load ISAPI Filters, and ISAPI Handlers with some Microsoft feature extensions will not work. • When running CGI scripts, the method Apache uses to find the interpreter for the script is configurable using the S CRIPT I NTERPRETER S OURCE directive. • Since it is often difficult to manage files with names like .htaccess in Windows, you may find it useful to change the name of this per-directory configuration file using the ACCESS F ILENAME directive. • Any errors during Apache startup are logged into the Windows event log when running on Windows NT. This mechanism acts as a backup for those situations where Apache is not yet prepared to use the error.log file. You can review the Windows Application Event Log by using the Event Viewer, e.g. Start - Settings - Control Panel - Administrative Tools - Event Viewer. 7.2. USING APACHE HTTP SERVER ON MICROSOFT WINDOWS 269 Running Apache as a Service Apache comes with a utility called the Apache Service Monitor. With it you can see and manage the state of all installed Apache services on any machine on your network. To be able to manage an Apache service with the monitor, you have to first install the service (either automatically via the installation or manually). You can install Apache as a Windows NT service as follows from the command prompt at the Apache bin subdirectory: httpd.exe -k install If you need to specify the name of the service you want to install, use the following command. You have to do this if you have several different service installations of Apache on your computer. If you specify a name during the install, you have to also specify it during any other -k operation. httpd.exe -k install -n "MyServiceName" If you need to have specifically named configuration files for different services, you must use this: httpd.exe -k install -n "MyServiceName" -f "c:\files\my.conf" If you use the first command without any special parameters except -k install, the service will be called Apache2.5 and the configuration will be assumed to be conf\httpd.conf. Removing an Apache service is easy. Just use: httpd.exe -k uninstall The specific Apache service to be uninstalled can be specified by using: httpd.exe -k uninstall -n "MyServiceName" Normal starting, restarting and shutting down of an Apache service is usually done via the Apache Service Monitor, by using commands like NET START Apache2.5 and NET STOP Apache2.5 or via normal Windows service management. Before starting Apache as a service by any means, you should test the service’s configuration file by using: httpd.exe -n "MyServiceName" -t You can control an Apache service by its command line switches, too. To start an installed Apache service you’ll use this: httpd.exe -k start -n "MyServiceName" To stop an Apache service via the command line switches, use this: httpd.exe -k stop -n "MyServiceName" 270 CHAPTER 7. PLATFORM-SPECIFIC NOTES or httpd.exe -k shutdown -n "MyServiceName" You can also restart a running service and force it to reread its configuration file by using: httpd.exe -k restart -n "MyServiceName" By default, all Apache services are registered to run as the system user (the LocalSystem account). The LocalSystem account has no privileges to your network via any Windows-secured mechanism, including the file system, named pipes, DCOM, or secure RPC. It has, however, wide privileges locally. ! Never grant any network privileges to the LocalSystem account! If you need Apache to be able to access network resources, create a separate account for Apache as noted below. It is recommended that users create a separate account for running Apache service(s). If you have to access network resources via Apache, this is required. 1. Create a normal domain user account, and be sure to memorize its password. 2. Grant the newly-created user a privilege of Log on as a service and Act as part of the operating system. On Windows NT 4.0 these privileges are granted via User Manager for Domains, but on Windows 2000 and XP you probably want to use Group Policy for propagating these settings. You can also manually set these via the Local Security Policy MMC snap-in. 3. Confirm that the created account is a member of the Users group. 4. Grant the account read and execute (RX) rights to all document and script folders (htdocs and cgi-bin for example). 5. Grant the account change (RWXD) rights to the Apache logs directory. 6. Grant the account read and execute (RX) rights to the httpd.exe binary executable. =⇒Itaccess is usually a good practice to grant the user the Apache service runs as read and execute (RX) to the whole Apache2.5 directory, except the logs subdirectory, where the user has to have at least change (RWXD) rights. If you allow the account to log in as a user and as a service, then you can log on with that account and test that the account has the privileges to execute the scripts, read the web pages, and that you can start Apache in a console window. If this works, and you have followed the steps above, Apache should execute as a service with no problems. =⇒Error code 2186 is a good indication that you need to review the "Log On As" configuration for the service, since Apache cannot access a required network resource. Also, pay close attention to the privileges of the user Apache is configured to run as. When starting Apache as a service you may encounter an error message from the Windows Service Control Manager. For example, if you try to start Apache by using the Services applet in the Windows Control Panel, you may get the following message: Could not start the Apache2.5 service on \\COMPUTER Error 1067; The process terminated unexpectedly. 7.2. USING APACHE HTTP SERVER ON MICROSOFT WINDOWS 271 You will get this generic error if there is any problem with starting the Apache service. In order to see what is really causing the problem you should follow the instructions for Running Apache for Windows from the Command Prompt. If you are having problems with the service, it is suggested you follow the instructions below to try starting httpd.exe from a console window, and work out the errors before struggling to start it as a service again. Running Apache as a Console Application Running Apache as a service is usually the recommended way to use it, but it is sometimes easier to work from the command line, especially during initial configuration and testing. To run Apache from the command line as a console application, use the following command: httpd.exe Apache will execute, and will remain running until it is stopped by pressing Control-C. You can also run Apache via the shortcut Start Apache in Console placed to Start Menu --> Programs --> Apache HTTP Server 2.5.xx --> Control Apache Server during the installation. This will open a console window and start Apache inside it. If you don’t have Apache installed as a service, the window will remain visible until you stop Apache by pressing Control-C in the console window where Apache is running in. The server will exit in a few seconds. However, if you do have Apache installed as a service, the shortcut starts the service. If the Apache service is running already, the shortcut doesn’t do anything. If Apache is running as a service, you can tell it to stop by opening another console window and entering: httpd.exe -k shutdown Running as a service should be preferred over running in a console window because this lets Apache end any current operations and clean up gracefully. But if the server is running in a console window, you can only stop it by pressing Control-C in the same window. You can also tell Apache to restart. This forces it to reread the configuration file. Any operations in progress are allowed to complete without interruption. To restart Apache, either press Control-Break in the console window you used for starting Apache, or enter httpd.exe -k restart if the server is running as a service. =⇒Note for people familiar with the Unix version of Apache: these commands provide a Windows equivalent to kill -TERM pid and kill -USR1 pid. The command line option used, -k, was chosen as a reminder of the kill command used on Unix. If the Apache console window closes immediately or unexpectedly after startup, open the Command Prompt from the Start Menu –> Programs. Change to the folder to which you installed Apache, type the command httpd.exe, and read the error message. Then change to the logs folder, and review the error.log file for configuration mistakes. Assuming httpd was installed into C:\Program Files\Apache Software Foundation\Apache2.5\, you can do the following: c: cd "\Program Files\Apache Software Foundation\Apache2.5\bin" httpd.exe 272 CHAPTER 7. PLATFORM-SPECIFIC NOTES Then wait for Apache to stop, or press Control-C. Then enter the following: cd ..\logs more < error.log When working with Apache it is important to know how it will find the configuration file. You can specify a configuration file on the command line in two ways: • -f specifies an absolute or relative path to a particular configuration file: httpd.exe -f "c:\my server files\anotherconfig.conf" or httpd.exe -f files\anotherconfig.conf • -n specifies the installed Apache service whose configuration file is to be used: httpd.exe -n "MyServiceName" In both of these cases, the proper S ERVER ROOT should be set in the configuration file. If you don’t specify a configuration file with -f or -n, Apache will use the file name compiled into the server, such as conf\httpd.conf. This built-in path is relative to the installation directory. You can verify the compiled file name from a value labelled as SERVER CONFIG FILE when invoking Apache with the -V switch, like this: httpd.exe -V Apache will then try to determine its S ERVER ROOT by trying the following, in this order: 1. A S ERVER ROOT directive via the -C command line switch. 2. The -d switch on the command line. 3. Current working directory. 4. A registry entry which was created if you did a binary installation. 5. The server root compiled into the server. This is /apache by default, you can verify it by using httpd.exe -V and looking for a value labelled as HTTPD ROOT. If you did not do a binary install, Apache will in some scenarios complain about the missing registry key. This warning can be ignored if the server was otherwise able to find its configuration file. The value of this key is the S ERVER ROOT directory which contains the conf subdirectory. When Apache starts it reads the httpd.conf file from that directory. If this file contains a S ERVER ROOT directive which contains a different directory from the one obtained from the registry key above, Apache will forget the registry key and use the directory from the configuration file. If you copy the Apache directory or configuration files to a new location it is vital that you update the S ERVER ROOT directive in the httpd.conf file to reflect the new location. 7.2. USING APACHE HTTP SERVER ON MICROSOFT WINDOWS 273 Testing the Installation After starting Apache (either in a console window or as a service) it will be listening on port 80 (unless you changed the L ISTEN directive in the configuration files or installed Apache only for the current user). To connect to the server and access the default page, launch a browser and enter this URL: http://localhost/ Apache should respond with a welcome page and you should see "It Works!". If nothing happens or you get an error, look in the error.log file in the logs subdirectory. If your host is not connected to the net, or if you have serious problems with your DNS (Domain Name Service) configuration, you may have to use this URL: http://127.0.0.1/ If you happen to be running Apache on an alternate port, you need to explicitly put that in the URL: http://127.0.0.1:8080/ Once your basic installation is working, you should configure it properly by editing the files in the conf subdirectory. Again, if you change the configuration of the Windows NT service for Apache, first attempt to start it from the command line to make sure that the service starts with no errors. Because Apache cannot share the same port with another TCP/IP application, you may need to stop, uninstall or reconfigure certain other services before running Apache. These conflicting services include other WWW servers, some firewall implementations, and even some client applications (such as Skype) which will use port 80 to attempt to bypass firewall issues. Configuring Access to Network Resources Access to files over the network can be specified using two mechanisms provided by Windows: Mapped drive letters e.g., Alias /images/ Z:/ UNC paths e.g., Alias /images/ //imagehost/www/images/ Mapped drive letters allow the administrator to maintain the mapping to a specific machine and path outside of the Apache httpd configuration. However, these mappings are associated only with interactive sessions and are not directly available to Apache httpd when it is started as a service. Use only UNC paths for network resources in httpd.conf so that the resources can be accessed consistently regardless of how Apache httpd is started. (Arcane and error prone procedures may work around the restriction on mapped drive letters, but this is not recommended.) Example DocumentRoot with UNC path DocumentRoot "//dochost/www/html/" Example DocumentRoot with IP address in UNC path DocumentRoot "//192.168.1.50/docs/" 274 CHAPTER 7. PLATFORM-SPECIFIC NOTES Example Alias and corresponding Directory with UNC path Alias "/images/" "//imagehost/www/images/" #... When running Apache httpd as a service, you must create a separate account in order to access network resources, as described above. Windows Tuning • If more than a few dozen piped loggers are used on an operating system instance, scaling up the "desktop heap" is often necessary. For more detailed information, refer to the piped logging (p. 56) documentation. 7.3. COMPILING APACHE FOR MICROSOFT WINDOWS 7.3 275 Compiling Apache for Microsoft Windows There are many important points to consider before you begin compiling Apache HTTP Server (httpd). See Using Apache HTTP Server on Microsoft Windows (p. 267) before you begin. httpd can be built on Windows using a cmake-based build system or with Visual Studio project files maintained by httpd developers. The cmake-based build system directly supports more versions of Visual Studio but currently has considerable functional limitations. Building httpd with the included Visual Studio project files Requirements Compiling Apache requires the following environment to be properly installed: • Disk Space Make sure you have at least 200 MB of free disk space available. After installation Apache requires approximately 80 MB of disk space, plus space for log and cache files, which can grow rapidly. The actual disk space requirements will vary considerably based on your chosen configuration and any third-party modules or libraries, especially when OpenSSL is also built. Because many files are text and very easily compressed, NTFS filesystem compression cuts these requirements in half. • Appropriate Patches The httpd binary is built with the help of several patches to third party packages, which ensure the released code is buildable and debuggable. These patches are available and distributed from http://www.apache.org/dist/httpd/binaries/win32/patches applied/ and are recommended to be applied to obtain identical results as the "official" ASF distributed binaries. • Microsoft Visual C++ 6.0 (Visual Studio 97) or later. Apache can be built using the command line tools, or from within the Visual Studio IDE Workbench. The command line build requires the environment to reflect the PATH, INCLUDE, LIB and other variables that can be configured with the vcvars32.bat script. =⇒You may want the Visual Studio Processor Pack for your older version of Visual Studio, or a full (not Express) version of newer Visual Studio editions, for the ml.exe assembler. This will allow you to build OpenSSL, if desired, using the more efficient assembly code implementation. =⇒Only the Microsoft compiler tool chain is actively supported by the active httpd contributors. Although the project regularly accepts patches to ensure MinGW and other alternative builds work and improve upon them, they are not actively maintained and are often broken in the course of normal development. • Updated Microsoft Windows Platform SDK, February 2003 or later. An appropriate Windows Platform SDK is included by default in the full (not express/lite) versions of Visual C++ 7.1 (Visual Studio 2002) and later, these users can ignore these steps unless explicitly choosing a newer or different version of the Platform SDK. To use Visual C++ 6.0 or 7.0 (Studio 2000 .NET), the Platform SDK environment must be prepared using the setenv.bat script (installed by the Platform SDK) before starting the command line build or launching the msdev/devenv GUI environment. Installing the Platform SDK for Visual Studio Express versions (2003 and later) should adjust the default environment appropriately. "c:\Program Files\Microsoft Visual Studio\VC98\Bin\VCVARS32" "c:\Program Files\Platform SDK\setenv.bat" 276 CHAPTER 7. PLATFORM-SPECIFIC NOTES • Perl and awk Several steps recommended here require a perl interpreter during the build preparation process, but it is otherwise not required. To install Apache within the build system, several files are modified using the awk.exe utility. awk was chosen since it is a very small download (compared with Perl or WSH/VB) and accomplishes the task of modifying configuration files upon installation. Brian Kernighan’s http://www.cs.princeton.edu/˜bwk/btl.mirror/ site has a compiled native Win32 binary, http://www.cs.princeton.edu/˜bwk/btl.mirror/awk95.exe which you must save with the name awk.exe (rather than awk95.exe). =⇒Ifstalled awk.exe is not found, Makefile.win’s install target will not perform substitutions in the in.conf files. You must manually modify the installed .conf files to allow the server to start. Search and replace all "@token@" tags as appropriate. =⇒The Visual Studio IDE will only find awk.exe from the PATH, or executable path specified in the menu option Tools -> Options -> (Projects ->) Directories. Ensure awk.exe is in your system path. =⇒Also note that if you are using Cygwin tools (http://www.cygwin.com/) the awk utility is named gawk.exe and that the file awk.exe is really a symlink to the gawk.exe file. The Windows command shell does not recognize symlinks, and because of this building InstallBin will fail. A workaround is to delete awk.exe from the cygwin installation and copy gawk.exe to awk.exe. Also note the cygwin/mingw ports of gawk 3.0.x were buggy, please upgrade to 3.1.x before attempting to use any gawk port. • [Optional] zlib library (for MOD DEFLATE) Zlib must be installed into a srclib subdirectory named zlib. This must be built in-place. Zlib can be obtained from http://www.zlib.net/ – the MOD DEFLATE is confirmed to work correctly with version 1.2.3. nmake -f win32\Makefile.msc nmake -f win32\Makefile.msc test • [Optional] OpenSSL libraries (for MOD SSL and ab.exe with ssl support) =⇒The OpenSSL library is cryptographic software. The country in which you currently reside may have restrictions on the import, possession, use, and/or re-export to another country, of encryption software. BEFORE using any encryption software, please check your country’s laws, regulations and policies concerning the import, possession, or use, and re-export of encryption software, to see if this is permitted. See http://www.wassenaar.org/ for more information. Configuring and building OpenSSL requires perl to be installed. OpenSSL must be installed into a srclib subdirectory named openssl, obtained from http://www.openssl.org/source/, in order to compile MOD SSL or the abs.exe project, which is ab.c with SSL support enabled. To prepare OpenSSL to be linked to Apache mod ssl or abs.exe, and disable patent encumbered features in OpenSSL, you might use the following build commands: perl Configure no-rc5 no-idea enable-mdc2 enable-zlib VC-WIN32 -Ipath/to/srclib/zlib -Lpath/to/srclib/zlib ms\do masm.bat nmake -f ms\ntdll.mak =⇒Ittheisfirstnot request advisable to use zlib-dynamic, as that transfers the cost of deflating SSL streams to which must load the zlib dll. Note the suggested patch enables the -L flag to work with windows builds, corrects the name of zdll.lib and ensures .pdb files are generated for troubleshooting. If the assembler is not installed, you would add no-asm above and use ms\do ms.bat instead of the ms\do masm.bat script. 7.3. COMPILING APACHE FOR MICROSOFT WINDOWS 277 • [Optional] Database libraries (for MOD DBD and MOD AUTHN DBM) The apr-util library exposes dbm (keyed database) and dbd (query oriented database) client functionality to the httpd server and its modules, such as authentication and authorization. The sdbm dbm and odbc dbd providers are compiled unconditionally. The dbd support includes the Oracle instantclient package, MySQL, PostgreSQL and sqlite. To build these all, for example, set up the LIB to include the library path, INCLUDE to include the headers path, and PATH to include the dll bin path of all four SDK’s, and set the DBD LIST environment variable to inform the build which client driver SDKs are installed correctly, e.g.; set DBD LIST=sqlite3 pgsql oracle mysql Similarly, the dbm support can be extended with DBM LIST to build a Berkeley DB provider (db) and/or gdbm provider, by similarly configuring LIB, INCLUDE and PATH first to ensure the client library libs and headers are available. set DBM LIST=db gdbm =⇒Depending on the choice of database distributions, it may be necessary to change the actual link target name (e.g. gdbm.lib vs. libgdb.lib) that are listed in the corresponding .dsp/.mak files within the directories srclib\apr-util\dbd or ...\dbm. See the README-win32.txt file for more hints on obtaining the various database driver SDKs. Building from Unix sources The policy of the Apache HTTP Server project is to only release Unix sources. Windows source packages made available for download have been supplied by volunteers and may not be available for every release. You can still build the server on Windows from the Unix source tarball with just a few additional steps. 1. Download and unpack the Unix source tarball for the latest version. 2. Download and unpack the Unix source tarball for latest version of APR, AR-Util and APR-Iconv, place these sources in directories httpd-2.x.x\srclib\apr, httpd-2.x.x\srclib\apr-util and httpd-2.x.x\srclib\apr-iconv 3. Open a Command Prompt and CD to the httpd-2.x.x folder 4. Run the line endings conversion utility at the prompt; perl srclib\apr\build\lineends.pl You can now build the server with the Visual Studio 6.0 development environment using the IDE. Command-Line builds of the server are not possible from Unix sources unless you export .mak files as explained below. 278 CHAPTER 7. PLATFORM-SPECIFIC NOTES Command-Line Build Makefile.win is the top level Apache makefile. To compile Apache on Windows, simply use one of the following commands to build the release or debug flavor: nmake /f Makefile.win apacher nmake /f Makefile.win apached Either command will compile Apache. The latter will disable optimization of the resulting files, making it easier to single step the code to find bugs and track down problems. You can add your apr-util dbd and dbm provider choices with the additional make (environment) variables DBD LIST and DBM LIST, see the comments about [Optional] Database libraries, above. Review the initial comments in Makefile.win for additional options that can be provided when invoking the build. Developer Studio Workspace IDE Build Apache can also be compiled using VC++’s Visual Studio development environment. To simplify this process, a Visual Studio workspace, Apache.dsw, is provided. This workspace exposes the entire list of working .dsp projects that are required for the complete Apache binary release. It includes dependencies between the projects to assure that they are built in the appropriate order. Open the Apache.dsw workspace, and select InstallBin (Release or Debug build, as desired) as the Active Project. InstallBin causes all related project to be built, and then invokes Makefile.win to move the compiled executables and dlls. You may personalize the INSTDIR= choice by changing InstallBin’s Settings, General tab, Build command line entry. INSTDIR defaults to the /Apache2 directory. If you only want a test compile (without installing) you may build the BuildBin project instead. The .dsp project files are distributed in Visual Studio 6.0 (98) format. Visual C++ 5.0 (97) will recognize them. Visual Studio 2002 (.NET) and later users must convert Apache.dsw plus the .dsp files into an Apache.sln plus .msproj files. Be sure you reconvert the .msproj file again if its source .dsp file changes! This is really trivial, just open Apache.dsw in the VC++ 7.0 IDE once again and reconvert. =⇒There is a flaw in the .vcproj conversion of .dsp files. devenv.exe will mis-parse the /D flag for RC flags containing long quoted /D’efines which contain spaces. The command: perl srclib\apr\build\cvtdsp.pl -2005 will convert the /D flags for RC flags to use an alternate, parseable syntax; unfortunately this syntax isn’t supported by Visual Studio 97 or its exported .mak files. These /D flags are used to pass the long description of the mod apachemodule.so files to the shared .rc resource versionidentifier build. Visual Studio 2002 (.NET) and later users should also use the Build menu, Configuration Manager dialog to uncheck both the Debug and Release Solution modules abs, MOD DEFLATE and MOD SSL components, as well as every component starting with apr db*. These modules are built by invoking nmake, or the IDE directly with the BinBuild target, which builds those modules conditionally if the srclib directories openssl and/or zlib exist, and based on the setting of DBD LIST and DBM LIST environment variables. Exporting command-line .mak files Exported .mak files pose a greater hassle, but they are required for Visual C++ 5.0 users to build MOD SSL, abs (ab with SSL support) and/or MOD DEFLATE. The .mak files also support a broader range of C++ tool chain distributions, such as Visual Studio Express. 7.3. COMPILING APACHE FOR MICROSOFT WINDOWS 279 You must first build all projects in order to create all dynamic auto-generated targets, so that dependencies can be parsed correctly. Build the entire project from within the Visual Studio 6.0 (98) IDE, using the BuildAll target, then use the Project Menu Export for all makefiles (checking on "with dependencies".) Run the following command to correct absolute paths into relative paths so they will build anywhere: perl srclib\apr\build\fixwin32mak.pl You must type this command from the top level directory of the httpd source tree. Every .mak and .dep project file within the current directory and below will be corrected, and the timestamps adjusted to reflect the .dsp. Always review the generated .mak and .dep files for Platform SDK or other local, machine specific file paths. The DevStudio\Common\MSDev98\bin\ (VC6) directory contains a sysincl.dat file, which lists all exceptions. Update this file (including both forward and backslashed paths, such as both sys/time.h and sys\time.h) to ignore such newer dependencies. Including local-install paths in a distributed .mak file will cause the build to fail completely. If you contribute back a patch that revises project files, we must commit project files in Visual Studio 6.0 format. Changes should be simple, with minimal compilation and linkage flags that can be recognized by all Visual Studio environments. Installation Once Apache has been compiled, it needs to be installed in its server root directory. The default is the \Apache2 directory, of the same drive. To build and install all the files into the desired folder dir automatically, use one of the following nmake commands: nmake /f Makefile.win installr INSTDIR=dir nmake /f Makefile.win installd INSTDIR=dir The dir argument to INSTDIR provides the installation directory; it can be omitted if Apache is to be installed into \Apache22 (of the current drive). Warning about building Apache from the development tree =⇒regenerated, Note only the .dsp files are maintained between release builds. The .mak files are NOT due to the tremendous waste of reviewer’s time. Therefore, you cannot rely on the NMAKE commands above to build revised .dsp project files unless you then export all .mak files yourself from the project. This is unnecessary if you build from within the Microsoft Developer Studio environment. Building httpd with cmake The primary documentation for this build mechanism is in the README.cmake file in the source distribution. Refer to that file for detailed instructions. Building httpd with cmake requires building APR and APR-util separately. Refer to their README.cmake files for instructions. The primary limitations of the cmake-based build are inherited from the APR-util project, and are listed below because of their impact on httpd: 280 CHAPTER 7. PLATFORM-SPECIFIC NOTES • No cmake build for the APR-iconv subproject is available, and the APR-util cmake build cannot consume an existing APR-iconv build. Thus, MOD CHARSET LITE and possibly some third-party modules cannot be used. • The cmake build for the APR-util subproject does not support most of the optional DBM and DBD libraries supported by the included Visual Studio project files. This limits the database backends supported by a number of bundled and third-party modules. 7.4. USING APACHE WITH RPM BASED SYSTEMS (REDHAT / CENTOS / FEDORA) 7.4 281 Using Apache With RPM Based Systems (Redhat / CentOS / Fedora) While many distributions make Apache httpd available as operating system supported packages, it can sometimes be desirable to install and use the canonical version of Apache httpd on these systems, replacing the natively provided versions of the packages. While the Apache httpd project does not currently create binary RPMs for the various distributions out there, it is easy to build your own binary RPMs from the canonical Apache httpd tarball. This document explains how to build, install, configure and run Apache httpd 2.4 under Unix systems supporting the RPM packaging format. Creating a Source RPM The Apache httpd source tarball can be converted into an SRPM as follows: rpmbuild -ts httpd-2.4.x.tar.bz2 Building RPMs RPMs can be built directly from the Apache httpd source tarballs using the following command: rpmbuild -tb httpd-2.4.x.tar.bz2 Corresponding "-devel" packages will be required to be installed on your build system prior to building the RPMs, the rpmbuild command will automatically calculate what RPMs are required and will list any dependencies that are missing on your system. These "-devel" packages will not be required after the build is completed, and can be safely removed. If successful, the following RPMs will be created: httpd-2.4.x-1.i686.rpm The core server and basic module set. httpd-debuginfo-2.4.x-1.i686.rpm Debugging symbols for the server and all modules. httpd-devel-2.4.x-1.i686.rpm Headers and development files for the server. httpd-manual-2.4.x-1.i686.rpm The webserver manual. httpd-tools-2.4.x-1.i686.rpm Supporting tools for the webserver. mod authnz ldap-2.4.x-1.i686.rpm MOD LDAP and MOD AUTHNZ LDAP, with corresponding dependency on openldap. mod lua-2.4.x-1.i686.rpm MOD LUA module, with corresponding dependency on lua. mod proxy html-2.4.x-1.i686.rpm MOD PROXY HTML module, with corresponding dependency on libxml2. mod socache dc-2.4.x-1.i686.rpm MOD SOCACHE DC module, with corresponding dependency on distcache. mod ssl-2.4.x-1.i686.rpm MOD SSL module, with corresponding dependency on openssl. 282 CHAPTER 7. PLATFORM-SPECIFIC NOTES Installing the Server The httpd RPM is the only RPM necessary to get a basic server to run. Install it as follows: rpm -U httpd-2.4.x-1.i686.rpm Self contained modules are included with the server. Modules that depend on external libraries are provided as separate RPMs to install if needed. Configuring the Default Instance of Apache httpd The default configuration for the server is installed by default beneath the /etc/httpd directory, with logs written by default to /var/log/httpd. The environment for the webserver is set by default within the optional /etc/sysconfig/httpd file. Start the server as follows: service httpd restart Configuring Additional Instances of Apache httpd on the Same Machine It is possible to configure additional instances of the Apache httpd server running independently alongside each other on the same machine. These instances can have independent configurations, and can potentially run as separate users if so configured. This was done by making the httpd startup script aware of its own name. This name is then used to find the environment file for the server, and in turn, the server root of the server instance. To create an additional instance called httpd-additional, follow these steps: • Create a symbolic link to the startup script for the additional server: ln -s /etc/rc.d/init.d/httpd /etc/rc.d/init.d/httpd-additional chkconfig --add httpd-additional • Create an environment file for the server, using the /etc/sysconfig/httpd file as a template: # template from httpd cp /etc/sysconfig/httpd /etc/sysconfig/httpd-additional # blank template touch /etc/sysconfig/httpd-additional Edit /etc/sysconfig/httpd-additional and pass the server root of the new server instance within the OPTIONS environment variable. OPTIONS="-d /etc/httpd-additional -f conf/httpd-additional.conf" 7.4. USING APACHE WITH RPM BASED SYSTEMS (REDHAT / CENTOS / FEDORA) 283 • Edit the server configuration file /etc/httpd-additional/conf/httpd-additional.conf to ensure the correct ports and paths are configured. • Start the server as follows: service httpd-additional restart • Repeat this process as required for each server instance. 284 CHAPTER 7. PLATFORM-SPECIFIC NOTES 7.5 Using Apache With Novell NetWare This document explains how to install, configure and run Apache 2.0 under Novell NetWare 6.0 and above. If you find any bugs, or wish to contribute in other ways, please use our bug reporting page.7 The bug reporting page and dev-httpd mailing list are not provided to answer questions about configuration or running Apache. Before you submit a bug report or request, first consult this document, the Frequently Asked Questions8 page and the other relevant documentation topics. If you still have a question or problem, post it to the novell.devsup.webserver9 newsgroup, where many Apache users are more than willing to answer new and obscure questions about using Apache on NetWare. Most of this document assumes that you are installing Apache from a binary distribution. If you want to compile Apache yourself (possibly to help with development, or to track down bugs), see the section on Compiling Apache for NetWare below. Requirements Apache 2.0 is designed to run on NetWare 6.0 service pack 3 and above. If you are running a service pack less than SP3, you must install the latest NetWare Libraries for C (LibC)10 . NetWare service packs are available here11 . Apache 2.0 for NetWare can also be run in a NetWare 5.1 environment as long as the latest service pack or the latest version of the NetWare Libraries for C (LibC)12 has been installed . WARNING: Apache 2.0 for NetWare has not been targeted for or tested in this environment. Downloading Apache for NetWare Information on the latest version of Apache can be found on the Apache web server at http://www.apache.org/. This will list the current release, any more recent alpha or beta-test releases, together with details of mirror web and anonymous ftp sites. Binary builds of the latest releases of Apache 2.0 for NetWare can be downloaded from here13 . Installing Apache for NetWare There is no Apache install program for NetWare currently. If you are building Apache 2.0 for NetWare from source, you will need to copy the files over to the server manually. Follow these steps to install Apache on NetWare from the binary download (assuming you will install to sys:/apache2): • Unzip the binary download file to the root of the SYS: volume (may be installed to any volume) • Edit the httpd.conf file setting S ERVER ROOT and S ERVER NAME along with any file path values to reflect your correct server settings • Add SYS:/APACHE2 to the search path, for example: SEARCH ADD SYS:\APACHE2 7 http://httpd.apache.org/bug report.html 8 http://wiki.apache.org/httpd/FAQ 9 news://developer-forums.novell.com/novell.devsup.webserver 10 http://developer.novell.com/ndk/libc.htm 11 http://support.novell.com/misc/patlst.htm#nw 12 http://developer.novell.com/ndk/libc.htm 13 http://www.apache.org/dist/httpd/binaries/netware 7.5. USING APACHE WITH NOVELL NETWARE 285 Follow these steps to install Apache on NetWare manually from your own build source (assuming you will install to sys:/apache2): • Create a directory called Apache2 on a NetWare volume • Copy APACHE2.NLM, APRLIB.NLM to SYS:/APACHE2 • Create a directory under SYS:/APACHE2 called BIN • Copy HTDIGEST.NLM, SYS:/APACHE2/BIN HTPASSWD.NLM, HTDBM.NLM, LOGRES.NLM, ROTLOGS.NLM to • Create a directory under SYS:/APACHE2 called CONF • Copy the HTTPD-STD.CONF file to the SYS:/APACHE2/CONF directory and rename to HTTPD.CONF • Copy the MIME.TYPES, CHARSET.CONV and MAGIC files to SYS:/APACHE2/CONF directory • Copy all files and subdirectories in \HTTPD-2.0\DOCS\ICONS to SYS:/APACHE2/ICONS • Copy all files and subdirectories in \HTTPD-2.0\DOCS\MANUAL to SYS:/APACHE2/MANUAL • Copy all files and subdirectories in \HTTPD-2.0\DOCS\ERROR to SYS:/APACHE2/ERROR • Copy all files and subdirectories in \HTTPD-2.0\DOCS\DOCROOT to SYS:/APACHE2/HTDOCS • Create the directory SYS:/APACHE2/LOGS on the server • Create the directory SYS:/APACHE2/CGI-BIN on the server • Create the directory SYS:/APACHE2/MODULES and copy all nlm modules into the modules directory • Edit the HTTPD.CONF file searching for all @@Value@@ markers and replacing them with the appropriate setting • Add SYS:/APACHE2 to the search path, for example: SEARCH ADD SYS:\APACHE2 Apache may be installed to other volumes besides the default SYS volume. During the build process, adding the keyword "install" to the makefile command line will automatically produce a complete distribution package under the subdirectory DIST. Install Apache by simply copying the distribution that was produced by the makfiles to the root of a NetWare volume (see: Compiling Apache for NetWare below). Running Apache for NetWare To start Apache just type apache at the console. This will load apache in the OS address space. If you prefer to load Apache in a protected address space you may specify the address space with the load statement as follows: load address space = apache2 apache2 This will load Apache into an address space called apache2. Running multiple instances of Apache concurrently on NetWare is possible by loading each instance into its own protected address space. After starting Apache, it will be listening to port 80 (unless you changed the L ISTEN directive in the configuration files). To connect to the server and access the default page, launch a browser and enter the server’s name or address. This should respond with a welcome page, and a link to the Apache manual. If nothing happens or you get an error, look in the error log file in the logs directory. Once your basic installation is working, you should configure it properly by editing the files in the conf directory. To unload Apache running in the OS address space just type the following at the console: 286 CHAPTER 7. PLATFORM-SPECIFIC NOTES unload apache2 or apache2 shutdown If apache is running in a protected address space specify the address space in the unload statement: unload address space = apache2 apache2 When working with Apache it is important to know how it will find the configuration files. You can specify a configuration file on the command line in two ways: • -f specifies a path to a particular configuration file apache2 -f "vol:/my server/conf/my.conf" apache -f test/test.conf In these cases, the proper S ERVER ROOT should be set in the configuration file. If you don’t specify a configuration file name with -f, Apache will use the file name compiled into the server, usually conf/httpd.conf. Invoking Apache with the -V switch will display this value labeled as SERVER CONFIG FILE. Apache will then determine its S ERVER ROOT by trying the following, in this order: • A ServerRoot directive via a -C switch. • The -d switch on the command line. • Current working directory • The server root compiled into the server. The server root compiled into the server is usually sys:/apache2. invoking apache with the -V switch will display this value labeled as HTTPD ROOT. Apache 2.0 for NetWare includes a set of command line directives that can be used to modify or display information about the running instance of the web server. These directives are only available while Apache is running. Each of these directives must be preceded by the keyword APACHE2. RESTART Instructs Apache to terminate all running worker threads as they become idle, reread the configuration file and restart each worker thread based on the new configuration. VERSION Displays version information about the currently running instance of Apache. MODULES Displays a list of loaded modules both built-in and external. DIRECTIVES Displays a list of all available directives. SETTINGS Enables or disables the thread status display on the console. When enabled, the state of each running threads is displayed on the Apache console screen. SHUTDOWN Terminates the running instance of the Apache web server. HELP Describes each of the runtime directives. By default these directives are issued against the instance of Apache running in the OS address space. To issue a directive against a specific instance running in a protected address space, include the -p parameter along with the name of the address space. For more information type "apache2 Help" on the command line. 7.5. USING APACHE WITH NOVELL NETWARE 287 Configuring Apache for NetWare Apache is configured by reading configuration files usually stored in the conf directory. These are the same as files used to configure the Unix version, but there are a few different directives for Apache on NetWare. See the Apache module documentation (p. 1101) for all the available directives. The main differences in Apache for NetWare are: • Because Apache for NetWare is multithreaded, it does not use a separate process for each request, as Apache does on some Unix implementations. Instead there are only threads running: a parent thread, and multiple child or worker threads which handle the requests. Therefore the "process"-management directives are different: M AX C ONNECTIONS P ER C HILD - Like the Unix directive, this controls how many connections a worker thread will serve before exiting. The recommended default, MaxConnectionsPerChild 0, causes the thread to continue servicing request indefinitely. It is recommended on NetWare, unless there is some specific reason, that this directive always remain set to 0. S TART T HREADS - This directive tells the server how many threads it should start initially. The recommended default is StartThreads 50. M IN S PARE T HREADS - This directive instructs the server to spawn additional worker threads if the number of idle threads ever falls below this value. The recommended default is MinSpareThreads 10. M AX S PARE T HREADS - This directive instructs the server to begin terminating worker threads if the number of idle threads ever exceeds this value. The recommended default is MaxSpareThreads 100. M AX T HREADS - This directive limits the total number of work threads to a maximum value. The recommended default is ThreadsPerChild 250. T HREAD S TACK S IZE - This directive tells the server what size of stack to use for the individual worker thread. The recommended default is ThreadStackSize 65536. • The directives that accept filenames as arguments must use NetWare filenames instead of Unix names. However, because Apache uses Unix-style names internally, forward slashes must be used rather than backslashes. It is recommended that all rooted file paths begin with a volume name. If omitted, Apache will assume the SYS: volume which may not be correct. • Apache for NetWare has the ability to load modules at runtime, without recompiling the server. If Apache is compiled normally, it will install a number of optional modules in the \Apache2\modules directory. To activate these, or other modules, the L OAD M ODULE directive must be used. For example, to active the status module, use the following: LoadModule status module modules/status.nlm Information on creating loadable modules (p. 908) is also available. Additional NetWare specific directives: • CGIM AP E XTENSION - This directive maps a CGI file extension to a script interpreter. • S ECURE L ISTEN - Enables SSL encryption for a specified port. • NWSSLT RUSTED C ERTS - Adds trusted certificates that are used to create secure connections to proxied servers. • NWSSLU PGRADEABLE - Allow a connection created on the specified address/port to be upgraded to an SSL connection. 288 CHAPTER 7. PLATFORM-SPECIFIC NOTES Compiling Apache for NetWare Compiling Apache requires MetroWerks CodeWarrior 6.x or higher. Once Apache has been built, it can be installed to the root of any NetWare volume. The default is the sys:/Apache2 directory. Before running the server you must fill out the conf directory. Copy the file HTTPD-STD.CONF from the distribution conf directory and rename it to HTTPD.CONF. Edit the HTTPD.CONF file searching for all @@Value@@ markers and replacing them with the appropriate setting. Copy over the conf/magic and conf/mime.types files as well. Alternatively, a complete distribution can be built by including the keyword install when invoking the makefiles. Requirements: The following development tools are required to build Apache 2.0 for NetWare: • Metrowerks CodeWarrior 6.0 or higher with the NetWare PDK 3.014 or higher. • NetWare Libraries for C (LibC)15 • LDAP Libraries for C16 • ZLIB Compression Library source code17 • AWK utility (awk, gawk or similar). AWK can be downloaded from http://developer.novell.com/ndk/apache.htm. The utility must be found in your windows path and must be named awk.exe. • To build using the makefiles, you will need GNU make version 3.78.1 (GMake) available at http://developer.novell.com/ndk/apache.htm. Building Apache using the NetWare makefiles: • Set the environment variable NOVELLLIBC to the location of the NetWare Libraries for C SDK, for example: Set NOVELLLIBC=c:\novell\ndk\libc • Set the environment variable METROWERKS to the location where you installed the Metrowerks CodeWarrior compiler, for example: Set METROWERKS=C:\Program Files\Metrowerks\CodeWarrior If you installed to the default location C:\Program Files\Metrowerks\CodeWarrior, you don’t need to set this. • Set the environment variable LDAPSDK to the location where you installed the LDAP Libraries for C, for example: Set LDAPSDK=c:\Novell\NDK\cldapsdk\NetWare\libc • Set the environment variable ZLIBSDK to the location where you installed the source code for the ZLib Library, for example: 14 http://developer.novell.com/ndk/cwpdk.htm 15 http://developer.novell.com/ndk/libc.htm 16 http://developer.novell.com/ndk/cldap.htm 17 http://www.gzip.org/zlib/ 7.5. USING APACHE WITH NOVELL NETWARE 289 Set ZLIBSDK=D:\NOVELL\zlib • Set the environment variable PCRESDK to the location where you installed the source code for the PCRE Library, for example: Set PCRESDK=D:\NOVELL\pcre • Set the environment variable AP WORK to the full path of the httpd source code directory. Set AP WORK=D:\httpd-2.0.x • Set the environment variable APR WORK to the full path of the apr source code directory. \httpd\srclib\apr but the APR project can be outside of the httpd directory structure. Typically Set APR WORK=D:\apr-1.x.x • Set the environment variable APU WORK to the full path of the apr-util source code directory. Typically \httpd\srclib\apr-util but the APR-UTIL project can be outside of the httpd directory structure. Set APU WORK=D:\apr-util-1.x.x • Make sure that the path to the AWK utility and the GNU make utility (gmake.exe) have been included in the system’s PATH environment variable. • Download the source code and unzip to an appropriate directory on your workstation. • Change directory to \httpd-2.0 and build the prebuild utilities by running "gmake -f nwgnumakefile prebuild". This target will create the directory \httpd-2.0\nwprebuild and copy each of the utilities to this location that are necessary to complete the following build steps. • Copy the files \httpd-2.0\nwprebuild\GENCHARS.nlm and \httpd-2.0\nwprebuild\DFTABLES.nlm to the SYS: volume of a NetWare server and run them using the following commands: SYS:\genchars > sys:\test char.h SYS:\dftables sys:\chartables.c • Copy the files test char.h and chartables.c to the directory \httpd-2.0\os\netware on the build machine. • Change directory to \httpd-2.0 and build Apache by running "gmake -f nwgnumakefile". You can create a distribution directory by adding an install parameter to the command, for example: gmake -f nwgnumakefile install 290 CHAPTER 7. PLATFORM-SPECIFIC NOTES Additional make options • gmake -f nwgnumakefileBuilds release versions of all of the binaries and copies them to a \release destination directory. • gmake -f nwgnumakefile DEBUG=1Builds debug versions of all of the binaries and copies them to a \debug destination directory. • gmake -f nwgnumakefile installCreates a complete Apache distribution with binaries, docs and additional support files in a \dist\Apache2 directory. • gmake -f nwgnumakefile prebuildBuilds all of the prebuild utilities and copies them to the \nwprebuild directory. • gmake -f nwgnumakefile installdevSame as install but also creates a \lib and \include directory in the destination directory and copies headers and import files. • gmake -f nwgnumakefile cleanCleans all object files and binaries from the \release.o or \debug.o build areas depending on whether DEBUG has been defined. • gmake -f nwgnumakefile clobber allSame as clean and also deletes the distribution directory if it exists. Additional environment variable options • To build all of the experimental modules, set the environment variable EXPERIMENTAL: Set EXPERIMENTAL=1 • To build Apache using standard BSD style sockets rather than Winsock, set the environment variable USE STDSOCKETS: Set USE STDSOCKETS=1 Building mod ssl for the NetWare platform By default Apache for NetWare uses the built-in module MOD NW SSL to provide SSL services. This module simply enables the native SSL services implemented in NetWare OS to handle all encryption for a given port. Alternatively, mod ssl can also be used in the same manner as on other platforms. Before mod ssl can be built for the NetWare platform, the OpenSSL libraries must be provided. This can be done through the following steps: • Download the recent OpenSSL 0.9.8 release source code from the OpenSSL Source18 page (older 0.9.7 versions need to be patched and are therefore not recommended). • Edit the file NetWare/set env.bat and modify any tools and utilities paths so that they correspond to your build environment. • From the root of the OpenSSL source directory, run the following scripts: Netware\set env netware-libc Netware\build netware-libc 18 http://www.openssl.org/source/ 7.5. USING APACHE WITH NOVELL NETWARE 291 For performance reasons you should enable to build with ASM code. Download NASM from the SF site19 . Then configure OpenSSL to use ASM code: Netware\build netware-libc nw-nasm enable-mdc2 enable-md5 Warning: dont use the CodeWarrior Assembler - it produces broken code! • Before building Apache, set the environment variable OSSLSDK to the full path to the root of the openssl source code directory, and set WITH MOD SSL to 1. Set OSSLSDK=d:\openssl-0.9.8x Set WITH MOD SSL=1 19 http://nasm.sourceforge.net/ 292 7.6 CHAPTER 7. PLATFORM-SPECIFIC NOTES Running a High-Performance Web Server on HPUX Date: Wed, 05 Nov 1997 16:59:34 -0800 From: Rick Jones Reply-To: raj@cup.hp.com Organization: Network Performance Subject: HP-UX tuning tips Here are some tuning tips for HP-UX to add to the tuning page. For HP-UX 9.X: Upgrade to 10.20 For HP-UX 10.[00—01—10]: Upgrade to 10.20 For HP-UX 10.20: Install the latest cumulative ARPA Transport Patch. This will allow you to configure the size of the TCP connection lookup hash table. The default is 256 buckets and must be set to a power of two. This is accomplished with adb against the *disc* image of the kernel. The variable name is tcp hash size. Notice that it’s critically important that you use "W" to write a 32 bit quantity, not "w" to write a 16 bit value when patching the disc image because the tcp hash size variable is a 32 bit quantity. How to pick the value? Examine the output of ftp://ftp.cup.hp.com/dist/networking/tools/connhist and see how many total TCP connections exist on the system. You probably want that number divided by the hash table size to be reasonably small, say less than 10. Folks can look at HP’s SPECweb96 disclosures for some common settings. These can be found at http://www.specbench.org/. If an HP-UX system was performing at 1000 SPECweb96 connections per second, the TIME WAIT time of 60 seconds would mean 60,000 TCP "connections" being tracked. Folks can check their listen queue depths with ftp://ftp.cup.hp.com/dist/networking/misc/listenq. If folks are running Apache on a PA-8000 based system, they should consider "chatr’ing" the Apache executable to have a large page size. This would be "chatr +pi L ". The GID of the running executable must have MLOCK privileges. Setprivgrp(1m) should be consulted for assigning MLOCK. The change can be validated by running Glance and examining the memory regions of the server(s) to make sure that they show a non-trivial fraction of the text segment being locked. If folks are running Apache on MP systems, they might consider writing a small program that uses mpctl() to bind processes to processors. A simple pid % numcpu algorithm is probably sufficient. This might even go into the source code. If folks are concerned about the number of FIN WAIT 2 connections, they can use nettune to shrink the value of tcp keepstart. However, they should be careful there - certainly do not make it less than oh two to four minutes. If tcp hash size has been set well, it is probably OK to let the FIN WAIT 2’s take longer to timeout (perhaps even the default two hours) - they will not on average have a big impact on performance. There are other things that could go into the code base, but that might be left for another email. Feel free to drop me a message if you or others are interested. sincerely, rick jones http://www.netperf.org/netperf/ Chapter 8 Apache HTTP Server and Supporting Programs 293 294 8.1 CHAPTER 8. APACHE HTTP SERVER AND SUPPORTING PROGRAMS Server and Supporting Programs This page documents all the executable programs included with the Apache HTTP Server. Index httpd Apache hypertext transfer protocol server apachectl Apache HTTP server control interface ab Apache HTTP server benchmarking tool apxs APache eXtenSion tool configure Configure the source tree dbmmanage Create and update user authentication files in DBM format for basic authentication fcgistarter Start a FastCGI program firehose Demultiplex a firehose from MOD FIREHOSE htcacheclean Clean up the disk cache htdigest Create and update user authentication files for digest authentication htdbm Manipulate DBM password databases. htpasswd Create and update user authentication files for basic authentication httxt2dbm Create dbm files for use with RewriteMap logresolve Resolve hostnames for IP-addresses in Apache logfiles log server status Periodically log the server’s status rotatelogs Rotate Apache logs without having to kill the server split-logfile Split a multi-vhost logfile into per-host logfiles suexec Switch User For Exec 8.2. HTTPD - APACHE HYPERTEXT TRANSFER PROTOCOL SERVER 8.2 295 httpd - Apache Hypertext Transfer Protocol Server httpd is the Apache HyperText Transfer Protocol (HTTP) server program. It is designed to be run as a standalone daemon process. When used like this it will create a pool of child processes or threads to handle requests. In general, httpd should not be invoked directly, but rather should be invoked via apachectl on Unix-based systems or as a service on Windows NT, 2000 and XP (p. 267) and as a console application on Windows 9x and ME (p. 267) . See also • Starting Apache httpd (p. 27) • Stopping Apache httpd (p. 29) • Configuration Files (p. 32) • Platform-specific Documentation (p. 266) • apachectl Synopsis httpd [ -d serverroot ] [ -f config ] [ -C directive ] [ -c directive ] [ -D parameter ] [ -e level ] [ -E file ] [ -k start|restart|graceful|stop|graceful-stop ] [ -R directory ] [ -h ] [ -l ] [ -L ] [ -S ] [ -t ] [ -v ] [ -V ] [ -X ] [ -M ] [ -T ] On Windows systems (p. 267) , the following additional arguments are available: httpd [ -k install|config|uninstall ] [ -n name ] [ -w ] Options -d serverroot Set the initial value for the S ERVER ROOT directive to serverroot. This can be overridden by the ServerRoot directive in the configuration file. The default is /usr/local/apache2. -f config Uses the directives in the file config on startup. If config does not begin with a /, then it is taken to be a path relative to the S ERVER ROOT. The default is conf/httpd.conf. -k start|restart|graceful|stop|graceful-stop Signals httpd to start, restart, or stop. See Stopping Apache httpd (p. 29) for more information. -C directive Process the configuration directive before reading config files. -c directive Process the configuration directive after reading config files. -D parameter Sets a configuration parameter which can be used with sections in the configuration files to conditionally skip or process commands at server startup and restart. Also can be used to set certain lesscommon startup parameters including -DNO DETACH (prevent the parent from forking) and -DFOREGROUND (prevent the parent from calling setsid() et al). -e level Sets the L OG L EVEL to level during server startup. This is useful for temporarily increasing the verbosity of the error messages to find problems during startup. -E file Send error messages during server startup to file. -h Output a short summary of available command line options. 296 CHAPTER 8. APACHE HTTP SERVER AND SUPPORTING PROGRAMS -l Output a list of modules compiled into the server. This will not list dynamically loaded modules included using the L OAD M ODULE directive. -L Output a list of directives provided by static modules, together with expected arguments and places where the directive is valid. Directives provided by shared modules are not listed. -M Dump a list of loaded Static and Shared Modules. -S Show the settings as parsed from the config file (currently only shows the virtualhost settings). -T (Available in 2.3.8 and later) Skip document root check at startup/restart. -t Run syntax tests for configuration files only. The program immediately exits after these syntax parsing tests with either a return code of 0 (Syntax OK) or return code not equal to 0 (Syntax Error). If -D DUMP VHOSTS is also set, details of the virtual host configuration will be printed. If -D DUMP MODULES is set, all loaded modules will be printed. If -D DUMP CERTS is set and MOD SSL is used, configured SSL certificates will be printed. If -D DUMP CA CERTS is set and MOD SSL is used, configured SSL CA certificates and configured directories containing SSL CA certificates will be printed. -v Print the version of httpd, and then exit. -V Print the version and build parameters of httpd, and then exit. -X Run httpd in debug mode. Only one worker will be started and the server will not detach from the console. The following arguments are available only on the Windows platform (p. 267) : -k install|config|uninstall Install Apache httpd as a Windows NT service; change startup options for the Apache httpd service; and uninstall the Apache httpd service. -n name The name of the Apache httpd service to signal. -w Keep the console window open on error so that the error message can be read. 8.3. AB - APACHE HTTP SERVER BENCHMARKING TOOL 8.3 297 ab - Apache HTTP server benchmarking tool ab is a tool for benchmarking your Apache Hypertext Transfer Protocol (HTTP) server. It is designed to give you an impression of how your current Apache installation performs. This especially shows you how many requests per second your Apache installation is capable of serving. See also • httpd Synopsis ab [ -A auth-username:password ] [ -b windowsize ] [ -B local-address ] [ -c concurrency ] [ -C cookie-name=value ] [ -d ] [ -e csv-file ] [ -f protocol ] [ -g gnuplot-file ] [ -h ] [ -H custom-header ] [ -i ] [ -k ] [ -l ] [ -m HTTP-method ] [ -n requests ] [ -p POST-file ] [ -P proxy-auth-username:password ] [ -q ] [ -r ] [ -s timeout ] [ -S ] [ -t timelimit ] [ -T content-type ] [ -u PUT-file ] [ -v verbosity] [ -V ] [ -w ] [ -x

-attributes ] [ -X proxy[:port] ] [ -y -attributes ] [ -z
-attributes ] [ -Z ciphersuite ] [http[s]://]hostname[:port]/path Options -A auth-username:password Supply BASIC Authentication credentials to the server. The username and password are separated by a single : and sent on the wire base64 encoded. The string is sent regardless of whether the server needs it (i.e., has sent an 401 authentication needed). -b windowsize Size of TCP send/receive buffer, in bytes. -B local-address Address to bind to when making outgoing connections. -c concurrency Number of multiple requests to perform at a time. Default is one request at a time. -C cookie-name=value Add a Cookie: line to the request. The argument is typically in the form of a name=value pair. This field is repeatable. -d Do not display the "percentage served within XX [ms] table". (legacy support). -e csv-file Write a Comma separated value (CSV) file which contains for each percentage (from 1% to 100%) the time (in milliseconds) it took to serve that percentage of the requests. This is usually more useful than the ’gnuplot’ file; as the results are already ’binned’. -f protocol Specify SSL/TLS protocol (SSL2, SSL3, TLS1, TLS1.1, TLS1.2, or ALL). TLS1.1 and TLS1.2 support available in 2.4.4 and later. -g gnuplot-file Write all measured values out as a ’gnuplot’ or TSV (Tab separate values) file. This file can easily be imported into packages like Gnuplot, IDL, Mathematica, Igor or even Excel. The labels are on the first line of the file. -h Display usage information. -H custom-header Append extra headers to the request. The argument is typically in the form of a valid header line, containing a colon-separated field-value pair (i.e., "Accept-Encoding: zip/zop;8bit"). -i Do HEAD requests instead of GET. 298 CHAPTER 8. APACHE HTTP SERVER AND SUPPORTING PROGRAMS -k Enable the HTTP KeepAlive feature, i.e., perform multiple requests within one HTTP session. Default is no KeepAlive. -l Do not report errors if the length of the responses is not constant. This can be useful for dynamic pages. Available in 2.4.7 and later. -m HTTP-method Custom HTTP method for the requests. Available in 2.4.10 and later. -n requests Number of requests to perform for the benchmarking session. The default is to just perform a single request which usually leads to non-representative benchmarking results. -p POST-file File containing data to POST. Remember to also set -T. -P proxy-auth-username:password Supply BASIC Authentication credentials to a proxy en-route. The username and password are separated by a single : and sent on the wire base64 encoded. The string is sent regardless of whether the proxy needs it (i.e., has sent an 407 proxy authentication needed). -q When processing more than 150 requests, ab outputs a progress count on stderr every 10% or 100 requests or so. The -q flag will suppress these messages. -r Don’t exit on socket receive errors. -s timeout Maximum number of seconds to wait before the socket times out. Default is 30 seconds. Available in 2.4.4 and later. -S Do not display the median and standard deviation values, nor display the warning/error messages when the average and median are more than one or two times the standard deviation apart. And default to the min/avg/max values. (legacy support). -t timelimit Maximum number of seconds to spend for benchmarking. This implies a -n 50000 internally. Use this to benchmark the server within a fixed total amount of time. Per default there is no timelimit. -T content-type Content-type header to use for POST/PUT application/x-www-form-urlencoded. Default is text/plain. data, eg. -u PUT-file File containing data to PUT. Remember to also set -T. -v verbosity Set verbosity level - 4 and above prints information on headers, 3 and above prints response codes (404, 200, etc.), 2 and above prints warnings and info. -V Display version number and exit. -w Print out results in HTML tables. Default table is two columns wide, with a white background. -x -attributes String to use as attributes for
. Attributes are inserted
. -X proxy[:port] Use a proxy server for the requests. -y -attributes String to use as attributes for . -z
-attributes String to use as attributes for . -Z ciphersuite Specify SSL/TLS cipher suite (See openssl ciphers) 8.3. AB - APACHE HTTP SERVER BENCHMARKING TOOL 299 Output The following list describes the values returned by ab: Server Software The value, if any, returned in the server HTTP header of the first successful response. This includes all characters in the header from beginning to the point a character with decimal value of 32 (most notably: a space or CR/LF) is detected. Server Hostname The DNS or IP address given on the command line Server Port The port to which ab is connecting. If no port is given on the command line, this will default to 80 for http and 443 for https. SSL/TLS Protocol The protocol parameters negotiated between the client and server. This will only be printed if SSL is used. Document Path The request URI parsed from the command line string. Document Length This is the size in bytes of the first successfully returned document. If the document length changes during testing, the response is considered an error. Concurrency Level The number of concurrent clients used during the test Time taken for tests This is the time taken from the moment the first socket connection is created to the moment the last response is received Complete requests The number of successful responses received Failed requests The number of requests that were considered a failure. If the number is greater than zero, another line will be printed showing the number of requests that failed due to connecting, reading, incorrect content length, or exceptions. Write errors The number of errors that failed during write (broken pipe). Non-2xx responses The number of responses that were not in the 200 series of response codes. If all responses were 200, this field is not printed. Keep-Alive requests The number of connections that resulted in Keep-Alive requests Total body sent If configured to send data as part of the test, this is the total number of bytes sent during the tests. This field is omitted if the test did not include a body to send. Total transferred The total number of bytes received from the server. This number is essentially the number of bytes sent over the wire. HTML transferred The total number of document bytes received from the server. This number excludes bytes received in HTTP headers Requests per second This is the number of requests per second. This value is the result of dividing the number of requests by the total time taken Time per request The average time spent per request. The first value is calculated with the formula concurrency * timetaken * 1000 / done while the second value is calculated with the formula timetaken * 1000 / done Transfer rate The rate of transfer as calculated by the formula totalread / 1024 / timetaken 300 CHAPTER 8. APACHE HTTP SERVER AND SUPPORTING PROGRAMS Bugs There are various statically declared buffers of fixed length. Combined with the lazy parsing of the command line arguments, the response headers from the server and other external inputs, this might bite you. It does not implement HTTP/1.x fully; only accepts some ’expected’ forms of responses. The rather heavy use of strstr(3) shows up top in profile, which might indicate a performance problem; i.e., you would measure the ab performance rather than the server’s. 8.4. APACHECTL - APACHE HTTP SERVER CONTROL INTERFACE 8.4 301 apachectl - Apache HTTP Server Control Interface apachectl is a front end to the Apache HyperText Transfer Protocol (HTTP) server. It is designed to help the administrator control the functioning of the Apache httpd daemon. The apachectl script can operate in two modes. First, it can act as a simple front-end to the httpd command that simply sets any necessary environment variables and then invokes httpd, passing through any command line arguments. Second, apachectl can act as a SysV init script, taking simple one-word arguments like start, restart, and stop, and translating them into appropriate signals to httpd. If your Apache installation uses non-standard paths, you will need to edit the apachectl script to set the appropriate paths to the httpd binary. You can also specify any necessary httpd command line arguments. See the comments in the script for details. The apachectl script returns a 0 exit value on success, and >0 if an error occurs. For more details, view the comments in the script. See also • Starting Apache (p. 27) • Stopping Apache (p. 29) • Configuration Files (p. 32) • Platform Docs (p. 266) • httpd Synopsis When acting in pass-through mode, apachectl can take all the arguments available for the httpd binary. apachectl [ httpd-argument ] When acting in SysV init mode, apachectl takes simple, one-word commands, defined below. apachectl command Options Only the SysV init-style options are defined here. Other arguments are defined on the httpd manual page. start Start the Apache httpd daemon. Gives an error if it is already running. This is equivalent to apachectl -k start. stop Stops the Apache httpd daemon. This is equivalent to apachectl -k stop. restart Restarts the Apache httpd daemon. If the daemon is not running, it is started. This command automatically checks the configuration files as in configtest before initiating the restart to make sure the daemon doesn’t die. This is equivalent to apachectl -k restart. fullstatus Displays a full status report from MOD STATUS. For this to work, you need to have MOD STATUS enabled on your server and a text-based browser such as lynx available on your system. The URL used to access the status report can be set by editing the STATUSURL variable in the script. status Displays a brief status report. Similar to the fullstatus option, except that the list of requests currently being served is omitted. 302 CHAPTER 8. APACHE HTTP SERVER AND SUPPORTING PROGRAMS graceful Gracefully restarts the Apache httpd daemon. If the daemon is not running, it is started. This differs from a normal restart in that currently open connections are not aborted. A side effect is that old log files will not be closed immediately. This means that if used in a log rotation script, a substantial delay may be necessary to ensure that the old log files are closed before processing them. This command automatically checks the configuration files as in configtest before initiating the restart to make sure Apache doesn’t die. This is equivalent to apachectl -k graceful. graceful-stop Gracefully stops the Apache httpd daemon. This differs from a normal stop in that currently open connections are not aborted. A side effect is that old log files will not be closed immediately. This is equivalent to apachectl -k graceful-stop. configtest Run a configuration file syntax test. It parses the configuration files and either reports Syntax Ok or detailed information about the particular syntax error. This is equivalent to apachectl -t. The following option was available in earlier versions but has been removed. startssl To start httpd with SSL support, you should edit your configuration file to include the relevant directives and then use the normal apachectl start. 8.5. APXS - APACHE EXTENSION TOOL 8.5 303 apxs - APache eXtenSion tool apxs is a tool for building and installing extension modules for the Apache HyperText Transfer Protocol (HTTP) server. This is achieved by building a dynamic shared object (DSO) from one or more source or object files which then can be loaded into the Apache server under runtime via the L OAD M ODULE directive from MOD SO. So to use this extension mechanism your platform has to support the DSO feature and your Apache httpd binary has to be built with the MOD SO module. The apxs tool automatically complains if this is not the case. You can check this yourself by manually running the command $ httpd -l The module MOD SO should be part of the displayed list. If these requirements are fulfilled you can easily extend your Apache server’s functionality by installing your own modules with the DSO mechanism by the help of this apxs tool: $ apxs -i -a -c mod foo.c gcc -fpic -DSHARED MODULE -I/path/to/apache/include -c mod foo.c ld -Bshareable -o mod foo.so mod foo.o cp mod foo.so /path/to/apache/modules/mod foo.so chmod 755 /path/to/apache/modules/mod foo.so [activating module ‘foo’ in /path/to/apache/etc/httpd.conf] $ apachectl restart /path/to/apache/sbin/apachectl restart: httpd not running, trying to start [Tue Mar 31 11:27:55 1998] [debug] mod so.c(303): loaded module foo module /path/to/apache/sbin/apachectl restart: httpd started $ The arguments files can be any C source file (.c), a object file (.o) or even a library archive (.a). The apxs tool automatically recognizes these extensions and automatically used the C source files for compilation while just using the object and archive files for the linking phase. But when using such pre-compiled objects make sure they are compiled for position independent code (PIC) to be able to use them for a dynamically loaded shared object. For instance with GCC you always just have to use -fpic. For other C compilers consult its manual page or at watch for the flags apxs uses to compile the object files. For more details about DSO support in Apache read the documentation of MOD SO or perhaps even read the src/modules/standard/mod so.c source file. See also • apachectl • httpd Synopsis apxs -g [ -S name=value ] -n modname apxs -q [ -v ] [ -S name=value ] query ... apxs -c [ -S name=value ] [ -o dsofile ] [ -I incdir ] [ -D name=value ] [ -L libdir ] [ -l libname ] [ -Wc,compiler-flags ] [ -Wl,linker-flags ] files ... apxs -i [ -S name=value ] [ -n modname ] [ -a ] [ -A ] dso-file ... apxs -e [ -S name=value ] [ -n modname ] [ -a ] [ -A ] dso-file ... 304 CHAPTER 8. APACHE HTTP SERVER AND SUPPORTING PROGRAMS Options Common Options -n modname This explicitly sets the module name for the -i (install) and -g (template generation) option. Use this to explicitly specify the module name. For option -g this is required, for option -i the apxs tool tries to determine the name from the source or (as a fallback) at least by guessing it from the filename. Query Options -q Performs a query for variables and environment settings used to build httpd. When invoked without query parameters, it prints all known variables and their values. The optional -v parameter formats the list output. Use this to manually determine settings used to build the httpd that will load your module. For instance use INC=-I‘apxs -q INCLUDEDIR‘ inside your own Makefiles if you need manual access to Apache’s C header files. Configuration Options -S name=value This option changes the apxs settings described above. Template Generation Options -g This generates a subdirectory name (see option -n) and there two files: A sample module source file named mod name.c which can be used as a template for creating your own modules or as a quick start for playing with the apxs mechanism. And a corresponding Makefile for even easier build and installing of this module. DSO Compilation Options -c This indicates the compilation operation. It first compiles the C source files (.c) of files into corresponding object files (.o) and then builds a dynamically shared object in dsofile by linking these object files plus the remaining object files (.o and .a) of files. If no -o option is specified the output file is guessed from the first filename in files and thus usually defaults to mod name.so. -o dsofile Explicitly specifies the filename of the created dynamically shared object. If not specified and the name cannot be guessed from the files list, the fallback name mod unknown.so is used. -D name=value This option is directly passed through to the compilation command(s). Use this to add your own defines to the build process. -I incdir This option is directly passed through to the compilation command(s). Use this to add your own include directories to search to the build process. -L libdir This option is directly passed through to the linker command. Use this to add your own library directories to search to the build process. -l libname This option is directly passed through to the linker command. Use this to add your own libraries to search to the build process. -Wc,compiler-flags This option passes compiler-flags as additional --mode=compile command. Use this to add local compiler-specific options. flags to the libtool 8.5. APXS - APACHE EXTENSION TOOL 305 -Wl,linker-flags This option passes linker-flags as additional flags to the libtool --mode=link command. Use this to add local linker-specific options. -p This option causes apxs to link against the apr/apr-util libraries. This is useful when compiling helper programs that use the apr/apr-util libraries. DSO Installation and Configuration Options -i This indicates the installation operation and installs one or more dynamically shared objects into the server’s modules directory. -a This activates the module by automatically adding a corresponding L OAD M ODULE line to Apache’s httpd.conf configuration file, or by enabling it if it already exists. -A Same as option -a but the created L OAD M ODULE directive is prefixed with a hash sign (#), i.e., the module is just prepared for later activation but initially disabled. -e This indicates the editing operation, which can be used with the -a and -A options similarly to the -i operation to edit Apache’s httpd.conf configuration file without attempting to install the module. Examples Assume you have an Apache module named mod foo.c available which should extend Apache’s server functionality. To accomplish this you first have to compile the C source into a shared object suitable for loading into the Apache server under runtime via the following command: $ apxs -c mod foo.c /path/to/libtool --mode=compile gcc ... -c mod foo.c /path/to/libtool --mode=link gcc ... -o mod foo.la mod foo.slo $ Then you have to update the Apache configuration by making sure a L OAD M ODULE directive is present to load this shared object. To simplify this step apxs provides an automatic way to install the shared object in its "modules" directory and updating the httpd.conf file accordingly. This can be achieved by running: $ apxs -i -a mod foo.la /path/to/instdso.sh mod foo.la /path/to/apache/modules /path/to/libtool --mode=install cp mod foo.la /path/to/apache/modules ... chmod 755 /path/to/apache/modules/mod foo.so [activating module ‘foo’ in /path/to/apache/conf/httpd.conf] $ This way a line named LoadModule foo module modules/mod foo.so is added to the configuration file if still not present. If you want to have this disabled per default use the -A option, i.e. $ apxs -i -A mod foo.c 306 CHAPTER 8. APACHE HTTP SERVER AND SUPPORTING PROGRAMS For a quick test of the apxs mechanism you can create a sample Apache module template plus a corresponding Makefile via: $ apxs -g -n foo Creating [DIR] foo Creating [FILE] foo/Makefile Creating [FILE] foo/modules.mk Creating [FILE] foo/mod foo.c Creating [FILE] foo/.deps $ Then you can immediately compile this sample module into a shared object and load it into the Apache server: $ cd foo $ make all reload apxs -c mod foo.c /path/to/libtool --mode=compile gcc ... -c mod foo.c /path/to/libtool --mode=link gcc ... -o mod foo.la mod foo.slo apxs -i -a -n "foo" mod foo.la /path/to/instdso.sh mod foo.la /path/to/apache/modules /path/to/libtool --mode=install cp mod foo.la /path/to/apache/modules ... chmod 755 /path/to/apache/modules/mod foo.so [activating module ‘foo’ in /path/to/apache/conf/httpd.conf] apachectl restart /path/to/apache/sbin/apachectl restart: httpd not running, trying to start [Tue Mar 31 11:27:55 1998] [debug] mod so.c(303): loaded module foo module /path/to/apache/sbin/apachectl restart: httpd started $ 8.6. CONFIGURE - CONFIGURE THE SOURCE TREE 8.6 307 configure - Configure the source tree The configure script configures the source tree for compiling and installing the Apache HTTP Server on your particular platform. Various options allow the compilation of a server corresponding to your personal requirements. This script, included in the root directory of the source distribution, is for compilation on Unix and Unix-like systems only. For other platforms, see the platform (p. 266) documentation. See also • Compiling and Installing (p. 22) Synopsis You should call the configure script from within the root directory of the distribution. ./configure [OPTION ]... [VAR=VALUE]... To assign environment variables (e.g. CC, CFLAGS ...), specify them as VAR=VALUE. See below for descriptions of some of the useful variables. Options • Configuration options • Installation directories • System types • Optional features • Options for support programs Configuration options The following options influence the behavior of configure itself. -C --config-cache This is an alias for --cache-file=config.cache --cache-file=FILE The test results will be cached in file FILE. This option is disabled by default. -h --help [short|recursive] Output the help and exit. With the argument short only options specific to this package will displayed. The argument recursive displays the short help of all the included packages. -n --no-create The configure script is run normally but does not create output files. This is useful to check the test results before generating makefiles for compilation. -q --quiet Do not print checking ... messages during the configure process. --srcdir=DIR Defines directory DIR to be the source file directory. Default is the directory where configure is located, or the parent directory. 308 CHAPTER 8. APACHE HTTP SERVER AND SUPPORTING PROGRAMS --silent Same as --quiet -V –version Display copyright information and exit. Installation directories These options define the installation directory. The installation tree depends on the selected layout. --prefix=PREFIX Install architecture-independent files in PREFIX. By default the installation directory is set to /usr/local/apache2. --exec-prefix=EPREFIX Install architecture-dependent files in EPREFIX. By default the installation directory is set to the PREFIX directory. By default, make install will install all the files in /usr/local/apache2/bin, /usr/local/apache2/lib etc. You can specify an installation prefix other than /usr/local/apache2 using --prefix, for instance --prefix=$HOME. Define a directory layout --enable-layout=LAYOUT Configure the source code and build scripts to assume an installation tree based on the layout LAYOUT. This allows you to separately specify the locations for each type of file within the Apache HTTP Server installation. The config.layout file contains several example configurations, and you can also create your own custom configuration following the examples. The different layouts in this file are grouped into ... sections and referred to by name as in FOO. The default layout is Apache. Fine tuning of the installation directories For better control of the installation directories, use the options below. Please note that the directory defaults are set by autoconf and are overwritten by the corresponding layout setting. --bindir=DIR Install user executables in DIR. The user executables are supporting programs like htpasswd, dbmmanage, etc. which are useful for site administrators. By default DIR is set to EPREFIX/bin. --datadir=DIR Install read-only architecture-independent data in DIR. By default datadir is set to PREFIX/share. This option is offered by autoconf and currently unused. --includedir=DIR Install C header files in DIR. By default includedir is set to EPREFIX/include. --infodir=DIR Install info documentation in DIR. By default infodir is set to PREFIX/info. This option is currently unused. --libdir=DIR Install object code libraries in DIR. By default libdir is set to EPREFIX/lib. --libexecdir=DIR Install the program executables (i.e., shared modules) in DIR. By default libexecdir is set to EPREFIX/modules. --localstatedir=DIR Install modifiable single-machine data in DIR. By default localstatedir is set to PREFIX/var. This option is offered by autoconf and currently unused. --mandir=DIR Install the man documentation in DIR. By default mandir is set to EPREFIX/man. 8.6. CONFIGURE - CONFIGURE THE SOURCE TREE 309 --oldincludedir=DIR Install C header files for non-gcc in DIR. By default oldincludedir is set to /usr/include. This option is offered by autoconf and currently unused. --sbindir=DIR Install the system administrator executables in DIR. Those are server programs like httpd, apachectl, suexec, etc. which are necessary to run the Apache HTTP Server. By default sbindir is set to EPREFIX/sbin. --sharedstatedir=DIR Install modifiable architecture-independent data in DIR. By default sharedstatedir is set to PREFIX/com. This option is offered by autoconf and currently unused. --sysconfdir=DIR Install read-only single-machine data like the server configuration files httpd.conf, mime.types, etc. in DIR. By default sysconfdir is set to PREFIX/conf. System types These options are used to cross-compile the Apache HTTP Server to run on another system. In normal cases, when building and running the server on the same system, these options are not used. --build=BUILD Defines the system type of the system on which the tools are being built. It defaults to the result of the script config.guess. --host=HOST Defines the system type of the system on which the server will run. HOST defaults to BUILD. --target=TARGET Configure for building compilers for the system type TARGET. It defaults to HOST. This option is offered by autoconf and not necessary for the Apache HTTP Server. Optional Features These options are used to fine tune the features your HTTP server will have. General syntax Generally you can use the following syntax to enable or disable a feature: --disable-FEATURE Do not include FEATURE. This is the same as --enable-FEATURE=no. --enable-FEATURE[=ARG] Include FEATURE. The default value for ARG is yes. --enable-MODULE=shared The corresponding module will be build as DSO module. By default enabled modules are linked dynamically. --enable-MODULE=static The corresponding module will be linked statically. =⇒Note configure will not complain about --enable-foo even if foo doesn’t exist, so you need to type carefully. 310 CHAPTER 8. APACHE HTTP SERVER AND SUPPORTING PROGRAMS Choosing modules to compile Most modules are compiled by default and have to be disabled explicitly or by using the keywords few or none (see --enable-modules, --enable-mods-shared and --enable-mods-static below for further explanation) to be removed. Other modules are not compiled by default and have to be enabled explicitly or by using the keywords all or reallyall to be available. To find out which modules are compiled by default, run ./configure -h or ./configure --help and look under Optional Features. Suppose you are interested in mod example1 and mod example2, and you see this: Optional Features: ... --disable-example1 --enable-example2 ... example module 1 example module 2 Then mod example1 is enabled by default, and you would use --disable-example1 to not compile it. mod example2 is disabled by default, and you would use --enable-example2 to compile it. Multi-Processing Modules Multi-Processing Modules (p. 90) , or MPMs, implement the basic behavior of the server. A single MPM must be active in order for the server to function. The list of available MPMs appears on the module index page (p. 1101) . MPMs can be built as DSOs for dynamic loading or statically linked with the server, and are enabled using the following options: --with-mpm=MPM Choose the default MPM for your server. If MPMs are built as DSO modules (see --enable-mpms-shared), this directive selects the MPM which will be loaded in the default configuration file. Otherwise, this directive selects the only available MPM, which will be statically linked into the server. If this option is omitted, the default MPM (p. 90) for your operating system will be used. --enable-mpms-shared=MPM-LIST Enable a list of MPMs as dynamic shared modules. One of these modules must be loaded dynamically using the L OAD M ODULE directive. MPM-LIST is a space-separated list of MPM names enclosed by quotation marks. For example: --enable-mpms-shared=’prefork worker’ Additionally you can use the special keyword all, which will select all MPMs which support dynamic loading on the current platform and build them as DSO modules. For example: --enable-mpms-shared=all 8.6. CONFIGURE - CONFIGURE THE SOURCE TREE 311 Third-party modules To add additional third-party modules use the following options: --with-module=module-type:module-file[, module-type:module-file] Add one or more third-party modules to the list of statically linked modules. The module source file module-file will be searched in the modules/module-type subdirectory of your Apache HTTP server source tree. If it is not found there configure is considering module-file to be an absolute file path and tries to copy the source file into the module-type subdirectory. If the subdirectory doesn’t exist it will be created and populated with a standard Makefile.in. This option is useful to add small external modules consisting of one source file. For more complex modules you should read the vendor’s documentation. =⇒Note If you want to build a DSO module instead of a statically linked use apxs. Cumulative and other options --enable-maintainer-mode Turn on debugging and compile time warnings and load all compiled modules. --enable-mods-shared=MODULE-LIST Defines a list of modules to be enabled and build as dynamic shared modules. This mean, these module have to be loaded dynamically by using the L OAD M ODULE directive. MODULE-LIST is a space separated list of modulenames enclosed by quotation marks. The module names are given without the preceding mod . For example: --enable-mods-shared=’headers rewrite dav’ Additionally you can use the special keywords reallyall, all, most, few and none. For example, --enable-mods-shared=most will compile most modules and build them as DSO modules, --enable-mods-shared=few will only compile a very basic set of modules. The default set is most. The L OAD M ODULE directives for the chosen modules will be automatically generated in the main configuration file. By default, all those directives will be commented out except for the modules that are either required or explicitly selected by a configure --enable-foo argument. You can change the set of loaded modules by activating or deactivating the L OAD M ODULE directives in httpd.conf. In addition the L OAD M ODULE directives for all built modules can be activated via the configure option --enable-load-all-modules. --enable-mods-static=MODULE-LIST This option behaves similar to --enable-mods-shared, but will link the given modules statically. This mean, these modules will always be present while running httpd. They need not be loaded with L OAD M ODULE. --enable-modules=MODULE-LIST This option behaves like to --enable-mods-shared, and will also link the given modules dynamically. The special keyword none disables the build of all modules. --enable-v4-mapped Allow IPv6 sockets to handle IPv4 connections. --with-port=PORT This defines the port on which httpd will listen. This port number is used when generating the configuration file httpd.conf. The default is 80. --with-program-name Define an alternative executable name. The default is httpd. 312 CHAPTER 8. APACHE HTTP SERVER AND SUPPORTING PROGRAMS Optional packages These options are used to define optional packages. General syntax Generally you can use the following syntax to define an optional package: --with-PACKAGE[=ARG] Use the package PACKAGE. The default value for ARG is yes. --without-PACKAGE Do not use the package PACKAGE. This is the same as --with-PACKAGE=no. This option is provided by autoconf but not very useful for the Apache HTTP Server. Specific packages --with-apr=DIR|FILE The Apache Portable Runtime (APR) is part of the httpd source distribution and will automatically be build together with the HTTP server. If you want to use an already installed APR instead you have to tell configure the path to the apr-config script. You may set the absolute path and name or the directory to the installed APR. apr-config must exist within this directory or the subdirectory bin. --with-apr-util=DIR|FILE The Apache Portable Runtime Utilities (APU) are part of the httpd source distribution and will automatically be build together with the HTTP server. If you want to use an already installed APU instead you have to tell configure the path to the apu-config script. You may set the absolute path and name or the directory to the installed APU. apu-config must exist within this directory or the subdirectory bin. --with-ssl=DIR If MOD SSL has been enabled configure searches for an installed OpenSSL. You can set the directory path to the SSL/TLS toolkit instead. --with-z=DIR configure searches automatically for an installed zlib library if your source configuration requires one (e.g., when MOD DEFLATE is enabled). You can set the directory path to the compression library instead. Several features of the Apache HTTP Server, including MOD AUTHN DBM and MOD REWRITE’s DBM R EWRITE M AP use simple key/value databases for quick lookups of information. SDBM is included in the APU, so this database is always available. If you would like to use other database types, use the following options to enable them: --with-gdbm[=path] If no path is specified, configure will search for the include files and libraries of a GNU DBM installation in the usual search paths. An explicit path will cause configure to look in path/lib and path/include for the relevant files. Finally, the path may specify specific include and library paths separated by a colon. --with-ndbm[=path] Like --with-gdbm, but searches for a New DBM installation. --with-berkeley-db[=path] Like --with-gdbm, but searches for a Berkeley DB installation. =⇒Note The DBM options are provided by the APU and passed through to its configuration script. They are useless when using an already installed APU defined by --with-apr-util. You may use more then one DBM implementation together with your HTTP server. The appropriated DBM type will be configured within the runtime configuration at each time. 8.6. CONFIGURE - CONFIGURE THE SOURCE TREE 313 Options for support programs --enable-static-support Build a statically linked version of the support binaries. This means, a stand-alone executable will be built with all the necessary libraries integrated. Otherwise the support binaries are linked dynamically by default. --enable-suexec Use this option to enable suexec, which allows you to set uid and gid for spawned processes. Do not use this option unless you understand all the security implications of running a suid binary on your server. Further options to configure suexec are described below. It is possible to create a statically linked binary of a single support program by using the following options: --enable-static-ab Build a statically linked version of ab. --enable-static-checkgid Build a statically linked version of checkgid. --enable-static-htdbm Build a statically linked version of htdbm. --enable-static-htdigest Build a statically linked version of htdigest. --enable-static-htpasswd Build a statically linked version of htpasswd. --enable-static-logresolve Build a statically linked version of logresolve. --enable-static-rotatelogs Build a statically linked version of rotatelogs. suexec configuration options The following options are used to fine tune the behavior of suexec. See Configuring and installing suEXEC (p. 335) for further information. --with-suexec-bin This defines the path to suexec binary. Default is --sbindir (see Fine tuning of installation directories). --with-suexec-caller This defines the user allowed to call suexec. It should be the same as the user under which httpd normally runs. --with-suexec-docroot This defines the directory tree under which suexec access is allowed for executables. Default value is --datadir/htdocs. --with-suexec-gidmin Define this as the lowest GID allowed to be a target user for suexec. The default value is 100. --with-suexec-logfile This defines the filename of the suexec logfile. By default the logfile is named suexec log and located in --logfiledir. --with-suexec-safepath Define the value of the environment variable PATH to be set for processes started by suexec. Default value is /usr/local/bin:/usr/bin:/bin. --with-suexec-userdir This defines the subdirectory under the user’s directory that contains all executables for which suexec access is allowed. This setting is necessary when you want to use suexec together with user-specific directories (as provided by MOD USERDIR). The default is public html. --with-suexec-uidmin Define this as the lowest UID allowed to be a target user for suexec. The default value is 100. --with-suexec-umask Set umask for processes started by suexec. It defaults to your system settings. 314 CHAPTER 8. APACHE HTTP SERVER AND SUPPORTING PROGRAMS Environment variables There are some useful environment variables to override the choices made by configure or to help it to find libraries and programs with nonstandard names or locations. CC Define the C compiler command to be used for compilation. CFLAGS Set C compiler flags you want to use for compilation. CPP Define the C preprocessor command to be used. CPPFLAGS Set C/C++ preprocessor flags, e.g. -Iincludedir if you have headers in a nonstandard directory includedir. LDFLAGS Set linker flags, e.g. -Llibdir if you have libraries in a nonstandard directory libdir. 8.7. DBMMANAGE - MANAGE USER AUTHENTICATION FILES IN DBM FORMAT 8.7 315 dbmmanage - Manage user authentication files in DBM format dbmmanage is used to create and update the DBM format files used to store usernames and password for basic authentication of HTTP users via MOD AUTHN DBM. Resources available from the Apache HTTP server can be restricted to just the users listed in the files created by dbmmanage. This program can only be used when the usernames are stored in a DBM file. To use a flat-file database see htpasswd. Another tool to maintain a DBM password database is htdbm. This manual page only lists the command line arguments. For details of the directives necessary to configure user authentication in httpd see the httpd manual, which is part of the Apache distribution or can be found at http://httpd.apache.org/. See also • httpd • htdbm • MOD AUTHN DBM • MOD AUTHZ DBM Synopsis dbmmanage [ encoding ] filename add|adduser|check|delete|update username [ encpasswd [ group[,group...] [ comment ] ] ] dbmmanage filename view [ username ] dbmmanage filename import Options filename The filename of the DBM format file. Usually without the extension .db, .pag, or .dir. username The user for which the operations are performed. The username may not contain a colon (:). encpasswd This is the already encrypted password to use for the update and add commands. You may use a hyphen (-) if you want to get prompted for the password, but fill in the fields afterwards. Additionally when using the update command, a period (.) keeps the original password untouched. group A group, which the user is member of. A groupname may not contain a colon (:). You may use a hyphen (-) if you don’t want to assign the user to a group, but fill in the comment field. Additionally when using the update command, a period (.) keeps the original groups untouched. comment This is the place for your opaque comments about the user, like realname, mailaddress or such things. The server will ignore this field. Encodings -d crypt encryption (default, except on Win32, Netware) -m MD5 encryption (default on Win32, Netware) -s SHA1 encryption -p plaintext (not recommended) 316 CHAPTER 8. APACHE HTTP SERVER AND SUPPORTING PROGRAMS Commands add Adds an entry for username to filename using the encrypted password encpasswd. dbmmanage passwords.dat add rbowen foKntnEF3KSXA adduser Asks for a password and then adds an entry for username to filename. dbmmanage passwords.dat adduser krietz check Asks for a password and then checks if username is in filename and if it’s password matches the specified one. dbmmanage passwords.dat check rbowen delete Deletes the username entry from filename. dbmmanage passwords.dat delete rbowen import Reads username:password entries (one per line) from STDIN and adds them to filename. The passwords already have to be crypted. update Same as the adduser command, except that it makes sure username already exists in filename. dbmmanage passwords.dat update rbowen view Just displays the contents of the DBM file. If you specify a username, it displays the particular record only. dbmmanage passwords.dat view Bugs One should be aware that there are a number of different DBM file formats in existence, and with all likelihood, libraries for more than one format may exist on your system. The three primary examples are SDBM, NDBM, the GNU project’s GDBM, and Berkeley DB 2. Unfortunately, all these libraries use different file formats, and you must make sure that the file format used by filename is the same format that dbmmanage expects to see. dbmmanage currently has no way of determining what type of DBM file it is looking at. If used against the wrong format, will simply return nothing, or may create a different DBM file with a different name, or at worst, it may corrupt the DBM file if you were attempting to write to it. dbmmanage has a list of DBM format preferences, defined by the @AnyDBM::ISA array near the beginning of the program. Since we prefer the Berkeley DB 2 file format, the order in which dbmmanage will look for system libraries is Berkeley DB 2, then NDBM, then GDBM and then SDBM. The first library found will be the library dbmmanage will attempt to use for all DBM file transactions. This ordering is slightly different than the standard @AnyDBM::ISA ordering in Perl, as well as the ordering used by the simple dbmopen() call in Perl, so if you use any other utilities to manage your DBM files, they must also follow this preference ordering. Similar care must be taken if using programs in other languages, like C, to access these files. One can usually use the file program supplied with most Unix systems to see what format a DBM file is in. 8.8. FCGISTARTER - START A FASTCGI PROGRAM 8.8 fcgistarter - Start a FastCGI program See also • MOD PROXY FCGI Note Currently only works on Unix systems. Synopsis fcgistarter -c command -p port [ -i interface ] -N num Options -c command FastCGI program -p port Port which the program will listen on -i interface Interface which the program will listen on -N num Number of instances of the program 317 318 CHAPTER 8. APACHE HTTP SERVER AND SUPPORTING PROGRAMS 8.9 firehose - Demultiplex a firehose stream firehose demultiplexes the given stream of multiplexed connections, and writes each connection to an individual file. When writing to files, each connection is placed into a dedicated file named after the UUID of the connection within the stream. Separate files will be created if requests and responses are found in the stream. If an optional prefix is specified as a parameter, connections that start with the given prefix will be included. The prefix needs to fit completely within the first fragment for a successful match to occur. See also • MOD FIREHOSE Synopsis firehose [ -f input ] [ -o output-directory ] [ -u uuid ] [ -h ] [ --version ] [prefix1 [...]] Options --file, -f filename File to read the firehose from. Defaults to stdin. --output-directory, -o output-directory Directory to write demultiplexed connections to. --uuid, -u uuid The UUID of the connection to demultiplex. Can be specified more than once. If not specified, all UUIDs will be demultiplexed. --help, -h This help text. --version Display the version of the program. 8.10. HTCACHECLEAN - CLEAN UP THE DISK CACHE 8.10 319 htcacheclean - Clean up the disk cache htcacheclean is used to keep the size of MOD CACHE DISK’s storage within a given size limit, or limit on inodes in use. This tool can run either manually or in daemon mode. When running in daemon mode, it sleeps in the background and checks the cache directory at regular intervals for cached content to be removed. You can stop the daemon cleanly by sending it a TERM or INT signal. When run manually, a once off check of the cache directory is made for cached content to be removed. If one or more URLs are specified, each URL will be deleted from the cache, if present. See also • MOD CACHE DISK Synopsis htcacheclean [ -D ] [ -v ] [ -t ] [ -r ] [ -n ] [ -Rround ] -ppath [-llimit| -Llimit] htcacheclean [ -n ] [ -t ] [ -i ] [ -Ppidfile ] [ -Rround ] -dinterval -ppath [-llimit| -Llimit] htcacheclean [ -v ] [ -Rround ] -ppath [ -a ] [ -A ] htcacheclean [ -D ] [ -v ] [ -t ] [ -Rround ] -ppath url Options -dinterval Daemonize and repeat cache cleaning every interval minutes. This option is mutually exclusive with the -D, -v and -r options. To shutdown the daemon cleanly, just send it a SIGTERM or SIGINT. -D Do a dry run and don’t delete anything. This option is mutually exclusive with the -d option. When doing a dry run and deleting directories with -t, the inodes reported deleted in the stats cannot take into account the directories deleted, and will be marked as an estimate. -v Be verbose and print statistics. This option is mutually exclusive with the -d option. -r Clean thoroughly. This assumes that the Apache web server is not running (otherwise you may get garbage in the cache). This option is mutually exclusive with the -d option and implies the -t option. -n Be nice. This causes slower processing in favour of other processes. htcacheclean will sleep from time to time so that (a) the disk IO will be delayed and (b) the kernel can schedule other processes in the meantime. -t Delete all empty directories. By default only cache files are removed, however with some configurations the large number of directories created may require attention. If your configuration requires a very large number of directories, to the point that inode or file allocation table exhaustion may become an issue, use of this option is advised. -ppath Specify path as the root directory of the disk cache. This should be the same value as specified with the C ACHE ROOT directive. -Ppidfile Specify pidfile as the name of the file to write the process ID to when daemonized. -Rround Specify round as the amount to round sizes up to, to compensate for disk block sizes. Set to the block size of the cache partition. -llimit Specify limit as the total disk cache size limit. The value is expressed in bytes by default (or attaching B to the number). Attach K for Kbytes or M for MBytes. 320 CHAPTER 8. APACHE HTTP SERVER AND SUPPORTING PROGRAMS -Llimit Specify limit as the total disk cache inode limit. -i Be intelligent and run only when there was a modification of the disk cache. This option is only possible together with the -d option. -a List the URLs currently stored in the cache. Variants of the same URL will be listed once for each variant. -A List the URLs currently stored in the cache, along with their attributes in the following order: url, header size, body size, status, entity version, date, expiry, request time, response time, body present, head request. Deleting a specific URL If htcacheclean is passed one or more URLs, each URL will be deleted from the cache. If multiple variants of an URL exists, all variants would be deleted. When a reverse proxied URL is to be deleted, the effective URL is constructed from the Host header, the port, the path and the query. Note the ’?’ in the URL must always be specified explicitly, whether a query string is present or not. For example, an attempt to delete the path / from the server localhost, the URL to delete would be http://localhost:80/?. Listing URLs in the Cache By passing the -a or -A options to htcacheclean, the URLs within the cache will be listed as they are found, one URL per line. The -A option dumps the full cache entry after the URL, with fields in the following order: url The URL of the entry. header size The size of the header in bytes. body size The size of the body in bytes. status Status of the cached response. entity version The number of times this entry has been revalidated without being deleted. date Date of the response. expiry Expiry date of the response. request time Time of the start of the request. response time Time of the end of the request. body present If 0, no body is stored with this request, 1 otherwise. head request If 1, the entry contains a cached HEAD request with no body, 0 otherwise. Exit Status htcacheclean returns a zero status ("true") if all operations were successful, 1 otherwise. If an URL is specified, and the URL was cached and successfully removed, 0 is returned, 2 otherwise. If an error occurred during URL removal, 1 is returned. 8.11. HTDBM - MANIPULATE DBM PASSWORD DATABASES 8.11 321 htdbm - Manipulate DBM password databases htdbm is used to manipulate the DBM format files used to store usernames and password for basic authentication of HTTP users via MOD AUTHN DBM. See the dbmmanage documentation for more information about these DBM files. See also • httpd • dbmmanage • MOD AUTHN DBM Synopsis htdbm [ -TDBTYPE ] [ -i ] [ -c ] [ -m | -B | -d | -s | -p ] [ -C cost ] [ -t ] [ -v ] filename username htdbm -b [ -TDBTYPE ] [ -c ] [ -m | -B | -d | -s | -p ] [ -C cost ] [ -t ] [ -v ] filename username password htdbm -n [ -i ] [ -c ] [ -m | -B | -d | -s | -p ] [ -C cost ] [ -t ] [ -v ] username htdbm -nb [ -c ] [ -m | -B | -d | -s | -p ] [ -C cost ] [ -t ] [ -v ] username password htdbm -v [ -TDBTYPE ] [ -i ] [ -c ] [ -m | -B | -d | -s | -p ] [ -C cost ] [ -t ] [ -v ] filename username htdbm -vb [ -TDBTYPE ] [ -c ] [ -m | -B | -d | -s | -p ] [ -C cost ] [ -t ] [ -v ] filename username password htdbm -x [ -TDBTYPE ] filename username htdbm -l [ -TDBTYPE ] Options -b Use batch mode; i.e., get the password from the command line rather than prompting for it. This option should be used with extreme care, since the password is clearly visible on the command line. For script use see the -i option. -i Read the password from stdin without verification (for script usage). -c Create the passwdfile. If passwdfile already exists, it is rewritten and truncated. This option cannot be combined with the -n option. -n Display the results on standard output rather than updating a database. This option changes the syntax of the command line, since the passwdfile argument (usually the first one) is omitted. It cannot be combined with the -c option. -m Use MD5 encryption for passwords. On Windows and Netware, this is the default. -B Use bcrypt encryption for passwords. This is currently considered to be very secure. -C This flag is only allowed in combination with -B (bcrypt encryption). It sets the computing time used for the bcrypt algorithm (higher is more secure but slower, default: 5, valid: 4 to 31). 322 CHAPTER 8. APACHE HTTP SERVER AND SUPPORTING PROGRAMS -d Use crypt() encryption for passwords. The default on all platforms but Windows and Netware. Though possibly supported by htdbm on all platforms, it is not supported by the httpd server on Windows and Netware. This algorithm is insecure by today’s standards. -s Use SHA encryption for passwords. Facilitates migration from/to Netscape servers using the LDAP Directory Interchange Format (ldif). This algorithm is insecure by today’s standards. -p Use plaintext passwords. Though htdbm will support creation on all platforms, the httpd daemon will only accept plain text passwords on Windows and Netware. -l Print each of the usernames and comments from the database on stdout. -v Verify the username and password. The program will print a message indicating whether the supplied password is valid. If the password is invalid, the program exits with error code 3. -x Delete user. If the username exists in the specified DBM file, it will be deleted. -t Interpret the final parameter as a comment. When this option is specified, an additional string can be appended to the command line; this string will be stored in the "Comment" field of the database, associated with the specified username. filename The filename of the DBM format file. Usually without the extension .db, .pag, or .dir. If -c is given, the DBM file is created if it does not already exist, or updated if it does exist. username The username to create or update in passwdfile. If username does not exist in this file, an entry is added. If it does exist, the password is changed. password The plaintext password to be encrypted and stored in the DBM file. Used only with the -b flag. -TDBTYPE Type of DBM file (SDBM, GDBM, DB, or "default"). Bugs One should be aware that there are a number of different DBM file formats in existence, and with all likelihood, libraries for more than one format may exist on your system. The three primary examples are SDBM, NDBM, GNU GDBM, and Berkeley/Sleepycat DB 2/3/4. Unfortunately, all these libraries use different file formats, and you must make sure that the file format used by filename is the same format that htdbm expects to see. htdbm currently has no way of determining what type of DBM file it is looking at. If used against the wrong format, will simply return nothing, or may create a different DBM file with a different name, or at worst, it may corrupt the DBM file if you were attempting to write to it. One can usually use the file program supplied with most Unix systems to see what format a DBM file is in. Exit Status htdbm returns a zero status ("true") if the username and password have been successfully added or updated in the DBM File. htdbm returns 1 if it encounters some problem accessing files, 2 if there was a syntax problem with the command line, 3 if the password was entered interactively and the verification entry didn’t match, 4 if its operation was interrupted, 5 if a value is too long (username, filename, password, or final computed record), 6 if the username contains illegal characters (see the Restrictions section), and 7 if the file is not a valid DBM password file. 8.11. HTDBM - MANIPULATE DBM PASSWORD DATABASES 323 Examples htdbm /usr/local/etc/apache/.htdbm-users jsmith Adds or modifies the password for user jsmith. The user is prompted for the password. If executed on a Windows system, the password will be encrypted using the modified Apache MD5 algorithm; otherwise, the system’s crypt() routine will be used. If the file does not exist, htdbm will do nothing except return an error. htdbm -c /home/doe/public html/.htdbm jane Creates a new file and stores a record in it for user jane. The user is prompted for the password. If the file exists and cannot be read, or cannot be written, it is not altered and htdbm will display a message and return an error status. htdbm -mb /usr/web/.htdbm-all jones Pwd4Steve Encrypts the password from the command line (Pwd4Steve) using the MD5 algorithm, and stores it in the specified file. Security Considerations Web password files such as those managed by htdbm should not be within the Web server’s URI space – that is, they should not be fetchable with a browser. The use of the -b option is discouraged, since when it is used the unencrypted password appears on the command line. When using the crypt() algorithm, note that only the first 8 characters of the password are used to form the password. If the supplied password is longer, the extra characters will be silently discarded. The SHA encryption format does not use salting: for a given password, there is only one encrypted representation. The crypt() and MD5 formats permute the representation by prepending a random salt string, to make dictionary attacks against the passwords more difficult. The SHA and crypt() formats are insecure by today’s standards. Restrictions On the Windows platform, passwords encrypted with htdbm are limited to no more than 255 characters in length. Longer passwords will be truncated to 255 characters. The MD5 algorithm used by htdbm is specific to the Apache software; passwords encrypted using it will not be usable with other Web servers. Usernames are limited to 255 bytes and may not include the character :. 324 CHAPTER 8. APACHE HTTP SERVER AND SUPPORTING PROGRAMS 8.12 htdigest - manage user files for digest authentication htdigest is used to create and update the flat-files used to store usernames, realm and password for digest authentication of HTTP users. Resources available from the Apache HTTP server can be restricted to just the users listed in the files created by htdigest. This manual page only lists the command line arguments. For details of the directives necessary to configure digest authentication in httpd see the Apache manual, which is part of the Apache distribution or can be found at http://httpd.apache.org/. See also • httpd • MOD AUTH DIGEST Synopsis htdigest [ -c ] passwdfile realm username Options -c Create the passwdfile. If passwdfile already exists, it is deleted first. passwdfile Name of the file to contain the username, realm and password. If -c is given, this file is created if it does not already exist, or deleted and recreated if it does exist. realm The realm name to which the user name belongs. See http://tools.ietf.org/html/rfc2617#section-3.2.11 for more details. username The user name to create or update in passwdfile. If username does not exist is this file, an entry is added. If it does exist, the password is changed. Security Considerations This program is not safe as a setuid executable. Do not make it setuid. 1 http://tools.ietf.org/html/rfc2617#section-3.2.1 8.13. HTPASSWD - MANAGE USER FILES FOR BASIC AUTHENTICATION 8.13 325 htpasswd - Manage user files for basic authentication htpasswd is used to create and update the flat-files used to store usernames and password for basic authentication of HTTP users. If htpasswd cannot access a file, such as not being able to write to the output file or not being able to read the file in order to update it, it returns an error status and makes no changes. Resources available from the Apache HTTP server can be restricted to just the users listed in the files created by htpasswd. This program can only manage usernames and passwords stored in a flat-file. It can encrypt and display password information for use in other types of data stores, though. To use a DBM database see dbmmanage or htdbm. htpasswd encrypts passwords using either bcrypt, a version of MD5 modified for Apache, SHA1, or the system’s crypt() routine. Files managed by htpasswd may contain a mixture of different encoding types of passwords; some user records may have bcrypt or MD5-encrypted passwords while others in the same file may have passwords encrypted with crypt(). This manual page only lists the command line arguments. For details of the directives necessary to configure user authentication in httpd see the Apache manual, which is part of the Apache distribution or can be found at http://httpd.apache.org/2 . See also • httpd • htdbm • The scripts in support/SHA1 which come with the distribution. Synopsis htpasswd [ -c ] [ -i ] [ -m | -B | -d | -s | -p ] [ -C cost ] [ -D ] [ -v ] passwdfile username htpasswd -b [ -c ] [ -m | -B | -d | -s | -p ] [ -C cost ] [ -D ] [ -v ] passwdfile username password htpasswd -n [ -i ] [ -m | -B | -d | -s | -p ] [ -C cost ] username htpasswd -nb [ -m | -B | -d | -s | -p ] [ -C cost ] username password Options -b Use batch mode; i.e., get the password from the command line rather than prompting for it. This option should be used with extreme care, since the password is clearly visible on the command line. For script use see the -i option. Available in 2.4.4 and later. -i Read the password from stdin without verification (for script usage). -c Create the passwdfile. If passwdfile already exists, it is rewritten and truncated. This option cannot be combined with the -n option. -n Display the results on standard output rather than updating a file. This is useful for generating password records acceptable to Apache for inclusion in non-text data stores. This option changes the syntax of the command line, since the passwdfile argument (usually the first one) is omitted. It cannot be combined with the -c option. -m Use MD5 encryption for passwords. This is the default (since version 2.2.18). 2 http://httpd.apache.org 326 CHAPTER 8. APACHE HTTP SERVER AND SUPPORTING PROGRAMS -B Use bcrypt encryption for passwords. This is currently considered to be very secure. -C This flag is only allowed in combination with -B (bcrypt encryption). It sets the computing time used for the bcrypt algorithm (higher is more secure but slower, default: 5, valid: 4 to 31). -d Use crypt() encryption for passwords. This is not supported by the httpd server on Windows and Netware. This algorithm limits the password length to 8 characters. This algorithm is insecure by today’s standards. It used to be the default algorithm until version 2.2.17. -s Use SHA encryption for passwords. Facilitates migration from/to Netscape servers using the LDAP Directory Interchange Format (ldif). This algorithm is insecure by today’s standards. -p Use plaintext passwords. Though htpasswd will support creation on all platforms, the httpd daemon will only accept plain text passwords on Windows and Netware. -D Delete user. If the username exists in the specified htpasswd file, it will be deleted. -v Verify password. Verify that the given password matches the password of the user stored in the specified htpasswd file. Available in 2.4.5 and later. passwdfile Name of the file to contain the user name and password. If -c is given, this file is created if it does not already exist, or rewritten and truncated if it does exist. username The username to create or update in passwdfile. If username does not exist in this file, an entry is added. If it does exist, the password is changed. password The plaintext password to be encrypted and stored in the file. Only used with the -b flag. Exit Status htpasswd returns a zero status ("true") if the username and password have been successfully added or updated in the passwdfile. htpasswd returns 1 if it encounters some problem accessing files, 2 if there was a syntax problem with the command line, 3 if the password was entered interactively and the verification entry didn’t match, 4 if its operation was interrupted, 5 if a value is too long (username, filename, password, or final computed record), 6 if the username contains illegal characters (see the Restrictions section), and 7 if the file is not a valid password file. Examples htpasswd /usr/local/etc/apache/.htpasswd-users jsmith Adds or modifies the password for user jsmith. The user is prompted for the password. The password will be encrypted using the modified Apache MD5 algorithm. If the file does not exist, htpasswd will do nothing except return an error. htpasswd -c /home/doe/public html/.htpasswd jane Creates a new file and stores a record in it for user jane. The user is prompted for the password. If the file exists and cannot be read, or cannot be written, it is not altered and htpasswd will display a message and return an error status. htpasswd -db /usr/web/.htpasswd-all jones Pwd4Steve Encrypts the password from the command line (Pwd4Steve) using the crypt() algorithm, and stores it in the specified file. 8.13. HTPASSWD - MANAGE USER FILES FOR BASIC AUTHENTICATION 327 Security Considerations Web password files such as those managed by htpasswd should not be within the Web server’s URI space – that is, they should not be fetchable with a browser. This program is not safe as a setuid executable. Do not make it setuid. The use of the -b option is discouraged, since when it is used the unencrypted password appears on the command line. When using the crypt() algorithm, note that only the first 8 characters of the password are used to form the password. If the supplied password is longer, the extra characters will be silently discarded. The SHA encryption format does not use salting: for a given password, there is only one encrypted representation. The crypt() and MD5 formats permute the representation by prepending a random salt string, to make dictionary attacks against the passwords more difficult. The SHA and crypt() formats are insecure by today’s standards. Restrictions On the Windows platform, passwords encrypted with htpasswd are limited to no more than 255 characters in length. Longer passwords will be truncated to 255 characters. The MD5 algorithm used by htpasswd is specific to the Apache software; passwords encrypted using it will not be usable with other Web servers. Usernames are limited to 255 bytes and may not include the character :. 328 CHAPTER 8. APACHE HTTP SERVER AND SUPPORTING PROGRAMS 8.14 httxt2dbm - Generate dbm files for use with RewriteMap httxt2dbm is used to generate dbm files from text input, for use in R EWRITE M AP with the dbm map type. If the output file already exists, it will not be truncated. New keys will be added and existing keys will be updated. See also • httpd • MOD REWRITE Synopsis httxt2dbm [ -v ] [ -f DBM TYPE ] -i SOURCE TXT -o OUTPUT DBM Options -v More verbose output -f DBM TYPE Specify the DBM type to be used for the output. If not specified, will use the APR Default. Available types are: GDBM for GDBM files, SDBM for SDBM files, DB for berkeley DB files, NDBM for NDBM files, default for the default DBM type. -i SOURCE TXT Input file from which the dbm is to be created. The file should be formated with one record per line, of the form: key value. See the documentation for R EWRITE M AP for further details of this file’s format and meaning. -o OUTPUT DBM Name of the output dbm files. Examples httxt2dbm -i rewritemap.txt -o rewritemap.dbm httxt2dbm -f SDBM -i rewritemap.txt -o rewritemap.dbm 8.15. LOGRESOLVE - RESOLVE IP-ADDRESSES TO HOSTNAMES IN APACHE LOG FILES 8.15 329 logresolve - Resolve IP-addresses to hostnames in Apache log files logresolve is a post-processing program to resolve IP-addresses in Apache’s access logfiles. To minimize impact on your nameserver, logresolve has its very own internal hash-table cache. This means that each IP number will only be looked up the first time it is found in the log file. Takes an Apache log file on standard input. The IP addresses must be the first thing on each line and must be separated from the remainder of the line by a space. Synopsis logresolve [ -s filename ] [ -c ] < access log > access log.new Options -s filename Specifies a filename to record statistics. -c This causes logresolve to apply some DNS checks: after finding the hostname from the IP address, it looks up the IP addresses for the hostname and checks that one of these matches the original address. 330 CHAPTER 8. APACHE HTTP SERVER AND SUPPORTING PROGRAMS 8.16 log server status - Log periodic status summaries This perl script is designed to be run at a frequent interval by something like cron. It connects to the server and downloads the status information. It reformats the information to a single line and logs it to a file. Adjust the variables at the top of the script to specify the location of the resulting logfile. MOD STATUS will need to be loaded and configured in order for this script to do its job. Usage The script contains the following section. my my my my $wherelog = "/usr/local/apache2/logs/"; # Logs will be like "/usr/local/apache2/logs/19 $server = "localhost"; # Name of server, could be "www.foo.com" $port = "80"; # Port on server $request = "/server-status/?auto"; # Request to send You’ll need to ensure that these variables have the correct values, and you’ll need to have the /server-status handler configured at the location specified, and the specified log location needs to be writable by the user which will run the script. Run the script periodically via cron to produce a daily log file, which can then be used for statistical analysis. 8.17. ROTATELOGS - PIPED LOGGING PROGRAM TO ROTATE APACHE LOGS 8.17 331 rotatelogs - Piped logging program to rotate Apache logs rotatelogs is a simple program for use in conjunction with Apache’s piped logfile feature. It supports rotation based on a time interval or maximum size of the log. Synopsis rotatelogs [ -l ] [ -L linkname ] [ -p program ] [ -f ] [ -D ] [ -t ] [ -v ] [ -e ] [ -c ] [ -n number-of-files ] logfile rotationtime|filesize(B|K|M|G) [ offset ] Options -l Causes the use of local time rather than GMT as the base for the interval or for strftime(3) formatting with size-based rotation. -L linkname Causes a hard link to be made from the current logfile to the specified link name. This can be used to watch the log continuously across rotations using a command like tail -F linkname. -p program If given, rotatelogs will execute the specified program every time a new log file is opened. The filename of the newly opened file is passed as the first argument to the program. If executing after a rotation, the old log file is passed as the second argument. rotatelogs does not wait for the specified program to terminate before continuing to operate, and will not log any error code returned on termination. The spawned program uses the same stdin, stdout, and stderr as rotatelogs itself, and also inherits the environment. -f Causes the logfile to be opened immediately, as soon as rotatelogs starts, instead of waiting for the first logfile entry to be read (for non-busy sites, there may be a substantial delay between when the server is started and when the first request is handled, meaning that the associated logfile does not "exist" until then, which causes problems from some automated logging tools) -D Creates the parent directories of the path that the log file will be placed in if they do not already exist. This allows strftime(3) formatting to be used in the path and not just the filename. -t Causes the logfile to be truncated instead of rotated. This is useful when a log is processed in real time by a command like tail, and there is no need for archived data. No suffix will be added to the filename, however format strings containing ’%’ characters will be respected. -v Produce verbose output on STDERR. The output contains the result of the configuration parsing, and all file open and close actions. -e Echo logs through to stdout. Useful when logs need to be further processed in real time by a further tool in the chain. -c Create log file for each interval, even if empty. -n number-of-files Use a circular list of filenames without timestamps. With -n 3, the series of log files opened would be "logfile", "logfile.1", "logfile.2", then overwriting "logfile". Available in 2.4.5 and later. logfile The path plus basename of the logfile. If logfile includes any ’%’ characters, it is treated as a format string for strftime(3). Otherwise, the suffix .nnnnnnnnnn is automatically added and is the time in seconds (unless the -t option is used). Both formats compute the start time from the beginning of the current period. For example, if a rotation time of 86400 is specified, the hour, minute, and second fields created from the strftime(3) format will all be zero, referring to the beginning of the current 24-hour period (midnight). 332 CHAPTER 8. APACHE HTTP SERVER AND SUPPORTING PROGRAMS When using strftime(3) filename formatting, be sure the log file format has enough granularity to produce a different file name each time the logs are rotated. Otherwise rotation will overwrite the same file instead of starting a new one. For example, if logfile was /var/log/errorlog.%Y-%m-%d with log rotation at 5 megabytes, but 5 megabytes was reached twice in the same day, the same log file name would be produced and log rotation would keep writing to the same file. rotationtime The time between log file rotations in seconds. The rotation occurs at the beginning of this interval. For example, if the rotation time is 3600, the log file will be rotated at the beginning of every hour; if the rotation time is 86400, the log file will be rotated every night at midnight. (If no data is logged during an interval, no file will be created.) filesize(B|K|M|G) The maximum file size in followed by exactly one of the letters B (Bytes), K (KBytes), M (MBytes) or G (GBytes). When time and size are specified, the size must be given after the time. Rotation will occur whenever either time or size limits are reached. offset The number of minutes offset from UTC. If omitted, zero is assumed and UTC is used. For example, to use local time in the zone UTC -5 hours, specify a value of -300 for this argument. In most cases, -l should be used instead of specifying an offset. Examples CustomLog "|bin/rotatelogs /var/log/logfile 86400" common This creates the files /var/log/logfile.nnnn where nnnn is the system time at which the log nominally starts (this time will always be a multiple of the rotation time, so you can synchronize cron scripts with it). At the end of each rotation time (here after 24 hours) a new log is started. CustomLog "|bin/rotatelogs -l /var/log/logfile.%Y.%m.%d 86400" common This creates the files /var/log/logfile.yyyy.mm.dd where yyyy is the year, mm is the month, and dd is the day of the month. Logging will switch to a new file every day at midnight, local time. CustomLog "|bin/rotatelogs /var/log/logfile 5M" common This configuration will rotate the logfile whenever it reaches a size of 5 megabytes. ErrorLog "|bin/rotatelogs /var/log/errorlog.%Y-%m-%d-%H %M %S 5M" This configuration will rotate the error logfile whenever it reaches a size of 5 megabytes, and the suffix to the logfile name will be created of the form errorlog.YYYY-mm-dd-HH MM SS. CustomLog "|bin/rotatelogs -t /var/log/logfile 86400" common This creates the file /var/log/logfile, truncating the file at startup and then truncating the file once per day. It is expected in this scenario that a separate process (such as tail) would process the file in real time. Portability The following logfile format string substitutions should be supported by all strftime(3) implementations, see the strftime(3) man page for library-specific extensions. 8.17. ROTATELOGS - PIPED LOGGING PROGRAM TO ROTATE APACHE LOGS %A %a %B %b %c %d %H %I %j %M %m %p %S %U %W %w %X %x %Y %y %Z %% full weekday name (localized) 3-character weekday name (localized) full month name (localized) 3-character month name (localized) date and time (localized) 2-digit day of month 2-digit hour (24 hour clock) 2-digit hour (12 hour clock) 3-digit day of year 2-digit minute 2-digit month am/pm of 12 hour clock (localized) 2-digit second 2-digit week of year (Sunday first day of week) 2-digit week of year (Monday first day of week) 1-digit weekday (Sunday first day of week) time (localized) date (localized) 4-digit year 2-digit year time zone name literal ‘%’ 333 334 8.18 CHAPTER 8. APACHE HTTP SERVER AND SUPPORTING PROGRAMS split-logfile - Split up multi-vhost logfiles This perl script will take a combined Web server access log file and break its contents into separate files. It assumes that the first field of each line is the virtual host identity, put there using the "%v" variable in L OG F ORMAT. Usage Create a log file with virtual host information in it: LogFormat "%v %h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\"" combined_plus_v CustomLog "logs/access_log" combined_plus_vhost Log files will be created, in the directory where you run the script, for each virtual host name that appears in the combined log file. These logfiles will named after the hostname, with a .log file extension. The combined log file is read from stdin. Records read will be appended to any existing log files. split-logfile < access log 8.19. SUEXEC - SWITCH USER BEFORE EXECUTING EXTERNAL PROGRAMS 8.19 335 suexec - Switch user before executing external programs suexec is used by the Apache HTTP Server to switch to another user before executing CGI programs. In order to achieve this, it must run as root. Since the HTTP daemon normally doesn’t run as root, the suexec executable needs the setuid bit set and must be owned by root. It should never be writable for any other person than root. For further information about the concepts and the security model of suexec please refer to the suexec documentation (http://httpd.apache.org/docs/trunk/suexec.html). Synopsis suexec -V Options -V If you are root, this option displays the compile options of suexec. For security reasons all configuration options are changeable only at compile time. 336 8.20 CHAPTER 8. APACHE HTTP SERVER AND SUPPORTING PROGRAMS Other Programs This page used to contain documentation for programs which now have their own docs pages. Please update any links. log server status split-logfile Chapter 9 Apache Miscellaneous Documentation 337 338 9.1 CHAPTER 9. APACHE MISCELLANEOUS DOCUMENTATION Apache Miscellaneous Documentation Below is a list of additional documentation pages that apply to the Apache web server development project. ! Warning The documents below have not been fully updated to take into account changes made in the 2.1 version of the Apache HTTP Server. Some of the information may still be relevant, but please use it with care. Performance Notes - Apache Tuning (p. 339) Notes about how to (run-time and compile-time) configure Apache for highest performance. Notes explaining why Apache does some things, and why it doesn’t do other things (which make it slower/faster). Performance Scaling (p. 350) Some easily accessible configuration and tuning options for Apache httpd 2.2 and 2.4 as well as monitoring tools. Security Tips (p. 364) Some "do"s - and "don’t"s - for keeping your Apache web site secure. Relevant Standards (p. 369) This document acts as a reference page for most of the relevant standards that Apache follows. Password Encryption Formats (p. 371) Discussion of the various ciphers supported by Apache for authentication purposes. 9.2. APACHE PERFORMANCE TUNING 9.2 339 Apache Performance Tuning Apache 2.x is a general-purpose webserver, designed to provide a balance of flexibility, portability, and performance. Although it has not been designed specifically to set benchmark records, Apache 2.x is capable of high performance in many real-world situations. Compared to Apache 1.3, release 2.x contains many additional optimizations to increase throughput and scalability. Most of these improvements are enabled by default. However, there are compile-time and run-time configuration choices that can significantly affect performance. This document describes the options that a server administrator can configure to tune the performance of an Apache 2.x installation. Some of these configuration options enable the httpd to better take advantage of the capabilities of the hardware and OS, while others allow the administrator to trade functionality for speed. Hardware and Operating System Issues The single biggest hardware issue affecting webserver performance is RAM. A webserver should never ever have to swap, as swapping increases the latency of each request beyond a point that users consider "fast enough". This causes users to hit stop and reload, further increasing the load. You can, and should, control the M AX R EQUEST W ORKERS setting so that your server does not spawn so many children that it starts swapping. The procedure for doing this is simple: determine the size of your average Apache process, by looking at your process list via a tool such as top, and divide this into your total available memory, leaving some room for other processes. Beyond that the rest is mundane: get a fast enough CPU, a fast enough network card, and fast enough disks, where "fast enough" is something that needs to be determined by experimentation. Operating system choice is largely a matter of local concerns. But some guidelines that have proven generally useful are: • Run the latest stable release and patch level of the operating system that you choose. Many OS suppliers have introduced significant performance improvements to their TCP stacks and thread libraries in recent years. • If your OS supports a sendfile(2) system call, make sure you install the release and/or patches needed to enable it. (With Linux, for example, this means using Linux 2.4 or later. For early releases of Solaris 8, you may need to apply a patch.) On systems where it is available, sendfile enables Apache 2 to deliver static content faster and with lower CPU utilization. Run-Time Configuration Issues Related Modules MOD DIR MPM COMMON MOD STATUS Related Directives A LLOW OVERRIDE D IRECTORY I NDEX H OSTNAME L OOKUPS E NABLE MMAP E NABLE S ENDFILE K EEPA LIVE T IMEOUT M AX S PARE S ERVERS M IN S PARE S ERVERS O PTIONS S TART S ERVERS 340 CHAPTER 9. APACHE MISCELLANEOUS DOCUMENTATION HostnameLookups and other DNS considerations Prior to Apache 1.3, H OSTNAME L OOKUPS defaulted to On. This adds latency to every request because it requires a DNS lookup to complete before the request is finished. In Apache 1.3 this setting defaults to Off. If you need to have addresses in your log files resolved to hostnames, use the logresolve program that comes with Apache, or one of the numerous log reporting packages which are available. It is recommended that you do this sort of postprocessing of your log files on some machine other than the production web server machine, in order that this activity not adversely affect server performance. If you use any A L L O W from domain or D E N Y from domain directives (i.e., using a hostname, or a domain name, rather than an IP address) then you will pay for two DNS lookups (a reverse, followed by a forward lookup to make sure that the reverse is not being spoofed). For best performance, therefore, use IP addresses, rather than names, when using these directives, if possible. Note that it’s possible to scope the directives, such as within a section. In this case the DNS lookups are only performed on requests matching the criteria. Here’s an example which disables lookups except for .html and .cgi files: HostnameLookups off HostnameLookups on But even still, if you just need DNS names in some CGIs you could consider doing the gethostbyname call in the specific CGIs that need it. FollowSymLinks and SymLinksIfOwnerMatch Wherever in your URL-space you do not have an Options FollowSymLinks, or you do have an Options SymLinksIfOwnerMatch, Apache will need to issue extra system calls to check up on symlinks. (One extra call per filename component.) For example, if you had: DocumentRoot "/www/htdocs" Options SymLinksIfOwnerMatch and a request is made for the URI /index.html, then Apache will perform lstat(2) on /www, /www/htdocs, and /www/htdocs/index.html. The results of these lstats are never cached, so they will occur on every single request. If you really desire the symlinks security checking, you can do something like this: DocumentRoot "/www/htdocs" Options FollowSymLinks Options -FollowSymLinks +SymLinksIfOwnerMatch This at least avoids the extra checks for the D OCUMENT ROOT path. Note that you’ll need to add similar sections if you have any A LIAS or R EWRITE RULE paths outside of your document root. For highest performance, and no symlink protection, set FollowSymLinks everywhere, and never set SymLinksIfOwnerMatch. 9.2. APACHE PERFORMANCE TUNING 341 AllowOverride Wherever in your URL-space you allow overrides (typically .htaccess files), Apache will attempt to open .htaccess for each filename component. For example, DocumentRoot "/www/htdocs" AllowOverride all and a request is made for the URI /index.html. Then Apache will attempt to open /.htaccess, /www/.htaccess, and /www/htdocs/.htaccess. The solutions are similar to the previous case of Options FollowSymLinks. For highest performance use AllowOverride None everywhere in your filesystem. Negotiation If at all possible, avoid content negotiation if you’re really interested in every last ounce of performance. In practice the benefits of negotiation outweigh the performance penalties. There’s one case where you can speed up the server. Instead of using a wildcard such as: DirectoryIndex index Use a complete list of options: DirectoryIndex index.cgi index.pl index.shtml index.html where you list the most common choice first. Also note that explicitly creating a type-map file provides better performance than using MultiViews, as the necessary information can be determined by reading this single file, rather than having to scan the directory for files. If your site needs content negotiation, consider using type-map files, rather than the Options MultiViews directive to accomplish the negotiation. See the Content Negotiation (p. 78) documentation for a full discussion of the methods of negotiation, and instructions for creating type-map files. Memory-mapping In situations where Apache 2.x needs to look at the contents of a file being delivered–for example, when doing serverside-include processing–it normally memory-maps the file if the OS supports some form of mmap(2). On some platforms, this memory-mapping improves performance. However, there are cases where memory-mapping can hurt the performance or even the stability of the httpd: • On some operating systems, mmap does not scale as well as read(2) when the number of CPUs increases. On multiprocessor Solaris servers, for example, Apache 2.x sometimes delivers server-parsed files faster when mmap is disabled. • If you memory-map a file located on an NFS-mounted filesystem and a process on another NFS client machine deletes or truncates the file, your process may get a bus error the next time it tries to access the mapped file content. For installations where either of these factors applies, you should use EnableMMAP off to disable the memorymapping of delivered files. (Note: This directive can be overridden on a per-directory basis.) 342 CHAPTER 9. APACHE MISCELLANEOUS DOCUMENTATION Sendfile In situations where Apache 2.x can ignore the contents of the file to be delivered – for example, when serving static file content – it normally uses the kernel sendfile support for the file if the OS supports the sendfile(2) operation. On most platforms, using sendfile improves performance by eliminating separate read and send mechanics. However, there are cases where using sendfile can harm the stability of the httpd: • Some platforms may have broken sendfile support that the build system did not detect, especially if the binaries were built on another box and moved to such a machine with broken sendfile support. • With an NFS-mounted filesystem, the kernel may be unable to reliably serve the network file through its own cache. For installations where either of these factors applies, you should use EnableSendfile off to disable sendfile delivery of file contents. (Note: This directive can be overridden on a per-directory basis.) Process Creation Prior to Apache 1.3 the M IN S PARE S ERVERS, M AX S PARE S ERVERS, and S TART S ERVERS settings all had drastic effects on benchmark results. In particular, Apache required a "ramp-up" period in order to reach a number of children sufficient to serve the load being applied. After the initial spawning of S TART S ERVERS children, only one child per second would be created to satisfy the M IN S PARE S ERVERS setting. So a server being accessed by 100 simultaneous clients, using the default S TART S ERVERS of 5 would take on the order of 95 seconds to spawn enough children to handle the load. This works fine in practice on real-life servers because they aren’t restarted frequently. But it does really poorly on benchmarks which might only run for ten minutes. The one-per-second rule was implemented in an effort to avoid swamping the machine with the startup of new children. If the machine is busy spawning children, it can’t service requests. But it has such a drastic effect on the perceived performance of Apache that it had to be replaced. As of Apache 1.3, the code will relax the one-per-second rule. It will spawn one, wait a second, then spawn two, wait a second, then spawn four, and it will continue exponentially until it is spawning 32 children per second. It will stop whenever it satisfies the M IN S PARE S ERVERS setting. This appears to be responsive enough that it’s almost unnecessary to twiddle the M IN S PARE S ERVERS, M AX S PARE S ERVERS and S TART S ERVERS knobs. When more than 4 children are spawned per second, a message will be emitted to the E RROR L OG. If you see a lot of these errors, then consider tuning these settings. Use the MOD STATUS output as a guide. Related to process creation is process death induced by the M AX C ONNECTIONS P ER C HILD setting. By default this is 0, which means that there is no limit to the number of connections handled per child. If your configuration currently has this set to some very low number, such as 30, you may want to bump this up significantly. If you are running SunOS or an old version of Solaris, limit this to 10000 or so because of memory leaks. When keep-alives are in use, children will be kept busy doing nothing waiting for more requests on the already open connection. The default K EEPA LIVE T IMEOUT of 5 seconds attempts to minimize this effect. The tradeoff here is between network bandwidth and server resources. In no event should you raise this above about 60 seconds, as most of the benefits are lost1 . Compile-Time Configuration Issues Choosing an MPM Apache 2.x supports pluggable concurrency models, called Multi-Processing Modules (p. 90) (MPMs). When building Apache, you must choose an MPM to use. There are platform-specific MPMs for some platforms: MPM NETWARE, 1 http://www.hpl.hp.com/techreports/Compaq-DEC/WRL-95-4.html 9.2. APACHE PERFORMANCE TUNING 343 MPMT OS 2, and MPM WINNT. For general Unix-type systems, there are several MPMs from which to choose. The choice of MPM can affect the speed and scalability of the httpd: • The WORKER MPM uses multiple child processes with many threads each. Each thread handles one connection at a time. Worker generally is a good choice for high-traffic servers because it has a smaller memory footprint than the prefork MPM. • The EVENT MPM is threaded like the Worker MPM, but is designed to allow more requests to be served simultaneously by passing off some processing work to supporting threads, freeing up the main threads to work on new requests. • The PREFORK MPM uses multiple child processes with one thread each. Each process handles one connection at a time. On many systems, prefork is comparable in speed to worker, but it uses more memory. Prefork’s threadless design has advantages over worker in some situations: it can be used with non-thread-safe third-party modules, and it is easier to debug on platforms with poor thread debugging support. For more information on these and other MPMs, please see the MPM documentation (p. 90) . Modules Since memory usage is such an important consideration in performance, you should attempt to eliminate modules that you are not actually using. If you have built the modules as DSOs (p. 68) , eliminating modules is a simple matter of commenting out the associated L OAD M ODULE directive for that module. This allows you to experiment with removing modules and seeing if your site still functions in their absence. If, on the other hand, you have modules statically linked into your Apache binary, you will need to recompile Apache in order to remove unwanted modules. An associated question that arises here is, of course, what modules you need, and which ones you don’t. The answer here will, of course, vary from one web site to another. However, the minimal list of modules which you can get by with tends to include MOD MIME, MOD DIR, and MOD LOG CONFIG. mod log config is, of course, optional, as you can run a web site without log files. This is, however, not recommended. Atomic Operations Some modules, such as MOD CACHE and recent development builds of the worker MPM, use APR’s atomic API. This API provides atomic operations that can be used for lightweight thread synchronization. By default, APR implements these operations using the most efficient mechanism available on each target OS/CPU platform. Many modern CPUs, for example, have an instruction that does an atomic compare-and-swap (CAS) operation in hardware. On some platforms, however, APR defaults to a slower, mutex-based implementation of the atomic API in order to ensure compatibility with older CPU models that lack such instructions. If you are building Apache for one of these platforms, and you plan to run only on newer CPUs, you can select a faster atomic implementation at build time by configuring Apache with the --enable-nonportable-atomics option: ./buildconf ./configure --with-mpm=worker --enable-nonportable-atomics=yes The --enable-nonportable-atomics option is relevant for the following platforms: • Solaris on SPARC By default, APR uses mutex-based atomics on Solaris/SPARC. If you configure with --enable-nonportable-atomics, however, APR generates code that uses a SPARC v8plus opcode for fast hardware compare-and-swap. If you configure Apache with this option, the atomic operations 344 CHAPTER 9. APACHE MISCELLANEOUS DOCUMENTATION will be more efficient (allowing for lower CPU utilization and higher concurrency), but the resulting executable will run only on UltraSPARC chips. • Linux on x86 By default, APR uses mutex-based atomics on Linux. If you configure with --enable-nonportable-atomics, however, APR generates code that uses a 486 opcode for fast hardware compare-and-swap. This will result in more efficient atomic operations, but the resulting executable will run only on 486 and later chips (and not on 386). mod status and ExtendedStatus On If you include MOD STATUS and you also set ExtendedStatus On when building and running Apache, then on every request Apache will perform two calls to gettimeofday(2) (or times(2) depending on your operating system), and (pre-1.3) several extra calls to time(2). This is all done so that the status report contains timing indications. For highest performance, set ExtendedStatus off (which is the default). accept Serialization - Multiple Sockets ! Warning: This section has not been fully updated to take into account changes made in the 2.x version of the Apache HTTP Server. Some of the information may still be relevant, but please use it with care. This discusses a shortcoming in the Unix socket API. Suppose your web server uses multiple L ISTEN statements to listen on either multiple ports or multiple addresses. In order to test each socket to see if a connection is ready, Apache uses select(2). select(2) indicates that a socket has zero or at least one connection waiting on it. Apache’s model includes multiple children, and all the idle ones test for new connections at the same time. A naive implementation looks something like this (these examples do not match the code, they’re contrived for pedagogical purposes): for (;;) { for (;;) { fd_set accept_fds; FD_ZERO (&accept_fds); for (i = first_socket; i <= last_socket; ++i) { FD_SET (i, &accept_fds); } rc = select (last_socket+1, &accept_fds, NULL, NULL, NULL); if (rc < 1) continue; new_connection = -1; for (i = first_socket; i <= last_socket; ++i) { if (FD_ISSET (i, &accept_fds)) { new_connection = accept (i, NULL, NULL); if (new_connection != -1) break; } } if (new_connection != -1) break; } process_the(new_connection); } 9.2. APACHE PERFORMANCE TUNING 345 But this naive implementation has a serious starvation problem. Recall that multiple children execute this loop at the same time, and so multiple children will block at select when they are in between requests. All those blocked children will awaken and return from select when a single request appears on any socket. (The number of children which awaken varies depending on the operating system and timing issues.) They will all then fall down into the loop and try to accept the connection. But only one will succeed (assuming there’s still only one connection ready). The rest will be blocked in accept. This effectively locks those children into serving requests from that one socket and no other sockets, and they’ll be stuck there until enough new requests appear on that socket to wake them all up. This starvation problem was first documented in PR#4672 . There are at least two solutions. One solution is to make the sockets non-blocking. In this case the accept won’t block the children, and they will be allowed to continue immediately. But this wastes CPU time. Suppose you have ten idle children in select, and one connection arrives. Then nine of those children will wake up, try to accept the connection, fail, and loop back into select, accomplishing nothing. Meanwhile none of those children are servicing requests that occurred on other sockets until they get back up to the select again. Overall this solution does not seem very fruitful unless you have as many idle CPUs (in a multiprocessor box) as you have idle children (not a very likely situation). Another solution, the one used by Apache, is to serialize entry into the inner loop. The loop looks like this (differences highlighted): for (;;) { accept_mutex_on (); for (;;) { fd_set accept_fds; FD_ZERO (&accept_fds); for (i = first_socket; i <= last_socket; ++i) { FD_SET (i, &accept_fds); } rc = select (last_socket+1, &accept_fds, NULL, NULL, NULL); if (rc < 1) continue; new_connection = -1; for (i = first_socket; i <= last_socket; ++i) { if (FD_ISSET (i, &accept_fds)) { new_connection = accept (i, NULL, NULL); if (new_connection != -1) break; } } if (new_connection != -1) break; } accept_mutex_off (); process the new_connection; } The functions accept mutex on and accept mutex off implement a mutual exclusion semaphore. Only one child can have the mutex at any time. There are several choices for implementing these mutexes. The choice is defined in src/conf.h (pre-1.3) or src/include/ap config.h (1.3 or later). Some architectures do not have any locking choice made, on these architectures it is unsafe to use multiple L ISTEN directives. The M UTEX directive can be used to change the mutex implementation of the mpm-accept mutex at run-time. Special considerations for different mutex implementations are documented with that directive. Another solution that has been considered but never implemented is to partially serialize the loop – that is, let in a certain number of processes. This would only be of interest on multiprocessor boxes where it’s possible that multiple 2 http://bugs.apache.org/index/full/467 346 CHAPTER 9. APACHE MISCELLANEOUS DOCUMENTATION children could run simultaneously, and the serialization actually doesn’t take advantage of the full bandwidth. This is a possible area of future investigation, but priority remains low because highly parallel web servers are not the norm. Ideally you should run servers without multiple L ISTEN statements if you want the highest performance. But read on. accept Serialization - Single Socket The above is fine and dandy for multiple socket servers, but what about single socket servers? In theory they shouldn’t experience any of these same problems because all children can just block in accept(2) until a connection arrives, and no starvation results. In practice this hides almost the same "spinning" behavior discussed above in the nonblocking solution. The way that most TCP stacks are implemented, the kernel actually wakes up all processes blocked in accept when a single connection arrives. One of those processes gets the connection and returns to user-space. The rest spin in the kernel and go back to sleep when they discover there’s no connection for them. This spinning is hidden from the user-land code, but it’s there nonetheless. This can result in the same load-spiking wasteful behavior that a non-blocking solution to the multiple sockets case can. For this reason we have found that many architectures behave more "nicely" if we serialize even the single socket case. So this is actually the default in almost all cases. Crude experiments under Linux (2.0.30 on a dual Pentium pro 166 w/128Mb RAM) have shown that the serialization of the single socket case causes less than a 3% decrease in requests per second over unserialized single-socket. But unserialized single-socket showed an extra 100ms latency on each request. This latency is probably a wash on long haul lines, and only an issue on LANs. If you want to override the single socket serialization, you can define SINGLE LISTEN UNSERIALIZED ACCEPT, and then single-socket servers will not serialize at all. Lingering Close As discussed in draft-ietf-http-connection-00.txt3 section 8, in order for an HTTP server to reliably implement the protocol, it needs to shut down each direction of the communication independently. (Recall that a TCP connection is bi-directional. Each half is independent of the other.) When this feature was added to Apache, it caused a flurry of problems on various versions of Unix because of shortsightedness. The TCP specification does not state that the FIN WAIT 2 state has a timeout, but it doesn’t prohibit it. On systems without the timeout, Apache 1.2 induces many sockets stuck forever in the FIN WAIT 2 state. In many cases this can be avoided by simply upgrading to the latest TCP/IP patches supplied by the vendor. In cases where the vendor has never released patches (i.e., SunOS4 – although folks with a source license can patch it themselves), we have decided to disable this feature. There are two ways to accomplish this. One is the socket option SO LINGER. But as fate would have it, this has never been implemented properly in most TCP/IP stacks. Even on those stacks with a proper implementation (i.e., Linux 2.0.31), this method proves to be more expensive (cputime) than the next solution. For the most part, Apache implements this in a function called lingering close (in http main.c). The function looks roughly like this: void lingering_close (int s) { char junk_buffer[2048]; /* shutdown the sending side */ shutdown (s, 1); signal (SIGALRM, lingering_death); alarm (30); 3 http://www.ics.uci.edu/pub/ietf/http/draft-ietf-http-connection-00.txt 9.2. APACHE PERFORMANCE TUNING 347 for (;;) { select (s for reading, 2 second timeout); if (error) break; if (s is ready for reading) { if (read (s, junk_buffer, sizeof (junk_buffer)) <= 0) { break; } /* just toss away whatever is here */ } } close (s); } This naturally adds some expense at the end of a connection, but it is required for a reliable implementation. As HTTP/1.1 becomes more prevalent, and all connections are persistent, this expense will be amortized over more requests. If you want to play with fire and disable this feature, you can define NO LINGCLOSE, but this is not recommended at all. In particular, as HTTP/1.1 pipelined persistent connections come into use, lingering close is an absolute necessity (and pipelined connections are faster4 , so you want to support them). Scoreboard File Apache’s parent and children communicate with each other through something called the scoreboard. Ideally this should be implemented in shared memory. For those operating systems that we either have access to, or have been given detailed ports for, it typically is implemented using shared memory. The rest default to using an on-disk file. The on-disk file is not only slow, but it is unreliable (and less featured). Peruse the src/main/conf.h file for your architecture, and look for either USE MMAP SCOREBOARD or USE SHMGET SCOREBOARD. Defining one of those two (as well as their companions HAVE MMAP and HAVE SHMGET respectively) enables the supplied shared memory code. If your system has another type of shared memory, edit the file src/main/http main.c and add the hooks necessary to use it in Apache. (Send us back a patch too, please.) =⇒Historical note: The Linux port of Apache didn’t start to use shared memory until version 1.2 of Apache. This oversight resulted in really poor and unreliable behavior of earlier versions of Apache on Linux. DYNAMIC MODULE LIMIT If you have no intention of using dynamically loaded modules (you probably don’t if you’re reading this and tuning your server for every last ounce of performance), then you should add -DDYNAMIC MODULE LIMIT=0 when building your server. This will save RAM that’s allocated only for supporting dynamically loaded modules. Appendix: Detailed Analysis of a Trace Here is a system call trace of Apache 2.0.38 with the worker MPM on Solaris 8. This trace was collected using: truss -l -p httpd child pid. 4 http://www.w3.org/Protocols/HTTP/Performance/Pipeline.html 348 CHAPTER 9. APACHE MISCELLANEOUS DOCUMENTATION The -l option tells truss to log the ID of the LWP (lightweight process–Solaris’ form of kernel-level thread) that invokes each system call. Other systems may have different system call tracing utilities such as strace, ktrace, or par. They all produce similar output. In this trace, a client has requested a 10KB static file from the httpd. Traces of non-static requests or requests with content negotiation look wildly different (and quite ugly in some cases). /67: /67: accept(3, 0x00200BEC, 0x00200C0C, 1) (sleeping...) accept(3, 0x00200BEC, 0x00200C0C, 1) = 9 In this trace, the listener thread is running within LWP #67. =⇒anNoteunserialized the lack of accept(2) serialization. On this particular platform, the worker MPM uses accept by default unless it is listening on multiple ports. /65: /67: lwp_park(0x00000000, 0) lwp_unpark(65, 1) = 0 = 0 Upon accepting the connection, the listener thread wakes up a worker thread to do the request processing. In this trace, the worker thread that handles the request is mapped to LWP #65. /65: getsockname(9, 0x00200BA4, 0x00200BC4, 1) = 0 In order to implement virtual hosts, Apache needs to know the local socket address used to accept the connection. It is possible to eliminate this call in many situations (such as when there are no virtual hosts, or when L ISTEN directives are used which do not have wildcard addresses). But no effort has yet been made to do these optimizations. /65: /65: brk(0x002170E8) brk(0x002190E8) = 0 = 0 The brk(2) calls allocate memory from the heap. It is rare to see these in a system call trace, because the httpd uses custom memory allocators (apr pool and apr bucket alloc) for most request processing. In this trace, the httpd has just been started, so it must call malloc(3) to get the blocks of raw memory with which to create the custom memory allocators. /65: /65: /65: /65: /65: /65: /65: fcntl(9, F_GETFL, 0x00000000) = 2 fstat64(9, 0xFAF7B818) = 0 getsockopt(9, 65535, 8192, 0xFAF7B918, 0xFAF7B910, 2190656) = 0 fstat64(9, 0xFAF7B818) = 0 getsockopt(9, 65535, 8192, 0xFAF7B918, 0xFAF7B914, 2190656) = 0 setsockopt(9, 65535, 8192, 0xFAF7B918, 4, 2190656) = 0 fcntl(9, F_SETFL, 0x00000082) = 0 Next, the worker thread puts the connection to the client (file descriptor 9) in non-blocking mode. The setsockopt(2) and getsockopt(2) calls are a side-effect of how Solaris’ libc handles fcntl(2) on sockets. 9.2. APACHE PERFORMANCE TUNING /65: read(9, " G E T 349 / 1 0 k . h t m".., 8000) = 97 The worker thread reads the request from the client. /65: /65: stat("/var/httpd/apache/httpd-8999/htdocs/10k.html", 0xFAF7B978) = 0 open("/var/httpd/apache/httpd-8999/htdocs/10k.html", O_RDONLY) = 10 This httpd has been configured with Options FollowSymLinks and AllowOverride None. Thus it doesn’t need to lstat(2) each directory in the path leading up to the requested file, nor check for .htaccess files. It simply calls stat(2) to verify that the file: 1) exists, and 2) is a regular file, not a directory. /65: sendfilev(0, 9, 0x00200F90, 2, 0xFAF7B53C) = 10269 In this example, the httpd is able to send the HTTP response header and the requested file with a single sendfilev(2) system call. Sendfile semantics vary among operating systems. On some other systems, it is necessary to do a write(2) or writev(2) call to send the headers before calling sendfile(2). /65: write(4, " 1 2 7 . 0 . 0 . 1 - ".., 78) = 78 This write(2) call records the request in the access log. Note that one thing missing from this trace is a time(2) call. Unlike Apache 1.3, Apache 2.x uses gettimeofday(3) to look up the time. On some operating systems, like Linux or Solaris, gettimeofday has an optimized implementation that doesn’t require as much overhead as a typical system call. /65: /65: /65: /65: shutdown(9, 1, 1) poll(0xFAF7B980, 1, 2000) read(9, 0xFAF7BC20, 512) close(9) = = = = 0 1 0 0 The worker thread does a lingering close of the connection. /65: /65: close(10) lwp_park(0x00000000, 0) = 0 (sleeping...) Finally the worker thread closes the file that it has just delivered and blocks until the listener assigns it another connection. /67: accept(3, 0x001FEB74, 0x001FEB94, 1) (sleeping...) Meanwhile, the listener thread is able to accept another connection as soon as it has dispatched this connection to a worker thread (subject to some flow-control logic in the worker MPM that throttles the listener if all the available workers are busy). Though it isn’t apparent from this trace, the next accept(2) can (and usually does, under high load conditions) occur in parallel with the worker thread’s handling of the just-accepted connection. 350 9.3 CHAPTER 9. APACHE MISCELLANEOUS DOCUMENTATION Performance Scaling The Performance Tuning page in the Apache 1.3 documentation says: "Apache is a general webserver, which is designed to be correct first, and fast second. Even so, its performance is quite satisfactory. Most sites have less than 10Mbits of outgoing bandwidth, which Apache can fill using only a low end Pentium-based webserver." However, this sentence was written a few years ago, and in the meantime several things have happened. On one hand, web server hardware has become much faster. On the other hand, many sites now are allowed much more than ten megabits per second of outgoing bandwidth. In addition, web applications have become more complex. The classic brochureware site is alive and well, but the web has grown up substantially as a computing application platform and webmasters may find themselves running dynamic content in Perl, PHP or Java, all of which take a toll on performance. Therefore, in spite of strides forward in machine speed and bandwidth allowances, web server performance and web application performance remain areas of concern. In this documentation several aspects of web server performance will be discussed. What Will and Will Not Be Discussed The session will focus on easily accessible configuration and tuning options for Apache httpd 2.2 and 2.4 as well as monitoring tools. Monitoring tools will allow you to observe your web server to gather information about its performance, or lack thereof. We’ll assume that you don’t have an unlimited budget for server hardware, so the existing infrastructure will have to do the job. You have no desire to compile your own Apache, or to recompile the operating system kernel. We do assume, though, that you have some familiarity with the Apache httpd configuration file. Monitoring Your Server The first task when sizing or performance-tuning your server is to find out how your system is currently performing. By monitoring your server under real-world load, or artificially generated load, you can extrapolate its behavior under stress, such as when your site is mentioned on Slashdot. Monitoring Tools top The top tool ships with Linux and FreeBSD. Solaris offers prstat(1). It collects a number of statistics for the system and for each running process, then displays them interactively on your terminal. The data displayed is refreshed every second and varies by platform, but typically includes system load average, number of processes and their current states, the percent CPU(s) time spent executing user and system code, and the state of the virtual memory system. The data displayed for each process is typically configurable and includes its process name and ID, priority and nice values, memory footprint, and percentage CPU usage. The following example shows multiple httpd processes (with MPM worker and event) running on an Linux (Xen) system: 9.3. PERFORMANCE SCALING 351 top - 23:10:58 up 71 days, 6:14, 4 users, load average: 0.25, 0.53, 0.47 Tasks: 163 total, 1 running, 162 sleeping, 0 stopped, 0 zombie Cpu(s): 11.6%us, 0.7%sy, 0.0%ni, 87.3%id, 0.4%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 2621656k total, 2178684k used, 442972k free, 100500k buffers Swap: 4194296k total, 860584k used, 3333712k free, 1157552k cached PID 16687 15195 1 2 3 4 5 6 7 19 20 28 29 30 31 32 33 34 35 36 37 38 39 68 69 70 USER example_ www root root root root root root root root root root root root root root root root root root root root root root root root PR 20 20 20 15 RT 15 RT 15 15 15 15 RT 15 RT 15 RT 15 RT 15 RT 15 RT 15 15 15 15 NI VIRT RES SHR S %CPU %MEM 0 1200m 547m 179m S 45 21.4 0 441m 33m 2468 S 0 1.3 0 10312 328 308 S 0 0.0 -5 0 0 0 S 0 0.0 -5 0 0 0 S 0 0.0 -5 0 0 0 S 0 0.0 -5 0 0 0 S 0 0.0 -5 0 0 0 S 0 0.0 -5 0 0 0 S 0 0.0 -5 0 0 0 S 0 0.0 -5 0 0 0 S 0 0.0 -5 0 0 0 S 0 0.0 -5 0 0 0 S 0 0.0 -5 0 0 0 S 0 0.0 -5 0 0 0 S 0 0.0 -5 0 0 0 S 0 0.0 -5 0 0 0 S 0 0.0 -5 0 0 0 S 0 0.0 -5 0 0 0 S 0 0.0 -5 0 0 0 S 0 0.0 -5 0 0 0 S 0 0.0 -5 0 0 0 S 0 0.0 -5 0 0 0 S 0 0.0 -5 0 0 0 S 0 0.0 -5 0 0 0 S 0 0.0 -5 0 0 0 S 0 0.0 TIME+ 1:09.59 0:41.41 0:33.17 0:00.00 0:00.14 0:04.58 4:45.89 1:42.52 0:00.00 0:00.00 0:00.00 0:00.14 0:00.20 0:05.96 1:18.35 0:00.08 0:00.18 0:06.00 1:08.39 0:00.10 0:00.16 0:06.08 1:22.81 0:06.28 0:00.04 0:00.04 COMMAND httpd-worker httpd-worker init kthreadd migration/0 ksoftirqd/0 watchdog/0 events/0 khelper xenwatch xenbus migration/1 ksoftirqd/1 watchdog/1 events/1 migration/2 ksoftirqd/2 watchdog/2 events/2 migration/3 ksoftirqd/3 watchdog/3 events/3 kblockd/0 kblockd/1 kblockd/2 Top is a wonderful tool even though it’s slightly resource intensive (when running, its own process is usually in the top ten CPU gluttons). It is indispensable in determining the size of a running process, which comes in handy when determining how many server processes you can run on your machine. How to do this is described in sizing MaxClients. Top is, however, an interactive tool and running it continuously has few if any advantages. free This command is only available on Linux. It shows how much memory and swap space is in use. Linux allocates unused memory as file system cache. The free command shows usage both with and without this cache. The free command can be used to find out how much memory the operating system is using, as described in the paragraph sizing MaxClients. The output of free looks like this: sctemme@brutus:˜$ free total used Mem: 4026028 3901892 -/+ buffers/cache: 2807704 Swap: 3903784 12540 free 124136 1218324 3891244 shared 0 buffers 253144 cached 841044 352 CHAPTER 9. APACHE MISCELLANEOUS DOCUMENTATION vmstat This command is available on many unix platforms. It displays a large number of operating system metrics. Run without argument, it displays a status line for that moment. When a numeric argument is added, the status is redisplayed at designated intervals. For example, vmstat 5 causes the information to reappear every five seconds. Vmstat displays the amount of virtual memory in use, how much memory is swapped in and out each second, the number of processes currently running and sleeping, the number of interrupts and context switches per second and the usage percentages of the CPU. The following is vmstat output of an idle server: [sctemme@GayDeceiver sctemme]$ vmstat 5 3 procs memory swap r b w swpd free buff cache si so 0 0 0 0 186252 6688 37516 0 0 0 0 0 0 186244 6696 37516 0 0 0 0 0 0 186236 6704 37516 0 0 bi 12 0 0 io bo 5 16 9 in 47 41 44 system cs us 311 0 314 0 314 0 sy 1 0 0 cpu id 99 100 100 And this is output of a server that is under a load of one hundred simultaneous connections fetching static content: [sctemme@GayDeceiver sctemme]$ vmstat 5 3 procs memory swap io system cpu r b w swpd free buff cache si so bi bo in cs us sy id 1 0 1 0 162580 6848 40056 0 0 11 5 150 324 1 1 98 6 0 1 0 163280 6856 40248 0 0 0 66 6384 1117 42 25 32 11 0 0 0 162780 6864 40436 0 0 0 61 6309 1165 33 28 40 The first line gives averages since the last reboot. The subsequent lines give information for five second intervals. The second argument tells vmstat to generate three reports and then exit. SE Toolkit The SE Toolkit is a system monitoring toolkit for Solaris. Its programming language is based on the C preprocessor and comes with a number of sample scripts. It can use both the command line and the GUI to display information. It can also be programmed to apply rules to the system data. The example script shown in Figure 2, Zoom.se, shows green, orange or red indicators when utilization of various parts of the system rises above certain thresholds. Another included script, Virtual Adrian, applies performance tuning metrics according to. The SE Toolkit has drifted around for a while and has had several owners since its inception. It seems that it has now found a final home at Sunfreeware.com, where it can be downloaded at no charge. There is a single package for Solaris 8, 9 and 10 on SPARC and x86, and includes source code. SE Toolkit author Richard Pettit has started a new company, Captive Metrics4 that plans to bring to market a multiplatform monitoring tool built on the same principles as SE Toolkit, written in Java. DTrace Given that DTrace is available for Solaris, FreeBSD and OS X, it might be worth exploring it. There’s also mod dtrace available for httpd. 9.3. PERFORMANCE SCALING 353 mod status The mod status module gives an overview of the server performance at a given moment. It generates an HTML page with, among others, the number of Apache processes running and how many bytes each has served, and the CPU load caused by httpd and the rest of the system. The Apache Software Foundation uses MOD STATUS on its own web site5 . If you put the ExtendedStatus On directive in your httpd.conf, the MOD STATUS page will give you more information at the cost of a little extra work per request. Web Server Log Files Monitoring and analyzing the log files httpd writes is one of the most effective ways to keep track of your server health and performance. Monitoring the error log allows you to detect error conditions, discover attacks and find performance issues. Analyzing the access logs tells you how busy your server is, which resources are the most popular and where your users come from. Historical log file data can give you invaluable insight into trends in access to your server, which allows you to predict when your performance needs will overtake your server capacity. Error Log The error log will contain messages if the server has reached the maximum number of active processes or the maximum number of concurrently open files. The error log also reflects when processes are being spawned at a higher-than-usual rate in response to a sudden increase in load. When the server starts, the stderr file descriptor is redirected to the error logfile, so any error encountered by httpd after it opens its logfiles will appear in this log. This makes it good practice to review the error log frequently. Before Apache httpd opens its logfiles, any errors will be written to the stderr stream. If you start httpd manually, this error information will appear on your terminal and you can use it directly to troubleshoot your server. If your httpd is started by a startup script, the destination of early error messages depends on their design. The /var/log/messages file is usually a good bet. On Windows, early error messages are written to the Applications Event Log, which can be viewed through the Event Viewer in Administrative Tools. The Error Log is configured through the E RROR L OG and L OG L EVEL configuration directives. The error log of httpd’s main server configuration receives the log messages that pertain to the entire server: startup, shutdown, crashes, excessive process spawns, etc. The E RROR L OG directive can also be used in virtual host containers. The error log of a virtual host receives only log messages specific to that virtual host, such as authentication failures and ’File not Found’ errors. On a server that is visible to the Internet, expect to see a lot of exploit attempt and worm attacks in the error log. A lot of these will be targeted at other server platforms instead of Apache, but the current state of affairs is that attack scripts just throw everything they have at any open port, regardless of which server is actually running or what applications might be installed. You could block these attempts using a firewall or mod security6 , but this falls outside the scope of this discussion. The L OG L EVEL directive determines the level of detail included in the logs. There are eight log levels as described here: 5 http://apache.org/server-status 6 http://www.modsecurity.org/ 354 CHAPTER 9. APACHE MISCELLANEOUS DOCUMENTATION Level emerg alert crit error warn notice info debug Description Emergencies - system is unusable. Action must be taken immediately. Critical Conditions. Error conditions. Warning conditions. Normal but significant condition. Informational. Debug-level messages The default log level is warn. A production server should not be run on debug, but increasing the level of detail in the error log can be useful during troubleshooting. Starting with 2.3.8 L OG L EVEL can be specified on a per module basis: LogLevel debug mod_ssl:warn This puts all of the server in debug mode, except for MOD SSL, which tends to be very noisy. Access Log Apache httpd keeps track of every request it services in its access log file. In addition to the time and nature of a request, httpd can log the client IP address, date and time of the request, the result and a host of other information. The various logging format features are documented in the manual. This file exists by default for the main server and can be configured per virtual host by using the T RANSFER L OG or C USTOM L OG configuration directive. The access logs can be analyzed with any of several free and commercially available programs. Popular free analysis packages include Analog and Webalizer. Log analysis should be done offline so the web server machine is not burdened by processing the log files. Most log analysis packages understand the Common Log Format. The fields in the log lines are explained in the following: 195.54.228.42 - - [24/Mar/2007:23:05:11 -0400] "GET /sander/feed/ HTTP/1.1" 200 9747 64.34.165.214 - - [24/Mar/2007:23:10:11 -0400] "GET /sander/feed/atom HTTP/1.1" 200 9068 60.28.164.72 - - [24/Mar/2007:23:11:41 -0400] "GET / HTTP/1.0" 200 618 85.140.155.56 - - [24/Mar/2007:23:14:12 -0400] "GET /sander/2006/09/27/44/ HTTP/1.1" 200 141 85.140.155.56 - - [24/Mar/2007:23:14:15 -0400] "GET /sander/2006/09/21/gore-tax-pollution/ H 74.6.72.187 - - [24/Mar/2007:23:18:11 -0400] "GET /sander/2006/09/27/44/ HTTP/1.0" 200 14172 74.6.72.229 - - [24/Mar/2007:23:24:22 -0400] "GET /sander/2006/11/21/os-java/ HTTP/1.0" 200 Field Client IP RFC 1413 ident Content 195.54.228.42 - username - timestamp Request Status Code Content Bytes [24/Mar/2007:23:05:11 -0400] "GET /sander/feed/ HTTP/1.1" 200 9747 Explanation IP address where the request originated Remote user identity as reported by their identd Remote username as authenticated by Apache Date and time of request Request line Response code Bytes transferred w/o headers Rotating Log Files There are several reasons to rotate logfiles. Even though almost no operating systems out there have a hard file size limit of two Gigabytes anymore, log files simply become too large to handle over time. Additionally, any periodic log 9.3. PERFORMANCE SCALING 355 file analysis should not be performed on files to which the server is actively writing. Periodic logfile rotation helps keep the analysis job manageable, and allows you to keep a closer eye on usage trends. On unix systems, you can simply rotate logfiles by giving the old file a new name using mv. The server will keep writing to the open file even though it has a new name. When you send a graceful restart signal to the server, it will open a new logfile with the configured name. For example, you could run a script from cron like this: APACHE=/usr/local/apache2 HTTPD=$APACHE/bin/httpd mv $APACHE/logs/access log $APACHE/logarchive/access log-‘date +%F‘ $HTTPD -k graceful This approach also works on Windows, just not as smoothly. While the httpd process on your Windows server will keep writing to the log file after it has been renamed, the Windows Service that runs Apache can not do a graceful restart. Restarting a Service on Windows means stopping it and then starting it again. The advantage of a graceful restart is that the httpd child processes get to complete responding to their current requests before they exit. Meanwhile, the httpd server becomes immediately available again to serve new requests. The stop-start that the Windows Service has to perform will interrupt any requests currently in progress, and the server is unavailable until it is started again. Plan for this when you decide the timing of your restarts. A second approach is to use piped logs. From the C USTOM L OG, T RANSFER L OG or E RROR L OG directives you can send the log data into any program using a pipe character (|). For instance: CustomLog "|/usr/local/apache2/bin/rotatelogs /var/log/access log 86400" common The program on the other end of the pipe will receive the Apache log data on its stdin stream, and can do with this data whatever it wants. The rotatelogs program that comes with Apache seamlessly turns over the log file based on time elapsed or the amount of data written, and leaves the old log files with a timestamp suffix to its name. This method for rotating logfiles works well on unix platforms, but is currently broken on Windows. Logging and Performance Writing entries to the Apache log files obviously takes some effort, but the information gathered from the logs is so valuable that under normal circumstances logging should not be turned off. For optimal performance, you should put your disk-based site content on a different physical disk than the server log files: the access patterns are very different. Retrieving content from disk is a read operation in a fairly random pattern, and log files are written to disk sequentially. Do not run a production server with your error L OG L EVEL set to debug. This log level causes a vast amount of information to be written to the error log, including, in the case of SSL access, complete dumps of BIO read and write operations. The performance implications are significant: use the default warn level instead. If your server has more than one virtual host, you may give each virtual host a separate access logfile. This makes it easier to analyze the logfile later. However, if your server has many virtual hosts, all the open logfiles put a resource burden on your system, and it may be preferable to log to a single file. Use the %v format character at the start of your L OG F ORMAT and starting 2.3.8 of your E RROR L OG to make httpd print the hostname of the virtual host that received the request or the error at the beginning of each log line. A simple Perl script can split out the log file after it rotates: one is included with the Apache source under support/split-logfile. You can use the B UFFERED L OGS directive to have Apache collect several log lines in memory before writing them to disk. This might yield better performance, but could affect the order in which the server’s log is written. 356 CHAPTER 9. APACHE MISCELLANEOUS DOCUMENTATION Generating A Test Load It is useful to generate a test load to monitor system performance under realistic operating circumstances. Besides commercial packages such as LoadRunner7 ,there are a number of freely available tools to generate a test load against your web server. • Apache ships with a test program called ab, short for Apache Bench. It can generate a web server load by repeatedly asking for the same file in rapid succession. You can specify a number of concurrent connections and have the program run for either a given amount of time or a specified number of requests. • Another freely available load generator is http load11 . This program works with a URL file and can be compiled with SSL support. • The Apache Software Foundation offers a tool named flood12 . Flood is a fairly sophisticated program that is configured through an XML file. • Finally, JMeter13 , a Jakarta subproject, is an all-Java load-testing tool. While early versions of this application were slow and difficult to use, the current version 2.1.1 seems to be versatile and useful. • ASF external projects, that have proven to be quite good: grinder, httperf, tsung, FunkLoad8 When you load-test your web server, please keep in mind that if that server is in production, the test load may negatively affect the server’s response. Also, any data traffic you generate may be charged against your monthly traffic allowance. Configuring for Performance Httpd Configuration The Apache 2.2 httpd is by default a pre-forking web server. When the server starts, the parent process spawns a number of child processes that do the actual work of servicing requests. But Apache httpd 2.0 introduced the concept of the Multi-Processing Module (MPM). Developers can write MPMs to suit the process- or threadingarchitecture of their specific operating system. Apache 2 comes with special MPMs for Windows, OS/2, Netware and BeOS. On unix-like platforms, the two most popular MPMs are Prefork and Worker. The Prefork MPM offers the same preforking process model that Apache 1.3 uses. The Worker MPM runs a smaller number of child processes, and spawns multiple request handling threads within each child process. In 2.4 MPMs are no longer hard-wired. They too can be exchanged via L OAD M ODULE. The default MPM in 2.4 is the event MPM. The maximum number of workers, be they pre-forked child processes or threads within a process, is an indication of how many requests your server can manage concurrently. It is merely a rough estimate because the kernel can queue connection attempts for your web server. When your site becomes busy and the maximum number of workers is running, the machine doesn’t hit a hard limit beyond which clients will be denied access. However, once requests start backing up, system performance is likely to degrade. Finally, if the httpd server in question is not executing any third-party code, via mod php, mod perl or similar, we recommend the use of MPM EVENT. This MPM is ideal for situations where httpd serves as a thin layer between clients and backend servers doing the real job, such as a proxy or cache. MaxClients The MaxClients directive in your Apache httpd configuration file specifies the maximum number of workers your server can create. It has two related directives, MinSpareServers and MaxSpareServers ,which specify the number of workers Apache keeps waiting in the wings ready to serve requests. The absolute maximum number of processes is configurable through the ServerLimit directive. 7 http://learnloadrunner.com/ 8 http://funkload.nuxeo.org/ 9.3. PERFORMANCE SCALING 357 Spinning Threads For the prefork MPM of the above directives are all there is to determining the process limit. However, if you are running a threaded MPM the situation is a little more complicated. Threaded MPMs support the ThreadsPerChild directive1 . Apache requires that MaxClients is evenly divisible by ThreadsPerChild .If you set either directive to a number that doesn’t meet this requirement, Apache will send a message of complaint to the error log and adjust the ThreadsPerChild value downwards until it is an even factor of MaxClients. Sizing MaxClients Optimally, the maximum number of processes should be set so that all the memory on your system is used, but no more. If your system gets so overloaded that it needs to heavily swap core memory out to disk, performance will degrade quickly. The formula for determining M AX C LIENTS is fairly simple: total RAM - RAM for OS - RAM for external programs MaxClients = ------------------------------------------------------RAM per httpd process The various amounts of memory allocated for the OS, external programs and the httpd processes is best determined by observation: use the top and free commands described above to determine the memory footprint of the OS without the web server running. You can also determine the footprint of a typical web server process from top: most top implementations have a Resident Size (RSS) column and a Shared Memory column. The difference between these two is the amount of memory per-process. The shared segment really exists only once and is used for the code and libraries loaded and the dynamic inter-process tally, or ’scoreboard,’ that Apache keeps. How much memory each process takes for itself depends heavily on the number and kind of modules you use. The best approach to use in determining this need is to generate a typical test load against your web site and see how large the httpd processes become. The RAM for external programs parameter is intended mostly for CGI programs and scripts that run outside the web server process. However, if you have a Java virtual machine running Tomcat on the same box it will need a significant amount of memory as well. The above assessment should give you an idea how far you can push MaxClients ,but it is not an exact science. When in doubt, be conservative and use a low MaxClients value. The Linux kernel will put extra memory to good use for caching disk access. On Solaris you need enough available real RAM memory to create any process. If no real memory is available, httpd will start writing ’No space left on device’ messages to the error log and be unable to fork additional child processes, so a higher MaxClients value may actually be a disadvantage. Selecting your MPM The prime reason for selecting a threaded MPM is that threads consume fewer system resources than processes, and it takes less effort for the system to switch between threads. This is more true for some operating systems than for others. On systems like Solaris and AIX, manipulating processes is relatively expensive in terms of system resources. On these systems, running a threaded MPM makes sense. On Linux, the threading implementation actually uses one process for each thread. Linux processes are relatively lightweight, but it means that a threaded MPM offers less of a performance advantage than in other environments. Running a threaded MPM can cause stability problems in some situations For instance, should a child process of a preforked MPM crash, at most one client connection is affected. However, if a threaded child crashes, all the threads in that process disappear, which means all the clients currently being served by that process will see their connection aborted. Additionally, there may be so-called "thread-safety" issues, especially with third-party libraries. In threaded applications, threads may access the same variables indiscriminately, not knowing whether a variable may have been changed by another thread. 358 CHAPTER 9. APACHE MISCELLANEOUS DOCUMENTATION This has been a sore point within the PHP community. The PHP processor heavily relies on third-party libraries and cannot guarantee that all of these are thread-safe. The good news is that if you are running Apache on Linux, you can run PHP in the preforked MPM without fear of losing too much performance relative to the threaded option. Spinning Locks Apache httpd maintains an inter-process lock around its network listener. For all practical purposes, this means that only one httpd child process can receive a request at any given time. The other processes are either servicing requests already received or are ’camping out’ on the lock, waiting for the network listener to become available. This process is best visualized as a revolving door, with only one process allowed in the door at any time. On a heavily loaded web server with requests arriving constantly, the door spins quickly and requests are accepted at a steady rate. On a lightly loaded web server, the process that currently "holds" the lock may have to stay in the door for a while, during which all the other processes sit idle, waiting to acquire the lock. At this time, the parent process may decide to terminate some children based on its MaxSpareServers directive. The Thundering Herd The function of the ’accept mutex’ (as this inter-process lock is called) is to keep request reception moving along in an orderly fashion. If the lock is absent, the server may exhibit the Thundering Herd syndrome. Consider an American Football team poised on the line of scrimmage. If the football players were Apache processes all team members would go for the ball simultaneously at the snap. One process would get it, and all the others would have to lumber back to the line for the next snap. In this metaphor, the accept mutex acts as the quarterback, delivering the connection "ball" to the appropriate player process. Moving this much information around is obviously a lot of work, and, like a smart person, a smart web server tries to avoid it whenever possible. Hence the revolving door construction. In recent years, many operating systems, including Linux and Solaris, have put code in place to prevent the Thundering Herd syndrome. Apache recognizes this and if you run with just one network listener, meaning one virtual host or just the main server, Apache will refrain from using an accept mutex. If you run with multiple listeners (for instance because you have a virtual host serving SSL requests), it will activate the accept mutex to avoid internal conflicts. You can manipulate the accept mutex with the AcceptMutex directive. Besides turning the accept mutex off, you can select the locking mechanism. Common locking mechanisms include fcntl, System V Semaphores and pthread locking. Not all are available on every platform, and their availability also depends on compile-time settings. The various locking mechanisms may place specific demands on system resources: manipulate them with care. There is no compelling reason to disable the accept mutex. Apache automatically recognizes the single listener situation described above and knows if it is safe to run without mutex on your platform. Tuning the Operating System People often look for the ’magic tune-up’ that will make their system perform four times as fast by tweaking just one little setting. The truth is, present-day UNIX derivatives are pretty well adjusted straight out of the box and there is not a lot that needs to be done to make them perform optimally. However, there are a few things that an administrator can do to improve performance. RAM and Swap Space The usual mantra regarding RAM is "more is better". As discussed above, unused RAM is put to good use as file system cache. The Apache processes get bigger if you load more modules, especially if you use modules that generate 9.3. PERFORMANCE SCALING 359 dynamic page content within the processes, like PHP and mod perl. A large configuration file-with many virtual hostsalso tends to inflate the process footprint. Having ample RAM allows you to run Apache with more child processes, which allows the server to process more concurrent requests. While the various platforms treat their virtual memory in different ways, it is never a good idea to run with less diskbased swap space than RAM. The virtual memory system is designed to provide a fallback for RAM, but when you don’t have disk space available and run out of swappable memory, your machine grinds to a halt. This can crash your box, requiring a physical reboot for which your hosting facility may charge you. Also, such an outage naturally occurs when you least want it: when the world has found your website and is beating a path to your door. If you have enough disk-based swap space available and the machine gets overloaded, it may get very, very slow as the system needs to swap memory pages to disk and back, but when the load decreases the system should recover. Remember, you still have MaxClients to keep things in hand. Most unix-like operating systems use designated disk partitions for swap space. When a system starts up it finds all swap partitions on the disk(s), by partition type or because they are listed in the file /etc/fstab ,and automatically enables them. When adding a disk or installing the operating system, be sure to allocate enough swap space to accommodate eventual RAM upgrades. Reassigning disk space on a running system is a cumbersome process. Plan for available hard drive swap space of at least twice your amount of RAM, perhaps up to four times in situations with frequent peaking loads. Remember to adjust this configuration whenever you upgrade RAM on your system. In a pinch, you can use a regular file as swap space. For instructions on how to do this, see the manual pages for the mkswap and swapon or swap programs. ulimit: Files and Processes Given a machine with plenty of RAM and processor capacity, you can run hundreds of Apache processes if necessary. . . and if your kernel allows it. Consider a situation in which several hundred web servers are running; if some of these need to spawn CGI processes, the maximum number of processes would occur quickly. However, you can change this limit with the command ulimit [-H|-S] -u [newvalue] This must be changed before starting the server, since the new value will only be available to the current shell and programs started from it. In newer Linux kernels the default has been raised to 2048. On FreeBSD, the number seems to be the rather unusual 513. In the default user shell on this system, csh the equivalent is limit and works analogous to the Bourne-like ulimit : limit [-h] maxproc [newvalue] Similarly, the kernel may limit the number of open files per process. This is generally not a problem for pre-forked servers, which just handle one request at a time per process. Threaded servers, however, serve many requests per process and much more easily run out of available file descriptors. You can increase the maximum number of open files per process by running the ulimit -n [newvalue] command. Once again, this must be done prior to starting Apache. 360 CHAPTER 9. APACHE MISCELLANEOUS DOCUMENTATION Setting User Limits on System Startup Under Linux, you can set the ulimit parameters on bootup by editing the /etc/security/limits.conf file. This file allows you to set soft and hard limits on a per-user or per-group basis; the file contains commentary explaining the options. To enable this, make sure that the file /etc/pam.d/login contains the line session required /lib/security/pam limits.so All items can have a ’soft’ and a ’hard’ limit: the first is the default setting and the second the maximum value for that item. In FreeBSD’s /etc/login.conf these resources can be limited or extended system wide, analogously to limits.conf. ’Soft’ limits can be specified with -cur and ’hard’ limits with -max. Solaris has a similar mechanism for manipulating limit values at boot time: In /etc/system you can set kernel tunables valid for the entire system at boot time. These are the same tunables that can be set with the mdb kernel debugger during run time. The soft and hard limit corresponding to ulimit -u can be set via: set rlim fd max=65536 set rlim fd cur=2048 Solaris calculates the maximum number of allowed processes per user (maxuprc) based on the total amount available memory on the system (maxusers). You can review the numbers with sysdef -i | grep maximum but it is not recommended to change them. Turn Off Unused Services and Modules Many UNIX and Linux distributions come with a slew of services turned on by default. You probably need few of them. For example, your web server does not need to be running sendmail, nor is it likely to be an NFS server, etc. Turn them off. On Red Hat Linux, the chkconfig tool will help you do this from the command line. On Solaris systems svcs and svcadm will show which services are enabled and disable them respectively. In a similar fashion, cast a critical eye on the Apache modules you load. Most binary distributions of Apache httpd, and pre-installed versions that come with Linux distributions, have their modules enabled through the L OAD M ODULE directive. Unused modules may be culled: if you don’t rely on their functionality and configuration directives, you can turn them off by commenting out the corresponding L OAD M ODULE lines. Read the documentation on each module’s functionality before deciding whether to keep it enabled. While the performance overhead of an unused module is small, it’s also unnecessary. Caching Content Requests for dynamically generated content usually take significantly more resources than requests for static content. Static content consists of simple filespages, images, etc.-on disk that are very efficiently served. Many operating systems also automatically cache the contents of frequently accessed files in memory. 9.3. PERFORMANCE SCALING 361 Processing dynamic requests, on the contrary, can be much more involved. Running CGI scripts, handing off requests to an external application server and accessing database content can introduce significant latency and processing load to a busy web server. Under many circumstances, performance can be improved by turning popular dynamic requests into static requests. In this section, two approaches to this will be discussed. Making Popular Pages Static By pre-rendering the response pages for the most popular queries in your application, you can gain a significant performance improvement without giving up the flexibility of dynamically generated content. For instance, if your application is a flower delivery service, you would probably want to pre-render your catalog pages for red roses during the weeks leading up to Valentine’s Day. When the user searches for red roses, they are served the pre-rendered page. Queries for, say, yellow roses will be generated directly from the database. The mod rewrite module included with Apache is a great tool to implement these substitutions. Example: A Statically Rendered Blog Blosxom is a lightweight web log package that runs as a CGI. It is written in Perl and uses plain text files for entry input. Besides running as CGI, Blosxom can be run from the command line to pre-render blog pages. Pre-rendering pages to static HTML can yield a significant performance boost in the event that large numbers of people actually start reading your blog. To run blosxom for static page generation, edit the CGI script according to the documentation. Set the $static dir variable to the D OCUMENT ROOT of the web server, and run the script from the command line as follows: $ perl blosxom.cgi -password=’whateveryourpassword’ This can be run periodically from Cron, after you upload content, etc. To make Apache substitute the statically rendered pages for the dynamic content, we’ll use mod rewrite. This module is included with the Apache source code, but is not compiled by default. It can be built with the server by passing the option --enable-rewrite[=shared] to the configure command. Many binary distributions of Apache come with MOD REWRITE included. The following is an example of an Apache virtual host that takes advantage of pre-rendered blog pages: Listen *:8001 ServerName blog.sandla.org:8001 ServerAdmin sander@temme.net DocumentRoot "/home/sctemme/inst/blog/httpd/htdocs" Options +Indexes Require all granted RewriteEngine on RewriteCond "%{REQUEST_FILENAME}" "!-f" RewriteCond "%{REQUEST_FILENAME}" "!-d" RewriteRule "ˆ(.*)$" "/cgi-bin/blosxom.cgi/$1" [L,QSA] RewriteLog "/home/sctemme/inst/blog/httpd/logs/rewrite_log" RewriteLogLevel 9 ErrorLog "/home/sctemme/inst/blog/httpd/logs/error_log" LogLevel debug CustomLog "/home/sctemme/inst/blog/httpd/logs/access_log" common ScriptAlias "/cgi-bin/" "/home/sctemme/inst/blog/bin/" 362 CHAPTER 9. APACHE MISCELLANEOUS DOCUMENTATION Options +ExecCGI Require all granted The R EWRITE C OND and R EWRITE RULE directives say that, if the requested resource does not exist as a file or a directory, its path is passed to the Blosxom CGI for rendering. Blosxom uses Path Info to specify blog entries and index pages, so this means that if a particular path under Blosxom exists as a static file in the file system, the file is served instead. Any request that isn’t pre- rendered is served by the CGI. This means that individual entries, which show the comments, are always served by the CGI which in turn means that your comment spam is always visible. This configuration also hides the Blosxom CGI from the user-visible URL in their Location bar. mod rewrite is a fantastically powerful and versatile module: investigate it to arrive at a configuration that is best for your situation. Caching Content With mod cache The mod cache module provides intelligent caching of HTTP responses: it is aware of the expiration timing and content requirements that are part of the HTTP specification. The mod cache module caches URL response content. If content sent to the client is considered cacheable, it is saved to disk. Subsequent requests for that URL will be served directly from the cache. The provider module for mod cache, mod disk cache, determines how the cached content is stored on disk. Most server systems will have more disk available than memory, and it’s good to note that some operating system kernels cache frequently accessed disk content transparently in memory, so replicating this in the server is not very useful. To enable efficient content caching and avoid presenting the user with stale or invalid content, the application that generates the actual content has to send the correct response headers. Without headers like Etag:, Last-Modified: or Expires:, MOD CACHE can not make the right decision on whether to cache the content, serve it from cache or leave it alone. When testing content caching, you may find that you need to modify your application or, if this is impossible, selectively disable caching for URLs that cause problems. The mod cache modules are not compiled by default, but can be enabled by passing the option --enable-cache[=shared] to the configure script. If you use a binary distribution of Apache httpd, or it came with your port or package collection, it may have MOD CACHE already included. Example: wiki.apache.org The Apache Software Foundation Wiki is served by MoinMoin. MoinMoin is written in Python and runs as a CGI. To date, any attempts to run it under mod python has been unsuccessful. The CGI proved to place an untenably high load on the server machine, especially when the Wiki was being indexed by search engines like Google. To lighten the load on the server machine, the Apache Infrastructure team turned to mod cache. It turned out MoinMoin needed a small patch to ensure proper behavior behind the caching server: certain requests can never be cached and the corresponding Python modules were patched to send the proper HTTP response headers. After this modification, the cache in front of the Wiki was enabled with the following configuration snippet in httpd.conf: CacheRoot /raid1/cacheroot CacheEnable disk / # A page modified 100 minutes ago will expire in 10 minutes CacheLastModifiedFactor .1 # Always check again after 6 hours CacheMaxExpire 21600 This configuration will try to cache any and all content within its virtual host. It will never cache content for more than six hours (the C ACHE M AX E XPIRE directive). If no Expires: header is present in the response, MOD CACHE 9.3. PERFORMANCE SCALING 363 will compute an expiration period from the Last-Modified: header. The computation using C ACHE L AST M OD IFIED FACTOR is based on the assumption that if a page was recently modified, it is likely to change again in the near future and will have to be re-cached. Do note that it can pay off to disable the ETag: header: For files smaller than 1k the server has to calculate the checksum (usually MD5) and then send out a 304 Not Modified response, which will use up some CPU and still saturate the same amount of network resources for the transfer (one TCP packet). For resources larger than 1k it might prove CPU expensive to calculate the header for each request. Unfortunately there does currently not exist a way to cache these headers. FileETag None This will disable the generation of the ETag: header for most static resources. The server does not calculate these headers for dynamic resources. Further Considerations Armed with the knowledge of how to tune a sytem to deliver the desired the performance, we will soon discover that one system might prove a bottleneck. How to make a system fit for growth, or how to put a number of systems into tune will be discussed in PerformanceScalingOut9 . 9 http://wiki.apache.org/httpd/PerformanceScalingOut 364 CHAPTER 9. APACHE MISCELLANEOUS DOCUMENTATION 9.4 Security Tips Some hints and tips on security issues in setting up a web server. Some of the suggestions will be general, others specific to Apache. Keep up to Date The Apache HTTP Server has a good record for security and a developer community highly concerned about security issues. But it is inevitable that some problems – small or large – will be discovered in software after it is released. For this reason, it is crucial to keep aware of updates to the software. If you have obtained your version of the HTTP Server directly from Apache, we highly recommend you subscribe to the Apache HTTP Server Announcements List10 where you can keep informed of new releases and security updates. Similar services are available from most third-party distributors of Apache software. Of course, most times that a web server is compromised, it is not because of problems in the HTTP Server code. Rather, it comes from problems in add-on code, CGI scripts, or the underlying Operating System. You must therefore stay aware of problems and updates with all the software on your system. Denial of Service (DoS) attacks All network servers can be subject to denial of service attacks that attempt to prevent responses to clients by tying up the resources of the server. It is not possible to prevent such attacks entirely, but you can do certain things to mitigate the problems that they create. Often the most effective anti-DoS tool will be a firewall or other operating-system configurations. For example, most firewalls can be configured to restrict the number of simultaneous connections from any individual IP address or network, thus preventing a range of simple attacks. Of course this is no help against Distributed Denial of Service attacks (DDoS). There are also certain Apache HTTP Server configuration settings that can help mitigate problems: • The R EQUEST R EAD T IMEOUT directive allows to limit the time a client may take to send the request. • The T IME O UT directive should be lowered on sites that are subject to DoS attacks. Setting this to as low as a few seconds may be appropriate. As T IME O UT is currently used for several different operations, setting it to a low value introduces problems with long running CGI scripts. • The K EEPA LIVE T IMEOUT directive may be also lowered on sites that are subject to DoS attacks. Some sites even turn off the keepalives completely via K EEPA LIVE, which has of course other drawbacks on performance. • The values of various timeout-related directives provided by other modules should be checked. • The directives L IMIT R EQUEST B ODY, L IMIT R EQUEST F IELDS, L IMIT R EQUEST F IELD S IZE, L IMIT R EQUESTL INE, and L IMIT XMLR EQUEST B ODY should be carefully configured to limit resource consumption triggered by client input. • On operating systems that support it, make sure that you use the ACCEPT F ILTER directive to offload part of the request processing to the operating system. This is active by default in Apache httpd, but may require reconfiguration of your kernel. • Tune the M AX R EQUEST W ORKERS directive to allow the server to handle the maximum number of simultaneous connections without running out of resources. See also the performance tuning documentation (p. 339) . 10 http://httpd.apache.org/lists.html#http-announce 9.4. SECURITY TIPS 365 • The use of a threaded mpm (p. 90) may allow you to handle more simultaneous connections, thereby mitigating DoS attacks. Further, the EVENT mpm uses asynchronous processing to avoid devoting a thread to each connection. Due to the nature of the OpenSSL library the EVENT mpm is currently incompatible with MOD SSL and other input filters. In these cases it falls back to the behaviour of the WORKER mpm. • There are a number of third-party modules available through http://modules.apache.org/ that can restrict certain client behaviors and thereby mitigate DoS problems. Permissions on ServerRoot Directories In typical operation, Apache is started by the root user, and it switches to the user defined by the U SER directive to serve hits. As is the case with any command that root executes, you must take care that it is protected from modification by non-root users. Not only must the files themselves be writeable only by root, but so must the directories, and parents of all directories. For example, if you choose to place ServerRoot in /usr/local/apache then it is suggested that you create that directory as root, with commands like these: mkdir /usr/local/apache cd /usr/local/apache mkdir bin conf logs chown 0 . bin conf logs chgrp 0 . bin conf logs chmod 755 . bin conf logs It is assumed that /, /usr, and /usr/local are only modifiable by root. When you install the httpd executable, you should ensure that it is similarly protected: cp httpd /usr/local/apache/bin chown 0 /usr/local/apache/bin/httpd chgrp 0 /usr/local/apache/bin/httpd chmod 511 /usr/local/apache/bin/httpd You can create an htdocs subdirectory which is modifiable by other users – since root never executes any files out of there, and shouldn’t be creating files in there. If you allow non-root users to modify any files that root either executes or writes on then you open your system to root compromises. For example, someone could replace the httpd binary so that the next time you start it, it will execute some arbitrary code. If the logs directory is writeable (by a non-root user), someone could replace a log file with a symlink to some other system file, and then root might overwrite that file with arbitrary data. If the log files themselves are writeable (by a non-root user), then someone may be able to overwrite the log itself with bogus data. Server Side Includes Server Side Includes (SSI) present a server administrator with several potential security risks. The first risk is the increased load on the server. All SSI-enabled files have to be parsed by Apache, whether or not there are any SSI directives included within the files. While this load increase is minor, in a shared server environment it can become significant. SSI files also pose the same risks that are associated with CGI scripts in general. Using the exec cmd element, SSI-enabled files can execute any CGI script or program under the permissions of the user and group Apache runs as, as configured in httpd.conf. There are ways to enhance the security of SSI files while still taking advantage of the benefits they provide. 366 CHAPTER 9. APACHE MISCELLANEOUS DOCUMENTATION To isolate the damage a wayward SSI file can cause, a server administrator can enable suexec (p. 115) as described in the CGI in General section. Enabling SSI for files with .html or .htm extensions can be dangerous. This is especially true in a shared, or high traffic, server environment. SSI-enabled files should have a separate extension, such as the conventional .shtml. This helps keep server load at a minimum and allows for easier management of risk. Another solution is to disable the ability to run scripts and programs from SSI pages. To do this replace Includes with IncludesNOEXEC in the O PTIONS directive. Note that users may still use <--#include virtual="..." --> to execute CGI scripts if these scripts are in directories designated by a S CRIPTA LIAS directive. CGI in General First of all, you always have to remember that you must trust the writers of the CGI scripts/programs or your ability to spot potential security holes in CGI, whether they were deliberate or accidental. CGI scripts can run essentially arbitrary commands on your system with the permissions of the web server user and can therefore be extremely dangerous if they are not carefully checked. All the CGI scripts will run as the same user, so they have potential to conflict (accidentally or deliberately) with other scripts e.g. User A hates User B, so he writes a script to trash User B’s CGI database. One program which can be used to allow scripts to run as different users is suEXEC (p. 115) which is included with Apache as of 1.2 and is called from special hooks in the Apache server code. Another popular way of doing this is with CGIWrap11 . Non Script Aliased CGI Allowing users to execute CGI scripts in any directory should only be considered if: • You trust your users not to write scripts which will deliberately or accidentally expose your system to an attack. • You consider security at your site to be so feeble in other areas, as to make one more potential hole irrelevant. • You have no users, and nobody ever visits your server. Script Aliased CGI Limiting CGI to special directories gives the admin control over what goes into those directories. This is inevitably more secure than non script aliased CGI, but only if users with write access to the directories are trusted or the admin is willing to test each new CGI script/program for potential security holes. Most sites choose this option over the non script aliased CGI approach. Other sources of dynamic content Embedded scripting options which run as part of the server itself, such as mod php, mod perl, mod tcl, and mod python, run under the identity of the server itself (see the U SER directive), and therefore scripts executed by these engines potentially can access anything the server user can. Some scripting engines may provide restrictions, but it is better to be safe and assume not. 11 http://cgiwrap.sourceforge.net/ 9.4. SECURITY TIPS 367 Dynamic content security When setting up dynamic content, such as mod php, mod perl or mod python, many security considerations get out of the scope of httpd itself, and you need to consult documentation from those modules. For example, PHP lets you setup Safe Mode12 , which is most usually disabled by default. Another example is Suhosin13 , a PHP addon for more security. For more information about those, consult each project documentation. At the Apache level, a module named mod security14 can be seen as a HTTP firewall and, provided you configure it finely enough, can help you enhance your dynamic content security. Protecting System Settings To run a really tight ship, you’ll want to stop users from setting up .htaccess files which can override security features you’ve configured. Here’s one way to do it. In the server configuration file, put AllowOverride None This prevents the use of .htaccess files in all directories apart from those specifically enabled. Note that this setting is the default since Apache 2.3.9. Protect Server Files by Default One aspect of Apache which is occasionally misunderstood is the feature of default access. That is, unless you take steps to change it, if the server can find its way to a file through normal URL mapping rules, it can serve it to clients. For instance, consider the following example: # cd /; ln -s / public html Accessing http://localhost/˜root/ This would allow clients to walk through the entire filesystem. To work around this, add the following block to your server’s configuration: Require all denied This will forbid default access to filesystem locations. Add appropriate D IRECTORY blocks to allow access only in those areas you wish. For example, Require all granted Require all granted 12 http://www.php.net/manual/en/ini.sect.safe-mode.php 13 http://www.hardened-php.net/suhosin/ 14 http://modsecurity.org/ 368 CHAPTER 9. APACHE MISCELLANEOUS DOCUMENTATION Pay particular attention to the interactions of L OCATION and D IRECTORY directives; for instance, even if denies access, a directive might overturn it. Also be wary of playing games with the U SER D IR directive; setting it to something like ./ would have the same effect, for root, as the first example above. We strongly recommend that you include the following line in your server configuration files: UserDir disabled root Watching Your Logs To keep up-to-date with what is actually going on against your server you have to check the Log Files (p. 56) . Even though the log files only reports what has already happened, they will give you some understanding of what attacks is thrown against the server and allow you to check if the necessary level of security is present. A couple of examples: grep -c "/jsp/source.jsp?/jsp/ /jsp/source.jsp??" access log grep "client denied" error log | tail -n 10 The first example will list the number of attacks trying to exploit the Apache Tomcat Source.JSP Malformed Request Information Disclosure Vulnerability15 , the second example will list the ten last denied clients, for example: [Thu Jul 11 17:18:39 2002] [error] [client foo.example.com] client denied by server configuration: /usr/local/apache/htdocs/.htpasswd As you can see, the log files only report what already has happened, so if the client had been able to access the .htpasswd file you would have seen something similar to: foo.example.com - - [12/Jul/2002:01:59:13 +0200] "GET /.htpasswd HTTP/1.1" in your Access Log (p. 56) . This means you probably commented out the following in your server configuration file: Require all denied Merging of configuration sections The merging of configuration sections is complicated and sometimes directive specific. Always test your changes when creating dependencies on how directives are merged. For modules that don’t implement any merging logic, such as MOD ACCESS COMPAT, the behavior in later sections depends on whether the later section has any directives from the module. The configuration is inherited until a change is made, at which point the configuration is replaced and not merged. 15 http://online.securityfocus.com/bid/4876/info/ 9.5. RELEVANT STANDARDS 9.5 369 Relevant Standards This page documents all the relevant standards that the Apache HTTP Server follows, along with brief descriptions. In addition to the information listed below, the following resources should be consulted: • http://purl.org/NET/http-errata16 - HTTP/1.1 Specification Errata • http://www.rfc-editor.org/errata.php17 - RFC Errata • http://ftp.ics.uci.edu/pub/ietf/http/#RFC18 - A pre-compiled list of HTTP related RFCs ! Notice This document is not yet complete. HTTP Recommendations Regardless of what modules are compiled and used, Apache as a basic web server complies with the following IETF recommendations: RFC 194519 (Informational) The Hypertext Transfer Protocol (HTTP) is an application-level protocol with the lightness and speed necessary for distributed, collaborative, hypermedia information systems. This documents HTTP/1.0. RFC 261620 (Standards Track) The Hypertext Transfer Protocol (HTTP) is an application-level protocol for distributed, collaborative, hypermedia information systems. This documents HTTP/1.1. RFC 239621 (Standards Track) A Uniform Resource Identifier (URI) is a compact string of characters for identifying an abstract or physical resource. RFC 434622 (Standards Track) The TLS protocol provides communications security over the Internet. It provides encryption, and is designed to prevent eavesdropping, tampering, and message forgery. HTML Recommendations Regarding the Hypertext Markup Language, Apache complies with the following IETF and W3C recommendations: RFC 285423 (Informational) This document summarizes the history of HTML development, and defines the "text/html" MIME type by pointing to the relevant W3C recommendations. HTML 4.01 Specification24 (Errata25 ) This specification defines the HyperText Markup Language (HTML), the publishing language of the World Wide Web. This specification defines HTML 4.01, which is a subversion of HTML 4. HTML 3.2 Reference Specification26 The HyperText Markup Language (HTML) is a simple markup language used to create hypertext documents that are portable from one platform to another. HTML documents are SGML documents. XHTML 1.1 - Module-based XHTML27 (Errata28 ) This Recommendation defines a new XHTML document type that is based upon the module framework and modules defined in Modularization of XHTML. 16 http://purl.org/NET/http-errata 17 http://www.rfc-editor.org/errata.php 18 http://ftp.ics.uci.edu/pub/ietf/http/#RFC 370 CHAPTER 9. APACHE MISCELLANEOUS DOCUMENTATION XHTML 1.0 The Extensible HyperText Markup Language (Second Edition)29 (Errata30 ) This specification defines the Second Edition of XHTML 1.0, a reformulation of HTML 4 as an XML 1.0 application, and three DTDs corresponding to the ones defined by HTML 4. Authentication Concerning the different methods of authentication, Apache follows the following IETF recommendations: RFC 261731 (Standards Track) "HTTP/1.0", includes the specification for a Basic Access Authentication scheme. Language/Country Codes The following links document ISO and other language and country code information: ISO 639-232 ISO 639 provides two sets of language codes, one as a two-letter code set (639-1) and another as a three-letter code set (this part of ISO 639) for the representation of names of languages. ISO 3166-133 These pages document the country names (official short names in English) in alphabetical order as given in ISO 3166-1 and the corresponding ISO 3166-1-alpha-2 code elements. BCP 4734 (Best Current Practice), RFC 306635 This document describes a language tag for use in cases where it is desired to indicate the language used in an information object, how to register values for use in this language tag, and a construct for matching such language tags. RFC 328236 (Standards Track) This document defines a "Content-language:" header, for use in cases where one desires to indicate the language of something that has RFC 822-like headers, like MIME body parts or Web documents, and an "Accept-Language:" header for use in cases where one wishes to indicate one’s preferences with regard to language. 9.6. PASSWORD FORMATS 9.6 371 Password Formats Notes about the password encryption formats generated and understood by Apache. Basic Authentication There are five formats that Apache recognizes for basic-authentication passwords. Note that not all formats work on every platform: bcrypt "$2y$" + the result of the crypt blowfish algorithm. See the APR source file crypt blowfish.c37 for the details of the algorithm. MD5 "$apr1$" + the result of an Apache-specific algorithm using an iterated (1,000 times) MD5 digest of various combinations of a random 32-bit salt and the password. See the APR source file apr md5.c38 for the details of the algorithm. SHA1 "{SHA}" + Base64-encoded SHA-1 digest of the password. Insecure. CRYPT Unix only. Uses the traditional Unix crypt(3) function with a randomly-generated 32-bit salt (only 12 bits used) and the first 8 characters of the password. Insecure. PLAIN TEXT (i.e. unencrypted) Windows & Netware only. Insecure. Generating values with htpasswd bcrypt $ htpasswd -nbB myName myPassword myName:$2y$05$c4WoMPo3SXsafkva.HHa6uXQZWr7oboPiC2bT/r7q1BB8I2s0BRqC MD5 $ htpasswd -nbm myName myPassword myName:$apr1$r31.....$HqJZimcKQFAMYayBlzkrA/ SHA1 $ htpasswd -nbs myName myPassword myName:{SHA}VBPuJHI7uixaa6LQGWx4s+5GKNE= CRYPT $ htpasswd -nbd myName myPassword myName:rqXexS6ZhobKA 37 http://svn.apache.org/viewvc/apr/apr/trunk/crypto/crypt 38 http://svn.apache.org/viewvc/apr/apr/trunk/crypto/apr blowfish.c?view=markup md5.c?view=markup 372 CHAPTER 9. APACHE MISCELLANEOUS DOCUMENTATION Generating CRYPT and MD5 values with the OpenSSL command-line program OpenSSL knows the Apache-specific MD5 algorithm. MD5 $ openssl passwd -apr1 myPassword $apr1$qHDFfhPC$nITSVHgYbDAK1Y0acGRnY0 CRYPT openssl passwd -crypt myPassword qQ5vTYO3c8dsU Validating CRYPT or MD5 passwords with the OpenSSL command line program The salt for a CRYPT password is the first two characters (converted to a binary value). To validate myPassword against rqXexS6ZhobKA CRYPT $ openssl passwd -crypt -salt rq myPassword Warning: truncating password to 8 characters rqXexS6ZhobKA Note that using myPasswo instead of myPassword will produce the same result because only the first 8 characters of CRYPT passwords are considered. The salt for an MD5 password is between $apr1$ and the following $ (as a Base64-encoded binary value - max 8 chars). To validate myPassword against $apr1$r31.....$HqJZimcKQFAMYayBlzkrA/ MD5 $ openssl passwd -apr1 -salt r31..... $apr1$r31.....$HqJZimcKQFAMYayBlzkrA/ myPassword Database password fields for mod dbd The SHA1 variant is probably the most useful format for DBD authentication. Since the SHA1 and Base64 functions are commonly available, other software can populate a database with encrypted passwords that are usable by Apache basic authentication. To create Apache SHA1-variant basic-authentication passwords in various languages: PHP ’{SHA}’ . base64 encode(sha1($password, TRUE)) Java "{SHA}" + new sun.misc.BASE64Encoder().encode(java.security.MessageDigest.getInstance("SHA1").digest(passw 9.6. PASSWORD FORMATS 373 ColdFusion "{SHA}" & ToBase64(BinaryDecode(Hash(password, "SHA1"), "Hex")) Ruby require ’digest/sha1’ require ’base64’ ’{SHA}’ + Base64.encode64(Digest::SHA1.digest(password)) C or C++ Use the APR function: apr sha1 base64 Python import base64 import hashlib "{SHA}" + format(base64.b64encode(hashlib.sha1(password).digest())) PostgreSQL (with the contrib/pgcrypto functions installed) ’{SHA}’||encode(digest(password,’sha1’),’base64’) Digest Authentication Apache recognizes one format for digest-authentication passwords - the MD5 hash of the string user:realm:password as a 32-character string of hexadecimal digits. realm is the Authorization Realm argument to the AUTH NAME directive in httpd.conf. Database password fields for mod dbd Since the MD5 function is commonly available, other software can populate a database with encrypted passwords that are usable by Apache digest authentication. To create Apache digest-authentication passwords in various languages: PHP md5($user . ’:’ . $realm . ’:’ .$password) Java byte b[] = java.security.MessageDigest.getInstance("MD5").digest( (user + ":" + realm + ":" + password ).getBytes()); java.math.BigInteger bi = new java.math.BigInteger(1, b); String s = bi.toString(16); while (s.length() < 32) s = "0" + s; // String s is the encrypted password 374 CHAPTER 9. APACHE MISCELLANEOUS DOCUMENTATION ColdFusion LCase(Hash( (user & ":" & realm & ":" & password) , "MD5")) Ruby require ’digest/md5’ Digest::MD5.hexdigest(user + ’:’ + realm + ’:’ + password) PostgreSQL (with the contrib/pgcrypto functions installed) encode(digest( user || ’:’ ’hex’) || realm || ’:’ || password , ’md5’), Chapter 10 Apache modules 375 376 10.1 CHAPTER 10. APACHE MODULES Terms Used to Describe Modules This document describes the terms that are used to describe each Apache module (p. 1101) . Description A brief description of the purpose of the module. Status This indicates how tightly bound into the Apache Web server the module is; in other words, you may need to recompile the server in order to gain access to the module and its functionality. Possible values for this attribute are: MPM A module with status "MPM" is a Multi-Processing Module (p. 90) . Unlike the other types of modules, Apache must have one and only one MPM in use at any time. This type of module is responsible for basic request handling and dispatching. Base A module labeled as having "Base" status is compiled and loaded into the server by default, and is therefore normally available unless you have taken steps to remove the module from your configuration. Extension A module with "Extension" status is not normally compiled and loaded into the server. To enable the module and its functionality, you may need to change the server build configuration files and re-compile Apache. Experimental "Experimental" status indicates that the module is available as part of the Apache kit, but you are on your own if you try to use it. The module is being documented for completeness, and is not necessarily supported. External Modules which are not included with the base Apache distribution ("third-party modules") may use the "External" status. We are not responsible for, nor do we support such modules. Source File This quite simply lists the name of the source file which contains the code for the module. This is also the name used by the directive. Module Identifier This is a string which identifies the module for use in the L OAD M ODULE directive when dynamically loading modules. In particular, it is the name of the external variable of type module in the source file. Compatibility If the module was not part of the original Apache version 2 distribution, the version in which it was introduced should be listed here. In addition, if the module is limited to particular platforms, the details will be listed here. 10.2. TERMS USED TO DESCRIBE DIRECTIVES 10.2 377 Terms Used to Describe Directives This document describes the terms that are used to describe each Apache configuration directive (p. 1106) . See also • Configuration files (p. 32) Description A brief description of the purpose of the directive. Syntax This indicates the format of the directive as it would appear in a configuration file. This syntax is extremely directivespecific, and is described in detail in the directive’s definition. Generally, the directive name is followed by a series of one or more space-separated arguments. If an argument contains a space, the argument must be enclosed in double quotes. Optional arguments are enclosed in square brackets. Where an argument can take on more than one possible value, the possible values are separated by vertical bars "—". Literal text is presented in the default font, while argument-types for which substitution is necessary are emphasized. Directives which can take a variable number of arguments will end in "..." indicating that the last argument is repeated. Directives use a great number of different argument types. A few common ones are defined below. URL A complete Uniform Resource Locator including a scheme, hostname, and optional pathname as in http://www.example.com/path/to/file.html URL-path The part of a url which follows the scheme and hostname as in /path/to/file.html. The url-path represents a web-view of a resource, as opposed to a file-system view. file-path The path to a file in the local file-system beginning with the root directory as in /usr/local/apache/htdocs/path/to/file.html. Unless otherwise specified, a file-path which does not begin with a slash will be treated as relative to the ServerRoot (p. 380) . directory-path The path to a directory in the local file-system beginning with the root directory as in /usr/local/apache/htdocs/path/to/. filename The name of a file with no accompanying path information as in file.html. regex A Perl-compatible regular expression. The directive definition will specify what the regex is matching against. extension In general, this is the part of the filename which follows the last dot. However, Apache recognizes multiple filename extensions, so if a filename contains more than one dot, each dot-separated part of the filename following the first dot is an extension. For example, the filename file.html.en contains two extensions: .html and .en. For Apache directives, you may specify extensions with or without the leading dot. In addition, extensions are not case sensitive. MIME-type A method of describing the format of a file which consists of a major format type and a minor format type, separated by a slash as in text/html. env-variable The name of an environment variable (p. 92) defined in the Apache configuration process. Note this is not necessarily the same as an operating system environment variable. See the environment variable documentation (p. 92) for more details. 378 CHAPTER 10. APACHE MODULES Default If the directive has a default value (i.e., if you omit it from your configuration entirely, the Apache Web server will behave as though you set it to a particular value), it is described here. If there is no default value, this section should say "None". Note that the default listed here is not necessarily the same as the value the directive takes in the default httpd.conf distributed with the server. Context This indicates where in the server’s configuration files the directive is legal. It’s a comma-separated list of one or more of the following values: server config This means that the directive may be used in the server configuration files (e.g., httpd.conf), but not within any or containers. It is not allowed in .htaccess files at all. virtual host This context means that the directive may appear inside containers in the server configuration files. directory A directive marked as being valid in this context may be used inside , , , , and

containers in the server configuration files, subject to the restrictions outlined in Configuration Sections (p. 35) . .htaccess If a directive is valid in this context, it means that it can appear inside per-directory .htaccess files. It may not be processed, though depending upon the overrides currently active. The directive is only allowed within the designated context; if you try to use it elsewhere, you’ll get a configuration error that will either prevent the server from handling requests in that context correctly, or will keep the server from operating at all – i.e., the server won’t even start. The valid locations for the directive are actually the result of a Boolean OR of all of the listed contexts. In other words, a directive that is marked as being valid in "server config, .htaccess" can be used in the httpd.conf file and in .htaccess files, but not within any or containers. Override This directive attribute indicates which configuration override must be active in order for the directive to be processed when it appears in a .htaccess file. If the directive’s context doesn’t permit it to appear in .htaccess files, then no context will be listed. Overrides are activated by the A LLOW OVERRIDE directive, and apply to a particular scope (such as a directory) and all descendants, unless further modified by other A LLOW OVERRIDE directives at lower levels. The documentation for that directive also lists the possible override names available. Status This indicates how tightly bound into the Apache Web server the directive is; in other words, you may need to recompile the server with an enhanced set of modules in order to gain access to the directive and its functionality. Possible values for this attribute are: Core If a directive is listed as having "Core" status, that means it is part of the innermost portions of the Apache Web server, and is always available. 10.2. TERMS USED TO DESCRIBE DIRECTIVES 379 MPM A directive labeled as having "MPM" status is provided by a Multi-Processing Module (p. 90) . This type of directive will be available if and only if you are using one of the MPMs listed on the Module line of the directive definition. Base A directive labeled as having "Base" status is supported by one of the standard Apache modules which is compiled into the server by default, and is therefore normally available unless you’ve taken steps to remove the module from your configuration. Extension A directive with "Extension" status is provided by one of the modules included with the Apache server kit, but the module isn’t normally compiled into the server. To enable the directive and its functionality, you will need to change the server build configuration files and re-compile Apache. Experimental "Experimental" status indicates that the directive is available as part of the Apache kit, but you’re on your own if you try to use it. The directive is being documented for completeness, and is not necessarily supported. The module which provides the directive may or may not be compiled in by default; check the top of the page which describes the directive and its module to see if it remarks on the availability. Module This quite simply lists the name of the source module which defines the directive. Compatibility If the directive wasn’t part of the original Apache version 2 distribution, the version in which it was introduced should be listed here. In addition, if the directive is available only on certain platforms, it will be noted here. 380 CHAPTER 10. APACHE MODULES 10.3 Apache Module core Description: Status: Core Apache HTTP Server features that are always available Core Directives • AcceptFilter • AcceptPathInfo • AccessFileName • AddDefaultCharset • AllowEncodedSlashes • AllowOverride • AllowOverrideList • AsyncFilter • CGIMapExtension • CGIPassAuth • CGIVar • ContentDigest • DefaultRuntimeDir • DefaultType • Define • • DocumentRoot • • EnableMMAP • EnableSendfile • Error • ErrorDocument • ErrorLog • ErrorLogFormat • ExtendedStatus • FileETag • • ForceType • GprofDir • HostnameLookups • 10.3. APACHE MODULE CORE • • Include • IncludeOptional • KeepAlive • KeepAliveTimeout • • LimitInternalRecursion • LimitRequestBody • LimitRequestFields • LimitRequestFieldSize • LimitRequestLine • LimitXMLRequestBody • • LogLevel • LogLevelOverride • MaxKeepAliveRequests • MaxRangeOverlaps • MaxRangeReversals • MaxRanges • MergeTrailers • Mutex • NameVirtualHost • Options • Protocol • Protocols • ProtocolsHonorOrder • QualifyRedirectURL • RegisterHttpMethod • RLimitCPU • RLimitMEM • RLimitNPROC • ScriptInterpreterSource • SeeRequestTail • ServerAdmin • ServerAlias • ServerName • ServerPath • ServerRoot • ServerSignature 381 382 CHAPTER 10. APACHE MODULES • ServerTokens • SetHandler • SetInputFilter • SetOutputFilter • TimeOut • TraceEnable • UnDefine • UseCanonicalName • UseCanonicalPhysicalPort • • Warning AcceptFilter Directive Description: Syntax: Context: Status: Module: Configures optimizations for a Protocol’s Listener Sockets AcceptFilter protocol accept filter server config Core core This directive enables operating system specific optimizations for a listening socket by the P ROTOCOL type. The basic premise is for the kernel to not send a socket to the server process until either data is received or an entire HTTP Request is buffered. Only FreeBSD’s Accept Filters1 , Linux’s more primitive TCP DEFER ACCEPT, and Windows’ optimized AcceptEx() are currently supported. Using none for an argument will disable any accept filters for that protocol. This is useful for protocols that require a server send data first, such as ftp: or nntp: AcceptFilter nntp none The default protocol names are https for port 443 and http for all other ports. To specify that another protocol is being used with a listening port, add the protocol argument to the L ISTEN directive. The default values on FreeBSD are: AcceptFilter http httpready AcceptFilter https dataready The httpready accept filter buffers entire HTTP requests at the kernel level. Once an entire request is received, the kernel then sends it to the server. See the accf http(9)2 man page for more details. Since HTTPS requests are encrypted, only the accf data(9)3 filter is used. The default values on Linux are: AcceptFilter http data AcceptFilter https data 1 http://www.freebsd.org/cgi/man.cgi?query=accept filter&sektion=9 http&sektion=9 3 http://www.freebsd.org/cgi/man.cgi?query=accf data&sektion=9 2 http://www.freebsd.org/cgi/man.cgi?query=accf 10.3. APACHE MODULE CORE 383 Linux’s TCP DEFER ACCEPT does not support buffering http requests. Any value besides none will enable TCP DEFER ACCEPT on that listener. For more details see the Linux tcp(7)4 man page. The default values on Windows are: AcceptFilter http data AcceptFilter https data Window’s mpm winnt interprets the AcceptFilter to toggle the AcceptEx() API, and does not support http protocol buffering. There are two values which utilize the Windows AcceptEx() API and will recycle network sockets between connections. data waits until data has been transmitted as documented above, and the initial data buffer and network endpoint addresses are all retrieved from the single AcceptEx() invocation. connect will use the AcceptEx() API, also retrieve the network endpoint addresses, but like none the connect option does not wait for the initial data transmission. On Windows, none uses accept() rather than AcceptEx() and will not recycle sockets between connections. This is useful for network adapters with broken driver support, as well as some virtual network providers such as vpn drivers, or spam, virus or spyware filters. See also • P ROTOCOL AcceptPathInfo Directive Description: Syntax: Default: Context: Override: Status: Module: Resources accept trailing pathname information AcceptPathInfo On|Off|Default AcceptPathInfo Default server config, virtual host, directory, .htaccess FileInfo Core core This directive controls whether requests that contain trailing pathname information that follows an actual filename (or non-existent file in an existing directory) will be accepted or rejected. The trailing pathname information can be made available to scripts in the PATH INFO environment variable. For example, assume the location /test/ points to a directory that contains only the single file here.html. Then requests for /test/here.html/more and /test/nothere.html/more both collect /more as PATH INFO. The three possible arguments for the ACCEPT PATH I NFO directive are: Off A request will only be accepted if it maps to a literal path that exists. Therefore a request with trailing pathname information after the true filename such as /test/here.html/more in the above example will return a 404 NOT FOUND error. On A request will be accepted if a leading path component maps to a file that exists. The above example /test/here.html/more will be accepted if /test/here.html maps to a valid file. Default The treatment of requests with trailing pathname information is determined by the handler (p. 108) responsible for the request. The core handler for normal files defaults to rejecting PATH INFO requests. Handlers that serve scripts, such as cgi-script (p. 580) and isapi-handler (p. 683) , generally accept PATH INFO by default. 4 http://homepages.cwi.nl/˜aeb/linux/man2html/man7/tcp.7.html 384 CHAPTER 10. APACHE MODULES The primary purpose of the AcceptPathInfo directive is to allow you to override the handler’s choice of accepting or rejecting PATH INFO. This override is required, for example, when you use a filter (p. 110) , such as INCLUDES (p. 667) , to generate content based on PATH INFO. The core handler would usually reject the request, so you can use the following configuration to enable such a script: Options +Includes SetOutputFilter INCLUDES AcceptPathInfo On AccessFileName Directive Description: Syntax: Default: Context: Status: Module: Name of the distributed configuration file AccessFileName filename [filename] ... AccessFileName .htaccess server config, virtual host Core core While processing a request, the server looks for the first existing configuration file from this list of names in every directory of the path to the document, if distributed configuration files are enabled for that directory. For example: AccessFileName .acl Before returning the document /usr/local/web/index.html, the server will read /.acl, /usr/.acl, /usr/local/.acl and /usr/local/web/.acl for directives unless they have been disabled with: AllowOverride None See also • A LLOW OVERRIDE • Configuration Files (p. 32) • .htaccess Files (p. 249) AddDefaultCharset Directive Description: Syntax: Default: Context: Override: Status: Module: Default charset parameter to be added when a response content-type is text/plain or text/html AddDefaultCharset On|Off|charset AddDefaultCharset Off server config, virtual host, directory, .htaccess FileInfo Core core This directive specifies a default value for the media type charset parameter (the name of a character encoding) to be added to a response if and only if the response’s content-type is either text/plain or text/html. This should override any charset specified in the body of the response via a META element, though the exact behavior is often 10.3. APACHE MODULE CORE 385 dependent on the user’s client configuration. A setting of AddDefaultCharset Off disables this functionality. AddDefaultCharset On enables a default charset of iso-8859-1. Any other value is assumed to be the charset to be used, which should be one of the IANA registered charset values5 for use in Internet media types (MIME types). For example: AddDefaultCharset utf-8 A DD D EFAULT C HARSET should only be used when all of the text resources to which it applies are known to be in that character encoding and it is too inconvenient to label their charset individually. One such example is to add the charset parameter to resources containing generated content, such as legacy CGI scripts, that might be vulnerable to cross-site scripting attacks due to user-provided data being included in the output. Note, however, that a better solution is to just fix (or delete) those scripts, since setting a default charset does not protect users that have enabled the "auto-detect character encoding" feature on their browser. See also • A DD C HARSET AllowEncodedSlashes Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Determines whether encoded path separators in URLs are allowed to be passed through AllowEncodedSlashes On|Off|NoDecode AllowEncodedSlashes Off server config, virtual host Core core NoDecode option available in 2.3.12 and later. The A LLOW E NCODED S LASHES directive allows URLs which contain encoded path separators (%2F for / and additionally %5C for \ on accordant systems) to be used in the path info. With the default value, Off, such URLs are refused with a 404 (Not found) error. With the value On, such URLs are accepted, and encoded slashes are decoded like all other encoded characters. With the value NoDecode, such URLs are accepted, but encoded slashes are not decoded but left in their encoded state. Turning A LLOW E NCODED S LASHES On is mostly useful when used in conjunction with PATH INFO. =⇒Note If encoded slashes are needed in path info, use of NoDecode is strongly recommended as a security measure. Allowing slashes to be decoded could potentially allow unsafe paths. See also • ACCEPT PATH I NFO 5 http://www.iana.org/assignments/character-sets 386 CHAPTER 10. APACHE MODULES AllowOverride Directive Description: Syntax: Default: Context: Status: Module: Types of directives that are allowed in .htaccess files AllowOverride All|None|directive-type [directive-type] ... AllowOverride None (2.3.9 and later), AllowOverride All (2.3.8 and earlier) directory Core core When the server finds an .htaccess file (as specified by ACCESS F ILE NAME), it needs to know which directives declared in that file can override earlier configuration directives. =⇒Only available in sections A O is valid only in sections specified without regular expressions, not in , or sections. When this directive is set to None and A LLOW OVERRIDE L IST is set to None, .htaccess files are completely ignored. In this case, the server will not even attempt to read .htaccess files in the filesystem. When this directive is set to All, then any directive which has the .htaccess Context (p. 377) is allowed in .htaccess files. The directive-type can be one of the following groupings of directives. AuthConfig Allow use of the authorization directives (AUTH DBMG ROUP F ILE, AUTH DBMU SER F ILE, AUTH G ROUP F ILE, AUTH NAME, AUTH T YPE, AUTH U SER F ILE, R EQUIRE, etc.). FileInfo Allow use of the directives controlling document types (E RROR D OCUMENT, F ORCE T YPE, L AN GUAGE P RIORITY , S ET H ANDLER , S ET I NPUT F ILTER , S ET O UTPUT F ILTER , and MOD MIME Add* and Remove* directives), document meta data (H EADER, R EQUEST H EADER, S ET E NV I F, S ET E NV I F N O C ASE, B ROWSER M ATCH, C OOKIE E XPIRES, C OOKIE D OMAIN, C OOKIE S TYLE, C OOKIE T RACKING, C OOKIE NAME ), MOD REWRITE directives (R EWRITE E NGINE , R EWRITE O PTIONS , R EWRITE BASE , R EWRITE C OND , R EWRITE RULE), MOD ALIAS directives (R EDIRECT, R EDIRECT T EMP, R EDIRECT P ERMANENT, R EDIRECTM ATCH), and ACTION from MOD ACTIONS. Indexes Allow use of the directives controlling directory indexing (A DD D ESCRIPTION, A DD I CON, A D D I CON B Y E NCODING , A DD I CON B Y T YPE , D EFAULT I CON , D IRECTORY I NDEX , FALLBACK R ESOURCE , FancyIndexing (p. 542) , H EADER NAME, I NDEX I GNORE, I NDEX O PTIONS, R EADME NAME, etc.). Limit Allow use of the directives controlling host access (A LLOW, D ENY and O RDER). Nonfatal=[Override—Unknown—All] Allow use of AllowOverride option to treat invalid (unrecognized or disallowed) directives in .htaccess as nonfatal. Instead of causing an Internal Server Error, disallowed or unrecognised directives will be ignored and a warning logged: • Nonfatal=Override treats directives forbidden by AllowOverride as nonfatal. • Nonfatal=Unknown treats unknown directives as nonfatal. This covers typos and directives implemented by a module that’s not present. • Nonfatal=All treats both the above as nonfatal. Note that a syntax error in a valid directive will still cause an Internal Server Error. ! Security Nonfatal errors may have security implications for .htaccess users. For example, if AllowOverride disallows AuthConfig, users’ configuration designed to restrict access to a site will be disabled. 10.3. APACHE MODULE CORE 387 Options[=Option,...] Allow use of the directives controlling specific directory features (O PTIONS and XB IT H ACK). An equal sign may be given followed by a comma-separated list, without spaces, of options that may be set using the O PTIONS command. =⇒Implicit disabling of Options Even though the list of options that may be used in .htaccess files can be limited with this di- rective, as long as any O PTIONS directive is allowed any other inherited option can be disabled by using the non-relative syntax. In other words, this mechanism cannot force a specific option to remain set while allowing any others to be set. AllowOverride Options=Indexes,MultiViews Example: AllowOverride AuthConfig Indexes In the example above, all directives that are neither in the group AuthConfig nor Indexes cause an internal server error. =⇒For security and performance reasons, do not set AllowOverride to anything other than None in your block. Instead, find (or create) the block that refers to the directory where you’re actually planning to place a .htaccess file. See also • ACCESS F ILE NAME • A LLOW OVERRIDE L IST • Configuration Files (p. 32) • .htaccess Files (p. 249) AllowOverrideList Directive Description: Syntax: Default: Context: Status: Module: Individual directives that are allowed in .htaccess files AllowOverrideList None|directive [directive-type] ... AllowOverrideList None directory Core core When the server finds an .htaccess file (as specified by ACCESS F ILE NAME), it needs to know which directives declared in that file can override earlier configuration directives. =⇒Only available in sections A O L is valid only in sections specified without regular expressions, not in , or sections. When this directive is set to None and A LLOW OVERRIDE is set to None, then .htaccess files are completely ignored. In this case, the server will not even attempt to read .htaccess files in the filesystem. Example: AllowOverride None AllowOverrideList Redirect RedirectMatch 388 CHAPTER 10. APACHE MODULES In the example above, only the Redirect and RedirectMatch directives are allowed. All others will cause an Internal Server Error. Example: AllowOverride AuthConfig AllowOverrideList CookieTracking CookieName In the example above, A LLOW OVERRIDE grants permission to the AuthConfig directive grouping and A L LOW OVERRIDE L IST grants permission to only two directives from the FileInfo directive grouping. All others will cause an Internal Server Error. See also • ACCESS F ILE NAME • A LLOW OVERRIDE • Configuration Files (p. 32) • .htaccess Files (p. 249) AsyncFilter Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Set the minimum filter type eligible for asynchronous handling AsyncFilter request|connection|network AsyncFilter request server config, virtual host Core core Only available from Apache 2.5.0 and later. This directive controls the minimum filter levels that are eligible for asynchronous handling. This may be necessary to support legacy external filters that did not handle meta buckets correctly. If set to "network", asynchronous handling will be limited to the network filter only. If set to "connection", all connection and network filters will be eligible for asynchronous handling, including MOD SSL. If set to "request", all filters will be eligible for asynchronous handling. CGIMapExtension Directive Description: Syntax: Context: Override: Status: Module: Compatibility: Technique for locating the interpreter for CGI scripts CGIMapExtension cgi-path .extension directory, .htaccess FileInfo Core core NetWare only This directive is used to control how Apache httpd finds the interpreter used to run CGI scripts. For example, setting CGIMapExtension sys:\foo.nlm .foo will cause all CGI script files with a .foo extension to be passed to the FOO interpreter. 10.3. APACHE MODULE CORE 389 CGIPassAuth Directive Description: Syntax: Default: Context: Override: Status: Module: Compatibility: Enables passing HTTP authorization headers to scripts as CGI variables CGIPassAuth On|Off CGIPassAuth Off directory, .htaccess AuthConfig Core core Available in Apache HTTP Server 2.4.13 and later CGIPASS AUTH allows scripts access to HTTP authorization headers such as Authorization, which is required for scripts that implement HTTP Basic authentication. Normally these HTTP headers are hidden from scripts. This is to disallow scripts from seeing user ids and passwords used to access the server when HTTP Basic authentication is enabled in the web server. This directive should be used when scripts are allowed to implement HTTP Basic authentication. This directive can be used instead of the compile-time setting SECURITY HOLE PASS AUTHORIZATION which has been available in previous versions of Apache HTTP Server. The setting is respected by any modules which use ap add common vars(), such as MOD CGI, MOD CGID, MOD PROXY FCGI , MOD PROXY SCGI , and so on. Notably, it affects modules which don’t handle the request in the usual sense but still use this API; examples of this are MOD INCLUDE and MOD EXT FILTER. Third-party modules that don’t use ap add common vars() may choose to respect the setting as well. CGIVar Directive Description: Syntax: Context: Override: Status: Module: Compatibility: Controls how some CGI variables are set CGIVar variable rule directory, .htaccess FileInfo Core core Available in Apache HTTP Server 2.4.21 and later This directive controls how some CGI variables are set. REQUEST URI rules: original-uri (default) The value is taken from the original request line, and will not reflect internal redirects or subrequests which change the requested resource. current-uri The value reflects the resource currently being processed, which may be different than the original request from the client due to internal redirects or subrequests. ContentDigest Directive Description: Syntax: Default: Context: Override: Status: Module: Enables the generation of Content-MD5 HTTP Response headers ContentDigest On|Off ContentDigest Off server config, virtual host, directory, .htaccess Options Core core This directive enables the generation of Content-MD5 headers as defined in RFC1864 respectively RFC2616. 390 CHAPTER 10. APACHE MODULES MD5 is an algorithm for computing a "message digest" (sometimes called "fingerprint") of arbitrary-length data, with a high degree of confidence that any alterations in the data will be reflected in alterations in the message digest. The Content-MD5 header provides an end-to-end message integrity check (MIC) of the entity-body. A proxy or client may check this header for detecting accidental modification of the entity-body in transit. Example header: Content-MD5: AuLb7Dp1rqtRtxz2m9kRpA== Note that this can cause performance problems on your server since the message digest is computed on every request (the values are not cached). Content-MD5 is only sent for documents served by the CORE, and not by any module. For example, SSI documents, output from CGI scripts, and byte range responses do not have this header. DefaultRuntimeDir Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Base directory for the server run-time files DefaultRuntimeDir directory-path DefaultRuntimeDir DEFAULT REL RUNTIMEDIR (logs/) server config Core core Available in Apache 2.4.2 and later The D EFAULT RUNTIME D IR directive sets the directory in which the server will create various run-time files (shared memory, locks, etc.). If set as a relative path, the full path will be relative to S ERVER ROOT. Example DefaultRuntimeDir scratch/ The default location of D EFAULT RUNTIME D IR may be modified by changing the DEFAULT REL RUNTIMEDIR #define at build time. Note: S ERVER ROOT should be specified before this directive is used. Otherwise, the default value of S ERVER ROOT would be used to set the base directory. See also • the security tips (p. 364) for information on how to properly set permissions on the S ERVER ROOT DefaultType Directive Description: Syntax: Default: Context: Override: Status: Module: Compatibility: This directive has no effect other than to emit warnings if the value is not none. In prior versions, DefaultType would specify a default media type to assign to response content for which no other media type configuration could be found. DefaultType media-type|none DefaultType none server config, virtual host, directory, .htaccess FileInfo Core core All choices except none are DISABLED for 2.3.x and later. This directive has been disabled. For backwards compatibility of configuration files, it may be specified with the value none, meaning no default media type. For example: 10.3. APACHE MODULE CORE 391 DefaultType None DefaultType None is only available in httpd-2.2.7 and later. Use the mime.types configuration file and the A DD T YPE to configure media type assignments via file extensions, or the F ORCE T YPE directive to configure the media type for specific resources. Otherwise, the server will send the response without a Content-Type header field and the recipient may attempt to guess the media type. Define Directive Description: Syntax: Context: Status: Module: Define a variable Define parameter-name [parameter-value] server config, virtual host Core core In its one parameter form, D EFINE is equivalent to passing the -D argument to httpd. It can be used to toggle the use of sections without needing to alter -D arguments in any startup scripts. In addition to that, if the second parameter is given, a config variable is set to this value. The variable can be used in the configuration using the ${VAR} syntax. The variable is always globally defined and not limited to the scope of the surrounding config section. Define servername test.example.com Define servername www.example.com Define SSL DocumentRoot "/var/www/${servername}/htdocs" Variable names may not contain colon ":" characters, to avoid clashes with R EWRITE M AP’s syntax. While this directive is supported in virtual host context, the changes it makes are visible to any later configuration directives, beyond any enclosing virtual host Directory Directive Description: Syntax: Context: Status: Module: Enclose a group of directives that apply only to the named file-system directory, subdirectories, and their contents. ... server config, virtual host Core core and are used to enclose a group of directives that will apply only to the named directory, sub-directories of that directory, and the files within the respective directories. Any directive that is allowed in a directory context may be used. Directory-path is either the full path to a directory, or a wild-card string using Unix shell-style matching. In a wild-card string, ? matches any single character, and * matches any sequences of characters. You may also use [] character ranges. None of the wildcards match a ‘/’ character, so will not match /home/user/public html, but will match. Example: 392 CHAPTER 10. APACHE MODULES Options Indexes FollowSymLinks Directory paths may be quoted, if you like, however, it must be quoted if the path contains spaces. This is because a space would otherwise indicate the end of an argument. =⇒Bewhichcareful with the directory-path arguments: They have to literally match the filesystem path Apache httpd uses to access the files. Directives applied to a particular will not apply to files accessed from that same directory via a different path, such as via different symbolic links. Regular expressions can also be used, with the addition of the ˜ character. For example: would match directories in /www/ that consisted of three numbers. If multiple (non-regular expression) sections match the directory (or one of its parents) containing a document, then the directives are applied in the order of shortest match first, interspersed with the directives from the .htaccess files. For example, with AllowOverride None AllowOverride FileInfo for access to the document /home/web/dir/doc.html the steps are: • Apply directive AllowOverride None (disabling .htaccess files). • Apply directive AllowOverride FileInfo (for directory /home). • Apply any FileInfo directives in /home/.htaccess, /home/web/dir/.htaccess in that order. /home/web/.htaccess and Regular expressions are not considered until after all of the normal sections have been applied. Then all of the regular expressions are tested in the order they appeared in the configuration file. For example, with # ... directives here ... the regular expression section won’t be considered until after all normal s and .htaccess files have been applied. Then the regular expression will match on /home/abc/public html/abc and the corresponding will be applied. Note that the default access for is to permit all access. This means that Apache httpd will serve any file mapped from an URL. It is recommended that you change this with a block such as 10.3. APACHE MODULE CORE 393 Require all denied and then override this for directories you want accessible. See the Security Tips (p. 364) page for more details. The directory sections occur in the httpd.conf file. directives cannot nest, and cannot appear in a or section. See also • How , and sections work (p. 35) for an explanation of how these different sections are combined when a request is received DirectoryMatch Directive Description: Syntax: Context: Status: Module: Enclose directives that apply to the contents of file-system directories matching a regular expression. ... server config, virtual host Core core and are used to enclose a group of directives which will apply only to the named directory (and the files within), the same as . However, it takes as an argument a regular expression. For example: # ... matches directories in /www/ (or any subdirectory thereof) that consist of three numbers. =⇒Compatability Prior to 2.3.9, this directive implicitly applied to sub-directories (like ) and could not match the end of line symbol ($). In 2.3.9 and later, only directories that match the expression are affected by the enclosed directives. =⇒Trailing Slash This directive applies to requests for directories that may or may not end in a trailing slash, so expressions that are anchored to the end of line ($) must be written with care. From 2.4.8 onwards, named groups and backreferences are captured and written to the environment with the corresponding name prefixed with "MATCH " and in upper case. This allows elements of paths to be referenced from within expressions (p. 99) and modules like MOD REWRITE. In order to prevent confusion, numbered (unnamed) backreferences are ignored. Use named groups instead. [ˆ/]+)"> Require ldap-group cn=%{env:MATCH_SITENAME},ou=combined,o=Example See also • for a description of how regular expressions are mixed in with normal s • How , and sections work (p. 35) for an explanation of how these different sections are combined when a request is received 394 CHAPTER 10. APACHE MODULES DocumentRoot Directive Description: Syntax: Default: Context: Status: Module: Directory that forms the main document tree visible from the web DocumentRoot directory-path DocumentRoot /usr/local/apache/htdocs server config, virtual host Core core This directive sets the directory from which httpd will serve files. Unless matched by a directive like A LIAS, the server appends the path from the requested URL to the document root to make the path to the document. Example: DocumentRoot "/usr/web" then an access to http://my.example.com/index.html refers to /usr/web/index.html. directory-path is not absolute then it is assumed to be relative to the S ERVER ROOT. If the The D OCUMENT ROOT should be specified without a trailing slash. See also • Mapping URLs to Filesystem Locations (p. 64) Else Directive Description: Syntax: Context: Override: Status: Module: Contains directives that apply only if the condition of a previous or section is not satisfied by a request at runtime ... server config, virtual host, directory, .htaccess All Core core The applies the enclosed directives if and only if the most recent or section in the same scope has not been applied. For example: In # ... # ... The would match HTTP/1.0 requests without a Host: header and the would match requests with a Host: header. See also • • How , , sections work (p. 35) for an explanation of how these different sections are combined when a request is received. , , and are applied last. 10.3. APACHE MODULE CORE 395 ElseIf Directive Description: Syntax: Context: Override: Status: Module: Contains directives that apply only if a condition is satisfied by a request at runtime while the condition of a previous or section is not satisfied ... server config, virtual host, directory, .htaccess All Core core The applies the enclosed directives if and only if both the given condition evaluates to true and the most recent or section in the same scope has not been applied. For example: In #... #... #... The would match if the remote address of a request belongs to the subnet 10.0.0.0/8 but not to the subnet 10.1.0.0/16. See also • Expressions in Apache HTTP Server (p. 99) , for a complete reference and more examples. • • How , , sections work (p. 35) for an explanation of how these different sections are combined when a request is received. , , and are applied last. EnableMMAP Directive Description: Syntax: Default: Context: Override: Status: Module: Use memory-mapping to read files during delivery EnableMMAP On|Off EnableMMAP On server config, virtual host, directory, .htaccess FileInfo Core core This directive controls whether the httpd may use memory-mapping if it needs to read the contents of a file during delivery. By default, when the handling of a request requires access to the data within a file – for example, when delivering a server-parsed file using MOD INCLUDE – Apache httpd memory-maps the file if the OS supports it. This memory-mapping sometimes yields a performance improvement. But in some environments, it is better to disable the memory-mapping to prevent operational problems: • On some multiprocessor systems, memory-mapping can reduce the performance of the httpd. • Deleting or truncating a file while httpd has it memory-mapped can cause httpd to crash with a segmentation fault. 396 CHAPTER 10. APACHE MODULES For server configurations that are vulnerable to these problems, you should disable memory-mapping of delivered files by specifying: EnableMMAP Off For NFS mounted files, this feature may be disabled explicitly for the offending files by specifying: EnableMMAP Off EnableSendfile Directive Description: Syntax: Default: Context: Override: Status: Module: Compatibility: Use the kernel sendfile support to deliver files to the client EnableSendfile On|Off EnableSendfile Off server config, virtual host, directory, .htaccess FileInfo Core core Default changed to Off in version 2.3.9. This directive controls whether httpd may use the sendfile support from the kernel to transmit file contents to the client. By default, when the handling of a request requires no access to the data within a file – for example, when delivering a static file – Apache httpd uses sendfile to deliver the file contents without ever reading the file if the OS supports it. This sendfile mechanism avoids separate read and send operations, and buffer allocations. But on some platforms or within some filesystems, it is better to disable this feature to avoid operational problems: • Some platforms may have broken sendfile support that the build system did not detect, especially if the binaries were built on another box and moved to such a machine with broken sendfile support. • On Linux the use of sendfile triggers TCP-checksum offloading bugs on certain networking cards when using IPv6. • On Linux on Itanium, sendfile may be unable to handle files over 2GB in size. • With a network-mounted D OCUMENT ROOT (e.g., NFS, SMB, CIFS, FUSE), the kernel may be unable to serve the network file through its own cache. For server configurations that are not vulnerable to these problems, you may enable this feature by specifying: EnableSendfile On For network mounted files, this feature may be disabled explicitly for the offending files by specifying: EnableSendfile Off Please note that the per-directory and .htaccess configuration of E NABLE S ENDFILE is not supported by MOD CACHE DISK . Only global definition of E NABLE S ENDFILE is taken into account by the module. 10.3. APACHE MODULE CORE 397 Error Directive Description: Syntax: Context: Status: Module: Compatibility: Abort configuration parsing with a custom error message Error message server config, virtual host, directory, .htaccess Core core 2.3.9 and later If an error can be detected within the configuration, this directive can be used to generate a custom error message, and halt configuration parsing. The typical use is for reporting required modules which are missing from the configuration. # Example # ensure that mod_include is loaded Error "mod_include is required by mod_foo. Load it with LoadModule." # ensure that exactly one of SSL,NOSSL is defined Error "Both SSL and NOSSL are defined. Define only one of them." Error "Either SSL or NOSSL must be defined." ErrorDocument Directive Description: Syntax: Context: Override: Status: Module: What the server will return to the client in case of an error ErrorDocument error-code document server config, virtual host, directory, .htaccess FileInfo Core core In the event of a problem or error, Apache httpd can be configured to do one of four things, 1. output a simple hardcoded error message 2. output a customized message 3. internally redirect to a local URL-path to handle the problem/error 4. redirect to an external URL to handle the problem/error The first option is the default, while options 2-4 are configured using the E RROR D OCUMENT directive, which is followed by the HTTP response code and a URL or a message. Apache httpd will sometimes offer additional information regarding the problem/error. From 2.4.13, expression syntax (p. 99) can be used inside the directive to produce dynamic strings and URLs. 398 CHAPTER 10. APACHE MODULES URLs can begin with a slash (/) for local web-paths (relative to the D OCUMENT ROOT), or be a full URL which the client can resolve. Alternatively, a message can be provided to be displayed by the browser. Note that deciding whether the parameter is an URL, a path or a message is performed before any expression is parsed. Examples: ErrorDocument ErrorDocument ErrorDocument ErrorDocument ErrorDocument ErrorDocument 500 404 401 403 403 403 http://example.com/cgi-bin/server-error.cgi /errors/bad_urls.php /subscription_info.html "Sorry, can’t allow you access today" Forbidden! /errors/forbidden.py?referrer=%{escape:%{HTTP_REFERER}} Additionally, the special value default can be used to specify Apache httpd’s simple hardcoded message. While not required under normal circumstances, default will restore Apache httpd’s simple hardcoded message for configurations that would otherwise inherit an existing E RROR D OCUMENT. ErrorDocument 404 /cgi-bin/bad_urls.pl ErrorDocument 404 default Note that when you specify an E RROR D OCUMENT that points to a remote URL (ie. anything with a method such as http in front of it), Apache HTTP Server will send a redirect to the client to tell it where to find the document, even if the document ends up being on the same server. This has several implications, the most important being that the client will not receive the original error status code, but instead will receive a redirect status code. This in turn can confuse web robots and other clients which try to determine if a URL is valid using the status code. In addition, if you use a remote URL in an ErrorDocument 401, the client will not know to prompt the user for a password since it will not receive the 401 status code. Therefore, if you use an ErrorDocument 401 directive, then it must refer to a local document. Microsoft Internet Explorer (MSIE) will by default ignore server-generated error messages when they are "too small" and substitute its own "friendly" error messages. The size threshold varies depending on the type of error, but in general, if you make your error document greater than 512 bytes, then MSIE will show the server-generated error rather than masking it. More information is available in Microsoft Knowledge Base article Q2948076 . Although most error messages can be overridden, there are certain circumstances where the internal messages are used regardless of the setting of E RROR D OCUMENT. In particular, if a malformed request is detected, normal request processing will be immediately halted and the internal error message returned. This is necessary to guard against security problems caused by bad requests. If you are using mod proxy, you may wish to enable P ROXY E RROROVERRIDE so that you can provide custom error messages on behalf of your Origin servers. If you don’t enable ProxyErrorOverride, Apache httpd will not generate custom error documents for proxied content. See also • documentation of customizable responses (p. 85) 6 http://support.microsoft.com/default.aspx?scid=kb;en-us;Q294807 10.3. APACHE MODULE CORE 399 ErrorLog Directive Description: Syntax: Default: Context: Status: Module: Location where the server will log errors ErrorLog file-path|syslog[:facility] ErrorLog logs/error log (Unix) ErrorLog logs/error.log (Windows and OS/2) server config, virtual host Core core The E RROR L OG directive sets the name of the file to which the server will log any errors it encounters. If the file-path is not absolute then it is assumed to be relative to the S ERVER ROOT. ErrorLog "/var/log/httpd/error_log" If the file-path begins with a pipe character "|" then it is assumed to be a command to spawn to handle the error log. ErrorLog "|/usr/local/bin/httpd_errors" See the notes on piped logs (p. 56) for more information. Using syslog instead of a filename enables logging via syslogd(8) if the system supports it and if MOD SYSLOG is loaded. The default is to use syslog facility local7, but you can override this by using the syslog:facility syntax where facility can be one of the names usually documented in syslog(1). The facility is effectively global, and if it is changed in individual virtual hosts, the final facility specified affects the entire server. ErrorLog syslog:user Additional modules can provide their own ErrorLog providers. The syntax is similar to the syslog example above. SECURITY: See the security tips (p. 364) document for details on why your security could be compromised if the directory where log files are stored is writable by anyone other than the user that starts the server. ! Note When entering a file path on non-Unix platforms, care should be taken to make sure that only forward slashes are used even though the platform may allow the use of back slashes. In general it is a good idea to always use forward slashes throughout the configuration files. See also • L OG L EVEL • Apache HTTP Server Log Files (p. 56) ErrorLogFormat Directive Description: Syntax: Context: Status: Module: Format specification for error log entries ErrorLogFormat [connection|request] format server config, virtual host Core core E RROR L OG F ORMAT allows to specify what supplementary information is logged in the error log in addition to the actual log message. 400 CHAPTER 10. APACHE MODULES #Simple example ErrorLogFormat "[%t] [%l] [pid %P] %F: %E: [client %a] %M" Specifying connection or request as first parameter allows to specify additional formats, causing additional information to be logged when the first message is logged for a specific connection or request, respectively. This additional information is only logged once per connection/request. If a connection or request is processed without causing any log message, the additional information is not logged either. It can happen that some format string items do not produce output. For example, the Referer header is only present if the log message is associated to a request and the log message happens at a time when the Referer header has already been read from the client. If no output is produced, the default behavior is to delete everything from the preceding space character to the next space character. This means the log line is implicitly divided into fields on non-whitespace to whitespace transitions. If a format string item does not produce output, the whole field is omitted. For example, if the remote address %a in the log format [%t] [%l] [%a] %M is not available, the surrounding brackets are not logged either. Space characters can be escaped with a backslash to prevent them from delimiting a field. The combination ’%’ (percent space) is a zero-width field delimiter that does not produce any output. The above behavior can be changed by adding modifiers to the format string item. A - (minus) modifier causes a minus to be logged if the respective item does not produce any output. In once-per-connection/request formats, it is also possible to use the + (plus) modifier. If an item with the plus modifier does not produce any output, the whole line is omitted. A number as modifier can be used to assign a log severity level to a format item. The item will only be logged if the severity of the log message is not higher than the specified log severity level. The number can range from 1 (alert) over 4 (warn) and 7 (debug) to 15 (trace8). For example, here’s what would happen if you added modifiers to the %{Referer}i token, which logs the Referer request header. Modified Token Meaning %-{Referer}i %+{Referer}i %4{Referer}i Logs a - if Referer is not set. Omits the entire line if Referer is not set. Logs the Referer only if the log message severity is higher than 4. Some format string items accept additional parameters in braces. FormatString Description %% %a %{c}a %A %{name}e %E %F %{name}i %k %l %L %{c}L %{C}L %m %M %{name}n %P %T %{g}T The percent sign Client IP address and port of the request Underlying peer IP address and port of the connection (see the MOD REMOTEIP module) Local IP-address and port Request environment variable name APR/OS error status code and string Source file name and line number of the log call Request header name Number of keep-alive requests on this connection Loglevel of the message Log ID of the request Log ID of the connection Log ID of the connection if used in connection scope, empty otherwise Name of the module logging the message The actual log message Request note name Process ID of current process Thread ID of current thread System unique thread ID of current thread (the same ID as displayed by e.g. top; currently Linux only) 10.3. APACHE MODULE CORE %t %{u}t %{cu}t %v %V \ (backslash space) % (percent space) 401 The current time The current time including micro-seconds The current time in compact ISO 8601 format, including micro-seconds The canonical S ERVER NAME of the current server. The server name of the server serving the request according to the U SE C ANONICAL NAME setting. Non-field delimiting space Field delimiter (no output) The log ID format %L produces a unique id for a connection or request. This can be used to correlate which log lines belong to the same connection or request, which request happens on which connection. A %L format string is also available in MOD LOG CONFIG to allow to correlate access log entries with error log lines. If MOD UNIQUE ID is loaded, its unique id will be used as log ID for requests. #Example (default format for threaded MPMs) ErrorLogFormat "[%{u}t] [%-m:%l] [pid %P:tid %T] %7F: %E: [client\ %a] %M%,\referer\%{Refer This would result in error messages such as: [Thu May 12 08:28:57.652118 2011] [core:error] [pid 8777:tid 4326490112] [client ::1:58619] File does not exist: /usr/local/apache2/htdocs/favicon.ico Notice that, as discussed above, some fields are omitted entirely because they are not defined. #Example (similar to the 2.2.x format) ErrorLogFormat "[%t] [%l] %7F: %E: [client\ %a] %M%,\referer\%{Referer}i" #Advanced example with request/connection log IDs ErrorLogFormat "[%{uc}t] [%-m:%-l] [R:%L] [C:%{C}L] %7F: %E: %M" ErrorLogFormat request "[%{uc}t] [R:%L] Request %k on C:%{c}L pid:%P tid:%T" ErrorLogFormat request "[%{uc}t] [R:%L] UA:’%+{User-Agent}i’" ErrorLogFormat request "[%{uc}t] [R:%L] Referer:’%+{Referer}i’" ErrorLogFormat connection "[%{uc}t] [C:%{c}L] local\ %a remote\ %A" See also • E RROR L OG • L OG L EVEL • Apache HTTP Server Log Files (p. 56) ExtendedStatus Directive Description: Syntax: Default: Context: Status: Module: Keep track of extended status information for each request ExtendedStatus On|Off ExtendedStatus Off[*] server config Core core 402 CHAPTER 10. APACHE MODULES This option tracks additional data per worker about the currently executing request and creates a utilization summary. You can see these variables during runtime by configuring MOD STATUS. Note that other modules may rely on this scoreboard. This setting applies to the entire server and cannot be enabled or disabled on a virtualhost-by-virtualhost basis. The collection of extended status information can slow down the server. Also note that this setting cannot be changed during a graceful restart. will change the default behavior to ExtendedStatus On, while =⇒Note that loading other third party modules may do the same. Such modules rely on collecting detailed inforMOD STATUS mation about the state of all workers. The default is changed by MOD STATUS beginning with version 2.3.6. The previous default was always Off. FileETag Directive Description: Syntax: Default: Context: Override: Status: Module: Compatibility: File attributes used to create the ETag HTTP response header for static files FileETag component ... FileETag MTime Size server config, virtual host, directory, .htaccess FileInfo Core core The default used to be "INodeMTimeSize" in 2.3.14 and earlier. The F ILE ETAG directive configures the file attributes that are used to create the ETag (entity tag) response header field when the document is based on a static file. (The ETag value is used in cache management to save network bandwidth.) The F ILE ETAG directive allows you to choose which of these – if any – should be used. The recognized keywords are: INode The file’s i-node number will be included in the calculation MTime The date and time the file was last modified will be included Size The number of bytes in the file will be included All All available fields will be used. This is equivalent to: FileETag INode MTime Size None If a document is file-based, no ETag field will be included in the response The INode, MTime, and Size keywords may be prefixed with either + or -, which allow changes to be made to the default setting inherited from a broader scope. Any keyword appearing without such a prefix immediately and completely cancels the inherited setting. If a directory’s configuration includes FileETagINodeMTimeSize, and a subdirectory’s includes FileETag-INode, the setting for that subdirectory (which will be inherited by any sub-subdirectories that don’t override it) will be equivalent to FileETagMTimeSize. ! Warning Do not change the default for directories or locations that have WebDAV enabled and use MOD DAV FS as a storage provider. MOD DAV FS uses MTimeSize as a fixed format for ETag comparisons on conditional requests. These conditional requests will break if the ETag format is changed via F ILE ETAG. 10.3. APACHE MODULE CORE 403 =⇒Server Side Includes An ETag is not generated for responses parsed by MOD INCLUDE since the response entity can change without a change of the INode, MTime, or Size of the static file with embedded SSI directives. Files Directive Description: Syntax: Context: Override: Status: Module: Contains directives that apply to matched filenames ... server config, virtual host, directory, .htaccess All Core core The directive limits the scope of the enclosed directives by filename. It is comparable to the and directives. It should be matched with a directive. The directives given within this section will be applied to any object with a basename (last component of filename) matching the specified filename. sections are processed in the order they appear in the configuration file, after the sections and .htaccess files are read, but before sections. Note that can be nested inside sections to restrict the portion of the filesystem they apply to. The filename argument should include a filename, or a wild-card string, where ? matches any single character, and * matches any sequences of characters. # Insert stuff that applies to cat.html here # This would apply to cat.html, bat.html, hat.php and so on. Regular expressions can also be used, with the addition of the ˜ character. For example: #... would match most common Internet graphics formats. is preferred, however. Note that unlike and sections, sections can be used inside .htaccess files. This allows users to control access to their own files, at a file-by-file level. See also • How , and sections work (p. 35) for an explanation of how these different sections are combined when a request is received FilesMatch Directive Description: Syntax: Context: Override: Status: Module: Contains directives that apply to regular-expression matched filenames ... server config, virtual host, directory, .htaccess All Core core 404 CHAPTER 10. APACHE MODULES The directive limits the scope of the enclosed directives by filename, just as the directive does. However, it accepts a regular expression. For example: # ... would match most common Internet graphics formats. =⇒The .+ at the start of the regex ensures that files named .png, or .gif, for example, are not matched. From 2.4.8 onwards, named groups and backreferences are captured and written to the environment with the corresponding name prefixed with "MATCH " and in upper case. This allows elements of files to be referenced from within expressions (p. 99) and modules like MOD REWRITE. In order to prevent confusion, numbered (unnamed) backreferences are ignored. Use named groups instead. [ˆ/]+)"> require ldap-group cn=%{env:MATCH_SITENAME},ou=combined,o=Example See also • How , and sections work (p. 35) for an explanation of how these different sections are combined when a request is received ForceType Directive Description: Syntax: Context: Override: Status: Module: Forces all matching files to be served with the specified media type in the HTTP Content-Type header field ForceType media-type|None directory, .htaccess FileInfo Core core When placed into an .htaccess file or a , or or section, this directive forces all matching files to be served with the content type identification given by media-type. For example, if you had a directory full of GIF files, but did not want to label them all with .gif, you might want to use: ForceType image/gif Note that this directive overrides other indirect media type associations defined in mime.types or via the A DD T YPE. You can also override more general F ORCE T YPE settings by using the value of None: # force all files to be image/gif: ForceType image/gif # but normal mime-type associations here: ForceType None 10.3. APACHE MODULE CORE 405 This directive primarily overrides the content types generated for static files served out of the filesystem. For resources other than static files, where the generator of the response typically specifies a Content-Type, this directive has no effect. =⇒Note If no handler is explicitly set for a request, the specified content type will also be used as the handler name. When explicit directives such as S ET H ANDLER or A DD H ANDLER do not apply to the current request, the internal handler name normally set by those directives is instead set to the content type specified by this directive. This is a historical behavior that some third-party modules (such as mod php) may look for a "synthetic" content type used only to signal the module to take responsibility for the matching request. Configurations that rely on such "synthetic" types should be avoided. Additionally, configurations that restrict access to S ET H ANDLER or A DD H ANDLER should restrict access to this directive as well. GprofDir Directive Description: Syntax: Context: Status: Module: Directory to write gmon.out profiling data to. GprofDir /tmp/gprof/|/tmp/gprof/% server config, virtual host Core core When the server has been compiled with gprof profiling support, G PROF D IR causes gmon.out files to be written to the specified directory when the process exits. If the argument ends with a percent symbol (’%’), subdirectories are created for each process id. This directive currently only works with the PREFORK MPM. HostnameLookups Directive Description: Syntax: Default: Context: Status: Module: Enables DNS lookups on client IP addresses HostnameLookups On|Off|Double HostnameLookups Off server config, virtual host, directory Core core This directive enables DNS lookups so that host names can be logged (and passed to CGIs/SSIs in REMOTE HOST). The value Double refers to doing double-reverse DNS lookup. That is, after a reverse lookup is performed, a forward lookup is then performed on that result. At least one of the IP addresses in the forward lookup must match the original address. (In "tcpwrappers" terminology this is called PARANOID.) Regardless of the setting, when MOD AUTHZ HOST is used for controlling access by hostname, a double reverse lookup will be performed. This is necessary for security. Note that the result of this double-reverse isn’t generally available unless you set HostnameLookups Double. For example, if only HostnameLookups On and a request is made to an object that is protected by hostname restrictions, regardless of whether the double-reverse fails or not, CGIs will still be passed the single-reverse result in REMOTE HOST. The default is Off in order to save the network traffic for those sites that don’t truly need the reverse lookups done. It is also better for the end users because they don’t have to suffer the extra latency that a lookup entails. Heavily loaded sites should leave this directive Off, since DNS lookups can take considerable amounts of time. The utility 406 CHAPTER 10. APACHE MODULES logresolve, compiled by default to the bin subdirectory of your installation directory, can be used to look up host names from logged IP addresses offline. Finally, if you have hostname-based Require directives (p. 536) , a hostname lookup will be performed regardless of the setting of HostnameLookups. If Directive Description: Syntax: Context: Override: Status: Module: Contains directives that apply only if a condition is satisfied by a request at runtime ... server config, virtual host, directory, .htaccess All Core core The directive evaluates an expression at runtime, and applies the enclosed directives if and only if the expression evaluates to true. For example: would match HTTP/1.0 requests without a Host: header. Expressions may contain various shell-like operators for string comparison (==, !=, <, ...), integer comparison (-eq, -ne, ...), and others (-n, -z, -f, ...). It is also possible to use regular expressions, shell-like pattern matches and many other operations. These operations can be done on request headers (req), environment variables (env), and a large number of other properties. The full documentation is available in Expressions in Apache HTTP Server (p. 99) . Only directives that support the directory context (p. 377) can be used within this configuration section. ! Certain variables, such as CONTENT TYPE and other response headers, are set after conditions have already been evaluated, and so will not be available to use in this directive. See also • Expressions in Apache HTTP Server (p. 99) , for a complete reference and more examples. • • How , , sections work (p. 35) for an explanation of how these different sections are combined when a request is received. , , and are applied last. IfDefine Directive Description: Syntax: Context: Override: Status: Module: Encloses directives that will be processed only if a test is true at startup ... server config, virtual host, directory, .htaccess All Core core 10.3. APACHE MODULE CORE 407 The ... section is used to mark directives that are conditional. The directives within an section are only processed if the test is true. If test is false, everything between the start and end markers is ignored. The test in the section directive can be one of two forms: • parameter-name • !parameter-name In the former case, the directives between the start and end markers are only processed if the parameter named parameter-name is defined. The second format reverses the test, and only processes the directives if parameter-name is not defined. The parameter-name argument is a define as given on the httpd command line via -Dparameter at the time the server was started or by the D EFINE directive. sections are nest-able, which can be used to implement simple multiple-parameter tests. Example: httpd -DReverseProxy -DUseCache -DMemCache ... LoadModule proxy_module modules/mod_proxy.so LoadModule proxy_http_module modules/mod_proxy_http.so LoadModule cache_module modules/mod_cache.so LoadModule mem_cache_module modules/mod_mem_cache.so LoadModule cache_disk_module modules/mod_cache_disk.so IfModule Directive Description: Syntax: Context: Override: Status: Module: Encloses directives that are processed conditional on the presence or absence of a specific module ... server config, virtual host, directory, .htaccess All Core core The ... section is used to mark directives that are conditional on the presence of a specific module. The directives within an section are only processed if the test is true. If test is false, everything between the start and end markers is ignored. The test in the section directive can be one of two forms: • module • !module 408 CHAPTER 10. APACHE MODULES In the former case, the directives between the start and end markers are only processed if the module named module is included in Apache httpd – either compiled in or dynamically loaded using L OAD M ODULE. The second format reverses the test, and only processes the directives if module is not included. The module argument can be either the module identifier or the file name of the module, at the time it was compiled. For example, rewrite module is the identifier and mod rewrite.c is the file name. If a module consists of several source files, use the name of the file containing the string STANDARD20 MODULE STUFF. sections are nest-able, which can be used to implement simple multiple-module tests. =⇒This section should only be used if you need to have one configuration file that works whether or not a specific module is available. In normal operation, directives need not be placed in sections. Include Directive Description: Syntax: Context: Status: Module: Compatibility: Includes other configuration files from within the server configuration files Include file-path|directory-path|wildcard server config, virtual host, directory Core core Directory wildcard matching available in 2.3.6 and later This directive allows inclusion of other configuration files from within the server configuration files. Shell-style (fnmatch()) wildcard characters can be used in the filename or directory parts of the path to include several files at once, in alphabetical order. In addition, if I NCLUDE points to a directory, rather than a file, Apache httpd will read all files in that directory and any subdirectory. However, including entire directories is not recommended, because it is easy to accidentally leave temporary files in a directory that can cause httpd to fail. Instead, we encourage you to use the wildcard syntax shown below, to include files that match a particular pattern, such as *.conf, for example. The I NCLUDE directive will fail with an error if a wildcard expression does not match any file. The I NCLUDE O P TIONAL directive can be used if non-matching wildcards should be ignored. The file path specified may be an absolute path, or may be relative to the S ERVER ROOT directory. Examples: Include /usr/local/apache2/conf/ssl.conf Include /usr/local/apache2/conf/vhosts/*.conf Or, providing paths relative to your S ERVER ROOT directory: Include conf/ssl.conf Include conf/vhosts/*.conf Wildcards may be included in the directory or file portion of the path. This example will fail if there is no subdirectory in conf/vhosts that contains at least one *.conf file: Include conf/vhosts/*/*.conf Alternatively, the following command will just be ignored in case of missing files or directories: IncludeOptional conf/vhosts/*/*.conf 10.3. APACHE MODULE CORE 409 See also • I NCLUDE O PTIONAL • apachectl IncludeOptional Directive Description: Syntax: Context: Status: Module: Compatibility: Includes other configuration files from within the server configuration files IncludeOptional file-path|directory-path|wildcard server config, virtual host, directory Core core Available in 2.3.6 and later This directive allows inclusion of other configuration files from within the server configuration files. It works identically to the I NCLUDE directive, with the exception that if wildcards do not match any file or directory, the I N CLUDE O PTIONAL directive will be silently ignored instead of causing an error. See also • I NCLUDE • apachectl KeepAlive Directive Description: Syntax: Default: Context: Status: Module: Enables HTTP persistent connections KeepAlive On|Off KeepAlive On server config, virtual host Core core The Keep-Alive extension to HTTP/1.0 and the persistent connection feature of HTTP/1.1 provide long-lived HTTP sessions which allow multiple requests to be sent over the same TCP connection. In some cases this has been shown to result in an almost 50% speedup in latency times for HTML documents with many images. To enable Keep-Alive connections, set KeepAlive On. For HTTP/1.0 clients, Keep-Alive connections will only be used if they are specifically requested by a client. In addition, a Keep-Alive connection with an HTTP/1.0 client can only be used when the length of the content is known in advance. This implies that dynamic content such as CGI output, SSI pages, and server-generated directory listings will generally not use Keep-Alive connections to HTTP/1.0 clients. For HTTP/1.1 clients, persistent connections are the default unless otherwise specified. If the client requests it, chunked encoding will be used in order to send content of unknown length over persistent connections. When a client uses a Keep-Alive connection, it will be counted as a single "request" for the M AX C ONNECTION S P ER C HILD directive, regardless of how many requests are sent using the connection. See also • M AX K EEPA LIVE R EQUESTS 410 CHAPTER 10. APACHE MODULES KeepAliveTimeout Directive Description: Syntax: Default: Context: Status: Module: Amount of time the server will wait for subsequent requests on a persistent connection KeepAliveTimeout num[ms] KeepAliveTimeout 5 server config, virtual host Core core The number of seconds Apache httpd will wait for a subsequent request before closing the connection. By adding a postfix of ms the timeout can be also set in milliseconds. Once a request has been received, the timeout value specified by the T IMEOUT directive applies. Setting K EEPA LIVE T IMEOUT to a high value may cause performance problems in heavily loaded servers. The higher the timeout, the more server processes will be kept occupied waiting on connections with idle clients. If K EEPA LIVE T IMEOUT is not set for a name-based virtual host, the value of the first defined virtual host best matching the local IP and port will be used. Limit Directive Description: Syntax: Context: Override: Status: Module: Restrict enclosed access controls to only certain HTTP methods ... directory, .htaccess AuthConfig, Limit Core core Access controls are normally effective for all access methods, and this is the usual desired behavior. In the general case, access control directives should not be placed within a section. The purpose of the directive is to restrict the effect of the access controls to the nominated HTTP methods. For all other methods, the access restrictions that are enclosed in the bracket will have no effect. The following example applies the access control only to the methods POST, PUT, and DELETE, leaving all other methods unprotected: Require valid-user The method names listed can be one or more of: GET, POST, PUT, DELETE, CONNECT, OPTIONS, PATCH, PROPFIND, PROPPATCH, MKCOL, COPY, MOVE, LOCK, and UNLOCK. The method name is case-sensitive. If GET is used, it will also restrict HEAD requests. The TRACE method cannot be limited (see T RACE E NABLE). ! A section should always be used in preference to a section when restricting access, since a section provides protection against arbitrary methods. The and directives may be nested. In this case, each successive level of or directives must further restrict the set of methods to which access controls apply. ! When using or directives with the R EQUIRE directive, note that the first R EQUIRE to succeed authorizes the request, regardless of the presence of other R E QUIRE directives. 10.3. APACHE MODULE CORE 411 For example, given the following configuration, all users will be authorized for POST requests, and the Require group editors directive will be ignored in all cases: Require valid-user Require group editors LimitExcept Directive Description: Syntax: Context: Override: Status: Module: Restrict access controls to all HTTP methods except the named ones ... directory, .htaccess AuthConfig, Limit Core core and are used to enclose a group of access control directives which will then apply to any HTTP access method not listed in the arguments; i.e., it is the opposite of a section and can be used to control both standard and nonstandard/unrecognized methods. See the documentation for for more details. For example: Require valid-user LimitInternalRecursion Directive Description: Syntax: Default: Context: Status: Module: Determine maximum number of internal redirects and nested subrequests LimitInternalRecursion number [number] LimitInternalRecursion 10 server config, virtual host Core core An internal redirect happens, for example, when using the ACTION directive, which internally redirects the original request to a CGI script. A subrequest is Apache httpd’s mechanism to find out what would happen for some URI if it were requested. For example, MOD DIR uses subrequests to look for the files listed in the D IRECTORY I NDEX directive. L IMIT I NTERNAL R ECURSION prevents the server from crashing when entering an infinite loop of internal redirects or subrequests. Such loops are usually caused by misconfigurations. The directive stores two different limits, which are evaluated on per-request basis. The first number is the maximum number of internal redirects that may follow each other. The second number determines how deeply subrequests may be nested. If you specify only one number, it will be assigned to both limits. LimitInternalRecursion 5 412 CHAPTER 10. APACHE MODULES LimitRequestBody Directive Description: Syntax: Default: Context: Override: Status: Module: Restricts the total size of the HTTP request body sent from the client LimitRequestBody bytes LimitRequestBody 0 server config, virtual host, directory, .htaccess All Core core This directive specifies the number of bytes from 0 (meaning unlimited) to 2147483647 (2GB) that are allowed in a request body. See the note below for the limited applicability to proxy requests. The L IMIT R EQUEST B ODY directive allows the user to set a limit on the allowed size of an HTTP request message body within the context in which the directive is given (server, per-directory, per-file or per-location). If the client request exceeds that limit, the server will return an error response instead of servicing the request. The size of a normal request message body will vary greatly depending on the nature of the resource and the methods allowed on that resource. CGI scripts typically use the message body for retrieving form information. Implementations of the PUT method will require a value at least as large as any representation that the server wishes to accept for that resource. This directive gives the server administrator greater control over abnormal client request behavior, which may be useful for avoiding some forms of denial-of-service attacks. If, for example, you are permitting file upload to a particular location and wish to limit the size of the uploaded file to 100K, you might use the following directive: LimitRequestBody 102400 =⇒documentation. For a full description of how this directive is interpreted by proxy requests, see the MOD PROXY LimitRequestFields Directive Description: Syntax: Default: Context: Status: Module: Limits the number of HTTP request header fields that will be accepted from the client LimitRequestFields number LimitRequestFields 100 server config, virtual host Core core Number is an integer from 0 (meaning unlimited) to 32767. The default value is defined by the compile-time constant DEFAULT LIMIT REQUEST FIELDS (100 as distributed). The L IMIT R EQUEST F IELDS directive allows the server administrator to modify the limit on the number of request header fields allowed in an HTTP request. A server needs this value to be larger than the number of fields that a normal client request might include. The number of request header fields used by a client rarely exceeds 20, but this may vary among different client implementations, often depending upon the extent to which a user has configured their browser to support detailed content negotiation. Optional HTTP extensions are often expressed using request header fields. This directive gives the server administrator greater control over abnormal client request behavior, which may be useful for avoiding some forms of denial-of-service attacks. The value should be increased if normal clients see an error response from the server that indicates too many fields were sent in the request. For example: LimitRequestFields 50 10.3. APACHE MODULE CORE ! 413 Warning When name-based virtual hosting is used, the value for this directive is taken from the default (first-listed) virtual host for the local IP and port combination. LimitRequestFieldSize Directive Description: Syntax: Default: Context: Status: Module: Limits the size of the HTTP request header allowed from the client LimitRequestFieldSize bytes LimitRequestFieldSize 8190 server config, virtual host Core core This directive specifies the number of bytes that will be allowed in an HTTP request header. The L IMIT R EQUEST F IELD S IZE directive allows the server administrator to set the limit on the allowed size of an HTTP request header field. A server needs this value to be large enough to hold any one header field from a normal client request. The size of a normal request header field will vary greatly among different client implementations, often depending upon the extent to which a user has configured their browser to support detailed content negotiation. SPNEGO authentication headers can be up to 12392 bytes. This directive gives the server administrator greater control over abnormal client request behavior, which may be useful for avoiding some forms of denial-of-service attacks. For example: LimitRequestFieldSize 4094 =⇒Under normal conditions, the value should not be changed from the default. ! Warning When name-based virtual hosting is used, the value for this directive is taken from the default (first-listed) virtual host best matching the current IP address and port combination. LimitRequestLine Directive Description: Syntax: Default: Context: Status: Module: Limit the size of the HTTP request line that will be accepted from the client LimitRequestLine bytes LimitRequestLine 8190 server config, virtual host Core core This directive sets the number of bytes that will be allowed on the HTTP request-line. The L IMIT R EQUEST L INE directive allows the server administrator to set the limit on the allowed size of a client’s HTTP request-line. Since the request-line consists of the HTTP method, URI, and protocol version, the L IMIT R E QUEST L INE directive places a restriction on the length of a request-URI allowed for a request on the server. A server needs this value to be large enough to hold any of its resource names, including any information that might be passed in the query part of a GET request. This directive gives the server administrator greater control over abnormal client request behavior, which may be useful for avoiding some forms of denial-of-service attacks. For example: 414 CHAPTER 10. APACHE MODULES LimitRequestLine 4094 =⇒Under normal conditions, the value should not be changed from the default. ! Warning When name-based virtual hosting is used, the value for this directive is taken from the default (first-listed) virtual host best matching the current IP address and port combination. LimitXMLRequestBody Directive Description: Syntax: Default: Context: Override: Status: Module: Limits the size of an XML-based request body LimitXMLRequestBody bytes LimitXMLRequestBody 1000000 server config, virtual host, directory, .htaccess All Core core Limit (in bytes) on maximum size of an XML-based request body. A value of 0 will disable any checking. Example: LimitXMLRequestBody 0 Location Directive Description: Syntax: Context: Status: Module: Applies the enclosed directives only to matching URLs ... server config, virtual host Core core The directive limits the scope of the enclosed directives by URL. It is similar to the directive, and starts a subsection which is terminated with a directive. sections are processed in the order they appear in the configuration file, after the sections and .htaccess files are read, and after the sections. sections operate completely outside the filesystem. This has several consequences. Most importantly, directives should not be used to control access to filesystem locations. Since several different URLs may map to the same filesystem location, such access controls may by circumvented. The enclosed directives will be applied to the request if the path component of the URL meets any of the following criteria: • The specified location matches exactly the path component of the URL. • The specified location, which ends in a forward slash, is a prefix of the path component of the URL (treated as a context root). • The specified location, with the addition of a trailing slash, is a prefix of the path component of the URL (also treated as a context root). In the example below, where no trailing slash is used, requests to /private1, /private1/ and /private1/file.txt will have the enclosed directives applied, but /private1other would not. 10.3. APACHE MODULE CORE 415 # ... In the example below, where a trailing slash is used, requests to /private2/ and /private2/file.txt will have the enclosed directives applied, but /private2 and /private2other would not. # ... =⇒When to use Use to apply directives to content that lives outside the filesystem. For content that lives in the filesystem, use and . An exception is , which is an easy way to apply a configuration to the entire server. For all origin (non-proxy) requests, the URL to be matched is a URL-path of the form /path/. No scheme, hostname, port, or query string may be included. For proxy requests, the URL to be matched is of the form scheme://servername/path, and you must include the prefix. The URL may use wildcards. In a wild-card string, ? matches any single character, and * matches any sequences of characters. Neither wildcard character matches a / in the URL-path. Regular expressions can also be used, with the addition of the ˜ character. For example: #... would match URLs that contained the substring /extra/data or /special/data. The directive behaves identical to the regex version of , and is preferred, for the simple reason that ˜ is hard to distinguish from - in many fonts. The functionality is especially useful when combined with the S ET H ANDLER directive. For example, to enable status requests but allow them only from browsers at example.com, you might use: SetHandler server-status Require host example.com =⇒Note about / (slash) The slash character has special meaning depending on where in a URL it appears. People may be used to its behavior in the filesystem where multiple adjacent slashes are frequently collapsed to a single slash (i.e., /home///foo is the same as /home/foo). In URL-space this is not necessarily true. The directive and the regex version of require you to explicitly specify multiple slashes if that is your intention. For example, would match the request URL /abc but not the request URL //abc. The (non-regex) directive behaves similarly when used for proxy requests. But when (non-regex) is used for non-proxy requests it will implicitly match multiple slashes with a single slash. For example, if you specify and the request is to /abc//def then it will match. See also 416 CHAPTER 10. APACHE MODULES • How , and sections work (p. 35) for an explanation of how these different sections are combined when a request is received. • L OCATION M ATCH LocationMatch Directive Description: Syntax: Context: Status: Module: Applies the enclosed directives only to regular-expression matching URLs ... server config, virtual host Core core The directive limits the scope of the enclosed directives by URL, in an identical manner to . However, it takes a regular expression as an argument instead of a simple string. For example: # ... would match URLs that contained the substring /extra/data or /special/data. =⇒/extra/data, If the intent is that a URL starts with /extra/data, rather than merely contains prefix the regular expression with a ˆ to require this. From 2.4.8 onwards, named groups and backreferences are captured and written to the environment with the corresponding name prefixed with "MATCH " and in upper case. This allows elements of URLs to be referenced from within expressions (p. 99) and modules like MOD REWRITE. In order to prevent confusion, numbered (unnamed) backreferences are ignored. Use named groups instead. [ˆ/]+)"> require ldap-group cn=%{env:MATCH_SITENAME},ou=combined,o=Example See also • How , and sections work (p. 35) for an explanation of how these different sections are combined when a request is received LogLevel Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Controls the verbosity of the ErrorLog LogLevel [module:]level [module:level] ... LogLevel warn server config, virtual host, directory Core core Per-module and per-directory configuration is available in Apache HTTP Server 2.3.6 and later L OG L EVEL adjusts the verbosity of the messages recorded in the error logs (see E RROR L OG directive). The following levels are available, in order of decreasing significance: 10.3. APACHE MODULE CORE 417 Level Description Example emerg alert crit error warn notice info Emergencies - system is unusable. Action must be taken immediately. Critical Conditions. Error conditions. Warning conditions. Normal but significant condition. Informational. debug trace1 trace2 Debug-level messages Trace messages Trace messages trace3 trace4 trace5 trace6 trace7 Trace messages Trace messages Trace messages Trace messages Trace messages, amounts of data Trace messages, amounts of data trace8 dumping large "Child cannot open lock file. Exiting" "getpwuid: couldn’t determine user name from uid" "socket: Failed to get a socket, exiting child" "Premature end of script headers" "child process 1234 did not exit, sending another SIGHUP" "httpd: caught SIGBUS, attempting to dump core in ..." "Server seems busy, (you may need to increase StartServers, or Min/MaxSpareServers)..." "Opening config file ..." "proxy: FTP: control connection complete" "proxy: CONNECT: sending the CONNECT request to the remote proxy" "openssl: Handshake: start" "read from buffered SSL brigade, mode 0, 17 bytes" "map lookup FAILED: map=rewritemap key=keyname" "cache lookup FAILED, forcing new map lookup" "— 0000: 02 23 44 30 13 40 ac 34 df 3d bf 9a 19 49 39 15 —" dumping large "— 0000: 02 23 44 30 13 40 ac 34 df 3d bf 9a 19 49 39 15 —" When a particular level is specified, messages from all other levels of higher significance will be reported as well. E.g., when LogLevel info is specified, then messages with log levels of notice and warn will also be posted. Using a level of at least crit is recommended. For example: LogLevel notice =⇒Note When logging to a regular file, messages of the level notice cannot be suppressed and thus are always logged. However, this doesn’t apply when logging is done using syslog. Specifying a level without a module name will reset the level for all modules to that level. Specifying a level with a module name will set the level for that module only. It is possible to use the module source file name, the module identifier, or the module identifier with the trailing module omitted as module specification. This means the following three specifications are equivalent: LogLevel info ssl:warn LogLevel info mod_ssl.c:warn LogLevel info ssl_module:warn It is also possible to change the level per directory: LogLevel info LogLevel debug =⇒Per directory loglevel configuration only affects messages that are logged after the request has been parsed and that are associated with the request. Log messages which are associated with the server or the connection are not affected. The latter can be influenced by the L OG L EVEL OVERRIDE directive, though. 418 CHAPTER 10. APACHE MODULES See also • E RROR L OG • E RROR L OG F ORMAT • L OG L EVEL OVERRIDE • Apache HTTP Server Log Files (p. 56) LogLevelOverride Directive Description: Syntax: Override the verbosity of the ErrorLog for certain clients LogLevel ipaddress[/prefixlen] [module:]level [module:level] ... unset server config, virtual host Core core Available in Apache HTTP Server 2.5.0 and later Default: Context: Status: Module: Compatibility: L OG L EVEL OVERRIDE adjusts the L OG L EVEL for requests coming from certain client IP addresses. This allows to enable verbose logging only for certain test clients. The IP address is checked at a very early state in the connection processing. Therefore, L OG L EVEL OVERRIDE allows to change the log level for things like the SSL handshake which happen before a L OG L EVEL directive in an container would be evaluated. L OG L EVEL OVERRIDE accepts either a single IP-address or a CIDR IP-address/len subnet specification. For the syntax of the loglevel specification, see the L OG L EVEL directive. For requests that match a L OG L EVEL OVERRIDE directive, per-directory specifications of L OG L EVEL are ignored. Examples: LogLevelOverride 192.0.2.0/24 ssl:trace6 LogLevelOverride 192.0.2.7 ssl:trace8 =⇒connection. L L O only affects log messages that are associated with the request or the Log messages which are associated with the server are not affected. OG EVEL VERRIDE See also • L OG L EVEL MaxKeepAliveRequests Directive Description: Syntax: Default: Context: Status: Module: Number of requests allowed on a persistent connection MaxKeepAliveRequests number MaxKeepAliveRequests 100 server config, virtual host Core core The M AX K EEPA LIVE R EQUESTS directive limits the number of requests allowed per connection when K EEPA LIVE is on. If it is set to 0, unlimited requests will be allowed. We recommend that this setting be kept to a high value for maximum server performance. For example: MaxKeepAliveRequests 500 10.3. APACHE MODULE CORE 419 MaxRangeOverlaps Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Number of overlapping ranges (eg: 100-200,150-300) allowed before returning the complete resource MaxRangeOverlaps default | unlimited | none | number-of-ranges MaxRangeOverlaps 20 server config, virtual host, directory Core core Available in Apache HTTP Server 2.3.15 and later The M AX R ANGE OVERLAPS directive limits the number of overlapping HTTP ranges the server is willing to return to the client. If more overlapping ranges than permitted are requested, the complete resource is returned instead. default Limits the number of overlapping ranges to a compile-time default of 20. none No overlapping Range headers are allowed. unlimited The server does not limit the number of overlapping ranges it is willing to satisfy. number-of-ranges A positive number representing the maximum number of overlapping ranges the server is willing to satisfy. MaxRangeReversals Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Number of range reversals (eg: 100-200,50-70) allowed before returning the complete resource MaxRangeReversals default | unlimited | none | number-of-ranges MaxRangeReversals 20 server config, virtual host, directory Core core Available in Apache HTTP Server 2.3.15 and later The M AX R ANGE R EVERSALS directive limits the number of HTTP Range reversals the server is willing to return to the client. If more ranges reversals than permitted are requested, the complete resource is returned instead. default Limits the number of range reversals to a compile-time default of 20. none No Range reversals headers are allowed. unlimited The server does not limit the number of range reversals it is willing to satisfy. number-of-ranges A positive number representing the maximum number of range reversals the server is willing to satisfy. MaxRanges Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Number of ranges allowed before returning the complete resource MaxRanges default | unlimited | none | number-of-ranges MaxRanges 200 server config, virtual host, directory Core core Available in Apache HTTP Server 2.3.15 and later 420 CHAPTER 10. APACHE MODULES The M AX R ANGES directive limits the number of HTTP ranges the server is willing to return to the client. If more ranges than permitted are requested, the complete resource is returned instead. default Limits the number of ranges to a compile-time default of 200. none Range headers are ignored. unlimited The server does not limit the number of ranges it is willing to satisfy. number-of-ranges A positive number representing the maximum number of ranges the server is willing to satisfy. MergeTrailers Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Determines whether trailers are merged into headers MergeTrailers [on|off] MergeTrailers off server config, virtual host Core core 2.4.11 and later This directive controls whether HTTP trailers are copied into the internal representation of HTTP headers. This merging occurs when the request body has been completely consumed, long after most header processing would have a chance to examine or modify request headers. This option is provided for compatibility with releases prior to 2.4.11, where trailers were always merged. Mutex Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Configures mutex mechanism and lock file directory for all or specified mutexes Mutex mechanism [default|mutex-name] ... [OmitPID] Mutex default server config Core core Available in Apache HTTP Server 2.3.4 and later The M UTEX directive sets the mechanism, and optionally the lock file location, that httpd and modules use to serialize access to resources. Specify default as the second argument to change the settings for all mutexes; specify a mutex name (see table below) as the second argument to override defaults only for that mutex. The M UTEX directive is typically used in the following exceptional situations: • change the mutex mechanism when the default mechanism selected by APR has a functional or performance problem • change the directory used by file-based mutexes when the default directory does not support locking =⇒Supported modules This directive only configures mutexes which have been registered with the core server using the ap mutex register() API. All modules bundled with httpd support the M UTEX directive, but third-party modules may not. Consult the documentation of the third-party module, which must indicate the mutex name(s) which can be configured if this directive is supported. The following mutex mechanisms are available: 10.3. APACHE MODULE CORE 421 • default | yes This selects the default locking implementation, as determined by APR. The default locking implementation can be displayed by running httpd with the -V option. • none | no This effectively disables the mutex, and is only allowed for a mutex if the module indicates that it is a valid choice. Consult the module documentation for more information. • posixsem This is a mutex variant based on a Posix semaphore. ! Warning The semaphore ownership is not recovered if a thread in the process holding the mutex segfaults, resulting in a hang of the web server. • sysvsem This is a mutex variant based on a SystemV IPC semaphore. ! ! Warning It is possible to "leak" SysV semaphores if processes crash before the semaphore is removed. Security The semaphore API allows for a denial of service attack by any CGIs running under the same uid as the webserver (i.e., all CGIs, unless you use something like suexec or cgiwrapper). • sem This selects the "best" available semaphore implementation, choosing between Posix and SystemV IPC semaphores, in that order. • pthread This is a mutex variant based on cross-process Posix thread mutexes. ! Warning On most systems, if a child process terminates abnormally while holding a mutex that uses this implementation, the server will deadlock and stop responding to requests. When this occurs, the server will require a manual restart to recover. Solaris and Linux are notable exceptions as they provide a mechanism which usually allows the mutex to be recovered after a child process terminates abnormally while holding a mutex. If your system is POSIX compliant or if it implements the pthread mutexattr setrobust np() function, you may be able to use the pthread option safely. • fcntl:/path/to/mutex This is a mutex variant where a physical (lock-)file and the fcntl() function are used as the mutex. ! Warning When multiple mutexes based on this mechanism are used within multi-threaded, multiprocess environments, deadlock errors (EDEADLK) can be reported for valid mutex operations if fcntl() is not thread-aware, such as on Solaris. • flock:/path/to/mutex This is similar to the fcntl:/path/to/mutex method with the exception that the flock() function is used to provide file locking. • file:/path/to/mutex This selects the "best" available file locking implementation, choosing between fcntl and flock, in that order. Most mechanisms are only available on selected platforms, where the underlying platform and APR support it. Mechanisms which aren’t available on all platforms are posixsem, sysvsem, sem, pthread, fcntl, flock, and file. With the file-based mechanisms fcntl and flock, the path, if provided, is a directory where the lock file will be created. The default directory is httpd’s run-time file directory, D EFAULT RUNTIME D IR. If a relative path is provided, it is relative to D EFAULT RUNTIME D IR. Always use a local disk filesystem for /path/to/mutex and never a directory 422 CHAPTER 10. APACHE MODULES residing on a NFS- or AFS-filesystem. The basename of the file will be the mutex type, an optional instance string provided by the module, and unless the OmitPID keyword is specified, the process id of the httpd parent process will be appended to make the file name unique, avoiding conflicts when multiple httpd instances share a lock file directory. For example, if the mutex name is mpm-accept and the lock file directory is /var/httpd/locks, the lock file name for the httpd instance with parent process id 12345 would be /var/httpd/locks/mpm-accept.12345. ! Security It is best to avoid putting mutex files in a world-writable directory such as /var/tmp because someone could create a denial of service attack and prevent the server from starting by creating a lockfile with the same name as the one the server will try to create. The following table documents the names of mutexes used by httpd and bundled modules. Mutex name Module(s) mpm-accept PREFORK authdigest-client authdigest-opaque ldap-cache rewrite-map MOD AUTH DIGEST ssl-cache ssl-stapling watchdog-callback MOD SSL and WORKER MPMs MOD AUTH DIGEST MOD LDAP MOD REWRITE MOD SSL MOD WATCHDOG Protected resource incoming connections, to avoid the thundering herd problem; for more information, refer to the performance tuning (p. 339) documentation client list in shared memory counter in shared memory LDAP result cache communication with external mapping programs, to avoid intermixed I/O from multiple requests SSL session cache OCSP stapling response cache callback function of a particular client module The OmitPID keyword suppresses the addition of the httpd parent process id from the lock file name. In the following example, the mutex mechanism for the MPM accept mutex will be changed from the compiled-in default to fcntl, with the associated lock file created in directory /var/httpd/locks. The mutex mechanism for all other mutexes will be changed from the compiled-in default to sysvsem. Mutex sysvsem default Mutex fcntl:/var/httpd/locks mpm-accept NameVirtualHost Directive Description: Syntax: Context: Status: Module: DEPRECATED: Designates an IP address for name-virtual hosting NameVirtualHost addr[:port] server config Core core Prior to 2.3.11, NAME V IRTUAL H OST was required to instruct the server that a particular IP address and port combination was usable as a name-based virtual host. In 2.3.11 and later, any time an IP address and port combination is used in multiple virtual hosts, name-based virtual hosting is automatically enabled for that address. This directive currently has no effect. See also • Virtual Hosts documentation (p. 124) 10.3. APACHE MODULE CORE 423 Options Directive Description: Syntax: Default: Context: Override: Status: Module: Compatibility: Configures what features are available in a particular directory Options [+|-]option [[+|-]option] ... Options FollowSymlinks server config, virtual host, directory, .htaccess Options Core core The default was changed from All to FollowSymlinks in 2.3.11 The O PTIONS directive controls which server features are available in a particular directory. option can be set to None, in which case none of the extra features are enabled, or one or more of the following: All All options except for MultiViews. ExecCGI Execution of CGI scripts using MOD CGI is permitted. FollowSymLinks The server will follow symbolic links in this directory. This is the default setting. =⇒Even though the server follows the symlink it does not change the pathname used to match against sections. IRECTORY The FollowSymLinks and SymLinksIfOwnerMatch O PTIONS work only in sections or .htaccess files. Omitting this option should not be considered a security restriction, since symlink testing is subject to race conditions that make it circumventable. Includes Server-side includes provided by MOD INCLUDE are permitted. IncludesNOEXEC Server-side includes are permitted, but the #exec cmd and #exec cgi are disabled. It is still possible to #include virtual CGI scripts from S CRIPTA LIASed directories. Indexes If a URL which maps to a directory is requested and there is no D IRECTORY I NDEX (e.g., index.html) in that directory, then MOD AUTOINDEX will return a formatted listing of the directory. MultiViews Content negotiated (p. 78) "MultiViews" are allowed using MOD NEGOTIATION. =⇒Note This option gets ignored if set anywhere other than , as MOD NEGOTIATION needs real resources to compare against and evaluate from. SymLinksIfOwnerMatch The server will only follow symbolic links for which the target file or directory is owned by the same user id as the link. =⇒Note The FollowSymLinks and SymLinksIfOwnerMatch O PTIONS work only in Normally, if multiple O PTIONS could apply to a directory, then the most specific one is used and others are ignored; the options are not merged. (See how sections are merged (p. 35) .) However if all the options on the O PTIONS directive are preceded by a + or - symbol, the options are merged. Any options preceded by a + are added to the options currently in force, and any options preceded by a - are removed from the options currently in force. =⇒Note Mixing O PTIONS with a + or - with those without is not valid syntax and will be rejected during server startup by the syntax check with an abort. For example, without any + and - symbols: 424 CHAPTER 10. APACHE MODULES Options Indexes FollowSymLinks Options Includes then only Includes will be set for the /web/docs/spec directory. However if the second O PTIONS directive uses the + and - symbols: Options Indexes FollowSymLinks Options +Includes -Indexes then the options FollowSymLinks and Includes are set for the /web/docs/spec directory. =⇒Note Using -IncludesNOEXEC or -Includes disables server-side includes completely regardless of the previous setting. The default in the absence of any other settings is FollowSymlinks. Protocol Directive Description: Syntax: Context: Status: Module: Compatibility: Protocol for a listening socket Protocol protocol server config, virtual host Core core On Windows, only available from Apache 2.3.3 and later. This directive specifies the protocol used for a specific listening socket. The protocol is used to determine which module should handle a request and to apply protocol specific optimizations with the ACCEPT F ILTER directive. You only need to set the protocol if you are running on non-standard ports; otherwise, http is assumed for port 80 and https for port 443. For example, if you are running https on a non-standard port, specify the protocol explicitly: Protocol https You can also specify the protocol using the L ISTEN directive. See also • ACCEPT F ILTER • L ISTEN 10.3. APACHE MODULE CORE 425 Protocols Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Protocols available for a server/virtual host Protocols protocol ... Protocols http/1.1 server config, virtual host Core core Only available from Apache 2.4.17 and later. This directive specifies the list of protocols supported for a server/virtual host. The list determines the allowed protocols a client may negotiate for this server/host. You need to set protocols if you want to extend the available protocols for a server/host. By default, only the http/1.1 protocol (which includes the compatibility with 1.0 and 0.9 clients) is allowed. For example, if you want to support HTTP/2 for a server with TLS, specify: Protocols h2 http/1.1 Valid protocols are http/1.1 for http and https connections, h2 on https connections and h2c for http connections. Modules may enable more protocols. It is safe to specify protocols that are unavailable/disabled. Such protocol names will simply be ignored. Protocols specified in base servers are inherited for virtual hosts only if the virtual host has no own Protocols directive. Or, the other way around, Protocols directives in virtual hosts replace any such directive in the base server. See also • P ROTOCOLS H ONORO RDER ProtocolsHonorOrder Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Determines if order of Protocols determines precedence during negotiation ProtocolsHonorOrder On|Off ProtocolsHonorOrder On server config, virtual host Core core Only available from Apache 2.4.17 and later. This directive specifies if the server should honor the order in which the P ROTOCOLS directive lists protocols. If configured Off, the client supplied list order of protocols has precedence over the order in the server configuration. With P ROTOCOLS H ONORO RDER set to on (default), the client ordering does not matter and only the ordering in the server settings influences the outcome of the protocol negotiation. See also • P ROTOCOLS 426 CHAPTER 10. APACHE MODULES QualifyRedirectURL Directive Description: Syntax: Default: Context: Override: Status: Module: Compatibility: Controls whether the REDIRECT URL environment variable is fully qualified QualifyRedirectURL ON|OFF QualifyRedirectURL OFF server config, virtual host, directory FileInfo Core core Directive supported in 2.4.18 and later. 2.4.17 acted as if ’QualifyRedirectURL ON’ was configured. This directive controls whether the server will ensure that the REDIRECT URL environment variable is fully qualified. By default, the variable contains the verbatim URL requested by the client, such as "/index.html". With Q UAL IFY R EDIRECT URL ON, the same request would result in a value such as "http://www.example.com/index.html". Even without this directive set, when a request is issued against a fully qualified URL, REDIRECT URL will remain fully qualified. RegisterHttpMethod Directive Description: Syntax: Context: Status: Module: Register non-standard HTTP methods RegisterHttpMethod method [method [...]] server config Core core HTTP Methods that are not conforming to the relvant RFCs are normally rejected by request processing in Apache HTTPD. To avoid this, modules can register non-standard HTTP methods they support. The R EGISTER H TTP M ETHOD allows to register such methods manually. This can be useful if such methods are forwarded for external processing, e.g. to a CGI script. RLimitCPU Directive Description: Syntax: Default: Context: Override: Status: Module: Limits the CPU consumption of processes launched by Apache httpd children RLimitCPU seconds|max [seconds|max] Unset; uses operating system defaults server config, virtual host, directory, .htaccess All Core core Takes 1 or 2 parameters. The first parameter sets the soft resource limit for all processes and the second parameter sets the maximum resource limit. Either parameter can be a number, or max to indicate to the server that the limit should be set to the maximum allowed by the operating system configuration. Raising the maximum resource limit requires that the server is running as root or in the initial startup phase. This applies to processes forked from Apache httpd children servicing requests, not the Apache httpd children themselves. This includes CGI scripts and SSI exec commands, but not any processes forked from the Apache httpd parent, such as piped logs. CPU resource limits are expressed in seconds per process. See also • RL IMIT MEM 10.3. APACHE MODULE CORE 427 • RL IMIT NPROC RLimitMEM Directive Description: Syntax: Default: Context: Override: Status: Module: Limits the memory consumption of processes launched by Apache httpd children RLimitMEM bytes|max [bytes|max] Unset; uses operating system defaults server config, virtual host, directory, .htaccess All Core core Takes 1 or 2 parameters. The first parameter sets the soft resource limit for all processes and the second parameter sets the maximum resource limit. Either parameter can be a number, or max to indicate to the server that the limit should be set to the maximum allowed by the operating system configuration. Raising the maximum resource limit requires that the server is running as root or in the initial startup phase. This applies to processes forked from Apache httpd children servicing requests, not the Apache httpd children themselves. This includes CGI scripts and SSI exec commands, but not any processes forked from the Apache httpd parent, such as piped logs. Memory resource limits are expressed in bytes per process. See also • RL IMIT CPU • RL IMIT NPROC RLimitNPROC Directive Description: Syntax: Default: Context: Override: Status: Module: Limits the number of processes that can be launched by processes launched by Apache httpd children RLimitNPROC number|max [number|max] Unset; uses operating system defaults server config, virtual host, directory, .htaccess All Core core Takes 1 or 2 parameters. The first parameter sets the soft resource limit for all processes, and the second parameter sets the maximum resource limit. Either parameter can be a number, or max to indicate to the server that the limit should be set to the maximum allowed by the operating system configuration. Raising the maximum resource limit requires that the server is running as root or in the initial startup phase. This applies to processes forked from Apache httpd children servicing requests, not the Apache httpd children themselves. This includes CGI scripts and SSI exec commands, but not any processes forked from the Apache httpd parent, such as piped logs. Process limits control the number of processes per user. =⇒Note If CGI processes are not running under user ids other than the web server user id, this directive will limit the number of processes that the server itself can create. Evidence of this situation will be indicated by cannot fork messages in the error log. See also 428 CHAPTER 10. APACHE MODULES • RL IMIT MEM • RL IMIT CPU ScriptInterpreterSource Directive Description: Syntax: Default: Context: Override: Status: Module: Compatibility: Technique for locating the interpreter for CGI scripts ScriptInterpreterSource Registry|Registry-Strict|Script ScriptInterpreterSource Script server config, virtual host, directory, .htaccess FileInfo Core core Win32 only. This directive is used to control how Apache httpd finds the interpreter used to run CGI scripts. The default setting is Script. This causes Apache httpd to use the interpreter pointed to by the shebang line (first line, starting with #!) in the script. On Win32 systems this line usually looks like: #!C:/Perl/bin/perl.exe or, if perl is in the PATH, simply: #!perl Setting ScriptInterpreterSource Registry will cause the Windows Registry tree HKEY CLASSES ROOT to be searched using the script file extension (e.g., .pl) as a search key. The command defined by the registry subkey Shell\ExecCGI\Command or, if it does not exist, by the subkey Shell\Open\Command is used to open the script file. If the registry keys cannot be found, Apache httpd falls back to the behavior of the Script option. ! Security Be careful when using ScriptInterpreterSource Registry with S CRIPTA LIAS’ed directories, because Apache httpd will try to execute every file within this directory. The Registry setting may cause undesired program calls on files which are typically not executed. For example, the default open command on .htm files on most Windows systems will execute Microsoft Internet Explorer, so any HTTP request for an .htm file existing within the script directory would start the browser in the background on the server. This is a good way to crash your system within a minute or so. The option Registry-Strict which is new in Apache HTTP Server 2.0 does the same thing as Registry but uses only the subkey Shell\ExecCGI\Command. The ExecCGI key is not a common one. It must be configured manually in the windows registry and hence prevents accidental program calls on your system. SeeRequestTail Directive Description: Syntax: Default: Context: Status: Module: Determine if mod status displays the first 63 characters of a request or the last 63, assuming the request itself is greater than 63 chars. SeeRequestTail On|Off SeeRequestTail Off server config Core core 10.3. APACHE MODULE CORE 429 mod status with ExtendedStatus On displays the actual request being handled. For historical purposes, only 63 characters of the request are actually stored for display purposes. This directive controls whether the 1st 63 characters are stored (the previous behavior and the default) or if the last 63 characters are. This is only applicable, of course, if the length of the request is 64 characters or greater. If Apache httpd is handling GET/disk1/storage/apache/htdocs/images/imagestore1/food/apples.jpgHTTP/1. mod status displays as follows: Off (default) On GET/disk1/storage/apache/htdocs/images/imagestore1/food/apples orage/apache/htdocs/images/imagestore1/food/apples.jpgHTTP/1.1 ServerAdmin Directive Description: Syntax: Context: Status: Module: Email address that the server includes in error messages sent to the client ServerAdmin email-address|URL server config, virtual host Core core The S ERVER A DMIN sets the contact address that the server includes in any error messages it returns to the client. If the httpd doesn’t recognize the supplied argument as an URL, it assumes, that it’s an email-address and prepends it with mailto: in hyperlink targets. However, it’s recommended to actually use an email address, since there are a lot of CGI scripts that make that assumption. If you want to use an URL, it should point to another server under your control. Otherwise users may not be able to contact you in case of errors. It may be worth setting up a dedicated address for this, e.g. ServerAdmin www-admin@foo.example.com as users do not always mention that they are talking about the server! ServerAlias Directive Description: Syntax: Context: Status: Module: Alternate names for a host used when matching requests to name-virtual hosts ServerAlias hostname [hostname] ... virtual host Core core The S ERVER A LIAS directive sets the alternate names for a host, for use with name-based virtual hosts (p. 125) . The S ERVER A LIAS may include wildcards, if appropriate. ServerName server.example.com ServerAlias server server2.example.com server2 ServerAlias *.example.com UseCanonicalName Off # ... Name-based virtual hosts for the best-matching set of s are processed in the order they appear in the configuration. The first matching S ERVER NAME or S ERVER A LIAS is used, with no different precedence for wildcards (nor for ServerName vs. ServerAlias). The complete list of names in the V IRTUAL H OST directive are treated just like a (non wildcard) S ERVER A LIAS. 430 CHAPTER 10. APACHE MODULES See also • U SE C ANONICAL NAME • Apache HTTP Server Virtual Host documentation (p. 124) ServerName Directive Description: Syntax: Context: Status: Module: Hostname and port that the server uses to identify itself ServerName [scheme://]domain-name|ip-address[:port] server config, virtual host Core core The S ERVER NAME directive sets the request scheme, hostname and port that the server uses to identify itself. S ERVER NAME is used (possibly in conjunction with S ERVER A LIAS) to uniquely identify a virtual host, when using name-based virtual hosts (p. 125) . Additionally, this is used when creating self-referential redirection URLs when U SE C ANONICAL NAME is set to a non-default value. For example, if the name of the machine hosting the web server is simple.example.com, but the machine also has the DNS alias www.example.com and you wish the web server to be so identified, the following directive should be used: ServerName www.example.com The S ERVER NAME directive may appear anywhere within the definition of a server. However, each appearance overrides the previous appearance (within that server). If no S ERVER NAME is specified, the server attempts to deduce the client visible hostname by first asking the operating system for the system hostname, and if that fails, performing a reverse lookup on an IP address present on the system. If no port is specified in the S ERVER NAME, then the server will use the port from the incoming request. For optimal reliability and predictability, you should specify an explicit hostname and port using the S ERVER NAME directive. If you are using name-based virtual hosts (p. 125) , the S ERVER NAME inside a section specifies what hostname must appear in the request’s Host: header to match this virtual host. Sometimes, the server runs behind a device that processes SSL, such as a reverse proxy, load balancer or SSL offload appliance. When this is the case, specify the https:// scheme and the port number to which the clients connect in the S ERVER NAME directive to make sure that the server generates the correct self-referential URLs. See the description of the U SE C ANONICAL NAME and U SE C ANONICAL P HYSICAL P ORT directives for settings which determine whether self-referential URLs (e.g., by the MOD DIR module) will refer to the specified port, or to the port number given in the client’s request. ! Failure to set S ERVER NAME to a name that your server can resolve to an IP address will result in a startup warning. httpd will then use whatever hostname it can determine, using the system’s hostname command. This will almost never be the hostname you actually want. httpd: Could not reliably determine the server’s fully qualified domain name, using rocinante.local for ServerName See also 10.3. APACHE MODULE CORE 431 • Issues Regarding DNS and Apache HTTP Server (p. 121) • Apache HTTP Server virtual host documentation (p. 124) • U SE C ANONICAL NAME • U SE C ANONICAL P HYSICAL P ORT • S ERVER A LIAS ServerPath Directive Description: Syntax: Context: Status: Module: Legacy URL pathname for a name-based virtual host that is accessed by an incompatible browser ServerPath URL-path virtual host Core core The S ERVER PATH directive sets the legacy URL pathname for a host, for use with name-based virtual hosts (p. 124) . See also • Apache HTTP Server Virtual Host documentation (p. 124) ServerRoot Directive Description: Syntax: Default: Context: Status: Module: Base directory for the server installation ServerRoot directory-path ServerRoot /usr/local/apache server config Core core The S ERVER ROOT directive sets the directory in which the server lives. Typically it will contain the subdirectories conf/ and logs/. Relative paths in other configuration directives (such as I NCLUDE or L OAD M ODULE, for example) are taken as relative to this directory. ServerRoot "/home/httpd" The default location of S ERVER ROOT may be modified by using the --prefix argument to configure (p. 307) , and most third-party distributions of the server have a different default location from the one listed above. See also • the -d option to httpd (p. 27) • the security tips (p. 364) for information on how to properly set permissions on the S ERVER ROOT ServerSignature Directive Description: Syntax: Default: Context: Override: Status: Module: Configures the footer on server-generated documents ServerSignature On|Off|EMail ServerSignature Off server config, virtual host, directory, .htaccess All Core core 432 CHAPTER 10. APACHE MODULES The S ERVER S IGNATURE directive allows the configuration of a trailing footer line under server-generated documents (error messages, MOD PROXY ftp directory listings, MOD INFO output, ...). The reason why you would want to enable such a footer line is that in a chain of proxies, the user often has no possibility to tell which of the chained servers actually produced a returned error message. The Off setting, which is the default, suppresses the footer line (and is therefore compatible with the behavior of Apache-1.2 and below). The On setting simply adds a line with the server version number and S ERVER NAME of the serving virtual host, and the EMail setting additionally creates a "mailto:" reference to the S ERVER A DMIN of the referenced document. After version 2.0.44, the details of the server version number presented are controlled by the S ERVERT OKENS directive. See also • S ERVERT OKENS ServerTokens Directive Description: Syntax: Default: Context: Status: Module: Configures the Server HTTP response header ServerTokens Major|Minor|Min[imal]|Prod[uctOnly]|OS|Full ServerTokens Full server config Core core This directive controls whether Server response header field which is sent back to clients includes a description of the generic OS-type of the server as well as information about compiled-in modules. ServerTokens Full (or not specified) Server PHP/4.2.2 MyMod/1.2 sends (e.g.): ServerTokens Prod[uctOnly] Server sends (e.g.): Server: Server: Apache ServerTokens Major Server sends (e.g.): Server: Apache/2 ServerTokens Minor Server sends (e.g.): Server: Apache/2.4 ServerTokens Min[imal] Server sends (e.g.): Server: ServerTokens OS Server sends (e.g.): Server: Apache/2.4.2 (Unix) Apache/2.4.2 Apache/2.4.2 (Unix) This setting applies to the entire server, and cannot be enabled or disabled on a virtualhost-by-virtualhost basis. After version 2.0.44, this directive also controls the information presented by the S ERVER S IGNATURE directive. =⇒Setting S T to less than minimal is not recommended because it makes it more difficult to debug interoperational problems. Also note that disabling the Server: header does ERVER OKENS nothing at all to make your server more secure. The idea of "security through obscurity" is a myth and leads to a false sense of safety. See also • S ERVER S IGNATURE 10.3. APACHE MODULE CORE 433 SetHandler Directive Description: Syntax: Context: Override: Status: Module: Compatibility: Forces all matching files to be processed by a handler SetHandler handler-name|none|expression server config, virtual host, directory, .htaccess FileInfo Core core 2.5 and later When placed into an .htaccess file or a or section, this directive forces all matching files to be parsed through the handler (p. 108) given by handler-name. For example, if you had a directory you wanted to be parsed entirely as imagemap rule files, regardless of extension, you might put the following into an .htaccess file in that directory: SetHandler imap-file Another example: if you wanted to have the server display a status report whenever a URL of http://servername/status was called, you might put the following into httpd.conf: SetHandler server-status You could also use this directive to configure a particular handler for files with a particular file extension. For example: SetHandler application/x-httpd-php String-valued expressions can be used to reference per-request variables, including backreferences to named regular expressions: [ˆ/]+)/> SetHandler "proxy:unix:/var/run/app_%{env:MATCH_sub}.sock|fcgi://localhost:8080" You can override an earlier defined S ET H ANDLER directive by using the value None. =⇒Note Because S ET H ANDLER overrides default handlers, normal behavior such as handling of URLs ending in a slash (/) as directories or index files is suppressed. See also • A DD H ANDLER SetInputFilter Directive Description: Syntax: Context: Override: Status: Module: Sets the filters that will process client requests and POST input SetInputFilter filter[;filter...] server config, virtual host, directory, .htaccess FileInfo Core core 434 CHAPTER 10. APACHE MODULES The S ET I NPUT F ILTER directive sets the filter or filters which will process client requests and POST input when they are received by the server. This is in addition to any filters defined elsewhere, including the A DD I NPUT F ILTER directive. If more than one filter is specified, they must be separated by semicolons in the order in which they should process the content. See also • Filters (p. 110) documentation SetOutputFilter Directive Description: Syntax: Context: Override: Status: Module: Sets the filters that will process responses from the server SetOutputFilter filter[;filter...] server config, virtual host, directory, .htaccess FileInfo Core core The S ET O UTPUT F ILTER directive sets the filters which will process responses from the server before they are sent to the client. This is in addition to any filters defined elsewhere, including the A DD O UTPUT F ILTER directive. For example, the following configuration will process all files in the /www/data/ directory for server-side includes. SetOutputFilter INCLUDES If more than one filter is specified, they must be separated by semicolons in the order in which they should process the content. See also • Filters (p. 110) documentation TimeOut Directive Description: Syntax: Default: Context: Status: Module: Amount of time the server will wait for certain events before failing a request TimeOut seconds TimeOut 60 server config, virtual host Core core The T IME O UT directive defines the length of time Apache httpd will wait for I/O in various circumstances: • When reading data from the client, the length of time to wait for a TCP packet to arrive if the read buffer is empty. For initial data on a new connection, this directive doesn’t take effect until after any configured ACCEPT F ILTER has passed the new connection to the server. • When writing data to the client, the length of time to wait for an acknowledgement of a packet if the send buffer is full. • In MOD CGI and MOD CGID, the length of time to wait for output from a CGI script. 10.3. APACHE MODULE CORE 435 • In MOD EXT FILTER, the length of time to wait for output from a filtering process. • In MOD PROXY, the default timeout value if P ROXY T IMEOUT is not configured. TraceEnable Directive Description: Syntax: Default: Context: Status: Module: Determines the behavior on TRACE requests TraceEnable [on|off|extended] TraceEnable on server config, virtual host Core core This directive overrides the behavior of TRACE for both the core server and MOD PROXY. The default TraceEnable on permits TRACE requests per RFC 2616, which disallows any request body to accompany the request. TraceEnable off causes the core server and MOD PROXY to return a 405 (Method not allowed) error to the client. Finally, for testing and diagnostic purposes only, request bodies may be allowed using the non-compliant TraceEnable extended directive. The core (as an origin server) will restrict the request body to 64k (plus 8k for chunk headers if Transfer-Encoding: chunked is used). The core will reflect the full headers and all chunk headers with the response body. As a proxy server, the request body is not restricted to 64k. =⇒Note Despite claims to the contrary, TRACE is not a security vulnerability, and there is no viable reason for it to be disabled. Doing so necessarily makes your server noncompliant. UnDefine Directive Description: Syntax: Context: Status: Module: Undefine the existence of a variable UnDefine parameter-name server config, virtual host Core core Undoes the effect of a D EFINE or of passing a -D argument to httpd. This directive can be used to toggle the use of sections without needing to alter -D arguments in any startup scripts. While this directive is supported in virtual host context, the changes it makes are visible to any later configuration directives, beyond any enclosing virtual host. UseCanonicalName Directive Description: Syntax: Default: Context: Status: Module: Configures how the server determines its own name and port UseCanonicalName On|Off|DNS UseCanonicalName Off server config, virtual host, directory Core core In many situations Apache httpd must construct a self-referential URL – that is, a URL that refers back to the same server. With UseCanonicalName On Apache httpd will use the hostname and port specified in the S ERVER NAME directive to construct the canonical name for the server. This name is used in all self-referential URLs, and for the values of SERVER NAME and SERVER PORT in CGIs. 436 CHAPTER 10. APACHE MODULES With UseCanonicalName Off Apache httpd will form self-referential URLs using the hostname and port supplied by the client if any are supplied (otherwise it will use the canonical name, as defined above). These values are the same that are used to implement name-based virtual hosts (p. 125) and are available with the same clients. The CGI variables SERVER NAME and SERVER PORT will be constructed from the client supplied values as well. An example where this may be useful is on an intranet server where you have users connecting to the machine using short names such as www. You’ll notice that if the users type a shortname and a URL which is a directory, such as http://www/splat, without the trailing slash, then Apache httpd will redirect them to http://www.example.com/splat/. If you have authentication enabled, this will cause the user to have to authenticate twice (once for www and once again for www.example.com – see the FAQ on this subject for more information7 ). But if U SE C ANONICAL NAME is set Off, then Apache httpd will redirect to http://www/splat/. There is a third option, UseCanonicalName DNS, which is intended for use with mass IP-based virtual hosting to support ancient clients that do not provide a Host: header. With this option, Apache httpd does a reverse DNS lookup on the server IP address that the client connected to in order to work out self-referential URLs. ! Warning If CGIs make assumptions about the values of SERVER NAME, they may be broken by this option. The client is essentially free to give whatever value they want as a hostname. But if the CGI is only using SERVER NAME to construct self-referential URLs, then it should be just fine. See also • U SE C ANONICAL P HYSICAL P ORT • S ERVER NAME • L ISTEN UseCanonicalPhysicalPort Directive Description: Syntax: Default: Context: Status: Module: Configures how the server determines its own port UseCanonicalPhysicalPort On|Off UseCanonicalPhysicalPort Off server config, virtual host, directory Core core In many situations Apache httpd must construct a self-referential URL – that is, a URL that refers back to the same server. With UseCanonicalPhysicalPort On, Apache httpd will, when constructing the canonical port for the server to honor the U SE C ANONICAL NAME directive, provide the actual physical port number being used by this request as a potential port. With UseCanonicalPhysicalPort Off, Apache httpd will not ever use the actual physical port number, instead relying on all configured information to construct a valid port number. 7 http://wiki.apache.org/httpd/FAQ#Why does Apache ask for my password twice before serving a file.3F 10.3. APACHE MODULE CORE 437 =⇒Note The ordering of the lookup when the physical port is used is as follows: UseCanonicalName On 1. Port provided in S ERVERNAME 2. Physical port 3. Default port UseCanonicalName Off | DNS 1. Parsed port from Host: header 2. Physical port 3. Port provided in S ERVERNAME 4. Default port With UseCanonicalPhysicalPort Off, the physical ports are removed from the ordering. See also • U SE C ANONICAL NAME • S ERVER NAME • L ISTEN VirtualHost Directive Description: Syntax: Context: Status: Module: Contains directives that apply only to a specific hostname or IP address ... server config Core core and are used to enclose a group of directives that will apply only to a particular virtual host. Any directive that is allowed in a virtual host context may be used. When the server receives a request for a document on a particular virtual host, it uses the configuration directives enclosed in the section. Addr can be any of the following, optionally followed by a colon and a port number (or *): • The IP address of the virtual host; • A fully qualified domain name for the IP address of the virtual host (not recommended); • The character *, which acts as a wildcard and matches any IP address. • The string default , which is an alias for * ServerAdmin webmaster@host.example.com DocumentRoot "/www/docs/host.example.com" ServerName host.example.com ErrorLog "logs/host.example.com-error_log" TransferLog "logs/host.example.com-access_log" IPv6 addresses must be specified in square brackets because the optional port number could not be determined otherwise. An IPv6 example is shown below: 438 CHAPTER 10. APACHE MODULES ServerAdmin webmaster@host.example.com DocumentRoot "/www/docs/host.example.com" ServerName host.example.com ErrorLog "logs/host.example.com-error_log" TransferLog "logs/host.example.com-access_log" Each Virtual Host must correspond to a different IP address, different port number, or a different host name for the server, in the former case the server machine must be configured to accept IP packets for multiple addresses. (If the machine does not have multiple network interfaces, then this can be accomplished with the ifconfig alias command – if your OS supports it). =⇒Note The use of does not affect what addresses Apache httpd listens on. You may need to ensure that Apache httpd is listening on the correct addresses using L ISTEN. A S ERVER NAME should be specified inside each block. If it is absent, the S ERVER NAME from the "main" server configuration will be inherited. When a request is received, the server first maps it to the best matching based on the local IP address and port combination only. Non-wildcards have a higher precedence. If no match based on IP and port occurs at all, the "main" server configuration is used. If multiple virtual hosts contain the best matching IP address and port, the server selects from these virtual hosts the best match based on the requested hostname. If no matching name-based virtual host is found, then the first listed virtual host that matched the IP address will be used. As a consequence, the first listed virtual host for a given IP address and port combination is the default virtual host for that IP and port combination. ! Security See the security tips (p. 364) document for details on why your security could be compromised if the directory where log files are stored is writable by anyone other than the user that starts the server. See also • Apache HTTP Server Virtual Host documentation (p. 124) • Issues Regarding DNS and Apache HTTP Server (p. 121) • Setting which addresses and ports Apache HTTP Server uses (p. 88) • How , and sections work (p. 35) for an explanation of how these different sections are combined when a request is received Warning Directive Description: Syntax: Context: Status: Module: Compatibility: Warn from configuration parsing with a custom message Warning message server config, virtual host, directory, .htaccess Core core 2.5 and later If an issue can be detected from within the configuration, this directive can be used to generate a custom warning message. The configuration parsing is not halted. The typical use is to check whether some user define options are set, and warn if not. 10.3. APACHE MODULE CORE # Example # tell when ReverseProxy is not set Warning "reverse proxy is not started, hope this is okay!" # define custom proxy configuration 439 440 CHAPTER 10. APACHE MODULES 10.4 Apache Module mod access compat Description: Status: ModuleIdentifier: SourceFile: Compatibility: Group authorizations based on host (name or IP address) Extension access compat module mod access compat.c Available in Apache HTTP Server 2.3 as a compatibility module with previous versions of Apache httpd 2.x. The directives provided by this module have been deprecated by the new authz refactoring. Please see MOD AUTHZ HOST Summary The directives provided by MOD ACCESS COMPAT are used in , , and sections as well as .htaccess (p. 380) files to control access to particular parts of the server. Access can be controlled based on the client hostname, IP address, or other characteristics of the client request, as captured in environment variables (p. 92) . The A LLOW and D ENY directives are used to specify which clients are or are not allowed access to the server, while the O RDER directive sets the default access state, and configures how the A LLOW and D ENY directives interact with each other. Both host-based access restrictions and password-based authentication may be implemented simultaneously. In that case, the S ATISFY directive is used to determine how the two sets of restrictions interact. ! Note The directives provided by MOD ACCESS COMPAT have been deprecated by Mixing old directives like O RDER, A LLOW or D ENY with new ones like R EQUIRE is technically possible but discouraged. This module was created to support configurations containing only old directives to facilitate the 2.4 upgrade. Please check the upgrading (p. 2) guide for more information. MOD AUTHZ HOST . In general, access restriction directives apply to all access methods (GET, PUT, POST, etc). This is the desired behavior in most cases. However, it is possible to restrict some methods, while leaving other methods unrestricted, by enclosing the directives in a section. =⇒Merging of configuration sections When any directive provided by this module is used in a new configuration section, no directives provided by this module are inherited from previous configuration sections. Directives • Allow • Deny • Order • Satisfy See also • R EQUIRE • MOD AUTHZ HOST • MOD AUTHZ CORE 10.4. APACHE MODULE MOD ACCESS COMPAT 441 Allow Directive Description: Syntax: Context: Override: Status: Module: Controls which hosts can access an area of the server Allow from all|host|env=[!]env-variable [host|env=[!]env-variable] ... directory, .htaccess Limit Extension mod access compat The A LLOW directive affects which hosts can access an area of the server. Access can be controlled by hostname, IP address, IP address range, or by other characteristics of the client request captured in environment variables. The first argument to this directive is always from. The subsequent arguments can take three different forms. If Allow from all is specified, then all hosts are allowed access, subject to the configuration of the D ENY and O RDER directives as discussed below. To allow only particular hosts or groups of hosts to access the server, the host can be specified in any of the following formats: A (partial) domain-name Allow from example.org Allow from .net example.edu Hosts whose names match, or end in, this string are allowed access. Only complete components are matched, so the above example will match foo.example.org but it will not match fooexample.org. This configuration will cause Apache httpd to perform a double DNS lookup on the client IP address, regardless of the setting of the H OSTNAME L OOKUPS directive. It will do a reverse DNS lookup on the IP address to find the associated hostname, and then do a forward lookup on the hostname to assure that it matches the original IP address. Only if the forward and reverse DNS are consistent and the hostname matches will access be allowed. A full IP address Allow from 10.1.2.3 Allow from 192.168.1.104 192.168.1.205 An IP address of a host allowed access A partial IP address Allow from 10.1 Allow from 10 172.20 192.168.2 The first 1 to 3 bytes of an IP address, for subnet restriction. A network/netmask pair Allow from 10.1.0.0/255.255.0.0 A network a.b.c.d, and a netmask w.x.y.z. For more fine-grained subnet restriction. A network/nnn CIDR specification Allow from 10.1.0.0/16 Similar to the previous case, except the netmask consists of nnn high-order 1 bits. Note that the last three examples above match exactly the same set of hosts. IPv6 addresses and IPv6 subnets can be specified as shown below: Allow from 2001:db8::a00:20ff:fea7:ccea Allow from 2001:db8::a00:20ff:fea7:ccea/10 The third format of the arguments to the A LLOW directive allows access to the server to be controlled based on the existence of an environment variable (p. 92) . When Allow from env=env-variable is specified, then the request is allowed access if the environment variable env-variable exists. When Allow from env=!env-variable is specified, then the request is allowed access if the environment variable env-variable doesn’t exist. The server provides the ability to set environment variables in a flexible way based on characteristics of the client request using the directives provided by MOD SETENVIF. Therefore, this directive can be used to allow access based on such factors as the clients User-Agent (browser type), Referer, or other HTTP request header fields. 442 CHAPTER 10. APACHE MODULES SetEnvIf User-Agent ˆKnockKnock/2\.0 let_me_in Order Deny,Allow Deny from all Allow from env=let_me_in In this case, browsers with a user-agent string beginning with KnockKnock/2.0 will be allowed access, and all others will be denied. =⇒Merging of configuration sections When any directive provided by this module is used in a new configuration section, no directives provided by this module are inherited from previous configuration sections. Deny Directive Description: Syntax: Context: Override: Status: Module: Controls which hosts are denied access to the server Deny from all|host|env=[!]env-variable [host|env=[!]env-variable] ... directory, .htaccess Limit Extension mod access compat This directive allows access to the server to be restricted based on hostname, IP address, or environment variables. The arguments for the D ENY directive are identical to the arguments for the A LLOW directive. Order Directive Description: Syntax: Default: Context: Override: Status: Module: Controls the default access state and the order in which A LLOW and D ENY are evaluated. Order ordering Order Deny,Allow directory, .htaccess Limit Extension mod access compat The O RDER directive, along with the A LLOW and D ENY directives, controls a three-pass access control system. The first pass processes either all A LLOW or all D ENY directives, as specified by the O RDER directive. The second pass parses the rest of the directives (D ENY or A LLOW). The third pass applies to all requests which do not match either of the first two. Note that all A LLOW and D ENY directives are processed, unlike a typical firewall, where only the first match is used. The last match is effective (also unlike a typical firewall). Additionally, the order in which lines appear in the configuration files is not significant – all A LLOW lines are processed as one group, all D ENY lines are considered as another, and the default state is considered by itself. Ordering is one of: Allow,Deny First, all A LLOW directives are evaluated; at least one must match, or the request is rejected. Next, all D ENY directives are evaluated. If any matches, the request is rejected. Last, any requests which do not match an A LLOW or a D ENY directive are denied by default. Deny,Allow First, all D ENY directives are evaluated; if any match, the request is denied unless it also matches an A LLOW directive. Any requests which do not match any A LLOW or D ENY directives are permitted. 10.4. APACHE MODULE MOD ACCESS COMPAT 443 Mutual-failure This order has the same effect as Order Allow,Deny and is deprecated in its favor. Keywords may only be separated by a comma; no whitespace is allowed between them. Match Match Allow only Match Deny only No match Match both Allow & Deny Allow,Deny result Deny,Allow result Request allowed Request denied Default to second directive: Denied Final match controls: Denied Request allowed Request denied Default to second directive: Allowed Final match controls: Allowed In the following example, all hosts in the example.org domain are allowed access; all other hosts are denied access. Order Deny,Allow Deny from all Allow from example.org In the next example, all hosts in the example.org domain are allowed access, except for the hosts which are in the foo.example.org subdomain, who are denied access. All hosts not in the example.org domain are denied access because the default state is to D ENY access to the server. Order Allow,Deny Allow from example.org Deny from foo.example.org On the other hand, if the O RDER in the last example is changed to Deny,Allow, all hosts will be allowed access. This happens because, regardless of the actual ordering of the directives in the configuration file, the Allow from example.org will be evaluated last and will override the Deny from foo.example.org. All hosts not in the example.org domain will also be allowed access because the default state is A LLOW. The presence of an O RDER directive can affect access to a part of the server even in the absence of accompanying A LLOW and D ENY directives because of its effect on the default access state. For example, Order Allow,Deny will Deny all access to the /www directory because the default access state is set to D ENY. The O RDER directive controls the order of access directive processing only within each phase of the server’s configuration processing. This implies, for example, that an A LLOW or D ENY directive occurring in a section will always be evaluated after an A LLOW or D ENY directive occurring in a section or .htaccess file, regardless of the setting of the O RDER directive. For details on the merging of configuration sections, see the documentation on How Directory, Location and Files sections work (p. 35) . =⇒Merging of configuration sections When any directive provided by this module is used in a new configuration section, no directives provided by this module are inherited from previous configuration sections. 444 CHAPTER 10. APACHE MODULES Satisfy Directive Description: Syntax: Default: Context: Override: Status: Module: Interaction between host-level access control and user authentication Satisfy Any|All Satisfy All directory, .htaccess AuthConfig Extension mod access compat Access policy if both A LLOW and R EQUIRE used. The parameter can be either All or Any. This directive is only useful if access to a particular area is being restricted by both username/password and client host address. In this case the default behavior (All) is to require that the client passes the address access restriction and enters a valid username and password. With the Any option the client will be granted access if they either pass the host restriction or enter a valid username and password. This can be used to password restrict an area, but to let clients from particular addresses in without prompting for a password. For example, if you wanted to let people on your network have unrestricted access to a portion of your website, but require that people outside of your network provide a password, you could use a configuration similar to the following: Require valid-user Allow from 192.168.1 Satisfy Any Another frequent use of the S ATISFY directive is to relax access restrictions for a subdirectory: Require valid-user Allow from all Satisfy Any In the above example, authentication will be required for the /var/www/private directory, but will not be required for the /var/www/private/public directory. Since version 2.0.51 S ATISFY directives can be restricted to particular methods by and sections. =⇒Merging of configuration sections When any directive provided by this module is used in a new configuration section, no directives provided by this module are inherited from previous configuration sections. See also • A LLOW • R EQUIRE 10.5. APACHE MODULE MOD ACTIONS 10.5 445 Apache Module mod actions Description: Status: ModuleIdentifier: SourceFile: Execute CGI scripts based on media type or request method. Base actions module mod actions.c Summary This module has two directives. The ACTION directive lets you run CGI scripts whenever a file of a certain MIME content type is requested. The S CRIPT directive lets you run CGI scripts whenever a particular method is used in a request. This makes it much easier to execute scripts that process files. Directives • Action • Script See also • MOD CGI • Dynamic Content with CGI (p. 236) • Apache httpd’s Handler Use (p. 108) Action Directive Description: Syntax: Context: Override: Status: Module: Activates a CGI script for a particular handler or content-type Action action-type cgi-script [virtual] server config, virtual host, directory, .htaccess FileInfo Base mod actions This directive adds an action, which will activate cgi-script when action-type is triggered by the request. The cgi-script is the URL-path to a resource that has been designated as a CGI script using S CRIPTA LIAS or A DD H ANDLER. The action-type can be either a handler (p. 108) or a MIME content type. It sends the URL and file path of the requested document using the standard CGI PATH INFO and PATH TRANSLATED environment variables. The handler used for the particular request is passed using the REDIRECT HANDLER variable. Example: MIME type # Requests for files of a particular MIME content type: Action image/gif /cgi-bin/images.cgi In this example, requests for files with a MIME content type of image/gif will be handled by the specified cgi script /cgi-bin/images.cgi. Example: File extension # Files of a particular file extension AddHandler my-file-type .xyz Action my-file-type /cgi-bin/program.cgi 446 CHAPTER 10. APACHE MODULES In this example, requests for files with a file extension of .xyz are handled by the specified cgi script /cgi-bin/program.cgi. The optional virtual modifier turns off the check whether the requested file really exists. This is useful, for example, if you want to use the ACTION directive in virtual locations. SetHandler news-handler Action news-handler /cgi-bin/news.cgi virtual See also • A DD H ANDLER Script Directive Description: Syntax: Context: Status: Module: Activates a CGI script for a particular request method. Script method cgi-script server config, virtual host, directory Base mod actions This directive adds an action, which will activate cgi-script when a file is requested using the method of method. The cgi-script is the URL-path to a resource that has been designated as a CGI script using S CRIPTA LIAS or A D D H ANDLER . The URL and file path of the requested document is sent using the standard CGI PATH INFO and PATH TRANSLATED environment variables. =⇒Any arbitrary method name may be used. Method names are case-sensitive, so Script PUT and Script put have two entirely different effects. Note that the S CRIPT command defines default actions only. If a CGI script is called, or some other resource that is capable of handling the requested method internally, it will do so. Also note that S CRIPT with a method of GET will only be called if there are query arguments present (e.g., foo.html?hi). Otherwise, the request will proceed normally. # All GET requests go here Script GET /cgi-bin/search # A CGI PUT handler Script PUT /˜bob/put.cgi 10.6. APACHE MODULE MOD ALIAS 10.6 447 Apache Module mod alias Description: Status: ModuleIdentifier: SourceFile: Provides for mapping different parts of the host filesystem in the document tree and for URL redirection Base alias module mod alias.c Summary The directives contained in this module allow for manipulation and control of URLs as requests arrive at the server. The A LIAS and S CRIPTA LIAS directives are used to map between URLs and filesystem paths. This allows for content which is not directly under the D OCUMENT ROOT served as part of the web document tree. The S CRIPTA LIAS directive has the additional effect of marking the target directory as containing only CGI scripts. The R EDIRECT directives are used to instruct clients to make a new request with a different URL. They are often used when a resource has moved to a new location. When the A LIAS, S CRIPTA LIAS and R EDIRECT directives are used within a or section, expression syntax (p. 99) can be used to manipulate the destination path or URL. MOD ALIAS is designed to handle simple URL manipulation tasks. For more complicated tasks such as manipulating the query string, use the tools provided by MOD REWRITE. Directives • Alias • AliasMatch • Redirect • RedirectMatch • RedirectPermanent • RedirectTemp • ScriptAlias • ScriptAliasMatch See also • MOD REWRITE • Mapping URLs to the filesystem (p. 64) Order of Processing Aliases and Redirects occurring in different contexts are processed like other directives according to standard merging rules (p. 35) . But when multiple Aliases or Redirects occur in the same context (for example, in the same section) they are processed in a particular order. First, all Redirects are processed before Aliases are processed, and therefore a request that matches a R EDIRECT or R EDIRECT M ATCH will never have Aliases applied. Second, the Aliases and Redirects are processed in the order they appear in the configuration files, with the first match taking precedence. For this reason, when two or more of these directives apply to the same sub-path, you must list the most specific path first in order for all the directives to have an effect. For example, the following configuration will work as expected: 448 CHAPTER 10. APACHE MODULES Alias "/foo/bar" "/baz" Alias "/foo" "/gaq" But if the above two directives were reversed in order, the /foo A LIAS would always match before the /foo/bar A LIAS, so the latter directive would be ignored. When the A LIAS, S CRIPTA LIAS and R EDIRECT directives are used within a or section, these directives will take precedence over any globally defined A LIAS, S CRIPTA LIAS and R EDIRECT directives. Alias Directive Description: Syntax: Context: Status: Module: Maps URLs to filesystem locations Alias [URL-path] file-path|directory-path server config, virtual host, directory Base mod alias The A LIAS directive allows documents to be stored in the local filesystem other than under the D OCUMENT ROOT. URLs with a (%-decoded) path beginning with URL-path will be mapped to local files beginning with directory-path. The URL-path is case-sensitive, even on case-insensitive file systems. Alias "/image" "/ftp/pub/image" A request for http://example.com/image/foo.gif would cause the server to return the file /ftp/pub/image/foo.gif. Only complete path segments are matched, so the above alias would not match a request for http://example.com/imagefoo.gif. For more complex matching using regular expressions, see the A LIAS M ATCH directive. Note that if you include a trailing / on the URL-path then the server will require a trailing / in order to expand the alias. That is, if you use Alias "/icons/" "/usr/local/apache/icons/" then the URL /icons will not be aliased, as it lacks that trailing /. Likewise, if you omit the slash on the URL-path then you must also omit it from the file-path. Note that you may need to specify additional sections which cover the destination of aliases. Aliasing occurs before sections are checked, so only the destination of aliases are affected. (Note however sections are run through once before aliases are performed, so they will apply.) In particular, if you are creating an Alias to a directory outside of your D OCUMENT ROOT, you may need to explicitly permit access to the target directory. Alias "/image" "/ftp/pub/image" Require all granted Any number slashes in the URL-path parameter matches any number of slashes in the requested URL-path. If the A LIAS directive is used within a or section the URL-path is omitted, and the file-path is interpreted using expression syntax (p. 99) . This syntax is available in Apache 2.4.19 and later. 10.6. APACHE MODULE MOD ALIAS 449 Alias "/ftp/pub/image" [0-9]+)"> Alias "/usr/local/apache/errors/%{env:MATCH_NUMBER}.html" AliasMatch Directive Description: Syntax: Context: Status: Module: Maps URLs to filesystem locations using regular expressions AliasMatch regex file-path|directory-path server config, virtual host Base mod alias This directive is equivalent to A LIAS, but makes use of regular expressions, instead of simple prefix matching. The supplied regular expression is matched against the URL-path, and if it matches, the server will substitute any parenthesized matches into the given string and use it as a filename. For example, to activate the /icons directory, one might use: AliasMatch "ˆ/icons(/|$)(.*)" "/usr/local/apache/icons$1$2" The full range of regular expression power is available. For example, it is possible to construct an alias with caseinsensitive matching of the URL-path: AliasMatch "(?i)ˆ/image(.*)" "/ftp/pub/image$1" One subtle difference between A LIAS and A LIAS M ATCH is that A LIAS will automatically copy any additional part of the URI, past the part that matched, onto the end of the file path on the right side, while A LIAS M ATCH will not. This means that in almost all cases, you will want the regular expression to match the entire request URI from beginning to end, and to use substitution on the right side. In other words, just changing A LIAS to A LIAS M ATCH will not have the same effect. At a minimum, you need to add ˆ to the beginning of the regular expression and add (.*)$ to the end, and add $1 to the end of the replacement. For example, suppose you want to replace this with AliasMatch: Alias "/image/" "/ftp/pub/image/" This is NOT equivalent - don’t do this! /ftp/pub/image/: This will send all requests that have /image/ anywhere in them to AliasMatch "/image/" "/ftp/pub/image/" This is what you need to get the same effect: AliasMatch "ˆ/image/(.*)$" "/ftp/pub/image/$1" Of course, there’s no point in using A LIAS M ATCH where A LIAS would work. A LIAS M ATCH lets you do more complicated things. For example, you could serve different kinds of files from different directories: AliasMatch "ˆ/image/(.*)\.jpg$" "/files/jpg.images/$1.jpg" AliasMatch "ˆ/image/(.*)\.gif$" "/files/gif.images/$1.gif" Multiple leading slashes in the requested URL are discarded by the server before directives from this module compares against the requested URL-path. 450 CHAPTER 10. APACHE MODULES Redirect Directive Description: Syntax: Context: Override: Status: Module: Sends an external redirect asking the client to fetch a different URL Redirect [status] [URL-path] URL server config, virtual host, directory, .htaccess FileInfo Base mod alias The Redirect directive maps an old URL into a new one by asking the client to refetch the resource at the new location. The old URL-path is a case-sensitive (%-decoded) path beginning with a slash. A relative path is not allowed. The new URL may be either an absolute URL beginning with a scheme and hostname, or a URL-path beginning with a slash. In this latter case the scheme and hostname of the current server will be added. Then any request beginning with URL-Path will return a redirect request to the client at the location of the target URL. Additional path information beyond the matched URL-Path will be appended to the target URL. # Redirect to a URL on a different host Redirect "/service" "http://foo2.example.com/service" # Redirect to a URL on the same host Redirect "/one" "/two" If the client requests http://example.com/service/foo.txt, it will be told to access http://foo2.example.com/service/foo.txt instead. This includes requests with GET parameters, such as http://example.com/service/foo.pl?q=23&a=42, it will be redirected to http://foo2.example.com/service/foo.pl?q=23&a=42. Note that POSTs will be discarded. Only complete path segments are matched, so the above example would not match a request for http://example.com/servicefoo.txt. For more complex matching using the expression syntax (p. 99) , omit the URL-path argument as described below. Alternatively, for matching using regular expressions, see the R EDIRECT M ATCH directive. =⇒Note Redirect directives take precedence over Alias and ScriptAlias directives, irrespective of their ordering in the configuration file. Redirect directives inside a Location take precedence over Redirect and Alias directives with an URL-path. If no status argument is given, the redirect will be "temporary" (HTTP status 302). This indicates to the client that the resource has moved temporarily. The status argument can be used to return other HTTP status codes: permanent Returns a permanent redirect status (301) indicating that the resource has moved permanently. temp Returns a temporary redirect status (302). This is the default. seeother Returns a "See Other" status (303) indicating that the resource has been replaced. gone Returns a "Gone" status (410) indicating that the resource has been permanently removed. When this status is used the URL argument should be omitted. Other status codes can be returned by giving the numeric status code as the value of status. If the status is between 300 and 399, the URL argument must be present. If the status is not between 300 and 399, the URL argument must be omitted. The status must be a valid HTTP status code, known to the Apache HTTP Server (see the function send error response in http protocol.c). Redirect permanent "/one" "http://example.com/two" Redirect 303 "/three" "http://example.com/other" 10.6. APACHE MODULE MOD ALIAS 451 If the R EDIRECT directive is used within a or section with the URL-path omitted, then the URL parameter will be interpreted using expression syntax (p. 99) . This syntax is available in Apache 2.4.19 and later. Redirect permanent "http://example.com/two" Redirect 303 "http://example.com/other" [0-9]+)"> Redirect permanent "http://example.com/errors/%{env:MATCH_NUMBER}.html" RedirectMatch Directive Description: Syntax: Context: Override: Status: Module: Sends an external redirect based on a regular expression match of the current URL RedirectMatch [status] regex URL server config, virtual host, directory, .htaccess FileInfo Base mod alias This directive is equivalent to R EDIRECT, but makes use of regular expressions, instead of simple prefix matching. The supplied regular expression is matched against the URL-path, and if it matches, the server will substitute any parenthesized matches into the given string and use it as a filename. For example, to redirect all GIF files to likenamed JPEG files on another server, one might use: RedirectMatch "(.*)\.gif$" "http://other.example.com$1.jpg" The considerations related to the difference between A LIAS and A LIAS M ATCH also apply to the difference between R EDIRECT and R EDIRECT M ATCH. See A LIAS M ATCH for details. RedirectPermanent Directive Description: Syntax: Context: Override: Status: Module: Sends an external permanent redirect asking the client to fetch a different URL RedirectPermanent URL-path URL server config, virtual host, directory, .htaccess FileInfo Base mod alias This directive makes the client know that the Redirect is permanent (status 301). Exactly equivalent to Redirect permanent. RedirectTemp Directive Description: Syntax: Context: Override: Status: Module: Sends an external temporary redirect asking the client to fetch a different URL RedirectTemp URL-path URL server config, virtual host, directory, .htaccess FileInfo Base mod alias 452 CHAPTER 10. APACHE MODULES This directive makes the client know that the Redirect is only temporary (status 302). Exactly equivalent to Redirect temp. ScriptAlias Directive Description: Syntax: Context: Status: Module: Maps a URL to a filesystem location and designates the target as a CGI script ScriptAlias [URL-path] file-path|directory-path server config, virtual host, directory Base mod alias The S CRIPTA LIAS directive has the same behavior as the A LIAS directive, except that in addition it marks the target directory as containing CGI scripts that will be processed by MOD CGI’s cgi-script handler. URLs with a case-sensitive (%-decoded) path beginning with URL-path will be mapped to scripts beginning with the second argument, which is a full pathname in the local filesystem. ScriptAlias "/cgi-bin/" "/web/cgi-bin/" A request for http://example.com/cgi-bin/foo would cause the server to run the script /web/cgi-bin/foo. This configuration is essentially equivalent to: Alias "/cgi-bin/" "/web/cgi-bin/" SetHandler cgi-script Options +ExecCGI S CRIPTA LIAS can also be used in conjunction with a script or handler you have. For example: ScriptAlias "/cgi-bin/" "/web/cgi-handler.pl" In this scenario all files requested in /cgi-bin/ will be handled by the file you have configured, this allows you to use your own custom handler. You may want to use this as a wrapper for CGI so that you can add content, or some other bespoke action. ! It is safer to avoid placing CGI scripts under the D OCUMENT ROOT in order to avoid accidentally revealing their source code if the configuration is ever changed. The S CRIPTA LIAS makes this easy by mapping a URL and designating CGI scripts at the same time. If you do choose to place your CGI scripts in a directory already accessible from the web, do not use S CRIPTA LIAS. Instead, use , S ET H ANDLER, and O PTIONS as in: SetHandler cgi-script Options ExecCGI This is necessary since multiple URL-paths can map to the same filesystem location, potentially bypassing the S CRIPTA LIAS and revealing the source code of the CGI scripts if they are not restricted by a D IRECTORY section. If the S CRIPTA LIAS directive is used within a or section with the URL-path omitted, then the URL parameter will be interpreted using expression syntax (p. 99) . This syntax is available in Apache 2.4.19 and later. 10.6. APACHE MODULE MOD ALIAS 453 ScriptAlias "/web/cgi-bin/" [0-9]+)"> ScriptAlias "/web/cgi-bin/errors/%{env:MATCH_NUMBER}.cgi" See also • CGI Tutorial (p. 236) ScriptAliasMatch Directive Description: Syntax: Context: Status: Module: Maps a URL to a filesystem location using a regular expression and designates the target as a CGI script ScriptAliasMatch regex file-path|directory-path server config, virtual host Base mod alias This directive is equivalent to S CRIPTA LIAS, but makes use of regular expressions, instead of simple prefix matching. The supplied regular expression is matched against the URL-path, and if it matches, the server will substitute any parenthesized matches into the given string and use it as a filename. For example, to activate the standard /cgi-bin, one might use: ScriptAliasMatch "ˆ/cgi-bin(.*)" "/usr/local/apache/cgi-bin$1" As for AliasMatch, the full range of regular expression power is available. For example, it is possible to construct an alias with case-insensitive matching of the URL-path: ScriptAliasMatch "(?i)ˆ/cgi-bin(.*)" "/usr/local/apache/cgi-bin$1" The considerations related to the difference between A LIAS and A LIAS M ATCH also apply to the difference between S CRIPTA LIAS and S CRIPTA LIAS M ATCH. See A LIAS M ATCH for details. 454 CHAPTER 10. APACHE MODULES 10.7 Apache Module mod allowhandlers Description: Status: ModuleIdentifier: SourceFile: Easily restrict what HTTP handlers can be used on the server Experimental allowhandlers module mod allowhandlers.c Summary This module makes it easy to restrict which handlers may be used for a request. A possible configuration would be: AllowHandlers not server-info server-status balancer-manager ldap-status It also registers a handler named forbidden that simply returns 403 FORBIDDEN to the client. This can be used with directives like A DD H ANDLER. Directives • AllowHandlers See also • S ET H ANDLER • A DD H ANDLER AllowHandlers Directive Description: Syntax: Default: Context: Status: Module: Restrict access to the listed handlers AllowHandlers [not] none|handler-name [none|handler-name]... AllowHandlers all directory Experimental mod allowhandlers The handler names are case sensitive. The special name none can be used to match the case where no handler has been set. The special vallue all can be used to allow all handlers again in a later config section, even if some headers were denied earlier in the configuration merge order: AllowHandlers all SetHandler server-status 10.8. APACHE MODULE MOD ALLOWMETHODS 10.8 455 Apache Module mod allowmethods Description: Status: ModuleIdentifier: SourceFile: Easily restrict what HTTP methods can be used on the server Experimental allowmethods module mod allowmethods.c Summary This module makes it easy to restrict what HTTP methods can be used on a server. The most common configuration would be: AllowMethods GET POST OPTIONS Directives • AllowMethods AllowMethods Directive Description: Syntax: Default: Context: Status: Module: Restrict access to the listed HTTP methods AllowMethods reset|HTTP-method [HTTP-method]... AllowMethods reset directory Experimental mod allowmethods The HTTP-methods are case sensitive and are generally, as per RFC, given in upper case. The GET and HEAD methods are treated as equivalent. The reset keyword can be used to turn off MOD ALLOWMETHODS in a deeper nested context: AllowMethods reset =⇒Caution The TRACE method cannot be denied by this module; use T MOD ALLOWMETHODS RACE E NABLE instead. was written to replace the rather kludgy implementation of L IMIT and L IMIT E XCEPT. 456 CHAPTER 10. APACHE MODULES 10.9 Apache Module mod asis Description: Status: ModuleIdentifier: SourceFile: Sends files that contain their own HTTP headers Base asis module mod asis.c Summary This module provides the handler send-as-is which causes Apache HTTP Server to send the document without adding most of the usual HTTP headers. This can be used to send any kind of data from the server, including redirects and other special HTTP responses, without requiring a cgi-script or an nph script. For historical reasons, this module will also process any file with the mime type httpd/send-as-is. Directives This module provides no directives. See also • MOD HEADERS • MOD CERN META • Apache httpd’s Handler Use (p. 108) Usage In the server configuration file, associate files with the send-as-is handler e.g. AddHandler send-as-is asis The contents of any file with a .asis extension will then be sent by Apache httpd to the client with almost no changes. In particular, HTTP headers are derived from the file itself according to MOD CGI rules, so an asis file must include valid headers, and may also use the CGI Status: header to determine the HTTP response code. The Content-Length: header will automatically be inserted or, if included, corrected by httpd. Here’s an example of a file whose contents are sent as is so as to tell the client that a file has redirected. Status: 301 Now where did I leave that URL Location: http://xyz.example.com/foo/bar.html Content-type: text/html Lame excuses’R’us

Fred’s exceptionally wonderful page has moved to Joe’s site.

10.9. APACHE MODULE MOD ASIS =⇒Notes: The server always adds a Date: and Server: header to the data returned to the client, so these should not be included in the file. The server does not add a Last-Modified header; it probably should. 457 458 CHAPTER 10. APACHE MODULES 10.10 Apache Module mod auth basic Description: Status: ModuleIdentifier: SourceFile: Basic HTTP authentication Base auth basic module mod auth basic.c Summary This module allows the use of HTTP Basic Authentication to restrict access by looking up users in the given providers. HTTP Digest Authentication is provided by MOD AUTH DIGEST. This module should usually be combined with at least one authentication module such as MOD AUTHN FILE and one authorization module such as MOD AUTHZ USER. Directives • AuthBasicAuthoritative • AuthBasicFake • AuthBasicProvider • AuthBasicUseDigestAlgorithm See also • AUTH NAME • AUTH T YPE • R EQUIRE • Authentication howto (p. 227) AuthBasicAuthoritative Directive Description: Syntax: Default: Context: Override: Status: Module: Sets whether authorization and authentication are passed to lower level modules AuthBasicAuthoritative On|Off AuthBasicAuthoritative On directory, .htaccess AuthConfig Base mod auth basic Normally, each authorization module listed in AUTH BASIC P ROVIDER will attempt to verify the user, and if the user is not found in any provider, access will be denied. Setting the AUTH BASIC AUTHORITATIVE directive explicitly to Off allows for both authentication and authorization to be passed on to other non-provider-based modules if there is no userID or rule matching the supplied userID. This should only be necessary when combining MOD AUTH BASIC with third-party modules that are not configured with the AUTH BASIC P ROVIDER directive. When using such modules, the order of processing is determined in the modules’ source code and is not configurable. 10.10. APACHE MODULE MOD AUTH BASIC 459 AuthBasicFake Directive Description: Syntax: Default: Context: Override: Status: Module: Compatibility: Fake basic authentication using the given expressions for username and password AuthBasicFake off|username [password] none directory, .htaccess AuthConfig Base mod auth basic Apache HTTP Server 2.4.5 and later The username and password specified are combined into an Authorization header, which is passed to the server or service behind the webserver. Both the username and password fields are interpreted using the expression parser (p. 99) , which allows both the username and password to be set based on request parameters. If the password is not specified, the default value "password" will be used. To disable fake basic authentication for an URL space, specify "AuthBasicFake off". In this example, we pass a fixed username and password to a backend server. Fixed Example AuthBasicFake demo demopass In this example, we pass the email address extracted from a client certificate, extending the functionality of the FakeBasicAuth option within the SSLO PTIONS directive. Like the FakeBasicAuth option, the password is set to the fixed string "password". Certificate Example AuthBasicFake "%{SSL_CLIENT_S_DN_Email}" Extending the above example, we generate a password by hashing the email address with a fixed passphrase, and passing the hash to the backend server. This can be used to gate into legacy systems that do not support client certificates. Password Example AuthBasicFake "%{SSL_CLIENT_S_DN_Email}" "%{sha1:passphrase-%{SSL_CLIENT_S_DN_Email}}" Exclusion Example AuthBasicFake off 460 CHAPTER 10. APACHE MODULES AuthBasicProvider Directive Description: Syntax: Default: Context: Override: Status: Module: Sets the authentication provider(s) for this location AuthBasicProvider provider-name [provider-name] ... AuthBasicProvider file directory, .htaccess AuthConfig Base mod auth basic The AUTH BASIC P ROVIDER directive sets which provider is used to authenticate the users for this location. The default file provider is implemented by the MOD AUTHN FILE module. Make sure that the chosen provider module is present in the server. Example AuthType basic AuthName "private area" AuthBasicProvider dbm AuthDBMType SDBM AuthDBMUserFile "/www/etc/dbmpasswd" Require valid-user Providers are queried in order until a provider finds a match for the requested username, at which point this sole provider will attempt to check the password. A failure to verify the password does not result in control being passed on to subsequent providers. Providers are implemented by MOD AUTHN DBM, MOD AUTHN FILE, MOD AUTHN DBD, MOD AUTHNZ LDAP and MOD AUTHN SOCACHE . AuthBasicUseDigestAlgorithm Directive Description: Syntax: Default: Context: Override: Status: Module: Compatibility: Check passwords against the authentication providers as if Digest Authentication was in force instead of Basic Authentication. AuthBasicUseDigestAlgorithm MD5|Off AuthBasicUseDigestAlgorithm Off directory, .htaccess AuthConfig Base mod auth basic Apache HTTP Server 2.4.7 and later Normally, when using Basic Authentication, the providers listed in AUTH BASIC P ROVIDER attempt to verify a user by checking their data stores for a matching username and associated password. The stored passwords are usually encrypted, but not necessarily so; each provider may choose its own storage scheme for passwords. When using AUTH D IGEST P ROVIDER and Digest Authentication, providers perform a similar check to find a matching username in their data stores. However, unlike in the Basic Authentication case, the value associated with each stored username must be an encrypted string composed from the username, realm name, and password. (See RFC 2617, Section 3.2.2.28 for more details on the format used for this encrypted string.) As a consequence of the difference in the stored values between Basic and Digest Authentication, converting from Digest Authentication to Basic Authentication generally requires that all users be assigned new passwords, as their 8 http://tools.ietf.org/html/rfc2617#section-3.2.2.2 10.10. APACHE MODULE MOD AUTH BASIC 461 existing passwords cannot be recovered from the password storage scheme imposed on those providers which support Digest Authentication. Setting the AUTH BASIC U SE D IGESTA LGORITHM directive to MD5 will cause the user’s Basic Authentication password to be checked using the same encrypted format as for Digest Authentication. First a string composed from the username, realm name, and password is hashed with MD5; then the username and this encrypted string are passed to the providers listed in AUTH BASIC P ROVIDER as if AUTH T YPE was set to Digest and Digest Authentication was in force. Through the use of AUTH BASIC U SE D IGESTA LGORITHM a site may switch from Digest to Basic Authentication without requiring users to be assigned new passwords. =⇒The inverse process of switching from Basic to Digest Authentication without assigning new passwords is generally not possible. Only if the Basic Authentication passwords have been stored in plain text or with a reversable encryption scheme will it be possible to recover them and generate a new data store following the Digest Authentication password storage scheme. =⇒Only providers which support Digest Authentication will be able to authenticate users when A B U D A is set to MD5. Use of other providers will result in an UTH ASIC SE IGEST LGORITHM error response and the client will be denied access. 462 CHAPTER 10. APACHE MODULES 10.11 Apache Module mod auth digest Description: Status: ModuleIdentifier: SourceFile: User authentication using MD5 Digest Authentication Extension auth digest module mod auth digest.c Summary This module implements HTTP Digest Authentication (RFC26179 ), and provides an alternative to MOD AUTH BASIC where the password is not transmitted as cleartext. However, this does not lead to a significant security advantage over basic authentication. On the other hand, the password storage on the server is much less secure with digest authentication than with basic authentication. Therefore, using basic auth and encrypting the whole connection using MOD SSL is a much better alternative. Directives • AuthDigestAlgorithm • AuthDigestDomain • AuthDigestNcCheck • AuthDigestNonceFormat • AuthDigestNonceLifetime • AuthDigestProvider • AuthDigestQop • AuthDigestShmemSize See also • AUTH NAME • AUTH T YPE • R EQUIRE • Authentication howto (p. 227) Using Digest Authentication To use MD5 Digest authentication, configure the location to be protected as shown in the below example: Example: AuthType Digest AuthName "private area" AuthDigestDomain "/private/" "http://mirror.my.dom/private2/" AuthDigestProvider file AuthUserFile "/web/auth/.digest_pw" Require valid-user 9 http://www.faqs.org/rfcs/rfc2617.html 10.11. APACHE MODULE MOD AUTH DIGEST 463 AUTH D IGEST D OMAIN should list the locations that will be protected by this configuration. The pasword file referenced in the AUTH U SER F ILE directive may be created and managed using the htdigest tool. =⇒Note Digest authentication was intended to be more secure than basic authentication, but no longer fulfills that design goal. A man-in-the-middle attacker can trivially force the browser to downgrade to basic authentication. And even a passive eavesdropper can brute-force the password using today’s graphics hardware, because the hashing algorithm used by digest authentication is too fast. Another problem is that the storage of the passwords on the server is insecure. The contents of a stolen htdigest file can be used directly for digest authentication. Therefore using MOD SSL to encrypt the whole connection is strongly recommended. MOD AUTH DIGEST only works properly on platforms where APR supports shared memory. AuthDigestAlgorithm Directive Description: Syntax: Default: Context: Override: Status: Module: Selects the algorithm used to calculate the challenge and response hashes in digest authentication AuthDigestAlgorithm MD5|MD5-sess AuthDigestAlgorithm MD5 directory, .htaccess AuthConfig Extension mod auth digest The AUTH D IGESTA LGORITHM directive selects the algorithm used to calculate the challenge and response hashes. =⇒MD5-sess is not correctly implemented yet. AuthDigestDomain Directive Description: Syntax: Context: Override: Status: Module: URIs that are in the same protection space for digest authentication AuthDigestDomain URI [URI] ... directory, .htaccess AuthConfig Extension mod auth digest The AUTH D IGEST D OMAIN directive allows you to specify one or more URIs which are in the same protection space (i.e. use the same realm and username/password info). The specified URIs are prefixes; the client will assume that all URIs "below" these are also protected by the same username/password. The URIs may be either absolute URIs (i.e. including a scheme, host, port, etc.) or relative URIs. This directive should always be specified and contain at least the (set of) root URI(s) for this space. Omitting to do so will cause the client to send the Authorization header for every request sent to this server. Apart from increasing the size of the request, it may also have a detrimental effect on performance if AUTH D IGEST N C C HECK is on. The URIs specified can also point to different servers, in which case clients (which understand this) will then share username/password info across multiple servers without prompting the user each time. 464 CHAPTER 10. APACHE MODULES AuthDigestNcCheck Directive Description: Syntax: Default: Context: Status: Module: Enables or disables checking of the nonce-count sent by the server AuthDigestNcCheck On|Off AuthDigestNcCheck Off server config Extension mod auth digest =⇒Not implemented yet. AuthDigestNonceFormat Directive Description: Syntax: Context: Override: Status: Module: Determines how the nonce is generated AuthDigestNonceFormat format directory, .htaccess AuthConfig Extension mod auth digest =⇒Not implemented yet. AuthDigestNonceLifetime Directive Description: Syntax: Default: Context: Override: Status: Module: How long the server nonce is valid AuthDigestNonceLifetime seconds AuthDigestNonceLifetime 300 directory, .htaccess AuthConfig Extension mod auth digest The AUTH D IGEST N ONCE L IFETIME directive controls how long the server nonce is valid. When the client contacts the server using an expired nonce the server will send back a 401 with stale=true. If seconds is greater than 0 then it specifies the amount of time for which the nonce is valid; this should probably never be set to less than 10 seconds. If seconds is less than 0 then the nonce never expires. AuthDigestProvider Directive Description: Syntax: Default: Context: Override: Status: Module: Sets the authentication provider(s) for this location AuthDigestProvider provider-name [provider-name] ... AuthDigestProvider file directory, .htaccess AuthConfig Extension mod auth digest The AUTH D IGEST P ROVIDER directive sets which provider is used to authenticate the users for this location. The default file provider is implemented by the MOD AUTHN FILE module. Make sure that the chosen provider module is present in the server. See MOD AUTHN DBM, MOD AUTHN FILE, MOD AUTHN DBD and MOD AUTHN SOCACHE for providers. 10.11. APACHE MODULE MOD AUTH DIGEST 465 AuthDigestQop Directive Description: Syntax: Default: Context: Override: Status: Module: Determines the quality-of-protection to use in digest authentication AuthDigestQop none|auth|auth-int [auth|auth-int] AuthDigestQop auth directory, .htaccess AuthConfig Extension mod auth digest The AUTH D IGEST Q OP directive determines the quality-of-protection to use. auth will only do authentication (username/password); auth-int is authentication plus integrity checking (an MD5 hash of the entity is also computed and checked); none will cause the module to use the old RFC-2069 digest algorithm (which does not include integrity checking). Both auth and auth-int may be specified, in which the case the browser will choose which of these to use. none should only be used if the browser for some reason does not like the challenge it receives otherwise. =⇒auth-int is not implemented yet. AuthDigestShmemSize Directive Description: Syntax: Default: Context: Status: Module: The amount of shared memory to allocate for keeping track of clients AuthDigestShmemSize size AuthDigestShmemSize 1000 server config Extension mod auth digest The AUTH D IGEST S HMEM S IZE directive defines the amount of shared memory, that will be allocated at the server startup for keeping track of clients. Note that the shared memory segment cannot be set less than the space that is necessary for tracking at least one client. This value is dependent on your system. If you want to find out the exact value, you may simply set AUTH D IGEST S HMEM S IZE to the value of 0 and read the error message after trying to start the server. The size is normally expressed in Bytes, but you may follow the number with a K or an M to express your value as KBytes or MBytes. For example, the following directives are all equivalent: AuthDigestShmemSize 1048576 AuthDigestShmemSize 1024K AuthDigestShmemSize 1M 466 CHAPTER 10. APACHE MODULES 10.12 Apache Module mod auth form Description: Status: ModuleIdentifier: SourceFile: Compatibility: Form authentication Base auth form module mod auth form.c Available in Apache 2.3 and later Summary ! Warning Form authentication depends on the MOD SESSION modules, and these modules make use of HTTP cookies, and as such can fall victim to Cross Site Scripting attacks, or expose potentially private information to clients. Please ensure that the relevant risks have been taken into account before enabling the session functionality on your server. This module allows the use of an HTML login form to restrict access by looking up users in the given providers. HTML forms require significantly more configuration than the alternatives, however an HTML login form can provide a much friendlier experience for end users. HTTP basic authentication is provided by MOD AUTH BASIC, and HTTP digest authentication is provided by MOD AUTH DIGEST. This module should be combined with at least one authentication module such as MOD AUTHN FILE and one authorization module such as MOD AUTHZ USER . Once the user has been successfully authenticated, the user’s login details will be stored in a session provided by MOD SESSION . Directives • AuthFormAuthoritative • AuthFormBody • AuthFormDisableNoStore • AuthFormFakeBasicAuth • AuthFormLocation • AuthFormLoginRequiredLocation • AuthFormLoginSuccessLocation • AuthFormLogoutLocation • AuthFormMethod • AuthFormMimetype • AuthFormPassword • AuthFormProvider • AuthFormSitePassphrase • AuthFormSize • AuthFormUsername See also • MOD SESSION 10.12. APACHE MODULE MOD AUTH FORM 467 • AUTH NAME • AUTH T YPE • R EQUIRE • Authentication howto (p. 227) Basic Configuration To protect a particular URL with MOD AUTH FORM, you need to decide where you will store your session, and you will need to decide what method you will use to authenticate. In this simple example, the login details will be stored in a session based on MOD SESSION COOKIE, and authentication will be attempted against a file using MOD AUTHN FILE. If authentication is unsuccessful, the user will be redirected to the form login page. Basic example AuthFormProvider file AuthUserFile "conf/passwd" AuthType form AuthName "/admin" AuthFormLoginRequiredLocation "http://example.com/login.html" Session On SessionCookieName session path=/ Require valid-user The directive AUTH T YPE will enable the MOD AUTH FORM authentication when set to the value form. The directives AUTH F ORM P ROVIDER and AUTH U SER F ILE specify that usernames and passwords should be checked against the chosen file. The directives S ESSION and S ESSION C OOKIE NAME session stored within an HTTP cookie on the browser. For more information on the different options for configuring a session, read the documentation for MOD SESSION. You can optionally add a S ESSION C RYPTO PASSPHRASE to create an encrypted session cookie. This required the additional module MOD SESSION CRYPTO be loaded. In the simple example above, a URL has been protected by MOD AUTH FORM, but the user has yet to be given an opportunity to enter their username and password. Options for doing so include providing a dedicated standalone login page for this purpose, or for providing the login page inline. Standalone Login The login form can be hosted as a standalone page, or can be provided inline on the same page. When configuring the login as a standalone page, unsuccessful authentication attempts should be redirected to a login form created by the website for this purpose, using the AUTH F ORM L OGIN R EQUIRED L OCATION directive. Typically this login page will contain an HTML form, asking the user to provide their usename and password. Example login form
Username: Password:
468 CHAPTER 10. APACHE MODULES The part that does the actual login is handled by the form-login-handler. The action of the form should point at this handler, which is configured within Apache httpd as follows: Form login handler example SetHandler form-login-handler AuthFormLoginRequiredLocation "http://example.com/login.html" AuthFormLoginSuccessLocation "http://example.com/admin/index.html" AuthFormProvider file AuthUserFile "conf/passwd" AuthType form AuthName /admin Session On SessionCookieName session path=/ The URLs specified by the AUTH F ORM L OGIN R EQUIRED L OCATION directive will typically point to a page explaining to the user that their login attempt was unsuccessful, and they should try again. The AUTH F ORM L OGIN S UCCESS L OCATION directive specifies the URL the user should be redirected to upon successful login. Alternatively, the URL to redirect the user to on success can be embedded within the login form, as in the example below. As a result, the same form-login-handler can be reused for different areas of a website. Example login form with location
Username: Password:
Inline Login ! Warning A risk exists that under certain circumstances, the login form configured using inline login may be submitted more than once, revealing login credentials to the application running underneath. The administrator must ensure that the underlying application is properly secured to prevent abuse. If in doubt, use the standalone login configuration. As an alternative to having a dedicated login page for a website, it is possible to configure MOD AUTH FORM to authenticate users inline, without being redirected to another page. This allows the state of the current page to be preserved during the login attempt. This can be useful in a situation where a time limited session is in force, and the session times out in the middle of the user request. The user can be re-authenticated in place, and they can continue where they left off. If a non-authenticated user attempts to access a page protected by MOD AUTH FORM that isn’t configured with a AUTH F ORM L OGIN R EQUIRED L OCATION directive, a HTTP UNAUTHORIZED status code is returned to the browser indicating to the user that they are not authorized to view the page. To configure inline authentication, the administrator overrides the error document returned by the HTTP UNAUTHORIZED status code with a custom error document containing the login form, as follows: 10.12. APACHE MODULE MOD AUTH FORM 469 Basic inline example AuthFormProvider file ErrorDocument 401 "/login.shtml" AuthUserFile "conf/passwd" AuthType form AuthName /admin AuthFormLoginRequiredLocation "http://example.com/login.html" Session On SessionCookieName session path=/ The error document page should contain a login form with an empty action property, as per the example below. This has the effect of submitting the form to the original protected URL, without the page having to know what that URL is. Example inline login form
Username: Password:
When the end user has filled in their login details, the form will make an HTTP POST request to the original password protected URL. MOD AUTH FORM will intercept this POST request, and if HTML fields are found present for the username and password, the user will be logged in, and the original password protected URL will be returned to the user as a GET request. Inline Login with Body Preservation A limitation of the inline login technique described above is that should an HTML form POST have resulted in the request to authenticate or reauthenticate, the contents of the original form posted by the browser will be lost. Depending on the function of the website, this could present significant inconvenience for the end user. MOD AUTH FORM addresses this by allowing the method and body of the original request to be embedded in the login form. If authentication is successful, the original method and body will be retried by Apache httpd, preserving the state of the original request. To enable body preservation, add three additional fields to the login form as per the example below. Example with body preservation
Username: Password:
How the method, mimetype and body of the original request are embedded within the login form will depend on the platform and technology being used within the website. One option is to use the MOD INCLUDE module along with the K EPT B ODY S IZE directive, along with a suitable CGI script to embed the variables in the form. Another option is to render the login form using a CGI script or other dynamic technology. 470 CHAPTER 10. APACHE MODULES CGI example AuthFormProvider file ErrorDocument 401 "/cgi-bin/login.cgi" ... Logging Out To enable a user to log out of a particular session, configure a page to be handled by the form-logout-handler. Any attempt to access this URL will cause the username and password to be removed from the current session, effectively logging the user out. By setting the AUTH F ORM L OGOUT L OCATION directive, a URL can be specified that the browser will be redirected to on successful logout. This URL might explain to the user that they have been logged out, and give the user the option to log in again. Basic logout example SetHandler form-logout-handler AuthName realm AuthFormLogoutLocation "http://example.com/loggedout.html" Session On SessionCookieName session path=/ Note that logging a user out does not delete the session; it merely removes the username and password from the session. If this results in an empty session, the net effect will be the removal of that session, but this is not guaranteed. If you want to guarantee the removal of a session, set the S ESSION M AX AGE directive to a small value, like 1 (setting the directive to zero would mean no session age limit). Basic session expiry example SetHandler form-logout-handler AuthFormLogoutLocation "http://example.com/loggedout.html" Session On SessionMaxAge 1 SessionCookieName session path=/ Usernames and Passwords Note that form submission involves URLEncoding the form data: in this case the username and password. You should therefore pick usernames and passwords that avoid characters that are URLencoded in form submission, or you may get unexpected results. AuthFormAuthoritative Directive Description: Syntax: Default: Context: Override: Status: Module: Sets whether authorization and authentication are passed to lower level modules AuthFormAuthoritative On|Off AuthFormAuthoritative On directory, .htaccess AuthConfig Base mod auth form 10.12. APACHE MODULE MOD AUTH FORM 471 Normally, each authorization module listed in AUTH F ORM P ROVIDER will attempt to verify the user, and if the user is not found in any provider, access will be denied. Setting the AUTH F ORM AUTHORITATIVE directive explicitly to Off allows for both authentication and authorization to be passed on to other non-provider-based modules if there is no userID or rule matching the supplied userID. This should only be necessary when combining MOD AUTH FORM with third-party modules that are not configured with the AUTH F ORM P ROVIDER directive. When using such modules, the order of processing is determined in the modules’ source code and is not configurable. AuthFormBody Directive Description: Syntax: Default: Context: Status: Module: Compatibility: The name of a form field carrying the body of the request to attempt on successful login AuthFormBody fieldname httpd body directory Base mod auth form Available in Apache HTTP Server 2.3.0 and later The AUTH F ORM M ETHOD directive specifies the name of an HTML field which, if present, will contain the method of the request to to submit should login be successful. By populating the form with fields described by AUTH F ORM M ETHOD, AUTH F ORM M IMETYPE and AUTH F ORM B ODY, a website can retry a request that may have been interrupted by the login screen, or by a session timeout. AuthFormDisableNoStore Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Disable the CacheControl no-store header on the login page AuthFormDisableNoStore On|Off AuthFormDisableNoStore Off directory Base mod auth form Available in Apache HTTP Server 2.3.0 and later The AUTH F ORM D ISABLE N O S TORE flag disables the sending of a Cache-Control no-store header with the error 401 page returned when the user is not yet logged in. The purpose of the header is to make it difficult for an ecmascript application to attempt to resubmit the login form, and reveal the username and password to the backend application. Disable at your own risk. AuthFormFakeBasicAuth Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Fake a Basic Authentication header AuthFormFakeBasicAuth On|Off AuthFormFakeBasicAuth Off directory Base mod auth form Available in Apache HTTP Server 2.3.0 and later The AUTH F ORM FAKE BASIC AUTH flag determines whether a Basic Authentication header will be added to the request headers. This can be used to expose the username and password to an underlying application, without the underlying application having to be aware of how the login was achieved. 472 CHAPTER 10. APACHE MODULES AuthFormLocation Directive Description: Syntax: Default: Context: Status: Module: Compatibility: The name of a form field carrying a URL to redirect to on successful login AuthFormLocation fieldname httpd location directory Base mod auth form Available in Apache HTTP Server 2.3.0 and later The AUTH F ORM L OCATION directive specifies the name of an HTML field which, if present, will contain a URL to redirect the browser to should login be successful. AuthFormLoginRequiredLocation Directive Description: Syntax: Default: Context: Status: Module: Compatibility: The URL of the page to be redirected to should login be required AuthFormLoginRequiredLocation url none directory Base mod auth form Available in Apache HTTP Server 2.3.0 and later. The use of the expression parser has been added in 2.4.4. The AUTH F ORM L OGIN R EQUIRED L OCATION directive specifies the URL to redirect to should the user not be authorised to view a page. The value is parsed using the ap expr (p. 99) parser before being sent to the client. By default, if a user is not authorised to view a page, the HTTP response code HTTP UNAUTHORIZED will be returned with the page specified by the E RROR D OCUMENT directive. This directive overrides this default. Use this directive if you have a dedicated login page to redirect users to. AuthFormLoginSuccessLocation Directive Description: Syntax: Default: Context: Status: Module: Compatibility: The URL of the page to be redirected to should login be successful AuthFormLoginSuccessLocation url none directory Base mod auth form Available in Apache HTTP Server 2.3.0 and later. The use of the expression parser has been added in 2.4.4. The AUTH F ORM L OGIN S UCCESS L OCATION directive specifies the URL to redirect to should the user have logged in successfully. The value is parsed using the ap expr (p. 99) parser before being sent to the client. This directive can be overridden if a form field has been defined containing another URL using the AUTH F ORM L OCATION directive. Use this directive if you have a dedicated login URL, and you have not embedded the destination page in the login form. 10.12. APACHE MODULE MOD AUTH FORM 473 AuthFormLogoutLocation Directive Description: Syntax: Default: Context: Status: Module: Compatibility: The URL to redirect to after a user has logged out AuthFormLogoutLocation uri none directory Base mod auth form Available in Apache HTTP Server 2.3.0 and later. The use of the expression parser has been added in 2.4.4. The AUTH F ORM L OGOUT L OCATION directive specifies the URL of a page on the server to redirect to should the user attempt to log out. The value is parsed using the ap expr (p. 99) parser before being sent to the client. When a URI is accessed that is served by the handler form-logout-handler, the page specified by this directive will be shown to the end user. For example: Example SetHandler form-logout-handler AuthFormLogoutLocation "http://example.com/loggedout.html" Session on #... An attempt to access the URI /logout/ will result in the user being logged out, and the page /loggedout.html will be displayed. Make sure that the page loggedout.html is not password protected, otherwise the page will not be displayed. AuthFormMethod Directive Description: Syntax: Default: Context: Status: Module: Compatibility: The name of a form field carrying the method of the request to attempt on successful login AuthFormMethod fieldname httpd method directory Base mod auth form Available in Apache HTTP Server 2.3.0 and later The AUTH F ORM M ETHOD directive specifies the name of an HTML field which, if present, will contain the method of the request to to submit should login be successful. By populating the form with fields described by AUTH F ORM M ETHOD, AUTH F ORM M IMETYPE and AUTH F ORM B ODY, a website can retry a request that may have been interrupted by the login screen, or by a session timeout. AuthFormMimetype Directive Description: Syntax: Default: Context: Status: Module: Compatibility: The name of a form field carrying the mimetype of the body of the request to attempt on successful login AuthFormMimetype fieldname httpd mimetype directory Base mod auth form Available in Apache HTTP Server 2.3.0 and later 474 CHAPTER 10. APACHE MODULES The AUTH F ORM M ETHOD directive specifies the name of an HTML field which, if present, will contain the mimetype of the request to submit should login be successful. By populating the form with fields described by AUTH F ORM M ETHOD, AUTH F ORM M IMETYPE and AUTH F ORM B ODY, a website can retry a request that may have been interrupted by the login screen, or by a session timeout. AuthFormPassword Directive Description: Syntax: Default: Context: Status: Module: Compatibility: The name of a form field carrying the login password AuthFormPassword fieldname httpd password directory Base mod auth form Available in Apache HTTP Server 2.3.0 and later The AUTH F ORM PASSWORD directive specifies the name of an HTML field which, if present, will contain the password to be used to log in. AuthFormProvider Directive Description: Syntax: Default: Context: Override: Status: Module: Sets the authentication provider(s) for this location AuthFormProvider provider-name [provider-name] ... AuthFormProvider file directory, .htaccess AuthConfig Base mod auth form The AUTH F ORM P ROVIDER directive sets which provider is used to authenticate the users for this location. The default file provider is implemented by the MOD AUTHN FILE module. Make sure that the chosen provider module is present in the server. Example AuthType form AuthName "private area" AuthFormProvider dbm AuthDBMType SDBM AuthDBMUserFile "/www/etc/dbmpasswd" Require valid-user #... Providers are implemented by MOD AUTHN DBM, MOD AUTHN FILE, MOD AUTHN DBD, MOD AUTHNZ LDAP and MOD AUTHN SOCACHE . 10.12. APACHE MODULE MOD AUTH FORM 475 AuthFormSitePassphrase Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Bypass authentication checks for high traffic sites AuthFormSitePassphrase secret none directory Base mod auth form Available in Apache HTTP Server 2.3.0 and later The AUTH F ORM S ITE PASSPHRASE directive specifies a passphrase which, if present in the user session, causes Apache httpd to bypass authentication checks for the given URL. It can be used on high traffic websites to reduce the load induced on authentication infrastructure. The passphrase can be inserted into a user session by adding this directive to the configuration for the form-loginhandler. The form-login-handler itself will always run the authentication checks, regardless of whether a passphrase is specified or not. ! Warning If the session is exposed to the user through the use of MOD SESSION COOKIE, and the session is not protected with MOD SESSION CRYPTO, the passphrase is open to potential exposure through a dictionary attack. Regardless of how the session is configured, ensure that this directive is not used within URL spaces where private user data could be exposed, or sensitive transactions can be conducted. Use at own risk. AuthFormSize Directive Description: Syntax: Default: Context: Status: Module: Compatibility: The largest size of the form in bytes that will be parsed for the login details AuthFormSize size 8192 directory Base mod auth form Available in Apache HTTP Server 2.3.0 and later The AUTH F ORM S IZE directive specifies the maximum size of the body of the request that will be parsed to find the login form. If a login request arrives that exceeds this size, the whole request will be aborted with the HTTP response code HTTP REQUEST TOO LARGE. If you have populated the form with fields described by AUTH F ORM M ETHOD, AUTH F ORM M IMETYPE and AUTH F ORM B ODY, you probably want to set this field to a similar size as the K EPT B ODY S IZE directive. AuthFormUsername Directive Description: Syntax: Default: Context: Status: Module: Compatibility: The name of a form field carrying the login username AuthFormUsername fieldname httpd username directory Base mod auth form Available in Apache HTTP Server 2.3.0 and later 476 CHAPTER 10. APACHE MODULES The AUTH F ORM U SERNAME directive specifies the name of an HTML field which, if present, will contain the username to be used to log in. 10.13. APACHE MODULE MOD AUTHN ANON 10.13 477 Apache Module mod authn anon Description: Status: ModuleIdentifier: SourceFile: Allows "anonymous" user access to authenticated areas Extension authn anon module mod authn anon.c Summary This module provides authentication front-ends such as MOD AUTH BASIC to authenticate users similar to anonymousftp sites, i.e. have a ’magic’ user id ’anonymous’ and the email address as a password. These email addresses can be logged. Combined with other (database) access control methods, this allows for effective user tracking and customization according to a user profile while still keeping the site open for ’unregistered’ users. One advantage of using Authbased user tracking is that, unlike magic-cookies and funny URL pre/postfixes, it is completely browser independent and it allows users to share URLs. When using MOD AUTH BASIC, this module is invoked via the AUTH BASIC P ROVIDER directive with the anon value. Directives • Anonymous • Anonymous LogEmail • Anonymous MustGiveEmail • Anonymous NoUserID • Anonymous VerifyEmail Example The example below is combined with "normal" htpasswd-file based authentication and allows users in additionally as ’guests’ with the following properties: • It insists that the user enters a userID. (A NONYMOUS N O U SER ID) • It insists that the user enters a password. (A NONYMOUS M UST G IVE E MAIL) • The password entered must be a valid email address, i.e. (A NONYMOUS V ERIFY E MAIL) contain at least one ’@’ and a ’.’. • The userID must be one of anonymous guest www test welcome and comparison is not case sensitive. (A NONYMOUS) • And the Email addresses entered in the passwd field are logged to the error log file. (A NONYMOUS L OG E MAIL) 478 CHAPTER 10. APACHE MODULES Example AuthName "Use ’anonymous’ & Email address for guest entry" AuthType Basic AuthBasicProvider file anon AuthUserFile "/path/to/your/.htpasswd" Anonymous_NoUserID off Anonymous_MustGiveEmail on Anonymous_VerifyEmail on Anonymous_LogEmail on Anonymous anonymous guest www test welcome Require valid-user Anonymous Directive Description: Syntax: Context: Override: Status: Module: Specifies userIDs that are allowed access without password verification Anonymous user [user] ... directory, .htaccess AuthConfig Extension mod authn anon A list of one or more ’magic’ userIDs which are allowed access without password verification. The userIDs are space separated. It is possible to use the ’ and " quotes to allow a space in a userID as well as the \escape character. Please note that the comparison is case-IN-sensitive. It’s strongly recommended that the magic username ’anonymous’ is always one of the allowed userIDs. Example: Anonymous anonymous "Not Registered" "I don’t know" This would allow the user to enter without password verification by using the userIDs "anonymous", "AnonyMous", "Not Registered" and "I Don’t Know". As of Apache 2.1 it is possible to specify the userID as "*". That allows any supplied userID to be accepted. Anonymous LogEmail Directive Description: Syntax: Default: Context: Override: Status: Module: Sets whether the password entered will be logged in the error log Anonymous LogEmail On|Off Anonymous LogEmail On directory, .htaccess AuthConfig Extension mod authn anon When set On, the default, the ’password’ entered (which hopefully contains a sensible email address) is logged in the error log. 10.13. APACHE MODULE MOD AUTHN ANON 479 Anonymous MustGiveEmail Directive Description: Syntax: Default: Context: Override: Status: Module: Specifies whether blank passwords are allowed Anonymous MustGiveEmail On|Off Anonymous MustGiveEmail On directory, .htaccess AuthConfig Extension mod authn anon Specifies whether the user must specify an email address as the password. This prohibits blank passwords. Anonymous NoUserID Directive Description: Syntax: Default: Context: Override: Status: Module: Sets whether the userID field may be empty Anonymous NoUserID On|Off Anonymous NoUserID Off directory, .htaccess AuthConfig Extension mod authn anon When set On, users can leave the userID (and perhaps the password field) empty. This can be very convenient for MS-Explorer users who can just hit return or click directly on the OK button; which seems a natural reaction. Anonymous VerifyEmail Directive Description: Syntax: Default: Context: Override: Status: Module: Sets whether to check the password field for a correctly formatted email address Anonymous VerifyEmail On|Off Anonymous VerifyEmail Off directory, .htaccess AuthConfig Extension mod authn anon When set On the ’password’ entered is checked for at least one ’@’ and a ’.’ to encourage users to enter valid email addresses (see the above A NONYMOUS L OG E MAIL). 480 CHAPTER 10. APACHE MODULES 10.14 Apache Module mod authn core Description: Status: ModuleIdentifier: SourceFile: Compatibility: Core Authentication Base authn core module mod authn core.c Available in Apache 2.3 and later Summary This module provides core authentication capabilities to allow or deny access to portions of the web site. MOD AUTHN CORE provides directives that are common to all authentication providers. Directives • AuthName • • AuthType Creating Authentication Provider Aliases Extended authentication providers can be created within the configuration file and assigned an alias name. The alias providers can then be referenced through the directives AUTH BASIC P ROVIDER or AUTH D IGEST P ROVIDER in the same way as a base authentication provider. Besides the ability to create and alias an extended provider, it also allows the same extended authentication provider to be reference by multiple locations. Examples This example checks for passwords in two different text files. Checking multiple text password files # Check here first AuthUserFile "/www/conf/passwords1" # Then check here AuthUserFile "/www/conf/passwords2" AuthBasicProvider file1 file2 AuthType Basic AuthName "Protected Area" Require valid-user The example below creates two different ldap authentication provider aliases based on the ldap provider. This allows a single authenticated location to be serviced by multiple ldap hosts: 10.14. APACHE MODULE MOD AUTHN CORE 481 Checking multiple LDAP servers AuthLDAPBindDN "cn=youruser,o=ctx" AuthLDAPBindPassword yourpassword AuthLDAPURL "ldap://ldap.host/o=ctx" AuthLDAPBindDN "cn=yourotheruser,o=dev" AuthLDAPBindPassword yourotherpassword AuthLDAPURL "ldap://other.ldap.host/o=dev?cn" Alias "/secure" "/webpages/secure" AuthBasicProvider ldap-other-alias ldap-alias1 AuthType Basic AuthName "LDAP Protected Place" Require valid-user # Note that Require ldap-* would not work here, since the # AuthnProviderAlias does not provide the config to authorization providers # that are implemented in the same module as the authentication provider. AuthName Directive Description: Syntax: Context: Override: Status: Module: Authorization realm for use in HTTP authentication AuthName auth-domain directory, .htaccess AuthConfig Base mod authn core This directive sets the name of the authorization realm for a directory. This realm is given to the client so that the user knows which username and password to send. AUTH NAME takes a single argument; if the realm name contains spaces, it must be enclosed in quotation marks. It must be accompanied by AUTH T YPE and R EQUIRE directives, and directives such as AUTH U SER F ILE and AUTH G ROUP F ILE to work. For example: AuthName "Top Secret" The string provided for the AuthName is what will appear in the password dialog provided by most browsers. The expression syntax (p. 99) can be used inside the directive to produce the name dynamically. For example: AuthName "%{HTTP_HOST}" See also • Authentication, Authorization, and Access Control (p. 227) • MOD AUTHZ CORE 482 CHAPTER 10. APACHE MODULES AuthnProviderAlias Directive Description: Syntax: Context: Status: Module: Enclose a group of directives that represent an extension of a base authentication provider and referenced by the specified alias ... server config Base mod authn core and are used to enclose a group of authentication directives that can be referenced by the alias name using one of the directives AUTH BASIC P ROVIDER or AUTH D IGESTP ROVIDER. =⇒This directive has no affect on authorization, even for modules that provide both authentication and authorization. AuthType Directive Description: Syntax: Context: Override: Status: Module: Type of user authentication AuthType None|Basic|Digest|Form directory, .htaccess AuthConfig Base mod authn core This directive selects the type of user authentication for a directory. The authentication types available are None, Basic (implemented by MOD AUTH BASIC), Digest (implemented by MOD AUTH DIGEST), and Form (implemented by MOD AUTH FORM). To implement authentication, you must also use the AUTH NAME and R EQUIRE directives. In addition, the server must have an authentication-provider module such as MOD AUTHN FILE and an authorization module such as MOD AUTHZ USER . The authentication type None disables authentication. When authentication is enabled, it is normally inherited by each subsequent configuration section (p. 35) , unless a different authentication type is specified. If no authentication is desired for a subsection of an authenticated section, the authentication type None may be used; in the following example, clients may access the /www/docs/public directory without authenticating: AuthType Basic AuthName Documents AuthBasicProvider file AuthUserFile "/usr/local/apache/passwd/passwords" Require valid-user AuthType None Require all granted From 2.4.13, expression syntax (p. 99) can be used inside the directive to specify the type dynamically. 10.14. APACHE MODULE MOD AUTHN CORE =⇒When disabling authentication, note that clients which have already authenticated against another portion of the server’s document tree will typically continue to send authentication HTTP headers or cookies with each request, regardless of whether the server actually requires authentication for every resource. See also • Authentication, Authorization, and Access Control (p. 227) 483 484 CHAPTER 10. APACHE MODULES 10.15 Apache Module mod authn dbd Description: Status: ModuleIdentifier: SourceFile: User authentication using an SQL database Extension authn dbd module mod authn dbd.c Summary This module provides authentication front-ends such as MOD AUTH DIGEST and MOD AUTH BASIC to authenticate users by looking up users in SQL tables. Similar functionality is provided by, for example, MOD AUTHN FILE. This module relies on MOD DBD to specify the backend database driver and connection parameters, and manage the database connections. When using MOD AUTH BASIC or MOD AUTH DIGEST, this module is invoked via the AUTH BASIC P ROVIDER or AUTH D IGEST P ROVIDER with the dbd value. Directives • AuthDBDUserPWQuery • AuthDBDUserRealmQuery See also • AUTH NAME • AUTH T YPE • AUTH BASIC P ROVIDER • AUTH D IGEST P ROVIDER • DBD RIVER • DBDPARAMS • Password Formats (p. 371) Performance and Cacheing Some users of DBD authentication in HTTPD 2.2/2.4 have reported that it imposes a problematic load on the database. This is most likely where an HTML page contains hundreds of objects (e.g. images, scripts, etc) each of which requires authentication. Users affected (or concerned) by this kind of problem should use MOD AUTHN SOCACHE to cache credentials and take most of the load off the database. Configuration Example This simple example shows use of this module in the context of the Authentication and DBD frameworks. # mod_dbd configuration # UPDATED to include authentication cacheing DBDriver pgsql DBDParams "dbname=apacheauth user=apache password=xxxxxx" DBDMin 4 10.15. APACHE MODULE MOD AUTHN DBD 485 DBDKeep 8 DBDMax 20 DBDExptime 300 # mod_authn_core and mod_auth_basic configuration # for mod_authn_dbd AuthType Basic AuthName "My Server" # To cache credentials, put socache ahead of dbd here AuthBasicProvider socache dbd # Also required for caching: tell the cache to cache dbd lookups! AuthnCacheProvideFor dbd AuthnCacheContext my-server # mod_authz_core configuration Require valid-user # mod_authn_dbd SQL query to authenticate a user AuthDBDUserPWQuery "SELECT password FROM authn WHERE user = %s" Exposing Login Information If httpd was built against APR version 1.3.0 or higher, then whenever a query is made to the database server, all column values in the first row returned by the query are placed in the environment, using environment variables with the prefix "AUTHENTICATE ". If a database query for example returned the username, full name and telephone number of a user, a CGI program will have access to this information without the need to make a second independent database query to gather this additional information. This has the potential to dramatically simplify the coding and configuration required in some web applications. Preventing SQL injections Whether you need to care about SQL security depends on what DBD driver and backend you use. With most drivers you don’t have to do anything : the statement is prepared by the database at startup, and user input is used only as data. But you may need to untaint your input. At the time of writing, the only driver that requires you to take care is FreeTDS. Please read MOD DBD documentation for more information about security on this scope. AuthDBDUserPWQuery Directive Description: Syntax: Context: Status: Module: SQL query to look up a password for a user AuthDBDUserPWQuery query directory Extension mod authn dbd 486 CHAPTER 10. APACHE MODULES The AUTH DBDU SER PWQ UERY specifies an SQL query to look up a password for a specified user. The user’s ID will be passed as a single string parameter when the SQL query is executed. It may be referenced within the query statement using a %s format specifier. AuthDBDUserPWQuery "SELECT password FROM authn WHERE user = %s" The first column value of the first row returned by the query statement should be a string containing the encrypted password. Subsequent rows will be ignored. If no rows are returned, the user will not be authenticated through MOD AUTHN DBD . If httpd was built against APR version 1.3.0 or higher, any additional column values in the first row returned by the query statement will be stored as environment variables with names of the form AUTHENTICATE COLUMN . The encrypted password format depends on which authentication frontend (e.g. MOD AUTH DIGEST ) is being used. See Password Formats (p. 371) for more information. MOD AUTH BASIC or AuthDBDUserRealmQuery Directive Description: Syntax: Context: Status: Module: SQL query to look up a password hash for a user and realm. AuthDBDUserRealmQuery query directory Extension mod authn dbd The AUTH DBDU SER R EALM Q UERY specifies an SQL query to look up a password for a specified user and realm in a digest authentication process. The user’s ID and the realm, in that order, will be passed as string parameters when the SQL query is executed. They may be referenced within the query statement using %s format specifiers. AuthDBDUserRealmQuery "SELECT password FROM authn WHERE user = %s AND realm = %s" The first column value of the first row returned by the query statement should be a string containing the encrypted password. Subsequent rows will be ignored. If no rows are returned, the user will not be authenticated through MOD AUTHN DBD . If httpd was built against APR version 1.3.0 or higher, any additional column values in the first row returned by the query statement will be stored as environment variables with names of the form AUTHENTICATE COLUMN . The encrypted password format depends on which authentication frontend (e.g. MOD AUTH DIGEST ) is being used. See Password Formats (p. 371) for more information. MOD AUTH BASIC or 10.16. APACHE MODULE MOD AUTHN DBM 10.16 487 Apache Module mod authn dbm Description: Status: ModuleIdentifier: SourceFile: User authentication using DBM files Extension authn dbm module mod authn dbm.c Summary This module provides authentication front-ends such as MOD AUTH DIGEST and MOD AUTH BASIC to authenticate users by looking up users in dbm password files. Similar functionality is provided by MOD AUTHN FILE. When using MOD AUTH BASIC or MOD AUTH DIGEST, this module is invoked via the AUTH BASIC P ROVIDER or AUTH D IGEST P ROVIDER with the dbm value. Directives • AuthDBMType • AuthDBMUserFile See also • AUTH NAME • AUTH T YPE • AUTH BASIC P ROVIDER • AUTH D IGEST P ROVIDER • htpasswd • htdbm • Password Formats (p. 371) AuthDBMType Directive Description: Syntax: Default: Context: Override: Status: Module: Sets the type of database file that is used to store passwords AuthDBMType default|SDBM|GDBM|NDBM|DB AuthDBMType default directory, .htaccess AuthConfig Extension mod authn dbm Sets the type of database file that is used to store the passwords. The default database type is determined at compile time. The availability of other types of database files also depends on compile-time settings (p. 307) . For example, in order to enable the support for Berkeley DB (correspondent to the db type) the --with-berkeley-db option needs to be added to httpd’s configure to generate the necessary DSO. It is crucial that whatever program you use to create your password files is configured to use the same type of database. 488 CHAPTER 10. APACHE MODULES AuthDBMUserFile Directive Description: Syntax: Context: Override: Status: Module: Sets the name of a database file containing the list of users and passwords for authentication AuthDBMUserFile file-path directory, .htaccess AuthConfig Extension mod authn dbm The AUTH DBMU SER F ILE directive sets the name of a DBM file containing the list of users and passwords for user authentication. File-path is the absolute path to the user file. The user file is keyed on the username. The value for a user is the encrypted password, optionally followed by a colon and arbitrary data. The colon and the data following it will be ignored by the server. ! Security: Make sure that the AUTH DBMU SER F ILE is stored outside the document tree of the webserver; do not put it in the directory that it protects. Otherwise, clients will be able to download the AUTH DBMU SER F ILE. The encrypted password format depends on which authentication frontend (e.g. MOD AUTH DIGEST ) is being used. See Password Formats (p. 371) for more information. MOD AUTH BASIC or Important compatibility note: The implementation of dbmopen in the Apache modules reads the string length of the hashed values from the DBM data structures, rather than relying upon the string being NULL-appended. Some applications, such as the Netscape web server, rely upon the string being NULL-appended, so if you are having trouble using DBM files interchangeably between applications this may be a part of the problem. A perl script called dbmmanage is included with Apache. This program can be used to create and update DBM format password files for use with this module. Another tool for maintaining the DBM files is the included program htdbm. 10.17. APACHE MODULE MOD AUTHN FILE 10.17 489 Apache Module mod authn file Description: Status: ModuleIdentifier: SourceFile: User authentication using text files Base authn file module mod authn file.c Summary This module provides authentication front-ends such as MOD AUTH DIGEST and MOD AUTH BASIC to authenticate users by looking up users in plain text password files. Similar functionality is provided by MOD AUTHN DBM. When using MOD AUTH BASIC or MOD AUTH DIGEST, this module is invoked via the AUTH BASIC P ROVIDER or AUTH D IGEST P ROVIDER with the file value. Directives • AuthUserFile See also • AUTH BASIC P ROVIDER • AUTH D IGEST P ROVIDER • htpasswd • htdigest • Password Formats (p. 371) AuthUserFile Directive Description: Syntax: Context: Override: Status: Module: Sets the name of a text file containing the list of users and passwords for authentication AuthUserFile file-path directory, .htaccess AuthConfig Base mod authn file The AUTH U SER F ILE directive sets the name of a textual file containing the list of users and passwords for user authentication. File-path is the path to the user file. If it is not absolute, it is treated as relative to the S ERVER ROOT. Each line of the user file contains a username followed by a colon, followed by the encrypted password. If the same user ID is defined multiple times, MOD AUTHN FILE will use the first occurrence to verify the password. The encrypted password format depends on which authentication frontend (e.g. MOD AUTH DIGEST ) is being used. See Password Formats (p. 371) for more information. MOD AUTH BASIC or For MOD AUTH BASIC, use the utility htpasswd which is installed as part of the binary distribution, or which can be found in src/support. See the man page (p. 325) for more details. In short: Create a password file Filename with username as the initial ID. It will prompt for the password: htpasswd -c Filename username Add or modify username2 in the password file Filename: 490 CHAPTER 10. APACHE MODULES htpasswd Filename username2 Note that searching large text files is very inefficient; AUTH DBMU SER F ILE should be used instead. For MOD AUTH DIGEST, use htdigest instead. Note that you cannot mix user data for Digest Authentication and Basic Authentication within the same file. ! Security Make sure that the AUTH U SER F ILE is stored outside the document tree of the web-server. Do not put it in the directory that it protects. Otherwise, clients may be able to download the AUTH U SER F ILE. 10.18. APACHE MODULE MOD AUTHN SOCACHE 10.18 491 Apache Module mod authn socache Description: Status: ModuleIdentifier: SourceFile: Compatibility: Manages a cache of authentication credentials to relieve the load on backends Base authn socache module mod authn socache.c Version 2.3 and later Summary Maintains a cache of authentication credentials, so that a new backend lookup is not required for every authenticated request. Directives • AuthnCacheContext • AuthnCacheEnable • AuthnCacheProvideFor • AuthnCacheSOCache • AuthnCacheTimeout Authentication Cacheing Some users of more heavyweight authentication such as SQL database lookups (MOD AUTHN DBD) have reported it putting an unacceptable load on their authentication provider. A typical case in point is where an HTML page contains hundreds of objects (images, scripts, stylesheets, media, etc), and a request to the page generates hundreds of effectively-immediate requests for authenticated additional contents. mod authn socache provides a solution to this problem by maintaining a cache of authentication credentials. Usage The authentication cache should be used where authentication lookups impose a significant load on the server, or a backend or network. Authentication by file (MOD AUTHN FILE) or dbm (MOD AUTHN DBM) are unlikely to benefit, as these are fast and lightweight in their own right (though in some cases, such as a network-mounted file, cacheing may be worthwhile). Other providers such as SQL or LDAP based authentication are more likely to benefit, particularly where there is an observed performance issue. Amongst the standard modules, MOD AUTHNZ LDAP manages its own cache, so only MOD AUTHN DBD will usually benefit from this cache. The basic rules to cache for a provider are: 1. Include the provider you’re cacheing for in an AUTHN C ACHE P ROVIDE F OR directive. 2. List socache ahead of the provider you’re cacheing for in your AUTH BASIC P ROVIDER or AUTH D IGESTP ROVIDER directive. A simple usage example to accelerate MOD AUTHN DBD using dbm as a cache engine: #AuthnCacheSOCache is optional. If specified, it is server-wide AuthnCacheSOCache dbm 492 CHAPTER 10. APACHE MODULES AuthType Basic AuthName "Cached Authentication Example" AuthBasicProvider socache dbd AuthDBDUserPWQuery "SELECT password FROM authn WHERE user = %s" AuthnCacheProvideFor dbd Require valid-user #Optional AuthnCacheContext dbd-authn-example Cacheing with custom modules Module developers should note that their modules must be enabled for cacheing with mod authn socache. A single optional API function ap authn cache store is provided to cache credentials a provider has just looked up or generated. Usage examples are available in r95707210 , in which three authn providers are enabled for cacheing. AuthnCacheContext Directive Description: Syntax: Default: Context: Status: Module: Specify a context string for use in the cache key AuthnCacheContext directory|server|custom-string directory directory Base mod authn socache This directive specifies a string to be used along with the supplied username (and realm in the case of Digest Authentication) in constructing a cache key. This serves to disambiguate identical usernames serving different authentication areas on the server. Two special values for this are directory, which uses the directory context of the request as a string, and server which uses the virtual host name. The default is directory, which is also the most conservative setting. This is likely to be less than optimal, as it (for example) causes $app-base, $app-base/images, $app-base/scripts and $app-base/media each to have its own separate cache key. A better policy is to name the AUTHN C ACHE C ONTEXT for the password provider: for example a htpasswd file or database table. Contexts can be shared across different areas of a server, where credentials are shared. However, this has potential to become a vector for cross-site or cross-application security breaches, so this directive is not permitted in .htaccess contexts. AuthnCacheEnable Directive Description: Syntax: Context: Override: Status: Module: Enable Authn caching configured anywhere AuthnCacheEnable server config None Base mod authn socache This directive is not normally necessary: it is implied if authentication cacheing is enabled anywhere in httpd.conf. However, if it is not enabled anywhere in httpd.conf it will by default not be initialised, and is therefore not available in a .htaccess context. This directive ensures it is initialised so it can be used in .htaccess. 10 http://svn.eu.apache.org/viewvc?view=revision&revision=957072 10.18. APACHE MODULE MOD AUTHN SOCACHE 493 AuthnCacheProvideFor Directive Description: Syntax: Default: Context: Override: Status: Module: Specify which authn provider(s) to cache for AuthnCacheProvideFor authn-provider [...] None directory, .htaccess AuthConfig Base mod authn socache This directive specifies an authentication provider or providers to cache for. Credentials found by a provider not listed in an AuthnCacheProvideFor directive will not be cached. For example, to cache credentials found by MOD AUTHN DBD or by a custom provider myprovider, but leave those looked up by lightweight providers like file or dbm lookup alone: AuthnCacheProvideFor dbd myprovider AuthnCacheSOCache Directive Description: Syntax: Context: Override: Status: Module: Compatibility: Select socache backend provider to use AuthnCacheSOCache provider-name[:provider-args] server config None Base mod authn socache Optional provider arguments are available in Apache HTTP Server 2.4.7 and later This is a server-wide setting to select a provider for the shared object cache (p. 114) , followed by optional arguments for that provider. Some possible values for provider-name are "dbm", "dc", "memcache", or "shmcb", each subject to the appropriate module being loaded. If not set, your platform’s default will be used. AuthnCacheTimeout Directive Description: Syntax: Default: Context: Override: Status: Module: Set a timeout for cache entries AuthnCacheTimeout timeout (seconds) 300 (5 minutes) directory, .htaccess AuthConfig Base mod authn socache Cacheing authentication data can be a security issue, though short-term cacheing is unlikely to be a problem. Typically a good solution is to cache credentials for as long as it takes to relieve the load on a backend, but no longer, though if changes to your users and passwords are infrequent then a longer timeout may suit you. The default 300 seconds (5 minutes) is both cautious and ample to keep the load on a backend such as dbd (SQL database queries) down. This should not be confused with session timeout, which is an entirely separate issue. However, you may wish to check your session-management software for whether cached credentials can "accidentally" extend a session, and bear it in mind when setting your timeout. 494 CHAPTER 10. APACHE MODULES 10.19 Apache Module mod authnz fcgi Description: Status: ModuleIdentifier: SourceFile: Compatibility: Allows a FastCGI authorizer application to handle Apache httpd authentication and authorization Extension authnz fcgi module mod authnz fcgi.c Available in version 2.4.10 and later Summary This module allows FastCGI authorizer applications to authenticate users and authorize access to resources. It supports generic FastCGI authorizers which participate in a single phase for authentication and authorization as well as Apache httpd-specific authenticators and authorizors which participate in one or both phases. FastCGI authorizers can authenticate using user id and password, such as for Basic authentication, or can authenticate using arbitrary mechanisms. Directives • AuthnzFcgiCheckAuthnProvider • AuthnzFcgiDefineProvider See also • Authentication, Authorization, and Access Control (p. 227) • MOD AUTH BASIC • fcgistarter • MOD PROXY FCGI Invocation modes The invocation modes for FastCGI authorizers supported by this module are distinguished by two characteristics, type and auth mechanism. Type is simply authn for authentication, authz for authorization, or authnz for combined authentication and authorization. Auth mechanism refers to the Apache httpd configuration mechanisms and processing phases, and can be AuthBasicProvider, Require, or check user id. The first two of these correspond to the directives used to enable participation in the appropriate processing phase. Descriptions of each mode: Type authn, mechanism AuthBasicProvider In this mode, FCGI ROLE is set to AUTHORIZER and FCGI APACHE ROLE is set to AUTHENTICATOR. The application must be defined as provider type authn using AUTHNZ F CGI D EFINE P ROVIDER and enabled with AUTH BASIC P ROVIDER. When invoked, the application is expected to authenticate the client using the provided user id and password. Example application: #!/usr/bin/perl use FCGI; my $request = FCGI::Request(); 10.19. APACHE MODULE MOD AUTHNZ FCGI 495 while ($request->Accept() >= 0) { die if $ENV{’FCGI_APACHE_ROLE’} ne "AUTHENTICATOR"; die if $ENV{’FCGI_ROLE’} ne "AUTHORIZER"; die if !$ENV{’REMOTE_PASSWD’}; die if !$ENV{’REMOTE_USER’}; print STDERR "This text is written to the web server error log.\n"; if ( ($ENV{’REMOTE_USER’ } eq "foo" || $ENV{’REMOTE_USER’} eq "foo1") && $ENV{’REMOTE_PASSWD’} eq "bar" ) { print "Status: 200\n"; print "Variable-AUTHN_1: authn_01\n"; print "Variable-AUTHN_2: authn_02\n"; print "\n"; } else { print "Status: 401\n\n"; } } Example configuration: AuthnzFcgiDefineProvider authn FooAuthn fcgi://localhost:10102/ AuthType Basic AuthName "Restricted" AuthBasicProvider FooAuthn Require ... Type authz, mechanism Require In this mode, FCGI ROLE is set to AUTHORIZER and FCGI APACHE ROLE is set to AUTHORIZER. The application must be defined as provider type authz using AUTHNZ F CGI D EFINE P ROVIDER. When invoked, the application is expected to authorize the client using the provided user id and other request data. Example application: #!/usr/bin/perl use FCGI; my $request = FCGI::Request(); while ($request->Accept() >= 0) { die if $ENV{’FCGI_APACHE_ROLE’} ne "AUTHORIZER"; die if $ENV{’FCGI_ROLE’} ne "AUTHORIZER"; die if $ENV{’REMOTE_PASSWD’}; print STDERR "This text is written to the web server error log.\n"; if ($ENV{’REMOTE_USER’} eq "foo1") { print "Status: 200\n"; print "Variable-AUTHZ_1: authz_01\n"; print "Variable-AUTHZ_2: authz_02\n"; print "\n"; } else { print "Status: 403\n\n"; } } 496 CHAPTER 10. APACHE MODULES Example configuration: AuthnzFcgiDefineProvider authz FooAuthz fcgi://localhost:10103/ AuthType ... AuthName ... AuthBasicProvider ... Require FooAuthz Type authnz, mechanism AuthBasicProvider + Require In this mode, which supports the web serveragnostic FastCGI AUTHORIZER protocol, FCGI ROLE is set to AUTHORIZER and FCGI APACHE ROLE is not set. The application must be defined as provider type authnz using AUTHNZ F CGI D EFINE P ROVIDER. The application is expected to handle both authentication and authorization in the same invocation using the user id, password, and other request data. The invocation occurs during the Apache httpd API authentication phase. If the application returns 200 and the same provider is invoked during the authorization phase (via R EQUIRE), mod authnz fcgi will return success for the authorization phase without invoking the application. Example application: #!/usr/bin/perl use FCGI; my $request = FCGI::Request(); while ($request->Accept() >= 0) { die if $ENV{’FCGI_APACHE_ROLE’}; die if $ENV{’FCGI_ROLE’} ne "AUTHORIZER"; die if !$ENV{’REMOTE_PASSWD’}; die if !$ENV{’REMOTE_USER’}; print STDERR "This text is written to the web server error log.\n"; if ( ($ENV{’REMOTE_USER’ } eq "foo" || $ENV{’REMOTE_USER’} eq "foo1") && $ENV{’REMOTE_PASSWD’} eq "bar" && $ENV{’REQUEST_URI’} =˜ m%/bar/.*%) { print "Status: 200\n"; print "Variable-AUTHNZ_1: authnz_01\n"; print "Variable-AUTHNZ_2: authnz_02\n"; print "\n"; } else { print "Status: 401\n\n"; } } Example configuration: AuthnzFcgiDefineProvider authnz FooAuthnz fcgi://localhost:10103/ AuthType Basic AuthName "Restricted" AuthBasicProvider FooAuthnz Require FooAuthnz AUTHORIZER and Type authn, mechanism check user id In this mode, FCGI ROLE is set to FCGI APACHE ROLE is set to AUTHENTICATOR. The application must be defined as provider type authn 10.19. APACHE MODULE MOD AUTHNZ FCGI 497 using AUTHNZ F CGI D EFINE P ROVIDER. AUTHNZ F CGI C HECK AUTHN P ROVIDER specifies when it is called. Example application: #!/usr/bin/perl use FCGI; my $request = FCGI::Request(); while ($request->Accept() >= 0) { die if $ENV{’FCGI_APACHE_ROLE’} ne "AUTHENTICATOR"; die if $ENV{’FCGI_ROLE’} ne "AUTHORIZER"; # This authorizer assumes that the RequireBasicAuth option of # AuthnzFcgiCheckAuthnProvider is On: die if !$ENV{’REMOTE_PASSWD’}; die if !$ENV{’REMOTE_USER’}; print STDERR "This text is written to the web server error log.\n"; if ( ($ENV{’REMOTE_USER’ } eq "foo" || $ENV{’REMOTE_USER’} eq "foo1") && $ENV{’REMOTE_PASSWD’} eq "bar" ) { print "Status: 200\n"; print "Variable-AUTHNZ_1: authnz_01\n"; print "Variable-AUTHNZ_2: authnz_02\n"; print "\n"; } else { print "Status: 401\n\n"; # If a response body is written here, it will be returned to # the client. } } Example configuration: AuthnzFcgiDefineProvider authn FooAuthn fcgi://localhost:10103/ AuthType ... AuthName ... AuthnzFcgiCheckAuthnProvider FooAuthn \ Authoritative On \ RequireBasicAuth Off \ UserExpr "%{reqenv:REMOTE_USER}" Require ... Additional examples 1. If your application supports the separate authentication and authorization roles (AUTHENTICATOR and AUTHORIZER), define separate providers as follows, even if they map to the same application: AuthnzFcgiDefineProvider authn AuthnzFcgiDefineProvider authz FooAuthn FooAuthz fcgi://localhost:10102/ fcgi://localhost:10102/ Specify the authn provider on AUTH BASIC P ROVIDER and the authz provider on R EQUIRE: 498 CHAPTER 10. APACHE MODULES AuthType Basic AuthName "Restricted" AuthBasicProvider FooAuthn Require FooAuthz 2. If your application supports the generic AUTHORIZER role (authentication and authorizer in one invocation), define a single provider as follows: AuthnzFcgiDefineProvider authnz FooAuthnz fcgi://localhost:10103/ Specify the authnz provider on both AUTH BASIC P ROVIDER and R EQUIRE: AuthType Basic AuthName "Restricted" AuthBasicProvider FooAuthnz Require FooAuthnz Limitations The following are potential features which are not currently implemented: Apache httpd access checker The Apache httpd API access check phase is a separate phase from authentication and authorization. Some other FastCGI implementations implement this phase, which is denoted by the setting of FCGI APACHE ROLE to ACCESS CHECKER. Local (Unix) sockets or pipes Only TCP sockets are currently supported. Support for mod authn socache mod authn socache interaction should be implemented for applications which participate in Apache httpd-style authentication. Support for digest authentication using AuthDigestProvider This is expected to be a permanent limitation as there is no authorizer flow for retrieving a hash. Application process management This is expected to be permanently out of scope for this module. Application processes must be controlled by other means. For example, fcgistarter can be used to start them. AP AUTH INTERNAL PER URI All providers are currently registered as AP AUTH INTERNAL PER CONF, which means that checks are not performed again for internal subrequests with the same access control configuration as the initial request. Protocol data charset conversion If mod authnz fcgi runs in an EBCDIC compilation environment, all FastCGI protocol data is written in EBCDIC and expected to be received in EBCDIC. Multiple requests per connection Currently the connection to the FastCGI authorizer is closed after every phase of processing. For example, if the authorizer handles separate authn and authz phases then two connections will be used. URI Mapping URIs from clients can’t be mapped, such as with the P ROXY PASS used with FastCGI responders. 10.19. APACHE MODULE MOD AUTHNZ FCGI 499 Logging 1. Processing errors are logged at log level error and higher. 2. Messages written by the application are logged at log level warn. 3. General messages for debugging are logged at log level debug. 4. Environment variables passed to the application are logged at log level trace2. The value of the REMOTE PASSWD variable will be obscured, but any other sensitive data will be visible in the log. 5. All I/O between the module and the FastCGI application, including all environment variables, will be logged in printable and hex format at log level trace5. All sensitive data will be visible in the log. L OG L EVEL can be used to configure a log level specific to mod authnz fcgi. For example: LogLevel info authnz_fcgi:trace8 AuthnzFcgiCheckAuthnProvider Directive Description: Syntax: Default: Context: Status: Module: Enables a FastCGI application to handle the check authn authentication hook. AuthnzFcgiCheckAuthnProvider provider-name|None option ... none directory Extension mod authnz fcgi This directive is used to enable a FastCGI authorizer to handle a specific processing phase of authentication or authorization. Some capabilities of FastCGI authorizers require enablement using this directive instead of AUTH BASIC P ROVIDER: • Non-Basic authentication; generally, determining the user id of the client and returning it from the authorizer; see the UserExpr option below • Selecting a custom response code; for a non-200 response from the authorizer, the code from the authorizer will be the status of the response • Setting the body of a non-200 response; if the authorizer provides a response body with a non-200 response, that body will be returned to the client; up to 8192 bytes of text are supported provider-name This is the name of a provider defined with AUTHNZ F CGI D EFINE P ROVIDER. None Specify None to disable a provider enabled with this directive in an outer scope, such as in a parent directory. option The following options are supported: Authoritative On—Off (default On) This controls whether or not other modules are allowed to run when this module has a FastCGI authorizer configured and it fails the request. DefaultUser userid When the authorizer returns success and UserExpr is configured and evaluates to an empty string (e.g., authorizer didn’t return a variable), this value will be used as the user id. This is typically used when the authorizer has a concept of guest, or unauthenticated, users and guest users are mapped to some specific user id for logging and other purposes. RequireBasicAuth On—Off (default Off) This controls whether or not Basic auth is required before passing the request to the authorizer. If required, the authorizer won’t be invoked without a user id and password; 401 will be returned for a request without that. 500 CHAPTER 10. APACHE MODULES UserExpr expr (no default) When Basic authentication isn’t provided by the client and the authorizer determines the user, this expression, evaluated after calling the authorizer, determines the user. The expression follows ap expr syntax (p. 99) and must resolve to a string. A typical use is to reference a Variable-XXX setting returned by the authorizer using an option like UserExpr "%{reqenv:XXX}". If this option is specified and the user id can’t be retrieved using the expression after a successful authentication, the request will be rejected with a 500 error. AuthnzFcgiDefineProvider Directive Description: Syntax: Default: Context: Status: Module: Defines a FastCGI application as a provider for authentication and/or authorization AuthnzFcgiDefineProvider type provider-name backend-address none server config Extension mod authnz fcgi This directive is used to define a FastCGI application as a provider for a particular phase of authentication or authorization. type This must be set to authn for authentication, authz for authorization, or authnz for a generic FastCGI authorizer which performs both checks. provider-name This is used to assign a name to the provider which is used in other directives such as AUTH BA SIC P ROVIDER and R EQUIRE. backend-address This specifies the address of the application, in the form fcgi://hostname:port/. The application process(es) must be managed independently, such as with fcgistarter. 10.20. APACHE MODULE MOD AUTHNZ LDAP 10.20 501 Apache Module mod authnz ldap Description: Status: ModuleIdentifier: SourceFile: Allows an LDAP directory to be used to store the database for HTTP Basic authentication. Extension authnz ldap module mod authnz ldap.c Summary This module allows authentication front-ends such as MOD AUTH BASIC to authenticate users through an ldap directory. MOD AUTHNZ LDAP supports the following features: • Known to support the OpenLDAP SDK11 (both 1.x and 2.x), Novell LDAP SDK12 and the iPlanet (Netscape)13 SDK. • Complex authorization policies can be implemented by representing the policy with LDAP filters. • Uses extensive caching of LDAP operations via mod ldap (p. 693) . • Support for LDAP over SSL (requires the Netscape SDK) or TLS (requires the OpenLDAP 2.x SDK or Novell LDAP SDK). When using MOD AUTH BASIC, this module is invoked via the AUTH BASIC P ROVIDER directive with the ldap value. Directives • AuthLDAPAuthorizePrefix • AuthLDAPBindAuthoritative • AuthLDAPBindDN • AuthLDAPBindPassword • AuthLDAPCharsetConfig • AuthLDAPCompareAsUser • AuthLDAPCompareDNOnServer • AuthLDAPDereferenceAliases • AuthLDAPGroupAttribute • AuthLDAPGroupAttributeIsDN • AuthLDAPInitialBindAsUser • AuthLDAPInitialBindPattern • AuthLDAPMaxSubGroupDepth • AuthLDAPRemoteUserAttribute • AuthLDAPRemoteUserIsDN • AuthLDAPSearchAsUser • AuthLDAPSubGroupAttribute 11 http://www.openldap.org/ 12 http://developer.novell.com/ndk/cldap.htm 13 http://www.iplanet.com/downloads/developer/ 502 CHAPTER 10. APACHE MODULES • AuthLDAPSubGroupClass • AuthLDAPUrl See also • MOD LDAP • MOD AUTH BASIC • MOD AUTHZ USER • MOD AUTHZ GROUPFILE Contents • General caveats • Operation – The Authentication Phase – The Authorization Phase • The Require Directives – – – – – – Require ldap-user Require ldap-group Require ldap-dn Require ldap-attribute Require ldap-filter Require ldap-search • Examples • Using TLS • Using SSL • Exposing Login Information • Using Active Directory • Using Microsoft FrontPage with MOD AUTHNZ LDAP – How It Works – Caveats General caveats This module caches authentication and authorization results based on the configuration of MOD LDAP. Changes made to the backing LDAP server will not be immediately reflected on the HTTP Server, including but not limited to user lockouts/revocations, password changes, or changes to group memberships. Consult the directives in MOD LDAP for details of the cache tunables. 10.20. APACHE MODULE MOD AUTHNZ LDAP 503 Operation There are two phases in granting access to a user. The first phase is authentication, in which the MOD AUTHNZ LDAP authentication provider verifies that the user’s credentials are valid. This is also called the search/bind phase. The second phase is authorization, in which MOD AUTHNZ LDAP determines if the authenticated user is allowed access to the resource in question. This is also known as the compare phase. MOD AUTHNZ LDAP registers both an authn ldap authentication provider and an authz ldap authorization handler. The authn ldap authentication provider can be enabled through the AUTH BASIC P ROVIDER directive using the ldap value. The authz ldap handler extends the R EQUIRE directive’s authorization types by adding ldap-user, ldap-dn and ldap-group values. The Authentication Phase During the authentication phase, MOD AUTHNZ LDAP searches for an entry in the directory that matches the username that the HTTP client passes. If a single unique match is found, then MOD AUTHNZ LDAP attempts to bind to the directory server using the DN of the entry plus the password provided by the HTTP client. Because it does a search, then a bind, it is often referred to as the search/bind phase. Here are the steps taken during the search/bind phase. 1. Generate a search filter by combining the attribute and filter provided in the AUTH LDAPURL directive with the username passed by the HTTP client. 2. Search the directory using the generated filter. If the search does not return exactly one entry, deny or decline access. 3. Fetch the distinguished name of the entry retrieved from the search and attempt to bind to the LDAP server using that DN and the password passed by the HTTP client. If the bind is unsuccessful, deny or decline access. The following directives are used during the search/bind phase AUTH LDAPURL AUTH LDAPB IND DN AUTH LDAPB IND PASSWORD Specifies the LDAP server, the base DN, the attribute to use in the search, as well as the extra search filter to use. An optional DN to bind with during the search phase. An optional password to bind with during the search phase. The Authorization Phase During the authorization phase, MOD AUTHNZ LDAP attempts to determine if the user is authorized to access the resource. Many of these checks require MOD AUTHNZ LDAP to do a compare operation on the LDAP server. This is why this phase is often referred to as the compare phase. MOD AUTHNZ LDAP accepts the following R EQUIRE directives to determine if the credentials are acceptable: • Grant access if there is a Require ldap-user directive, and the username in the directive matches the username passed by the client. • Grant access if there is a Require ldap-dn directive, and the DN in the directive matches the DN fetched from the LDAP directory. • Grant access if there is a Require ldap-group directive, and the DN fetched from the LDAP directory (or the username passed by the client) occurs in the LDAP group or, potentially, in one of its sub-groups. • Grant access if there is a Require ldap-attribute directive, and the attribute fetched from the LDAP directory matches the given value. • Grant access if there is a Require ldap-filter directive, and the search filter successfully finds a single user object that matches the dn of the authenticated user. 504 CHAPTER 10. APACHE MODULES • Grant access if there is a Require ldap-search directive, and the search filter successfully returns a single matching object with any distinguished name. • otherwise, deny or decline access Other R EQUIRE values may also be used which may require loading additional authorization modules. • Grant access to all successfully authenticated users if there is a Require valid-user directive. (requires MOD AUTHZ USER ) • Grant access if there is a Require group directive, and MOD AUTHZ GROUPFILE has been loaded with the AUTH G ROUP F ILE directive set. • others... MOD AUTHNZ LDAP uses the following directives during the compare phase: AUTH LDAPURL AUTH LDAPC OMPARE DNO N S ERVER AUTH LDAPG ROUPATTRIBUTE AUTH LDAPG ROUPATTRIBUTE I S DN AUTH LDAPM AX S UB G ROUP D EPTH AUTH LDAPS UB G ROUPATTRIBUTE AUTH LDAPS UB G ROUP C LASS The attribute specified in the URL is used in compare operations for the Require ldap-user operation. Determines the behavior of the Require ldap-dn directive. Determines the attribute to use for comparisons in the Require ldap-group directive. Specifies whether to use the user DN or the username when doing comparisons for the Require ldap-group directive. Determines the maximum depth of sub-groups that will be evaluated during comparisons in the Require ldap-group directive. Determines the attribute to use when obtaining sub-group members of the current group during comparisons in the Require ldap-group directive. Specifies the LDAP objectClass values used to identify if queried directory objects really are group objects (as opposed to user objects) during the Require ldap-group directive’s sub-group processing. The Require Directives Apache’s R EQUIRE directives are used during the authorization phase to ensure that a user is allowed to access a resource. mod authnz ldap extends the authorization types with ldap-user, ldap-dn, ldap-group, ldap-attribute and ldap-filter. Other authorization types may also be used but may require that additional authorization modules be loaded. Since v2.4.8, expressions (p. 99) are supported within the LDAP require directives. Require ldap-user The Require ldap-user directive specifies what usernames can access the resource. Once MOD AUTHNZ LDAP has retrieved a unique DN from the directory, it does an LDAP compare operation using the username specified in the Require ldap-user to see if that username is part of the just-fetched LDAP entry. Multiple users can be granted access by putting multiple usernames on the line, separated with spaces. If a username has a space in it, then it must be surrounded with double quotes. Multiple users can also be granted access by using multiple Require ldap-user directives, with one user per line. For example, with a AUTH LDAPURL of ldap://ldap/o=Example?cn (i.e., cn is used for searches), the following Require directives could be used to restrict access: Require ldap-user "Barbara Jenson" Require ldap-user "Fred User" Require ldap-user "Joe Manager" 10.20. APACHE MODULE MOD AUTHNZ LDAP 505 Because of the way that MOD AUTHNZ LDAP handles this directive, Barbara Jenson could sign on as Barbara Jenson, Babs Jenson or any other cn that she has in her LDAP entry. Only the single Require ldap-user line is needed to support all values of the attribute in the user’s entry. If the uid attribute was used instead of the cn attribute in the URL above, the above three lines could be condensed to Require ldap-user bjenson fuser jmanager Require ldap-group This directive specifies an LDAP group whose members are allowed access. It takes the distinguished name of the LDAP group. Note: Do not surround the group name with quotes. For example, assume that the following entry existed in the LDAP directory: dn: cn=Administrators, o=Example objectClass: groupOfUniqueNames uniqueMember: cn=Barbara Jenson, o=Example uniqueMember: cn=Fred User, o=Example The following directive would grant access to both Fred and Barbara: Require ldap-group cn=Administrators, o=Example Members can also be found within sub-groups of a specified LDAP group if AUTH LDAPM AX S UB G ROUP D EPTH is set to a value greater than 0. For example, assume the following entries exist in the LDAP directory: dn: cn=Employees, o=Example objectClass: groupOfUniqueNames uniqueMember: cn=Managers, o=Example uniqueMember: cn=Administrators, o=Example uniqueMember: cn=Users, o=Example dn: cn=Managers, o=Example objectClass: groupOfUniqueNames uniqueMember: cn=Bob Ellis, o=Example uniqueMember: cn=Tom Jackson, o=Example dn: cn=Administrators, o=Example objectClass: groupOfUniqueNames uniqueMember: cn=Barbara Jenson, o=Example uniqueMember: cn=Fred User, o=Example dn: cn=Users, o=Example objectClass: groupOfUniqueNames uniqueMember: cn=Allan Jefferson, o=Example uniqueMember: cn=Paul Tilley, o=Example uniqueMember: cn=Temporary Employees, o=Example dn: cn=Temporary Employees, o=Example objectClass: groupOfUniqueNames uniqueMember: cn=Jim Swenson, o=Example uniqueMember: cn=Elliot Rhodes, o=Example 506 CHAPTER 10. APACHE MODULES The following directives would allow access for Bob Ellis, Tom Jackson, Barbara Jenson, Fred User, Allan Jefferson, and Paul Tilley but would not allow access for Jim Swenson, or Elliot Rhodes (since they are at a sub-group depth of 2): Require ldap-group cn=Employees, o=Example AuthLDAPMaxSubGroupDepth 1 Behavior of this directive is modified by the AUTH LDAPG ROUPATTRIBUTE, AUTH LDAPG ROUPATTRIBUTE I S DN, AUTH LDAPM AX S UB G ROUP D EPTH, AUTH LDAPS UB G ROUPATTRIBUTE, and AUTH LDAPS UB G ROUP C LASS directives. Require ldap-dn The Require ldap-dn directive allows the administrator to grant access based on distinguished names. It specifies a DN that must match for access to be granted. If the distinguished name that was retrieved from the directory server matches the distinguished name in the Require ldap-dn, then authorization is granted. Note: do not surround the distinguished name with quotes. The following directive would grant access to a specific DN: Require ldap-dn cn=Barbara Jenson, o=Example Behavior of this directive is modified by the AUTH LDAPC OMPARE DNO N S ERVER directive. Require ldap-attribute The Require ldap-attribute directive allows the administrator to grant access based on attributes of the authenticated user in the LDAP directory. If the attribute in the directory matches the value given in the configuration, access is granted. The following directive would grant access to anyone with the attribute employeeType = active Require ldap-attribute "employeeType=active" Multiple attribute/value pairs can be specified on the same line separated by spaces or they can be specified in multiple Require ldap-attribute directives. The effect of listing multiple attribute/values pairs is an OR operation. Access will be granted if any of the listed attribute values match the value of the corresponding attribute in the user object. If the value of the attribute contains a space, only the value must be within double quotes. The following directive would grant access to anyone with the city attribute equal to "San Jose" or status equal to "Active" Require ldap-attribute city="San Jose" "status=active" Require ldap-filter The Require ldap-filter directive allows the administrator to grant access based on a complex LDAP search filter. If the dn returned by the filter search matches the authenticated user dn, access is granted. The following directive would grant access to anyone having a cell phone and is in the marketing department Require ldap-filter "&(cell=*)(department=marketing)" 10.20. APACHE MODULE MOD AUTHNZ LDAP 507 The difference between the Require ldap-filter directive and the Require ldap-attribute directive is that ldap-filter performs a search operation on the LDAP directory using the specified search filter rather than a simple attribute comparison. If a simple attribute comparison is all that is required, the comparison operation performed by ldap-attribute will be faster than the search operation used by ldap-filter especially within a large directory. When using an expression (p. 99) within the filter, care must be taken to ensure that LDAP filters are escaped correctly to guard against LDAP injection. The ldap function can be used for this purpose. [ˆ/]+)/"> Require ldap-filter "(memberOf=cn=%{ldap:%{unescape:%{env:MATCH_SITENAME}},ou=Websites,o= Require ldap-search The Require ldap-search directive allows the administrator to grant access based on a generic LDAP search filter using an expression (p. 99) . If there is exactly one match to the search filter, regardless of the distinguished name, access is granted. The following directive would grant access to URLs that match the given objects in the LDAP server: [ˆ/]+)/"> Require ldap-search "(cn=%{ldap:%{unescape:%{env:MATCH_SITENAME}} Website)" Note: care must be taken to ensure that any expressions are properly escaped to guard against LDAP injection. The ldap function can be used as per the example above. Examples • Grant access to anyone who exists in the LDAP directory, using their UID for searches. AuthLDAPURL "ldap://ldap1.example.com:389/ou=People, o=Example?uid?sub?(objectClass=*)" Require valid-user • The next example is the same as above; but with the fields that have useful defaults omitted. Also, note the use of a redundant LDAP server. AuthLDAPURL "ldap://ldap1.example.com ldap2.example.com/ou=People, o=Example" Require valid-user • The next example is similar to the previous one, but it uses the common name instead of the UID. Note that this could be problematical if multiple people in the directory share the same cn, because a search on cn must return exactly one entry. That’s why this approach is not recommended: it’s a better idea to choose an attribute that is guaranteed unique in your directory, such as uid. AuthLDAPURL "ldap://ldap.example.com/ou=People, o=Example?cn" Require valid-user • Grant access to anybody in the Administrators group. The users must authenticate using their UID. AuthLDAPURL ldap://ldap.example.com/o=Example?uid Require ldap-group cn=Administrators, o=Example 508 CHAPTER 10. APACHE MODULES • Grant access to anybody in the group whose name matches the hostname of the virtual host. In this example an expression (p. 99) is used to build the filter. AuthLDAPURL ldap://ldap.example.com/o=Example?uid Require ldap-group cn=%{SERVER_NAME}, o=Example • The next example assumes that everyone at Example who carries an alphanumeric pager will have an LDAP attribute of qpagePagerID. The example will grant access only to people (authenticated via their UID) who have alphanumeric pagers: AuthLDAPURL ldap://ldap.example.com/o=Example?uid??(qpagePagerID=*) Require valid-user • The next example demonstrates the power of using filters to accomplish complicated administrative requirements. Without filters, it would have been necessary to create a new LDAP group and ensure that the group’s members remain synchronized with the pager users. This becomes trivial with filters. The goal is to grant access to anyone who has a pager, plus grant access to Joe Manager, who doesn’t have a pager, but does need to access the same resource: AuthLDAPURL ldap://ldap.example.com/o=Example?uid??(|(qpagePagerID=*)(uid=jmanager)) Require valid-user This last may look confusing at first, so it helps to evaluate what the search filter will look like based on who connects, as shown below. If Fred User connects as fuser, the filter would look like (&(|(qpagePagerID=*)(uid=jmanager))(uid=fuser)) The above search will only succeed if fuser has a pager. When Joe Manager connects as jmanager, the filter looks like (&(|(qpagePagerID=*)(uid=jmanager))(uid=jmanager)) The above search will succeed whether jmanager has a pager or not. Using TLS To use TLS, see the MOD LDAP directives LDAPT RUSTED C LIENT C ERT, LDAPT RUSTED G LOBAL C ERT and LDAPT RUSTED M ODE. An optional second parameter can be added to the AUTH LDAPURL to override the default connection type set by LDAPT RUSTED M ODE. This will allow the connection established by an ldap:// Url to be upgraded to a secure connection on the same port. Using SSL To use SSL, see the MOD LDAP directives LDAPT RUSTED C LIENT C ERT, LDAPT RUSTED G LOBAL C ERT and LDAPT RUSTED M ODE. To specify a secure LDAP server, use ldaps:// in the AUTH LDAPURL directive, instead of ldap://. 10.20. APACHE MODULE MOD AUTHNZ LDAP 509 Exposing Login Information when this module performs authentication, ldap attributes specified in the AUTHLDAPURL directive are placed in environment variables with the prefix "AUTHENTICATE ". when this module performs authorization, ldap attributes specified in the AUTHLDAPURL directive are placed in environment variables with the prefix "AUTHORIZE ". If the attribute field contains the username, common name and telephone number of a user, a CGI program will have access to this information without the need to make a second independent LDAP query to gather this additional information. This has the potential to dramatically simplify the coding and configuration required in some web applications. Using Active Directory An Active Directory installation may support multiple domains at the same time. To distinguish users between domains, an identifier called a User Principle Name (UPN) can be added to a user’s entry in the directory. This UPN usually takes the form of the user’s account name, followed by the domain components of the particular domain, for example somebody@nz.example.com. You may wish to configure the MOD AUTHNZ LDAP module to authenticate users present in any of the domains making up the Active Directory forest. In this way both somebody@nz.example.com and someone@au.example.com can be authenticated using the same query at the same time. To make this practical, Active Directory supports the concept of a Global Catalog. This Global Catalog is a read only copy of selected attributes of all the Active Directory servers within the Active Directory forest. Querying the Global Catalog allows all the domains to be queried in a single query, without the query spanning servers over potentially slow links. If enabled, the Global Catalog is an independent directory server that runs on port 3268 (3269 for SSL). To search for a user, do a subtree search for the attribute userPrincipalName, with an empty search root, like so: AuthLDAPBindDN apache@example.com AuthLDAPBindPassword password AuthLDAPURL ldap://10.0.0.1:3268/?userPrincipalName?sub Users will need to enter their User Principal Name as a login, in the form somebody@nz.example.com. Using Microsoft FrontPage with mod authnz ldap Normally, FrontPage uses FrontPage-web-specific user/group files (i.e., the MOD AUTHN FILE and modules) to handle all authentication. Unfortunately, it is not possible to just change to LDAP authentication by adding the proper directives, because it will break the Permissions forms in the FrontPage client, which attempt to modify the standard text-based authorization files. MOD AUTHZ GROUPFILE Once a FrontPage web has been created, adding LDAP authentication to it is a matter of adding the following directives to every .htaccess file that gets created in the web AuthLDAPURL AuthGroupFile Require group "the url" "mygroupfile" "mygroupfile" 510 CHAPTER 10. APACHE MODULES How It Works FrontPage restricts access to a web by adding the Require valid-user directive to the .htaccess files. The Require valid-user directive will succeed for any user who is valid as far as LDAP is concerned. This means that anybody who has an entry in the LDAP directory is considered a valid user, whereas FrontPage considers only those people in the local user file to be valid. By substituting the ldap-group with group file authorization, Apache is allowed to consult the local user file (which is managed by FrontPage) - instead of LDAP - when handling authorizing the user. Once directives have been added as specified above, FrontPage users will be able to perform all management operations from the FrontPage client. Caveats • When choosing the LDAP URL, the attribute to use for authentication should be something that will also be valid for putting into a MOD AUTHN FILE user file. The user ID is ideal for this. • When adding users via FrontPage, FrontPage administrators should choose usernames that already exist in the LDAP directory (for obvious reasons). Also, the password that the administrator enters into the form is ignored, since Apache will actually be authenticating against the password in the LDAP database, and not against the password in the local user file. This could cause confusion for web administrators. • Apache must be compiled with MOD AUTH BASIC, MOD AUTHN FILE and MOD AUTHZ GROUPFILE in order to use FrontPage support. This is because Apache will still use the MOD AUTHZ GROUPFILE group file for determine the extent of a user’s access to the FrontPage web. • The directives must be put in the .htaccess files. Attempting to put them inside or directives won’t work. This is because MOD AUTHNZ LDAP has to be able to grab the AUTH G ROUP F ILE directive that is found in FrontPage .htaccess files so that it knows where to look for the valid user list. If the MOD AUTHNZ LDAP directives aren’t in the same .htaccess file as the FrontPage directives, then the hack won’t work, because MOD AUTHNZ LDAP will never get a chance to process the .htaccess file, and won’t be able to find the FrontPage-managed user file. AuthLDAPAuthorizePrefix Directive Description: Syntax: Default: Context: Override: Status: Module: Compatibility: Specifies the prefix for environment variables set during authorization AuthLDAPAuthorizePrefix prefix AuthLDAPAuthorizePrefix AUTHORIZE directory, .htaccess AuthConfig Extension mod authnz ldap Available in version 2.3.6 and later This directive allows you to override the prefix used for environment variables set during LDAP authorization. If AUTHENTICATE is specified, consumers of these environment variables see the same information whether LDAP has performed authentication, authorization, or both. =⇒Note No authorization variables are set when a user is authorized on the basis of Require valid-user. 10.20. APACHE MODULE MOD AUTHNZ LDAP 511 AuthLDAPBindAuthoritative Directive Description: Syntax: Default: Context: Override: Status: Module: Determines if other authentication providers are used when a user can be mapped to a DN but the server cannot successfully bind with the user’s credentials. AuthLDAPBindAuthoritative off|on AuthLDAPBindAuthoritative on directory, .htaccess AuthConfig Extension mod authnz ldap By default, subsequent authentication providers are only queried if a user cannot be mapped to a DN, but not if the user can be mapped to a DN and their password cannot be verified with an LDAP bind. If AUTH LDAPB INDAU THORITATIVE is set to off, other configured authentication modules will have a chance to validate the user if the LDAP bind (with the current user’s credentials) fails for any reason. This allows users present in both LDAP and AUTH U SER F ILE to authenticate when the LDAP server is available but the user’s account is locked or password is otherwise unusable. See also • AUTH U SER F ILE • AUTH BASIC P ROVIDER AuthLDAPBindDN Directive Description: Syntax: Context: Override: Status: Module: Optional DN to use in binding to the LDAP server AuthLDAPBindDN distinguished-name directory, .htaccess AuthConfig Extension mod authnz ldap An optional DN used to bind to the server when searching for entries. If not provided, MOD AUTHNZ LDAP will use an anonymous bind. AuthLDAPBindPassword Directive Description: Syntax: Context: Override: Status: Module: Compatibility: Password used in conjuction with the bind DN AuthLDAPBindPassword password directory, .htaccess AuthConfig Extension mod authnz ldap exec: was added in 2.4.5. A bind password to use in conjunction with the bind DN. Note that the bind password is probably sensitive data, and should be properly protected. You should only use the AUTH LDAPB IND DN and AUTH LDAPB IND PASSWORD if you absolutely need them to search the directory. If the value begins with exec: the resulting command will be executed and the first line returned to standard output by the program will be used as the password. #Password used as-is AuthLDAPBindPassword secret 512 CHAPTER 10. APACHE MODULES #Run /path/to/program to get my password AuthLDAPBindPassword exec:/path/to/program #Run /path/to/otherProgram and provide arguments AuthLDAPBindPassword "exec:/path/to/otherProgram argument1" AuthLDAPCharsetConfig Directive Description: Syntax: Context: Status: Module: Language to charset conversion configuration file AuthLDAPCharsetConfig file-path server config Extension mod authnz ldap The AUTH LDAPC HARSET C ONFIG directive sets the location of the language to charset conversion configuration file. File-path is relative to the S ERVER ROOT. This file specifies the list of language extensions to character sets. Most administrators use the provided charset.conv file, which associates common language extensions to character sets. The file contains lines in the following format: Language-Extension charset [Language-String] ... The case of the extension does not matter. Blank lines, and lines beginning with a hash character (#) are ignored. AuthLDAPCompareAsUser Directive Description: Syntax: Default: Context: Override: Status: Module: Compatibility: Use the authenticated user’s credentials to perform authorization comparisons AuthLDAPCompareAsUser on|off AuthLDAPCompareAsUser off directory, .htaccess AuthConfig Extension mod authnz ldap Available in version 2.3.6 and later When set, and MOD AUTHNZ LDAP has authenticated the user, LDAP comparisons for authorization use the queried distinguished name (DN) and HTTP basic authentication password of the authenticated user instead of the servers configured credentials. The ldap-attribute, ldap-user, and ldap-group (single-level only) authorization checks use comparisons. This directive only has effect on the comparisons performed during nested group processing when AUTH LDAPS EARCH A S U SER is also enabled. This directive should only be used when your LDAP server doesn’t accept anonymous comparisons and you cannot use a dedicated AUTH LDAPB IND DN. See also • AUTH LDAPI NITIAL B INDA S U SER • AUTH LDAPS EARCH A S U SER 10.20. APACHE MODULE MOD AUTHNZ LDAP 513 AuthLDAPCompareDNOnServer Directive Description: Syntax: Default: Context: Override: Status: Module: Use the LDAP server to compare the DNs AuthLDAPCompareDNOnServer on|off AuthLDAPCompareDNOnServer on directory, .htaccess AuthConfig Extension mod authnz ldap When set, MOD AUTHNZ LDAP will use the LDAP server to compare the DNs. This is the only foolproof way to compare DNs. MOD AUTHNZ LDAP will search the directory for the DN specified with the Require dn directive, then, retrieve the DN and compare it with the DN retrieved from the user entry. If this directive is not set, MOD AUTHNZ LDAP simply does a string comparison. It is possible to get false negatives with this approach, but it is much faster. Note the MOD LDAP cache can speed up DN comparison in most situations. AuthLDAPDereferenceAliases Directive Description: Syntax: Default: Context: Override: Status: Module: When will the module de-reference aliases AuthLDAPDereferenceAliases never|searching|finding|always AuthLDAPDereferenceAliases always directory, .htaccess AuthConfig Extension mod authnz ldap This directive specifies when MOD AUTHNZ LDAP will de-reference aliases during LDAP operations. The default is always. AuthLDAPGroupAttribute Directive Description: Syntax: Default: Context: Override: Status: Module: LDAP attributes used to identify the user members of groups. AuthLDAPGroupAttribute attribute AuthLDAPGroupAttribute member uniquemember directory, .htaccess AuthConfig Extension mod authnz ldap This directive specifies which LDAP attributes are used to check for user members within groups. Multiple attributes can be used by specifying this directive multiple times. If not specified, then MOD AUTHNZ LDAP uses the member and uniquemember attributes. AuthLDAPGroupAttributeIsDN Directive Description: Syntax: Default: Context: Override: Status: Module: Use the DN of the client username when checking for group membership AuthLDAPGroupAttributeIsDN on|off AuthLDAPGroupAttributeIsDN on directory, .htaccess AuthConfig Extension mod authnz ldap 514 CHAPTER 10. APACHE MODULES When set on, this directive says to use the distinguished name of the client username when checking for group membership. Otherwise, the username will be used. For example, assume that the client sent the username bjenson, which corresponds to the LDAP DN cn=Babs Jenson, o=Example. If this directive is set, MOD AUTHNZ LDAP will check if the group has cn=Babs Jenson, o=Example as a member. If this directive is not set, then MOD AUTHNZ LDAP will check if the group has bjenson as a member. AuthLDAPInitialBindAsUser Directive Description: Syntax: Default: Context: Override: Status: Module: Compatibility: Determines if the server does the initial DN lookup using the basic authentication users’ own username, instead of anonymously or with hard-coded credentials for the server AuthLDAPInitialBindAsUser off|on AuthLDAPInitialBindAsUser off directory, .htaccess AuthConfig Extension mod authnz ldap Available in version 2.3.6 and later By default, the server either anonymously, or with a dedicated user and password, converts the basic authentication username into an LDAP distinguished name (DN). This directive forces the server to use the verbatim username and password provided by the incoming user to perform the initial DN search. If the verbatim username can’t directly bind, but needs some cosmetic transformation, see AUTH LDAPI NITIAL B IND PATTERN. This directive should only be used when your LDAP server doesn’t accept anonymous searches and you cannot use a dedicated AUTH LDAPB IND DN. =⇒Not available with authorization-only This directive can only be used if this module authenticates the user, and has no effect when this module is used exclusively for authorization. See also • AUTH LDAPI NITIAL B IND PATTERN • AUTH LDAPB IND DN • AUTH LDAPC OMPARE A S U SER • AUTH LDAPS EARCH A S U SER AuthLDAPInitialBindPattern Directive Description: Syntax: Default: Context: Override: Status: Module: Compatibility: Specifies the transformation of the basic authentication username to be used when binding to the LDAP server to perform a DN lookup AuthLDAPInitialBindPattern regex substitution AuthLDAPInitialBindPattern (.*) $1 (remote username used verbatim) directory, .htaccess AuthConfig Extension mod authnz ldap Available in version 2.3.6 and later If AUTH LDAPI NITIAL B INDA S U SER is set to ON, the basic authentication username will be transformed according to the regular expression and substitution arguments. 10.20. APACHE MODULE MOD AUTHNZ LDAP 515 The regular expression argument is compared against the current basic authentication username. The substitution argument may contain backreferences, but has no other variable interpolation. This directive should only be used when your LDAP server doesn’t accept anonymous searches and you cannot use a dedicated AUTH LDAPB IND DN. AuthLDAPInitialBindPattern (.+) $1@example.com AuthLDAPInitialBindPattern (.+) cn=$1,dc=example,dc=com =⇒Not available with authorization-only This directive can only be used if this module authenticates the user, and has no effect when this module is used exclusively for authorization. =⇒debugging The substituted DN is recorded in the environment variable LDAP BINDASUSER. If the regular expression does not match the input, the verbatim username is used. See also • AUTH LDAPI NITIAL B INDA S U SER • AUTH LDAPB IND DN AuthLDAPMaxSubGroupDepth Directive Description: Syntax: Default: Context: Override: Status: Module: Compatibility: Specifies the maximum sub-group nesting depth that will be evaluated before the user search is discontinued. AuthLDAPMaxSubGroupDepth Number AuthLDAPMaxSubGroupDepth 0 directory, .htaccess AuthConfig Extension mod authnz ldap Available in version 2.3.0 and later, defaulted to 10 in 2.4.x and early 2.5 When this directive is set to a non-zero value X combined with use of the Require ldap-group someGroupDN directive, the provided user credentials will be searched for as a member of the someGroupDN directory object or of any group member of the current group up to the maximum nesting level X specified by this directive. See the Require ldap-group section for a more detailed example. =⇒Nested groups performance When A LDAPS G UTH UB ROUPATTRIBUTE overlaps with AUTH LDAPG ROUPATTRIBUTE (as it does by default and as required by common LDAP schemas), uncached searching for subgroups in large groups can be very slow. If you use large, non-nested groups, keep AUTH LDAPM AX S UB G ROUP D EPTH set to zero. 516 CHAPTER 10. APACHE MODULES AuthLDAPRemoteUserAttribute Directive Description: Syntax: Default: Context: Override: Status: Module: Use the value of the attribute returned during the user query to set the REMOTE USER environment variable AuthLDAPRemoteUserAttribute uid none directory, .htaccess AuthConfig Extension mod authnz ldap If this directive is set, the value of the REMOTE USER environment variable will be set to the value of the attribute specified. Make sure that this attribute is included in the list of attributes in the AuthLDAPUrl definition, otherwise this directive will have no effect. This directive, if present, takes precedence over AUTH LDAPR EMOTE U SER I S DN. This directive is useful should you want people to log into a website using an email address, but a backend application expects the username as a userid. This directive only has effect when this module is used for authentication. AuthLDAPRemoteUserIsDN Directive Description: Syntax: Default: Context: Override: Status: Module: Use the DN of the client username to set the REMOTE USER environment variable AuthLDAPRemoteUserIsDN on|off AuthLDAPRemoteUserIsDN off directory, .htaccess AuthConfig Extension mod authnz ldap If this directive is set to on, the value of the REMOTE USER environment variable will be set to the full distinguished name of the authenticated user, rather than just the username that was passed by the client. It is turned off by default. This directive only has effect when this module is used for authentication. AuthLDAPSearchAsUser Directive Description: Syntax: Default: Context: Override: Status: Module: Compatibility: Use the authenticated user’s credentials to perform authorization searches AuthLDAPSearchAsUser on|off AuthLDAPSearchAsUser off directory, .htaccess AuthConfig Extension mod authnz ldap Available in version 2.3.6 and later When set, and MOD AUTHNZ LDAP has authenticated the user, LDAP searches for authorization use the queried distinguished name (DN) and HTTP basic authentication password of the authenticated user instead of the servers configured credentials. The ldap-filter and ldap-dn authorization checks use searches. This directive only has effect on the comparisons performed during nested group processing when AUTH LDAPC OM PARE A S U SER is also enabled. This directive should only be used when your LDAP server doesn’t accept anonymous searches and you cannot use a dedicated AUTH LDAPB IND DN. 10.20. APACHE MODULE MOD AUTHNZ LDAP 517 See also • AUTH LDAPI NITIAL B INDA S U SER • AUTH LDAPC OMPARE A S U SER AuthLDAPSubGroupAttribute Directive Description: Syntax: Default: Context: Override: Status: Module: Compatibility: Specifies the attribute labels, one value per directive line, used to distinguish the members of the current group that are groups. AuthLDAPSubGroupAttribute attribute AuthLDAPSubgroupAttribute member uniquemember directory, .htaccess AuthConfig Extension mod authnz ldap Available in version 2.3.0 and later An LDAP group object may contain members that are users and members that are groups (called nested or sub groups). The AUTH LDAPS UB G ROUPATTRIBUTE directive identifies the labels of group members and the AUTH LDAPG ROUPATTRIBUTE directive identifies the labels of the user members. Multiple attributes can be used by specifying this directive multiple times. If not specified, then MOD AUTHNZ LDAP uses the member and uniqueMember attributes. AuthLDAPSubGroupClass Directive Description: Syntax: Default: Context: Override: Status: Module: Compatibility: Specifies which LDAP objectClass values identify directory objects that are groups during sub-group processing. AuthLDAPSubGroupClass LdapObjectClass AuthLDAPSubGroupClass groupOfNames groupOfUniqueNames directory, .htaccess AuthConfig Extension mod authnz ldap Available in version 2.3.0 and later An LDAP group object may contain members that are users and members that are groups (called nested or sub groups). The AUTH LDAPS UB G ROUPATTRIBUTE directive identifies the labels of members that may be sub-groups of the current group (as opposed to user members). The AUTH LDAPS UB G ROUP C LASS directive specifies the LDAP objectClass values used in verifying that these potential sub-groups are in fact group objects. Verified sub-groups can then be searched for more user or sub-group members. Multiple attributes can be used by specifying this directive multiple times. If not specified, then MOD AUTHNZ LDAP uses the groupOfNames and groupOfUniqueNames values. AuthLDAPUrl Directive Description: Syntax: Context: Override: Status: Module: URL specifying the LDAP search parameters AuthLDAPUrl url [NONE|SSL|TLS|STARTTLS] directory, .htaccess AuthConfig Extension mod authnz ldap An RFC 2255 URL which specifies the LDAP search parameters to use. The syntax of the URL is 518 CHAPTER 10. APACHE MODULES ldap://host:port/basedn?attribute?scope?filter If you want to specify more than one LDAP URL that Apache should try in turn, the syntax is: AuthLDAPUrl "ldap://ldap1.example.com ldap2.example.com/dc=..." Caveat: If you specify multiple servers, you need to enclose the entire URL string in quotes; otherwise you will get an error: "AuthLDAPURL takes one argument, URL to define LDAP connection.." You can of course use search parameters on each of these. ldap For regular ldap, use the string ldap. For secure LDAP, use ldaps instead. Secure LDAP is only available if Apache was linked to an LDAP library with SSL support. host:port The name/port of the ldap server (defaults to localhost:389 for ldap, and localhost:636 for ldaps). To specify multiple, redundant LDAP servers, just list all servers, separated by spaces. MOD AUTHNZ LDAP will try connecting to each server in turn, until it makes a successful connection. If multiple ldap servers are specified, then entire LDAP URL must be encapsulated in double quotes. Once a connection has been made to a server, that connection remains active for the life of the httpd process, or until the LDAP server goes down. If the LDAP server goes down and breaks an existing connection, MOD AUTHNZ LDAP will attempt to reconnect, starting with the primary server, and trying each redundant server in turn. Note that this is different than a true round-robin search. basedn The DN of the branch of the directory where all searches should start from. At the very least, this must be the top of your directory tree, but could also specify a subtree in the directory. attribute The attribute to search for. Although RFC 2255 allows a comma-separated list of attributes, only the first attribute will be used, no matter how many are provided. If no attributes are provided, the default is to use uid. It’s a good idea to choose an attribute that will be unique across all entries in the subtree you will be using. All attributes listed will be put into the environment with an AUTHENTICATE prefix for use by other modules. scope The scope of the search. Can be either one or sub. Note that a scope of base is also supported by RFC 2255, but is not supported by this module. If the scope is not provided, or if base scope is specified, the default is to use a scope of sub. filter A valid LDAP search filter. If not provided, defaults to (objectClass=*), which will search for all objects in the tree. Filters are limited to approximately 8000 characters (the definition of MAX STRING LEN in the Apache source code). This should be more than sufficient for any application. In 2.4.10 and later, the keyword none disables the use of a filter; this is required by some primitive LDAP servers. When doing searches, the attribute, filter and username passed by the HTTP client are combined to create a search filter that looks like (&(filter)(attribute=username)). For example, consider an URL of ldap://ldap.example.com/o=Example?cn?sub?(posixid=*). When a client attempts to connect using a username of Babs Jenson, the resulting search filter will be (&(posixid=*)(cn=Babs Jenson)). An optional parameter can be added to allow the LDAP Url to override the connection type. This parameter can be one of the following: NONE Establish an unsecure connection on the default LDAP port. This is the same as ldap:// on port 389. SSL Establish a secure connection on the default secure LDAP port. This is the same as ldaps:// TLS — STARTTLS Establish an upgraded secure connection on the default LDAP port. This connection will be initiated on port 389 by default and then upgraded to a secure connection on the same port. See above for examples of AUTH LDAPU RL URLs. 10.21. APACHE MODULE MOD AUTHZ CORE 10.21 519 Apache Module mod authz core Description: Status: ModuleIdentifier: SourceFile: Compatibility: Core Authorization Base authz core module mod authz core.c Available in Apache HTTPD 2.3 and later Summary This module provides core authorization capabilities so that authenticated users can be allowed or denied access to portions of the web site. MOD AUTHZ CORE provides the functionality to register various authorization providers. It is usually used in conjunction with an authentication provider module such as MOD AUTHN FILE and an authorization module such as MOD AUTHZ USER. It also allows for advanced logic to be applied to the authorization processing. Directives • AuthMerging • • AuthzSendForbiddenOnFailure • Require • Authorization Containers The authorization container directives , and may be combined with each other and with the R EQUIRE directive to express complex authorization logic. The example below expresses the following authorization logic. In order to access the resource, the user must either be the superadmin user, or belong to both the admins group and the Administrators LDAP group and either belong to the sales group or have the LDAP dept attribute sales. Furthermore, in order to access the resource, the user must not belong to either the temps group or the LDAP group Temporary Employees. Require user superadmin Require group admins Require ldap-group "cn=Administrators,o=Airius" Require group sales Require ldap-attribute dept="sales" Require group temps 520 CHAPTER 10. APACHE MODULES Require ldap-group "cn=Temporary Employees,o=Airius" The Require Directives MOD AUTHZ CORE provides some generic authorization providers which can be used with the R EQUIRE directive. Require env The env provider allows access to the server to be controlled based on the existence of an environment variable (p. 92) . When Require env env-variable is specified, then the request is allowed access if the environment variable env-variable exists. The server provides the ability to set environment variables in a flexible way based on characteristics of the client request using the directives provided by MOD SETENVIF. Therefore, this directive can be used to allow access based on such factors as the clients User-Agent (browser type), Referer, or other HTTP request header fields. SetEnvIf User-Agent "ˆKnockKnock/2\.0" let_me_in Require env let_me_in In this case, browsers with a user-agent string beginning with KnockKnock/2.0 will be allowed access, and all others will be denied. When the server looks up a path via an internal subrequest such as looking for a D IRECTORY I NDEX or generating a directory listing with MOD AUTOINDEX, per-request environment variables are not inherited in the subrequest. Additionally, S ET E NV I F directives are not separately evaluated in the subrequest due to the API phases MOD SETENVIF takes action in. Require all The all provider mimics the functionality that was previously provided by the ’Allow from all’ and ’Deny from all’ directives. This provider can take one of two arguments which are ’granted’ or ’denied’. The following examples will grant or deny access to all requests. Require all granted Require all denied Require method The method provider allows using the HTTP method in authorization decisions. The GET and HEAD methods are treated as equivalent. The TRACE method is not available to this provider, use T RACE E NABLE instead. The following example will only allow GET, HEAD, POST, and OPTIONS requests: Require method GET POST OPTIONS 10.21. APACHE MODULE MOD AUTHZ CORE 521 The following example will allow GET, HEAD, POST, and OPTIONS requests without authentication, and require a valid user for all other methods: Require method GET POST OPTIONS Require valid-user Require expr The expr provider allows basing authorization decisions on arbitrary expressions. Require expr %{TIME_HOUR} -ge 9 && %{TIME_HOUR} -le 17 Require expr "!(%{QUERY_STRING} =˜ /secret/)" Require expr "%{REQUEST_URI} in { ’/example.cgi’, ’/other.cgi’ }" Require expr "!(%{QUERY_STRING} =˜ /secret/) && %{REQUEST_URI} in { ’/example.cgi’, ’/other The syntax is described in the ap expr (p. 99) documentation. Normally, the expression is evaluated before authentication. However, if the expression returns false and references the variable %{REMOTE USER}, authentication will be performed and the expression will be re-evaluated. Creating Authorization Provider Aliases Extended authorization providers can be created within the configuration file and assigned an alias name. The alias providers can then be referenced through the R EQUIRE directive in the same way as a base authorization provider. Besides the ability to create and alias an extended provider, it also allows the same extended authorization provider to be referenced by multiple locations. Example The example below creates two different ldap authorization provider aliases based on the ldap-group authorization provider. This example allows a single authorization location to check group membership within multiple ldap hosts: AuthLDAPBindDN "cn=youruser,o=ctx" AuthLDAPBindPassword yourpassword AuthLDAPURL "ldap://ldap.host/o=ctx" AuthLDAPBindDN "cn=yourotheruser,o=dev" AuthLDAPBindPassword yourotherpassword AuthLDAPURL "ldap://other.ldap.host/o=dev?cn" Alias "/secure" "/webpages/secure" 522 CHAPTER 10. APACHE MODULES Require all granted AuthBasicProvider file AuthType Basic AuthName LDAP_Protected_Place #implied OR operation Require ldap-group-alias1 Require ldap-group-alias2 AuthMerging Directive Description: Syntax: Default: Context: Override: Status: Module: Controls the manner in which each configuration section’s authorization logic is combined with that of preceding configuration sections. AuthMerging Off | And | Or AuthMerging Off directory, .htaccess AuthConfig Base mod authz core When authorization is enabled, it is normally inherited by each subsequent configuration section (p. 35) , unless a different set of authorization directives is specified. This is the default action, which corresponds to an explicit setting of AuthMerging Off. However, there may be circumstances in which it is desirable for a configuration section’s authorization to be combined with that of its predecessor while configuration sections are being merged. Two options are available for this case, And and Or. When a configuration section contains AuthMerging And or AuthMerging Or, its authorization logic is combined with that of the nearest predecessor (according to the overall order of configuration sections) which also contains authorization logic as if the two sections were jointly contained within a or directive, respectively. =⇒which The setting of A M is not inherited outside of the configuration section in it appears. In the following example, only users belonging to group alpha may UTH ERGING access /www/docs. Users belonging to either groups alpha or beta may access /www/docs/ab. However, the default Off setting of AUTH M ERGING applies to the configuration section for /www/docs/ab/gamma, so that section’s authorization directives override those of the preceding sections. Thus only users belong to the group gamma may access /www/docs/ab/gamma. AuthType Basic AuthName Documents AuthBasicProvider file AuthUserFile "/usr/local/apache/passwd/passwords" Require group alpha 10.21. APACHE MODULE MOD AUTHZ CORE 523 AuthMerging Or Require group beta Require group gamma AuthzProviderAlias Directive Description: Syntax: Context: Status: Module: Enclose a group of directives that represent an extension of a base authorization provider and referenced by the specified alias ... server config Base mod authz core and are used to enclose a group of authorization directives that can be referenced by the alias name using the directive R EQUIRE. AuthzSendForbiddenOnFailure Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Send ’403 FORBIDDEN’ instead of ’401 UNAUTHORIZED’ if authentication succeeds but authorization fails AuthzSendForbiddenOnFailure On|Off AuthzSendForbiddenOnFailure Off directory, .htaccess Base mod authz core Available in Apache HTTPD 2.3.11 and later If authentication succeeds but authorization fails, Apache HTTPD will respond with an HTTP response code of ’401 UNAUTHORIZED’ by default. This usually causes browsers to display the password dialogue to the user again, which is not wanted in all situations. AUTHZ S END F ORBIDDEN O N FAILURE allows to change the response code to ’403 FORBIDDEN’. ! Security Warning Modifying the response in case of missing authorization weakens the security of the password, because it reveals to a possible attacker, that his guessed password was right. Require Directive Description: Syntax: Context: Override: Status: Module: Tests whether an authenticated user is authorized by an authorization provider. Require [not] entity-name [entity-name] ... directory, .htaccess AuthConfig Base mod authz core This directive tests whether an authenticated user is authorized according to a particular authorization provider and the specified restrictions. MOD AUTHZ CORE provides the following generic authorization providers: 524 CHAPTER 10. APACHE MODULES Require all granted Access is allowed unconditionally. Require all denied Access is denied unconditionally. Require env env-var [env-var] ... Access is allowed only if one of the given environment variables is set. Require method http-method [http-method] ... Access is allowed only for the given HTTP methods. Require expr expression Access is allowed if expression evaluates to true. Some of the allowed syntaxes provided by MOD AUTHZ USER, MOD AUTHZ HOST, and MOD AUTHZ GROUPFILE are: Require user userid [userid] ... Only the named users can access the resource. Require group group-name [group-name] ... Only users in the named groups can access the resource. Require valid-user All valid users can access the resource. Require ip 10 172.20 192.168.2 Clients in the specified IP address ranges can access the resource. Other authorization modules that implement require options include MOD AUTHNZ LDAP, MOD AUTHZ DBM, MOD AUTHZ DBD, MOD AUTHZ OWNER and MOD SSL . In most cases, for a complete authentication and authorization configuration, R EQUIRE must be accompanied by AUTH NAME, AUTH T YPE and AUTH BASIC P ROVIDER or AUTH D IGEST P ROVIDER directives, and directives such as AUTH U SER F ILE and AUTH G ROUP F ILE (to define users and groups) in order to work correctly. Example: AuthType Basic AuthName "Restricted Resource" AuthBasicProvider file AuthUserFile "/web/users" AuthGroupFile "/web/groups" Require group admin Access controls which are applied in this way are effective for all methods. This is what is normally desired. If you wish to apply access controls only to specific methods, while leaving other methods unprotected, then place the R EQUIRE statement into a section. The result of the R EQUIRE directive may be negated through the use of the not option. As with the other negated authorization directive , when the R EQUIRE directive is negated it can only fail or return a neutral result, and therefore may never independently authorize a request. In the following example, all users in the alpha and beta groups are authorized, except for those who are also in the reject group. Require group alpha beta Require not group reject 10.21. APACHE MODULE MOD AUTHZ CORE 525 When multiple R EQUIRE directives are used in a single configuration section (p. 35) and are not contained in another authorization directive like , they are implicitly contained within a directive. Thus the first one to authorize a user authorizes the entire request, and subsequent R EQUIRE directives are ignored. ! Security Warning Exercise caution when setting authorization directives in L OCATION sections that overlap with content served out of the filesystem. By default, these configuration sections (p. 35) overwrite authorization configuration in D IRECTORY, and F ILES sections. The AUTH M ERGING directive can be used to control how authorization configuration sections are merged. See also • Access Control howto (p. 234) • Authorization Containers • MOD AUTHN CORE • MOD AUTHZ HOST RequireAll Directive Description: Syntax: Context: Override: Status: Module: Enclose a group of authorization directives of which none must fail and at least one must succeed for the enclosing directive to succeed. ... directory, .htaccess AuthConfig Base mod authz core and are used to enclose a group of authorization directives of which none must fail and at least one must succeed in order for the directive to succeed. If none of the directives contained within the directive fails, and at least one succeeds, then the directive succeeds. If none succeed and none fail, then it returns a neutral result. In all other cases, it fails. See also • Authorization Containers • Authentication, Authorization, and Access Control (p. 227) RequireAny Directive Description: Syntax: Context: Override: Status: Module: Enclose a group of authorization directives of which one must succeed for the enclosing directive to succeed. ... directory, .htaccess AuthConfig Base mod authz core and are used to enclose a group of authorization directives of which one must succeed in order for the directive to succeed. 526 CHAPTER 10. APACHE MODULES If one or more of the directives contained within the directive succeed, then the directive succeeds. If none succeed and none fail, then it returns a neutral result. In all other cases, it fails. =⇒Because negated authorization directives are unable to return a successful result, they can not significantly influence the result of a directive. (At most they could cause EQUIRE NY the directive to fail in the case where they failed and all other directives returned a neutral value.) Therefore negated authorization directives are not permitted within a directive. See also • Authorization Containers • Authentication, Authorization, and Access Control (p. 227) RequireNone Directive Description: Syntax: Context: Override: Status: Module: Enclose a group of authorization directives of which none must succeed for the enclosing directive to not fail. ... directory, .htaccess AuthConfig Base mod authz core and are used to enclose a group of authorization directives of which none must succeed in order for the directive to not fail. If one or more of the directives contained within the directive succeed, then the directive fails. In all other cases, it returns a neutral result. Thus as with the other negated authorization directive Require not, it can never independently authorize a request because it can never return a successful result. It can be used, however, to restrict the set of users who are authorized to access a resource. =⇒Because negated authorization directives are unable to return a successful result, they can not significantly influence the result of a directive. Therefore negated authoEQUIRE ONE rization directives are not permitted within a directive. See also • Authorization Containers • Authentication, Authorization, and Access Control (p. 227) 10.22. APACHE MODULE MOD AUTHZ DBD 10.22 527 Apache Module mod authz dbd Description: Status: ModuleIdentifier: SourceFile: Compatibility: Group Authorization and Login using SQL Extension authz dbd module mod authz dbd.c Available in Apache 2.4 and later Summary This module provides authorization capabilities so that authenticated users can be allowed or denied access to portions of the web site by group membership. Similar functionality is provided by MOD AUTHZ GROUPFILE and MOD AUTHZ DBM , with the exception that this module queries a SQL database to determine whether a user is a member of a group. This module can also provide database-backed user login/logout capabilities. These are likely to be of most value when used in conjunction with MOD AUTHN DBD. This module relies on MOD DBD to specify the backend database driver and connection parameters, and manage the database connections. Directives • AuthzDBDLoginToReferer • AuthzDBDQuery • AuthzDBDRedirectQuery See also • R EQUIRE • AUTH DBDU SER PWQ UERY • DBD RIVER • DBDPARAMS The Require Directives Apache’s R EQUIRE directives are used during the authorization phase to ensure that a user is allowed to access a resource. mod authz dbd extends the authorization types with dbd-group, dbd-login and dbd-logout. Since v2.4.8, expressions (p. 99) are supported within the DBD require directives. Require dbd-group This directive specifies group membership that is required for the user to gain access. Require dbd-group team AuthzDBDQuery "SELECT group FROM authz WHERE user = %s" 528 CHAPTER 10. APACHE MODULES Require dbd-login This directive specifies a query to be run indicating the user has logged in. Require dbd-login AuthzDBDQuery "UPDATE authn SET login = ’true’ WHERE user = %s" Require dbd-logout This directive specifies a query to be run indicating the user has logged out. Require dbd-logout AuthzDBDQuery "UPDATE authn SET login = ’false’ WHERE user = %s" Database Login In addition to the standard authorization function of checking group membership, this module can also provide serverside user session management via database-backed login/logout capabilities. Specifically, it can update a user’s session status in the database whenever the user visits designated URLs (subject of course to users supplying the necessary credentials). This works by defining two special R EQUIRE types: Require dbd-login and Require dbd-logout. For usage details, see the configuration example below. Client Login integration Some administrators may wish to implement client-side session management that works in concert with the server-side login/logout capabilities offered by this module, for example, by setting or unsetting an HTTP cookie or other such token when a user logs in or out. To support such integration, MOD AUTHZ DBD exports an optional hook that will be run whenever a user’s status is updated in the database. Other session management modules can then use the hook to implement functions that start and end client-side sessions. Configuration example # mod_dbd configuration DBDriver pgsql DBDParams "dbname=apacheauth user=apache pass=xxxxxx" DBDMin 4 DBDKeep 8 DBDMax 20 DBDExptime 300 # mod_authn_core and mod_auth_basic configuration # for mod_authn_dbd AuthType Basic AuthName Team AuthBasicProvider dbd 10.22. APACHE MODULE MOD AUTHZ DBD 529 # mod_authn_dbd SQL query to authenticate a logged-in user AuthDBDUserPWQuery \ "SELECT password FROM authn WHERE user = %s AND login = ’true’" # mod_authz_core configuration for mod_authz_dbd Require dbd-group team # mod_authz_dbd configuration AuthzDBDQuery "SELECT group FROM authz WHERE user = %s" # when a user fails to be authenticated or authorized, # invite them to login; this page should provide a link # to /team-private/login.html ErrorDocument 401 /login-info.html # don’t require user to already be logged in! AuthDBDUserPWQuery "SELECT password FROM authn WHERE user = %s" # dbd-login action executes a statement to log user in Require dbd-login AuthzDBDQuery "UPDATE authn SET login = ’true’ WHERE user = %s" # return user to referring page (if any) after # successful login AuthzDBDLoginToReferer On # dbd-logout action executes a statement to log user out Require dbd-logout AuthzDBDQuery "UPDATE authn SET login = ’false’ WHERE user = %s" Preventing SQL injections Whether you need to care about SQL security depends on what DBD driver and backend you use. With most drivers you don’t have to do anything : the statement is prepared by the database at startup, and user input is used only as data. But you may need to untaint your input. At the time of writing, the only driver that requires you to take care is FreeTDS. Please read MOD DBD documentation for more information about security on this scope. 530 CHAPTER 10. APACHE MODULES AuthzDBDLoginToReferer Directive Description: Syntax: Default: Context: Status: Module: Determines whether to redirect the Client to the Referring page on successful login or logout if a Referer request header is present AuthzDBDLoginToReferer On|Off AuthzDBDLoginToReferer Off directory Extension mod authz dbd In conjunction with Require dbd-login or Require dbd-logout, this provides the option to redirect the client back to the Referring page (the URL in the Referer HTTP request header, if present). When there is no Referer header, AuthzDBDLoginToReferer On will be ignored. AuthzDBDQuery Directive Description: Syntax: Context: Status: Module: Specify the SQL Query for the required operation AuthzDBDQuery query directory Extension mod authz dbd The AUTHZ DBDQ UERY specifies an SQL query to run. The purpose of the query depends on the R EQUIRE directive in effect. • When used with a Require dbd-group directive, it specifies a query to look up groups for the current user. This is the standard functionality of other authorization modules such as MOD AUTHZ GROUPFILE and MOD AUTHZ DBM . The first column value of each row returned by the query statement should be a string containing a group name. Zero, one, or more rows may be returned. Require dbd-group AuthzDBDQuery "SELECT group FROM groups WHERE user = %s" • When used with a Require dbd-login or Require dbd-logout directive, it will never deny access, but will instead execute a SQL statement designed to log the user in or out. The user must already be authenticated with MOD AUTHN DBD. Require dbd-login AuthzDBDQuery "UPDATE authn SET login = ’true’ WHERE user = %s" In all cases, the user’s ID will be passed as a single string parameter when the SQL query is executed. It may be referenced within the query statement using a %s format specifier. AuthzDBDRedirectQuery Directive Description: Syntax: Context: Status: Module: Specify a query to look up a login page for the user AuthzDBDRedirectQuery query directory Extension mod authz dbd Specifies an optional SQL query to use after successful login (or logout) to redirect the user to a URL, which may be specific to the user. The user’s ID will be passed as a single string parameter when the SQL query is executed. It may be referenced within the query statement using a %s format specifier. 10.22. APACHE MODULE MOD AUTHZ DBD 531 AuthzDBDRedirectQuery "SELECT userpage FROM userpages WHERE user = %s" The first column value of the first row returned by the query statement should be a string containing a URL to which to redirect the client. Subsequent rows will be ignored. If no rows are returned, the client will not be redirected. Note that AUTHZ DBDL OGIN T O R EFERER takes precedence if both are set. 532 CHAPTER 10. APACHE MODULES 10.23 Apache Module mod authz dbm Description: Status: ModuleIdentifier: SourceFile: Group authorization using DBM files Extension authz dbm module mod authz dbm.c Summary This module provides authorization capabilities so that authenticated users can be allowed or denied access to portions of the web site by group membership. Similar functionality is provided by MOD AUTHZ GROUPFILE. Directives • AuthDBMGroupFile • AuthzDBMType See also • R EQUIRE The Require Directives Apache’s R EQUIRE directives are used during the authorization phase to ensure that a user is allowed to access a resource. mod authz dbm extends the authorization types with dbm-group. Since v2.4.8, expressions (p. 99) are supported within the DBM require directives. Require dbm-group This directive specifies group membership that is required for the user to gain access. Require dbm-group admin Require dbm-file-group When this directive is specified, the user must be a member of the group assigned to the file being accessed. Require dbm-file-group Example usage Note that using mod authz dbm requires you to require dbm-group instead of group: AuthType Basic AuthName "Secure Area" AuthBasicProvider dbm AuthDBMUserFile "site/data/users" 10.23. APACHE MODULE MOD AUTHZ DBM 533 AuthDBMGroupFile "site/data/users" Require dbm-group admin AuthDBMGroupFile Directive Description: Syntax: Context: Override: Status: Module: Sets the name of the database file containing the list of user groups for authorization AuthDBMGroupFile file-path directory, .htaccess AuthConfig Extension mod authz dbm The AUTH DBMG ROUP F ILE directive sets the name of a DBM file containing the list of user groups for user authorization. File-path is the absolute path to the group file. The group file is keyed on the username. The value for a user is a comma-separated list of the groups to which the users belongs. There must be no whitespace within the value, and it must never contain any colons. ! Security Make sure that the AUTH DBMG ROUP F ILE is stored outside the document tree of the webserver. Do not put it in the directory that it protects. Otherwise, clients will be able to download the AUTH DBMG ROUP F ILE unless otherwise protected. Combining Group and Password DBM files: In some cases it is easier to manage a single database which contains both the password and group details for each user. This simplifies any support programs that need to be written: they now only have to deal with writing to and locking a single DBM file. This can be accomplished by first setting the group and password files to point to the same DBM: AuthDBMGroupFile "/www/userbase" AuthDBMUserFile "/www/userbase" The key for the single DBM is the username. The value consists of Encrypted Password : List of Groups [ : (ignored) ] The password section contains the encrypted password as before. This is followed by a colon and the comma separated list of groups. Other data may optionally be left in the DBM file after another colon; it is ignored by the authorization module. This is what www.telescope.org uses for its combined password and group database. AuthzDBMType Directive Description: Syntax: Default: Context: Override: Status: Module: Sets the type of database file that is used to store list of user groups AuthzDBMType default|SDBM|GDBM|NDBM|DB AuthzDBMType default directory, .htaccess AuthConfig Extension mod authz dbm Sets the type of database file that is used to store the list of user groups. The default database type is determined at compile time. The availability of other types of database files also depends on compile-time settings (p. 22) . It is crucial that whatever program you use to create your group files is configured to use the same type of database. 534 CHAPTER 10. APACHE MODULES 10.24 Apache Module mod authz groupfile Description: Status: ModuleIdentifier: SourceFile: Group authorization using plaintext files Base authz groupfile module mod authz groupfile.c Summary This module provides authorization capabilities so that authenticated users can be allowed or denied access to portions of the web site by group membership. Similar functionality is provided by MOD AUTHZ DBM. Directives • AuthGroupFile See also • R EQUIRE The Require Directives Apache’s R EQUIRE directives are used during the authorization phase to ensure that a user is allowed to access a resource. mod authz groupfile extends the authorization types with group and group-file. Since v2.4.8, expressions (p. 99) are supported within the groupfile require directives. Require group This directive specifies group membership that is required for the user to gain access. Require group admin Require file-group When this directive is specified, the filesystem permissions on the file being accessed are consulted. The user must be a member of a group with the same name as the group that owns the file. See MOD AUTHZ OWNER for more details. Require file-group AuthGroupFile Directive Description: Syntax: Context: Override: Status: Module: Sets the name of a text file containing the list of user groups for authorization AuthGroupFile file-path directory, .htaccess AuthConfig Base mod authz groupfile The AUTH G ROUP F ILE directive sets the name of a textual file containing the list of user groups for user authorization. File-path is the path to the group file. If it is not absolute, it is treated as relative to the S ERVER ROOT. 10.24. APACHE MODULE MOD AUTHZ GROUPFILE 535 Each line of the group file contains a groupname followed by a colon, followed by the member usernames separated by spaces. Example: mygroup: bob joe anne Note that searching large text files is very inefficient; AUTH DBMG ROUP F ILE provides a much better performance. ! Security Make sure that the AUTH G ROUP F ILE is stored outside the document tree of the web-server; do not put it in the directory that it protects. Otherwise, clients may be able to download the AUTH G ROUP F ILE. 536 CHAPTER 10. APACHE MODULES 10.25 Apache Module mod authz host Description: Status: ModuleIdentifier: SourceFile: Compatibility: Group authorizations based on host (name or IP address) Base authz host module mod authz host.c Available in Apache 2.3 and later Summary The authorization providers implemented by MOD AUTHZ HOST are registered using the R EQUIRE directive. The directive can be referenced within a , , or section as well as .htaccess (p. 380) files to control access to particular parts of the server. Access can be controlled based on the client hostname or IP address. In general, access restriction directives apply to all access methods (GET, PUT, POST, etc). This is the desired behavior in most cases. However, it is possible to restrict some methods, while leaving other methods unrestricted, by enclosing the directives in a section. Directives This module provides no directives. See also • Authentication, Authorization, and Access Control (p. 227) • R EQUIRE The Require Directives Apache’s R EQUIRE directive is used during the authorization phase to ensure that a user is allowed or denied access to a resource. mod authz host extends the authorization types with ip, host, forward-dns and local. Other authorization types may also be used but may require that additional authorization modules be loaded. These authorization providers affect which hosts can access an area of the server. Access can be controlled by hostname, IP Address, or IP Address range. Since v2.4.8, expressions (p. 99) are supported within the host require directives. Require ip The ip provider allows access to the server to be controlled based on the IP address of the remote client. When Require ip ip-address is specified, then the request is allowed access if the IP address matches. A full IP address: Require ip 10.1.2.3 Require ip 192.168.1.104 192.168.1.205 An IP address of a host allowed access A partial IP address: Require ip 10.1 Require ip 10 172.20 192.168.2 10.25. APACHE MODULE MOD AUTHZ HOST 537 The first 1 to 3 bytes of an IP address, for subnet restriction. A network/netmask pair: Require ip 10.1.0.0/255.255.0.0 A network a.b.c.d, and a netmask w.x.y.z. For more fine-grained subnet restriction. A network/nnn CIDR specification: Require ip 10.1.0.0/16 Similar to the previous case, except the netmask consists of nnn high-order 1 bits. Note that the last three examples above match exactly the same set of hosts. IPv6 addresses and IPv6 subnets can be specified as shown below: Require Require Require Require ip ip ip ip 2001:db8::a00:20ff:fea7:ccea 2001:db8:1:1::a 2001:db8:2:1::/64 2001:db8:3::/48 Note: As the IP addresses are parsed on startup, expressions are not evaluated at request time. Require host The host provider allows access to the server to be controlled based on the host name of the remote client. When Require host host-name is specified, then the request is allowed access if the host name matches. A (partial) domain-name Require host example.org Require host .net example.edu Hosts whose names match, or end in, this string are allowed access. Only complete components are matched, so the above example will match foo.example.org but it will not match fooexample.org. This configuration will cause Apache to perform a double reverse DNS lookup on the client IP address, regardless of the setting of the H OSTNAME L OOKUPS directive. It will do a reverse DNS lookup on the IP address to find the associated hostname, and then do a forward lookup on the hostname to assure that it matches the original IP address. Only if the forward and reverse DNS are consistent and the hostname matches will access be allowed. Require forward-dns The forward-dns provider allows access to the server to be controlled based on simple host names. When Require forward-dns host-name is specified, all IP addresses corresponding to host-name are allowed access. In contrast to the host provider, this provider does not rely on reverse DNS lookups: it simply queries the DNS for the host name and allows a client if its IP matches. As a consequence, it will only work with host names, not domain names. However, as the reverse DNS is not used, it will work with clients which use a dynamic DNS service. Require forward-dns bla.example.org A client the IP of which is resolved from the name bla.example.org will be granted access. 538 CHAPTER 10. APACHE MODULES Require local The local provider allows access to the server if any of the following conditions is true: • the client address matches 127.0.0.0/8 • the client address is ::1 • both the client and the server address of the connection are the same This allows a convenient way to match connections that originate from the local host: Require local Security Note If you are proxying content to your server, you need to be aware that the client address will be the address of your proxy server, not the address of the client, and so using the Require directive in this context may not do what you mean. See MOD REMOTEIP for one possible solution to this problem. 10.26. APACHE MODULE MOD AUTHZ OWNER 10.26 539 Apache Module mod authz owner Description: Status: ModuleIdentifier: SourceFile: Authorization based on file ownership Extension authz owner module mod authz owner.c Summary This module authorizes access to files by comparing the userid used for HTTP authentication (the web userid) with the file-system owner or group of the requested file. The supplied username and password must be already properly verified by an authentication module, such as MOD AUTH BASIC or MOD AUTH DIGEST. MOD AUTHZ OWNER recognizes two arguments for the R EQUIRE directive, file-owner and file-group, as follows: file-owner The supplied web-username must match the system’s name for the owner of the file being requested. That is, if the operating system says the requested file is owned by jones, then the username used to access it through the web must be jones as well. file-group The name of the system group that owns the file must be present in a group database, which is provided, for example, by MOD AUTHZ GROUPFILE or MOD AUTHZ DBM, and the web-username must be a member of that group. For example, if the operating system says the requested file is owned by (system) group accounts, the group accounts must appear in the group database and the web-username used in the request must be a member of that group. =⇒Note If MOD AUTHZ OWNER is used in order to authorize a resource that is not actually present in the filesystem (i.e. a virtual resource), it will deny the access. Particularly it will never authorize content negotiated "MultiViews" (p. 78) resources. Directives This module provides no directives. See also • R EQUIRE Configuration Examples Require file-owner Consider a multi-user system running the Apache Web server, with each user having his or her own files in ˜/public html/private. Assuming that there is a single AUTH DBMU SER F ILE database that lists all of their web-usernames, and that these usernames match the system’s usernames that actually own the files on the server, then the following stanza would allow only the user himself access to his own files. User jones would not be allowed to access files in /home/smith/public html/private unless they were owned by jones instead of smith. AuthType Basic AuthName "MyPrivateFiles" AuthBasicProvider dbm AuthDBMUserFile "/usr/local/apache2/etc/.htdbm-all" Require file-owner 540 CHAPTER 10. APACHE MODULES Require file-group Consider a system similar to the one described above, but with some users that share their project files in ˜/public html/project-foo. The files are owned by the system group foo and there is a single AUTH DBMG ROUP F ILE database that contains all of the web-usernames and their group membership, i.e. they must be at least member of a group named foo. So if jones and smith are both member of the group foo, then both will be authorized to access the project-foo directories of each other. AuthType Basic AuthName "Project Foo Files" AuthBasicProvider dbm # combined user/group database AuthDBMUserFile "/usr/local/apache2/etc/.htdbm-all" AuthDBMGroupFile "/usr/local/apache2/etc/.htdbm-all" Satisfy All Require file-group 10.27. APACHE MODULE MOD AUTHZ USER 10.27 541 Apache Module mod authz user Description: Status: ModuleIdentifier: SourceFile: User Authorization Base authz user module mod authz user.c Summary This module provides authorization capabilities so that authenticated users can be allowed or denied access to portions of the web site. MOD AUTHZ USER grants access if the authenticated user is listed in a Require user directive. Alternatively Require valid-user can be used to grant access to all successfully authenticated users. Directives This module provides no directives. See also • R EQUIRE The Require Directives Apache’s R EQUIRE directives are used during the authorization phase to ensure that a user is allowed to access a resource. mod authz user extends the authorization types with user and valid-user. Since v2.4.8, expressions (p. 99) are supported within the user require directives. Require user This directive specifies a list of users that are allowed to gain access. Require user john paul george ringo Require valid-user When this directive is specified, any successfully authenticated user will be allowed to gain access. Require valid-user 542 CHAPTER 10. APACHE MODULES 10.28 Apache Module mod autoindex Description: Status: ModuleIdentifier: SourceFile: Generates directory indexes, automatically, similar to the Unix ls command or the Win32 dir shell command Base autoindex module mod autoindex.c Summary The index of a directory can come from one of two sources: • A file located in that directory, typically called index.html. The D IRECTORY I NDEX directive sets the name of the file or files to be used. This is controlled by MOD DIR. • Otherwise, a listing generated by the server. The other directives control the format of this listing. The A D D I CON , A DD I CON B Y E NCODING and A DD I CON B Y T YPE are used to set a list of icons to display for various file types; for each file listed, the first icon listed that matches the file is displayed. These are controlled by MOD AUTOINDEX . The two functions are separated so that you can completely remove (or replace) automatic index generation should you want to. Automatic index generation is enabled with using Options +Indexes. See the O PTIONS directive for more details. If the FancyIndexing option is given with the I NDEX O PTIONS directive, the column headers are links that control the order of the display. If you select a header link, the listing will be regenerated, sorted by the values in that column. Selecting the same header repeatedly toggles between ascending and descending order. These column header links are suppressed with the I NDEX O PTIONS directive’s SuppressColumnSorting option. Note that when the display is sorted by "Size", it’s the actual size of the files that’s used, not the displayed value so a 1010-byte file will always be displayed before a 1011-byte file (if in ascending order) even though they both are shown as "1K". Directives • AddAlt • AddAltByEncoding • AddAltByType • AddDescription • AddIcon • AddIconByEncoding • AddIconByType • DefaultIcon • HeaderName • IndexHeadInsert • IndexIgnore • IndexIgnoreReset • IndexOptions • IndexOrderDefault 10.28. APACHE MODULE MOD AUTOINDEX 543 • IndexStyleSheet • ReadmeName Autoindex Request Query Arguments Various query string arguments are available to give the client some control over the ordering of the directory listing, as well as what files are listed. If you do not wish to give the client this control, the IndexOptions IgnoreClient option disables that functionality. The column sorting headers themselves are self-referencing hyperlinks that add the sort query options shown below. Any option below may be added to any request for the directory resource. • C=N sorts the directory by file name • C=M sorts the directory by last-modified date, then file name • C=S sorts the directory by size, then file name • C=D sorts the directory by description, then file name • O=A sorts the listing in Ascending Order • O=D sorts the listing in Descending Order • F=0 formats the listing as a simple list (not FancyIndexed) • F=1 formats the listing as a FancyIndexed list • F=2 formats the listing as an HTMLTable FancyIndexed list • V=0 disables version sorting • V=1 enables version sorting • P=pattern lists only files matching the given pattern Note that the ’P’attern query argument is tested after the usual I NDEX I GNORE directives are processed, and all file names are still subjected to the same criteria as any other autoindex listing. The Query Arguments parser in MOD AUTOINDEX will stop abruptly when an unrecognized option is encountered. The Query Arguments must be well formed, according to the table above. The simple example below, which can be clipped and saved in a header.html file, illustrates these query options. Note that the unknown "X" argument, for the submit button, is listed last to assure the arguments are all parsed before mod autoindex encounters the X=Go input. 544 CHAPTER 10. APACHE MODULES
Show me a Sorted by Matching
AddAlt Directive Description: Syntax: Context: Override: Status: Module: Alternate text to display for a file, instead of an icon selected by filename AddAlt string file [file] ... server config, virtual host, directory, .htaccess Indexes Base mod autoindex A DDA LT provides the alternate text to display for a file, instead of an icon, for FancyIndexing. File is a file extension, partial filename, wild-card expression or full filename for files to describe. If String contains any whitespace, you have to enclose it in quotes (" or ’). This alternate text is displayed if the client is image-incapable, has image loading disabled, or fails to retrieve the icon. AddAlt "PDF file" *.pdf AddAlt Compressed *.gz *.zip *.Z AddAltByEncoding Directive Description: Syntax: Context: Override: Status: Module: Alternate text to display for a file instead of an icon selected by MIME-encoding AddAltByEncoding string MIME-encoding [MIME-encoding] ... server config, virtual host, directory, .htaccess Indexes Base mod autoindex 10.28. APACHE MODULE MOD AUTOINDEX 545 A DDA LT B Y E NCODING provides the alternate text to display for a file, instead of an icon, for FancyIndexing. MIME-encoding is a valid content-encoding, such as x-compress. If String contains any whitespace, you have to enclose it in quotes (" or ’). This alternate text is displayed if the client is image-incapable, has image loading disabled, or fails to retrieve the icon. AddAltByEncoding gzip x-gzip AddAltByType Directive Description: Syntax: Context: Override: Status: Module: Alternate text to display for a file, instead of an icon selected by MIME content-type AddAltByType string MIME-type [MIME-type] ... server config, virtual host, directory, .htaccess Indexes Base mod autoindex A DDA LT B Y T YPE sets the alternate text to display for a file, instead of an icon, for FancyIndexing. MIME-type is a valid content-type, such as text/html. If String contains any whitespace, you have to enclose it in quotes (" or ’). This alternate text is displayed if the client is image-incapable, has image loading disabled, or fails to retrieve the icon. AddAltByType ’plain text’ text/plain AddDescription Directive Description: Syntax: Context: Override: Status: Module: Description to display for a file AddDescription string file [file] ... server config, virtual host, directory, .htaccess Indexes Base mod autoindex This sets the description to display for a file, for FancyIndexing. File is a file extension, partial filename, wild-card expression or full filename for files to describe. String is enclosed in double quotes ("). AddDescription "The planet Mars" mars.gif AddDescription "My friend Marshall" friends/mars.gif The typical, default description field is 23 bytes wide. 6 more bytes are added by the IndexOptions SuppressIcon option, 7 bytes are added by the IndexOptions SuppressSize option, and 19 bytes are added by the IndexOptions SuppressLastModified option. Therefore, the widest default the description column is ever assigned is 55 bytes. Since the File argument may be a partial file name, please remember that a too-short partial filename may match unintended files. For example, le.html will match the file le.html but will also match the file example.html. In the event that there may be ambiguity, use as complete a filename as you can, but keep in mind that the first match encountered will be used, and order your list of AddDescription directives accordingly. See the DescriptionWidth I NDEX O PTIONS keyword for details on overriding the size of this column, or allowing descriptions of unlimited length. 546 CHAPTER 10. APACHE MODULES =⇒Caution Descriptive text defined with A DD D ESCRIPTION may contain HTML markup, such as tags and character entities. If the width of the description column should happen to truncate a tagged element (such as cutting off the end of a bolded phrase), the results may affect the rest of the directory listing. =⇒Arguments with path information Absolute paths are not currently supported and do not match anything at runtime. Arguments with relative path information, which would normally only be used in htaccess context, are implicitly prefixed with ’*/’ to avoid matching partial directory names. AddIcon Directive Description: Syntax: Context: Override: Status: Module: Icon to display for a file selected by name AddIcon icon name [name] ... server config, virtual host, directory, .htaccess Indexes Base mod autoindex This sets the icon to display next to a file ending in name for FancyIndexing. Icon is either a (%-escaped) relative URL to the icon, a fully qualified remote URL, or of the format (alttext,url) where alttext is the text tag given for an icon for non-graphical browsers. Name is either ˆˆDIRECTORYˆˆ for directories, ˆˆBLANKICONˆˆ for blank lines (to format the list correctly), a file extension, a wildcard expression, a partial filename or a complete filename. ˆˆBLANKICONˆˆ is only used for formatting, and so is unnecessary if you’re using IndexOptions HTMLTable. #Examples AddIcon (IMG,/icons/image.png) .gif .jpg .png AddIcon /icons/dir.png ˆˆDIRECTORYˆˆ AddIcon /icons/backup.png *˜ A DD I CON B Y T YPE should be used in preference to A DD I CON, when possible. AddIconByEncoding Directive Description: Syntax: Context: Override: Status: Module: Icon to display next to files selected by MIME content-encoding AddIconByEncoding icon MIME-encoding [MIME-encoding] ... server config, virtual host, directory, .htaccess Indexes Base mod autoindex This sets the icon to display next to files with FancyIndexing. Icon is either a (%-escaped) relative URL to the icon, a fully qualified remote URL, or of the format (alttext,url) where alttext is the text tag given for an icon for non-graphical browsers. MIME-encoding is a valid content-encoding, such as x-compress. AddIconByEncoding /icons/compress.png x-compress 10.28. APACHE MODULE MOD AUTOINDEX 547 AddIconByType Directive Description: Syntax: Context: Override: Status: Module: Icon to display next to files selected by MIME content-type AddIconByType icon MIME-type [MIME-type] ... server config, virtual host, directory, .htaccess Indexes Base mod autoindex This sets the icon to display next to files of type MIME-type for FancyIndexing. Icon is either a (%-escaped) relative URL to the icon, a fully qualified remote URL, or of the format (alttext,url) where alttext is the text tag given for an icon for non-graphical browsers. MIME-type is a wildcard expression matching required the mime types. AddIconByType (IMG,/icons/image.png) image/* DefaultIcon Directive Description: Syntax: Context: Override: Status: Module: Icon to display for files when no specific icon is configured DefaultIcon url-path server config, virtual host, directory, .htaccess Indexes Base mod autoindex The D EFAULT I CON directive sets the icon to display for files when no specific icon is known, for FancyIndexing. Url-path is a (%-escaped) relative URL to the icon, or a fully qualified remote URL. DefaultIcon /icon/unknown.png HeaderName Directive Description: Syntax: Context: Override: Status: Module: Name of the file that will be inserted at the top of the index listing HeaderName filename server config, virtual host, directory, .htaccess Indexes Base mod autoindex The H EADER NAME directive sets the name of the file that will be inserted at the top of the index listing. Filename is the name of the file to include. HeaderName HEADER.html 548 CHAPTER 10. APACHE MODULES =⇒Both HeaderName and R N now treat Filename as a URI path relative to the one used to access the directory being indexed. If Filename begins with a slash, it will be taken to EADME AME be relative to the D OCUMENT ROOT. HeaderName /include/HEADER.html Filename must resolve to a document with a major content type of text/* (e.g., text/html, text/plain, etc.). This means that filename may refer to a CGI script if the script’s actual file type (as opposed to its output) is marked as text/html such as with a directive like: AddType text/html .cgi Content negotiation (p. 78) will be performed if O PTIONS MultiViews is in effect. If filename resolves to a static text/html document (not a CGI script) and either one of the OP TIONS Includes or IncludesNOEXEC is enabled, the file will be processed for server-side includes (see the MOD INCLUDE documentation). If the file specified by H EADER NAME contains the beginnings of an HTML document (, , etc.) then you will probably want to set IndexOptions +SuppressHTMLPreamble, so that these tags are not repeated. See also • R EADME NAME IndexHeadInsert Directive Description: Syntax: Context: Override: Status: Module: Inserts text in the HEAD section of an index page. IndexHeadInsert "markup ..." server config, virtual host, directory, .htaccess Indexes Base mod autoindex The I NDEX H EAD I NSERT directive specifies a string to insert in the section of the HTML generated for the index page. IndexHeadInsert "" IndexIgnore Directive Description: Syntax: Default: Context: Override: Status: Module: Adds to the list of files to hide when listing a directory IndexIgnore file [file] ... IndexIgnore "." server config, virtual host, directory, .htaccess Indexes Base mod autoindex The I NDEX I GNORE directive adds to the list of files to hide when listing a directory. File is a shell-style wildcard expression or full filename. Multiple IndexIgnore directives add to the list, rather than replacing the list of ignored files. By default, the list contains . (the current directory). IndexIgnore .??* *˜ *# HEADER* README* RCS CVS *,v *,t 10.28. APACHE MODULE MOD AUTOINDEX 549 =⇒Regular Expressions This directive does not currently work in configuration sections that have regular expression arguments, such as IndexIgnoreReset Directive Description: Syntax: Context: Override: Status: Module: Compatibility: Empties the list of files to hide when listing a directory IndexIgnoreReset ON|OFF server config, virtual host, directory, .htaccess Indexes Base mod autoindex 2.3.10 and later The I NDEX I GNORE R ESET directive removes any files ignored by I NDEX I GNORE otherwise inherited from other configuration sections. IndexIgnore *.bak .??* *˜ *# HEADER* README* RCS CVS *,v *,t IndexIgnoreReset ON IndexIgnore .??* *# HEADER* README* RCS CVS *,v *,t ! Review the default configuration for a list of patterns that you might want to explicitly ignore after using this directive. IndexOptions Directive Description: Syntax: Default: Context: Override: Status: Module: Various configuration settings for directory indexing IndexOptions [+|-]option [[+|-]option] ... By default, no options are enabled. server config, virtual host, directory, .htaccess Indexes Base mod autoindex The I NDEX O PTIONS directive specifies the behavior of the directory indexing. Option can be one of AddAltClass Adds an additional CSS class declaration to each row of the directory listing table when IndexOptions HTMLTable is in effect and an IndexStyleSheet is defined. Rather than the standard even and odd classes that would otherwise be applied to each row of the table, a class of even-ALT or odd-ALT where ALT is either the standard alt text associated with the file style (eg. snd, txt, img, etc) or the alt text defined by one of the various AddAlt* directives. Charset=character-set (Apache HTTP Server 2.0.61 and later) The Charset keyword allows you to specify the character set of the generated page. The default is UTF-8 on Windows and Mac OS X, and ISO-8859-1 elsewhere. (It depends on whether the underlying file system uses Unicode filenames or not.) IndexOptions Charset=UTF-8 550 CHAPTER 10. APACHE MODULES DescriptionWidth=[n — *] The DescriptionWidth keyword allows you to specify the width of the description column in characters. -DescriptionWidth (or unset) allows MOD AUTOINDEX to calculate the best width. DescriptionWidth=n fixes the column width to n bytes wide. DescriptionWidth=* grows the column to the width necessary to accommodate the longest description string. See the section on A DD D ESCRIPTION for dangers inherent in truncating descriptions. FancyIndexing This turns on fancy indexing of directories. FoldersFirst If this option is enabled, subdirectory listings will always appear first, followed by normal files in the directory. The listing is basically broken into two components, the files and the subdirectories, and each is sorted separately and then displayed subdirectories-first. For instance, if the sort order is descending by name, and FoldersFirst is enabled, subdirectory Zed will be listed before subdirectory Beta, which will be listed before normal files Gamma and Alpha. This option only has an effect if FancyIndexing is also enabled. HTMLTable This option with FancyIndexing constructs a simple table for the fancy directory listing. It is necessary for utf-8 enabled platforms or if file names or description text will alternate between left-to-right and right-to-left reading order. IconsAreLinks This makes the icons part of the anchor for the filename, for fancy indexing. IconHeight[=pixels] Presence of this option, when used with IconWidth, will cause the server to include height and width attributes in the img tag for the file icon. This allows browser to precalculate the page layout without having to wait until all the images have been loaded. If no value is given for the option, it defaults to the standard height of the icons supplied with the Apache httpd software. This option only has an effect if FancyIndexing is also enabled. IconWidth[=pixels] Presence of this option, when used with IconHeight, will cause the server to include height and width attributes in the img tag for the file icon. This allows browser to precalculate the page layout without having to wait until all the images have been loaded. If no value is given for the option, it defaults to the standard width of the icons supplied with the Apache httpd software. IgnoreCase If this option is enabled, names are sorted in a case-insensitive manner. For instance, if the sort order is ascending by name, and IgnoreCase is enabled, file Zeta will be listed after file alfa (Note: file GAMMA will always be listed before file gamma). IgnoreClient This option causes MOD AUTOINDEX to ignore all query variables from the client, including sort order (implies SuppressColumnSorting.) NameWidth=[n — *] The NameWidth keyword allows you to specify the width of the filename column in bytes. -NameWidth (or unset) allows MOD AUTOINDEX to calculate the best width, but only up to 20 bytes wide. NameWidth=n fixes the column width to n bytes wide. NameWidth=* grows the column to the necessary width. ScanHTMLTitles This enables the extraction of the title from HTML documents for fancy indexing. If the file does not have a description given by A DD D ESCRIPTION then httpd will read the document for the value of the title element. This is CPU and disk intensive. ShowForbidden If specified, Apache httpd will show files normally hidden because the subrequest returned HTTP UNAUTHORIZED or HTTP FORBIDDEN 10.28. APACHE MODULE MOD AUTOINDEX 551 SuppressColumnSorting If specified, Apache httpd will not make the column headings in a FancyIndexed directory listing into links for sorting. The default behavior is for them to be links; selecting the column heading will sort the directory listing by the values in that column. However, query string arguments which are appended to the URL will still be honored. That behavior is controlled by IndexOptions IgnoreClient. SuppressDescription This will suppress the file description in fancy indexing listings. By default, no file descriptions are defined, and so the use of this option will regain 23 characters of screen space to use for something else. See A DD D ESCRIPTION for information about setting the file description. See also the DescriptionWidth index option to limit the size of the description column. This option only has an effect if FancyIndexing is also enabled. SuppressHTMLPreamble If the directory actually contains a file specified by the H EADER NAME directive, the module usually includes the contents of the file after a standard HTML preamble (, , et cetera). The SuppressHTMLPreamble option disables this behaviour, causing the module to start the display with the header file contents. The header file must contain appropriate HTML instructions in this case. If there is no header file, the preamble is generated as usual. If you also specify a R EADME NAME, and if that file exists, The closing tags are also ommitted from the output, under the assumption that you’ll likely put those closing tags in that file. SuppressIcon This will suppress the icon in fancy indexing listings. Combining both SuppressIcon and SuppressRules yields proper HTML 3.2 output, which by the final specification prohibits img and hr elements from the pre block (used to format FancyIndexed listings.) SuppressLastModified This will suppress the display of the last modification date, in fancy indexing listings. This option only has an effect if FancyIndexing is also enabled. SuppressRules This will suppress the horizontal rule lines (hr elements) in directory listings. Combining both SuppressIcon and SuppressRules yields proper HTML 3.2 output, which by the final specification prohibits img and hr elements from the pre block (used to format FancyIndexed listings.) This option only has an effect if FancyIndexing is also enabled. SuppressSize This will suppress the file size in fancy indexing listings. This option only has an effect if FancyIndexing is also enabled. TrackModified This returns the Last-Modified and ETag values for the listed directory in the HTTP header. It is only valid if the operating system and file system return appropriate stat() results. Some Unix systems do so, as do OS2’s JFS and Win32’s NTFS volumes. OS2 and Win32 FAT volumes, for example, do not. Once this feature is enabled, the client or proxy can track changes to the list of files when they perform a HEAD request. Note some operating systems correctly track new and removed files, but do not track changes for sizes or dates of the files within the directory. Changes to the size or date stamp of an existing file will not update the Last-Modified header on all Unix platforms. If this is a concern, leave this option disabled. Type=MIME content-type (Apache HTTP Server 2.0.61 and later) The Type keyword allows you to specify the MIME content-type of the generated page. The default is text/html. IndexOptions Type=text/plain VersionSort (Apache HTTP Server 2.0a3 and later) The VersionSort keyword causes files containing version numbers to sort in a natural way. Strings are sorted as usual, except that substrings of digits in the name and description are compared according to their numeric value. Example: foo-1.7 foo-1.7.2 foo-1.7.12 foo-1.8.2 foo-1.8.2a foo-1.12 552 CHAPTER 10. APACHE MODULES If the number starts with a zero, then it is considered to be a fraction: foo-1.001 foo-1.002 foo-1.030 foo-1.04 XHTML (Apache HTTP Server 2.0.49 and later) The XHTML keyword forces MOD AUTOINDEX to emit XHTML 1.0 code instead of HTML 3.2. This option only has an effect if FancyIndexing is also enabled. Incremental IndexOptions Be aware of how multiple I NDEX O PTIONS are handled. • Multiple I NDEX O PTIONS directives for a single directory are now merged together. The result of: IndexOptions HTMLTable IndexOptions SuppressColumnsorting will be the equivalent of IndexOptions HTMLTable SuppressColumnsorting • The addition of the incremental syntax (i.e., prefixing keywords with + or -). Whenever a ’+’ or ’-’ prefixed keyword is encountered, it is applied to the current I NDEX O PTIONS settings (which may have been inherited from an upper-level directory). However, whenever an unprefixed keyword is processed, it clears all inherited options and any incremental settings encountered so far. Consider the following example: IndexOptions +ScanHTMLTitles -IconsAreLinks FancyIndexing IndexOptions +SuppressSize The net effect is equivalent to IndexOptions FancyIndexing +SuppressSize, because the unprefixed FancyIndexing discarded the incremental keywords before it, but allowed them to start accumulating again afterward. To unconditionally set the I NDEX O PTIONS for a particular directory, clearing the inherited settings, specify keywords without any + or - prefixes. IndexOrderDefault Directive Description: Syntax: Default: Context: Override: Status: Module: Sets the default ordering of the directory index IndexOrderDefault Ascending|Descending Name|Date|Size|Description IndexOrderDefault Ascending Name server config, virtual host, directory, .htaccess Indexes Base mod autoindex The I NDEX O RDER D EFAULT directive is used in combination with the FancyIndexing index option. By default, fancyindexed directory listings are displayed in ascending order by filename; the I NDEX O RDER D EFAULT allows you to change this initial display order. 10.28. APACHE MODULE MOD AUTOINDEX 553 I NDEX O RDER D EFAULT takes two arguments. The first must be either Ascending or Descending, indicating the direction of the sort. The second argument must be one of the keywords Name, Date, Size, or Description, and identifies the primary key. The secondary key is always the ascending filename. You can, if desired, prevent the client from reordering the list by also adding the SuppressColumnSorting index option to remove the sort link from the top of the column, along with the IgnoreClient index option to prevent them from manually adding sort options to the query string in order to override your ordering preferences. IndexStyleSheet Directive Description: Syntax: Context: Override: Status: Module: Adds a CSS stylesheet to the directory index IndexStyleSheet url-path server config, virtual host, directory, .htaccess Indexes Base mod autoindex The I NDEX S TYLE S HEET directive sets the name of the file that will be used as the CSS for the index listing. IndexStyleSheet "/css/style.css" Using this directive in conjunction with IndexOptions HTMLTable adds a number of CSS classes to the resulting HTML. The entire table is given a CSS id of indexlist and the following classes are associated with the various parts of the listing: Class Definition tr.indexhead th.indexcolicon and td.indexcolicon th.indexcolname and td.indexcolname th.indexcollastmod and td.indexcollastmod th.indexcolsize and td.indexcolsize th.indexcoldesc and td.indexcoldesc tr.breakrow tr.odd and tr.even Header row of listing Icon column File name column Last modified column File size column Description column Horizontal rule at the bottom of the table Alternating even and odd rows ReadmeName Directive Description: Syntax: Context: Override: Status: Module: Name of the file that will be inserted at the end of the index listing ReadmeName filename server config, virtual host, directory, .htaccess Indexes Base mod autoindex The R EADME NAME directive sets the name of the file that will be appended to the end of the index listing. Filename is the name of the file to include, and is taken to be relative to the location being indexed. If Filename begins with a slash, as in example 2, it will be taken to be relative to the D OCUMENT ROOT. # Example 1 ReadmeName FOOTER.html # Example 2 ReadmeName /include/FOOTER.html See also H EADER NAME, where this behavior is described in greater detail. 554 CHAPTER 10. APACHE MODULES 10.29 Apache Module mod buffer Description: Status: ModuleIdentifier: SourceFile: Compatibility: Support for request buffering Extension buffer module mod buffer.c Available in Apache 2.3 and later Summary This module provides the ability to buffer the input and output filter stacks. Under certain circumstances, content generators might create content in small chunks. In order to promote memory reuse, in memory chunks are always 8k in size, regardless of the size of the chunk itself. When many small chunks are generated by a request, this can create a large memory footprint while the request is being processed, and an unnecessarily large amount of data on the wire. The addition of a buffer collapses the response into the fewest chunks possible. When httpd is used in front of an expensive content generator, buffering the response may allow the backend to complete processing and release resources sooner, depending on how the backend is designed. The buffer filter may be added to either the input or the output filter stacks, as appropriate, using the S ET I NPUT F ILTER, S ET O UTPUT F ILTER, A DD O UTPUT F ILTER or A DD O UTPUT F ILTER B Y T YPE directives. Using buffer with mod include AddOutputFilterByType INCLUDES;BUFFER text/html ! The buffer filters read the request/response into RAM and then repack the request/response into the fewest memory buckets possible, at the cost of CPU time. When the request/response is already efficiently packed, buffering the request/response could cause the request/response to be slower than not using a buffer at all. These filters should be used with care, and only where necessary. Directives • BufferSize See also • Filters (p. 110) BufferSize Directive Description: Syntax: Default: Context: Status: Module: Maximum size in bytes to buffer by the buffer filter BufferSize integer BufferSize 131072 server config, virtual host, directory, .htaccess Extension mod buffer The B UFFER S IZE directive specifies the amount of data in bytes that will be buffered before being read from or written to each request. The default is 128 kilobytes. 10.30. APACHE MODULE MOD CACHE 10.30 555 Apache Module mod cache Description: Status: ModuleIdentifier: SourceFile: RFC 2616 compliant HTTP caching filter. Extension cache module mod cache.c Summary ! This module should be used with care, as when the C ACHE Q UICK H ANDLER directive is in its default value of on, the A LLOW and D ENY directives will be circumvented. You should not enable quick handler caching for any content to which you wish to limit access by client host name, address or environment variable. implements an RFC 261614 compliant HTTP content caching filter, with support for the caching of content negotiated responses containing the Vary header. MOD CACHE RFC 2616 compliant caching provides a mechanism to verify whether stale or expired content is still fresh, and can represent a significant performance boost when the origin server supports conditional requests by honouring the IfNone-Match15 HTTP request header. Content is only regenerated from scratch when the content has changed, and not when the cached entry expires. As a filter, MOD CACHE can be placed in front of content originating from any handler, including flat files (served from a slow disk cached on a fast disk), the output of a CGI script or dynamic content generator, or content proxied from another server. In the default configuration, MOD CACHE inserts the caching filter as far forward as possible within the filter stack, utilising the quick handler to bypass all per request processing when returning content to the client. In this mode of operation, MOD CACHE may be thought of as a caching proxy server bolted to the front of the webserver, while running within the webserver itself. When the quick handler is switched off using the C ACHE Q UICK H ANDLER directive, it becomes possible to insert the CACHE filter at a point in the filter stack chosen by the administrator. This provides the opportunity to cache content before that content is personalised by the MOD INCLUDE filter, or optionally compressed by the MOD DEFLATE filter. Under normal operation, MOD CACHE will respond to and can be controlled by the Cache-Control16 and Pragma17 headers sent from a client in a request, or from a server within a response. Under exceptional circumstances, MOD CACHE can be configured to override these headers and force site specific behaviour, however such behaviour will be limited to this cache only, and will not affect the operation of other caches that may exist between the client and server, and as a result is not recommended unless strictly necessary. RFC 2616 allows for the cache to return stale data while the existing stale entry is refreshed from the origin server, and this is supported by MOD CACHE when the C ACHE L OCK directive is suitably configured. Such responses will contain a Warning18 HTTP header with a 110 response code. RFC 2616 also allows a cache to return stale data when the attempt made to refresh the stale data returns an error 500 or above, and this behaviour is supported by default by MOD CACHE . Such responses will contain a Warning19 HTTP header with a 111 response code. MOD CACHE requires the services of one or more storage management modules. The following storage management modules are included in the base Apache distribution: 14 http://www.ietf.org/rfc/rfc2616.txt 15 http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.26 16 http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9 17 http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.32 18 http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.46 19 http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.46 556 CHAPTER 10. APACHE MODULES Implements a disk based storage manager. Headers and bodies are stored separately on disk, in a directory structure derived from the md5 hash of the cached URL. Multiple content negotiated responses can be stored concurrently, however the caching of partial content is not supported by this module. The htcacheclean tool is provided to list cached URLs, remove cached URLs, or to maintain the size of the disk cache within size and inode limits. MOD CACHE DISK Implements a shared object cache based storage manager. Headers and bodies are stored together beneath a single key based on the URL of the response being cached. Multiple content negotiated responses can be stored concurrently, however the caching of partial content is not supported by this module. MOD CACHE SOCACHE Further details, discussion, and examples, are provided in the Caching Guide (p. 43) . Directives • CacheDefaultExpire • CacheDetailHeader • CacheDisable • CacheEnable • CacheHeader • CacheIgnoreCacheControl • CacheIgnoreHeaders • CacheIgnoreNoLastMod • CacheIgnoreQueryString • CacheIgnoreURLSessionIdentifiers • CacheKeyBaseURL • CacheLastModifiedFactor • CacheLock • CacheLockMaxAge • CacheLockPath • CacheMaxExpire • CacheMinExpire • CacheQuickHandler • CacheStaleOnError • CacheStoreExpired • CacheStoreNoStore • CacheStorePrivate See also • Caching Guide (p. 43) 10.30. APACHE MODULE MOD CACHE 557 Related Modules and Directives Related Modules MOD CACHE DISK MOD CACHE SOCACHE Related Directives C ACHE ROOT C ACHE D IR L EVELS C ACHE D IR L ENGTH C ACHE M IN F ILE S IZE C ACHE M AX F ILE S IZE C ACHE S OCACHE C ACHE S OCACHE M AX T IME C ACHE S OCACHE M IN T IME C ACHE S OCACHE M AX S IZE C ACHE S OCACHE R EAD S IZE C ACHE S OCACHE R EAD T IME Sample Configuration Sample httpd.conf # # Sample Cache Configuration # LoadModule cache_module modules/mod_cache.so LoadModule cache_disk_module modules/mod_cache_disk.so CacheRoot c:/cacheroot CacheEnable disk / CacheDirLevels 5 CacheDirLength 3 # When acting as a proxy, don’t cache the list of security updates CacheDisable http://security.update.server/update-list/ Avoiding the Thundering Herd When a cached entry becomes stale, MOD CACHE will submit a conditional request to the backend, which is expected to confirm whether the cached entry is still fresh, and send an updated entity if not. A small but finite amount of time exists between the time the cached entity becomes stale, and the time the stale entity is fully refreshed. On a busy server, a significant number of requests might arrive during this time, and cause a thundering herd of requests to strike the backend suddenly and unpredictably. To keep the thundering herd at bay, the C ACHE L OCK directive can be used to define a directory in which locks are created for URLs in flight. The lock is used as a hint by other requests to either suppress an attempt to cache (someone else has gone to fetch the entity), or to indicate that a stale entry is being refreshed (stale content will be returned in the mean time). 558 CHAPTER 10. APACHE MODULES Initial caching of an entry When an entity is cached for the first time, a lock will be created for the entity until the response has been fully cached. During the lifetime of the lock, the cache will suppress the second and subsequent attempt to cache the same entity. While this doesn’t hold back the thundering herd, it does stop the cache attempting to cache the same entity multiple times simultaneously. Refreshment of a stale entry When an entity reaches its freshness lifetime and becomes stale, a lock will be created for the entity until the response has either been confirmed as still fresh, or replaced by the backend. During the lifetime of the lock, the second and subsequent incoming request will cause stale data to be returned, and the thundering herd is kept at bay. Locks and Cache-Control: no-cache Locks are used as a hint only to enable the cache to be more gentle on backend servers, however the lock can be overridden if necessary. If the client sends a request with a Cache-Control header forcing a reload, any lock that may be present will be ignored, and the client’s request will be honored immediately and the cached entry refreshed. As a further safety mechanism, locks have a configurable maximum age. Once this age has been reached, the lock is removed, and a new request is given the opportunity to create a new lock. This maximum age can be set using the C ACHE L OCK M AX AGE directive, and defaults to 5 seconds. Example configuration Enabling the cache lock # # Enable the cache lock # CacheLock on CacheLockPath /tmp/mod_cache-lock CacheLockMaxAge 5 Fine Control with the CACHE Filter Under the default mode of cache operation, the cache runs as a quick handler, short circuiting the majority of server processing and offering the highest cache performance available. In this mode, the cache bolts onto the front of the server, acting as if a free standing RFC 2616 caching proxy had been placed in front of the server. While this mode offers the best performance, the administrator may find that under certain circumstances they may want to perform further processing on the request after the request is cached, such as to inject personalisation into the cached page, or to apply authorization restrictions to the content. Under these circumstances, an administrator is often forced to place independent reverse proxy servers either behind or in front of the caching server to achieve this. To solve this problem the C ACHE Q UICK H ANDLER directive can be set to off, and the server will process all phases normally handled by a non-cached request, including the authentication and authorization phases. In addition, the administrator may optionally specify the precise point within the filter chain where caching is to take place by adding the CACHE filter to the output filter chain. 10.30. APACHE MODULE MOD CACHE 559 For example, to cache content before applying compression to the response, place the CACHE filter before the DEFLATE filter as in the example below: # Cache content before optional compression CacheQuickHandler off AddOutputFilterByType CACHE;DEFLATE text/plain Another option is to have content cached before personalisation is applied by MOD INCLUDE (or another content processing filter). In this example templates containing tags understood by MOD INCLUDE are cached before being parsed: # Cache content before mod_include and mod_deflate CacheQuickHandler off AddOutputFilterByType CACHE;INCLUDES;DEFLATE text/html You may place the CACHE filter anywhere you wish within the filter chain. In this example, content is cached after being parsed by MOD INCLUDE, but before being processed by MOD DEFLATE: # Cache content between mod_include and mod_deflate CacheQuickHandler off AddOutputFilterByType INCLUDES;CACHE;DEFLATE text/html ! Warning: If the location of the CACHE filter in the filter chain is changed for any reason, you may need to flush your cache to ensure that your data served remains consistent. MOD CACHE is not in a position to enforce this for you. Cache Status and Logging Once MOD CACHE has made a decision as to whether or not an entity is to be served from cache, the detailed reason for the decision is written to the subprocess environment within the request under the cache-status key. This reason can be logged by the L OG F ORMAT directive as follows: LogFormat "%{cache-status}e ..." Based on the caching decision made, the reason is also written to the subprocess environment under one the following four keys, as appropriate: cache-hit The response was served from cache. cache-revalidate The response was stale and was successfully revalidated, then served from cache. cache-miss The response was served from the upstream server. cache-invalidate The cached entity was invalidated by a request method other than GET or HEAD. This makes it possible to support conditional logging of cached requests as per the following example: CustomLog CustomLog CustomLog CustomLog "cached-requests.log" common env=cache-hit "uncached-requests.log" common env=cache-miss "revalidated-requests.log" common env=cache-revalidate "invalidated-requests.log" common env=cache-invalidate For module authors, a hook called cache status is available, allowing modules to respond to the caching outcomes above in customised ways. 560 CHAPTER 10. APACHE MODULES CacheDefaultExpire Directive Description: Syntax: Default: Context: Status: Module: The default duration to cache a document when no expiry date is specified. CacheDefaultExpire seconds CacheDefaultExpire 3600 (one hour) server config, virtual host, directory, .htaccess Extension mod cache The C ACHE D EFAULT E XPIRE directive specifies a default time, in seconds, to cache a document if neither an expiry date nor last-modified date are provided with the document. The value specified with the C ACHE M AX E XPIRE directive does not override this setting. CacheDefaultExpire 86400 CacheDetailHeader Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Add an X-Cache-Detail header to the response. CacheDetailHeader on|off CacheDetailHeader off server config, virtual host, directory, .htaccess Extension mod cache Available in Apache 2.3.9 and later When the C ACHE D ETAIL H EADER directive is switched on, an X-Cache-Detail header will be added to the response containing the detailed reason for a particular caching decision. It can be useful during development of cached RESTful services to have additional information about the caching decision written to the response headers, so as to confirm whether Cache-Control and other headers have been correctly used by the service and client. If the normal handler is used, this directive may appear within a or directive. If the quick handler is used, this directive must appear within a server or virtual host context, otherwise the setting will be ignored. # Enable the X-Cache-Detail header CacheDetailHeader on X-Cache-Detail: localhost "conditional cache hit: entity refreshed" from CacheDisable Directive Description: Syntax: Context: Status: Module: Disable caching of specified URLs CacheDisable url-string | on server config, virtual host, directory, .htaccess Extension mod cache The C ACHE D ISABLE directive instructs MOD CACHE to not cache urls at or below url-string. 10.30. APACHE MODULE MOD CACHE 561 Example CacheDisable /local_files If used in a directive, the path needs to be specified below the Location, or if the word "on" is used, caching for the whole location will be disabled. Example CacheDisable on The no-cache environment variable can be set to disable caching on a finer grained set of resources in versions 2.2.12 and later. See also • Environment Variables in Apache (p. 92) CacheEnable Directive Description: Syntax: Context: Status: Module: Compatibility: Enable caching of specified URLs using a specified storage manager CacheEnable cache type [url-string] server config, virtual host, directory Extension mod cache A url-string of ’/’ applied to forward proxy content in 2.2 and earlier. The C ACHE E NABLE directive instructs MOD CACHE to cache urls at or below url-string. The cache storage manager is specified with the cache type argument. The C ACHE E NABLE directive can alternatively be placed inside either or sections to indicate the content is cacheable. cache type disk instructs MOD CACHE to use the disk based storage manager implemented by MOD CACHE DISK . cache type socache instructs MOD CACHE to use the shared object cache based storage manager implemented by MOD CACHE SOCACHE. In the event that the URL space overlaps between different C ACHE E NABLE directives (as in the example below), each possible storage manager will be run until the first one that actually processes the request. The order in which the storage managers are run is determined by the order of the C ACHE E NABLE directives in the configuration file. C ACHE E NABLE directives within or sections are processed before globally defined C ACHE E NABLE directives. When acting as a forward proxy server, url-string must minimally begin with a protocol for which caching should be enabled. # Cache content (normal handler only) CacheQuickHandler off CacheEnable disk # Cache regex (normal handler only) CacheQuickHandler off CacheEnable disk 562 CHAPTER 10. APACHE MODULES # Cache all but forward proxy url’s (normal or quick handler) CacheEnable disk / # Cache FTP-proxied url’s (normal or quick handler) CacheEnable disk ftp:// # Cache forward proxy content from www.example.org (normal or quick handler) CacheEnable disk http://www.example.org/ A hostname starting with a "*" matches all hostnames with that suffix. A hostname starting with "." matches all hostnames containing the domain components that follow. # Match www.example.org, and fooexample.org CacheEnable disk http://*example.org/ # Match www.example.org, but not fooexample.org CacheEnable disk http://.example.org/ The no-cache environment variable can be set to disable caching on a finer grained set of resources in versions 2.2.12 and later. See also • Environment Variables in Apache (p. 92) CacheHeader Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Add an X-Cache header to the response. CacheHeader on|off CacheHeader off server config, virtual host, directory, .htaccess Extension mod cache Available in Apache 2.3.9 and later When the C ACHE H EADER directive is switched on, an X-Cache header will be added to the response with the cache status of this response. If the normal handler is used, this directive may appear within a or directive. If the quick handler is used, this directive must appear within a server or virtual host context, otherwise the setting will be ignored. HIT The entity was fresh, and was served from cache. REVALIDATE The entity was stale, was successfully revalidated and was served from cache. MISS The entity was fetched from the upstream server and was not served from cache. # Enable the X-Cache header CacheHeader on X-Cache: HIT from localhost 10.30. APACHE MODULE MOD CACHE 563 CacheIgnoreCacheControl Directive Description: Syntax: Default: Context: Status: Module: Ignore request to not serve cached content to client CacheIgnoreCacheControl On|Off CacheIgnoreCacheControl Off server config, virtual host Extension mod cache Ordinarily, requests containing a Cache-Control: no-cache or Pragma: no-cache header value will not be served from the cache. The C ACHE I GNORE C ACHE C ONTROL directive allows this behavior to be overridden. C ACHE I GNORE C ACHE C ONTROL O N tells the server to attempt to serve the resource from the cache even if the request contains no-cache header values. CacheIgnoreCacheControl On ! Warning: This directive will allow serving from the cache even if the client has requested that the document not be served from the cache. This might result in stale content being served. See also • C ACHE S TORE P RIVATE • C ACHE S TORE N O S TORE CacheIgnoreHeaders Directive Description: Syntax: Default: Context: Status: Module: Do not store the given HTTP header(s) in the cache. CacheIgnoreHeaders header-string [header-string] ... CacheIgnoreHeaders None server config, virtual host Extension mod cache According to RFC 2616, hop-by-hop HTTP headers are not stored in the cache. The following HTTP headers are hop-by-hop headers and thus do not get stored in the cache in any case regardless of the setting of C ACHE I GNORE H EADERS: • Connection • Keep-Alive • Proxy-Authenticate • Proxy-Authorization • TE • Trailers • Transfer-Encoding • Upgrade C ACHE I GNORE H EADERS specifies additional HTTP headers that should not to be stored in the cache. For example, it makes sense in some cases to prevent cookies from being stored in the cache. 564 CHAPTER 10. APACHE MODULES C ACHE I GNORE H EADERS takes a space separated list of HTTP headers that should not be stored in the cache. If only hop-by-hop headers not should be stored in the cache (the RFC 2616 compliant behaviour), C ACHE I GNORE H EADERS can be set to None. Example 1 CacheIgnoreHeaders Set-Cookie Example 2 CacheIgnoreHeaders None ! Warning: If headers like Expires which are needed for proper cache management are not stored due to a C ACHE I GNORE H EADERS setting, the behaviour of mod cache is undefined. CacheIgnoreNoLastMod Directive Description: Syntax: Default: Context: Status: Module: Ignore the fact that a response has no Last Modified header. CacheIgnoreNoLastMod On|Off CacheIgnoreNoLastMod Off server config, virtual host, directory, .htaccess Extension mod cache Ordinarily, documents without a last-modified date are not cached. Under some circumstances the last-modified date is removed (during MOD INCLUDE processing for example) or not provided at all. The C ACHE I GNORE N O L AST M OD directive provides a way to specify that documents without last-modified dates should be considered for caching, even without a last-modified date. If neither a last-modified date nor an expiry date are provided with the document then the value specified by the C ACHE D EFAULT E XPIRE directive will be used to generate an expiration date. CacheIgnoreNoLastMod On CacheIgnoreQueryString Directive Description: Syntax: Default: Context: Status: Module: Ignore query string when caching CacheIgnoreQueryString On|Off CacheIgnoreQueryString Off server config, virtual host Extension mod cache Ordinarily, requests with query string parameters are cached separately for each unique query string. This is according to RFC 2616/13.9 done only if an expiration time is specified. The C ACHE I GNORE Q UERY S TRING directive tells the cache to cache requests even if no expiration time is specified, and to reply with a cached reply even if the query string differs. From a caching point of view the request is treated as if having no query string when this directive is enabled. CacheIgnoreQueryString On 10.30. APACHE MODULE MOD CACHE 565 CacheIgnoreURLSessionIdentifiers Directive Description: Syntax: Default: Context: Status: Module: Ignore defined session identifiers encoded in the URL when caching CacheIgnoreURLSessionIdentifiers identifier [identifier] ... CacheIgnoreURLSessionIdentifiers None server config, virtual host Extension mod cache Sometimes applications encode the session identifier into the URL like in the following Examples: • /someapplication/image.gif;jsessionid=123456789 • /someapplication/image.gif?PHPSESSIONID=12345678 This causes cachable resources to be stored separately for each session, which is often not desired. C ACHE I GNORE URLS ESSION I DENTIFIERS lets define a list of identifiers that are removed from the key that is used to identify an entity in the cache, such that cachable resources are not stored separately for each session. CacheIgnoreURLSessionIdentifiers None clears the list of ignored identifiers. Otherwise, each identifier is added to the list. Example 1 CacheIgnoreURLSessionIdentifiers jsessionid Example 2 CacheIgnoreURLSessionIdentifiers None CacheKeyBaseURL Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Override the base URL of reverse proxied cache keys. CacheKeyBaseURL URL CacheKeyBaseURL http://example.com server config, virtual host Extension mod cache Available in Apache 2.3.9 and later When the C ACHE K EY BASE URL directive is specified, the URL provided will be used as the base URL to calculate the URL of the cache keys in the reverse proxy configuration. When not specified, the scheme, hostname and port of the current virtual host is used to construct the cache key. When a cluster of machines is present, and all cached entries should be cached beneath the same cache key, a new base URL can be specified with this directive. # Override the base URL of the cache key. CacheKeyBaseURL http://www.example.com/ ! Take care when setting this directive. If two separate virtual hosts are accidentally given the same base URL, entries from one virtual host will be served to the other. 566 CHAPTER 10. APACHE MODULES CacheLastModifiedFactor Directive Description: Syntax: Default: Context: Status: Module: The factor used to compute an expiry date based on the LastModified date. CacheLastModifiedFactor float CacheLastModifiedFactor 0.1 server config, virtual host, directory, .htaccess Extension mod cache In the event that a document does not provide an expiry date but does provide a last-modified date, an expiry date can be calculated based on the time since the document was last modified. The C ACHE L AST M ODIFIED FACTOR directive specifies a factor to be used in the generation of this expiry date according to the following formula: expiry-period = time-since-last-modified-date * factor expiry-date = current-date + expiry-period For example, if the document was last modified 10 hours ago, and factor is 0.1 then the expiry-period will be set to 10*0.1 = 1 hour. If the current time was 3:00pm then the computed expiry-date would be 3:00pm + 1hour = 4:00pm. If the expiry-period would be longer than that set by C ACHE M AX E XPIRE, then the latter takes precedence. CacheLastModifiedFactor 0.5 CacheLock Directive Description: Syntax: Default: Context: Status: Module: Enable the thundering herd lock. CacheLock on|off CacheLock off server config, virtual host Extension mod cache The C ACHE L OCK directive enables the thundering herd lock for the given URL space. In a minimal configuration the following directive is all that is needed to enable the thundering herd lock in the default run-time file directory. # Enable cache lock CacheLock on Locks consist of empty files that only exist for stale URLs in flight, so this is significantly less resource intensive than the traditional disk cache. CacheLockMaxAge Directive Description: Syntax: Default: Context: Status: Module: Set the maximum possible age of a cache lock. CacheLockMaxAge integer CacheLockMaxAge 5 server config, virtual host Extension mod cache The C ACHE L OCK M AX AGE directive specifies the maximum age of any cache lock. A lock older than this value in seconds will be ignored, and the next incoming request will be given the opportunity to re-establish the lock. This mechanism prevents a slow client taking an excessively long time to refresh an entity. 10.30. APACHE MODULE MOD CACHE 567 CacheLockPath Directive Description: Syntax: Default: Context: Status: Module: Set the lock path directory. CacheLockPath directory CacheLockPath mod cache-lock server config, virtual host Extension mod cache The C ACHE L OCK PATH directive allows you to specify the directory in which the locks are created. If directory is not an absolute path, the location specified will be relative to the value of D EFAULT RUNTIME D IR. CacheMaxExpire Directive Description: Syntax: Default: Context: Status: Module: The maximum time in seconds to cache a document CacheMaxExpire seconds CacheMaxExpire 86400 (one day) server config, virtual host, directory, .htaccess Extension mod cache The C ACHE M AX E XPIRE directive specifies the maximum number of seconds for which cachable HTTP documents will be retained without checking the origin server. Thus, documents will be out of date at most this number of seconds. This maximum value is enforced even if an expiry date was supplied with the document. CacheMaxExpire 604800 CacheMinExpire Directive Description: Syntax: Default: Context: Status: Module: The minimum time in seconds to cache a document CacheMinExpire seconds CacheMinExpire 0 server config, virtual host, directory, .htaccess Extension mod cache The C ACHE M IN E XPIRE directive specifies the minimum number of seconds for which cachable HTTP documents will be retained without checking the origin server. This is only used if no valid expire time was supplied with the document. CacheMinExpire 3600 CacheQuickHandler Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Run the cache from the quick handler. CacheQuickHandler on|off CacheQuickHandler on server config, virtual host Extension mod cache Apache HTTP Server 2.3.3 and later The C ACHE Q UICK H ANDLER directive controls the phase in which the cache is handled. 568 CHAPTER 10. APACHE MODULES In the default enabled configuration, the cache operates within the quick handler phase. This phase short circuits the majority of server processing, and represents the most performant mode of operation for a typical server. The cache bolts onto the front of the server, and the majority of server processing is avoided. When disabled, the cache operates as a normal handler, and is subject to the full set of phases when handling a server request. While this mode is slower than the default, it allows the cache to be used in cases where full processing is required, such as when content is subject to authorization. # Run cache as a normal handler CacheQuickHandler off It is also possible, when the quick handler is disabled, for the administrator to choose the precise location within the filter chain where caching is to be performed, by adding the CACHE filter to the chain. # Cache content before mod_include and mod_deflate CacheQuickHandler off AddOutputFilterByType CACHE;INCLUDES;DEFLATE text/html If the CACHE filter is specified more than once, the last instance will apply. CacheStaleOnError Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Serve stale content in place of 5xx responses. CacheStaleOnError on|off CacheStaleOnError on server config, virtual host, directory, .htaccess Extension mod cache Available in Apache 2.3.9 and later When the C ACHE S TALE O N E RROR directive is switched on, and when stale data is available in the cache, the cache will respond to 5xx responses from the backend by returning the stale data instead of the 5xx response. While the Cache-Control headers sent by clients will be respected, and the raw 5xx responses returned to the client on request, the 5xx response so returned to the client will not invalidate the content in the cache. # Serve stale data on error. CacheStaleOnError on CacheStoreExpired Directive Description: Syntax: Default: Context: Status: Module: Attempt to cache responses that the server reports as expired CacheStoreExpired On|Off CacheStoreExpired Off server config, virtual host, directory, .htaccess Extension mod cache Since httpd 2.2.4, responses which have already expired are not stored in the cache. The C ACHE S TORE E XPIRED directive allows this behavior to be overridden. C ACHE S TORE E XPIRED On tells the server to attempt to cache the resource if it is stale. Subsequent requests would trigger an If-Modified-Since request of the origin server, and the response may be fulfilled from cache if the backend resource has not changed. CacheStoreExpired On 10.30. APACHE MODULE MOD CACHE 569 CacheStoreNoStore Directive Description: Syntax: Default: Context: Status: Module: Attempt to cache requests or responses that have been marked as no-store. CacheStoreNoStore On|Off CacheStoreNoStore Off server config, virtual host, directory, .htaccess Extension mod cache Ordinarily, requests or responses with Cache-Control: no-store header values will not be stored in the cache. The C ACHE S TORE N O S TORE directive allows this behavior to be overridden. C ACHE S TORE N O S TORE On tells the server to attempt to cache the resource even if it contains no-store header values. CacheStoreNoStore On ! Warning: As described in RFC 2616, the no-store directive is intended to "prevent the inadvertent release or retention of sensitive information (for example, on backup tapes)." Enabling this option could store sensitive information in the cache. You are hereby warned. See also • C ACHE I GNORE C ACHE C ONTROL • C ACHE S TORE P RIVATE CacheStorePrivate Directive Description: Syntax: Default: Context: Status: Module: Attempt to cache responses that the server has marked as private CacheStorePrivate On|Off CacheStorePrivate Off server config, virtual host, directory, .htaccess Extension mod cache Ordinarily, responses with Cache-Control: private header values will not be stored in the cache. The C ACHE S TORE P RIVATE directive allows this behavior to be overridden. C ACHE S TORE P RIVATE On tells the server to attempt to cache the resource even if it contains private header values. CacheStorePrivate On ! Warning: This directive will allow caching even if the upstream server has requested that the resource not be cached. This directive is only ideal for a ’private’ cache. See also • C ACHE I GNORE C ACHE C ONTROL • C ACHE S TORE N O S TORE 570 CHAPTER 10. APACHE MODULES 10.31 Apache Module mod cache disk Description: Status: ModuleIdentifier: SourceFile: Disk based storage module for the HTTP caching filter. Extension cache disk module mod cache disk.c Summary MOD CACHE DISK implements a disk based storage manager for MOD CACHE. The headers and bodies of cached responses are stored separately on disk, in a directory structure derived from the md5 hash of the cached URL. Multiple content negotiated responses can be stored concurrently, however the caching of partial content is not yet supported by this module. Atomic cache updates to both header and body files are achieved without the need for locking by storing the device and inode numbers of the body file within the header file. This has the side effect that cache entries manually moved into the cache will be ignored. The htcacheclean tool is provided to list cached URLs, remove cached URLs, or to maintain the size of the disk cache within size and/or inode limits. The tool can be run on demand, or can be daemonized to offer continuous monitoring of directory sizes. =⇒Note: MOD CACHE DISK requires the services of MOD CACHE, which must be loaded before mod cache disk. =⇒Note: MOD CACHE DISK uses the sendfile feature to serve files from the cache when supported by the platform, and when enabled with E NABLE S ENDFILE. However, per-directory and .htaccess configuration of E NABLE S ENDFILE are ignored by MOD CACHE DISK as the corresponding settings are not available to the module when a request is being served from the cache. Directives • CacheDirLength • CacheDirLevels • CacheMaxFileSize • CacheMinFileSize • CacheReadSize • CacheReadTime • CacheRoot See also • MOD CACHE • MOD CACHE SOCACHE • Caching Guide (p. 43) 10.31. APACHE MODULE MOD CACHE DISK 571 CacheDirLength Directive Description: Syntax: Default: Context: Status: Module: The number of characters in subdirectory names CacheDirLength length CacheDirLength 2 server config, virtual host Extension mod cache disk The C ACHE D IR L ENGTH directive sets the number of characters for each subdirectory name in the cache hierarchy. It can be used in conjunction with C ACHE D IR L EVELS to determine the approximate structure of your cache hierarchy. A high value for C ACHE D IR L ENGTH combined with a low value for C ACHE D IR L EVELS will result in a relatively flat hierarchy, with a large number of subdirectories at each level. =⇒The result of C ACHE D IR L EVELS * C ACHE D IR L ENGTH must not be higher than 20. CacheDirLevels Directive Description: Syntax: Default: Context: Status: Module: The number of levels of subdirectories in the cache. CacheDirLevels levels CacheDirLevels 2 server config, virtual host Extension mod cache disk The C ACHE D IR L EVELS directive sets the number of subdirectory levels in the cache. Cached data will be saved this many directory levels below the C ACHE ROOT directory. A high value for C ACHE D IR L EVELS combined with a low value for C ACHE D IR L ENGTH will result in a relatively deep hierarchy, with a small number of subdirectories at each level. =⇒The result of C ACHE D IR L EVELS * C ACHE D IR L ENGTH must not be higher than 20. CacheMaxFileSize Directive Description: Syntax: Default: Context: Status: Module: The maximum size (in bytes) of a document to be placed in the cache CacheMaxFileSize bytes CacheMaxFileSize 1000000 server config, virtual host, directory, .htaccess Extension mod cache disk The C ACHE M AX F ILE S IZE directive sets the maximum size, in bytes, for a document to be considered for storage in the cache. CacheMaxFileSize 64000 572 CHAPTER 10. APACHE MODULES CacheMinFileSize Directive Description: Syntax: Default: Context: Status: Module: The minimum size (in bytes) of a document to be placed in the cache CacheMinFileSize bytes CacheMinFileSize 1 server config, virtual host, directory, .htaccess Extension mod cache disk The C ACHE M IN F ILE S IZE directive sets the minimum size, in bytes, for a document to be considered for storage in the cache. CacheMinFileSize 64 CacheReadSize Directive Description: Syntax: Default: Context: Status: Module: The minimum size (in bytes) of the document to read and be cached before sending the data downstream CacheReadSize bytes CacheReadSize 0 server config, virtual host, directory, .htaccess Extension mod cache disk The C ACHE R EAD S IZE directive sets the minimum amount of data, in bytes, to be read from the backend before the data is sent to the client. The default of zero causes all data read of any size to be passed downstream to the client immediately as it arrives. Setting this to a higher value causes the disk cache to buffer at least this amount before sending the result to the client. This can improve performance when caching content from a reverse proxy. This directive only takes effect when the data is being saved to the cache, as opposed to data being served from the cache. CacheReadSize 102400 CacheReadTime Directive Description: Syntax: Default: Context: Status: Module: The minimum time (in milliseconds) that should elapse while reading before data is sent downstream CacheReadTime milliseconds CacheReadTime 0 server config, virtual host, directory, .htaccess Extension mod cache disk The C ACHE R EAD T IME directive sets the minimum amount of elapsed time that should pass before making an attempt to send data downstream to the client. During the time period, data will be buffered before sending the result to the client. This can improve performance when caching content from a reverse proxy. The default of zero disables this option. This directive only takes effect when the data is being saved to the cache, as opposed to data being served from the cache. It is recommended that this option be used alongside the C ACHE R EAD S IZE directive to ensure that the server does not buffer excessively should data arrive faster than expected. CacheReadTime 1000 10.31. APACHE MODULE MOD CACHE DISK 573 CacheRoot Directive Description: Syntax: Context: Status: Module: The directory root under which cache files are stored CacheRoot directory server config, virtual host Extension mod cache disk The C ACHE ROOT directive defines the name of the directory on the disk to contain cache files. If the MOD CACHE DISK module has been loaded or compiled in to the Apache server, this directive must be defined. Failing to provide a value for C ACHE ROOT will result in a configuration file processing error. The C ACHE D IR L EVELS and C ACHE D IR L ENGTH directives define the structure of the directories under the specified root directory. CacheRoot c:/cacheroot 574 CHAPTER 10. APACHE MODULES 10.32 Apache Module mod cache socache Description: Status: ModuleIdentifier: SourceFile: Shared object cache (socache) based storage module for the HTTP caching filter. Extension cache socache module mod cache socache.c Summary MOD CACHE SOCACHE implements a shared object cache (socache) based storage manager for MOD CACHE. The headers and bodies of cached responses are combined, and stored underneath a single key in the shared object cache. A number of implementations (p. 114) of shared object caches are available to choose from. Multiple content negotiated responses can be stored concurrently, however the caching of partial content is not yet supported by this module. # Turn on caching CacheSocache shmcb CacheSocacheMaxSize 102400 CacheEnable socache # Fall back to the disk cache CacheSocache shmcb CacheSocacheMaxSize 102400 CacheEnable socache CacheEnable disk =⇒Note: MOD CACHE SOCACHE mod cache socache. Directives • CacheSocache • CacheSocacheMaxSize • CacheSocacheMaxTime • CacheSocacheMinTime • CacheSocacheReadSize • CacheSocacheReadTime See also • MOD CACHE • MOD CACHE DISK • Caching Guide (p. 43) requires the services of MOD CACHE, which must be loaded before 10.32. APACHE MODULE MOD CACHE SOCACHE 575 CacheSocache Directive Description: Syntax: Context: Status: Module: Compatibility: The shared object cache implementation to use CacheSocache type[:args] server config, virtual host Extension mod cache socache Available in Apache 2.4.5 and later The C ACHE S OCACHE directive defines the name of the shared object cache implementation to use, followed by optional arguments for that implementation. A number of implementations (p. 114) of shared object caches are available to choose from. CacheSocache shmcb CacheSocacheMaxSize Directive Description: Syntax: Default: Context: Status: Module: Compatibility: The maximum size (in bytes) of an entry to be placed in the cache CacheSocacheMaxSize bytes CacheSocacheMaxSize 102400 server config, virtual host, directory, .htaccess Extension mod cache socache Available in Apache 2.4.5 and later The C ACHE S OCACHE M AX S IZE directive sets the maximum size, in bytes, for the combined headers and body of a document to be considered for storage in the cache. The larger the headers that are stored alongside the body, the smaller the body may be. The MOD CACHE SOCACHE module will only attempt to cache responses that have an explicit content length, or that are small enough to be written in one pass. This is done to allow the MOD CACHE DISK module to have an opportunity to cache responses larger than those cacheable within MOD CACHE SOCACHE. CacheSocacheMaxSize 102400 CacheSocacheMaxTime Directive Description: Syntax: Default: Context: Status: Module: Compatibility: The maximum time (in seconds) for a document to be placed in the cache CacheSocacheMaxTime seconds CacheSocacheMaxTime 86400 server config, virtual host, directory, .htaccess Extension mod cache socache Available in Apache 2.4.5 and later The C ACHE S OCACHE M AX T IME directive sets the maximum freshness lifetime, in seconds, for a document to be stored in the cache. This value overrides the freshness lifetime defined for the document by the HTTP protocol. CacheSocacheMaxTime 86400 576 CHAPTER 10. APACHE MODULES CacheSocacheMinTime Directive Description: Syntax: Default: Context: Status: Module: Compatibility: The minimum time (in seconds) for a document to be placed in the cache CacheSocacheMinTime seconds CacheSocacheMinTime 600 server config, virtual host, directory, .htaccess Extension mod cache socache Available in Apache 2.4.5 and later The C ACHE S OCACHE M IN T IME directive sets the amount of seconds beyond the freshness lifetime of the response that the response should be cached for in the shared object cache. If a response is only stored for its freshness lifetime, there will be no opportunity to revalidate the response to make it fresh again. CacheSocacheMinTime 600 CacheSocacheReadSize Directive Description: Syntax: Default: Context: Status: Module: Compatibility: The minimum size (in bytes) of the document to read and be cached before sending the data downstream CacheSocacheReadSize bytes CacheSocacheReadSize 0 server config, virtual host, directory, .htaccess Extension mod cache socache Available in Apache 2.4.5 and later The C ACHE S OCACHE R EAD S IZE directive sets the minimum amount of data, in bytes, to be read from the backend before the data is sent to the client. The default of zero causes all data read of any size to be passed downstream to the client immediately as it arrives. Setting this to a higher value causes the disk cache to buffer at least this amount before sending the result to the client. This can improve performance when caching content from a slow reverse proxy. This directive only takes effect when the data is being saved to the cache, as opposed to data being served from the cache. CacheReadSize 102400 CacheSocacheReadTime Directive Description: Syntax: Default: Context: Status: Module: Compatibility: The minimum time (in milliseconds) that should elapse while reading before data is sent downstream CacheSocacheReadTime milliseconds CacheSocacheReadTime 0 server config, virtual host, directory, .htaccess Extension mod cache socache Available in Apache 2.4.5 and later The C ACHE S OCACHE R EAD T IME directive sets the minimum amount of elapsed time that should pass before making an attempt to send data downstream to the client. During the time period, data will be buffered before sending the result to the client. This can improve performance when caching content from a reverse proxy. The default of zero disables this option. 10.32. APACHE MODULE MOD CACHE SOCACHE 577 This directive only takes effect when the data is being saved to the cache, as opposed to data being served from the cache. It is recommended that this option be used alongside the C ACHE S OCACHE R EAD S IZE directive to ensure that the server does not buffer excessively should data arrive faster than expected. CacheSocacheReadTime 1000 578 CHAPTER 10. APACHE MODULES 10.33 Apache Module mod cern meta Description: Status: ModuleIdentifier: SourceFile: CERN httpd metafile semantics Extension cern meta module mod cern meta.c Summary Emulate the CERN HTTPD Meta file semantics. Meta files are HTTP headers that can be output in addition to the normal range of headers for each file accessed. They appear rather like the Apache .asis files, and are able to provide a crude way of influencing the Expires: header, as well as providing other curiosities. There are many ways to manage meta information, this one was chosen because there is already a large number of CERN users who can exploit this module. More information on the CERN metafile semantics20 is available. Directives • MetaDir • MetaFiles • MetaSuffix See also • MOD HEADERS • MOD ASIS MetaDir Directive Description: Syntax: Default: Context: Override: Status: Module: Name of the directory to find CERN-style meta information files MetaDir directory MetaDir .web server config, virtual host, directory, .htaccess Indexes Extension mod cern meta Specifies the name of the directory in which Apache can find meta information files. The directory is usually a ’hidden’ subdirectory of the directory that contains the file being accessed. Set to "." to look in the same directory as the file: MetaDir . Or, to set it to a subdirectory of the directory containing the files: MetaDir .meta 20 http://www.w3.org/pub/WWW/Daemon/User/Config/General.html#MetaDir 10.33. APACHE MODULE MOD CERN META 579 MetaFiles Directive Description: Syntax: Default: Context: Override: Status: Module: Activates CERN meta-file processing MetaFiles on|off MetaFiles off server config, virtual host, directory, .htaccess Indexes Extension mod cern meta Turns on/off Meta file processing on a per-directory basis. MetaSuffix Directive Description: Syntax: Default: Context: Override: Status: Module: File name suffix for the file containing CERN-style meta information MetaSuffix suffix MetaSuffix .meta server config, virtual host, directory, .htaccess Indexes Extension mod cern meta Specifies the file name suffix for the file containing the meta information. For example, the default values for the two directives will cause a request to DOCUMENT ROOT/somedir/index.html to look in DOCUMENT ROOT/somedir/.web/index.html.meta and will use its contents to generate additional MIME header information. Example: MetaSuffix .meta 580 CHAPTER 10. APACHE MODULES 10.34 Apache Module mod cgi Description: Status: ModuleIdentifier: SourceFile: Execution of CGI scripts Base cgi module mod cgi.c Summary Any file that has the handler cgi-script will be treated as a CGI script, and run by the server, with its output being returned to the client. Files acquire this handler either by having a name containing an extension defined by the A DD H ANDLER directive, or by being in a S CRIPTA LIAS directory. For an introduction to using CGI scripts with Apache, see our tutorial on Dynamic Content With CGI (p. 236) . When using a multi-threaded MPM under unix, the module MOD CGID should be used in place of this module. At the user level, the two modules are essentially identical. For backward-compatibility, the cgi-script handler will also be activated for any file with the mime-type application/x-httpd-cgi. The use of the magic mime-type is deprecated. Directives • ScriptLog • ScriptLogBuffer • ScriptLogLength See also • ACCEPT PATH I NFO • O PTIONS ExecCGI • S CRIPTA LIAS • A DD H ANDLER • Running CGI programs under different user IDs (p. 115) • CGI Specification21 CGI Environment variables The server will set the CGI environment variables as described in the CGI specification22 , with the following provisions: PATH INFO This will not be available if the ACCEPT PATH I NFO directive is explicitly set to off. The default behavior, if ACCEPT PATH I NFO is not given, is that MOD CGI will accept path info (trailing /more/path/info following the script filename in the URI), while the core server will return a 404 NOT FOUND error for requests with additional path info. Omitting the ACCEPT PATH I NFO directive has the same effect as setting it On for MOD CGI requests. REMOTE HOST This will only be set if H OSTNAME L OOKUPS is set to on (it is off by default), and if a reverse DNS lookup of the accessing host’s address indeed finds a host name. 21 http://www.ietf.org/rfc/rfc3875 22 http://www.ietf.org/rfc/rfc3875 10.34. APACHE MODULE MOD CGI 581 REMOTE IDENT This will only be set if I DENTITY C HECK is set to on and the accessing host supports the ident protocol. Note that the contents of this variable cannot be relied upon because it can easily be faked, and if there is a proxy between the client and the server, it is usually totally useless. REMOTE USER This will only be set if the CGI script is subject to authentication. This module also leverages the core functions ap add common vars23 and ap add cgi vars24 to add environment variables like: DOCUMENT ROOT Set with the content of the related D OCUMENT ROOT directive. SERVER NAME The fully qualified domain name related to the request. SERVER ADDR The IP address of the Virtual Host serving the request. SERVER ADMIN Set with the content of the related S ERVER A DMIN directive. For an exhaustive list it is suggested to write a basic CGI script that dumps all the environment variables passed by Apache in a convenient format. CGI Debugging Debugging CGI scripts has traditionally been difficult, mainly because it has not been possible to study the output (standard output and error) for scripts which are failing to run properly. These directives provide more detailed logging of errors when they occur. CGI Logfile Format When configured, the CGI error log logs any CGI which does not execute properly. Each CGI script which fails to operate causes several lines of information to be logged. The first two lines are always of the format: %% [time] request-line %% HTTP-status CGI-script-filename If the error is that CGI script cannot be run, the log file will contain an extra two lines: %%error error-message Alternatively, if the error is the result of the script returning incorrect header information (often due to a bug in the script), the following information is logged: %request All HTTP request headers received POST or PUT entity (if any) %response All headers output by the CGI script %stdout CGI standard output %stderr CGI standard error 23 https://ci.apache.org/projects/httpd/trunk/doxygen/group 24 https://ci.apache.org/projects/httpd/trunk/doxygen/group APACHE CORE SCRIPT.html#ga0e81f9571a8a73f5da0e89e1f46d34b1 APACHE CORE SCRIPT.html#ga6b975cd7ff27a338cb8752381a4cc14f 582 CHAPTER 10. APACHE MODULES (The %stdout and %stderr parts may be missing if the script did not output anything on standard output or standard error). ScriptLog Directive Description: Syntax: Context: Status: Module: Location of the CGI script error logfile ScriptLog file-path server config, virtual host Base MOD CGI , MOD CGID The S CRIPT L OG directive sets the CGI script error logfile. If no S CRIPT L OG is given, no error log is created. If given, any CGI errors are logged into the filename given as argument. If this is a relative file or path it is taken relative to the S ERVER ROOT. Example ScriptLog logs/cgi_log This log will be opened as the user the child processes run as, i.e. the user specified in the main U SER directive. This means that either the directory the script log is in needs to be writable by that user or the file needs to be manually created and set to be writable by that user. If you place the script log in your main logs directory, do NOT change the directory permissions to make it writable by the user the child processes run as. Note that script logging is meant to be a debugging feature when writing CGI scripts, and is not meant to be activated continuously on running servers. It is not optimized for speed or efficiency, and may have security problems if used in a manner other than that for which it was designed. ScriptLogBuffer Directive Description: Syntax: Default: Context: Status: Module: Maximum amount of PUT or POST requests that will be recorded in the scriptlog ScriptLogBuffer bytes ScriptLogBuffer 1024 server config, virtual host Base MOD CGI , MOD CGID The size of any PUT or POST entity body that is logged to the file is limited, to prevent the log file growing too big too quickly if large bodies are being received. By default, up to 1024 bytes are logged, but this can be changed with this directive. ScriptLogLength Directive Description: Syntax: Default: Context: Status: Module: Size limit of the CGI script logfile ScriptLogLength bytes ScriptLogLength 10385760 server config, virtual host Base MOD CGI , MOD CGID S CRIPT L OG L ENGTH can be used to limit the size of the CGI script logfile. Since the logfile logs a lot of information per CGI error (all request headers, all script output) it can grow to be a big file. To prevent problems due to unbounded growth, this directive can be used to set an maximum file-size for the CGI logfile. If the file exceeds this size, no more information will be written to it. 10.35. APACHE MODULE MOD CGID 10.35 583 Apache Module mod cgid Description: Status: ModuleIdentifier: SourceFile: Compatibility: Execution of CGI scripts using an external CGI daemon Base cgid module mod cgid.c Unix threaded MPMs only Summary Except for the optimizations and the additional S CRIPT S OCK directive noted below, MOD CGID behaves similarly to MOD CGI . See the MOD CGI summary for additional details about Apache and CGI. On certain unix operating systems, forking a process from a multi-threaded server is a very expensive operation because the new process will replicate all the threads of the parent process. In order to avoid incurring this expense on each CGI invocation, MOD CGID creates an external daemon that is responsible for forking child processes to run CGI scripts. The main server communicates with this daemon using a unix domain socket. This module is used by default instead of MOD CGI whenever a multi-threaded MPM is selected during the compilation process. At the user level, this module is identical in configuration and operation to MOD CGI. The only exception is the additional directive ScriptSock which gives the name of the socket to use for communication with the cgi daemon. Directives • CGIDScriptTimeout • ScriptLog (p. 582) • ScriptLogBuffer (p. 582) • ScriptLogLength (p. 582) • ScriptSock See also • MOD CGI • Running CGI programs under different user IDs (p. 115) CGIDScriptTimeout Directive Description: Syntax: Default: Context: Status: Module: Compatibility: The length of time to wait for more output from the CGI program CGIDScriptTimeout time[s|ms] value of T I M E O U T directive when unset server config, virtual host, directory, .htaccess Base mod cgid CGIDScriptTimeout defaults to zero in releases 2.4 and earlier This directive limits the length of time to wait for more output from the CGI program. If the time is exceeded, the request and CGI are terminated. Example CGIDScriptTimeout 20 584 CHAPTER 10. APACHE MODULES ScriptSock Directive Description: Syntax: Default: Context: Status: Module: The filename prefix of the socket to use for communication with the cgi daemon ScriptSock file-path ScriptSock cgisock server config Base mod cgid This directive sets the filename prefix of the socket to use for communication with the CGI daemon, an extension corresponding to the process ID of the server will be appended. The socket will be opened using the permissions of the user who starts Apache (usually root). To maintain the security of communications with CGI scripts, it is important that no other user has permission to write in the directory where the socket is located. If file-path is not an absolute path, the location specified will be relative to the value of D EFAULT RUNTIME D IR. Example ScriptSock /var/run/cgid.sock 10.36. APACHE MODULE MOD CHARSET LITE 10.36 585 Apache Module mod charset lite Description: Status: ModuleIdentifier: SourceFile: Specify character set translation or recoding Extension charset lite module mod charset lite.c Summary MOD CHARSET LITE allows the server to change the character set of responses before sending them to the client. In an EBCDIC environment, Apache always translates HTTP protocol content (e.g. response headers) from the code page of the Apache process locale to ISO-8859-1, but not the body of responses. In any environment, MOD CHARSET LITE can be used to specify that response bodies should be translated. For example, if files are stored in EBCDIC, then MOD CHARSET LITE can translate them to ISO-8859-1 before sending them to the client. This module provides a small subset of configuration mechanisms implemented by Russian Apache and its associated mod charset. Directives • CharsetDefault • CharsetOptions • CharsetSourceEnc Common Problems Invalid character set names The character set name parameters of C HARSET S OURCE E NC and C HARSET D EFAULT must be acceptable to the translation mechanism used by APR on the system where MOD CHARSET LITE is deployed. These character set names are not standardized and are usually not the same as the corresponding values used in http headers. Currently, APR can only use iconv(3), so you can easily test your character set names using the iconv(1) program, as follows: iconv -f charsetsourceenc-value -t charsetdefault-value Mismatch between character set of content and translation rules If the translation rules don’t make sense for the content, translation can fail in various ways, including: • The translation mechanism may return a bad return code, and the connection will be aborted. • The translation mechanism may silently place special characters (e.g., question marks) in the output buffer when it cannot translate the input buffer. 586 CHAPTER 10. APACHE MODULES CharsetDefault Directive Description: Syntax: Context: Override: Status: Module: Charset to translate into CharsetDefault charset server config, virtual host, directory, .htaccess FileInfo Extension mod charset lite The C HARSET D EFAULT directive specifies the charset that content in the associated container should be translated to. The value of the charset argument must be accepted as a valid character set name by the character set support in APR. Generally, this means that it must be supported by iconv. Example CharsetSourceEnc UTF-16BE CharsetDefault ISO-8859-1 =⇒translation. Specifying the same charset for both C S E and C D disables The charset need not match the charset of the response, but it must be a valid HARSET OURCE NC HARSET EFAULT charset on the system. CharsetOptions Directive Description: Syntax: Default: Context: Override: Status: Module: Configures charset translation behavior CharsetOptions option [option] ... CharsetOptions ImplicitAdd server config, virtual host, directory, .htaccess FileInfo Extension mod charset lite The C HARSET O PTIONS directive configures certain behaviors of MOD CHARSET LITE. Option can be one of ImplicitAdd | NoImplicitAdd The ImplicitAdd keyword specifies that MOD CHARSET LITE should implicitly insert its filter when the configuration specifies that the character set of content should be translated. If the filter chain is explicitly configured using the A DD O UTPUT F ILTER directive, NoImplicitAdd should be specified so that MOD CHARSET LITE doesn’t add its filter. TranslateAllMimeTypes | NoTranslateAllMimeTypes Normally, MOD CHARSET LITE will only perform translation on a small subset of possible mimetypes. When the TranslateAllMimeTypes keyword is specified for a given configuration section, translation is performed without regard for mimetype. CharsetSourceEnc Directive Description: Syntax: Context: Override: Status: Module: Source charset of files CharsetSourceEnc charset server config, virtual host, directory, .htaccess FileInfo Extension mod charset lite 10.36. APACHE MODULE MOD CHARSET LITE 587 The C HARSET S OURCE E NC directive specifies the source charset of files in the associated container. The value of the charset argument must be accepted as a valid character set name by the character set support in APR. Generally, this means that it must be supported by iconv. Example CharsetSourceEnc UTF-16BE CharsetDefault ISO-8859-1 The character set names in this example work with the iconv translation support in Solaris 8. =⇒translation. Specifying the same charset for both C S E and C D disables The charset need not match the charset of the response, but it must be a valid HARSET OURCE NC charset on the system. HARSET EFAULT 588 CHAPTER 10. APACHE MODULES 10.37 Apache Module mod data Description: Status: ModuleIdentifier: SourceFile: Compatibility: Convert response body into an RFC2397 data URL Extension data module mod data.c Available in Apache 2.3 and later Summary This module provides the ability to convert a response into an RFC2397 data URL25 . Data URLs can be embedded inline within web pages using something like the MOD INCLUDE module, to remove the need for clients to make separate connections to fetch what may potentially be many small images. Data URLs may also be included into pages generated by scripting languages such as PHP. An example of a data URL  AAAC8IyPqcvt3wCcDkiLc7C0qwyGHhSWpjQu5yqmCYsapyuvUUlvONmOZtfzgFz ByTB10QgxOR0TqBQejhRNzOfkVJ+5YiUqrXF5Y5lKh/DeuNcP5yLWGsEbtLiOSp a/TPg7JpJHxyendzWTBfX0cxOnKPjgBzi4diinWGdkF8kjdfnycQZXZeYGejmJl ZeGl9i2icVqaNVailT6F5iJ90m6mvuTS4OK05M0vDk0Q4XUtwvKOzrcd3iq9uis F81M1OIcR7lEewwcLp7tuNNkM3uNna3F2JQFo97Vriy/Xl4/f1cf5VWzXyym7PH hhx4dbgYKAAA7 The filter takes no parameters, and can be added to the filter stack using the S ET O UTPUT F ILTER directive, or any of the directives supported by the MOD FILTER module. Configuring the filter SetOutputFilter DATA Directives This module provides no directives. See also • Filters (p. 110) 25 http://tools.ietf.org/html/rfc2397 10.38. APACHE MODULE MOD DAV 10.38 589 Apache Module mod dav Description: Status: ModuleIdentifier: SourceFile: Distributed Authoring and Versioning (WebDAV26 ) functionality Extension dav module mod dav.c Summary This module provides class 1 and class 2 WebDAV27 (’Web-based Distributed Authoring and Versioning’) functionality for Apache. This extension to the HTTP protocol allows creating, moving, copying, and deleting resources and collections on a remote web server. Directives • Dav • DavDepthInfinity • DavMinTimeout See also • DAV L OCK DB • L IMIT XMLR EQUEST B ODY • WebDAV Resources28 Enabling WebDAV To enable MOD DAV, add the following to a container in your httpd.conf file: Dav On This enables the DAV file system provider, which is implemented by the MOD DAV FS module. Therefore, that module must be compiled into the server or loaded at runtime using the L OAD M ODULE directive. In addition, a location for the DAV lock database must be specified in the global section of your httpd.conf file using the DAV L OCK DB directive: DavLockDB "/usr/local/apache2/var/DavLock" The directory containing the lock database file must be writable by the U SER and G ROUP under which Apache is running. You may wish to add a clause inside the directive to limit access to DAV-enabled locations. If you want to set the maximum amount of bytes that a DAV client can send at one request, you have to use the L IMIT XMLR EQUEST B ODY directive. The "normal" L IMIT R EQUEST B ODY directive has no effect on DAV requests. 27 http://www.webdav.org 28 http://www.webdav.org 590 CHAPTER 10. APACHE MODULES Full Example DavLockDB "/usr/local/apache2/var/DavLock" Require all granted Dav On AuthType Basic AuthName "DAV" AuthUserFile "user.passwd" Require user admin Security Issues Since DAV access methods allow remote clients to manipulate files on the server, you must take particular care to assure that your server is secure before enabling MOD DAV. Any location on the server where DAV is enabled should be protected by authentication. The use of HTTP Basic Authentication is not recommended. You should use at least HTTP Digest Authentication, which is provided by the MOD AUTH DIGEST module. Nearly all WebDAV clients support this authentication method. An alternative is Basic Authentication over an SSL (p. 192) enabled connection. In order for MOD DAV to manage files, it must be able to write to the directories and files under its control using the U SER and G ROUP under which Apache is running. New files created will also be owned by this U SER and G ROUP. For this reason, it is important to control access to this account. The DAV repository is considered private to Apache; modifying files outside of Apache (for example using FTP or filesystem-level tools) should not be allowed. MOD DAV may be subject to various kinds of denial-of-service attacks. The L IMIT XMLR EQUEST B ODY directive can be used to limit the amount of memory consumed in parsing large DAV requests. The DAV D EPTH I NFINITY directive can be used to prevent PROPFIND requests on a very large repository from consuming large amounts of memory. Another possible denial-of-service attack involves a client simply filling up all available disk space with many large files. There is no direct way to prevent this in Apache, so you should avoid giving DAV access to untrusted users. Complex Configurations One common request is to use MOD DAV to manipulate dynamic files (PHP scripts, CGI scripts, etc). This is difficult because a GET request will always run the script, rather than downloading its contents. One way to avoid this is to map two different URLs to the content, one of which will run the script, and one of which will allow it to be downloaded and manipulated with DAV. Alias "/phparea" "/home/gstein/php_files" Alias "/php-source" "/home/gstein/php_files" Dav On ForceType text/plain With this setup, http://example.com/phparea can be used to access the output of the PHP scripts, and http://example.com/php-source can be used with a DAV client to manipulate them. 10.38. APACHE MODULE MOD DAV 591 Dav Directive Description: Syntax: Default: Context: Status: Module: Enable WebDAV HTTP methods Dav On|Off|provider-name Dav Off directory Extension mod dav Use the DAV directive to enable the WebDAV HTTP methods for the given container: Dav On The value On is actually an alias for the default provider filesystem which is served by the MOD DAV FS module. Note, that once you have DAV enabled for some location, it cannot be disabled for sublocations. For a complete configuration example have a look at the section above. ! Do not enable WebDAV until you have secured your server. Otherwise everyone will be able to distribute files on your system. DavDepthInfinity Directive Description: Syntax: Default: Context: Status: Module: Allow PROPFIND, Depth: Infinity requests DavDepthInfinity on|off DavDepthInfinity off server config, virtual host, directory Extension mod dav Use the DAV D EPTH I NFINITY directive to allow the processing of PROPFIND requests containing the header ’Depth: Infinity’. Because this type of request could constitute a denial-of-service attack, by default it is not allowed. DavMinTimeout Directive Description: Syntax: Default: Context: Status: Module: Minimum amount of time the server holds a lock on a DAV resource DavMinTimeout seconds DavMinTimeout 0 server config, virtual host, directory Extension mod dav When a client requests a DAV resource lock, it can also specify a time when the lock will be automatically removed by the server. This value is only a request, and the server can ignore it or inform the client of an arbitrary value. Use the DAV M IN T IMEOUT directive to specify, in seconds, the minimum lock timeout to return to a client. Microsoft Web Folders defaults to a timeout of 120 seconds; the DAV M IN T IMEOUT can override this to a higher value (like 600 seconds) to reduce the chance of the client losing the lock due to network latency. Example DavMinTimeout 600 592 CHAPTER 10. APACHE MODULES 10.39 Apache Module mod dav fs Description: Status: ModuleIdentifier: SourceFile: Filesystem provider for MOD DAV Extension dav fs module mod dav fs.c Summary This module requires the service of MOD DAV. It acts as a support module for MOD DAV and provides access to resources located in the server’s file system. The formal name of this provider is filesystem. MOD DAV backend providers will be invoked by using the DAV directive: Example Dav filesystem Since filesystem is the default provider for MOD DAV, you may simply use the value On instead. Directives • DavLockDB See also • MOD DAV DavLockDB Directive Description: Syntax: Context: Status: Module: Location of the DAV lock database DavLockDB file-path server config, virtual host Extension mod dav fs Use the DAV L OCK DB directive to specify the full path to the lock database, excluding an extension. If the path is not absolute, it will be taken relative to S ERVER ROOT. The implementation of MOD DAV FS uses a SDBM database to track user locks. Example DavLockDB var/DavLock The directory containing the lock database file must be writable by the U SER and G ROUP under which Apache is running. For security reasons, you should create a directory for this purpose rather than changing the permissions on an existing directory. In the above example, Apache will create files in the var/ directory under the S ERVER ROOT with the base filename DavLock and extension name chosen by the server. 10.40. APACHE MODULE MOD DAV LOCK 10.40 593 Apache Module mod dav lock Description: Status: ModuleIdentifier: SourceFile: Generic locking module for MOD DAV Extension dav lock module mod dav lock.c Summary This module implements a generic locking API which can be used by any backend provider of MOD DAV. It requires at least the service of MOD DAV. But without a backend provider which makes use of it, it’s useless and should not be loaded into the server. A sample backend module which actually utilizes MOD DAV LOCK is mod dav svn29 , the subversion provider module. Note that MOD DAV FS does not need this generic locking module, because it uses its own more specialized version. In order to make MOD DAV LOCK functional, you just have to specify the location of the lock database using the DAV G ENERIC L OCK DB directive described below. =⇒Developer’s Note In order to retrieve the pointer to the locking provider function, you have to use the ap lookup provider API with the arguments dav-lock, generic, and 0. Directives • DavGenericLockDB See also • MOD DAV DavGenericLockDB Directive Description: Syntax: Context: Status: Module: Location of the DAV lock database DavGenericLockDB file-path server config, virtual host, directory Extension mod dav lock Use the DAV G ENERIC L OCK DB directive to specify the full path to the lock database, excluding an extension. If the path is not absolute, it will be interpreted relative to S ERVER ROOT. The implementation of MOD DAV LOCK uses a SDBM database to track user locks. Example DavGenericLockDB var/DavLock The directory containing the lock database file must be writable by the U SER and G ROUP under which Apache is running. For security reasons, you should create a directory for this purpose rather than changing the permissions on an existing directory. In the above example, Apache will create files in the var/ directory under the S ERVER ROOT with the base filename DavLock and an extension added by the server. 29 http://subversion.apache.org/ 594 CHAPTER 10. APACHE MODULES 10.41 Apache Module mod dbd Description: Status: ModuleIdentifier: SourceFile: Manages SQL database connections Extension dbd module mod dbd.c Summary MOD DBD manages SQL database connections using APR. It provides database connections on request to modules requiring SQL database functions, and takes care of managing databases with optimal efficiency and scalability for both threaded and non-threaded MPMs. For details, see the APR30 website and this overview of the Apache DBD Framework31 by its original developer. Directives • DBDExptime • DBDInitSQL • DBDKeep • DBDMax • DBDMin • DBDParams • DBDPersist • DBDPrepareSQL • DBDriver See also • Password Formats (p. 371) Connection Pooling This module manages database connections, in a manner optimised for the platform. On non-threaded platforms, it provides a persistent connection in the manner of classic LAMP (Linux, Apache, Mysql, Perl/PHP/Python). On threaded platform, it provides an altogether more scalable and efficient connection pool, as described in this article at ApacheTutor32 . Note that MOD DBD supersedes the modules presented in that article. Apache DBD API MOD DBD exports five functions for other modules to use. The API is as follows: typedef struct { apr_dbd_t *handle; apr_dbd_driver_t *driver; apr_hash_t *prepared; 30 http://apr.apache.org/ 31 http://people.apache.org/˜niq/dbd.html 32 http://www.apachetutor.org/dev/reslist 10.41. APACHE MODULE MOD DBD 595 } ap_dbd_t; /* Export functions to access the database */ /* acquire a connection that MUST be explicitly closed. * Returns NULL on error */ AP_DECLARE(ap_dbd_t*) ap_dbd_open(apr_pool_t*, server_rec*); /* release a connection acquired with ap_dbd_open */ AP_DECLARE(void) ap_dbd_close(server_rec*, ap_dbd_t*); /* acquire a connection that will have the lifetime of a request * and MUST NOT be explicitly closed. Return NULL on error. * This is the preferred function for most applications. */ AP_DECLARE(ap_dbd_t*) ap_dbd_acquire(request_rec*); /* acquire a connection that will have the lifetime of a connection * and MUST NOT be explicitly closed. Return NULL on error. */ AP_DECLARE(ap_dbd_t*) ap_dbd_cacquire(conn_rec*); /* Prepare a statement for use by a client module */ AP_DECLARE(void) ap_dbd_prepare(server_rec*, const char*, const char*); /* Also export them as optional functions for modules that prefer it */ APR_DECLARE_OPTIONAL_FN(ap_dbd_t*, ap_dbd_open, (apr_pool_t*, server_rec*)); APR_DECLARE_OPTIONAL_FN(void, ap_dbd_close, (server_rec*, ap_dbd_t*)); APR_DECLARE_OPTIONAL_FN(ap_dbd_t*, ap_dbd_acquire, (request_rec*)); APR_DECLARE_OPTIONAL_FN(ap_dbd_t*, ap_dbd_cacquire, (conn_rec*)); APR_DECLARE_OPTIONAL_FN(void, ap_dbd_prepare, (server_rec*, const char*, const char*)); SQL Prepared Statements MOD DBD supports SQL prepared statements on behalf of modules that may wish to use them. Each prepared statement must be assigned a name (label), and they are stored in a hash: the prepared field of an ap dbd t. Hash entries are of type apr dbd prepared t and can be used in any of the apr dbd prepared statement SQL query or select commands. It is up to dbd user modules to use the prepared statements and document what statements can be specified in httpd.conf, or to provide their own directives and use ap dbd prepare. ! Caveat When using prepared statements with a MySQL database, it is preferred to set reconnect to 0 in the connection string as to avoid errors that arise from the MySQL client reconnecting without properly resetting the prepared statements. If set to 1, any broken connections will be attempted fixed, but as mod dbd is not informed, the prepared statements will be invalidated. 596 CHAPTER 10. APACHE MODULES SECURITY WARNING Any web/database application needs to secure itself against SQL injection attacks. In most cases, Apache DBD is safe, because applications use prepared statements, and untrusted inputs are only ever used as data. Of course, if you use it via third-party modules, you should ascertain what precautions they may require. However, the FreeTDS driver is inherently unsafe. The underlying library doesn’t support prepared statements, so the driver emulates them, and the untrusted input is merged into the SQL statement. It can be made safe by untainting all inputs: a process inspired by Perl’s taint checking. Each input is matched against a regexp, and only the match is used, according to the Perl idiom: $untrusted =˜ /([a-z]+)/; $trusted = $1; To use this, the untainting regexps must be included in the prepared statements configured. The regexp follows immediately after the % in the prepared statement, and is enclosed in curly brackets {}. For example, if your application expects alphanumeric input, you can use: "SELECT foo FROM bar WHERE input = %s" with other drivers, and suffer nothing worse than a failed query. But with FreeTDS you’d need: "SELECT foo FROM bar WHERE input = %{([A-Za-z0-9]+)}s" Now anything that doesn’t match the regexp’s $1 match is discarded, so the statement is safe. An alternative to this may be the third-party ODBC driver, which offers the security of genuine prepared statements. DBDExptime Directive Description: Syntax: Default: Context: Status: Module: Keepalive time for idle connections DBDExptime time-in-seconds DBDExptime 300 server config, virtual host Extension mod dbd Set the time to keep idle connections alive when the number of connections specified in DBDKeep has been exceeded (threaded platforms only). DBDInitSQL Directive Description: Syntax: Context: Status: Module: Execute an SQL statement after connecting to a database DBDInitSQL "SQL statement" server config, virtual host Extension mod dbd Modules, that wish it, can have one or more SQL statements executed when a connection to a database is created. Example usage could be initializing certain values or adding a log entry when a new connection is made to the database. 10.41. APACHE MODULE MOD DBD 597 DBDKeep Directive Description: Syntax: Default: Context: Status: Module: Maximum sustained number of connections DBDKeep number DBDKeep 2 server config, virtual host Extension mod dbd Set the maximum number of connections per process to be sustained, other than for handling peak demand (threaded platforms only). DBDMax Directive Description: Syntax: Default: Context: Status: Module: Maximum number of connections DBDMax number DBDMax 10 server config, virtual host Extension mod dbd Set the hard maximum number of connections per process (threaded platforms only). DBDMin Directive Description: Syntax: Default: Context: Status: Module: Minimum number of connections DBDMin number DBDMin 1 server config, virtual host Extension mod dbd Set the minimum number of connections per process (threaded platforms only). DBDParams Directive Description: Syntax: Context: Status: Module: Parameters for database connection DBDParams param1=value1[,param2=value2] server config, virtual host Extension mod dbd As required by the underlying driver. Typically this will be used to pass whatever cannot be defaulted amongst username, password, database name, hostname and port number for connection. Connection string parameters for current drivers include: FreeTDS (for MSSQL and SyBase) username, password, appname, dbname, host, charset, lang, server MySQL host, port, user, pass, dbname, sock, flags, fldsz, group, reconnect Oracle user, pass, dbname, server PostgreSQL The connection string is passed straight through to PQconnectdb 598 CHAPTER 10. APACHE MODULES SQLite2 The connection string is split on a colon, and part1:part2 is used as sqlite open(part1, atoi(part2), NULL) SQLite3 The connection string is passed straight through to sqlite3 open ODBC datasource, user, password, connect, ctimeout, stimeout, access, txmode, bufsize DBDPersist Directive Description: Syntax: Context: Status: Module: Whether to use persistent connections DBDPersist On|Off server config, virtual host Extension mod dbd If set to Off, persistent and pooled connections are disabled. A new database connection is opened when requested by a client, and closed immediately on release. This option is for debugging and low-usage servers. The default is to enable a pool of persistent connections (or a single LAMP-style persistent connection in the case of a non-threaded server), and should almost always be used in operation. Prior to version 2.2.2, this directive accepted only the values 0 and 1 instead of Off and On, respectively. DBDPrepareSQL Directive Description: Syntax: Context: Status: Module: Define an SQL prepared statement DBDPrepareSQL "SQL statement" label server config, virtual host Extension mod dbd For modules such as authentication that repeatedly use a single SQL statement, optimum performance is achieved by preparing the statement at startup rather than every time it is used. This directive prepares an SQL statement and assigns it a label. DBDriver Directive Description: Syntax: Context: Status: Module: Specify an SQL driver DBDriver name server config, virtual host Extension mod dbd Selects an apr dbd driver by name. The driver must be installed on your system (on most systems, it will be a shared object or dll). For example, DBDriver mysql will select the MySQL driver in apr dbd mysql.so. 10.42. APACHE MODULE MOD DEFLATE 10.42 599 Apache Module mod deflate Description: Status: ModuleIdentifier: SourceFile: Compress content before it is delivered to the client Extension deflate module mod deflate.c Summary The MOD DEFLATE module provides the DEFLATE output filter that allows output from your server to be compressed before being sent to the client over the network. Directives • DeflateAlterETag • DeflateBufferSize • DeflateCompressionLevel • DeflateFilterNote • DeflateInflateLimitRequestBody • DeflateInflateRatioBurst • DeflateInflateRatioLimit • DeflateMemLevel • DeflateWindowSize See also • Filters (p. 110) Supported Encodings The gzip encoding is the only one supported to ensure complete compatibility with old browser implementations. The deflate encoding is not supported, please check the zlib’s documentation33 for a complete explanation. Sample Configurations ! Compression and TLS Some web applications are vulnerable to an information disclosure attack when a TLS connection carries deflate compressed data. For more information, review the details of the "BREACH" family of attacks. This is a simple configuration that compresses common text-based content types. Compress only a few types AddOutputFilterByType DEFLATE text/html text/plain text/xml text/css text/javascript applica 33 http://www.gzip.org/zlib/zlib faq.html#faq38 600 CHAPTER 10. APACHE MODULES Enabling Compression ! Compression and TLS Some web applications are vulnerable to an information disclosure attack when a TLS connection carries deflate compressed data. For more information, review the details of the "BREACH" family of attacks. Output Compression Compression is implemented by the DEFLATE filter (p. 110) . The following directive will enable compression for documents in the container where it is placed: SetOutputFilter DEFLATE SetEnvIfNoCase Request_URI \.(?:gif|jpe?g|png)$ no-gzip If you want to restrict the compression to particular MIME types in general, you may use the A DD O UTPUT F ILTER B Y T YPE directive. Here is an example of enabling compression only for the html files of the Apache documentation: AddOutputFilterByType DEFLATE text/html =⇒Note The DEFLATE filter is always inserted after RESOURCE filters like PHP or SSI. It never touches internal subrequests. =⇒Note There is an environment variable force-gzip, set via S ET E NV , which will ignore the accept-encoding setting of your browser and will send compressed output. Output Decompression The MOD DEFLATE module also provides a filter for inflating/uncompressing a gzip compressed response body. In order to activate this feature you have to insert the INFLATE filter into the output filter chain using S ET O UTPUT F ILTER or A DD O UTPUT F ILTER, for example: ProxyPass "http://example.com/" SetOutputFilter INFLATE This Example will uncompress gzip’ed output from example.com, so other filters can do further processing with it. Input Decompression The MOD DEFLATE module also provides a filter for decompressing a gzip compressed request body . In order to activate this feature you have to insert the DEFLATE filter into the input filter chain using S ET I NPUT F ILTER or A DD I NPUT F ILTER, for example: 10.42. APACHE MODULE MOD DEFLATE 601 SetInputFilter DEFLATE Now if a request contains a Content-Encoding: gzip header, the body will be automatically decompressed. Few browsers have the ability to gzip request bodies. However, some special applications actually do support request compression, for instance some WebDAV34 clients. ! Note on Content-Length If you evaluate the request body yourself, don’t trust the Content-Length header! The Content-Length header reflects the length of the incoming data from the client and not the byte count of the decompressed data stream. Dealing with proxy servers The MOD DEFLATE module sends a Vary: Accept-Encoding HTTP response header to alert proxies that a cached response should be sent only to clients that send the appropriate Accept-Encoding request header. This prevents compressed content from being sent to a client that will not understand it. If you use some special exclusions dependent on, for example, the User-Agent header, you must manually configure an addition to the Vary header to alert proxies of the additional restrictions. For example, in a typical configuration where the addition of the DEFLATE filter depends on the User-Agent, you should add: Header append Vary User-Agent If your decision about compression depends on other information than request headers (e.g. HTTP version), you have to set the Vary header to the value *. This prevents compliant proxies from caching entirely. Example Header set Vary * Serving pre-compressed content Since MOD DEFLATE re-compresses content each time a request is made, some performance benefit can be derived by pre-compressing the content and telling mod deflate to serve them without re-compressing them. This may be accomplished using a configuration like the following: # Serve gzip compressed CSS files if they exist # and the client accepts gzip. RewriteCond "%{HTTP:Accept-encoding}" "gzip" RewriteCond "%{REQUEST_FILENAME}\.gz" "-s" RewriteRule "ˆ(.*)\.css" "$1\.css\.gz" [QSA] # Serve gzip compressed JS files if they exist # and the client accepts gzip. RewriteCond "%{HTTP:Accept-encoding}" "gzip" RewriteCond "%{REQUEST_FILENAME}\.gz" "-s" 34 http://www.webdav.org 602 CHAPTER 10. APACHE MODULES RewriteRule "ˆ(.*)\.js" "$1\.js\.gz" [QSA] # Serve correct content types, and prevent mod_deflate double gzip. RewriteRule "\.css\.gz$" "-" [T=text/css,E=no-gzip:1] RewriteRule "\.js\.gz$" "-" [T=text/javascript,E=no-gzip:1] # Serve correct encoding type. Header append Content-Encoding gzip # Force proxies to cache gzipped & # non-gzipped css/js files separately. Header append Vary Accept-Encoding DeflateAlterETag Directive Description: Syntax: Default: Context: Status: Module: How the outgoing ETag header should be modified during compression DeflateAlterETag AddSuffix|NoChange|Remove DeflateAlterETag AddSuffix server config, virtual host Extension mod deflate The D EFLATE A LTER ETAG directive specifies how the ETag hader should be altered when a response is compressed. AddSuffix Append the compression method onto the end of the ETag, causing compressed and uncompressed representations to have unique ETags. This has been the default since 2.4.0, but prevents serving "HTTP Not Modified" (304) responses to conditional requests for compressed content. NoChange Don’t change the ETag on a compressed response. This was the default prior to 2.4.0, but does not satisfy the HTTP/1.1 property that all representations of the same resource have unique ETags. Remove Remove the ETag header from compressed responses. This prevents some conditional requests from being possible, but avoids the shortcomings of the preceding options. DeflateBufferSize Directive Description: Syntax: Default: Context: Status: Module: Fragment size to be compressed at one time by zlib DeflateBufferSize value DeflateBufferSize 8096 server config, virtual host Extension mod deflate The D EFLATE B UFFER S IZE directive specifies the size in bytes of the fragments that zlib should compress at one time. If the compressed response size is bigger than the one specified by this directive then httpd will switch to chunked encoding (HTTP header Transfer-Encoding set to Chunked), with the side effect of not setting any Content-Length HTTP header. This is particularly important when httpd works behind reverse caching proxies or when httpd is configured with MOD CACHE and MOD CACHE DISK because HTTP responses without any Content-Length header might not be cached. 10.42. APACHE MODULE MOD DEFLATE 603 DeflateCompressionLevel Directive Description: Syntax: Default: Context: Status: Module: How much compression do we apply to the output DeflateCompressionLevel value Zlib’s default server config, virtual host Extension mod deflate The D EFLATE C OMPRESSION L EVEL directive specifies what level of compression should be used, the higher the value, the better the compression, but the more CPU time is required to achieve this. The value must between 1 (less compression) and 9 (more compression). DeflateFilterNote Directive Description: Syntax: Context: Status: Module: Places the compression ratio in a note for logging DeflateFilterNote [type] notename server config, virtual host Extension mod deflate The D EFLATE F ILTER N OTE directive specifies that a note about compression ratios should be attached to the request. The name of the note is the value specified for the directive. You can use that note for statistical purposes by adding the value to your access log (p. 56) . Example DeflateFilterNote ratio LogFormat ’"%r" %b (%{ratio}n) "%{User-agent}i"’ deflate CustomLog "logs/deflate_log" deflate If you want to extract more accurate values from your logs, you can use the type argument to specify the type of data left as a note for logging. type can be one of: Input Store the byte count of the filter’s input stream in the note. Output Store the byte count of the filter’s output stream in the note. Ratio Store the compression ratio (output/input * 100) in the note. This is the default, if the type argument is omitted. Thus you may log it this way: Accurate Logging DeflateFilterNote Input instream DeflateFilterNote Output outstream DeflateFilterNote Ratio ratio LogFormat ’"%r" %{outstream}n/%{instream}n (%{ratio}n%%)’ deflate CustomLog "logs/deflate_log" deflate See also • MOD LOG CONFIG 604 CHAPTER 10. APACHE MODULES DeflateInflateLimitRequestBody Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Maximum size of inflated request bodies DeflateInflateLimitRequestBodyvalue None, but LimitRequestBody applies after deflation server config, virtual host, directory, .htaccess Extension mod deflate 2.4.10 and later The D EFLATE I NFLATE L IMIT R EQUEST B ODY directive specifies the maximum size of an inflated request body. If it is unset, L IMIT R EQUEST B ODY is applied to the inflated body. DeflateInflateRatioBurst Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Maximum number of times the inflation ratio for request bodies can be crossed DeflateInflateRatioBurst value 3 server config, virtual host, directory, .htaccess Extension mod deflate 2.4.10 and later The D EFLATE I NFLATE R ATIO B URST directive specifies the maximum number of times the D EFLATE I NFLATE R ATI O L IMIT can be crossed before terminating the request. DeflateInflateRatioLimit Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Maximum inflation ratio for request bodies DeflateInflateRatioLimit value 200 server config, virtual host, directory, .htaccess Extension mod deflate 2.4.10 and later The D EFLATE I NFLATE R ATIO L IMIT directive specifies the maximum ratio of deflated to inflated size of an inflated request body. This ratio is checked as the body is streamed in, and if crossed more than D EFLATE I NFLATE R ATIO B URST times, the request will be terminated. DeflateMemLevel Directive Description: Syntax: Default: Context: Status: Module: How much memory should be used by zlib for compression DeflateMemLevel value DeflateMemLevel 9 server config, virtual host Extension mod deflate The D EFLATE M EM L EVEL directive specifies how much memory should be used by zlib for compression (a value between 1 and 9). 10.42. APACHE MODULE MOD DEFLATE 605 DeflateWindowSize Directive Description: Syntax: Default: Context: Status: Module: Zlib compression window size DeflateWindowSize value DeflateWindowSize 15 server config, virtual host Extension mod deflate The D EFLATE W INDOW S IZE directive specifies the zlib compression window size (a value between 1 and 15). Generally, the higher the window size, the higher can the compression ratio be expected. 606 CHAPTER 10. APACHE MODULES 10.43 Apache Module mod dialup Description: Status: ModuleIdentifier: SourceFile: Send static content at a bandwidth rate limit, defined by the various old modem standards Experimental dialup module mod dialup.c Summary It is a module that sends static content at a bandwidth rate limit, defined by the various old modem standards. So, you can browse your site with a 56k V.92 modem, by adding something like this: ModemStandard "V.92" Previously to do bandwidth rate limiting modules would have to block an entire thread, for each client, and insert sleeps to slow the bandwidth down. Using the new suspend feature, a handler can get callback N milliseconds in the future, and it will be invoked by the Event MPM on a different thread, once the timer hits. From there the handler can continue to send data to the client. Directives • ModemStandard ModemStandard Directive Description: Syntax: Context: Status: Module: Modem standard to simulate ModemStandard V.21|V.26bis|V.32|V.34|V.92 directory Experimental mod dialup Specify what modem standard you wish to simulate. ModemStandard "V.26bis" 10.44. APACHE MODULE MOD DIR 10.44 607 Apache Module mod dir Description: Status: ModuleIdentifier: SourceFile: Provides for "trailing slash" redirects and serving directory index files Base dir module mod dir.c Summary The index of a directory can come from one of two sources: • A file written by the user, typically called index.html. The D IRECTORY I NDEX directive sets the name of this file. This is controlled by MOD DIR. • Otherwise, a listing generated by the server. This is provided by MOD AUTOINDEX. The two functions are separated so that you can completely remove (or replace) automatic index generation should you want to. A "trailing slash" redirect is issued when the server receives a request for a URL http://servername/foo/dirname where dirname is a directory. Directories require a trailing slash, so MOD DIR issues a redirect to http://servername/foo/dirname/. Directives • DirectoryCheckHandler • DirectoryIndex • DirectoryIndexRedirect • DirectorySlash • FallbackResource DirectoryCheckHandler Directive Description: Syntax: Default: Context: Override: Status: Module: Compatibility: Toggle how this module responds when another handler is configured DirectoryCheckHandler On|Off DirectoryCheckHandler Off server config, virtual host, directory, .htaccess Indexes Base mod dir Available in 2.4.8 and later. Releases prior to 2.4 implicitly act as if "DirectoryCheckHandler ON" was specified. The D IRECTORY C HECK H ANDLER directive determines whether MOD DIR should check for directory indexes or add trailing slashes when some other handler has been configured for the current URL. Handlers can be set by directives such as S ET H ANDLER or by other modules at runtime. In releases prior to 2.4, this module did not take any action if any other handler was configured for a URL. This allows directory indexes to be served even when a S ET H ANDLER directive is specified for an entire directory, but it can also result in some conflicts with other modules. 608 CHAPTER 10. APACHE MODULES DirectoryIndex Directive Description: Syntax: Default: Context: Override: Status: Module: List of resources to look for when the client requests a directory DirectoryIndex disabled | local-url [local-url] ... DirectoryIndex index.html server config, virtual host, directory, .htaccess Indexes Base mod dir The D IRECTORY I NDEX directive sets the list of resources to look for, when the client requests an index of the directory by specifying a / at the end of the directory name. Local-url is the (%-encoded) URL of a document on the server relative to the requested directory; it is usually the name of a file in the directory. Several URLs may be given, in which case the server will return the first one that it finds. If none of the resources exist and the Indexes option is set, the server will generate its own listing of the directory. Example DirectoryIndex index.html then a request for http://example.com/docs/ would http://example.com/docs/index.html if it exists, or would list the directory if it did not. return Note that the documents do not need to be relative to the directory; DirectoryIndex index.html index.txt /cgi-bin/index.pl would cause the CGI script /cgi-bin/index.pl to be executed if neither index.html or index.txt existed in a directory. A single argument of "disabled" prevents MOD DIR from searching for an index. An argument of "disabled" will be interpreted literally if it has any arguments before or after it, even if they are "disabled" as well. Note: Multiple D IRECTORY I NDEX directives within the same context (p. 35) will add to the list of resources to look for rather than replace: # Example A: Set index.html as an index page, then add index.php to that list as well. DirectoryIndex index.html DirectoryIndex index.php # Example B: This is identical to example A, except it’s done with a single directive. DirectoryIndex index.html index.php # Example C: To replace the list, you must explicitly reset it first: # In this example, only index.php will remain as an index resource. DirectoryIndex index.html DirectoryIndex disabled DirectoryIndex index.php 10.44. APACHE MODULE MOD DIR 609 DirectoryIndexRedirect Directive Description: Syntax: Default: Context: Override: Status: Module: Compatibility: Configures an external redirect for directory indexes. DirectoryIndexRedirect on | off | permanent | temp | seeother | 3xx-code DirectoryIndexRedirect off server config, virtual host, directory, .htaccess Indexes Base mod dir Available in version 2.3.14 and later By default, the D IRECTORY I NDEX is selected and returned transparently to the client. D IRECTORY I NDEX R EDIRECT causes an external redirect to instead be issued. The argument can be: • on: issues a 302 redirection to the index resource. • off: does not issue a redirection. This is the legacy behaviour of mod dir. • permanent: issues a 301 (permanent) redirection to the index resource. • temp: this has the same effect as on • seeother: issues a 303 redirection (also known as "See Other") to the index resource. • 3xx-code: issues a redirection marked by the chosen 3xx code. Example DirectoryIndexRedirect on A request for http://example.com/docs/ would http://example.com/docs/index.html if it exists. return a temporary redirect to DirectorySlash Directive Description: Syntax: Default: Context: Override: Status: Module: Toggle trailing slash redirects on or off DirectorySlash On|Off DirectorySlash On server config, virtual host, directory, .htaccess Indexes Base mod dir The D IRECTORY S LASH directive determines whether MOD DIR should fixup URLs pointing to a directory or not. Typically if a user requests a resource without a trailing slash, which points to a directory, MOD DIR redirects him to the same resource, but with trailing slash for some good reasons: • The user is finally requesting the canonical URL of the resource • MOD AUTOINDEX works correctly. Since it doesn’t emit the path in the link, it would point to the wrong path. • D IRECTORY I NDEX will be evaluated only for directories requested with trailing slash. • Relative URL references inside html pages will work correctly. 610 CHAPTER 10. APACHE MODULES If you don’t want this effect and the reasons above don’t apply to you, you can turn off the redirect as shown below. However, be aware that there are possible security implications to doing this. # see security warning below! DirectorySlash Off SetHandler some-handler ! Security Warning Turning off the trailing slash redirect may result in an information disclosure. Consider a situation where MOD AUTOINDEX is active (Options +Indexes) and D IRECTORY I NDEX is set to a valid resource (say, index.html) and there’s no other special handler defined for that URL. In this case a request with a trailing slash would show the index.html file. But a request without trailing slash would list the directory contents. Also note that some browsers may erroneously change POST requests into GET (thus discarding POST data) when a redirect is issued. FallbackResource Directive Description: Syntax: Default: Context: Override: Status: Module: Compatibility: Define a default URL for requests that don’t map to a file FallbackResource disabled | local-url disabled - httpd will return 404 (Not Found) server config, virtual host, directory, .htaccess Indexes Base mod dir The disabled argument is available in version 2.4.4 and later Use this to set a handler for any URL that doesn’t map to anything in your filesystem, and would otherwise return HTTP 404 (Not Found). For example FallbackResource /not-404.php will cause requests for non-existent files to be handled by not-404.php, while requests for files that exist are unaffected. It is frequently desirable to have a single file or resource handle all requests to a particular directory, except those requests that correspond to an existing file or script. This is often referred to as a ’front controller.’ In earlier versions of httpd, this effect typically required MOD REWRITE, and the use of the -f and -d tests for file and directory existence. This now requires only one line of configuration. FallbackResource /index.php Existing files, such as images, css files, and so on, will be served normally. Use the disabled argument to disable that feature if inheritance from a parent directory is not desired. In a sub-URI, such as http://example.com/blog/ this sub-URI has to be supplied as local-url: FallbackResource /blog/index.php 10.44. APACHE MODULE MOD DIR FallbackResource disabled 611 612 CHAPTER 10. APACHE MODULES 10.45 Apache Module mod dumpio Description: Status: ModuleIdentifier: SourceFile: Dumps all I/O to error log as desired. Extension dumpio module mod dumpio.c Summary mod dumpio allows for the logging of all input received by Apache and/or all output sent by Apache to be logged (dumped) to the error.log file. The data logging is done right after SSL decoding (for input) and right before SSL encoding (for output). As can be expected, this can produce extreme volumes of data, and should only be used when debugging problems. Directives • DumpIOInput • DumpIOOutput Enabling dumpio Support To enable the module, it should be compiled and loaded in to your running Apache configuration. Logging can then be enabled or disabled separately for input and output via the below directives. Additionally, MOD DUMPIO needs to be configured to L OG L EVEL trace7: LogLevel dumpio:trace7 DumpIOInput Directive Description: Syntax: Default: Context: Status: Module: Dump all input data to the error log DumpIOInput On|Off DumpIOInput Off server config Extension mod dumpio Enable dumping of all input. Example DumpIOInput On DumpIOOutput Directive Description: Syntax: Default: Context: Status: Module: Dump all output data to the error log DumpIOOutput On|Off DumpIOOutput Off server config Extension mod dumpio 10.45. APACHE MODULE MOD DUMPIO Enable dumping of all output. Example DumpIOOutput On 613 614 CHAPTER 10. APACHE MODULES 10.46 Apache Module mod echo Description: Status: ModuleIdentifier: SourceFile: A simple echo server to illustrate protocol modules Experimental echo module mod echo.c Summary This module provides an example protocol module to illustrate the concept. It provides a simple echo server. Telnet to it and type stuff, and it will echo it. Directives • ProtocolEcho ProtocolEcho Directive Description: Syntax: Default: Context: Status: Module: Turn the echo server on or off ProtocolEcho On|Off ProtocolEcho Off server config, virtual host Experimental mod echo The P ROTOCOL E CHO directive enables or disables the echo server. Example ProtocolEcho On 10.47. APACHE MODULE MOD ENV 10.47 615 Apache Module mod env Description: Status: ModuleIdentifier: SourceFile: Modifies the environment which is passed to CGI scripts and SSI pages Base env module mod env.c Summary This module allows for control of internal environment variables that are used by various Apache HTTP Server modules. These variables are also provided to CGI scripts as native system environment variables, and available for use in SSI pages. Environment variables may be passed from the shell which invoked the httpd process. Alternatively, environment variables may be set or unset within the configuration process. Directives • PassEnv • SetEnv • UnsetEnv See also • Environment Variables (p. 92) • S ET E NV I F PassEnv Directive Description: Syntax: Context: Override: Status: Module: Passes environment variables from the shell PassEnv env-variable [env-variable] ... server config, virtual host, directory, .htaccess FileInfo Base mod env Specifies one or more native system environment variables to make available as internal environment variables, which are available to Apache HTTP Server modules as well as propagated to CGI scripts and SSI pages. Values come from the native OS environment of the shell which invoked the httpd process. Example PassEnv LD_LIBRARY_PATH SetEnv Directive Description: Syntax: Context: Override: Status: Module: Sets environment variables SetEnv env-variable [value] server config, virtual host, directory, .htaccess FileInfo Base mod env 616 CHAPTER 10. APACHE MODULES Sets an internal environment variable, which is then available to Apache HTTP Server modules, and passed on to CGI scripts and SSI pages. Example SetEnv SPECIAL_PATH /foo/bin If you omit the value argument, the variable is set to an empty string. =⇒cessing The internal environment variables set by this directive are set after most early request prodirectives are run, such as access control and URI-to-filename mapping. If the environment variable you’re setting is meant as input into this early phase of processing such as the R EWRITE RULE directive, you should instead set the environment variable with S ET E NV I F. See also • Environment Variables (p. 92) UnsetEnv Directive Description: Syntax: Context: Override: Status: Module: Removes variables from the environment UnsetEnv env-variable [env-variable] ... server config, virtual host, directory, .htaccess FileInfo Base mod env Removes one or more internal environment variables from those passed on to CGI scripts and SSI pages. Example UnsetEnv LD_LIBRARY_PATH 10.48. APACHE MODULE MOD EXAMPLE HOOKS 10.48 617 Apache Module mod example hooks Description: Status: ModuleIdentifier: SourceFile: Illustrates the Apache module API Experimental example hooks module mod example hooks.c Summary The files in the modules/examples directory under the Apache distribution directory tree are provided as an example to those that wish to write modules that use the Apache API. The main file is mod example hooks.c, which illustrates all the different callback mechanisms and call syntaxes. By no means does an add-on module need to include routines for all of the callbacks - quite the contrary! The example module is an actual working module. If you link it into your server, enable the "example-hooks-handler" handler for a location, and then browse to that location, you will see a display of some of the tracing the example module did as the various callbacks were made. Directives • Example Compiling the example hooks module To include the example hooks module in your server, follow the steps below: 1. Run configure with --enable-example-hooks option. 2. Make the server (run "make"). To add another module of your own: 1. cp modules/examples/mod example hooks.c modules/new module/mod myexample.c 2. Modify the file. 3. Create modules/new module/config.m4. (a) Add APACHE MODPATH INIT(new module). (b) Copy APACHE MODULE line with "example hooks" from modules/examples/config.m4. (c) Replace the first argument "example hooks" with myexample. (d) Replace the second argument with brief description of your module. It will be used in configure --help. (e) If your module needs additional C compiler flags, linker flags or libraries, add them to CFLAGS, LDFLAGS and LIBS accordingly. See other config.m4 files in modules directory for examples. (f) Add APACHE MODPATH FINISH. 4. Create module/new module/Makefile.in. If your module doesn’t need special build instructions, all you need to have in that file is include $(top srcdir)/build/special.mk. 5. Run ./buildconf from the top-level directory. 6. Build the server with –enable-myexample 618 CHAPTER 10. APACHE MODULES Using the mod example hooks Module To activate the example hooks module, include a block similar to the following in your httpd.conf file: SetHandler example-hooks-handler As an alternative, you can put the following into a .htaccess (p. 380) file and then request the file "test.example" from that location: AddHandler example-hooks-handler .example After reloading/restarting your server, you should be able to browse to this location and see the brief display mentioned earlier. Example Directive Description: Syntax: Context: Status: Module: Demonstration directive to illustrate the Apache module API Example server config, virtual host, directory, .htaccess Experimental mod example hooks The E XAMPLE directive just sets a demonstration flag which the example module’s content handler displays. It takes no arguments. If you browse to an URL to which the example-hooks content-handler applies, you will get a display of the routines within the module and how and in what order they were called to service the document request. The effect of this directive one can observe under the point "Example directive declared here: YES/NO". 10.49. APACHE MODULE MOD EXPIRES 10.49 619 Apache Module mod expires Description: Status: ModuleIdentifier: SourceFile: Generation of Expires and Cache-Control HTTP headers according to userspecified criteria Extension expires module mod expires.c Summary This module controls the setting of the Expires HTTP header and the max-age directive of the Cache-Control HTTP header in server responses. The expiration date can set to be relative to either the time the source file was last modified, or to the time of the client access. These HTTP headers are an instruction to the client about the document’s validity and persistence. If cached, the document may be fetched from the cache rather than from the source until this time has passed. After that, the cache copy is considered "expired" and invalid, and a new copy must be obtained from the source. To modify Cache-Control directives other than max-age (see RFC 2616 section 14.935 ), you can use the H EADER directive. When the Expires header is already part of the response generated by the server, for example when generated by a CGI script or proxied from an origin server, this module does not change or add an Expires or Cache-Control header. Directives • ExpiresActive • ExpiresByType • ExpiresDefault Alternate Interval Syntax The E XPIRES D EFAULT and E XPIRES B Y T YPE directives can also be defined in a more readable syntax of the form: ExpiresDefault "base [plus num type] [num type] ..." ExpiresByType type/encoding "base [plus num type] [num type] ..." where base is one of: • access • now (equivalent to ’access’) • modification The plus keyword is optional. num should be an integer value [acceptable to atoi()], and type is one of: • years • months • weeks 35 http://www.w3.org/Protocols/rfc2616/rfc2616-sec14.html#sec14.9 620 CHAPTER 10. APACHE MODULES • days • hours • minutes • seconds For example, any of the following directives can be used to make documents expire 1 month after being accessed, by default: ExpiresDefault "access plus 1 month" ExpiresDefault "access plus 4 weeks" ExpiresDefault "access plus 30 days" The expiry time can be fine-tuned by adding several ’num type’ clauses: ExpiresByType text/html "access plus 1 month 15 days 2 hours" ExpiresByType image/gif "modification plus 5 hours 3 minutes" Note that if you use a modification date based setting, the Expires header will not be added to content that does not come from a file on disk. This is due to the fact that there is no modification time for such content. ExpiresActive Directive Description: Syntax: Default: Context: Override: Status: Module: Enables generation of Expires headers ExpiresActive On|Off ExpiresActive Off server config, virtual host, directory, .htaccess Indexes Extension mod expires This directive enables or disables the generation of the Expires and Cache-Control headers for the document realm in question. (That is, if found in an .htaccess file, for instance, it applies only to documents generated from that directory.) If set to Off, the headers will not be generated for any document in the realm (unless overridden at a lower level, such as an .htaccess file overriding a server config file). If set to On, the headers will be added to served documents according to the criteria defined by the E XPIRES B Y T YPE and E XPIRES D EFAULT directives (q.v.). Note that this directive does not guarantee that an Expires or Cache-Control header will be generated. If the criteria aren’t met, no header will be sent, and the effect will be as though this directive wasn’t even specified. ExpiresByType Directive Description: Syntax: Context: Override: Status: Module: Value of the Expires header configured by MIME type ExpiresByType MIME-type seconds server config, virtual host, directory, .htaccess Indexes Extension mod expires This directive defines the value of the Expires header and the max-age directive of the Cache-Control header generated for documents of the specified type (e.g., text/html). The second argument sets the number of seconds that will be added to a base time to construct the expiration date. The Cache-Control: max-age is calculated by subtracting the request time from the expiration date and expressing the result in seconds. 10.49. APACHE MODULE MOD EXPIRES 621 The base time is either the last modification time of the file, or the time of the client’s access to the document. Which should be used is specified by the field; M means that the file’s last modification time should be used as the base time, and A means the client’s access time should be used. The difference in effect is subtle. If M is used, all current copies of the document in all caches will expire at the same time, which can be good for something like a weekly notice that’s always found at the same URL. If A is used, the date of expiration is different for each client; this can be good for image files that don’t change very often, particularly for a set of related documents that all refer to the same images (i.e., the images will be accessed repeatedly within a relatively short timespan). Example: # enable expirations ExpiresActive On # expire GIF images after a month in the client’s cache ExpiresByType image/gif A2592000 # HTML documents are good for a week from the # time they were changed ExpiresByType text/html M604800 Note that this directive only has effect if ExpiresActive On has been specified. It overrides, for the specified MIME type only, any expiration date set by the E XPIRES D EFAULT directive. You can also specify the expiration time calculation using an alternate syntax, described earlier in this document. ExpiresDefault Directive Description: Syntax: Context: Override: Status: Module: Default algorithm for calculating expiration time ExpiresDefault seconds server config, virtual host, directory, .htaccess Indexes Extension mod expires This directive sets the default algorithm for calculating the expiration time for all documents in the affected realm. It can be overridden on a type-by-type basis by the E XPIRES B Y T YPE directive. See the description of that directive for details about the syntax of the argument, and the alternate syntax description as well. 622 CHAPTER 10. APACHE MODULES 10.50 Apache Module mod ext filter Description: Status: ModuleIdentifier: SourceFile: Pass the response body through an external program before delivery to the client Extension ext filter module mod ext filter.c Summary MOD EXT FILTER presents a simple and familiar programming model for filters (p. 110) . With this module, a program which reads from stdin and writes to stdout (i.e., a Unix-style filter command) can be a filter for Apache. This filtering mechanism is much slower than using a filter which is specially written for the Apache API and runs inside of the Apache server process, but it does have the following benefits: • the programming model is much simpler • any programming/scripting language can be used, provided that it allows the program to read from standard input and write to standard output • existing programs can be used unmodified as Apache filters Even when the performance characteristics are not suitable for production use, MOD EXT FILTER can be used as a prototype environment for filters. Directives • ExtFilterDefine • ExtFilterOptions See also • Filters (p. 110) Examples Generating HTML from some other type of response # mod_ext_filter directive to define a filter # to HTML-ize text/c files using the external # program /usr/bin/enscript, with the type of # the result set to text/html ExtFilterDefine c-to-html mode=output \ intype=text/c outtype=text/html \ cmd="/usr/bin/enscript --color -W html -Ec -o - -" # core directive to cause the new filter to # be run on output SetOutputFilter c-to-html # mod_mime directive to set the type of .c # files to text/c AddType text/c .c 10.50. APACHE MODULE MOD EXT FILTER 623 Implementing a content encoding filter Note: this gzip example is just for the purposes of illustration. Please refer to MOD DEFLATE for a practical implementation. # mod_ext_filter directive to define the external filter ExtFilterDefine gzip mode=output cmd=/bin/gzip # core directive to cause the gzip filter to be # run on output SetOutputFilter gzip # mod_headers directive to add # "Content-Encoding: gzip" header field Header set Content-Encoding gzip Slowing down the server # mod_ext_filter directive to define a filter # which runs everything through cat; cat doesn’t # modify anything; it just introduces extra pathlength # and consumes more resources ExtFilterDefine slowdown mode=output cmd=/bin/cat \ preservescontentlength # core directive to cause the slowdown filter to # be run several times on output # SetOutputFilter slowdown;slowdown;slowdown Using sed to replace text in the response # mod_ext_filter directive to define a filter which # replaces text in the response # ExtFilterDefine fixtext mode=output intype=text/html \ cmd="/bin/sed s/verdana/arial/g" # core directive to cause the fixtext filter to # be run on output SetOutputFilter fixtext =⇒You can do the same thing using MOD SUBSTITUTE without invoking an external process. 624 CHAPTER 10. APACHE MODULES Tracing another filter # Trace the data read and written by mod_deflate # for a particular client (IP 192.168.1.31) # experiencing compression problems. # This filter will trace what goes into mod_deflate. ExtFilterDefine tracebefore \ cmd="/bin/tracefilter.pl /tmp/tracebefore" \ EnableEnv=trace_this_client # This filter will trace what goes after mod_deflate. # Note that without the ftype parameter, the default # filter type of AP_FTYPE_RESOURCE would cause the # filter to be placed *before* mod_deflate in the filter # chain. Giving it a numeric value slightly higher than # AP_FTYPE_CONTENT_SET will ensure that it is placed # after mod_deflate. ExtFilterDefine traceafter \ cmd="/bin/tracefilter.pl /tmp/traceafter" \ EnableEnv=trace_this_client ftype=21 SetEnvIf Remote_Addr 192.168.1.31 trace_this_client SetOutputFilter tracebefore;deflate;traceafter Here is the filter which traces the data: #!/usr/local/bin/perl -w use strict; open(SAVE, ">$ARGV[0]") or die "can’t open $ARGV[0]: $?"; while () { print SAVE $_; print $_; } close(SAVE); ExtFilterDefine Directive Description: Syntax: Context: Status: Module: Define an external filter ExtFilterDefine filtername parameters server config Extension mod ext filter The E XT F ILTER D EFINE directive defines the characteristics of an external filter, including the program to run and its arguments. filtername specifies the name of the filter being defined. This name can then be used in S ET O UTPUT F ILTER directives. It must be unique among all registered filters. At the present time, no error is reported by the register-filter API, so a problem with duplicate names isn’t reported to the user. 10.50. APACHE MODULE MOD EXT FILTER 625 Subsequent parameters can appear in any order and define the external command to run and certain other characteristics. The only required parameter is cmd=. These parameters are: cmd=cmdline The cmd= keyword allows you to specify the external command to run. If there are arguments after the program name, the command line should be surrounded in quotation marks (e.g., cmd="/bin/mypgm arg1 arg2".) Normal shell quoting is not necessary since the program is run directly, bypassing the shell. Program arguments are blank-delimited. A backslash can be used to escape blanks which should be part of a program argument. Any backslashes which are part of the argument must be escaped with backslash themselves. In addition to the standard CGI environment variables, DOCUMENT URI, DOCUMENT PATH INFO, and QUERY STRING UNESCAPED will also be set for the program. mode=mode Use mode=output (the default) for filters which process the response. Use mode=input for filters which process the request. mode=input is available in Apache 2.1 and later. intype=imt This parameter specifies the internet media type (i.e., MIME type) of documents which should be filtered. By default, all documents are filtered. If intype= is specified, the filter will be disabled for documents of other types. outtype=imt This parameter specifies the internet media type (i.e., MIME type) of filtered documents. It is useful when the filter changes the internet media type as part of the filtering operation. By default, the internet media type is unchanged. PreservesContentLength The PreservesContentLength keyword specifies that the filter preserves the content length. This is not the default, as most filters change the content length. In the event that the filter doesn’t modify the length, this keyword should be specified. ftype=filtertype This parameter specifies the numeric value for filter type that the filter should be registered as. The default value, AP FTYPE RESOURCE, is sufficient in most cases. If the filter needs to operate at a different point in the filter chain than resource filters, then this parameter will be necessary. See the AP FTYPE foo definitions in util filter.h for appropriate values. disableenv=env This parameter specifies the name of an environment variable which, if set, will disable the filter. enableenv=env This parameter specifies the name of an environment variable which must be set, or the filter will be disabled. ExtFilterOptions Directive Description: Syntax: Default: Context: Status: Module: Configure MOD EXT FILTER options ExtFilterOptions option [option] ... ExtFilterOptions NoLogStderr directory Extension mod ext filter The E XT F ILTERO PTIONS directive specifies special processing options for MOD EXT FILTER. Option can be one of LogStderr | NoLogStderr The LogStderr keyword specifies that messages written to standard error by the external filter program will be saved in the Apache error log. NoLogStderr disables this feature. Onfail=[abort|remove] Determines how to proceed if the external filter program cannot be started. With abort (the default value) the request will be aborted. With remove, the filter is removed and the request continues without it. ExtFilterOptions LogStderr Messages written to the filter’s standard error will be stored in the Apache error log. 626 CHAPTER 10. APACHE MODULES 10.51 Apache Module mod file cache Description: Status: ModuleIdentifier: SourceFile: Caches a static list of files in memory Experimental file cache module mod file cache.c Summary ! This module should be used with care. You can easily create a broken site using MOD FILE CACHE , so read this document carefully. Caching frequently requested files that change very infrequently is a technique for reducing server load. MOD FILE CACHE provides two techniques for caching frequently requested static files. Through configuration directives, you can direct MOD FILE CACHE to either open then mmap() a file, or to pre-open a file and save the file’s open file handle. Both techniques reduce server load when processing requests for these files by doing part of the work (specifically, the file I/O) for serving the file when the server is started rather than during each request. Notice: You cannot use this for speeding up CGI programs or other files which are served by special content handlers. It can only be used for regular files which are usually served by the Apache core content handler. This module is an extension of and borrows heavily from the mod mmap static module in Apache 1.3. Directives • CacheFile • MMapFile Using mod file cache MOD FILE CACHE caches a list of statically configured files via MM AP F ILE or C ACHE F ILE directives in the main server configuration. Not all platforms support both directives. You will receive an error message in the server error log if you attempt to use an unsupported directive. If given an unsupported directive, the server will start but the file will not be cached. On platforms that support both directives, you should experiment with both to see which works best for you. MMapFile Directive The MM AP F ILE directive of MOD FILE CACHE maps a list of statically configured files into memory through the system call mmap(). This system call is available on most modern Unix derivatives, but not on all. There are sometimes system-specific limits on the size and number of files that can be mmap()ed, experimentation is probably the easiest way to find out. This mmap()ing is done once at server start or restart, only. So whenever one of the mapped files changes on the filesystem you have to restart the server (see the Stopping and Restarting (p. 29) documentation). To reiterate that point: if the files are modified in place without restarting the server you may end up serving requests that are completely bogus. You should update files by unlinking the old copy and putting a new copy in place. Most tools such as rdist and mv do this. The reason why this modules doesn’t take care of changes to the files is that this check would need an extra stat() every time which is a waste and against the intent of I/O reduction. 10.51. APACHE MODULE MOD FILE CACHE 627 CacheFile Directive The C ACHE F ILE directive of MOD FILE CACHE opens an active handle or file descriptor to the file (or files) listed in the configuration directive and places these open file handles in the cache. When the file is requested, the server retrieves the handle from the cache and passes it to the sendfile() (or TransmitFile() on Windows), socket API. This file handle caching is done once at server start or restart, only. So whenever one of the cached files changes on the filesystem you have to restart the server (see the Stopping and Restarting (p. 29) documentation). To reiterate that point: if the files are modified in place without restarting the server you may end up serving requests that are completely bogus. You should update files by unlinking the old copy and putting a new copy in place. Most tools such as rdist and mv do this. =⇒Note Don’t bother asking for a directive which recursively caches all the files in a directory. Try this instead... See the I NCLUDE directive, and consider this command: find /www/htdocs -type f -print \ | sed -e ’s/.*/mmapfile &/’ > /www/conf/mmap.conf CacheFile Directive Description: Syntax: Context: Status: Module: Cache a list of file handles at startup time CacheFile file-path [file-path] ... server config Experimental mod file cache The C ACHE F ILE directive opens handles to one or more files (given as whitespace separated arguments) and places these handles into the cache at server startup time. Handles to cached files are automatically closed on a server shutdown. When the files have changed on the filesystem, the server should be restarted to re-cache them. Be careful with the file-path arguments: They have to literally match the filesystem path Apache’s URL-to-filename translation handlers create. We cannot compare inodes or other stuff to match paths through symbolic links etc. because that again would cost extra stat() system calls which is not acceptable. This module may or may not work with filenames rewritten by MOD ALIAS or MOD REWRITE. Example CacheFile /usr/local/apache/htdocs/index.html MMapFile Directive Description: Syntax: Context: Status: Module: Map a list of files into memory at startup time MMapFile file-path [file-path] ... server config Experimental mod file cache The MM AP F ILE directive maps one or more files (given as whitespace separated arguments) into memory at server startup time. They are automatically unmapped on a server shutdown. When the files have changed on the filesystem at least a HUP or USR1 signal should be send to the server to re-mmap() them. 628 CHAPTER 10. APACHE MODULES Be careful with the file-path arguments: They have to literally match the filesystem path Apache’s URL-to-filename translation handlers create. We cannot compare inodes or other stuff to match paths through symbolic links etc. because that again would cost extra stat() system calls which is not acceptable. This module may or may not work with filenames rewritten by MOD ALIAS or MOD REWRITE. Example MMapFile /usr/local/apache/htdocs/index.html 10.52. APACHE MODULE MOD FILTER 10.52 629 Apache Module mod filter Description: Status: ModuleIdentifier: SourceFile: Context-sensitive smart filter configuration module Base filter module mod filter.c Summary This module enables smart, context-sensitive configuration of output content filters. For example, apache can be configured to process different content-types through different filters, even when the content-type is not known in advance (e.g. in a proxy). MOD FILTER works by introducing indirection into the filter chain. Instead of inserting filters in the chain, we insert a filter harness which in turn dispatches conditionally to a filter provider. Any content filter may be used as a provider to MOD FILTER; no change to existing filter modules is required (although it may be possible to simplify them). Directives • AddOutputFilterByType • FilterChain • FilterDeclare • FilterProtocol • FilterProvider • FilterTrace Smart Filtering In the traditional filtering model, filters are inserted unconditionally using A DD O UTPUT F ILTER and family. Each filter then needs to determine whether to run, and there is little flexibility available for server admins to allow the chain to be configured dynamically. MOD FILTER by contrast gives server administrators a great deal of flexibility in configuring the filter chain. In fact, filters can be inserted based on complex boolean expressions (p. 99) This generalises the limited flexibility offered by A DD O UTPUT F ILTER B Y T YPE. 630 CHAPTER 10. APACHE MODULES Filter Declarations, Providers and Chains Figure 1: The traditional filter model In the traditional model, output filters are a simple chain from the content generator (handler) to the client. This works well provided the filter chain can be correctly configured, but presents problems when the filters need to be configured dynamically based on the outcome of the handler. 10.52. APACHE MODULE MOD FILTER 631 Figure 2: The MOD FILTER model MOD FILTER works by introducing indirection into the filter chain. Instead of inserting filters in the chain, we insert a filter harness which in turn dispatches conditionally to a filter provider. Any content filter may be used as a provider to MOD FILTER; no change to existing filter modules is required (although it may be possible to simplify them). There can be multiple providers for one filter, but no more than one provider will run for any single request. A filter chain comprises any number of instances of the filter harness, each of which may have any number of providers. A special case is that of a single provider with unconditional dispatch: this is equivalent to inserting the provider filter directly into the chain. Configuring the Chain There are three stages to configuring a filter chain with MOD FILTER. For details of the directives, see below. Declare Filters The F ILTER D ECLARE directive declares a filter, assigning it a name and filter type. Required only if the filter is not the default type AP FTYPE RESOURCE. Register Providers The F ILTER P ROVIDER directive registers a provider with a filter. The filter may have been declared with F ILTER D ECLARE; if not, FilterProvider will implicitly declare it with the default type AP FTYPE RESOURCE. The provider must have been registered with ap register output filter by some module. The final argument to F ILTER P ROVIDER is an expression: the provider will be selected to run for a request if and only if the expression evaluates to true. The expression may evaluate HTTP request or response headers, environment variables, or the Handler used by this request. Unlike earlier versions, mod filter now supports complex expressions involving multiple criteria with AND / OR logic (&& / ——) and brackets. The details of the expression syntax are described in the ap expr documentation (p. 99) . 632 CHAPTER 10. APACHE MODULES Configure the Chain The above directives build components of a smart filter chain, but do not configure it to run. The F ILTER C HAIN directive builds a filter chain from smart filters declared, offering the flexibility to insert filters at the beginning or end of the chain, remove a filter, or clear the chain. Filtering and Response Status mod filter normally only runs filters on responses with HTTP status 200 (OK). If you want to filter documents with other response statuses, you can set the filter-errordocs environment variable, and it will work on all responses regardless of status. To refine this further, you can use expression conditions with F ILTER P ROVIDER. Upgrading from Apache HTTP Server 2.2 Configuration The F ILTER P ROVIDER directive has changed from httpd 2.2: the match and dispatch arguments are replaced with a single but more versatile expression. In general, you can convert a match/dispatch pair to the two sides of an expression, using something like: "dispatch = ’match’" The Request headers, Response headers and Environment variables are now interpreted from syntax %{req:foo}, %{resp:foo} and %{env:foo} respectively. The variables %{HANDLER} and %{CONTENT TYPE} are also supported. Note that the match no longer support substring matches. They can be replaced by regular expression matches. Examples Server side Includes (SSI) A simple case of replacing A DD O UTPUT F ILTER B Y T YPE FilterDeclare SSI FilterProvider SSI INCLUDES "%{CONTENT_TYPE} =˜ m|ˆtext/html|" FilterChain SSI Server side Includes (SSI) The same as the above but dispatching on handler (classic SSI behaviour; .shtml files get processed). FilterProvider SSI INCLUDES "%{HANDLER} = ’server-parsed’" FilterChain SSI Emulating mod gzip with mod deflate Insert INFLATE filter only if "gzip" is NOT in the Accept-Encoding header. This filter runs with ftype CONTENT SET. FilterDeclare gzip CONTENT_SET FilterProvider gzip inflate "%{req:Accept-Encoding} !˜ /gzip/" FilterChain gzip Image Downsampling Suppose we want to downsample all web images, and have filters for GIF, JPEG and PNG. FilterProvider unpack jpeg_unpack "%{CONTENT_TYPE} = ’image/jpeg’" FilterProvider unpack gif_unpack "%{CONTENT_TYPE} = ’image/gif’" FilterProvider unpack png_unpack "%{CONTENT_TYPE} = ’image/png’" 10.52. APACHE MODULE MOD FILTER 633 FilterProvider downsample downsample_filter "%{CONTENT_TYPE} = m|ˆimage/(jpeg|gif|png)| FilterProtocol downsample "change=yes" FilterProvider repack jpeg_pack "%{CONTENT_TYPE} = ’image/jpeg’" FilterProvider repack gif_pack "%{CONTENT_TYPE} = ’image/gif’" FilterProvider repack png_pack "%{CONTENT_TYPE} = ’image/png’" FilterChain unpack downsample repack Protocol Handling Historically, each filter is responsible for ensuring that whatever changes it makes are correctly represented in the HTTP response headers, and that it does not run when it would make an illegal change. This imposes a burden on filter authors to re-implement some common functionality in every filter: • Many filters will change the content, invalidating existing content tags, checksums, hashes, and lengths. • Filters that require an entire, unbroken response in input need to ensure they don’t get byteranges from a backend. • Filters that transform output in a filter need to ensure they don’t violate a Cache-Control: no-transform header from the backend. • Filters may make responses uncacheable. MOD FILTER aims to offer generic handling of these details of filter implementation, reducing the complexity required of content filter modules. This is work-in-progress; the F ILTER P ROTOCOL implements some of this functionality for back-compatibility with Apache 2.0 modules. For httpd 2.1 and later, the ap register output filter protocol and ap filter protocol API enables filter modules to declare their own behaviour. At the same time, MOD FILTER should not interfere with a filter that wants to handle all aspects of the protocol. By default (i.e. in the absence of any F ILTER P ROTOCOL directives), MOD FILTER will leave the headers untouched. At the time of writing, this feature is largely untested, as modules in common use are designed to work with 2.0. Modules using it should test it carefully. AddOutputFilterByType Directive Description: Syntax: Context: Override: Status: Module: Compatibility: assigns an output filter to a particular media-type AddOutputFilterByType filter[;filter...] media-type [media-type] ... server config, virtual host, directory, .htaccess FileInfo Base mod filter Had severe limitations before being moved to MOD FILTER in version 2.3.7 This directive activates a particular output filter (p. 110) for a request depending on the response media-type. The following example uses the DEFLATE filter, which is provided by MOD DEFLATE. It will compress all output (either static or dynamic) which is labeled as text/html or text/plain before it is sent to the client. AddOutputFilterByType DEFLATE text/html text/plain 634 CHAPTER 10. APACHE MODULES If you want the content to be processed by more than one filter, their names have to be separated by semicolons. It’s also possible to use one A DD O UTPUT F ILTER B Y T YPE directive for each of these filters. The configuration below causes all script output labeled as text/html to be processed at first by the INCLUDES filter and then by the DEFLATE filter. Options Includes AddOutputFilterByType INCLUDES;DEFLATE text/html See also • A DD O UTPUT F ILTER • S ET O UTPUT F ILTER • filters (p. 110) FilterChain Directive Description: Syntax: Context: Override: Status: Module: Configure the filter chain FilterChain [+=-@!]filter-name ... server config, virtual host, directory, .htaccess Options Base mod filter This configures an actual filter chain, from declared filters. F ILTER C HAIN takes any number of arguments, each optionally preceded with a single-character control that determines what to do: +filter-name Add filter-name to the end of the filter chain @filter-name Insert filter-name at the start of the filter chain -filter-name Remove filter-name from the filter chain =filter-name Empty the filter chain and insert filter-name ! Empty the filter chain filter-name Equivalent to +filter-name FilterDeclare Directive Description: Syntax: Context: Override: Status: Module: Declare a smart filter FilterDeclare filter-name [type] server config, virtual host, directory, .htaccess Options Base mod filter This directive declares an output filter together with a header or environment variable that will determine runtime configuration. The first argument is a filter-name for use in F ILTER P ROVIDER, F ILTER C HAIN and F ILTER P ROTOCOL directives. The final (optional) argument is the type of filter, and takes values of ap filter type - namely RESOURCE (the default), CONTENT SET, PROTOCOL, TRANSCODE, CONNECTION or NETWORK. 10.52. APACHE MODULE MOD FILTER 635 FilterProtocol Directive Description: Syntax: Context: Override: Status: Module: Deal with correct HTTP protocol handling FilterProtocol filter-name [provider-name] proto-flags server config, virtual host, directory, .htaccess Options Base mod filter This directs MOD FILTER to deal with ensuring the filter doesn’t run when it shouldn’t, and that the HTTP response headers are correctly set taking into account the effects of the filter. There are two forms of this directive. With three arguments, it applies specifically to a filter-name and a provider-name for that filter. With two arguments it applies to a filter-name whenever the filter runs any provider. Flags specified with this directive are merged with the flags that underlying providers may have registerd with MOD FILTER . For example, a filter may internally specify the equivalent of change=yes, but a particular configuration of the module can override with change=no. proto-flags is one or more of change=yes|no Specifies whether the filter changes the content, including possibly the content length. The "no" argument is supported in 2.4.7 and later. change=1:1 The filter changes the content, but will not change the content length byteranges=no The filter cannot work on byteranges and requires complete input proxy=no The filter should not run in a proxy context proxy=transform The filter transforms the response in a manner incompatible with the HTTP Cache-Control: no-transform header. cache=no The filter renders the output uncacheable (eg by introducing randomised content changes) FilterProvider Directive Description: Syntax: Context: Override: Status: Module: Register a content filter FilterProvider filter-name provider-name expression server config, virtual host, directory, .htaccess Options Base mod filter This directive registers a provider for the smart filter. The provider will be called if and only if the expression declared evaluates to true when the harness is first called. provider-name must have been ap register output filter. registered by loading a module that registers expression is an ap expr (p. 99) . See also • Expressions in Apache HTTP Server (p. 99) , for a complete reference and examples. • MOD INCLUDE the name with 636 CHAPTER 10. APACHE MODULES FilterTrace Directive Description: Syntax: Context: Status: Module: Get debug/diagnostic information from MOD FILTER FilterTrace filter-name level server config, virtual host, directory Base mod filter This directive generates debug information from MOD FILTER. It is designed to help test and debug providers (filter modules), although it may also help with MOD FILTER itself. The debug output depends on the level set: 0 (default) No debug information is generated. 1 MOD FILTER will record buckets and brigades passing through the filter to the error log, before the provider has processed them. This is similar to the information generated by mod diagnostics36 . 2 (not yet implemented) Will dump the full data passing through to a tempfile before the provider. For single-user debug only; this will not support concurrent hits. 36 http://apache.webthing.com/mod diagnostics/ 10.53. APACHE MODULE MOD FIREHOSE 10.53 637 Apache Module mod firehose Description: Status: ModuleIdentifier: SourceFile: Multiplexes all I/O to a given file or pipe. Extension firehose module mod firehose.c Summary mod firehose provides a mechanism to record data being passed between the httpd server and the client at the raw connection level to either a file or a pipe in such a way that the data can be analysed or played back to the server at a future date. It can be thought of as "tcpdump for httpd". Connections are recorded after the SSL has been stripped, and can be used for forensic debugging. The firehose tool can be used to demultiplex the recorded stream back into individual files for analysis, or playback using a tool like netcat. =⇒WARNING This module IGNORES all request level mechanisms to keep data private. It is the responsibility of the administrator to ensure that private data is not inadvertently exposed using this module. Directives • FirehoseConnectionInput • FirehoseConnectionOutput • FirehoseProxyConnectionInput • FirehoseProxyConnectionOutput • FirehoseRequestInput • FirehoseRequestOutput See also • firehose Enabling a Firehose To enable the module, it should be compiled and loaded in to your running Apache configuration, and the directives below used to record the data you are interested in. It is possible to record both incoming and outgoing data to the same filename if desired, as the direction of flow is recorded within each fragment. It is possible to write to both normal files and fifos (pipes). In the case of fifos, mod firehose ensures that the packet size is no larger than PIPE BUF to ensure writes are atomic. If a pipe is being used, something must be reading from the pipe before httpd is started for the pipe to be successfully opened for write. If the request to open the pipe fails, mod firehose will silently stand down and not record anything, and the server will keep running as normal. By default, all attempts to write will block the server. If the webserver has been built against APR v2.0 or later, and an optional "nonblock" parameter is specified all file writes will be non blocking, and buffer overflows will cause debugging data to be lost. In this case it is possible to prioritise the running of the server over the recording of firehose data. 638 CHAPTER 10. APACHE MODULES Stream Format The server typically serves multiple connections simultaneously, and as a result requests and responses need to be multiplexed before being written to the firehose. The fragment format is designed as clear text, so that a firehose can be opened with and inspected by a normal text editor. Alternatively, the firehose tool can be used to demultiplex the firehose back into individual requests or connections. The size of the multiplexed fragments is governed by PIPE BUF, the maximum size of write the system is prepared to perform atomically. By keeping the multiplexed fragments below PIPE BUF in size, the module guarantees that data from different fragments does not interleave. The size of PIPE BUF varies on different operating systems. The BNF for the fragment format is as follows: stream = 0*(fragment) fragment = header CRLF body CRLF header = length SPC timestamp SPC ( request | response ) SPC uuid SPC count length = timestamp = request = "<" response = ">" uuid = count = body = SPC = CRLF = All fragments for a connection or a request will share the same UUID, depending on whether connections or requests are being recorded. If connections are being recorded, multiple requests may appear within a connection. A fragment with a zero length indicates the end of the connection. Fragments may go missing or be dropped if the process reading the fragments is too slow. If this happens, gaps will exist in the connection counter numbering. A warning will be logged in the error log to indicate the UUID and counter of the dropped fragment, so it can be confirmed the fragment was dropped. It is possible that the terminating empty fragment may not appear, caused by the httpd process crashing, or being terminated ungracefully. The terminating fragment may be dropped if the process reading the fragments is not fast enough. FirehoseConnectionInput Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Capture traffic coming into the server on each connection FirehoseConnectionInput [ block | nonblock ] filename none server config Extension mod firehose FirehoseConnectionInput is only available in Apache 2.5.0 and later. 10.53. APACHE MODULE MOD FIREHOSE 639 Capture traffic coming into the server on each connection. Multiple requests will be captured within the same connection if keepalive is present. Example FirehoseConnectionInput connection-input.firehose FirehoseConnectionOutput Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Capture traffic going out of the server on each connection FirehoseConnectionOutput [ block | nonblock ] filename none server config Extension mod firehose FirehoseConnectionOutput is only available in Apache 2.5.0 and later. Capture traffic going out of the server on each connection. Multiple requests will be captured within the same connection if keepalive is present. Example FirehoseConnectionOutput connection-output.firehose FirehoseProxyConnectionInput Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Capture traffic coming into the back of mod proxy FirehoseProxyConnectionInput [ block | nonblock ] filename none server config Extension mod firehose FirehoseProxyConnectionInput is only available in Apache 2.5.0 and later. Capture traffic being received by mod proxy. Example FirehoseProxyConnectionInput proxy-input.firehose FirehoseProxyConnectionOutput Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Capture traffic sent out from the back of mod proxy FirehoseProxyConnectionOutput [ block | nonblock ] filename none server config Extension mod firehose FirehoseProxyConnectionOutput is only available in Apache 2.5.0 and later. Capture traffic being sent out by mod proxy. 640 CHAPTER 10. APACHE MODULES Example FirehoseProxyConnectionOutput proxy-output.firehose FirehoseRequestInput Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Capture traffic coming into the server on each request FirehoseRequestInput [ block | nonblock ] filename none server config Extension mod firehose FirehoseRequestInput is only available in Apache 2.5.0 and later. Capture traffic coming into the server on each request. Requests will be captured separately, regardless of the presence of keepalive. Example FirehoseRequestInput request-input.firehose FirehoseRequestOutput Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Capture traffic going out of the server on each request FirehoseRequestOutput [ block | nonblock ] filename none server config Extension mod firehose FirehoseRequestOutput is only available in Apache 2.5.0 and later. Capture traffic going out of the server on each request. Requests will be captured separately, regardless of the presence of keepalive. Example FirehoseRequestOutput request-output.firehose 10.54. APACHE MODULE MOD HEADERS 10.54 641 Apache Module mod headers Description: Status: ModuleIdentifier: SourceFile: Customization of HTTP request and response headers Extension headers module mod headers.c Summary This module provides directives to control and modify HTTP request and response headers. Headers can be merged, replaced or removed. Directives • Header • RequestHeader Order of Processing The directives provided by MOD HEADERS can occur almost anywhere within the server configuration, and can be limited in scope by enclosing them in configuration sections (p. 35) . Order of processing is important and is affected both by the order in the configuration file and by placement in configuration sections (p. 35) . These two directives have a different effect if reversed: RequestHeader append MirrorID "mirror 12" RequestHeader unset MirrorID This way round, the MirrorID header is not set. If reversed, the MirrorID header is set to "mirror 12". Early and Late Processing MOD HEADERS can be applied either early or late in the request. The normal mode is late, when Request Headers are set immediately before running the content generator and Response Headers just as the response is sent down the wire. Always use Late mode in an operational server. Early mode is designed as a test/debugging aid for developers. Directives defined using the early keyword are set right at the beginning of processing the request. This means they can be used to simulate different requests and set up test cases, but it also means that headers may be changed at any time by other modules before generating a Response. Because early directives are processed before the request path’s configuration is traversed, early headers can only be set in a main server or virtual host context. Early directives cannot depend on a request path, so they will fail in contexts such as or . Examples 1. Copy all request headers that begin with "TS" to the response headers: Header echo ˆTS 642 CHAPTER 10. APACHE MODULES 2. Add a header, MyHeader, to the response including a timestamp for when the request was received and how long it took to begin serving the request. This header can be used by the client to intuit load on the server or in isolating bottlenecks between the client and the server. Header set MyHeader "%D %t" results in this header being added to the response: MyHeader: D=3775428 t=991424704447256 3. Say hello to Joe Header set MyHeader "Hello Joe. It took %D microseconds for Apache to serve this reques results in this header being added to the response: MyHeader: Hello Joe. serve this request. It took D=3775428 microseconds for Apache to 4. Conditionally send MyHeader on the response if and only if header MyRequestHeader is present on the request. This is useful for constructing headers in response to some client stimulus. Note that this example requires the services of the MOD SETENVIF module. SetEnvIf MyRequestHeader myvalue HAVE_MyRequestHeader Header set MyHeader "%D %t mytext" env=HAVE_MyRequestHeader If the header MyRequestHeader: following header: MyHeader: myvalue is present on the HTTP request, the response will contain the D=3775428 t=991424704447256 mytext 5. Enable DAV to work with Apache running HTTP through SSL hardware (problem description37 ) by replacing https: with http: in the Destination header: RequestHeader edit Destination ˆhttps: http: early 6. Set the same header value under multiple nonexclusive conditions, but do not duplicate the value in the final header. If all of the following conditions applied to a request (i.e., if the CGI, NO CACHE and NO STORE environment variables all existed for the request): Header merge Cache-Control no-cache env=CGI Header merge Cache-Control no-cache env=NO_CACHE Header merge Cache-Control no-store env=NO_STORE then the response would contain the following header: Cache-Control: no-cache, no-store 37 http://svn.haxx.se/users/archive-2006-03/0549.shtml 10.54. APACHE MODULE MOD HEADERS 643 If append was used instead of merge, then the response would contain the following header: Cache-Control: no-cache, no-cache, no-store 7. Set a test cookie if and only if the client didn’t send us a cookie Header set Set-Cookie testcookie "expr=-z %{req:Cookie}" 8. Append a Caching header for responses with a HTTP status code of 200 Header append Cache-Control s-maxage=600 "expr=%{REQUEST_STATUS} == 200" Header Directive Description: Syntax: Context: Override: Status: Module: Compatibility: Configure HTTP response headers Header [condition] add|append|echo|edit|edit*|merge|set|setifempty|unset|note header [[expr=]value [replacement] [early|env=[!]varname|expr=expression]] server config, virtual host, directory, .htaccess FileInfo Extension mod headers SetIfEmpty available in 2.4.7 and later, expr=value available in 2.4.10 and later This directive can replace, merge or remove HTTP response headers. The header is modified just after the content handler and output filters are run, allowing outgoing headers to be modified. The optional condition argument determines which internal table of responses headers this directive will operate against. Despite the name, the default value of onsuccess does not limit an action to responses with a 2xx status code. Headers set under this condition are still used when, for example, a request is successfully proxied or generated by CGI, even when they have generated a failing status code. When your action is a function of an existing header, you may need to specify a condition of always, depending on which internal table the original header was set in. The table that corresponds to always is used for locally generated error responses as well as successful responses. Note also that repeating this directive with both conditions makes sense in some scenarios because always is not a superset of onsuccess with respect to existing headers: • You’re adding a header to a locally generated non-success (non-2xx) response, such as a redirect, in which case only the table corresponding to always is used in the ultimate response. • You’re modifying or removing a header generated by a CGI script, in which case the CGI scripts are in the table corresponding to always and not in the default table. • You’re modifying or removing a header generated by some piece of the server but that header is not being found by the default onsuccess condition. Separately from the condition parameter described above, you can limit an action based on HTTP status codes for e.g. proxied or CGI requests. See the example that uses %{REQUEST STATUS} in the section above. The action it performs is determined by the first argument (second argument if a condition is specified). This can be one of the following values: add The response header is added to the existing set of headers, even if this header already exists. This can result in two (or more) headers having the same name. This can lead to unforeseen consequences, and in general set, append or merge should be used instead. 644 CHAPTER 10. APACHE MODULES append The response header is appended to any existing header of the same name. When a new value is merged onto an existing header it is separated from the existing header with a comma. This is the HTTP standard way of giving a header multiple values. echo Request headers with this name are echoed back in the response headers. header may be a regular expression. value must be omitted. edit edit* If this response header exists, its value is transformed according to a regular expression search-and-replace. The value argument is a regular expression, and the replacement is a replacement string, which may contain backreferences or format specifiers. The edit form will match and replace exactly once in a header value, whereas the edit* form will replace every instance of the search pattern if it appears more than once. merge The response header is appended to any existing header of the same name, unless the value to be appended already appears in the header’s comma-delimited list of values. When a new value is merged onto an existing header it is separated from the existing header with a comma. This is the HTTP standard way of giving a header multiple values. Values are compared in a case sensitive manner, and after all format specifiers have been processed. Values in double quotes are considered different from otherwise identical unquoted values. set The response header is set, replacing any previous header with this name. The value may be a format string. setifempty The request header is set, but only if there is no previous header with this name. Available in 2.4.7 and later. unset The response header of this name is removed, if it exists. If there are multiple headers of the same name, all will be removed. value must be omitted. note The value of the named response header is copied into an internal note whose name is given by value. This is useful if a header sent by a CGI or proxied resource is configured to be unset but should also be logged. Available in 2.4.7 and later. This argument is followed by a header name, which can include the final colon, but it is not required. Case is ignored for set, append, merge, add, unset and edit. The header name for echo is case sensitive and may be a regular expression. For set, append, merge and add a value is specified as the next argument. If value contains spaces, it should be surrounded by double quotes. value may be a character string, a string containing MOD HEADERS specific format specifiers (and character literals), or an ap expr (p. 99) expression prefixed with expr= The following format specifiers are supported in value: Format Description %% %t The percent sign The time the request was received in Universal Coordinated Time since the epoch (Jan. 1, 1970) measured in microseconds. The value is preceded by t=. The time from when the request was received to the time the headers are sent on the wire. This is a measure of the duration of the request. The value is preceded by D=. The value is measured in microseconds. The current load averages of the actual server itself. It is designed to expose the values obtained by getloadavg() and this represents the current load average, the 5 minute average, and the 15 minute average. The value is preceded by l= with each average separated by /. Available in 2.4.4 and later. The current idle percentage of httpd (0 to 100) based on available processes and threads. The value is preceded by i=. Available in 2.4.4 and later. The current busy percentage of httpd (0 to 100) based on available processes and threads. The value is preceded by b=. Available in 2.4.4 and later. The contents of the environment variable (p. 92) VARNAME. The contents of the SSL environment variable (p. 916) VARNAME, if MOD SSL is enabled. %D %l %i %b %{VARNAME}e %{VARNAME}s 10.54. APACHE MODULE MOD HEADERS 645 =⇒Note The %s format specifier is only available in Apache 2.1 and later; it can be used instead of %e to avoid the overhead of enabling SSLOptions +StdEnvVars. If SSLOptions +StdEnvVars must be enabled anyway for some other reason, %e will be more efficient than %s. =⇒Note on expression values When the value parameter uses the ap expr (p. 99) parser, some expression syntax will differ from examples that evaluate boolean expressions such as : • The starting point of the grammar is ’string’ rather than ’expr’. • Function calls use the %{funcname:arg} syntax rather than funcname(arg). • Multi-argument functions are not currently accessible from this starting point • Quote the entire parameter, such as Header set foo-checksum "expr=%{md5:foo}" For edit there is both a value argument which is a regular expression, and an additional replacement string. As of version 2.4.7 the replacement string may also contain format specifiers. The H EADER directive may be followed by an additional argument, which may be any of: early Specifies early processing. env=[!]varname The directive is applied if and only if the environment variable (p. 92) varname exists. A ! in front of varname reverses the test, so the directive applies only if varname is unset. expr=expression The directive is applied if and only if expression evaluates to true. Details of expression syntax and evaluation are documented in the ap expr (p. 99) documentation. # This delays the evaluation of the condition clause compared to Header always set CustomHeader my-value "expr=%{REQUEST_URI} =˜ m#ˆ/special_path.php$#" Except in early mode, the H EADER directives are processed just before the response is sent to the network. This means that it is possible to set and/or override most headers, except for some headers added by the HTTP header filter. Prior to 2.2.12, it was not possible to change the Content-Type header with this directive. RequestHeader Directive Description: Syntax: Context: Override: Status: Module: Compatibility: Configure HTTP request headers RequestHeader add|append|edit|edit*|merge|set|setifempty|unset header [[expr=]value [replacement] [early|env=[!]varname|expr=expression]] server config, virtual host, directory, .htaccess FileInfo Extension mod headers SetIfEmpty available in 2.4.7 and later, expr=value available in 2.4.10 and later This directive can replace, merge, change or remove HTTP request headers. The header is modified just before the content handler is run, allowing incoming headers to be modified. The action it performs is determined by the first argument. This can be one of the following values: add The request header is added to the existing set of headers, even if this header already exists. This can result in two (or more) headers having the same name. This can lead to unforeseen consequences, and in general set, append or merge should be used instead. 646 CHAPTER 10. APACHE MODULES append The request header is appended to any existing header of the same name. When a new value is merged onto an existing header it is separated from the existing header with a comma. This is the HTTP standard way of giving a header multiple values. edit edit* If this request header exists, its value is transformed according to a regular expression search-and-replace. The value argument is a regular expression, and the replacement is a replacement string, which may contain backreferences or format specifiers. The edit form will match and replace exactly once in a header value, whereas the edit* form will replace every instance of the search pattern if it appears more than once. merge The request header is appended to any existing header of the same name, unless the value to be appended already appears in the existing header’s comma-delimited list of values. When a new value is merged onto an existing header it is separated from the existing header with a comma. This is the HTTP standard way of giving a header multiple values. Values are compared in a case sensitive manner, and after all format specifiers have been processed. Values in double quotes are considered different from otherwise identical unquoted values. set The request header is set, replacing any previous header with this name setifempty The request header is set, but only if there is no previous header with this name. Available in 2.4.7 and later. unset The request header of this name is removed, if it exists. If there are multiple headers of the same name, all will be removed. value must be omitted. This argument is followed by a header name, which can include the final colon, but it is not required. Case is ignored. For set, append, merge and add a value is given as the third argument. If a value contains spaces, it should be surrounded by double quotes. For unset, no value should be given. value may be a character string, a string containing format specifiers or a combination of both. The supported format specifiers are the same as for the H EADER, please have a look there for details. For edit both a value and a replacement are required, and are a regular expression and a replacement string respectively. The R EQUEST H EADER directive may be followed by an additional argument, which may be any of: early Specifies early processing. env=[!]varname The directive is applied if and only if the environment variable (p. 92) varname exists. A ! in front of varname reverses the test, so the directive applies only if varname is unset. expr=expression The directive is applied if and only if expression evaluates to true. Details of expression syntax and evaluation are documented in the ap expr (p. 99) documentation. Except in early mode, the R EQUEST H EADER directive is processed just before the request is run by its handler in the fixup phase. This should allow headers generated by the browser, or by Apache input filters to be overridden or modified. 10.55. APACHE MODULE MOD HEARTBEAT 10.55 647 Apache Module mod heartbeat Description: Status: ModuleIdentifier: SourceFile: Compatibility: Sends messages with server status to frontend proxy Experimental heartbeat module mod heartbeat Available in Apache 2.3 and later Summary MOD HEARTBEAT sends multicast messages to a MOD HEARTMONITOR listener that advertises the servers current connection count. Usually, MOD HEARTMONITOR will be running on a proxy server with MOD LBMETHOD HEARTBEAT loaded, which allows P ROXY PASS to use the "heartbeat" lbmethod inside of P ROX Y PASS . MOD HEARTBEAT ! itself is loaded on the origin server(s) that serve requests through the proxy server(s). To use MOD HEARTBEAT, MOD STATUS and MOD WATCHDOG must be either a static modules or, if a dynamic module, must be loaded before MOD HEARTBEAT. Directives • HeartbeatAddress Consuming mod heartbeat Output Every 1 second, this module generates a single multicast UDP packet, containing the number of busy and idle workers. The packet is a simple ASCII format, similar to GET query parameters in HTTP. An Example Packet v=1&ready=75&busy=0 Consumers should handle new variables besides busy and ready, separated by ’&’, being added in the future. HeartbeatAddress Directive Description: Syntax: Default: Context: Status: Module: Multicast address for heartbeat packets HeartbeatAddress addr:port disabled server config Experimental mod heartbeat The H EARTBEATA DDRESS directive specifies the multicast address to which MOD HEARTBEAT will send status information. This address will usually correspond to a configured H EARTBEAT L ISTEN on a frontend proxy system. HeartbeatAddress 239.0.0.1:27999 648 CHAPTER 10. APACHE MODULES 10.56 Apache Module mod heartmonitor Description: Status: ModuleIdentifier: SourceFile: Compatibility: Centralized monitor for mod heartbeat origin servers Experimental heartmonitor module mod heartmonitor.c Available in Apache 2.3 and later Summary MOD HEARTMONITOR listens for server status messages generated by MOD HEARTBEAT enabled origin servers and makes their status available to MOD LBMETHOD HEARTBEAT. This allows P ROXY PASS to use the "heartbeat" lbmethod inside of P ROXY PASS. This module uses the services of MOD SLOTMEM SHM when available instead of flat-file storage. No configuration is required to use MOD SLOTMEM SHM. ! To use MOD HEARTMONITOR, MOD STATUS and MOD WATCHDOG must be either a static modules or, if a dynamic module, it must be loaded before MOD HEARTMONITOR. Directives • HeartbeatListen • HeartbeatMaxServers • HeartbeatStorage HeartbeatListen Directive Description: Syntax: Default: Context: Status: Module: multicast address to listen for incoming heartbeat requests HeartbeatListenaddr:port disabled server config Experimental mod heartmonitor The H EARTBEAT L ISTEN directive specifies the multicast address on which the server will listen for status information from MOD HEARTBEAT-enabled servers. This address will usually correspond to a configured H EARTBEATA DDRESS on an origin server. HeartbeatListen 239.0.0.1:27999 This module is inactive until this directive is used. HeartbeatMaxServers Directive Description: Syntax: Default: Context: Status: Module: Specifies the maximum number of servers that will be sending heartbeat requests to this server HeartbeatMaxServers number-of-servers HeartbeatMaxServers 10 server config Experimental mod heartmonitor 10.56. APACHE MODULE MOD HEARTMONITOR 649 The H EARTBEAT M AX S ERVERS directive specifies the maximum number of servers that will be sending requests to this monitor server. It is used to control the size of the shared memory allocated to store the heartbeat info when MOD SLOTMEM SHM is in use. HeartbeatStorage Directive Description: Syntax: Default: Context: Status: Module: Path to store heartbeat data HeartbeatStorage file-path HeartbeatStorage logs/hb.dat server config Experimental mod heartmonitor The H EARTBEAT S TORAGE directive specifies the path to store heartbeat data. This flat-file is used only when MOD SLOTMEM SHM is not loaded. 650 CHAPTER 10. APACHE MODULES 10.57 Apache Module mod http2 Description: Status: ModuleIdentifier: SourceFile: Compatibility: Support for the HTTP/2 transport layer Extension http2 module mod http2.c Available in version 2.4.17 and later Summary This module provides HTTP/2 (RFC 754038 ) support for the Apache HTTP Server. This module relies on libnghttp239 to provide the core http/2 engine. ! Warning This module is experimental. Its behaviors, directives, and defaults are subject to more change from release to release relative to other standard modules. Users are encouraged to consult the "CHANGES" file for potential updates. You must enable HTTP/2 via P ROTOCOLS in order to use the functionality described in this document. The HTTP/2 protocol does not require40 the use of encryption so two schemes are available: h2 (HTTP/2 over TLS) and h2c (HTTP/2 over TCP). Two useful configuration schemes are: =⇒HTTP/2 in a VirtualHost context (TLS only) Protocols h2 http/1.1 Allows HTTP/2 negotiation (h2) via TLS ALPN in a secure . HTTP/2 preamble checking (Direct mode, see H2D IRECT) is disabled by default for h2. =⇒HTTP/2 in a Server context (TLS and cleartext) Protocols h2 h2c http/1.1 Allows HTTP/2 negotiation (h2) via TLS ALPN for secure . Allows HTTP/2 cleartext negotiation (h2c) upgrading from an initial HTTP/1.1 connection or via HTTP/2 preamble checking (Direct mode, see H2D IRECT). Refer to the official HTTP/2 FAQ41 for any doubt about the protocol. Directives • H2Direct • H2MaxSessionStreams • H2MaxWorkerIdleSeconds • H2MaxWorkers • H2MinWorkers 38 https://tools.ietf.org/html/rfc7540 39 http://nghttp2.org/ 40 https://http2.github.io/faq/#does-http2-require-encryption 41 https://http2.github.io/faq 10.57. APACHE MODULE MOD HTTP2 651 • H2ModernTLSOnly • H2Push • H2PushDiarySize • H2PushPriority • H2SerializeHeaders • H2SessionExtraFiles • H2StreamMaxMemSize • H2TLSCoolDownSecs • H2TLSWarmUpSize • H2Upgrade • H2WindowSize How it works HTTP/2 Dimensioning Enabling HTTP/2 on your Apache Server has impact on the resource consumption and if you have a busy site, you may need to consider carefully the implications. The first noticeable thing after enabling HTTP/2 is that your server processes will start additional threads. The reason for this is that HTTP/2 gives all requests that it receives to its own Worker threads for processing, collects the results and streams them out to the client. In the current implementation, these workers use a separate thread pool from the MPM workers that you might be familiar with. This is just how things are right now and not intended to be like this forever. (It might be forever for the 2.4.x release line, though.) So, HTTP/2 workers, or shorter H2Workers, will not show up in MOD STATUS. They are also not counted against directives such as T HREADS P ER C HILD. However they take T HREADS P ER C HILD as default if you have not configured something else via H2M IN W ORKERS and H2M AX W ORKERS. Another thing to watch out for is is memory consumption. Since HTTP/2 keeps more state on the server to manage all the open request, priorities for and dependencies between them, it will always need more memory than HTTP/1.1 processing. There are three directives which steer the memory footprint of a HTTP/2 connection: H2M AX S ESSION S TREAMS, H2W INDOW S IZE and H2S TREAM M AX M EM S IZE. H2M AX S ESSION S TREAMS limits the number of parallel requests that a client can make on a HTTP/2 connection. It depends on your site how many you should allow. The default is 100 which is plenty and unless you run into memory problems, I would keep it this way. Most requests that browsers send are GETs without a body, so they use up only a little bit of memory until the actual processing starts. H2W INDOW S IZE controls how much the client is allowed to send as body of a request, before it waits for the server to encourage more. Or, the other way around, it is the amount of request body data the server needs to be able to buffer. This is per request. And last, but not least, H2S TREAM M AX M EM S IZE controls how much response data shall be buffered. The request sits in a H2Worker thread and is producing data, the HTTP/2 connection tries to send this to the client. If the client does not read fast enough, the connection will buffer this amount of data and then suspend the H2Worker. If you serve a lot of static files, H2S ESSION E XTRA F ILES is of interest. This tells the server how many file handles per HTTP/2 connection it is allowed to waste for better performance. Because when a request produces a static file as the response, the file handle gets passed around and is buffered and not the file contents. That allows to serve many large files without wasting memory or copying data unnecessarily. However file handles are a limited resource for a process, and if too many are used this way, requests may fail under load as the amount of open handles has been exceeded. 652 CHAPTER 10. APACHE MODULES Multiple Hosts and Misdirected Requests Many sites use the same TLS certificate for multiple virtual hosts. The certificate either has a wildcard name, such as ’*.example.org’ or carries several alternate names. Browsers using HTTP/2 will recognize that and reuse an already opened connection for such hosts. While this is great for performance, it comes at a price: such vhosts need more care in their configuration. The problem is that you will have multiple requests for multiple hosts on the same TLS connection. And that makes renegotiation impossible, in face the HTTP/2 standard forbids it. So, if you have several virtual hosts using the same certificate and want to use HTTP/2 for them, you need to make sure that all vhosts have exactly the same SSL configuration. You need the same protocol, ciphers and settings for client verification. If you mix things, Apache httpd will detect it and return a special response code, 421 Misdirected Request, to the client. Environment Variables This module can be configured to provide HTTP/2 related information as additional environment variables to the SSI and CGI namespace, as well as in custom log configurations (see %{VAR NAME}e). Variable Name: Value Type: Description: HTTP2 H2PUSH flag flag H2 H2 H2 H2 H2 flag string number number string HTTP/2 is being used. HTTP/2 Server Push is enabled for this connection and also supported by the client. alternate name for H2PUSH empty or PUSHED for a request being pushed by the server. HTTP/2 stream number that triggered the push of this request. HTTP/2 stream number of this request. HTTP/2 process unique stream identifier, consisting of connection id and stream id separated by -. PUSH PUSHED PUSHED ON STREAM ID STREAM TAG H2Direct Directive Description: Syntax: Default: Context: Status: Module: H2 Direct Protocol Switch H2Direct on|off H2Direct on for h2c, off for h2 protocol server config, virtual host Extension mod http2 This directive toggles the usage of the HTTP/2 Direct Mode. This should be used inside a section to enable direct HTTP/2 communication for that virtual host. Direct communication means that if the first bytes received by the server on a connection match the HTTP/2 preamble, the HTTP/2 protocol is switched to immediately without further negotiation. This mode is defined in RFC 7540 for the cleartext (h2c) case. Its use on TLS connections not mandated by the standard. When a server/vhost does not have h2 or h2c enabled via P ROTOCOLS, the connection is never inspected for a HTTP/2 preamble. H2D IRECT does not matter then. This is important for connections that use protocols where an initial read might hang indefinitely, such as NNTP. For clients that have out-of-band knowledge about a server supporting h2c, direct HTTP/2 saves the client from having to perform an HTTP/1.1 upgrade, resulting in better performance and avoiding the Upgrade restrictions on request bodies. 10.57. APACHE MODULE MOD HTTP2 653 This makes direct h2c attractive for server to server communication as well, when the connection can be trusted or is secured by other means. Example H2Direct on H2MaxSessionStreams Directive Description: Syntax: Default: Context: Status: Module: Maximum number of active streams per HTTP/2 session. H2MaxSessionStreams n H2MaxSessionStreams 100 server config, virtual host Extension mod http2 This directive sets the maximum number of active streams per HTTP/2 session (e.g. connection) that the server allows. A stream is active if it is not idle or closed according to RFC 7540. Example H2MaxSessionStreams 20 H2MaxWorkerIdleSeconds Directive Description: Syntax: Default: Context: Status: Module: Maximum number of seconds h2 workers remain idle until shut down. H2MaxWorkerIdleSeconds n H2MaxWorkerIdleSeconds 600 server config Extension mod http2 This directive sets the maximum number of seconds a h2 worker may idle until it shuts itself down. This only happens while the number of h2 workers exceeds H2M IN W ORKERS. Example H2MaxWorkerIdleSeconds 20 H2MaxWorkers Directive Description: Syntax: Context: Status: Module: Maximum number of worker threads to use per child process. H2MaxWorkers n server config Extension mod http2 This directive sets the maximum number of worker threads to spawn per child process for HTTP/2 processing. If this directive is not used, MOD HTTP 2 will chose a value suitable for the mpm module loaded. Example H2MaxWorkers 20 654 CHAPTER 10. APACHE MODULES H2MinWorkers Directive Description: Syntax: Context: Status: Module: Minimal number of worker threads to use per child process. H2MinWorkers n server config Extension mod http2 This directive sets the minimum number of worker threads to spawn per child process for HTTP/2 processing. If this directive is not used, MOD HTTP 2 will chose a value suitable for the mpm module loaded. Example H2MinWorkers 10 H2ModernTLSOnly Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Require HTTP/2 connections to be "modern TLS" only H2ModernTLSOnly on|off H2ModernTLSOnly on server config, virtual host Extension mod http2 Available in version 2.4.18 and later. This directive toggles the security checks on HTTP/2 connections in TLS mode (https:). This can be used server wide or for specific s. The security checks require that the TSL protocol is at least TLSv1.2 and that none of the ciphers listed in RFC 7540, Appendix A is used. These checks will be extended once new security requirements come into place. The name stems from the Security/Server Side TLS42 definitions at mozilla where "modern compatibility" is defined. Mozilla Firefox and other browsers require modern compatibility for HTTP/2 connections. As everything in OpSec, this is a moving target and can be expected to evolve in the future. One purpose of having these checks in MOD HTTP 2 is to enforce this security level for all connections, not only those from browsers. The other purpose is to prevent the negotiation of HTTP/2 as a protocol should the requirements not be met. Ultimately, the security of the TLS connection is determined by the server configuration directives for MOD SSL. Example H2ModernTLSOnly off H2Push Directive Description: Syntax: Default: Context: Status: Module: Compatibility: H2 Server Push Switch H2Push on|off H2Push on server config, virtual host Extension mod http2 Available in version 2.4.18 and later. 42 https://wiki.mozilla.org/Security/Server Side TLS 10.57. APACHE MODULE MOD HTTP2 655 This directive toggles the usage of the HTTP/2 server push protocol feature. This should be used inside a section to enable direct HTTP/2 communication for that virtual host. The HTTP/2 protocol allows the server to push other resources to a client when it asked for a particular one. This is helpful if those resources are connected in some way and the client can be expected to ask for it anyway. The pushing then saves the time it takes the client to ask for the resources itself. On the other hand, pushing resources the client never needs or already has is a waste of bandwidth. Server pushes are detected by inspecting the Link headers of responses (see https://tools.ietf.org/html/rfc5988 for the specification). When a link thus specified has the rel=preload attribute, it is treated as a resource to be pushed. Link headers in responses are either set by the application or can be configured via MOD HEADERS as: mod headers example Header add Link ";rel=preload" Header add Link ";rel=preload" As the example shows, there can be several link headers added to a response, resulting in several pushes being triggered. There are no checks in the module to avoid pushing the same resource twice or more to one client. Use with care. HTTP/2 server pushes are enabled by default. This directive allows it to be switch off on all resources of this server/virtual host. Example H2Push off Last but not least, pushes happen only when the client signals its willingness to accept those. Most browsers do, some, like Safari 9, do not. Also, pushes also only happen for resources from the same authority as the original response is for. H2PushDiarySize Directive Description: Syntax: Default: Context: Status: Module: Compatibility: H2 Server Push Diary Size H2PushDiarySize n H2PushDiarySize 256 server config, virtual host Extension mod http2 Available in version 2.4.19 and later. This directive toggles the maximum number of HTTP/2 server pushes that are remembered per HTTP/2 connection. This can be used inside the section to influence the number for all connections to that virtual host. The push diary records a digest (currently using a 64 bit number) of pushed resources (their URL) to avoid duplicate pushes on the same connection. These value are not persisted, so clients opening a new connection will experience known pushes again. There is ongoing work to enable a client to disclose a digest of the resources it already has, so the diary maybe initialized by the client on each connection setup. If the maximum size is reached, newer entries replace the oldest ones. A diary entry uses 8 bytes, letting a default diary with 256 entries consume around 2 KB of memory. A size of 0 will effectively disable the push diary. 656 CHAPTER 10. APACHE MODULES H2PushPriority Directive Description: Syntax: Default: Context: Status: Module: Compatibility: H2 Server Push Priority H2PushPriority mime-type [after|before|interleaved] [weight] H2PushPriority * After 16 server config, virtual host Extension mod http2 Available in version 2.4.18 and later. For having an effect, a nghttp2 library version 1.5.0 or newer is necessary. This directive defines the priority handling of pushed responses based on the content-type of the response. This is usually defined per server config, but may also appear in a virtual host. HTTP/2 server pushes are always related to a client request. Each such request/response pairs, or streams have a dependency and a weight, together defining the priority of a stream. When a stream depends on another, say X depends on Y, then Y gets all bandwidth before X gets any. Note that this does not mean that Y will block X. If Y has no data to send, all bandwidth allocated to Y can be used by X. When a stream has more than one dependant, say X1 and X2 both depend on Y, the weight determines the bandwidth allocation. If X1 and X2 have the same weight, they both get half of the available bandwidth. If the weight of X1 is twice as large as that for X2, X1 gets twice the bandwidth of X2. Ultimately, every stream depends on the root stream which gets all the bandwidth available, but never sends anything. So all its bandwidth is distributed by weight among its children. Which either have data to send or distribute the bandwidth to their own children. And so on. If none of the children have data to send, that bandwidth get distributed somewhere else according to the same rules. The purpose of this priority system is to always make use of available bandwidth while allowing precedence and weight to be given to specific streams. Since, normally, all streams are initiated by the client, it is also the one that sets these priorities. Only when such a stream results in a PUSH, gets the server to decide what the initial priority of such a pushed stream is. In the examples below, X is the client stream. It depends on Y and the server decides to PUSH streams P1 and P2 onto X. The default priority rule is: Default Priority Rule H2PushPriority * After 16 which reads as ’Send a pushed stream of any content-type depending on the client stream with weight 16’. And so P1 and P2 will be send after X and, as they have equal weight, share bandwidth equally among themselves. Interleaved Priority Rule H2PushPriority text/css Interleaved 256 which reads as ’Send any CSS resource on the same dependency and weight as the client stream’. If P1 has contenttype ’text/css’, it will depend on Y (as does X) and its effective weight will be calculated as P1ew = Xw * (P1w / 256). With P1w being 256, this will make the effective weight the same as the weight of X. If both X and P1 have data to send, bandwidth will be allocated to both equally. With Pw specified as 512, a pushed, interleaved stream would get double the weight of X. With 128 only half as much. Note that effective weights are always capped at 256. 10.57. APACHE MODULE MOD HTTP2 657 Before Priority Rule H2PushPriority application/json Before This says that any pushed stream of content type ’application/json’ should be send out before X. This makes P1 dependent on Y and X dependent on P1. So, X will be stalled as long as P1 has data to send. The effective weight is inherited from the client stream. Specifying a weight is not allowed. Be aware that the effect of priority specifications is limited by the available server resources. If a server does not have workers available for pushed streams, the data for the stream may only ever arrive when other streams have been finished. Last, but not least, there are some specifics of the syntax to be used in this directive: 1. ’*’ is the only special content-type that matches all others. ’image/*’ will not work. 2. The default dependency is ’After’. 3. There are also default weights: for ’After’ it is 16, ’interleaved’ is 256. Shorter Priority Rules H2PushPriority application/json 32 H2PushPriority image/jpeg before H2PushPriority text/css interleaved # an After rule # weight inherited # weight 256 default H2SerializeHeaders Directive Description: Syntax: Default: Context: Status: Module: Serialize Request/Response Processing Switch H2SerializeHeaders on|off H2SerializeHeaders off server config, virtual host Extension mod http2 This directive toggles if HTTP/2 requests shall be serialized in HTTP/1.1 format for processing by httpd core or if received binary data shall be passed into the request recs directly. Serialization will lower performance, but gives more backward compatibility in case custom filters/hooks need it. Example H2SerializeHeaders on H2SessionExtraFiles Directive Description: Syntax: Context: Status: Module: Number of Extra File Handles H2SessionExtraFiles n server config, virtual host Extension mod http2 This directive sets maximum number of extra file handles a HTTP/2 session is allowed to use. A file handle is counted as extra when it is transferred from a h2 worker thread to the main HTTP/2 connection handling. This commonly happens when serving static files. 658 CHAPTER 10. APACHE MODULES Depending on the processing model configured on the server, the number of connections times number of active streams may exceed the number of file handles for the process. On the other hand, converting every file into memory bytes early results in too many buffer writes. This option helps to mitigate that. The number of file handles used by a server process is then in the order of: (h2_connections * extra_files) + (h2_max_worker) Example H2SessionExtraFiles 10 If nothing is configured, the module tries to make a conservative guess how many files are safe to use. This depends largely on the MPM chosen. H2StreamMaxMemSize Directive Description: Syntax: Default: Context: Status: Module: Maximum amount of output data buffered per stream. H2StreamMaxMemSize bytes H2StreamMaxMemSize 65536 server config, virtual host Extension mod http2 This directive sets the maximum number of outgoing data bytes buffered in memory for an active streams. This memory is not allocated per stream as such. Allocations are counted against this limit when they are about to be done. Stream processing freezes when the limit has been reached and will only continue when buffered data has been sent out to the client. Example H2StreamMaxMemSize 128000 H2TLSCoolDownSecs Directive Description: Syntax: Default: Context: Status: Module: Compatibility: H2TLSCoolDownSecs seconds H2TLSCoolDownSecs 1 server config, virtual host Extension mod http2 Available in version 2.4.18 and later. This directive sets the number of seconds of idle time on a TLS connection before the TLS write size falls back to small (˜1300 bytes) length. This can be used server wide or for specific s. See H2TLSWARM U P S IZE for a description of TLS warmup. H2TLSC OOL D OWN S ECS reflects the fact that connections may deteriorate over time (and TCP flow adjusts) for idle connections as well. It is beneficial to overall performance to fall back to the pre-warmup phase after a number of seconds that no data has been sent. In deployments where connections can be considered reliable, this timer can be disabled by setting it to 0. The following example sets the seconds to zero, effectively disabling any cool down. Warmed up TLS connections stay on maximum record size. 10.57. APACHE MODULE MOD HTTP2 659 Example H2TLSCoolDownSecs 0 H2TLSWarmUpSize Directive Description: Syntax: Default: Context: Status: Module: Compatibility: H2TLSWarmUpSize amount H2TLSWarmUpSize 1048576 server config, virtual host Extension mod http2 Available in version 2.4.18 and later. This directive sets the number of bytes to be sent in small TLS records (˜1300 bytes) until doing maximum sized writes (16k) on https: HTTP/2 connections. This can be used server wide or for specific s. Measurements by google performance labs43 show that best performance on TLS connections is reached, if initial record sizes stay below the MTU level, to allow a complete record to fit into an IP packet. While TCP adjust its flow-control and window sizes, longer TLS records can get stuck in queues or get lost and need retransmission. This is of course true for all packets. TLS however needs the whole record in order to decrypt it. Any missing bytes at the end will stall usage of the received ones. After a sufficient number of bytes have been send successfully, the TCP state of the connection is stable and maximum TLS record sizes (16 KB) can be used for optimal performance. In deployments where servers are reached locally or over reliable connections only, the value might be decreased with 0 disabling any warmup phase altogether. The following example sets the size to zero, effectively disabling any warmup phase. Example H2TLSWarmUpSize 0 H2Upgrade Directive Description: Syntax: Default: Context: Status: Module: H2 Upgrade Protocol Switch H2Upgrade on|off H2Upgrade on for h2c, off for h2 protocol server config, virtual host Extension mod http2 This directive toggles the usage of the HTTP/1.1 Upgrade method for switching to HTTP/2. This should be used inside a section to enable Upgrades to HTTP/2 for that virtual host. This method of switching protocols is defined in HTTP/1.1 and uses the "Upgrade" header (thus the name) to announce willingness to use another protocol. This may happen on any request of a HTTP/1.1 connection. This method of protocol switching is enabled by default on cleartext (potential h2c) connections and disabled on TLS (potential h2), as mandated by RFC 7540. Please be aware that Upgrades are only accepted for requests that carry no body. POSTs and PUTs with content will never trigger an upgrade to HTTP/2. See H2D IRECT for an alternative to Upgrade. 43 https://www.igvita.com 660 CHAPTER 10. APACHE MODULES This mode only has an effect when h2 or h2c is enabled via the P ROTOCOLS. Example H2Upgrade on H2WindowSize Directive Description: Syntax: Default: Context: Status: Module: Size of Stream Window for upstream data. H2WindowSize bytes H2WindowSize 65535 server config, virtual host Extension mod http2 This directive sets the size of the window that is used for flow control from client to server and limits the amount of data the server has to buffer. The client will stop sending on a stream once the limit has been reached until the server announces more available space (as it has processed some of the data). This limit affects only request bodies, not its meta data such as headers. Also, it has no effect on response bodies as the window size for those are managed by the clients. Example H2WindowSize 128000 10.58. APACHE MODULE MOD IDENT 10.58 661 Apache Module mod ident Description: Status: ModuleIdentifier: SourceFile: RFC 1413 ident lookups Extension ident module mod ident.c Summary This module queries an RFC 141344 compatible daemon on a remote host to look up the owner of a connection. Directives • IdentityCheck • IdentityCheckTimeout See also • MOD LOG CONFIG IdentityCheck Directive Description: Syntax: Default: Context: Status: Module: Enables logging of the RFC 1413 identity of the remote user IdentityCheck On|Off IdentityCheck Off server config, virtual host, directory Extension mod ident This directive enables RFC 141345 -compliant logging of the remote user name for each connection, where the client machine runs identd or something similar. This information is logged in the access log using the %...l format string (p. 705) . =⇒The information should not be trusted in any way except for rudimentary usage tracking. Note that this can cause serious latency problems accessing your server since every request requires one of these lookups to be performed. When firewalls or proxy servers are involved, each lookup might possibly fail and add a latency duration as defined by the I DENTITY C HECK T IMEOUT directive to each hit. So in general this is not very useful on public servers accessible from the Internet. IdentityCheckTimeout Directive Description: Syntax: Default: Context: Status: Module: Determines the timeout duration for ident requests IdentityCheckTimeout seconds IdentityCheckTimeout 30 server config, virtual host, directory Extension mod ident 44 http://www.ietf.org/rfc/rfc1413.txt 45 http://www.ietf.org/rfc/rfc1413.txt 662 CHAPTER 10. APACHE MODULES This directive specifies the timeout duration of an ident request. The default value of 30 seconds is recommended by RFC 141346 , mainly because of possible network latency. However, you may want to adjust the timeout value according to your local network speed. 46 http://www.ietf.org/rfc/rfc1413.txt 10.59. APACHE MODULE MOD IMAGEMAP 10.59 663 Apache Module mod imagemap Description: Status: ModuleIdentifier: SourceFile: Server-side imagemap processing Base imagemap module mod imagemap.c Summary This module processes .map files, thereby replacing the functionality of the imagemap CGI program. Any directory or document type configured to use the handler imap-file (using either A DD H ANDLER or S ET H ANDLER) will be processed by this module. The following directive will activate files ending with .map as imagemap files: AddHandler imap-file map Note that the following is still supported: AddType application/x-httpd-imap map However, we are trying to phase out "magic MIME types" so we are deprecating this method. Directives • ImapBase • ImapDefault • ImapMenu New Features The imagemap module adds some new features that were not possible with previously distributed imagemap programs. • URL references relative to the Referer: information. • Default assignment through a new map directive base. • No need for imagemap.conf file. • Point references. • Configurable generation of imagemap menus. Imagemap File The lines in the imagemap files can have one of several formats: directive value [x,y ...] directive value "Menu text" [x,y ...] directive value x,y ... "Menu text" The directive is one of base, default, poly, circle, rect, or point. The value is an absolute or relative URL, or one of the special values listed below. The coordinates are x,y pairs separated by whitespace. The quoted text is used as the text of the link if a imagemap menu is generated. Lines beginning with ’#’ are comments. 664 CHAPTER 10. APACHE MODULES Imagemap File Directives There are six directives allowed in the imagemap file. The directives can come in any order, but are processed in the order they are found in the imagemap file. base Directive Has the effect of . The non-absolute URLs of the map-file are taken relative to this value. The base directive overrides I MAP BASE as set in a .htaccess file or in the server configuration files. In the absence of an I MAP BASE configuration directive, base defaults to http://server name/. base uri is synonymous with base. Note that a trailing slash on the URL is significant. default Directive The action taken if the coordinates given do not fit any of the poly, circle or rect directives, and there are no point directives. Defaults to nocontent in the absence of an I MAP D EFAULT configuration setting, causing a status code of 204 No Content to be returned. The client should keep the same page displayed. poly Directive Takes three to one-hundred points, and is obeyed if the user selected coordinates fall within the polygon defined by these points. circle Takes the center coordinates of a circle and a point on the circle. Is obeyed if the user selected point is with the circle. rect Directive Takes the coordinates of two opposing corners of a rectangle. Obeyed if the point selected is within this rectangle. point Directive Takes a single point. The point directive closest to the user selected point is obeyed if no other directives are satisfied. Note that default will not be followed if a point directive is present and valid coordinates are given. Values The values for each of the directives can be any of the following: a URL The URL can be relative or absolute URL. Relative URLs can contain ’..’ syntax and will be resolved relative to the base value. base itself will not be resolved according to the current value. A statement base mailto: will work properly, though. map Equivalent to the URL of the imagemap file itself. No coordinates are sent with this, so a menu will be generated unless I MAP M ENU is set to none. menu Synonymous with map. referer Equivalent to the URL of the referring document. Defaults to http://servername/ if no Referer: header was present. nocontent Sends a status code of 204 No Content, telling the client to keep the same page displayed. Valid for all but base. error Fails with a 500 Server Error. Valid for all but base, but sort of silly for anything but default. Coordinates 0,0 200,200 A coordinate consists of an x and a y value separated by a comma. The coordinates are separated from each other by whitespace. To accommodate the way Lynx handles imagemaps, should a user select the coordinate 0,0, it is as if no coordinate had been selected. 10.59. APACHE MODULE MOD IMAGEMAP 665 Quoted Text "Menu Text" After the value or after the coordinates, the line optionally may contain text within double quotes. This string is used as the text for the link if a menu is generated: Menu text If no quoted text is present, the name of the link will be used as the text: http://example.com If you want to use double quotes within this text, you have to write them as ". Example Mapfile #Comments are printed in a ’formatted’ or ’semiformatted’ menu. #And can contain html tags.
base referer poly map "Could I have a menu, please?" 0,0 0,10 10,10 10,0 rect .. 0,0 77,27 "the directory of the referer" circle http://www.inetnebr.example.com/lincoln/feedback/ 195,0 305,27 rect another file "in same directory as referer" 306,0 419,27 point http://www.zyzzyva.example.com/ 100,100 point http://www.tripod.example.com/ 200,200 rect mailto:nate@tripod.example.com 100,150 200,0 "Bugs?" Referencing your mapfile HTML example XHTML example ImapBase Directive Description: Syntax: Default: Context: Override: Status: Module: Default base for imagemap files ImapBase map|referer|URL ImapBase http://servername/ server config, virtual host, directory, .htaccess Indexes Base mod imagemap 666 CHAPTER 10. APACHE MODULES The I MAP BASE directive sets the default base used in the imagemap files. Its value is overridden by a base directive within the imagemap file. If not present, the base defaults to http://servername/. See also • U SE C ANONICAL NAME ImapDefault Directive Description: Syntax: Default: Context: Override: Status: Module: Default action when an imagemap is called with coordinates that are not explicitly mapped ImapDefault error|nocontent|map|referer|URL ImapDefault nocontent server config, virtual host, directory, .htaccess Indexes Base mod imagemap The I MAP D EFAULT directive sets the default default used in the imagemap files. Its value is overridden by a default directive within the imagemap file. If not present, the default action is nocontent, which means that a 204 No Content is sent to the client. In this case, the client should continue to display the original page. ImapMenu Directive Description: Syntax: Default: Context: Override: Status: Module: Action if no coordinates are given when calling an imagemap ImapMenu none|formatted|semiformatted|unformatted ImapMenu formatted server config, virtual host, directory, .htaccess Indexes Base mod imagemap The I MAP M ENU directive determines the action taken if an imagemap file is called without valid coordinates. none If ImapMenu is none, no menu is generated, and the default action is performed. formatted A formatted menu is the simplest menu. Comments in the imagemap file are ignored. A level one header is printed, then an hrule, then the links each on a separate line. The menu has a consistent, plain look close to that of a directory listing. semiformatted In the semiformatted menu, comments are printed where they occur in the imagemap file. Blank lines are turned into HTML breaks. No header or hrule is printed, but otherwise the menu is the same as a formatted menu. unformatted Comments are printed, blank lines are ignored. Nothing is printed that does not appear in the imagemap file. All breaks and headers must be included as comments in the imagemap file. This gives you the most flexibility over the appearance of your menus, but requires you to treat your map files as HTML instead of plaintext. 10.60. APACHE MODULE MOD INCLUDE 10.60 667 Apache Module mod include Description: Status: ModuleIdentifier: SourceFile: Server-parsed html documents (Server Side Includes) Base include module mod include.c Summary This module provides a filter which will process files before they are sent to the client. The processing is controlled by specially formatted SGML comments, referred to as elements. These elements allow conditional text, the inclusion of other files or programs, as well as the setting and printing of environment variables. Directives • SSIEndTag • SSIErrorMsg • SSIETag • SSILastModified • SSILegacyExprParser • SSIStartTag • SSITimeFormat • SSIUndefinedEcho • XBitHack See also • O PTIONS • ACCEPT PATH I NFO • Filters (p. 110) • SSI Tutorial (p. 243) Enabling Server-Side Includes Server Side Includes are implemented by the INCLUDES filter (p. 110) . If documents containing server-side include directives are given the extension .shtml, the following directives will make Apache parse them and assign the resulting document the mime type of text/html: AddType text/html .shtml AddOutputFilter INCLUDES .shtml The following directive must be given for the directories containing the shtml files (typically in a section, but this directive is also valid in .htaccess files if A LLOW OVERRIDE Options is set): Options +Includes For backwards compatibility, the server-parsed handler (p. 108) also activates the INCLUDES filter. As well, Apache will activate the INCLUDES filter for any document with mime type text/x-server-parsed-html or text/x-server-parsed-html3 (and the resulting output will have the mime type text/html). For more information, see our Tutorial on Server Side Includes (p. 243) . 668 CHAPTER 10. APACHE MODULES PATH INFO with Server Side Includes Files processed for server-side includes no longer accept requests with PATH INFO (trailing pathname information) by default. You can use the ACCEPT PATH I NFO directive to configure the server to accept requests with PATH INFO. Available Elements The document is parsed as an HTML document, with special commands embedded as SGML comments. A command has the syntax: The value will often be enclosed in double quotes, but single quotes (’) and backticks (‘) are also possible. Many commands only allow a single attribute-value pair. Note that the comment terminator (-->) should be preceded by whitespace to ensure that it isn’t considered part of an SSI token. Note that the leading The config Element This command controls various aspects of the parsing. The valid attributes are: echomsg (Apache 2.1 and later) The value is a message that is sent back to the client if the echo element attempts to echo an undefined variable. This overrides any SSIU NDEFINED E CHO directives. 10.60. APACHE MODULE MOD INCLUDE 669 errmsg The value is a message that is sent back to the client if an error occurs while parsing the document. This overrides any SSIE RROR M SG directives. sizefmt The value sets the format to be used when displaying the size of a file. Valid values are bytes for a count in bytes, or abbrev for a count in Kb or Mb as appropriate, for example a size of 1024 bytes will be printed as "1K". timefmt The value is a string to be used by the strftime(3) library routine when printing dates. The echo Element This command prints one of the include variables defined below. If the variable is unset, the result is determined by the SSIU NDEFINED E CHO directive. Any dates printed are subject to the currently configured timefmt. Attributes: var The value is the name of the variable to print. decoding Specifies whether Apache should strip an encoding from the variable before processing the variable further. The default is none, where no decoding will be done. If set to url, then URL decoding (also known as %-encoding; this is appropriate for use within URLs in links, etc.) will be performed. If set to urlencoded, application/x-www-form-urlencoded compatible encoding (found in query strings) will be stripped. If set to base64, base64 will be decoded, and if set to entity, HTML entity encoding will be stripped. Decoding is done prior to any further encoding on the variable. Multiple encodings can be stripped by specifying more than one comma separated encoding. The decoding setting will remain in effect until the next decoding attribute is encountered, or the element ends. The decoding attribute must precede the corresponding var attribute to be effective. encoding Specifies how Apache should encode special characters contained in the variable before outputting them. If set to none, no encoding will be done. If set to url, then URL encoding (also known as %-encoding; this is appropriate for use within URLs in links, etc.) will be performed. If set to urlencoded, application/xwww-form-urlencoded compatible encoding will be performed instead, and should be used with query strings. If set to base64, base64 encoding will be performed. At the start of an echo element, the default is set to entity, resulting in entity encoding (which is appropriate in the context of a block-level HTML element, e.g. a paragraph of text). This can be changed by adding an encoding attribute, which will remain in effect until the next encoding attribute is encountered or the element ends, whichever comes first. The encoding attribute must precede the corresponding var attribute to be effective. ! In order to avoid cross-site scripting issues, you should always encode user supplied data. Example 670 CHAPTER 10. APACHE MODULES The exec Element The exec command executes a given shell command or CGI script. It requires MOD CGI to be present in the server. If O PTIONS IncludesNOEXEC is set, this command is completely disabled. The valid attributes are: cgi The value specifies a (%-encoded) URL-path to the CGI script. If the path does not begin with a slash (/), then it is taken to be relative to the current document. The document referenced by this path is invoked as a CGI script, even if the server would not normally recognize it as such. However, the directory containing the script must be enabled for CGI scripts (with S CRIPTA LIAS or O PTIONS ExecCGI). The CGI script is given the PATH INFO and query string (QUERY STRING) of the original request from the client; these cannot be specified in the URL path. The include variables will be available to the script in addition to the standard CGI (p. 580) environment. Example If the script returns a Location: header instead of output, then this will be translated into an HTML anchor. The include virtual element should be used in preference to exec cgi. In particular, if you need to pass additional arguments to a CGI program, using the query string, this cannot be done with exec cgi, but can be done with include virtual, as shown here: cmd The server will execute the given string using /bin/sh. The include variables are available to the command, in addition to the usual set of CGI variables. The use of #include virtual is almost always prefered to using either #exec cgi or #exec cmd. The former (#include virtual) uses the standard Apache sub-request mechanism to include files or scripts. It is much better tested and maintained. In addition, on some platforms, like Win32, and on unix when using suexec (p. 115) , you cannot pass arguments to a command in an exec directive, or otherwise include spaces in the command. Thus, while the following will work under a non-suexec configuration on unix, it will not produce the desired result under Win32, or when running suexec: The fsize Element This command prints the size of the specified file, subject to the sizefmt format specification. Attributes: file The value is a path relative to the directory containing the current document being parsed. This file is bytes. The value of file cannot start with a slash (/), nor can it contain ../ so as to refer to a file above the current directory or outside of the document root. Attempting to so will result in the error message: The given path was above the root path. 10.60. APACHE MODULE MOD INCLUDE 671 virtual The value is a (%-encoded) URL-path. If it does not begin with a slash (/) then it is taken to be relative to the current document. Note, that this does not print the size of any CGI output, but the size of the CGI script itself. This file is bytes. Note that in many cases these two are exactly the same thing. However, the file attribute doesn’t respect URL-space aliases. The flastmod Element This command prints the last modification date of the specified file, subject to the timefmt format specification. The attributes are the same as for the fsize command. The include Element This command inserts the text of another document or file into the parsed file. Any included file is subject to the usual access control. If the directory containing the parsed file has Options (p. 380) IncludesNOEXEC set, then only documents with a text MIME-type (text/plain, text/html etc.) will be included. Otherwise CGI scripts are invoked as normal using the complete URL given in the command, including any query string. An attribute defines the location of the document, and may appear more than once in an include element; an inclusion is done for each attribute given to the include command in turn. The valid attributes are: file The value is a path relative to the directory containing the current document being parsed. It cannot contain ../, nor can it be an absolute path. Therefore, you cannot include files that are outside of the document root, or above the current document in the directory structure. The virtual attribute should always be used in preference to this one. virtual The value is a (%-encoded) URL-path. The URL cannot contain a scheme or hostname, only a path and an optional query string. If it does not begin with a slash (/) then it is taken to be relative to the current document. A URL is constructed from the attribute, and the output the server would return if the URL were accessed by the client is included in the parsed output. Thus included files can be nested. If the specified URL is a CGI program, the program will be executed and its output inserted in place of the directive in the parsed file. You may include a query string in a CGI url: include virtual should be used in preference to exec cgi to include the output of CGI programs into an HTML document. If the K EPT B ODY S IZE directive is correctly configured and valid for this included file, attempts to POST requests to the enclosing HTML document will be passed through to subrequests as POST requests as well. Without the directive, all subrequests are processed as GET requests. onerror The value is a (%-encoded) URL-path which is shown should a previous attempt to include a file or virtual attribute failed. To be effective, this attribute must be specified after the file or virtual attributes being covered. If the attempt to include the onerror path fails, or if onerror is not specified, the default error message will be included. 672 CHAPTER 10. APACHE MODULES # Simple example # Dedicated onerror paths The printenv Element This prints out a plain text listing of all existing variables and their values. Special characters are entity encoded (see the echo element for details) before being output. There are no attributes. Example
  
The set Element This sets the value of a variable. Attributes: var The name of the variable to set. value The value to give a variable. decoding Specifies whether Apache should strip an encoding from the variable before processing the variable further. The default is none, where no decoding will be done. If set to url, urlencoded, base64 or entity, URL decoding, application/x-www-form-urlencoded decoding, base64 decoding or HTML entity decoding will be performed respectively. More than one decoding can be specified by separating with commas. The decoding setting will remain in effect until the next decoding attribute is encountered, or the element ends. The decoding attribute must precede the corresponding var attribute to be effective. encoding Specifies how Apache should encode special characters contained in the variable before setting them. The default is none, where no encoding will be done. If set to url, urlencoding, base64 or entity, URL encoding, application/x-www-form-urlencoded encoding, base64 encoding or HTML entity encoding will be performed respectively. More than one encoding can be specified by separating with commas. The encoding setting will remain in effect until the next encoding attribute is encountered, or the element ends. The encoding attribute must precede the corresponding var attribute to be effective. Encodings are applied after all decodings have been stripped. Example 10.60. APACHE MODULE MOD INCLUDE 673 Include Variables In addition to the variables in the standard CGI environment, these are available for the echo command, for if and elif, and to any program invoked by the document. DATE GMT The current date in Greenwich Mean Time. DATE LOCAL The current date in the local time zone. DOCUMENT ARGS This variable contains the query string of the active SSI document, or the empty string if a query string is not included. For subrequests invoked through the include SSI directive, QUERY STRING will represent the query string of the subrequest and DOCUMENT ARGS will represent the query string of the SSI document. (Available in Apache HTTP Server 2.4.19 and later.) DOCUMENT NAME The filename (excluding directories) of the document requested by the user. DOCUMENT URI The (%-decoded) URL path of the document requested by the user. Note that in the case of nested include files, this is not the URL for the current document. Note also that if the URL is modified internally (e.g. by an ALIAS or DIRECTORYINDEX), the modified URL is shown. LAST MODIFIED The last modification date of the document requested by the user. QUERY STRING UNESCAPED If a query string is present in the request for the active SSI document, this variable contains the (%-decoded) query string, which is escaped for shell usage (special characters like & etc. are preceded by backslashes). It is not set if a query string is not present. Use DOCUMENT ARGS if shell escaping is not desired. Variable Substitution Variable substitution is done within quoted strings in most cases where they may reasonably occur as an argument to an SSI directive. This includes the config, exec, flastmod, fsize, include, echo, and set directives. If SSIL EGACY E XPR PARSER is set to on, substitution also occurs in the arguments to conditional operators. You can insert a literal dollar sign into the string using backslash quoting: If a variable reference needs to be substituted in the middle of a character sequence that might otherwise be considered a valid identifier in its own right, it can be disambiguated by enclosing the reference in braces, a la shell substitution: This will result in the Zed variable being set to "X Y" if REMOTE HOST is "X" and REQUEST METHOD is "Y". Flow Control Elements The basic flow control elements are: 674 CHAPTER 10. APACHE MODULES The if element works like an if statement in a programming language. The test condition is evaluated and if the result is true, then the text until the next elif, else or endif element is included in the output stream. The elif or else statements are used to put text into the output stream if the original test condition was false. These elements are optional. The endif element ends the if element and is required. test condition is a boolean expression which follows the ap expr (p. 99) syntax. The syntax can be changed to be compatible with Apache HTTPD 2.2.x using SSIL EGACY E XPR PARSER. The SSI variables set with the var element are exported into the request environment and can be accessed with the reqenv function. As a short-cut, the function name v is also available inside MOD INCLUDE. The below example will print "from local net" if client IP address belongs to the 10.0.0.0/8 subnet. from local net from somewhere else The below example will print "foo is bar" if the variable foo is set to the value "bar". foo is bar =⇒Reference Documentation See also: Expressions in Apache HTTP Server (p. 99) , for a complete reference and examples. The restricted functions are not available inside MOD INCLUDE Legacy expression syntax This section describes the syntax of the #if expr element if SSIL EGACY E XPR PARSER is set to on. string true if string is not empty -A string true if the URL represented by the string is accessible by configuration, false otherwise. This is useful where content on a page is to be hidden from users who are not authorized to view the URL, such as a link to that URL. Note that the URL is only tested for whether access would be granted, not whether the URL exists. Example Click here to access private information. string1 = string2string1 == string2string1 != string2 Compare string1 with string2. If string2 has the form /string2/ then it is treated as a regular expression. Regular expressions are implemented by the PCRE47 engine and have the same syntax as those in perl 548 . Note that == is just an alias for = and behaves exactly the same way. 47 http://www.pcre.org 48 http://www.perl.com 10.60. APACHE MODULE MOD INCLUDE 675 If you are matching positive (= or ==), you can capture grouped parts of the regular expression. The captured parts are stored in the special variables $1 .. $9. The whole string matched by the regular expression is stored in the special variable $0 Example string1 < string2string1 <= string2string1 > string2string1 >= string2 Compare string1 with string2. Note, that strings are compared literally (using strcmp(3)). Therefore the string "100" is less than "20". ( test condition ) true if test condition is true ! test condition true if test condition is false test condition1 && test condition2 true if both test condition1 and test condition2 are true test condition1 || test condition2 true if either test condition1 or test condition2 is true "=" and "!=" bind more tightly than "&&" and "||". "!" binds most tightly. Thus, the following are equivalent: The boolean operators && and || share the same priority. So if you want to bind such an operator more tightly, you should use parentheses. Anything that’s not recognized as a variable or an operator is treated as a string. Strings can also be quoted: ’string’. Unquoted strings can’t contain whitespace (blanks and tabs) because it is used to separate tokens such as variables. If multiple strings are found in a row, they are concatenated using blanks. So, string1string2 results in string1string2 and ’string1string2’ results in string1string2. =⇒Optimization of Boolean Expressions If the expressions become more complex and slow down processing significantly, you can try to optimize them according to the evaluation rules: • Expressions are evaluated from left to right • Binary boolean operators (&& and ||) are short circuited wherever possible. In conclusion with the rule above that means, MOD INCLUDE evaluates at first the left expression. If the left result is sufficient to determine the end result, processing stops here. Otherwise it evaluates the right side and computes the end result from both left and right results. • Short circuit evaluation is turned off as long as there are regular expressions to deal with. These must be evaluated to fill in the backreference variables ($1 .. $9). If you want to look how a particular expression is handled, you can recompile MOD INCLUDE using the -DDEBUG INCLUDE compiler option. This inserts for every parsed expression tokenizer information, the parse tree and how it is evaluated into the output sent to the client. 676 CHAPTER 10. APACHE MODULES =⇒Escaping slashes in regex strings All slashes which are not intended to act as delimiters in your regex must be escaped. This is regardless of their meaning to the regex engine. SSIEndTag Directive Description: Syntax: Default: Context: Status: Module: String that ends an include element SSIEndTag tag SSIEndTag "-->" server config, virtual host Base mod include This directive changes the string that MOD INCLUDE looks for to mark the end of an include element. SSIEndTag "%>" See also • SSIS TART TAG SSIErrorMsg Directive Description: Syntax: Default: Context: Override: Status: Module: Error message displayed when there is an SSI error SSIErrorMsg message SSIErrorMsg "[an error occurred while processing this directive]" server config, virtual host, directory, .htaccess All Base mod include The SSIE RROR M SG directive changes the error message displayed when MOD INCLUDE encounters an error. For production servers you may consider changing the default error message to "" so that the message is not presented to the user. This directive has the same effect as the element. SSIErrorMsg "" SSIETag Directive Description: Syntax: Default: Context: Status: Module: Controls whether ETags are generated by the server. SSIETag on|off SSIETag off directory, .htaccess Base mod include Under normal circumstances, a file filtered by MOD INCLUDE may contain elements that are either dynamically generated, or that may have changed independently of the original file. As a result, by default the server is asked not to generate an ETag header for the response by adding no-etag to the request notes. 10.60. APACHE MODULE MOD INCLUDE 677 The SSIETAG directive suppresses this behaviour, and allows the server to generate an ETag header. This can be used to enable caching of the output. Note that a backend server or dynamic content generator may generate an ETag of its own, ignoring no-etag, and this ETag will be passed by MOD INCLUDE regardless of the value of this setting. SSIETAG can take on the following values: off no-etag will be added to the request notes, and the server is asked not to generate an ETag. Where a server ignores the value of no-etag and generates an ETag anyway, the ETag will be respected. on Existing ETags will be respected, and ETags generated by the server will be passed on in the response. SSILastModified Directive Description: Syntax: Default: Context: Status: Module: Controls whether Last-Modified headers are generated by the server. SSILastModified on|off SSILastModified off directory, .htaccess Base mod include Under normal circumstances, a file filtered by MOD INCLUDE may contain elements that are either dynamically generated, or that may have changed independently of the original file. As a result, by default the Last-Modified header is stripped from the response. The SSIL AST M ODIFIED directive overrides this behaviour, and allows the Last-Modified header to be respected if already present, or set if the header is not already present. This can be used to enable caching of the output. SSIL AST M ODIFIED can take on the following values: off The Last-Modified header will be stripped from responses, unless the XB IT H ACK directive is set to full as described below. on The Last-Modified header will be respected if already present in a response, and added to the response if the response is a file and the header is missing. The SSIL AST M ODIFIED directive takes precedence over XB IT H ACK. SSILegacyExprParser Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Enable compatibility mode for conditional expressions. SSILegacyExprParser on|off SSILegacyExprParser off directory, .htaccess Base mod include Available in version 2.3.13 and later. As of version 2.3.13, MOD INCLUDE has switched to the new ap expr (p. 99) syntax for conditional expressions in #if flow control elements. This directive allows to switch to the old syntax which is compatible with Apache HTTPD version 2.2.x and earlier. 678 CHAPTER 10. APACHE MODULES SSIStartTag Directive Description: Syntax: Default: Context: Status: Module: String that starts an include element SSIStartTag tag SSIStartTag " element. SSITimeFormat "%R, %B %d, %Y" The above directive would cause times to be displayed in the format "22:26, June 14, 2002". SSIUndefinedEcho Directive Description: Syntax: Default: Context: Override: Status: Module: String displayed when an unset variable is echoed SSIUndefinedEcho string SSIUndefinedEcho "(none)" server config, virtual host, directory, .htaccess All Base mod include 10.60. APACHE MODULE MOD INCLUDE 679 This directive changes the string that MOD INCLUDE displays when a variable is not set and "echoed". SSIUndefinedEcho "" XBitHack Directive Description: Syntax: Default: Context: Override: Status: Module: Parse SSI directives in files with the execute bit set XBitHack on|off|full XBitHack off server config, virtual host, directory, .htaccess Options Base mod include The XB IT H ACK directive controls the parsing of ordinary html documents. This directive only affects files associated with the MIME-type text/html. XB IT H ACK can take on the following values: off No special treatment of executable files. on Any text/html file that has the user-execute bit set will be treated as a server-parsed html document. full As for on but also test the group-execute bit. If it is set, then set the Last-modified date of the returned file to be the last modified time of the file. If it is not set, then no last-modified date is sent. Setting this bit allows clients and proxies to cache the result of the request. =⇒Note You would not want to use the full option, unless you assure the group-execute bit is unset for every SSI script which might #include a CGI or otherwise produces different output on each hit (or could potentially change on subsequent requests). The SSIL AST M ODIFIED directive takes precedence over the XB IT H ACK directive when SSIL AST M ODIFIED is set to on. 680 CHAPTER 10. APACHE MODULES 10.61 Apache Module mod info Description: Status: ModuleIdentifier: SourceFile: Provides a comprehensive overview of the server configuration Extension info module mod info.c Summary To configure MOD INFO, add the following to your httpd.conf file. SetHandler server-info You may wish to use MOD AUTHZ HOST inside the directive to limit access to your server configuration information: SetHandler server-info Require host example.com Once configured, the server information http://your.host.example.com/server-info is obtained by accessing Directives • AddModuleInfo Security Issues Once MOD INFO is loaded into the server, its handler capability is available in all configuration files, including perdirectory files (e.g., .htaccess). This may have security-related ramifications for your site. In particular, this module can leak sensitive information from the configuration directives of other Apache modules such as system paths, usernames/passwords, database names, etc. Therefore, this module should only be used in a controlled environment and always with caution. You will probably want to use MOD AUTHZ HOST to limit access to your server configuration information. Access control SetHandler server-info # Allow access from server itself Require ip 127.0.0.1 # Additionally, allow access from local workstation Require ip 192.168.1.17 10.61. APACHE MODULE MOD INFO 681 Selecting the information shown By default, the server information includes a list of all enabled modules, and for each module, a description of the directives understood by that module, the hooks implemented by that module, and the relevant directives from the current configuration. Other views of the configuration information are available by appending a query to the server-info request. For example, http://your.host.example.com/server-info?config will show all configuration directives. ? Only information relevant to the named module ?config Just the configuration directives, not sorted by module ?hooks Only the list of Hooks each module is attached to ?list Only a simple list of enabled modules ?server Only the basic server information Dumping the configuration on startup If the config define -DDUMP CONFIG is set, MOD INFO will dump the pre-parsed configuration to stdout during server startup. Pre-parsed means that directives like and are evaluated and environment varialbles are replaced. However it does not represent the final state of the configuration. In particular, it does not represent the merging or overriding that may happen for repeated directives. This is roughly equivalent to the ?config query. Known Limitations MOD INFO provides its information by reading the parsed configuration, rather than reading the original configuration file. There are a few limitations as a result of the way the parsed configuration tree is created: • Directives which are executed immediately rather than being stored in the parsed configuration are not listed. These include S ERVER ROOT, L OAD M ODULE, and L OAD F ILE. • Directives which control the configuration file itself, such as I NCLUDE, and are not listed, but the included configuration directives are. • Comments are not listed. (This may be considered a feature.) • Configuration directives from .htaccess files are not listed (since they do not form part of the permanent server configuration). • Container directives such as are listed normally, but MOD INFO cannot figure out the line number for the closing . • Directives generated by third party modules such as mod perl49 might not be listed. 49 http://perl.apache.org 682 CHAPTER 10. APACHE MODULES AddModuleInfo Directive Description: Syntax: Context: Status: Module: Adds additional information to the module information displayed by the server-info handler AddModuleInfo module-name string server config, virtual host Extension mod info This allows the content of string to be shown as HTML interpreted, Additional Information for the module modulename. Example: AddModuleInfo mod_deflate.c ’See \ http://httpd.apache.org/docs/trunk/mod/mod_deflate.html’ 10.62. APACHE MODULE MOD ISAPI 10.62 683 Apache Module mod isapi Description: Status: ModuleIdentifier: SourceFile: Compatibility: ISAPI Extensions within Apache for Windows Base isapi module mod isapi.c Win32 only Summary This module implements the Internet Server extension API. It allows Internet Server extensions (e.g. ISAPI .dll modules) to be served by Apache for Windows, subject to the noted restrictions. ISAPI extension modules (.dll files) are written by third parties. The Apache Group does not author these modules, so we provide no support for them. Please contact the ISAPI’s author directly if you are experiencing problems running their ISAPI extension. Please do not post such problems to Apache’s lists or bug reporting pages. Directives • ISAPIAppendLogToErrors • ISAPIAppendLogToQuery • ISAPICacheFile • ISAPIFakeAsync • ISAPILogNotSupported • ISAPIReadAheadBuffer Usage In the server configuration file, use the A DD H ANDLER directive to associate ISAPI files with the isapi-handler handler, and map it to them with their file extensions. To enable any .dll file to be processed as an ISAPI extension, edit the httpd.conf file and add the following line: AddHandler isapi-handler .dll =⇒isapi-handler. In older versions of the Apache server, isapi-isa was the proper handler name, rather than As of 2.3 development versions of the Apache server, isapi-isa is no longer valid. You will need to change your configuration to use isapi-handler instead. There is no capability within the Apache server to leave a requested module loaded. However, you may preload and keep a specific module loaded by using the following syntax in your httpd.conf: ISAPICacheFile c:/WebWork/Scripts/ISAPI/mytest.dll Whether or not you have preloaded an ISAPI extension, all ISAPI extensions are governed by the same permissions and restrictions as CGI scripts. That is, O PTIONS ExecCGI must be set for the directory that contains the ISAPI .dll file. Review the Additional Notes and the Programmer’s Journal for additional details and clarification of the specific ISAPI support offered by MOD ISAPI. 684 CHAPTER 10. APACHE MODULES Additional Notes Apache’s ISAPI implementation conforms to all of the ISAPI 2.0 specification, except for some "Microsoft-specific" extensions dealing with asynchronous I/O. Apache’s I/O model does not allow asynchronous reading and writing in a manner that the ISAPI could access. If an ISA tries to access unsupported features, including async I/O, a message is placed in the error log to help with debugging. Since these messages can become a flood, the directive ISAPILogNotSupported Off exists to quiet this noise. Some servers, like Microsoft IIS, load the ISAPI extension into the server and keep it loaded until memory usage is too high, or unless configuration options are specified. Apache currently loads and unloads the ISAPI extension each time it is requested, unless the ISAPIC ACHE F ILE directive is specified. This is inefficient, but Apache’s memory model makes this the most effective method. Many ISAPI modules are subtly incompatible with the Apache server, and unloading these modules helps to ensure the stability of the server. Also, remember that while Apache supports ISAPI Extensions, it does not support ISAPI Filters. Support for filters may be added at a later date, but no support is planned at this time. Programmer’s Journal If you are programming Apache 2.0 MOD ISAPI modules, you must limit your calls to ServerSupportFunction to the following directives: HSE REQ SEND URL REDIRECT RESP Redirect the user to another location. This must be a fully qualified URL (e.g. http://server/location). HSE REQ SEND URL Redirect the user to another location. This cannot be a fully qualified URL, you are not allowed to pass the protocol or a server name (e.g. simply /location). This redirection is handled by the server, not the browser. ! Warning In their recent documentation, Microsoft appears to have abandoned the distinction between the two HSE REQ SEND URL functions. Apache continues to treat them as two distinct functions with different requirements and behaviors. HSE REQ SEND RESPONSE HEADER Apache accepts a response body following the header if it follows the blank line (two consecutive newlines) in the headers string argument. This body cannot contain NULLs, since the headers argument is NULL terminated. HSE REQ DONE WITH SESSION Apache considers this a no-op, since the session will be finished when the ISAPI returns from processing. HSE REQ MAP URL TO PATH Apache will translate a virtual name to a physical name. HSE APPEND LOG PARAMETER This logged message may be captured in any of the following logs: • in the \"%{isapi-parameter}n\" component in a C USTOM L OG directive • in the %q log component with the ISAPIA PPEND L OG T O Q UERY On directive • in the error log with the ISAPIA PPEND L OG T O E RRORS On directive The first option, the %{isapi-parameter}n component, is always available and preferred. HSE REQ IS KEEP CONN Will return the negotiated Keep-Alive status. HSE REQ SEND RESPONSE HEADER EX Will behave as documented, although the fKeepConn flag is ignored. 10.62. APACHE MODULE MOD ISAPI 685 HSE REQ IS CONNECTED Will report false if the request has been aborted. Apache returns FALSE to any unsupported call to ServerSupportFunction, and sets the GetLastError value to ERROR INVALID PARAMETER. ReadClient retrieves the request body exceeding the initial buffer (defined by ISAPIR EADA HEAD B UFFER). Based on the ISAPIR EADA HEAD B UFFER setting (number of bytes to buffer prior to calling the ISAPI handler) shorter requests are sent complete to the extension when it is invoked. If the request is longer, the ISAPI extension must use ReadClient to retrieve the remaining request body. WriteClient is supported, but only with the HSE IO SYNC flag or no option flag (value of 0). Any other WriteClient request will be rejected with a return value of FALSE, and a GetLastError value of ERROR INVALID PARAMETER. GetServerVariable is supported, although extended server variables do not exist (as defined by other servers.) All the usual Apache CGI environment variables are available from GetServerVariable, as well as the ALL HTTP and ALL RAW values. Since httpd 2.0, MOD ISAPI supports additional features introduced in later versions of the ISAPI specification, as well as limited emulation of async I/O and the TransmitFile semantics. Apache httpd also supports preloading ISAPI .dlls for performance. ISAPIAppendLogToErrors Directive Description: Syntax: Default: Context: Override: Status: Module: Record HSE APPEND LOG PARAMETER requests from ISAPI extensions to the error log ISAPIAppendLogToErrors on|off ISAPIAppendLogToErrors off server config, virtual host, directory, .htaccess FileInfo Base mod isapi Record HSE APPEND LOG PARAMETER requests from ISAPI extensions to the server error log. ISAPIAppendLogToQuery Directive Description: Syntax: Default: Context: Override: Status: Module: Record HSE APPEND LOG PARAMETER requests from ISAPI extensions to the query field ISAPIAppendLogToQuery on|off ISAPIAppendLogToQuery on server config, virtual host, directory, .htaccess FileInfo Base mod isapi Record HSE APPEND LOG PARAMETER requests from ISAPI extensions to the query field (appended to the C US TOM L OG %q component). ISAPICacheFile Directive Description: Syntax: Context: Status: Module: ISAPI .dll files to be loaded at startup ISAPICacheFile file-path [file-path] ... server config, virtual host Base mod isapi 686 CHAPTER 10. APACHE MODULES Specifies a space-separated list of file names to be loaded when the Apache server is launched, and remain loaded until the server is shut down. This directive may be repeated for every ISAPI .dll file desired. The full path name of each file should be specified. If the path name is not absolute, it will be treated relative to S ERVER ROOT. ISAPIFakeAsync Directive Description: Syntax: Default: Context: Override: Status: Module: Fake asynchronous support for ISAPI callbacks ISAPIFakeAsync on|off ISAPIFakeAsync off server config, virtual host, directory, .htaccess FileInfo Base mod isapi While set to on, asynchronous support for ISAPI callbacks is simulated. ISAPILogNotSupported Directive Description: Syntax: Default: Context: Override: Status: Module: Log unsupported feature requests from ISAPI extensions ISAPILogNotSupported on|off ISAPILogNotSupported off server config, virtual host, directory, .htaccess FileInfo Base mod isapi Logs all requests for unsupported features from ISAPI extensions in the server error log. This may help administrators to track down problems. Once set to on and all desired ISAPI modules are functioning, it should be set back to off. ISAPIReadAheadBuffer Directive Description: Syntax: Default: Context: Override: Status: Module: Size of the Read Ahead Buffer sent to ISAPI extensions ISAPIReadAheadBuffer size ISAPIReadAheadBuffer 49152 server config, virtual host, directory, .htaccess FileInfo Base mod isapi Defines the maximum size of the Read Ahead Buffer sent to ISAPI extensions when they are initially invoked. All remaining data must be retrieved using the ReadClient callback; some ISAPI extensions may not support the ReadClient function. Refer questions to the ISAPI extension’s author. 10.63. APACHE MODULE MOD JOURNALD 10.63 687 Apache Module mod journald Description: Status: ModuleIdentifier: SourceFile: Provides "journald" ErrorLog provider Extension journald module mod journald.c Summary This module provides "journald" ErrorLog provider. It allows logging error messages and CustomLog/TransferLog via systemd-journald(8). Directives This module provides no directives. Structured logging Systemd-journald allows structured logging and therefore it is possible to filter logged messages according to various variables. Currently supported variables are: LOG The name of the log. For ErrorLog, the value is "error log". For CustomLog or TransferLog, the value is the first argument of these directives. REQUEST HOSTNAME Host, as set by full URI or Host: header in the request. REQUEST USER If an authentication check was made, this gets set to the user name. REQUEST USERAGENT IP The address that originated the request. REQUEST URI The path portion of the URI, or "/" if no path provided. SERVER HOSTNAME The hostname of server for which the log message has been generated. These variables can be for example used to show only log messages for particular URI using journalctl: journalctl REQUEST_URI=/index.html -a For more examples, see systemd-journalctl documentation. Examples Using journald in ErrorLog directive (see CORE) instead of a filename enables logging via systemd-journald(8) if the system supports it. ErrorLog journald Using journald as an error log provider in CustomLog directive (see MOD LOG CONFIG) enables logging via systemd-journald(8) if the system supports it. CustomLog "journald" "%h %l %u %t \"%r\" %>s %b" ! Performance warning Currently, systemd-journald is not designed for high-throughput logging and logging access log to systemd-journald could decrease the performance a lot. 688 CHAPTER 10. APACHE MODULES 10.64 Apache Module mod lbmethod bybusyness Description: Pending Request Counting load balancer scheduler algorithm for MOD PROXY BALANCER Status: ModuleIdentifier: SourceFile: Compatibility: Extension lbmethod bybusyness module mod lbmethod bybusyness.c Split off from MOD PROXY BALANCER in 2.3 Summary This module does not provide any configuration directives of its own. MOD PROXY BALANCER , and provides the bybusyness load balancing method. It requires the services of Directives This module provides no directives. See also • MOD PROXY • MOD PROXY BALANCER Pending Request Counting Algorithm Enabled via lbmethod=bybusyness, this scheduler keeps track of how many requests each worker is currently assigned at present. A new request is automatically assigned to the worker with the lowest number of active requests. This is useful in the case of workers that queue incoming requests independently of Apache, to ensure that queue length stays even and a request is always given to the worker most likely to service it the fastest and reduce latency. In the case of multiple least-busy workers, the statistics (and weightings) used by the Request Counting method are used to break the tie. Over time, the distribution of work will come to resemble that characteristic of byrequests (as implemented by MOD LBMETHOD BYREQUESTS). 10.65. APACHE MODULE MOD LBMETHOD BYREQUESTS 10.65 689 Apache Module mod lbmethod byrequests Description: Status: ModuleIdentifier: SourceFile: Compatibility: Request Counting load balancer scheduler algorithm for MOD PROXY BALANCER Extension lbmethod byrequests module mod lbmethod byrequests.c Split off from MOD PROXY BALANCER in 2.3 Summary This module does not provide any configuration directives of its own. MOD PROXY BALANCER , and provides the byrequests load balancing method.. It requires the services of Directives This module provides no directives. See also • MOD PROXY • MOD PROXY BALANCER Request Counting Algorithm Enabled via lbmethod=byrequests, the idea behind this scheduler is that we distribute the requests among the various workers to ensure that each gets their configured share of the number of requests. It works as follows: lbfactor is how much we expect this worker to work, or the workers’ work quota. This is a normalized value representing their "share" of the amount of work to be done. lbstatus is how urgent this worker has to work to fulfill its quota of work. The worker is a member of the load balancer, usually a remote host serving one of the supported protocols. We distribute each worker’s work quota to the worker, and then look which of them needs to work most urgently (biggest lbstatus). This worker is then selected for work, and its lbstatus reduced by the total work quota we distributed to all workers. Thus the sum of all lbstatus does not change(*) and we distribute the requests as desired. If some workers are disabled, the others will still be scheduled correctly. for each worker in workers worker lbstatus += worker lbfactor total factor += worker lbfactor if worker lbstatus > candidate lbstatus candidate = worker candidate lbstatus -= total factor If a balancer is configured as follows: worker lbfactor lbstatus a b c d 25 0 25 0 25 0 25 0 And b gets disabled, the following schedule is produced: 690 CHAPTER 10. APACHE MODULES worker lbstatus lbstatus lbstatus a -50 -25 0 (repeat) That is it schedules: a c d a c d a c d ... Please note that: worker lbfactor a b c d 25 25 25 25 a b c d 1 1 1 1 Has the exact same behavior as: worker lbfactor This is because all values of lbfactor are normalized with respect to the others. For: worker lbfactor a b c 1 4 1 worker b will, on average, get 4 times the requests that a and c will. The following asymmetric configuration works as one would expect: worker lbfactor a lbstatus lbstatus lbstatus lbstatus lbstatus lbstatus lbstatus lbstatus lbstatus lbstatus -30 40 10 -20 -50 20 -10 -40 30 0 (repeat) That is after 10 schedules, the schedule repeats and 7 a are selected with 3 b interspersed. 70 10.66. APACHE MODULE MOD LBMETHOD BYTRAFFIC 10.66 691 Apache Module mod lbmethod bytraffic Description: Weighted Traffic Counting load balancer scheduler algorithm for MOD PROXY BALANCER Status: ModuleIdentifier: SourceFile: Compatibility: Extension lbmethod bytraffic module mod lbmethod bytraffic.c Split off from MOD PROXY BALANCER in 2.3 Summary This module does not provide any configuration directives of its own. MOD PROXY BALANCER , and provides the bytraffic load balancing method.. It requires the services of Directives This module provides no directives. See also • MOD PROXY • MOD PROXY BALANCER Weighted Traffic Counting Algorithm Enabled via lbmethod=bytraffic, the idea behind this scheduler is very similar to the Request Counting method, with the following changes: lbfactor is how much traffic, in bytes, we want this worker to handle. This is also a normalized value representing their "share" of the amount of work to be done, but instead of simply counting the number of requests, we take into account the amount of traffic this worker has either seen or produced. If a balancer is configured as follows: worker lbfactor a b c 1 2 1 Then we mean that we want b to process twice the amount of bytes than a or c should. It does not necessarily mean that b would handle twice as many requests, but it would process twice the I/O. Thus, the size of the request and response are applied to the weighting and selection algorithm. Note: input and output bytes are weighted the same. 692 CHAPTER 10. APACHE MODULES 10.67 Apache Module mod lbmethod heartbeat Description: Heartbeat Traffic Counting load balancer scheduler algorithm for MOD PROXY BALANCER Status: ModuleIdentifier: SourceFile: Compatibility: Experimental lbmethod heartbeat module mod lbmethod heartbeat.c Available in version 2.3 and later Summary lbmethod=heartbeat uses the services of MOD HEARTMONITOR to balance between origin servers that are providing heartbeat info via the MOD HEARTBEAT module. This modules load balancing algorithm favors servers with more ready (idle) capacity over time, but does not select the server with the most ready capacity every time. Servers that have 0 active clients are penalized, with the assumption that they are not fully initialized. Directives • HeartbeatStorage See also • MOD PROXY • MOD PROXY BALANCER • MOD HEARTBEAT • MOD HEARTMONITOR HeartbeatStorage Directive Description: Syntax: Default: Context: Status: Module: Path to read heartbeat data HeartbeatStorage file-path HeartbeatStorage logs/hb.dat server config Experimental mod lbmethod heartbeat The H EARTBEAT S TORAGE directive specifies the path to read heartbeat data. This flat-file is used only when MOD SLOTMEM SHM is not loaded. 10.68. APACHE MODULE MOD LDAP 10.68 693 Apache Module mod ldap Description: Status: ModuleIdentifier: SourceFile: LDAP connection pooling and result caching services for use by other LDAP modules Extension ldap module util ldap.c Summary This module was created to improve the performance of websites relying on backend connections to LDAP servers. In addition to the functions provided by the standard LDAP libraries, this module adds an LDAP connection pool and an LDAP shared memory cache. To enable this module, LDAP support must be compiled into apr-util. This is achieved by adding the --with-ldap flag to the configure script when building Apache. SSL/TLS support is dependent on which LDAP toolkit has been linked to APR. As of this writing, APR-util supports: OpenLDAP SDK50 (2.x or later), Novell LDAP SDK51 , Mozilla LDAP SDK52 , native Solaris LDAP SDK (Mozilla based) or the native Microsoft LDAP SDK. See the APR53 website for details. Directives • LDAPCacheEntries • LDAPCacheTTL • LDAPConnectionPoolTTL • LDAPConnectionTimeout • LDAPLibraryDebug • LDAPOpCacheEntries • LDAPOpCacheTTL • LDAPReferralHopLimit • LDAPReferrals • LDAPRetries • LDAPRetryDelay • LDAPSharedCacheFile • LDAPSharedCacheSize • LDAPTimeout • LDAPTrustedClientCert • LDAPTrustedGlobalCert • LDAPTrustedMode • LDAPVerifyServerCert 50 http://www.openldap.org/ 51 http://developer.novell.com/ndk/cldap.htm 52 https://wiki.mozilla.org/LDAP 53 http://apr.apache.org C SDK 694 CHAPTER 10. APACHE MODULES Example Configuration The following is an example configuration that uses MOD LDAP to increase the performance of HTTP Basic authentication provided by MOD AUTHNZ LDAP. # # # # # Enable the LDAP connection pool and shared memory cache. Enable the LDAP cache status handler. Requires that mod_ldap and mod_authnz_ldap be loaded. Change the "yourdomain.example.com" to match your domain. LDAPSharedCacheSize 500000 LDAPCacheEntries 1024 LDAPCacheTTL 600 LDAPOpCacheEntries 1024 LDAPOpCacheTTL 600 SetHandler ldap-status Require host yourdomain.example.com Satisfy any AuthType Basic AuthName "LDAP Protected" AuthBasicProvider ldap AuthLDAPURL ldap://127.0.0.1/dc=example,dc=com?uid?one Require valid-user LDAP Connection Pool LDAP connections are pooled from request to request. This allows the LDAP server to remain connected and bound ready for the next request, without the need to unbind/connect/rebind. The performance advantages are similar to the effect of HTTP keepalives. On a busy server it is possible that many requests will try and access the same LDAP server connection simultaneously. Where an LDAP connection is in use, Apache will create a new connection alongside the original one. This ensures that the connection pool does not become a bottleneck. There is no need to manually enable connection pooling in the Apache configuration. Any module using this module for access to LDAP services will share the connection pool. LDAP connections can keep track of the ldap client credentials used when binding to an LDAP server. These credentials can be provided to LDAP servers that do not allow anonymous binds during referral chasing. To control this feature, see the LDAPR EFERRALS and LDAPR EFERRAL H OP L IMIT directives. By default, this feature is enabled. LDAP Cache For improved performance, MOD LDAP uses an aggressive caching strategy to minimize the number of times that the LDAP server must be contacted. Caching can easily double or triple the throughput of Apache when it is serving pages protected with mod authnz ldap. In addition, the load on the LDAP server will be significantly decreased. 10.68. APACHE MODULE MOD LDAP 695 MOD LDAP supports two types of LDAP caching during the search/bind phase with a search/bind cache and during the compare phase with two operation caches. Each LDAP URL that is used by the server has its own set of these three caches. The Search/Bind Cache The process of doing a search and then a bind is the most time-consuming aspect of LDAP operation, especially if the directory is large. The search/bind cache is used to cache all searches that resulted in successful binds. Negative results (i.e., unsuccessful searches, or searches that did not result in a successful bind) are not cached. The rationale behind this decision is that connections with invalid credentials are only a tiny percentage of the total number of connections, so by not caching invalid credentials, the size of the cache is reduced. MOD LDAP stores the username, the DN retrieved, the password used to bind, and the time of the bind in the cache. Whenever a new connection is initiated with the same username, MOD LDAP compares the password of the new connection with the password in the cache. If the passwords match, and if the cached entry is not too old, MOD LDAP bypasses the search/bind phase. The search and bind cache is controlled with the LDAPC ACHE E NTRIES and LDAPC ACHE TTL directives. Operation Caches During attribute and distinguished name comparison functions, MOD LDAP uses two operation caches to cache the compare operations. The first compare cache is used to cache the results of compares done to test for LDAP group membership. The second compare cache is used to cache the results of comparisons done between distinguished names. Note that, when group membership is being checked, any sub-group comparison results are cached to speed future sub-group comparisons. The behavior of both of these caches is controlled with the LDAPO P C ACHE E NTRIES and LDAPO P C ACHE TTL directives. Monitoring the Cache MOD LDAP has a content handler that allows administrators to monitor the cache performance. The name of the content handler is ldap-status, so the following directives could be used to access the MOD LDAP cache information: SetHandler ldap-status By fetching the URL http://servername/cache-info, the administrator can get a status report of every cache that is used by MOD LDAP cache. Note that if Apache does not support shared memory, then each httpd instance has its own cache, so reloading the URL will result in different information each time, depending on which httpd instance processes the request. Using SSL/TLS The ability to create an SSL and TLS connections to an LDAP server is defined by the directives LDAPT RUSTED G LOBAL C ERT , LDAPT RUSTED C LIENT C ERT and LDAPT RUSTED M ODE . These directives specify the CA and optional client certificates to be used, as well as the type of encryption to be used on the connection (none, SSL or TLS/STARTTLS). 696 CHAPTER 10. APACHE MODULES # Establish an SSL LDAP connection on port 636. Requires that # mod_ldap and mod_authnz_ldap be loaded. Change the # "yourdomain.example.com" to match your domain. LDAPTrustedGlobalCert CA_DER /certs/certfile.der SetHandler ldap-status Require host yourdomain.example.com Satisfy any AuthType Basic AuthName "LDAP Protected" AuthBasicProvider ldap AuthLDAPURL ldaps://127.0.0.1/dc=example,dc=com?uid?one Require valid-user # Establish a TLS LDAP connection on port 389. Requires that # mod_ldap and mod_authnz_ldap be loaded. Change the # "yourdomain.example.com" to match your domain. LDAPTrustedGlobalCert CA_DER /certs/certfile.der SetHandler ldap-status Require host yourdomain.example.com Satisfy any AuthType Basic AuthName "LDAP Protected" AuthBasicProvider ldap AuthLDAPURL ldap://127.0.0.1/dc=example,dc=com?uid?one TLS Require valid-user SSL/TLS Certificates The different LDAP SDKs have widely different methods of setting and handling both CA and client side certificates. If you intend to use SSL or TLS, read this section CAREFULLY so as to understand the differences between configurations on the different LDAP toolkits supported. Netscape/Mozilla/iPlanet SDK CA certificates are specified within a file called cert7.db. The SDK will not talk to any LDAP server whose certificate was not signed by a CA specified in this file. If client certificates are required, an optional key3.db file may be specified with an optional password. The secmod file can be specified if required. These files are in the same format as used by the Netscape Communicator or Mozilla web browsers. The easiest way to obtain these files is to grab them from your browser installation. 10.68. APACHE MODULE MOD LDAP 697 Client certificates are specified per connection using the LDAPTrustedClientCert directive by referring to the certificate "nickname". An optional password may be specified to unlock the certificate’s private key. The SDK supports SSL only. An attempt to use STARTTLS will cause an error when an attempt is made to contact the LDAP server at runtime. # Specify a Netscape CA certificate file LDAPTrustedGlobalCert CA_CERT7_DB /certs/cert7.db # Specify an optional key3.db file for client certificate support LDAPTrustedGlobalCert CERT_KEY3_DB /certs/key3.db # Specify the secmod file if required LDAPTrustedGlobalCert CA_SECMOD /certs/secmod SetHandler ldap-status Require host yourdomain.example.com Satisfy any AuthType Basic AuthName "LDAP Protected" AuthBasicProvider ldap LDAPTrustedClientCert CERT_NICKNAME [password] AuthLDAPURL ldaps://127.0.0.1/dc=example,dc=com?uid?one Require valid-user Novell SDK One or more CA certificates must be specified for the Novell SDK to work correctly. These certificates can be specified as binary DER or Base64 (PEM) encoded files. Note: Client certificates are specified globally rather than per connection, and so must be specified with the LDAPTrustedGlobalCert directive as below. Trying to set client certificates via the LDAPTrustedClientCert directive will cause an error to be logged when an attempt is made to connect to the LDAP server.. The SDK supports both SSL and STARTTLS, set using the LDAPTrustedMode parameter. If an ldaps:// URL is specified, SSL mode is forced, override this directive. # Specify two CA certificate files LDAPTrustedGlobalCert CA_DER /certs/cacert1.der LDAPTrustedGlobalCert CA_BASE64 /certs/cacert2.pem # Specify a client certificate file and key LDAPTrustedGlobalCert CERT_BASE64 /certs/cert1.pem LDAPTrustedGlobalCert KEY_BASE64 /certs/key1.pem [password] # Do not use this directive, as it will throw an error #LDAPTrustedClientCert CERT_BASE64 /certs/cert1.pem OpenLDAP SDK One or more CA certificates must be specified for the OpenLDAP SDK to work correctly. These certificates can be specified as binary DER or Base64 (PEM) encoded files. Both CA and client certificates may be specified globally (LDAPTrustedGlobalCert) or per-connection (LDAPTrustedClientCert). When any settings are specified per-connection, the global settings are superceded. 698 CHAPTER 10. APACHE MODULES The documentation for the SDK claims to support both SSL and STARTTLS, however STARTTLS does not seem to work on all versions of the SDK. The SSL/TLS mode can be set using the LDAPTrustedMode parameter. If an ldaps:// URL is specified, SSL mode is forced. The OpenLDAP documentation notes that SSL (ldaps://) support has been deprecated to be replaced with TLS, although the SSL functionality still works. # Specify two CA certificate files LDAPTrustedGlobalCert CA_DER /certs/cacert1.der LDAPTrustedGlobalCert CA_BASE64 /certs/cacert2.pem SetHandler ldap-status Require host yourdomain.example.com LDAPTrustedClientCert CERT_BASE64 /certs/cert1.pem LDAPTrustedClientCert KEY_BASE64 /certs/key1.pem # CA certs respecified due to per-directory client certs LDAPTrustedClientCert CA_DER /certs/cacert1.der LDAPTrustedClientCert CA_BASE64 /certs/cacert2.pem Satisfy any AuthType Basic AuthName "LDAP Protected" AuthBasicProvider ldap AuthLDAPURL ldaps://127.0.0.1/dc=example,dc=com?uid?one Require valid-user Solaris SDK SSL/TLS for the native Solaris LDAP libraries is not yet supported. If required, install and use the OpenLDAP libraries instead. Microsoft SDK SSL/TLS certificate configuration for the native Microsoft LDAP libraries is done inside the system registry, and no configuration directives are required. Both SSL and TLS are supported by using the ldaps:// URL format, or by using the LDAPTrustedMode directive accordingly. Note: The status of support for client certificates is not yet known for this toolkit. LDAPCacheEntries Directive Description: Syntax: Default: Context: Status: Module: Maximum number of entries in the primary LDAP cache LDAPCacheEntries number LDAPCacheEntries 1024 server config Extension mod ldap Specifies the maximum size of the primary LDAP cache. This cache contains successful search/binds. Set it to 0 to turn off search/bind caching. The default size is 1024 cached searches. 10.68. APACHE MODULE MOD LDAP 699 LDAPCacheTTL Directive Description: Syntax: Default: Context: Status: Module: Time that cached items remain valid LDAPCacheTTL seconds LDAPCacheTTL 600 server config Extension mod ldap Specifies the time (in seconds) that an item in the search/bind cache remains valid. The default is 600 seconds (10 minutes). LDAPConnectionPoolTTL Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Discard backend connections that have been sitting in the connection pool too long LDAPConnectionPoolTTL n LDAPConnectionPoolTTL -1 server config, virtual host Extension mod ldap Apache HTTP Server 2.3.12 and later Specifies the maximum age, in seconds, that a pooled LDAP connection can remain idle and still be available for use. Connections are cleaned up when they are next needed, not asynchronously. A setting of 0 causes connections to never be saved in the backend connection pool. The default value of -1, and any other negative value, allows connections of any age to be reused. For performance reasons, the reference time used by this directive is based on when the LDAP connection is returned to the pool, not the time of the last successful I/O with the LDAP server. Since 2.4.10, new measures are in place to avoid the reference time from being inflated by cache hits or slow requests. First, the reference time is not updated if no backend LDAP conncetions were needed. Second, the reference time uses the time the HTTP request was received instead of the time the request is completed. =⇒This timeout defaults to units of seconds, but accepts suffixes for milliseconds (ms), minutes (min), and hours (h). LDAPConnectionTimeout Directive Description: Syntax: Context: Status: Module: Specifies the socket connection timeout in seconds LDAPConnectionTimeout seconds server config Extension mod ldap This directive configures the LDAP OPT NETWORK TIMEOUT (or LDAP OPT CONNECT TIMEOUT) option in the underlying LDAP client library, when available. This value typically controls how long the LDAP client library will wait for the TCP connection to the LDAP server to complete. If a connection is not successful with the timeout period, either an error will be returned or the LDAP client library will attempt to connect to a secondary LDAP server if one is specified (via a space-separated list of hostnames in the AUTH LDAPURL). The default is 10 seconds, if the LDAP LDAP OPT NETWORK TIMEOUT option. client library linked with the server supports the 700 CHAPTER 10. APACHE MODULES =⇒LDAPConnectionTimeout is only available when the LDAP client library linked with the server supports the LDAP OPT NETWORK TIMEOUT (or LDAP OPT CONNECT TIMEOUT) option, and the ultimate behavior is dictated entirely by the LDAP client library. LDAPLibraryDebug Directive Description: Syntax: Default: Context: Status: Module: Enable debugging in the LDAP SDK LDAPLibraryDebug 7 disabled server config Extension mod ldap Turns on SDK-specific LDAP debug options that generally cause the LDAP SDK to log verbose trace information to the main Apache error log. The trace messages from the LDAP SDK provide gory details that can be useful during debugging of connectivity problems with backend LDAP servers This option is only configurable when Apache HTTP Server is linked with an LDAP SDK that implements LDAP OPT DEBUG or LDAP OPT DEBUG LEVEL, such as OpenLDAP (a value of 7 is verbose) or Tivoli Directory Server (a value of 65535 is verbose). ! The logged information will likely contain plaintext credentials being used or validated by LDAP authentication, so care should be taken in protecting and purging the error log when this directive is used. LDAPOpCacheEntries Directive Description: Syntax: Default: Context: Status: Module: Number of entries used to cache LDAP compare operations LDAPOpCacheEntries number LDAPOpCacheEntries 1024 server config Extension mod ldap This specifies the number of entries MOD LDAP will use to cache LDAP compare operations. The default is 1024 entries. Setting it to 0 disables operation caching. LDAPOpCacheTTL Directive Description: Syntax: Default: Context: Status: Module: Time that entries in the operation cache remain valid LDAPOpCacheTTL seconds LDAPOpCacheTTL 600 server config Extension mod ldap Specifies the time (in seconds) that entries in the operation cache remain valid. The default is 600 seconds. 10.68. APACHE MODULE MOD LDAP 701 LDAPReferralHopLimit Directive Description: Syntax: Default: Context: Override: Status: Module: The maximum number of referral hops to chase before terminating an LDAP query. LDAPReferralHopLimit number SDK dependent, typically between 5 and 10 directory, .htaccess AuthConfig Extension mod ldap This directive, if enabled by the LDAPR EFERRALS directive, limits the number of referral hops that are followed before terminating an LDAP query. ! Support for this tunable is uncommon in LDAP SDKs. LDAPReferrals Directive Description: Syntax: Default: Context: Override: Status: Module: Compatibility: Enable referral chasing during queries to the LDAP server. LDAPReferrals On|Off|default LDAPReferrals On directory, .htaccess AuthConfig Extension mod ldap The default parameter is available in Apache 2.4.7 and later Some LDAP servers divide their directory among multiple domains and use referrals to direct a client when a domain boundary is crossed. This is similar to a HTTP redirect. LDAP client libraries may or may not chase referrals by default. This directive explicitly configures the referral chasing in the underlying SDK. LDAPR EFERRALS takes the following values: "on" When set to "on", the underlying SDK’s referral chasing state is enabled, LDAPR EFERRAL H OP L IMIT is used to override the SDK’s hop limit, and an LDAP rebind callback is registered. "off" When set to "off", the underlying SDK’s referral chasing state is disabled completely. "default" When set to "default", the underlying SDK’s referral chasing state is not changed, LDAPR EFERRAL H O P L IMIT is not used to overide the SDK’s hop limit, and no LDAP rebind callback is registered. The directive LDAPR EFERRAL H OP L IMIT works in conjunction with this directive to limit the number of referral hops to follow before terminating the LDAP query. When referral processing is enabled by a value of "On", client credentials will be provided, via a rebind callback, for any LDAP server requiring them. LDAPRetries Directive Description: Syntax: Default: Context: Status: Module: Configures the number of LDAP server retries. LDAPRetries number-of-retries LDAPRetries 3 server config Extension mod ldap The server will retry failed LDAP requests up to LDAPR ETRIES times. Setting this directive to 0 disables retries. LDAP errors such as timeouts and refused connections are retryable. 702 CHAPTER 10. APACHE MODULES LDAPRetryDelay Directive Description: Syntax: Default: Context: Status: Module: Configures the delay between LDAP server retries. LDAPRetryDelay seconds LDAPRetryDelay 0 server config Extension mod ldap If LDAPR ETRY D ELAY is set to a non-zero value, the server will delay retrying an LDAP request for the specified amount of time. Setting this directive to 0 will result in any retry to occur without delay. LDAP errors such as timeouts and refused connections are retryable. LDAPSharedCacheFile Directive Description: Syntax: Context: Status: Module: Sets the shared memory cache file LDAPSharedCacheFile file-path server config Extension mod ldap Specifies the path of the shared memory cache file. If not set, anonymous shared memory will be used if the platform supports it. If file-path is not an absolute path, the location specified will be relative to the value of D EFAULT RUNTIME D IR. LDAPSharedCacheSize Directive Description: Syntax: Default: Context: Status: Module: Size in bytes of the shared-memory cache LDAPSharedCacheSize bytes LDAPSharedCacheSize 500000 server config Extension mod ldap Specifies the number of bytes to allocate for the shared memory cache. The default is 500kb. If set to 0, shared memory caching will not be used and every HTTPD process will create its own cache. LDAPTimeout Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Specifies the timeout for LDAP search and bind operations, in seconds LDAPTimeout seconds LDAPTimeout 60 server config Extension mod ldap Apache HTTP Server 2.3.5 and later This directive configures the timeout for bind and search operations, as well as the LDAP OPT TIMEOUT option in the underlying LDAP client library, when available. If the timeout expires, httpd will retry in case an existing connection has been silently dropped by a firewall. However, performance will be much better if the firewall is configured to send TCP RST packets instead of silently dropping packets. 10.68. APACHE MODULE MOD LDAP 703 =⇒OpenLDAP Timeouts for ldap compare operations requires an SDK with LDAP OPT TIMEOUT, such as >= 2.4.4. LDAPTrustedClientCert Directive Description: Syntax: Context: Status: Module: Sets the file containing or nickname referring to a per connection client certificate. Not all LDAP toolkits support per connection client certificates. LDAPTrustedClientCert type directory-path/filename/nickname [password] directory, .htaccess Extension mod ldap It specifies the directory path, file name or nickname of a per connection client certificate used when establishing an SSL or TLS connection to an LDAP server. Different locations or directories may have their own independent client certificate settings. Some LDAP toolkits (notably Novell) do not support per connection client certificates, and will throw an error on LDAP server connection if you try to use this directive (Use the LDAPTrustedGlobalCert directive instead for Novell client certificates - See the SSL/TLS certificate guide above for details). The type specifies the kind of certificate parameter being set, depending on the LDAP toolkit being used. Supported types are: • CA DER - binary DER encoded CA certificate • CA BASE64 - PEM encoded CA certificate • CERT DER - binary DER encoded client certificate • CERT BASE64 - PEM encoded client certificate • CERT NICKNAME - Client certificate "nickname" (Netscape SDK) • KEY DER - binary DER encoded private key • KEY BASE64 - PEM encoded private key LDAPTrustedGlobalCert Directive Description: Syntax: Context: Status: Module: Sets the file or database containing global trusted Certificate Authority or global client certificates LDAPTrustedGlobalCert type directory-path/filename [password] server config Extension mod ldap It specifies the directory path and file name of the trusted CA certificates and/or system wide client certificates MOD LDAP should use when establishing an SSL or TLS connection to an LDAP server. Note that all certificate information specified using this directive is applied globally to the entire server installation. Some LDAP toolkits (notably Novell) require all client certificates to be set globally using this directive. Most other toolkits require clients certificates to be set per Directory or per Location using LDAPTrustedClientCert. If you get this wrong, an error may be logged when an attempt is made to contact the LDAP server, or the connection may silently fail (See the SSL/TLS certificate guide above for details). The type specifies the kind of certificate parameter being set, depending on the LDAP toolkit being used. Supported types are: • CA DER - binary DER encoded CA certificate • CA BASE64 - PEM encoded CA certificate • CA CERT7 DB - Netscape cert7.db CA certificate database file • CA SECMOD - Netscape secmod database file 704 CHAPTER 10. APACHE MODULES • CERT DER - binary DER encoded client certificate • CERT BASE64 - PEM encoded client certificate • CERT KEY3 DB - Netscape key3.db client certificate database file • CERT NICKNAME - Client certificate "nickname" (Netscape SDK) • CERT PFX - PKCS#12 encoded client certificate (Novell SDK) • KEY DER - binary DER encoded private key • KEY BASE64 - PEM encoded private key • KEY PFX - PKCS#12 encoded private key (Novell SDK) LDAPTrustedMode Directive Description: Syntax: Context: Status: Module: Specifies the SSL/TLS mode to be used when connecting to an LDAP server. LDAPTrustedMode type server config, virtual host Extension mod ldap The following modes are supported: • NONE - no encryption • SSL - ldaps:// encryption on default port 636 • TLS - STARTTLS encryption on default port 389 Not all LDAP toolkits support all the above modes. An error message will be logged at runtime if a mode is not supported, and the connection to the LDAP server will fail. If an ldaps:// URL is specified, the mode becomes SSL and the setting of LDAPTrustedMode is ignored. LDAPVerifyServerCert Directive Description: Syntax: Default: Context: Status: Module: Force server certificate verification LDAPVerifyServerCert On|Off LDAPVerifyServerCert On server config Extension mod ldap Specifies whether to force the verification of a server certificate when establishing an SSL connection to the LDAP server. 10.69. APACHE MODULE MOD LOG CONFIG 10.69 705 Apache Module mod log config Description: Status: ModuleIdentifier: SourceFile: Logging of the requests made to the server Base log config module mod log config.c Summary This module provides for flexible logging of client requests. Logs are written in a customizable format, and may be written directly to a file, or to an external program. Conditional logging is provided so that individual requests may be included or excluded from the logs based on characteristics of the request. Three directives are provided by this module: T RANSFER L OG to create a log file, L OG F ORMAT to set a custom format, and C USTOM L OG to define a log file and format in one step. The T RANSFER L OG and C USTOM L OG directives can be used multiple times in each server to cause each request to be logged to multiple files. Directives • BufferedLogs • CustomLog • GlobalLog • LogFormat • TransferLog See also • Apache Log Files (p. 56) Custom Log Formats The format argument to the L OG F ORMAT and C USTOM L OGdirectives is a string. This string is used to log each request to the log file. It can contain literal characters copied into the log files and the C-style control characters "\n" and "\t" to represent new-lines and tabs. Literal quotes and backslashes should be escaped with backslashes. The characteristics of the request itself are logged by placing "%" directives in the format string, which are replaced in the log file by the values as follows: FormatString Description %% %a %{c}a %A %B %b The percent sign. Client IP address of the request (see the MOD REMOTEIP module). Underlying peer IP address of the connection (see the MOD REMOTEIP module). Local IP-address. Size of response in bytes, excluding HTTP headers. Size of response in bytes, excluding HTTP headers. In CLF format, i.e. a ’-’ rather than a 0 when no bytes are sent. The contents of cookie VARNAME in the request sent to the server. Only version 0 cookies are fully supported. The time taken to serve the request, in microseconds. The contents of the environment variable VARNAME. Filename. %{VARNAME}C %D %{VARNAME}e %f 706 %h %H %{VARNAME}i %k %l %L %m %{VARNAME}n %{VARNAME}o %p %{format}p %P %{format}P %q %r %R %s %t %{format}t %T %{UNIT}T %u %U %v %V %X CHAPTER 10. APACHE MODULES Remote hostname. Will log the IP address if H OSTNAME L OOKUPS is set to Off, which is the default. If it logs the hostname for only a few hosts, you probably have access control directives mentioning them by name. See the Require host documentation (p. 536) . The request protocol. The contents of VARNAME: header line(s) in the request sent to the server. Changes made by other modules (e.g. MOD HEADERS) affect this. If you’re interested in what the request header was prior to when most modules would have modified it, use MOD SETENVIF to copy the header into an internal environment variable and log that value with the %{VARNAME}e described above. Number of keepalive requests handled on this connection. Interesting if K EEPA LIVE is being used, so that, for example, a ’1’ means the first keepalive request after the initial one, ’2’ the second, etc...; otherwise this is always 0 (indicating the initial request). Remote logname (from identd, if supplied). This will return a dash unless MOD IDENT is present and I DENTITY C HECK is set On. The request log ID from the error log (or ’-’ if nothing has been logged to the error log for this request). Look for the matching error log line to see what request caused what error. The request method. The contents of note VARNAME from another module. The contents of VARNAME: header line(s) in the reply. The canonical port of the server serving the request. The canonical port of the server serving the request, or the server’s actual port, or the client’s actual port. Valid formats are canonical, local, or remote. The process ID of the child that serviced the request. The process ID or thread ID of the child that serviced the request. Valid formats are pid, tid, and hextid. hextid requires APR 1.2.0 or higher. The query string (prepended with a ? if a query string exists, otherwise an empty string). First line of request. The handler generating the response (if any). Status. For requests that have been internally redirected, this is the status of the original request. Use %>s for the final status. Time the request was received, in the format [18/Sep/2011:19:18:28 -0400]. The last number indicates the timezone offset from GMT The time, in the form given by format, which should be in an extended strftime(3) format (potentially localized). If the format starts with begin: (default) the time is taken at the beginning of the request processing. If it starts with end: it is the time when the log entry gets written, close to the end of the request processing. In addition to the formats supported by strftime(3), the following format tokens are supported: sec number of seconds since the Epoch msec number of milliseconds since the Epoch usec number of microseconds since the Epoch msec frac millisecond fraction usec frac microsecond fraction These tokens can not be combined with each other or strftime(3) formatting in the same format string. You can use multiple %{format}t tokens instead. The time taken to serve the request, in seconds. The time taken to serve the request, in a time unit given by UNIT. Valid units are ms for milliseconds, us for microseconds, and s for seconds. Using s gives the same result as %T without any format; using us gives the same result as %D. Combining %T with a unit is available in 2.4.13 and later. Remote user if the request was authenticated. May be bogus if return status (%s) is 401 (unauthorized). The URL path requested, not including any query string. The canonical S ERVER NAME of the server serving the request. The server name according to the U SE C ANONICAL NAME setting. Connection status when response is completed: X= Connection aborted before the response completed. += Connection may be kept alive after the response is sent. -= Connection will be closed after the response is sent. 10.69. APACHE MODULE MOD LOG CONFIG %I %O %S %{VARNAME}ˆti %{VARNAME}ˆto 707 Bytes received, including request and headers. Cannot be zero. You need to enable MOD LOGIO to use this. Bytes sent, including headers. May be zero in rare cases such as when a request is aborted before a response is sent. You need to enable MOD LOGIO to use this. Bytes transferred (received and sent), including request and headers, cannot be zero. This is the combination of %I and %O. You need to enable MOD LOGIO to use this. The contents of VARNAME: trailer line(s) in the request sent to the server. The contents of VARNAME: trailer line(s) in the response sent from the server. Modifiers Particular items can be restricted to print only for responses with specific HTTP status codes by placing a commaseparated list of status codes immediately following the "%". The status code list may be preceded by a "!" to indicate negation. Format String Meaning %400,501{User-agent}i Logs User-agent on 400 errors and 501 errors only. For other status codes, the literal string "-" will be logged. %!200,304,302{Referer}i Logs Referer on all requests that do not return one of the three specified codes, "-" otherwise. The modifiers "<" and ">" can be used for requests that have been internally redirected to choose whether the original or final (respectively) request should be consulted. By default, the % directives %s, %U, %T, %D, and %r look at the original request while all others look at the final request. So for example, %>s can be used to record the final status of the request and %s %b" Common Log Format with Virtual Host "%v %h %l %u %t \"%r\" %>s %b" 708 CHAPTER 10. APACHE MODULES NCSA extended/combined log format "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\"" Referer log format "%{Referer}i -> %U" Agent (Browser) log format "%{User-agent}i" You can use the %{format}t directive multiple times to build up a time format using the extended format tokens like msec frac: Timestamp including milliseconds "%{%d/%b/%Y %T}t.%{msec frac}t %{%z}t" Security Considerations See the security tips (p. 364) document for details on why your security could be compromised if the directory where logfiles are stored is writable by anyone other than the user that starts the server. BufferedLogs Directive Description: Syntax: Default: Context: Status: Module: Buffer log entries in memory before writing to disk BufferedLogs On|Off BufferedLogs Off server config Base mod log config The B UFFERED L OGS directive causes MOD LOG CONFIG to store several log entries in memory and write them together to disk, rather than writing them after each request. On some systems, this may result in more efficient disk access and hence higher performance. It may be set only once for the entire server; it cannot be configured per virtual-host. =⇒This directive should be used with caution as a crash might cause loss of logging data. CustomLog Directive Description: Syntax: Context: Status: Module: Sets filename and format of log file CustomLog file|pipe|provider format|nickname [env=[!]environment-variable| expr=expression] server config, virtual host Base mod log config The C USTOM L OG directive is used to log requests to the server. A log format is specified, and the logging can optionally be made conditional on request characteristics using environment variables. The first argument, which specifies the location to which the logs will be written, can take one of the following two types of values: file A filename, relative to the S ERVER ROOT. pipe The pipe character "|", followed by the path to a program to receive the log information on its standard input. See the notes on piped logs (p. 56) for more information. 10.69. APACHE MODULE MOD LOG CONFIG ! ! 709 Security: If a program is used, then it will be run as the user who started httpd. This will be root if the server was started by root; be sure that the program is secure. Note When entering a file path on non-Unix platforms, care should be taken to make sure that only forward slashed are used even though the platform may allow the use of back slashes. In general it is a good idea to always use forward slashes throughout the configuration files. provider Modules implementing ErrorLog providers can also be used as a target for CustomLog messages. To use ErrorLog provider as a target, "provider:argument" syntax must be used. You can for example use MOD JOURNALD or MOD SYSLOG as a provider: # CustomLog logging to journald CustomLog "journald" "%h %l %u %t \"%r\" %>s %b" # CustomLog logging to syslog with "user" facility CustomLog "syslog:user" "%h %l %u %t \"%r\" %>s %b" The second argument specifies what will be written to the log file. It can specify either a nickname defined by a previous L OG F ORMAT directive, or it can be an explicit format string as described in the log formats section. For example, the following two sets of directives have exactly the same effect: # CustomLog with format nickname LogFormat "%h %l %u %t \"%r\" %>s %b" common CustomLog "logs/access_log" common # CustomLog with explicit format string CustomLog "logs/access_log" "%h %l %u %t \"%r\" %>s %b" The third argument is optional and controls whether or not to log a particular request. The condition can be the presence or absence (in the case of a ’env=!name’ clause) of a particular variable in the server environment (p. 92) . Alternatively, the condition can be expressed as arbitrary boolean expression (p. 99) . If the condition is not satisfied, the request will not be logged. References to HTTP headers in the expression will not cause the header names to be added to the Vary header. Environment variables can be set on a per-request basis using the MOD SETENVIF and/or MOD REWRITE modules. For example, if you want to record requests for all GIF images on your server in a separate logfile but not in your main log, you can use: SetEnvIf Request_URI \.gif$ gif-image CustomLog "gif-requests.log" common env=gif-image CustomLog "nongif-requests.log" common env=!gif-image Or, to reproduce the behavior of the old RefererIgnore directive, you might use the following: SetEnvIf Referer example\.com localreferer CustomLog "referer.log" referer env=!localreferer 710 CHAPTER 10. APACHE MODULES GlobalLog Directive Description: Syntax: Context: Status: Module: Compatibility: Sets filename and format of log file GlobalLog file|pipe|provider format|nickname [env=[!]environment-variable| expr=expression] server config Base mod log config Available in Apache HTTP Server 2.4.19 and later The G LOBAL L OG directive defines a log shared by the main server configuration and all defined virtual hosts. The G LOBAL L OG directive is identical to the C USTOM L OG directive, apart from the following differences: • G LOBAL L OG is not valid in virtual host context. • G LOBAL L OG is used by virtual hosts that define their own C USTOM L OG, unlike a globally specified C USTOM L OG. LogFormat Directive Description: Syntax: Default: Context: Status: Module: Describes a format for use in a log file LogFormat format|nickname [nickname] LogFormat "%h %l %u %t \"%r\" %>s %b" server config, virtual host Base mod log config This directive specifies the format of the access log file. The L OG F ORMAT directive can take one of two forms. In the first form, where only one argument is specified, this directive sets the log format which will be used by logs specified in subsequent T RANSFER L OG directives. The single argument can specify an explicit format as discussed in the custom log formats section above. Alternatively, it can use a nickname to refer to a log format defined in a previous L OG F ORMAT directive as described below. The second form of the L OG F ORMAT directive associates an explicit format with a nickname. This nickname can then be used in subsequent L OG F ORMAT or C USTOM L OG directives rather than repeating the entire format string. A L OG F ORMAT directive that defines a nickname does nothing else – that is, it only defines the nickname, it doesn’t actually apply the format and make it the default. Therefore, it will not affect subsequent T RANSFER L OG directives. In addition, L OG F ORMAT cannot use one nickname to define another nickname. Note that the nickname should not contain percent signs (%). Example LogFormat "%v %h %l %u %t \"%r\" %>s %b" vhost_common TransferLog Directive Description: Syntax: Context: Status: Module: Specify location of a log file TransferLog file|pipe server config, virtual host Base mod log config This directive has exactly the same arguments and effect as the C USTOM L OG directive, with the exception that it does not allow the log format to be specified explicitly or for conditional logging of requests. Instead, the log format is 10.69. APACHE MODULE MOD LOG CONFIG 711 determined by the most recently specified L OG F ORMAT directive which does not define a nickname. Common Log Format is used if no other format has been specified. Example LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\"" TransferLog "logs/access_log" 712 CHAPTER 10. APACHE MODULES 10.70 Apache Module mod log debug Description: Status: ModuleIdentifier: SourceFile: Compatibility: Additional configurable debug logging Experimental log debug module mod log debug.c Available in Apache 2.3.14 and later Directives • LogMessage Examples 1. Log message after request to /foo/* is processed: LogMessage "/foo/ has been requested" 2. Log message if request to /foo/* is processed in a sub-request: LogMessage "subrequest to /foo/" hook=type_checker "expr=-T %{IS_SUBREQ}" The default log transaction hook is not executed for sub-requests, therefore we have to use a different hook. 3. Log message if an IPv6 client causes a request timeout: LogMessage "IPv6 timeout from %{REMOTE_ADDR}" "expr=-T %{IPV6} && %{REQUEST_STATUS} = 4 Note the placing of the double quotes for the expr= argument. 4. Log the value of the "X-Foo" request environment variable in each stage of the request: LogMessage "%{reqenv:X-Foo}" hook=all Together with microsecond time stamps in the error log, hook=all also lets you determine the times spent in the different parts of the request processing. LogMessage Directive Description: Syntax: Default: Context: Status: Module: Log user-defined message to error log LogMessage message [hook=hook] [expr=expression] Unset directory Experimental mod log debug 10.70. APACHE MODULE MOD LOG DEBUG 713 This directive causes a user defined message to be logged to the error log. The message can use variables and functions from the ap expr syntax (p. 99) . References to HTTP headers will not cause header names to be added to the Vary header. The messages are logged at loglevel info. The hook specifies before which phase of request processing the message will be logged. The following hooks are supported: Name translate name type checker quick handler map to storage check access check access ex insert filter check authn check authz fixups handler log transaction The default is log transaction. The special value all is also supported, causing a message to be logged at each phase. Not all hooks are executed for every request. The optional expression allows to restrict the message if a condition is met. The details of the expression syntax are described in the ap expr documentation (p. 99) . References to HTTP headers will not cause the header names to be added to the Vary header. 714 CHAPTER 10. APACHE MODULES 10.71 Apache Module mod log forensic Description: Status: ModuleIdentifier: SourceFile: Forensic Logging of the requests made to the server Extension log forensic module mod log forensic.c Summary This module provides for forensic logging of client requests. Create the log file using the F ORENSIC L OG directive: ForensicLog logs/forensic_log Logging is done before and after processing a request, so the forensic log contains two log lines for each request. The forensic logger is very strict, which means: • The format is fixed. You cannot modify the logging format at runtime. • If it cannot write its data, the child process exits immediately and may dump core (depending on your C ORE D UMP D IRECTORY configuration). The check forensic script, which can be found in the distribution’s support directory, processes the resulting log file to identify the requests that didn’t complete. check-forensic forensic log Directives • ForensicLog See also • Apache Log Files (p. 56) • MOD LOG CONFIG Forensic Log Format Each request is logged two times. The first time is before it’s processed further (that is, after receiving the headers). The second log entry is written after the request processing at the same time where normal logging occurs. In order to identify each request, a unique request ID is assigned. This forensic ID can be cross logged in the normal transfer log using the %{forensic-id}n format string. If you’re using MOD UNIQUE ID, its generated ID will be used. The first line logs the forensic ID, the request line and all received headers, separated by pipe characters (|). A sample line looks like the following (all on one line): +yQtJf8CoAB4AAFNXBIEAAAAA|GET /manual/de/images/down.gif HTTP/1.1|Host:localhost%3a8080|User-Agent:Mozilla/5.0 (X11; U; Linux i686; en-US; rv%3a1.6) Gecko/20040216 Firefox/0.8|Accept:image/png, etc... 10.71. APACHE MODULE MOD LOG FORENSIC 715 The plus character at the beginning indicates that this is the first log line of this request. The second line just contains a minus character and the ID again: -yQtJf8CoAB4AAFNXBIEAAAAA The check forensic script takes as its argument the name of the logfile. It looks for those +/- ID pairs and complains if a request was not completed. Security Considerations See the security tips (p. 364) document for details on why your security could be compromised if the directory where logfiles are stored is writable by anyone other than the user that starts the server. The log files may contain sensitive data such as the contents of Authorization: headers (which can contain passwords), so they should not be readable by anyone except the user that starts the server. ForensicLog Directive Description: Syntax: Context: Status: Module: Sets filename of the forensic log ForensicLog filename|pipe server config, virtual host Extension mod log forensic The F ORENSIC L OG directive is used to log requests to the server for forensic analysis. Each log entry is assigned a unique ID which can be associated with the request using the normal C USTOM L OG directive. MOD LOG FORENSIC creates a token called forensic-id, which can be added to the transfer log using the %{forensic-id}n format string. The argument, which specifies the location to which the logs will be written, can take one of the following two types of values: filename A filename, relative to the S ERVER ROOT. pipe The pipe character "|", followed by the path to a program to receive the log information on its standard input. The program name can be specified relative to the S ERVER ROOT directive. ! Security: If a program is used, then it will be run as the user who started httpd. This will be root if the server was started by root; be sure that the program is secure or switches to a less privileged user. =⇒Note When entering a file path on non-Unix platforms, care should be taken to make sure that only forward slashes are used even though the platform may allow the use of back slashes. In general it is a good idea to always use forward slashes throughout the configuration files. 716 CHAPTER 10. APACHE MODULES 10.72 Apache Module mod logio Description: Status: ModuleIdentifier: SourceFile: Logging of input and output bytes per request Extension logio module mod logio.c Summary This module provides the logging of input and output number of bytes received/sent per request. The numbers reflect the actual bytes as received on the network, which then takes into account the headers and bodies of requests and responses. The counting is done before SSL/TLS on input and after SSL/TLS on output, so the numbers will correctly reflect any changes made by encryption. This module requires MOD LOG CONFIG. =⇒When KeepAlive connections are used with SSL, the overhead of the SSL handshake is reflected in the byte count of the first request on the connection. When per-directory SSL renegotiation occurs, the bytes are associated with the request that triggered the renegotiation. Directives • LogIOTrackTTFB See also • MOD LOG CONFIG • Apache Log Files (p. 56) Custom Log Formats This module adds three new logging directives. The characteristics of the request itself are logged by placing "%" directives in the format string, which are replaced in the log file by the values as follows: FormatString Description %I %O %S Bytes received, including request and headers, cannot be zero. Bytes sent, including headers, cannot be zero. Bytes transferred (received and sent), including request and headers, cannot be zero. This is the combination of %I and %O. Available in Apache 2.4.7 and later Delay in microseconds between when the request arrived and the first byte of the response headers are written. Only available if L OG IOT RACK TTFB is set to ON. Available in Apache 2.4.13 and later %ˆFB Usually, the functionality is used like this: Combined I/O log format: "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-agent}i\" %I %O" 10.72. APACHE MODULE MOD LOGIO 717 LogIOTrackTTFB Directive Description: Syntax: Default: Context: Override: Status: Module: Enable tracking of time to first byte (TTFB) LogIOTrackTTFB ON|OFF LogIOTrackTTFB OFF server config, virtual host, directory, .htaccess none Extension mod logio This directive configures whether this module tracks the delay between the request being read and the first byte of the response headers being written. The resulting value may be logged with the %ˆFB format. 718 CHAPTER 10. APACHE MODULES 10.73 Apache Module mod lua Description: Status: ModuleIdentifier: SourceFile: Compatibility: Provides Lua hooks into various portions of the httpd request processing Experimental lua module mod lua.c 2.3 and later Summary This module allows the server to be extended with scripts written in the Lua programming language. The extension points (hooks) available with MOD LUA include many of the hooks available to natively compiled Apache HTTP Server modules, such as mapping requests to files, generating dynamic responses, access control, authentication, and authorization More information on the Lua programming language can be found at the the Lua website54 . =⇒mod lua is still in experimental state. Until it is declared stable, usage and behavior may change at any time, even between stable releases of the 2.4.x series. Be sure to check the CHANGES file before upgrading. ! Warning This module holds a great deal of power over httpd, which is both a strength and a potential security risk. It is not recommended that you use this module on a server that is shared with users you do not trust, as it can be abused to change the internal workings of httpd. Directives • LuaAuthzProvider • LuaCodeCache • LuaHookAccessChecker • LuaHookAuthChecker • LuaHookCheckUserID • LuaHookFixups • LuaHookInsertFilter • LuaHookLog • LuaHookMapToStorage • LuaHookTranslateName • LuaHookTypeChecker • LuaInherit • LuaInputFilter • LuaMapHandler • LuaOutputFilter • LuaPackageCPath • LuaPackagePath • LuaQuickHandler 54 http://www.lua.org/ 10.73. APACHE MODULE MOD LUA 719 • LuaRoot • LuaScope Basic Configuration The basic module loading directive is LoadModule lua_module modules/mod_lua.so mod lua provides a handler named lua-script, which can be used with a S ET H ANDLER or A DD H ANDLER directive: SetHandler lua-script This will cause mod lua to handle requests for files ending in .lua by invoking that file’s handle function. For more flexibility, see L UA M AP H ANDLER. Writing Handlers In the Apache HTTP Server API, the handler is a specific kind of hook responsible for generating the response. Examples of modules that include a handler are MOD PROXY, MOD CGI, and MOD STATUS. mod lua always looks to invoke a Lua function for the handler, rather than just evaluating a script body CGI style. A handler function looks something like this: example.lua -- example handler require "string" --[[ This is the default method name for Lua handlers, see the optional function-name in the LuaMapHandler directive to choose a different entry point. --]] function handle(r) r.content_type = "text/plain" if r.method == ’GET’ then r:puts("Hello Lua World!\n") for k, v in pairs( r:parseargs() ) do r:puts( string.format("%s: %s\n", k, v) ) end elseif r.method == ’POST’ then r:puts("Hello Lua World!\n") for k, v in pairs( r:parsebody() ) do r:puts( string.format("%s: %s\n", k, v) ) end elseif r.method == ’PUT’ then 720 CHAPTER 10. APACHE MODULES -- use our own Error contents r:puts("Unsupported HTTP method " .. r.method) r.status = 405 return apache2.OK else -- use the ErrorDocument return 501 end return apache2.OK end This handler function just prints out the uri or form encoded arguments to a plaintext page. This means (and in fact encourages) that you can have multiple handlers (or hooks, or filters) in the same script. Writing Authorization Providers MOD AUTHZ CORE provides a high-level interface to authorization that is much easier to use than using into the relevant hooks directly. The first argument to the R EQUIRE directive gives the name of the responsible authorization provider. For any R EQUIRE line, MOD AUTHZ CORE will call the authorization provider of the given name, passing the rest of the line as parameters. The provider will then check authorization and pass the result as return value. The authz provider is normally called before authentication. If it needs to know the authenticated user name (or if the user will be authenticated at all), the provider must return apache2.AUTHZ DENIED NO USER. This will cause authentication to proceed and the authz provider to be called a second time. The following authz provider function takes two arguments, one ip address and one user name. It will allow access from the given ip address without authentication, or if the authenticated user matches the second argument: authz_provider.lua require ’apache2’ function authz_check_foo(r, ip, user) if r.useragent_ip == ip then return apache2.AUTHZ_GRANTED elseif r.user == nil then return apache2.AUTHZ_DENIED_NO_USER elseif r.user == user then return apache2.AUTHZ_GRANTED else return apache2.AUTHZ_DENIED end end The following configuration registers this function as provider foo and configures it for URL /: LuaAuthzProvider foo authz_provider.lua authz_check_foo Require foo 10.1.2.3 john_doe 10.73. APACHE MODULE MOD LUA 721 Writing Hooks Hook functions are how modules (and Lua scripts) participate in the processing of requests. Each type of hook exposed by the server exists for a specific purpose, such as mapping requests to the file system, performing access control, or setting mime types: Hook phase mod lua directive Description Quick handler L UAQ UICK H ANDLER Translate name L UA H OOK T RANSLATE NAME Map to storage L UA H OOK M AP T O S TORAGE Check Access L UA H OOK ACCESS C HECKER Check User ID L UA H OOK C HECK U SER ID Check Authorization L UA H OOK AUTH C HECKER or L UA AU THZ P ROVIDER Check Type L UA H OOK T YPE C HECKER Fixups L UA H OOK F IXUPS Content handler fx. .lua files or through L UA M AP H AN - This is the first hook that will be called after a request has been mapped to a host or virtual host This phase translates the requested URI into a filename on the system. Modules such as MOD ALIAS and MOD REWRITE operate in this phase. This phase maps files to their physical, cached or external/proxied storage. It can be used by proxy or caching modules This phase checks whether a client has access to a resource. This phase is run before the user is authenticated, so beware. This phase it used to check the negotiated user ID This phase authorizes a user based on the negotiated credentials, such as user ID, client certificate etc. This phase checks the requested file and assigns a content type and a handler to it This is the final "fix anything" phase before the content handlers are run. Any last-minute changes to the request should be made here. This is where the content is handled. Files are read, parsed, some are run, and the result is sent to the client Once a request has been handled, it enters several logging phases, which logs the request in either the error or access log. Mod lua is able to hook into the start of this and control logging output. DLER Logging L UA H OOK L OG Hook functions are passed the request object as their only argument (except for LuaAuthzProvider, which also gets passed the arguments from the Require directive). They can return any value, depending on the hook, but most commonly they’ll return OK, DONE, or DECLINED, which you can write in Lua as apache2.OK, apache2.DONE, or apache2.DECLINED, or else an HTTP status code. translate_name.lua -- example hook that rewrites the URI to a filesystem path. require ’apache2’ function translate_name(r) if r.uri == "/translate-name" then r.filename = r.document_root .. "/find_me.txt" return apache2.OK end -- we don’t care about this URL, give another module a chance return apache2.DECLINED end 722 CHAPTER 10. APACHE MODULES translate_name2.lua --[[ example hook that rewrites one URI to another URI. It returns a apache2.DECLINED to give other URL mappers a chance to work on the substitution, including the core translate_name hook which maps based on the DocumentRoot. Note: Use the early/late flags in the directive to make it run before or after mod_alias. --]] require ’apache2’ function translate_name(r) if r.uri == "/translate-name" then r.uri = "/find_me.txt" return apache2.DECLINED end return apache2.DECLINED end Data Structures request rec The request rec is mapped in as a userdata. It has a metatable which lets you do useful things with it. For the most part it has the same fields as the request rec struct, many of which are writable as well as readable. (The table fields’ content can be changed, but the fields themselves cannot be set to different tables.) Name Lua type Writable Description allowoverrides string no ap auth type string no args string yes assbackwards boolean no auth name string no banner string no basic auth pw string no canonical filename string no content encoding string no content type string yes The AllowOverride options applied to the current request. If an authentication check was made, this is set to the type of authentication (f.x. basic) The query string arguments extracted from the request (f.x. foo=bar&name=johnsmith) Set to true if this is an HTTP/0.9 style request (e.g. GET /foo (with no headers) ) The realm name used for authorization (if applicable). The server banner, f.x. Apache HTTP Server/2.4.3 openssl/0.9.8c The basic auth password sent with this request, if any The canonical filename of the request The content encoding of the current request The content type of the current request, as determined in the type check phase (f.x. image/gif or text/html) 10.73. APACHE MODULE MOD LUA 723 context prefix context document root document root err headers out string string string table no no no no filename string yes handler string yes headers in table yes headers out table yes hostname string no is https boolean no is initial req boolean no limit req body number no log id string no method string no notes table yes options string no path info string no port number no The document root of the host MIME header environment for the response, printed even on errors and persist across internal redirects The file name that the request maps to, f.x. /www/example.com/foo.txt. This can be changed in the translate-name or map-tostorage phases of a request to allow the default handler (or script handlers) to serve a different file than what was requested. The name of the handler (p. 108) that should serve this request, f.x. lua-script if it is to be served by mod lua. This is typically set by the A D D H ANDLER or S ET H ANDLER directives, but could also be set via mod lua to allow another handler to serve up a specific request that would otherwise not be served by it. MIME header environment from the request. This contains headers such as Host, User-Agent, Referer and so on. MIME header environment for the response. The host name, as set by the Host: header or by a full URI. Whether or not this request is done via HTTPS Whether this request is the initial request or a sub-request The size limit of the request body for this request, or 0 if no limit. The ID to identify request in access and error log. The request method, f.x. GET or POST. A list of notes that can be passed on from one module to another. The Options directive applied to the current request. The PATH INFO extracted from this request. The server port used by the request. 724 CHAPTER 10. APACHE MODULES protocol string no proxyreq string yes range string no remaining number no server built string no server name string no some auth required boolean no subprocess env table yes started number no status number yes the request string no unparsed uri string no uri string yes user string yes useragent ip string no The protocol used, f.x. HTTP/1.1 Denotes whether this is a proxy request or not. This value is generally set in the post read request/translate name phase of a request. The contents of the Range: header. The number of bytes remaining to be read from the request body. The time the server executable was built. The server name for this request. Whether some authorization is/was required for this request. The environment variables set for this request. The time the server was (re)started, in seconds since the epoch (Jan 1st, 1970) The (current) HTTP return code for this request, f.x. 200 or 404. The request string as sent by the client, f.x. GET /foo/bar HTTP/1.1. The unparsed URI of the request The URI after it has been parsed by httpd If an authentication check has been made, this is set to the name of the authenticated user. The IP of the user agent making the request Built in functions The request rec object has (at least) the following methods: r:flush() -- flushes the output buffer. -- Returns true if the flush was successful, false otherwise. while we_have_stuff_to_send do r:puts("Bla bla bla\n") -- print something to client r:flush() -- flush the buffer (send to client) r.usleep(500000) -- fake processing time for 0.5 sec. and repeat end r:addoutputfilter(name|function) -- add an output filter: r:addoutputfilter("fooFilter") -- add the fooFilter to the output stream 10.73. APACHE MODULE MOD LUA 725 r:sendfile(filename) -- sends an entire file to the client, using sendfile if supported by if use_sendfile_thing then r:sendfile("/var/www/large_file.img") end r:parseargs() -- returns two tables; one standard key/value table for regular GET data, -- and one for multi-value data (fx. foo=1&foo=2&foo=3): local GET, GETMULTI = r:parseargs() r:puts("Your name is: " .. GET[’name’] or "Unknown") r:parsebody([sizeLimit]) ----- parse the request body as a POST and return two lua tables, just like r:parseargs(). An optional number may be passed to specify the maximum number of bytes to parse. Default is 8192 bytes: local POST, POSTMULTI = r:parsebody(1024*1024) r:puts("Your name is: " .. POST[’name’] or "Unknown") r:puts("hello", " world", "!") -- print to response body, self explanatory r:write("a single string") -- print to response body, self explanatory r:escape_html("test") -- Escapes HTML code and returns the escaped result r:base64_encode(string) -- Encodes a string using the Base64 encoding standard: local encoded = r:base64_encode("This is a test") -- returns VGhpcyBpcyBhIHRlc3Q= r:base64_decode(string) -- Decodes a Base64-encoded string: local decoded = r:base64_decode("VGhpcyBpcyBhIHRlc3Q=") -- returns ’This is a test’ r:md5(string) -- Calculates and returns the MD5 digest of a string (binary safe): local hash = r:md5("This is a test") -- returns ce114e4501d2f4e2dcea3e17b546f339 r:sha1(string) -- Calculates and returns the SHA1 digest of a string (binary safe): local hash = r:sha1("This is a test") -- returns a54d88e06612d820bc3be72877c74f257b561b19 r:escape(string) -- URL-Escapes a string: local url = "http://foo.bar/1 2 3 & 4 + 5" local escaped = r:escape(url) -- returns ’http%3a%2f%2ffoo.bar%2f1+2+3+%26+4+%2b+5’ r:unescape(string) -- Unescapes an URL-escaped string: local url = "http%3a%2f%2ffoo.bar%2f1+2+3+%26+4+%2b+5" local unescaped = r:unescape(url) -- returns ’http://foo.bar/1 2 3 & 4 + 5’ 726 CHAPTER 10. APACHE MODULES r:construct_url(string) -- Constructs an URL from an URI local url = r:construct_url(r.uri) r.mpm_query(number) -- Queries the server for MPM information using ap_mpm_query: local mpm = r.mpm_query(14) if mpm == 1 then r:puts("This server uses the Event MPM") end r:expr(string) -- Evaluates an expr string. if r:expr("%{HTTP_HOST} =˜ /ˆwww/") then r:puts("This host name starts with www") end r:scoreboard_process(a) -- Queries the server for information about the process at position local process = r:scoreboard_process(1) r:puts("Server 1 has PID " .. process.pid) r:scoreboard_worker(a, b) -- Queries for information about the worker thread, b, in process local thread = r:scoreboard_worker(1, 1) r:puts("Server 1’s thread 1 has thread ID " .. thread.tid .. " and is in " .. thread.status r:clock() -- Returns the current time with microsecond precision r:requestbody(filename) -- Reads and returns the request body of a request. -- If ’filename’ is specified, it instead saves the -- contents to that file: local input = r:requestbody() r:puts("You sent the following request body to me:\n") r:puts(input) r:add_input_filter(filter_name) -- Adds ’filter_name’ as an input filter r.module_info(module_name) -- Queries the server for information about a module local mod = r.module_info("mod_lua.c") if mod then for k, v in pairs(mod.commands) do r:puts( ("%s: %s\n"):format(k,v)) -- print out all directives accepted by this modul end end r:loaded_modules() -- Returns a list of modules loaded by httpd: for k, module in pairs(r:loaded_modules()) do r:puts("I have loaded module " .. module .. "\n") end 10.73. APACHE MODULE MOD LUA 727 r:runtime_dir_relative(filename) -- Compute the name of a run-time file (e.g., shared memor -- relative to the appropriate run-time directory. r:server_info() -- Returns a table containing server information, such as -- the name of the httpd executable file, mpm used etc. r:set_document_root(file_path) -- Sets the document root for the request to file_path r:set_context_info(prefix, docroot) -- Sets the context prefix and context document root fo r:os_escape_path(file_path) -- Converts an OS path to a URL in an OS dependent way r:escape_logitem(string) -- Escapes a string for logging r.strcmp_match(string, pattern) -- Checks if ’string’ matches ’pattern’ using strcmp_match -- fx. whether ’www.example.com’ matches ’*.example.com’: local match = r.strcmp_match("foobar.com", "foo*.com") if match then r:puts("foobar.com matches foo*.com") end r:set_keepalive() -- Sets the keepalive status for a request. Returns true if possible, fal r:make_etag() -- Constructs and returns the etag for the current request. r:send_interim_response(clear) -- Sends an interim (1xx) response to the client. -- if ’clear’ is true, available headers will be sent and cleared. r:custom_response(status_code, string) -- Construct and set a custom response for a given s -- This works much like the ErrorDocument directive: r:custom_response(404, "Baleted!") r.exists_config_define(string) -- Checks whether a configuration definition exists or not: if r.exists_config_define("FOO") then r:puts("httpd was probably run with -DFOO, or it was defined in the configuration") end r:state_query(string) -- Queries the server for state information r:stat(filename [,wanted]) -- Runs stat() on a file, and returns a table with file informat local info = r:stat("/var/www/foo.txt") if info then r:puts("This file exists and was last modified at: " .. info.modified) end 728 CHAPTER 10. APACHE MODULES r:regex(string, pattern [,flags]) -- Runs a regular expression match on a string, returning local matches = r:regex("foo bar baz", [[foo (\w+) (\S*)]]) if matches then r:puts("The regex matched, and the last word captured ($2) was: " .. matches[2]) end -- Example ignoring case sensitivity: local matches = r:regex("FOO bar BAz", [[(foo) bar]], 1) -- Flags can be a bitwise combination of: -- 0x01: Ignore case -- 0x02: Multiline search r.usleep(number_of_microseconds) -- Puts the script to sleep for a given number of microsec r:dbacquire(dbType[, dbParams]) -- Acquires a connection to a database and returns a databa -- See ’Database connectivity’ for details. r:ivm_set("key", value) -----r:ivm_get("key") Set an Inter-VM variable to hold a specific value. These values persist even though the VM is gone or not being use and so should only be used if MaxConnectionsPerChild is > 0 Values can be numbers, strings and booleans, and are stored on a per process basis (so they won’t do much good with a prefork mpm -- Fetches a variable set by ivm_set. Returns the contents of the v -- if it exists or nil if no such variable exists. -- An example getter/setter that saves a global variable outside the VM: function handle(r) -- First VM to call this will get no value, and will have to create it local foo = r:ivm_get("cached_data") if not foo then foo = do_some_calcs() -- fake some return value r:ivm_set("cached_data", foo) -- set it globally end r:puts("Cached data is: ", foo) end r:htpassword(string [,algorithm [,cost]]) -- Creates a password hash from a string. -- algorithm: 0 = APMD5 (default), 1 = SHA, 2 = B -- cost: only valid with BCRYPT algorithm (defaul r:mkdir(dir [,mode]) -- Creates a directory and sets mode to optional mode paramter. r:mkrdir(dir [,mode]) -- Creates directories recursive and sets mode to optional mode param r:rmdir(dir) -- Removes a directory. r:touch(file [,mtime]) -- Sets the file modification time to current time or to optional mt 10.73. APACHE MODULE MOD LUA 729 r:get_direntries(dir) -- Returns a table with all directory entries. function handle(r) local dir = r.context_document_root for _, f in ipairs(r:get_direntries(dir)) do local info = r:stat(dir .. "/" .. f) if info then local mtime = os.date(fmt, info.mtime / 1000000) local ftype = (info.filetype == 2) and "[dir] " or "[file]" r:puts( ("%s %s %10i %s\n"):format(ftype, mtime, info.size, f) ) end end end r.date_parse_rfc(string) -- Parses a date/time string and returns seconds since epoche. r:getcookie(key) -- Gets a HTTP cookie r:setcookie{ key = [key], value = [value], expires = [expiry], secure = [boolean], httponly = [boolean], path = [path], domain = [domain] } -- Sets a HTTP cookie, for instance: r:setcookie{ key = "cookie1", value = "HDHfa9eyffh396rt", expires = os.time() + 86400, secure = true } r:wsupgrade() -- Upgrades a connection to WebSockets if possible (and requested): if r:wsupgrade() then -- if we can upgrade: r:wswrite("Welcome to websockets!") -- write something to the client r:wsclose() -- goodbye! end r:wsread() -- Reads a WebSocket frame from a WebSocket upgraded connection (see above): local line, isFinal = r:wsread() -- isFinal denotes whether this is the final frame. -- If it isn’t, then more frames can be read r:wswrite("You wrote: " .. line) r:wswrite(line) -- Writes a frame to a WebSocket client: r:wswrite("Hello, world!") r:wsclose() -- Closes a WebSocket request and terminates it for httpd: 730 CHAPTER 10. APACHE MODULES if r:wsupgrade() then r:wswrite("Write something: ") local line = r:wsread() or "nothing" r:wswrite("You wrote: " .. line); r:wswrite("Goodbye!") r:wsclose() end r:wspeek() -- Checks if any data is ready to be read -- Sleep while nothing is being sent to us... while r:wspeek() == false do r.usleep(50000) end -- We have data ready! local line = r:wsread() r:config() -- Get a walkable tree of the entire httpd configuration r:activeconfig() -- Get a walkable tree of the active (virtualhost-specific) httpd configur Logging Functions -- examples of logging messages r:trace1("This is a trace log message") -- trace1 through trace8 can be used r:debug("This is a debug log message") r:info("This is an info log message") r:notice("This is a notice log message") r:warn("This is a warn log message") r:err("This is an err log message") r:alert("This is an alert log message") r:crit("This is a crit log message") r:emerg("This is an emerg log message") apache2 Package A package named apache2 is available with (at least) the following contents. apache2.OK internal constant OK. Handlers should return this if they’ve handled the request. apache2.DECLINED internal constant DECLINED. Handlers should return this if they are not going to handle the request. apache2.DONE internal constant DONE. apache2.version Apache HTTP server version string apache2.HTTP MOVED TEMPORARILY HTTP status code apache2.PROXYREQ NONE, apache2.PROXYREQ PROXY, apache2.PROXYREQ REVERSE, apache2.PROXYREQ RESPO internal constants used by MOD PROXY apache2.AUTHZ DENIED, apache2.AUTHZ GRANTED, apache2.AUTHZ NEUTRAL, apache2.AUTHZ GENERAL ERROR internal constants used by MOD AUTHZ CORE (Other HTTP status codes are not yet implemented.) 10.73. APACHE MODULE MOD LUA 731 Modifying contents with Lua filters Filter functions implemented via L UA I NPUT F ILTER or L UAO UTPUT F ILTER are designed as three-stage non-blocking functions using coroutines to suspend and resume a function as buckets are sent down the filter chain. The core structure of such a function is: function filter(r) -- Our first yield is to signal that we are ready to receive buckets. -- Before this yield, we can set up our environment, check for conditions, -- and, if we deem it necessary, decline filtering a request alltogether: if something_bad then return -- This would skip this filter. end -- Regardless of whether we have data to prepend, a yield MUST be called here. -- Note that only output filters can prepend data. Input filters must use the -- final stage to append data to the content. coroutine.yield([optional header to be prepended to the content]) -- After we have yielded, buckets will be sent to us, one by one, and we can -- do whatever we want with them and then pass on the result. -- Buckets are stored in the global variable ’bucket’, so we create a loop -- that checks if ’bucket’ is not nil: while bucket ˜= nil do local output = mangle(bucket) -- Do some stuff to the content coroutine.yield(output) -- Return our new content to the filter chain end -- Once the buckets are gone, ’bucket’ is set to nil, which will exit the -- loop and land us here. Anything extra we want to append to the content -- can be done by doing a final yield here. Both input and output filters -- can append data to the content in this phase. coroutine.yield([optional footer to be appended to the content]) end Database connectivity Mod lua implements a simple database feature for querying and running commands on the most popular database engines (mySQL, PostgreSQL, FreeTDS, ODBC, SQLite, Oracle) as well as mod dbd. The example below shows how to acquire a database handle and return information from a table: function handle(r) -- Acquire a database handle local database, err = r:dbacquire("mysql", "server=localhost,user=someuser,pass=somepas if not err then -- Select some information from it local results, err = database:select(r, "SELECT ‘name‘, ‘age‘ FROM ‘people‘ WHERE 1 if not err then local rows = results(0) -- fetch all rows synchronously for k, row in pairs(rows) do r:puts( string.format("Name: %s, Age: %s
", row[1], row[2]) ) end else 732 CHAPTER 10. APACHE MODULES r:puts("Database query error: " .. err) end database:close() else r:puts("Could not connect to the database: " .. err) end end To utilize MOD DBD, specify mod dbd as the database type, or leave the field blank: local database = r:dbacquire("mod_dbd") Database object and contained functions The database object returned by dbacquire has the following methods: Normal select and query from a database: -- Run a statement and return the number of rows affected: local affected, errmsg = database:query(r, "DELETE FROM ‘tbl‘ WHERE 1") -- Run a statement and return a result set that can be used synchronously or async: local result, errmsg = database:select(r, "SELECT * FROM ‘people‘ WHERE 1") Using prepared statements (recommended): -- Create and run a prepared statement: local statement, errmsg = database:prepare(r, "DELETE FROM ‘tbl‘ WHERE ‘age‘ > %u") if not errmsg then local result, errmsg = statement:query(20) -- run the statement with age > 20 end -- Fetch a prepared statement from a DBDPrepareSQL directive: local statement, errmsg = database:prepared(r, "someTag") if not errmsg then local result, errmsg = statement:select("John Doe", 123) -- inject the values "John Doe end Escaping values, closing databases etc: -- Escape a value for use in a statement: local escaped = database:escape(r, [["’|blabla]]) -- Close a database connection and free up handles: database:close() -- Check whether a database connection is up and running: local connected = database:active() 10.73. APACHE MODULE MOD LUA 733 Working with result sets The result set returned by db:select or by the prepared statement functions created through db:prepare can be used to fetch rows synchronously or asynchronously, depending on the row number specified: result(0) fetches all rows in a synchronous manner, returning a table of rows. result(-1) fetches the next available row in the set, asynchronously. result(N) fetches row number N, asynchronously: -- fetch a result set using a regular query: local result, err = db:select(r, "SELECT * FROM ‘tbl‘ WHERE 1") local local local local rows = result(0) -- Fetch ALL rows synchronously row = result(-1) -- Fetch the next available row, asynchronously row = result(1234) -- Fetch row number 1234, asynchronously row = result(-1, true) -- Fetch the next available row, using row names as key indexe One can construct a function that returns an iterative function to iterate over all rows in a synchronous or asynchronous way, depending on the async argument: function rows(resultset, async) local a = 0 local function getnext() a = a + 1 local row = resultset(-1) return row and a or nil, row end if not async then return pairs(resultset(0)) else return getnext, self end end local statement, err = db:prepare(r, "SELECT * FROM ‘tbl‘ WHERE ‘age‘ > %u") if not err then -- fetch rows asynchronously: local result, err = statement:select(20) if not err then for index, row in rows(result, true) do .... end end -- fetch rows synchronously: local result, err = statement:select(20) if not err then for index, row in rows(result, false) do .... end end end 734 CHAPTER 10. APACHE MODULES Closing a database connection Database handles should be closed using database:close() when they are no longer needed. If you do not close them manually, they will eventually be garbage collected and closed by mod lua, but you may end up having too many unused connections to the database if you leave the closing up to mod lua. Essentially, the following two measures are the same: -- Method 1: Manually close a handle local database = r:dbacquire("mod_dbd") database:close() -- All done -- Method 2: Letting the garbage collector close it local database = r:dbacquire("mod_dbd") database = nil -- throw away the reference collectgarbage() -- close the handle via GC Precautions when working with databases Although the standard query and run functions are freely available, it is recommended that you use prepared statements whenever possible, to both optimize performance (if your db handle lives on for a long time) and to minimize the risk of SQL injection attacks. run and query should only be used when there are no variables inserted into a statement (a static statement). When using dynamic statements, use db:prepare or db:prepared. LuaAuthzProvider Directive Description: Syntax: Context: Status: Module: Compatibility: Plug an authorization provider function into MOD AUTHZ CORE LuaAuthzProvider provider name /path/to/lua/script.lua function name server config Experimental mod lua 2.4.3 and later After a lua function has been registered as authorization provider, it can be used with the R EQUIRE directive: LuaRoot /usr/local/apache2/lua LuaAuthzProvider foo authz.lua authz_check_foo Require foo johndoe require "apache2" function authz_check_foo(r, who) if r.user ˜= who then return apache2.AUTHZ_DENIED return apache2.AUTHZ_GRANTED end 10.73. APACHE MODULE MOD LUA 735 LuaCodeCache Directive Description: Syntax: Default: Context: Override: Status: Module: Configure the compiled code cache. LuaCodeCache stat|forever|never LuaCodeCache stat server config, virtual host, directory, .htaccess All Experimental mod lua Specify the behavior of the in-memory code cache. The default is stat, which stats the top level script (not any included ones) each time that file is needed, and reloads it if the modified time indicates it is newer than the one it has already loaded. The other values cause it to keep the file cached forever (don’t stat and replace) or to never cache the file. In general stat or forever is good for production, and stat or never for development. Examples: LuaCodeCache stat LuaCodeCache forever LuaCodeCache never LuaHookAccessChecker Directive Description: Syntax: Context: Override: Status: Module: Compatibility: Provide a hook for the access checker phase of request processing LuaHookAccessChecker /path/to/lua/script.lua hook function name [early|late] server config, virtual host, directory, .htaccess All Experimental mod lua The optional third argument is supported in 2.3.15 and later Add your hook to the access checker phase. An access checker hook function usually returns OK, DECLINED, or HTTP FORBIDDEN. =⇒Ordering The optional arguments "early" or "late" control when this script runs relative to other modules. LuaHookAuthChecker Directive Description: Syntax: Context: Override: Status: Module: Compatibility: Provide a hook for the auth checker phase of request processing LuaHookAuthChecker /path/to/lua/script.lua hook function name [early|late] server config, virtual host, directory, .htaccess All Experimental mod lua The optional third argument is supported in 2.3.15 and later Invoke a lua function in the auth checker phase of processing a request. This can be used to implement arbitrary authentication and authorization checking. A very simple example: 736 CHAPTER 10. APACHE MODULES require ’apache2’ -- fake authcheck hook -- If request has no auth info, set the response header and -- return a 401 to ask the browser for basic auth info. -- If request has auth info, don’t actually look at it, just -- pretend we got userid ’foo’ and validated it. -- Then check if the userid is ’foo’ and accept the request. function authcheck_hook(r) -- look for auth info auth = r.headers_in[’Authorization’] if auth ˜= nil then -- fake the user r.user = ’foo’ end if r.user == nil then r:debug("authcheck: user is nil, returning 401") r.err_headers_out[’WWW-Authenticate’] = ’Basic realm="WallyWorld"’ return 401 elseif r.user == "foo" then r:debug(’user foo: OK’) else r:debug("authcheck: user=’" .. r.user .. "’") r.err_headers_out[’WWW-Authenticate’] = ’Basic realm="WallyWorld"’ return 401 end return apache2.OK end =⇒Ordering The optional arguments "early" or "late" control when this script runs relative to other modules. LuaHookCheckUserID Directive Description: Syntax: Context: Override: Status: Module: Provide a hook for the check user id phase of request processing LuaHookCheckUserID /path/to/lua/script.lua hook function name server config, virtual host, directory, .htaccess All Experimental mod lua LuaHookFixups Directive Description: Syntax: Context: Override: Status: Module: Provide a hook for the fixups phase of a request processing LuaHookFixups /path/to/lua/script.lua hook function name server config, virtual host, directory, .htaccess All Experimental mod lua 10.73. APACHE MODULE MOD LUA 737 Just like LuaHookTranslateName, but executed at the fixups phase LuaHookInsertFilter Directive Description: Syntax: Context: Override: Status: Module: Provide a hook for the insert filter phase of request processing LuaHookInsertFilter /path/to/lua/script.lua hook function name server config, virtual host, directory, .htaccess All Experimental mod lua Not Yet Implemented LuaHookLog Directive Description: Syntax: Context: Override: Status: Module: Provide a hook for the access log phase of a request processing LuaHookLog /path/to/lua/script.lua log function name server config, virtual host, directory, .htaccess All Experimental mod lua This simple logging hook allows you to run a function when httpd enters the logging phase of a request. With it, you can append data to your own logs, manipulate data before the regular log is written, or prevent a log entry from being created. To prevent the usual logging from happening, simply return apache2.DONE in your logging handler, otherwise return apache2.OK to tell httpd to log as normal. Example: LuaHookLog /path/to/script.lua logger -- /path/to/script.lua -function logger(r) -- flip a coin: -- If 1, then we write to our own Lua log and tell httpd not to log -- in the main log. -- If 2, then we just sanitize the output a bit and tell httpd to -- log the sanitized bits. if math.random(1,2) == 1 then -- Log stuff ourselves and don’t log in the regular log local f = io.open("/foo/secret.log", "a") if f then f:write("Something secret happened at " .. r.uri .. "\n") f:close() end return apache2.DONE -- Tell httpd not to use the regular logging functions else r.uri = r.uri:gsub("somesecretstuff", "") -- sanitize the URI return apache2.OK -- tell httpd to log it. end end 738 CHAPTER 10. APACHE MODULES LuaHookMapToStorage Directive Description: Syntax: Context: Override: Status: Module: Provide a hook for the map to storage phase of request processing LuaHookMapToStorage /path/to/lua/script.lua hook function name server config, virtual host, directory, .htaccess All Experimental mod lua Like L UA H OOK T RANSLATE NAME but executed at the map-to-storage phase of a request. Modules like mod cache run at this phase, which makes for an interesting example on what to do here: LuaHookMapToStorage /path/to/lua/script.lua check_cache require"apache2" cached_files = {} function read_file(filename) local input = io.open(filename, "r") if input then local data = input:read("*a") cached_files[filename] = data file = cached_files[filename] input:close() end return cached_files[filename] end function check_cache(r) if r.filename:match("%.png$") then -- Only match PNG files local file = cached_files[r.filename] -- Check cache entries if not file then file = read_file(r.filename) -- Read file into cache end if file then -- If file exists, write it out r.status = 200 r:write(file) r:info(("Sent %s to client from cache"):format(r.filename)) return apache2.DONE -- skip default handler for PNG files end end return apache2.DECLINED -- If we had nothing to do, let others serve this. end LuaHookTranslateName Directive Description: Syntax: Context: Override: Status: Module: Compatibility: Provide a hook for the translate name phase of request processing LuaHookTranslateName /path/to/lua/script.lua hook function name [early|late] server config, virtual host All Experimental mod lua The optional third argument is supported in 2.3.15 and later 10.73. APACHE MODULE MOD LUA 739 Add a hook (at APR HOOK MIDDLE) to the translate name phase of request processing. The hook function receives a single argument, the request rec, and should return a status code, which is either an HTTP error code, or the constants defined in the apache2 module: apache2.OK, apache2.DECLINED, or apache2.DONE. For those new to hooks, basically each hook will be invoked until one of them returns apache2.OK. If your hook doesn’t want to do the translation it should just return apache2.DECLINED. If the request should stop processing, then return apache2.DONE. Example: # httpd.conf LuaHookTranslateName /scripts/conf/hooks.lua silly_mapper -- /scripts/conf/hooks.lua -require "apache2" function silly_mapper(r) if r.uri == "/" then r.filename = "/var/www/home.lua" return apache2.OK else return apache2.DECLINED end end =⇒Context This directive is not valid in , , or htaccess context. =⇒Ordering The optional arguments "early" or "late" control when this script runs relative to other modIRECTORY ILES ules. LuaHookTypeChecker Directive Description: Syntax: Context: Override: Status: Module: Provide a hook for the type checker phase of request processing LuaHookTypeChecker /path/to/lua/script.lua hook function name server config, virtual host, directory, .htaccess All Experimental mod lua This directive provides a hook for the type checker phase of the request processing. This phase is where requests are assigned a content type and a handler, and thus can be used to modify the type and handler based on input: LuaHookTypeChecker /path/to/lua/script.lua type_checker function type_checker(r) if r.uri:match("%.to_gif$") then -- match foo.png.to_gif r.content_type = "image/gif" -- assign it the image/gif type r.handler = "gifWizard" -- tell the gifWizard module to handle this r.filename = r.uri:gsub("%.to_gif$", "") -- fix the filename requested return apache2.OK end return apache2.DECLINED end 740 CHAPTER 10. APACHE MODULES LuaInherit Directive Description: Syntax: Default: Context: Override: Status: Module: Compatibility: Controls how parent configuration sections are merged into children LuaInherit none|parent-first|parent-last LuaInherit parent-first server config, virtual host, directory, .htaccess All Experimental mod lua 2.4.0 and later By default, if LuaHook* directives are used in overlapping Directory or Location configuration sections, the scripts defined in the more specific section are run after those defined in the more generic section (LuaInherit parent-first). You can reverse this order, or make the parent context not apply at all. In previous 2.3.x releases, the default was effectively to ignore LuaHook* directives from parent configuration sections. LuaInputFilter Directive Description: Syntax: Context: Status: Module: Compatibility: Provide a Lua function for content input filtering LuaInputFilter filter name /path/to/lua/script.lua function name server config Experimental mod lua 2.4.5 and later Provides a means of adding a Lua function as an input filter. As with output filters, input filters work as coroutines, first yielding before buffers are sent, then yielding whenever a bucket needs to be passed down the chain, and finally (optionally) yielding anything that needs to be appended to the input data. The global variable bucket holds the buckets as they are passed onto the Lua script: LuaInputFilter myInputFilter /www/filter.lua input_filter SetInputFilter myInputFilter --[[ Example input filter that converts all POST data to uppercase. ]]-function input_filter(r) print("luaInputFilter called") -- debug print coroutine.yield() -- Yield and wait for buckets while bucket do -- For each bucket, do... local output = string.upper(bucket) -- Convert all POST data to uppercase coroutine.yield(output) -- Send converted data down the chain end -- No more buckets available. coroutine.yield("&filterSignature=1234") -- Append signature at the end end The input filter supports denying/skipping a filter if it is deemed unwanted: 10.73. APACHE MODULE MOD LUA 741 function input_filter(r) if not good then return -- Simply deny filtering, passing on the original content instead end coroutine.yield() -- wait for buckets ... -- insert filter stuff here end See "Modifying contents with Lua filters" for more information. LuaMapHandler Directive Description: Syntax: Context: Override: Status: Module: Map a path to a lua handler LuaMapHandler uri-pattern /path/to/lua/script.lua [function-name] server config, virtual host, directory, .htaccess All Experimental mod lua This directive matches a uri pattern to invoke a specific handler function in a specific file. It uses PCRE regular expressions to match the uri, and supports interpolating match groups into both the file path and the function name. Be careful writing your regular expressions to avoid security issues. Examples: LuaMapHandler /(\w+)/(\w+) /scripts/$1.lua handle_$2 This would match uri’s such as /photos/show?id=9 to the file /scripts/photos.lua and invoke the handler function handle show on the lua vm after loading that file. LuaMapHandler /bingo /scripts/wombat.lua This would invoke the "handle" function, which is the default if no specific function name is provided. LuaOutputFilter Directive Description: Syntax: Context: Status: Module: Compatibility: Provide a Lua function for content output filtering LuaOutputFilter filter name /path/to/lua/script.lua function name server config Experimental mod lua 2.4.5 and later Provides a means of adding a Lua function as an output filter. As with input filters, output filters work as coroutines, first yielding before buffers are sent, then yielding whenever a bucket needs to be passed down the chain, and finally (optionally) yielding anything that needs to be appended to the input data. The global variable bucket holds the buckets as they are passed onto the Lua script: LuaOutputFilter myOutputFilter /www/filter.lua output_filter SetOutputFilter myOutputFilter 742 CHAPTER 10. APACHE MODULES --[[ Example output filter that escapes all HTML entities in the output ]]-function output_filter(r) coroutine.yield("(Handled by myOutputFilter)
\n") -- Prepend some data to the outpu -- yield and wait for buckets. while bucket do -- For each bucket, do... local output = r:escape_html(bucket) -- Escape all output coroutine.yield(output) -- Send converted data down the chain end -- No more buckets available. end As with the input filter, the output filter supports denying/skipping a filter if it is deemed unwanted: function output_filter(r) if not r.content_type:match("text/html") then return -- Simply deny filtering, passing on the original content instead end coroutine.yield() -- wait for buckets ... -- insert filter stuff here end =⇒Lua filters with When a Lua filter is used as the underlying provider via the F MOD FILTER ILTER P ROVIDER directive, filtering will only work when the filter-name is identical to the provider-name. See "Modifying contents with Lua filters" for more information. LuaPackageCPath Directive Description: Syntax: Context: Override: Status: Module: Add a directory to lua’s package.cpath LuaPackageCPath /path/to/include/?.soa server config, virtual host, directory, .htaccess All Experimental mod lua Add a path to lua’s shared library search path. Follows the same conventions as lua. This just munges the package.cpath in the lua vms. LuaPackagePath Directive Description: Syntax: Context: Override: Status: Module: Add a directory to lua’s package.path LuaPackagePath /path/to/include/?.lua server config, virtual host, directory, .htaccess All Experimental mod lua Add a path to lua’s module search path. Follows the same conventions as lua. This just munges the package.path in the lua vms. 10.73. APACHE MODULE MOD LUA 743 Examples: LuaPackagePath /scripts/lib/?.lua LuaPackagePath /scripts/lib/?/init.lua LuaQuickHandler Directive Description: Syntax: Context: Override: Status: Module: Provide a hook for the quick handler of request processing LuaQuickHandler /path/to/script.lua hook function name server config, virtual host All Experimental mod lua This phase is run immediately after the request has been mapped to a virtal host, and can be used to either do some request processing before the other phases kick in, or to serve a request without the need to translate, map to storage et cetera. As this phase is run before anything else, directives such as or are void in this phase, just as URIs have not been properly parsed yet. =⇒Context This directive is not valid in , , or htaccess context. LuaRoot Directive Description: Syntax: Context: Override: Status: Module: Specify the base path for resolving relative paths for mod lua directives LuaRoot /path/to/a/directory server config, virtual host, directory, .htaccess All Experimental mod lua Specify the base path which will be used to evaluate all relative paths within mod lua. If not specified they will be resolved relative to the current working directory, which may not always work well for a server. LuaScope Directive Description: Syntax: Default: Context: Override: Status: Module: One of once, request, conn, thread – default is once LuaScope once|request|conn|thread|server [min] [max] LuaScope once server config, virtual host, directory, .htaccess All Experimental mod lua Specify the life cycle scope of the Lua interpreter which will be used by handlers in this "Directory." The default is "once" once: use the interpreter once and throw it away. request: use the interpreter to handle anything based on the same file within this request, which is also request scoped. conn: Same as request but attached to the connection rec thread: Use the interpreter for the lifetime of the thread handling the request (only available with threaded MPMs). 744 CHAPTER 10. APACHE MODULES server: This one is different than others because the server scope is quite long lived, and multiple threads will have the same server rec. To accommodate this, server scoped Lua states are stored in an apr resource list. The min and max arguments specify the minimum and maximum number of Lua states to keep in the pool. Generally speaking, the thread and server scopes execute roughly 2-3 times faster than the rest, because they don’t have to spawn new Lua states on every request (especially with the event MPM, as even keepalive requests will use a new thread for each request). If you are satisfied that your scripts will not have problems reusing a state, then the thread or server scopes should be used for maximum performance. While the thread scope will provide the fastest responses, the server scope will use less memory, as states are pooled, allowing f.x. 1000 threads to share only 100 Lua states, thus using only 10% of the memory required by the thread scope. 10.74. APACHE MODULE MOD MACRO 10.74 745 Apache Module mod macro Description: Status: ModuleIdentifier: SourceFile: Provides macros within apache httpd runtime configuration files Base macro module mod macro.c Summary Provides macros within Apache httpd runtime configuration files, to ease the process of creating numerous similar configuration blocks. When the server starts up, the macros are expanded using the provided parameters, and the result is processed as along with the rest of the configuration file. Directives • • UndefMacro • Use Usage Macros are defined using blocks, which contain the portion of your configuration that needs to be repeated, complete with variables for those parts that will need to be substituted. For example, you might use a macro to define a block, in order to define multiple similar virtual hosts: ServerName $domain ServerAlias www.$domain DocumentRoot "/var/www/vhosts/$name" ErrorLog "/var/log/httpd/$name.error_log" CustomLog "/var/log/httpd/$name.access_log" combined Macro names are case-insensitive, like httpd configuration directives. However, variable names are case sensitive. You would then invoke this macro several times to create virtual hosts: Use VHost example example.com Use VHost myhost hostname.org Use VHost apache apache.org UndefMacro VHost At server startup time, each of these U SE invocations would be expanded into a full virtualhost, as described by the definition. The U NDEF M ACRO directive is used so that later macros using the same variable names don’t result in conflicting definitions. A more elaborate version of this example may be seen below in the Examples section. 746 CHAPTER 10. APACHE MODULES Tips Parameter names should begin with a sigil such as $, %, or @, so that they are clearly identifiable, and also in order to help deal with interactions with other directives, such as the core D EFINE directive. Failure to do so will result in a warning. Nevertheless, you are encouraged to have a good knowledge of your entire server configuration in order to avoid reusing the same variables in different scopes, which can cause confusion. Parameters prefixed with either $ or % are not escaped. Parameters prefixes with @ are escaped in quotes. Avoid using a parameter which contains another parameter as a prefix, (For example, $win and $winter) as this may cause confusion at expression evaluation time. In the event of such confusion, the longest possible parameter name is used. If you want to use a value within another string, it is useful to surround the parameter in braces, to avoid confusion: DocumentRoot "/var/www/${docroot}/htdocs" Examples Virtual Host Definition A common usage of MOD MACRO is for the creation of dynamically-generated virtual hosts. ## Define a VHost Macro for repetitive configurations Listen $port ServerName $host DocumentRoot "$dir" # Public document root Require all granted # limit access to intranet subdir. Require ip 10.0.0.0/8 ## Use of VHost with different arguments. Use VHost www.apache.org 80 /vhosts/apache/htdocs Use VHost example.org 8080 /vhosts/example/htdocs Use VHost www.example.fr 1234 /vhosts/example.fr/htdocs 10.74. APACHE MODULE MOD MACRO 747 Removal of a macro definition It’s recommended that you undefine a macro once you’ve used it. This avoids confusion in a complex configuration file where there may be conflicts in variable names. Require group $group Use DirGroup /www/apache/private private Use DirGroup /www/apache/server admin UndefMacro DirGroup Macro Directive Description: Syntax: Context: Status: Module: Define a configuration file macro ... server config, virtual host, directory Base mod macro The directive controls the definition of a macro within the server runtime configuration files. The first argument is the name of the macro. Other arguments are parameters to the macro. It is good practice to prefix parameter names with any of ’$%@’, and not macro names with such characters. Require ip 10.2.16.0/24 Require ip $ipnumbers UndefMacro Directive Description: Syntax: Context: Status: Module: Undefine a macro UndefMacro name server config, virtual host, directory Base mod macro The U NDEF M ACRO directive undefines a macro which has been defined before hand. UndefMacro LocalAccessPolicy UndefMacro RestrictedAccessPolicy 748 CHAPTER 10. APACHE MODULES Use Directive Description: Syntax: Context: Status: Module: Use a macro Use name [value1 ... valueN] server config, virtual host, directory Base mod macro The U SE directive controls the use of a macro. The specified macro is expanded. It must be given the same number of arguments as in the macro definition. The provided values are associated to their corresponding initial parameters and are substituted before processing. Use LocalAccessPolicy ... Use RestrictedAccessPolicy "192.54.172.0/24 192.54.148.0/24" is equivalent, with the macros defined above, to: Require ip 10.2.16.0/24 ... Require ip 192.54.172.0/24 192.54.148.0/24 10.75. APACHE MODULE MOD MIME 10.75 749 Apache Module mod mime Description: Status: ModuleIdentifier: SourceFile: Associates the requested filename’s extensions with the file’s behavior (handlers and filters) and content (mime-type, language, character set and encoding) Base mime module mod mime.c Summary This module is used to assign content metadata to the content selected for an HTTP response by mapping patterns in the URI or filenames to the metadata values. For example, the filename extensions of content files often define the content’s Internet media type, language, character set, and content-encoding. This information is sent in HTTP messages containing that content and used in content negotiation when selecting alternatives, such that the user’s preferences are respected when choosing one of several possible contents to serve. See MOD NEGOTIATION for more information about content negotiation (p. 78) . The directives A DD C HARSET, A DD E NCODING, A DD L ANGUAGE and A DD T YPE are all used to map file extensions onto the metadata for that file. Respectively they set the character set, content-encoding, content-language, and mediatype (content-type) of documents. The directive T YPES C ONFIG is used to specify a file which also maps extensions onto media types. In addition, MOD MIME may define the handler (p. 108) and filters (p. 110) that originate and process content. The directives A DD H ANDLER, A DD O UTPUT F ILTER, and A DD I NPUT F ILTER control the modules or scripts that serve the document. The M ULTIVIEWS M ATCH directive allows MOD NEGOTIATION to consider these file extensions to be included when testing Multiviews matches. While MOD MIME associates metadata with filename extensions, the CORE server provides directives that are used to associate all the files in a given container (e.g., , , or ) with particular metadata. These directives include F ORCE T YPE, S ET H ANDLER, S ET I NPUT F ILTER, and S ET O UTPUT F ILTER. The core directives override any filename extension mappings defined in MOD MIME. Note that changing the metadata for a file does not change the value of the Last-Modified header. Thus, previously cached copies may still be used by a client or proxy, with the previous headers. If you change the metadata (language, content type, character set or encoding) you may need to ’touch’ affected files (updating their last modified date) to ensure that all visitors are receive the corrected content headers. Directives • AddCharset • AddEncoding • AddHandler • AddInputFilter • AddLanguage • AddOutputFilter • AddType • DefaultLanguage • ModMimeUsePathInfo • MultiviewsMatch • RemoveCharset • RemoveEncoding 750 CHAPTER 10. APACHE MODULES • RemoveHandler • RemoveInputFilter • RemoveLanguage • RemoveOutputFilter • RemoveType • TypesConfig See also • M IME M AGIC F ILE • A DD D EFAULT C HARSET • F ORCE T YPE • S ET H ANDLER • S ET I NPUT F ILTER • S ET O UTPUT F ILTER Files with Multiple Extensions Files can have more than one extension; the order of the extensions is normally irrelevant. For example, if the file welcome.html.fr maps onto content type text/html and language French then the file welcome.fr.html will map onto exactly the same information. If more than one extension is given that maps onto the same type of metadata, then the one to the right will be used, except for languages and content encodings. For example, if .gif maps to the media-type image/gif and .html maps to the media-type text/html, then the file welcome.gif.html will be associated with the media-type text/html. Languages and content encodings are treated accumulative, because one can assign more than one language or encoding to a particular resource. For example, the file welcome.html.en.de will be delivered with Content-Language: en, de and Content-Type: text/html. Care should be taken when a file with multiple extensions gets associated with both a media-type and a handler. This will usually result in the request being handled by the module associated with the handler. For example, if the .imap extension is mapped to the handler imap-file (from MOD IMAGEMAP) and the .html extension is mapped to the media-type text/html, then the file world.imap.html will be associated with both the imap-file handler and text/html media-type. When it is processed, the imap-file handler will be used, and so it will be treated as a MOD IMAGEMAP imagemap file. If you would prefer only the last dot-separated part of the filename to be mapped to a particular piece of meta-data, then do not use the Add* directives. For example, if you wish to have the file foo.html.cgi processed as a CGI script, but not the file bar.cgi.html, then instead of using AddHandler cgi-script .cgi, use Configure handler based on final extension only SetHandler cgi-script 10.75. APACHE MODULE MOD MIME 751 Content encoding A file of a particular media-type can additionally be encoded a particular way to simplify transmission over the Internet. While this usually will refer to compression, such as gzip, it can also refer to encryption, such a pgp or to an encoding such as UUencoding, which is designed for transmitting a binary file in an ASCII (text) format. The HTTP/1.1 RFC55 , section 14.11 puts it this way: The Content-Encoding entity-header field is used as a modifier to the media-type. When present, its value indicates what additional content codings have been applied to the entity-body, and thus what decoding mechanisms must be applied in order to obtain the media-type referenced by the Content-Type header field. Content-Encoding is primarily used to allow a document to be compressed without losing the identity of its underlying media type. By using more than one file extension (see section above about multiple file extensions), you can indicate that a file is of a particular type, and also has a particular encoding. For example, you may have a file which is a Microsoft Word document, which is pkzipped to reduce its size. If the .doc extension is associated with the Microsoft Word file type, and the .zip extension is associated with the pkzip file encoding, then the file Resume.doc.zip would be known to be a pkzip’ed Word document. Apache sends a Content-encoding header with the resource, in order to tell the client browser about the encoding method. Content-encoding: pkzip Character sets and languages In addition to file type and the file encoding, another important piece of information is what language a particular document is in, and in what character set the file should be displayed. For example, the document might be written in the Vietnamese alphabet, or in Cyrillic, and should be displayed as such. This information, also, is transmitted in HTTP headers. The character set, language, encoding and mime type are all used in the process of content negotiation (See MOD NEGOTIATION ) to determine which document to give to the client, when there are alternative documents in more than one character set, language, encoding or mime type. All filename extensions associations created with A D D C HARSET , A DD E NCODING , A DD L ANGUAGE and A DD T YPE directives (and extensions listed in the M IME M AG IC F ILE ) participate in this select process. Filename extensions that are only associated using the A DD H ANDLER , A DD I NPUT F ILTER or A DD O UTPUT F ILTER directives may be included or excluded from matching by using the M UL TIVIEWS M ATCH directive. Charset To convey this further information, Apache optionally sends a Content-Language header, to specify the language that the document is in, and can append additional information onto the Content-Type header to indicate the particular character set that should be used to correctly render the information. Content-Language: en, fr Content-Type: charset=ISO-8859-1 text/plain; The language specification is the two-letter abbreviation for the language. The charset is the name of the particular character set which should be used. 55 http://www.ietf.org/rfc/rfc2616.txt 752 CHAPTER 10. APACHE MODULES AddCharset Directive Description: Syntax: Context: Override: Status: Module: Maps the given filename extensions to the specified content charset AddCharset charset extension [extension] ... server config, virtual host, directory, .htaccess FileInfo Base mod mime The A DD C HARSET directive maps the given filename extensions to the specified content charset (the Internet registered name for a given character encoding). charset is the media type’s charset parameter56 for resources with filenames containing extension. This mapping is added to any already in force, overriding any mappings that already exist for the same extension. Example AddLanguage ja .ja AddCharset EUC-JP .euc AddCharset ISO-2022-JP .jis AddCharset SHIFT_JIS .sjis Then the document xxxx.ja.jis will be treated as being a Japanese document whose charset is ISO-2022-JP (as will the document xxxx.jis.ja). The A DD C HARSET directive is useful for both to inform the client about the character encoding of the document so that the document can be interpreted and displayed appropriately, and for content negotiation (p. 78) , where the server returns one from several documents based on the client’s charset preference. The extension argument is case-insensitive and can be specified with or without a leading dot. Filenames may have multiple extensions and the extension argument will be compared against each of them. See also • MOD NEGOTIATION • A DD D EFAULT C HARSET AddEncoding Directive Description: Syntax: Context: Override: Status: Module: Maps the given filename extensions to the specified encoding type AddEncoding encoding extension [extension] ... server config, virtual host, directory, .htaccess FileInfo Base mod mime The A DD E NCODING directive maps the given filename extensions to the specified HTTP content-encoding. encoding is the HTTP content coding to append to the value of the Content-Encoding header field for documents named with the extension. This mapping is added to any already in force, overriding any mappings that already exist for the same extension. Example AddEncoding x-gzip .gz AddEncoding x-compress .Z 56 http://www.iana.org/assignments/character-sets 10.75. APACHE MODULE MOD MIME 753 This will cause filenames containing the .gz extension to be marked as encoded using the x-gzip encoding, and filenames containing the .Z extension to be marked as encoded with x-compress. Old clients expect x-gzip and x-compress, however the standard dictates that they’re equivalent to gzip and compress respectively. Apache does content encoding comparisons by ignoring any leading x-. When responding with an encoding Apache will use whatever form (i.e., x-foo or foo) the client requested. If the client didn’t specifically request a particular form Apache will use the form given by the AddEncoding directive. To make this long story short, you should always use x-gzip and x-compress for these two specific encodings. More recent encodings, such as deflate, should be specified without the x-. The extension argument is case-insensitive and can be specified with or without a leading dot. Filenames may have multiple extensions and the extension argument will be compared against each of them. AddHandler Directive Description: Syntax: Context: Override: Status: Module: Maps the filename extensions to the specified handler AddHandler handler-name extension [extension] ... server config, virtual host, directory, .htaccess FileInfo Base mod mime Files having the name extension will be served by the specified handler-name (p. 108) . This mapping is added to any already in force, overriding any mappings that already exist for the same extension. For example, to activate CGI scripts with the file extension .cgi, you might use: AddHandler cgi-script .cgi Once that has been put into your httpd.conf file, any file containing the .cgi extension will be treated as a CGI program. The extension argument is case-insensitive and can be specified with or without a leading dot. Filenames may have multiple extensions and the extension argument will be compared against each of them. See also • S ET H ANDLER AddInputFilter Directive Description: Syntax: Context: Override: Status: Module: Maps filename extensions to the filters that will process client requests AddInputFilter filter[;filter...] extension [extension] ... server config, virtual host, directory, .htaccess FileInfo Base mod mime A DD I NPUT F ILTER maps the filename extension extension to the filters (p. 110) which will process client requests and POST input when they are received by the server. This is in addition to any filters defined elsewhere, including the S ET I NPUT F ILTER directive. This mapping is merged over any already in force, overriding any mappings that already exist for the same extension. If more than one filter is specified, they must be separated by semicolons in the order in which they should process the content. The filter is case-insensitive. The extension argument is case-insensitive and can be specified with or without a leading dot. Filenames may have multiple extensions and the extension argument will be compared against each of them. 754 CHAPTER 10. APACHE MODULES See also • R EMOVE I NPUT F ILTER • S ET I NPUT F ILTER AddLanguage Directive Description: Syntax: Context: Override: Status: Module: Maps the given filename extension to the specified content language AddLanguage language-tag extension [extension] ... server config, virtual host, directory, .htaccess FileInfo Base mod mime The A DD L ANGUAGE directive maps the given filename extension to the specified content language. Files with the filename extension are assigned an HTTP Content-Language value of language-tag corresponding to the language identifiers defined by RFC 3066. This directive overrides any mappings that already exist for the same extension. Example AddEncoding x-compress .Z AddLanguage en .en AddLanguage fr .fr Then the document xxxx.en.Z will be treated as being a compressed English document (as will the document xxxx.Z.en). Although the content language is reported to the client, the browser is unlikely to use this information. The A DD L ANGUAGE directive is more useful for content negotiation (p. 78) , where the server returns one from several documents based on the client’s language preference. If multiple language assignments are made for the same extension, the last one encountered is the one that is used. That is, for the case of: AddLanguage en .en AddLanguage en-gb .en AddLanguage en-us .en documents with the extension .en would be treated as being en-us. The extension argument is case-insensitive and can be specified with or without a leading dot. Filenames may have multiple extensions and the extension argument will be compared against each of them. See also • MOD NEGOTIATION AddOutputFilter Directive Description: Syntax: Context: Override: Status: Module: Maps filename extensions to the filters that will process responses from the server AddOutputFilter filter[;filter...] extension [extension] ... server config, virtual host, directory, .htaccess FileInfo Base mod mime 10.75. APACHE MODULE MOD MIME 755 The A DD O UTPUT F ILTER directive maps the filename extension extension to the filters (p. 110) which will process responses from the server before they are sent to the client. This is in addition to any filters defined elsewhere, including S ET O UTPUT F ILTER and A DD O UTPUT F ILTER B Y T YPE directive. This mapping is merged over any already in force, overriding any mappings that already exist for the same extension. For example, the following configuration will process all .shtml files for server-side includes and will then compress the output using MOD DEFLATE. AddOutputFilter INCLUDES;DEFLATE shtml If more than one filter is specified, they must be separated by semicolons in the order in which they should process the content. The filter argument is case-insensitive. The extension argument is case-insensitive and can be specified with or without a leading dot. Filenames may have multiple extensions and the extension argument will be compared against each of them. Note that when defining a set of filters using the A DD O UTPUT F ILTER directive, any definition made will replace any previous definition made by the A DD O UTPUT F ILTER directive. # Effective filter "DEFLATE" AddOutputFilter DEFLATE shtml # Effective filter "INCLUDES", replacing "DEFLATE" AddOutputFilter INCLUDES shtml # Effective filter "INCLUDES;DEFLATE", replacing "DEFLATE" AddOutputFilter INCLUDES;DEFLATE shtml # Effective filter "BUFFER", replacing "INCLUDES;DEFLATE" AddOutputFilter BUFFER shtml # No effective filter, replacing "BUFFER" RemoveOutputFilter shtml See also • R EMOVE O UTPUT F ILTER • S ET O UTPUT F ILTER AddType Directive Description: Syntax: Context: Override: Status: Module: Maps the given filename extensions onto the specified content type AddType media-type extension [extension] ... server config, virtual host, directory, .htaccess FileInfo Base mod mime The A DD T YPE directive maps the given filename extensions onto the specified content type. media-type is the media type to use for filenames containing extension. This mapping is added to any already in force, overriding any mappings that already exist for the same extension. 756 CHAPTER 10. APACHE MODULES =⇒changing It is recommended that new media types be added using the A the T C file. DD T YPE directive rather than YPES ONFIG Example AddType image/gif .gif Or, to specify multiple file extensions in one directive: Example AddType image/jpeg jpeg jpg jpe The extension argument is case-insensitive and can be specified with or without a leading dot. Filenames may have multiple extensions and the extension argument will be compared against each of them. A simmilar effect to MOD NEGOTIATION’s L ANGUAGE P RIORITY can be achieved by qualifying a media-type with qs: Example AddType application/rss+xml;qs=0.8 .xml This is useful in situations, e.g. when a client requesting Accept: returned by the server. */* can not actually processes the content This directive primarily configures the content types generated for static files served out of the filesystem. For resources other than static files, where the generator of the response typically specifies a Content-Type, this directive has no effect. =⇒Note If no handler is explicitly set for a request, the specified content type will also be used as the handler name. When explicit directives such as S ET H ANDLER or A DD H ANDLER do not apply to the current request, the internal handler name normally set by those directives is instead set to the content type specified by this directive. This is a historical behavior that may be used by some third-party modules (such as mod php) for taking responsibility for the matching request. Configurations that rely on such "synthetic" types should be avoided. Additionally, configurations that restrict access to S ET H ANDLER or A DD H ANDLER should restrict access to this directive as well. See also • F ORCE T YPE • MOD NEGOTIATION DefaultLanguage Directive Description: Syntax: Context: Override: Status: Module: Defines a default language-tag to be sent in the Content-Language header field for all resources in the current context that have not been assigned a language-tag by some other means. DefaultLanguage language-tag server config, virtual host, directory, .htaccess FileInfo Base mod mime 10.75. APACHE MODULE MOD MIME 757 The D EFAULT L ANGUAGE directive tells Apache that all resources in the directive’s scope (e.g., all resources covered by the current container) that don’t have an explicit language extension (such as .fr or .de as configured by A DD L ANGUAGE) should be assigned a Content-Language of language-tag. This allows entire directory trees to be marked as containing Dutch content, for instance, without having to rename each file. Note that unlike using extensions to specify languages, D EFAULT L ANGUAGE can only specify a single language. If no D EFAULT L ANGUAGE directive is in force and a file does not have any language extensions as configured by A DD L ANGUAGE, then no Content-Language header field will be generated. Example DefaultLanguage en See also • MOD NEGOTIATION ModMimeUsePathInfo Directive Description: Syntax: Default: Context: Status: Module: Tells MOD MIME to treat path info components as part of the filename ModMimeUsePathInfo On|Off ModMimeUsePathInfo Off directory Base mod mime The M OD M IME U SE PATH I NFO directive is used to combine the filename with the path info URL component to apply MOD MIME’s directives to the request. The default value is Off - therefore, the path info component is ignored. This directive is recommended when you have a virtual filesystem. Example ModMimeUsePathInfo On If you have a request for /index.php/foo.shtml MOD MIME will now treat the incoming request as /index.php/foo.shtml and directives like AddOutputFilter INCLUDES .shtml will add the INCLUDES filter to the request. If M OD M IME U SE PATH I NFO is not set, the INCLUDES filter will not be added. This will work analogously for virtual paths, such as those defined by See also • ACCEPT PATH I NFO MultiviewsMatch Directive Description: Syntax: Default: Context: Override: Status: Module: The types of files that will be included when searching for a matching file with MultiViews MultiviewsMatch Any|NegotiatedOnly|Filters|Handlers [Handlers|Filters] MultiviewsMatch NegotiatedOnly server config, virtual host, directory, .htaccess FileInfo Base mod mime 758 CHAPTER 10. APACHE MODULES M ULTIVIEWS M ATCH permits three different behaviors for mod negotiation (p. 766) ’s Multiviews feature. Multiviews allows a request for a file, e.g. index.html, to match any negotiated extensions following the base request, e.g. index.html.en, index.html.fr, or index.html.gz. The NegotiatedOnly option provides that every extension following the base name must correlate to a recognized MOD MIME extension for content negotiation, e.g. Charset, Content-Type, Language, or Encoding. This is the strictest implementation with the fewest unexpected side effects, and is the default behavior. To include extensions associated with Handlers and/or Filters, set the M ULTIVIEWS M ATCH directive to either Handlers, Filters, or both option keywords. If all other factors are equal, the smallest file will be served, e.g. in deciding between index.html.cgi of 500 bytes and index.html.pl of 1000 bytes, the .cgi file would win in this example. Users of .asis files might prefer to use the Handler option, if .asis files are associated with the asis-handler. You may finally allow Any extensions to match, even if MOD MIME doesn’t recognize the extension. This can cause unpredictable results, such as serving .old or .bak files the webmaster never expected to be served. For example, the following configuration will allow handlers and filters to participate in Multviews, but will exclude unknown files: MultiviewsMatch Handlers Filters M ULTIVIEWS M ATCH is not allowed in a or section. See also • O PTIONS • MOD NEGOTIATION RemoveCharset Directive Description: Syntax: Context: Override: Status: Module: Removes any character set associations for a set of file extensions RemoveCharset extension [extension] ... virtual host, directory, .htaccess FileInfo Base mod mime The R EMOVE C HARSET directive removes any character set associations for files with the given extensions. This allows .htaccess files in subdirectories to undo any associations inherited from parent directories or the server config files. The extension argument is case-insensitive and can be specified with or without a leading dot. Example RemoveCharset .html .shtml RemoveEncoding Directive Description: Syntax: Context: Override: Status: Module: Removes any content encoding associations for a set of file extensions RemoveEncoding extension [extension] ... virtual host, directory, .htaccess FileInfo Base mod mime 10.75. APACHE MODULE MOD MIME 759 The R EMOVE E NCODING directive removes any encoding associations for files with the given extensions. This allows .htaccess files in subdirectories to undo any associations inherited from parent directories or the server config files. An example of its use might be: /foo/.htaccess: AddEncoding x-gzip .gz AddType text/plain .asc RemoveEncoding .gz This will cause foo.gz to be marked as being encoded with the gzip method, but foo.gz.asc as an unencoded plaintext file. =⇒Note R EMOVE E NCODING directives are processed after any A DD E NCODING directives, so it is possible they may undo the effects of the latter if both occur within the same directory configuration. The extension argument is case-insensitive and can be specified with or without a leading dot. RemoveHandler Directive Description: Syntax: Context: Override: Status: Module: Removes any handler associations for a set of file extensions RemoveHandler extension [extension] ... virtual host, directory, .htaccess FileInfo Base mod mime The R EMOVE H ANDLER directive removes any handler associations for files with the given extensions. This allows .htaccess files in subdirectories to undo any associations inherited from parent directories or the server config files. An example of its use might be: /foo/.htaccess: AddHandler server-parsed .html /foo/bar/.htaccess: RemoveHandler .html This has the effect of returning .html files in the /foo/bar directory to being treated as normal files, rather than as candidates for parsing (see the MOD INCLUDE module). The extension argument is case-insensitive and can be specified with or without a leading dot. RemoveInputFilter Directive Description: Syntax: Context: Override: Status: Module: Removes any input filter associations for a set of file extensions RemoveInputFilter extension [extension] ... virtual host, directory, .htaccess FileInfo Base mod mime 760 CHAPTER 10. APACHE MODULES The R EMOVE I NPUT F ILTER directive removes any input filter (p. 110) associations for files with the given extensions. This allows .htaccess files in subdirectories to undo any associations inherited from parent directories or the server config files. The extension argument is case-insensitive and can be specified with or without a leading dot. See also • A DD I NPUT F ILTER • S ET I NPUT F ILTER RemoveLanguage Directive Description: Syntax: Context: Override: Status: Module: Removes any language associations for a set of file extensions RemoveLanguage extension [extension] ... virtual host, directory, .htaccess FileInfo Base mod mime The R EMOVE L ANGUAGE directive removes any language associations for files with the given extensions. This allows .htaccess files in subdirectories to undo any associations inherited from parent directories or the server config files. The extension argument is case-insensitive and can be specified with or without a leading dot. RemoveOutputFilter Directive Description: Syntax: Context: Override: Status: Module: Removes any output filter associations for a set of file extensions RemoveOutputFilter extension [extension] ... virtual host, directory, .htaccess FileInfo Base mod mime The R EMOVE O UTPUT F ILTER directive removes any output filter (p. 110) associations for files with the given extensions. This allows .htaccess files in subdirectories to undo any associations inherited from parent directories or the server config files. The extension argument is case-insensitive and can be specified with or without a leading dot. Example RemoveOutputFilter shtml See also • A DD O UTPUT F ILTER RemoveType Directive Description: Syntax: Context: Override: Status: Module: Removes any content type associations for a set of file extensions RemoveType extension [extension] ... virtual host, directory, .htaccess FileInfo Base mod mime 10.75. APACHE MODULE MOD MIME 761 The R EMOVE T YPE directive removes any media type associations for files with the given extensions. This allows .htaccess files in subdirectories to undo any associations inherited from parent directories or the server config files. An example of its use might be: /foo/.htaccess: RemoveType .cgi This will remove any special handling of .cgi files in the /foo/ directory and any beneath it, causing responses containing those files to omit the HTTP Content-Type header field. =⇒Note R EMOVE T YPE directives are processed after any A DD T YPE directives, so it is possible they may undo the effects of the latter if both occur within the same directory configuration. The extension argument is case-insensitive and can be specified with or without a leading dot. TypesConfig Directive Description: Syntax: Default: Context: Status: Module: The location of the mime.types file TypesConfig file-path TypesConfig conf/mime.types server config Base mod mime The T YPES C ONFIG directive sets the location of the media types configuration file. File-path is relative to the S ERVER ROOT. This file sets the default list of mappings from filename extensions to content types. Most administrators use the mime.types file provided by their OS, which associates common filename extensions with the official list of IANA registered media types maintained at http://www.iana.org/assignments/media-types/index.html as well as a large number of unofficial types. This simplifies the httpd.conf file by providing the majority of mediatype definitions, and may be overridden by A DD T YPE directives as needed. You should not edit the mime.types file, because it may be replaced when you upgrade your server. The file contains lines in the format of the arguments to an A DD T YPE directive: media-type [extension] ... The case of the extension does not matter. Blank lines, and lines beginning with a hash character (#) are ignored. Empty lines are there for completeness (of the mime.types file). Apache httpd can still determine these types with MOD MIME MAGIC . =⇒Please do not send requests to the Apache HTTP Server Project to add any new entries in the distributed mime.types file unless (1) they are already registered with IANA, and (2) they use widely accepted, non-conflicting filename extensions across platforms. category/x-subtype requests will be automatically rejected, as will any new two-letter extensions as they will likely conflict later with the already crowded language and character set namespace. See also • MOD MIME MAGIC 762 CHAPTER 10. APACHE MODULES 10.76 Apache Module mod mime magic Description: Status: ModuleIdentifier: SourceFile: Determines the MIME type of a file by looking at a few bytes of its contents Extension mime magic module mod mime magic.c Summary This module determines the MIME type of files in the same way the Unix file(1) command works: it looks at the first few bytes of the file. It is intended as a "second line of defense" for cases that MOD MIME can’t resolve. This module is derived from a free version of the file(1) command for Unix, which uses "magic numbers" and other hints from a file’s contents to figure out what the contents are. This module is active only if the magic file is specified by the M IME M AGIC F ILE directive. Directives • MimeMagicFile Format of the Magic File The contents of the file are plain ASCII text in 4-5 columns. Blank lines are allowed but ignored. Commented lines use a hash mark (#). The remaining lines are parsed for the following columns: Column Description 1 byte number to begin checking from ">" indicates a dependency upon the previous non-">" line type of data to match byte single character short machine-order 16-bit integer long machine-order 32-bit integer string arbitrary-length string date long integer date (seconds since Unix epoch/1970) beshort big-endian 16-bit integer belong big-endian 32-bit integer bedate big-endian 32-bit integer date leshort little-endian 16-bit integer lelong little-endian 32-bit integer ledate little-endian 32-bit integer date contents of data to match MIME type if matched MIME encoding if matched (optional) 2 3 4 5 For example, the following magic file lines would recognize some audio formats: 10.76. APACHE MODULE MOD MIME MAGIC # Sun/NeXT audio data 0 string .snd >12 belong 1 >12 belong 2 >12 belong 3 >12 belong 4 >12 belong 5 >12 belong 6 >12 belong 7 >12 belong 23 763 audio/basic audio/basic audio/basic audio/basic audio/basic audio/basic audio/basic audio/x-adpcm Or these would recognize the difference between *.doc files containing Microsoft Word or FrameMaker documents. (These are incompatible file formats which use the same file suffix.) # Frame 0 string 0 string 0 string 0 string 0 string 0 string 0 string \

Content of the page.

----xyz---- Consider, for example, a resource called document.html which is available in English, French, and German. The files for each of these are called document.html.en, document.html.fr, and document.html.de, respectively. The type map file will be called document.html.var, and will contain the following: URI: document.html Content-language: en Content-type: text/html URI: document.html.en Content-language: fr Content-type: text/html URI: document.html.fr Content-language: de Content-type: text/html URI: document.html.de All four of these files should be placed in the same directory, and the .var file should be associated with the type-map handler with an A DD H ANDLER directive: AddHandler type-map .var 768 CHAPTER 10. APACHE MODULES A request for document.html.var in this directory will result in choosing the variant which most closely matches the language preference specified in the user’s Accept-Language request header. If Multiviews is enabled, and M ULTIVIEWS M ATCH is set to "handlers" or "any", a request to document.html will discover document.html.var and continue negotiating with the explicit type map. Other configuration directives, such as A LIAS can be used to map document.html to document.html.var. Multiviews A Multiviews search is enabled by the Multiviews O PTIONS. If the server receives a request for /some/dir/foo and /some/dir/foo does not exist, then the server reads the directory looking for all files named foo.*, and effectively fakes up a type map which names all those files, assigning them the same media types and content-encodings it would have if the client had asked for one of them by name. It then chooses the best match to the client’s requirements, and returns that document. The M ULTIVIEWS M ATCH directive configures whether Apache will consider files that do not have content negotiation meta-information assigned to them when choosing files. CacheNegotiatedDocs Directive Description: Syntax: Default: Context: Status: Module: Allows content-negotiated documents to be cached by proxy servers CacheNegotiatedDocs On|Off CacheNegotiatedDocs Off server config, virtual host Base mod negotiation If set, this directive allows content-negotiated documents to be cached by proxy servers. This could mean that clients behind those proxys could retrieve versions of the documents that are not the best match for their abilities, but it will make caching more efficient. This directive only applies to requests which come from HTTP/1.0 browsers. HTTP/1.1 provides much better control over the caching of negotiated documents, and this directive has no effect in responses to HTTP/1.1 requests. ForceLanguagePriority Directive Description: Syntax: Default: Context: Override: Status: Module: Action to take if a single acceptable document is not found ForceLanguagePriority None|Prefer|Fallback [Prefer|Fallback] ForceLanguagePriority Prefer server config, virtual host, directory, .htaccess FileInfo Base mod negotiation The F ORCE L ANGUAGE P RIORITY directive uses the given L ANGUAGE P RIORITY to satisfy negotiation where the server could otherwise not return a single matching document. ForceLanguagePriority Prefer uses LanguagePriority to serve a one valid result, rather than returning an HTTP result 300 (MULTIPLE CHOICES) when there are several equally valid choices. If the directives below were given, and the user’s Accept-Language header assigned en and de each as quality .500 (equally acceptable) then the first matching variant, en, will be served. LanguagePriority en fr de ForceLanguagePriority Prefer 10.77. APACHE MODULE MOD NEGOTIATION 769 ForceLanguagePriority Fallback uses L ANGUAGE P RIORITY to serve a valid result, rather than returning an HTTP result 406 (NOT ACCEPTABLE). If the directives below were given, and the user’s Accept-Language only permitted an es language response, but such a variant isn’t found, then the first variant from the L ANGUAGE P RI ORITY list below will be served. LanguagePriority en fr de ForceLanguagePriority Fallback Both options, Prefer and Fallback, may be specified, so either the first matching variant from L ANGUAGE P RI ORITY will be served if more than one variant is acceptable, or first available document will be served if none of the variants matched the client’s acceptable list of languages. See also • A DD L ANGUAGE LanguagePriority Directive Description: Syntax: Context: Override: Status: Module: The precendence of language variants for cases where the client does not express a preference LanguagePriority MIME-lang [MIME-lang] ... server config, virtual host, directory, .htaccess FileInfo Base mod negotiation The L ANGUAGE P RIORITY sets the precedence of language variants for the case where the client does not express a preference, when handling a Multiviews request. The list of MIME-lang are in order of decreasing preference. LanguagePriority en fr de For a request for foo.html, where foo.html.fr and foo.html.de both existed, but the browser did not express a language preference, then foo.html.fr would be returned. Note that this directive only has an effect if a ’best’ language cannot be determined by any other means or the F ORCE L ANGUAGE P RIORITY directive is not None. In general, the client determines the language preference, not the server. See also • A DD L ANGUAGE 770 CHAPTER 10. APACHE MODULES 10.78 Apache Module mod nw ssl Description: Status: ModuleIdentifier: SourceFile: Compatibility: Enable SSL encryption for NetWare Base nwssl module mod nw ssl.c NetWare only Summary This module enables SSL encryption for a specified port. It takes advantage of the SSL encryption functionality that is built into the NetWare operating system. Directives • NWSSLTrustedCerts • NWSSLUpgradeable • SecureListen NWSSLTrustedCerts Directive Description: Syntax: Context: Status: Module: List of additional client certificates NWSSLTrustedCerts filename [filename] ... server config Base mod nw ssl Specifies a list of client certificate files (DER format) that are used when creating a proxied SSL connection. Each client certificate used by a server must be listed separately in its own .der file. NWSSLUpgradeable Directive Description: Syntax: Context: Status: Module: Allows a connection to be upgraded to an SSL connection upon request NWSSLUpgradeable [IP-address:]portnumber server config Base mod nw ssl Allow a connection that was created on the specified address and/or port to be upgraded to an SSL connection upon request from the client. The address and/or port must have already be defined previously with a L ISTEN directive. SecureListen Directive Description: Syntax: Context: Status: Module: Enables SSL encryption for the specified port SecureListen [IP-address:]portnumber Certificate-Name [MUTUAL] server config Base mod nw ssl Specifies the port and the eDirectory based certificate name that will be used to enable SSL encryption. An optional third parameter also enables mutual authentication. 10.79. APACHE MODULE MOD POLICY 10.79 771 Apache Module mod policy Description: Status: ModuleIdentifier: SourceFile: HTTP protocol compliance enforcement. Extension policy module mod policy.c Summary The HTTP protocol recommends that clients should be "liberal in what they accept", and servers "strict with what they send". In some cases it can be difficult to detect when a server or an application has been misconfigured, is serving uncacheable content or is behaving suboptimally, as an HTTP client might be compensating for the server. These problems can potentially lead to excessive bandwidth consumption, or a server outage under load. The MOD POLICY module consists of a set of filters that test servers for HTTP protocol compliance. These tests allow the server administrator to log violations of, or outright reject responses where certain defined conditions exist. This could be used as a way to set minimum HTTP protocol compliance criteria for a restful application. Alternatively, a reverse proxy or cache could be configured to protect itself from misconfigured origin servers or unexpectedly uncacheable content, or as a mechanism to detect configuration mistakes within the server itself. Directives • PolicyConditional • PolicyConditionalURL • PolicyEnvironment • PolicyFilter • PolicyKeepalive • PolicyKeepaliveURL • PolicyLength • PolicyLengthURL • PolicyMaxage • PolicyMaxageURL • PolicyNocache • PolicyNocacheURL • PolicyType • PolicyTypeURL • PolicyValidation • PolicyValidationURL • PolicyVary • PolicyVaryURL • PolicyVersion • PolicyVersionURL See also • Filters (p. 110) • HTTP Protocol Compliance (p. 71) 772 CHAPTER 10. APACHE MODULES Actions If a policy is violated, one of the following actions can be taken: ignore The policy check will be ignored for the given URL space, even if the filter is present. log The policy check will be executed, and if a violation is detected a warning will be logged to the server error log, and a Warning header added to the response for the benefit of the client. enforce The policy check will be executed, and if a violation is detected an error will be logged to the server error log, a Warning header added to the response, and a 502 Bad Gateway will be returned to the client. Optional links to explanatory documentation can be added to each error message, detailing the origin of each policy. It is also possible to selectively disable all policies for a given URL space, should the need arise, using the P OLICYF ILTER directive. Alternatively, the P OLICY E NVIRONMENT directive can be used to specify an environment variable, which if present, will cause the policies to be selectively downgraded or bypassed. Policy Tests The following policy filters are available: POLICY TYPE (p. 71) : Enforce valid content types Content types that are syntactically invalid or blank can be detected and the request rejected. Types can be restricted to a specific list containing optional wildcards ? and *. POLICY LENGTH (p. 71) : Enforce the presence of a Content-Length The length of responses can be specified in one of three ways, by specifying an explicit length in advance, using chunked encoding to set the length, or by setting no length at all and terminating the request when complete. The absence of a specific content length can affect the cacheability of the response, and prevents the use of keepalive during HTTP/1.0 requests. This policy enforces the presence of an explicit content length on the response. POLICY KEEPALIVE (p. 71) : Enforce the option to keepalive Less restrictive than the POLICY LENGTH test, this policy enforces the possibility that the response can be kept alive. If the response doesn’t have a protocol defined zero length, and the response isn’t already an error, and the response has neither a Content-Length or is declared HTTP/1.1 and lacks Content-Encoding: chunked, then this response will be rejected. POLICY VARY (p. 71) : Enforce the absence of certain headers within Vary headers If the Vary header contains any of the headers specified, this policy will reject the request. The typical case is the presence of the User-Agent within Vary, which is likely to cause a denial of service condition to a cache. POLICY VALIDATION (p. 71) : Enforce the presence of Etag and/or Last-Modified The ability for a cache to determine whether a cached entity can be refreshed is dependent on whether a valid Etag and/or Last-Modified header is present to revalidate against. The absence of both headers, or the invalid syntax of a header will cause this policy to be rejected. POLICY CONDITIONAL (p. 71) : Enforce correct operation of conditional requests When conditional headers are present in the request, a server should respond with a 304 Not Modified or 412 Precondition Failed response where appropriate. A server may ignore conditional headers, and this affects the efficiency of the HTTP caching mechanism. This policy rejects requests where a conditional header is present, and a 304 or 412 response code was expected, but a 2xx response was seen instead. POLICY NOCACHE (p. 71) : Enforce cacheable responses When a response is encountered that declares itself explicitly uncacheable, the request is rejected. A response is considered uncacheable if it specifies any of the following: 10.79. APACHE MODULE MOD POLICY • • • • 773 Cache-Control: no-cache Pragma: no-cache Cache-Control: no-store Cache-Control: private POLICY MAXAGE (p. 71) : Enforce a minimum maxage When a response is encountered where the freshness lifetime is less than the given value, or the freshness lifetime is heuristic, the request is rejected. A response is checked in the following order: • • • • • • If s-maxage is present but too small; or If max-age is present but too small; or If Expires is present and invalid; or Date is present and invalid; or Expires minus Date is too small; or No s-maxage, maxage, or Expires/Date declared at all POLICY VERSION (p. 71) : Enforce a minimum HTTP version within a request When a request is encountered with an HTTP version number less than the required minimum version, the request is rejected. The following version numbers are recognised: • HTTP/1.1 • HTTP/1.0 • HTTP/0.9 Example Configuration A typical configuration protecting a server serving static content might be as follows: SetOutputFilter POLICY_TYPE;POLICY_LENGTH;POLICY_KEEPALIVE;POLICY_VARY;POLICY_VALIDATION; POLICY_CONDITIONAL;POLICY_NOCACHE;POLICY_MAXAGE;POLICY_VERSION # content type must be present and valid, but can be anything PolicyType enforce */* # reject if no explicitly declared content length PolicyLength enforce # covered by the policy length filter PolicyKeepalive ignore # reject if User-Agent appears within Vary headers PolicyVary enforce User-Agent # we want to enforce validation PolicyValidation enforce # non-functional conditional responses should be rejected PolicyConditional enforce # no-cache responses should be rejected PolicyNocache enforce 774 CHAPTER 10. APACHE MODULES # maxage must be at least a day PolicyMaxage enforce 86400 # request version can be anything PolicyVersion ignore HTTP/1.1 # suppress policy protection for server-status PolicyFilter off PolicyConditional Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Enable the conditional request policy. PolicyConditional ignore|log|enforce ignore server config, virtual host, directory Extension mod policy PolicyConditional is only available in Apache 2.5.0 and later. When logged or enforced, a response that should have been conditional but wasn’t will be rejected. Example # non-functional conditional responses should be rejected PolicyConditional enforce PolicyConditionalURL Directive Description: Syntax: Default: Context: Status: Module: Compatibility: URL describing the conditional request policy. PolicyConditionalURL url none server config, virtual host, directory Extension mod policy PolicyConditionalURL is only available in Apache 2.5.0 and later. Specify the URL of the documentation describing the conditional request policy, to appear within error messages. PolicyEnvironment Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Override policies based on an environment variable. PolicyEnvironment variable log-value ignore-value none server config, virtual host, directory Extension mod policy PolicyEnvironment is only available in Apache 2.5.0 and later. 10.79. APACHE MODULE MOD POLICY 775 Downgrade policies to logging only or ignored based on the presence of an environment variable. If the given variable is present and equal to the log-value, enforced policies will be logged instead. If the given variable is present and equal to the ignore-value, all policies will be ignored. Example # downgrade if POLICY_CONTROL was present PolicyEnvironment POLICY_CONTROL log ignore PolicyFilter Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Enable or disable policies for the given URL space. PolicyFilter on|off on server config, virtual host, directory Extension mod policy PolicyFilter is only available in Apache 2.5.0 and later. Master switch to enable or disable policies for a given URL space. Example # enabled by default PolicyFilter on # suppress policy protection for server-status PolicyFilter off PolicyKeepalive Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Enable the keepalive policy. PolicyKeepalive ignore|log|enforce ignore server config, virtual host, directory Extension mod policy PolicyKeepalive is only available in Apache 2.5.0 and later. When logged or enforced, a response that lacks both an explicit Content-Length header and a Transfer-Encoding of chunked will be rejected. Example # missing Content-Length or Transfer-Encoding should be rejected PolicyKeepalive enforce 776 CHAPTER 10. APACHE MODULES PolicyKeepaliveURL Directive Description: Syntax: Default: Context: Status: Module: Compatibility: URL describing the keepalive policy. PolicyKeepaliveURL url none server config, virtual host, directory Extension mod policy PolicyKeepaliveURL is only available in Apache 2.5.0 and later. Specify the URL of the documentation describing the keepalive policy, to appear within error messages. PolicyLength Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Enable the content length policy. PolicyLength ignore|log|enforce ignore server config, virtual host, directory Extension mod policy PolicyLength is only available in Apache 2.5.0 and later. When logged or enforced, a response that lacks an explicit Content-Length header will be rejected. Example # missing Content-Length header should be rejected PolicyLength enforce PolicyLengthURL Directive Description: Syntax: Default: Context: Status: Module: Compatibility: URL describing the content length policy. PolicyLengthURL url none server config, virtual host, directory Extension mod policy PolicyLengthURL is only available in Apache 2.5.0 and later. Specify the URL of the documentation describing the content length policy, to appear within error messages. PolicyMaxage Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Enable the caching minimum max-age policy. PolicyMaxage ignore|log|enforce age ignore server config, virtual host, directory Extension mod policy PolicyMaxage is only available in Apache 2.5.0 and later. When logged or enforced, a response that lacks an explicit freshness lifetime defined with max-age, s-maxage or an Expires header, or where the explicit freshness lifetime is smaller than the given value, will be rejected. 10.79. APACHE MODULE MOD POLICY 777 Example # reject responses with a freshness lifetime shorter than a day PolicyMaxage enforce 86400 PolicyMaxageURL Directive Description: Syntax: Default: Context: Status: Module: Compatibility: URL describing the caching minimum freshness lifetime policy. PolicyMaxageURL url none server config, virtual host, directory Extension mod policy PolicyMaxageURL is only available in Apache 2.5.0 and later. Specify the URL of the documentation describing the caching minimum freshness lifetime policy, to appear within error messages. PolicyNocache Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Enable the caching no-cache policy. PolicyNocache ignore|log|enforce ignore server config, virtual host, directory Extension mod policy PolicyNocache is only available in Apache 2.5.0 and later. When logged or enforced, a response that defines itself uncacheable using the Cache-Control or Pragma headers will be rejected. Example # Cache-Control: no-cache will be rejected PolicyNocache enforce PolicyNocacheURL Directive Description: Syntax: Default: Context: Status: Module: Compatibility: URL describing the caching no-cache policy. PolicyNocacheURL url none server config, virtual host, directory Extension mod policy PolicyNocacheURL is only available in Apache 2.5.0 and later. Specify the URL of the documentation describing the caching no-cache policy, to appear within error messages. 778 CHAPTER 10. APACHE MODULES PolicyType Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Enable the content type policy. PolicyType ignore|log|enforce type [ type [ ... ignore server config, virtual host, directory Extension mod policy PolicyType is only available in Apache 2.5.0 and later. ]] When logged or enforced, a response that lacks a Content-Type header, where the Content-Type header is malformed, or where the header does not match the given pattern or patterns will be rejected. Example # enforce json or XML PolicyType enforce application/json text/xml Example # malformed content type should be rejected PolicyType enforce */* PolicyTypeURL Directive Description: Syntax: Default: Context: Status: Module: Compatibility: URL describing the content type policy. PolicyTypeURL url none server config, virtual host, directory Extension mod policy PolicyTypeURL is only available in Apache 2.5.0 and later. Specify the URL of the documentation describing the content type policy, to appear within error messages. PolicyValidation Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Enable the validation policy. PolicyValidation ignore|log|enforce ignore server config, virtual host, directory Extension mod policy PolicyValidation is only available in Apache 2.5.0 and later. When logged or enforced, a response that lacks either a valid ETag header or a Last-Modified header, or where either header is syntactically incorrect, will be rejected. Example # no ETag or Last-Modified will be rejected PolicyValidation enforce 10.79. APACHE MODULE MOD POLICY 779 PolicyValidationURL Directive Description: Syntax: Default: Context: Status: Module: Compatibility: URL describing the content type policy. PolicyValidationURL url none server config, virtual host, directory Extension mod policy PolicyValidationURL is only available in Apache 2.5.0 and later. Specify the URL of the documentation describing the validation policy, to appear within error messages. PolicyVary Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Enable the Vary policy. PolicyVary ignore|log|enforce header [ header [ ... ignore server config, virtual host, directory Extension mod policy PolicyVary is only available in Apache 2.5.0 and later. ]] When logged or enforced, a response that contains a Vary header which in turn contains one of the headers listed, will be rejected. Example # reject reponses with "User-Agent" listed in the Vary header PolicyVary enforce User-Agent PolicyVaryURL Directive Description: Syntax: Default: Context: Status: Module: Compatibility: URL describing the content type policy. PolicyVaryURL url none server config, virtual host, directory Extension mod policy PolicyVaryURL is only available in Apache 2.5.0 and later. Specify the URL of the documentation describing the vary policy, to appear within error messages. PolicyVersion Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Enable the version policy. PolicyVersion ignore|log|enforce HTTP/0.9|HTTP/1.0|HTTP/1.1 ignore server config, virtual host, directory Extension mod policy PolicyVersion is only available in Apache 2.5.0 and later. When logged or enforced, a request with a version lower than specified will be rejected. 780 CHAPTER 10. APACHE MODULES Example # reject requests with an HTTP version older than HTTP/1.1 PolicyVersion enforce HTTP/1.1 PolicyVersionURL Directive Description: Syntax: Default: Context: Status: Module: Compatibility: URL describing the minimum request HTTP version policy. PolicyVersionURL url none server config, virtual host, directory Extension mod policy PolicyVersionURL is only available in Apache 2.5.0 and later. Specify the URL of the documentation describing the minimum request HTTP version policy, to appear within error messages. 10.80. APACHE MODULE MOD PRIVILEGES 10.80 781 Apache Module mod privileges Description: Status: ModuleIdentifier: SourceFile: Compatibility: Support for Solaris privileges and for running virtual hosts under different user IDs. Experimental privileges module mod privileges.c Available in Apache 2.3 and up, on Solaris 10 and OpenSolaris platforms Summary This module enables different Virtual Hosts to run with different Unix User and Group IDs, and with different Solaris Privileges58 . In particular, it offers a solution to the problem of privilege separation between different Virtual Hosts, first promised by the abandoned perchild MPM. It also offers other security enhancements. Unlike perchild, MOD PRIVILEGES is not itself an MPM. It works within a processing model to set privileges and User/Group per request in a running process. It is therefore not compatible with a threaded MPM, and will refuse to run under one. MOD PRIVILEGES raises security issues similar to those of suexec (p. 115) . But unlike suexec, it applies not only to CGI programs but to the entire request processing cycle, including in-process applications and subprocesses. It is ideally suited to running PHP applications under mod php, which is also incompatible with threaded MPMs. It is also well-suited to other in-process scripting applications such as mod perl, mod python, and mod ruby, and to applications implemented in C as apache modules where privilege separation is an issue. Directives • DTracePrivileges • PrivilegesMode • VHostCGIMode • VHostCGIPrivs • VHostGroup • VHostPrivs • VHostSecure • VHostUser Security Considerations MOD PRIVILEGES introduces new security concerns in situations where untrusted code may be run within the webserver process. This applies to untrusted modules, and scripts running under modules such as mod php or mod perl. Scripts running externally (e.g. as CGI or in an appserver behind mod proxy or mod jk) are NOT affected. The basic security concerns with mod privileges are: • Running as a system user introduces the same security issues as mod suexec, and near-equivalents such as cgiwrap and suphp. • A privileges-aware malicious user extension (module or script) could escalate its privileges to anything available to the httpd process in any virtual host. This introduces new risks if (and only if) mod privileges is compiled with the BIG SECURITY HOLE option. 58 http://sosc-dr.sun.com/bigadmin/features/articles/least privilege.jsp 782 CHAPTER 10. APACHE MODULES • A privileges-aware malicious user extension (module or script) could escalate privileges to set its user ID to another system user (and/or group). The P RIVILEGES M ODE directive allows you to select either FAST or SECURE mode. You can mix modes, using FAST mode for trusted users and fully-audited code paths, while imposing SECURE mode where an untrusted user has scope to introduce code. Before describing the modes, we should also introduce the target use cases: Benign vs Hostile. In a benign situation, you want to separate users for their convenience, and protect them and the server against the risks posed by honest mistakes, but you trust your users are not deliberately subverting system security. In a hostile situation - e.g. commercial hosting - you may have users deliberately attacking the system or each other. FAST mode In FAST mode, requests are run in-process with the selected uid/gid and privileges, so the overhead is negligible. This is suitable for benign situations, but is not secure against an attacker escalating privileges with an in-process module or script. SECURE mode A request in SECURE mode forks a subprocess, which then drops privileges. This is a very similar case to running CGI with suexec, but for the entire request cycle, and with the benefit of fine-grained control of privileges. You can select different P RIVILEGES M ODEs for each virtual host, and even in a directory context within a virtual host. FAST mode is appropriate where the user(s) are trusted and/or have no privilege to load in-process code. SECURE mode is appropriate to cases where untrusted code might be run in-process. However, even in SECURE mode, there is no protection against a malicious user who is able to introduce privileges-aware code running before the start of the request-processing cycle. DTracePrivileges Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Determines whether the privileges required by dtrace are enabled. DTracePrivileges On|Off DTracePrivileges Off server config Experimental mod privileges Available on Solaris 10 and OpenSolaris with non-threaded MPMs (PREFORK or custom MPM). This server-wide directive determines whether Apache will run with the privileges59 required to run dtrace60 . Note that DTracePrivileges On will not in itself activate DTrace, but DTracePrivileges Off will prevent it working. PrivilegesMode Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Trade off processing speed and efficiency vs security against malicious privileges-aware code. PrivilegesMode FAST|SECURE|SELECTIVE PrivilegesMode FAST server config, virtual host, directory Experimental mod privileges Available on Solaris 10 and OpenSolaris with non-threaded MPMs (PREFORK or custom MPM). 59 http://sosc-dr.sun.com/bigadmin/features/articles/least 60 http://sosc-dr.sun.com/bigadmin/content/dtrace/ privilege.jsp 10.80. APACHE MODULE MOD PRIVILEGES 783 This directive trades off performance vs security against malicious, privileges-aware code. In SECURE mode, each request runs in a secure subprocess, incurring a substantial performance penalty. In FAST mode, the server is not protected against escalation of privileges as discussed above. This directive differs slightly between a context (including equivalents such as Location/Files/If) and a top-level or . At top-level, it sets a default that will be inherited by virtualhosts. In a virtual host, FAST or SECURE mode acts on the entire HTTP request, and any settings in a context will be ignored. A third pseudo-mode SELECTIVE defers the choice of FAST vs SECURE to directives in a context. In a context, it is applicable only where SELECTIVE mode was set for the VirtualHost. Only FAST or SECURE can be set in this context (SELECTIVE would be meaningless). ! Warning Where SELECTIVE mode is selected for a virtual host, the activation of privileges must be deferred until after the mapping phase of request processing has determined what context applies to the request. This might give an attacker opportunities to introduce code through a R EWRITE M AP running at top-level or context before privileges have been dropped and userid/gid set. VHostCGIMode Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Determines whether the virtualhost can run subprocesses, and the privileges available to subprocesses. VHostCGIMode On|Off|Secure VHostCGIMode On virtual host Experimental mod privileges Available on Solaris 10 and OpenSolaris with non-threaded MPMs (PREFORK or custom MPM). Determines whether the virtual host is allowed to run fork and exec, the privileges61 required to run subprocesses. If this is set to Off the virtualhost is denied the privileges and will not be able to run traditional CGI programs or scripts under the traditional MOD CGI, nor similar external programs such as those created by MOD EXT FILTER or R EWRITE M AP prog. Note that it does not prevent CGI programs running under alternative process and security models such as mod fcgid62 , which is a recommended solution in Solaris. If set to On or Secure, the virtual host is permitted to run external programs and scripts as above. Setting VH OST CGIM ODE Secure has the effect of denying privileges to the subprocesses, as described for VH OST S ECURE. 61 http://sosc-dr.sun.com/bigadmin/features/articles/least 62 https://httpd.apache.org/mod fcgid/ privilege.jsp 784 CHAPTER 10. APACHE MODULES VHostCGIPrivs Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Assign arbitrary privileges to subprocesses created by a virtual host. VHostPrivs [+-]?privilege-name [[+-]?privilege-name] ... None virtual host Experimental mod privileges Available on Solaris 10 and OpenSolaris with non-threaded MPMs (PREFORK or custom MPM) and when MOD PRIVILEGES is compiled with the BIG SECURITY HOLE compiletime option. VH OST CGIP RIVS can be used to assign arbitrary privileges63 to subprocesses created by a virtual host, as discussed under VH OST CGIM ODE. Each privilege-name is the name of a Solaris privilege, such as file setid or sys nfs. A privilege-name may optionally be prefixed by + or -, which will respectively allow or deny a privilege. If used with neither + nor -, all privileges otherwise assigned to the virtualhost will be denied. You can use this to override any of the default sets and construct your own privilege set. ! Security This directive can open huge security holes in apache subprocesses, up to and including running them with root-level powers. Do not use it unless you fully understand what you are doing! VHostGroup Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Sets the Group ID under which a virtual host runs. VHostGroup unix-groupid Inherits the group id specified in G R O U P virtual host Experimental mod privileges Available on Solaris 10 and OpenSolaris with non-threaded MPMs (PREFORK or custom MPM). The VH OST G ROUP directive sets the Unix group under which the server will process requests to a virtualhost. The group is set before the request is processed and reset afterwards using Solaris Privileges64 . Since the setting applies to the process, this is not compatible with threaded MPMs. Unix-group is one of: A group name Refers to the given group by name. # followed by a group number. Refers to a group by its number. ! Security This directive cannot be used to run apache as root! Nevertheless, it opens potential security issues similar to those discussed in the suexec (p. 115) documentation. See also • G ROUP 63 http://sosc-dr.sun.com/bigadmin/features/articles/least 64 http://sosc-dr.sun.com/bigadmin/features/articles/least privilege.jsp privilege.jsp 10.80. APACHE MODULE MOD PRIVILEGES 785 • S UEXEC U SER G ROUP VHostPrivs Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Assign arbitrary privileges to a virtual host. VHostPrivs [+-]?privilege-name [[+-]?privilege-name] ... None virtual host Experimental mod privileges Available on Solaris 10 and OpenSolaris with non-threaded MPMs (PREFORK or custom MPM) and when MOD PRIVILEGES is compiled with the BIG SECURITY HOLE compiletime option. VH OST P RIVS can be used to assign arbitrary privileges65 to a virtual host. Each privilege-name is the name of a Solaris privilege, such as file setid or sys nfs. A privilege-name may optionally be prefixed by + or -, which will respectively allow or deny a privilege. If used with neither + nor -, all privileges otherwise assigned to the virtualhost will be denied. You can use this to override any of the default sets and construct your own privilege set. ! Security This directive can open huge security holes in apache, up to and including running requests with root-level powers. Do not use it unless you fully understand what you are doing! VHostSecure Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Determines whether the server runs with enhanced security for the virtualhost. VHostSecure On|Off VHostSecure On virtual host Experimental mod privileges Available on Solaris 10 and OpenSolaris with non-threaded MPMs (PREFORK or custom MPM). Determines whether the virtual host processes requests with security enhanced by removal of Privileges66 that are rarely needed in a webserver, but which are available by default to a normal Unix user and may therefore be required by modules and applications. It is recommended that you retain the default (On) unless it prevents an application running. Since the setting applies to the process, this is not compatible with threaded MPMs. =⇒Note If VH OST S ECURE prevents an application running, this may be a warning sign that the application should be reviewed for security. 65 http://sosc-dr.sun.com/bigadmin/features/articles/least 66 http://sosc-dr.sun.com/bigadmin/features/articles/least privilege.jsp privilege.jsp 786 CHAPTER 10. APACHE MODULES VHostUser Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Sets the User ID under which a virtual host runs. VHostUser unix-userid Inherits the userid specified in U S E R virtual host Experimental mod privileges Available on Solaris 10 and OpenSolaris with non-threaded MPMs (PREFORK or custom MPM). The VH OST U SER directive sets the Unix userid under which the server will process requests to a virtualhost. The userid is set before the request is processed and reset afterwards using Solaris Privileges67 . Since the setting applies to the process, this is not compatible with threaded MPMs. Unix-userid is one of: A username Refers to the given user by name. # followed by a user number. Refers to a user by its number. ! Security This directive cannot be used to run apache as root! Nevertheless, it opens potential security issues similar to those discussed in the suexec (p. 115) documentation. See also • U SER • S UEXEC U SER G ROUP 67 http://sosc-dr.sun.com/bigadmin/features/articles/least privilege.jsp 10.81. APACHE MODULE MOD PROXY 10.81 787 Apache Module mod proxy Description: Status: ModuleIdentifier: SourceFile: Multi-protocol proxy/gateway server Extension proxy module mod proxy.c Summary ! Warning Do not enable proxying with P ROXY R EQUESTS until you have secured your server. Open proxy servers are dangerous both to your network and to the Internet at large. MOD PROXY and related modules implement a proxy/gateway for Apache HTTP Server, supporting a number of popular protocols as well as several different load balancing algorithms. Third-party modules can add support for additional protocols and load balancing algorithms. A set of modules must be loaded into the server to provide the necessary features. These modules can be included statically at build time or dynamically via the L OAD M ODULE directive). The set must include: • MOD PROXY, which provides basic proxy capabilities • MOD PROXY BALANCER and one or more balancer modules if load balancing is required. MOD PROXY BALANCER for more information.) (See • one or more proxy scheme, or protocol, modules: Protocol Module AJP13 (Apache JServe Protocol version 1.3) CONNECT (for SSL) FastCGI ftp HTTP/0.9, HTTP/1.0, and HTTP/1.1 SCGI WS and WSS (Web-sockets) MOD PROXY AJP MOD PROXY CONNECT MOD PROXY FCGI MOD PROXY FTP MOD PROXY HTTP MOD PROXY SCGI MOD PROXY WSTUNNEL In addition, extended features are provided by other modules. Caching is provided by MOD CACHE and related modules. The ability to contact remote servers using the SSL/TLS protocol is provided by the SSLProxy* directives of MOD SSL. These additional modules will need to be loaded and configured to take advantage of these features. Directives • BalancerGrowth • BalancerInherit • BalancerMember • BalancerPersist • NoProxy • • ProxyAddHeaders • ProxyBadHeader • ProxyBlock 788 CHAPTER 10. APACHE MODULES • ProxyDomain • ProxyErrorOverride • ProxyIOBufferSize • • ProxyMaxForwards • ProxyPass • ProxyPassInherit • ProxyPassInterpolateEnv • ProxyPassMatch • ProxyPassReverse • ProxyPassReverseCookieDomain • ProxyPassReverseCookiePath • ProxyPreserveHost • ProxyReceiveBufferSize • ProxyRemote • ProxyRemoteMatch • ProxyRequests • ProxySet • ProxySourceAddress • ProxyStatus • ProxyTimeout • ProxyVia See also • MOD CACHE • MOD PROXY AJP • MOD PROXY BALANCER • MOD PROXY CONNECT • MOD PROXY FCGI • MOD PROXY FTP • MOD PROXY HCHECK • MOD PROXY HTTP • MOD PROXY SCGI • MOD PROXY WSTUNNEL • MOD SSL 10.81. APACHE MODULE MOD PROXY 789 Forward Proxies and Reverse Proxies/Gateways Apache HTTP Server can be configured in both a forward and reverse proxy (also known as gateway) mode. An ordinary forward proxy is an intermediate server that sits between the client and the origin server. In order to get content from the origin server, the client sends a request to the proxy naming the origin server as the target. The proxy then requests the content from the origin server and returns it to the client. The client must be specially configured to use the forward proxy to access other sites. A typical usage of a forward proxy is to provide Internet access to internal clients that are otherwise restricted by a firewall. The forward proxy can also use caching (as provided by MOD CACHE) to reduce network usage. The forward proxy is activated using the P ROXY R EQUESTS directive. Because forward proxies allow clients to access arbitrary sites through your server and to hide their true origin, it is essential that you secure your server so that only authorized clients can access the proxy before activating a forward proxy. A reverse proxy (or gateway), by contrast, appears to the client just like an ordinary web server. No special configuration on the client is necessary. The client makes ordinary requests for content in the namespace of the reverse proxy. The reverse proxy then decides where to send those requests and returns the content as if it were itself the origin. A typical usage of a reverse proxy is to provide Internet users access to a server that is behind a firewall. Reverse proxies can also be used to balance load among several back-end servers or to provide caching for a slower back-end server. In addition, reverse proxies can be used simply to bring several servers into the same URL space. A reverse proxy is activated using the P ROXY PASS directive or the [P] flag to the R EWRITE RULE directive. It is not necessary to turn P ROXY R EQUESTS on in order to configure a reverse proxy. Basic Examples The examples below are only a very basic idea to help you get started. Please read the documentation on the individual directives. In addition, if you wish to have caching enabled, consult the documentation from MOD CACHE. Reverse Proxy ProxyPass "/foo" "http://foo.example.com/bar" ProxyPassReverse "/foo" "http://foo.example.com/bar" Forward Proxy ProxyRequests On ProxyVia On Require host internal.example.com Access via Handler You can also force a request to be handled as a reverse-proxy request, by creating a suitable Handler pass-through. The example configuration below will pass all requests for PHP scripts to the specified FastCGI server using reverse proxy: 790 CHAPTER 10. APACHE MODULES Reverse Proxy PHP scripts SetHandler "proxy:unix:/path/to/app.sock|fcgi://localhost/" This feature is available in Apache HTTP Server 2.4.10 and later. Workers The proxy manages the configuration of origin servers and their communication parameters in objects called workers. There are two built-in workers: the default forward proxy worker and the default reverse proxy worker. Additional workers can be configured explicitly. The two default workers have a fixed configuration and will be used if no other worker matches the request. They do not use HTTP Keep-Alive or connection pooling. The TCP connections to the origin server will instead be opened and closed for each request. Explicitly configured workers are identified by their URL. They are usually created and configured using P ROXY PASS or P ROXY PASS M ATCH when used for a reverse proxy: ProxyPass "/example" "http://backend.example.com" connectiontimeout=5 timeout=30 This will create a worker associated with the origin server URL http://backend.example.com that will use the given timeout values. When used in a forward proxy, workers are usually defined via the P ROXY S ET directive: ProxySet http://backend.example.com connectiontimeout=5 timeout=30 or alternatively using P ROXY and P ROXY S ET: ProxySet connectiontimeout=5 timeout=30 Using explicitly configured workers in the forward mode is not very common, because forward proxies usually communicate with many different origin servers. Creating explicit workers for some of the origin servers can still be useful if they are used very often. Explicitly configured workers have no concept of forward or reverse proxying by themselves. They encapsulate a common concept of communication with origin servers. A worker created by P ROXY PASS for use in a reverse proxy will also be used for forward proxy requests whenever the URL to the origin server matches the worker URL, and vice versa. The URL identifying a direct worker is the URL of its origin server including any path components given: ProxyPass "/examples" "http://backend.example.com/examples" ProxyPass "/docs" "http://backend.example.com/docs" This example defines two different workers, each using a separate connection pool and configuration. 10.81. APACHE MODULE MOD PROXY ! 791 Worker Sharing Worker sharing happens if the worker URLs overlap, which occurs when the URL of some worker is a leading substring of the URL of another worker defined later in the configuration file. In the following example ProxyPass "/apps" "http://backend.example.com/" timeout=60 ProxyPass "/examples" "http://backend.example.com/examples" timeout=10 the second worker isn’t actually created. Instead the first worker is used. The benefit is, that there is only one connection pool, so connections are more often reused. Note that all configuration attributes given explicitly for the later worker will be ignored. This will be logged as a warning. In the above example, the resulting timeout value for the URL /examples will be 60 instead of 10! If you want to avoid worker sharing, sort your worker definitions by URL length, starting with the longest worker URLs. If you want to maximize worker sharing, use the reverse sort order. See also the related warning about ordering P ROXY PASS directives. Explicitly configured workers come in two flavors: direct workers and (load) balancer workers. They support many important configuration attributes which are described below in the P ROXY PASS directive. The same attributes can also be set using P ROXY S ET. The set of options available for a direct worker depends on the protocol which is specified in the origin server URL. Available protocols include ajp, fcgi, ftp, http and scgi. Balancer workers are virtual workers that use direct workers known as their members to actually handle the requests. Each balancer can have multiple members. When it handles a request, it chooses a member based on the configured load balancing algorithm. A balancer worker is created if its worker URL uses balancer as the protocol scheme. The balancer URL uniquely identifies the balancer worker. Members are added to a balancer using BALANCER M EMBER. =⇒DNS resolution for origin domains The DNS domain resolution happens when the socket to the origin server is created for the first time. When connection pooling is used, the DNS resolution is performed again only when the ttl of the connection expires (please check P ROXY PASS parameters). This means that httpd does not perform any DNS resolution caching. Controlling Access to Your Proxy You can control who can access your proxy via the

control block as in the following example: Require ip 192.168.0 For more information on access control directives, see MOD AUTHZ HOST. Strictly limiting access is essential if you are using a forward proxy (using the P ROXY R EQUESTS directive). Otherwise, your server can be used by any client to access arbitrary hosts while hiding his or her true identity. This is dangerous both for your network and for the Internet at large. When using a reverse proxy (using the P ROXY PASS directive with ProxyRequests Off), access control is less critical because clients can only contact the hosts that you have specifically configured. See Also the Proxy-Chain-Auth (p. 850) environment variable. 792 CHAPTER 10. APACHE MODULES Slow Startup If you’re using the P ROXY B LOCK directive, hostnames’ IP addresses are looked up and cached during startup for later match test. This may take a few seconds (or more) depending on the speed with which the hostname lookups occur. Intranet Proxy An Apache httpd proxy server situated in an intranet needs to forward external requests through the company’s firewall (for this, configure the P ROXY R EMOTE directive to forward the respective scheme to the firewall proxy). However, when it has to access resources within the intranet, it can bypass the firewall when accessing hosts. The N O P ROXY directive is useful for specifying which hosts belong to the intranet and should be accessed directly. Users within an intranet tend to omit the local domain name from their WWW requests, thus requesting "http://somehost/" instead of http://somehost.example.com/. Some commercial proxy servers let them get away with this and simply serve the request, implying a configured local domain. When the P ROXY D OMAIN directive is used and the server is configured for proxy service, Apache httpd can return a redirect response and send the client to the correct, fully qualified, server address. This is the preferred method since the user’s bookmark files will then contain fully qualified hosts. Protocol Adjustments For circumstances where MOD PROXY is sending requests to an origin server that doesn’t properly implement keepalives or HTTP/1.1, there are two environment variables (p. 92) that can force the request to use HTTP/1.0 with no keepalive. These are set via the S ET E NV directive. These are the force-proxy-request-1.0 and proxy-nokeepalive notes. ProxyPass "http://buggyappserver:7001/foo/" SetEnv force-proxy-request-1.0 1 SetEnv proxy-nokeepalive 1 Request Bodies Some request methods such as POST include a request body. The HTTP protocol requires that requests which include a body either use chunked transfer encoding or send a Content-Length request header. When passing these requests on to the origin server, MOD PROXY HTTP will always attempt to send the Content-Length. But if the body is large and the original request used chunked encoding, then chunked encoding may also be used in the upstream request. You can control this selection using environment variables (p. 92) . Setting proxy-sendcl ensures maximum compatibility with upstream servers by always sending the Content-Length, while setting proxy-sendchunked minimizes resource usage by using chunked encoding. Under some circumstances, the server must spool request bodies to disk to satisfy the requested handling of request bodies. For example, this spooling will occur if the original body was sent with chunked encoding (and is large), but the administrator has asked for backend requests to be sent with Content-Length or as HTTP/1.0. This spooling can also occur if the request body already has a Content-Length header, but the server is configured to filter incoming request bodies. L IMIT R EQUEST B ODY only applies to request bodies that the server will spool to disk 10.81. APACHE MODULE MOD PROXY 793 Reverse Proxy Request Headers When acting in a reverse-proxy mode (using the P ROXY PASS directive, for example), MOD PROXY HTTP adds several request headers in order to pass information to the origin server. These headers are: X-Forwarded-For The IP address of the client. X-Forwarded-Host The original host requested by the client in the Host HTTP request header. X-Forwarded-Server The hostname of the proxy server. Be careful when using these headers on the origin server, since they will contain more than one (commaseparated) value if the original request already contained one of these headers. For example, you can use %{X-Forwarded-For}i in the log format string of the origin server to log the original clients IP address, but you may get more than one address if the request passes through several proxies. See also the P ROXY P RESERVE H OST and P ROXY V IA directives, which control other request headers. Note: If you need to specify custom request headers to be added to the forwarded request, use the R EQUEST H EADER directive. BalancerGrowth Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Number of additional Balancers that can be added Post-configuration BalancerGrowth # BalancerGrowth 5 server config, virtual host Extension mod proxy BalancerGrowth is only available in Apache HTTP Server 2.3.13 and later. This directive allows for growth potential in the number of Balancers available for a virtualhost in addition to the number pre-configured. It only takes effect if there is at least one pre-configured Balancer. BalancerInherit Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Inherit proxy Balancers/Workers defined from the main server BalancerInherit On|Off BalancerInherit On server config, virtual host Extension mod proxy BalancerInherit is only available in Apache HTTP Server 2.4.5 and later. This directive will cause the current server/vhost to "inherit" Balancers and Workers defined in the main server. This can cause issues and inconsistent behavior if using the Balancer Manager for dynamic changes and so should be disabled if using that feature. The setting in the global server defines the default for all vhosts. Disabling P ROXY PASS I NHERIT also disables BalancerInherit. 794 CHAPTER 10. APACHE MODULES BalancerMember Directive Description: Syntax: Context: Status: Module: Add a member to a load balancing group BalancerMember [balancerurl] url [key=value [key=value ...]] directory Extension mod proxy This directive adds a member to a load balancing group. It can be used within a container directive and can take any of the key value pair parameters available to P ROXY PASS directives. One additional parameter is available only to BALANCER M EMBER directives: loadfactor. This is the member load factor - a number between 1 (default) and 100, which defines the weighted load to be applied to the member in question. The balancerurl is only needed when not within a container directive. It corresponds to the url of a balancer defined in P ROXY PASS directive. The path component of the balancer URL in any container directive is ignored. Trailing slashes should typically be removed from the URL of a BALANCER M EMBER. BalancerPersist Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Attempt to persist changes made by the Balancer Manager across restarts. BalancerPersist On|Off BalancerPersist Off server config, virtual host Extension mod proxy BalancerPersist is only available in Apache HTTP Server 2.4.4 and later. This directive will cause the shared memory storage associated with the balancers and balancer members to be persisted across restarts. This allows these local changes to not be lost during the normal restart/graceful state transitions. NoProxy Directive Description: Syntax: Context: Status: Module: Hosts, domains, or networks that will be connected to directly NoProxy host [host] ... server config, virtual host Extension mod proxy This directive is only useful for Apache httpd proxy servers within intranets. The N O P ROXY directive specifies a list of subnets, IP addresses, hosts and/or domains, separated by spaces. A request to a host which matches one or more of these is always served directly, without forwarding to the configured P ROXY R EMOTE proxy server(s). Example ProxyRemote NoProxy * http://firewall.example.com:81 .example.com 192.168.112.0/21 The host arguments to the N O P ROXY directive are one of the following type list: Domain A Domain is a partially qualified DNS domain name, preceded by a period. It represents a list of hosts which logically belong to the same DNS domain or zone (i.e., the suffixes of the hostnames are all ending in Domain). 10.81. APACHE MODULE MOD PROXY 795 Examples .com .example.org. To distinguish Domains from Hostnames (both syntactically and semantically; a DNS domain can have a DNS A record, too!), Domains are always written with a leading period. =⇒Note Domain name comparisons are done without regard to the case, and Domains are always as- sumed to be anchored in the root of the DNS tree; therefore, the two domains .ExAmple.com and .example.com. (note the trailing period) are considered equal. Since a domain comparison does not involve a DNS lookup, it is much more efficient than subnet comparison. SubNet A SubNet is a partially qualified internet address in numeric (dotted quad) form, optionally followed by a slash and the netmask, specified as the number of significant bits in the SubNet. It is used to represent a subnet of hosts which can be reached over a common network interface. In the absence of the explicit net mask it is assumed that omitted (or zero valued) trailing digits specify the mask. (In this case, the netmask can only be multiples of 8 bits wide.) Examples: 192.168 or 192.168.0.0 the subnet 192.168.0.0 with an implied netmask of 16 valid bits (sometimes used in the netmask form 255.255.0.0) 192.168.112.0/21 the subnet 192.168.112.0/21 with a netmask of 21 valid bits (also used in the form 255.255.248.0) As a degenerate case, a SubNet with 32 valid bits is the equivalent to an IPAddr, while a SubNet with zero valid bits (e.g., 0.0.0.0/0) is the same as the constant Default , matching any IP address. IPAddr A IPAddr represents a fully qualified internet address in numeric (dotted quad) form. Usually, this address represents a host, but there need not necessarily be a DNS domain name connected with the address. Example 192.168.123.7 =⇒Note An IPAddr does not need to be resolved by the DNS system, so it can result in more effective apache performance. Hostname A Hostname is a fully qualified DNS domain name which can be resolved to one or more IPAddrs via the DNS domain name service. It represents a logical host (in contrast to Domains, see above) and must be resolvable to at least one IPAddr (or often to a list of hosts with different IPAddrs). Examples prep.ai.example.edu www.example.org =⇒Note In many situations, it is more effective to specify an IPAddr in place of a Hostname since a DNS lookup can be avoided. Name resolution in Apache httpd can take a remarkable deal of time when the connection to the name server uses a slow PPP link. Hostname comparisons are done without regard to the case, and Hostnames are always assumed to be anchored in the root of the DNS tree; therefore, the two hosts WWW.ExAmple.com and www.example.com. (note the trailing period) are considered equal. See also • DNS Issues (p. 121) 796 CHAPTER 10. APACHE MODULES Proxy Directive Description: Syntax: Context: Status: Module: Container for directives applied to proxied resources ... server config, virtual host Extension mod proxy Directives placed in

sections apply only to matching proxied content. Shell-style wildcards are allowed. For example, the following will allow only hosts in yournetwork.example.com to access content via your proxy server: Require host yournetwork.example.com The following example will process all files in the foo directory of example.com through the INCLUDES filter when they are sent through the proxy server: SetOutputFilter INCLUDES The next example will allow web clients from the specified IP addresses to issue CONNECT requests to access the https://www.example.com/ SSL server if MOD PROXY CONNECT is enabled. Require ip 192.168.0.0/16 =⇒Differences from the Location configuration section A backend URL matches the configuration section if it begins with the the wildcard-url string, even if the last path segment in the directive only matches a prefix of the backend URL. For example, matches all of http://example.com/foo, http://example.com/foo/bar, and http://example.com/foobar. The matching of the final URL differs from the behavior of the section, which for purposes of this note treats the final path component as if it ended in a slash. For more control over the matching, see

. See also •

ProxyAddHeaders Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Add proxy information in X-Forwarded-* headers ProxyAddHeaders Off|On ProxyAddHeaders On server config, virtual host, directory Extension mod proxy Available in version 2.3.10 and later 10.81. APACHE MODULE MOD PROXY 797 This directive determines whether or not proxy related information should be passed to the backend server through X-Forwarded-For, X-Forwarded-Host and X-Forwarded-Server HTTP headers. =⇒Effectiveness This option is of use only for HTTP proxying, as handled by MOD PROXY HTTP. ProxyBadHeader Directive Description: Syntax: Default: Context: Status: Module: Determines how to handle bad header lines in a response ProxyBadHeader IsError|Ignore|StartBody ProxyBadHeader IsError server config, virtual host Extension mod proxy The P ROXY BAD H EADER directive determines the behavior of MOD PROXY if it receives syntactically invalid response header lines (i.e. containing no colon) from the origin server. The following arguments are possible: IsError Abort the request and end up with a 502 (Bad Gateway) response. This is the default behavior. Ignore Treat bad header lines as if they weren’t sent. StartBody When receiving the first bad header line, finish reading the headers and treat the remainder as body. This helps to work around buggy backend servers which forget to insert an empty line between the headers and the body. ProxyBlock Directive Description: Syntax: Context: Status: Module: Disallow proxy requests to certain hosts ProxyBlock *|hostname|partial-hostname [hostname|partial-hostname]... server config, virtual host Extension mod proxy The P ROXY B LOCK directive can be used to block FTP or HTTP access to certain hosts via the proxy, based on a full or partial hostname match, or, if applicable, an IP address comparison. Each of the arguments to the P ROXY B LOCK directive can be either * or a alphanumeric string. At startup, the module will attempt to resolve every alphanumeric string from a DNS name to a set of IP addresses, but any DNS errors are ignored. If an asterisk "*" argument is specified, MOD PROXY will deny access to all FTP or HTTP sites. Otherwise, for any request for an HTTP or FTP resource via the proxy, MOD PROXY will check the hostname of the request URI against each specified string. If a partial string match is found, access is denied. If no matches against hostnames are found, and a remote (forward) proxy is configured using P ROXY R EMOTE or P ROXY R EMOTE M ATCH, access is allowed. If no remote (forward) proxy is configured, the IP address of the hostname from the URI is compared against all resolved IP addresses determined at startup. Access is denied if any match is found. Note that the DNS lookups may slow down the startup time of the server. Example ProxyBlock news.example.com auctions.example.com friends.example.com 798 CHAPTER 10. APACHE MODULES Note that example would also be sufficient to match any of these sites. Hosts would also be matched if referenced by IP address. Note also that ProxyBlock * blocks connections to all sites. ProxyDomain Directive Description: Syntax: Context: Status: Module: Default domain name for proxied requests ProxyDomain Domain server config, virtual host Extension mod proxy This directive is only useful for Apache httpd proxy servers within intranets. The P ROXY D OMAIN directive specifies the default domain which the apache proxy server will belong to. If a request to a host without a domain name is encountered, a redirection response to the same host with the configured Domain appended will be generated. Example ProxyRemote NoProxy ProxyDomain " *" "http://firewall.example.com:81" ".example.com" "192.168.112.0/21" ".example.com" ProxyErrorOverride Directive Description: Syntax: Default: Context: Status: Module: Override error pages for proxied content ProxyErrorOverride On|Off ProxyErrorOverride Off server config, virtual host, directory Extension mod proxy This directive is useful for reverse-proxy setups where you want to have a common look and feel on the error pages seen by the end user. This also allows for included files (via MOD INCLUDE’s SSI) to get the error code and act accordingly. (Default behavior would display the error page of the proxied server. Turning this on shows the SSI Error message.) This directive does not affect the processing of informational (1xx), normal success (2xx), or redirect (3xx) responses. ProxyIOBufferSize Directive Description: Syntax: Default: Context: Status: Module: Determine size of internal data throughput buffer ProxyIOBufferSize bytes ProxyIOBufferSize 8192 server config, virtual host Extension mod proxy 10.81. APACHE MODULE MOD PROXY 799 The P ROXY IOB UFFER S IZE directive adjusts the size of the internal buffer which is used as a scratchpad for the data between input and output. The size must be at least 512. In almost every case, there’s no reason to change that value. If used with AJP, this directive sets the maximum AJP packet size in bytes. Values larger than 65536 are set to 65536. If you change it from the default, you must also change the packetSize attribute of your AJP connector on the Tomcat side! The attribute packetSize is only available in Tomcat 5.5.20+ and 6.0.2+ Normally it is not necessary to change the maximum packet size. Problems with the default value have been reported when sending certificates or certificate chains. ProxyMatch Directive Description: Syntax: Context: Status: Module: Container for directives applied to regular-expression-matched proxied resources ... server config, virtual host Extension mod proxy The

directive is identical to the

directive, except that it matches URLs using regular expressions. From 2.4.8 onwards, named groups and backreferences are captured and written to the environment with the corresponding name prefixed with "MATCH " and in upper case. This allows elements of URLs to be referenced from within expressions (p. 99) and modules like MOD REWRITE. In order to prevent confusion, numbered (unnamed) backreferences are ignored. Use named groups instead. [ˆ/]+)> require ldap-group cn=%{env:MATCH_SITENAME},ou=combined,o=Example See also •

ProxyMaxForwards Directive Description: Syntax: Default: Context: Status: Module: Maximium number of proxies that a request can be forwarded through ProxyMaxForwards number ProxyMaxForwards -1 server config, virtual host Extension mod proxy The P ROXY M AX F ORWARDS directive specifies the maximum number of proxies through which a request may pass if there’s no Max-Forwards header supplied with the request. This may be set to prevent infinite proxy loops or a DoS attack. Example ProxyMaxForwards 15 Note that setting P ROXY M AX F ORWARDS is a violation of the HTTP/1.1 protocol (RFC2616), which forbids a Proxy setting Max-Forwards if the Client didn’t set it. Earlier Apache httpd versions would always set it. A negative P ROXY M AX F ORWARDS value, including the default -1, gives you protocol-compliant behavior but may leave you open to loops. 800 CHAPTER 10. APACHE MODULES ProxyPass Directive Description: Syntax: Context: Status: Module: Compatibility: Maps remote servers into the local server URL-space ProxyPass [path] !|url [key=value [key=value ...]] [interpolate] [noquery] server config, virtual host, directory Extension mod proxy Unix Domain Socket (UDS) support added in 2.4.7 [nocanon] This directive allows remote servers to be mapped into the space of the local server. The local server does not act as a proxy in the conventional sense but appears to be a mirror of the remote server. The local server is often called a reverse proxy or gateway. The path is the name of a local virtual path; url is a partial URL for the remote server and cannot include a query string. =⇒Note: This directive cannot be used within a context. ! The P R directive should usually be set off when using P ROXY PASS . ROXY EQUESTS In 2.4.7 and later, support for using a Unix Domain Socket is available by using a target which prepends unix:/path/lis.sock|. For example, to proxy HTTP and target the UDS at /home/www/socket, you would use unix:/home/www.socket|http://localhost/whatever/. Since the socket is local, the hostname used (in this case localhost) is moot, but it is passed as the Host: header value of the request. =⇒Note: The path associated with the unix: URL is D R D aware. =⇒Note: R R requires the [P,NE] option to prevent the ’|’ character from being escaped. EFAULT UNTIME IR EWRITE ULE When used inside a section, the first argument is omitted and the local directory is obtained from the . The same will occur inside a section; however, ProxyPass does not interpret the regexp as such, so it is necessary to use P ROXY PASS M ATCH in this situation instead. Suppose the local server has address http://example.com/; then ProxyPass "http://backend.example.com/" will cause a local request for http://example.com/mirror/foo/bar to be internally converted into a proxy request to http://backend.example.com/bar. The ProxyPass directive is not supported in or sections. If you require a more flexible reverse-proxy configuration, see the R EWRITE RULE directive with the [P] flag. The following alternative syntax is possible; however, it can carry a performance penalty when present in very large numbers. The advantage of the below syntax is that it allows for dynamic control via the Balancer Manager (p. 824) interface: ProxyPass "/mirror/foo/" "http://backend.example.com/" ! If the first argument ends with a trailing /, the second argument should also end with a trailing /, and vice versa. Otherwise, the resulting requests to the backend may miss some needed slashes and do not deliver the expected results. The ! directive is useful in situations where you don’t want to reverse-proxy a subdirectory, e.g. 10.81. APACHE MODULE MOD PROXY 801 ProxyPass "http://backend.example.com/" ProxyPass "!" ProxyPass "/mirror/foo/i" "!" ProxyPass "/mirror/foo" "http://backend.example.com" will proxy all requests to /mirror/foo to backend.example.com except requests made to /mirror/foo/i. ! Ordering ProxyPass Directives The configured P ROXY PASS and P ROXY PASS M ATCH rules are checked in the order of configuration. The first rule that matches wins. So usually you should sort conflicting P ROXY PASS rules starting with the longest URLs first. Otherwise, later rules for longer URLS will be hidden by any earlier rule which uses a leading substring of the URL. Note that there is some relation with worker sharing. In contrast, only one P ROXY PASS directive can be placed in a L OCATION block, and the most specific location will take precedence. For the same reasons, exclusions must come before the general P ROXY PASS directives. ProxyPass key=value Parameters In Apache HTTP Server 2.1 and later, mod proxy supports pooled connections to a backend server. Connections created on demand can be retained in a pool for future use. Limits on the pool size and other settings can be coded on the P ROXY PASS directive using key=value parameters, described in the tables below. By default, mod proxy will allow and retain the maximum number of connections that could be used simultaneously by that web server child process. Use the max parameter to reduce the number from the default. Use the ttl parameter to set an optional time to live; connections which have been unused for at least ttl seconds will be closed. ttl can be used to avoid using a connection which is subject to closing because of the backend server’s keep-alive timeout. The pool of connections is maintained per web server child process, and max and other settings are not coordinated among all child processes, except when only one child process is allowed by configuration or MPM design. Example ProxyPass "/example" "http://backend.example.com" max=20 ttl=120 retry=300 BalancerMember parameters Parameter Default Description min 0 max 1...n Minimum number of connection pool entries, unrelated to the actual number of connections. This only needs to be modified from the default for special circumstances where heap memory associated with the backend connections should be preallocated or retained. Maximum number of connections that will be allowed to the backend server. The default for this limit is the number of threads per process in the active MPM. In the Prefork MPM, this is always 1; while with other MPMs, it is controlled by the T HREADS P ER C HILD directive. 802 CHAPTER 10. APACHE MODULES smax max acquire - connectiontimeout timeout disablereuse Off enablereuse On flushpackets off flushwait 10 iobuffersize 8192 Retained connection pool entries above this limit are freed during certain operations if they have been unused for longer than the time to live, controlled by the ttl parameter. If the connection pool entry has an associated connection, it will be closed. This only needs to be modified from the default for special circumstances where connection pool entries and any associated connections which have exceeded the time to live need to be freed or closed more aggressively. If set, this will be the maximum time to wait for a free connection in the connection pool, in milliseconds. If there are no free connections in the pool, the Apache httpd will return SERVER BUSY status to the client. Connect timeout in seconds. The number of seconds Apache httpd waits for the creation of a connection to the backend to complete. By adding a postfix of ms, the timeout can be also set in milliseconds. This parameter should be used when you want to force mod proxy to immediately close a connection to the backend after being used, and thus, disable its persistent connection and pool for that backend. This helps in various situations where a firewall between Apache httpd and the backend server (regardless of protocol) tends to silently drop connections or when backends themselves may be under round- robin DNS. To disable connection pooling reuse, set this property value to On. This is the inverse of ’disablereuse’ above, provided as a convenience for scheme handlers that require opt-in for connection reuse (such as MOD PROXY FCGI ). Determines whether the proxy module will auto-flush the output brigade after each "chunk" of data. ’off’ means that it will flush only when needed; ’on’ means after each chunk is sent; and ’auto’ means poll/wait for a period of time and flush if no input has been received for ’flushwait’ milliseconds. Currently, this is in effect only for AJP. The time to wait for additional input, in milliseconds, before flushing the output brigade if ’flushpackets’ is ’auto’. Adjusts the size of the internal scratchpad IO buffer. This allows you to override the P ROXY IOB UFFER S IZE for a specific worker. This must be at least 512 or set to 0 for the system default of 8192. 10.81. APACHE MODULE MOD PROXY keepalive Off lbset 0 ping 0 receivebuffersize 0 redirect - 803 This parameter should be used when you have a firewall between your Apache httpd and the backend server, which tends to drop inactive connections. This flag will tell the Operating System to send KEEP ALIVE messages on inactive connections and thus prevent the firewall from dropping the connection. To enable keepalive, set this property value to On. The frequency of initial and subsequent TCP keepalive probes depends on global OS settings, and may be as high as 2 hours. To be useful, the frequency configured in the OS must be smaller than the threshold used by the firewall. Sets the load balancer cluster set that the worker is a member of. The load balancer will try all members of a lower numbered lbset before trying higher numbered ones. Ping property tells the webserver to "test" the connection to the backend before forwarding the request. For negative values, the test is a simple socket check; for positive values, it’s a more functional check, dependent upon the protocol. For AJP, it causes MOD PROXY AJP to send a CPING request on the ajp13 connection (implemented on Tomcat 3.3.2+, 4.1.28+ and 5.0.13+). For HTTP, it causes MOD PROXY HTTP to send a 100-Continue to the backend (only valid for HTTP/1.1 - for non HTTP/1.1 backends, this property has no effect). In both cases, the parameter is the delay in seconds to wait for the reply. This feature has been added to avoid problems with hung and busy backends. This will increase the network traffic during the normal operation which could be an issue, but it will lower the traffic in case some of the cluster nodes are down or busy. By adding a postfix of ms, the delay can be also set in milliseconds. Adjusts the size of the explicit (TCP/IP) network buffer size for proxied connections. This allows you to override the P ROXY R ECEIVE B UFFER S IZE for a specific worker. This must be at least 512 or set to 0 for the system default. Redirection Route of the worker. This value is usually set dynamically to enable safe removal of the node from the cluster. If set, all requests without session id will be redirected to the BalancerMember that has route parameter equal to this value. 804 CHAPTER 10. APACHE MODULES retry 60 route - status - timeout P ROXY T IMEOUT ttl - Connection pool worker retry timeout in seconds. If the connection pool worker to the backend server is in the error state, Apache httpd will not forward any requests to that server until the timeout expires. This enables to shut down the backend server for maintenance and bring it back online later. A value of 0 means always retry workers in an error state with no timeout. Route of the worker when used inside load balancer. The route is a value appended to session id. Single letter value defining the initial status of this worker. D Worker is disabled and will not accept any requests; will be automatically retried. S Worker is administratively stopped; will not accept requests and will not be automatically retried I Worker is in ignore-errors mode and will always be considered available. H Worker is in hotstandby mode and will only be used if no other viable workers are available. E Worker is in an error state. N Worker is in drain mode and will only accept existing sticky sessions destined for itself and ignore all other requests. Status can be set (which is the default) by prepending with ’+’ or cleared by prepending with ’-’. Thus, a setting of ’SE’ sets this worker to Stopped and clears the in-error flag. Connection timeout in seconds. The number of seconds Apache httpd waits for data sent by / to the backend. Time to live for inactive connections and associated connection pool entries, in seconds. Once reaching this limit, a connection will not be used again; it will be closed at some later time. 10.81. APACHE MODULE MOD PROXY flusher flush 805 Name of the provider used by See the documentation of this module for more details. MOD PROXY FDPASS . If the Proxy directive scheme starts with the balancer:// (eg: balancer://cluster, any path information is ignored), then a virtual worker that does not really communicate with the backend server will be created. Instead, it is responsible for the management of several "real" workers. In that case, the special set of parameters can be added to this virtual worker. See MOD PROXY BALANCER for more information about how the balancer works. Balancer parameters 806 CHAPTER 10. APACHE MODULES Parameter Default Description lbmethod byrequests maxattempts nofailover One less than the number of workers, or 1 with a single worker. Off stickysession - stickysessionsep "." scolonpathdelim Off timeout 0 failonstatus - failontimeout Off nonce Balancer load-balance method. Select the load-balancing scheduler method to use. Either byrequests, to perform weighted request counting; bytraffic, to perform weighted traffic byte count balancing; or bybusyness, to perform pending request balancing. The default is byrequests. Maximum number of failover attempts before giving up. If set to On, the session will break if the worker is in error state or disabled. Set this value to On if backend servers do not support session replication. Balancer sticky session name. The value is usually set to something like JSESSIONID or PHPSESSIONID, and it depends on the backend application server that support sessions. If the backend application server uses different names for cookies and url encoded id (like servlet containers), use — to separate them. The first part is for the cookie; the second is for the path. Available in Apache HTTP Server 2.4.4 and later. Sets the separation symbol in the session cookie. Some backend application servers do not use the ’.’ as the symbol. For example, the Oracle Weblogic server uses ’!’. The correct symbol can be set using this option. The setting of ’Off’ signifies that no symbol is used. If set to On, the semi-colon character ’;’ will be used as an additional sticky session path delimiter/separator. This is mainly used to emulate mod jk’s behavior when dealing with paths such as JSESSIONID=6736bcf34;foo=aabfa Balancer timeout in seconds. If set, this will be the maximum time to wait for a free worker. The default is to not wait. A single or comma-separated list of HTTP status codes. If set, this will force the worker into error state when the backend returns any status code in the list. Worker recovery behaves the same as other worker errors. If set, an IO read timeout after a request is sent to the backend will force the worker into error state. Worker recovery behaves the same as other worker errors. Available in Apache HTTP Server 2.4.5 and later. The protective nonce used in the balancer-manager application page. The default is to use an automatically determined UUID-based nonce, to provide for further protection for the page. If set, then the nonce is set to that value. A setting of None disables all nonce checking. =⇒ Note In addition to the nonce, the balancer-manager page 10.81. APACHE MODULE MOD PROXY 807 A sample balancer setup: ProxyPass "/special-area" "http://special.example.com" smax=5 max=10 ProxyPass "/" "balancer://mycluster/" stickysession=JSESSIONID|jsessionid nofai BalancerMember ajp://1.2.3.4:8009 BalancerMember ajp://1.2.3.5:8009 loadfactor=20 # Less powerful server, don’t send as many requests there, BalancerMember ajp://1.2.3.6:8009 loadfactor=5 Setting up a hot-standby that will only be used if no other members are available: ProxyPass "/" "balancer://hotcluster/" BalancerMember ajp://1.2.3.4:8009 loadfactor=1 BalancerMember ajp://1.2.3.5:8009 loadfactor=2 # The server below is on hot standby BalancerMember ajp://1.2.3.6:8009 status=+H ProxySet lbmethod=bytraffic Additional ProxyPass Keywords Normally, mod proxy will canonicalise ProxyPassed URLs. But this may be incompatible with some backends, particularly those that make use of PATH INFO. The optional nocanon keyword suppresses this and passes the URL path "raw" to the backend. Note that this keyword may affect the security of your backend, as it removes the normal limited protection against URL-based attacks provided by the proxy. Normally, mod proxy will include the query string when generating the SCRIPT FILENAME environment variable. The optional noquery keyword (available in httpd 2.4.1 and later) prevents this. The optional interpolate keyword, in combination with P ROXY PASS I NTERPOLATE E NV, causes the ProxyPass to interpolate environment variables, using the syntax ${VARNAME}. Note that many of the standard CGI-derived environment variables will not exist when this interpolation happens, so you may still have to resort to MOD REWRITE for complex rules. Also note that interpolation is not supported within the scheme portion of a URL. Dynamic determination of the scheme can be accomplished with MOD REWRITE as in the following example. RewriteEngine On RewriteCond RewriteRule RewriteCond RewriteRule %{HTTPS} =off . - [E=protocol:http] %{HTTPS} =on . - [E=protocol:https] RewriteRule ˆ/mirror/foo/(.*) %{ENV:protocol}://backend.example.com/$1 [P] ProxyPassReverse "/mirror/foo/" "http://backend.example.com/" ProxyPassReverse "/mirror/foo/" "https://backend.example.com/" 808 CHAPTER 10. APACHE MODULES ProxyPassInherit Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Inherit ProxyPass directives defined from the main server ProxyPassInherit On|Off ProxyPassInherit On server config, virtual host Extension mod proxy ProxyPassInherit is only available in Apache HTTP Server 2.4.5 and later. This directive will cause the current server/vhost to "inherit" P ROXY PASS directives defined in the main server. This can cause issues and inconsistent behavior if using the Balancer Manager for dynamic changes and so should be disabled if using that feature. The setting in the global server defines the default for all vhosts. Disabling ProxyPassInherit also disables BALANCER I NHERIT. ProxyPassInterpolateEnv Directive Description: Syntax: Default: Context: Status: Module: Enable Environment Variable interpolation in Reverse Proxy configurations ProxyPassInterpolateEnv On|Off ProxyPassInterpolateEnv Off server config, virtual host, directory Extension mod proxy This directive, together with the interpolate argument to P ROXY PASS, P ROXY PASS R EVERSE, P ROXY PASS R EVER SE C OOKIE D OMAIN , and P ROXY PASS R EVERSE C OOKIE PATH , enables reverse proxies to be dynamically configured using environment variables which may be set by another module such as MOD REWRITE. It affects the P ROX Y PASS , P ROXY PASS R EVERSE , P ROXY PASS R EVERSE C OOKIE D OMAIN , and P ROXY PASS R EVERSE C OOKIE PATH directives and causes them to substitute the value of an environment variable varname for the string ${varname} in configuration directives if the interpolate option is set. Keep this turned off (for server performance) unless you need it! ProxyPassMatch Directive Description: Syntax: Context: Status: Module: Maps remote servers into the local server URL-space using regular expressions ProxyPassMatch [regex] !|url [key=value [key=value ...]] server config, virtual host, directory Extension mod proxy This directive is equivalent to P ROXY PASS but makes use of regular expressions instead of simple prefix matching. The supplied regular expression is matched against the url, and if it matches, the server will substitute any parenthesized matches into the given string and use it as a new url. =⇒Note: This directive cannot be used within a context. Suppose the local server has address http://example.com/; then ProxyPassMatch "ˆ/(.*\.gif)$" "http://backend.example.com/$1" will cause a local request for http://example.com/foo/bar.gif to be internally converted into a proxy request to http://backend.example.com/foo/bar.gif. 10.81. APACHE MODULE MOD PROXY 809 =⇒Note The URL argument must be parsable as a URL before regexp substitutions (as well as after). This limits the matches you can use. For instance, if we had used ProxyPassMatch "ˆ(/.*\.gif)$" "http://backend.example.com:8000$1" in our previous example, it would fail with a syntax error at server startup. This is a bug (PR 46665 in the ASF bugzilla), and the workaround is to reformulate the match: ProxyPassMatch "ˆ/(.*\.gif)$" "http://backend.example.com:8000/$1" The ! directive is useful in situations where you don’t want to reverse-proxy a subdirectory. When used inside a section, the first argument is omitted and the regexp is obtained from the . If you require a more flexible reverse-proxy configuration, see the R EWRITE RULE directive with the [P] flag. =⇒Default Substitution When the URL parameter doesn’t use any backreferences into the regular expression, the original URL will be appended to the URL parameter. ! Security Warning Take care when constructing the target URL of the rule, considering the security impact from allowing the client influence over the set of URLs to which your server will act as a proxy. Ensure that the scheme and hostname part of the URL is either fixed or does not allow the client undue influence. ProxyPassReverse Directive Description: Syntax: Context: Status: Module: Adjusts the URL in HTTP response headers sent from a reverse proxied server ProxyPassReverse [path] url [interpolate] server config, virtual host, directory Extension mod proxy This directive lets Apache httpd adjust the URL in the Location, Content-Location and URI headers on HTTP redirect responses. This is essential when Apache httpd is used as a reverse proxy (or gateway) to avoid bypassing the reverse proxy because of HTTP redirects on the backend servers which stay behind the reverse proxy. Only the HTTP response headers specifically mentioned above will be rewritten. Apache httpd will not rewrite other response headers, nor will it by default rewrite URL references inside HTML pages. This means that if the proxied content contains absolute URL references, they will bypass the proxy. To rewrite HTML content to match the proxy, you must load and enable MOD PROXY HTML. path is the name of a local virtual path; url is a partial URL for the remote server. These parameters are used the same way as for the P ROXY PASS directive. For example, suppose the local server has address http://example.com/; then ProxyPass "/mirror/foo/" "http://backend.example.com/" ProxyPassReverse "/mirror/foo/" "http://backend.example.com/" ProxyPassReverseCookieDomain backend.example.com public.example.com ProxyPassReverseCookiePath "/" "/mirror/foo/" 810 CHAPTER 10. APACHE MODULES will not only cause a local request for the http://example.com/mirror/foo/bar to be internally converted into a proxy request to http://backend.example.com/bar (the functionality which ProxyPass provides here). It also takes care of redirects which the server backend.example.com sends when redirecting http://backend.example.com/bar to http://backend.example.com/quux . Apache httpd adjusts this to http://example.com/mirror/foo/quux before forwarding the HTTP redirect response to the client. Note that the hostname used for constructing the URL is chosen in respect to the setting of the U SE C ANONICAL NAME directive. Note that this P ROXY PASS R EVERSE directive can also be used in conjunction with the proxy feature (RewriteRule ... [P]) from MOD REWRITE because it doesn’t depend on a corresponding P ROXY PASS directive. The optional interpolate keyword, used together with P ROXY PASS I NTERPOLATE E NV, enables interpolation of environment variables specified using the format ${VARNAME}. Note that interpolation is not supported within the scheme portion of a URL. When used inside a section, the first argument is omitted and the local directory is obtained from the . The same occurs inside a section, but will probably not work as intended, as ProxyPassReverse will interpret the regexp literally as a path; if needed in this situation, specify the ProxyPassReverse outside the section or in a separate section. This directive is not supported in or sections. ProxyPassReverseCookieDomain Directive Description: Syntax: Context: Status: Module: Adjusts the Domain string in Set-Cookie headers from a reverse- proxied server ProxyPassReverseCookieDomain internal-domain public-domain [interpolate] server config, virtual host, directory Extension mod proxy Usage is basically similar to P ROXY PASS R EVERSE, but instead of rewriting headers that are a URL, this rewrites the domain string in Set-Cookie headers. ProxyPassReverseCookiePath Directive Description: Syntax: Context: Status: Module: Adjusts the Path string in Set-Cookie headers from a reverse- proxied server ProxyPassReverseCookiePath internal-path public-path [interpolate] server config, virtual host, directory Extension mod proxy Useful in conjunction with P ROXY PASS R EVERSE in situations where backend URL paths are mapped to public paths on the reverse proxy. This directive rewrites the path string in Set-Cookie headers. If the beginning of the cookie path matches internal-path, the cookie path will be replaced with public-path. In the example given with P ROXY PASS R EVERSE, the directive: ProxyPassReverseCookiePath "/" "/mirror/foo/" will rewrite a cookie with backend path / (or /example or, in fact, anything) to /mirror/foo/. 10.81. APACHE MODULE MOD PROXY 811 ProxyPreserveHost Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Use incoming Host HTTP request header for proxy request ProxyPreserveHost On|Off ProxyPreserveHost Off server config, virtual host, directory Extension mod proxy Usable in directory context in 2.3.3 and later. When enabled, this option will pass the Host: line from the incoming request to the proxied host, instead of the hostname specified in the P ROXY PASS line. This option should normally be turned Off. It is mostly useful in special configurations like proxied mass name-based virtual hosting, where the original Host header needs to be evaluated by the backend server. ProxyReceiveBufferSize Directive Description: Syntax: Default: Context: Status: Module: Network buffer size for proxied HTTP and FTP connections ProxyReceiveBufferSize bytes ProxyReceiveBufferSize 0 server config, virtual host Extension mod proxy The P ROXY R ECEIVE B UFFER S IZE directive specifies an explicit (TCP/IP) network buffer size for proxied HTTP and FTP connections, for increased throughput. It has to be greater than 512 or set to 0 to indicate that the system’s default buffer size should be used. Example ProxyReceiveBufferSize 2048 ProxyRemote Directive Description: Syntax: Context: Status: Module: Remote proxy used to handle certain requests ProxyRemote match remote-server server config, virtual host Extension mod proxy This defines remote proxies to this proxy. match is either the name of a URL-scheme that the remote server supports, or a partial URL for which the remote server should be used, or * to indicate the server should be contacted for all requests. remote-server is a partial URL for the remote server. Syntax: remote-server = scheme://hostname[:port] scheme is effectively the protocol that should be used to communicate with the remote server; only http and https are supported by this module. When using https, the requests are forwarded through the remote proxy using the HTTP CONNECT method. 812 CHAPTER 10. APACHE MODULES Example ProxyRemote http://goodguys.example.com/ http://mirrorguys.example.com:8000 ProxyRemote * http://cleverproxy.localdomain ProxyRemote ftp http://ftpproxy.mydomain:8080 In the last example, the proxy will forward FTP requests, encapsulated as yet another HTTP proxy request, to another proxy which can handle them. This option also supports reverse proxy configuration; a backend webserver can be embedded within a virtualhost URL space even if that server is hidden by another forward proxy. ProxyRemoteMatch Directive Description: Syntax: Context: Status: Module: Remote proxy used to handle requests matched by regular expressions ProxyRemoteMatch regex remote-server server config, virtual host Extension mod proxy The P ROXY R EMOTE M ATCH is identical to the P ROXY R EMOTE directive, except that the first argument is a regular expression match against the requested URL. ProxyRequests Directive Description: Syntax: Default: Context: Status: Module: Enables forward (standard) proxy requests ProxyRequests On|Off ProxyRequests Off server config, virtual host Extension mod proxy This allows or prevents Apache httpd from functioning as a forward proxy server. (Setting ProxyRequests to Off does not disable use of the P ROXY PASS directive.) In a typical reverse proxy or gateway configuration, this option should be set to Off. In order to get the functionality of proxying HTTP or FTP sites, you need also MOD PROXY HTTP or MOD PROXY FTP (or both) present in the server. In order to get the functionality of (forward) proxying HTTPS sites, you need MOD PROXY CONNECT enabled in the server. ! Warning Do not enable proxying with P ROXY R EQUESTS until you have secured your server. Open proxy servers are dangerous both to your network and to the Internet at large. See also • Forward and Reverse Proxies/Gateways 10.81. APACHE MODULE MOD PROXY 813 ProxySet Directive Description: Syntax: Context: Status: Module: Set various Proxy balancer or member parameters ProxySet url key=value [key=value ...] directory Extension mod proxy This directive is used as an alternate method of setting any of the parameters available to Proxy balancers and workers normally done via the P ROXY PASS directive. If used within a container directive, the url argument is not required. As a side effect the respective balancer or worker gets created. This can be useful when doing reverse proxying via a R EWRITE RULE instead of a P ROXY PASS directive. BalancerMember http://www2.example.com:8080 loadfactor=1 BalancerMember http://www3.example.com:8080 loadfactor=2 ProxySet lbmethod=bytraffic ProxySet keepalive=On ProxySet balancer://foo lbmethod=bytraffic timeout=15 ProxySet ajp://backend:7001 timeout=15 ! Warning Keep in mind that the same parameter key can have a different meaning depending whether it is applied to a balancer or a worker, as shown by the two examples above regarding timeout. ProxySourceAddress Directive Description: Syntax: Context: Status: Module: Compatibility: Set local IP address for outgoing proxy connections ProxySourceAddress address server config, virtual host Extension mod proxy Available in version 2.3.9 and later This directive allows to set a specific local address to bind to when connecting to a backend server. ProxyStatus Directive Description: Syntax: Default: Context: Status: Module: Show Proxy LoadBalancer status in mod status ProxyStatus Off|On|Full ProxyStatus Off server config, virtual host Extension mod proxy 814 CHAPTER 10. APACHE MODULES This directive determines whether or not proxy loadbalancer status data is displayed via the MOD STATUS server-status page. =⇒Note Full is synonymous with On ProxyTimeout Directive Description: Syntax: Default: Context: Status: Module: Network timeout for proxied requests ProxyTimeout seconds Value of T I M E O U T server config, virtual host Extension mod proxy This directive allows a user to specifiy a timeout on proxy requests. This is useful when you have a slow/buggy appserver which hangs, and you would rather just return a timeout and fail gracefully instead of waiting however long it takes the server to return. ProxyVia Directive Description: Syntax: Default: Context: Status: Module: Information provided in the Via HTTP response header for proxied requests ProxyVia On|Off|Full|Block ProxyVia Off server config, virtual host Extension mod proxy This directive controls the use of the Via: HTTP header by the proxy. Its intended use is to control the flow of proxy requests along a chain of proxy servers. See RFC 261668 (HTTP/1.1), section 14.45 for an explanation of Via: header lines. • If set to Off, which is the default, no special processing is performed. If a request or reply contains a Via: header, it is passed through unchanged. • If set to On, each request and reply will get a Via: header line added for the current host. • If set to Full, each generated Via: header line will additionally have the Apache httpd server version shown as a Via: comment field. • If set to Block, every proxy request will have all its Via: header lines removed. No new Via: header will be generated. 68 http://www.ietf.org/rfc/rfc2616.txt 10.82. APACHE MODULE MOD PROXY AJP 10.82 815 Apache Module mod proxy ajp Description: Status: ModuleIdentifier: SourceFile: AJP support module for MOD PROXY Extension proxy ajp module mod proxy ajp.c Summary This module requires the service of MOD PROXY. version 1.3 (hereafter AJP13). It provides support for the Apache JServ Protocol Thus, in order to get the ability of handling AJP13 protocol, MOD PROXY and MOD PROXY AJP have to be present in the server. ! Warning Do not enable proxying until you have secured your server (p. 787) . Open proxy servers are dangerous both to your network and to the Internet at large. Directives This module provides no directives. See also • MOD PROXY • Environment Variable documentation (p. 92) Usage This module is used to reverse proxy to a backend application server (e.g. Apache Tomcat) using the AJP13 protocol. The usage is similar to an HTTP reverse proxy, but uses the ajp:// prefix: Simple Reverse Proxy ProxyPass "/app" "ajp://backend.example.com:8009/app" Balancers may also be used: Balancer Reverse Proxy BalancerMember ajp://app1.example.com:8009 loadfactor=1 BalancerMember ajp://app2.example.com:8009 loadfactor=2 ProxySet lbmethod=bytraffic ProxyPass "/app" "balancer://cluster/app" Note that usually no P ROXY PASS R EVERSE directive is necessary. The AJP request includes the original host header given to the proxy, and the application server can be expected to generate self-referential headers relative to this host, so no rewriting is necessary. The main exception is when the URL path on the proxy differs from that on the backend. In this case, a redirect header can be rewritten relative to the original host URL (not the backend ajp:// URL), for example: 816 CHAPTER 10. APACHE MODULES Rewriting Proxied Path ProxyPass "/apps/foo" "ajp://backend.example.com:8009/foo" ProxyPassReverse "/apps/foo" "http://www.example.com/foo" However, it is usually better to deploy the application on the backend server at the same path as the proxy rather than to take this approach. Environment Variables Environment variables whose names have the prefix AJP are forwarded to the origin server as AJP request attributes (with the AJP prefix removed from the name of the key). Overview of the protocol The AJP13 protocol is packet-oriented. A binary format was presumably chosen over the more readable plain text for reasons of performance. The web server communicates with the servlet container over TCP connections. To cut down on the expensive process of socket creation, the web server will attempt to maintain persistent TCP connections to the servlet container, and to reuse a connection for multiple request/response cycles. Once a connection is assigned to a particular request, it will not be used for any others until the request-handling cycle has terminated. In other words, requests are not multiplexed over connections. This makes for much simpler code at either end of the connection, although it does cause more connections to be open at once. Once the web server has opened a connection to the servlet container, the connection can be in one of the following states: • Idle No request is being handled over this connection. • Assigned The connection is handling a specific request. Once a connection is assigned to handle a particular request, the basic request information (e.g. HTTP headers, etc) is sent over the connection in a highly condensed form (e.g. common strings are encoded as integers). Details of that format are below in Request Packet Structure. If there is a body to the request (content-length > 0), that is sent in a separate packet immediately after. At this point, the servlet container is presumably ready to start processing the request. As it does so, it can send the following messages back to the web server: • SEND HEADERS Send a set of headers back to the browser. • SEND BODY CHUNK Send a chunk of body data back to the browser. • GET BODY CHUNK Get further data from the request if it hasn’t all been transferred yet. This is necessary because the packets have a fixed maximum size and arbitrary amounts of data can be included the body of a request (for uploaded files, for example). (Note: this is unrelated to HTTP chunked transfer). • END RESPONSE Finish the request-handling cycle. Each message is accompanied by a differently formatted packet of data. See Response Packet Structures below for details. 10.82. APACHE MODULE MOD PROXY AJP 817 Basic Packet Structure There is a bit of an XDR heritage to this protocol, but it differs in lots of ways (no 4 byte alignment, for example). AJP13 uses network byte order for all data types. There are four data types in the protocol: bytes, booleans, integers and strings. Byte A single byte. Boolean A single byte, 1 = true, 0 = false. Using other non-zero values as true (i.e. C-style) may work in some places, but it won’t in others. Integer A number in the range of 0 to 2ˆ16 (32768). Stored in 2 bytes with the high-order byte first. String A variable-sized string (length bounded by 2ˆ16). Encoded with the length packed into two bytes first, followed by the string (including the terminating ’\0’). Note that the encoded length doesnotinclude the trailing ’\0’ – it is likestrlen. This is a touch confusing on the Java side, which is littered with odd autoincrement statements to skip over these terminators. I believe the reason this was done was to allow the C code to be extra efficient when reading strings which the servlet container is sending back – with the terminating \0 character, the C code can pass around references into a single buffer, without copying. if the \0 was missing, the C code would have to copy things out in order to get its notion of a string. Packet Size According to much of the code, the max packet size is 8 * 1024 bytes (8K). The actual length of the packet is encoded in the header. Packet Headers Packets sent from the server to the container begin with 0x1234. Packets sent from the container to the server begin with AB (that’s the ASCII code for A followed by the ASCII code for B). After those first two bytes, there is an integer (encoded as above) with the length of the payload. Although this might suggest that the maximum payload could be as large as 2ˆ16, in fact, the code sets the maximum to be 8K. Packet Format (Server->Container) Byte Contents 0 0x12 1 0x34 2 Data Length (n) 3 Data 4...(n+3) Packet Format (Container->Server) Byte Contents 0 A 1 B 2 Data Length (n) 3 Data 4...(n+3) For most packets, the first byte of the payload encodes the type of message. The exception is for request body packets sent from the server to the container – they are sent with a standard packet header ( 0x1234 and then length of the packet), but without any prefix code after that. The web server can send the following messages to the servlet container: 818 CHAPTER 10. APACHE MODULES Code 2 7 8 Type of Packet Forward Request Shutdown Ping 10 CPing none Data Meaning Begin the request-processing cycle with the following data The web server asks the container to shut itself down. The web server asks the container to take control (secure login phase). The web server asks the container to respond quickly with a CPong. Size (2 bytes) and corresponding body data. To ensure some basic security, the container will only actually do the Shutdown if the request comes from the same machine on which it’s hosted. The first Data packet is send immediately after the Forward Request by the web server. The servlet container can send the following types of messages to the webserver: Code 3 Type of Packet Send Body Chunk 4 Send Headers 5 End Response 6 Get Body Chunk 9 CPong Reply Meaning Send a chunk of the body from the servlet container to the web server (and presumably, onto the browser). Send the response headers from the servlet container to the web server (and presumably, onto the browser). Marks the end of the response (and thus the request-handling cycle). Get further data from the request if it hasn’t all been transferred yet. The reply to a CPing request Each of the above messages has a different internal structure, detailed below. Request Packet Structure For messages from the server to the container of type Forward Request: AJP13_FORWARD_REQUEST := prefix_code (byte) 0x02 = JK_AJP13_FORWARD_REQUEST method (byte) protocol (string) req_uri (string) remote_addr (string) remote_host (string) server_name (string) server_port (integer) is_ssl (boolean) num_headers (integer) request_headers *(req_header_name req_header_value) attributes *(attribut_name attribute_value) request_terminator (byte) OxFF The request headers have the following structure: req_header_name := sc_req_header_name | (string) [see below for how this is parsed] sc_req_header_name := 0xA0xx (integer) req_header_value := (string) 10.82. APACHE MODULE MOD PROXY AJP 819 The attributes are optional and have the following structure: attribute_name := sc_a_name | (sc_a_req_attribute string) attribute_value := (string) Not that the all-important header is content-length, because it determines whether or not the container looks for another packet immediately. Detailed description of the elements of Forward Request Request prefix For all requests, this will be 2. See above for details on other Prefix codes. Method The HTTP method, encoded as a single byte: Command Name OPTIONS GET HEAD POST PUT DELETE TRACE PROPFIND PROPPATCH MKCOL COPY MOVE LOCK UNLOCK ACL REPORT VERSION-CONTROL CHECKIN CHECKOUT UNCHECKOUT SEARCH MKWORKSPACE UPDATE LABEL MERGE BASELINE CONTROL MKACTIVITY Code 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 Later version of ajp13, will transport additional methods, even if they are not in this list. 820 CHAPTER 10. APACHE MODULES protocol, req uri, remote addr, remote host, server name, server port, is ssl These are all fairly self-explanatory. Each of these is required, and will be sent for every request. Headers The structure of request headers is the following: First, the number of headers num headers is encoded. Then, a series of header name req header name / value req header value pairs follows. Common header names are encoded as integers, to save space. If the header name is not in the list of basic headers, it is encoded normally (as a string, with prefixed length). The list of common headers sc req header nameand their codes is as follows (all are case-sensitive): Name accept accept-charset accept-encoding accept-language authorization connection content-type content-length cookie cookie2 host pragma referer user-agent Code value 0xA001 0xA002 0xA003 0xA004 0xA005 0xA006 0xA007 0xA008 0xA009 0xA00A 0xA00B 0xA00C 0xA00D 0xA00E Code name SC REQ ACCEPT SC REQ ACCEPT CHARSET SC REQ ACCEPT ENCODING SC REQ ACCEPT LANGUAGE SC REQ AUTHORIZATION SC REQ CONNECTION SC REQ CONTENT TYPE SC REQ CONTENT LENGTH SC REQ COOKIE SC REQ COOKIE2 SC REQ HOST SC REQ PRAGMA SC REQ REFERER SC REQ USER AGENT The Java code that reads this grabs the first two-byte integer and if it sees an ’0xA0’ in the most significant byte, it uses the integer in the second byte as an index into an array of header names. If the first byte is not 0xA0, it assumes that the two-byte integer is the length of a string, which is then read in. This works on the assumption that no header names will have length greater than 0x9FFF (==0xA000 - 1), which is perfectly reasonable, though somewhat arbitrary. =⇒Note: The content-length header is extremely important. If it is present and non-zero, the container assumes that the request has a body (a POST request, for example), and immediately reads a separate packet off the input stream to get that body. Attributes The attributes prefixed with a ? (e.g. ?context) are all optional. For each, there is a single byte code to indicate the type of attribute, and then its value (string or integer). They can be sent in any order (though the C code always sends them in the order listed below). A special terminating code is sent to signal the end of the list of optional attributes. The list of byte codes is: 10.82. APACHE MODULE MOD PROXY AJP 821 Information ?context ?servlet path ?remote user ?auth type ?query string ?jvm route ?ssl cert ?ssl cipher ?ssl session ?req attribute Code Value 0x01 0x02 0x03 0x04 0x05 0x06 0x07 0x08 0x09 0x0A Type Of Value String String String String String String String String ?ssl key size are done 0x0B 0xFF Integer - Note Not currently implemented Not currently implemented Name (the name of the attribute follows) request terminator The context and servlet path are not currently set by the C code, and most of the Java code completely ignores whatever is sent over for those fields (and some of it will actually break if a string is sent along after one of those codes). I don’t know if this is a bug or an unimplemented feature or just vestigial code, but it’s missing from both sides of the connection. The remote user and auth type presumably refer to HTTP-level authentication, and communicate the remote user’s username and the type of authentication used to establish their identity (e.g. Basic, Digest). The query string, ssl cert, ssl cipher, and ssl session refer to the corresponding pieces of HTTP and HTTPS. The jvm route, is used to support sticky sessions – associating a user’s sesson with a particular Tomcat instance in the presence of multiple, load-balancing servers. Beyond this list of basic attributes, any number of other attributes can be sent via the req attribute code 0x0A. A pair of strings to represent the attribute name and value are sent immediately after each instance of that code. Environment values are passed in via this method. Finally, after all the attributes have been sent, the attribute terminator, 0xFF, is sent. This signals both the end of the list of attributes and also then end of the Request Packet. Response Packet Structure for messages which the container can send back to the server. 822 CHAPTER 10. APACHE MODULES AJP13_SEND_BODY_CHUNK := prefix_code 3 chunk_length (integer) chunk *(byte) chunk_terminator (byte) Ox00 AJP13_SEND_HEADERS := prefix_code 4 http_status_code (integer) http_status_msg (string) num_headers (integer) response_headers *(res_header_name header_value) res_header_name := sc_res_header_name | (string) [see below for how this is parsed] sc_res_header_name := 0xA0 (byte) header_value := (string) AJP13_END_RESPONSE := prefix_code 5 reuse (boolean) AJP13_GET_BODY_CHUNK := prefix_code 6 requested_length (integer) Details: Send Body Chunk The chunk is basically binary data, and is sent directly back to the browser. Send Headers The status code and message are the usual HTTP things (e.g. 200 and OK). The response header names are encoded the same way the request header names are. See header encoding above for details about how the codes are distinguished from the strings. The codes for common headers are: 10.82. APACHE MODULE MOD PROXY AJP Name Content-Type Content-Language Content-Length Date Last-Modified Location Set-Cookie Set-Cookie2 Servlet-Engine Status WWW-Authenticate 823 Code value 0xA001 0xA002 0xA003 0xA004 0xA005 0xA006 0xA007 0xA008 0xA009 0xA00A 0xA00B After the code or the string header name, the header value is immediately encoded. End Response Signals the end of this request-handling cycle. If the reuse flag is true (anything other than 0 in the actual C code), this TCP connection can now be used to handle new incoming requests. If reuse is false (==0), the connection should be closed. Get Body Chunk The container asks for more data from the request (If the body was too large to fit in the first packet sent over or when the request is chunked). The server will send a body packet back with an amount of data which is the minimum of the request length, the maximum send body size (8186 (8 Kbytes - 6)), and the number of bytes actually left to send from the request body. If there is no more data in the body (i.e. the servlet container is trying to read past the end of the body), the server will send back an empty packet, which is a body packet with a payload length of 0. (0x12,0x34,0x00,0x00) 824 CHAPTER 10. APACHE MODULES 10.83 Apache Module mod proxy balancer Description: Status: ModuleIdentifier: SourceFile: MOD PROXY extension for load balancing Extension proxy balancer module mod proxy balancer.c Summary This module requires the service of MOD PROXY and it provides load balancing for all the supported protocols. The most important ones are: • HTTP, using MOD PROXY HTTP • FTP, using MOD PROXY FTP • AJP13, using MOD PROXY AJP • WebSocket, using MOD PROXY WSTUNNEL The Load balancing scheduler algorithm is not provided by this module but from other ones such as: • MOD LBMETHOD BYREQUESTS • MOD LBMETHOD BYTRAFFIC • MOD LBMETHOD BYBUSYNESS • MOD LBMETHOD HEARTBEAT Thus, in order to get the ability of load balancing, MOD PROXY, MOD PROXY BALANCER and at least one of load balancing scheduler algorithm modules have to be present in the server. ! Warning Do not enable proxying until you have secured your server (p. 787) . Open proxy servers are dangerous both to your network and to the Internet at large. Directives This module provides no directives. See also • MOD PROXY Load balancer scheduler algorithm At present, there are 3 load balancer scheduler algorithms available for use: Request Counting, Weighted Traffic Counting and Pending Request Counting. These are controlled via the lbmethod value of the Balancer definition. See the P ROXY PASS directive for more information, especially regarding how to configure the Balancer and BalancerMembers. 10.83. APACHE MODULE MOD PROXY BALANCER 825 Load balancer stickyness The balancer supports stickyness. When a request is proxied to some back-end, then all following requests from the same user should be proxied to the same back-end. Many load balancers implement this feature via a table that maps client IP addresses to back-ends. This approach is transparent to clients and back-ends, but suffers from some problems: unequal load distribution if clients are themselves hidden behind proxies, stickyness errors when a client uses a dynamic IP address that changes during a session and loss of stickyness, if the mapping table overflows. The module MOD PROXY BALANCER implements stickyness on top of two alternative means: cookies and URL encoding. Providing the cookie can be either done by the back-end or by the Apache web server itself. The URL encoding is usually done on the back-end. Examples of a balancer configuration Before we dive into the technical details, here’s an example of how you might use MOD PROXY BALANCER to provide load balancing between two back-end servers: BalancerMember http://192.168.1.50:80 BalancerMember http://192.168.1.51:80 ProxyPass "/test" "balancer://mycluster" ProxyPassReverse "/test" "balancer://mycluster" Another example of how to provide load balancing with stickyness using MOD HEADERS, even if the back-end server does not set a suitable session cookie: Header add Set-Cookie "ROUTEID=.%{BALANCER_WORKER_ROUTE}e; path=/" env=BALANCER_ROUTE_CHANG BalancerMember http://192.168.1.50:80 route=1 BalancerMember http://192.168.1.51:80 route=2 ProxySet stickysession=ROUTEID ProxyPass "/test" "balancer://mycluster" ProxyPassReverse "/test" "balancer://mycluster" Exported Environment Variables At present there are 6 environment variables exported: BALANCER SESSION STICKY This is assigned the stickysession value used for the current request. It is the name of the cookie or request parameter used for sticky sessions BALANCER SESSION ROUTE This is assigned the route parsed from the current request. BALANCER NAME This is assigned the name of the balancer used for the current request. The value is something like balancer://foo. BALANCER WORKER NAME This is assigned the name of the worker used for the current request. The value is something like http://hostA:1234. BALANCER WORKER ROUTE This is assigned the route of the worker that will be used for the current request. 826 CHAPTER 10. APACHE MODULES BALANCER ROUTE CHANGED This is set to 1 if the session route does not match the worker route (BALANCER SESSION ROUTE != BALANCER WORKER ROUTE) or the session does not yet have an established route. This can be used to determine when/if the client needs to be sent an updated route when sticky sessions are used. Enabling Balancer Manager Support This module requires the service of MOD STATUS. Balancer manager enables dynamic update of balancer members. You can use balancer manager to change the balance factor of a particular member, or put it in the off line mode. Thus, in order to get the ability of load balancer management, MOD STATUS and MOD PROXY BALANCER have to be present in the server. To enable load balancer management for browsers from the example.com domain add this code to your httpd.conf configuration file SetHandler balancer-manager Require host example.com You can now access load balancer manager by using a Web browser to access the page http://your.server.name/balancer-manager. Please note that only Balancers defined outside of containers can be dynamically controlled by the Manager. Details on load balancer stickyness When using cookie based stickyness, you need to configure the name of the cookie that contains the information about which back-end to use. This is done via the stickysession attribute added to either P ROXY PASS or P ROXY S ET. The name of the cookie is case-sensitive. The balancer extracts the value of the cookie and looks for a member worker with route equal to that value. The route must also be set in either P ROXY PASS or P ROXY S ET. The cookie can either be set by the back-end, or as shown in the above example by the Apache web server itself. Some back-ends use a slightly different form of stickyness cookie, for instance Apache Tomcat. Tomcat adds the name of the Tomcat instance to the end of its session id cookie, separated with a dot (.) from the session id. Thus if the Apache web server finds a dot in the value of the stickyness cookie, it only uses the part behind the dot to search for the route. In order to let Tomcat know about its instance name, you need to set the attribute jvmRoute inside the Tomcat configuration file conf/server.xml to the value of the route of the worker that connects to the respective Tomcat. The name of the session cookie used by Tomcat (and more generally by Java web applications based on servlets) is JSESSIONID (upper case) but can be configured to something else. The second way of implementing stickyness is URL encoding. The web server searches for a query parameter in the URL of the request. The name of the parameter is specified again using stickysession. The value of the parameter is used to lookup a member worker with route equal to that value. Since it is not easy to extract and manipulate all URL links contained in responses, generally the work of adding the parameters to each link is done by the back-end generating the content. In some cases it might be feasible doing this via the web server using MOD SUBSTITUTE or MOD SED . This can have negative impact on performance though. The Java standards implement URL encoding slightly different. They use a path info appended to the URL using a semicolon (;) as the separator and add the session id behind. As in the cookie case, Apache Tomcat can include the configured jvmRoute in this path info. To let Apache find this sort of path info, you neet to set scolonpathdelim to On in P ROXY PASS or P ROXY S ET. Finally you can support cookies and URL encoding at the same time, by configuring the name of the cookie and the name of the URL parameter separated by a vertical bar (|) as in the following example: 10.83. APACHE MODULE MOD PROXY BALANCER 827 ProxyPass "/test" "balancer://mycluster" stickysession=JSESSIONID|jsessionid scolonpathdeli BalancerMember http://192.168.1.50:80 route=node1 BalancerMember http://192.168.1.51:80 route=node2 If the cookie and the request parameter both provide routing information for the same request, the information from the request parameter is used. Troubleshooting load balancer stickyness If you experience stickyness errors, e.g. users lose their application sessions and need to login again, you first want to check whether this is because the back-ends are sometimes unavailable or whether your configuration is wrong. To find out about possible stability problems with the back-ends, check your Apache error log for proxy error messages. To verify your configuration, first check, whether the stickyness is based on a cookie or on URL encoding. Next step would be logging the appropriate data in the access log by using an enhanced L OG F ORMAT. The following fields are useful: %{MYCOOKIE}C The value contained in the cookie with name MYCOOKIE. The name should be the same given in the stickysession attribute. %{Set-Cookie}o This logs any cookie set by the back-end. You can track, whether the back-end sets the session cookie you expect, and to which value it is set. %{BALANCER SESSION STICKY}e The name of the cookie or request parameter used to lookup the routing information. %{BALANCER SESSION ROUTE}e The route information found in the request. %{BALANCER WORKER ROUTE}e The route of the worker chosen. %{BALANCER ROUTE CHANGED}e Set to 1 if the route in the request is different from the route of the worker, i.e. the request couldn’t be handled sticky. Common reasons for loss of session are session timeouts, which are usually configurable on the back-end server. The balancer also logs detailed information about handling stickyness to the error log, if the log level is set to debug or higher. This is an easy way to troubleshoot stickyness problems, but the log volume might be to high for production servers under high load. 828 CHAPTER 10. APACHE MODULES 10.84 Apache Module mod proxy connect Description: Status: ModuleIdentifier: SourceFile: MOD PROXY extension for CONNECT request handling Extension proxy connect module mod proxy connect.c Summary This module requires the service of MOD PROXY. It provides support for the CONNECT HTTP method. This method is mainly used to tunnel SSL requests through proxy servers. Thus, in order to get the ability of handling CONNECT requests, MOD PROXY and MOD PROXY CONNECT have to be present in the server. CONNECT is also used when the server needs to send an HTTPS request through a forward proxy. In this case the server acts as a CONNECT client. This functionality is part of MOD PROXY and MOD PROXY CONNECT is not needed in this case. ! Warning Do not enable proxying until you have secured your server (p. 787) . Open proxy servers are dangerous both to your network and to the Internet at large. Directives • AllowCONNECT See also • MOD PROXY Request notes MOD PROXY CONNECT creates the following request notes for logging using the %{VARNAME}n format in L OG F OR MAT or E RROR L OG F ORMAT: proxy-source-port The local port used for the connection to the backend server. CONNECT method requests are controlled by the P ROXY block as any other HTTP request going through. SSL connections through a proxy may be filtered explicitely by specifying the target host and port, for instance: Require ip 192.168.0.0/16 10.84. APACHE MODULE MOD PROXY CONNECT 829 AllowCONNECT Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Ports that are allowed to CONNECT through the proxy AllowCONNECT port[-port] [port[-port]] ... AllowCONNECT 443 563 server config, virtual host Extension mod proxy connect Moved from MOD PROXY in Apache 2.3.5. Port ranges available since Apache 2.3.7. The A LLOW CONNECT directive specifies a list of port numbers or ranges to which the proxy CONNECT method may connect. Today’s browsers use this method when a https connection is requested and proxy tunneling over HTTP is in effect. By default, only the default https port (443) and the default snews port (563) are enabled. Use the A LLOW CONNECT directive to override this default and allow connections to the listed ports only. 830 CHAPTER 10. APACHE MODULES 10.85 Apache Module mod proxy express Description: Status: ModuleIdentifier: SourceFile: Dynamic mass reverse proxy extension for MOD PROXY Extension proxy express module mod proxy express.c Summary This module creates dynamically configured mass reverse proxies, by mapping the Host: header of the HTTP request to a server name and backend URL stored in a DBM file. This allows for easy use of a huge number of reverse proxies with no configuration changes. It is much less feature-full than MOD PROXY BALANCER, which also provides dynamic growth, but is intended to handle much, much larger numbers of backends. It is ideally suited as a front-end HTTP switch. This module requires the service of MOD PROXY. ! Warning Do not enable proxying until you have secured your server (p. 787) . Open proxy servers are dangerous both to your network and to the Internet at large. =⇒Limitations • This module is not intended to replace the dynamic capability of MOD PROXY BALANCER . Instead, it is intended to be mostly a lightweight and fast alternative to using MOD REWRITE with R EWRITE M AP and the [P] flag for mapped reverse proxying. • It does not support regex or pattern matching at all. • It emulates: ServerName front.end.server ProxyPass "/" "back.end.server:port" ProxyPassReverse "/" "back.end.server:port" That is, the entire URL is appended to the mapped backend URL. This is in keeping with the intent of being a simple but fast reverse proxy switch. Directives • ProxyExpressDBMFile • ProxyExpressDBMType • ProxyExpressEnable See also • MOD PROXY 10.85. APACHE MODULE MOD PROXY EXPRESS 831 ProxyExpressDBMFile Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Pathname to DBM file. ProxyExpressDBMFile None server config, virtual host Extension mod proxy express Available in Apache 2.3.13 and later The P ROXY E XPRESS DBMF ILE directive points to the location of the Express map DBM file. This file serves to map the incoming server name, obtained from the Host: header, to a backend URL. =⇒Note The file is constructed from a plain text file format using the httxt2dbm (p. 328) util- ity. ProxyExpress map file ## ##express-map.txt: ## www1.example.com http://192.168.211.2:8080 www2.example.com http://192.168.211.12:8088 www3.example.com http://192.168.212.10 Create DBM file httxt2dbm -i express-map.txt -o emap Configuration ProxyExpressEnable on ProxyExpressDBMFile emap ProxyExpressDBMType Directive Description: Syntax: Default: Context: Status: Module: Compatibility: DBM type of file. ProxyExpressDBMFile "default" server config, virtual host Extension mod proxy express Available in Apache 2.3.13 and later The P ROXY E XPRESS DBMT YPE directive controls the DBM type expected by the module. The default is the default DBM type created with httxt2dbm (p. 328) . Possible values are (not all may be available at run time): 832 CHAPTER 10. APACHE MODULES Value Description db gdbm ndbm sdbm default Berkeley DB files GDBM files NDBM files SDBM files (always available) default DBM type ProxyExpressEnable Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Enable the module functionality. ProxyExpressEnable [on|off] off server config, virtual host Extension mod proxy express Available in Apache 2.3.13 and later The P ROXY E XPRESS E NABLE directive controls whether the module will be active. 10.86. APACHE MODULE MOD PROXY FCGI 10.86 833 Apache Module mod proxy fcgi Description: Status: ModuleIdentifier: SourceFile: Compatibility: FastCGI support module for MOD PROXY Extension proxy fcgi module mod proxy fcgi.c Available in version 2.3 and later Summary This module requires the service of MOD PROXY. It provides support for the FastCGI69 protocol. Thus, in order to get the ability of handling the FastCGI protocol, MOD PROXY and MOD PROXY FCGI have to be present in the server. Unlike mod fcgid70 and mod fastcgi71 , MOD PROXY FCGI has no provision for starting the application process; fcgistarter is provided (on some platforms) for that purpose. Alternatively, external launching or process management may be available in the FastCGI application framework in use. ! Warning Do not enable proxying until you have secured your server (p. 787) . Open proxy servers are dangerous both to your network and to the Internet at large. Directives This module provides no directives. See also • fcgistarter • MOD PROXY • MOD AUTHNZ FCGI Examples Remember, in order to make the following examples work, you have to enable MOD PROXY and MOD PROXY FCGI. Single application instance ProxyPass "/myapp/" "fcgi://localhost:4000/" MOD PROXY FCGI disables connection reuse by default, so after a request has been completed the connection will NOT be held open by that httpd child process and won’t be reused. If the FastCGI application is able to handle concurrent connections from httpd, you can opt-in to connection reuse as shown in the following example: Single application instance, connection reuse (2.4.11 and later) ProxyPass "/myapp/" "fcgi://localhost:4000/" enablereuse=on The following example passes the request URI as a filesystem path for the PHP-FPM daemon to run. The request URL is implicitly added to the 2nd parameter. The hostname and port following fcgi:// are where PHP-FPM is listening. Connection pooling is enabled. 69 http://www.fastcgi.com/ 70 http://httpd.apache.org/mod 71 http://www.fastcgi.com/ fcgid/ 834 CHAPTER 10. APACHE MODULES PHP-FPM ProxyPassMatch "ˆ/myapp/.*\.php(/.*)?$" "fcgi://localhost:9000/var/www/" enablereuse=on The following example passes the request URI as a filesystem path for the PHP-FPM daemon to run. In this case, PHP-FPM is listening on a unix domain socket (UDS). Requires 2.4.9 or later. With this syntax, the hostname and optional port following fcgi:// are ignored. PHP-FPM with UDS # UDS does not currently support connection reuse ProxyPassMatch "ˆ/(.*\.php(/.*)?)$" "unix:/var/run/php5-fpm.sock|fcgi://localhost/var/www/" The balanced gateway needs MOD PROXY BALANCER and at least one load balancer algorithm module, such as MOD LBMETHOD BYREQUESTS, in addition to the proxy modules listed above. MOD LBMETHOD BYREQUESTS is the default, and will be used for this example configuration. Balanced gateway to multiple application instances ProxyPass "/myapp/" "balancer://myappcluster/" BalancerMember "fcgi://localhost:4000" BalancerMember "fcgi://localhost:4001" You can also force a request to be handled as a reverse-proxy request, by creating a suitable Handler pass-through. The example configuration below will pass all requests for PHP scripts to the specified FastCGI server using reverse proxy. This feature is available in Apache HTTP Server 2.4.10 and later. For performance reasons, you will want to define a worker (p. 787) representing the same fcgi:// backend. The benefit of this form is that it allows the normal mapping of URI to filename to occur in the server, and the local filesystem result is passed to the backend. When FastCGI is configured this way, the server can calculate the most accurate PATH INFO. Proxy via Handler # Note: The only part that varies is /path/to/app.sock SetHandler "proxy:unix:/path/to/app.sock|fcgi://localhost/" # Define a matching worker. # The part that is matched to the SetHandler is the part that # follows the pipe. If you need to distinguish, "localhost; can # be anything unique. SetHandler "proxy:fcgi://localhost:9000" SetHandler "proxy:balancer://myappcluster/" 10.86. APACHE MODULE MOD PROXY FCGI 835 Environment Variables In addition to the configuration directives that control the behaviour of MOD PROXY, there are a number of environment variables that control the FCGI protocol provider: proxy-fcgi-pathinfo When configured via P ROXY PASS or P ROXY PASS M ATCH, MOD PROXY FCGI will not set the PATH INFO environment variable. This allows the backend FCGI server to correctly determine SCRIPT NAME and Script-URI and be compliant with RFC 3875 section 3.3. If instead you need MOD PROXY FCGI to generate a "best guess" for PATH INFO, set this env-var. This is a workaround for a bug in some FCGI implementations. This variable can be set to multiple values to tweak at how the best guess is chosen (In 2.4.11 and later only): first-dot PATH INFO is split from the slash following the first "." in the URL. last-dot PATH INFO is split from the slash following the last "." in the URL. full PATH INFO is calculated by an attempt to map the URL to the local filesystem. unescape PATH INFO is the path component of the URL, unescaped / decoded. any other value PATH INFO is the same as the path component of the URL. Originally, this was the only proxy-fcgi-pathinfo option. 836 CHAPTER 10. APACHE MODULES 10.87 Apache Module mod proxy fdpass Description: Status: ModuleIdentifier: SourceFile: Compatibility: fdpass external process support module for MOD PROXY Extension proxy fdpass module mod proxy fdpass.c Available for unix in version 2.3 and later Summary This module requires the service of MOD PROXY. It provides support for the passing the socket of the client to another process. mod proxy fdpass uses the ability of AF UNIX domain sockets to pass an open file descriptor72 to allow another process to finish handling a request. The module has a proxy fdpass flusher provider interface, which allows another module to optionally send the response headers, or even the start of the response body. The default flush provider disables keep-alive, and sends the response headers, letting the external process just send a response body. In order to use another provider, you have to set the flusher parameter in the P ROXY PASS directive. At this time the only data passed to the external process is the client socket. To receive a client socket, call recvfrom with an allocated struct cmsghdr73 . Future versions of this module may include more data after the client socket, but this is not implemented at this time. Directives This module provides no directives. See also • MOD PROXY 72 http://www.freebsd.org/cgi/man.cgi?query=recv 73 http://www.kernel.org/doc/man-pages/online/pages/man3/cmsg.3.html 10.88. APACHE MODULE MOD PROXY FTP 10.88 837 Apache Module mod proxy ftp Description: Status: ModuleIdentifier: SourceFile: FTP support module for MOD PROXY Extension proxy ftp module mod proxy ftp.c Summary This module requires the service of MOD PROXY. It provides support for the proxying FTP sites. Note that FTP support is currently limited to the GET method. Thus, in order to get the ability of handling FTP proxy requests, MOD PROXY and MOD PROXY FTP have to be present in the server. ! Warning Do not enable proxying until you have secured your server (p. 787) . Open proxy servers are dangerous both to your network and to the Internet at large. Directives • ProxyFtpDirCharset • ProxyFtpEscapeWildcards • ProxyFtpListOnWildcard See also • MOD PROXY Why doesn’t file type xxx download via FTP? You probably don’t have that particular file type defined as application/octet-stream in your proxy’s mime.types configuration file. A useful line can be application/octet-stream bin dms lha lzh exe class tgz taz Alternatively you may prefer to default everything to binary: ForceType application/octet-stream How can I force an FTP ASCII download of File xxx? In the rare situation where you must download a specific file using the FTP ASCII transfer method (while the default transfer is in binary mode), you can override MOD PROXY’s default by suffixing the request with ;type=a to force an ASCII transfer. (FTP Directory listings are always executed in ASCII mode, however.) 838 CHAPTER 10. APACHE MODULES How can I do FTP upload? Currently, only GET is supported for FTP in mod proxy. You can of course use HTTP upload (POST or PUT) through an Apache proxy. How can I access FTP files outside of my home directory? An FTP URI is interpreted relative to the home directory of the user who is logging in. Alas, to reach higher directory levels you cannot use /../, as the dots are interpreted by the browser and not actually sent to the FTP server. To address this problem, the so called Squid %2f hack was implemented in the Apache FTP proxy; it is a solution which is also used by other popular proxy servers like the Squid Proxy Cache74 . By prepending /%2f to the path of your request, you can make such a proxy change the FTP starting directory to / (instead of the home directory). For example, to retrieve the file /etc/motd, you would use the URL: ftp://user@host/%2f/etc/motd How can I hide the FTP cleartext password in my browser’s URL line? To log in to an FTP server by username and password, Apache uses different strategies. In absence of a user name and password in the URL altogether, Apache sends an anonymous login to the FTP server, i.e., user: anonymous password: apache proxy@ This works for all popular FTP servers which are configured for anonymous access. For a personal login with a specific username, you can embed the user name into the URL, like in: ftp://username@host/myfile If the FTP server asks for a password when given this username (which it should), then Apache will reply with a 401 (Authorization required) response, which causes the Browser to pop up the username/password dialog. Upon entering the password, the connection attempt is retried, and if successful, the requested resource is presented. The advantage of this procedure is that your browser does not display the password in cleartext (which it would if you had used ftp://username:password@host/myfile in the first place). =⇒Note The password which is transmitted in such a way is not encrypted on its way. It travels between your browser and the Apache proxy server in a base64-encoded cleartext string, and between the Apache proxy and the FTP server as plaintext. You should therefore think twice before accessing your FTP server via HTTP (or before accessing your personal files via FTP at all!) When using insecure channels, an eavesdropper might intercept your password on its way. 74 http://www.squid-cache.org/ 10.88. APACHE MODULE MOD PROXY FTP 839 Why do I get a file listing when I expected a file to be downloaded? In order to allow both browsing the directories on an FTP server and downloading files, Apache looks at the request URL. If it looks like a directory, or contains wildcard characters ("*?[{˜"), then it guesses that a listing is wanted instead of a download. You can disable the special handling of names with wildcard characters. See the P ROXY F TP L IST O N W ILDCARD directive. ProxyFtpDirCharset Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Define the character set for proxied FTP listings ProxyFtpDirCharset character set ProxyFtpDirCharset ISO-8859-1 server config, virtual host, directory Extension mod proxy ftp Moved from MOD PROXY in Apache 2.3.5. The P ROXY F TP D IR C HARSET directive defines the character set to be set for FTP directory listings in HTML generated by MOD PROXY FTP. ProxyFtpEscapeWildcards Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Whether wildcards in requested filenames are escaped when sent to the FTP server ProxyFtpEscapeWildcards [on|off] on server config, virtual host, directory Extension mod proxy ftp Available in Apache 2.3.3 and later The P ROXY F TP E SCAPE W ILDCARDS directive controls whether wildcard characters ("*?[{˜") in requested filenames are escaped with backslash before sending them to the FTP server. That is the default behavior, but many FTP servers don’t know about the escaping and try to serve the literal filenames they were sent, including the backslashes in the names. Set to "off" to allow downloading files with wildcards in their names from FTP servers that don’t understand wildcard escaping. ProxyFtpListOnWildcard Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Whether wildcards in requested filenames trigger a file listing ProxyFtpListOnWildcard [on|off] on server config, virtual host, directory Extension mod proxy ftp Available in Apache 2.3.3 and later The P ROXY F TP L IST O N W ILDCARD directive controls whether wildcard characters ("*?[{˜") in requested filenames cause MOD PROXY FTP to return a listing of files instead of downloading a file. By default (value on), they do. Set to "off" to allow downloading files even if they have wildcard characters in their names. 840 10.89 CHAPTER 10. APACHE MODULES Apache Module mod proxy hcheck Description: Status: ModuleIdentifier: SourceFile: Compatibility: Dynamic health check of Balancer members (workers) for MOD PROXY Extension proxy hcheck module mod proxy hcheck.c Available in Apache 2.4.21 and later Summary This module provides for dynamic health checking of balancer members (workers). This can be enabled on a workerby-worker basis. The health check is done independently of the actual reverse proxy requests. This module requires the service of MOD WATCHDOG. 10.89. APACHE MODULE MOD PROXY HCHECK 841 =⇒Parameters The health check mechanism is enabled via the use of additional BalancerMember parameters, which are configured in the standard way via P ROXY PASS: A new BalancerMember status state (flag) is defined via this module: "C". When the worker is taken offline due to failures as determined by the health check module, this flag is set, and can be seen (and modified) via the balancer-manager. Parameter Default Description hcmethod None No dynamic performed. Method None health check Choices are: TCP OPTIONS HEAD GET hcpasses 1 hcfails 1 hcinterval 30 hcuri hctemplate hcexpr *: Unless hcexpr is used, a 2xx or 3xx HTTP status will be interpreted as passing the health check Number of successful health check tests before worker is reenabled Number of failed health check tests before worker is disabled Period of health checks in seconds (e.g. performed every 30 seconds) Additional URI to be appended to the worker URL for the health check. Name of template, created via P ROXY HCT EMPLATE to use for setting health check parameters for this worker Name of expression, created via P ROXY HCE XPR, used to check response headers for health. If not used, 2xx thru 3xx status codes imply success Description No dynamic health checking done Check that a socket to the backend can be created: e.g. "are you up" Send an HTTP OPTIONS request to the backend Send an HTTP HEAD request to the backend Send an HTTP GET request to the backend Note * * * 842 CHAPTER 10. APACHE MODULES Directives • ProxyHCExpr • ProxyHCTemplate • ProxyHCTPsize See also • MOD PROXY Usage examples The following example shows how one might configured health checking for various backend servers: ProxyHCExpr ok234 {%{REQUEST_STATUS} =˜ /ˆ[234]/} ProxyHCExpr gdown {%{REQUEST_STATUS} =˜ /ˆ[5]/} ProxyHCExpr in_maint {hc(’body’) !˜ /Under maintenance/} BalancerMember http://www.example.com/ hcmethod=GET hcexpr=in_maint hcuri=/status.php BalancerMember http://www2.example.com/ hcmethod=HEAD hcexpr=ok234 hcinterval=10 BalancerMember http://www3.example.com/ hcmethod=TCP hcinterval=5 hcpasses=2 hcfails=3 BalancerMember http://www4.example.com/ ProxyPass "/" "balancer://foo" ProxyPassReverse "/" "balancer://foo" In this scenario, http://www.example.com/ is health checked by sending a GET /status.php request to that server and seeing that the returned page does not include the string Under maintenance. If it does, that server is put in health-check fail mode, and disabled. This dynamic check is performed every 30 seconds, which is the default. http://www2.example.com/ is checked by sending a simple HEAD request every 10 seconds and making sure that the response status is 2xx, 3xx or 4xx. http://www3.example.com/ is checked every 5 seconds by simply ensuring that the socket to that server is up. If the backend is marked as "down" and it passes 2 health check, it will be re-enabled and added back into the load balancer. It takes 3 back-to-back health check failures to disable the server and move it out of rotation. Finally, http://www4.example.com/ is not dynamically checked at all. ProxyHCExpr Directive Description: Syntax: Context: Status: Module: Creates a named condition expression to use to determine health of the backend based on its response. ProxyHCExpr name {ap expr expression} server config, virtual host Extension mod proxy hcheck The P ROXY HCE XPR directive allows for creating a named condition expression that checks the response headers of the backend server to determine its health. This named condition can then be assigned to balancer members via the hcexpr parameter 10.89. APACHE MODULE MOD PROXY HCHECK 843 ProxyHCExpr: Allow for 2xx/3xx/4xx as passing ProxyHCExpr ok234 {%{REQUEST_STATUS} =˜ /ˆ[234]/} ProxyPass "/apps" "http://backend.example.com/" hcexpr=ok234 =⇒normal The expression (p. 99) can use curly-parens ("{}") as quoting deliminators in addition to quotes. If using a health check method (eg: GET) which results in a response body, that body itself can be checked via ap expr using the hc() expression function, which is unique to this module. In the following example, we send the backend a GET request and if the response body contains the phrase Under maintenance, we want to disable the backend. ProxyHCExpr: Checking response body ProxyHCExpr in_maint {hc(’body’) !˜ /Under maintenance/} ProxyPass "/apps" "http://backend.example.com/" hcexpr=in_maint hcmethod=get hcuri=/stat NOTE: Since response body can quite large, it is best if used against specific status pages. ProxyHCTemplate Directive Description: Syntax: Context: Status: Module: Creates a named template for setting various health check parameters ProxyHCTemplate name parameter=setting <...> server config, virtual host Extension mod proxy hcheck The P ROXY HCT EMPLATE directive allows for creating a named set (template) of health check parameters that can then be assigned to balancer members via the hctemplate parameter ProxyHCTemplate ProxyHCTemplate tcp5 hcmethod=tcp hcinterval=5 ProxyPass "/apps" "http://backend.example.com/" hctemplate=tcp5 ProxyHCTPsize Directive Description: Syntax: Context: Status: Module: Sets the size of the threadpool used for the health check workers. ProxyHCTPsize server config, virtual host Extension mod proxy hcheck If Apache httpd and APR are built with thread support, the health check module will offload the work of the actual checking to a threadpool associated with the Watchdog process, allowing for parallel checks. The P ROXY HCTP SIZE directive determines the size of this threadpool. If set to 0, no threadpool is used at all, resulting in serialized health checks. The default size is 16. ProxyHCTPsize ProxyHCTPsize 32 844 CHAPTER 10. APACHE MODULES 10.90 Apache Module mod proxy html Description: Status: ModuleIdentifier: SourceFile: Compatibility: Rewrite HTML links in to ensure they are addressable from Clients’ networks in a proxy context. Base proxy html module mod proxy html.c Version 2.4 and later. Available as a third-party module for earlier 2.x versions Summary This module provides an output filter to rewrite HTML links in a proxy situation, to ensure that links work for users outside the proxy. It serves the same purpose as Apache’s ProxyPassReverse directive does for HTTP headers, and is an essential component of a reverse proxy. For example, if a company has an application server at appserver.example.com that is only visible from within the company’s internal network, and a public webserver www.example.com, they may wish to provide a gateway to the application server at http://www.example.com/appserver/. When the application server links to itself, those links need to be rewritten to work through the gateway. mod proxy html serves to rewrite foobar to foobar making it accessible from outside. mod proxy html was originally developed at Webing, whose extensive documentation75 may be useful to users. Directives • ProxyHTMLBufSize • ProxyHTMLCharsetOut • ProxyHTMLDocType • ProxyHTMLEnable • ProxyHTMLEvents • ProxyHTMLExtended • ProxyHTMLFixups • ProxyHTMLInterp • ProxyHTMLLinks • ProxyHTMLMeta • ProxyHTMLStripComments • ProxyHTMLURLMap ProxyHTMLBufSize Directive Description: Syntax: Context: Status: Module: Compatibility: Sets the buffer size increment for buffering inline scripts and stylesheets. ProxyHTMLBufSize bytes server config, virtual host, directory Base mod proxy html Version 2.4 and later; available as a third-party for earlier 2.x versions 75 http://apache.webthing.com/mod proxy html/ 10.90. APACHE MODULE MOD PROXY HTML 845 In order to parse non-HTML content (stylesheets and scripts) embedded in HTML documents, mod proxy html has to read the entire script or stylesheet into a buffer. This buffer will be expanded as necessary to hold the largest script or stylesheet in a page, in increments of bytes as set by this directive. The default is 8192, and will work well for almost all pages. However, if you know you’re proxying pages containing stylesheets and/or scripts bigger than 8K (that is, for a single script or stylesheet, NOT in total), it will be more efficient to set a larger buffer size and avoid the need to resize the buffer dynamically during a request. ProxyHTMLCharsetOut Directive Description: Syntax: Context: Status: Module: Compatibility: Specify a charset for mod proxy html output. ProxyHTMLCharsetOut Charset | * server config, virtual host, directory Base mod proxy html Version 2.4 and later; available as a third-party for earlier 2.x versions This selects an encoding for mod proxy html output. It should not normally be used, as any change from the default UTF-8 (Unicode - as used internally by libxml2) will impose an additional processing overhead. The special token ProxyHTMLCharsetOut * will generate output using the same encoding as the input. Note that this relies on MOD XML 2 ENC being loaded. ProxyHTMLDocType Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Sets an HTML or XHTML document type declaration. ProxyHTMLDocType HTML|XHTML [Legacy] OR ProxyHTMLDocType fpi [SGML|XML] OR ProxyHTMLDocType html5 OR ProxyHTMLDocType auto ProxyHTMLDocType auto (2.5/trunk versions); no FPI (2.4.x) server config, virtual host, directory Base mod proxy html Version 2.4 and later; available as a third-party for earlier 2.x versions In the first form, documents will be declared as HTML 4.01 or XHTML 1.0 according to the option selected. This option also determines whether HTML or XHTML syntax is used for output. Note that the format of the documents coming from the backend server is immaterial: the parser will deal with it automatically. If the optional second argument is set to "Legacy", documents will be declared "Transitional", an option that may be necessary if you are proxying pre-1998 content or working with defective authoring/publishing tools. In the second form, it will insert your own FPI. The optional second argument determines whether SGML/HTML or XML/XHTML syntax will be used. The third form declares documents as HTML 5. The fourth form is new in HTTPD trunk and not yet available in released versions, and uses libxml2’s HTML parser to detect the doctype. If the first form is used, mod proxy html will also clean up the HTML to the specified standard. It cannot fix every error, but it will strip out bogus elements and attributes. It will also optionally log other errors at L OG L EVEL Debug. 846 CHAPTER 10. APACHE MODULES ProxyHTMLEnable Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Turns the proxy html filter on or off. ProxyHTMLEnable On|Off ProxyHTMLEnable Off server config, virtual host, directory Base mod proxy html Version 2.4 and later; available as a third-party module for earlier 2.x versions. A simple switch to enable or disable the proxy html filter. If MOD XML 2 ENC is loaded it will also automatically set up internationalisation support. Note that the proxy html filter will only act on HTML data (Content-Type text/html or application/xhtml+xml) and when the data are proxied. You can override this (at your own risk) by setting the PROXY HTML FORCE environment variable. ProxyHTMLEvents Directive Description: Syntax: Context: Status: Module: Compatibility: Specify attributes to treat as scripting events. ProxyHTMLEvents attribute [attribute ...] server config, virtual host, directory Base mod proxy html Version 2.4 and later; available as a third-party for earlier 2.x versions Specifies one or more attributes to treat as scripting events and apply P ROXY HTMLURLM APs to where enabled. You can specify any number of attributes in one or more ProxyHTMLEvents directives. Normally you’ll set this globally. If you set ProxyHTMLEvents in more than one scope so that one overrides the other, you’ll need to specify a complete set in each of those scopes. A default configuration is supplied in proxy-html.conf and defines the events in standard HTML 4 and XHTML 1. ProxyHTMLExtended Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Determines whether to fix links in inline scripts, stylesheets, and scripting events. ProxyHTMLExtended On|Off ProxyHTMLExtended Off server config, virtual host, directory Base mod proxy html Version 2.4 and later; available as a third-party for earlier 2.x versions Set to Off, HTML links are rewritten according to the P ROXY HTMLURLM AP directives, but links appearing in Javascript and CSS are ignored. Set to On, all scripting events (as determined by P ROXY HTMLE VENTS) and embedded scripts or stylesheets are also processed by the P ROXY HTMLURLM AP rules, according to the flags set for each rule. Since this requires more parsing, performance will be best if you only enable it when strictly necessary. You’ll also need to take care over patterns matched, since the parser has no knowledge of what is a URL within an embedded script or stylesheet. In particular, extended matching of / is likely to lead to false matches. 10.90. APACHE MODULE MOD PROXY HTML 847 ProxyHTMLFixups Directive Description: Syntax: Context: Status: Module: Compatibility: Fixes for simple HTML errors. ProxyHTMLFixups [lowercase] [dospath] [reset] server config, virtual host, directory Base mod proxy html Version 2.4 and later; available as a third-party for earlier 2.x versions This directive takes one to three arguments as follows: • lowercase Urls are rewritten to lowercase • dospath Backslashes in URLs are rewritten to forward slashes. • reset Unset any options set at a higher level in the configuration. Take care when using these. The fixes will correct certain authoring mistakes, but risk also erroneously fixing links that were correct to start with. Only use them if you know you have a broken backend server. ProxyHTMLInterp Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Enables per-request interpolation of P ROXY HTMLURLM AP rules. ProxyHTMLInterp On|Off ProxyHTMLInterp Off server config, virtual host, directory Base mod proxy html Version 2.4 and later; available as a third-party for earlier 2.x versions This enables per-request interpolation in P ROXY HTMLURLM AP to- and from- patterns. If interpolation is not enabled, all rules are pre-compiled at startup. With interpolation, they must be re-compiled for every request, which implies an extra processing overhead. It should therefore be enabled only when necessary. ProxyHTMLLinks Directive Description: Syntax: Context: Status: Module: Compatibility: Specify HTML elements that have URL attributes to be rewritten. ProxyHTMLLinks element attribute [attribute2 ...] server config, virtual host, directory Base mod proxy html Version 2.4 and later; available as a third-party for earlier 2.x versions Specifies elements that have URL attributes that should be rewritten using standard P ROXY HTMLURLM APs. You will need one ProxyHTMLLinks directive per element, but it can have any number of attributes. Normally you’ll set this globally. If you set ProxyHTMLLinks in more than one scope so that one overrides the other, you’ll need to specify a complete set in each of those scopes. A default configuration is supplied in proxy-html.conf and defines the HTML links for standard HTML 4 and XHTML 1. 848 CHAPTER 10. APACHE MODULES ProxyHTMLMeta Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Turns on or off extra pre-parsing of metadata in HTML sections. ProxyHTMLMeta On|Off ProxyHTMLMeta Off server config, virtual host, directory Base mod proxy html Version 2.4 and later; available as a third-party module for earlier 2.x versions. This turns on or off pre-parsing of metadata in HTML sections. If not required, turning ProxyHTMLMeta Off will give a small performance boost by skipping this parse step. However, it is sometimes necessary for internationalisation to work correctly. ProxyHTMLMeta has two effects. Firstly and most importantly it enables detection of character encodings declared in the form or, in the case of an XHTML document, an XML declaration. It is NOT required if the charset is declared in a real HTTP header (which is always preferable) from the backend server, nor if the document is utf-8 (unicode) or a subset such as ASCII. You may also be able to dispense with it where documents use a default declared using XML 2E NC D EFAULT , but that risks propagating an incorrect declaration. A P ROXY HTMLC HARSET O UT can remove that risk, but is likely to be a bigger processing overhead than enabling ProxyHTMLMeta. The other effect of enabling ProxyHTMLMeta is to parse all declarations and convert them to real HTTP headers, in keeping with the original purpose of this form of the HTML element. ProxyHTMLStripComments Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Determines whether to strip HTML comments. ProxyHTMLStripComments On|Off ProxyHTMLStripComments Off server config, virtual host, directory Base mod proxy html Version 2.4 and later; available as a third-party for earlier 2.x versions This directive will cause mod proxy html to strip HTML comments. Note that this will also kill off any scripts or styles embedded in comments (a bogosity introduced in 1995/6 with Netscape 2 for the benefit of then-older browsers, but still in use today). It may also interfere with comment-based processors such as SSI or ESI: be sure to run any of those before mod proxy html in the filter chain if stripping comments! ProxyHTMLURLMap Directive Description: Syntax: Context: Status: Module: Compatibility: Defines a rule to rewrite HTML links ProxyHTMLURLMap from-pattern to-pattern [flags] [cond] server config, virtual host, directory Base mod proxy html Version 2.4 and later; available as a third-party module for earlier 2.x versions. This is the key directive for rewriting HTML links. When parsing a document, whenever a link target matches frompattern, the matching portion will be rewritten to to-pattern, as modified by any flags supplied and by the P ROXYHTMLE XTENDED directive. 10.90. APACHE MODULE MOD PROXY HTML 849 The optional third argument may define any of the following Flags. Flags are case-sensitive. h Ignore HTML links (pass through unchanged) e Ignore scripting events (pass through unchanged) c Pass embedded script and style sections through untouched. L Last-match. If this rule matches, no more rules are applied (note that this happens automatically for HTML links). l Opposite to L. Overrides the one-change-only default behaviour with HTML links. R Use Regular Expression matching-and-replace. from-pattern is a regexp, and to-pattern a replacement string that may be based on the regexp. Regexp memory is supported: you can use brackets () in the from-pattern and retrieve the matches with $1 to $9 in the to-pattern. If R is not set, it will use string-literal search-and-replace. The logic is starts-with in HTML links, but contains in scripting events and embedded script and style sections. x Use POSIX extended Regular Expressions. Only applicable with R. i Case-insensitive matching. Only applicable with R. n Disable regexp memory (for speed). Only applicable with R. s Line-based regexp matching. Only applicable with R. ˆ Match at start only. This applies only to string matching (not regexps) and is irrelevant to HTML links. $ Match at end only. This applies only to string matching (not regexps) and is irrelevant to HTML links. V Interpolate environment variables in to-pattern. A string of the form ${varname|default} will be replaced by the value of environment variable varname. If that is unset, it is replaced by default. The |default is optional. NOTE: interpolation will only be enabled if P ROXY HTMLI NTERP is On. v Interpolate environment variables in from-pattern. Patterns supported are as above. NOTE: interpolation will only be enabled if P ROXY HTMLI NTERP is On. The optional fourth cond argument defines a condition that will be evaluated per Request, provided P ROXY HTMLI N TERP is On. If the condition evaluates FALSE the map will not be applied in this request. If TRUE, or if no condition is defined, the map is applied. A cond is evaluated by the Expression Parser (p. 99) . In addition, the simpler syntax of conditions in mod proxy html 3.x for HTTPD 2.0 and 2.2 is also supported. 850 CHAPTER 10. APACHE MODULES 10.91 Apache Module mod proxy http Description: Status: ModuleIdentifier: SourceFile: HTTP support module for MOD PROXY Extension proxy http module mod proxy http.c Summary This module requires the service of MOD PROXY. It provides the features used for proxying HTTP and HTTPS requests. MOD PROXY HTTP supports HTTP/0.9, HTTP/1.0 and HTTP/1.1. It does not provide any caching abilities. If you want to set up a caching proxy, you might want to use the additional service of the MOD CACHE module. Thus, in order to get the ability of handling HTTP proxy requests, MOD PROXY and MOD PROXY HTTP have to be present in the server. ! Warning Do not enable proxying until you have secured your server (p. 787) . Open proxy servers are dangerous both to your network and to the Internet at large. Directives This module provides no directives. See also • MOD PROXY • MOD PROXY CONNECT Environment Variables In addition to the configuration directives that control the behaviour of MOD PROXY, there are a number of environment variables that control the HTTP protocol provider. Environment variables below that don’t specify specific values are enabled when set to any value. proxy-sendextracrlf Causes proxy to send an extra CR-LF newline on the end of a request. This is a workaround for a bug in some browsers. force-proxy-request-1.0 Forces the proxy to send requests to the backend as HTTP/1.0 and disables HTTP/1.1 features. proxy-nokeepalive Forces the proxy to close the backend connection after each request. proxy-chain-auth If the proxy requires authentication, it will read and consume the proxy authentication credentials sent by the client. With proxy-chain-auth it will also forward the credentials to the next proxy in the chain. This may be necessary if you have a chain of proxies that share authentication information. Security Warning: Do not set this unless you know you need it, as it forwards sensitive information! proxy-sendcl HTTP/1.0 required all HTTP requests that include a body (e.g. POST requests) to include a ContentLength header. This environment variable forces the Apache proxy to send this header to the backend server, regardless of what the Client sent to the proxy. It ensures compatibility when proxying for an HTTP/1.0 or unknown backend. However, it may require the entire request to be buffered by the proxy, so it becomes very inefficient for large requests. 10.91. APACHE MODULE MOD PROXY HTTP 851 proxy-sendchunks or proxy-sendchunked This is the opposite of proxy-sendcl. It allows request bodies to be sent to the backend using chunked transfer encoding. This allows the request to be efficiently streamed, but requires that the backend server supports HTTP/1.1. proxy-interim-response This variable takes values RFC (the default) or Suppress. Earlier httpd versions would suppress HTTP interim (1xx) responses sent from the backend. This is technically a violation of the HTTP protocol. In practice, if a backend sends an interim response, it may itself be extending the protocol in a manner we know nothing about, or just broken. So this is now configurable: set proxy-interim-response RFC to be fully protocol compliant, or proxy-interim-response Suppress to suppress interim responses. proxy-initial-not-pooled If this variable is set, no pooled connection will be reused if the client request is the initial request on the frontend connection. This avoids the "proxy: error reading status line from remote server" error message caused by the race condition that the backend server closed the pooled connection after the connection check by the proxy and before data sent by the proxy reached the backend. It has to be kept in mind that setting this variable downgrades performance, especially with HTTP/1.0 clients. Request notes MOD PROXY HTTP creates the following request notes for logging using the %{VARNAME}n format in L OG F ORMAT or E RROR L OG F ORMAT: proxy-source-port The local port used for the connection to the backend server. proxy-status The HTTP status received from the backend server. 852 CHAPTER 10. APACHE MODULES 10.92 Apache Module mod proxy http2 Description: Status: ModuleIdentifier: SourceFile: HTTP/2 support module for MOD PROXY Extension proxy http2 module mod proxy http2.c Summary This module requires the service of MOD PROXY. It provides the features used for proxying HTTP/2 requests. MOD PROXY HTTP 2 supports HTTP/2 only. It does not provide any downgrades to HTTP/1.1. Thus, in order to get the ability of handling HTTP/2 proxy requests, MOD PROXY and MOD PROXY HTTP 2 have to be present in the server. MOD PROXY HTTP 2 works with incoming requests over HTTP/1.1 and HTTP/2 requests. If MOD HTTP 2 handles the frontend connection, requests against the same HTTP/2 backend are sent over a single connection, whenever possible. This module relies on libnghttp276 to provide the core http/2 engine. ! ! Warning This module is experimental. Its behaviors, directives, and defaults are subject to more change from release to release relative to other standard modules. Users are encouraged to consult the "CHANGES" file for potential updates. Warning Do not enable proxying until you have secured your server (p. 787) . Open proxy servers are dangerous both to your network and to the Internet at large. Directives This module provides no directives. See also • MOD HTTP 2 • MOD PROXY • MOD PROXY CONNECT Request notes MOD PROXY HTTP creates the following request notes for logging using the %{VARNAME}n format in L OG F ORMAT or E RROR L OG F ORMAT: proxy-source-port The local port used for the connection to the backend server. proxy-status The HTTP/2 status received from the backend server. 76 http://nghttp2.org/ 10.93. APACHE MODULE MOD PROXY SCGI 10.93 853 Apache Module mod proxy scgi Description: Status: ModuleIdentifier: SourceFile: SCGI gateway module for MOD PROXY Extension proxy scgi module mod proxy scgi.c Summary This module requires the service of MOD PROXY. It provides support for the SCGI protocol, version 177 . Thus, in order to get the ability of handling the SCGI protocol, MOD PROXY and MOD PROXY SCGI have to be present in the server. ! Warning Do not enable proxying until you have secured your server (p. 787) . Open proxy servers are dangerous both to your network and to the Internet at large. Directives • ProxySCGIInternalRedirect • ProxySCGISendfile See also • MOD PROXY • MOD PROXY BALANCER Examples Remember, in order to make the following examples work, you have to enable MOD PROXY and MOD PROXY SCGI. Simple gateway ProxyPass "/scgi-bin/" "scgi://localhost:4000/" The balanced gateway needs MOD PROXY BALANCER and at least one load balancer algorithm module, such as MOD LBMETHOD BYREQUESTS, in addition to the proxy modules listed above. MOD LBMETHOD BYREQUESTS is the default, and will be used for this example configuration. Balanced gateway ProxyPass "/scgi-bin/" "balancer://somecluster/" BalancerMember scgi://localhost:4000 BalancerMember scgi://localhost:4001 77 http://python.ca/scgi/protocol.txt 854 CHAPTER 10. APACHE MODULES Environment Variables In addition to the configuration directives that control the behaviour of MOD PROXY, an environment variable may also control the SCGI protocol provider: proxy-scgi-pathinfo By default MOD PROXY SCGI will neither create nor export the PATH INFO environment variable. This allows the backend SCGI server to correctly determine SCRIPT NAME and Script-URI and be compliant with RFC 3875 section 3.3. If instead you need MOD PROXY SCGI to generate a "best guess" for PATH INFO, set this env-var. The variable must be set before S ET E NV is effective. S ET E NV I F can be used instead: SetEnvIf Request URI . proxy-scgi-pathinfo ProxySCGIInternalRedirect Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Enable or disable internal redirect responses from the backend ProxySCGIInternalRedirect On|Off|Headername ProxySCGIInternalRedirect On server config, virtual host, directory Extension mod proxy scgi The Headername feature is available in Apache httpd 2.4.13 and later. The P ROXY SCGII NTERNAL R EDIRECT enables the backend to internally redirect the gateway to a different URL. This feature originates in MOD CGI, which internally redirects the response if the response status is OK (200) and the response contains a Location (or configured alternate header) and its value starts with a slash (/). This value is interpreted as a new local URL that Apache httpd internally redirects to. MOD PROXY SCGI does the same as MOD CGI in this regard, except that you can turn off the feature or specify the use of a header other than Location. Example ProxySCGIInternalRedirect Off # Django and some other frameworks will fully qualify "local URLs" # set by the application, so an alternate header must be used. ProxySCGIInternalRedirect X-Location ProxySCGISendfile Directive Description: Syntax: Default: Context: Status: Module: Enable evaluation of X-Sendfile pseudo response header ProxySCGISendfile On|Off|Headername ProxySCGISendfile Off server config, virtual host, directory Extension mod proxy scgi The P ROXY SCGIS ENDFILE directive enables the SCGI backend to let files be served directly by the gateway. This is useful for performance purposes - httpd can use sendfile or other optimizations, which are not possible if the file comes over the backend socket. Additionally, the file contents are not transmitted twice. The P ROXY SCGIS ENDFILE argument determines the gateway behaviour: 10.93. APACHE MODULE MOD PROXY SCGI 855 Off No special handling takes place. On The gateway looks for a backend response header called X-Sendfile and interprets the value as the filename to serve. The header is removed from the final response headers. This is equivalent to ProxySCGISendfile X-Sendfile. anything else Similar to On, but instead of the hardcoded header name X-Sendfile, the argument is used as the header name. Example # Use the default header (X-Sendfile) ProxySCGISendfile On # Use a different header ProxySCGISendfile X-Send-Static 856 CHAPTER 10. APACHE MODULES 10.94 Apache Module mod proxy wstunnel Description: Status: ModuleIdentifier: SourceFile: Compatibility: Websockets support module for MOD PROXY Extension proxy wstunnel module mod proxy wstunnel.c Available in httpd 2.4.5 and later Summary This module requires the service of MOD PROXY. It provides support for the tunnelling of web socket connections to a backend websockets server. The connection is automatically upgraded to a websocket connection: HTTP Response Upgrade: WebSocket Connection: Upgrade Proxying requests to a websockets server like echo.websocket.org can be done using the P ROXY PASS directive: ProxyPass "/ws2/" "ws://echo.websocket.org/" ProxyPass "/wss2/" "wss://echo.websocket.org/" Load balancing for multiple backends can be achieved using MOD PROXY BALANCER. Directives • ProxyWebsocketAsync • ProxyWebsocketAsyncDelay • ProxyWebsocketIdleTimeout See also • MOD PROXY ProxyWebsocketAsync Directive Description: Syntax: Context: Status: Module: Instructs this module to try to create an asynchronous tunnel ProxyWebsocketAsync ON|OFF server config, virtual host Extension mod proxy wstunnel This directive instructs the server to try to create an asynchronous tunnel. If the current MPM does not support the necessary features, a synchronous tunnel is used. =⇒Note Async support is experimental and subject to change. 10.94. APACHE MODULE MOD PROXY WSTUNNEL 857 ProxyWebsocketAsyncDelay Directive Description: Syntax: Default: Context: Status: Module: Sets the amount of time the tunnel waits synchronously for data ProxyWebsocketAsyncDelay num[ms] ProxyWebsocketAsyncDelay 0 server config, virtual host Extension mod proxy wstunnel If P ROXY W EBSOCKETA SYNC is enabled, this directive controls how long the server synchronously waits for more data. =⇒Note Async support is experimental and subject to change. ProxyWebsocketIdleTimeout Directive Description: Syntax: Default: Context: Status: Module: Sets the maximum amount of time to wait for data on the websockets tunnel ProxyWebsocketIdleTimeout num[ms] ProxyWebsocketIdleTimeout 0 server config, virtual host Extension mod proxy wstunnel This directive imposes a maximum amount of time for the tunnel to be left open while idle. 858 10.95 CHAPTER 10. APACHE MODULES Apache Module mod ratelimit Description: Status: ModuleIdentifier: SourceFile: Bandwidth Rate Limiting for Clients Extension ratelimit module mod ratelimit.c Summary Provides a filter named RATE LIMIT to limit client bandwidth. The connection speed to be simulated is specified, in KiB/s, using the environment variable rate-limit. Example Configuration SetOutputFilter RATE_LIMIT SetEnv rate-limit 400 Directives This module provides no directives. 10.96. APACHE MODULE MOD REFLECTOR 10.96 859 Apache Module mod reflector Description: Status: ModuleIdentifier: SourceFile: Compatibility: Reflect a request body as a response via the output filter stack. Base reflector module mod reflector.c Version 2.3 and later Summary This module allows request bodies to be reflected back to the client, in the process passing the request through the output filter stack. A suitably configured chain of filters can be used to transform the request into a response. This module can be used to turn an output filter into an HTTP service. Directives • ReflectorHeader Examples Compression service Pass the request body through the DEFLATE filter to compress the body. This request requires a Content-Encoding request header containing "gzip" for the filter to return compressed data. SetHandler reflector SetOutputFilter DEFLATE Image downsampling service Pass the request body through an image downsampling filter, and reflect the results to the caller. SetHandler reflector SetOutputFilter DOWNSAMPLE ReflectorHeader Directive Description: Syntax: Context: Override: Status: Module: Reflect an input header to the output headers ReflectorHeader inputheader [outputheader] server config, virtual host, directory, .htaccess Options Base mod reflector This directive controls the reflection of request headers to the response. The first argument is the name of the request header to copy. If the optional second argument is specified, it will be used as the name of the response header, otherwise the original request header name will be used. 860 CHAPTER 10. APACHE MODULES 10.97 Apache Module mod remoteip Description: Status: ModuleIdentifier: SourceFile: Replaces the original client IP address for the connection with the useragent IP address list presented by a proxies or a load balancer via the request headers. Base remoteip module mod remoteip.c Summary This module is used to treat the useragent which initiated the request as the originating useragent as identified by httpd for the purposes of authorization and logging, even where that useragent is behind a load balancer, front end server, or proxy server. The module overrides the client IP address for the connection with the useragent IP address reported in the request header configured with the R EMOTE IPH EADER directive. Once replaced as instructed, this overridden useragent IP address is then used for the MOD AUTHZ HOST R EQUIRE IP feature, is reported by MOD STATUS , and is recorded by MOD LOG CONFIG %a and CORE %a format strings. The underlying client IP of the connection is available in the %{c}a format string. ! It is critical to only enable this behavior from intermediate hosts (proxies, etc) which are trusted by this server, since it is trivial for the remote useragent to impersonate another useragent. Directives • RemoteIPHeader • RemoteIPInternalProxy • RemoteIPInternalProxyList • RemoteIPProxiesHeader • RemoteIPTrustedProxy • RemoteIPTrustedProxyList See also • MOD AUTHZ HOST • MOD STATUS • MOD LOG CONFIG Remote IP Processing Apache by default identifies the useragent with the connection’s client ip value, and the connection remote host and remote logname are derived from this value. These fields play a role in authentication, authorization and logging and other purposes by other loadable modules. mod remoteip overrides the client IP of the connection with the advertised useragent IP as provided by a proxy or load balancer, for the duration of the request. A load balancer might establish a long lived keepalive connection with the server, and each request will have the correct useragent IP, even though the underlying client IP address of the load balancer remains unchanged. 10.97. APACHE MODULE MOD REMOTEIP 861 When multiple, comma delimited useragent IP addresses are listed in the header value, they are processed in Rightto-Left order. Processing halts when a given useragent IP address is not trusted to present the preceding IP address. The header field is updated to this remaining list of unconfirmed IP addresses, or if all IP addresses were trusted, this header is removed from the request altogether. In overriding the client IP, the module stores the list of intermediate hosts in a remoteip-proxy-ip-list note, which MOD LOG CONFIG can record using the %{remoteip-proxy-ip-list}n format token. If the administrator needs to store this as an additional header, this same value can also be recording as a header using the directive R EMOTE IPP ROXIES H EADER. =⇒IPv4-over-IPv6 Mapped Addresses As with httpd in general, any IPv4-over-IPv6 mapped addresses are recorded in their IPv4 representation. =⇒Internal (Private) Addresses All internal addresses 10/8, 172.16/12, 192.168/16, 169.254/16 and 127/8 blocks (and IPv6 addresses outside of the public 2000::/3 block) are only evaluated by mod remoteip when R EMOTE IPI NTERNAL P ROXY internal (intranet) proxies are registered. RemoteIPHeader Directive Description: Syntax: Context: Status: Module: Declare the header field which should be parsed for useragent IP addresses RemoteIPHeader header-field server config, virtual host Base mod remoteip The R EMOTE IPH EADER directive triggers MOD REMOTEIP to treat the value of the specified header-field header as the useragent IP address, or list of intermediate useragent IP addresses, subject to further configuration of the R EMOTE IPI NTERNAL P ROXY and R EMOTE IPT RUSTED P ROXY directives. ! Unless these other directives are used, MOD REMOTEIP will trust all hosts presenting a non internal address in the R EMOTE IPH EADER header value. Internal (Load Balancer) Example RemoteIPHeader X-Client-IP Proxy Example RemoteIPHeader X-Forwarded-For RemoteIPInternalProxy Directive Description: Syntax: Context: Status: Module: Declare client intranet IP addresses trusted to present the RemoteIPHeader value RemoteIPInternalProxy proxy-ip|proxy-ip/subnet|hostname ... server config, virtual host Base mod remoteip The R EMOTE IPI NTERNAL P ROXY directive adds one or more addresses (or address blocks) to trust as presenting a valid RemoteIPHeader value of the useragent IP. Unlike the R EMOTE IPT RUSTED P ROXY directive, any IP address presented in this header, including private intranet addresses, are trusted when passed from these proxies. 862 CHAPTER 10. APACHE MODULES Internal (Load Balancer) Example RemoteIPHeader X-Client-IP RemoteIPInternalProxy 10.0.2.0/24 RemoteIPInternalProxy gateway.localdomain RemoteIPInternalProxyList Directive Description: Syntax: Context: Status: Module: Declare client intranet IP addresses trusted to present the RemoteIPHeader value RemoteIPInternalProxyList filename server config, virtual host Base mod remoteip The R EMOTE IPI NTERNAL P ROXY L IST directive specifies a file parsed at startup, and builds a list of addresses (or address blocks) to trust as presenting a valid RemoteIPHeader value of the useragent IP. The ’#’ hash character designates a comment line, otherwise each whitespace or newline separated entry is processed identically to the R EMOTE IPI NTERNAL P ROXY directive. Internal (Load Balancer) Example RemoteIPHeader X-Client-IP RemoteIPInternalProxyList conf/trusted-proxies.lst conf/trusted-proxies.lst contents # Our internally trusted proxies; 10.0.2.0/24 #Everyone in the testing group gateway.localdomain #The front end balancer RemoteIPProxiesHeader Directive Description: Syntax: Context: Status: Module: Declare the header field which will record all intermediate IP addresses RemoteIPProxiesHeader HeaderFieldName server config, virtual host Base mod remoteip The R EMOTE IPP ROXIES H EADER directive specifies a header into which MOD REMOTEIP will collect a list of all of the intermediate client IP addresses trusted to resolve the useragent IP of the request. Note that intermediate R E MOTE IPT RUSTED P ROXY addresses are recorded in this header, while any intermediate R EMOTE IPI NTERNAL P ROXY addresses are discarded. Example RemoteIPHeader X-Forwarded-For RemoteIPProxiesHeader X-Forwarded-By 10.97. APACHE MODULE MOD REMOTEIP 863 RemoteIPTrustedProxy Directive Description: Syntax: Context: Status: Module: Restrict client IP addresses trusted to present the RemoteIPHeader value RemoteIPTrustedProxy proxy-ip|proxy-ip/subnet|hostname ... server config, virtual host Base mod remoteip The R EMOTE IPT RUSTED P ROXY directive restricts which peer IP addresses (or address blocks) will be trusted to present a valid RemoteIPHeader value of the useragent IP. Unlike the R EMOTE IPI NTERNAL P ROXY directive, any intranet or private IP address reported by such proxies, including the 10/8, 172.16/12, 192.168/16, 169.254/16 and 127/8 blocks (or outside of the IPv6 public 2000::/3 block) are not trusted as the useragent IP, and are left in the R EMOTE IPH EADER header’s value. ! By default, MOD REMOTEIP will trust all hosts presenting a non internal address in the R E MOTE IPH EADER header value. Trusted (Load Balancer) Example RemoteIPHeader X-Forwarded-For RemoteIPTrustedProxy 10.0.2.16/28 RemoteIPTrustedProxy proxy.example.com RemoteIPTrustedProxyList Directive Description: Syntax: Context: Status: Module: Restrict client IP addresses trusted to present the RemoteIPHeader value RemoteIPTrustedProxyList filename server config, virtual host Base mod remoteip The R EMOTE IPT RUSTED P ROXY L IST directive specifies a file parsed at startup, and builds a list of addresses (or address blocks) to trust as presenting a valid RemoteIPHeader value of the useragent IP. The ’#’ hash character designates a comment line, otherwise each whitespace or newline separated entry is processed identically to the R EMOTE IPT RUSTED P ROXY directive. Trusted (Load Balancer) Example RemoteIPHeader X-Forwarded-For RemoteIPTrustedProxyList conf/trusted-proxies.lst conf/trusted-proxies.lst contents # Identified external proxies; 192.0.2.16/28 #wap phone group of proxies proxy.isp.example.com #some well known ISP 864 CHAPTER 10. APACHE MODULES 10.98 Apache Module mod reqtimeout Description: Status: ModuleIdentifier: SourceFile: Set timeout and minimum data rate for receiving requests Extension reqtimeout module mod reqtimeout.c Directives • RequestReadTimeout Examples 1. Allow 10 seconds to receive the request including the headers and 30 seconds for receiving the request body: RequestReadTimeout header=10 body=30 2. Allow at least 10 seconds to receive the request body. If the client sends data, increase the timeout by 1 second for every 1000 bytes received, with no upper limit for the timeout (except for the limit given indirectly by L IMIT R EQUEST B ODY): RequestReadTimeout body=10,MinRate=1000 3. Allow at least 10 seconds to receive the request including the headers. If the client sends data, increase the timeout by 1 second for every 500 bytes received. But do not allow more than 30 seconds for the request including the headers: RequestReadTimeout header=10-30,MinRate=500 4. Usually, a server should have both header and body timeouts configured. If a common configuration is used for http and https virtual hosts, the timeouts should not be set too low: RequestReadTimeout header=20-40,MinRate=500 body=20,MinRate=500 RequestReadTimeout Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Set timeout values for receiving request headers and body from client. RequestReadTimeout [header=timeout[-maxtimeout][,MinRate=rate] [body=timeout[-maxtimeout][,MinRate=rate] header=20-40,MinRate=500 body=20,MinRate=500 server config, virtual host Extension mod reqtimeout Defaulted to disabled in version 2.3.14 and earlier. This directive can set various timeouts for receiving the request headers and the request body from the client. If the client fails to send headers or body within the configured time, a 408 REQUEST TIME OUT error is sent. For SSL virtual hosts, the header timeout values include the time needed to do the initial SSL handshake. If the user’s browser is configured to query certificate revocation lists and the CRL server is not reachable, the initial SSL handshake may take a significant time until the browser gives up waiting for the CRL. Therefore the header timeout values should not be set to very low values for SSL virtual hosts. The body timeout values include the time needed for SSL renegotiation (if necessary). 10.98. APACHE MODULE MOD REQTIMEOUT 865 When an ACCEPT F ILTER is in use (usually the case on Linux and FreeBSD), the socket is not sent to the server process before at least one byte (or the whole request for httpready) is received. The header timeout configured with RequestReadTimeout is only effective after the server process has received the socket. For each of the two timeout types (header or body), there are three ways to specify the timeout: • Fixed timeout value: type=timeout The time in seconds allowed for reading all of the request headers or body, respectively. A value of 0 means no limit. • Disable module for a vhost:: header=0 body=0 This disables MOD REQTIMEOUT completely. • Timeout value that is increased when data is received: type=timeout,MinRate=data rate Same as above, but whenever data is received, the timeout value is increased according to the specified minimum data rate (in bytes per second). • Timeout value that is increased when data is received, with an upper bound: type=timeout-maxtimeout,MinRate=data rate Same as above, but the timeout will not be increased above the second value of the specified timeout range. 866 CHAPTER 10. APACHE MODULES 10.99 Apache Module mod request Description: Status: ModuleIdentifier: SourceFile: Compatibility: Filters to handle and make available HTTP request bodies Base request module mod request.c Available in Apache 2.3 and later Directives • KeptBodySize KeptBodySize Directive Description: Syntax: Default: Context: Status: Module: Keep the request body instead of discarding it up to the specified maximum size, for potential use by filters such as mod include. KeptBodySize maximum size in bytes KeptBodySize 0 directory Base mod request Under normal circumstances, request handlers such as the default handler for static files will discard the request body when it is not needed by the request handler. As a result, filters such as mod include are limited to making GET requests only when including other URLs as subrequests, even if the original request was a POST request, as the discarded request body is no longer available once filter processing is taking place. When this directive has a value greater than zero, request handlers that would otherwise discard request bodies will instead set the request body aside for use by filters up to the maximum size specified. In the case of the mod include filter, an attempt to POST a request to the static shtml file will cause any subrequests to be POST requests, instead of GET requests as before. This feature makes it possible to break up complex web pages and web applications into small individual components, and combine the components and the surrounding web page structure together using MOD INCLUDE. The components can take the form of CGI programs, scripted languages, or URLs reverse proxied into the URL space from another server using MOD PROXY. Note: Each request set aside has to be set aside in temporary RAM until the request is complete. As a result, care should be taken to ensure sufficient RAM is available on the server to support the intended load. Use of this directive should be limited to where needed on targeted parts of your URL space, and with the lowest possible value that is still big enough to hold a request body. If the request size sent by the client exceeds the maximum size allocated by this directive, the server will return 413 Request Entity Too Large. See also • mod include (p. 667) documentation • mod auth form (p. 466) documentation 10.100. APACHE MODULE MOD REWRITE 10.100 867 Apache Module mod rewrite Description: Status: ModuleIdentifier: SourceFile: Provides a rule-based rewriting engine to rewrite requested URLs on the fly Extension rewrite module mod rewrite.c Summary The MOD REWRITE module uses a rule-based rewriting engine, based on a PCRE regular-expression parser, to rewrite requested URLs on the fly. By default, MOD REWRITE maps a URL to a filesystem path. However, it can also be used to redirect one URL to another URL, or to invoke an internal proxy fetch. MOD REWRITE provides a flexible and powerful way to manipulate URLs using an unlimited number of rules. Each rule can have an unlimited number of attached rule conditions, to allow you to rewrite URL based on server variables, environment variables, HTTP headers, or time stamps. MOD REWRITE operates on the full URL path, including the path-info section. A rewrite rule can be invoked in httpd.conf or in .htaccess. The path generated by a rewrite rule can include a query string, or can lead to internal sub-processing, external request redirection, or internal proxy throughput. Further details, discussion, and examples, are provided in the detailed mod rewrite documentation (p. 146) . Directives • RewriteBase • RewriteCond • RewriteEngine • RewriteMap • RewriteOptions • RewriteRule Logging MOD REWRITE offers detailed logging of its actions at the trace1 to trace8 log levels. The log level can be set specifically for MOD REWRITE using the L OG L EVEL directive: Up to level debug, no actions are logged, while trace8 means that practically all actions are logged. =⇒Using a high trace log level for will slow down your Apache HTTP Server dramatically! Use a log level higher than trace2 only for debugging! MOD REWRITE Example LogLevel alert rewrite:trace3 =⇒RewriteLog Those familiar with earlier versions of MOD REWRITE will no doubt be looking for the RewriteLog and RewriteLogLevel directives. This functionality has been completely replaced by the new per-module logging configuration mentioned above. To get just the MOD REWRITE-specific log messages, pipe the log file through grep: tail -f error log|fgrep ’[rewrite:’ 868 CHAPTER 10. APACHE MODULES RewriteBase Directive Description: Syntax: Default: Context: Override: Status: Module: Sets the base URL for per-directory rewrites RewriteBase URL-path None directory, .htaccess FileInfo Extension mod rewrite The R EWRITE BASE directive specifies the URL prefix to be used for per-directory (htaccess) R EWRITE RULE directives that substitute a relative path. This directive is required when you use a relative path in a substitution in per-directory (htaccess) context unless either of the following conditions are true: • The original request, and the substitution, are underneath the D OCUMENT ROOT (as opposed to reachable by other means, such as A LIAS). • The filesystem path to the directory containing the R EWRITE RULE, suffixed by the relative substitution is also valid as a URL path on the server (this is rare). • In Apache HTTP Server 2.4.16 and later, this directive may be omitted when the request is mapped via A LIAS or MOD USERDIR. In the example below, R EWRITE BASE is necessary to avoid rewriting to http://example.com/opt/myapp1.2.3/welcome.html since the resource was not relative to the document root. This misconfiguration would normally cause the server to look for an "opt" directory under the document root. DocumentRoot "/var/www/example.com" AliasMatch "ˆ/myapp" "/opt/myapp-1.2.3" RewriteEngine On RewriteBase "/myapp/" RewriteRule "ˆindex\.html$" "welcome.html" RewriteCond Directive Description: Syntax: Context: Override: Status: Module: Defines a condition under which rewriting will take place RewriteCond TestString CondPattern server config, virtual host, directory, .htaccess FileInfo Extension mod rewrite The R EWRITE C OND directive defines a rule condition. One or more R EWRITE C OND can precede a R EWRITE RULE directive. The following rule is then only used if both the current state of the URI matches its pattern, and if these conditions are met. TestString is a string which can contain the following expanded constructs in addition to plain text: • RewriteRule backreferences: These are backreferences of the form $N (0 <= N <= 9). $1 to $9 provide access to the grouped parts (in parentheses) of the pattern, from the RewriteRule which is subject to the current set of RewriteCond conditions. $0 provides access to the whole string matched by that pattern. 10.100. APACHE MODULE MOD REWRITE 869 • RewriteCond backreferences: These are backreferences of the form %N (0 <= N <= 9). %1 to %9 provide access to the grouped parts (again, in parentheses) of the pattern, from the last matched RewriteCond in the current set of conditions. %0 provides access to the whole string matched by that pattern. • RewriteMap expansions: These are expansions of the form ${mapname:key|default}. See the documentation for RewriteMap for more details. • Server-Variables: These are variables of the form %{ NAME OF VARIABLE } where NAME OF VARIABLE can be a string taken from the following list: HTTP headers: connection & request: HTTP HTTP HTTP HTTP HTTP HTTP HTTP AUTH TYPE CONN REMOTE ADDR CONTEXT PREFIX CONTEXT DOCUMENT ROOT IPV6 PATH INFO QUERY STRING REMOTE ADDR REMOTE HOST REMOTE IDENT REMOTE PORT REMOTE USER REQUEST METHOD SCRIPT FILENAME ACCEPT COOKIE FORWARDED HOST PROXY CONNECTION REFERER USER AGENT server internals: DOCUMENT ROOT SCRIPT GROUP SCRIPT USER SERVER ADDR SERVER ADMIN SERVER NAME SERVER PORT SERVER PROTOCOL SERVER SOFTWARE date and time: specials: TIME TIME TIME TIME TIME TIME TIME TIME API VERSION CONN REMOTE ADDR HTTPS IS SUBREQ REMOTE ADDR REQUEST FILENAME REQUEST SCHEME REQUEST URI THE REQUEST YEAR MON DAY HOUR MIN SEC WDAY These variables all correspond to the similarly named HTTP MIME-headers, C variables of the Apache HTTP Server or struct tm fields of the Unix system. Most are documented here (p. 99) or elsewhere in the Manual or in the CGI specification. SERVER NAME and SERVER PORT depend on the values of U SE C ANONICAL NAME and U SE C ANONICAL P HYSICAL P ORT respectively. Those that are special to mod rewrite include those below. API VERSION This is the version of the Apache httpd module API (the internal interface between server and module) in the current httpd build, as defined in include/ap mmn.h. The module API version corresponds to the version of Apache httpd in use (in the release version of Apache httpd 1.3.14, for instance, it is 19990320:10), but is mainly of interest to module authors. CONN REMOTE ADDR Since 2.4.8: The peer IP address of the connection (see the MOD REMOTEIP module). HTTPS Will contain the text "on" if the connection is using SSL/TLS, or "off" otherwise. (This variable can be safely used regardless of whether or not MOD SSL is loaded). IS SUBREQ Will contain the text "true" if the request currently being processed is a sub-request, "false" otherwise. Sub-requests may be generated by modules that need to resolve additional files or URIs in order to complete their tasks. REMOTE ADDR The IP address of the remote host (see the MOD REMOTEIP module). REQUEST FILENAME The full local filesystem path to the file or script matching the request, if this has already been determined by the server at the time REQUEST FILENAME is referenced. Otherwise, such as 870 CHAPTER 10. APACHE MODULES when used in virtual host context, the same value as REQUEST URI. Depending on the value of ACCEPTPATH I NFO, the server may have only used some leading components of the REQUEST URI to map the request to a file. REQUEST SCHEME Will contain the scheme of the request (usually "http" or "https"). This value can be influenced with S ERVER NAME. REQUEST URI The path component of the requested URI, such as "/index.html". This notably excludes the query string which is available as its own variable named QUERY STRING. THE REQUEST The full HTTP request line sent by the browser to the server (e.g., "GET /index.html HTTP/1.1"). This does not include any additional headers sent by the browser. This value has not been unescaped (decoded), unlike most other variables below. If the TestString has the special value expr, the CondPattern will be treated as an ap expr (p. 99) . HTTP headers referenced in the expression will be added to the Vary header if the novary flag is not given. Other things you should be aware of: 1. The variables SCRIPT FILENAME and REQUEST FILENAME contain the same value - the value of the filename field of the internal request rec structure of the Apache HTTP Server. The first name is the commonly known CGI variable name while the second is the appropriate counterpart of REQUEST URI (which contains the value of the uri field of request rec). If a substitution occurred and the rewriting continues, the value of both variables will be updated accordingly. If used in per-server context (i.e., before the request is mapped to the filesystem) SCRIPT FILENAME and REQUEST FILENAME cannot contain the full local filesystem path since the path is unknown at this stage of processing. Both variables will initially contain the value of REQUEST URI in that case. In order to obtain the full local filesystem path of the request in per-server context, use an URL-based look-ahead %{LA-U:REQUEST FILENAME} to determine the final value of REQUEST FILENAME. 2. %{ENV:variable}, where variable can be any environment variable, is also available. This is looked-up via internal Apache httpd structures and (if not found there) via getenv() from the Apache httpd server process. 3. %{SSL:variable}, where variable is the name of an SSL environment variable (p. 916) , can be used whether or not MOD SSL is loaded, but will always expand to the empty string if it is not. Example: %{SSL:SSL CIPHER USEKEYSIZE} may expand to 128. These variables are available even without setting the StdEnvVars option of the SSLO PTIONS directive. 4. %{HTTP:header}, where header can be any HTTP MIME-header name, can always be used to obtain the value of a header sent in the HTTP request. Example: %{HTTP:Proxy-Connection} is the value of the HTTP header “Proxy-Connection:”. If a HTTP header is used in a condition this header is added to the Vary header of the response in case the condition evaluates to true for the request. It is not added if the condition evaluates to false for the request. Adding the HTTP header to the Vary header of the response is needed for proper caching. It has to be kept in mind that conditions follow a short circuit logic in the case of the ’ornext|OR’ flag so that certain conditions might not be evaluated at all. 5. %{LA-U:variable} can be used for look-aheads which perform an internal (URL-based) sub-request to determine the final value of variable. This can be used to access variable for rewriting which is not available at the current stage, but will be set in a later phase. For instance, to rewrite according to the REMOTE USER variable from within the per-server context (httpd.conf file) you must use %{LA-U:REMOTE USER} this variable is set by the authorization phases, which come after the URL translation phase (during which mod rewrite operates). On the other hand, because mod rewrite implements its per-directory context (.htaccess file) via the Fixup phase of the API and because the authorization phases come before this phase, you just can use %{REMOTE USER} in that context. 10.100. APACHE MODULE MOD REWRITE 871 6. %{LA-F:variable} can be used to perform an internal (filename-based) sub-request, to determine the final value of variable. Most of the time, this is the same as LA-U above. CondPattern is the condition pattern, a regular expression which is applied to the current instance of the TestString. TestString is first evaluated, before being matched against CondPattern. CondPattern is usually a perl compatible regular expression, but there is additional syntax available to perform other useful tests against the Teststring: 1. You can prefix the pattern string with a ’!’ character (exclamation mark) to negate the result of the condition, no matter what kind of CondPattern is used. 2. You can perform lexicographical string comparisons: CondPattern Lexicographically follows Treats the CondPattern as a plain string and compares it lexicographically to TestString. True if TestString lexicographically follows CondPattern. =CondPattern Lexicographically equal Treats the CondPattern as a plain string and compares it lexicographically to TestString. True if TestString is lexicographically equal to CondPattern (the two strings are exactly equal, character for character). If CondPattern is "" (two quotation marks) this compares TestString to the empty string. <=CondPattern Lexicographically less than or equal to Treats the CondPattern as a plain string and compares it lexicographically to TestString. True if TestString lexicographically precedes CondPattern, or is equal to CondPattern (the two strings are equal, character for character). >=CondPattern Lexicographically greater than or equal to Treats the CondPattern as a plain string and compares it lexicographically to TestString. True if TestString lexicographically follows CondPattern, or is equal to CondPattern (the two strings are equal, character for character). 3. You can perform integer comparisons: -eq Is numerically equal to The TestString is treated as an integer, and is numerically compared to the CondPattern. True if the two are numerically equal. -ge Is numerically greater than or equal to The TestString is treated as an integer, and is numerically compared to the CondPattern. True if the TestString is numerically greater than or equal to the CondPattern. -gt Is numerically greater than The TestString is treated as an integer, and is numerically compared to the CondPattern. True if the TestString is numerically greater than the CondPattern. -le Is numerically less than or equal to The TestString is treated as an integer, and is numerically compared to the CondPattern. True if the TestString is numerically less than or equal to the CondPattern. Avoid confusion with the -l by using the -L or -h variant. -lt Is numerically less than The TestString is treated as an integer, and is numerically compared to the CondPattern. True if the TestString is numerically less than the CondPattern. Avoid confusion with the -l by using the -L or -h variant. 872 CHAPTER 10. APACHE MODULES -ne Is numerically not equal to The TestString is treated as an integer, and is numerically compared to the CondPattern. True if the two are numerically different. This is equivalent to !-eq. 4. You can perform various file attribute tests: -d Is directory. Treats the TestString as a pathname and tests whether or not it exists, and is a directory. -f Is regular file. Treats the TestString as a pathname and tests whether or not it exists, and is a regular file. -F Is existing file, via subrequest. Checks whether or not TestString is a valid file, accessible via all the server’s currently-configured access controls for that path. This uses an internal subrequest to do the check, so use it with care - it can impact your server’s performance! -h Is symbolic link, bash convention. See -l. -l Is symbolic link. Treats the TestString as a pathname and tests whether or not it exists, and is a symbolic link. May also use the bash convention of -L or -h if there’s a possibility of confusion such as when using the -lt or -le tests. -L Is symbolic link, bash convention. See -l. -s Is regular file, with size. Treats the TestString as a pathname and tests whether or not it exists, and is a regular file with size greater than zero. -U Is existing URL, via subrequest. Checks whether or not TestString is a valid URL, accessible via all the server’s currently-configured access controls for that path. This uses an internal subrequest to do the check, so use it with care - it can impact your server’s performance! This flag only returns information about things like access control, authentication, and authorization. This flag does not return information about the status code the configured handler (static file, CGI, proxy, etc.) would have returned. -x Has executable permissions. Treats the TestString as a pathname and tests whether or not it exists, and has executable permissions. These permissions are determined according to the underlying OS. For example: RewriteCond /var/www/%{REQUEST_URI} !-f RewriteRule ˆ(.+) /other/archive/$1 [R] 5. If the TestString has the special value expr, the CondPattern will be treated as an ap expr (p. 99) . In the below example, -strmatch is used to compare the REFERER against the site hostname, to block unwanted hotlinking. RewriteCond expr "! %{HTTP_REFERER} -strmatch ’*://%{HTTP_HOST}/*’" RewriteRule "ˆ/images" "-" [F] 6. You can also set special flags for CondPattern by appending [flags] as the third argument to the RewriteCond directive, where flags is a comma-separated list of any of the following flags: 10.100. APACHE MODULE MOD REWRITE 873 • ’nocase|NC’ (no case) This makes the test case-insensitive - differences between ’A-Z’ and ’a-z’ are ignored, both in the expanded TestString and the CondPattern. This flag is effective only for comparisons between TestString and CondPattern. It has no effect on filesystem and subrequest checks. • ’ornext|OR’ (or next condition) Use this to combine rule conditions with a local OR instead of the implicit AND. Typical example: RewriteCond RewriteCond RewriteCond RewriteRule "%{REMOTE_HOST}" "ˆhost1" [OR] "%{REMOTE_HOST}" "ˆhost2" [OR] "%{REMOTE_HOST}" "ˆhost3" ...some special stuff for any of these hosts... Without this flag you would have to write the condition/rule pair three times. • ’novary|NV’ (no vary) If a HTTP header is used in the condition, this flag prevents this header from being added to the Vary header of the response. Using this flag might break proper caching of the response if the representation of this response varies on the value of this header. So this flag should be only used if the meaning of the Vary header is well understood. Example: To rewrite the Homepage of a site according to the “User-Agent:” header of the request, you can use the following: RewriteCond RewriteRule "%{HTTP_USER_AGENT}" "ˆ/$" "(iPhone|Blackberry|Android)" "/homepage.mobile.html" [L] RewriteRule "ˆ/$" "/homepage.std.html" [L] Explanation: If you use a browser which identifies itself as a mobile browser (note that the example is incomplete, as there are many other mobile platforms), the mobile version of the homepage is served. Otherwise, the standard page is served. RewriteEngine Directive Description: Syntax: Default: Context: Override: Status: Module: Enables or disables runtime rewriting engine RewriteEngine on|off RewriteEngine off server config, virtual host, directory, .htaccess FileInfo Extension mod rewrite The R EWRITE E NGINE directive enables or disables the runtime rewriting engine. If it is set to off this module does no runtime processing at all. It does not even update the SCRIPT URx environment variables. Use this directive to disable rules in a particular context, rather than commenting out all the R EWRITE RULE directives. Note that rewrite configurations are not inherited by virtual hosts. This means that you need to have a RewriteEngine on directive for each virtual host in which you wish to use rewrite rules. R EWRITE M AP directives of the type prg are not started during server initialization if they’re defined in a context that does not have R EWRITE E NGINE set to on 874 CHAPTER 10. APACHE MODULES RewriteMap Directive Description: Syntax: Context: Status: Module: Defines a mapping function for key-lookup RewriteMap MapName MapType:MapSource MapTypeOptions server config, virtual host Extension mod rewrite The R EWRITE M AP directive defines a Rewriting Map which can be used inside rule substitution strings by the mapping-functions to insert/substitute fields through a key lookup. The source of this lookup can be of various types. The MapName is the name of the map and will be used to specify a mapping-function for the substitution strings of a rewriting rule via one of the following constructs: ${ MapName : LookupKey } ${ MapName : LookupKey | DefaultValue } When such a construct occurs, the map MapName is consulted and the key LookupKey is looked-up. If the key is found, the map-function construct is substituted by SubstValue. If the key is not found then it is substituted by DefaultValue or by the empty string if no DefaultValue was specified. Empty values behave as if the key was absent, therefore it is not possible to distinguish between empty-valued keys and absent keys. For example, you might define a R EWRITE M AP as: RewriteMap examplemap "txt:/path/to/file/map.txt" You would then be able to use this map in a R EWRITE RULE as follows: RewriteRule "ˆ/ex/(.*)" "${examplemap:$1}" The meaning of the MapTypeOptions argument depends on particular MapType. See the Using RewriteMap (p. 166) for more information. The following combinations for MapType and MapSource can be used: txt A plain text file containing space-separated key-value pairs, one per line. (Details ... (p. 166) ) rnd Randomly selects an entry from a plain text file (Details ... (p. 166) ) dbm Looks up an entry in a dbm file containing name, value pairs. Hash is constructed from a plain text file format using the httxt2dbm (p. 328) utility. (Details ... (p. 166) ) int One of the four available internal functions provided by RewriteMap: toupper, tolower, escape or unescape. (Details ... (p. 166) ) prg Calls an external program or script to process the rewriting. (Details ... (p. 166) ) dbd or fastdbd A SQL SELECT statement to be performed to look up the rewrite target. (Details ... (p. 166) ) Further details, and numerous examples, may be found in the RewriteMap HowTo (p. 166) RewriteOptions Directive Description: Syntax: Context: Override: Status: Module: Sets some special options for the rewrite engine RewriteOptions Options server config, virtual host, directory, .htaccess FileInfo Extension mod rewrite 10.100. APACHE MODULE MOD REWRITE 875 The R EWRITE O PTIONS directive sets some special options for the current per-server or per-directory configuration. The Option string can currently only be one of the following: Inherit This forces the current configuration to inherit the configuration of the parent. In per-virtual-server context, this means that the maps, conditions and rules of the main server are inherited. In per-directory context this means that conditions and rules of the parent directory’s .htaccess configuration or sections are inherited. The inherited rules are virtually copied to the section where this directive is being used. If used in combination with local rules, the inherited rules are copied behind the local rules. The position of this directive - below or above of local rules - has no influence on this behavior. If local rules forced the rewriting to stop, the inherited rules won’t be processed. ! Rules inherited from the parent scope are applied after rules specified in the child scope. InheritBefore Like Inherit above, but the rules from the parent scope are applied before rules specified in the child scope. Available in Apache HTTP Server 2.3.10 and later. InheritDown If this option is enabled, all child configurations will inherit the configuration of the current configuration. It is equivalent to specifying RewriteOptions Inherit in all child configurations. See the Inherit option for more details on how the parent-child relationships are handled. Available in Apache HTTP Server 2.4.8 and later. InheritDownBefore Like InheritDown above, but the rules from the current scope are applied before rules specified in any child’s scope. Available in Apache HTTP Server 2.4.8 and later. IgnoreInherit This option forces the current and child configurations to ignore all rules that would be inherited from a parent specifying InheritDown or InheritDownBefore. Available in Apache HTTP Server 2.4.8 and later. AllowNoSlash By default, MOD REWRITE will ignore URLs that map to a directory on disk but lack a trailing slash, in the expectation that the MOD DIR module will issue the client with a redirect to the canonical URL with a trailing slash. When the D IRECTORY S LASH directive is set to off, the AllowNoSlash option can be enabled to ensure that rewrite rules are no longer ignored. This option makes it possible to apply rewrite rules within .htaccess files that match the directory without a trailing slash, if so desired. Available in Apache HTTP Server 2.4.0 and later. AllowAnyURI When R EWRITE RULE is used in VirtualHost or server context with version 2.2.22 or later of httpd, MOD REWRITE will only process the rewrite rules if the request URI is a URL-path (p. 377) . This avoids some security issues where particular rules could allow "surprising" pattern expansions (see CVE-2011-336878 and CVE-2011-431779 ). To lift the restriction on matching a URL-path, the AllowAnyURI option can be enabled, and MOD REWRITE will apply the rule set to any request URI string, regardless of whether that string matches the URL-path grammar required by the HTTP specification. Available in Apache HTTP Server 2.4.3 and later. ! Security Warning Enabling this option will make the server vulnerable to security issues if used with rewrite rules which are not carefully authored. It is strongly recommended that this option is not used. In particular, beware of input strings containing the ’@’ character which could change the interpretation of the transformed URI, as per the above CVE names. 78 http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2011-3368 79 http://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2011-4317 876 CHAPTER 10. APACHE MODULES MergeBase With this option, the value of R EWRITE BASE is copied from where it’s explicitly defined into any subdirectory or sub-location that doesn’t define its own R EWRITE BASE. This was the default behavior in 2.4.0 through 2.4.3, and the flag to restore it is available Apache HTTP Server 2.4.4 and later. IgnoreContextInfo When a relative substitution is made in directory (htaccess) context and R EWRITE BASE has not been set, this module uses some extended URL and filesystem context information to change the relative substitution back into a URL. Modules such as MOD USERDIR and MOD ALIAS supply this extended context info. Available in 2.4.16 and later. RewriteRule Directive Description: Syntax: Context: Override: Status: Module: Defines rules for the rewriting engine RewriteRule Pattern Substitution [flags] server config, virtual host, directory, .htaccess FileInfo Extension mod rewrite The R EWRITE RULE directive is the real rewriting workhorse. The directive can occur more than once, with each instance defining a single rewrite rule. The order in which these rules are defined is important - this is the order in which they will be applied at run-time. Pattern is a perl compatible regular expression. On the first RewriteRule, it is matched against the (%-decoded) URLpath (p. 377) of the request, or, in per-directory context (see below), the URL path relative to that per-directory context. Subsequent patterns are matched against the output of the last matching RewriteRule. =⇒What is matched? In V H context, The Pattern will initially be matched against the part of the URL IRTUAL OST after the hostname and port, and before the query string (e.g. "/app1/index.html"). In D IRECTORY and htaccess context, the Pattern will initially be matched against the filesystem path, after removing the prefix that led the server to the current R EWRITE RULE (e.g. "app1/index.html" or "index.html" depending on where the directives are defined). If you wish to match against the hostname, port, or query string, use a R EWRITE C OND with the %{HTTP HOST}, %{SERVER PORT}, or %{QUERY STRING} variables respectively. In any case, remember that regular expressions are substring matches. That is, you don’t need the regex to describe the entire string, just the part that you wish to match. Thus, using a regex of . is often sufficient rather than .*, and the regex abc is not the same as ˆabc$. 10.100. APACHE MODULE MOD REWRITE 877 =⇒Per-directory Rewrites • The rewrite engine may be used in .htaccess (p. 249) files and in sections, with some additional complexity. • To enable the rewrite engine in this context, you need to set "RewriteEngine On" and "Options FollowSymLinks" must be enabled. If your administrator has disabled override of FollowSymLinks for a user’s directory, then you cannot use the rewrite engine. This restriction is required for security reasons. • When using the rewrite engine in .htaccess files the per-directory prefix (which always is the same for a specific directory) is automatically removed for the RewriteRule pattern matching and automatically added after any relative (not starting with a slash or protocol name) substitution encounters the end of a rule set. See the R EWRITE BASE directive for more information regarding what prefix will be added back to relative substitutions. • If you wish to match against the full URL-path in a per-directory (htaccess) RewriteRule, use the %{REQUEST URI} variable in a R EWRITE C OND. • The removed prefix always ends with a slash, meaning the matching occurs against a string which never has a leading slash. Therefore, a Pattern with ˆ/ never matches in per-directory context. • Although rewrite rules are syntactically permitted in and sections (including their regular expression counterparts), this should never be necessary and is unsupported. A likely feature to break in these contexts is relative substitutions. For some hints on regular expressions, see the mod rewrite Introduction (p. 147) . In mod rewrite, the NOT character (’!’) is also available as a possible pattern prefix. This enables you to negate a pattern; to say, for instance: “if the current URL does NOT match this pattern”. This can be used for exceptional cases, where it is easier to match the negative pattern, or as a last default rule. =⇒Note When using the NOT character to negate a pattern, you cannot include grouped wildcard parts in that pattern. This is because, when the pattern does NOT match (ie, the negation matches), there are no contents for the groups. Thus, if negated patterns are used, you cannot use $N in the substitution string! The Substitution of a rewrite rule is the string that replaces the original URL-path that was matched by Pattern. The Substitution may be a: file-system path Designates the location on the file-system of the resource to be delivered to the client. Substitutions are only treated as a file-system path when the rule is configured in server (virtualhost) context and the first component of the path in the substitution exists in the file-system URL-path A D OCUMENT ROOT-relative path to the resource to be served. Note that MOD REWRITE tries to guess whether you have specified a file-system path or a URL-path by checking to see if the first segment of the path exists at the root of the file-system. For example, if you specify a Substitution string of /www/file.html, then this will be treated as a URL-path unless a directory named www exists at the root or your file-system (or, in the case of using rewrites in a .htaccess file, relative to your document root), in which case it will be treated as a file-system path. If you wish other URL-mapping directives (such as A LIAS) to be applied to the resulting URL-path, use the [PT] flag as described below. Absolute URL If an absolute URL is specified, MOD REWRITE checks to see whether the hostname matches the current host. If it does, the scheme and hostname are stripped out and the resulting path is treated as a URLpath. Otherwise, an external redirect is performed for the given URL. To force an external redirect back to the current host, see the [R] flag below. 878 CHAPTER 10. APACHE MODULES - (dash) A dash indicates that no substitution should be performed (the existing path is passed through untouched). This is used when a flag (see below) needs to be applied without changing the path. In addition to plain text, the Substitution string can include 1. back-references ($N) to the RewriteRule pattern 2. back-references (%N) to the last matched RewriteCond pattern 3. server-variables as in rule condition test-strings (%{VARNAME}) 4. mapping-function calls (${mapname:key|default}) Back-references are identifiers of the form $N (N=0..9), which will be replaced by the contents of the Nth group of the matched Pattern. The server-variables are the same as for the TestString of a R EWRITE C OND directive. The mapping-functions come from the R EWRITE M AP directive and are explained there. These three types of variables are expanded in the order above. Rewrite rules are applied to the results of previous rewrite rules, in the order in which they are defined in the config file. The URL-path or file-system path (see "What is matched?", above) is completely replaced by the Substitution and the rewriting process continues until all rules have been applied, or it is explicitly terminated by an L flag (p. 178) , or other flag which implies immediate termination, such as END or F. =⇒Modifying the Query String By default, the query string is passed through unchanged. You can, however, create URLs in the substitution string containing a query string part. Simply use a question mark inside the substitution string to indicate that the following text should be re-injected into the query string. When you want to erase an existing query string, end the substitution string with just a question mark. To combine new and old query strings, use the [QSA] flag. Additionally you can set special actions to be performed by appending [flags] as the third argument to the R EWRITE RULE directive. Flags is a comma-separated list, surround by square brackets, of any of the flags in the following table. More details, and examples, for each flag, are available in the Rewrite Flags document (p. 178) . Flag and syntax Function B Escape non-alphanumeric characters in backreferences before applying the transformation. details ... (p. 178) If backreferences are being escaped, spaces should be escaped to %20 instead of +. Useful when the backreference will be used in the path component rather than the query string.details ... (p. 178) Rule is chained to the following rule. If the rule fails, the rule(s) chained to it will be skipped. details ... (p. 178) Sets a cookie in the client browser. Full syntax is: CO=NAME:VAL:domain[:lifetime[:path[:secure[:httponly]]]] details ... (p. 178) Causes the PATH INFO portion of the rewritten URI to be discarded. details ... (p. 178) Stop the rewriting process immediately and don’t apply any more rules. Also prevents further execution of rewrite rules in per-directory and .htaccess context. (Available in 2.3.9 and later) details ... (p. 178) Causes an environment variable VAR to be set (to the value VAL if provided). The form !VAR causes the environment variable VAR to be unset. details ... (p. 178) Returns a 403 FORBIDDEN response to the client browser. details ... (p. 178) backrefnoplus—BNP chain—C cookie—CO=NAME:VAL discardpath—DPI END env—E=[!]VAR[:VAL] forbidden—F 10.100. APACHE MODULE MOD REWRITE gone—G Handler—H=Content-handler last—L next—N nocase—NC noescape—NE nosubreq—NS proxy—P passthrough—PT qsappend—QSA qsdiscard—QSD qslast—QSL redirect—R[=code] skip—S=num type—T=MIME-type 879 Returns a 410 GONE response to the client browser. details ... (p. 178) Causes the resulting URI to be sent to the specified Contenthandler for processing. details ... (p. 178) Stop the rewriting process immediately and don’t apply any more rules. Especially note caveats for per-directory and .htaccess context (see also the END flag). details ... (p. 178) Re-run the rewriting process, starting again with the first rule, using the result of the ruleset so far as a starting point. details ... (p. 178) Makes the pattern comparison case-insensitive. details ... (p. 178) Prevent mod rewrite from applying hexcode escaping of special characters in the result of the rewrite. details ... (p. 178) Causes a rule to be skipped if the current request is an internal sub-request. details ... (p. 178) Force the substitution URL to be internally sent as a proxy request. details ... (p. 178) Forces the resulting URI to be passed back to the URL mapping engine for processing of other URI-to-filename translators, such as Alias or Redirect. details ... (p. 178) Appends any query string from the original request URL to any query string created in the rewrite target.details ... (p. 178) Discard any query string attached to the incoming URI. details ... (p. 178) Interpret the last (right-most) question mark as the query string delimeter, instead of the first (left-most) as normally used. Available in 2.4.19 and later. details ... (p. 178) Forces an external redirect, optionally with the specified HTTP status code. details ... (p. 178) Tells the rewriting engine to skip the next num rules if the current rule matches. details ... (p. 178) Force the MIME-type of the target file to be the specified type. details ... (p. 178) =⇒Home directory expansion When the substitution string begins with a string resembling "/˜user" (via explicit text or backreferences), mod rewrite performs home directory expansion independent of the presence or configuration of MOD USERDIR. This expansion does not occur when the PT flag is used on the R EWRITE RULE directive. Here are all possible substitution combinations and their meanings: Inside per-server configuration (httpd.conf) for request “GET /somepath/pathinfo”: 880 CHAPTER 10. APACHE MODULES Given Rule Resulting Substitution ˆ/somepath(.*) otherpath$1 ˆ/somepath(.*) otherpath$1 [R] ˆ/somepath(.*) otherpath$1 [P] ˆ/somepath(.*) /otherpath$1 ˆ/somepath(.*) /otherpath$1 [R] ˆ/somepath(.*) /otherpath$1 [P] ˆ/somepath(.*) http://thishost/otherpath$1 ˆ/somepath(.*) http://thishost/otherpath$1 [R] ˆ/somepath(.*) http://thishost/otherpath$1 [P] ˆ/somepath(.*) http://otherhost/otherpath$1 ˆ/somepath(.*) http://otherhost/otherpath$1 [R] invalid, not supported invalid, not supported invalid, not supported /otherpath/pathinfo http://thishost/otherpath/pathinfo via external redirection doesn’t make sense, not supported /otherpath/pathinfo http://thishost/otherpath/pathinfo via external redirection doesn’t make sense, not supported http://otherhost/otherpath/pathinfo via external redirection http://otherhost/otherpath/pathinfo via external redirection (the [R] flag is redundant) http://otherhost/otherpath/pathinfo via internal proxy ˆ/somepath(.*) http://otherhost/otherpath$1 [P] Inside per-directory configuration for /somepath (/physical/path/to/somepath/.htaccess, with RewriteBase "/somepath") for request “GET /somepath/localpath/pathinfo”: Given Rule Resulting Substitution ˆlocalpath(.*) otherpath$1 ˆlocalpath(.*) otherpath$1 [R] /somepath/otherpath/pathinfo http://thishost/somepath/otherpath/pathinfo via external redirection doesn’t make sense, not supported /otherpath/pathinfo http://thishost/otherpath/pathinfo via external redirection doesn’t make sense, not supported /otherpath/pathinfo http://thishost/otherpath/pathinfo via external redirection doesn’t make sense, not supported http://otherhost/otherpath/pathinfo via external redirection http://otherhost/otherpath/pathinfo via external redirection (the [R] flag is redundant) http://otherhost/otherpath/pathinfo via internal proxy ˆlocalpath(.*) otherpath$1 [P] ˆlocalpath(.*) /otherpath$1 ˆlocalpath(.*) /otherpath$1 [R] ˆlocalpath(.*) /otherpath$1 [P] ˆlocalpath(.*) http://thishost/otherpath$1 ˆlocalpath(.*) http://thishost/otherpath$1 [R] ˆlocalpath(.*) http://thishost/otherpath$1 [P] ˆlocalpath(.*) http://otherhost/otherpath$1 ˆlocalpath(.*) http://otherhost/otherpath$1 [R] ˆlocalpath(.*) http://otherhost/otherpath$1 [P] 10.101. APACHE MODULE MOD SED 10.101 881 Apache Module mod sed Description: Status: ModuleIdentifier: SourceFile: Compatibility: Filter Input (request) and Output (response) content using sed syntax Experimental sed module mod sed.c sed0.c sed1.c regexp.c regexp.h sed.h Available in Apache 2.3 and later Summary MOD SED is an in-process content filter. The MOD SED filter implements the sed editing commands implemented by the Solaris 10 sed program as described in the manual page80 . However, unlike sed, MOD SED doesn’t take data from standard input. Instead, the filter acts on the entity data sent between client and server. MOD SED can be used as an input or output filter. MOD SED is a content filter, which means that it cannot be used to modify client or server http headers. The MOD SED output filter accepts a chunk of data, executes the sed scripts on the data, and generates the output which is passed to the next filter in the chain. The MOD SED input filter reads the data from the next filter in the chain, executes the sed scripts, and returns the generated data to the caller filter in the filter chain. Both the input and output filters only process the data if newline characters are seen in the content. At the end of the data, the rest of the data is treated as the last line. A tutorial article on MOD SED, and why it is more powerful than simple string or regular expression search and replace, is available on the author’s blog81 . Directives • InputSed • OutputSed Sample Configuration Adding an output filter # In the following example, the sed filter will change the string # "monday" to "MON" and the string "sunday" to SUN in html documents # before sending to the client. AddOutputFilter Sed html OutputSed "s/monday/MON/g" OutputSed "s/sunday/SUN/g" 80 http://www.gnu.org/software/sed/manual/sed.txt 81 https://blogs.oracle.com/basant/entry/using mod sed to filter 882 CHAPTER 10. APACHE MODULES Adding an input filter # In the following example, the sed filter will change the string # "monday" to "MON" and the string "sunday" to SUN in the POST data # sent to PHP. AddInputFilter Sed php InputSed "s/monday/MON/g" InputSed "s/sunday/SUN/g" Sed Commands Complete details of the sed command can be found from the sed manual page82 . b Branch to the label specified (similar to goto). h Copy the current line to the hold buffer. H Append the current line to the hold buffer. g Copy the hold buffer to the current line. G Append the hold buffer to the current line. x Swap the contents of the hold buffer and the current line. InputSed Directive Description: Syntax: Context: Status: Module: Sed command to filter request data (typically POST data) InputSed sed-command directory, .htaccess Experimental mod sed The I NPUT S ED directive specifies the sed command to execute on the request data e.g., POST data. OutputSed Directive Description: Syntax: Context: Status: Module: Sed command for filtering response content OutputSed sed-command directory, .htaccess Experimental mod sed The O UTPUT S ED directive specifies the sed command to execute on the response. 82 http://www.gnu.org/software/sed/manual/sed.txt 10.102. APACHE MODULE MOD SESSION 10.102 883 Apache Module mod session Description: Status: ModuleIdentifier: SourceFile: Compatibility: Session support Extension session module mod session.c Available in Apache 2.3 and later Summary ! Warning The session modules make use of HTTP cookies, and as such can fall victim to Cross Site Scripting attacks, or expose potentially private information to clients. Please ensure that the relevant risks have been taken into account before enabling the session functionality on your server. This module provides support for a server wide per user session interface. Sessions can be used for keeping track of whether a user has been logged in, or for other per user information that should be kept available across requests. Sessions may be stored on the server, or may be stored on the browser. Sessions may also be optionally encrypted for added security. These features are divided into several modules in addition to MOD SESSION; MOD SESSION CRYPTO, MOD SESSION COOKIE and MOD SESSION DBD . Depending on the server requirements, load the appropriate modules into the server (either statically at compile time or dynamically via the L OAD M ODULE directive). Sessions may be manipulated from other modules that depend on the session, or the session may be read from and written to using environment variables and HTTP headers, as appropriate. Directives • Session • SessionEnv • SessionExclude • SessionExpiryUpdateInterval • SessionHeader • SessionInclude • SessionMaxAge See also • MOD SESSION COOKIE • MOD SESSION CRYPTO • MOD SESSION DBD What is a session? At the core of the session interface is a table of key and value pairs that are made accessible across browser requests. These pairs can be set to any valid string, as needed by the application making use of the session. The "session" is a application/x-www-form-urlencoded string containing these key value pairs, as defined by the HTML specification83 . 83 http://www.w3.org/TR/html4/ 884 CHAPTER 10. APACHE MODULES The session can optionally be encrypted and base64 encoded before being written to the storage mechanism, as defined by the administrator. Who can use a session? The session interface is primarily developed for the use by other server modules, such as MOD AUTH FORM, however CGI based applications can optionally be granted access to the contents of the session via the HTTP SESSION environment variable. Sessions have the option to be modified and/or updated by inserting an HTTP response header containing the new session parameters. Keeping sessions on the server Apache can be configured to keep track of per user sessions stored on a particular server or group of servers. This functionality is similar to the sessions available in typical application servers. If configured, sessions are tracked through the use of a session ID that is stored inside a cookie, or extracted from the parameters embedded within the URL query string, as found in a typical GET request. As the contents of the session are stored exclusively on the server, there is an expectation of privacy of the contents of the session. This does have performance and resource implications should a large number of sessions be present, or where a large number of webservers have to share sessions with one another. The MOD SESSION DBD module allows the storage of user sessions within a SQL database via MOD DBD. Keeping sessions on the browser In high traffic environments where keeping track of a session on a server is too resource intensive or inconvenient, the option exists to store the contents of the session within a cookie on the client browser instead. This has the advantage that minimal resources are required on the server to keep track of sessions, and multiple servers within a server farm have no need to share session information. The contents of the session however are exposed to the client, with a corresponding risk of a loss of privacy. The MOD SESSION CRYPTO module can be configured to encrypt the contents of the session before writing the session to the client. The MOD SESSION COOKIE allows the storage of user sessions on the browser within an HTTP cookie. Basic Examples Creating a session is as simple as turning the session on, and deciding where the session will be stored. In this example, the session will be stored on the browser, in a cookie called session. Browser based session Session On SessionCookieName session path=/ The session is not useful unless it can be written to or read from. The following example shows how values can be injected into the session through the use of a predetermined HTTP response header called X-Replace-Session. Writing to a session Session On SessionCookieName session path=/ SessionHeader X-Replace-Session 10.102. APACHE MODULE MOD SESSION 885 The header should contain name value pairs expressed in the same format as a query string in a URL, as in the example below. Setting a key to the empty string has the effect of removing that key from the session. CGI to write to a session #!/bin/bash echo "Content-Type: text/plain" echo "X-Replace-Session: key1=foo&key2=&key3=bar" echo env If configured, the session can be read back from the HTTP SESSION environment variable. By default, the session is kept private, so this has to be explicitly turned on with the S ESSION E NV directive. Read from a session Session On SessionEnv On SessionCookieName session path=/ SessionHeader X-Replace-Session Once read, the CGI variable HTTP SESSION should contain the value key1=foo&key3=bar. Session Privacy Using the "show cookies" feature of your browser, you would have seen a clear text representation of the session. This could potentially be a problem should the end user need to be kept unaware of the contents of the session, or where a third party could gain unauthorised access to the data within the session. The contents of the session can be optionally encrypted before being placed on the browser using the MOD SESSION CRYPTO module. Browser based encrypted session Session On SessionCryptoPassphrase secret SessionCookieName session path=/ The session will be automatically decrypted on load, and encrypted on save by Apache, the underlying application using the session need have no knowledge that encryption is taking place. Sessions stored on the server rather than on the browser can also be encrypted as needed, offering privacy where potentially sensitive information is being shared between webservers in a server farm using the MOD SESSION DBD module. Cookie Privacy The HTTP cookie mechanism also offers privacy features, such as the ability to restrict cookie transport to SSL protected pages only, or to prevent browser based javascript from gaining access to the contents of the cookie. ! Warning Some of the HTTP cookie privacy features are either non-standard, or are not implemented consistently across browsers. The session modules allow you to set cookie parameters, but it makes no guarantee that privacy will be respected by the browser. If security is a concern, use the MOD SESSION CRYPTO to encrypt the contents of the session, or store the session on the server using the MOD SESSION DBD module. 886 CHAPTER 10. APACHE MODULES Standard cookie parameters can be specified after the name of the cookie, as in the example below. Setting cookie parameters Session On SessionCryptoPassphrase secret SessionCookieName session path=/private;domain=example.com;httponly;secure; In cases where the Apache server forms the frontend for backend origin servers, it is possible to have the session cookies removed from the incoming HTTP headers using the S ESSION C OOKIE R EMOVE directive. This keeps the contents of the session cookies from becoming accessible from the backend server. Session Support for Authentication As is possible within many application servers, authentication modules can use a session for storing the username and password after login. The MOD AUTH FORM saves the user’s login name and password within the session. Form based authentication Session On SessionCryptoPassphrase secret SessionCookieName session path=/ AuthFormProvider file AuthUserFile "conf/passwd" AuthType form AuthName "realm" #... See the MOD AUTH FORM module for documentation and complete examples. Integrating Sessions with External Applications In order for sessions to be useful, it must be possible to share the contents of a session with external applications, and it must be possible for an external application to write a session of its own. A typical example might be an application that changes a user’s password set by MOD AUTH FORM. This application would need to read the current username and password from the session, make the required changes to the user’s password, and then write the new password to the session in order to provide a seamless transition to the new password. A second example might involve an application that registers a new user for the first time. When registration is complete, the username and password is written to the session, providing a seamless transition to being logged in. Apache modules Modules within the server that need access to the session can use the mod session.h API in order to read from and write to the session. This mechanism is used by modules like MOD AUTH FORM. CGI programs and scripting languages Applications that run within the webserver can optionally retrieve the value of the session from the HTTP SESSION environment variable. The session should be encoded as a application/x-www-form-urlencoded string as described by the HTML specification84 . The environment variable is controlled by the setting of the S ESSION E NV directive. The session can be written to by the script by returning a application/x-www-form-urlencoded response header with a name set by the S ESSION H EADER directive. In both cases, any encryption or decryption, and the reading the session from or writing the session to the chosen storage mechanism is handled by the MOD SESSION modules and corresponding configuration. 84 http://www.w3.org/TR/html4/ 10.102. APACHE MODULE MOD SESSION 887 Applications behind MOD PROXY If the S ESSION H EADER directive is used to define an HTTP request header, the session, encoded as a application/x-www-form-urlencoded string, will be made available to the application. If the same header is provided in the response, the value of this response header will be used to replace the session. As above, any encryption or decryption, and the reading the session from or writing the session to the chosen storage mechanism is handled by the MOD SESSION modules and corresponding configuration. Standalone applications Applications might choose to manipulate the session outside the control of the Apache HTTP server. In this case, it is the responsibility of the application to read the session from the chosen storage mechanism, decrypt the session, update the session, encrypt the session and write the session to the chosen storage mechanism, as appropriate. Session Directive Description: Syntax: Default: Context: Override: Status: Module: Enables a session for the current directory or location Session On|Off Session Off server config, virtual host, directory, .htaccess AuthConfig Extension mod session The S ESSION directive enables a session for the directory or location container. Further directives control where the session will be stored and how privacy is maintained. SessionEnv Directive Description: Syntax: Default: Context: Override: Status: Module: Control whether the contents of the session are written to the HTTP SESSION environment variable SessionEnv On|Off SessionEnv Off server config, virtual host, directory, .htaccess AuthConfig Extension mod session If set to On, the S ESSION E NV directive causes the contents of the session to be written to a CGI environment variable called HTTP SESSION. The string is written in the URL query format, for example: key1=foo&key3=bar SessionExclude Directive Description: Syntax: Default: Context: Status: Module: Define URL prefixes for which a session is ignored SessionExclude path none server config, virtual host, directory, .htaccess Extension mod session The S ESSION E XCLUDE directive allows sessions to be disabled relative to URL prefixes only. This can be used to make a website more efficient, by targeting a more precise URL space for which a session should be maintained. By 888 CHAPTER 10. APACHE MODULES default, all URLs within the directory or location are included in the session. The S ESSION E XCLUDE directive takes precedence over the S ESSION I NCLUDE directive. ! Warning This directive has a similar purpose to the path attribute in HTTP cookies, but should not be confused with this attribute. This directive does not set the path attribute, which must be configured separately. SessionExpiryUpdateInterval Directive Description: Syntax: Default: Context: Status: Module: Define the number of seconds a session’s expiry may change without the session being updated SessionExpiryUpdateInterval interval SessionExpiryUpdateInterval 0 (always update) server config, virtual host, directory, .htaccess Extension mod session The S ESSION E XPIRY U PDATE I NTERVAL directive allows sessions to avoid the cost associated with writing the session each request when only the expiry time has changed. This can be used to make a website more efficient or reduce load on a database when using MOD SESSION DBD. The session is always written if the data stored in the session has changed or the expiry has changed by more than the configured interval. Setting the interval to zero disables this directive, and the session expiry is refreshed for each request. This directive only has an effect when combined with S ESSION M AX AGE to enable session expiry. Sessions without an expiry are only written when the data stored in the session has changed. ! Warning Because the session expiry may not be refreshed with each request, it’s possible for sessions to expire up to interval seconds early. Using a small interval usually provides sufficient savings while having a minimal effect on expiry resolution. SessionHeader Directive Description: Syntax: Default: Context: Override: Status: Module: Import session updates from a given HTTP response header SessionHeader header none server config, virtual host, directory, .htaccess AuthConfig Extension mod session The S ESSION H EADER directive defines the name of an HTTP response header which, if present, will be parsed and written to the current session. The header value is expected to be in the URL query format, for example: key1=foo&key2=&key3=bar Where a key is set to the empty string, that key will be removed from the session. 10.102. APACHE MODULE MOD SESSION 889 SessionInclude Directive Description: Syntax: Default: Context: Override: Status: Module: Define URL prefixes for which a session is valid SessionInclude path all URLs server config, virtual host, directory, .htaccess AuthConfig Extension mod session The S ESSION I NCLUDE directive allows sessions to be made valid for specific URL prefixes only. This can be used to make a website more efficient, by targeting a more precise URL space for which a session should be maintained. By default, all URLs within the directory or location are included in the session. ! Warning This directive has a similar purpose to the path attribute in HTTP cookies, but should not be confused with this attribute. This directive does not set the path attribute, which must be configured separately. SessionMaxAge Directive Description: Syntax: Default: Context: Override: Status: Module: Define a maximum age in seconds for a session SessionMaxAge maxage SessionMaxAge 0 server config, virtual host, directory, .htaccess AuthConfig Extension mod session The S ESSION M AX AGE directive defines a time limit for which a session will remain valid. When a session is saved, this time limit is reset and an existing session can be continued. If a session becomes older than this limit without a request to the server to refresh the session, the session will time out and be removed. Where a session is used to stored user login details, this has the effect of logging the user out automatically after the given time. Setting the maxage to zero disables session expiry. 890 CHAPTER 10. APACHE MODULES 10.103 Apache Module mod session cookie Description: Status: ModuleIdentifier: SourceFile: Compatibility: Cookie based session support Extension session cookie module mod session cookie.c Available in Apache 2.3 and later Summary ! Warning The session modules make use of HTTP cookies, and as such can fall victim to Cross Site Scripting attacks, or expose potentially private information to clients. Please ensure that the relevant risks have been taken into account before enabling the session functionality on your server. This submodule of MOD SESSION provides support for the storage of user sessions on the remote browser within HTTP cookies. Using cookies to store a session removes the need for the server or a group of servers to store the session locally, or collaborate to share a session, and can be useful for high traffic environments where a server based session might be too resource intensive. If session privacy is required, the MOD SESSION CRYPTO module can be used to encrypt the contents of the session before writing the session to the client. For more details on the session interface, see the documentation for the MOD SESSION module. Directives • SessionCookieName • SessionCookieName2 • SessionCookieRemove See also • MOD SESSION • MOD SESSION CRYPTO • MOD SESSION DBD Basic Examples To create a simple session and store it in a cookie called session, configure the session as follows: Browser based session Session On SessionCookieName session path=/ For more examples on how the session can be configured to be read from and written to by a CGI application, see the MOD SESSION examples section. For documentation on how the session can be used to store username and password details, see the MOD AUTH FORM module. 10.103. APACHE MODULE MOD SESSION COOKIE 891 SessionCookieName Directive Description: Syntax: Default: Context: Status: Module: Name and attributes for the RFC2109 cookie storing the session SessionCookieName name attributes none server config, virtual host, directory, .htaccess Extension mod session cookie The S ESSION C OOKIE NAME directive specifies the name and optional attributes of an RFC2109 compliant cookie inside which the session will be stored. RFC2109 cookies are set using the Set-Cookie HTTP header. An optional list of cookie attributes can be specified, as per the example below. These attributes are inserted into the cookie as is, and are not interpreted by Apache. Ensure that your attributes are defined correctly as per the cookie specification. Cookie with attributes Session On SessionCookieName session path=/private;domain=example.com;httponly;secure;version=1; SessionCookieName2 Directive Description: Syntax: Default: Context: Status: Module: Name and attributes for the RFC2965 cookie storing the session SessionCookieName2 name attributes none server config, virtual host, directory, .htaccess Extension mod session cookie The S ESSION C OOKIE NAME 2 directive specifies the name and optional attributes of an RFC2965 compliant cookie inside which the session will be stored. RFC2965 cookies are set using the Set-Cookie2 HTTP header. An optional list of cookie attributes can be specified, as per the example below. These attributes are inserted into the cookie as is, and are not interpreted by Apache. Ensure that your attributes are defined correctly as per the cookie specification. Cookie2 with attributes Session On SessionCookieName2 session path=/private;domain=example.com;httponly;secure;version=1; SessionCookieRemove Directive Description: Syntax: Default: Context: Status: Module: Control for whether session cookies should be removed from incoming HTTP headers SessionCookieRemove On|Off SessionCookieRemove Off server config, virtual host, directory, .htaccess Extension mod session cookie The S ESSION C OOKIE R EMOVE flag controls whether the cookies containing the session will be removed from the headers during request processing. 892 CHAPTER 10. APACHE MODULES In a reverse proxy situation where the Apache server acts as a server frontend for a backend origin server, revealing the contents of the session cookie to the backend could be a potential privacy violation. When set to on, the session cookie will be removed from the incoming HTTP headers. 10.104. APACHE MODULE MOD SESSION CRYPTO 10.104 893 Apache Module mod session crypto Description: Status: ModuleIdentifier: SourceFile: Compatibility: Session encryption support Experimental session crypto module mod session crypto.c Available in Apache 2.3 and later Summary ! Warning The session modules make use of HTTP cookies, and as such can fall victim to Cross Site Scripting attacks, or expose potentially private information to clients. Please ensure that the relevant risks have been taken into account before enabling the session functionality on your server. This submodule of MOD SESSION provides support for the encryption of user sessions before being written to a local database, or written to a remote browser via an HTTP cookie. This can help provide privacy to user sessions where the contents of the session should be kept private from the user, or where protection is needed against the effects of cross site scripting attacks. For more details on the session interface, see the documentation for the MOD SESSION module. Directives • SessionCryptoCipher • SessionCryptoDriver • SessionCryptoPassphrase • SessionCryptoPassphraseFile See also • MOD SESSION • MOD SESSION COOKIE • MOD SESSION DBD Basic Usage To create a simple encrypted session and store it in a cookie called session, configure the session as follows: Browser based encrypted session Session On SessionCookieName session path=/ SessionCryptoPassphrase secret The session will be encrypted with the given key. Different servers can be configured to share sessions by ensuring the same encryption key is used on each server. If the encryption key is changed, sessions will be invalidated automatically. For documentation on how the session can be used to store username and password details, see the MOD AUTH FORM module. 894 CHAPTER 10. APACHE MODULES SessionCryptoCipher Directive Description: Syntax: Default: Context: Status: Module: Compatibility: The crypto cipher to be used to encrypt the session SessionCryptoCipher name aes256 server config, virtual host, directory, .htaccess Experimental mod session crypto Available in Apache 2.3.0 and later The S ESSION C RYPTO C IPHER directive allows the cipher to be used during encryption. If not specified, the cipher defaults to aes256. Possible values depend on the crypto driver in use, and could be one of: • 3des192 • aes128 • aes192 • aes256 SessionCryptoDriver Directive Description: Syntax: Default: Context: Status: Module: Compatibility: The crypto driver to be used to encrypt the session SessionCryptoDriver name [param[=value]] none server config Experimental mod session crypto Available in Apache 2.3.0 and later The S ESSION C RYPTO D RIVER directive specifies the name of the crypto driver to be used for encryption. If not specified, the driver defaults to the recommended driver compiled into APR-util. The NSS crypto driver requires some parameters for configuration, which are specified as parameters with optional values after the driver name. NSS without a certificate database SessionCryptoDriver nss NSS with certificate database SessionCryptoDriver nss dir=certs NSS with certificate database and parameters SessionCryptoDriver nss dir=certs key3=key3.db cert7=cert7.db secmod=secmod NSS with paths containing spaces SessionCryptoDriver nss "dir=My Certs" key3=key3.db cert7=cert7.db secmod=secmod 10.104. APACHE MODULE MOD SESSION CRYPTO 895 The NSS crypto driver might have already been configured by another part of the server, for example from mod nss or MOD LDAP. If found to have already been configured, a warning will be logged, and the existing configuration will have taken affect. To avoid this warning, use the noinit parameter as follows. NSS with certificate database SessionCryptoDriver nss noinit To prevent confusion, ensure that all modules requiring NSS are configured with identical parameters. The openssl crypto driver supports an optional parameter to specify the engine to be used for encryption. OpenSSL with engine support SessionCryptoDriver openssl engine=name SessionCryptoPassphrase Directive Description: Syntax: Default: Context: Status: Module: Compatibility: The key used to encrypt the session SessionCryptoPassphrase secret [ secret ... none server config, virtual host, directory, .htaccess Experimental mod session crypto Available in Apache 2.3.0 and later ] The S ESSION C RYPTO PASSPHRASE directive specifies the keys to be used to enable symmetrical encryption on the contents of the session before writing the session, or decrypting the contents of the session after reading the session. Keys are more secure when they are long, and consist of truly random characters. Changing the key on a server has the effect of invalidating all existing sessions. Multiple keys can be specified in order to support key rotation. The first key listed will be used for encryption, while all keys listed will be attempted for decryption. To rotate keys across multiple servers over a period of time, add a new secret to the end of the list, and once rolled out completely to all servers, remove the first key from the start of the list. As of version 2.4.7 if the value begins with exec: the resulting command will be executed and the first line returned to standard output by the program will be used as the key. #key used as-is SessionCryptoPassphrase secret #Run /path/to/program to get key SessionCryptoPassphrase exec:/path/to/program #Run /path/to/otherProgram and provide arguments SessionCryptoPassphrase "exec:/path/to/otherProgram argument1" 896 CHAPTER 10. APACHE MODULES SessionCryptoPassphraseFile Directive Description: Syntax: Default: Context: Status: Module: Compatibility: File containing keys used to encrypt the session SessionCryptoPassphraseFile filename none server config, virtual host, directory Experimental mod session crypto Available in Apache 2.3.0 and later The S ESSION C RYPTO PASSPHRASE F ILE directive specifies the name of a configuration file containing the keys to use for encrypting or decrypting the session, specified one per line. The file is read on server start, and a graceful restart will be necessary for httpd to pick up changes to the keys. Unlike the S ESSION C RYPTO PASSPHRASE directive, the keys are not exposed within the httpd configuration and can be hidden by protecting the file appropriately. Multiple keys can be specified in order to support key rotation. The first key listed will be used for encryption, while all keys listed will be attempted for decryption. To rotate keys across multiple servers over a period of time, add a new secret to the end of the list, and once rolled out completely to all servers, remove the first key from the start of the list. 10.105. APACHE MODULE MOD SESSION DBD 10.105 897 Apache Module mod session dbd Description: Status: ModuleIdentifier: SourceFile: Compatibility: DBD/SQL based session support Extension session dbd module mod session dbd.c Available in Apache 2.3 and later Summary ! Warning The session modules make use of HTTP cookies, and as such can fall victim to Cross Site Scripting attacks, or expose potentially private information to clients. Please ensure that the relevant risks have been taken into account before enabling the session functionality on your server. This submodule of MOD SESSION provides support for the storage of user sessions within a SQL database using the MOD DBD module. Sessions can either be anonymous, where the session is keyed by a unique UUID string stored on the browser in a cookie, or per user, where the session is keyed against the userid of the logged in user. SQL based sessions are hidden from the browser, and so offer a measure of privacy without the need for encryption. Different webservers within a server farm may choose to share a database, and so share sessions with one another. For more details on the session interface, see the documentation for the MOD SESSION module. Directives • SessionDBDCookieName • SessionDBDCookieName2 • SessionDBDCookieRemove • SessionDBDDeleteLabel • SessionDBDInsertLabel • SessionDBDPerUser • SessionDBDSelectLabel • SessionDBDUpdateLabel See also • MOD SESSION • MOD SESSION CRYPTO • MOD SESSION COOKIE • MOD DBD DBD Configuration Before the MOD SESSION DBD module can be configured to maintain a session, the MOD DBD module must be configured to make the various database queries available to the server. 898 CHAPTER 10. APACHE MODULES There are four queries required to keep a session maintained, to select an existing session, to update an existing session, to insert a new session, and to delete an expired or empty session. These queries are configured as per the example below. Sample DBD configuration DBDriver pgsql DBDParams "dbname=apachesession user=apache password=xxxxx host=localhost" DBDPrepareSQL "delete from session where key = %s" deletesession DBDPrepareSQL "update session set value = %s, expiry = %lld, key = %s where key = %s" update DBDPrepareSQL "insert into session (value, expiry, key) values (%s, %lld, %s)" insertsession DBDPrepareSQL "select value from session where key = %s and (expiry = 0 or expiry > %lld)" s DBDPrepareSQL "delete from session where expiry != 0 and expiry < %lld" cleansession Anonymous Sessions Anonymous sessions are keyed against a unique UUID, and stored on the browser within an HTTP cookie. This method is similar to that used by most application servers to store session information. To create a simple anonymous session and store it in a postgres database table called apachesession, and save the session ID in a cookie called session, configure the session as follows: SQL based anonymous session Session On SessionDBDCookieName session path=/ For more examples on how the session can be configured to be read from and written to by a CGI application, see the MOD SESSION examples section. For documentation on how the session can be used to store username and password details, see the MOD AUTH FORM module. Per User Sessions Per user sessions are keyed against the username of a successfully authenticated user. It offers the most privacy, as no external handle to the session exists outside of the authenticated realm. Per user sessions work within a correctly configured authenticated environment, be that using basic authentication, digest authentication or SSL client certificates. Due to the limitations of who came first, the chicken or the egg, per user sessions cannot be used to store authentication credentials from a module like MOD AUTH FORM. To create a simple per user session and store it in a postgres database table called apachesession, and with the session keyed to the userid, configure the session as follows: SQL based per user session Session On SessionDBDPerUser On 10.105. APACHE MODULE MOD SESSION DBD 899 Database Housekeeping Over the course of time, the database can be expected to start accumulating expired sessions. At this point, the MOD SESSION DBD module is not yet able to handle session expiry automatically. ! Warning The administrator will need to set up an external process via cron to clean out expired sessions. SessionDBDCookieName Directive Description: Syntax: Default: Context: Status: Module: Name and attributes for the RFC2109 cookie storing the session ID SessionDBDCookieName name attributes none server config, virtual host, directory, .htaccess Extension mod session dbd The S ESSION DBDC OOKIE NAME directive specifies the name and optional attributes of an RFC2109 compliant cookie inside which the session ID will be stored. RFC2109 cookies are set using the Set-Cookie HTTP header. An optional list of cookie attributes can be specified, as per the example below. These attributes are inserted into the cookie as is, and are not interpreted by Apache. Ensure that your attributes are defined correctly as per the cookie specification. Cookie with attributes Session On SessionDBDCookieName session path=/private;domain=example.com;httponly;secure;version=1; SessionDBDCookieName2 Directive Description: Syntax: Default: Context: Status: Module: Name and attributes for the RFC2965 cookie storing the session ID SessionDBDCookieName2 name attributes none server config, virtual host, directory, .htaccess Extension mod session dbd The S ESSION DBDC OOKIE NAME 2 directive specifies the name and optional attributes of an RFC2965 compliant cookie inside which the session ID will be stored. RFC2965 cookies are set using the Set-Cookie2 HTTP header. An optional list of cookie attributes can be specified, as per the example below. These attributes are inserted into the cookie as is, and are not interpreted by Apache. Ensure that your attributes are defined correctly as per the cookie specification. Cookie2 with attributes Session On SessionDBDCookieName2 session path=/private;domain=example.com;httponly;secure;version=1; 900 CHAPTER 10. APACHE MODULES SessionDBDCookieRemove Directive Description: Syntax: Default: Context: Status: Module: Control for whether session ID cookies should be removed from incoming HTTP headers SessionDBDCookieRemove On|Off SessionDBDCookieRemove On server config, virtual host, directory, .htaccess Extension mod session dbd The S ESSION DBDC OOKIE R EMOVE flag controls whether the cookies containing the session ID will be removed from the headers during request processing. In a reverse proxy situation where the Apache server acts as a server frontend for a backend origin server, revealing the contents of the session ID cookie to the backend could be a potential privacy violation. When set to on, the session ID cookie will be removed from the incoming HTTP headers. SessionDBDDeleteLabel Directive Description: Syntax: Default: Context: Status: Module: The SQL query to use to remove sessions from the database SessionDBDDeleteLabel label SessionDBDDeleteLabel deletesession server config, virtual host, directory, .htaccess Extension mod session dbd The S ESSION DBDD ELETE L ABEL directive sets the default delete query label to be used to delete an expired or empty session. This label must have been previously defined using the DBDP REPARE SQL directive. SessionDBDInsertLabel Directive Description: Syntax: Default: Context: Status: Module: The SQL query to use to insert sessions into the database SessionDBDInsertLabel label SessionDBDInsertLabel insertsession server config, virtual host, directory, .htaccess Extension mod session dbd The S ESSION DBDI NSERT L ABEL directive sets the default insert query label to be used to load in a session. This label must have been previously defined using the DBDP REPARE SQL directive. If an attempt to update the session affects no rows, this query will be called to insert the session into the database. SessionDBDPerUser Directive Description: Syntax: Default: Context: Status: Module: Enable a per user session SessionDBDPerUser On|Off SessionDBDPerUser Off server config, virtual host, directory, .htaccess Extension mod session dbd The S ESSION DBDP ERU SER flag enables a per user session keyed against the user’s login name. If the user is not logged in, this directive will be ignored. 10.105. APACHE MODULE MOD SESSION DBD 901 SessionDBDSelectLabel Directive Description: Syntax: Default: Context: Status: Module: The SQL query to use to select sessions from the database SessionDBDSelectLabel label SessionDBDSelectLabel selectsession server config, virtual host, directory, .htaccess Extension mod session dbd The S ESSION DBDS ELECT L ABEL directive sets the default select query label to be used to load in a session. This label must have been previously defined using the DBDP REPARE SQL directive. SessionDBDUpdateLabel Directive Description: Syntax: Default: Context: Status: Module: The SQL query to use to update existing sessions in the database SessionDBDUpdateLabel label SessionDBDUpdateLabel updatesession server config, virtual host, directory, .htaccess Extension mod session dbd The S ESSION DBDU PDATE L ABEL directive sets the default update query label to be used to load in a session. This label must have been previously defined using the DBDP REPARE SQL directive. If an attempt to update the session affects no rows, the insert query will be called to insert the session into the database. If the database supports InsertOrUpdate, override this query to perform the update in one query instead of two. 902 CHAPTER 10. APACHE MODULES 10.106 Apache Module mod setenvif Description: Status: ModuleIdentifier: SourceFile: Allows the setting of environment variables based on characteristics of the request Base setenvif module mod setenvif.c Summary The MOD SETENVIF module allows you to set internal environment variables according to whether different aspects of the request match regular expressions you specify. These environment variables can be used by other parts of the server to make decisions about actions to be taken, as well as becoming available to CGI scripts and SSI pages. The directives are considered in the order they appear in the configuration files. So more complex sequences can be used, such as this example, which sets netscape if the browser is mozilla but not MSIE. BrowserMatch ˆMozilla netscape BrowserMatch MSIE !netscape When the server looks up a path via an internal subrequest such as looking for a D IRECTORY I NDEX or generating a directory listing with MOD AUTOINDEX, per-request environment variables are not inherited in the subrequest. Additionally, S ET E NV I F directives are not separately evaluated in the subrequest due to the API phases MOD SETENVIF takes action in. Directives • BrowserMatch • BrowserMatchNoCase • SetEnvIf • SetEnvIfExpr • SetEnvIfNoCase See also • Environment Variables in Apache HTTP Server (p. 92) BrowserMatch Directive Description: Syntax: Context: Override: Status: Module: Sets environment variables conditional on HTTP User-Agent BrowserMatch regex [!]env-variable[=value] [[!]env-variable[=value]] ... server config, virtual host, directory, .htaccess FileInfo Base mod setenvif The B ROWSER M ATCH is a special cases of the S ET E NV I F directive that sets environment variables conditional on the User-Agent HTTP request header. The following two lines have the same effect: BrowserMatch Robot is_a_robot SetEnvIf User-Agent Robot is_a_robot 10.106. APACHE MODULE MOD SETENVIF 903 Some additional examples: BrowserMatch ˆMozilla forms jpeg=yes browser=netscape BrowserMatch "ˆMozilla/[2-3]" tables agif frames javascript BrowserMatch MSIE !javascript BrowserMatchNoCase Directive Description: Syntax: Context: Override: Status: Module: Sets environment variables conditional on User-Agent without respect to case BrowserMatchNoCase regex [!]env-variable[=value] [[!]env-variable[=value]] ... server config, virtual host, directory, .htaccess FileInfo Base mod setenvif The B ROWSER M ATCH N O C ASE directive is semantically identical to the B ROWSER M ATCH directive. However, it provides for case-insensitive matching. For example: BrowserMatchNoCase mac platform=macintosh BrowserMatchNoCase win platform=windows The B ROWSER M ATCH and B ROWSER M ATCH N O C ASE directives are special cases of the S ET E NV I F and S ET E NVI F N O C ASE directives. The following two lines have the same effect: BrowserMatchNoCase Robot is_a_robot SetEnvIfNoCase User-Agent Robot is_a_robot SetEnvIf Directive Description: Syntax: Context: Override: Status: Module: Sets environment variables based on attributes of the request SetEnvIf attribute regex [!]env-variable[=value] [[!]env-variable[=value]] ... server config, virtual host, directory, .htaccess FileInfo Base mod setenvif The S ET E NV I F directive defines environment variables based on attributes of the request. The attribute specified in the first argument can be one of four things: 1. An HTTP request header field (see RFC261685 for more information about these); for example: Host, User-Agent, Referer, and Accept-Language. A regular expression may be used to specify a set of request headers. 2. One of the following aspects of the request: • Remote Host - the hostname (if available) of the client making the request • Remote Addr - the IP address of the client making the request • Server Addr - the IP address of the server on which the request was received (only with versions later than 2.0.43) 85 http://www.rfc-editor.org/rfc/rfc2616.txt 904 CHAPTER 10. APACHE MODULES • Request Method - the name of the method being used (GET, POST, et cetera) • Request Protocol - the name and version of the protocol with which the request was made (e.g., "HTTP/0.9", "HTTP/1.1", etc.) • Request URI - the resource requested on the HTTP request line – generally the portion of the URL following the scheme and host portion without the query string. See the R EWRITE C OND directive of MOD REWRITE for extra information on how to match your query string. 3. The name of an environment variable in the list of those associated with the request. This allows S ET E N V I F directives to test against the result of prior matches. Only those environment variables defined by earlier SetEnvIf[NoCase] directives are available for testing in this manner. ’Earlier’ means that they were defined at a broader scope (such as server-wide) or previously in the current directive’s scope. Environment variables will be considered only if there was no match among request characteristics and a regular expression was not used for the attribute. The second argument (regex) is a regular expression. If the regex matches against the attribute, then the remainder of the arguments are evaluated. The rest of the arguments give the names of variables to set, and optionally values to which they should be set. These take the form of 1. varname, or 2. !varname, or 3. varname=value In the first form, the value will be set to "1". The second will remove the given variable if already defined, and the third will set the variable to the literal value given by value. Since version 2.0.51, Apache httpd will recognize occurrences of $1..$9 within value and replace them by parenthesized subexpressions of regex. $0 provides access to the whole string matched by that pattern. SetEnvIf Request_URI "\.gif$" object_is_image=gif SetEnvIf Request_URI "\.jpg$" object_is_image=jpg SetEnvIf Request_URI "\.xbm$" object_is_image=xbm SetEnvIf Referer www\.mydomain\.example\.com intra_site_referral SetEnvIf object_is_image xbm XBIT_PROCESSING=1 SetEnvIf Request_URI "\.(.*)$" EXTENSION=$1 SetEnvIf ˆTS ˆ[a-z] HAVE_TS The first three will set the environment variable object is image if the request was for an image file, and the fourth sets intra site referral if the referring page was somewhere on the www.mydomain.example.com Web site. The last example will set environment variable HAVE TS if the request contains any headers that begin with "TS" whose values begins with any character in the set [a-z]. See also • Environment Variables in Apache HTTP Server (p. 92) , for additional examples. 10.106. APACHE MODULE MOD SETENVIF 905 SetEnvIfExpr Directive Description: Syntax: Context: Override: Status: Module: Sets environment variables based on an ap expr expression SetEnvIfExpr expr [!]env-variable[=value] [[!]env-variable[=value]] ... server config, virtual host, directory, .htaccess FileInfo Base mod setenvif The S ET E NV I F E XPR directive defines environment variables based on an ap expr. These expressions will be evaluated at runtime, and applied env-variable in the same fashion as S ET E NV I F. SetEnvIfExpr "tolower(req(’X-Sendfile’)) == ’d:\images\very_big.iso’)" iso_delivered This would set the environment variable iso delivered every time our application attempts to send it via X-Sendfile A more useful example would be to set the variable rfc1918 if the remote IP address is a private address according to RFC 1918: SetEnvIfExpr "-R ’10.0.0.0/8’ || -R ’172.16.0.0/12’ || -R ’192.168.0.0/16’" rfc1918 See also • Expressions in Apache HTTP Server (p. 99) , for a complete reference and more examples. • can be used to achieve similar results. • MOD FILTER SetEnvIfNoCase Directive Description: Syntax: Context: Override: Status: Module: Sets environment variables based on attributes of the request without respect to case SetEnvIfNoCase attribute regex [!]env-variable[=value] [[!]env-variable[=value]] ... server config, virtual host, directory, .htaccess FileInfo Base mod setenvif The S ET E NV I F N O C ASE is semantically identical to the S ET E NV I F directive, and differs only in that the regular expression matching is performed in a case-insensitive manner. For example: SetEnvIfNoCase Host Example\.Org site=example This will cause the site environment variable to be set to "example" if the HTTP request header field Host: was included and contained Example.Org, example.org, or any other combination. 906 10.107 CHAPTER 10. APACHE MODULES Apache Module mod slotmem plain Description: Status: ModuleIdentifier: SourceFile: Slot-based shared memory provider. Extension slotmem plain module mod slotmem plain.c Summary mod slotmem plain is a memory provider which provides for creation and access to a plain memory segment in which the datasets are organized in "slots." If the memory needs to be shared between threads and processes, a better provider would be MOD SLOTMEM SHM. mod slotmem plain provides the following API functions: apr status t doall(ap slotmem instance t *s, ap slotmem callback fn t *func, void *data, apr pool t *pool) call the callback on all worker slots apr status t create(ap slotmem instance t **new, const char *name, apr size t item size, unsigned int item num, ap slotmem ty create a new slotmem with each item size is item size. apr status t attach(ap slotmem instance t **new, const char *name, apr size t *item size, unsigned int *item num, apr pool t * attach to an existing slotmem. apr status t dptr(ap slotmem instance t *s, unsigned int item id, void**mem) get the direct pointer to the memory associated with this worker slot. apr status t get(ap slotmem instance t *s, unsigned int item id, unsigned char *dest, apr size t dest len) get/read the memory from this slot to dest apr status t put(ap slotmem instance t *slot, unsigned int item id, unsigned char *src, apr size t src len) put/write the data from src to this slot unsigned int num slots(ap slotmem instance t *s) return the total number of slots in the segment apr size t slot size(ap slotmem instance t *s) return the total data size, in bytes, of a slot in the segment apr status t grab(ap slotmem instance t *s, unsigned int *item id); grab or allocate the first free slot and mark as in-use (does not do any data copying) apr status t fgrab(ap slotmem instance t *s, unsigned int item id); forced grab or allocate the specified slot and mark as in-use (does not do any data copying) apr status t release(ap slotmem instance t *s, unsigned int item id); release or free a slot and mark as not in-use (does not do any data copying) Directives This module provides no directives. 10.108. APACHE MODULE MOD SLOTMEM SHM 10.108 907 Apache Module mod slotmem shm Description: Status: ModuleIdentifier: SourceFile: Slot-based shared memory provider. Extension slotmem shm module mod slotmem shm.c Summary mod slotmem shm is a memory provider which provides for creation and access to a shared memory segment in which the datasets are organized in "slots." All shared memory is cleared and cleaned with each restart, whether graceful or not. The data itself is stored and restored within a file noted by the name parameter in the create and attach calls. If not specified with an absolute path, the file will be created relative to the path specified by the D EFAULT RUNTIME D IR directive. mod slotmem shm provides the following API functions: apr status t doall(ap slotmem instance t *s, ap slotmem callback fn t *func, void *data, apr pool t *pool) call the callback on all worker slots apr status t create(ap slotmem instance t **new, const char *name, apr size t item size, unsigned int item num, ap slotmem ty create a new slotmem with each item size is item size. name is used to generate a filename for the persistent store of the shared memory if configured. Values are: "none" Anonymous shared memory and no persistent store "file-name" [DefaultRuntimeDir]/file-name "/absolute-file-name" Absolute file name apr status t attach(ap slotmem instance t **new, const char *name, apr size t *item size, unsigned int *item num, apr pool t * attach to an existing slotmem. See create for description of name parameter. apr status t dptr(ap slotmem instance t *s, unsigned int item id, void**mem) get the direct pointer to the memory associated with this worker slot. apr status t get(ap slotmem instance t *s, unsigned int item id, unsigned char *dest, apr size t dest len) get/read the memory from this slot to dest apr status t put(ap slotmem instance t *slot, unsigned int item id, unsigned char *src, apr size t src len) put/write the data from src to this slot unsigned int num slots(ap slotmem instance t *s) return the total number of slots in the segment apr size t slot size(ap slotmem instance t *s) return the total data size, in bytes, of a slot in the segment apr status t grab(ap slotmem instance t *s, unsigned int *item id); grab or allocate the first free slot and mark as in-use (does not do any data copying) apr status t fgrab(ap slotmem instance t *s, unsigned int item id); forced grab or allocate the specified slot and mark as in-use (does not do any data copying) apr status t release(ap slotmem instance t *s, unsigned int item id); release or free a slot and mark as not in-use (does not do any data copying) Directives This module provides no directives. 908 CHAPTER 10. APACHE MODULES 10.109 Apache Module mod so Description: Status: ModuleIdentifier: SourceFile: Compatibility: Loading of executable code and modules into the server at start-up or restart time Extension so module mod so.c This is a Base module (always included) on Windows Summary On selected operating systems this module can be used to load modules into Apache HTTP Server at runtime via the Dynamic Shared Object (p. 68) (DSO) mechanism, rather than requiring a recompilation. On Unix, the loaded code typically comes from shared object files (usually with .so extension), on Windows this may either be the .so or .dll extension. ! Warning Modules built for one major version of the Apache HTTP Server will generally not work on another. (e.g. 1.3 vs. 2.0, or 2.0 vs. 2.2) There are usually API changes between one major version and another that require that modules be modified to work with the new version. Directives • LoadFile • LoadModule Creating Loadable Modules for Windows =⇒Note On Windows, where loadable files typically have a file extension of .dll, Apache httpd modules are called mod whatever.so, just as they are on other platforms. However, you may encounter third-party modules, such as PHP for example, that continue to use the .dll convention. While mod so still loads modules with ApacheModuleFoo.dll names, the new naming convention is preferred; if you are converting your loadable module for 2.0, please fix the name to this 2.0 convention. The Apache httpd module API is unchanged between the Unix and Windows versions. Many modules will run on Windows with no or little change from Unix, although others rely on aspects of the Unix architecture which are not present in Windows, and will not work. When a module does work, it can be added to the server in one of two ways. As with Unix, it can be compiled into the server. Because Apache httpd for Windows does not have the Configure program of Apache httpd for Unix, the module’s source file must be added to the ApacheCore project file, and its symbols must be added to the os\win32\modules.c file. The second way is to compile the module as a DLL, a shared library that can be loaded into the server at runtime, using the L OAD M ODULE directive. These module DLLs can be distributed and run on any Apache httpd for Windows installation, without recompilation of the server. To create a module DLL, a small change is necessary to the module’s source file: The module record must be exported from the DLL (which will be created later; see below). To do this, add the AP MODULE DECLARE DATA (defined in the Apache httpd header files) to your module’s module record definition. For example, if your module has: 10.109. APACHE MODULE MOD SO 909 module foo module; Replace the above with: module AP MODULE DECLARE DATA foo module; Note that this will only be activated on Windows, so the module can continue to be used, unchanged, with Unix if needed. Also, if you are familiar with .DEF files, you can export the module record with that method instead. Now, create a DLL containing your module. You will need to link this against the libhttpd.lib export library that is created when the libhttpd.dll shared library is compiled. You may also have to change the compiler settings to ensure that the Apache httpd header files are correctly located. You can find this library in your server root’s modules directory. It is best to grab an existing module .dsp file from the tree to assure the build environment is configured correctly, or alternately compare the compiler and link options to your .dsp. This should create a DLL version of your module. Now simply place it in the modules directory of your server root, and use the L OAD M ODULE directive to load it. LoadFile Directive Description: Syntax: Context: Status: Module: Link in the named object file or library LoadFile filename [filename] ... server config, virtual host Extension mod so The L OAD F ILE directive links in the named object files or libraries when the server is started or restarted; this is used to load additional code which may be required for some module to work. Filename is either an absolute path or relative to ServerRoot (p. 380) . For example: LoadFile "libexec/libxmlparse.so" LoadModule Directive Description: Syntax: Context: Status: Module: Links in the object file or library, and adds to the list of active modules LoadModule module filename server config, virtual host Extension mod so The L OAD M ODULE directive links in the object file or library filename and adds the module structure named module to the list of active modules. Module is the name of the external variable of type module in the file, and is listed as the Module Identifier (p. 376) in the module documentation. For example: LoadModule status_module "modules/mod_status.so" loads the named module from the modules subdirectory of the ServerRoot. 910 CHAPTER 10. APACHE MODULES 10.110 Apache Module mod socache dbm Description: Status: ModuleIdentifier: SourceFile: DBM based shared object cache provider. Extension socache dbm module mod socache dbm.c Summary mod socache dbm is a shared object cache provider which provides for creation and access to a cache backed by a DBM database. dbm:/path/to/datafile If the path is not absolute then it is assumed to be relative to the D EFAULT RUNTIME D IR. Details of other shared object cache providers can be found here (p. 114) . Directives This module provides no directives. 10.111. APACHE MODULE MOD SOCACHE DC 10.111 911 Apache Module mod socache dc Description: Status: ModuleIdentifier: SourceFile: Distcache based shared object cache provider. Extension socache dc module mod socache dc.c Summary MOD SOCACHE DC is a shared object cache provider which provides for creation and access to a cache backed by the distcache86 distributed session caching libraries. Details of other shared object cache providers can be found here (p. 114) . Directives This module provides no directives. 86 http://distcache.sourceforge.net/ 912 CHAPTER 10. APACHE MODULES 10.112 Apache Module mod socache memcache Description: Status: ModuleIdentifier: SourceFile: Memcache based shared object cache provider. Extension socache memcache module mod socache memcache.c Summary mod socache memcache is a shared object cache provider which provides for creation and access to a cache backed by the memcached87 high-performance, distributed memory object caching system. This shared object cache provider’s "create" method requires a comma separated list of memcached host/port specifications. If using this provider via another modules configuration (such as SSLS ESSION C ACHE), provide the list of servers as the optional "arg" parameter. SSLSessionCache memcache:memcache.example.com:12345,memcache2.example.com:12345 Details of other shared object cache providers can be found here (p. 114) . Directives • MemcacheConnTTL MemcacheConnTTL Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Keepalive time for idle connections MemcacheConnTTL num[units] MemcacheConnTTL 15s server config, virtual host Extension mod socache memcache Available in Apache 2.4.17 and later Set the time to keep idle connections with the memcache server(s) alive (threaded platforms only). Valid values for M EMCACHE C ONN TTL are times up to one hour. 0 means no timeout. =⇒This timeout defaults to units of seconds, but accepts suffixes for milliseconds (ms), seconds (s), minutes (min), and hours (h). Before Apache 2.4.17, this timeout was hardcoded and its value was 600 usec. So, the closest configuration to match the legacy behaviour is to set M EMCACHE C ONN TTL to 1ms. # Set a timeout MemcacheConnTTL # Set a timeout MemcacheConnTTL 87 http://memcached.org/ of 10 minutes 10min of 60 seconds 60 10.113. APACHE MODULE MOD SOCACHE SHMCB 10.113 913 Apache Module mod socache shmcb Description: Status: ModuleIdentifier: SourceFile: shmcb based shared object cache provider. Extension socache shmcb module mod socache shmcb.c Summary mod socache shmcb is a shared object cache provider which provides for creation and access to a cache backed by a high-performance cyclic buffer inside a shared memory segment. shmcb:/path/to/datafile(512000) If the path is not absolute then it is assumed to be relative to the D EFAULT RUNTIME D IR. Details of other shared object cache providers can be found here (p. 114) . Directives This module provides no directives. 914 CHAPTER 10. APACHE MODULES 10.114 Apache Module mod speling Description: Status: ModuleIdentifier: SourceFile: Attempts to correct mistaken URLs by ignoring capitalization, or attempting to correct various minor misspellings. Extension speling module mod speling.c Summary Requests to documents sometimes cannot be served by the core apache server because the request was misspelled or miscapitalized. This module addresses this problem by trying to find a matching document, even after all other modules gave up. It does its work by comparing each document name in the requested directory against the requested document name without regard to case, and allowing up to one misspelling (character insertion / omission / transposition or wrong character). A list is built with all document names which were matched using this strategy. If, after scanning the directory, • no matching document was found, Apache will proceed as usual and return a "document not found" error. • only one document is found that "almost" matches the request, then it is returned in the form of a redirection response. • more than one document with a close match was found, then the list of the matches is returned to the client, and the client can select the correct candidate. Directives • CheckBasenameMatch • CheckCaseOnly • CheckSpelling CheckBasenameMatch Directive Description: Syntax: Default: Context: Override: Status: Module: Also match files with differing file name extensions. CheckBasenameMatch on|off CheckBasenameMatch Off server config, virtual host, directory, .htaccess Options Extension mod speling When set, this directive extends the action of the spelling correction to the file name extension. For example a file foo.gif will match a request for foo or foo.jpg. This can be particulary useful in conjunction with MultiViews (p. 78) . 10.114. APACHE MODULE MOD SPELING 915 CheckCaseOnly Directive Description: Syntax: Default: Context: Override: Status: Module: Limits the action of the speling module to case corrections CheckCaseOnly on|off CheckCaseOnly Off server config, virtual host, directory, .htaccess Options Extension mod speling When set, this directive limits the action of the spelling correction to lower/upper case changes. Other potential corrections are not performed, except when C HECK BASENAME M ATCH is also set. CheckSpelling Directive Description: Syntax: Default: Context: Override: Status: Module: Enables the spelling module CheckSpelling on|off CheckSpelling Off server config, virtual host, directory, .htaccess Options Extension mod speling This directive enables or disables the spelling module. When enabled, keep in mind that • the directory scan which is necessary for the spelling correction will have an impact on the server’s performance when many spelling corrections have to be performed at the same time. • the document trees should not contain sensitive files which could be matched inadvertently by a spelling "correction". • the module is unable to correct misspelled user names (as in http://my.host/˜apahce/), just file names or directory names. • spelling corrections apply strictly to existing files, so a request for the may get incorrectly treated as the negotiated file "/stats.html". mod speling should not be enabled in DAV (p. 589) enabled directories, because it will try to "spell fix" newly created resource names against existing filenames, e.g., when trying to upload a new document doc43.html it might redirect to an existing document doc34.html, which is not what was intended. 916 CHAPTER 10. APACHE MODULES 10.115 Apache Module mod ssl Description: Status: ModuleIdentifier: SourceFile: Strong cryptography using the Secure Sockets Layer (SSL) and Transport Layer Security (TLS) protocols Extension ssl module mod ssl.c Summary This module provides SSL v3 and TLS v1.x support for the Apache HTTP Server. SSL v2 is no longer supported. This module relies on OpenSSL88 to provide the cryptography engine. Further details, discussion, and examples are provided in the SSL documentation (p. 192) . Directives • SSLCACertificateFile • SSLCACertificatePath • SSLCADNRequestFile • SSLCADNRequestPath • SSLCARevocationCheck • SSLCARevocationFile • SSLCARevocationPath • SSLCertificateChainFile • SSLCertificateFile • SSLCertificateKeyFile • SSLCipherSuite • SSLCompression • SSLCryptoDevice • SSLEngine • SSLFIPS • SSLHonorCipherOrder • SSLInsecureRenegotiation • SSLOCSPDefaultResponder • SSLOCSPEnable • SSLOCSPOverrideResponder • SSLOCSPProxyURL • SSLOCSPResponderTimeout • SSLOCSPResponseMaxAge • SSLOCSPResponseTimeSkew • SSLOCSPUseRequestNonce • SSLOpenSSLConfCmd 88 http://www.openssl.org/ 10.115. APACHE MODULE MOD SSL • SSLOptions • SSLPassPhraseDialog • SSLProtocol • SSLProxyCACertificateFile • SSLProxyCACertificatePath • SSLProxyCARevocationCheck • SSLProxyCARevocationFile • SSLProxyCARevocationPath • SSLProxyCheckPeerCN • SSLProxyCheckPeerExpire • SSLProxyCheckPeerName • SSLProxyCipherSuite • SSLProxyEngine • SSLProxyMachineCertificateChainFile • SSLProxyMachineCertificateFile • SSLProxyMachineCertificatePath • SSLProxyProtocol • SSLProxyVerify • SSLProxyVerifyDepth • SSLRandomSeed • SSLRenegBufferSize • SSLRequire • SSLRequireSSL • SSLSessionCache • SSLSessionCacheTimeout • SSLSessionTicketKeyFile • SSLSessionTickets • SSLSRPUnknownUserSeed • SSLSRPVerifierFile • SSLStaplingCache • SSLStaplingErrorCacheTimeout • SSLStaplingFakeTryLater • SSLStaplingForceURL • SSLStaplingResponderTimeout • SSLStaplingResponseMaxAge • SSLStaplingResponseTimeSkew • SSLStaplingReturnResponderErrors • SSLStaplingStandardCacheTimeout • SSLStrictSNIVHostCheck • SSLUserName • SSLUseStapling • SSLVerifyClient • SSLVerifyDepth 917 918 CHAPTER 10. APACHE MODULES Environment Variables This module can be configured to provide several items of SSL information as additional environment variables to the SSI and CGI namespace. This information is not provided by default for performance reasons. (See SSLO PTIONS StdEnvVars, below.) The generated variables are listed in the table below. For backward compatibility the information can be made available under different names, too. Look in the Compatibility (p. 202) chapter for details on the compatibility variables. Variable Name: Value Type: Description: HTTPS SSL PROTOCOL SSL SESSION ID SSL SESSION RESUMED flag string string string SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL string string string number number string string string string string string string string HTTPS is being used. The SSL protocol version (SSLv3, TLSv1, TLSv1.1, TLSv1.2) The hex-encoded SSL session id Initial or Resumed SSL Session. Note: multiple requests may be served over the same (Initial or Resumed) SSL session if HTTP KeepAlive is in use true if secure renegotiation is supported, else false The cipher specification name true if cipher is an export cipher Number of cipher bits (actually used) Number of cipher bits (possible) SSL compression method negotiated The mod ssl program version The OpenSSL program version The version of the client certificate The serial of the client certificate Subject DN in client’s certificate Component of client’s Subject DN Client certificate’s subjectAltName extension entries of type rfc822Name Client certificate’s subjectAltName extension entries of type dNSName Client certificate’s subjectAltName extension entries of type otherName, Microsoft User Principal Name form (OID 1.3.6.1.4.1.311.20.2.3) Issuer DN of client’s certificate Component of client’s Issuer DN Validity of client’s certificate (start time) Validity of client’s certificate (end time) Number of days until client’s certificate expires Algorithm used for the signature of client’s certificate Algorithm used for the public key of client’s certificate PEM-encoded client certificate PEM-encoded certificates in client certificate chain Serial number and issuer of the certificate. The format matches that of the CertificateExactAssertion in RFC4523 NONE, SUCCESS, GENEROUS or FAILED:reason The version of the server certificate The serial of the server certificate Subject DN in server’s certificate Server certificate’s subjectAltName extension entries of type rfc822Name Server certificate’s subjectAltName extension entries of type dNSName Server certificate’s subjectAltName extension entries of type otherName, SRVName form (OID 1.3.6.1.5.5.7.8.7, RFC 4985) SECURE RENEG CIPHER CIPHER EXPORT CIPHER USEKEYSIZE CIPHER ALGKEYSIZE COMPRESS METHOD VERSION INTERFACE VERSION LIBRARY CLIENT M VERSION CLIENT M SERIAL CLIENT S DN CLIENT S DN x509 CLIENT SAN Email n SSL CLIENT SAN DNS n string SSL CLIENT SAN OTHER msUPN n string SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL CLIENT CLIENT CLIENT CLIENT CLIENT CLIENT CLIENT CLIENT CLIENT CLIENT I DN I DN x509 V START V END V REMAIN A SIG A KEY CERT CERT CHAIN n CERT RFC4523 CEA string string string string string string string string string string SSL SSL SSL SSL SSL CLIENT SERVER SERVER SERVER SERVER VERIFY M VERSION M SERIAL S DN SAN Email n string string string string string SSL SERVER SAN DNS n string SSL SERVER SAN OTHER dnsSRV n string 10.115. APACHE MODULE MOD SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SSL SERVER S DN x509 SERVER I DN SERVER I DN x509 SERVER V START SERVER V END SERVER A SIG SERVER A KEY SERVER CERT SRP USER SRP USERINFO TLS SNI 919 string string string string string string string string string string string Component of server’s Subject DN Issuer DN of server’s certificate Component of server’s Issuer DN Validity of server’s certificate (start time) Validity of server’s certificate (end time) Algorithm used for the signature of server’s certificate Algorithm used for the public key of server’s certificate PEM-encoded server certificate SRP username SRP user info Contents of the SNI TLS extension (if supplied with ClientHello) x509 specifies a component of an X.509 DN; one of C,ST,L,O,OU,CN,T,I,G,S,D,UID,Email. In Apache 2.1 and later, x509 may also include a numeric n suffix. If the DN in question contains multiple attributes of the same name, this suffix is used as a zero-based index to select a particular attribute. For example, where the server certificate subject DN included two OU attributes, SSL SERVER S DN OU 0 and SSL SERVER S DN OU 1 could be used to reference each. A variable name without a n suffix is equivalent to that name with a 0 suffix; the first (or only) attribute. When the environment table is populated using the StdEnvVars option of the SSLO PTIONS directive, the first (or only) attribute of any DN is added only under a non-suffixed name; i.e. no 0 suffixed entries are added. The format of the * DN variables has changed in Apache HTTPD 2.3.11. See the LegacyDNStringFormat option for SSLO PTIONS for details. SSL CLIENT V REMAIN is only available in version 2.1 and later. A number of additional environment variables can also be used in SSLR EQUIRE expressions, or in custom log formats: =⇒ HTTP_USER_AGENT HTTP_REFERER HTTP_COOKIE HTTP_FORWARDED HTTP_HOST HTTP_PROXY_CONNECTION HTTP_ACCEPT THE_REQUEST REQUEST_FILENAME REQUEST_METHOD REQUEST_SCHEME REQUEST_URI PATH_INFO QUERY_STRING REMOTE_HOST REMOTE_IDENT IS_SUBREQ DOCUMENT_ROOT SERVER_ADMIN SERVER_NAME SERVER_PORT SERVER_PROTOCOL REMOTE_ADDR REMOTE_USER AUTH_TYPE SERVER_SOFTWARE API_VERSION TIME_YEAR TIME_MON TIME_DAY TIME_HOUR TIME_MIN TIME_SEC TIME_WDAY TIME In these contexts, two special formats can also be used: ENV:variablename This will expand to the standard environment variable variablename. HTTP:headername This will expand to the value of the request header with name headername. Custom Log Formats When MOD SSL is built into Apache or at least loaded (under DSO situation) additional functions exist for the Custom Log Format (p. 705) of MOD LOG CONFIG. First there is an additional “%{varname}x” eXtension format function which can be used to expand any variables provided by any module, especially those provided by mod ssl which can you find in the above table. 920 CHAPTER 10. APACHE MODULES For backward compatibility there is additionally a special “%{name}c” cryptography format function provided. Information about this function is provided in the Compatibility (p. 202) chapter. Example CustomLog "logs/ssl_request_log" "%t %h %{SSL_PROTOCOL}x %{SSL_CIPHER}x \"%r\" %b" These formats even work without setting the StdEnvVars option of the SSLO PTIONS directive. Request Notes MOD SSL sets "notes" for the request which can be used in logging with the %{name}n format string in MOD LOG CONFIG . The notes supported are as follows: ssl-access-forbidden This note is set to the value 1 if access was denied due to an SSLR EQUIRE or SSLR E QUIRE SSL directive. ssl-secure-reneg If MOD SSL is built against a version of OpenSSL which supports the secure renegotiation extension, this note is set to the value 1 if SSL is in used for the current connection, and the client also supports the secure renegotiation extension. If the client does not support the secure renegotiation extension, the note is set to the value 0. If MOD SSL is not built against a version of OpenSSL which supports secure renegotiation, or if SSL is not in use for the current connection, the note is not set. Expression Parser Extension When MOD SSL is built into Apache or at least loaded (under DSO situation) any variables provided by MOD SSL can be used in expressions for the ap expr Expression Parser (p. 99) . The variables can be referenced using the syntax “%{varname}”. Starting with version 2.4.18 one can also use the MOD REWRITE style syntax “%{SSL:varname}” or the function style syntax “ssl(varname)”. Example (using MOD HEADERS) Header set X-SSL-PROTOCOL "expr=%{SSL_PROTOCOL}" Header set X-SSL-CIPHER "expr=%{SSL:SSL_CIPHER}" This feature even works without setting the StdEnvVars option of the SSLO PTIONS directive. Authorization providers for use with Require MOD SSL provides a few authentication providers for use with MOD AUTHZ CORE’s R EQUIRE directive. Require ssl The ssl provider denies access if a connection is not encrypted with SSL. This is similar to the SSLR EQUIRE SSL directive. Require ssl 10.115. APACHE MODULE MOD SSL 921 Require ssl-verify-client The ssl provider allows access if the user is authenticated with a valid client certificate. This is only useful if SSLVerifyClient optional is in effect. The following example grants access if the user is authenticated either with a client certificate or by username and password. Require ssl-verify-client Require valid-user SSLCACertificateFile Directive Description: Syntax: Context: Status: Module: File of concatenated PEM-encoded CA Certificates for Client Auth SSLCACertificateFile file-path server config, virtual host Extension mod ssl This directive sets the all-in-one file where you can assemble the Certificates of Certification Authorities (CA) whose clients you deal with. These are used for Client Authentication. Such a file is simply the concatenation of the various PEM-encoded Certificate files, in order of preference. This can be used alternatively and/or additionally to SSLCACERTIFICATE PATH . Example SSLCACertificateFile /usr/local/apache2/conf/ssl.crt/ca-bundle-client.crt SSLCACertificatePath Directive Description: Syntax: Context: Status: Module: Directory of PEM-encoded CA Certificates for Client Auth SSLCACertificatePath directory-path server config, virtual host Extension mod ssl This directive sets the directory where you keep the Certificates of Certification Authorities (CAs) whose clients you deal with. These are used to verify the client certificate on Client Authentication. The files in this directory have to be PEM-encoded and are accessed through hash filenames. So usually you can’t just place the Certificate files there: you also have to create symbolic links named hash-value.N. And you should always make sure this directory contains the appropriate symbolic links. Example SSLCACertificatePath /usr/local/apache2/conf/ssl.crt/ SSLCADNRequestFile Directive Description: Syntax: Context: Status: Module: File of concatenated PEM-encoded CA Certificates for defining acceptable CA names SSLCADNRequestFile file-path server config, virtual host Extension mod ssl 922 CHAPTER 10. APACHE MODULES When a client certificate is requested by mod ssl, a list of acceptable Certificate Authority names is sent to the client in the SSL handshake. These CA names can be used by the client to select an appropriate client certificate out of those it has available. If neither of the directives SSLCADNR EQUEST PATH or SSLCADNR EQUEST F ILE are given, then the set of acceptable CA names sent to the client is the names of all the CA certificates given by the SSLCAC ERTIFICATE F ILE and SSLCAC ERTIFICATE PATH directives; in other words, the names of the CAs which will actually be used to verify the client certificate. In some circumstances, it is useful to be able to send a set of acceptable CA names which differs from the actual CAs used to verify the client certificate - for example, if the client certificates are signed by intermediate CAs. In such cases, SSLCADNR EQUEST PATH and/or SSLCADNR EQUEST F ILE can be used; the acceptable CA names are then taken from the complete set of certificates in the directory and/or file specified by this pair of directives. SSLCADNR EQUEST F ILE must specify an all-in-one file containing a concatenation of PEM-encoded CA certificates. Example SSLCADNRequestFile /usr/local/apache2/conf/ca-names.crt SSLCADNRequestPath Directive Description: Syntax: Context: Status: Module: Directory of PEM-encoded CA Certificates for defining acceptable CA names SSLCADNRequestPath directory-path server config, virtual host Extension mod ssl This optional directive can be used to specify the set of acceptable CA names which will be sent to the client when a client certificate is requested. See the SSLCADNR EQUEST F ILE directive for more details. The files in this directory have to be PEM-encoded and are accessed through hash filenames. So usually you can’t just place the Certificate files there: you also have to create symbolic links named hash-value.N. And you should always make sure this directory contains the appropriate symbolic links. Example SSLCADNRequestPath /usr/local/apache2/conf/ca-names.crt/ SSLCARevocationCheck Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Enable CRL-based revocation checking SSLCARevocationCheck chain|leaf|none flags SSLCARevocationCheck none server config, virtual host Extension mod ssl Optional flags available in httpd 2.5-dev or later Enables certificate revocation list (CRL) checking. At least one of SSLCAR EVOCATION F ILE or SSLCAR EVOCA TION PATH must be configured. When set to chain (recommended setting), CRL checks are applied to all certificates in the chain, while setting it to leaf limits the checks to the end-entity cert. The available flags are: 10.115. APACHE MODULE MOD SSL 923 • no crl for cert ok Prior to version 2.3.15, CRL checking in mod ssl also succeeded when no CRL(s) for the checked certificate(s) were found in any of the locations configured with SSLCAR EVOCATION F ILE or SSLCAR EVOCATION PATH. With the introduction of SSLCAR EVOCATION F ILE, the behavior has been changed: by default with chain or leaf, CRLs must be present for the validation to succeed - otherwise it will fail with an "unable to get certificate CRL" error. The flag no crl for cert ok allows to restore previous behaviour. Example SSLCARevocationCheck chain Compatibility with versions 2.2 SSLCARevocationCheck chain no_crl_for_cert_ok SSLCARevocationFile Directive Description: Syntax: Context: Status: Module: File of concatenated PEM-encoded CA CRLs for Client Auth SSLCARevocationFile file-path server config, virtual host Extension mod ssl This directive sets the all-in-one file where you can assemble the Certificate Revocation Lists (CRL) of Certification Authorities (CA) whose clients you deal with. These are used for Client Authentication. Such a file is simply the concatenation of the various PEM-encoded CRL files, in order of preference. This can be used alternatively and/or additionally to SSLCAR EVOCATION PATH. Example SSLCARevocationFile /usr/local/apache2/conf/ssl.crl/ca-bundle-client.crl SSLCARevocationPath Directive Description: Syntax: Context: Status: Module: Directory of PEM-encoded CA CRLs for Client Auth SSLCARevocationPath directory-path server config, virtual host Extension mod ssl This directive sets the directory where you keep the Certificate Revocation Lists (CRL) of Certification Authorities (CAs) whose clients you deal with. These are used to revoke the client certificate on Client Authentication. The files in this directory have to be PEM-encoded and are accessed through hash filenames. So usually you have not only to place the CRL files there. Additionally you have to create symbolic links named hash-value.rN. And you should always make sure this directory contains the appropriate symbolic links. Example SSLCARevocationPath /usr/local/apache2/conf/ssl.crl/ 924 CHAPTER 10. APACHE MODULES SSLCertificateChainFile Directive Description: Syntax: Context: Status: Module: File of PEM-encoded Server CA Certificates SSLCertificateChainFile file-path server config, virtual host Extension mod ssl =⇒SSLCertificateChainFile is deprecated SSLCertificateChainFile became obsolete with version 2.4.8, when SSLC CATE F ILE ERTIFI was extended to also load intermediate CA certificates from the server certificate file. This directive sets the optional all-in-one file where you can assemble the certificates of Certification Authorities (CA) which form the certificate chain of the server certificate. This starts with the issuing CA certificate of the server certificate and can range up to the root CA certificate. Such a file is simply the concatenation of the various PEMencoded CA Certificate files, usually in certificate chain order. This should be used alternatively and/or additionally to SSLCAC ERTIFICATE PATH for explicitly constructing the server certificate chain which is sent to the browser in addition to the server certificate. It is especially useful to avoid conflicts with CA certificates when using client authentication. Because although placing a CA certificate of the server certificate chain into SSLCAC ERTIFICATE PATH has the same effect for the certificate chain construction, it has the side-effect that client certificates issued by this same CA certificate are also accepted on client authentication. But be careful: Providing the certificate chain works only if you are using a single RSA or DSA based server certificate. If you are using a coupled RSA+DSA certificate pair, this will work only if actually both certificates use the same certificate chain. Else the browsers will be confused in this situation. Example SSLCertificateChainFile /usr/local/apache2/conf/ssl.crt/ca.crt SSLCertificateFile Directive Description: Syntax: Context: Status: Module: Server PEM-encoded X.509 certificate data file SSLCertificateFile file-path server config, virtual host Extension mod ssl This directive points to a file with certificate data in PEM format. At a minimum, the file must include an end-entity (leaf) certificate. The directive can be used multiple times (referencing different filenames) to support multiple algorithms for server authentication - typically RSA, DSA, and ECC. The number of supported algorithms depends on the OpenSSL version being used for mod ssl: with version 1.0.0 or later, openssl list-public-key-algorithms will output a list of supported algorithms, see also the note below about limitations of OpenSSL versions prior to 1.0.2 and the ways to work around them. The files may also include intermediate CA certificates, sorted from leaf to root. This is supported with version 2.4.8 and later, and obsoletes SSLC ERTIFICATE C HAIN F ILE. When running with OpenSSL 1.0.2 or later, this allows to configure the intermediate CA chain on a per-certificate basis. Custom DH parameters and an EC curve name for ephemeral keys, can also be added to end of the first file configured using SSLC ERTIFICATE F ILE. This is supported in version 2.4.7 or later. Such parameters can be generated using the commands openssl dhparam and openssl ecparam. The parameters can be added as-is to the end of the first certificate file. Only the first file can be used for custom parameters, as they are applied independently of the authentication algorithm type. 10.115. APACHE MODULE MOD SSL 925 Finally the end-entity certificate’s private key can also be added to the certificate file instead of using a separate SSLC ERTIFICATE K EY F ILE directive. This practice is highly discouraged. If it is used, the certificate files using such an embedded key must be configured after the certificates using a separate key file. If the private key is encrypted, the pass phrase dialog is forced at startup time. =⇒DH parameter interoperability with primes > 1024 bit Beginning with version 2.4.7, mod ssl makes use of standardized DH parameters with prime lengths of 2048, 3072 and 4096 bits and with additional prime lengths of 6144 and 8192 bits beginning with version 2.4.10 (from RFC 3526a ), and hands them out to clients based on the length of the certificate’s RSA/DSA key. With Java-based clients in particular (Java 7 or earlier), this may lead to handshake failures - see this FAQ answer (p. 212) for working around such issues. a http://www.ietf.org/rfc/rfc3526.txt =⇒Default DH parameters when using multiple certificates and OpenSSL versions prior to 1.0.2 When using multiple certificates to support different authentication algorithms (like RSA, DSA, but mainly ECC) and OpenSSL prior to 1.0.2, it is recommended to either use custom DH parameters (preferably) by adding them to the first certificate file (as described above), or to order the SSLC ERTIFICATE F ILE directives such that RSA/DSA certificates are placed after the ECC one. This is due to a limitation in older versions of OpenSSL which don’t let the Apache HTTP Server determine the currently selected certificate at handshake time (when the DH parameters must be sent to the peer) but instead always provide the last configured certificate. Consequently, the server may select default DH parameters based on the length of the wrong certificate’s key (ECC keys are much smaller than RSA/DSA ones and their length is not relevant for selecting DH primes). Since custom DH parameters always take precedence over the default ones, this issue can be avoided by creating and configuring them (as described above), thus using a custom/suitable length. Example SSLCertificateFile /usr/local/apache2/conf/ssl.crt/server.crt SSLCertificateKeyFile Directive Description: Syntax: Context: Status: Module: Server PEM-encoded private key file SSLCertificateKeyFile file-path server config, virtual host Extension mod ssl This directive points to the PEM-encoded private key file for the server. If the contained private key is encrypted, the pass phrase dialog is forced at startup time. The directive can be used multiple times (referencing different filenames) to support multiple algorithms for server authentication. For each SSLC ERTIFICATE K EY F ILE directive, there must be a matching SSLC ERTIFICATE F ILE directive. The private key may also be combined with the certificate in the file given by SSLC ERTIFICATE F ILE, but this practice is highly discouraged. If it is used, the certificate files using such an embedded key must be configured after the certificates using a separate key file. 926 CHAPTER 10. APACHE MODULES Example SSLCertificateKeyFile /usr/local/apache2/conf/ssl.key/server.key SSLCipherSuite Directive Description: Syntax: Default: Context: Override: Status: Module: Cipher Suite available for negotiation in SSL handshake SSLCipherSuite cipher-spec SSLCipherSuite DEFAULT (depends on OpenSSL version) server config, virtual host, directory, .htaccess AuthConfig Extension mod ssl This complex directive uses a colon-separated cipher-spec string consisting of OpenSSL cipher specifications to configure the Cipher Suite the client is permitted to negotiate in the SSL handshake phase. Notice that this directive can be used both in per-server and per-directory context. In per-server context it applies to the standard SSL handshake when a connection is established. In per-directory context it forces a SSL renegotiation with the reconfigured Cipher Suite after the HTTP request was read but before the HTTP response is sent. An SSL cipher specification in cipher-spec is composed of 4 major attributes plus a few extra minor ones: • Key Exchange Algorithm: RSA, Diffie-Hellman, Elliptic Curve Diffie-Hellman, Secure Remote Password • Authentication Algorithm: RSA, Diffie-Hellman, DSS, ECDSA, or none. • Cipher/Encryption Algorithm: AES, DES, Triple-DES, RC4, RC2, IDEA, etc. • MAC Digest Algorithm: MD5, SHA or SHA1, SHA256, SHA384. An SSL cipher can also be an export cipher. SSLv2 ciphers are no longer supported. To specify which ciphers to use, one can either specify all the Ciphers, one at a time, or use aliases to specify the preference and order for the ciphers (see Table 1). The actually available ciphers and aliases depends on the used openssl version. Newer openssl versions may include additional ciphers. Tag Key Exchange Algorithm: kRSA kDHr kDHd kEDH kSRP Authentication Algorithm: aNULL aRSA aDSS aDH Cipher Encoding Algorithm: eNULL NULL AES Description RSA key exchange Diffie-Hellman key exchange with RSA key Diffie-Hellman key exchange with DSA key Ephemeral (temp.key) Diffie-Hellman key exchange (no cert) Secure Remote Password (SRP) key exchange No authentication RSA authentication DSS authentication Diffie-Hellman authentication No encryption alias for eNULL AES encryption 10.115. APACHE MODULE MOD SSL DES 3DES RC4 RC2 IDEA MAC Digest Algorithm: MD5 SHA1 SHA SHA256 SHA384 Aliases: SSLv3 TLSv1 EXP EXPORT40 EXPORT56 LOW MEDIUM HIGH RSA DH EDH ECDH ADH AECDH SRP DSS ECDSA aNULL 927 DES encryption Triple-DES encryption RC4 encryption RC2 encryption IDEA encryption MD5 hash function SHA1 hash function alias for SHA1 SHA256 hash function SHA384 hash function all SSL version 3.0 ciphers all TLS version 1.0 ciphers all export ciphers all 40-bit export ciphers only all 56-bit export ciphers only all low strength ciphers (no export, single DES) all ciphers with 128 bit encryption all ciphers using Triple-DES all ciphers using RSA key exchange all ciphers using Diffie-Hellman key exchange all ciphers using Ephemeral Diffie-Hellman key exchange Elliptic Curve Diffie-Hellman key exchange all ciphers using Anonymous Diffie-Hellman key exchange all ciphers using Anonymous Elliptic Curve Diffie-Hellman key exchange all ciphers using Secure Remote Password (SRP) key exchange all ciphers using DSS authentication all ciphers using ECDSA authentication all ciphers using no authentication Now where this becomes interesting is that these can be put together to specify the order and ciphers you wish to use. To speed this up there are also aliases (SSLv3, TLSv1, EXP, LOW, MEDIUM, HIGH) for certain groups of ciphers. These tags can be joined together with prefixes to form the cipher-spec. Available prefixes are: • none: add cipher to list • +: move matching ciphers to the current location in list • -: remove cipher from list (can be added later again) • !: kill cipher from list completely (can not be added later again) =⇒aNULL, eNULL and EXP ciphers are always disabled Beginning with version 2.4.7, null and export-grade ciphers are always disabled, as mod ssl unconditionally adds !aNULL:!eNULL:!EXP to any cipher string at initialization. A simpler way to look at all of this is to use the “openssl ciphers -v” command which provides a nice way to successively create the correct cipher-spec string. The default cipher-spec string depends on the version of the OpenSSL libraries used. Let’s suppose it is “RC4-SHA:AES128-SHA:HIGH:MEDIUM:!aNULL:!MD5” which means the following: Put RC4-SHA and AES128-SHA at the beginning. We do this, because these ciphers offer a 928 CHAPTER 10. APACHE MODULES good compromise between speed and security. Next, include high and medium security ciphers. Finally, remove all ciphers which do not authenticate, i.e. for SSL the Anonymous Diffie-Hellman ciphers, as well as all ciphers which use MD5 as hash algorithm, because it has been proven insufficient. $ openssl ciphers -v ’RC4-SHA:AES128-SHA:HIGH:MEDIUM:!aNULL:!MD5’ RC4-SHA SSLv3 Kx=RSA Au=RSA Enc=RC4(128) Mac=SHA1 AES128-SHA SSLv3 Kx=RSA Au=RSA Enc=AES(128) Mac=SHA1 DHE-RSA-AES256-SHA SSLv3 Kx=DH Au=RSA Enc=AES(256) Mac=SHA1 ... ... ... ... ... SEED-SHA SSLv3 Kx=RSA Au=RSA Enc=SEED(128) Mac=SHA1 PSK-RC4-SHA SSLv3 Kx=PSK Au=PSK Enc=RC4(128) Mac=SHA1 KRB5-RC4-SHA SSLv3 Kx=KRB5 Au=KRB5 Enc=RC4(128) Mac=SHA1 The complete list of particular RSA & DH ciphers for SSL is given in Table 2. Example SSLCipherSuite RSA:!EXP:!NULL:+HIGH:+MEDIUM:-LOW Cipher-Tag RSA Ciphers: DES-CBC3-SHA IDEA-CBC-SHA RC4-SHA RC4-MD5 DES-CBC-SHA EXP-DES-CBC-SHA EXP-RC2-CBC-MD5 EXP-RC4-MD5 NULL-SHA NULL-MD5 Diffie-Hellman Ciphers: ADH-DES-CBC3-SHA ADH-DES-CBC-SHA ADH-RC4-MD5 EDH-RSA-DES-CBC3-SHA EDH-DSS-DES-CBC3-SHA EDH-RSA-DES-CBC-SHA EDH-DSS-DES-CBC-SHA EXP-EDH-RSA-DES-CBC-SHA EXP-EDH-DSS-DES-CBC-SHA EXP-ADH-DES-CBC-SHA EXP-ADH-RC4-MD5 Protocol Key Ex. Auth. Enc. MAC SSLv3 SSLv3 SSLv3 SSLv3 SSLv3 SSLv3 SSLv3 SSLv3 SSLv3 SSLv3 RSA RSA RSA RSA RSA RSA(512) RSA(512) RSA(512) RSA RSA RSA RSA RSA RSA RSA RSA RSA RSA RSA RSA 3DES(168) IDEA(128) RC4(128) RC4(128) DES(56) DES(40) RC2(40) RC4(40) None None SHA1 SHA1 SHA1 MD5 SHA1 SHA1 MD5 MD5 SHA1 MD5 SSLv3 SSLv3 SSLv3 SSLv3 SSLv3 SSLv3 SSLv3 SSLv3 SSLv3 SSLv3 SSLv3 DH DH DH DH DH DH DH DH(512) DH(512) DH(512) DH(512) None None None RSA DSS RSA DSS RSA DSS None None 3DES(168) DES(56) RC4(128) 3DES(168) 3DES(168) DES(56) DES(56) DES(40) DES(40) DES(40) RC4(40) SHA1 SHA1 MD5 SHA1 SHA1 SHA1 SHA1 SHA1 SHA1 SHA1 MD5 Type export export export export export export export 10.115. APACHE MODULE MOD SSL 929 SSLCompression Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Enable compression on the SSL level SSLCompression on|off SSLCompression off server config, virtual host Extension mod ssl Available in httpd 2.4.3 and later, if using OpenSSL 0.9.8 or later; virtual host scope available if using OpenSSL 1.0.0 or later. The default used to be on in version 2.4.3. This directive allows to enable compression on the SSL level. ! Enabling compression causes security issues in most setups (the so called CRIME attack). SSLCryptoDevice Directive Description: Syntax: Default: Context: Status: Module: Enable use of a cryptographic hardware accelerator SSLCryptoDevice engine SSLCryptoDevice builtin server config Extension mod ssl This directive enables use of a cryptographic hardware accelerator board to offload some of the SSL processing overhead. This directive can only be used if the SSL toolkit is built with "engine" support; OpenSSL 0.9.7 and later releases have "engine" support by default, the separate "-engine" releases of OpenSSL 0.9.6 must be used. To discover which engine names are supported, run the command "openssl engine". Example # For a Broadcom accelerator: SSLCryptoDevice ubsec SSLEngine Directive Description: Syntax: Default: Context: Status: Module: SSL Engine Operation Switch SSLEngine on|off|optional SSLEngine off server config, virtual host Extension mod ssl This directive toggles the usage of the SSL/TLS Protocol Engine. This is should be used inside a section to enable SSL/TLS for a that virtual host. By default the SSL/TLS Protocol Engine is disabled for both the main server and all configured virtual hosts. Example SSLEngine on #... 930 CHAPTER 10. APACHE MODULES In Apache 2.1 and later, SSLE NGINE can be set to optional. This enables support for RFC 281789 , Upgrading to TLS Within HTTP/1.1. At this time no web browsers support RFC 2817. SSLFIPS Directive Description: Syntax: Default: Context: Status: Module: SSL FIPS mode Switch SSLFIPS on|off SSLFIPS off server config Extension mod ssl This directive toggles the usage of the SSL library FIPS mode flag. It must be set in the global server context and cannot be configured with conflicting settings (SSLFIPS on followed by SSLFIPS off or similar). The mode applies to all SSL library operations. If httpd was compiled against an SSL library which did not support the FIPS mode flag, SSLFIPS on will fail. Refer to the FIPS 140-2 Security Policy document of the SSL provider library for specific requirements to use mod ssl in a FIPS 140-2 approved mode of operation; note that mod ssl itself is not validated, but may be described as using FIPS 140-2 validated cryptographic module, when all components are assembled and operated under the guidelines imposed by the applicable Security Policy. SSLHonorCipherOrder Directive Description: Syntax: Default: Context: Status: Module: Option to prefer the server’s cipher preference order SSLHonorCipherOrder on|off SSLHonorCipherOrder off server config, virtual host Extension mod ssl When choosing a cipher during an SSLv3 or TLSv1 handshake, normally the client’s preference is used. If this directive is enabled, the server’s preference will be used instead. Example SSLHonorCipherOrder on SSLInsecureRenegotiation Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Option to enable support for insecure renegotiation SSLInsecureRenegotiation on|off SSLInsecureRenegotiation off server config, virtual host Extension mod ssl Available if using OpenSSL 0.9.8m or later As originally specified, all versions of the SSL and TLS protocols (up to and including TLS/1.2) were vulnerable to a Man-in-the-Middle attack (CVE-2009-355590 ) during a renegotiation. This vulnerability allowed an attacker to 89 http://www.ietf.org/rfc/rfc2817.txt 90 http://cve.mitre.org/cgi-bin/cvename.cgi?name=CAN-2009-3555 10.115. APACHE MODULE MOD SSL 931 "prefix" a chosen plaintext to the HTTP request as seen by the web server. A protocol extension was developed which fixed this vulnerability if supported by both client and server. If MOD SSL is linked against OpenSSL version 0.9.8m or later, by default renegotiation is only supported with clients supporting the new protocol extension. If this directive is enabled, renegotiation will be allowed with old (unpatched) clients, albeit insecurely. ! Security warning If this directive is enabled, SSL connections will be vulnerable to the Man-in-the-Middle prefix attack as described in CVE-2009-3555a . a http://cve.mitre.org/cgi-bin/cvename.cgi?name=CAN-2009-3555 Example SSLInsecureRenegotiation on The SSL SECURE RENEG environment variable can be used from an SSI or CGI script to determine whether secure renegotiation is supported for a given SSL connection. SSLOCSPDefaultResponder Directive Description: Syntax: Context: Status: Module: Set the default responder URI for OCSP validation SSLOCSDefaultResponder uri server config, virtual host Extension mod ssl This option sets the default OCSP responder to use. If SSLOCSPOVERRIDE R ESPONDER is not enabled, the URI given will be used only if no responder URI is specified in the certificate being verified. SSLOCSPEnable Directive Description: Syntax: Default: Context: Status: Module: Enable OCSP validation of the client certificate chain SSLOCSPEnable on|off SSLOCSPEnable off server config, virtual host Extension mod ssl This option enables OCSP validation of the client certificate chain. If this option is enabled, certificates in the client’s certificate chain will be validated against an OCSP responder after normal verification (including CRL checks) have taken place. The OCSP responder used is either extracted from the certificate itself, or derived by configuration; see the SSLOCSPD EFAULT R ESPONDER and SSLOCSPOVERRIDE R ESPONDER directives. Example SSLVerifyClient on SSLOCSPEnable on SSLOCSPDefaultResponder http://responder.example.com:8888/responder SSLOCSPOverrideResponder on 932 CHAPTER 10. APACHE MODULES SSLOCSPOverrideResponder Directive Description: Syntax: Default: Context: Status: Module: Force use of the default responder URI for OCSP validation SSLOCSPOverrideResponder on|off SSLOCSPOverrideResponder off server config, virtual host Extension mod ssl This option forces the configured default OCSP responder to be used during OCSP certificate validation, regardless of whether the certificate being validated references an OCSP responder. SSLOCSPProxyURL Directive Description: Syntax: Context: Status: Module: Compatibility: Proxy URL to use for OCSP requests SSLOCSPProxyURL url server config, virtual host Extension mod ssl Available in httpd 2.4.19 and later This option allows to set the URL of a HTTP proxy that should be used for all queries to OCSP responders. SSLOCSPResponderTimeout Directive Description: Syntax: Default: Context: Status: Module: Timeout for OCSP queries SSLOCSPResponderTimeout seconds SSLOCSPResponderTimeout 10 server config, virtual host Extension mod ssl This option sets the timeout for queries to OCSP responders, when SSLOCSPE NABLE is turned on. SSLOCSPResponseMaxAge Directive Description: Syntax: Default: Context: Status: Module: Maximum allowable age for OCSP responses SSLOCSPResponseMaxAge seconds SSLOCSPResponseMaxAge -1 server config, virtual host Extension mod ssl This option sets the maximum allowable age ("freshness") for OCSP responses. The default value (-1) does not enforce a maximum age, which means that OCSP responses are considered valid as long as their nextUpdate field is in the future. 10.115. APACHE MODULE MOD SSL 933 SSLOCSPResponseTimeSkew Directive Description: Syntax: Default: Context: Status: Module: Maximum allowable time skew for OCSP response validation SSLOCSPResponseTimeSkew seconds SSLOCSPResponseTimeSkew 300 server config, virtual host Extension mod ssl This option sets the maximum allowable time skew for OCSP responses (when checking their thisUpdate and nextUpdate fields). SSLOCSPUseRequestNonce Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Use a nonce within OCSP queries SSLOCSPUseRequestNonce on|off SSLOCSPUseRequestNonce on server config, virtual host Extension mod ssl Available in httpd 2.4.10 and later This option determines whether queries to OCSP responders should contain a nonce or not. By default, a query nonce is always used and checked against the response’s one. When the responder does not use nonces (e.g. Microsoft OCSP Responder), this option should be turned off. SSLOpenSSLConfCmd Directive Description: Syntax: Context: Status: Module: Compatibility: Configure OpenSSL parameters through its SSL CONF API SSLOpenSSLConfCmd command-name command-value server config, virtual host Extension mod ssl Available in httpd 2.4.8 and later, if using OpenSSL 1.0.2 or later This directive exposes OpenSSL’s SSL CONF API to mod ssl, allowing a flexible configuration of OpenSSL parameters without the need of implementing additional MOD SSL directives when new features are added to OpenSSL. The set of available SSLO PEN SSLC ONF C MD commands depends on the OpenSSL version being used for MOD SSL (at least version 1.0.2 is required). For a list of supported command names, see the section Supported configuration file commands in the SSL CONF cmd(3)91 manual page for OpenSSL. Some of the SSLO PEN SSLC ONF C MD commands can be used as an alternative to existing directives (such as SSLC IPHER S UITE or SSLP ROTOCOL), though it should be noted that the syntax / allowable values for the parameters may sometimes differ. Examples SSLOpenSSLConfCmd SSLOpenSSLConfCmd SSLOpenSSLConfCmd SSLOpenSSLConfCmd SSLOpenSSLConfCmd Options -SessionTicket,ServerPreference ECDHParameters brainpoolP256r1 ServerInfoFile /usr/local/apache2/conf/server-info.pem Protocol "-ALL, TLSv1.2" SignatureAlgorithms RSA+SHA384:ECDSA+SHA256 91 http://www.openssl.org/docs/man1.0.2/ssl/SSL CONF cmd.html#SUPPORTED-CONFIGURATION-FILE-COMMANDS 934 CHAPTER 10. APACHE MODULES SSLOptions Directive Description: Syntax: Context: Override: Status: Module: Configure various SSL engine run-time options SSLOptions [+|-]option ... server config, virtual host, directory, .htaccess Options Extension mod ssl This directive can be used to control various run-time options on a per-directory basis. Normally, if multiple SSLOptions could apply to a directory, then the most specific one is taken completely; the options are not merged. However if all the options on the SSLOptions directive are preceded by a plus (+) or minus (-) symbol, the options are merged. Any options preceded by a + are added to the options currently in force, and any options preceded by a are removed from the options currently in force. The available options are: • StdEnvVars When this option is enabled, the standard set of SSL related CGI/SSI environment variables are created. This per default is disabled for performance reasons, because the information extraction step is a rather expensive operation. So one usually enables this option for CGI and SSI requests only. • ExportCertData When this option is enabled, additional CGI/SSI environment variables are created: SSL SERVER CERT, SSL CLIENT CERT and SSL CLIENT CERT CHAIN n (with n = 0,1,2,..). These contain the PEM-encoded X.509 Certificates of server and client for the current HTTPS connection and can be used by CGI scripts for deeper Certificate checking. Additionally all other certificates of the client certificate chain are provided, too. This bloats up the environment a little bit which is why you have to use this option to enable it on demand. • FakeBasicAuth When this option is enabled, the Subject Distinguished Name (DN) of the Client X509 Certificate is translated into a HTTP Basic Authorization username. This means that the standard Apache authentication methods can be used for access control. The user name is just the Subject of the Client’s X509 Certificate (can be determined by running OpenSSL’s openssl x509 command: openssl x509 -noout -subject -in certificate.crt). The optional SSLU SER NAME directive can be used to specify which part of the certificate Subject is embedded in the username. Note that no password is obtained from the user. Every entry in the user file needs this password: “xxj31ZMTZzkVA”, which is the DES-encrypted version of the word ‘password”. Those who live under MD5-based encryption (for instance under FreeBSD or BSD/OS, etc.) should use the following MD5 hash of the same word: “$1$OXLyS...$Owx8s2/m9/gfkcRVXzgoE/”. Note that the AUTH BASIC FAKE directive within MOD AUTH BASIC can be used as a more general mechanism for faking basic authentication, giving control over the structure of both the username and password. • StrictRequire This forces forbidden access when SSLRequireSSL or SSLRequire successfully decided that access should be forbidden. Usually the default is that in the case where a “Satisfy any” directive is used, and other access restrictions are passed, denial of access due to SSLRequireSSL or SSLRequire is overridden (because that’s how the Apache Satisfy mechanism should work.) But for strict access restriction you can use SSLRequireSSL and/or SSLRequire in combination with an “SSLOptions +StrictRequire”. Then an additional “Satisfy Any” has no chance once mod ssl has decided to deny access. • OptRenegotiate This enables optimized SSL connection renegotiation handling when SSL directives are used in per-directory context. By default a strict scheme is enabled where every per-directory reconfiguration of SSL parameters causes a full SSL renegotiation handshake. When this option is used mod ssl tries to avoid unnecessary handshakes by doing more granular (but still safe) parameter checks. Nevertheless these granular checks sometimes may not be what the user expects, so enable this on a per-directory basis only, please. 10.115. APACHE MODULE MOD SSL 935 • LegacyDNStringFormat This option influences how values of the SSL {CLIENT,SERVER} {I,S} DN variables are formatted. Since version 2.3.11, Apache HTTPD uses a RFC 2253 compatible format by default. This uses commas as delimiters between the attributes, allows the use of non-ASCII characters (which are converted to UTF8), escapes various special characters with backslashes, and sorts the attributes with the "C" attribute last. If LegacyDNStringFormat is set, the old format will be used which sorts the "C" attribute first, uses slashes as separators, and does not handle non-ASCII and special characters in any consistent way. Example SSLOptions +FakeBasicAuth -StrictRequire SSLOptions +StdEnvVars -ExportCertData SSLPassPhraseDialog Directive Description: Syntax: Default: Context: Status: Module: Type of pass phrase dialog for encrypted private keys SSLPassPhraseDialog type SSLPassPhraseDialog builtin server config Extension mod ssl When Apache starts up it has to read the various Certificate (see SSLC ERTIFICATE F ILE) and Private Key (see SSLC ERTIFICATE K EY F ILE) files of the SSL-enabled virtual servers. Because for security reasons the Private Key files are usually encrypted, mod ssl needs to query the administrator for a Pass Phrase in order to decrypt those files. This query can be done in two ways which can be configured by type: • builtin This is the default where an interactive terminal dialog occurs at startup time just before Apache detaches from the terminal. Here the administrator has to manually enter the Pass Phrase for each encrypted Private Key file. Because a lot of SSL-enabled virtual hosts can be configured, the following reuse-scheme is used to minimize the dialog: When a Private Key file is encrypted, all known Pass Phrases (at the beginning there are none, of course) are tried. If one of those known Pass Phrases succeeds no dialog pops up for this particular Private Key file. If none succeeded, another Pass Phrase is queried on the terminal and remembered for the next round (where it perhaps can be reused). This scheme allows mod ssl to be maximally flexible (because for N encrypted Private Key files you can use N different Pass Phrases - but then you have to enter all of them, of course) while minimizing the terminal dialog (i.e. when you use a single Pass Phrase for all N Private Key files this Pass Phrase is queried only once). • |/path/to/program [args...] This mode allows an external program to be used which acts as a pipe to a particular input device; the program is sent the standard prompt text used for the builtin mode on stdin, and is expected to write password strings on stdout. If several passwords are needed (or an incorrect password is entered), additional prompt text will be written subsequent to the first password being returned, and more passwords must then be written back. • exec:/path/to/program Here an external program is configured which is called at startup for each encrypted Private Key file. It is called with one argument, a string of the form “servername:portnumber:index” (with index being a zerobased sequence number), which indicates for which server, TCP port and certificate number it has to print the 936 CHAPTER 10. APACHE MODULES corresponding Pass Phrase to stdout. The intent is that this external program first runs security checks to make sure that the system is not compromised by an attacker, and only when these checks were passed successfully it provides the Pass Phrase. Both these security checks, and the way the Pass Phrase is determined, can be as complex as you like. Mod ssl just defines the interface: an executable program which provides the Pass Phrase on stdout. Nothing more or less! So, if you’re really paranoid about security, here is your interface. Anything else has to be left as an exercise to the administrator, because local security requirements are so different. The reuse-algorithm above is used here, too. In other words: The external program is called only once per unique Pass Phrase. Example SSLPassPhraseDialog exec:/usr/local/apache/sbin/pp-filter SSLProtocol Directive Description: Syntax: Default: Context: Status: Module: Configure usable SSL/TLS protocol versions SSLProtocol [+|-]protocol ... SSLProtocol all -SSLv3 server config, virtual host Extension mod ssl This directive can be used to control which versions of the SSL/TLS protocol will be accepted in new connections. The available (case-insensitive) protocols are: • SSLv3 This is the Secure Sockets Layer (SSL) protocol, version 3.0, from the Netscape Corporation. It is the successor to SSLv2 and the predecessor to TLSv1, but is deprecated in RFC 756892 . • TLSv1 This is the Transport Layer Security (TLS) protocol, version 1.0. It is the successor to SSLv3 and is defined in RFC 224693 . It is supported by nearly every client. • TLSv1.1 (when using OpenSSL 1.0.1 and later) A revision of the TLS 1.0 protocol, as defined in RFC 434694 . • TLSv1.2 (when using OpenSSL 1.0.1 and later) A revision of the TLS 1.1 protocol, as defined in RFC 524695 . • all This is a shortcut for “+SSLv3 +TLSv1” or - when using OpenSSL 1.0.1 and later - “+SSLv3 +TLSv1 +TLSv1.1 +TLSv1.2”, respectively (except for OpenSSL versions compiled with the “no-ssl3” configuration option, where all does not include +SSLv3). Example SSLProtocol TLSv1 92 http://www.ietf.org/rfc/rfc7568.txt 93 http://www.ietf.org/rfc/rfc2246.txt 94 http://www.ietf.org/rfc/rfc4346.txt 95 http://www.ietf.org/rfc/rfc5246.txt 10.115. APACHE MODULE MOD SSL 937 SSLProxyCACertificateFile Directive Description: Syntax: Context: Override: Status: Module: File of concatenated PEM-encoded CA Certificates for Remote Server Auth SSLProxyCACertificateFile file-path server config, virtual host, proxy section Not applicable Extension mod ssl This directive sets the all-in-one file where you can assemble the Certificates of Certification Authorities (CA) whose remote servers you deal with. These are used for Remote Server Authentication. Such a file is simply the concatenation of the various PEM-encoded Certificate files, in order of preference. This can be used alternatively and/or additionally to SSLP ROXY CAC ERTIFICATE PATH. Example SSLProxyCACertificateFile /usr/local/apache2/conf/ssl.crt/ca-bundle-remote-server.crt SSLProxyCACertificatePath Directive Description: Syntax: Context: Override: Status: Module: Directory of PEM-encoded CA Certificates for Remote Server Auth SSLProxyCACertificatePath directory-path server config, virtual host, proxy section Not applicable Extension mod ssl This directive sets the directory where you keep the Certificates of Certification Authorities (CAs) whose remote servers you deal with. These are used to verify the remote server certificate on Remote Server Authentication. The files in this directory have to be PEM-encoded and are accessed through hash filenames. So usually you can’t just place the Certificate files there: you also have to create symbolic links named hash-value.N. And you should always make sure this directory contains the appropriate symbolic links. Example SSLProxyCACertificatePath /usr/local/apache2/conf/ssl.crt/ SSLProxyCARevocationCheck Directive Description: Syntax: Default: Context: Override: Status: Module: Enable CRL-based revocation checking for Remote Server Auth SSLProxyCARevocationCheck chain|leaf|none SSLProxyCARevocationCheck none server config, virtual host, proxy section Not applicable Extension mod ssl Enables certificate revocation list (CRL) checking for the remote servers you deal with. At least one of SSLP ROX Y CAR EVOCATION F ILE or SSLP ROXY CAR EVOCATION PATH must be configured. When set to chain (recommended setting), CRL checks are applied to all certificates in the chain, while setting it to leaf limits the checks to the end-entity cert. 938 CHAPTER 10. APACHE MODULES =⇒When set to chain or leaf, CRLs must be available for successful validation Prior to version 2.3.15, CRL checking in mod ssl also succeeded when no CRL(s) were found in any of the locations configured with SSLP ROXY CAR EVOCATION F ILE or SSLP ROXYCAR EVOCATION PATH. With the introduction of this directive, the behavior has been changed: when checking is enabled, CRLs must be present for the validation to succeed - otherwise it will fail with an "unable to get certificate CRL" error. Example SSLProxyCARevocationCheck chain SSLProxyCARevocationFile Directive Description: Syntax: Context: Override: Status: Module: File of concatenated PEM-encoded CA CRLs for Remote Server Auth SSLProxyCARevocationFile file-path server config, virtual host, proxy section Not applicable Extension mod ssl This directive sets the all-in-one file where you can assemble the Certificate Revocation Lists (CRL) of Certification Authorities (CA) whose remote servers you deal with. These are used for Remote Server Authentication. Such a file is simply the concatenation of the various PEM-encoded CRL files, in order of preference. This can be used alternatively and/or additionally to SSLP ROXY CAR EVOCATION PATH. Example SSLProxyCARevocationFile /usr/local/apache2/conf/ssl.crl/ca-bundle-remote-server.crl SSLProxyCARevocationPath Directive Description: Syntax: Context: Override: Status: Module: Directory of PEM-encoded CA CRLs for Remote Server Auth SSLProxyCARevocationPath directory-path server config, virtual host, proxy section Not applicable Extension mod ssl This directive sets the directory where you keep the Certificate Revocation Lists (CRL) of Certification Authorities (CAs) whose remote servers you deal with. These are used to revoke the remote server certificate on Remote Server Authentication. The files in this directory have to be PEM-encoded and are accessed through hash filenames. So usually you have not only to place the CRL files there. Additionally you have to create symbolic links named hash-value.rN. And you should always make sure this directory contains the appropriate symbolic links. Example SSLProxyCARevocationPath /usr/local/apache2/conf/ssl.crl/ 10.115. APACHE MODULE MOD SSL 939 SSLProxyCheckPeerCN Directive Description: Syntax: Default: Context: Override: Status: Module: Whether to check the remote server certificate’s CN field SSLProxyCheckPeerCN on|off SSLProxyCheckPeerCN on server config, virtual host, proxy section Not applicable Extension mod ssl This directive sets whether the remote server certificate’s CN field is compared against the hostname of the request URL. If both are not equal a 502 status code (Bad Gateway) is sent. In 2.4.5 and later, SSLProxyCheckPeerCN has been superseded by SSLP ROXY C HECK P EER NAME, and its setting is only taken into account when SSLProxyCheckPeerName off is specified at the same time. Example SSLProxyCheckPeerCN on SSLProxyCheckPeerExpire Directive Description: Syntax: Default: Context: Override: Status: Module: Whether to check if remote server certificate is expired SSLProxyCheckPeerExpire on|off SSLProxyCheckPeerExpire on server config, virtual host, proxy section Not applicable Extension mod ssl This directive sets whether it is checked if the remote server certificate is expired or not. If the check fails a 502 status code (Bad Gateway) is sent. Example SSLProxyCheckPeerExpire on SSLProxyCheckPeerName Directive Description: Syntax: Default: Context: Override: Status: Module: Compatibility: Configure host name checking for remote server certificates SSLProxyCheckPeerName on|off SSLProxyCheckPeerName on server config, virtual host, proxy section Not applicable Extension mod ssl Apache HTTP Server 2.4.5 and later This directive configures host name checking for server certificates when mod ssl is acting as an SSL client. The check will succeed if the host name from the request URI is found in either the subjectAltName extension or (one of) the CN attribute(s) in the certificate’s subject. If the check fails, the SSL request is aborted and a 502 status code (Bad Gateway) is returned. The directive supersedes SSLP ROXY C HECK P EER CN, which only checks for the expected host name in the first CN attribute. 940 CHAPTER 10. APACHE MODULES Wildcard matching is supported in one specific flavor: subjectAltName entries of type dNSName or CN attributes starting with *. will match for any DNS name with the same number of labels and the same suffix (i.e., *.example.org matches for foo.example.org, but not for foo.bar.example.org). SSLProxyCipherSuite Directive Description: Syntax: Default: Context: Override: Status: Module: Cipher Suite available for negotiation in SSL proxy handshake SSLProxyCipherSuite cipher-spec SSLProxyCipherSuite ALL:!ADH:RC4+RSA:+HIGH:+MEDIUM:+LOW:+EXP server config, virtual host, proxy section Not applicable Extension mod ssl Equivalent to SSLCipherSuite, but for the proxy connection. Please refer to SSLC IPHER S UITE for additional information. SSLProxyEngine Directive Description: Syntax: Default: Context: Override: Status: Module: SSL Proxy Engine Operation Switch SSLProxyEngine on|off SSLProxyEngine off server config, virtual host, proxy section Not applicable Extension mod ssl This directive toggles the usage of the SSL/TLS Protocol Engine for proxy. This is usually used inside a section to enable SSL/TLS for proxy usage in a particular virtual host. By default the SSL/TLS Protocol Engine is disabled for proxy both for the main server and all configured virtual hosts. Note that the SSLProxyEngine directive should not, in general, be included in a virtual host that will be acting as a forward proxy (using or directives. SSLProxyEngine is not required to enable a forward proxy server to proxy SSL/TLS requests. Example SSLProxyEngine on #... SSLProxyMachineCertificateChainFile Directive Description: Syntax: Context: Override: Status: Module: File of concatenated PEM-encoded CA certificates to be used by the proxy for choosing a certificate SSLProxyMachineCertificateChainFile filename server config, virtual host, proxy section Not applicable Extension mod ssl 10.115. APACHE MODULE MOD SSL 941 This directive sets the all-in-one file where you keep the certificate chain for all of the client certs in use. This directive will be needed if the remote server presents a list of CA certificates that are not direct signers of one of the configured client certificates. This referenced file is simply the concatenation of the various PEM-encoded certificate files. Upon startup, each client certificate configured will be examined and a chain of trust will be constructed. ! Security warning If this directive is enabled, all of the certificates in the file will be trusted as if they were also in SSLP ROXY CAC ERTIFICATE F ILE. Example SSLProxyMachineCertificateChainFile /usr/local/apache2/conf/ssl.crt/proxyCA.pem SSLProxyMachineCertificateFile Directive Description: Syntax: Context: Override: Status: Module: File of concatenated PEM-encoded client certificates and keys to be used by the proxy SSLProxyMachineCertificateFile filename server config, virtual host, proxy section Not applicable Extension mod ssl This directive sets the all-in-one file where you keep the certificates and keys used for authentication of the proxy server to remote servers. This referenced file is simply the concatenation of the various PEM-encoded certificate files, in order of preference. Use this directive alternatively or additionally to SSLProxyMachineCertificatePath. ! Currently there is no support for encrypted private keys Example SSLProxyMachineCertificateFile /usr/local/apache2/conf/ssl.crt/proxy.pem SSLProxyMachineCertificatePath Directive Description: Syntax: Context: Override: Status: Module: Directory of PEM-encoded client certificates and keys to be used by the proxy SSLProxyMachineCertificatePath directory server config, virtual host, proxy section Not applicable Extension mod ssl This directive sets the directory where you keep the certificates and keys used for authentication of the proxy server to remote servers. The files in this directory must be PEM-encoded and are accessed through hash filenames. Additionally, you must create symbolic links named hash-value.N. And you should always make sure this directory contains the appropriate symbolic links. 942 ! CHAPTER 10. APACHE MODULES Currently there is no support for encrypted private keys Example SSLProxyMachineCertificatePath /usr/local/apache2/conf/proxy.crt/ SSLProxyProtocol Directive Description: Syntax: Default: Context: Override: Status: Module: Configure usable SSL protocol flavors for proxy usage SSLProxyProtocol [+|-]protocol ... SSLProxyProtocol all -SSLv3 server config, virtual host, proxy section Not applicable Extension mod ssl This directive can be used to control the SSL protocol flavors mod ssl should use when establishing its server environment for proxy . It will only connect to servers using one of the provided protocols. Please refer to SSLP ROTOCOL for additional information. SSLProxyVerify Directive Description: Syntax: Default: Context: Override: Status: Module: Type of remote server Certificate verification SSLProxyVerify level SSLProxyVerify none server config, virtual host, proxy section Not applicable Extension mod ssl When a proxy is configured to forward requests to a remote SSL server, this directive can be used to configure certificate verification of the remote server. The following levels are available for level: • none: no remote server Certificate is required at all • optional: the remote server may present a valid Certificate • require: the remote server has to present a valid Certificate • optional no ca: the remote server may present a valid Certificate but it need not to be (successfully) verifiable. In practice only levels none and require are really interesting, because level optional doesn’t work with all servers and level optional no ca is actually against the idea of authentication (but can be used to establish SSL test pages, etc.) Example SSLProxyVerify require 10.115. APACHE MODULE MOD SSL 943 SSLProxyVerifyDepth Directive Description: Syntax: Default: Context: Override: Status: Module: Maximum depth of CA Certificates in Remote Server Certificate verification SSLProxyVerifyDepth number SSLProxyVerifyDepth 1 server config, virtual host, proxy section Not applicable Extension mod ssl This directive sets how deeply mod ssl should verify before deciding that the remote server does not have a valid certificate. The depth actually is the maximum number of intermediate certificate issuers, i.e. the number of CA certificates which are max allowed to be followed while verifying the remote server certificate. A depth of 0 means that selfsigned remote server certificates are accepted only, the default depth of 1 means the remote server certificate can be self-signed or has to be signed by a CA which is directly known to the server (i.e. the CA’s certificate is under SSLP ROXY CAC ERTIFICATE PATH), etc. Example SSLProxyVerifyDepth 10 SSLRandomSeed Directive Description: Syntax: Context: Status: Module: Pseudo Random Number Generator (PRNG) seeding source SSLRandomSeed context source [bytes] server config Extension mod ssl This configures one or more sources for seeding the Pseudo Random Number Generator (PRNG) in OpenSSL at startup time (context is startup) and/or just before a new SSL connection is established (context is connect). This directive can only be used in the global server context because the PRNG is a global facility. The following source variants are available: • builtin This is the always available builtin seeding source. Its usage consumes minimum CPU cycles under runtime and hence can be always used without drawbacks. The source used for seeding the PRNG contains of the current time, the current process id and (when applicable) a randomly chosen 1KB extract of the interprocess scoreboard structure of Apache. The drawback is that this is not really a strong source and at startup time (where the scoreboard is still not available) this source just produces a few bytes of entropy. So you should always, at least for the startup, use an additional seeding source. • file:/path/to/source This variant uses an external file /path/to/source as the source for seeding the PRNG. When bytes is specified, only the first bytes number of bytes of the file form the entropy (and bytes is given to /path/to/source as the first argument). When bytes is not specified the whole file forms the entropy (and 0 is given to /path/to/source as the first argument). Use this especially at startup time, for instance with an available /dev/random and/or /dev/urandom devices (which usually exist on modern Unix derivatives like FreeBSD and Linux). But be careful: Usually /dev/random provides only as much entropy data as it actually has, i.e. when you request 512 bytes of entropy, but the device currently has only 100 bytes available two things can happen: On some platforms you receive only the 100 bytes while on other platforms the read blocks until enough bytes 944 CHAPTER 10. APACHE MODULES are available (which can take a long time). Here using an existing /dev/urandom is better, because it never blocks and actually gives the amount of requested data. The drawback is just that the quality of the received data may not be the best. • exec:/path/to/program This variant uses an external executable /path/to/program as the source for seeding the PRNG. When bytes is specified, only the first bytes number of bytes of its stdout contents form the entropy. When bytes is not specified, the entirety of the data produced on stdout form the entropy. Use this only at startup time when you need a very strong seeding with the help of an external program (for instance as in the example above with the truerand utility you can find in the mod ssl distribution which is based on the AT&T truerand library). Using this in the connection context slows down the server too dramatically, of course. So usually you should avoid using external programs in that context. • egd:/path/to/egd-socket (Unix only) This variant uses the Unix domain socket of the external Entropy Gathering Daemon (EGD) (see http://www.lothar.com/tech /crypto/96 ) to seed the PRNG. Use this if no random device exists on your platform. Example SSLRandomSeed SSLRandomSeed SSLRandomSeed SSLRandomSeed SSLRandomSeed SSLRandomSeed SSLRandomSeed startup startup startup startup connect connect connect builtin file:/dev/random file:/dev/urandom 1024 exec:/usr/local/bin/truerand 16 builtin file:/dev/random file:/dev/urandom 1024 SSLRenegBufferSize Directive Description: Syntax: Default: Context: Override: Status: Module: Set the size for the SSL renegotiation buffer SSLRenegBufferSize bytes SSLRenegBufferSize 131072 directory, .htaccess AuthConfig Extension mod ssl If an SSL renegotiation is required in per-location context, for example, any use of SSLV ERIFY C LIENT in a Directory or Location block, then MOD SSL must buffer any HTTP request body into memory until the new SSL handshake can be performed. This directive can be used to set the amount of memory that will be used for this buffer. ! Note that in many configurations, the client sending the request body will be untrusted so a denial of service attack by consumption of memory must be considered when changing this configuration setting. Example SSLRenegBufferSize 262144 96 http://www.lothar.com/tech/crypto/ 10.115. APACHE MODULE MOD SSL 945 SSLRequire Directive Description: Syntax: Context: Override: Status: Module: Allow access only when an arbitrarily complex boolean expression is true SSLRequire expression directory, .htaccess AuthConfig Extension mod ssl =⇒SSLRequire is deprecated SSLRequire is deprecated and should in general be replaced by Require expr (p. 519) . The so called ap expr (p. 99) syntax of Require expr is a superset of the syntax of SSLRequire, with the following exception: In SSLRequire, the comparison operators <, <=, ... are completely equivalent to the operators lt, le, ... and work in a somewhat peculiar way that first compares the length of two strings and then the lexical order. On the other hand, ap expr (p. 99) has two sets of comparison operators: The operators <, <=, ... do lexical string comparison, while the operators -lt, -le, ... do integer comparison. For the latter, there are also aliases without the leading dashes: lt, le, ... This directive specifies a general access requirement which has to be fulfilled in order to allow access. It is a very powerful directive because the requirement specification is an arbitrarily complex boolean expression containing any number of access checks. The expression must match the following syntax (given as a BNF grammar notation): expr comp ::= "true" | "false" | "!" expr | expr "&&" expr | expr "||" expr | "(" expr ")" | comp ::= | | | | | | | | | word word word word word word word word word word "==" "!=" "<" "<=" ">" ">=" "in" "in" "=˜" "!˜" word | word "eq" word word | word "ne" word word | word "lt" word word | word "le" word word | word "gt" word word | word "ge" word "{" wordlist "}" "PeerExtList(" word ")" regex regex wordlist ::= word | wordlist "," word word ::= | | | digit cstring variable function digit ::= [0-9]+ cstring ::= "..." variable ::= "%{" varname "}" 946 CHAPTER 10. APACHE MODULES function ::= funcname "(" funcargs ")" For varname any of the variables described in Environment Variables can be used. For funcname the available functions are listed in the ap expr documentation (p. 99) . The expression is parsed into an internal machine representation when the configuration is loaded, and then evaluated during request processing. In .htaccess context, the expression is both parsed and executed each time the .htaccess file is encountered during request processing. Example SSLRequire ( %{SSL_CIPHER} !˜ m/ˆ(EXP|NULL)-/ and %{SSL_CLIENT_S_DN_O} eq "Snake Oil, Ltd." and %{SSL_CLIENT_S_DN_OU} in {"Staff", "CA", "Dev"} and %{TIME_WDAY} -ge 1 and %{TIME_WDAY} -le 5 and %{TIME_HOUR} -ge 8 and %{TIME_HOUR} -le 20 or %{REMOTE_ADDR} =˜ m/ˆ192\.76\.162\.[0-9]+$/ \ \ \ \ ) \ The PeerExtList(object-ID) function expects to find zero or more instances of the X.509 certificate extension identified by the given object ID (OID) in the client certificate. The expression evaluates to true if the left-hand side string matches exactly against the value of an extension identified with this OID. (If multiple extensions with the same OID are present, at least one extension must match). Example SSLRequire "foobar" in PeerExtList("1.2.3.4.5.6") =⇒Notes on the PeerExtList function • The object ID can be specified either as a descriptive name recognized by the SSL library, such as "nsComment", or as a numeric OID, such as "1.2.3.4.5.6". • Expressions with types known to the SSL library are rendered to a string before comparison. For an extension with a type not recognized by the SSL library, mod ssl will parse the value if it is one of the primitive ASN.1 types UTF8String, IA5String, VisibleString, or BMPString. For an extension of one of these types, the string value will be converted to UTF-8 if necessary, then compared against the left-hand-side expression. See also • Environment Variables in Apache HTTP Server (p. 92) , for additional examples. • Require expr (p. 519) • Generic expression syntax in Apache HTTP Server (p. 99) SSLRequireSSL Directive Description: Syntax: Context: Override: Status: Module: Deny access when SSL is not used for the HTTP request SSLRequireSSL directory, .htaccess AuthConfig Extension mod ssl 10.115. APACHE MODULE MOD SSL 947 This directive forbids access unless HTTP over SSL (i.e. HTTPS) is enabled for the current connection. This is very handy inside the SSL-enabled virtual host or directories for defending against configuration errors that expose stuff that should be protected. When this directive is present all requests are denied which are not using SSL. Example SSLRequireSSL SSLSessionCache Directive Description: Syntax: Default: Context: Status: Module: Type of the global/inter-process SSL Session Cache SSLSessionCache type SSLSessionCache none server config Extension mod ssl This configures the storage type of the global/inter-process SSL Session Cache. This cache is an optional facility which speeds up parallel request processing. For requests to the same server process (via HTTP keep-alive), OpenSSL already caches the SSL session information locally. But because modern clients request inlined images and other data via parallel requests (usually up to four parallel requests are common) those requests are served by different pre-forked server processes. Here an inter-process cache helps to avoid unnecessary session handshakes. The following five storage types are currently supported: • none This disables the global/inter-process Session Cache. This will incur a noticeable speed penalty and may cause problems if using certain browsers, particularly if client certificates are enabled. This setting is not recommended. • nonenotnull This disables any global/inter-process Session Cache. However it does force OpenSSL to send a non-null session ID to accommodate buggy clients that require one. • dbm:/path/to/datafile This makes use of a DBM hashfile on the local disk to synchronize the local OpenSSL memory caches of the server processes. This session cache may suffer reliability issues under high load. To use this, ensure that MOD SOCACHE DBM is loaded. • shmcb:/path/to/datafile[(size)] This makes use of a high-performance cyclic buffer (approx. size bytes in size) inside a shared memory segment in RAM (established via /path/to/datafile) to synchronize the local OpenSSL memory caches of the server processes. This is the recommended session cache. To use this, ensure that MOD SOCACHE SHMCB is loaded. • dc:UNIX:/path/to/socket This makes use of the distcache97 distributed session caching libraries. The argument should specify the location of the server or proxy to be used using the distcache address syntax; for example, UNIX:/path/to/socket specifies a UNIX domain socket (typically a local dc client proxy); IP:server.example.com:9001 specifies an IP address. To use this, ensure that MOD SOCACHE DC is loaded. 97 http://distcache.sourceforge.net/ 948 CHAPTER 10. APACHE MODULES Examples SSLSessionCache dbm:/usr/local/apache/logs/ssl_gcache_data SSLSessionCache shmcb:/usr/local/apache/logs/ssl_gcache_data(512000) The ssl-cache mutex is used to serialize access to the session cache to prevent corruption. This mutex can be configured using the M UTEX directive. SSLSessionCacheTimeout Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Number of seconds before an SSL session expires in the Session Cache SSLSessionCacheTimeout seconds SSLSessionCacheTimeout 300 server config, virtual host Extension mod ssl Applies also to RFC 5077 TLS session resumption in Apache 2.4.10 and later This directive sets the timeout in seconds for the information stored in the global/inter-process SSL Session Cache, the OpenSSL internal memory cache and for sessions resumed by TLS session resumption (RFC 5077). It can be set as low as 15 for testing, but should be set to higher values like 300 in real life. Example SSLSessionCacheTimeout 600 SSLSessionTicketKeyFile Directive Description: Syntax: Context: Status: Module: Compatibility: Persistent encryption/decryption key for TLS session tickets SSLSessionTicketKeyFile file-path server config, virtual host Extension mod ssl Available in httpd 2.4.0 and later, if using OpenSSL 0.9.8h or later Optionally configures a secret key for encrypting and decrypting TLS session tickets, as defined in RFC 507798 . Primarily suitable for clustered environments where TLS sessions information should be shared between multiple nodes. For single-instance httpd setups, it is recommended to not configure a ticket key file, but to rely on (random) keys generated by mod ssl at startup, instead. The ticket key file must contain 48 bytes of random data, preferrably created from a high-entropy source. On a Unix-based system, a ticket key file can be created as follows: dd if=/dev/random of=/path/to/file.tkey bs=1 count=48 Ticket keys should be rotated (replaced) on a frequent basis, as this is the only way to invalidate an existing session ticket - OpenSSL currently doesn’t allow to specify a limit for ticket lifetimes. A new ticket key only gets used after restarting the web server. All existing session tickets become invalid after a restart. ! The ticket key file contains sensitive keying material and should be protected with file permissions similar to those used for SSLC ERTIFICATE K EY F ILE. 98 http://www.ietf.org/rfc/rfc5077.txt 10.115. APACHE MODULE MOD SSL 949 SSLSessionTickets Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Enable or disable use of TLS session tickets SSLSessionTickets on|off SSLSessionTickets on server config, virtual host Extension mod ssl Available in httpd 2.4.11 and later, if using OpenSSL 0.9.8f or later. This directive allows to enable or disable the use of TLS session tickets (RFC 5077). ! TLS session tickets are enabled by default. Using them without restarting the web server with an appropriate frequency (e.g. daily) compromises perfect forward secrecy. SSLSRPUnknownUserSeed Directive Description: Syntax: Context: Status: Module: Compatibility: SRP unknown user seed SSLSRPUnknownUserSeed secret-string server config, virtual host Extension mod ssl Available in httpd 2.4.4 and later, if using OpenSSL 1.0.1 or later This directive sets the seed used to fake SRP user parameters for unknown users, to avoid leaking whether a given user exists. Specify a secret string. If this directive is not used, then Apache will return the UNKNOWN PSK IDENTITY alert to clients who specify an unknown username. Example SSLSRPUnknownUserSeed "secret" SSLSRPVerifierFile Directive Description: Syntax: Context: Status: Module: Compatibility: Path to SRP verifier file SSLSRPVerifierFile file-path server config, virtual host Extension mod ssl Available in httpd 2.4.4 and later, if using OpenSSL 1.0.1 or later This directive enables TLS-SRP and sets the path to the OpenSSL SRP (Secure Remote Password) verifier file containing TLS-SRP usernames, verifiers, salts, and group parameters. Example SSLSRPVerifierFile "/path/to/file.srpv" The verifier file can be created with the openssl command line utility: Creating the SRP verifier file openssl srp -srpvfile passwd.srpv -userinfo "some info" -add username The value given with the optional -userinfo parameter is avalable in the SSL SRP USERINFO request environment variable. 950 CHAPTER 10. APACHE MODULES SSLStaplingCache Directive Description: Syntax: Context: Status: Module: Compatibility: Configures the OCSP stapling cache SSLStaplingCache type server config Extension mod ssl Available if using OpenSSL 0.9.8h or later Configures the cache used to store OCSP responses which get included in the TLS handshake if SSLU SE S TAPLING is enabled. Configuration of a cache is mandatory for OCSP stapling. With the exception of none and nonenotnull, the same storage types are supported as with SSLS ESSION C ACHE. SSLStaplingErrorCacheTimeout Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Number of seconds before expiring invalid responses in the OCSP stapling cache SSLStaplingErrorCacheTimeout seconds SSLStaplingErrorCacheTimeout 600 server config, virtual host Extension mod ssl Available if using OpenSSL 0.9.8h or later Sets the timeout in seconds before invalid responses in the OCSP stapling cache (configured through SSLS TAPLING C ACHE) will expire. To set the cache timeout for valid responses, see SSLS TAPLING S TANDARD C ACHE T IMEOUT. SSLStaplingFakeTryLater Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Synthesize "tryLater" responses for failed OCSP stapling queries SSLStaplingFakeTryLater on|off SSLStaplingFakeTryLater on server config, virtual host Extension mod ssl Available if using OpenSSL 0.9.8h or later When enabled and a query to an OCSP responder for stapling purposes fails, mod ssl will synthesize a "tryLater" response for the client. Only effective if SSLS TAPLING R ETURN R ESPONDER E RRORS is also enabled. SSLStaplingForceURL Directive Description: Syntax: Context: Status: Module: Compatibility: Override the OCSP responder URI specified in the certificate’s AIA extension SSLStaplingForceURL uri server config, virtual host Extension mod ssl Available if using OpenSSL 0.9.8h or later This directive overrides the URI of an OCSP responder as obtained from the authorityInfoAccess (AIA) extension of the certificate. One potential use is when a proxy is used for retrieving OCSP queries. 10.115. APACHE MODULE MOD SSL 951 SSLStaplingResponderTimeout Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Timeout for OCSP stapling queries SSLStaplingResponderTimeout seconds SSLStaplingResponderTimeout 10 server config, virtual host Extension mod ssl Available if using OpenSSL 0.9.8h or later This option sets the timeout for queries to OCSP responders when SSLU SE S TAPLING is enabled and mod ssl is querying a responder for OCSP stapling purposes. SSLStaplingResponseMaxAge Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Maximum allowable age for OCSP stapling responses SSLStaplingResponseMaxAge seconds SSLStaplingResponseMaxAge -1 server config, virtual host Extension mod ssl Available if using OpenSSL 0.9.8h or later This option sets the maximum allowable age ("freshness") when considering OCSP responses for stapling purposes, i.e. when SSLU SE S TAPLING is turned on. The default value (-1) does not enforce a maximum age, which means that OCSP responses are considered valid as long as their nextUpdate field is in the future. SSLStaplingResponseTimeSkew Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Maximum allowable time skew for OCSP stapling response validation SSLStaplingResponseTimeSkew seconds SSLStaplingResponseTimeSkew 300 server config, virtual host Extension mod ssl Available if using OpenSSL 0.9.8h or later This option sets the maximum allowable time skew when mod ssl checks the thisUpdate and nextUpdate fields of OCSP responses which get included in the TLS handshake (OCSP stapling). Only applicable if SSLU SE S TAPLING is turned on. SSLStaplingReturnResponderErrors Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Pass stapling related OCSP errors on to client SSLStaplingReturnResponderErrors on|off SSLStaplingReturnResponderErrors on server config, virtual host Extension mod ssl Available if using OpenSSL 0.9.8h or later When enabled, mod ssl will pass responses from unsuccessful stapling related OCSP queries (such as responses with an overall status other than "successful", responses with a certificate status other than "good", expired responses etc.) 952 CHAPTER 10. APACHE MODULES on to the client. If set to off, only responses indicating a certificate status of "good" will be included in the TLS handshake. SSLStaplingStandardCacheTimeout Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Number of seconds before expiring responses in the OCSP stapling cache SSLStaplingStandardCacheTimeout seconds SSLStaplingStandardCacheTimeout 3600 server config, virtual host Extension mod ssl Available if using OpenSSL 0.9.8h or later Sets the timeout in seconds before responses in the OCSP stapling cache (configured through SSLS TAPLING C ACHE) will expire. This directive applies to valid responses, while SSLS TAPLING E RROR C ACHE T IMEOUT is used for controlling the timeout for invalid/unavailable responses. SSLStrictSNIVHostCheck Directive Description: Syntax: Default: Context: Status: Module: Whether to allow non-SNI clients to access a name-based virtual host. SSLStrictSNIVHostCheck on|off SSLStrictSNIVHostCheck off server config, virtual host Extension mod ssl This directive sets whether a non-SNI client is allowed to access a name-based virtual host. If set to on in the default name-based virtual host, clients that are SNI unaware will not be allowed to access any virtual host, belonging to this particular IP / port combination. If set to on in any other virtual host, SNI unaware clients are not allowed to access this particular virtual host. ! This option is only available if httpd was compiled against an SNI capable version of OpenSSL. Example SSLStrictSNIVHostCheck on SSLUserName Directive Description: Syntax: Context: Override: Status: Module: Variable name to determine user name SSLUserName varname server config, directory, .htaccess AuthConfig Extension mod ssl This directive sets the "user" field in the Apache request object. This is used by lower modules to identify the user with a character string. In particular, this may cause the environment variable REMOTE USER to be set. The varname can be any of the SSL environment variables. When the FakeBasicAuth option is enabled, this directive instead controls the value of the username embedded within the basic authentication header (see SSLOptions). 10.115. APACHE MODULE MOD SSL 953 Example SSLUserName SSL_CLIENT_S_DN_CN SSLUseStapling Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Enable stapling of OCSP responses in the TLS handshake SSLUseStapling on|off SSLUseStapling off server config, virtual host Extension mod ssl Available if using OpenSSL 0.9.8h or later This option enables OCSP stapling, as defined by the "Certificate Status Request" TLS extension specified in RFC 6066. If enabled (and requested by the client), mod ssl will include an OCSP response for its own certificate in the TLS handshake. Configuring an SSLS TAPLING C ACHE is a prerequisite for enabling OCSP stapling. OCSP stapling relieves the client of querying the OCSP responder on its own, but it should be noted that with the RFC 6066 specification, the server’s CertificateStatus reply may only include an OCSP response for a single cert. For server certificates with intermediate CA certificates in their chain (the typical case nowadays), stapling in its current implementation therefore only partially achieves the stated goal of "saving roundtrips and resources" - see also RFC 696199 (TLS Multiple Certificate Status Extension). When OCSP stapling is enabled, the ssl-stapling mutex is used to control access to the OCSP stapling cache in order to prevent corruption, and the sss-stapling-refresh mutex is used to control refreshes of OCSP responses. These mutexes can be configured using the M UTEX directive. SSLVerifyClient Directive Description: Syntax: Default: Context: Override: Status: Module: Type of Client Certificate verification SSLVerifyClient level SSLVerifyClient none server config, virtual host, directory, .htaccess AuthConfig Extension mod ssl This directive sets the Certificate verification level for the Client Authentication. Notice that this directive can be used both in per-server and per-directory context. In per-server context it applies to the client authentication process used in the standard SSL handshake when a connection is established. In per-directory context it forces a SSL renegotiation with the reconfigured client verification level after the HTTP request was read but before the HTTP response is sent. The following levels are available for level: • none: no client Certificate is required at all • optional: the client may present a valid Certificate • require: the client has to present a valid Certificate • optional no ca: the client may present a valid Certificate but it need not to be (successfully) verifiable. 99 http://www.ietf.org/rfc/rfc6961.txt 954 CHAPTER 10. APACHE MODULES In practice only levels none and require are really interesting, because level optional doesn’t work with all browsers and level optional no ca is actually against the idea of authentication (but can be used to establish SSL test pages, etc.) Example SSLVerifyClient require SSLVerifyDepth Directive Description: Syntax: Default: Context: Override: Status: Module: Maximum depth of CA Certificates in Client Certificate verification SSLVerifyDepth number SSLVerifyDepth 1 server config, virtual host, directory, .htaccess AuthConfig Extension mod ssl This directive sets how deeply mod ssl should verify before deciding that the clients don’t have a valid certificate. Notice that this directive can be used both in per-server and per-directory context. In per-server context it applies to the client authentication process used in the standard SSL handshake when a connection is established. In per-directory context it forces a SSL renegotiation with the reconfigured client verification depth after the HTTP request was read but before the HTTP response is sent. The depth actually is the maximum number of intermediate certificate issuers, i.e. the number of CA certificates which are max allowed to be followed while verifying the client certificate. A depth of 0 means that self-signed client certificates are accepted only, the default depth of 1 means the client certificate can be self-signed or has to be signed by a CA which is directly known to the server (i.e. the CA’s certificate is under SSLCAC ERTIFICATE PATH), etc. Example SSLVerifyDepth 10 10.116. APACHE MODULE MOD SSL CT 10.116 955 Apache Module mod ssl ct Description: Status: ModuleIdentifier: SourceFile: Implementation of Certificate Transparency (RFC 6962) Extension ssl ct module mod ssl ct.c Summary This module provides an implementation of Certificate Transparency, in conjunction with MOD SSL and command-line tools from the certificate-transparency100 open source project. The goal of Certificate Transparency is to expose the use of server certificates which are trusted by browsers but were mistakenly or maliciously issued. More information about Certificate Transparency is available at http://www.certificate-transparency.org/101 . Key terminology used in this documentation: Certificate log A certificate log, referred to simply as log in this documentation, is a network service to which server certificates have been submitted. A user agent can confirm that the certificate of a server which it accesses has been submitted to a log which it trusts, and that the log itself has not been tampered with. Signed Certificate Timestamp (SCT) This is an acknowledgement from a log that it has accepted a valid certificate. It is signed with the log’s public key. One or more SCTs is passed to clients during the handshake, either in the ServerHello (TLS extension), certificate extension, or in a stapled OCSP response. This implementation for Apache httpd provides these features for TLS servers and proxies: • Signed Certificate Timestamps (SCTs) can be obtained from logs automatically and, in conjunction with any statically configured SCTs, sent to aware clients in the ServerHello (during the handshake). • SCTs can be received by the proxy from origin servers in the ServerHello, in a certificate extension, and/or within stapled OCSP responses; any SCTs received can be partially validated on-line and optionally queued for off-line audit. • The proxy can be configured to disallow communication with an origin server which does not provide an SCT which passes on-line validation. Configuration information about logs can be defined statically in the web server configuration or maintained in a SQLite3 database. In the latter case, MOD SSL CT will reload the database periodically, so any site-specific infrastructure for maintaining and propagating log configuration information does not have to also restart httpd to make it take effect. =⇒This module is experimental for the following reasons: • Insufficient test and review • Reliance on an unreleased version of OpenSSL (1.0.2, Beta 3 or later) for basic operation • Incomplete off-line audit capability Configuration mechanisms, format of data saved for off-line audit, and other characteristics are subject to change based on further feedback and testing. Directives • CTAuditStorage 100 https://code.google.com/p/certificate-transparency/ 101 http://www.certificate-transparency.org/ 956 CHAPTER 10. APACHE MODULES • CTLogClient • CTLogConfigDB • CTMaxSCTAge • CTProxyAwareness • CTSCTStorage • CTServerHelloSCTLimit • CTStaticLogConfig • CTStaticSCTs Server processing overview Servers need to send SCTs to their clients. SCTs in a certificate extension or stapled OCSP response will be sent without any special program logic. This module handles sending SCTs configured by the administrator or received from configured logs. The number of SCTs sent in the ServerHello (i.e., not including those in a certificate extension or stapled OCSP response) can be limited by the CTS ERVER H ELLO SCTL IMIT directive. For each server certificate, a daemon process maintains an SCT list to be sent in the ServerHello, created from statically configured SCTs as well as those received from logs. Logs marked as untrusted or with a maximum valid timestamp before the present time will be ignored. Periodically the daemon will submit certificates to a log as necessary (due to changed log configuration or age) and rebuild the concatenation of SCTs. The SCT list for a server certificate will be sent to any client that indicates awareness in the ClientHello when that particular server certificate is used. Proxy processing overview The proxy indicates Certificate Transparency awareness in the ClientHello by including the signed certificate timestamp extension. It can recognize SCTs received in the ServerHello, in an extension in the certificate for an origin server, or in a stapled OCSP response. On-line verification is attempted for each received SCT: • For any SCT, the timestamp can be checked to see if it is not yet valid based on the current time as well as any configured valid time interval for the log. • For an SCT from a log for which a public key is configured, the server signature will be checked. If verification fails for at least one SCT and verification was not successful for at least one SCT, the connection is aborted if CTP ROXYAWARENESS is set to require. Additionally, the server certificate chain and SCTs are stored for off-line verification if the CTAUDIT S TORAGE directive is configured. As an optimization, on-line verification and storing of data from the server is only performed the first time a web server child process receives the data. This saves some processing time as well as disk space. For typical reverse proxy setups, very little processing overhead will be required. Log configuration Servers and proxies use different information about logs for their processing. This log configuration can be set in two ways: 10.116. APACHE MODULE MOD SSL CT 957 • Create a log configuration database using ctlogconfig, and configure the path to that database using the CTL OG C ONFIG directive. This method of configuration supports dynamic updates; MOD SSL CT will re-read the database at intervals. Additionally, the off-line audit program ctauditscts can use this configuration to find the URL of logs. • Configure information about logs statically using the CTS TATIC L OG C ONFIG directive. As with all other directives, the server must be restarted in order to pick up changes to the directives. The information that can be configured about a log using either mechanism is described below: log id The log id is the SHA-256 hash of the log’s public key, and is part of every SCT. This is a convenient way to identify a particular log when configuring valid timestamp ranges or certain other information. public key of the log A proxy must have the public key of the log in order to check the signature in SCTs it receives which were obtained from the log. A server must have the public key of the log in order to submit certificates to it. general trust/distrust setting This is a mechanism to distrust or restore trust in a particular log, for whatever reason (including simply avoiding interaction with the log in situations where it is off-line). minimum and/or maximum valid timestamps When configured, the proxy will check that timestamps from SCTs are within the valid range. log URL The URL of the log (for its API) is required by a server in order to submit server certificates to the log. The server will submit each server certificate in order to obtain an SCT for each log with a configured URL, except when the log is also marked as distrusted or the current time is not within any configured valid timestamp range. The log URL is also needed by off-line auditing of SCTs received by a proxy. Generally, only a small subset of this information is configured for a particular log. Refer to the documentation for the CTS TATIC L OG C ONFIG directive and the ctlogconfig command for more specific information. Storing SCTs in a form consumable by mod ssl ct MOD SSL CT allows you to configure SCTs statically using the CTS TATIC SCT S directive. These must be in binary form, ready to send to a client. Sample code in the form of a Python script to build an SCT in the correct format from data received from a log can be found in Tom Ritter’s ct-tools repository102 . Refer to write-sct.py Logging CT status in the access log Proxy and server modes set the SSL CT PROXY STATUS and SSL CT CLIENT STATUS variables, respectively, to indicate if the corresponding peer is CT-aware. Proxy mode sets the SSL CT PROXY SCT SOURCES variable to indicate whether and where SCTs were obtained (ServerHello, certificate extension, etc.). These variables can be logged with the %{varname}e format of MOD LOG CONFIG. 102 https://github.com/tomrittervg/ct-tools 958 CHAPTER 10. APACHE MODULES Off-line audit for proxy Experimental support for this is implemented in the ctauditscts command, which itself relies on the verify single proof.py tool in the certificate-transparency open source project. ctauditscts can parse data for off-line audit (enabled with the CTAUDIT S TORAGE directive) and invoke verify single proof.py. Here are rough notes for using ctauditscts: • Create a virtualenv using the requirements.txt file from the certificate-transparency project and run the following steps with that virtualenv activated. • Set PYTHONPATH to include the python directory within the certificate-transparency tools. • Set PATH to include the python/ct/client/tools directory. • Run ctauditscts, passing the value of the CTAUDIT S TORAGE directive and, optionally, the path to the log configuration database. The latter will be used to look up log URLs by log id. The data saved for audit can also be used by other programs; refer to the ctauditscts source code for details on processing the data. CTAuditStorage Directive Description: Syntax: Default: Context: Status: Module: Existing directory where data for off-line audit will be stored CTAuditStorage directory none server config Extension mod ssl ct The CTAUDIT S TORAGE directive sets the name of a directory where data will be stored for off-line audit. If directory is not absolute then it is assumed to be relative to D EFAULT RUNTIME D IR. If this directive is not specified, data will not be stored for off-line audit. The directory will contain files named PID.tmp for active child processes and files named PID.out for exited child processes. These .out files are ready for off-line audit. The experimental command ctauditscts (in the httpd source tree, not currently installed) interfaces with certificate-transparency tools to perform the audit. CTLogClient Directive Description: Syntax: Default: Context: Status: Module: Location of certificate-transparency log client tool CTLogClient executable none server config Extension mod ssl ct executable is the full path to the log client tool, which is normally file cpp/client/ct (or ct.exe) within the source tree of the certificate-transparency103 open source project. An alternative implementation could be used to retrieve SCTs for a server certificate as long as the command-line interface is equivalent. If this directive is not configured, server certificates cannot be submitted to logs in order to obtain SCTs; thus, only admin-managed SCTs or SCTs in certificate extensions will be provided to clients. 103 https://code.google.com/p/certificate-transparency/ 10.116. APACHE MODULE MOD SSL CT 959 CTLogConfigDB Directive Description: Syntax: Default: Context: Status: Module: Log configuration database supporting dynamic updates CTLogConfigDB filename none server config Extension mod ssl ct The CTL OG C ONFIG DB directive sets the name of a database containing configuration about known logs. If filename is not absolute then it is assumed to be relative to S ERVER ROOT. Refer to the documentation for the ctlogconfig program, which manages the database. CTMaxSCTAge Directive Description: Syntax: Default: Context: Status: Module: Maximum age of SCT obtained from a log, before it will be refreshed CTMaxSCTAge num-seconds 1 day server config Extension mod ssl ct Server certificates with SCTs which are older than this maximum age will be resubmitted to configured logs. Generally the log will return the same SCT as before, but that is subject to log operation. SCTs will be refreshed as necessary during normal server operation, with new SCTs returned to clients as they become available. CTProxyAwareness Directive Description: Syntax: Default: Context: Status: Module: Level of CT awareness and enforcement for a proxy CTProxyAwareness oblivious|aware|require aware server config, virtual host Extension mod ssl ct This directive controls awareness and checks for valid SCTs for a proxy. Several options are available: oblivious The proxy will neither ask for nor examine SCTs. Certificate Transparency processing for the proxy is completely disabled. aware The proxy will perform all appropriate Certificate Transparency processing, such as asking for and examining SCTs. However, the proxy will not disallow communication if the origin server does not provide any valid SCTs. require The proxy will abort communication with the origin server if it does not provide at least one SCT which passes on-line validation. 960 CHAPTER 10. APACHE MODULES CTSCTStorage Directive Description: Syntax: Default: Context: Status: Module: Existing directory where SCTs are managed CTSCTStorage directory none server config Extension mod ssl ct The CTSCTS TORAGE directive sets the name of a directory where SCTs and SCT lists will be stored. If directory is not absolute then it is assumed to be relative to D EFAULT RUNTIME D IR. A subdirectory for each server certificate contains information relative to that certificate; the name of the subdirectory is the SHA-256 hash of the certificate. The certificate-specific directory contains SCTs retrieved from configured logs, SCT lists prepared from statically configured SCTs and retrieved SCTs, and other information used for managing SCTs. CTServerHelloSCTLimit Directive Description: Syntax: Default: Context: Status: Module: Limit on number of SCTs that can be returned in ServerHello CTServerHelloSCTLimit limit 100 server config Extension mod ssl ct This directive can be used to limit the number of SCTs which can be returned by a TLS server in ServerHello, in case the number of configured logs and statically-defined SCTs is relatively high. Typically only a few SCTs would be available, so this directive is only needed in special circumstances. The directive does not take into account SCTs which may be provided in certificate extensions or in stapled OCSP responses. CTStaticLogConfig Directive Description: Syntax: Default: Context: Status: Module: Static configuration of information about a log CTStaticLogConfig log-id|- public-key-file|- 1|0|min-timestamp|- max-timestamp|- log-URL|none server config Extension mod ssl ct This directive is used to configure information about a particular log. This directive is appropriate when configuration information changes rarely. If dynamic configuration updates must be supported, refer to the CTL OG C ONFIG DB directive. Each of the six fields must be specified, but usually only a small amount of information must be configured for each log; use - when no information is available for the field. For example, in support of a server-only configuration (i.e., no proxy), the administrator might configure only the log URL to be used when submitting server certificates and obtaining a Signed Certificate Timestamp. The fields are defined as follows: 10.116. APACHE MODULE MOD SSL CT 961 log-id This is the id of the log, which is the SHA-256 hash of the log’s public key, provided in hexadecimal format. This string is 64 characters in length. This field should be omitted when public-key-file is provided. public-key-file This is the name of a file containing the PEM encoding of the log’s public key. If the name is not absolute, then it is assumed to be relative to S ERVER ROOT. trust/distrust Set this field to 1 to distrust this log, or to otherwise avoid using it for server certificate submission. Set this to - or 0 (the default) to treat the log normally. min-timestamp and max-timestamp A timestamp is a time as expressed in the number of milliseconds since the epoch, ignoring leap seconds. This is the form of time used in Signed Certificate Timestamps. This must be provided as a decimal number. Specify - for one of the timestamps if it is unknown. For example, when configuring the minimum valid timestamp for a log which remains valid, specify - for max-timestamp. SCTs received from this log by the proxy are invalid if the timestamp is older than min-timestamp or newer than max-timestamp. log-URL This is the URL of the log, for use in submitting server certificates and in turn obtaining an SCT to be sent to clients. See also • Log configuration contains more general information about the fields which can be configured with this directive. CTStaticSCTs Directive Description: Syntax: Default: Context: Status: Module: Static configuration of one or more SCTs for a server certificate CTStaticSCTs certificate-pem-file sct-directory none server config Extension mod ssl ct This directive is used to statically define one or more SCTs corresponding to a server certificate. This mechanism can be used instead of or in addition to dynamically obtaining SCTs from configured logs. Any changes to the set of SCTs for a particular server certificate will be adopted dynamically without the need to restart the server. certificate-pem-file refers to the server certificate in PEM format. If the name is not absolute, then it is assumed to be relative to S ERVER ROOT. sct-directory should contain one or more files with extension .sct, representing one or more SCTs corresponding to the server certificate. If sct-directory is not absolute, then it is assumed to be relative to S ERVER ROOT. If sct-directory is empty, no error will be raised. This directive could be used to identify directories of SCTs maintained by other infrastructure, provided that they are saved in binary format with file extension .sct 962 CHAPTER 10. APACHE MODULES 10.117 Apache Module mod status Description: Status: ModuleIdentifier: SourceFile: Provides information on server activity and performance Base status module mod status.c Summary The Status module allows a server administrator to find out how well their server is performing. A HTML page is presented that gives the current server statistics in an easily readable form. If required this page can be made to automatically refresh (given a compatible browser). Another page gives a simple machine-readable list of the current server state. The details given are: • The number of worker serving requests • The number of idle worker • The status of each worker, the number of requests that worker has performed and the total number of bytes served by the worker (*) • A total number of accesses and byte count served (*) • The time the server was started/restarted and the time it has been running for • Averages giving the number of requests per second, the number of bytes served per second and the average number of bytes per request (*) • The current percentage CPU used by each worker and in total by all workers combined (*) • The current hosts and requests being processed (*) The lines marked "(*)" are only available if E XTENDED S TATUS is On. In version 2.3.6, loading mod status will toggle E XTENDED S TATUS On by default. Directives This module provides no directives. Enabling Status Support To enable status reports only for browsers from the example.com domain add this code to your httpd.conf configuration file SetHandler server-status Require host example.com You can now access server statistics by http://your.server.name/server-status using a Web browser to access the page Automatic Updates You can get the status page to update itself automatically if you have a browser that supports "refresh". Access the page http://your.server.name/server-status?refresh=N to refresh the page every N seconds. 10.117. APACHE MODULE MOD STATUS 963 Machine Readable Status File A machine-readable version of the status file is available by accessing the page http://your.server.name/server-status?auto. This is useful when automatically run, see the Perl program log server status, which you will find in the /support directory of your Apache HTTP Server installation. is loaded into the server, its handler capability is =⇒Itavailable should be noted that if in all configuration files, including per-directory files (e.g., .htaccess). This MOD STATUS may have security-related ramifications for your site. Using server-status to troubleshoot The server-status page may be used as a starting place for troubleshooting a situation where your server is consuming all available resources (CPU or memory), and you wish to identify which requests or clients are causing the problem. First, ensure that you have E XTENDED S TATUS set on, so that you can see the full request and client information for each child or thread. Now look in your process list (using top, or similar process viewing utility) to identify the specific processes that are the main culprits. Order the output of top by CPU usage, or memory usage, depending on what problem you’re trying to address. Reload the server-status page, and look for those process ids, and you’ll be able to see what request is being served by that process, for what client. Requests are transient, so you may need to try several times before you catch it in the act, so to speak. This process should give you some idea what client, or what type of requests, are primarily responsible for your load problems. Often you will identify a particular web application that is misbehaving, or a particular client that is attacking your site. 964 CHAPTER 10. APACHE MODULES 10.118 Apache Module mod substitute Description: Status: ModuleIdentifier: SourceFile: Perform search and replace operations on response bodies Extension substitute module mod substitute.c Summary MOD SUBSTITUTE provides a mechanism to perform both regular expression and fixed string substitutions on response bodies. Directives • Substitute • SubstituteInheritBefore • SubstituteMaxLineLength Substitute Directive Description: Syntax: Context: Override: Status: Module: Pattern to filter the response content Substitute s/pattern/substitution/[infq] directory, .htaccess FileInfo Extension mod substitute The S UBSTITUTE directive specifies a search and replace pattern to apply to the response body. The meaning of the pattern can be modified by using any combination of these flags: i Perform a case-insensitive match. n By default the pattern is treated as a regular expression. Using the n flag forces the pattern to be treated as a fixed string. f The f flag causes mod substitute to flatten the result of a substitution allowing for later substitutions to take place on the boundary of this one. This is the default. q The q flag causes mod substitute to not flatten the buckets after each substitution. This can result in much faster response and a decrease in memory utilization, but should only be used if there is no possibility that the result of one substitution will ever match a pattern or regex of a subsequent one. Example AddOutputFilterByType SUBSTITUTE text/html Substitute "s/foo/bar/ni" If either the pattern or the substitution contain a slash character then an alternative delimiter should be used: 10.118. APACHE MODULE MOD SUBSTITUTE 965 Example of using an alternate delimiter AddOutputFilterByType SUBSTITUTE text/html Substitute "s|
|
|i"
Backreferences can be used in the comparison and in the substitution, when regular expressions are used, as illustrated in the following example: Example of using backreferences and captures AddOutputFilterByType SUBSTITUTE text/html # "foo=k,bar=k" -> "foo/bar=k" Substitute "s|foo=(\w+),bar=\1|foo/bar=$1" A common use scenario for mod substitute is the situation in which a front-end server proxies requests to a back-end server which returns HTML with hard-coded embedded URLs that refer to the back-end server. These URLs don’t work for the end-user, since the back-end server is unreachable. In this case, mod substitute can be used to rewrite those URLs into something that will work from the front end: Rewriting URLs embedded in proxied content ProxyPass "/blog/" "http://internal.blog.example.com" ProxyPassReverse "/blog/" "http://internal.blog.example.com/" Substitute "s|http://internal.blog.example.com/|http://www.example.com/blog/|i" P ROXY PASS R EVERSE modifies any Location (redirect) headers that are sent by the back-end server, and, in this example, S UBSTITUTE takes care of the rest of the problem by fixing up the HTML response as well. SubstituteInheritBefore Directive Description: Syntax: Default: Context: Override: Status: Module: Compatibility: Change the merge order of inherited patterns SubstituteInheritBefore on|off SubstituteInheritBefore on directory, .htaccess FileInfo Extension mod substitute Available in httpd 2.4.17 and later Whether to apply the inherited S UBSTITUTE patterns first (on), or after the ones of the current context (off). The latter was the default in versions 2.4 and earlier, but changed starting with 2.5, hence S UBSTITUTE I NHERIT B EFORE set to off allows to restore the legacy behaviour. S UBSTITUTE I NHERIT B EFORE is itself inherited, hence contexts that inherit it (those that don’t specify their own S UBSTITUTE I NHERIT B EFORE value) will apply the closest defined merge order. 966 CHAPTER 10. APACHE MODULES SubstituteMaxLineLength Directive Description: Syntax: Default: Context: Override: Status: Module: Compatibility: Set the maximum line size SubstituteMaxLineLength bytes(b|B|k|K|m|M|g|G) SubstituteMaxLineLength 1m directory, .htaccess FileInfo Extension mod substitute Available in httpd 2.4.11 and later The maximum line size handled by MOD SUBSTITUTE is limited to restrict memory use. The limit can be configured using S UBSTITUTE M AX L INE L ENGTH. The value can be given as the number of bytes and can be suffixed with a single letter b, B, k, K, m, M, g, G to provide the size in bytes, kilobytes, megabytes or gigabytes respectively. Example AddOutputFilterByType SUBSTITUTE text/html SubstituteMaxLineLength 10m Substitute "s/foo/bar/ni" 10.119. APACHE MODULE MOD SUEXEC 10.119 967 Apache Module mod suexec Description: Status: ModuleIdentifier: SourceFile: Allows CGI scripts to run as a specified user and Group Extension suexec module mod suexec.c Summary This module, in combination with the suexec support program allows CGI scripts to run as a specified user and Group. Directives • SuexecUserGroup See also • SuEXEC support (p. 115) SuexecUserGroup Directive Description: Syntax: Context: Status: Module: User and group for CGI programs to run as SuexecUserGroup User Group server config, virtual host Extension mod suexec The S UEXEC U SER G ROUP directive allows you to specify a user and group for CGI programs to run as. Non-CGI requests are still processed with the user specified in the U SER directive. Example SuexecUserGroup nobody nogroup Startup will fail if this directive is specified but the suEXEC feature is disabled. See also • S UEXEC 968 10.120 CHAPTER 10. APACHE MODULES Apache Module mod syslog Description: Status: ModuleIdentifier: SourceFile: Provides "syslog" ErrorLog provider Extension syslog module mod syslog.c Summary This module provides "syslog" ErrorLog provider. It allows logging error messages via syslogd(8). Directives This module provides no directives. Examples Using syslog in ErrorLog directive (see CORE) instead of a filename enables logging via syslogd(8) if the system supports it. The default is to use syslog facility local7, but you can override this by using the syslog:facility syntax where facility can be one of the names usually documented in syslog(1). The facility is effectively global, and if it is changed in individual virtual hosts, the final facility specified affects the entire server. ErrorLog syslog:user 10.121. APACHE MODULE MOD SYSTEMD 10.121 969 Apache Module mod systemd Description: Status: ModuleIdentifier: SourceFile: Provides better support for systemd integration Extension systemd module mod systemd.c Summary This module provides support for systemd integration. It allows starting httpd as a service with systemd Type=notify (see systemd.service(5) manual page for more information). It also provides statistics in systemctl status output and adds various directives useful for systemd integration. Directives • IdleShutdown IdleShutdown Directive Description: Syntax: Default: Context: Status: Module: Enable shutting down the httpd when it is idle for some time. IdleShutdown seconds IdleShutdown 0 server config Extension mod systemd The I DLE S HUTDOWN directive enables shutting down the httpd when it is idle for some time. The idleness is based on bytes served, so if there are no bytes sent for some time defined by this directive, httpd will shutdown. By default, IdleShutdown is set to 0 meaning this feature is disabled. This feature is useful in a combination with systemd socket activation (see systemd.socket(5) manual page). When httpd is started by systemd on some request, using this directive you can stop the httpd automatically when all the requests are served. ! Implementation warning Because of implementation details, idleness is checked only every 10 seconds. That means that if you specify IdleShutdown 14, httpd will stop itself after 20 seconds of idleness. 970 CHAPTER 10. APACHE MODULES 10.122 Apache Module mod unique id Description: Status: ModuleIdentifier: SourceFile: Provides an environment variable with a unique identifier for each request Extension unique id module mod unique id.c Summary This module provides a magic token for each request which is guaranteed to be unique across "all" requests under very specific conditions. The unique identifier is even unique across multiple machines in a properly configured cluster of machines. The environment variable UNIQUE ID is set to the identifier for each request. Unique identifiers are useful for various reasons which are beyond the scope of this document. Directives This module provides no directives. Theory First a brief recap of how the Apache server works on Unix machines. This feature currently isn’t supported on Windows NT. On Unix machines, Apache creates several children, the children process requests one at a time. Each child can serve multiple requests in its lifetime. For the purpose of this discussion, the children don’t share any data with each other. We’ll refer to the children as httpd processes. Your website has one or more machines under your administrative control, together we’ll call them a cluster of machines. Each machine can possibly run multiple instances of Apache. All of these collectively are considered "the universe", and with certain assumptions we’ll show that in this universe we can generate unique identifiers for each request, without extensive communication between machines in the cluster. The machines in your cluster should satisfy these requirements. (Even if you have only one machine you should synchronize its clock with NTP.) • The machines’ times are synchronized via NTP or other network time protocol. • The machines’ hostnames all differ, such that the module can do a hostname lookup on the hostname and receive a different IP address for each machine in the cluster. As far as operating system assumptions go, we assume that pids (process ids) fit in 32-bits. If the operating system uses more than 32-bits for a pid, the fix is trivial but must be performed in the code. Given those assumptions, at a single point in time we can identify any httpd process on any machine in the cluster from all other httpd processes. The machine’s IP address and the pid of the httpd process are sufficient to do this. A httpd process can handle multiple requests simultaneously if you use a multi-threaded MPM. In order to identify threads, we use a thread index Apache httpd uses internally. So in order to generate unique identifiers for requests we need only distinguish between different points in time. To distinguish time we will use a Unix timestamp (seconds since January 1, 1970 UTC), and a 16-bit counter. The timestamp has only one second granularity, so the counter is used to represent up to 65536 values during a single second. The quadruple ( ip addr, pid, time stamp, counter ) is sufficient to enumerate 65536 requests per second per httpd process. There are issues however with pid reuse over time, and the counter is used to alleviate this issue. When an httpd child is created, the counter is initialized with ( current microseconds divided by 10 ) modulo 65536 (this formula was chosen to eliminate some variance problems with the low order bits of the microsecond timers on some systems). When a unique identifier is generated, the time stamp used is the time the request arrived at the web server. The counter is incremented every time an identifier is generated (and allowed to roll over). 10.122. APACHE MODULE MOD UNIQUE ID 971 The kernel generates a pid for each process as it forks the process, and pids are allowed to roll over (they’re 16-bits on many Unixes, but newer systems have expanded to 32-bits). So over time the same pid will be reused. However unless it is reused within the same second, it does not destroy the uniqueness of our quadruple. That is, we assume the system does not spawn 65536 processes in a one second interval (it may even be 32768 processes on some Unixes, but even this isn’t likely to happen). Suppose that time repeats itself for some reason. That is, suppose that the system’s clock is screwed up and it revisits a past time (or it is too far forward, is reset correctly, and then revisits the future time). In this case we can easily show that we can get pid and time stamp reuse. The choice of initializer for the counter is intended to help defeat this. Note that we really want a random number to initialize the counter, but there aren’t any readily available numbers on most systems (i.e., you can’t use rand() because you need to seed the generator, and can’t seed it with the time because time, at least at one second resolution, has repeated itself). This is not a perfect defense. How good a defense is it? Suppose that one of your machines serves at most 500 requests per second (which is a very reasonable upper bound at this writing, because systems generally do more than just shovel out static files). To do that it will require a number of children which depends on how many concurrent clients you have. But we’ll be pessimistic and suppose that a single child is able to serve 500 requests per second. There are 1000 possible starting counter values such that two sequences of 500 requests overlap. So there is a 1.5% chance that if time (at one second resolution) repeats itself this child will repeat a counter value, and uniqueness will be broken. This was a very pessimistic example, and with real world values it’s even less likely to occur. If your system is such that it’s still likely to occur, then perhaps you should make the counter 32 bits (by editing the code). You may be concerned about the clock being "set back" during summer daylight savings. However this isn’t an issue because the times used here are UTC, which "always" go forward. Note that x86 based Unixes may need proper configuration for this to be true – they should be configured to assume that the motherboard clock is on UTC and compensate appropriately. But even still, if you’re running NTP then your UTC time will be correct very shortly after reboot. The UNIQUE ID environment variable is constructed by encoding the 144-bit (32-bit IP address, 32 bit pid, 32 bit time stamp, 16 bit counter, 32 bit thread index) quadruple using the alphabet [A-Za-z0-9@-] in a manner similar to MIME base64 encoding, producing 24 characters. The MIME base64 alphabet is actually [A-Za-z0-9+/] however + and / need to be specially encoded in URLs, which makes them less desirable. All values are encoded in network byte ordering so that the encoding is comparable across architectures of different byte ordering. The actual ordering of the encoding is: time stamp, IP address, pid, counter. This ordering has a purpose, but it should be emphasized that applications should not dissect the encoding. Applications should treat the entire encoded UNIQUE ID as an opaque token, which can be compared against other UNIQUE IDs for equality only. The ordering was chosen such that it’s possible to change the encoding in the future without worrying about collision with an existing database of UNIQUE IDs. The new encodings should also keep the time stamp as the first element, and can otherwise use the same alphabet and bit length. Since the time stamps are essentially an increasing sequence, it’s sufficient to have a flag second in which all machines in the cluster stop serving any request, and stop using the old encoding format. Afterwards they can resume requests and begin issuing the new encodings. This we believe is a relatively portable solution to this problem. The identifiers generated have essentially an infinite life-time because future identifiers can be made longer as required. Essentially no communication is required between machines in the cluster (only NTP synchronization is required, which is low overhead), and no communication between httpd processes is required (the communication is implicit in the pid value assigned by the kernel). In very specific situations the identifier can be shortened, but more information needs to be assumed (for example the 32-bit IP address is overkill for any site, but there is no portable shorter replacement for it). 972 CHAPTER 10. APACHE MODULES 10.123 Apache Module mod unixd Description: Status: ModuleIdentifier: SourceFile: Basic (required) security for Unix-family platforms. Base unixd module mod unixd.c Directives • ChrootDir • Group • Suexec • User See also • suEXEC support (p. 115) ChrootDir Directive Description: Syntax: Default: Context: Status: Module: Directory for apache to run chroot(8) after startup. ChrootDir /path/to/directory none server config Base MOD UNIXD This directive tells the server to chroot(8) to the specified directory after startup, but before accepting requests over the ’net. Note that running the server under chroot is not simple, and requires additional setup, particularly if you are running scripts such as CGI or PHP. Please make sure you are properly familiar with the operation of chroot before attempting to use this feature. Group Directive Description: Syntax: Default: Context: Status: Module: Group under which the server will answer requests Group unix-group Group #-1 server config Base mod unixd The G ROUP directive sets the group under which the server will answer requests. In order to use this directive, the server must be run initially as root. If you start the server as a non-root user, it will fail to change to the specified group, and will instead continue to run as the group of the original user. Unix-group is one of: A group name Refers to the given group by name. # followed by a group number. Refers to a group by its number. 10.123. APACHE MODULE MOD UNIXD 973 Example Group www-group It is recommended that you set up a new group specifically for running the server. Some admins use user nobody, but this is not always possible or desirable. ! Security Don’t set G ROUP (or U SER) to root unless you know exactly what you are doing, and what the dangers are. See also • VH OST G ROUP • S UEXEC U SER G ROUP Suexec Directive Description: Syntax: Default: Context: Status: Module: Enable or disable the suEXEC feature Suexec On|Off On if suexec binary exists with proper owner and mode, Off otherwise server config Base mod unixd When On, startup will fail if the suexec binary doesn’t exist or has an invalid owner or file mode. When Off, suEXEC will be disabled even if the suexec binary exists and has a valid owner and file mode. User Directive Description: Syntax: Default: Context: Status: Module: The userid under which the server will answer requests User unix-userid User #-1 server config Base mod unixd The U SER directive sets the user ID as which the server will answer requests. In order to use this directive, the server must be run initially as root. If you start the server as a non-root user, it will fail to change to the lesser privileged user, and will instead continue to run as that original user. If you do start the server as root, then it is normal for the parent process to remain running as root. Unix-userid is one of: A username Refers to the given user by name. # followed by a user number. Refers to a user by its number. The user should have no privileges that result in it being able to access files that are not intended to be visible to the outside world, and similarly, the user should not be able to execute code that is not meant for HTTP requests. It is recommended that you set up a new user and group specifically for running the server. Some admins use user nobody, but this is not always desirable, since the nobody user can have other uses on the system. 974 ! CHAPTER 10. APACHE MODULES Security Don’t set U SER (or G ROUP) to root unless you know exactly what you are doing, and what the dangers are. See also • VH OST U SER • S UEXEC U SER G ROUP 10.124. APACHE MODULE MOD USERDIR 10.124 975 Apache Module mod userdir Description: Status: ModuleIdentifier: SourceFile: User-specific directories Base userdir module mod userdir.c Summary This module allows user-specific directories to be accessed using the http://example.com/˜user/ syntax. Directives • UserDir See also • Mapping URLs to the Filesystem (p. 64) • public html tutorial (p. 258) UserDir Directive Description: Syntax: Context: Status: Module: Location of the user-specific directories UserDir directory-filename [directory-filename] ... server config, virtual host Base mod userdir The U SER D IR directive sets the real directory in a user’s home directory to use when a request for a document for a user is received. Directory-filename is one of the following: • The name of a directory or a pattern such as those shown below. • The keyword disabled. This turns off all username-to-directory translations except those explicitly named with the enabled keyword (see below). • The keyword disabled followed by a space-delimited list of usernames. Usernames that appear in such a list will never have directory translation performed, even if they appear in an enabled clause. • The keyword enabled followed by a space-delimited list of usernames. These usernames will have directory translation performed even if a global disable is in effect, but not if they also appear in a disabled clause. If neither the enabled nor the disabled keywords appear in the U SERDIR directive, the argument is treated as a filename pattern, and is used to turn the name into a directory specification. A request for http://www.example.com/˜bob/one/two.html will be translated to: UserDir directive used Translated path UserDir public html UserDir /usr/web UserDir /home/*/www ˜bob/public html/one/two.html /usr/web/bob/one/two.html /home/bob/www/one/two.html The following directives will send redirects to the client: UserDir directive used Translated path UserDir http://www.example.com/users UserDir http://www.example.com/*/usr UserDir http://www.example.com/˜*/ http://www.example.com/users/bob/one/two.html http://www.example.com/bob/usr/one/two.html http://www.example.com/˜bob/one/two.html 976 CHAPTER 10. APACHE MODULES =⇒"/˜root" Be careful when using this directive; for instance, "UserDir ./" would map to "/" - which is probably undesirable. It is strongly recommended that your configuration include a "UserDir disabled root" declaration. See also the D IRECTORY directive and the Security Tips (p. 364) page for more information. Additional examples: To allow a few users to have UserDir directories, but not anyone else, use the following: UserDir disabled UserDir enabled user1 user2 user3 To allow most users to have UserDir directories, but deny this to a few, use the following: UserDir disabled user4 user5 user6 It is also possible to specify alternative user directories. If you use a command like: UserDir "public_html" "/usr/web" "http://www.example.com/" With a request for http://www.example.com/˜bob/one/two.html, will try to find the page at ˜bob/public html/one/two.html first, then /usr/web/bob/one/two.html, and finally it will send a redirect to http://www.example.com/bob/one/two.html. If you add a redirect, it must be the last alternative in the list. Apache httpd cannot determine if the redirect succeeded or not, so if you have the redirect earlier in the list, that will always be the alternative that is used. User directory substitution is not active by default in versions 2.1.4 and later. In earlier versions, UserDir public html was assumed if no U SER D IR directive was present. =⇒Merging details Lists of specific enabled and disabled users are replaced, not merged, from global to virtual host scope See also • Per-user web directories tutorial (p. 258) 10.125. APACHE MODULE MOD USERTRACK 10.125 977 Apache Module mod usertrack Description: Status: ModuleIdentifier: SourceFile: Clickstream logging of user activity on a site Extension usertrack module mod usertrack.c Summary Provides tracking of a user through your website via browser cookies. Directives • CookieDomain • CookieExpires • CookieName • CookieStyle • CookieTracking Logging MOD USERTRACK sets a cookie which can be logged via MOD LOG CONFIG configurable logging formats: LogFormat "%{Apache}n %r %t" usertrack CustomLog "logs/clickstream.log" usertrack CookieDomain Directive Description: Syntax: Context: Override: Status: Module: The domain to which the tracking cookie applies CookieDomain domain server config, virtual host, directory, .htaccess FileInfo Extension mod usertrack This directive controls the setting of the domain to which the tracking cookie applies. If not present, no domain is included in the cookie header field. The domain string must begin with a dot, and must include at least one embedded dot. That is, .example.com is legal, but www.example.com and .com are not. =⇒Most browsers in use today will not allow cookies to be set for a two-part top level domain, such as .co.uk, although such a domain ostensibly fulfills the requirements above. These domains are equivalent to top level domains such as .com, and allowing such cookies may be a security risk. Thus, if you are under a two-part top level domain, you should still use your actual domain, as you would with any other top level domain (for example .example.co.uk). CookieDomain .example.com 978 CHAPTER 10. APACHE MODULES CookieExpires Directive Description: Syntax: Context: Override: Status: Module: Expiry time for the tracking cookie CookieExpires expiry-period server config, virtual host, directory, .htaccess FileInfo Extension mod usertrack When used, this directive sets an expiry time on the cookie generated by the usertrack module. The expiry-period can be given either as a number of seconds, or in the format such as "2 weeks 3 days 7 hours". Valid denominations are: years, months, weeks, days, hours, minutes and seconds. If the expiry time is in any format other than one number indicating the number of seconds, it must be enclosed by double quotes. If this directive is not used, cookies last only for the current browser session. CookieExpires "3 weeks" CookieName Directive Description: Syntax: Default: Context: Override: Status: Module: Name of the tracking cookie CookieName token CookieName Apache server config, virtual host, directory, .htaccess FileInfo Extension mod usertrack This directive allows you to change the name of the cookie this module uses for its tracking purposes. By default the cookie is named "Apache". You must specify a valid cookie name; results are unpredictable if you use a name containing unusual characters. Valid characters include A-Z, a-z, 0-9, " ", and "-". CookieName clicktrack CookieStyle Directive Description: Syntax: Default: Context: Override: Status: Module: Format of the cookie header field CookieStyle Netscape|Cookie|Cookie2|RFC2109|RFC2965 CookieStyle Netscape server config, virtual host, directory, .htaccess FileInfo Extension mod usertrack This directive controls the format of the cookie header field. The three formats allowed are: • Netscape, which is the original but now deprecated syntax. This is the default, and the syntax Apache has historically used. • Cookie or RFC2109, which is the syntax that superseded the Netscape syntax. • Cookie2 or RFC2965, which is the most current cookie syntax. 10.125. APACHE MODULE MOD USERTRACK 979 Not all clients can understand all of these formats, but you should use the newest one that is generally acceptable to your users’ browsers. At the time of writing, most browsers support all three of these formats, with Cookie2 being the preferred format. CookieStyle Cookie2 CookieTracking Directive Description: Syntax: Default: Context: Override: Status: Module: Enables tracking cookie CookieTracking on|off CookieTracking off server config, virtual host, directory, .htaccess FileInfo Extension mod usertrack When MOD USERTRACK is loaded, and CookieTracking on is set, Apache will send a user-tracking cookie for all new requests. This directive can be used to turn this behavior on or off on a per-server or per-directory basis. By default, enabling MOD USERTRACK will not activate cookies. CookieTracking on 980 CHAPTER 10. APACHE MODULES 10.126 Apache Module mod version Description: Status: ModuleIdentifier: SourceFile: Version dependent configuration Extension version module mod version.c Summary This module is designed for the use in test suites and large networks which have to deal with different httpd versions and different configurations. It provides a new container – , which allows a flexible version checking including numeric comparisons and regular expressions. Examples # current httpd version is exactly 2.4.2 = 2.5> # use really new features :-) See below for further possibilities. Directives • IfVersion Directive Description: Syntax: Context: Override: Status: Module: contains version dependent configuration ... server config, virtual host, directory, .htaccess All Extension mod version The section encloses configuration directives which are executed only if the httpd version matches the desired criteria. For normal (numeric) comparisons the version argument has the format major[.minor[.patch]], e.g. 2.1.0 or 2.2. minor and patch are optional. If these numbers are omitted, they are assumed to be zero. The following numerical operators are possible: operator description = or == > >= < <= httpd version is equal httpd version is greater than httpd version is greater or equal httpd version is less than httpd version is less or equal Example = 2.3> # this happens only in versions greater or # equal 2.3.0. 10.126. APACHE MODULE MOD VERSION 981 Besides the numerical comparison it is possible to match a regular expression against the httpd version. There are two ways to write it: operator description = or == ˜ version has the form /regex/ version has the form regex Example # e.g. workaround for buggy versions In order to reverse the meaning, all operators can be preceded by an exclamation mark (!): # not for those versions If the operator is omitted, it is assumed to be =. 982 CHAPTER 10. APACHE MODULES 10.127 Apache Module mod vhost alias Description: Status: ModuleIdentifier: SourceFile: Provides for dynamically configured mass virtual hosting Extension vhost alias module mod vhost alias.c Summary This module creates dynamically configured virtual hosts, by allowing the IP address and/or the Host: header of the HTTP request to be used as part of the pathname to determine what files to serve. This allows for easy use of a huge number of virtual hosts with similar configurations. =⇒Note If MOD ALIAS or MOD USERDIR are used for translating URIs to filenames, they will override the directives of MOD VHOST ALIAS described below. For example, the following configuration will map /cgi-bin/script.pl to /usr/local/apache2/cgi-bin/script.pl in all cases: ScriptAlias "/cgi-bin/" "/usr/local/apache2/cgi-bin/" VirtualScriptAlias "/never/found/%0/cgi-bin/" Directives • VirtualDocumentRoot • VirtualDocumentRootIP • VirtualScriptAlias • VirtualScriptAliasIP See also • U SE C ANONICAL NAME • Dynamically configured mass virtual hosting (p. 130) Directory Name Interpolation All the directives in this module interpolate a string into a pathname. The interpolated string (henceforth called the "name") may be either the server name (see the U SE C ANONICAL NAME directive for details on how this is determined) or the IP address of the virtual host on the server in dotted-quad format. The interpolation is controlled by specifiers inspired by printf which have a number of formats: %% %p %N.M insert a % insert the port number of the virtual host insert (part of) the name N and M are used to specify substrings of the name. N selects from the dot-separated components of the name, and M selects characters within whatever N has selected. M is optional and defaults to zero if it isn’t present; the dot must be present if and only if M is present. The interpretation is as follows: 10.127. APACHE MODULE MOD VHOST ALIAS 0 1 2 -1 -2 2+ -2+ 1+ and -1+ 983 the whole name the first part the second part the last part the penultimate part the second and all subsequent parts the penultimate and all preceding parts the same as 0 If N or M is greater than the number of parts available a single underscore is interpolated. Examples For simple name-based virtual hosts you might use the following directives in your server configuration file: UseCanonicalName Off VirtualDocumentRoot "/usr/local/apache/vhosts/%0" A request for http://www.example.com/directory/file.html will be satisfied by the file /usr/local/apache/vhosts/www.example.com/directory/file.html. For a very large number of virtual hosts it is a good idea to arrange the files to reduce the size of the vhosts directory. To do this you might use the following in your configuration file: UseCanonicalName Off VirtualDocumentRoot "/usr/local/apache/vhosts/%3+/%2.1/%2.2/%2.3/%2" A request for http://www.domain.example.com/directory/file.html will be satisfied by the file /usr/local/apache/vhosts/example.com/d/o/m/domain/directory/file.html. A more even spread of files can be achieved by hashing from the end of the name, for example: VirtualDocumentRoot "/usr/local/apache/vhosts/%3+/%2.-1/%2.-2/%2.-3/%2" The example request would come from /usr/local/apache/vhosts/example.com/n/i/a/domain/directory/file. Alternatively you might use: VirtualDocumentRoot "/usr/local/apache/vhosts/%3+/%2.1/%2.2/%2.3/%2.4+" The example request would come from /usr/local/apache/vhosts/example.com/d/o/m/ain/directory/file.htm A very common request by users is the ability to point multiple domains to multiple document roots without having to worry about the length or number of parts of the hostname being requested. If the requested hostname is sub.www.domain.example.com instead of simply www.domain.example.com, then using %3+ will result in the document root being /usr/local/apache/vhosts/domain.example.com/... instead of the intended example.com directory. In such cases, it can be beneficial to use the combination %-2.0.%-1.0, which will always yield the domain name and the tld, for example example.com regardless of the number of subdomains appended to the hostname. As such, one can make a configuration that will direct all first, second or third level subdomains to the same directory: VirtualDocumentRoot "/usr/local/apache/vhosts/%-2.0.%-1.0" 984 CHAPTER 10. APACHE MODULES In the example above, both www.example.com as well as www.sub.example.com or example.com will all point to /usr/local/apache/vhosts/example.com. For IP-based virtual hosting you might use the following in your configuration file: UseCanonicalName DNS VirtualDocumentRootIP "/usr/local/apache/vhosts/%1/%2/%3/%4/docs" VirtualScriptAliasIP "/usr/local/apache/vhosts/%1/%2/%3/%4/cgi-bin" A request for http://www.domain.example.com/directory/file.html would be satisfied by the file /usr/local/apache/vhosts/10/20/30/40/docs/directory/file.html if the IP address of www.domain.example.com were 10.20.30.40. A request for http://www.domain.example.com/cgi-bin/script.pl would be satisfied by executing the program /usr/local/apache/vhosts/10/20/30/40/cgi-bin/script.pl. If you want to include the . character in a VirtualDocumentRoot directive, but it clashes with a % directive, you can work around the problem in the following way: VirtualDocumentRoot "/usr/local/apache/vhosts/%2.0.%3.0" A request for http://www.domain.example.com/directory/file.html will be satisfied by the file /usr/local/apache/vhosts/domain.example/directory/file.html. The L OG F ORMAT directives %V and %A are useful in conjunction with this module. VirtualDocumentRoot Directive Description: Syntax: Default: Context: Status: Module: Dynamically configure the location of the document root for a given virtual host VirtualDocumentRoot interpolated-directory|none VirtualDocumentRoot none server config, virtual host Extension mod vhost alias The V IRTUAL D OCUMENT ROOT directive allows you to determine where Apache HTTP Server will find your documents based on the value of the server name. The result of expanding interpolated-directory is used as the root of the document tree in a similar manner to the D OCUMENT ROOT directive’s argument. If interpolated-directory is none then V IRTUAL D OCUMENT ROOT is turned off. This directive cannot be used in the same context as V IRTUAL D OCU MENT ROOT IP. ! Note V IRTUAL D OCUMENT ROOT will override any D OCUMENT ROOT directives you may have put in the same context or child contexts. Putting a V IRTUAL D OCUMENT ROOT in the global server scope will effectively override D OCUMENT ROOT directives in any virtual hosts defined later on, unless you set V IRTUAL D OCUMENT ROOT to None in each virtual host. VirtualDocumentRootIP Directive Description: Syntax: Default: Context: Status: Module: Dynamically configure the location of the document root for a given virtual host VirtualDocumentRootIP interpolated-directory|none VirtualDocumentRootIP none server config, virtual host Extension mod vhost alias 10.127. APACHE MODULE MOD VHOST ALIAS 985 The V IRTUAL D OCUMENT ROOT IP directive is like the V IRTUAL D OCUMENT ROOT directive, except that it uses the IP address of the server end of the connection for directory interpolation instead of the server name. VirtualScriptAlias Directive Description: Syntax: Default: Context: Status: Module: Dynamically configure the location of the CGI directory for a given virtual host VirtualScriptAlias interpolated-directory|none VirtualScriptAlias none server config, virtual host Extension mod vhost alias The V IRTUAL S CRIPTA LIAS directive allows you to determine where Apache httpd will find CGI scripts in a similar manner to V IRTUAL D OCUMENT ROOT does for other documents. It matches requests for URIs starting /cgi-bin/, much like S CRIPTA LIAS /cgi-bin/ would. VirtualScriptAliasIP Directive Description: Syntax: Default: Context: Status: Module: Dynamically configure the location of the CGI directory for a given virtual host VirtualScriptAliasIP interpolated-directory|none VirtualScriptAliasIP none server config, virtual host Extension mod vhost alias The V IRTUAL S CRIPTA LIAS IP directive is like the V IRTUAL S CRIPTA LIAS directive, except that it uses the IP address of the server end of the connection for directory interpolation instead of the server name. 986 CHAPTER 10. APACHE MODULES 10.128 Apache Module mod watchdog Description: Status: ModuleIdentifier: SourceFile: Compatibility: provides infrastructure for other modules to periodically run tasks Base watchdog module mod watchdog.c Available in Apache 2.3 and later Summary MOD WATCHDOG defines programmatic hooks for other modules to periodically run tasks. These modules can register handlers for MOD WATCHDOG hooks. Currently, the following modules in the Apache distribution use this functionality: • MOD HEARTBEAT • MOD HEARTMONITOR ! To allow a module to use MOD WATCHDOG functionality, MOD WATCHDOG itself must be statically linked to the server core or, if a dynamic module, be loaded before the calling module. Directives • WatchdogInterval WatchdogInterval Directive Description: Syntax: Default: Context: Status: Module: Watchdog interval in seconds WatchdogInterval number-of-seconds WatchdogInterval 1 server config Base mod watchdog Sets the interval at which the watchdog step hook runs. Default is to run every second. 10.129. APACHE MODULE MOD XML2ENC 10.129 987 Apache Module mod xml2enc Description: Status: ModuleIdentifier: SourceFile: Compatibility: Enhanced charset/internationalisation support for libxml2-based filter modules Base xml2enc module mod xml2enc.c Version 2.4 and later. Available as a third-party module for 2.2.x versions Summary This module provides enhanced internationalisation support for markup-aware filter modules such as MOD PROXY HTML . It can automatically detect the encoding of input data and ensure they are correctly processed by the libxml2104 parser, including converting to Unicode (UTF-8) where necessary. It can also convert data to an encoding of choice after markup processing, and will ensure the correct charset value is set in the HTTP Content-Type header. Directives • xml2EncAlias • xml2EncDefault • xml2StartParse Usage There are two usage scenarios: with modules programmed to work with mod xml2enc, and with those that are not aware of it: Filter modules enabled for mod xml2enc Modules such as MOD PROXY HTML version 3.1 and up use the xml2enc charset optional function to retrieve the charset argument to pass to the libxml2 parser, and may use the xml2enc filter optional function to postprocess to another encoding. Using mod xml2enc with an enabled module, no configuration is necessary: the other module will configure mod xml2enc for you (though you may still want to customise it using the configuration directives below). Non-enabled modules To use it with a libxml2-based module that isn’t explicitly enabled for mod xml2enc, you will have to configure the filter chain yourself. So to use it with a filter foo provided by a module mod foo to improve the latter’s i18n support with HTML and XML, you could use FilterProvider FilterProvider FilterProvider FilterProvider FilterChain iconv xml2enc Content-Type $text/html iconv xml2enc Content-Type $xml markup foo Content-Type $text/html markup foo Content-Type $xml iconv markup mod foo will now support any character set supported by either (or both) of libxml2 or apr xlate/iconv. Programming API Programmers writing libxml2-based filter modules are encouraged to enable them for mod xml2enc, to provide strong i18n support for your users without reinventing the wheel. The programming API is exposed in mod xml2enc.h, and a usage example is MOD PROXY HTML. 104 http://xmlsoft.org/ 988 CHAPTER 10. APACHE MODULES Detecting an Encoding Unlike MOD CHARSET LITE, mod xml2enc is designed to work with data whose encoding cannot be known in advance and thus configured. It therefore uses ’sniffing’ techniques to detect the encoding of HTTP data as follows: 1. If the HTTP Content-Type header includes a charset parameter, that is used. 2. If the data start with an XML Byte Order Mark (BOM) or an XML encoding declaration, that is used. 3. If an encoding is declared in an HTML element, that is used. 4. If none of the above match, the default value set by XML 2E NC D EFAULT is used. The rules are applied in order. As soon as a match is found, it is used and detection is stopped. Output Encoding libxml2105 always uses UTF-8 (Unicode) internally, and libxml2-based filter modules will output that by default. mod xml2enc can change the output encoding through the API, but there is currently no way to configure that directly. Changing the output encoding should (in theory, at least) never be necessary, and is not recommended due to the extra processing load on the server of an unnecessary conversion. Unsupported Encodings If you are working with encodings that are not supported by any of the conversion methods available on your platform, you can still alias them to a supported encoding using XML 2E NC A LIAS. xml2EncAlias Directive Description: Syntax: Context: Status: Module: Recognise Aliases for encoding values xml2EncAlias charset alias [alias ...] server config Base mod xml2enc This server-wide directive aliases one or more encoding to another encoding. This enables encodings not recognised by libxml2 to be handled internally by libxml2’s encoding support using the translation table for a recognised encoding. This serves two purposes: to support character sets (or names) not recognised either by libxml2 or iconv, and to skip conversion for an encoding where it is known to be unnecessary. xml2EncDefault Directive Description: Syntax: Context: Status: Module: Compatibility: 105 http://xmlsoft.org/ Sets a default encoding to assume when absolutely no information can be automatically detected xml2EncDefault name server config, virtual host, directory, .htaccess Base mod xml2enc Version 2.4.0 and later; available as a third-party module for earlier versions. 10.129. APACHE MODULE MOD XML2ENC 989 If you are processing data with known encoding but no encoding information, you can set this default to help mod xml2enc process the data correctly. For example, to work with the default value of Latin1 (iso-8859-1 specified in HTTP/1.0, use xml2EncDefault iso-8859-1 xml2StartParse Directive Description: Syntax: Context: Status: Module: Advise the parser to skip leading junk. xml2StartParse element [element ...] server config, virtual host, directory, .htaccess Base mod xml2enc Specify that the markup parser should start at the first instance of any of the elements specified. This can be used as a workaround where a broken backend inserts leading junk that messes up the parser (example here106 ). It should never be used for XML, nor well-formed HTML. 106 http://bahumbug.wordpress.com/2006/10/12/mod proxy html-revisited/ 990 CHAPTER 10. APACHE MODULES 10.130 Apache Module mpm common Description: Status: A collection of directives that are implemented by more than one multi-processing module (MPM) MPM Directives • CoreDumpDirectory • EnableExceptionHook • GracefulShutdownTimeout • Listen • ListenBackLog • ListenCoresBucketsRatio • MaxConnectionsPerChild • MaxMemFree • MaxRequestWorkers • MaxSpareThreads • MinSpareThreads • PidFile • ReceiveBufferSize • ScoreBoardFile • SendBufferSize • ServerLimit • StartServers • StartThreads • ThreadLimit • ThreadsPerChild • ThreadStackSize CoreDumpDirectory Directive Description: Syntax: Default: Context: Status: Module: Directory where Apache HTTP Server attempts to switch before dumping core CoreDumpDirectory directory See usage for the default setting server config MPM EVENT , WORKER , PREFORK This controls the directory to which Apache httpd attempts to switch before dumping core. If your operating system is configured to create core files in the working directory of the crashing process, C ORE D UMP D IRECTORY is necessary to change working directory from the default S ERVER ROOT directory, which should not be writable by the user the server runs as. If you want a core dump for debugging, you can use this directive to place it in a different location. This directive has no effect if your operating system is not configured to write core files to the working directory of the crashing processes. 10.130. APACHE MODULE MPM COMMON 991 =⇒Core Dumps on Linux If Apache httpd starts as root and switches to another user, the Linux kernel disables core dumps even if the directory is writable for the process. Apache httpd (2.0.46 and later) reenables core dumps on Linux 2.4 and beyond, but only if you explicitly configure a C ORE D UMP D IRECTORY. =⇒Core Dumps on BSD To enable core-dumping of suid-executables on BSD-systems (such as FreeBSD), set kern.sugid coredump to 1. =⇒Specific signals C D D ORE UMP IRECTORY processing only occurs for a select set of fatal signals: SIGFPE, SIGILL, SIGABORT, SIGSEGV, and SIGBUS. On some operating systems, SIGQUIT also results in a core dump but does not go through C ORE D UMP D IRECTORY or E NABLE E XCEPTION H OOK processing, so the core location is dictated entirely by the operating system. EnableExceptionHook Directive Description: Syntax: Default: Context: Status: Module: Enables a hook that runs exception handlers after a crash EnableExceptionHook On|Off EnableExceptionHook Off server config MPM EVENT , WORKER , PREFORK For safety reasons this directive is only available if the server was configured with the --enable-exception-hook option. It enables a hook that allows external modules to plug in and do something after a child crashed. There are already two modules, mod whatkilledus and mod backtrace that make use of this hook. Please have a look at Jeff Trawick’s EnableExceptionHook site107 for more information about these. GracefulShutdownTimeout Directive Description: Syntax: Default: Context: Status: Module: Specify a timeout after which a gracefully shutdown server will exit. GracefulShutdownTimeout seconds GracefulShutdownTimeout 0 server config MPM EVENT , WORKER , PREFORK The G RACEFUL S HUTDOWN T IMEOUT specifies how many seconds after receiving a "graceful-stop" signal, a server should continue to run, handling the existing connections. Setting this value to zero means that the server will wait indefinitely until all remaining requests have been fully served. 107 http://people.apache.org/˜trawick/exception hook.html 992 CHAPTER 10. APACHE MODULES Listen Directive Description: Syntax: Context: Status: Module: IP addresses and ports that the server listens to Listen [IP-address:]portnumber [protocol] server config MPM EVENT , WORKER , PREFORK , MPM WINNT, MPM NETWARE , MPMT OS 2 The L ISTEN directive instructs Apache httpd to listen to only specific IP addresses or ports; by default it responds to requests on all IP interfaces. L ISTEN is now a required directive. If it is not in the config file, the server will fail to start. This is a change from previous versions of Apache httpd. The L ISTEN directive tells the server to accept incoming requests on the specified port or address-and-port combination. If only a port number is specified, the server listens to the given port on all interfaces. If an IP address is given as well as a port, the server will listen on the given port and interface. Multiple L ISTEN directives may be used to specify a number of addresses and ports to listen to. The server will respond to requests from any of the listed addresses and ports. For example, to make the server accept connections on both port 80 and port 8000, use: Listen 80 Listen 8000 To make the server accept connections on two specified interfaces and port numbers, use Listen 192.170.2.1:80 Listen 192.170.2.5:8000 IPv6 addresses must be surrounded in square brackets, as in the following example: Listen [2001:db8::a00:20ff:fea7:ccea]:80 The optional protocol argument is not required for most configurations. If not specified, https is the default for port 443 and http the default for all other ports. The protocol is used to determine which module should handle a request, and to apply protocol specific optimizations with the ACCEPT F ILTER directive. You only need to set the protocol if you are running on non-standard ports. For example, running an https site on port 8443: Listen 192.170.2.1:8443 https =⇒Error condition Multiple L ISTEN directives for the same ip address and port will result in an Address already in use error message. See also • DNS Issues (p. 121) • Setting which addresses and ports Apache HTTP Server uses (p. 88) • Further discussion of the Address already in use error message, including other causes.108 108 http://wiki.apache.org/httpd/CouldNotBindToAddress 10.130. APACHE MODULE MPM COMMON 993 ListenBackLog Directive Description: Syntax: Default: Context: Status: Module: Maximum length of the queue of pending connections ListenBacklog backlog ListenBacklog 511 server config MPM EVENT , WORKER , PREFORK , MPM WINNT, MPM NETWARE , MPMT OS 2 The maximum length of the queue of pending connections. Generally no tuning is needed or desired, however on some systems it is desirable to increase this when under a TCP SYN flood attack. See the backlog parameter to the listen(2) system call. This will often be limited to a smaller number by the operating system. This varies from OS to OS. Also note that many OSes do not use exactly what is specified as the backlog, but use a number based on (but normally larger than) what is set. ListenCoresBucketsRatio Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Ratio between the number of CPU cores (online) and the number of listeners’ buckets ListenCoresBucketsRatio ratio ListenCoresBucketsRatio 0 (disabled) server config MPM EVENT , WORKER , PREFORK Available in Apache HTTP Server 2.4.17, with a kernel supporting the socket option SO REUSEPORT and distributing new connections evenly accross listening processes’ (or threads’) sockets using it (eg. Linux 3.9 and later, but not the current implementations of SO REUSEPORT in *BSDs. A ratio between the number of (online) CPU cores and the number of listeners’ buckets can be used to make Apache HTTP Server create num cpu cores / ratio listening buckets, each containing its own L ISTEN-ing socket(s) on the same port(s), and then make each child handle a single bucket (with round-robin distribution of the buckets at children creation time). =⇒Meaning of "online" CPU core On Linux (and also BSD) a CPU core can be turned on/off if Hotplug is configured, therefore a L ISTEN C ORES B UCKETS R ATIO needs to take this parameter into account while calculating the number of buckets to create. a https://www.kernel.org/doc/Documentation/cpu-hotplug.txt L ISTEN C ORES B UCKETS R ATIO can improve the scalability when accepting new connections is/becomes the bottleneck. On systems with a large number of CPU cores, enabling this feature has been tested to show significant performances improvement and shorter responses time. There must be at least twice the number of CPU cores than the configured ratio for this to be active. The recommended ratio is 8, hence at least 16 cores should be available at runtime when this value is used. The right ratio to obtain maximum performance needs to be calculated for each target system, testing multiple values and observing the variations in your key performance metrics. 994 CHAPTER 10. APACHE MODULES MaxConnectionsPerChild Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Limit on the number of connections that an individual child server will handle during its life MaxConnectionsPerChild number MaxConnectionsPerChild 0 server config MPM EVENT , WORKER , PREFORK , MPM WINNT, MPM NETWARE , MPMT OS 2 Available Apache HTTP Server 2.3.9 and later. The old name MaxRequestsPerChild is still supported. The M AX C ONNECTIONS P ER C HILD directive sets the limit on the number of connections that an individual child server process will handle. After M AX C ONNECTIONS P ER C HILD connections, the child process will die. If M AX C ONNECTIONS P ER C HILD is 0, then the process will never expire. Setting M AX C ONNECTIONS P ER C HILD to a non-zero value limits the amount of memory that process can consume by (accidental) memory leakage. MaxMemFree Directive Description: Syntax: Default: Context: Status: Module: Maximum amount of memory that the main allocator is allowed to hold without calling free() MaxMemFree KBytes MaxMemFree 2048 server config MPM EVENT , WORKER , PREFORK , MPM WINNT, MPM NETWARE The M AX M EM F REE directive sets the maximum number of free Kbytes that every allocator is allowed to hold without calling free(). In threaded MPMs, every thread has its own allocator. When set to zero, the threshold will be set to unlimited. MaxRequestWorkers Directive Description: Syntax: Default: Context: Status: Module: Maximum number of connections that will be processed simultaneously MaxRequestWorkers number See usage for details server config MPM EVENT , WORKER , PREFORK The M AX R EQUEST W ORKERS directive sets the limit on the number of simultaneous requests that will be served. Any connection attempts over the M AX R EQUEST W ORKERS limit will normally be queued, up to a number based on the L ISTEN BACKLOG directive. Once a child process is freed at the end of a different request, the connection will then be serviced. For non-threaded servers (i.e., PREFORK), M AX R EQUEST W ORKERS translates into the maximum number of child processes that will be launched to serve requests. The default value is 256; to increase it, you must also raise S ERVER L IMIT. For threaded and hybrid servers (e.g. EVENT or WORKER) M AX R EQUEST W ORKERS restricts the total number of threads that will be available to serve clients. For hybrid MPMs the default value is 16 (S ERVER L IMIT) multiplied by the value of 25 (T HREADS P ER C HILD). Therefore, to increase M AX R EQUEST W ORKERS to a value that requires more than 16 processes, you must also raise S ERVER L IMIT. M AX R EQUEST W ORKERS was called M AX C LIENTS before version 2.3.13. The old name is still supported. 10.130. APACHE MODULE MPM COMMON 995 MaxSpareThreads Directive Description: Syntax: Default: Context: Status: Module: Maximum number of idle threads MaxSpareThreads number See usage for details server config MPM EVENT , WORKER , MPM NETWARE , MPMT OS 2 Maximum number of idle threads. Different MPMs deal with this directive differently. For WORKER and EVENT, the default is MaxSpareThreads 250. These MPMs deal with idle threads on a serverwide basis. If there are too many idle threads in the server then child processes are killed until the number of idle threads is less than this number. For MPM NETWARE the default is MaxSpareThreads 100. Since this MPM runs a single-process, the spare thread count is also server-wide. MPMT OS 2 works similar to MPM NETWARE. For MPMT OS 2 the default value is 10. =⇒Restrictions The range of the M AX S PARE T HREADS value is restricted. Apache httpd will correct the given value automatically according to the following rules: • MPM NETWARE wants the value to be greater than M IN S PARE T HREADS. • For WORKER and EVENT, the value must be greater or equal to the sum of M IN SPARE T HREADS and T HREADS P ER C HILD . See also • M IN S PARE T HREADS • S TART S ERVERS • M AX S PARE S ERVERS MinSpareThreads Directive Description: Syntax: Default: Context: Status: Module: Minimum number of idle threads available to handle request spikes MinSpareThreads number See usage for details server config MPM EVENT , WORKER , MPM NETWARE , MPMT OS 2 Minimum number of idle threads to handle request spikes. Different MPMs deal with this directive differently. WORKER and EVENT use a default of MinSpareThreads 75 and deal with idle threads on a server-wide basis. If there aren’t enough idle threads in the server then child processes are created until the number of idle threads is greater than number. Please also note that additional processes/threads might be created if L ISTEN C ORES B UCKETS R ATIO is enabled. MPM NETWARE uses a default of MinSpareThreads 10 and, since it is a single-process MPM, tracks this on a server-wide bases. MPMT OS 2 works similar to MPM NETWARE. For MPMT OS 2 the default value is 5. See also • M AX S PARE T HREADS 996 CHAPTER 10. APACHE MODULES • S TART S ERVERS • M IN S PARE S ERVERS PidFile Directive Description: Syntax: Default: Context: Status: Module: File where the server records the process ID of the daemon PidFile filename PidFile httpd.pid server config MPM EVENT , WORKER , PREFORK , MPM WINNT, MPMT OS 2 The P ID F ILE directive sets the file to which the server records the process id of the daemon. If the filename is not absolute then it is assumed to be relative to the D EFAULT RUNTIME D IR. Example PidFile /var/run/apache.pid It is often useful to be able to send the server a signal, so that it closes and then re-opens its E RROR L OG and T RANS FER L OG , and re-reads its configuration files. This is done by sending a SIGHUP (kill -1) signal to the process id listed in the P ID F ILE. The P ID F ILE is subject to the same warnings about log file placement and security (p. 364) . =⇒Note As of Apache HTTP Server 2, we recommended that you only use the apachectl script, or the init script that your OS provides, for (re-)starting or stopping the server. ReceiveBufferSize Directive Description: Syntax: Default: Context: Status: Module: TCP receive buffer size ReceiveBufferSize bytes ReceiveBufferSize 0 server config MPM EVENT , WORKER , PREFORK , MPM WINNT, MPM NETWARE , MPMT OS 2 The server will set the TCP receive buffer size to the number of bytes specified. If set to the value of 0, the server will use the OS default. ScoreBoardFile Directive Description: Syntax: Default: Context: Status: Module: Location of the file used to store coordination data for the child processes ScoreBoardFile file-path ScoreBoardFile apache runtime status server config MPM EVENT , WORKER , PREFORK , MPM WINNT Apache HTTP Server uses a scoreboard to communicate between its parent and child processes. Some architectures require a file to facilitate this communication. If the file is left unspecified, Apache httpd first attempts to create the scoreboard entirely in memory (using anonymous shared memory) and, failing that, will attempt to create the file on 10.130. APACHE MODULE MPM COMMON 997 disk (using file-based shared memory). Specifying this directive causes Apache httpd to always create the file on the disk. If file-path is not an absolute path, the location specified will be relative to the value of D EFAULT RUNTIME D IR. Example ScoreBoardFile /var/run/apache_runtime_status File-based shared memory is useful for third-party applications that require direct access to the scoreboard. If you use a S CORE B OARD F ILE then you may see improved speed by placing it on a RAM disk. But be careful that you heed the same warnings about log file placement and security (p. 364) . See also • Stopping and Restarting Apache HTTP Server (p. 29) SendBufferSize Directive Description: Syntax: Default: Context: Status: Module: TCP buffer size SendBufferSize bytes SendBufferSize 0 server config MPM EVENT , WORKER , PREFORK , MPM WINNT, MPM NETWARE , MPMT OS 2 Sets the server’s TCP send buffer size to the number of bytes specified. It is often useful to set this past the OS’s standard default value on high speed, high latency connections (i.e., 100ms or so, such as transcontinental fast pipes). If set to the value of 0, the server will use the default value provided by your OS. Further configuration of your operating system may be required to elicit better performance on high speed, high latency connections. =⇒OnS some operating systems, changes in TCP behavior resulting from a larger S B may not be seen unless E S is set to OFF. This interaction applies only to END UFFER IZE NABLE ENDFILE static files. ServerLimit Directive Description: Syntax: Default: Context: Status: Module: Upper limit on configurable number of processes ServerLimit number See usage for details server config MPM EVENT , WORKER , PREFORK For the PREFORK MPM, this directive sets the maximum configured value for M AX R EQUEST W ORKERS for the lifetime of the Apache httpd process. For the WORKER and EVENT MPMs, this directive in combination with T HREAD L IMIT sets the maximum configured value for M AX R EQUEST W ORKERS for the lifetime of the Apache httpd process. Any attempts to change this directive during a restart will be ignored, but M AX R EQUEST W ORKERS can be modified during a restart. Special care must be taken when using this directive. If S ERVER L IMIT is set to a value much higher than necessary, extra, unused shared memory will be allocated. If both S ERVER L IMIT and M AX R EQUEST W ORKERS are set to values higher than the system can handle, Apache httpd may not start or the system may become unstable. 998 CHAPTER 10. APACHE MODULES With the PREFORK MPM, use this directive only if you need to set M AX R EQUEST W ORKERS higher than 256 (default). Do not set the value of this directive any higher than what you might want to set M AX R EQUEST W ORKERS to. With WORKER and EVENT, use this directive only if your M AX R EQUEST W ORKERS and T HREADS P ER C HILD settings require more than 16 server processes (default). Do not set the value of this directive any higher than the number of server processes required by what you may want for M AX R EQUEST W ORKERS and T HREADS P ER C HILD. =⇒Note There is a hard limit of ServerLimit 20000 compiled into the server (for the PREFORK MPM 200000). This is intended to avoid nasty effects caused by typos. To increase it even further past this limit, you will need to modify the value of MAX SERVER LIMIT in the mpm source file and rebuild the server. See also • Stopping and Restarting Apache HTTP Server (p. 29) StartServers Directive Description: Syntax: Default: Context: Status: Module: Number of child server processes created at startup StartServers number See usage for details server config MPM EVENT , WORKER , PREFORK , MPMT OS 2 The S TART S ERVERS directive sets the number of child server processes created on startup. As the number of processes is dynamically controlled depending on the load, (see M IN S PARE T HREADS, M AX S PARE T HREADS, M IN SPARE S ERVERS , M AX S PARE S ERVERS ) there is usually little reason to adjust this parameter. The default value differs from MPM to MPM. WORKER and EVENT default to StartServers 3; PREFORK defaults to 5; MPMT OS 2 defaults to 2. StartThreads Directive Description: Syntax: Default: Context: Status: Module: Number of threads created on startup StartThreads number See usage for details server config MPM MPM NETWARE Number of threads created on startup. As the number of threads is dynamically controlled depending on the load, (see M IN S PARE T HREADS, M AX S PARE T HREADS, M IN S PARE S ERVERS, M AX S PARE S ERVERS) there is usually little reason to adjust this parameter. For MPM NETWARE the default is StartThreads 50 and, since there is only a single process, this is the total number of threads created at startup to serve requests. 10.130. APACHE MODULE MPM COMMON 999 ThreadLimit Directive Description: Syntax: Default: Context: Status: Module: Sets the upper limit on the configurable number of threads per child process ThreadLimit number See usage for details server config MPM EVENT , WORKER , MPM WINNT This directive sets the maximum configured value for T HREADS P ER C HILD for the lifetime of the Apache httpd process. Any attempts to change this directive during a restart will be ignored, but T HREADS P ER C HILD can be modified during a restart up to the value of this directive. Special care must be taken when using this directive. If T HREAD L IMIT is set to a value much higher than T HREADS P ER C HILD, extra unused shared memory will be allocated. If both T HREAD L IMIT and T HREADS P ER C HILD are set to values higher than the system can handle, Apache httpd may not start or the system may become unstable. Do not set the value of this directive any higher than your greatest predicted setting of T HREADS P ER C HILD for the current run of Apache httpd. The default value for T HREAD L IMIT is 1920 when used with MPM WINNT and 64 when used with the others. =⇒Note There is a hard limit of ThreadLimit 20000 (or ThreadLimit 100000 with EVENT , ThreadLimit 15000 with MPM WINNT) compiled into the server. This is intended to avoid nasty effects caused by typos. To increase it even further past this limit, you will need to modify the value of MAX THREAD LIMIT in the mpm source file and rebuild the server. ThreadsPerChild Directive Description: Syntax: Default: Context: Status: Module: Number of threads created by each child process ThreadsPerChild number See usage for details server config MPM EVENT , WORKER , MPM WINNT This directive sets the number of threads created by each child process. The child creates these threads at startup and never creates more. If using an MPM like MPM WINNT, where there is only one child process, this number should be high enough to handle the entire load of the server. If using an MPM like WORKER, where there are multiple child processes, the total number of threads should be high enough to handle the common load on the server. The default value for T HREADS P ER C HILD is 64 when used with MPM WINNT and 25 when used with the others. ThreadStackSize Directive Description: Syntax: Default: Context: Status: Module: The size in bytes of the stack used by threads handling client connections ThreadStackSize size 65536 on NetWare; varies on other operating systems server config MPM EVENT , WORKER , MPM WINNT, MPM NETWARE , MPMT OS 2 The T HREAD S TACK S IZE directive sets the size of the stack (for autodata) of threads which handle client connections and call modules to help process those connections. In most cases the operating system default for stack size is reasonable, but there are some conditions where it may need to be adjusted: 1000 CHAPTER 10. APACHE MODULES • On platforms with a relatively small default thread stack size (e.g., HP-UX), Apache httpd may crash when using some third-party modules which use a relatively large amount of autodata storage. Those same modules may have worked fine on other platforms where the default thread stack size is larger. This type of crash is resolved by setting T HREAD S TACK S IZE to a value higher than the operating system default. This type of adjustment is necessary only if the provider of the third-party module specifies that it is required, or if diagnosis of an Apache httpd crash indicates that the thread stack size was too small. • On platforms where the default thread stack size is significantly larger than necessary for the web server configuration, a higher number of threads per child process will be achievable if T HREAD S TACK S IZE is set to a value lower than the operating system default. This type of adjustment should only be made in a test environment which allows the full set of web server processing can be exercised, as there may be infrequent requests which require more stack to process. The minimum required stack size strongly depends on the modules used, but any change in the web server configuration can invalidate the current T HREAD S TACK S IZE setting. • On Linux, this directive can only be used to increase the default stack size, as the underlying system call uses the value as a minimum stack size. The (often large) soft limit for ulimit -s (8MB if unlimited) is used as the default stack size. =⇒child It is recommended to not reduce T S S unless a high number of threads per process is needed. On some platforms (including Linux), a setting of 128000 is already HREAD TACK IZE too low and causes crashes with some common modules. 10.131. APACHE MODULE EVENT 10.131 1001 Apache Module event Description: Status: ModuleIdentifier: SourceFile: A variant of the WORKER MPM with the goal of consuming threads only for connections with active processing MPM mpm event module event.c Summary The EVENT Multi-Processing Module (MPM) is, as its name implies, an asynchronous, event-based implementation designed to allow more requests to be served simultaneously by passing off some processing work to the listeners threads, freeing up the worker threads to serve new requests. To use the EVENT MPM, add --with-mpm=event to the configure script’s arguments when building the httpd. Directives • AsyncRequestWorkerFactor • CoreDumpDirectory (p. 990) • EnableExceptionHook (p. 991) • Group (p. 972) • Listen (p. 992) • ListenBacklog (p. 993) • MaxConnectionsPerChild (p. 994) • MaxMemFree (p. 994) • MaxRequestWorkers (p. 994) • MaxSpareThreads (p. 995) • MinSpareThreads (p. 995) • PidFile (p. 996) • ScoreBoardFile (p. 996) • SendBufferSize (p. 997) • ServerLimit (p. 997) • StartServers (p. 998) • ThreadLimit (p. 999) • ThreadsPerChild (p. 999) • ThreadStackSize (p. 999) • User (p. 973) See also • The worker MPM (p. 1014) 1002 CHAPTER 10. APACHE MODULES Relationship with the Worker MPM EVENT is based on the WORKER MPM, which implements a hybrid multi-process multi-threaded server. A single control process (the parent) is responsible for launching child processes. Each child process creates a fixed number of server threads as specified in the T HREADS P ER C HILD directive, as well as a listener thread which listens for connections and passes them to a worker thread for processing when they arrive. Run-time configuration directives are identical to those provided by WORKER, with the only addition of the A SYN C R EQUEST W ORKER FACTOR . How it Works This original goal of this MPM was to fix the ’keep alive problem’ in HTTP. After a client completes the first request, it can keep the connection open, sending further requests using the same socket and saving significant overhead in creating TCP connections. However, Apache HTTP Server traditionally keeps an entire child process/thread waiting for data from the client, which brings its own disadvantages. To solve this problem, this MPM uses a dedicated listener thread for each process along with a pool of worker threads, sharing queues specific for those requests in keep-alive mode (or, more simply, "readable"), those in write- completion mode, and those in the process of shutting down ("closing"). An event loop, triggered on the status of the socket’s availability, adjusts these queues and pushes work to the worker pool. This new architecture, leveraging non-blocking sockets and modern kernel features exposed by APR (like Linux’s epoll), no longer requires the mpm-accept M UTEX configured to avoid the thundering herd problem. The total amount of connections that a single process/threads block can handle is regulated by the A SYNC R EQUESTW ORKER FACTOR directive. Async connections Async connections would need a fixed dedicated worker thread with the previous MPMs but not with event. The status page of MOD STATUS shows new columns under the Async connections section: Writing While sending the response to the client, it might happen that the TCP write buffer fills up because the connection is too slow. Usually in this case a write() to the socket returns EWOULDBLOCK or EAGAIN, to become writable again after an idle time. The worker holding the socket might be able to offload the waiting task to the listener thread, that in turn will re-assign it to the first idle worker thread available once an event will be raised for the socket (for example, "the socket is now writable"). Please check the Limitations section for more information. Keep-alive Keep Alive handling is the most basic improvement from the worker MPM. Once a worker thread finishes to flush the response to the client, it can offload the socket handling to the listener thread, that in turns will wait for any event from the OS, like "the socket is readable". If any new request comes from the client, then the listener will forward it to the first worker thread available. Conversely, if the K EEPA LIVE T IMEOUT occurs then the socket will be closed by the listener. In this way the worker threads are not responsible for idle sockets and they can be re-used to serve other requests. Closing Sometimes the MPM needs to perform a lingering close, namely sending back an early error to the client while it is still transmitting data to httpd. Sending the response and then closing the connection immediately is not the correct thing to do since the client (still trying to send the rest of the request) would get a connection reset and could not read the httpd’s response. So in such cases, httpd tries to read the rest of the request to allow the client to consume the response. The lingering close is time bounded but it can take relatively long time, so a worker thread can offload this work to the listener. These improvements are valid for both HTTP/HTTPS connections. 10.131. APACHE MODULE EVENT 1003 Limitations The improved connection handling may not work for certain connection filters that have declared themselves as incompatible with event. In these cases, this MPM will fall back to the behaviour of the WORKER MPM and reserve one worker thread per connection. All modules shipped with the server are compatible with the event MPM. A similar restriction is currently present for requests involving an output filter that needs to read and/or modify the whole response body. If the connection to the client blocks while the filter is processing the data, and the amount of data produced by the filter is too big to be buffered in memory, the thread used for the request is not freed while httpd waits until the pending data is sent to the client. To illustrate this point we can think about the following two situations: serving a static asset (like a CSS file) versus serving content retrieved from FCGI/CGI or a proxied server. The former is predictable, namely the event MPM has full visibility on the end of the content and it can use events: the worker thread serving the response content can flush the first bytes until EWOULDBLOCK or EAGAIN is returned, delegating the rest to the listener. This one in turn waits for an event on the socket, and delegates the work to flush the rest of the content to the first idle worker thread. Meanwhile in the latter example (FCGI/CGI/proxied content) the MPM can’t predict the end of the response and a worker thread has to finish its work before returning the control to the listener. The only alternative is to buffer the response in memory, but it wouldn’t be the safest option for the sake of the server’s stability and memory footprint. Background material The event model was made possible by the introduction of new APIs into the supported operating systems: • epoll (Linux) • kqueue (BSD) • event ports (Solaris) Before these new APIs where made available, the traditional select and poll APIs had to be used. Those APIs get slow if used to handle many connections or if the set of connections rate of change is high. The new APIs allow to monitor much more connections and they perform way better when the set of connections to monitor changes frequently. So these APIs made it possible to write the event MPM, that scales much better with the typical HTTP pattern of many idle connections. The MPM assumes that the underlying apr pollset implementation is reasonably threadsafe. This enables the MPM to avoid excessive high level locking, or having to wake up the listener thread in order to send it a keep-alive socket. This is currently only compatible with KQueue and EPoll. Requirements This MPM depends on APR’s atomic compare-and-swap operations for thread synchronization. If you are compiling for an x86 target and you don’t need to support 386s, or you are compiling for a SPARC and you don’t need to run on pre-UltraSPARC chips, add --enable-nonportable-atomics=yes to the configure script’s arguments. This will cause APR to implement atomic operations using efficient opcodes not available in older CPUs. This MPM does not perform well on older platforms which lack good threading, but the requirement for EPoll or KQueue makes this moot. • To use this MPM on FreeBSD, FreeBSD 5.3 or higher is recommended. However, it is possible to run this MPM on FreeBSD 5.2.1, if you use libkse (see man libmap.conf). • For NetBSD, at least version 2.0 is recommended. • For Linux, a 2.6 kernel is recommended. It is also necessary to ensure that your version of glibc has been compiled with support for EPoll. 1004 CHAPTER 10. APACHE MODULES AsyncRequestWorkerFactor Directive Description: Syntax: Default: Context: Status: Module: Compatibility: Limit concurrent connections per process AsyncRequestWorkerFactor factor 2 server config MPM event Available in version 2.3.13 and later The event MPM handles some connections in an asynchronous way, where request worker threads are only allocated for short periods of time as needed, and other connections with one request worker thread reserved per connection. This can lead to situations where all workers are tied up and no worker thread is available to handle new work on established async connections. To mitigate this problem, the event MPM does two things: • it limits the number of connections accepted per process, depending on the number of idle request workers; • if all workers are busy, it will close connections in keep-alive state even if the keep-alive timeout has not expired. This allows the respective clients to reconnect to a different process which may still have worker threads available. This directive can be used to fine-tune the per-process connection limit. A process will only accept new connections if the current number of connections (not counting connections in the "closing" state) is lower than: T HREADS P ER C HILD + (A SYNC R EQUEST W ORKER FACTOR * number of idle workers) An estimation of the maximum concurrent connections across all the processes given an average value of idle worker threads can be calculated with: (T HREADS P ER C HILD + (A SYNC R EQUEST W ORKER FACTOR * number of idle workers)) * S ERVER L IMIT =⇒Example ThreadsPerChild = 10 ServerLimit = 4 AsyncRequestWorkerFactor = 2 MaxRequestWorkers = 40 idle_workers = 4 (average for all the processes to keep it simple) max_connections = (ThreadsPerChild + (AsyncRequestWorkerFactor * idle_workers)) * Se = (10 + (2 * 4)) * 4 = 72 When all the worker threads are idle, then absolute maximum numbers of concurrent connections can be calculared in a simpler way: (A SYNC R EQUEST W ORKER FACTOR + 1) * M AX R EQUEST W ORKERS 10.131. APACHE MODULE EVENT 1005 =⇒Example ThreadsPerChild = 10 ServerLimit = 4 MaxRequestWorkers = 40 AsyncRequestWorkerFactor = 2 If all the processes have all threads idle then: idle_workers = 10 We can calculate the absolute maximum numbers of concurrent connections in two ways: max_connections = (ThreadsPerChild + (AsyncRequestWorkerFactor * idle_workers)) * Se = (10 + (2 * 10)) * 4 = 120 max_connections = (AsyncRequestWorkerFactor + 1) * MaxRequestWorkers = (2 + 1) * 40 = 120 Tuning A SYNC R EQUEST W ORKER FACTOR requires knowledge about the traffic handled by httpd in each specific use case, so changing the default value requires extensive testing and data gathering from MOD STATUS. M AX R EQUEST W ORKERS was called M AX C LIENTS prior to version 2.3.13. The above value shows that the old name did not accurately describe its meaning for the event MPM. A SYNC R EQUEST W ORKER FACTOR can take non-integer arguments, e.g "1.5". 1006 10.132 CHAPTER 10. APACHE MODULES Apache Module mpm netware Description: Status: ModuleIdentifier: SourceFile: Multi-Processing Module implementing an exclusively threaded web server optimized for Novell NetWare MPM mpm netware module mpm netware.c Summary This Multi-Processing Module (MPM) implements an exclusively threaded web server that has been optimized for Novell NetWare. The main thread is responsible for launching child worker threads which listen for connections and serve them when they arrive. Apache HTTP Server always tries to maintain several spare or idle worker threads, which stand ready to serve incoming requests. In this way, clients do not need to wait for a new child threads to be spawned before their requests can be served. The S TART T HREADS, M IN S PARE T HREADS, M AX S PARE T HREADS, and M AX T HREADS regulate how the main thread creates worker threads to serve requests. In general, Apache httpd is very self-regulating, so most sites do not need to adjust these directives from their default values. Sites with limited memory may need to decrease M AX T HREADS to keep the server from thrashing (spawning and terminating idle threads). More information about tuning process creation is provided in the performance hints (p. 339) documentation. M AX C ONNECTIONS P ER C HILD controls how frequently the server recycles processes by killing old ones and launching new ones. On the NetWare OS it is highly recommended that this directive remain set to 0. This allows worker threads to continue servicing requests indefinitely. Directives • Listen (p. 992) • ListenBacklog (p. 993) • MaxConnectionsPerChild (p. 994) • MaxMemFree (p. 994) • MaxSpareThreads (p. 995) • MaxThreads • MinSpareThreads (p. 995) • ReceiveBufferSize (p. 996) • SendBufferSize (p. 997) • StartThreads (p. 998) • ThreadStackSize (p. 999) See also • Setting which addresses and ports Apache httpd uses (p. 88) 10.132. APACHE MODULE MPM NETWARE 1007 MaxThreads Directive Description: Syntax: Default: Context: Status: Module: Set the maximum number of worker threads MaxThreads number MaxThreads 2048 server config MPM mpm netware The M AX T HREADS directive sets the desired maximum number worker threads allowable. The default value is also the compiled in hard limit. Therefore it can only be lowered, for example: MaxThreads 512 1008 10.133 CHAPTER 10. APACHE MODULES Apache Module mpmt os2 Description: Status: ModuleIdentifier: SourceFile: Hybrid multi-process, multi-threaded MPM for OS/2 MPM mpm mpmt os2 module mpmt os2.c Summary The Server consists of a main, parent process and a small, static number of child processes. The parent process’ job is to manage the child processes. This involves spawning children as required to ensure there are always S TART S ERVERS processes accepting connections. Each child process consists of a pool of worker threads and a main thread that accepts connections and passes them to the workers via a work queue. The worker thread pool is dynamic, managed by a maintenance thread so that the number of idle threads is kept between M IN S PARE T HREADS and M AX S PARE T HREADS. Directives • Group (p. 972) • Listen (p. 992) • ListenBacklog (p. 993) • MaxConnectionsPerChild (p. 994) • MaxSpareThreads (p. 995) • MinSpareThreads (p. 995) • PidFile (p. 996) • ReceiveBufferSize (p. 996) • SendBufferSize (p. 997) • StartServers (p. 998) • User (p. 973) See also • Setting which addresses and ports Apache uses (p. 88) 10.134. APACHE MODULE PREFORK 10.134 1009 Apache Module prefork Description: Status: ModuleIdentifier: SourceFile: Implements a non-threaded, pre-forking web server MPM mpm prefork module prefork.c Summary This Multi-Processing Module (MPM) implements a non-threaded, pre-forking web server. Each server process may answer incoming requests, and a parent process manages the size of the server pool. It is appropriate for sites that need to avoid threading for compatibility with non-thread-safe libraries. It is also the best MPM for isolating each request, so that a problem with a single request will not affect any other. This MPM is very self-regulating, so it is rarely necessary to adjust its configuration directives. Most important is that M AX R EQUEST W ORKERS be big enough to handle as many simultaneous requests as you expect to receive, but small enough to assure that there is enough physical RAM for all processes. Directives • CoreDumpDirectory (p. 990) • EnableExceptionHook (p. 991) • Group (p. 972) • Listen (p. 992) • ListenBacklog (p. 993) • MaxConnectionsPerChild (p. 994) • MaxMemFree (p. 994) • MaxRequestWorkers (p. 994) • MaxSpareServers • MinSpareServers • PidFile (p. 996) • ReceiveBufferSize (p. 996) • ScoreBoardFile (p. 996) • SendBufferSize (p. 997) • ServerLimit (p. 997) • StartServers (p. 998) • User (p. 973) See also • Setting which addresses and ports Apache HTTP Server uses (p. 88) How it Works A single control process is responsible for launching child processes which listen for connections and serve them when they arrive. Apache httpd always tries to maintain several spare or idle server processes, which stand ready to serve 1010 CHAPTER 10. APACHE MODULES incoming requests. In this way, clients do not need to wait for a new child processes to be forked before their requests can be served. The S TART S ERVERS, M IN S PARE S ERVERS, M AX S PARE S ERVERS, and M AX R EQUEST W ORKERS regulate how the parent process creates children to serve requests. In general, Apache httpd is very self-regulating, so most sites do not need to adjust these directives from their default values. Sites which need to serve more than 256 simultaneous requests may need to increase M AX R EQUEST W ORKERS, while sites with limited memory may need to decrease M AX R EQUEST W ORKERS to keep the server from thrashing (swapping memory to disk and back). More information about tuning process creation is provided in the performance hints (p. 339) documentation. While the parent process is usually started as root under Unix in order to bind to port 80, the child processes are launched by Apache httpd as a less-privileged user. The U SER and G ROUP directives are used to set the privileges of the Apache httpd child processes. The child processes must be able to read all the content that will be served, but should have as few privileges beyond that as possible. M AX C ONNECTIONS P ER C HILD controls how frequently the server recycles processes by killing old ones and launching new ones. This MPM uses the mpm-accept mutex to serialize access to incoming connections when subject to the thundering herd problem (generally, when there are multiple listening sockets). The implementation aspects of this mutex can be configured with the M UTEX directive. The performance hints (p. 339) documentation has additional information about this mutex. MaxSpareServers Directive Description: Syntax: Default: Context: Status: Module: Maximum number of idle child server processes MaxSpareServers number MaxSpareServers 10 server config MPM prefork The M AX S PARE S ERVERS directive sets the desired maximum number of idle child server processes. An idle process is one which is not handling a request. If there are more than M AX S PARE S ERVERS idle, then the parent process will kill off the excess processes. Tuning of this parameter should only be necessary on very busy sites. Setting this parameter to a large number is almost always a bad idea. If you are trying to set the value equal to or lower than M IN S PARE S ERVERS, Apache HTTP Server will automatically adjust it to M IN S PARE S ERVERS + 1. See also • M IN S PARE S ERVERS • S TART S ERVERS • M AX S PARE T HREADS MinSpareServers Directive Description: Syntax: Default: Context: Status: Module: Minimum number of idle child server processes MinSpareServers number MinSpareServers 5 server config MPM prefork 10.134. APACHE MODULE PREFORK 1011 The M IN S PARE S ERVERS directive sets the desired minimum number of idle child server processes. An idle process is one which is not handling a request. If there are fewer than M IN S PARE S ERVERS idle, then the parent process creates new children: It will spawn one, wait a second, then spawn two, wait a second, then spawn four, and it will continue exponentially until it is spawning 32 children per second. It will stop whenever it satisfies the M IN S PARE S ERVERS setting. Tuning of this parameter should only be necessary on very busy sites. Setting this parameter to a large number is almost always a bad idea. See also • M AX S PARE S ERVERS • S TART S ERVERS • M IN S PARE T HREADS 1012 CHAPTER 10. APACHE MODULES 10.135 Apache Module mpm winnt Description: Status: ModuleIdentifier: SourceFile: Multi-Processing Module optimized for Windows NT. MPM mpm winnt module mpm winnt.c Summary This Multi-Processing Module (MPM) is the default for the Windows NT operating systems. It uses a single control process which launches a single child process which in turn creates threads to handle requests Capacity is configured using the T HREADS P ER C HILD directive, which sets the maximum number of concurrent client connections. By default, this MPM uses advanced Windows APIs for accepting new client connections. In some configurations, third-party products may interfere with this implementation, with the following messages written to the web server log: Child: Encountered too many AcceptEx faults accepting client connections. winnt mpm: falling back to ’AcceptFilter none’. The MPM falls back to a safer implementation, but some client requests were not processed correctly. In order to avoid this error, use ACCEPT F ILTER with accept filter none. AcceptFilter http none AcceptFilter https none In Apache httpd 2.0 and 2.2, W IN 32D ISABLE ACCEPT E X was used for this purpose. The WinNT MPM differs from the Unix MPMs such as worker and event in several areas: • When a child process is exiting due to shutdown, restart, or M AX C ONNECTIONS P ER C HILD, active requests in the exiting process have T IME O UT seconds to finish before processing is aborted. Alternate types of restart and shutdown are not implemented. • New child processes read the configuration files instead of inheriting the configuration from the parent. The behavior will be the same as on Unix if the child process is created at startup or restart, but if a child process is created because the prior one crashed or reached M AX C ONNECTIONS P ER C HILD, any pending changes to the configuration will become active in the child at that point, and the parent and child will be using a different configuration. If planned configuration changes have been partially implemented and the current configuration cannot be parsed, the replacement child process cannot start up and the server will halt. Because of this behavior, configuration files should not be changed until the time of a server restart. • The monitor and fatal exception hooks are not currently implemented. • ACCEPT F ILTER is implemented in the MPM and has a different type of control over handling of new connections. (Refer to the ACCEPT F ILTER documentation for details.) Directives • AcceptFilter (p. 382) • CoreDumpDirectory (p. 990) 10.135. APACHE MODULE MPM WINNT • Listen (p. 992) • ListenBacklog (p. 993) • MaxConnectionsPerChild (p. 994) • MaxMemFree (p. 994) • PidFile (p. 996) • ReceiveBufferSize (p. 996) • ScoreBoardFile (p. 996) • SendBufferSize (p. 997) • ThreadLimit (p. 999) • ThreadsPerChild (p. 999) • ThreadStackSize (p. 999) See also • Using Apache HTTP Server on Microsoft Windows (p. 267) 1013 1014 10.136 CHAPTER 10. APACHE MODULES Apache Module worker Description: Status: ModuleIdentifier: SourceFile: Multi-Processing Module implementing a hybrid multi-threaded multi-process web server MPM mpm worker module worker.c Summary This Multi-Processing Module (MPM) implements a hybrid multi-process multi-threaded server. By using threads to serve requests, it is able to serve a large number of requests with fewer system resources than a process-based server. However, it retains much of the stability of a process-based server by keeping multiple processes available, each with many threads. The most important directives used to control this MPM are T HREADS P ER C HILD, which controls the number of threads deployed by each child process and M AX R EQUEST W ORKERS, which controls the maximum total number of threads that may be launched. Directives • CoreDumpDirectory (p. 990) • EnableExceptionHook (p. 991) • Group (p. 972) • Listen (p. 992) • ListenBacklog (p. 993) • MaxConnectionsPerChild (p. 994) • MaxMemFree (p. 994) • MaxRequestWorkers (p. 994) • MaxSpareThreads (p. 995) • MinSpareThreads (p. 995) • PidFile (p. 996) • ReceiveBufferSize (p. 996) • ScoreBoardFile (p. 996) • SendBufferSize (p. 997) • ServerLimit (p. 997) • StartServers (p. 998) • ThreadLimit (p. 999) • ThreadsPerChild (p. 999) • ThreadStackSize (p. 999) • User (p. 973) See also • Setting which addresses and ports Apache HTTP Server uses (p. 88) 10.136. APACHE MODULE WORKER 1015 How it Works A single control process (the parent) is responsible for launching child processes. Each child process creates a fixed number of server threads as specified in the T HREADS P ER C HILD directive, as well as a listener thread which listens for connections and passes them to a server thread for processing when they arrive. Apache HTTP Server always tries to maintain a pool of spare or idle server threads, which stand ready to serve incoming requests. In this way, clients do not need to wait for a new threads or processes to be created before their requests can be served. The number of processes that will initially launch is set by the S TART S ERVERS directive. During operation, the server assesses the total number of idle threads in all processes, and forks or kills processes to keep this number within the boundaries specified by M IN S PARE T HREADS and M AX S PARE T HREADS. Since this process is very self-regulating, it is rarely necessary to modify these directives from their default values. The maximum number of clients that may be served simultaneously (i.e., the maximum total number of threads in all processes) is determined by the M AX R EQUEST W ORKERS directive. The maximum number of active child processes is determined by the M AX R EQUEST W ORKERS directive divided by the T HREADS P ER C HILD directive. Two directives set hard limits on the number of active child processes and the number of server threads in a child process, and can only be changed by fully stopping the server and then starting it again. S ERVER L IMIT is a hard limit on the number of active child processes, and must be greater than or equal to the M AX R EQUEST W ORKERS directive divided by the T HREADS P ER C HILD directive. T HREAD L IMIT is a hard limit of the number of server threads, and must be greater than or equal to the T HREADS P ER C HILD directive. In addition to the set of active child processes, there may be additional child processes which are terminating, but where at least one server thread is still handling an existing client connection. Up to M AX R EQUEST W ORKERS terminating processes may be present, though the actual number can be expected to be much smaller. This behavior can be avoided by disabling the termination of individual child processes, which is achieved using the following: • set the value of M AX C ONNECTIONS P ER C HILD to zero • set the value of M AX S PARE T HREADS to the same value as M AX R EQUEST W ORKERS A typical configuration of the process-thread controls in the WORKER MPM could look as follows: ServerLimit StartServers MaxRequestWorkers MinSpareThreads MaxSpareThreads ThreadsPerChild 16 2 150 25 75 25 While the parent process is usually started as root under Unix in order to bind to port 80, the child processes and threads are launched by the server as a less-privileged user. The U SER and G ROUP directives are used to set the privileges of the Apache HTTP Server child processes. The child processes must be able to read all the content that will be served, but should have as few privileges beyond that as possible. In addition, unless suexec is used, these directives also set the privileges which will be inherited by CGI scripts. M AX C ONNECTIONS P ER C HILD controls how frequently the server recycles processes by killing old ones and launching new ones. This MPM uses the mpm-accept mutex to serialize access to incoming connections when subject to the thundering herd problem (generally, when there are multiple listening sockets). The implementation aspects of this mutex can be configured with the M UTEX directive. The performance hints (p. 339) documentation has additional information about this mutex. 1016 CHAPTER 10. APACHE MODULES Chapter 11 Developer Documentation 1017 1018 CHAPTER 11. DEVELOPER DOCUMENTATION 11.1 ! Developer Documentation for the Apache HTTP Server 2.4 Warning Many of the documents listed here are in need of update. They are in different stages of progress. Please be patient and follow this linka to propose a fix or point out any error/discrepancy. a https://httpd.apache.org/docs-project/ 2.4 development documents • Developing modules for the Apache HTTP Server 2.4 (p. 1042) • Hook Functions in 2.4 (p. 1071) • Request Processing in 2.4 (p. 1078) • How filters work in 2.4 (p. 1081) • Guidelines for output filters in 2.4 (p. 1084) • Documenting code in 2.4 (p. 1070) • Thread Safety Issues in 2.4 (p. 1091) Upgrading to 2.4 • API changes in 2.3/2.4 (p. 1035) • Converting Modules from 1.3 to 2.x (p. 1074) External Resources • Autogenerated Apache HTTP Server (trunk) code documentation1 (the link is built by this job2 ). • Developer articles at apachetutor3 include: – – – – – Request Processing4 Configuration for Modules5 Resource Management6 Connection Pooling7 Introduction to Buckets and Brigades8 1 http://ci.apache.org/projects/httpd/trunk/doxygen/ 2 https://ci.apache.org/builders/httpd-doxygen-nightly 3 http://www.apachetutor.org/ 4 http://www.apachetutor.org/dev/request 5 http://www.apachetutor.org/dev/config 6 http://www.apachetutor.org/dev/pools 7 http://www.apachetutor.org/dev/reslist 8 http://www.apachetutor.org/dev/brigades 11.2. APACHE 1.3 API NOTES 11.2 ! 1019 Apache 1.3 API notes Warning This document has not been updated to take into account changes made in the 2.0 version of the Apache HTTP Server. Some of the information may still be relevant, but please use it with care. These are some notes on the Apache API and the data structures you have to deal with, etc. They are not yet nearly complete, but hopefully, they will help you get your bearings. Keep in mind that the API is still subject to change as we gain experience with it. (See the TODO file for what might be coming). However, it will be easy to adapt modules to any changes that are made. (We have more modules to adapt than you do). A few notes on general pedagogical style here. In the interest of conciseness, all structure declarations here are incomplete – the real ones have more slots that I’m not telling you about. For the most part, these are reserved to one component of the server core or another, and should be altered by modules with caution. However, in some cases, they really are things I just haven’t gotten around to yet. Welcome to the bleeding edge. Finally, here’s an outline, to give you some bare idea of what’s coming up, and in what order: • Basic concepts. – Handlers, Modules, and Requests – A brief tour of a module • How handlers work – – – – – – A brief tour of the request rec Where request rec structures come from Handling requests, declining, and returning error codes Special considerations for response handlers Special considerations for authentication handlers Special considerations for logging handlers • Resource allocation and resource pools • Configuration, commands and the like – Per-directory configuration structures – Command handling – Side notes — per-server configuration, virtual servers, etc. Basic concepts We begin with an overview of the basic concepts behind the API, and how they are manifested in the code. Handlers, Modules, and Requests Apache breaks down request handling into a series of steps, more or less the same way the Netscape server API does (although this API has a few more stages than NetSite does, as hooks for stuff I thought might be useful in the future). These are: • URI -> Filename translation 1020 CHAPTER 11. DEVELOPER DOCUMENTATION • Auth ID checking [is the user who they say they are?] • Auth access checking [is the user authorized here?] • Access checking other than auth • Determining MIME type of the object requested • ‘Fixups’ – there aren’t any of these yet, but the phase is intended as a hook for possible extensions like S ET E NV, which don’t really fit well elsewhere. • Actually sending a response back to the client. • Logging the request These phases are handled by looking at each of a succession of modules, looking to see if each of them has a handler for the phase, and attempting invoking it if so. The handler can typically do one of three things: • Handle the request, and indicate that it has done so by returning the magic constant OK. • Decline to handle the request, by returning the magic integer constant DECLINED. In this case, the server behaves in all respects as if the handler simply hadn’t been there. • Signal an error, by returning one of the HTTP error codes. This terminates normal handling of the request, although an ErrorDocument may be invoked to try to mop up, and it will be logged in any case. Most phases are terminated by the first module that handles them; however, for logging, ‘fixups’, and non-access authentication checking, all handlers always run (barring an error). Also, the response phase is unique in that modules may declare multiple handlers for it, via a dispatch table keyed on the MIME type of the requested object. Modules may declare a response-phase handler which can handle any request, by giving it the key */* (i.e., a wildcard MIME type specification). However, wildcard handlers are only invoked if the server has already tried and failed to find a more specific response handler for the MIME type of the requested object (either none existed, or they all declined). The handlers themselves are functions of one argument (a request rec structure. vide infra), which returns an integer, as above. A brief tour of a module At this point, we need to explain the structure of a module. Our candidate will be one of the messier ones, the CGI module – this handles both CGI scripts and the S CRIPTA LIAS config file command. It’s actually a great deal more complicated than most modules, but if we’re going to have only one example, it might as well be the one with its fingers in every place. Let’s begin with handlers. In order to handle the CGI scripts, the module declares a response handler for them. Because of S CRIPTA LIAS, it also has handlers for the name translation phase (to recognize S CRIPTA LIASed URIs), the type-checking phase (any S CRIPTA LIASed request is typed as a CGI script). The module needs to maintain some per (virtual) server information, namely, the S CRIPTA LIASes in effect; the module structure therefore contains pointers to a functions which builds these structures, and to another which combines two of them (in case the main server and a virtual server both have S CRIPTA LIASes declared). Finally, this module contains code to handle the S CRIPTA LIAS command itself. This particular module only declares one command, but there could be more, so modules have command tables which declare their commands, and describe where they are permitted, and how they are to be invoked. A final note on the declared types of the arguments of some of these commands: a pool is a pointer to a resource pool structure; these are used by the server to keep track of the memory which has been allocated, files opened, etc., either to service a particular request, or to handle the process of configuring itself. That way, when the request is over (or, for the configuration pool, when the server is restarting), the memory can be freed, and the files closed, en masse, without anyone having to write explicit code to track them all down and dispose of them. Also, a cmd parms structure 11.2. APACHE 1.3 API NOTES 1021 contains various information about the config file being read, and other status information, which is sometimes of use to the function which processes a config-file command (such as S CRIPTA LIAS). With no further ado, the module itself: /* Declarations of handlers. */ int translate scriptalias (request rec *); int type scriptalias (request rec *); int cgi handler (request rec *); /* Subsidiary dispatch table for response-phase * handlers, by MIME type */ handler rec cgi handlers[] = { { "application/x-httpd-cgi", cgi handler }, { NULL } }; /* Declarations of routines to manipulate the * module’s configuration info. Note that these are * returned, and passed in, as void *’s; the server * core keeps track of them, but it doesn’t, and can’t, * know their internal structure. */ void *make cgi server config (pool *); void *merge cgi server config (pool *, void *, void *); /* Declarations of routines to handle config-file commands */ extern char *script alias(cmd parms *, void *per dir config, char *fake, char *real); command rec cgi cmds[] = { { "ScriptAlias", script alias, NULL, RSRC CONF, TAKE2, "a fakename and a realname"}, { NULL } }; module cgi module = { STANDARD_MODULE_STUFF, NULL, NULL, NULL, make_cgi_server_config, merge_cgi_server_config, cgi_cmds, cgi_handlers, translate_scriptalias, NULL, NULL, NULL, type_scriptalias, NULL, NULL, NULL }; /* /* /* /* /* /* /* /* /* /* /* /* /* /* /* initializer */ dir config creator */ dir merger */ server config */ merge server config */ command table */ handlers */ filename translation */ check_user_id */ check auth */ check access */ type_checker */ fixups */ logger */ header parser */ 1022 CHAPTER 11. DEVELOPER DOCUMENTATION How handlers work The sole argument to handlers is a request rec structure. This structure describes a particular request which has been made to the server, on behalf of a client. In most cases, each connection to the client generates only one request rec structure. A brief tour of the request rec The request rec contains pointers to a resource pool which will be cleared when the server is finished handling the request; to structures containing per-server and per-connection information, and most importantly, information on the request itself. The most important such information is a small set of character strings describing attributes of the object being requested, including its URI, filename, content-type and content-encoding (these being filled in by the translation and type-check handlers which handle the request, respectively). Other commonly used data items are tables giving the MIME headers on the client’s original request, MIME headers to be sent back with the response (which modules can add to at will), and environment variables for any subprocesses which are spawned off in the course of servicing the request. These tables are manipulated using the ap table get and ap table set routines. =⇒theNoteapthattable the Content-type header value cannot be set by module content-handlers using () routines. Rather, it is set by pointing the content type field in the * request rec structure to an appropriate string. e.g., r->content type = "text/html"; Finally, there are pointers to two data structures which, in turn, point to per-module configuration structures. Specifically, these hold pointers to the data structures which the module has built to describe the way it has been configured to operate in a given directory (via .htaccess files or sections), for private data it has built in the course of servicing the request (so modules’ handlers for one phase can pass ‘notes’ to their handlers for other phases). There is another such configuration vector in the server rec data structure pointed to by the request rec, which contains per (virtual) server configuration data. Here is an abridged declaration, giving the fields most commonly used: 11.2. APACHE 1.3 API NOTES 1023 struct request rec { pool *pool; conn rec *connection; server rec *server; /* What object is being requested */ char *uri; char *filename; char *path info; char *args; struct stat finfo; /* QUERY_ARGS, if any */ /* Set by server core; * st_mode set to zero if no such file */ char *content type; char *content encoding; /* MIME header environments, in and out. Also, * an array containing environment variables to * be passed to subprocesses, so people can write * modules to add to that environment. * * The difference between headers out and * err headers out is that the latter are printed * even on error, and persist across internal * redirects (so the headers printed for * E R R O R D O C U M E N T handlers will have them). */ table *headers in; table *headers out; table *err headers out; table *subprocess env; /* Info about the request itself... int header_only; char *protocol; char *method; int method_number; /* /* /* /* */ HEAD request, as opposed to GET */ Protocol, as given to us, or HTTP/0.9 */ GET, HEAD, POST, etc. */ M_GET, M_POST, etc. */ /* Info for logging */ char *the request; int bytes sent; /* A flag which modules can set, to indicate that * the data being returned is volatile, and clients * should be told not to cache it. */ int no cache; /* Various other config info which may change * with .htaccess files * These are config vectors, with one void* * pointer for each module (the thing pointed * to being the module’s business). */ void *per_dir_config; void *request_config; }; /* Options set in config files, etc. */ /* Notes on *this* request */ 1024 CHAPTER 11. DEVELOPER DOCUMENTATION Where request rec structures come from Most request rec structures are built by reading an HTTP request from a client, and filling in the fields. However, there are a few exceptions: • If the request is to an imagemap, a type map (i.e., a *.var file), or a CGI script which returned a local ‘Location:’, then the resource which the user requested is going to be ultimately located by some URI other than what the client originally supplied. In this case, the server does an internal redirect, constructing a new request rec for the new URI, and processing it almost exactly as if the client had requested the new URI directly. • If some handler signaled an error, and an ErrorDocument is in scope, the same internal redirect machinery comes into play. • Finally, a handler occasionally needs to investigate ‘what would happen if’ some other request were run. For instance, the directory indexing module needs to know what MIME type would be assigned to a request for each directory entry, in order to figure out what icon to use. Such handlers can construct a sub-request, using the functions ap sub req lookup file, ap sub req lookup uri, and ap sub req method uri; these construct a new request rec structure and processes it as you would expect, up to but not including the point of actually sending a response. (These functions skip over the access checks if the sub-request is for a file in the same directory as the original request). (Server-side includes work by building sub-requests and then actually invoking the response handler for them, via the function ap run sub req). Handling requests, declining, and returning error codes As discussed above, each handler, when invoked to handle a particular request rec, has to return an int to indicate what happened. That can either be • OK – the request was handled successfully. This may or may not terminate the phase. • DECLINED – no erroneous condition exists, but the module declines to handle the phase; the server tries to find another. • an HTTP error code, which aborts handling of the request. Note that if the error code returned is REDIRECT, then the module should put a Location in the request’s headers out, to indicate where the client should be redirected to. Special considerations for response handlers Handlers for most phases do their work by simply setting a few fields in the request rec structure (or, in the case of access checkers, simply by returning the correct error code). However, response handlers have to actually send a request back to the client. They should begin by sending an HTTP response header, using the function ap send http header. (You don’t have to do anything special to skip sending the header for HTTP/0.9 requests; the function figures out on its own that it shouldn’t do anything). If the request is marked header only, that’s all they should do; they should return after that, without attempting any further output. Otherwise, they should produce a request body which responds to the client as appropriate. The primitives for this are ap rputc and ap rprintf, for internally generated output, and ap send fd, to copy the contents of some FILE * straight to the client. 11.2. APACHE 1.3 API NOTES 1025 At this point, you should more or less understand the following piece of code, which is the handler which handles GET requests which have no more specific handler; it also shows how conditional GETs can be handled, if it’s desirable to do so in a particular response handler – ap set last modified checks against the If-modified-since value supplied by the client, if any, and returns an appropriate code (which will, if nonzero, be USE LOCAL COPY). No similar considerations apply for ap set content length, but it returns an error code for symmetry. int default handler (request rec *r) { int errstatus; FILE *f; if (r->method number != M GET) return DECLINED; if (r->finfo.st mode == 0) return NOT FOUND; if ((errstatus = ap set content length (r, r->finfo.st size)) || (errstatus = ap set last modified (r, r->finfo.st mtime))) return errstatus; f = fopen (r->filename, "r"); if (f == NULL) { log reason("file permissions deny server access", r->filename, r); return FORBIDDEN; } register timeout ("send", r); ap send http header (r); if (!r->header only) send fd (f, r); ap pfclose (r->pool, f); return OK; } Finally, if all of this is too much of a challenge, there are a few ways out of it. First off, as shown above, a response handler which has not yet produced any output can simply return an error code, in which case the server will automatically produce an error response. Secondly, it can punt to some other handler by invoking ap internal redirect, which is how the internal redirection machinery discussed above is invoked. A response handler which has internally redirected should always return OK. (Invoking ap internal redirect from handlers which are not response handlers will lead to serious confusion). Special considerations for authentication handlers Stuff that should be discussed here in detail: • Authentication-phase handlers not invoked unless auth is configured for the directory. • Common auth configuration stored in the core per-dir configuration; it has accessors ap auth type, ap auth name, and ap requires. • Common routines, to handle the protocol end of things, at least for HTTP basic authentication (ap get basic auth pw, which sets the connection->user structure field automatically, and ap note basic auth failure, which arranges for the proper WWW-Authenticate: header to be sent back). 1026 CHAPTER 11. DEVELOPER DOCUMENTATION Special considerations for logging handlers When a request has internally redirected, there is the question of what to log. Apache handles this by bundling the entire chain of redirects into a list of request rec structures which are threaded through the r->prev and r->next pointers. The request rec which is passed to the logging handlers in such cases is the one which was originally built for the initial request from the client; note that the bytes sent field will only be correct in the last request in the chain (the one for which a response was actually sent). Resource allocation and resource pools One of the problems of writing and designing a server-pool server is that of preventing leakage, that is, allocating resources (memory, open files, etc.), without subsequently releasing them. The resource pool machinery is designed to make it easy to prevent this from happening, by allowing resource to be allocated in such a way that they are automatically released when the server is done with them. The way this works is as follows: the memory which is allocated, file opened, etc., to deal with a particular request are tied to a resource pool which is allocated for the request. The pool is a data structure which itself tracks the resources in question. When the request has been processed, the pool is cleared. At that point, all the memory associated with it is released for reuse, all files associated with it are closed, and any other clean-up functions which are associated with the pool are run. When this is over, we can be confident that all the resource tied to the pool have been released, and that none of them have leaked. Server restarts, and allocation of memory and resources for per-server configuration, are handled in a similar way. There is a configuration pool, which keeps track of resources which were allocated while reading the server configuration files, and handling the commands therein (for instance, the memory that was allocated for per-server module configuration, log files and other files that were opened, and so forth). When the server restarts, and has to reread the configuration files, the configuration pool is cleared, and so the memory and file descriptors which were taken up by reading them the last time are made available for reuse. It should be noted that use of the pool machinery isn’t generally obligatory, except for situations like logging handlers, where you really need to register cleanups to make sure that the log file gets closed when the server restarts (this is most easily done by using the function ap pfopen, which also arranges for the underlying file descriptor to be closed before any child processes, such as for CGI scripts, are execed), or in case you are using the timeout machinery (which isn’t yet even documented here). However, there are two benefits to using it: resources allocated to a pool never leak (even if you allocate a scratch string, and just forget about it); also, for memory allocation, ap palloc is generally faster than malloc. We begin here by describing how memory is allocated to pools, and then discuss how other resources are tracked by the resource pool machinery. Allocation of memory in pools Memory is allocated to pools by calling the function ap palloc, which takes two arguments, one being a pointer to a resource pool structure, and the other being the amount of memory to allocate (in chars). Within handlers for handling requests, the most common way of getting a resource pool structure is by looking at the pool slot of the relevant request rec; hence the repeated appearance of the following idiom in module code: 11.2. APACHE 1.3 API NOTES 1027 int my handler(request rec *r) { struct my structure *foo; ... foo = (foo *)ap palloc (r->pool, sizeof(my structure)); } Note that there is no ap pfree – ap palloced memory is freed only when the associated resource pool is cleared. This means that ap palloc does not have to do as much accounting as malloc(); all it does in the typical case is to round up the size, bump a pointer, and do a range check. (It also raises the possibility that heavy use of ap palloc could cause a server process to grow excessively large. There are two ways to deal with this, which are dealt with below; briefly, you can use malloc, and try to be sure that all of the memory gets explicitly freed, or you can allocate a sub-pool of the main pool, allocate your memory in the sub-pool, and clear it out periodically. The latter technique is discussed in the section on sub-pools below, and is used in the directory-indexing code, in order to avoid excessive storage allocation when listing directories with thousands of files). Allocating initialized memory There are functions which allocate initialized memory, and are frequently useful. The function ap pcalloc has the same interface as ap palloc, but clears out the memory it allocates before it returns it. The function ap pstrdup takes a resource pool and a char * as arguments, and allocates memory for a copy of the string the pointer points to, returning a pointer to the copy. Finally ap pstrcat is a varargs-style function, which takes a pointer to a resource pool, and at least two char * arguments, the last of which must be NULL. It allocates enough memory to fit copies of each of the strings, as a unit; for instance: ap pstrcat (r->pool, "foo", "/", "bar", NULL); returns a pointer to 8 bytes worth of memory, initialized to "foo/bar". Commonly-used pools in the Apache Web server A pool is really defined by its lifetime more than anything else. There are some static pools in http main which are passed to various non-http main functions as arguments at opportune times. Here they are: permanent pool never passed to anything else, this is the ancestor of all pools pconf • subpool of permanent pool • created at the beginning of a config "cycle"; exists until the server is terminated or restarts; passed to all config-time routines, either via cmd->pool, or as the "pool *p" argument on those which don’t take pools • passed to the module init() functions ptemp • sorry I lie, this pool isn’t called this currently in 1.3, I renamed it this in my pthreads development. I’m referring to the use of ptrans in the parent... contrast this with the later definition of ptrans in the child. • subpool of permanent pool • created at the beginning of a config "cycle"; exists until the end of config parsing; passed to config-time routines via cmd->temp pool. Somewhat of a "bastard child" because it isn’t available everywhere. Used for temporary scratch space which may be needed by some config routines but which is deleted at the end of config. 1028 CHAPTER 11. DEVELOPER DOCUMENTATION pchild • subpool of permanent pool • created when a child is spawned (or a thread is created); lives until that child (thread) is destroyed • passed to the module child init functions • destruction happens right after the child exit functions are called... (which may explain why I think child exit is redundant and unneeded) ptrans • should be a subpool of pchild, but currently is a subpool of permanent pool, see above • cleared by the child before going into the accept() loop to receive a connection • used as connection->pool • for the main request this is a subpool of connection->pool; for subrequests it is a subpool of the parent request’s pool. • exists until the end of the request (i.e., ap destroy sub req, or in child main after process request has finished) • note that r itself is allocated from r->pool; i.e., r->pool is first created and then r is the first thing palloc()d from it r->pool For almost everything folks do, r->pool is the pool to use. But you can see how other lifetimes, such as pchild, are useful to some modules... such as modules that need to open a database connection once per child, and wish to clean it up when the child dies. You can also see how some bugs have manifested themself, such as setting connection->user to a value from r->pool – in this case connection exists for the lifetime of ptrans, which is longer than r->pool (especially if r->pool is a subrequest!). So the correct thing to do is to allocate from connection->pool. And there was another interesting bug in MOD INCLUDE / MOD CGI. You’ll see in those that they do this test to decide if they should use r->pool or r->main->pool. In this case the resource that they are registering for cleanup is a child process. If it were registered in r->pool, then the code would wait() for the child when the subrequest finishes. With MOD INCLUDE this could be any old #include, and the delay can be up to 3 seconds... and happened quite frequently. Instead the subprocess is registered in r->main->pool which causes it to be cleaned up when the entire request is done – i.e., after the output has been sent to the client and logging has happened. Tracking open files, etc. As indicated above, resource pools are also used to track other sorts of resources besides memory. The most common are open files. The routine which is typically used for this is ap pfopen, which takes a resource pool and two strings as arguments; the strings are the same as the typical arguments to fopen, e.g., ... FILE *f = ap pfopen (r->pool, r->filename, "r"); if (f == NULL) { ... } else { ... } There is also a ap popenf routine, which parallels the lower-level open system call. Both of these routines arrange for the file to be closed when the resource pool in question is cleared. Unlike the case for memory, there are functions to close files allocated with ap pfopen, and ap popenf, namely ap pfclose and ap pclosef. (This is because, on many systems, the number of files which a single process can have open is quite limited). It is important to use these functions to close files allocated with ap pfopen and ap popenf, since to do otherwise could cause fatal errors on systems such as Linux, which react badly if the same FILE* is closed more than once. (Using the close functions is not mandatory, since the file will eventually be closed regardless, but you should consider it in cases where your module is opening, or could open, a lot of files). 11.2. APACHE 1.3 API NOTES 1029 Other sorts of resources – cleanup functions More text goes here. spawn process. Describe the cleanup primitives in terms of which the file stuff is implemented; also, Pool cleanups live until clear pool() is called: clear pool(a) recursively calls destroy pool() on all subpools of a; then calls all the cleanups for a; then releases all the memory for a. destroy pool(a) calls clear pool(a) and then releases the pool structure itself. i.e., clear pool(a) doesn’t delete a, it just frees up all the resources and you can start using it again immediately. Fine control – creating and dealing with sub-pools, with a note on sub-requests On rare occasions, too-free use of ap palloc() and the associated primitives may result in undesirably profligate resource allocation. You can deal with such a case by creating a sub-pool, allocating within the sub-pool rather than the main pool, and clearing or destroying the sub-pool, which releases the resources which were associated with it. (This really is a rare situation; the only case in which it comes up in the standard module set is in case of listing directories, and then only with very large directories. Unnecessary use of the primitives discussed here can hair up your code quite a bit, with very little gain). The primitive for creating a sub-pool is ap make sub pool, which takes another pool (the parent pool) as an argument. When the main pool is cleared, the sub-pool will be destroyed. The sub-pool may also be cleared or destroyed at any time, by calling the functions ap clear pool and ap destroy pool, respectively. (The difference is that ap clear pool frees resources associated with the pool, while ap destroy pool also deallocates the pool itself. In the former case, you can allocate new resources within the pool, and clear it again, and so forth; in the latter case, it is simply gone). One final note – sub-requests have their own resource pools, which are sub-pools of the resource pool for the main request. The polite way to reclaim the resources associated with a sub request which you have allocated (using the ap sub req ... functions) is ap destroy sub req, which frees the resource pool. Before calling this function, be sure to copy anything that you care about which might be allocated in the sub-request’s resource pool into someplace a little less volatile (for instance, the filename in its request rec structure). (Again, under most circumstances, you shouldn’t feel obliged to call this function; only 2K of memory or so are allocated for a typical sub request, and it will be freed anyway when the main request pool is cleared. It is only when you are allocating many, many sub-requests for a single main request that you should seriously consider the ap destroy ... functions). Configuration, commands and the like One of the design goals for this server was to maintain external compatibility with the NCSA 1.3 server — that is, to read the same configuration files, to process all the directives therein correctly, and in general to be a drop-in replacement for NCSA. On the other hand, another design goal was to move as much of the server’s functionality into modules which have as little as possible to do with the monolithic server core. The only way to reconcile these goals is to move the handling of most commands from the central server into the modules. However, just giving the modules command tables is not enough to divorce them completely from the server core. The server has to remember the commands in order to act on them later. That involves maintaining data which is private to the modules, and which can be either per-server, or per-directory. Most things are per-directory, including in particular access control and authorization information, but also information on how to determine file types from suffixes, which can be modified by A DD T YPE and F ORCE T YPE directives, and so forth. In general, the governing philosophy is that anything which can be made configurable by directory should be; per-server information is generally used in the standard set of modules for information like A LIASes and R EDIRECTs which come into play before the request is tied to a particular place in the underlying file system. Another requirement for emulating the NCSA server is being able to handle the per-directory configuration files, 1030 CHAPTER 11. DEVELOPER DOCUMENTATION generally called .htaccess files, though even in the NCSA server they can contain directives which have nothing at all to do with access control. Accordingly, after URI -> filename translation, but before performing any other phase, the server walks down the directory hierarchy of the underlying filesystem, following the translated pathname, to read any .htaccess files which might be present. The information which is read in then has to be merged with the applicable information from the server’s own config files (either from the sections in access.conf, or from defaults in srm.conf, which actually behaves for most purposes almost exactly like ). Finally, after having served a request which involved reading .htaccess files, we need to discard the storage allocated for handling them. That is solved the same way it is solved wherever else similar problems come up, by tying those structures to the per-transaction resource pool. Per-directory configuration structures Let’s look out how all of this plays out in mod mime.c, which defines the file typing handler which emulates the NCSA server’s behavior of determining file types from suffixes. What we’ll be looking at, here, is the code which implements the A DD T YPE and A DD E NCODING commands. These commands can appear in .htaccess files, so they must be handled in the module’s private per-directory data, which in fact, consists of two separate tables for MIME types and encoding information, and is declared as follows: typedef struct { table *forced_types; table *encoding_types; } mime_dir_config; /* Additional AddTyped stuff */ /* Added with AddEncoding... */ When the server is reading a configuration file, or section, which includes one of the MIME module’s commands, it needs to create a mime dir config structure, so those commands have something to act on. It does this by invoking the function it finds in the module’s ‘create per-dir config slot’, with two arguments: the name of the directory to which this configuration information applies (or NULL for srm.conf), and a pointer to a resource pool in which the allocation should happen. (If we are reading a .htaccess file, that resource pool is the per-request resource pool for the request; otherwise it is a resource pool which is used for configuration data, and cleared on restarts. Either way, it is important for the structure being created to vanish when the pool is cleared, by registering a cleanup on the pool if necessary). For the MIME module, the per-dir config creation function just ap pallocs the structure above, and a creates a couple of tables to fill it. That looks like this: void *create mime dir config (pool *p, char *dummy) { mime dir config *new = (mime dir config *) ap palloc (p, sizeof(mime dir config)); new->forced types = ap make table (p, 4); new->encoding types = ap make table (p, 4); return new; } Now, suppose we’ve just read in a .htaccess file. We already have the per-directory configuration structure for the next directory up in the hierarchy. If the .htaccess file we just read in didn’t have any A DD T YPE or A DD E NCOD ING commands, its per-directory config structure for the MIME module is still valid, and we can just use it. Otherwise, we need to merge the two structures somehow. 11.2. APACHE 1.3 API NOTES 1031 To do that, the server invokes the module’s per-directory config merge function, if one is present. That function takes three arguments: the two structures being merged, and a resource pool in which to allocate the result. For the MIME module, all that needs to be done is overlay the tables from the new per-directory config structure with those from the parent: void *merge *subdirv) { mime dir mime dir mime dir mime dir configs (pool *p, void *parent dirv, void config *parent dir = (mime dir config *)parent dirv; config *subdir = (mime dir config *)subdirv; config *new = (mime dir config *)ap palloc (p, sizeof(mime dir config)); new->forced types = ap overlay tables (p, subdir->forced types, parent dir->forced types); new->encoding types = ap overlay tables (p, subdir->encoding types, parent dir->encoding types); return new; } As a note – if there is no per-directory merge function present, the server will just use the subdirectory’s configuration info, and ignore the parent’s. For some modules, that works just fine (e.g., for the includes module, whose perdirectory configuration information consists solely of the state of the XBITHACK), and for those modules, you can just not declare one, and leave the corresponding structure slot in the module itself NULL. Command handling Now that we have these structures, we need to be able to figure out how to fill them. That involves processing the actual A DD T YPE and A DD E NCODING commands. To find commands, the server looks in the module’s command table. That table contains information on how many arguments the commands take, and in what formats, where it is permitted, and so forth. That information is sufficient to allow the server to invoke most command-handling functions with pre-parsed arguments. Without further ado, let’s look at the A DD T YPE command handler, which looks like this (the A DD E NCODING command looks basically the same, and won’t be shown here): char *add type(cmd parms *cmd, mime dir config *m, char *ct, char *ext) { if (*ext == ’.’) ++ext; ap table set (m->forced types, ext, ct); return NULL; } This command handler is unusually simple. As you can see, it takes four arguments, two of which are pre-parsed arguments, the third being the per-directory configuration structure for the module in question, and the fourth being a pointer to a cmd parms structure. That structure contains a bunch of arguments which are frequently of use to some, but not all, commands, including a resource pool (from which memory can be allocated, and to which cleanups should be tied), and the (virtual) server being configured, from which the module’s per-server configuration data can be obtained if required. Another way in which this particular command handler is unusually simple is that there are no error conditions which it can encounter. If there were, it could return an error message instead of NULL; this causes an error to be printed out 1032 CHAPTER 11. DEVELOPER DOCUMENTATION on the server’s stderr, followed by a quick exit, if it is in the main config files; for a .htaccess file, the syntax error is logged in the server error log (along with an indication of where it came from), and the request is bounced with a server error response (HTTP error status, code 500). The MIME module’s command table has entries for these commands, which look like this: command rec mime cmds[] = { { "AddType", add type, NULL, OR FILEINFO, TAKE2, "a mime type followed by a file extension" }, { "AddEncoding", add encoding, NULL, OR FILEINFO, TAKE2, "an encoding (e.g., gzip), followed by a file extension" }, { NULL } }; The entries in these tables are: • The name of the command • The function which handles it • a (void *) pointer, which is passed in the cmd parms structure to the command handler — this is useful in case many similar commands are handled by the same function. • A bit mask indicating where the command may appear. There are mask bits corresponding to each AllowOverride option, and an additional mask bit, RSRC CONF, indicating that the command may appear in the server’s own config files, but not in any .htaccess file. • A flag indicating how many arguments the command handler wants pre-parsed, and how they should be passed in. TAKE2 indicates two pre-parsed arguments. Other options are TAKE1, which indicates one pre-parsed argument, FLAG, which indicates that the argument should be On or Off, and is passed in as a boolean flag, RAW ARGS, which causes the server to give the command the raw, unparsed arguments (everything but the command name itself). There is also ITERATE, which means that the handler looks the same as TAKE1, but that if multiple arguments are present, it should be called multiple times, and finally ITERATE2, which indicates that the command handler looks like a TAKE2, but if more arguments are present, then it should be called multiple times, holding the first argument constant. • Finally, we have a string which describes the arguments that should be present. If the arguments in the actual config file are not as required, this string will be used to help give a more specific error message. (You can safely leave this NULL). Finally, having set this all up, we have to use it. This is ultimately done in the module’s handlers, specifically for its file-typing handler, which looks more or less like this; note that the per-directory configuration structure is extracted from the request rec’s per-directory configuration vector by using the ap get module config function. 11.2. APACHE 1.3 API NOTES 1033 int find ct(request rec *r) { int i; char *fn = ap pstrdup (r->pool, r->filename); mime dir config *conf = (mime dir config *) ap get module config(r->per dir config, &mime module); char *type; if (S ISDIR(r->finfo.st mode)) { r->content type = DIR MAGIC TYPE; return OK; } if((i=ap rind(fn,’.’)) ++i; < 0) return DECLINED; if ((type = ap table get (conf->encoding types, &fn[i]))) { r->content encoding = type; /* go back to previous extension to try to use it as a type */ fn[i-1] = ’\0’; if((i=ap rind(fn,’.’)) < 0) return OK; ++i; } if ((type = ap table get (conf->forced types, &fn[i]))) { r->content type = type; } return OK; } Side notes – per-server configuration, virtual servers, etc. The basic ideas behind per-server module configuration are basically the same as those for per-directory configuration; there is a creation function and a merge function, the latter being invoked where a virtual server has partially overridden the base server configuration, and a combined structure must be computed. (As with per-directory configuration, the default if no merge function is specified, and a module is configured in some virtual server, is that the base configuration is simply ignored). The only substantial difference is that when a command needs to configure the per-server private module data, it needs to go to the cmd parms data to get at it. Here’s an example, from the alias module, which also indicates how a syntax error can be returned (note that the per-directory configuration argument to the command handler is declared as a dummy, since the module doesn’t actually have per-directory config data): 1034 CHAPTER 11. DEVELOPER DOCUMENTATION char *add redirect(cmd parms *cmd, void *dummy, char *f, char *url) { server rec *s = cmd->server; alias server conf *conf = (alias server conf *) ap get module config(s->module config,&alias module); alias entry *new = ap push array (conf->redirects); if (!ap is url (url)) return "Redirect to non-URL"; new->fake = f; new->real = url; return NULL; } 11.3. API CHANGES IN APACHE HTTP SERVER 2.4 SINCE 2.2 11.3 1035 API Changes in Apache HTTP Server 2.4 since 2.2 This document describes changes to the Apache HTTPD API from version 2.2 to 2.4, that may be of interest to module/application developers and core hacks. As of the first GA release of the 2.4 branch API compatibility is preserved for the life of the 2.4 branch. (The VERSIONING9 description for the 2.4 release provides more information about API compatibility.) API changes fall into two categories: APIs that are altogether new, and existing APIs that are expanded or changed. The latter are further divided into those where all changes are backwards-compatible (so existing modules can ignore them), and those that might require attention by maintainers. As with the transition from HTTPD 2.0 to 2.2, existing modules and applications will require recompiling and may call for some attention, but most should not require any substantial updating (although some may be able to take advantage of API changes to offer significant improvements). For the purpose of this document, the API is split according to the public header files. These headers are themselves the reference documentation, and can be used to generate a browsable HTML reference with make docs. Changed APIs ap expr (NEW!) Introduces a new API to parse and evaluate boolean and algebraic expressions, including provision for a standard syntax and customised variants. ap listen (changed; backwards-compatible) Introduces a new API to enable httpd child processes to serve different purposes. ap mpm (changed) ap mpm run is replaced by a new mpm hook. ap mpm register timed callback is new. Also ap graceful stop signalled is lost, and ap regex (changed) In addition to the existing regexp wrapper, a new higher-level API ap rxplus is now provided. This provides the capability to compile Perl-style expressions like s/regexp/replacement/flags and to execute them against arbitrary strings. Support for regexp backreferences is also added. ap slotmem (NEW!) Introduces an API for modules to allocate and manage memory slots, most commonly for shared memory. ap socache (NEW!) API to manage a shared object cache. heartbeat (NEW!) common structures for heartbeat modules 9 http://svn.apache.org/repos/asf/httpd/httpd/branches/2.4.x/VERSIONING 1036 CHAPTER 11. DEVELOPER DOCUMENTATION ap parse htaccess (changed) The function signature for ap parse htaccess has been changed. A apr table t of individual directives allowed for override must now be passed (override remains). http config (changed) • Introduces per-module, per-directory loglevels, including macro wrappers. • New AP DECLARE MODULE macro to declare all modules. • New APLOG USE MODULE macro necessary for per-module loglevels in multi-file modules. • New API to retain data across module unload/load • New check config hook • New ap process fnmatch configs() function to process wildcards • Change ap configfile t, ap cfg getline(), ap cfg getc() to return error codes, and add ap pcfg strerror() for retrieving an error description. • Any config directive permitted in ACCESS CONF context must now correctly handle being called from an .htaccess file via the new A LLOW OVERRIDE L IST directive. ap check cmd context() accepts a new flag NOT IN HTACCESS to detect this case. http core (changed) • REMOVED ap default type, ap requires, all 2.2 authnz API • Introduces Optional Functions for logio and authnz • New function ap get server name for url to support IPv6 literals. • New function ap register errorlog handler to register error log format string handlers. • Arguments of error log hook have changed. Declaration has moved to http core.h. • New function ap state query to determine if the server is in the initial configuration preflight phase or not. This is both easier to use and more correct than the old method of creating a pool userdata entry in the process pool. • New function ap get conn socket to get the socket descriptor for a connection. This should be used instead of accessing the core connection config directly. httpd (changed) • Introduce per-directory, per-module loglevel • New loglevels APLOG TRACEn • Introduce errorlog ids for requests and connections • Support for mod request kept body • Support buffering filter data for async requests • New CONN STATE values • Function changes: ap escape html ap escape path segment buffer updated; ap unescape all, • Modules that load other modules later than the EXEC ON READ config reading stage need to call ap reserve module slots() or ap reserve module slots directive() in their pre config hook. 11.3. API CHANGES IN APACHE HTTP SERVER 2.4 SINCE 2.2 1037 • The useragent IP address per request can now be tracked independently of the client IP address of the connection, for support of deployments with load balancers. http log (changed) • Introduce per-directory, per-module loglevel • New loglevels APLOG TRACEn • ap log *error become macro wrappers (backwards-compatible if APLOG MARK macro is used, except that is no longer possible to use #ifdef inside the argument list) • piped logging revamped • module index added to error log hook • new function: ap log command line http request (changed) • New auth internal API and auth provider API • New EOR bucket type • New function ap process async request • New flags AP AUTH INTERNAL PER CONF and AP AUTH INTERNAL PER URI • New access checker ex hook to apply additional access control and/or bypass authentication. • New functions ap hook check access ex, ap hook check access, ap hook check authn, ap hook check authz which accept AP AUTH INTERNAL PER * flags • DEPRECATED direct use of ap hook access checker, ap hook check user id, ap hook auth checker access checker ex, When possible, registering all access control hooks (including authentication and authorization hooks) using AP AUTH INTERNAL PER CONF is recommended. If all modules’ access control hooks are registered with this flag, then whenever the server handles an internal sub-request that matches the same set of access control configuration directives as the initial request (which is the common case), it can avoid invoking the access control hooks another time. If your module requires the old behavior and must perform access control checks on every sub-request with a different URI from the initial request, even if that URI matches the same set of access control configuration directives, then use AP AUTH INTERNAL PER URI. mod auth (NEW!) Introduces the new provider framework for authn and authz mod cache (changed) Introduces a commit entity() function to the cache provider interface, allowing atomic writes to cache. Add a cache status() hook to report the cache decision. All private structures and functions were removed. mod core (NEW!) This introduces low-level APIs to send arbitrary headers, and exposes functions to handle HTTP OPTIONS and TRACE. 1038 CHAPTER 11. DEVELOPER DOCUMENTATION mod cache disk (changed) Changes the disk format of the disk cache to support atomic cache updates without locking. The device/inode pair of the body file is embedded in the header file, allowing confirmation that the header and body belong to one another. mod disk cache (renamed) The mod disk cache module has been renamed to mod cache disk in order to be consistent with the naming of other modules within the server. mod request (NEW!) The API for MOD REQUEST, to make input data available to multiple application/handler modules where required, and to parse HTML form data. mpm common (changed) • REMOVES: accept, lockfile, lock mech, set scoreboard (locking uses the new ap mutex API) • NEW API to drop privileges (delegates this platform-dependent function to modules) • NEW Hooks: mpm query, timed callback, and get name • CHANGED interfaces: ap relieve child processes monitor hook, ap reclaim child processes, scoreboard (changed) ap get scoreboard worker is made non-backwards-compatible as an alternative version is introduced. Additional proxy balancer support. Child status stuff revamped. util cookies (NEW!) Introduces a new API for managing HTTP Cookies. util ldap (changed) no description available util mutex (NEW!) A wrapper for APR proc and global mutexes in httpd, providing common configuration for the underlying mechanism and location of lock files. util script (changed) NEW: ap args to table 11.3. API CHANGES IN APACHE HTTP SERVER 2.4 SINCE 2.2 1039 util time (changed) NEW: ap recent ctime ex Specific information on upgrading modules from 2.2 Logging In order to take advantage of per-module loglevel configuration, any source file that calls the ap log * functions should declare which module it belongs to. If the module’s module struct is called foo module, the following code can be used to remain backward compatible with HTTPD 2.0 and 2.2: #include #ifdef APLOG USE MODULE APLOG USE MODULE(foo); #endif Note: This is absolutely required for C++-language modules. It can be skipped for C-language modules, though that breaks module-specific log level support for files without it. The number of parameters of the ap log * functions and the definition of APLOG MARK has changed. Normally, the change is completely transparent. However, changes are required if a module uses APLOG MARK as a parameter to its own functions or if a module calls ap log * without passing APLOG MARK. A module which uses wrappers around ap log * typically uses both of these constructs. The easiest way to change code which passes APLOG MARK to its own functions is to define and use a different macro that expands to the parameters required by those functions, as APLOG MARK should only be used when calling ap log * directly. In this way, the code will remain compatible with HTTPD 2.0 and 2.2. Code which calls ap log * without passing APLOG MARK will necessarily differ between 2.4 and earlier releases, as 2.4 requires a new third argument, APLOG MODULE INDEX. /* code for httpd 2.0/2.2 */ ap log perror(file, line, APLOG ERR, 0, p, "Failed to allocate dynamic lock structure"); /* code for httpd 2.4 */ ap log perror(file, line, APLOG MODULE INDEX, APLOG ERR, 0, p, "Failed to allocate dynamic lock structure"); ap log *error are now implemented as macros. This means that it is no longer possible to use #ifdef inside the argument list of ap log *error, as this would cause undefined behavor according to C99. A server rec pointer must be passed to ap log error() when called after startup. This was always appropriate, but there are even more limitations with a NULL server rec in 2.4 than in previous releases. Beginning with 2.3.12, the global variable ap server conf can always be used as the server rec parameter, as it will be NULL only when it is valid to pass NULL to ap log error(). ap server conf should be used only when a more appropriate server rec is not available. Consider the following changes to take advantage of the new APLOG TRACE1..8 log levels: • Check current use of APLOG DEBUG and consider if one of the APLOG TRACEn levels is more appropriate. 1040 CHAPTER 11. DEVELOPER DOCUMENTATION • If your module currently has a mechanism for configuring the amount of debug logging which is performed, consider eliminating that mechanism and relying on the use of different APLOG TRACEn levels. If expensive trace processing needs to be bypassed depending on the configured log level, use the APLOGtracen and APLOGrtracen macros to first check if tracing is enabled. Modules sometimes add process id and/or thread id to their log messages. These ids are now logged by default, so it may not be necessary for the module to log them explicitly. (Users may remove them from the error log format, but they can be instructed to add it back if necessary for problem diagnosis.) If your module uses these existing APIs... ap default type() This is no longer available; Content-Type must be configured explicitly or added by the application. ap get server name() If the returned server name is used in a URL, use ap get server name for url() instead. This new function handles the odd case where the server name is an IPv6 literal address. ap get server version() For logging purposes, where detailed information is appropriate, use ap get server description(). When generating output, where the amount of information should be configurable by ServerTokens, use ap get server banner(). ap graceful stop signalled() Replace with a call to ap mpm query(AP MPMQ MPM STATE) and checking for state AP MPMQ STOPPING. ap max daemons limit, ap my generation, and ap threads per child Use ap mpm query() query codes AP MPMQ MAX DAEMON USED, AP MPMQ GENERATION, and AP MPMQ MAX THREADS, respectively. ap mpm query() Ensure that it is not used until after the register-hooks hook has completed. Otherwise, an MPM built as a DSO would not have had a chance to enable support for this function. ap requires() The core server now provides better infrastructure for handling R EQUIRE configuration. Register an auth provider function for each supported entity using ap register auth provider(). The function will be called as necessary during R EQUIRE processing. (Consult bundled modules for detailed examples.) ap server conf->process->pool userdata Optional: • If your module uses this to determine which pass of the startup hooks is being run, use ap state query(AP SQ MAIN STATE). • If your module uses this to maintain data across the unloading and reloading of your module, use ap retained data create() and ap retained data get(). apr global mutex create(), apr proc mutex create() Optional: See ap mutex register(), ap global mutex create(), and ap proc mutex create(); these allow your mutexes to be configurable with the M UTEX directive; you can also remove any configuration mechanisms in your module for such mutexes CORE PRIVATE This is now unnecessary and ignored. dav new error() and dav new error tag() Previously, these assumed that errno contained information describing the failure. Now, an apr status t parameter must be provided. Pass 0/APR SUCCESS if there is no such error information, or a valid apr status t value otherwise. mpm default.h, DEFAULT LOCKFILE, DEFAULT THREAD LIMIT, DEFAULT PIDLOG, etc. The header file and most of the default configuration values set in it are no longer visible to modules. (Most can still be overridden at build time.) DEFAULT PIDLOG and DEFAULT REL RUNTIMEDIR are now universally available via ap config.h. 11.3. API CHANGES IN APACHE HTTP SERVER 2.4 SINCE 2.2 1041 unixd config This has been renamed to ap unixd config. conn rec->remote ip and conn rec->remote addr These fields have been renamed in order to distinguish between the client IP address of the connection and the useragent IP address of the request (potentially overridden by a load balancer or proxy). References to either of these fields must be updated with one of the following options, as appropriate for the module: • When you require the IP address of the user agent, which might be connected directly to the server, or might optionally be separated from the server by a transparent load balancer or proxy, use request rec->useragent ip and request rec->useragent addr. • When you require the IP address of the client that is connected directly to the server, which might be the useragent or might be the load balancer or proxy itself, use conn rec->client ip and conn rec->client addr. If your module interfaces with this feature... suEXEC Optional: If your module logs an error when ap unixd config.suexec enabled is 0, also log the value of the new field suexec disabled reason, which contains an explanation of why it is not available. Extended status data in the scoreboard In previous releases, ExtendedStatus had to be set to On, which in turn required that mod status was loaded. In 2.4, just set ap extended status to 1 in a pre-config hook and the extended status data will be available. Does your module... Parse query args Consider if ap args to table() would be helpful. Parse form data... Use ap parse form data(). Check for request header fields Content-Length and Transfer-Encoding to see if a body was specified Use ap request has body(). Implement cleanups which clear pointer variables Use ap pool cleanup set null(). Create run-time files such as shared memory files, pid files, etc. Use ap runtime dir relative() so that the global configuration for the location of such files, either by the DEFAULT REL RUNTIMEDIR compile setting or the D EFAULT RUNTIME D IR directive, will be respected. Apache httpd 2.4.2 and above. 1042 11.4 CHAPTER 11. DEVELOPER DOCUMENTATION Developing modules for the Apache HTTP Server 2.4 This document explains how you can develop modules for the Apache HTTP Server 2.4 See also • Request Processing in Apache 2.4 (p. 1078) • Apache 2.x Hook Functions (p. 1071) Introduction What we will be discussing in this document This document will discuss how you can create modules for the Apache HTTP Server 2.4, by exploring an example module called mod example. In the first part of this document, the purpose of this module will be to calculate and print out various digest values for existing files on your web server, whenever we access the URL http://hostname/filename.sum. For instance, if we want to know the MD5 digest value of the file located at http://www.example.com/index.html, we would visit http://www.example.com/index.html.sum. In the second part of this document, which deals with configuration directive and context awareness, we will be looking at a module that simply writes out its own configuration to the client. Prerequisites First and foremost, you are expected to have a basic knowledge of how the C programming language works. In most cases, we will try to be as pedagogical as possible and link to documents describing the functions used in the examples, but there are also many cases where it is necessary to either just assume that "it works" or do some digging yourself into what the hows and whys of various function calls. Lastly, you will need to have a basic understanding of how modules are loaded and configured in the Apache HTTP Server, as well as how to get the headers for Apache if you do not have them already, as these are needed for compiling new modules. Compiling your module To compile the source code we are building in this document, we will be using APXS (p. 303) . Assuming your source file is called mod example.c, compiling, installing and activating the module is as simple as: apxs -i -a -c mod_example.c 11.4. DEVELOPING MODULES FOR THE APACHE HTTP SERVER 2.4 1043 Defining a module Every module starts with the same declaration, or name tag if you will, that defines a module as a separate entity within Apache: module AP_MODULE_DECLARE_DATA example_module = { STANDARD20_MODULE_STUFF, create_dir_conf, /* Per-directory configuration handler */ merge_dir_conf, /* Merge handler for per-directory configurations */ create_svr_conf, /* Per-server configuration handler */ merge_svr_conf, /* Merge handler for per-server configurations */ directives, /* Any directives we may have for httpd */ register_hooks /* Our hook registering function */ }; This bit of code lets the server know that we have now registered a new module in the system, and that its name is example module. The name of the module is used primarily for two things: • Letting the server know how to load the module using the LoadModule • Setting up a namespace for the module to use in configurations For now, we’re only concerned with the first purpose of the module name, which comes into play when we need to load the module: LoadModule example_module "modules/mod_example.so" In essence, this tells the server to open up mod example.so and look for a module called example module. Within this name tag of ours is also a bunch of references to how we would like to handle things: Which directives do we respond to in a configuration file or .htaccess, how do we operate within specific contexts, and what handlers are we interested in registering with the Apache HTTP service. We’ll return to all these elements later in this document. Getting started: Hooking into the server An introduction to hooks When handling requests in Apache HTTP Server 2.4, the first thing you will need to do is create a hook into the request handling process. A hook is essentially a message telling the server that you are willing to either serve or at least take a glance at certain requests given by clients. All handlers, whether it’s mod rewrite, mod authn *, 1044 CHAPTER 11. DEVELOPER DOCUMENTATION mod proxy and so on, are hooked into specific parts of the request process. As you are probably aware, modules serve different purposes; Some are authentication/authorization handlers, others are file or script handlers while some third modules rewrite URIs or proxies content. Furthermore, in the end, it is up to the user of the server how and when each module will come into place. Thus, the server itself does not presume to know which module is responsible for handling a specific request, and will ask each module whether they have an interest in a given request or not. It is then up to each module to either gently decline serving a request, accept serving it or flat out deny the request from being served, as authentication/authorization modules do: To make it a bit easier for handlers such as our mod example to know whether the client is requesting content we should handle or not, the server has directives for hinting to modules whether their assistance is needed or not. Two of these are A DD H ANDLER and S ET H ANDLER. Let’s take a look at an example using A DD H ANDLER. In our example case, we want every request ending with .sum to be served by mod example, so we’ll add a configuration directive that tells the server to do just that: AddHandler example-handler ".sum" What this tells the server is the following: Whenever we receive a request for a URI ending in .sum, we are to let all modules know that we are looking for whoever goes by the name of "example-handler" . Thus, when a request is being served that ends in .sum, the server will let all modules know, that this request should be served by "examplehandler". As you will see later, when we start building mod example, we will check for this handler tag relayed by AddHandler and reply to the server based on the value of this tag. Hooking into httpd To begin with, we only want to create a simple handler, that replies to the client browser when a specific URL is requested, so we won’t bother setting up configuration handlers and directives just yet. Our initial module definition will look like this: module AP_MODULE_DECLARE_DATA example_module = 11.4. DEVELOPING MODULES FOR THE APACHE HTTP SERVER 2.4 1045 { STANDARD20_MODULE_STUFF, NULL, NULL, NULL, NULL, NULL, register_hooks /* Our hook registering function */ }; This lets the server know that we are not interested in anything fancy, we just want to hook onto the requests and possibly handle some of them. The reference in our example declaration, register hooks is the name of a function we will create to manage how we hook onto the request process. In this example module, the function has just one purpose; To create a simple hook that gets called after all the rewrites, access control etc has been handled. Thus, we will let the server know, that we want to hook into its process as one of the last modules: static void register_hooks(apr_pool_t *pool) { /* Create a hook in the request handler, so we get called when a request arrives */ ap_hook_handler(example_handler, NULL, NULL, APR_HOOK_LAST); } The example handler reference is the function that will handle the request. We will discuss how to create a handler in the next chapter. Other useful hooks Hooking into the request handling phase is but one of many hooks that you can create. Some other ways of hooking are: • ap hook child init: Place a hook that executes when a child process is spawned (commonly used for initializing modules after the server has forked) • ap hook pre config: Place a hook that executes before any configuration data has been read (very early hook) • ap hook post config: Place a hook that executes after configuration has been parsed, but before the server has forked • ap hook translate name: Place a hook that executes when a URI needs to be translated into a filename on the server (think mod rewrite) • ap hook quick handler: Similar to ap hook handler, except it is run before any other request hooks (translation, auth, fixups etc) • ap hook log transaction: Place a hook that executes when the server is about to add a log entry of the current request Building a handler A handler is essentially a function that receives a callback when a request to the server is made. It is passed a record of the current request (how it was made, which headers and requests were passed along, who’s giving the request and so on), and is put in charge of either telling the server that it’s not interested in the request or handle the request with the tools provided. 1046 CHAPTER 11. DEVELOPER DOCUMENTATION A simple "Hello, world!" handler Let’s start off by making a very simple request handler that does the following: 1. Check that this is a request that should be served by "example-handler" 2. Set the content type of our output to text/html 3. Write "Hello, world!" back to the client browser 4. Let the server know that we took care of this request and everything went fine In C code, our example handler will now look like this: static int example_handler(request_rec *r) { /* First off, we need to check if this is a call for the "example-handler" handler. * If it is, we accept it and do our things, if not, we simply return DECLINED, * and the server will try somewhere else. */ if (!r->handler || strcmp(r->handler, "example-handler")) return (DECLINED); /* Now that we are handling this request, we’ll write out "Hello, world!" to the client * To do so, we must first set the appropriate content type, followed by our output. */ ap_set_content_type(r, "text/html"); ap_rprintf(r, "Hello, world!"); /* Lastly, we must tell the server that we took care of this request and everything wen * We do so by simply returning the value OK to the server. */ return OK; } Now, we put all we have learned together and end up with a program that looks like mod example 1.c10 . The functions used in this example will be explained later in the section "Some useful functions you should know". The request rec structure The most essential part of any request is the request record . In a call to a handler function, this is represented by the request rec* structure passed along with every call that is made. This struct, typically just referred to as r in modules, contains all the information you need for your module to fully process any HTTP request and respond accordingly. Some key elements of the request rec structure are: • r->handler (char*): Contains the name of the handler the server is currently asking to do the handling of this request • r->method (char*): Contains the HTTP method being used, f.x. GET or POST • r->filename (char*): Contains the translated filename the client is requesting • r->args (char*): Contains the query string of the request, if any 10 http://people.apache.org/˜humbedooh/mods/examples/mod example 1.c 11.4. DEVELOPING MODULES FOR THE APACHE HTTP SERVER 2.4 1047 • r->headers in (apr table t*): Contains all the headers sent by the client • r->connection (conn rec*): A record containing information about the current connection • r->user (char*): If the URI requires authentication, this is set to the username provided • r->useragent ip (char*): The IP address of the client connecting to us • r->pool (apr pool t*): The memory pool of this request. We’ll discuss this in the "Memory management" chapter. A complete list of all the values contained within the request rec structure can be found in the httpd.h11 header file or at http://ci.apache.org/projects/httpd/trunk/doxygen/structrequest rec.html. Let’s try out some of these variables in another example handler: static int example_handler(request_rec *r) { /* Set the appropriate content type */ ap_set_content_type(r, "text/html"); /* Print out the IP address of the client connecting to us: */ ap_rprintf(r, "

Hello, %s!

", r->useragent_ip); /* If we were reached through a GET or a POST request, be happy, else sad. */ if ( !strcmp(r->method, "POST") || !strcmp(r->method, "GET") ) { ap_rputs("You used a GET or a POST method, that makes us happy!
", r); } else { ap_rputs("You did not use POST or GET, that makes us sad :(
", r); } /* Lastly, if there was a query string, let’s print that too! */ if (r->args) { ap_rprintf(r, "Your query string was: %s", r->args); } return OK; } Return values Apache relies on return values from handlers to signify whether a request was handled or not, and if so, whether the request went well or not. If a module is not interested in handling a specific request, it should always return the value DECLINED. If it is handling a request, it should either return the generic value OK, or a specific HTTP status code, for example: static int example_handler(request_rec *r) { /* Return 404: Not found */ return HTTP_NOT_FOUND; } Returning OK or a HTTP status code does not necessarily mean that the request will end. The server may still have other handlers that are interested in this request, for instance the logging modules which, upon a successful request, 11 http://svn.apache.org/repos/asf/httpd/httpd/trunk/include/httpd.h 1048 CHAPTER 11. DEVELOPER DOCUMENTATION will write down a summary of what was requested and how it went. To do a full stop and prevent any further processing after your module is done, you can return the value DONE to let the server know that it should cease all activity on this request and carry on with the next, without informing other handlers. General response codes: • DECLINED: We are not handling this request • OK: We handled this request and it went well • DONE: We handled this request and the server should just close this thread without further processing HTTP specific return codes (excerpt): • HTTP OK (200): Request was okay • HTTP MOVED PERMANENTLY (301): The resource has moved to a new URL • HTTP UNAUTHORIZED (401): Client is not authorized to visit this page • HTTP FORBIDDEN (403): Permission denied • HTTP NOT FOUND (404): File not found • HTTP INTERNAL SERVER ERROR (500): Internal server error (self explanatory) Some useful functions you should know • ap rputs(const char *string, request rec *r): Sends a string of text to the client. This is a shorthand version of ap rwrite12 . ap_rputs("Hello, world!", r); • ap rprintf13 : This function works just like printf, except it sends the result to the client. ap_rprintf(r, "Hello, %s!", r->useragent_ip); • ap set content type14 (request rec *r, const char *type): Sets the content type of the output you are sending. ap_set_content_type(r, "text/plain"); /* force a raw text output */ Memory management Managing your resources in Apache HTTP Server 2.4 is quite easy, thanks to the memory pool system. In essence, each server, connection and request have their own memory pool that gets cleaned up when its scope ends, e.g. when a request is done or when a server process shuts down. All your module needs to do is latch onto this memory pool, and you won’t have to worry about having to clean up after yourself - pretty neat, huh? In our module, we will primarily be allocating memory for each request, so it’s appropriate to use the r->pool reference when creating new objects. A few of the functions for allocating memory within a pool are: 12 http://ci.apache.org/projects/httpd/trunk/doxygen/group APACHE CORE PROTO.html#gac827cd0537d2b6213a7c06d7c26cc36e APACHE CORE PROTO.html#ga5e91eb6ca777c9a427b2e82bf1eeb81d 14 http://ci.apache.org/projects/httpd/trunk/doxygen/group APACHE CORE PROTO.html#gaa2f8412c400197338ec509f4a45e4579 13 http://ci.apache.org/projects/httpd/trunk/doxygen/group 11.4. DEVELOPING MODULES FOR THE APACHE HTTP SERVER 2.4 1049 • void* apr palloc15 ( apr pool t *p, apr size t size): Allocates size number of bytes in the pool for you • void* apr pcalloc16 ( apr pool t *p, apr size t size): Allocates size number of bytes in the pool for you and sets all bytes to 0 • char* apr pstrdup17 ( apr pool t *p, const char *s): Creates a duplicate of the string s. This is useful for copying constant values so you can edit them • char* apr psprintf18 ( apr pool t *p, const char *fmt, ...): Similar to sprintf, except the server supplies you with an appropriately allocated target variable Let’s put these functions into an example handler: static int example_handler(request_rec *r) { const char *original = "You can’t edit this!"; char *copy; int *integers; /* Allocate space for 10 integer values and set them all to zero. */ integers = apr_pcalloc(r->pool, sizeof(int)*10); /* Create a copy of the ’original’ variable that we can edit. */ copy = apr_pstrdup(r->pool, original); return OK; } This is all well and good for our module, which won’t need any pre-initialized variables or structures. However, if we wanted to initialize something early on, before the requests come rolling in, we could simply add a call to a function in our register hooks function to sort it out: static void register_hooks(apr_pool_t *pool) { /* Call a function that initializes some stuff */ example_init_function(pool); /* Create a hook in the request handler, so we get called when a request arrives */ ap_hook_handler(example_handler, NULL, NULL, APR_HOOK_LAST); } In this pre-request initialization function we would not be using the same pool as we did when allocating resources for request-based functions. Instead, we would use the pool given to us by the server for allocating memory on a per-process based level. Parsing request data In our example module, we would like to add a feature, that checks which type of digest, MD5 or SHA1 the client would like to see. This could be solved by adding a query string to the request. A query string is typically comprised of several keys and values put together in a string, for instance valueA=yes&valueB=no&valueC=maybe. It is 15 http://apr.apache.org/docs/apr/1.4/group apr apr 17 http://apr.apache.org/docs/apr/1.4/group apr 18 http://apr.apache.org/docs/apr/1.4/group apr 16 http://apr.apache.org/docs/apr/1.4/group pools.html#ga85f1e193c31d109affda72f9a92c6915 pools.html#gaf61c098ad258069d64cdf8c0a9369f9e strings.html#gabc79e99ff19abbd7cfd18308c5f85d47 strings.html#ga3eca76b8d293c5c3f8021e45eda813d8 1050 CHAPTER 11. DEVELOPER DOCUMENTATION up to the module itself to parse these and get the data it requires. In our example, we’ll be looking for a key called digest, and if set to md5, we’ll produce an MD5 digest, otherwise we’ll produce a SHA1 digest. Since the introduction of Apache HTTP Server 2.4, parsing request data from GET and POST requests have never been easier. All we require to parse both GET and POST data is four simple lines: apr_table_t *GET; apr_array_header_t*POST; ap_args_to_table(r, &GET); ap_parse_form_data(r, NULL, &POST, -1, 8192); In our specific example module, we’re looking for the digest value from the query string, which now resides inside a table called GET. To extract this value, we need only perform a simple operation: /* Get the "digest" key from the query string, if any. */ const char *digestType = apr_table_get(GET, "digest"); /* If no key was returned, we will set a default value instead. */ if (!digestType) digestType = "sha1"; The structures used for the POST and GET data are not exactly the same, so if we were to fetch a value from POST data instead of the query string, we would have to resort to a few more lines, as outlined in this example in the last chapter of this document. Making an advanced handler Now that we have learned how to parse form data and manage our resources, we can move on to creating an advanced version of our module, that spits out the MD5 or SHA1 digest of files: static int example_handler(request_rec *r) { int rc, exists; apr_finfo_t finfo; apr_file_t *file; char *filename; char buffer[256]; apr_size_t readBytes; int n; apr_table_t *GET; apr_array_header_t *POST; const char *digestType; /* Check that the "example-handler" handler is being called. */ if (!r->handler || strcmp(r->handler, "example-handler")) return (DECLINED); /* Figure out which file is being requested by removing the .sum from it */ filename = apr_pstrdup(r->pool, r->filename); 11.4. DEVELOPING MODULES FOR THE APACHE HTTP SERVER 2.4 1051 filename[strlen(filename)-4] = 0; /* Cut off the last 4 characters. */ /* Figure out if the file we request a sum on exists and isn’t a directory */ rc = apr_stat(&finfo, filename, APR_FINFO_MIN, r->pool); if (rc == APR_SUCCESS) { exists = ( (finfo.filetype != APR_NOFILE) && !(finfo.filetype & APR_DIR) ); if (!exists) return HTTP_NOT_FOUND; /* Return a 404 if not found. */ } /* If apr_stat failed, we’re probably not allowed to check this file. */ else return HTTP_FORBIDDEN; /* Parse the GET and, optionally, the POST data sent to us */ ap_args_to_table(r, &GET); ap_parse_form_data(r, NULL, &POST, -1, 8192); /* Set the appropriate content type */ ap_set_content_type(r, "text/html"); /* Print a title and some general information */ ap_rprintf(r, "

Information on %s:

", filename); ap_rprintf(r, "Size: %u bytes
", finfo.size); /* Get the digest type the client wants to see */ digestType = apr_table_get(GET, "digest"); if (!digestType) digestType = "MD5"; rc = apr_file_open(&file, filename, APR_READ, APR_OS_DEFAULT, r->pool); if (rc == APR_SUCCESS) { /* Are we trying to calculate the MD5 or the SHA1 digest? */ if (!strcasecmp(digestType, "md5")) { /* Calculate the MD5 sum of the file */ union { char chr[16]; uint32_t num[4]; } digest; apr_md5_ctx_t md5; apr_md5_init(&md5); readBytes = 256; while ( apr_file_read(file, buffer, &readBytes) == APR_SUCCESS ) { apr_md5_update(&md5, buffer, readBytes); } apr_md5_final(digest.chr, &md5); /* Print out the MD5 digest */ ap_rputs("MD5: ", r); for (n = 0; n < APR_MD5_DIGESTSIZE/4; n++) { 1052 CHAPTER 11. DEVELOPER DOCUMENTATION ap_rprintf(r, "%08x", digest.num[n]); } ap_rputs("", r); /* Print a link to the SHA1 version */ ap_rputs("
View the SHA1 hash instead", r); } else { /* Calculate the SHA1 sum of the file */ union { char chr[20]; uint32_t num[5]; } digest; apr_sha1_ctx_t sha1; apr_sha1_init(&sha1); readBytes = 256; while ( apr_file_read(file, buffer, &readBytes) == APR_SUCCESS ) { apr_sha1_update(&sha1, buffer, readBytes); } apr_sha1_final(digest.chr, &sha1); /* Print out the SHA1 digest */ ap_rputs("SHA1: ", r); for (n = 0; n < APR_SHA1_DIGESTSIZE/4; n++) { ap_rprintf(r, "%08x", digest.num[n]); } ap_rputs("", r); /* Print a link to the MD5 version */ ap_rputs("
View the MD5 hash instead", r); } apr_file_close(file); } /* Let the server know that we responded to this request. */ return OK; } This version in its entirety can be found here: mod example 2.c19 . Adding configuration options In this next segment of this document, we will turn our eyes away from the digest module and create a new example module, whose only function is to write out its own configuration. The purpose of this is to examine how the server works with configuration, and what happens when you start writing advanced configurations for your modules. An introduction to configuration directives If you are reading this, then you probably already know what a configuration directive is. Simply put, a directive is a way of telling an individual module (or a set of modules) how to behave, such as these directives control how mod rewrite works: 19 http://people.apache.org/˜humbedooh/mods/examples/mod example 2.c 11.4. DEVELOPING MODULES FOR THE APACHE HTTP SERVER 2.4 1053 RewriteEngine On RewriteCond "%{REQUEST_URI}" "ˆ/foo/bar" RewriteRule "ˆ/foo/bar/(.*)$" "/foobar?page=$1" Each of these configuration directives are handled by a separate function, that parses the parameters given and sets up a configuration accordingly. Making an example configuration To begin with, we’ll create a basic configuration in C-space: typedef struct { int enabled; /* Enable or disable our module */ const char *path; /* Some path to...something */ int typeOfAction; /* 1 means action A, 2 means action B and so on */ } example_config; Now, let’s put this into perspective by creating a very small module that just prints out a hard-coded configuration. You’ll notice that we use the register hooks function for initializing the configuration values to their defaults: typedef struct { int enabled; /* Enable or disable our module */ const char *path; /* Some path to...something */ int typeOfAction; /* 1 means action A, 2 means action B and so on */ } example_config; static example_config config; static int example_handler(request_rec *r) { if (!r->handler || strcmp(r->handler, "example-handler")) return(DECLINED); ap_set_content_type(r, "text/plain"); ap_rprintf(r, "Enabled: %u\n", config.enabled); ap_rprintf(r, "Path: %s\n", config.path); ap_rprintf(r, "TypeOfAction: %x\n", config.typeOfAction); return OK; } static void register_hooks(apr_pool_t *pool) { config.enabled = 1; config.path = "/foo/bar"; config.typeOfAction = 0x00; ap_hook_handler(example_handler, NULL, NULL, APR_HOOK_LAST); } /* Define our module as an entity and assign a function for registering hooks module AP_MODULE_DECLARE_DATA example_module = { STANDARD20_MODULE_STUFF, NULL, /* Per-directory configuration handler */ */ 1054 CHAPTER 11. DEVELOPER DOCUMENTATION NULL, NULL, NULL, NULL, register_hooks /* /* /* /* /* Merge handler for per-directory configurations */ Per-server configuration handler */ Merge handler for per-server configurations */ Any directives we may have for httpd */ Our hook registering function */ }; So far so good. To access our new handler, we could add the following to our configuration: SetHandler example-handler When we visit, we’ll see our current configuration being spit out by our module. Registering directives with the server What if we want to change our configuration, not by hard-coding new values into the module, but by using either the httpd.conf file or possibly a .htaccess file? It’s time to let the server know that we want this to be possible. To do so, we must first change our name tag to include a reference to the configuration directives we want to register with the server: module AP_MODULE_DECLARE_DATA example_module = { STANDARD20_MODULE_STUFF, NULL, /* Per-directory configuration handler */ NULL, /* Merge handler for per-directory configurations */ NULL, /* Per-server configuration handler */ NULL, /* Merge handler for per-server configurations */ example_directives, /* Any directives we may have for httpd */ register_hooks /* Our hook registering function */ }; This will tell the server that we are now accepting directives from the configuration files, and that the structure called example directives holds information on what our directives are and how they work. Since we have three different variables in our module configuration, we will add a structure with three directives and a NULL at the end: static const command_rec example_directives[] = { AP_INIT_TAKE1("exampleEnabled", example_set_enabled, NULL, RSRC_CONF, "Enable or disabl AP_INIT_TAKE1("examplePath", example_set_path, NULL, RSRC_CONF, "The path to whatever") AP_INIT_TAKE2("exampleAction", example_set_action, NULL, RSRC_CONF, "Special action val { NULL } }; 11.4. DEVELOPING MODULES FOR THE APACHE HTTP SERVER 2.4 1055 As you can see, each directive needs at least 5 parameters set: 1. AP INIT TAKE120 : This is a macro that tells the server that this directive takes one and only one argument. If we required two arguments, we could use the macro AP INIT TAKE221 and so on (refer to httpd conf.h for more macros). 2. exampleEnabled: This is the name of our directive. More precisely, it is what the user must put in his/her configuration in order to invoke a configuration change in our module. 3. example set enabled: This is a reference to a C function that parses the directive and sets the configuration accordingly. We will discuss how to make this in the following paragraph. 4. RSRC CONF: This tells the server where the directive is permitted. We’ll go into details on this value in the later chapters, but for now, RSRC CONF means that the server will only accept these directives in a server context. 5. "Enable or disable....": This is simply a brief description of what the directive does. (The "missing" parameter in our definition, which is usually set to NULL, is an optional function that can be run after the initial function to parse the arguments have been run. This is usually omitted, as the function for verifying arguments might as well be used to set them.) The directive handler function Now that we have told the server to expect some directives for our module, it’s time to make a few functions for handling these. What the server reads in the configuration file(s) is text, and so naturally, what it passes along to our directive handler is one or more strings, that we ourselves need to recognize and act upon. You’ll notice, that since we set our exampleAction directive to accept two arguments, its C function also has an additional parameter defined: /* Handler for the "exampleEnabled" directive */ const char *example_set_enabled(cmd_parms *cmd, void *cfg, const char *arg) { if(!strcasecmp(arg, "on")) config.enabled = 1; else config.enabled = 0; return NULL; } /* Handler for the "examplePath" directive */ const char *example_set_path(cmd_parms *cmd, void *cfg, const char *arg) { 20 http://ci.apache.org/projects/httpd/trunk/doxygen/group 21 http://ci.apache.org/projects/httpd/trunk/doxygen/group APACHE CORE CONFIG.html#ga07c7d22ae17805e61204463326cf9c34 APACHE CORE CONFIG.html#gafaec43534fcf200f37d9fecbf9247c21 1056 CHAPTER 11. DEVELOPER DOCUMENTATION config.path = arg; return NULL; } /* Handler for the "exampleAction" directive */ /* Let’s pretend this one takes one argument (file or db), and a second (deny or allow), */ /* and we store it in a bit-wise manner. */ const char *example_set_action(cmd_parms *cmd, void *cfg, const char *arg1, const char *arg { if(!strcasecmp(arg1, "file")) config.typeOfAction = 0x01; else config.typeOfAction = 0x02; if(!strcasecmp(arg2, "deny")) config.typeOfAction += 0x10; else config.typeOfAction += 0x20; return NULL; } Putting it all together Now that we have our directives set up, and handlers configured for them, we can assemble our module into one big file: /* mod_example_config_simple.c: */ #include #include "apr_hash.h" #include "ap_config.h" #include "ap_provider.h" #include "httpd.h" #include "http_core.h" #include "http_config.h" #include "http_log.h" #include "http_protocol.h" #include "http_request.h" /* ============================================================================== Our configuration prototype and declaration: ============================================================================== */ typedef struct { int enabled; /* Enable or disable our module */ const char *path; /* Some path to...something */ int typeOfAction; /* 1 means action A, 2 means action B and so on */ } example_config; static example_config config; /* ============================================================================== Our directive handlers: ============================================================================== */ /* Handler for the "exampleEnabled" directive */ 11.4. DEVELOPING MODULES FOR THE APACHE HTTP SERVER 2.4 1057 const char *example_set_enabled(cmd_parms *cmd, void *cfg, const char *arg) { if(!strcasecmp(arg, "on")) config.enabled = 1; else config.enabled = 0; return NULL; } /* Handler for the "examplePath" directive */ const char *example_set_path(cmd_parms *cmd, void *cfg, const char *arg) { config.path = arg; return NULL; } /* Handler for the "exampleAction" directive */ /* Let’s pretend this one takes one argument (file or db), and a second (deny or allow), */ /* and we store it in a bit-wise manner. */ const char *example_set_action(cmd_parms *cmd, void *cfg, const char *arg1, const char *arg { if(!strcasecmp(arg1, "file")) config.typeOfAction = 0x01; else config.typeOfAction = 0x02; if(!strcasecmp(arg2, "deny")) config.typeOfAction += 0x10; else config.typeOfAction += 0x20; return NULL; } /* ============================================================================== The directive structure for our name tag: ============================================================================== */ static const command_rec example_directives[] = { AP_INIT_TAKE1("exampleEnabled", example_set_enabled, NULL, RSRC_CONF, "Enable or disabl AP_INIT_TAKE1("examplePath", example_set_path, NULL, RSRC_CONF, "The path to whatever") AP_INIT_TAKE2("exampleAction", example_set_action, NULL, RSRC_CONF, "Special action val { NULL } }; /* ============================================================================== Our module handler: ============================================================================== */ static int example_handler(request_rec *r) { if(!r->handler || strcmp(r->handler, "example-handler")) return(DECLINED); ap_set_content_type(r, "text/plain"); ap_rprintf(r, "Enabled: %u\n", config.enabled); ap_rprintf(r, "Path: %s\n", config.path); ap_rprintf(r, "TypeOfAction: %x\n", config.typeOfAction); return OK; } 1058 CHAPTER 11. DEVELOPER DOCUMENTATION /* ============================================================================== The hook registration function (also initializes the default config values): ============================================================================== */ static void register_hooks(apr_pool_t *pool) { config.enabled = 1; config.path = "/foo/bar"; config.typeOfAction = 3; ap_hook_handler(example_handler, NULL, NULL, APR_HOOK_LAST); } /* ============================================================================== Our module name tag: ============================================================================== */ module AP_MODULE_DECLARE_DATA example_module = { STANDARD20_MODULE_STUFF, NULL, /* Per-directory configuration handler */ NULL, /* Merge handler for per-directory configurations */ NULL, /* Per-server configuration handler */ NULL, /* Merge handler for per-server configurations */ example_directives, /* Any directives we may have for httpd */ register_hooks /* Our hook registering function */ }; In our httpd.conf file, we can now change the hard-coded configuration by adding a few lines: ExampleEnabled On ExamplePath "/usr/bin/foo" ExampleAction file allow And thus we apply the configuration, visit /example on our web site, and we see the configuration has adapted to what we wrote in our configuration file. Context aware configurations Introduction to context aware configurations In Apache HTTP Server 2.4, different URLs, virtual hosts, directories etc can have very different meanings to the user of the server, and thus different contexts within which modules must operate. For example, let’s assume you have this configuration set up for mod rewrite: RewriteCond "%{HTTP_HOST}" "ˆexample.com$" RewriteRule "(.*)" "http://www.example.com/$1" RewriteRule "ˆfoobar$" "index.php?foobar=true" 11.4. DEVELOPING MODULES FOR THE APACHE HTTP SERVER 2.4 1059 In this example, you will have set up two different contexts for mod rewrite: 1. Inside /var/www, all requests for http://example.com must go to http://www.example.com 2. Inside /var/www/sub, all requests for foobar must go to index.php?foobar=true If mod rewrite (or the entire server for that matter) wasn’t context aware, then these rewrite rules would just apply to every and any request made, regardless of where and how they were made, but since the module can pull the context specific configuration straight from the server, it does not need to know itself, which of the directives are valid in this context, since the server takes care of this. So how does a module get the specific configuration for the server, directory or location in question? It does so by making one simple call: example_config *config = (example_config*) ap_get_module_config(r->per_dir_config, &example That’s it! Of course, a whole lot goes on behind the scenes, which we will discuss in this chapter, starting with how the server came to know what our configuration looks like, and how it came to be set up as it is in the specific context. Our basic configuration setup In this chapter, we will be working with a slightly modified version of our previous context structure. We will set a context variable that we can use to track which context configuration is being used by the server in various places: typedef struct { char context[256]; char path[256]; int typeOfAction; int enabled; } example_config; Our handler for requests will also be modified, yet still very simple: static int example_handler(request_rec *r) { if(!r->handler || strcmp(r->handler, "example-handler")) return(DECLINED); example_config *config = (example_config*) ap_get_module_config(r->per_dir_config, &exa ap_set_content_type(r, "text/plain"); ap_rprintf("Enabled: %u\n", config->enabled); ap_rprintf("Path: %s\n", config->path); ap_rprintf("TypeOfAction: %x\n", config->typeOfAction); ap_rprintf("Context: %s\n", config->context); return OK; } Choosing a context Before we can start making our module context aware, we must first define, which contexts we will accept. As we saw in the previous chapter, defining a directive required five elements be set: AP_INIT_TAKE1("exampleEnabled", example_set_enabled, NULL, RSRC_CONF, "Enable or disable mo 1060 CHAPTER 11. DEVELOPER DOCUMENTATION The RSRC CONF definition told the server that we would only allow this directive in a global server context, but since we are now trying out a context aware version of our module, we should set this to something more lenient, namely the value ACCESS CONF, which lets us use the directive inside and blocks. For more control over the placement of your directives, you can combine the following restrictions together to form a specific rule: • RSRC CONF: Allow in .conf files (not .htaccess) outside or • ACCESS CONF: Allow in .conf files (not .htaccess) inside or • OR OPTIONS: Allow in .conf files and .htaccess when AllowOverride Options is set • OR FILEINFO: Allow in .conf files and .htaccess when AllowOverride FileInfo is set • OR AUTHCFG: Allow in .conf files and .htaccess when AllowOverride AuthConfig is set • OR INDEXES: Allow in .conf files and .htaccess when AllowOverride Indexes is set • OR ALL: Allow anywhere in .conf files and .htaccess Using the server to allocate configuration slots A much smarter way to manage your configurations is by letting the server help you create them. To do so, we must first start off by changing our name tag to let the server know, that it should assist us in creating and managing our configurations. Since we have chosen the per-directory (or per-location) context for our module configurations, we’ll add a per-directory creator and merger function reference in our tag: module AP_MODULE_DECLARE_DATA example_module = { STANDARD20_MODULE_STUFF, create_dir_conf, /* Per-directory configuration handler */ merge_dir_conf, /* Merge handler for per-directory configurations */ NULL, /* Per-server configuration handler */ NULL, /* Merge handler for per-server configurations */ directives, /* Any directives we may have for httpd */ register_hooks /* Our hook registering function */ }; Creating new context configurations Now that we have told the server to help us create and manage configurations, our first step is to make a function for creating new, blank configurations. We do so by creating the function we just referenced in our name tag as the Per-directory configuration handler: void *create_dir_conf(apr_pool_t *pool, char *context) { context = context ? context : "(undefined context)"; example_config *cfg = apr_pcalloc(pool, sizeof(example_config)); if(cfg) { /* Set some default values */ strcpy(cfg->context, context); cfg->enabled = 0; cfg->path = "/foo/bar"; cfg->typeOfAction = 0x11; } return cfg; } 11.4. DEVELOPING MODULES FOR THE APACHE HTTP SERVER 2.4 1061 Merging configurations Our next step in creating a context aware configuration is merging configurations. This part of the process particularly applies to scenarios where you have a parent configuration and a child, such as the following: ExampleEnabled On ExamplePath "/foo/bar" ExampleAction file allow ExampleAction file deny In this example, it is natural to assume that the directory /var/www/subdir should inherit the values set for the /var/www directory, as we did not specify an ExampleEnabled nor an ExamplePath for this directory. The server does not presume to know if this is true, but cleverly does the following: 1. Creates a new configuration for /var/www 2. Sets the configuration values according to the directives given for /var/www 3. Creates a new configuration for /var/www/subdir 4. Sets the configuration values according to the directives given for /var/www/subdir 5. Proposes a merge of the two configurations into a new configuration for /var/www/subdir This proposal is handled by the merge dir conf function we referenced in our name tag. The purpose of this function is to assess the two configurations and decide how they are to be merged: void *merge_dir_conf(apr_pool_t *pool, void *BASE, void *ADD) { example_config *base = (example_config *) BASE ; /* This is what was set in the parent example_config *add = (example_config *) ADD ; /* This is what is set in the new cont example_config *conf = (example_config *) create_dir_conf(pool, "Merged configuration") /* Merge configurations */ conf->enabled = ( add->enabled == 0 ) ? base->enabled : add->enabled ; conf->typeOfAction = add->typeOfAction ? add->typeOfAction : base->typeOfAction; strcpy(conf->path, strlen(add->path) ? add->path : base->path); return conf ; } Trying out our new context aware configurations Now, let’s try putting it all together to create a new module that is context aware. First off, we’ll create a configuration that lets us test how the module works: SetHandler example-handler ExampleEnabled on ExamplePath "/foo/bar" 1062 CHAPTER 11. DEVELOPER DOCUMENTATION ExampleAction file allow ExampleAction file deny ExampleEnabled off ExampleAction db deny ExamplePath "/foo/bar/baz" ExampleEnabled on Then we’ll assemble our module code. Note, that since we are now using our name tag as reference when fetching configurations in our handler, I have added some prototypes to keep the compiler happy: /*$6 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ * mod_example_config.c ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ */ #include #include #include #include #include #include #include #include #include #include "apr_hash.h" "ap_config.h" "ap_provider.h" "httpd.h" "http_core.h" "http_config.h" "http_log.h" "http_protocol.h" "http_request.h" /*$1 ˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜ Configuration structure ˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜ */ typedef struct { char context[256]; char path[256]; int typeOfAction; int enabled; } example_config; /*$1 ˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜ Prototypes ˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜ 11.4. DEVELOPING MODULES FOR THE APACHE HTTP SERVER 2.4 1063 */ static int const char const char const char void void static void example_handler(request_rec *r); *example_set_enabled(cmd_parms *cmd, void *cfg, const char *arg); *example_set_path(cmd_parms *cmd, void *cfg, const char *arg); *example_set_action(cmd_parms *cmd, void *cfg, const char *arg1, const char * *create_dir_conf(apr_pool_t *pool, char *context); *merge_dir_conf(apr_pool_t *pool, void *BASE, void *ADD); register_hooks(apr_pool_t *pool); /*$1 ˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜ Configuration directives ˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜ */ static const command_rec directives[] = { AP_INIT_TAKE1("exampleEnabled", example_set_enabled, NULL, ACCESS_CONF, "Enable or disa AP_INIT_TAKE1("examplePath", example_set_path, NULL, ACCESS_CONF, "The path to whatever AP_INIT_TAKE2("exampleAction", example_set_action, NULL, ACCESS_CONF, "Special action v { NULL } }; /*$1 ˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜ Our name tag ˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜ */ module AP_MODULE_DECLARE_DATA example_module = { STANDARD20_MODULE_STUFF, create_dir_conf, /* Per-directory configuration handler */ merge_dir_conf, /* Merge handler for per-directory configurations */ NULL, /* Per-server configuration handler */ NULL, /* Merge handler for per-server configurations */ directives, /* Any directives we may have for httpd */ register_hooks /* Our hook registering function */ }; /* ========================================================================================== Hook registration function ========================================================================================== */ static void register_hooks(apr_pool_t *pool) { ap_hook_handler(example_handler, NULL, NULL, APR_HOOK_LAST); } /* ========================================================================================== 1064 CHAPTER 11. DEVELOPER DOCUMENTATION Our example web service handler ========================================================================================== */ static int example_handler(request_rec *r) { if(!r->handler || strcmp(r->handler, "example-handler")) return(DECLINED); /*˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜ example_config *config = (example_config *) ap_get_module_config(r->per_dir_config, /*˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜ ap_set_content_type(r, "text/plain"); ap_rprintf(r, "Enabled: %u\n", config->enabled); ap_rprintf(r, "Path: %s\n", config->path); ap_rprintf(r, "TypeOfAction: %x\n", config->typeOfAction); ap_rprintf(r, "Context: %s\n", config->context); return OK; } /* ========================================================================================== Handler for the "exampleEnabled" directive ========================================================================================== */ const char *example_set_enabled(cmd_parms *cmd, void *cfg, const char *arg) { /*˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜*/ example_config *conf = (example_config *) cfg; /*˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜*/ if(conf) { if(!strcasecmp(arg, "on")) conf->enabled = 1; else conf->enabled = 0; } return NULL; } /* ========================================================================================== Handler for the "examplePath" directive ========================================================================================== */ const char *example_set_path(cmd_parms *cmd, void *cfg, const char *arg) { /*˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜*/ example_config *conf = (example_config *) cfg; /*˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜*/ if(conf) 11.4. DEVELOPING MODULES FOR THE APACHE HTTP SERVER 2.4 1065 { strcpy(conf->path, arg); } return NULL; } /* ========================================================================================== Handler for the "exampleAction" directive ; Let’s pretend this one takes one argument (file or db), and a second (deny or allow), ; and we store it in a bit-wise manner. ========================================================================================== */ const char *example_set_action(cmd_parms *cmd, void *cfg, const char *arg1, const char *arg { /*˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜*/ example_config *conf = (example_config *) cfg; /*˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜*/ if(conf) { { if(!strcasecmp(arg1, "file")) conf->typeOfAction = 0x01; else conf->typeOfAction = 0x02; if(!strcasecmp(arg2, "deny")) conf->typeOfAction += 0x10; else conf->typeOfAction += 0x20; } } return NULL; } /* ========================================================================================== Function for creating new configurations for per-directory contexts ========================================================================================== */ void *create_dir_conf(apr_pool_t *pool, char *context) { context = context ? context : "Newly created configuration"; /*˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜*/ example_config *cfg = apr_pcalloc(pool, sizeof(example_config)); /*˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜*/ if(cfg) { { 1066 CHAPTER 11. DEVELOPER DOCUMENTATION /* Set some default values */ strcpy(cfg->context, context); cfg->enabled = 0; memset(cfg->path, 0, 256); cfg->typeOfAction = 0x00; } } return cfg; } /* ========================================================================================== Merging function for configurations ========================================================================================== */ void *merge_dir_conf(apr_pool_t *pool, void *BASE, void *ADD) { /*˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜*/ example_config *base = (example_config *) BASE; example_config *add = (example_config *) ADD; example_config *conf = (example_config *) create_dir_conf(pool, "Merged configuratio /*˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜*/ conf->enabled = (add->enabled == 0) ? base->enabled : add->enabled; conf->typeOfAction = add->typeOfAction ? add->typeOfAction : base->typeOfAction; strcpy(conf->path, strlen(add->path) ? add->path : base->path); return conf; } Summing up We have now looked at how to create simple modules for Apache HTTP Server 2.4 and configuring them. What you do next is entirely up to you, but it is my hope that something valuable has come out of reading this documentation. If you have questions on how to further develop modules, you are welcome to join our mailing lists22 or check out the rest of our documentation for further tips. Some useful snippets of code Retrieve variables from POST form data typedef struct { const char *key; const char *value; } keyValuePair; keyValuePair *readPost(request_rec *r) { apr_array_header_t *pairs = NULL; apr_off_t len; apr_size_t size; 22 http://httpd.apache.org/lists.html 11.4. DEVELOPING MODULES FOR THE APACHE HTTP SERVER 2.4 1067 int res; int i = 0; char *buffer; keyValuePair *kvp; res = ap_parse_form_data(r, NULL, &pairs, -1, HUGE_STRING_LEN); if (res != OK || !pairs) return NULL; /* Return NULL if we failed or if there are is no kvp = apr_pcalloc(r->pool, sizeof(keyValuePair) * (pairs->nelts + 1)); while (pairs && !apr_is_empty_array(pairs)) { ap_form_pair_t *pair = (ap_form_pair_t *) apr_array_pop(pairs); apr_brigade_length(pair->value, 1, &len); size = (apr_size_t) len; buffer = apr_palloc(r->pool, size + 1); apr_brigade_flatten(pair->value, buffer, &size); buffer[len] = 0; kvp[i].key = apr_pstrdup(r->pool, pair->name); kvp[i].value = buffer; i++; } return kvp; } static int example_handler(request_rec *r) { /*˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜*/ keyValuePair *formData; /*˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜*/ formData = readPost(r); if (formData) { int i; for (i = 0; &formData[i]; i++) { if (formData[i].key && formData[i].value) { ap_rprintf(r, "%s = %s\n", formData[i].key, formData[i].value); } else if (formData[i].key) { ap_rprintf(r, "%s\n", formData[i].key); } else if (formData[i].value) { ap_rprintf(r, "= %s\n", formData[i].value); } else { break; } } } return OK; } Printing out every HTTP header received static int example_handler(request_rec *r) { /*˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜*/ const apr_array_header_t *fields; int i; 1068 CHAPTER 11. DEVELOPER DOCUMENTATION apr_table_entry_t *e = 0; /*˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜*/ fields = apr_table_elts(r->headers_in); e = (apr_table_entry_t *) fields->elts; for(i = 0; i < fields->nelts; i++) { ap_rprintf(r, "%s: %s\n", e[i].key, e[i].val); } return OK; } Reading the request body into memory static int util_read(request_rec *r, const char **rbuf, apr_off_t *size) { /*˜˜˜˜˜˜˜˜*/ int rc = OK; /*˜˜˜˜˜˜˜˜*/ if((rc = ap_setup_client_block(r, REQUEST_CHUNKED_ERROR))) { return(rc); } if(ap_should_client_block(r)) { /*˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜*/ char argsbuffer[HUGE_STRING_LEN]; apr_off_t rsize, len_read, rpos = 0; apr_off_t length = r->remaining; /*˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜*/ *rbuf = (const char *) apr_pcalloc(r->pool, (apr_size_t) (length + 1)); *size = length; while((len_read = ap_get_client_block(r, argsbuffer, sizeof(argsbuffer))) > 0) { if((rpos + len_read) > length) { rsize = length - rpos; } else { rsize = len_read; } memcpy((char *) *rbuf + rpos, argsbuffer, (size_t) rsize); rpos += rsize; } } return(rc); } static int example_handler(request_rec *r) { /*˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜*/ apr_off_t size; const char *buffer; 11.4. DEVELOPING MODULES FOR THE APACHE HTTP SERVER 2.4 1069 /*˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜˜*/ if(util_read(r, &buffer, &size) == OK) { ap_rprintf(r, "We read a request body that was %" APR_OFF_T_FMT " bytes long", size } return OK; } 1070 11.5 CHAPTER 11. DEVELOPER DOCUMENTATION Documenting code in Apache 2.4 Apache 2.4 uses Doxygen23 to document the APIs and global variables in the code. This will explain the basics of how to document using Doxygen. Brief Description To start a documentation block, use /** To end a documentation block, use */ In the middle of the block, there are multiple tags we can use: Description of this functions purpose @param parameter name description @return description @deffunc signature of the function The deffunc is not always necessary. DoxyGen does not have a full parser in it, so any prototype that use a macro in the return type declaration is too complex for scandoc. Those functions require a deffunc. An example (using > rather than >): /** * return the final element of the pathname * @param pathname The path to get the final element of * @return the final element of the path * @tip Examples: *
* "/foo/bar/gum" -> "gum"
* "/foo/bar/gum/" -> ""
* "gum" -> "gum"
* "wi\\n32\\stuff" -> "stuff"
* 
* @deffunc const char * ap filename of pathname(const char *pathname) */ At the top of the header file, always include: /** * @package Name of library header */ Doxygen uses a new HTML file for each package. The HTML files are named {Name of library header}.html, so try to be concise with your names. For a further discussion of the possibilities please refer to the Doxygen site24 . 23 http://www.doxygen.org/ 24 http://www.doxygen.org/ 11.6. HOOK FUNCTIONS IN THE APACHE HTTP SERVER 2.X 11.6 ! 1071 Hook Functions in the Apache HTTP Server 2.x Warning This document is still in development and may be partially out of date. In general, a hook function is one that the Apache HTTP Server will call at some point during the processing of a request. Modules can provide functions that are called, and specify when they get called in comparison to other modules. Core Hooks The httpd’s core modules offer a predefinined list of hooks used during the standard request processing (p. 1078) phase. Creating a new hook will expose a function that implements it (see sections below) but it is essential to undestand that you will not extend the httpd’s core hooks. Their presence and order in the request processing is in fact a consequence of how they are called in server/request.c (check this section (p. 1042) for an overview). The core hooks are listed in the doxygen documentation25 . Reading guide for developing modules (p. 1042) and request processing (p. 1078) before proceeding is highly recomended. Creating a hook function In order to create a new hook, four things need to be done: Declare the hook function Use the AP DECLARE HOOK macro, which needs to be given the return type of the hook function, the name of the hook, and the arguments. For example, if the hook returns an int and takes a request rec * and an int and is called do something, then declare it like this: AP_DECLARE_HOOK(int, do_something, (request_rec *r, int n)) This should go in a header which modules will include if they want to use the hook. Create the hook structure Each source file that exports a hook has a private structure which is used to record the module functions that use the hook. This is declared as follows: APR_HOOK_STRUCT( APR_HOOK_LINK(do_something) ... ) Implement the hook caller The source file that exports the hook has to implement a function that will call the hook. There are currently three possible ways to do this. In all cases, the calling function is called ap run hookname(). 25 https://ci.apache.org/projects/httpd/trunk/doxygen/group hooks.html 1072 CHAPTER 11. DEVELOPER DOCUMENTATION Void hooks If the return value of a hook is void, then all the hooks are called, and the caller is implemented like this: AP_IMPLEMENT_HOOK_VOID(do_something, (request_rec *r, int n), (r, n)) The second and third arguments are the dummy argument declaration and the dummy arguments as they will be used when calling the hook. In other words, this macro expands to something like this: void ap_run_do_something(request_rec *r, int n) { ... do_something(r, n); } Hooks that return a value If the hook returns a value, then it can either be run until the first hook that does something interesting, like so: AP_IMPLEMENT_HOOK_RUN_FIRST(int, do_something, (request_rec *r, int n), (r, n), DECLINED) The first hook that does not return DECLINED stops the loop and its return value is returned from the hook caller. Note that DECLINED is the traditional hook return value meaning "I didn’t do anything", but it can be whatever suits you. Alternatively, all hooks can be run until an error occurs. This boils down to permitting two return values, one of which means "I did something, and it was OK" and the other meaning "I did nothing". The first function that returns a value other than one of those two stops the loop, and its return is the return value. Declare these like so: AP_IMPLEMENT_HOOK_RUN_ALL(int, do_something, (request_rec *r, int n), (r, n), OK, DECLINED) Again, OK and DECLINED are the traditional values. You can use what you want. Call the hook callers At appropriate moments in the code, call the hook caller, like so: int n, ret; request_rec *r; ret=ap_run_do_something(r, n); Hooking the hook A module that wants a hook to be called needs to do two things. 11.6. HOOK FUNCTIONS IN THE APACHE HTTP SERVER 2.X 1073 Implement the hook function Include the appropriate header, and define a static function of the correct type: static int my_something_doer(request_rec *r, int n) { ... return OK; } Add a hook registering function During initialisation, the server will call each modules hook registering function, which is included in the module structure: static void my_register_hooks() { ap_hook_do_something(my_something_doer, NULL, NULL, APR_HOOK_MIDDLE); } mode MODULE_VAR_EXPORT my_module = { ... my_register_hooks /* register hooks */ }; Controlling hook calling order In the example above, we didn’t use the three arguments in the hook registration function that control calling order of all the functions registered within the hook. There are two mechanisms for doing this. The first, rather crude, method, allows us to specify roughly where the hook is run relative to other modules. The final argument control this. There are three possible values: APR HOOK FIRST, APR HOOK MIDDLE and APR HOOK LAST. All modules using any particular value may be run in any order relative to each other, but, of course, all modules using APR HOOK FIRST will be run before APR HOOK MIDDLE which are before APR HOOK LAST. Modules that don’t care when they are run should use APR HOOK MIDDLE. These values are spaced out, so that positions like APR HOOK FIRST-2 are possible to hook slightly earlier than other functions. Note that there are two more values, APR HOOK REALLY FIRST and APR HOOK REALLY LAST. These should only be used by the hook exporter. The other method allows finer control. When a module knows that it must be run before (or after) some other modules, it can specify them by name. The second (third) argument is a NULL-terminated array of strings consisting of the names of modules that must be run before (after) the current module. For example, suppose we want "mod xyz.c" and "mod abc.c" to run before we do, then we’d hook as follows: static void register_hooks() { static const char * const aszPre[] = { "mod_xyz.c", "mod_abc.c", NULL }; ap_hook_do_something(my_something_doer, aszPre, NULL, APR_HOOK_MIDDLE); } Note that the sort used to achieve this is stable, so ordering set by APR HOOK ORDER is preserved, as far as is possible. 1074 11.7 CHAPTER 11. DEVELOPER DOCUMENTATION Converting Modules from Apache 1.3 to Apache 2.0 This is a first attempt at writing the lessons I learned when trying to convert the mod mmap static module to Apache 2.0. It’s by no means definitive and probably won’t even be correct in some ways, but it’s a start. The easier changes ... Cleanup Routines These now need to be of type apr status t and return a value of that type. Normally the return value will be APR SUCCESS unless there is some need to signal an error in the cleanup. Be aware that even though you signal an error not all code yet checks and acts upon the error. Initialisation Routines These should now be renamed to better signify where they sit in the overall process. So the name gets a small change from mmap init to mmap post config. The arguments passed have undergone a radical change and now look like • apr pool t *p • apr pool t *plog • apr pool t *ptemp • server rec *s Data Types A lot of the data types have been moved into the APR26 . This means that some have had a name change, such as the one shown above. The following is a brief list of some of the changes that you are likely to have to make. • pool becomes apr pool t • table becomes apr table t The messier changes... Register Hooks The new architecture uses a series of hooks to provide for calling your functions. These you’ll need to add to your module by way of a new function, static void register hooks(void). The function is really reasonably straightforward once you understand what needs to be done. Each function that needs calling at some stage in the processing of a request needs to be registered, handlers do not. There are a number of phases where functions can be added, and for each you can specify with a high degree of control the relative order that the function will be called in. This is the code that was added to mod mmap static: 26 http://apr.apache.org/ 11.7. CONVERTING MODULES FROM APACHE 1.3 TO APACHE 2.0 1075 static void register_hooks(void) { static const char * const aszPre[]={ "http_core.c",NULL }; ap_hook_post_config(mmap_post_config,NULL,NULL,HOOK_MIDDLE); ap_hook_translate_name(mmap_static_xlat,aszPre,NULL,HOOK_LAST); }; This registers 2 functions that need to be called, one in the post config stage (virtually every module will need this one) and one for the translate name phase. note that while there are different function names the format of each is identical. So what is the format? ap hook phase name(function name, predecessors, successors, position); There are 3 hook positions defined... • HOOK FIRST • HOOK MIDDLE • HOOK LAST To define the position you use the position and then modify it with the predecessors and successors. Each of the modifiers can be a list of functions that should be called, either before the function is run (predecessors) or after the function has run (successors). In the mod mmap static case I didn’t care about the post config stage, but the mmap static xlat must be called after the core module had done its name translation, hence the use of the aszPre to define a modifier to the position HOOK LAST. Module Definition There are now a lot fewer stages to worry about when creating your module definition. The old definition looked like module MODULE_VAR_EXPORT module_name_module = { STANDARD_MODULE_STUFF, /* initializer */ /* dir config creater */ /* dir merger --- default is to override */ /* server config */ /* merge server config */ /* command handlers */ /* handlers */ /* filename translation */ /* check_user_id */ /* check auth */ /* check access */ /* type_checker */ /* fixups */ /* logger */ /* header parser */ /* child_init */ /* child_exit */ /* post read-request */ }; 1076 CHAPTER 11. DEVELOPER DOCUMENTATION The new structure is a great deal simpler... module MODULE_VAR_EXPORT module_name_module = { STANDARD20_MODULE_STUFF, /* create per-directory config structures /* merge per-directory config structures /* create per-server config structures /* merge per-server config structures /* command handlers */ /* handlers */ /* register hooks */ }; */ */ */ */ Some of these read directly across, some don’t. I’ll try to summarise what should be done below. The stages that read directly across : /* dir config creater */ /* create per-directory config structures */ /* server config */ /* create per-server config structures */ /* dir merger */ /* merge per-directory config structures */ /* merge server config */ /* merge per-server config structures */ /* command table */ /* command apr table t */ /* handlers */ /* handlers */ The remainder of the old functions should be registered as hooks. There are the following hook stages defined so far... ap hook pre config do any setup required prior to processing configuration directives ap hook check config review configuration directive interdependencies ap hook test config executes only with -t option ap hook open logs open any specified logs ap hook post config this is where the old init routines get registered ap hook http method retrieve the http method from a request. (legacy) ap hook auth checker check if the resource requires authorization ap hook access checker check for module-specific restrictions ap hook check user id check the user-id and password ap hook default port retrieve the default port for the server ap hook pre connection do any setup required just before processing, but after accepting ap hook process connection run the correct protocol ap hook child init call as soon as the child is started ap hook create request ?? 11.7. CONVERTING MODULES FROM APACHE 1.3 TO APACHE 2.0 1077 ap hook fixups last chance to modify things before generating content ap hook handler generate the content ap hook header parser lets modules look at the headers, not used by most modules, because they use post read request for this ap hook insert filter to insert filters into the filter chain ap hook log transaction log information about the request ap hook optional fn retrieve retrieve any functions registered as optional ap hook post read request called after reading the request, before any other phase ap hook quick handler called before any request processing, used by cache modules. ap hook translate name translate the URI into a filename ap hook type checker determine and/or set the doc type 1078 CHAPTER 11. DEVELOPER DOCUMENTATION 11.8 ! Request Processing in the Apache HTTP Server 2.x Warning Warning - this is a first (fast) draft that needs further revision! Several changes in 2.0 and above affect the internal request processing mechanics. Module authors need to be aware of these changes so they may take advantage of the optimizations and security enhancements. The first major change is to the subrequest and redirect mechanisms. There were a number of different code paths in the Apache HTTP Server 1.3 to attempt to optimize subrequest or redirect behavior. As patches were introduced to 2.0, these optimizations (and the server behavior) were quickly broken due to this duplication of code. All duplicate code has been folded back into ap process request internal() to prevent the code from falling out of sync again. This means that much of the existing code was ’unoptimized’. It is the Apache HTTP Project’s first goal to create a robust and correct implementation of the HTTP server RFC. Additional goals include security, scalability and optimization. New methods were sought to optimize the server (beyond the performance of 1.3) without introducing fragile or insecure code. The Request Processing Cycle All requests pass through ap process request internal() in server/request.c, including subrequests and redirects. If a module doesn’t pass generated requests through this code, the author is cautioned that the module may be broken by future changes to request processing. To streamline requests, the module author can take advantage of the hooks offered (p. 1042) to drop out of the request cycle early, or to bypass core hooks which are irrelevant (and costly in terms of CPU.) The Request Parsing Phase Unescapes the URL The request’s parsed uri path is unescaped, once and only once, at the beginning of internal request processing. This step is bypassed if the proxyreq flag is set, or the parsed uri.path element is unset. The module has no further control of this one-time unescape operation, either failing to unescape or multiply unescaping the URL leads to security repercussions. Strips Parent and This Elements from the URI All /../ and /./ elements are removed by ap getparents(). This helps to ensure the path is (nearly) absolute before the request processing continues. This step cannot be bypassed. Initial URI Location Walk Every request is subject to an ap location walk() call. This ensures that sections are consistently enforced for all requests. If the request is an internal redirect or a sub-request, it may borrow some or all of the processing from the previous or parent request’s ap location walk, so this step is generally very efficient after processing the main request. 11.8. REQUEST PROCESSING IN THE APACHE HTTP SERVER 2.X 1079 translate name Modules can determine the file name, or alter the given URI in this step. For example, MOD VHOST ALIAS will translate the URI’s path into the configured virtual host, MOD ALIAS will translate the path to an alias path, and if the request falls back on the core, the D OCUMENT ROOT is prepended to the request resource. If all modules DECLINE this phase, an error 500 is returned to the browser, and a "couldn’t translate name" error is logged automatically. Hook: map to storage After the file or correct URI was determined, the appropriate per-dir configurations are merged together. For example, MOD PROXY compares and merges the appropriate

sections. If the URI is nothing more than a local (non-proxy) TRACE request, the core handles the request and returns DONE. If no module answers this hook with OK or DONE, the core will run the request filename against the and sections. If the request ’filename’ isn’t an absolute, legal filename, a note is set for later termination. URI Location Walk Every request is hardened by a second ap location walk() call. This reassures that a translated request is still subjected to the configured sections. The request again borrows some or all of the processing from its previous location walk above, so this step is almost always very efficient unless the translated URI mapped to a substantially different path or Virtual Host. Hook: header parser The main request then parses the client’s headers. This prepares the remaining request processing steps to better serve the client’s request. The Security Phase Needs Documentation. Code is: if ((access_status = ap_run_access_checker(r)) != 0) { return decl_die(access_status, "check access", r); } if ((access_status = ap_run_check_user_id(r)) != 0) { return decl_die(access_status, "check user", r); } if ((access_status = ap_run_auth_checker(r)) != 0) { return decl_die(access_status, "check authorization", r); } The Preparation Phase Hook: type checker The modules have an opportunity to test the URI or filename against the target resource, and set mime information for the request. Both MOD MIME and MOD MIME MAGIC use this phase to compare the file name or contents against 1080 CHAPTER 11. DEVELOPER DOCUMENTATION the administrator’s configuration and set the content type, language, character set and request handler. Some modules may set up their filters or other request handling parameters at this time. If all modules DECLINE this phase, an error 500 is returned to the browser, and a "couldn’t find types" error is logged automatically. Hook: fixups Many modules are ’trounced’ by some phase above. The fixups phase is used by modules to ’reassert’ their ownership or force the request’s fields to their appropriate values. It isn’t always the cleanest mechanism, but occasionally it’s the only option. The Handler Phase This phase is not part of the processing in ap process request internal(). Many modules prepare one or more subrequests prior to creating any content at all. After the core, or a module calls ap process request internal() it then calls ap invoke handler() to generate the request. Hook: insert filter Modules that transform the content in some way can insert their values and override existing filters, such that if the user configured a more advanced filter out-of-order, then the module can move its order as need be. There is no result code, so actions in this hook better be trusted to always succeed. Hook: handler The module finally has a chance to serve the request in its handler hook. Note that not every prepared request is sent to the handler hook. Many modules, such as MOD AUTOINDEX, will create subrequests for a given URI, and then never serve the subrequest, but simply lists it for the user. Remember not to put required teardown from the hooks above into this module, but register pool cleanups against the request pool to free resources as required. 11.9. HOW FILTERS WORK IN APACHE 2.0 11.9 ! 1081 How filters work in Apache 2.0 Warning This is a cut ’n paste job from an email (<022501c1c529$f63a9550$7f00000a@KOJ>) and only reformatted for better readability. It’s not up to date but may be a good start for further research. Filter Types There are three basic filter types (each of these is actually broken down into two categories, but that comes later). CONNECTION Filters of this type are valid for the lifetime of this connection. AP FTYPE NETWORK) (AP FTYPE CONNECTION, PROTOCOL Filters of this type are valid for the lifetime of this request from the point of view of the client, this means that the request is valid from the time that the request is sent until the time that the response is received. (AP FTYPE PROTOCOL, AP FTYPE TRANSCODE) RESOURCE Filters of this type are valid for the time that this content is used to satisfy a request. For simple requests, this is identical to PROTOCOL, but internal redirects and sub-requests can change the content without ending the request. (AP FTYPE RESOURCE, AP FTYPE CONTENT SET) It is important to make the distinction between a protocol and a resource filter. A resource filter is tied to a specific resource, it may also be tied to header information, but the main binding is to a resource. If you are writing a filter and you want to know if it is resource or protocol, the correct question to ask is: "Can this filter be removed if the request is redirected to a different resource?" If the answer is yes, then it is a resource filter. If it is no, then it is most likely a protocol or connection filter. I won’t go into connection filters, because they seem to be well understood. With this definition, a few examples might help: Byterange We have coded it to be inserted for all requests, and it is removed if not used. Because this filter is active at the beginning of all requests, it can not be removed if it is redirected, so this is a protocol filter. http header This filter actually writes the headers to the network. This is obviously a required filter (except in the asis case which is special and will be dealt with below) and so it is a protocol filter. Deflate The administrator configures this filter based on which file has been requested. If we do an internal redirect from an autoindex page to an index.html page, the deflate filter may be added or removed based on config, so this is a resource filter. The further breakdown of each category into two more filter types is strictly for ordering. We could remove it, and only allow for one filter type, but the order would tend to be wrong, and we would need to hack things to make it work. Currently, the RESOURCE filters only have one filter type, but that should change. How are filters inserted? This is actually rather simple in theory, but the code is complex. First of all, it is important that everybody realize that there are three filter lists for each request, but they are all concatenated together: • r->output filters (corresponds to RESOURCE) • r->proto output filters (corresponds to PROTOCOL) 1082 CHAPTER 11. DEVELOPER DOCUMENTATION • r->connection->output filters (corresponds to CONNECTION) The problem previously, was that we used a singly linked list to create the filter stack, and we started from the "correct" location. This means that if I had a RESOURCE filter on the stack, and I added a CONNECTION filter, the CONNECTION filter would be ignored. This should make sense, because we would insert the connection filter at the top of the c->output filters list, but the end of r->output filters pointed to the filter that used to be at the front of c->output filters. This is obviously wrong. The new insertion code uses a doubly linked list. This has the advantage that we never lose a filter that has been inserted. Unfortunately, it comes with a separate set of headaches. The problem is that we have two different cases were we use subrequests. The first is to insert more data into a response. The second is to replace the existing response with an internal redirect. These are two different cases and need to be treated as such. In the first case, we are creating the subrequest from within a handler or filter. This means that the next filter should be passed to make sub request function, and the last resource filter in the sub-request will point to the next filter in the main request. This makes sense, because the sub-request’s data needs to flow through the same set of filters as the main request. A graphical representation might help: Default_handler --> includes_filter --> byterange --> ... If the includes filter creates a sub request, then we don’t want the data from that sub-request to go through the includes filter, because it might not be SSI data. So, the subrequest adds the following: Default_handler --> includes_filter -/-> byterange --> ... / Default_handler --> sub_request_core What happens if the subrequest is SSI data? Well, that’s easy, the includes filter is a resource filter, so it will be added to the sub request in between the Default handler and the sub request core filter. The second case for sub-requests is when one sub-request is going to become the real request. This happens whenever a sub-request is created outside of a handler or filter, and NULL is passed as the next filter to the make sub request function. In this case, the resource filters no longer make sense for the new request, because the resource has changed. So, instead of starting from scratch, we simply point the front of the resource filters for the sub-request to the front of the protocol filters for the old request. This means that we won’t lose any of the protocol filters, neither will we try to send this data through a filter that shouldn’t see it. The problem is that we are using a doubly-linked list for our filter stacks now. But, you should notice that it is possible for two lists to intersect in this model. So, you do you handle the previous pointer? This is a very difficult question to answer, because there is no "right" answer, either method is equally valid. I looked at why we use the previous pointer. The only reason for it is to allow for easier addition of new servers. With that being said, the solution I chose was to make the previous pointer always stay on the original request. This causes some more complex logic, but it works for all cases. My concern in having it move to the sub-request, is that for the more common case (where a sub-request is used to add data to a response), the main filter chain would be wrong. That didn’t seem like a good idea to me. Asis The final topic. :-) Mod Asis is a bit of a hack, but the handler needs to remove all filters except for connection filters, and send the data. If you are using MOD ASIS, all other bets are off. 11.9. HOW FILTERS WORK IN APACHE 2.0 1083 Explanations The absolutely last point is that the reason this code was so hard to get right, was because we had hacked so much to force it to work. I wrote most of the hacks originally, so I am very much to blame. However, now that the code is right, I have started to remove some hacks. Most people should have seen that the reset filters and add required filters functions are gone. Those inserted protocol level filters for error conditions, in fact, both functions did the same thing, one after the other, it was really strange. Because we don’t lose protocol filters for error cases any more, those hacks went away. The HTTP HEADER, Content-length, and Byterange filters are all added in the insert filters phase, because if they were added earlier, we had some interesting interactions. Now, those could all be moved to be inserted with the HTTP IN, CORE, and CORE IN filters. That would make the code easier to follow. 1084 CHAPTER 11. DEVELOPER DOCUMENTATION 11.10 Guide to writing output filters There are a number of common pitfalls encountered when writing output filters; this page aims to document best practice for authors of new or existing filters. This document is applicable to both version 2.0 and version 2.2 of the Apache HTTP Server; it specifically targets RESOURCE-level or CONTENT SET-level filters though some advice is generic to all types of filter. Filters and bucket brigades Each time a filter is invoked, it is passed a bucket brigade, containing a sequence of buckets which represent both data content and metadata. Every bucket has a bucket type; a number of bucket types are defined and used by the httpd core modules (and the apr-util library which provides the bucket brigade interface), but modules are free to define their own types. =⇒Output filters must be prepared to process buckets of non-standard types; with a few exceptions, a filter need not care about the types of buckets being filtered. A filter can tell whether a bucket represents either data or metadata using the APR BUCKET IS METADATA macro. Generally, all metadata buckets should be passed down the filter chain by an output filter. Filters may transform, delete, and insert data buckets as appropriate. There are two metadata bucket types which all filters must pay attention to: the EOS bucket type, and the FLUSH bucket type. An EOS bucket indicates that the end of the response has been reached and no further buckets need be processed. A FLUSH bucket indicates that the filter should flush any buffered buckets (if applicable) down the filter chain immediately. =⇒FLUSH buckets are sent when the content generator (or an upstream filter) knows that there may be a delay before more content can be sent. By passing FLUSH buckets down the filter chain immediately, filters ensure that the client is not kept waiting for pending data longer than necessary. Filters can create FLUSH buckets and pass these down the filter chain if desired. Generating FLUSH buckets unnecessarily, or too frequently, can harm network utilisation since it may force large numbers of small packets to be sent, rather than a small number of larger packets. The section on Non-blocking bucket reads covers a case where filters are encouraged to generate FLUSH buckets. Example bucket brigade HEAP FLUSH FILE EOS This shows a bucket brigade which may be passed to a filter; it contains two metadata buckets (FLUSH and EOS), and two data buckets (HEAP and FILE). Filter invocation For any given request, an output filter might be invoked only once and be given a single brigade representing the entire response. It is also possible that the number of times a filter is invoked for a single response is proportional to the size of the content being filtered, with the filter being passed a brigade containing a single bucket each time. Filters must operate correctly in either case. ! An output filter which allocates long-lived memory every time it is invoked may consume memory proportional to response size. Output filters which need to allocate memory should do so once per response; see Maintaining state below. 11.10. GUIDE TO WRITING OUTPUT FILTERS 1085 An output filter can distinguish the final invocation for a given response by the presence of an EOS bucket in the brigade. Any buckets in the brigade after an EOS should be ignored. An output filter should never pass an empty brigade down the filter chain. To be defensive, filters should be prepared to accept an empty brigade, and should return success without passing this brigade on down the filter chain. The handling of an empty brigade should have no side effects (such as changing any state private to the filter). How to handle an empty brigade apr_status_t dummy_filter(ap_filter_t *f, apr_bucket_brigade *bb) { if (APR_BRIGADE_EMPTY(bb)) { return APR_SUCCESS; } ... Brigade structure A bucket brigade is a doubly-linked list of buckets. The list is terminated (at both ends) by a sentinel which can be distinguished from a normal bucket by comparing it with the pointer returned by APR BRIGADE SENTINEL. The list sentinel is in fact not a valid bucket structure; any attempt to call normal bucket functions (such as apr bucket read) on the sentinel will have undefined behaviour (i.e. will crash the process). There are a variety of functions and macros for traversing and manipulating bucket brigades; see the apr buckets.h27 header for complete coverage. Commonly used macros include: APR BRIGADE FIRST(bb) returns the first bucket in brigade bb APR BRIGADE LAST(bb) returns the last bucket in brigade bb APR BUCKET NEXT(e) gives the next bucket after bucket e APR BUCKET PREV(e) gives the bucket before bucket e The apr bucket brigade structure itself is allocated out of a pool, so if a filter creates a new brigade, it must ensure that memory use is correctly bounded. A filter which allocates a new brigade out of the request pool (r->pool) on every invocation, for example, will fall foul of the warning above concerning memory use. Such a filter should instead create a brigade on the first invocation per request, and store that brigade in its state structure. ! It is generally never advisable to use apr brigade destroy to "destroy" a brigade unless you know for certain that the brigade will never be used again, even then, it should be used rarely. The memory used by the brigade structure will not be released by calling this function (since it comes from a pool), but the associated pool cleanup is unregistered. Using apr brigade destroy can in fact cause memory leaks; if a "destroyed" brigade contains buckets when its containing pool is destroyed, those buckets will not be immediately destroyed. In general, filters should use apr brigade cleanup in preference to apr brigade destroy. 27 http://apr.apache.org/docs/apr-util/trunk/group apr util bucket brigades.html 1086 CHAPTER 11. DEVELOPER DOCUMENTATION Processing buckets When dealing with non-metadata buckets, it is important to understand that the "apr bucket *" object is an abstract representation of data: 1. The amount of data represented by the bucket may or may not have a determinate length; for a bucket which represents data of indeterminate length, the ->length field is set to the value (apr size t)-1. For example, buckets of the PIPE bucket type have an indeterminate length; they represent the output from a pipe. 2. The data represented by a bucket may or may not be mapped into memory. The FILE bucket type, for example, represents data stored in a file on disk. Filters read the data from a bucket using the apr bucket read function. When this function is invoked, the bucket may morph into a different bucket type, and may also insert a new bucket into the bucket brigade. This must happen for buckets which represent data not mapped into memory. To give an example; consider a bucket brigade containing a single FILE bucket representing an entire file, 24 kilobytes in size: FILE(0K-24K) When this bucket is read, it will read a block of data from the file, morph into a HEAP bucket to represent that data, and return the data to the caller. It also inserts a new FILE bucket representing the remainder of the file; after the apr bucket read call, the brigade looks like: HEAP(8K) FILE(8K-24K) Filtering brigades The basic function of any output filter will be to iterate through the passed-in brigade and transform (or simply examine) the content in some manner. The implementation of the iteration loop is critical to producing a well-behaved output filter. Taking an example which loops through the entire brigade as follows: Bad output filter – do not imitate! apr_bucket *e = APR_BRIGADE_FIRST(bb); const char *data; apr_size_t length; while (e != APR_BRIGADE_SENTINEL(bb)) { apr_bucket_read(e, &data, &length, APR_BLOCK_READ); e = APR_BUCKET_NEXT(e); } return ap_pass_brigade(bb); The above implementation would consume memory proportional to content size. If passed a FILE bucket, for example, the entire file contents would be read into memory as each apr bucket read call morphed a FILE bucket into a HEAP bucket. In contrast, the implementation below will consume a fixed amount of memory to filter any brigade; a temporary brigade is needed and must be allocated only once per response, see the Maintaining state section. 11.10. GUIDE TO WRITING OUTPUT FILTERS 1087 Better output filter apr_bucket *e; const char *data; apr_size_t length; while ((e = APR_BRIGADE_FIRST(bb)) != APR_BRIGADE_SENTINEL(bb)) { rv = apr_bucket_read(e, &data, &length, APR_BLOCK_READ); if (rv) ...; /* Remove bucket e from bb. */ APR_BUCKET_REMOVE(e); /* Insert it into temporary brigade. */ APR_BRIGADE_INSERT_HEAD(tmpbb, e); /* Pass brigade downstream. */ rv = ap_pass_brigade(f->next, tmpbb); if (rv) ...; apr_brigade_cleanup(tmpbb); } Maintaining state A filter which needs to maintain state over multiple invocations per response can use the ->ctx field of its ap filter t structure. It is typical to store a temporary brigade in such a structure, to avoid having to allocate a new brigade per invocation as described in the Brigade structure section. Example code to maintain filter state struct dummy_state { apr_bucket_brigade *tmpbb; int filter_state; ... }; apr_status_t dummy_filter(ap_filter_t *f, apr_bucket_brigade *bb) { struct dummy_state *state; state = f->ctx; if (state == NULL) { /* First invocation for this response: initialise state structure. */ f->ctx = state = apr_palloc(f->r->pool, sizeof *state); state->tmpbb = apr_brigade_create(f->r->pool, f->c->bucket_alloc); state->filter_state = ...; } ... Buffering buckets If a filter decides to store buckets beyond the duration of a single filter function invocation (for example storing them in its ->ctx state structure), those buckets must be set aside. This is necessary because some bucket types provide buckets which represent temporary resources (such as stack memory) which will fall out of scope as soon as the filter chain completes processing the brigade. 1088 CHAPTER 11. DEVELOPER DOCUMENTATION To setaside a bucket, the apr bucket setaside function can be called. Not all bucket types can be setaside, but if successful, the bucket will have morphed to ensure it has a lifetime at least as long as the pool given as an argument to the apr bucket setaside function. Alternatively, the ap save brigade function can be used, which will move all the buckets into a separate brigade containing buckets with a lifetime as long as the given pool argument. This function must be used with care, taking into account the following points: 1. On return, ap save brigade guarantees that all the buckets in the returned brigade will represent data mapped into memory. If given an input brigade containing, for example, a PIPE bucket, ap save brigade will consume an arbitrary amount of memory to store the entire output of the pipe. 2. When ap save brigade reads from buckets which cannot be setaside, it will always perform blocking reads, removing the opportunity to use Non-blocking bucket reads. 3. If ap save brigade is used without passing a non-NULL "saveto" (destination) brigade parameter, the function will create a new brigade, which may cause memory use to be proportional to content size as described in the Brigade structure section. ! Filters must ensure that any buffered data is processed and passed down the filter chain during the last invocation for a given response (a brigade containing an EOS bucket). Otherwise such data will be lost. Non-blocking bucket reads The apr bucket read function takes an apr read type e argument which determines whether a blocking or non-blocking read will be performed from the data source. A good filter will first attempt to read from every data bucket using a non-blocking read; if that fails with APR EAGAIN, then send a FLUSH bucket down the filter chain, and retry using a blocking read. This mode of operation ensures that any filters further down the filter chain will flush any buffered buckets if a slow content source is being used. A CGI script is an example of a slow content source which is implemented as a bucket type. MOD CGI will send PIPE buckets which represent the output from a CGI script; reading from such a bucket will block when waiting for the CGI script to produce more output. 11.10. GUIDE TO WRITING OUTPUT FILTERS 1089 Example code using non-blocking bucket reads apr_bucket *e; apr_read_type_e mode = APR_NONBLOCK_READ; while ((e = APR_BRIGADE_FIRST(bb)) != APR_BRIGADE_SENTINEL(bb)) { apr_status_t rv; rv = apr_bucket_read(e, &data, &length, mode); if (rv == APR_EAGAIN && mode == APR_NONBLOCK_READ) { /* Pass down a brigade containing a flush bucket: */ APR_BRIGADE_INSERT_TAIL(tmpbb, apr_bucket_flush_create(...)); rv = ap_pass_brigade(f->next, tmpbb); apr_brigade_cleanup(tmpbb); if (rv != APR_SUCCESS) return rv; /* Retry, using a blocking read. */ mode = APR_BLOCK_READ; continue; } else if (rv != APR_SUCCESS) { /* handle errors */ } /* Next time, try a non-blocking read first. */ mode = APR_NONBLOCK_READ; ... } Ten rules for output filters In summary, here is a set of rules for all output filters to follow: 1. Output filters should not pass empty brigades down the filter chain, but should be tolerant of being passed empty brigades. 2. Output filters must pass all metadata buckets down the filter chain; FLUSH buckets should be respected by passing any pending or buffered buckets down the filter chain. 3. Output filters should ignore any buckets following an EOS bucket. 4. Output filters must process a fixed amount of data at a time, to ensure that memory consumption is not proportional to the size of the content being filtered. 5. Output filters should be agnostic with respect to bucket types, and must be able to process buckets of unfamiliar type. 6. After calling ap pass brigade to pass a brigade down the filter chain, output filters should call apr brigade cleanup to ensure the brigade is empty before reusing that brigade structure; output filters should never use apr brigade destroy to "destroy" brigades. 7. Output filters must setaside any buckets which are preserved beyond the duration of the filter function. 8. Output filters must not ignore the return value of ap pass brigade, and must return appropriate errors back up the filter chain. 1090 CHAPTER 11. DEVELOPER DOCUMENTATION 9. Output filters must only create a fixed number of bucket brigades for each response, rather than one per invocation. 10. Output filters should first attempt non-blocking reads from each data bucket, and send a FLUSH bucket down the filter chain if the read blocks, before retrying with a blocking read. 11.11. APACHE HTTP SERVER 2.X THREAD SAFETY ISSUES 11.11 1091 Apache HTTP Server 2.x Thread Safety Issues When using any of the threaded mpms in the Apache HTTP Server 2.x it is important that every function called from Apache be thread safe. When linking in 3rd party extensions it can be difficult to determine whether the resulting server will be thread safe. Casual testing generally won’t tell you this either as thread safety problems can lead to subtle race conditions that may only show up in certain conditions under heavy load. Global and static variables When writing your module or when trying to determine if a module or 3rd party library is thread safe there are some common things to keep in mind. First, you need to recognize that in a threaded model each individual thread has its own program counter, stack and registers. Local variables live on the stack, so those are fine. You need to watch out for any static or global variables. This doesn’t mean that you are absolutely not allowed to use static or global variables. There are times when you actually want something to affect all threads, but generally you need to avoid using them if you want your code to be thread safe. In the case where you have a global variable that needs to be global and accessed by all threads, be very careful when you update it. If, for example, it is an incrementing counter, you need to atomically increment it to avoid race conditions with other threads. You do this using a mutex (mutual exclusion). Lock the mutex, read the current value, increment it and write it back and then unlock the mutex. Any other thread that wants to modify the value has to first check the mutex and block until it is cleared. If you are using APR28 , have a look at the apr atomic * functions and the apr thread mutex * functions. errno This is a common global variable that holds the error number of the last error that occurred. If one thread calls a lowlevel function that sets errno and then another thread checks it, we are bleeding error numbers from one thread into another. To solve this, make sure your module or library defines REENTRANT or is compiled with -D REENTRANT. This will make errno a per-thread variable and should hopefully be transparent to the code. It does this by doing something like this: #define errno (*( errno location())) which means that accessing errno will call errno location() which is provided by the libc. Setting REENTRANT also forces redefinition of some other functions to their * r equivalents and sometimes changes the common getc/putc macros into safer function calls. Check your libc documentation for specifics. Instead of, or in addition to REENTRANT the symbols that may affect this are POSIX C SOURCE, THREAD SAFE, SVID SOURCE, and BSD SOURCE. Common standard troublesome functions Not only do things have to be thread safe, but they also have to be reentrant. strtok() is an obvious one. You call it the first time with your delimiter which it then remembers and on each subsequent call it returns the next token. Obviously if multiple threads are calling it you will have a problem. Most systems have a reentrant version of the function called strtok r() where you pass in an extra argument which contains an allocated char * which the function will use instead of its own static storage for maintaining the tokenizing state. If you are using APR29 you can use apr strtok(). 28 http://apr.apache.org/ 29 http://apr.apache.org/ 1092 CHAPTER 11. DEVELOPER DOCUMENTATION crypt() is another function that tends to not be reentrant, so if you run across calls to that function in a library, watch out. On some systems it is reentrant though, so it is not always a problem. If your system has crypt r() chances are you should be using that, or if possible simply avoid the whole mess by using md5 instead. Common 3rd Party Libraries The following is a list of common libraries that are used by 3rd party Apache modules. You can check to see if your module is using a potentially unsafe library by using tools such as ldd(1) and nm(1). For PHP30 , for example, try this: % ldd libphp4.so libsablot.so.0 => /usr/local/lib/libsablot.so.0 (0x401f6000) libexpat.so.0 => /usr/lib/libexpat.so.0 (0x402da000) libsnmp.so.0 => /usr/lib/libsnmp.so.0 (0x402f9000) libpdf.so.1 => /usr/local/lib/libpdf.so.1 (0x40353000) libz.so.1 => /usr/lib/libz.so.1 (0x403e2000) libpng.so.2 => /usr/lib/libpng.so.2 (0x403f0000) libmysqlclient.so.11 => /usr/lib/libmysqlclient.so.11 (0x40411000) libming.so => /usr/lib/libming.so (0x40449000) libm.so.6 => /lib/libm.so.6 (0x40487000) libfreetype.so.6 => /usr/lib/libfreetype.so.6 (0x404a8000) libjpeg.so.62 => /usr/lib/libjpeg.so.62 (0x404e7000) libcrypt.so.1 => /lib/libcrypt.so.1 (0x40505000) libssl.so.2 => /lib/libssl.so.2 (0x40532000) libcrypto.so.2 => /lib/libcrypto.so.2 (0x40560000) libresolv.so.2 => /lib/libresolv.so.2 (0x40624000) libdl.so.2 => /lib/libdl.so.2 (0x40634000) libnsl.so.1 => /lib/libnsl.so.1 (0x40637000) libc.so.6 => /lib/libc.so.6 (0x4064b000) /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x80000000) In addition to these libraries you will need to have a look at any libraries linked statically into the module. You can use nm(1) to look for individual symbols in the module. Library List Please drop a note to dev@httpd.apache.org31 if you have additions or corrections to this list. Library Version ASpell/PSpella Thread Safe? Notes ? a http://aspell.sourceforge.net/ Berkeley DBa 3.x, 4.x Yes Be careful about sharing a connection across threads. Yes Both low-level and high-level APIs are thread-safe. However, high-level API requires threadsafe access to errno. a http://www.sleepycat.com/ bzip2a a http://sources.redhat.com/bzip2/index.html cdba a http://cr.yp.to/cdb.html 30 http://www.php.net/ 31 http://httpd.apache.org/lists.html#http-dev ? 11.11. APACHE HTTP SERVER 2.X THREAD SAFETY ISSUES C-Clienta Perhaps a http://www.washington.edu/imap/ libcrypta 1093 c-client uses strtok() and gethostbyname() which are not thread-safe on most C library implementations. c-client’s static data is meant to be shared across threads. If strtok() and gethostbyname() are thread-safe on your OS, c-client may be thread-safe. ? a http://www.ijg.org/files/ Expata Yes a http://expat.sourceforge.net/ FreeTDSa Need a separate parser instance per thread ? a http://www.freetds.org/ FreeTypea ? a http://www.freetype.org/ GD 1.8.xa ? a http://www.boutell.com/gd/ GD 2.0.xa ? a http://www.boutell.com/gd/ gdbma No Errors returned via a static gdbm error variable Yes ImageMagick docs claim it is thread safe since version 5.2.2 (see Change loga ). a http://www.gnu.org/software/gdbm/gdbm.html ImageMagicka 5.2.2 a http://www.imagemagick.org/ a http://www.imagemagick.com/www/changelog a Imlib2 ? a http://www.enlightenment.org/p.php?p=about/efl&l=en libjpega v6b ? a http://www.ijg.org/files/ libmysqlclienta Yes a http://mysql.com Minga Use mysqlclient r library variant to ensure thread-safety. For more information, please read http://dev.mysql.com/doc/mysql/en/Threaded 0.2a ? 5.0.x ? 2.1.x Yes Use ldap r library variant to ensure thread-safety. 0.9.6g Yes Requires CRYPTO CRYPTO CRYPTO 8.x,9.x ? a http://www.opaque.net/ming/ Net-SNMPa a http://net- snmp.sourceforge.net/ OpenLDAPa a http://www.openldap.org/ OpenSSLa a http://www.openssl.org/ liboci8 (Oracle 8+)a a http://www.oracle.com/ proper usage of num locks, set locking callback, set id callback 1094 pdfliba CHAPTER 11. DEVELOPER DOCUMENTATION 5.0.x Yes 1.0.x ? a http://pdflib.com/ libpnga PDFLib docs claim it is thread safe; changes.txt indicates it has been partially thread-safe since V1.91: http://www.pdflib.com/products/pdflibfamily/pdflib/. a http://www.libpng.org/pub/png/libpng.html libpnga 1.2.x ? a http://www.libpng.org/pub/png/libpng.html libpq (PostgreSQL)a 8.x Yes a http://www.postgresql.org/docs/8.4/static/libpqthreading.html Sablotrona 0.95 a http://www.gingerall.com/charlie/ga/xml/p zliba a http://www.gzip.org/zlib/ 1.1.4 Don’t share connections across threads and watch out for crypt() calls ? sab.xml Yes Relies upon thread-safe zalloc and zfree functions Default is to use libc’s calloc/free which are thread-safe. Chapter 12 Glossary and Index 1095 1096 12.1 CHAPTER 12. GLOSSARY AND INDEX Glossary This glossary defines some of the common terminology related to Apache in particular, and web serving in general. More information on each concept is provided in the links. Definitions Access Control The restriction of access to network realms. In an Apache context usually the restriction of access to certain URLs. See: Authentication, Authorization, and Access Control (p. 227) Algorithm An unambiguous formula or set of rules for solving a problem in a finite number of steps. Algorithms for encryption are usually called Ciphers. APache eXtension Tool (apxs) A perl script that aids in compiling module sources into Dynamic Shared Objects (DSOs) and helps install them in the Apache Web server. See: Manual Page: apxs Apache Portable Runtime (APR) A set of libraries providing many of the basic interfaces between the server and the operating system. APR is developed parallel to the Apache HTTP Server as an independent project. See: Apache Portable Runtime Project1 Authentication The positive identification of a network entity such as a server, a client, or a user. See: Authentication, Authorization, and Access Control (p. 227) Certificate A data record used for authenticating network entities such as a server or a client. A certificate contains X.509 information pieces about its owner (called the subject) and the signing Certification Authority (called the issuer), plus the owner’s public key and the signature made by the CA. Network entities verify these signatures using CA certificates. See: SSL/TLS Encryption (p. 192) Certificate Signing Request (CSR) An unsigned certificate for submission to a Certification Authority, which signs it with the Private Key of their CA Certificate. Once the CSR is signed, it becomes a real certificate. See: SSL/TLS Encryption (p. 192) Certification Authority (CA) A trusted third party whose purpose is to sign certificates for network entities it has authenticated using secure means. Other network entities can check the signature to verify that a CA has authenticated the bearer of a certificate. See: SSL/TLS Encryption (p. 192) Cipher An algorithm or system for data encryption. Examples are DES, IDEA, RC4, etc. See: SSL/TLS Encryption (p. 192) Ciphertext The result after Plaintext is passed through a Cipher. See: SSL/TLS Encryption (p. 192) Common Gateway Interface (CGI) A standard definition for an interface between a web server and an external program that allows the external program to service requests. There is an Informational RFC2 which covers the specifics. See: Dynamic Content with CGI (p. 236) Configuration Directive See: Directive Configuration File A text file containing Directives that control the configuration of Apache. See: Configuration Files (p. 32) 1 http://apr.apache.org/ 2 http://www.ietf.org/rfc/rfc3875 12.1. GLOSSARY 1097 CONNECT An HTTP method for proxying raw data channels over HTTP. It can be used to encapsulate other protocols, such as the SSL protocol. Context An area in the configuration files where certain types of directives are allowed. See: Terms Used to Describe Apache Directives (p. 377) Digital Signature An encrypted text block that validates a certificate or other file. A Certification Authority creates a signature by generating a hash of the Public Key embedded in a Certificate, then encrypting the hash with its own Private Key. Only the CA’s public key can decrypt the signature, verifying that the CA has authenticated the network entity that owns the Certificate. See: SSL/TLS Encryption (p. 192) Directive A configuration command that controls one or more aspects of Apache’s behavior. Directives are placed in the Configuration File See: Directive Index (p. 1106) Dynamic Shared Object (DSO) Modules compiled separately from the Apache httpd binary that can be loaded on-demand. See: Dynamic Shared Object Support (p. 68) Environment Variable (env-variable) Named variables managed by the operating system shell and used to store information and communicate between programs. Apache also contains internal variables that are referred to as environment variables, but are stored in internal Apache structures, rather than in the shell environment. See: Environment Variables in Apache (p. 92) Export-Crippled Diminished in cryptographic strength (and security) in order to comply with the United States’ Export Administration Regulations (EAR). Export-crippled cryptographic software is limited to a small key size, resulting in Ciphertext which usually can be decrypted by brute force. See: SSL/TLS Encryption (p. 192) Filter A process that is applied to data that is sent or received by the server. Input filters process data sent by the client to the server, while output filters process documents on the server before they are sent to the client. For example, the INCLUDES output filter processes documents for Server Side Includes. See: Filters (p. 110) Fully-Qualified Domain-Name (FQDN) The unique name of a network entity, consisting of a hostname and a domain name that can resolve to an IP address. For example, www is a hostname, example.com is a domain name, and www.example.com is a fully-qualified domain name. Handler An internal Apache representation of the action to be performed when a file is called. Generally, files have implicit handlers, based on the file type. Normally, all files are simply served by the server, but certain file types are "handled" separately. For example, the cgi-script handler designates files to be processed as CGIs. See: Apache’s Handler Use (p. 108) Hash A mathematical one-way, irreversible algorithm generating a string with fixed-length from another string of any length. Different input strings will usually produce different hashes (depending on the hash function). Header The part of the HTTP request and response that is sent before the actual content, and that contains metainformation describing the content. .htaccess A configuration file that is placed inside the web tree and applies configuration directives to the directory where it is placed and all sub-directories. Despite its name, this file can hold almost any type of directive, not just access-control directives. See: Configuration Files (p. 32) httpd.conf The main Apache configuration file. The default location is /usr/local/apache2/conf/httpd.conf, but it may be moved using run-time or compiletime configuration. See: Configuration Files (p. 32) 1098 CHAPTER 12. GLOSSARY AND INDEX HyperText Transfer Protocol (HTTP) The standard transmission protocol used on the World Wide Web. Apache implements version 1.1 of the protocol, referred to as HTTP/1.1 and defined by RFC 26163 . HTTPS The HyperText Transfer Protocol (Secure), the standard encrypted communication mechanism on the World Wide Web. This is actually just HTTP over SSL. See: SSL/TLS Encryption (p. 192) Method In the context of HTTP, an action to perform on a resource, specified on the request line by the client. Some of the methods available in HTTP are GET, POST, and PUT. Message Digest A hash of a message, which can be used to verify that the contents of the message have not been altered in transit. See: SSL/TLS Encryption (p. 192) MIME-type A way to describe the kind of document being transmitted. Its name comes from that fact that its format is borrowed from the Multipurpose Internet Mail Extensions. It consists of a major type and a minor type, separated by a slash. Some examples are text/html, image/gif, and application/octet-stream. In HTTP, the MIME-type is transmitted in the Content-Type header. See: mod mime (p. 749) Module An independent part of a program. Much of Apache’s functionality is contained in modules that you can choose to include or exclude. Modules that are compiled into the Apache httpd binary are called static modules, while modules that are stored separately and can be optionally loaded at run-time are called dynamic modules or DSOs. Modules that are included by default are called base modules. Many modules are available for Apache that are not distributed as part of the Apache HTTP Server tarball. These are referred to as thirdparty modules. See: Module Index (p. 1101) Module Magic Number (MMN) Module Magic Number is a constant defined in the Apache source code that is associated with binary compatibility of modules. It is changed when internal Apache structures, function calls and other significant parts of API change in such a way that binary compatibility cannot be guaranteed any more. On MMN change, all third party modules have to be at least recompiled, sometimes even slightly changed in order to work with the new version of Apache. OpenSSL The Open Source toolkit for SSL/TLS See http://www.openssl.org/# Pass Phrase The word or phrase that protects private key files. It prevents unauthorized users from encrypting them. Usually it’s just the secret encryption/decryption key used for Ciphers. See: SSL/TLS Encryption (p. 192) Plaintext The unencrypted text. Private Key The secret key in a Public Key Cryptography system, used to decrypt incoming messages and sign outgoing ones. See: SSL/TLS Encryption (p. 192) Proxy An intermediate server that sits between the client and the origin server. It accepts requests from clients, transmits those requests on to the origin server, and then returns the response from the origin server to the client. If several clients request the same content, the proxy can deliver that content from its cache, rather than requesting it from the origin server each time, thereby reducing response time. See: mod proxy (p. 787) Public Key The publicly available key in a Public Key Cryptography system, used to encrypt messages bound for its owner and to decrypt signatures made by its owner. See: SSL/TLS Encryption (p. 192) 3 http://ietf.org/rfc/rfc2616.txt 12.1. GLOSSARY 1099 Public Key Cryptography The study and application of asymmetric encryption systems, which use one key for encryption and another for decryption. A corresponding pair of such keys constitutes a key pair. Also called Asymmetric Cryptography. See: SSL/TLS Encryption (p. 192) Regular Expression (Regex) A way of describing a pattern in text - for example, "all the words that begin with the letter A" or "every 10-digit phone number" or even "Every sentence with two commas in it, and no capital letter Q". Regular expressions are useful in Apache because they let you apply certain attributes against collections of files or resources in very flexible ways - for example, all .gif and .jpg files under any "images" directory could be written as "/images/.*(jpg|gif)$". In places where regular expressions are used to replace strings, the special variables $1 ... $9 contain backreferences to the grouped parts (in parentheses) of the matched expression. The special variable $0 contains a backreference to the whole matched expression. To write a literal dollar sign in a replacement string, it can be escaped with a backslash. Historically, the variable & could be used as alias for $0 in some places. This is no longer possible since version 2.3.6. Apache uses Perl Compatible Regular Expressions provided by the PCRE4 library. You can find more documentation about PCRE’s regular expression syntax at that site, or at Wikipedia5 . Reverse Proxy A proxy server that appears to the client as if it is an origin server. This is useful to hide the real origin server from the client for security reasons, or to load balance. Secure Sockets Layer (SSL) A protocol created by Netscape Communications Corporation for general communication authentication and encryption over TCP/IP networks. The most popular usage is HTTPS, i.e. the HyperText Transfer Protocol (HTTP) over SSL. See: SSL/TLS Encryption (p. 192) Server Name Indication (SNI) An SSL function that allows passing the desired server hostname in the initial SSL handshake message, so that the web server can select the correct virtual host configuration to use in processing the SSL handshake. It was added to SSL starting with the TLS extensions, RFC 3546. See: the SSL FAQ (p. 212) and RFC 35466 Server Side Includes (SSI) A technique for embedding processing directives inside HTML files. See: Introduction to Server Side Includes (p. 243) Session The context information of a communication in general. SSLeay The original SSL/TLS implementation library developed by Eric A. Young Subrequest Apache provides a subrequest API to modules that allows other filesystem or URL paths to be partially or fully evaluated by the server. Example consumers of this API are D IRECTORY I NDEX, MOD AUTOINDEX, and MOD INCLUDE. Symmetric Cryptography The study and application of Ciphers that use a single secret key for both encryption and decryption operations. See: SSL/TLS Encryption (p. 192) Tarball A package of files gathered together using the tar utility. Apache distributions are stored in compressed tar archives or using pkzip. Transport Layer Security (TLS) The successor protocol to SSL, created by the Internet Engineering Task Force (IETF) for general communication authentication and encryption over TCP/IP networks. TLS version 1 is nearly identical with SSL version 3. See: SSL/TLS Encryption (p. 192) 4 http://www.pcre.org/ 5 http://en.wikipedia.org/wiki/PCRE 6 http://www.ietf.org/rfc/rfc3546.txt 1100 CHAPTER 12. GLOSSARY AND INDEX Uniform Resource Locator (URL) The name/address of a resource on the Internet. This is the common informal term for what is formally called a Uniform Resource Identifier. URLs are usually made up of a scheme, like http or https, a hostname, and a path. A URL for this page might be http://httpd.apache.org/docs/trunk/glossary.html. Uniform Resource Identifier (URI) A compact string of characters for identifying an abstract or physical resource. It is formally defined by RFC 23967 . URIs used on the world-wide web are commonly referred to as URLs. Virtual Hosting Serving multiple websites using a single instance of Apache. IP virtual hosting differentiates between websites based on their IP address, while name-based virtual hosting uses only the name of the host and can therefore host many sites on the same IP address. See: Apache Virtual Host documentation (p. 124) X.509 An authentication certificate scheme recommended by the International Telecommunication Union (ITU-T) which is used for SSL/TLS authentication. See: SSL/TLS Encryption (p. 192) 7 http://www.ietf.org/rfc/rfc2396.txt 12.2. MODULE INDEX 12.2 1101 Module Index Below is a list of all of the modules that come as part of the Apache HTTP Server distribution. See also the complete alphabetical list of all Apache HTTP Server directives (p. 1106) . See also • Multi-Processing Modules (MPMs) (p. 90) • Directive Quick Reference (p. 1106) Core Features and Multi-Processing Modules core (p. 380) Core Apache HTTP Server features that are always available mpm common (p. 990) A collection of directives that are implemented by more than one multi-processing module (MPM) event (p. 1001) A variant of the WORKER MPM with the goal of consuming threads only for connections with active processing mpm netware (p. 1006) Multi-Processing Module implementing an exclusively threaded web server optimized for Novell NetWare mpmt os2 (p. 1008) Hybrid multi-process, multi-threaded MPM for OS/2 prefork (p. 1009) Implements a non-threaded, pre-forking web server mpm winnt (p. 1012) Multi-Processing Module optimized for Windows NT. worker (p. 1014) Multi-Processing Module implementing a hybrid multi-threaded multi-process web server Other Modules mod access compat (p. 440) Group authorizations based on host (name or IP address) mod actions (p. 445) Execute CGI scripts based on media type or request method. mod alias (p. 447) Provides for mapping different parts of the host filesystem in the document tree and for URL redirection mod allowhandlers (p. 454) Easily restrict what HTTP handlers can be used on the server mod allowmethods (p. 455) Easily restrict what HTTP methods can be used on the server mod asis (p. 456) Sends files that contain their own HTTP headers mod auth basic (p. 458) Basic HTTP authentication mod auth digest (p. 462) User authentication using MD5 Digest Authentication mod auth form (p. 466) Form authentication mod authn anon (p. 477) Allows "anonymous" user access to authenticated areas mod authn core (p. 480) Core Authentication mod authn dbd (p. 484) User authentication using an SQL database 1102 CHAPTER 12. GLOSSARY AND INDEX mod authn dbm (p. 487) User authentication using DBM files mod authn file (p. 489) User authentication using text files mod authn socache (p. 491) Manages a cache of authentication credentials to relieve the load on backends mod authnz fcgi (p. 494) Allows a FastCGI authorizer application to handle Apache httpd authentication and authorization mod authnz ldap (p. 501) Allows an LDAP directory to be used to store the database for HTTP Basic authentication. mod authz core (p. 519) Core Authorization mod authz dbd (p. 527) Group Authorization and Login using SQL mod authz dbm (p. 532) Group authorization using DBM files mod authz groupfile (p. 534) Group authorization using plaintext files mod authz host (p. 536) Group authorizations based on host (name or IP address) mod authz owner (p. 539) Authorization based on file ownership mod authz user (p. 541) User Authorization mod autoindex (p. 542) Generates directory indexes, automatically, similar to the Unix ls command or the Win32 dir shell command mod buffer (p. 554) Support for request buffering mod cache (p. 555) RFC 2616 compliant HTTP caching filter. mod cache disk (p. 570) Disk based storage module for the HTTP caching filter. mod cache socache (p. 574) Shared object cache (socache) based storage module for the HTTP caching filter. mod cern meta (p. 578) CERN httpd metafile semantics mod cgi (p. 580) Execution of CGI scripts mod cgid (p. 583) Execution of CGI scripts using an external CGI daemon mod charset lite (p. 585) Specify character set translation or recoding mod data (p. 588) Convert response body into an RFC2397 data URL mod dav (p. 589) Distributed Authoring and Versioning (WebDAV8 ) functionality mod dav fs (p. 592) Filesystem provider for MOD DAV mod dav lock (p. 593) Generic locking module for MOD DAV mod dbd (p. 594) Manages SQL database connections mod deflate (p. 599) Compress content before it is delivered to the client mod dialup (p. 606) Send static content at a bandwidth rate limit, defined by the various old modem standards mod dir (p. 607) Provides for "trailing slash" redirects and serving directory index files mod dumpio (p. 612) Dumps all I/O to error log as desired. mod echo (p. 614) A simple echo server to illustrate protocol modules 8 http://www.webdav.org/ 12.2. MODULE INDEX 1103 mod env (p. 615) Modifies the environment which is passed to CGI scripts and SSI pages mod example hooks (p. 617) Illustrates the Apache module API mod expires (p. 619) Generation of Expires and Cache-Control HTTP headers according to user-specified criteria mod ext filter (p. 622) Pass the response body through an external program before delivery to the client mod file cache (p. 626) Caches a static list of files in memory mod filter (p. 629) Context-sensitive smart filter configuration module mod firehose (p. 637) Multiplexes all I/O to a given file or pipe. mod headers (p. 641) Customization of HTTP request and response headers mod heartbeat (p. 647) Sends messages with server status to frontend proxy mod heartmonitor (p. 648) Centralized monitor for mod heartbeat origin servers mod http2 (p. 650) Support for the HTTP/2 transport layer mod ident (p. 661) RFC 1413 ident lookups mod imagemap (p. 663) Server-side imagemap processing mod include (p. 667) Server-parsed html documents (Server Side Includes) mod info (p. 680) Provides a comprehensive overview of the server configuration mod isapi (p. 683) ISAPI Extensions within Apache for Windows mod journald (p. 687) Provides "journald" ErrorLog provider mod lbmethod bybusyness (p. 688) Pending Request Counting load balancer scheduler algorithm for MOD PROXY BALANCER mod lbmethod byrequests (p. 689) Request Counting load balancer scheduler algorithm for MOD PROXY BALANCER mod lbmethod bytraffic (p. 691) Weighted Traffic Counting load balancer scheduler algorithm for MOD PROXY BALANCER mod lbmethod heartbeat (p. 692) Heartbeat Traffic Counting load balancer scheduler algorithm for MOD PROXY BALANCER mod ldap (p. 693) LDAP connection pooling and result caching services for use by other LDAP modules mod log config (p. 705) Logging of the requests made to the server mod log debug (p. 712) Additional configurable debug logging mod log forensic (p. 714) Forensic Logging of the requests made to the server mod logio (p. 716) Logging of input and output bytes per request mod lua (p. 718) Provides Lua hooks into various portions of the httpd request processing mod macro (p. 745) Provides macros within apache httpd runtime configuration files mod mime (p. 749) Associates the requested filename’s extensions with the file’s behavior (handlers and filters) and content (mime-type, language, character set and encoding) 1104 CHAPTER 12. GLOSSARY AND INDEX mod mime magic (p. 762) Determines the MIME type of a file by looking at a few bytes of its contents mod negotiation (p. 766) Provides for content negotiation (p. 78) mod nw ssl (p. 770) Enable SSL encryption for NetWare mod policy (p. 771) HTTP protocol compliance enforcement. mod privileges (p. 781) Support for Solaris privileges and for running virtual hosts under different user IDs. mod proxy (p. 787) Multi-protocol proxy/gateway server mod proxy ajp (p. 815) AJP support module for MOD PROXY mod proxy balancer (p. 824) MOD PROXY extension for load balancing mod proxy connect (p. 828) MOD PROXY extension for CONNECT request handling mod proxy express (p. 830) Dynamic mass reverse proxy extension for MOD PROXY mod proxy fcgi (p. 833) FastCGI support module for MOD PROXY mod proxy fdpass (p. 836) fdpass external process support module for MOD PROXY mod proxy ftp (p. 837) FTP support module for MOD PROXY mod proxy hcheck (p. 840) Dynamic health check of Balancer members (workers) for MOD PROXY mod proxy html (p. 844) Rewrite HTML links in to ensure they are addressable from Clients’ networks in a proxy context. mod proxy http (p. 850) HTTP support module for MOD PROXY mod proxy http2 (p. 852) HTTP/2 support module for MOD PROXY mod proxy scgi (p. 853) SCGI gateway module for MOD PROXY mod proxy wstunnel (p. 856) Websockets support module for MOD PROXY mod ratelimit (p. 858) Bandwidth Rate Limiting for Clients mod reflector (p. 859) Reflect a request body as a response via the output filter stack. mod remoteip (p. 860) Replaces the original client IP address for the connection with the useragent IP address list presented by a proxies or a load balancer via the request headers. mod reqtimeout (p. 864) Set timeout and minimum data rate for receiving requests mod request (p. 866) Filters to handle and make available HTTP request bodies mod rewrite (p. 867) Provides a rule-based rewriting engine to rewrite requested URLs on the fly mod sed (p. 881) Filter Input (request) and Output (response) content using sed syntax mod session (p. 883) Session support mod session cookie (p. 890) Cookie based session support mod session crypto (p. 893) Session encryption support mod session dbd (p. 897) DBD/SQL based session support mod setenvif (p. 902) Allows the setting of environment variables based on characteristics of the request 12.2. MODULE INDEX 1105 mod slotmem plain (p. 906) Slot-based shared memory provider. mod slotmem shm (p. 907) Slot-based shared memory provider. mod so (p. 908) Loading of executable code and modules into the server at start-up or restart time mod socache dbm (p. 910) DBM based shared object cache provider. mod socache dc (p. 911) Distcache based shared object cache provider. mod socache memcache (p. 912) Memcache based shared object cache provider. mod socache shmcb (p. 913) shmcb based shared object cache provider. mod speling (p. 914) Attempts to correct mistaken URLs by ignoring capitalization, or attempting to correct various minor misspellings. mod ssl (p. 916) Strong cryptography using the Secure Sockets Layer (SSL) and Transport Layer Security (TLS) protocols mod ssl ct (p. 955) Implementation of Certificate Transparency (RFC 6962) mod status (p. 962) Provides information on server activity and performance mod substitute (p. 964) Perform search and replace operations on response bodies mod suexec (p. 967) Allows CGI scripts to run as a specified user and Group mod syslog (p. 968) Provides "syslog" ErrorLog provider mod systemd (p. 969) Provides better support for systemd integration mod unique id (p. 970) Provides an environment variable with a unique identifier for each request mod unixd (p. 972) Basic (required) security for Unix-family platforms. mod userdir (p. 975) User-specific directories mod usertrack (p. 977) Clickstream logging of user activity on a site mod version (p. 980) Version dependent configuration mod vhost alias (p. 982) Provides for dynamically configured mass virtual hosting mod watchdog (p. 986) provides infrastructure for other modules to periodically run tasks mod xml2enc (p. 987) Enhanced charset/internationalisation support for libxml2-based filter modules 1106 12.3 CHAPTER 12. GLOSSARY AND INDEX Directive Quick Reference The directive quick reference shows the usage, default, status, and context of each Apache configuration directive. For more information about each of these, see the Directive Dictionary (p. 377) . The first column gives the directive name and usage. The second column shows the default value of the directive, if a default exists. If the default is too large to display, it will be truncated and followed by "+". The third and fourth columns list the contexts where the directive is allowed and the status of the directive according to the legend tables below. / AcceptFilter protocol accept filter Configures optimizations for a Protocol’s Listener Sockets AcceptPathInfo On|Off|Default Default Resources accept trailing pathname information AccessFileName filename [filename] ... .htaccess Name of the distributed configuration file Action action-type cgi-script [virtual] Activates a CGI script for a particular handler or content-type AddAlt string file [file] ... Alternate text to display for a file, instead of an icon selected by filename AddAltByEncoding string MIME-encoding [MIME-encoding] ... Alternate text to display for a file instead of an icon selected by MIME-encoding AddAltByType string MIME-type [MIME-type] ... Alternate text to display for a file, instead of an icon selected by MIME content-type AddCharset charset extension [extension] ... Maps the given filename extensions to the specified content charset AddDefaultCharset On|Off|charset Off Default charset parameter to be added when a response content-type is text/plain or text/html AddDescription string file [file] ... Description to display for a file AddEncoding encoding extension [extension] ... Maps the given filename extensions to the specified encoding type AddHandler handler-name extension [extension] ... Maps the filename extensions to the specified handler AddIcon icon name [name] ... Icon to display for a file selected by name AddIconByEncoding icon MIME-encoding [MIME-encoding] ... Icon to display next to files selected by MIME content-encoding AddIconByType icon MIME-type [MIME-type] ... Icon to display next to files selected by MIME content-type AddInputFilter filter[;filter...] extension [extension] ... Maps filename extensions to the filters that will process client requests AddLanguage language-tag extension [extension] ... Maps the given filename extension to the specified content language AddModuleInfo module-name string Adds additional information to the module information displayed by the server-info handler AddOutputFilter filter[;filter...] extension [extension] ... Maps filename extensions to the filters that will process responses from the server AddOutputFilterByType filter[;filter...] media-type [media-type] ... assigns an output filter to a particular media-type AddType media-type extension [extension] ... Maps the given filename extensions onto the specified content type Alias [URL-path] file-path|directory-path Maps URLs to filesystem locations AliasMatch regex file-path|directory-path Maps URLs to filesystem locations using regular expressions Allow from all|host|env=[!]env-variable [host|env=[!]env-variable] ... Controls which hosts can access an area of the server AllowCONNECT port[-port] [port[-port]] ... 443 563 Ports that are allowed to CONNECT through the proxy s p. 382 svdh p. 383 sv p. 384 svdh p. 445 svdh p. 544 svdh p. 544 svdh p. 545 svdh p. 752 svdh p. 384 svdh p. 545 svdh p. 752 svdh p. 753 svdh p. 546 svdh p. 546 svdh p. 547 svdh C C C B B B B B C B B B B B B B p. 753 svdh B p. 754 sv E p. 682 svdh B p. 754 svdh B p. 633 svdh p. 755 svd p. 448 sv p. 449 dh B B B E p. 441 sv E p. 829 12.3. DIRECTIVE QUICK REFERENCE AllowEncodedSlashes On|Off|NoDecode Off Determines whether encoded path separators in URLs are allowed to be passed through AllowHandlers [not] none|handler-name all [none|handler-name]... Restrict access to the listed handlers AllowMethods reset|HTTP-method [HTTP-method]... reset Restrict access to the listed HTTP methods AllowOverride All|None|directive-type [directive-type] ... None (2.3.9 and lat + Types of directives that are allowed in .htaccess files AllowOverrideList None|directive [directive-type] ... None Individual directives that are allowed in .htaccess files Anonymous user [user] ... Specifies userIDs that are allowed access without password verification Anonymous LogEmail On|Off On Sets whether the password entered will be logged in the error log Anonymous MustGiveEmail On|Off On Specifies whether blank passwords are allowed Anonymous NoUserID On|Off Off Sets whether the userID field may be empty Anonymous VerifyEmail On|Off Off Sets whether to check the password field for a correctly formatted email address AsyncFilter request|connection|network request Set the minimum filter type eligible for asynchronous handling AsyncRequestWorkerFactor factor Limit concurrent connections per process AuthBasicAuthoritative On|Off On Sets whether authorization and authentication are passed to lower level modules AuthBasicFake off|username [password] Fake basic authentication using the given expressions for username and password AuthBasicProvider provider-name [provider-name] ... file Sets the authentication provider(s) for this location AuthBasicUseDigestAlgorithm MD5|Off Off Check passwords against the authentication providers as if Digest Authentication was in force instead of Basic Authentication. AuthDBDUserPWQuery query SQL query to look up a password for a user AuthDBDUserRealmQuery query SQL query to look up a password hash for a user and realm. AuthDBMGroupFile file-path Sets the name of the database file containing the list of user groups for authorization AuthDBMType default|SDBM|GDBM|NDBM|DB default Sets the type of database file that is used to store passwords AuthDBMUserFile file-path Sets the name of a database file containing the list of users and passwords for authentication AuthDigestAlgorithm MD5|MD5-sess MD5 Selects the algorithm used to calculate the challenge and response hashes in digest authentication AuthDigestDomain URI [URI] ... URIs that are in the same protection space for digest authentication AuthDigestNcCheck On|Off Off Enables or disables checking of the nonce-count sent by the server AuthDigestNonceFormat format Determines how the nonce is generated AuthDigestNonceLifetime seconds 300 How long the server nonce is valid AuthDigestProvider provider-name [provider-name] ... file Sets the authentication provider(s) for this location AuthDigestQop none|auth|auth-int [auth|auth-int] auth Determines the quality-of-protection to use in digest authentication AuthDigestShmemSize size 1000 The amount of shared memory to allocate for keeping track of clients AuthFormAuthoritative On|Off On Sets whether authorization and authentication are passed to lower level modules AuthFormBody fieldname The name of a form field carrying the body of the request to attempt on successful login AuthFormDisableNoStore On|Off Off Disable the CacheControl no-store header on the login page 1107 sv C p. 385 d X p. 454 d p. 455 d p. 386 d p. 387 dh p. 478 dh p. 478 dh p. 479 dh p. 479 dh p. 479 sv p. 388 s p. 1004 dh p. 458 dh p. 459 dh p. 460 dh p. 460 d p. 485 d p. 486 dh p. 533 dh p. 487 dh p. 488 dh p. 463 dh p. 463 s p. 464 dh p. 464 dh p. 464 dh p. 464 dh p. 465 s p. 465 dh p. 470 d p. 471 d p. 471 X C C E E E E E C M B B B B E E E E E E E E E E E E E B B B 1108 CHAPTER 12. GLOSSARY AND INDEX AuthFormFakeBasicAuth On|Off Off d Fake a Basic Authentication header p. 471 AuthFormLocation fieldname d The name of a form field carrying a URL to redirect to on successful login p. 472 AuthFormLoginRequiredLocation url d The URL of the page to be redirected to should login be required p. 472 AuthFormLoginSuccessLocation url d The URL of the page to be redirected to should login be successful p. 472 AuthFormLogoutLocation uri d The URL to redirect to after a user has logged out p. 473 AuthFormMethod fieldname d The name of a form field carrying the method of the request to attempt on successful login p. 473 AuthFormMimetype fieldname d The name of a form field carrying the mimetype of the body of the request to attempt on successful login p. 473 AuthFormPassword fieldname d The name of a form field carrying the login password p. 474 AuthFormProvider provider-name [provider-name] ... file dh Sets the authentication provider(s) for this location p. 474 AuthFormSitePassphrase secret d Bypass authentication checks for high traffic sites p. 475 AuthFormSize size d The largest size of the form in bytes that will be parsed for the login details p. 475 AuthFormUsername fieldname d The name of a form field carrying the login username p. 475 AuthGroupFile file-path dh Sets the name of a text file containing the list of user groups for authorization p. 534 AuthLDAPAuthorizePrefix prefix AUTHORIZE dh Specifies the prefix for environment variables set during authorization p. 510 AuthLDAPBindAuthoritative off|on on dh Determines if other authentication providers are used when a user can be mapped to a DN but the server cannot successfully bind with the user’s credentials. p. 511 AuthLDAPBindDN distinguished-name dh Optional DN to use in binding to the LDAP server p. 511 AuthLDAPBindPassword password dh Password used in conjuction with the bind DN p. 511 AuthLDAPCharsetConfig file-path s Language to charset conversion configuration file p. 512 AuthLDAPCompareAsUser on|off off dh Use the authenticated user’s credentials to perform authorization comparisons p. 512 AuthLDAPCompareDNOnServer on|off on dh Use the LDAP server to compare the DNs p. 513 AuthLDAPDereferenceAliases never|searching|finding|always always dh When will the module de-reference aliases p. 513 AuthLDAPGroupAttribute attribute member uniquemember + dh LDAP attributes used to identify the user members of groups. p. 513 AuthLDAPGroupAttributeIsDN on|off on dh Use the DN of the client username when checking for group membership p. 513 AuthLDAPInitialBindAsUser off|on off dh Determines if the server does the initial DN lookup using the basic authentication users’ own username, instead of anonymously or with hard-coded credentials for the server p. 514 AuthLDAPInitialBindPattern regex substitution (.*) $1 (remote use + dh Specifies the transformation of the basic authentication username to be used when binding to the LDAP server to perform a DN lookup p. 514 AuthLDAPMaxSubGroupDepth Number 0 dh Specifies the maximum sub-group nesting depth that will be evaluated before the user search is discontinued. p. 515 AuthLDAPRemoteUserAttribute uid dh Use the value of the attribute returned during the user query to set the REMOTE USER environment variable p. 516 AuthLDAPRemoteUserIsDN on|off off dh Use the DN of the client username to set the REMOTE USER environment variable p. 516 AuthLDAPSearchAsUser on|off off dh Use the authenticated user’s credentials to perform authorization searches p. 516 AuthLDAPSubGroupAttribute attribute dh Specifies the attribute labels, one value per directive line, used to distinguish the members of the current group that are groups. p. 517 AuthLDAPSubGroupClass LdapObjectClass groupOfNames groupO + dh Specifies which LDAP objectClass values identify directory objects that are groups during sub-group processing. p. 517 B B B B B B B B B B B B B E E E E E E E E E E E E E E E E E E 12.3. DIRECTIVE QUICK REFERENCE 1109 AuthLDAPUrl url [NONE|SSL|TLS|STARTTLS] dh URL specifying the LDAP search parameters p. 517 AuthMerging Off | And | Or Off dh Controls the manner in which each configuration section’s authorization logic is combined with that of preceding configuration sections. p. 522 AuthName auth-domain dh Authorization realm for use in HTTP authentication p. 481 AuthnCacheContext directory|server|custom-string d Specify a context string for use in the cache key p. 492 AuthnCacheEnable s Enable Authn caching configured anywhere p. 492 AuthnCacheProvideFor authn-provider [...] dh Specify which authn provider(s) to cache for p. 493 AuthnCacheSOCache provider-name[:provider-args] s Select socache backend provider to use p. 493 AuthnCacheTimeout timeout (seconds) dh Set a timeout for cache entries p. 493 ... s Enclose a group of directives that represent an extension of a base authentication provider and referenced by the specified alias p. 482 AuthnzFcgiCheckAuthnProvider provider-name|None option ... d Enables a FastCGI application to handle the check authn authentication hook. p. 499 AuthnzFcgiDefineProvider type provider-name s backend-address Defines a FastCGI application as a provider for authentication and/or authorization p. 500 AuthType None|Basic|Digest|Form dh Type of user authentication p. 482 AuthUserFile file-path dh Sets the name of a text file containing the list of users and passwords for authentication p. 489 AuthzDBDLoginToReferer On|Off Off d Determines whether to redirect the Client to the Referring page on successful login or logout if a Referer request header is presentp. 530 AuthzDBDQuery query d Specify the SQL Query for the required operation p. 530 AuthzDBDRedirectQuery query d Specify a query to look up a login page for the user p. 530 AuthzDBMType default|SDBM|GDBM|NDBM|DB default dh Sets the type of database file that is used to store list of user groups p. 533 ... Enclose a group of directives that represent an extension of a base authorization provider and referenced by the specified alias p. 523 AuthzSendForbiddenOnFailure On|Off Off dh Send ’403 FORBIDDEN’ instead of ’401 UNAUTHORIZED’ if authentication succeeds but authorization fails p. 523 BalancerGrowth # 5 sv Number of additional Balancers that can be added Post-configuration p. 793 BalancerInherit On|Off On sv Inherit proxy Balancers/Workers defined from the main server p. 793 BalancerMember [balancerurl] url [key=value [key=value d ...]] Add a member to a load balancing group p. 794 BalancerPersist On|Off Off sv Attempt to persist changes made by the Balancer Manager across restarts. p. 794 BrowserMatch regex [!]env-variable[=value] svdh [[!]env-variable[=value]] ... Sets environment variables conditional on HTTP User-Agent p. 902 BrowserMatchNoCase regex [!]env-variable[=value] svdh [[!]env-variable[=value]] ... Sets environment variables conditional on User-Agent without respect to case p. 903 BufferedLogs On|Off Off s Buffer log entries in memory before writing to disk p. 708 BufferSize integer 131072 svdh Maximum size in bytes to buffer by the buffer filter p. 554 CacheDefaultExpire seconds 3600 (one hour) svdh The default duration to cache a document when no expiry date is specified. p. 560 CacheDetailHeader on|off off svdh Add an X-Cache-Detail header to the response. p. 560 E B B B B B B B B E E B B E E E E B B E E E E B B B E E E 1110 CHAPTER 12. GLOSSARY AND INDEX CacheDirLength length 2 The number of characters in subdirectory names CacheDirLevels levels 2 The number of levels of subdirectories in the cache. CacheDisable url-string | on Disable caching of specified URLs CacheEnable cache type [url-string] Enable caching of specified URLs using a specified storage manager CacheFile file-path [file-path] ... Cache a list of file handles at startup time CacheHeader on|off off Add an X-Cache header to the response. CacheIgnoreCacheControl On|Off Off Ignore request to not serve cached content to client CacheIgnoreHeaders header-string [header-string] ... None Do not store the given HTTP header(s) in the cache. CacheIgnoreNoLastMod On|Off Off Ignore the fact that a response has no Last Modified header. CacheIgnoreQueryString On|Off Off Ignore query string when caching CacheIgnoreURLSessionIdentifiers identifier [identifier] None ... Ignore defined session identifiers encoded in the URL when caching CacheKeyBaseURL URL http://example.com Override the base URL of reverse proxied cache keys. CacheLastModifiedFactor float 0.1 The factor used to compute an expiry date based on the LastModified date. CacheLock on|off off Enable the thundering herd lock. CacheLockMaxAge integer 5 Set the maximum possible age of a cache lock. CacheLockPath directory mod cache-lock Set the lock path directory. CacheMaxExpire seconds 86400 (one day) The maximum time in seconds to cache a document CacheMaxFileSize bytes 1000000 The maximum size (in bytes) of a document to be placed in the cache CacheMinExpire seconds 0 The minimum time in seconds to cache a document CacheMinFileSize bytes 1 The minimum size (in bytes) of a document to be placed in the cache CacheNegotiatedDocs On|Off Off Allows content-negotiated documents to be cached by proxy servers CacheQuickHandler on|off on Run the cache from the quick handler. CacheReadSize bytes 0 The minimum size (in bytes) of the document to read and be cached before sending the data downstream CacheReadTime milliseconds 0 The minimum time (in milliseconds) that should elapse while reading before data is sent downstream CacheRoot directory The directory root under which cache files are stored CacheSocache type[:args] The shared object cache implementation to use CacheSocacheMaxSize bytes 102400 The maximum size (in bytes) of an entry to be placed in the cache CacheSocacheMaxTime seconds 86400 The maximum time (in seconds) for a document to be placed in the cache CacheSocacheMinTime seconds 600 The minimum time (in seconds) for a document to be placed in the cache CacheSocacheReadSize bytes 0 The minimum size (in bytes) of the document to read and be cached before sending the data downstream CacheSocacheReadTime milliseconds 0 The minimum time (in milliseconds) that should elapse while reading before data is sent downstream CacheStaleOnError on|off on Serve stale content in place of 5xx responses. sv p. 571 sv p. 571 svdh p. 560 svd p. 561 s p. 627 svdh p. 562 sv p. 563 sv p. 563 svdh p. 564 sv p. 564 sv p. 565 sv p. 565 svdh p. 566 sv p. 566 sv p. 566 sv p. 567 svdh p. 567 svdh p. 571 svdh p. 567 svdh p. 572 sv p. 768 sv p. 567 svdh p. 572 svdh p. 572 sv p. 573 sv p. 575 svdh p. 575 svdh p. 575 svdh p. 576 svdh p. 576 svdh p. 576 svdh p. 568 E E E E X E E E E E E E E E E E E E E E B E E E E E E E E E E E 12.3. DIRECTIVE QUICK REFERENCE CacheStoreExpired On|Off Attempt to cache responses that the server reports as expired CacheStoreNoStore On|Off Attempt to cache requests or responses that have been marked as no-store. CacheStorePrivate On|Off Attempt to cache responses that the server has marked as private CGIDScriptTimeout time[s|ms] The length of time to wait for more output from the CGI program CGIMapExtension cgi-path .extension Technique for locating the interpreter for CGI scripts CGIPassAuth On|Off Enables passing HTTP authorization headers to scripts as CGI variables CGIVar variable rule Controls how some CGI variables are set CharsetDefault charset Charset to translate into CharsetOptions option [option] ... Configures charset translation behavior CharsetSourceEnc charset Source charset of files CheckBasenameMatch on|off Also match files with differing file name extensions. CheckCaseOnly on|off Limits the action of the speling module to case corrections CheckSpelling on|off Enables the spelling module ChrootDir /path/to/directory Directory for apache to run chroot(8) after startup. ContentDigest On|Off Enables the generation of Content-MD5 HTTP Response headers CookieDomain domain The domain to which the tracking cookie applies CookieExpires expiry-period Expiry time for the tracking cookie CookieName token Name of the tracking cookie CookieStyle Netscape|Cookie|Cookie2|RFC2109|RFC2965 Format of the cookie header field CookieTracking on|off Enables tracking cookie CoreDumpDirectory directory Directory where Apache HTTP Server attempts to switch before dumping core CTAuditStorage directory Existing directory where data for off-line audit will be stored CTLogClient executable Location of certificate-transparency log client tool CTLogConfigDB filename Log configuration database supporting dynamic updates CTMaxSCTAge num-seconds Maximum age of SCT obtained from a log, before it will be refreshed CTProxyAwareness oblivious|aware|require Level of CT awareness and enforcement for a proxy CTSCTStorage directory Existing directory where SCTs are managed CTServerHelloSCTLimit limit Limit on number of SCTs that can be returned in ServerHello CTStaticLogConfig log-id|- public-key-file|- 1|0|min-timestamp|- max-timestamp|- log-URL|Static configuration of information about a log CTStaticSCTs certificate-pem-file sct-directory Static configuration of one or more SCTs for a server certificate CustomLog file|pipe|provider format|nickname [env=[!]environment-variable| expr=expression] Sets filename and format of log file 1111 Off Off Off Off ImplicitAdd Off Off Off Off Apache Netscape off svdh p. 568 svdh p. 569 svdh p. 569 svdh p. 583 dh p. 388 dh p. 389 dh p. 389 svdh p. 586 svdh p. 586 svdh p. 586 svdh p. 914 svdh p. 915 svdh p. 915 s p. 972 svdh p. 389 svdh p. 977 svdh p. 978 svdh p. 978 svdh p. 978 svdh p. 979 s p. 990 s p. 958 s p. 958 s p. 959 s p. 959 sv p. 959 s p. 960 s p. 960 s E E E B C C C E E E E E E B C E E E E E M E E E E E E E E p. 960 s E p. 961 sv B p. 708 1112 CHAPTER 12. GLOSSARY AND INDEX Dav On|Off|provider-name Off d Enable WebDAV HTTP methods p. 591 DavDepthInfinity on|off off svd Allow PROPFIND, Depth: Infinity requests p. 591 DavGenericLockDB file-path svd Location of the DAV lock database p. 593 DavLockDB file-path sv Location of the DAV lock database p. 592 DavMinTimeout seconds 0 svd Minimum amount of time the server holds a lock on a DAV resource p. 591 DBDExptime time-in-seconds 300 sv Keepalive time for idle connections p. 596 DBDInitSQL "SQL statement" sv Execute an SQL statement after connecting to a database p. 596 DBDKeep number 2 sv Maximum sustained number of connections p. 597 DBDMax number 10 sv Maximum number of connections p. 597 DBDMin number 1 sv Minimum number of connections p. 597 DBDParams param1=value1[,param2=value2] sv Parameters for database connection p. 597 DBDPersist On|Off sv Whether to use persistent connections p. 598 DBDPrepareSQL "SQL statement" label sv Define an SQL prepared statement p. 598 DBDriver name sv Specify an SQL driver p. 598 DefaultIcon url-path svdh Icon to display for files when no specific icon is configured p. 547 DefaultLanguage language-tag svdh Defines a default language-tag to be sent in the Content-Language header field for all resources in the current context that have not been assigned a language-tag by some other means. p. 756 DefaultRuntimeDir directory-path DEFAULT REL RUNTIME + s Base directory for the server run-time files p. 390 DefaultType media-type|none none svdh This directive has no effect other than to emit warnings if the value is not none. In prior versions, DefaultType would specify a default media type to assign to response content for which no other media type configuration could be found. p. 390 Define parameter-name [parameter-value] sv Define a variable p. 391 DeflateAlterETag AddSuffix|NoChange|Remove AddSuffix sv How the outgoing ETag header should be modified during compression p. 602 DeflateBufferSize value 8096 sv Fragment size to be compressed at one time by zlib p. 602 DeflateCompressionLevel value sv How much compression do we apply to the output p. 603 DeflateFilterNote [type] notename sv Places the compression ratio in a note for logging p. 603 DeflateInflateLimitRequestBodyvalue svdh Maximum size of inflated request bodies p. 604 DeflateInflateRatioBurst value svdh Maximum number of times the inflation ratio for request bodies can be crossed p. 604 DeflateInflateRatioLimit value svdh Maximum inflation ratio for request bodies p. 604 DeflateMemLevel value 9 sv How much memory should be used by zlib for compression p. 604 DeflateWindowSize value 15 sv Zlib compression window size p. 605 Deny from all|host|env=[!]env-variable dh [host|env=[!]env-variable] ... Controls which hosts are denied access to the server p. 442 ... sv Enclose a group of directives that apply only to the named file-system directory, sub-directories, and their contents. p. 391 DirectoryCheckHandler On|Off Off svdh Toggle how this module responds when another handler is configured p. 607 E E E E E E E E E E E E E E B B C C C E E E E E E E E E E C B 12.3. DIRECTIVE QUICK REFERENCE 1113 DirectoryIndex disabled | local-url [local-url] ... index.html svdh List of resources to look for when the client requests a directory p. 608 DirectoryIndexRedirect on | off | permanent | temp | off svdh seeother | 3xx-code Configures an external redirect for directory indexes. p. 609 ... sv Enclose directives that apply to the contents of file-system directories matching a regular expression. p. 393 DirectorySlash On|Off On svdh Toggle trailing slash redirects on or off p. 609 DocumentRoot directory-path /usr/local/apache/h + sv Directory that forms the main document tree visible from the web p. 394 DTracePrivileges On|Off Off s Determines whether the privileges required by dtrace are enabled. p. 782 DumpIOInput On|Off Off s Dump all input data to the error log p. 612 DumpIOOutput On|Off Off s Dump all output data to the error log p. 612 ... svdh Contains directives that apply only if the condition of a previous or section is not satisfied by a request at runtime p. 394 ... svdh Contains directives that apply only if a condition is satisfied by a request at runtime while the condition of a previous or section is not satisfied p. 395 EnableExceptionHook On|Off Off s Enables a hook that runs exception handlers after a crash p. 991 EnableMMAP On|Off On svdh Use memory-mapping to read files during delivery p. 395 EnableSendfile On|Off Off svdh Use the kernel sendfile support to deliver files to the client p. 396 Error message svdh Abort configuration parsing with a custom error message p. 397 ErrorDocument error-code document svdh What the server will return to the client in case of an error p. 397 ErrorLog file-path|syslog[:facility] logs/error log (Uni + sv Location where the server will log errors p. 399 ErrorLogFormat [connection|request] format sv Format specification for error log entries p. 399 Example svdh Demonstration directive to illustrate the Apache module API p. 618 ExpiresActive On|Off Off svdh Enables generation of Expires headers p. 620 ExpiresByType MIME-type seconds svdh Value of the Expires header configured by MIME type p. 620 ExpiresDefault seconds svdh Default algorithm for calculating expiration time p. 621 ExtendedStatus On|Off Off[*] s Keep track of extended status information for each request p. 401 ExtFilterDefine filtername parameters s Define an external filter p. 624 ExtFilterOptions option [option] ... NoLogStderr d Configure MOD EXT FILTER options p. 625 FallbackResource disabled | local-url svdh Define a default URL for requests that don’t map to a file p. 610 FileETag component ... MTime Size svdh File attributes used to create the ETag HTTP response header for static files p. 402 ... svdh Contains directives that apply to matched filenames p. 403 ... svdh Contains directives that apply to regular-expression matched filenames p. 403 FilterChain [+=-@!]filter-name ... svdh Configure the filter chain p. 634 FilterDeclare filter-name [type] svdh Declare a smart filter p. 634 FilterProtocol filter-name [provider-name] proto-flags svdh Deal with correct HTTP protocol handling p. 635 B B C B C X E E C C M C C C C C C X E E E C E E B C C C B B B 1114 CHAPTER 12. GLOSSARY AND INDEX FilterProvider filter-name provider-name expression Register a content filter FilterTrace filter-name level Get debug/diagnostic information from MOD FILTER FirehoseConnectionInput [ block | nonblock ] filename Capture traffic coming into the server on each connection FirehoseConnectionOutput [ block | nonblock ] filename Capture traffic going out of the server on each connection FirehoseProxyConnectionInput [ block | nonblock ] filename Capture traffic coming into the back of mod proxy FirehoseProxyConnectionOutput [ block | nonblock ] filename Capture traffic sent out from the back of mod proxy FirehoseRequestInput [ block | nonblock ] filename Capture traffic coming into the server on each request FirehoseRequestOutput [ block | nonblock ] filename Capture traffic going out of the server on each request ForceLanguagePriority None|Prefer|Fallback Prefer [Prefer|Fallback] Action to take if a single acceptable document is not found ForceType media-type|None Forces all matching files to be served with the specified media type in the HTTP Content-Type header field ForensicLog filename|pipe Sets filename of the forensic log GlobalLog file|pipe|provider format|nickname [env=[!]environment-variable| expr=expression] Sets filename and format of log file GprofDir /tmp/gprof/|/tmp/gprof/% Directory to write gmon.out profiling data to. GracefulShutdownTimeout seconds 0 Specify a timeout after which a gracefully shutdown server will exit. Group unix-group #-1 Group under which the server will answer requests H2Direct on|off on for h2c, off for + H2 Direct Protocol Switch H2MaxSessionStreams n 100 Maximum number of active streams per HTTP/2 session. H2MaxWorkerIdleSeconds n 600 Maximum number of seconds h2 workers remain idle until shut down. H2MaxWorkers n Maximum number of worker threads to use per child process. H2MinWorkers n Minimal number of worker threads to use per child process. H2ModernTLSOnly on|off on Require HTTP/2 connections to be "modern TLS" only H2Push on|off on H2 Server Push Switch H2PushDiarySize n 256 H2 Server Push Diary Size H2PushPriority mime-type [after|before|interleaved] * After 16 [weight] H2 Server Push Priority H2SerializeHeaders on|off off Serialize Request/Response Processing Switch H2SessionExtraFiles n Number of Extra File Handles H2StreamMaxMemSize bytes 65536 Maximum amount of output data buffered per stream. H2TLSCoolDownSecs seconds 1 H2TLSWarmUpSize amount 1048576 H2Upgrade on|off on for h2c, off for + H2 Upgrade Protocol Switch svdh p. 635 svd p. 636 s p. 638 s p. 639 s p. 639 s B B E E E E p. 639 s E p. 640 s E p. 640 svdh B p. 768 dh C p. 404 sv E p. 715 s B p. 710 sv p. 405 s p. 991 s p. 972 sv p. 652 sv p. 653 s p. 653 s p. 653 s p. 654 sv p. 654 sv p. 654 sv p. 655 sv p. 656 sv p. 657 sv p. 657 sv p. 658 sv p. 658 sv p. 659 sv p. 659 C M B E E E E E E E E E E E E E E E 12.3. DIRECTIVE QUICK REFERENCE H2WindowSize bytes 65535 Size of Stream Window for upstream data. Header [condition] add|append|echo|edit|edit*|merge|set|setifempty|unset|note header [[expr=]value [replacement] [early|env=[!]varname|expr=expression]] Configure HTTP response headers HeaderName filename Name of the file that will be inserted at the top of the index listing HeartbeatAddress addr:port Multicast address for heartbeat packets HeartbeatListenaddr:port multicast address to listen for incoming heartbeat requests HeartbeatMaxServers number-of-servers 10 Specifies the maximum number of servers that will be sending heartbeat requests to this server HeartbeatStorage file-path logs/hb.dat Path to store heartbeat data HeartbeatStorage file-path logs/hb.dat Path to read heartbeat data HostnameLookups On|Off|Double Off Enables DNS lookups on client IP addresses IdentityCheck On|Off Off Enables logging of the RFC 1413 identity of the remote user IdentityCheckTimeout seconds 30 Determines the timeout duration for ident requests IdleShutdown seconds 0 Enable shutting down the httpd when it is idle for some time. ... Contains directives that apply only if a condition is satisfied by a request at runtime ... Encloses directives that will be processed only if a test is true at startup ... Encloses directives that are processed conditional on the presence or absence of a specific module ... contains version dependent configuration ImapBase map|referer|URL http://servername/ Default base for imagemap files ImapDefault error|nocontent|map|referer|URL nocontent Default action when an imagemap is called with coordinates that are not explicitly mapped ImapMenu none|formatted|semiformatted|unformatted formatted Action if no coordinates are given when calling an imagemap Include file-path|directory-path|wildcard Includes other configuration files from within the server configuration files IncludeOptional file-path|directory-path|wildcard Includes other configuration files from within the server configuration files IndexHeadInsert "markup ..." Inserts text in the HEAD section of an index page. IndexIgnore file [file] ... "." Adds to the list of files to hide when listing a directory IndexIgnoreReset ON|OFF Empties the list of files to hide when listing a directory IndexOptions [+|-]option [[+|-]option] ... Various configuration settings for directory indexing IndexOrderDefault Ascending|Descending Ascending Name Name|Date|Size|Description Sets the default ordering of the directory index IndexStyleSheet url-path Adds a CSS stylesheet to the directory index InputSed sed-command Sed command to filter request data (typically POST data) ISAPIAppendLogToErrors on|off off Record HSE APPEND LOG PARAMETER requests from ISAPI extensions to the error log ISAPIAppendLogToQuery on|off on Record HSE APPEND LOG PARAMETER requests from ISAPI extensions to the query field ISAPICacheFile file-path [file-path] ... ISAPI .dll files to be loaded at startup 1115 sv E p. 660 svdh E p. 643 svdh p. 547 s p. 647 s p. 648 s p. 648 s p. 649 s p. 692 svd p. 405 svd p. 661 svd p. 661 s p. 969 svdh p. 406 svdh p. 406 svdh p. 407 svdh p. 980 svdh p. 665 svdh p. 666 svdh p. 666 svd p. 408 svd p. 409 svdh p. 548 svdh p. 548 svdh p. 549 svdh p. 549 svdh p. 552 svdh p. 553 dh p. 882 svdh p. 685 svdh p. 685 sv p. 685 B X X X X X C E E E C C C E B B B C C B B B B B B X B B B 1116 CHAPTER 12. GLOSSARY AND INDEX ISAPIFakeAsync on|off off svdh Fake asynchronous support for ISAPI callbacks p. 686 ISAPILogNotSupported on|off off svdh Log unsupported feature requests from ISAPI extensions p. 686 ISAPIReadAheadBuffer size 49152 svdh Size of the Read Ahead Buffer sent to ISAPI extensions p. 686 KeepAlive On|Off On sv Enables HTTP persistent connections p. 409 KeepAliveTimeout num[ms] 5 sv Amount of time the server will wait for subsequent requests on a persistent connection p. 410 KeptBodySize maximum size in bytes 0 d Keep the request body instead of discarding it up to the specified maximum size, for potential use by filters such as mod include. p. 866 LanguagePriority MIME-lang [MIME-lang] ... svdh The precendence of language variants for cases where the client does not express a preference p. 769 LDAPCacheEntries number 1024 s Maximum number of entries in the primary LDAP cache p. 698 LDAPCacheTTL seconds 600 s Time that cached items remain valid p. 699 LDAPConnectionPoolTTL n -1 sv Discard backend connections that have been sitting in the connection pool too long p. 699 LDAPConnectionTimeout seconds s Specifies the socket connection timeout in seconds p. 699 LDAPLibraryDebug 7 s Enable debugging in the LDAP SDK p. 700 LDAPOpCacheEntries number 1024 s Number of entries used to cache LDAP compare operations p. 700 LDAPOpCacheTTL seconds 600 s Time that entries in the operation cache remain valid p. 700 LDAPReferralHopLimit number dh The maximum number of referral hops to chase before terminating an LDAP query. p. 701 LDAPReferrals On|Off|default On dh Enable referral chasing during queries to the LDAP server. p. 701 LDAPRetries number-of-retries 3 s Configures the number of LDAP server retries. p. 701 LDAPRetryDelay seconds 0 s Configures the delay between LDAP server retries. p. 702 LDAPSharedCacheFile file-path s Sets the shared memory cache file p. 702 LDAPSharedCacheSize bytes 500000 s Size in bytes of the shared-memory cache p. 702 LDAPTimeout seconds 60 s Specifies the timeout for LDAP search and bind operations, in seconds p. 702 LDAPTrustedClientCert type directory-path/filename/nickname dh [password] Sets the file containing or nickname referring to a per connection client certificate. Not all LDAP toolkits support per connection client certificates. p. 703 LDAPTrustedGlobalCert type directory-path/filename s [password] Sets the file or database containing global trusted Certificate Authority or global client certificates p. 703 LDAPTrustedMode type sv Specifies the SSL/TLS mode to be used when connecting to an LDAP server. p. 704 LDAPVerifyServerCert On|Off On s Force server certificate verification p. 704 ... dh Restrict enclosed access controls to only certain HTTP methods p. 410 ... dh Restrict access controls to all HTTP methods except the named ones p. 411 LimitInternalRecursion number [number] 10 sv Determine maximum number of internal redirects and nested subrequests p. 411 LimitRequestBody bytes 0 svdh Restricts the total size of the HTTP request body sent from the client p. 412 LimitRequestFields number 100 sv Limits the number of HTTP request header fields that will be accepted from the client p. 412 LimitRequestFieldSize bytes 8190 sv Limits the size of the HTTP request header allowed from the client p. 413 B B B C C B B E E E E E E E E E E E E E E E E E E C C C C C C 12.3. DIRECTIVE QUICK REFERENCE LimitRequestLine bytes Limit the size of the HTTP request line that will be accepted from the client LimitXMLRequestBody bytes Limits the size of an XML-based request body Listen [IP-address:]portnumber [protocol] IP addresses and ports that the server listens to ListenBacklog backlog Maximum length of the queue of pending connections ListenCoresBucketsRatio ratio Ratio between the number of CPU cores (online) and the number of listeners’ buckets LoadFile filename [filename] ... Link in the named object file or library LoadModule module filename Links in the object file or library, and adds to the list of active modules ... Applies the enclosed directives only to matching URLs ... Applies the enclosed directives only to regular-expression matching URLs LogFormat format|nickname [nickname] Describes a format for use in a log file LogIOTrackTTFB ON|OFF Enable tracking of time to first byte (TTFB) LogLevel [module:]level [module:level] ... Controls the verbosity of the ErrorLog LogLevel ipaddress[/prefixlen] [module:]level [module:level] ... Override the verbosity of the ErrorLog for certain clients LogMessage message [hook=hook] [expr=expression] Log user-defined message to error log LuaAuthzProvider provider name /path/to/lua/script.lua function name Plug an authorization provider function into MOD AUTHZ CORE LuaCodeCache stat|forever|never Configure the compiled code cache. LuaHookAccessChecker /path/to/lua/script.lua hook function name [early|late] Provide a hook for the access checker phase of request processing LuaHookAuthChecker /path/to/lua/script.lua hook function name [early|late] Provide a hook for the auth checker phase of request processing LuaHookCheckUserID /path/to/lua/script.lua hook function name Provide a hook for the check user id phase of request processing LuaHookFixups /path/to/lua/script.lua hook function name Provide a hook for the fixups phase of a request processing LuaHookInsertFilter /path/to/lua/script.lua hook function name Provide a hook for the insert filter phase of request processing LuaHookLog /path/to/lua/script.lua log function name Provide a hook for the access log phase of a request processing LuaHookMapToStorage /path/to/lua/script.lua hook function name Provide a hook for the map to storage phase of request processing LuaHookTranslateName /path/to/lua/script.lua hook function name [early|late] Provide a hook for the translate name phase of request processing LuaHookTypeChecker /path/to/lua/script.lua hook function name Provide a hook for the type checker phase of request processing LuaInherit none|parent-first|parent-last Controls how parent configuration sections are merged into children LuaInputFilter filter name /path/to/lua/script.lua function name Provide a Lua function for content input filtering 1117 8190 1000000 0 (disabled) "%h %l %u %t \"%r\" + OFF warn sv p. 413 svdh p. 414 s p. 992 s p. 993 s p. 993 sv p. 909 sv p. 909 sv p. 414 sv p. 416 sv p. 710 svdh p. 717 svd p. 416 sv C C M M M E E C C B E C C p. 418 d X p. 712 s X stat p. 734 svdh X p. 735 svdh X p. 735 svdh X p. 735 svdh X p. 736 svdh X p. 736 svdh X p. 737 svdh X p. 737 svdh X p. 738 sv X p. 738 svdh X parent-first p. 739 svdh X p. 740 s X p. 740 1118 CHAPTER 12. GLOSSARY AND INDEX LuaMapHandler uri-pattern /path/to/lua/script.lua [function-name] Map a path to a lua handler LuaOutputFilter filter name /path/to/lua/script.lua function name Provide a Lua function for content output filtering LuaPackageCPath /path/to/include/?.soa Add a directory to lua’s package.cpath LuaPackagePath /path/to/include/?.lua Add a directory to lua’s package.path LuaQuickHandler /path/to/script.lua hook function name Provide a hook for the quick handler of request processing LuaRoot /path/to/a/directory Specify the base path for resolving relative paths for mod lua directives LuaScope once|request|conn|thread|server [min] [max] once One of once, request, conn, thread – default is once ... Define a configuration file macro MaxConnectionsPerChild number 0 Limit on the number of connections that an individual child server will handle during its life MaxKeepAliveRequests number 100 Number of requests allowed on a persistent connection MaxMemFree KBytes 2048 Maximum amount of memory that the main allocator is allowed to hold without calling free() MaxRangeOverlaps default | unlimited | none | 20 number-of-ranges Number of overlapping ranges (eg: 100-200,150-300) allowed before returning the complete resource MaxRangeReversals default | unlimited | none | 20 number-of-ranges Number of range reversals (eg: 100-200,50-70) allowed before returning the complete resource MaxRanges default | unlimited | none | number-of-ranges 200 Number of ranges allowed before returning the complete resource MaxRequestWorkers number Maximum number of connections that will be processed simultaneously MaxSpareServers number 10 Maximum number of idle child server processes MaxSpareThreads number Maximum number of idle threads MaxThreads number 2048 Set the maximum number of worker threads MemcacheConnTTL num[units] 15s Keepalive time for idle connections MergeTrailers [on|off] off Determines whether trailers are merged into headers MetaDir directory .web Name of the directory to find CERN-style meta information files MetaFiles on|off off Activates CERN meta-file processing MetaSuffix suffix .meta File name suffix for the file containing CERN-style meta information MimeMagicFile file-path Enable MIME-type determination based on file contents using the specified magic file MinSpareServers number 5 Minimum number of idle child server processes MinSpareThreads number Minimum number of idle threads available to handle request spikes MMapFile file-path [file-path] ... Map a list of files into memory at startup time ModemStandard V.21|V.26bis|V.32|V.34|V.92 Modem standard to simulate ModMimeUsePathInfo On|Off Off Tells MOD MIME to treat path info components as part of the filename MultiviewsMatch Any|NegotiatedOnly|Filters|Handlers NegotiatedOnly [Handlers|Filters] The types of files that will be included when searching for a matching file with MultiViews svdh X p. 741 s X p. 741 svdh p. 742 svdh p. 742 sv p. 743 svdh p. 743 svdh p. 743 svd p. 747 s p. 994 sv p. 418 s p. 994 svd X X X X X B M C M C p. 419 svd C p. 419 svd p. 419 s p. 994 s p. 1010 s p. 995 s p. 1007 sv p. 912 sv p. 420 svdh p. 578 svdh p. 579 svdh p. 579 sv p. 765 s p. 1010 s p. 995 s p. 627 d p. 606 d p. 757 svdh p. 757 C M M M M E C E E E E M M X X B B 12.3. DIRECTIVE QUICK REFERENCE Mutex mechanism [default|mutex-name] ... [OmitPID] default Configures mutex mechanism and lock file directory for all or specified mutexes NameVirtualHost addr[:port] DEPRECATED: Designates an IP address for name-virtual hosting NoProxy host [host] ... Hosts, domains, or networks that will be connected to directly NWSSLTrustedCerts filename [filename] ... List of additional client certificates NWSSLUpgradeable [IP-address:]portnumber Allows a connection to be upgraded to an SSL connection upon request Options [+|-]option [[+|-]option] ... FollowSymlinks Configures what features are available in a particular directory Order ordering Deny,Allow Controls the default access state and the order in which A LLOW and D ENY are evaluated. OutputSed sed-command Sed command for filtering response content PassEnv env-variable [env-variable] ... Passes environment variables from the shell PidFile filename httpd.pid File where the server records the process ID of the daemon PolicyConditional ignore|log|enforce Enable the conditional request policy. PolicyConditionalURL url URL describing the conditional request policy. PolicyEnvironment variable log-value ignore-value Override policies based on an environment variable. PolicyFilter on|off Enable or disable policies for the given URL space. PolicyKeepalive ignore|log|enforce Enable the keepalive policy. PolicyKeepaliveURL url URL describing the keepalive policy. PolicyLength ignore|log|enforce Enable the content length policy. PolicyLengthURL url URL describing the content length policy. PolicyMaxage ignore|log|enforce age Enable the caching minimum max-age policy. PolicyMaxageURL url URL describing the caching minimum freshness lifetime policy. PolicyNocache ignore|log|enforce Enable the caching no-cache policy. PolicyNocacheURL url URL describing the caching no-cache policy. PolicyType ignore|log|enforce type [ type [ ... ]] Enable the content type policy. PolicyTypeURL url URL describing the content type policy. PolicyValidation ignore|log|enforce Enable the validation policy. PolicyValidationURL url URL describing the content type policy. PolicyVary ignore|log|enforce header [ header [ ... ]] Enable the Vary policy. PolicyVaryURL url URL describing the content type policy. PolicyVersion ignore|log|enforce HTTP/0.9|HTTP/1.0|HTTP/1.1 Enable the version policy. PolicyVersionURL url URL describing the minimum request HTTP version policy. PrivilegesMode FAST|SECURE|SELECTIVE FAST Trade off processing speed and efficiency vs security against malicious privileges-aware code. Protocol protocol Protocol for a listening socket 1119 s p. 420 s p. 422 sv p. 794 s p. 770 s p. 770 svdh p. 423 dh p. 442 dh p. 882 svdh p. 615 s p. 996 svd p. 774 svd p. 774 svd p. 774 svd p. 775 svd p. 775 svd p. 776 svd p. 776 svd p. 776 svd p. 776 svd p. 777 svd p. 777 svd p. 777 svd p. 778 svd p. 778 svd p. 778 svd p. 779 svd p. 779 svd p. 779 svd p. 779 svd p. 780 svd p. 782 sv p. 424 C C E B B C E X B M E E E E E E E E E E E E E E E E E E E E X C 1120 CHAPTER 12. GLOSSARY AND INDEX ProtocolEcho On|Off Off Turn the echo server on or off Protocols protocol ... http/1.1 Protocols available for a server/virtual host ProtocolsHonorOrder On|Off On Determines if order of Protocols determines precedence during negotiation ... Container for directives applied to proxied resources ProxyAddHeaders Off|On On Add proxy information in X-Forwarded-* headers ProxyBadHeader IsError|Ignore|StartBody IsError Determines how to handle bad header lines in a response ProxyBlock *|hostname|partial-hostname [hostname|partial-hostname]... Disallow proxy requests to certain hosts ProxyDomain Domain Default domain name for proxied requests ProxyErrorOverride On|Off Off Override error pages for proxied content ProxyExpressDBMFile Pathname to DBM file. ProxyExpressDBMFile DBM type of file. ProxyExpressEnable [on|off] Enable the module functionality. ProxyFtpDirCharset character set ISO-8859-1 Define the character set for proxied FTP listings ProxyFtpEscapeWildcards [on|off] Whether wildcards in requested filenames are escaped when sent to the FTP server ProxyFtpListOnWildcard [on|off] Whether wildcards in requested filenames trigger a file listing ProxyHCExpr name {ap expr expression} Creates a named condition expression to use to determine health of the backend based on its response. ProxyHCTemplate name parameter=setting <...> Creates a named template for setting various health check parameters ProxyHCTPsize Sets the size of the threadpool used for the health check workers. ProxyHTMLBufSize bytes Sets the buffer size increment for buffering inline scripts and stylesheets. ProxyHTMLCharsetOut Charset | * Specify a charset for mod proxy html output. auto (2.5/trunk ver + ProxyHTMLDocType HTML|XHTML [Legacy] OR ProxyHTMLDocType fpi [SGML|XML] OR ProxyHTMLDocType html5 OR ProxyHTMLDocType auto Sets an HTML or XHTML document type declaration. ProxyHTMLEnable On|Off Off Turns the proxy html filter on or off. ProxyHTMLEvents attribute [attribute ...] Specify attributes to treat as scripting events. ProxyHTMLExtended On|Off Off Determines whether to fix links in inline scripts, stylesheets, and scripting events. ProxyHTMLFixups [lowercase] [dospath] [reset] Fixes for simple HTML errors. ProxyHTMLInterp On|Off Off Enables per-request interpolation of P ROXY HTMLURLM AP rules. ProxyHTMLLinks element attribute [attribute2 ...] Specify HTML elements that have URL attributes to be rewritten. ProxyHTMLMeta On|Off Off Turns on or off extra pre-parsing of metadata in HTML sections. ProxyHTMLStripComments On|Off Off Determines whether to strip HTML comments. sv p. 614 sv p. 425 sv p. 425 sv p. 796 svd p. 796 sv p. 797 sv p. 797 sv p. 798 svd p. 798 sv p. 831 sv p. 831 sv p. 832 svd p. 839 svd p. 839 svd p. 839 sv p. 842 sv p. 843 sv p. 843 svd p. 844 svd p. 845 svd p. 845 svd p. 846 svd p. 846 svd p. 846 svd p. 847 svd p. 847 svd p. 847 svd p. 848 svd p. 848 X C C E E E E E E E E E E E E E E E B B B B B B B B B B B 12.3. DIRECTIVE QUICK REFERENCE ProxyHTMLURLMap from-pattern to-pattern [flags] [cond] Defines a rule to rewrite HTML links ProxyIOBufferSize bytes Determine size of internal data throughput buffer ... Container for directives applied to regular-expression-matched proxied resources ProxyMaxForwards number Maximium number of proxies that a request can be forwarded through ProxyPass [path] !|url [key=value [key=value ...]] [nocanon] [interpolate] [noquery] Maps remote servers into the local server URL-space ProxyPassInherit On|Off Inherit ProxyPass directives defined from the main server ProxyPassInterpolateEnv On|Off Enable Environment Variable interpolation in Reverse Proxy configurations ProxyPassMatch [regex] !|url [key=value [key=value ...]] Maps remote servers into the local server URL-space using regular expressions ProxyPassReverse [path] url [interpolate] Adjusts the URL in HTTP response headers sent from a reverse proxied server ProxyPassReverseCookieDomain internal-domain public-domain [interpolate] Adjusts the Domain string in Set-Cookie headers from a reverse- proxied server ProxyPassReverseCookiePath internal-path public-path [interpolate] Adjusts the Path string in Set-Cookie headers from a reverse- proxied server ProxyPreserveHost On|Off Use incoming Host HTTP request header for proxy request ProxyReceiveBufferSize bytes Network buffer size for proxied HTTP and FTP connections ProxyRemote match remote-server Remote proxy used to handle certain requests ProxyRemoteMatch regex remote-server Remote proxy used to handle requests matched by regular expressions ProxyRequests On|Off Enables forward (standard) proxy requests ProxySCGIInternalRedirect On|Off|Headername Enable or disable internal redirect responses from the backend ProxySCGISendfile On|Off|Headername Enable evaluation of X-Sendfile pseudo response header ProxySet url key=value [key=value ...] Set various Proxy balancer or member parameters ProxySourceAddress address Set local IP address for outgoing proxy connections ProxyStatus Off|On|Full Show Proxy LoadBalancer status in mod status ProxyTimeout seconds Network timeout for proxied requests ProxyVia On|Off|Full|Block Information provided in the Via HTTP response header for proxied requests ProxyWebsocketAsync ON|OFF Instructs this module to try to create an asynchronous tunnel ProxyWebsocketAsyncDelay num[ms] Sets the amount of time the tunnel waits synchronously for data ProxyWebsocketIdleTimeout num[ms] Sets the maximum amount of time to wait for data on the websockets tunnel QualifyRedirectURL ON|OFF Controls whether the REDIRECT URL environment variable is fully qualified ReadmeName filename Name of the file that will be inserted at the end of the index listing ReceiveBufferSize bytes TCP receive buffer size Redirect [status] [URL-path] URL Sends an external redirect asking the client to fetch a different URL RedirectMatch [status] regex URL Sends an external redirect based on a regular expression match of the current URL 1121 8192 -1 On Off svd p. 848 sv p. 798 sv p. 799 sv p. 799 svd p. 800 sv p. 808 svd p. 808 svd p. 808 svd p. 809 svd B E E E E E E E E E p. 810 svd E Off 0 Off On Off Off Off 0 0 OFF 0 p. 810 svd p. 811 sv p. 811 sv p. 811 sv p. 812 sv p. 812 svd p. 854 svd p. 854 d p. 813 sv p. 813 sv p. 813 sv p. 814 sv p. 814 sv p. 856 sv p. 857 sv p. 857 svd p. 426 svdh p. 553 s p. 996 svdh p. 450 svdh p. 451 E E E E E E E E E E E E E E E C B M B B 1122 CHAPTER 12. GLOSSARY AND INDEX RedirectPermanent URL-path URL svdh Sends an external permanent redirect asking the client to fetch a different URL p. 451 RedirectTemp URL-path URL svdh Sends an external temporary redirect asking the client to fetch a different URL p. 451 ReflectorHeader inputheader [outputheader] svdh Reflect an input header to the output headers p. 859 RegisterHttpMethod method [method [...]] s Register non-standard HTTP methods p. 426 RemoteIPHeader header-field sv Declare the header field which should be parsed for useragent IP addresses p. 861 RemoteIPInternalProxy proxy-ip|proxy-ip/subnet|hostname sv ... Declare client intranet IP addresses trusted to present the RemoteIPHeader value p. 861 RemoteIPInternalProxyList filename sv Declare client intranet IP addresses trusted to present the RemoteIPHeader value p. 862 RemoteIPProxiesHeader HeaderFieldName sv Declare the header field which will record all intermediate IP addresses p. 862 RemoteIPTrustedProxy proxy-ip|proxy-ip/subnet|hostname ... sv Restrict client IP addresses trusted to present the RemoteIPHeader value p. 863 RemoteIPTrustedProxyList filename sv Restrict client IP addresses trusted to present the RemoteIPHeader value p. 863 RemoveCharset extension [extension] ... vdh Removes any character set associations for a set of file extensions p. 758 RemoveEncoding extension [extension] ... vdh Removes any content encoding associations for a set of file extensions p. 758 RemoveHandler extension [extension] ... vdh Removes any handler associations for a set of file extensions p. 759 RemoveInputFilter extension [extension] ... vdh Removes any input filter associations for a set of file extensions p. 759 RemoveLanguage extension [extension] ... vdh Removes any language associations for a set of file extensions p. 760 RemoveOutputFilter extension [extension] ... vdh Removes any output filter associations for a set of file extensions p. 760 RemoveType extension [extension] ... vdh Removes any content type associations for a set of file extensions p. 760 RequestHeader add|append|edit|edit*|merge|set|setifempty|unset svdh header [[expr=]value [replacement] [early|env=[!]varname|expr=expression]] Configure HTTP request headers p. 645 RequestReadTimeout [header=timeout[-maxtimeout][,MinRate=rate] sv [body=timeout[-maxtimeout][,MinRate=rate] Set timeout values for receiving request headers and body from client. p. 864 Require [not] entity-name [entity-name] ... dh Tests whether an authenticated user is authorized by an authorization provider. p. 523 ... dh Enclose a group of authorization directives of which none must fail and at least one must succeed for the enclosing directive to succeed. p. 525 ... dh Enclose a group of authorization directives of which one must succeed for the enclosing directive to succeed. p. 525 ... dh Enclose a group of authorization directives of which none must succeed for the enclosing directive to not fail. p. 526 RewriteBase URL-path dh Sets the base URL for per-directory rewrites p. 868 RewriteCond TestString CondPattern svdh Defines a condition under which rewriting will take place p. 868 RewriteEngine on|off off svdh Enables or disables runtime rewriting engine p. 873 RewriteMap MapName MapType:MapSource MapTypeOptions sv Defines a mapping function for key-lookup p. 874 RewriteOptions Options svdh Sets some special options for the rewrite engine p. 874 RewriteRule Pattern Substitution [flags] svdh Defines rules for the rewriting engine p. 876 RLimitCPU seconds|max [seconds|max] svdh Limits the CPU consumption of processes launched by Apache httpd children p. 426 B B B C B B B B B B B B B B B B B E E B B B B E E E E E E C 12.3. DIRECTIVE QUICK REFERENCE 1123 RLimitMEM bytes|max [bytes|max] svdh Limits the memory consumption of processes launched by Apache httpd children p. 427 RLimitNPROC number|max [number|max] svdh Limits the number of processes that can be launched by processes launched by Apache httpd children p. 427 Satisfy Any|All All dh Interaction between host-level access control and user authentication p. 444 ScoreBoardFile file-path apache runtime stat + s Location of the file used to store coordination data for the child processes p. 996 Script method cgi-script svd Activates a CGI script for a particular request method. p. 446 ScriptAlias [URL-path] file-path|directory-path svd Maps a URL to a filesystem location and designates the target as a CGI script p. 452 ScriptAliasMatch regex file-path|directory-path sv Maps a URL to a filesystem location using a regular expression and designates the target as a CGI script p. 453 ScriptInterpreterSource Registry|Registry-Strict|Script Script svdh Technique for locating the interpreter for CGI scripts p. 428 ScriptLog file-path sv Location of the CGI script error logfile p. 582 ScriptLogBuffer bytes 1024 sv Maximum amount of PUT or POST requests that will be recorded in the scriptlog p. 582 ScriptLogLength bytes 10385760 sv Size limit of the CGI script logfile p. 582 ScriptSock file-path cgisock s The filename prefix of the socket to use for communication with the cgi daemon p. 584 SecureListen [IP-address:]portnumber Certificate-Name s [MUTUAL] Enables SSL encryption for the specified port p. 770 SeeRequestTail On|Off Off s Determine if mod status displays the first 63 characters of a request or the last 63, assuming the request itself is greater than 63 chars.p. 428 SendBufferSize bytes 0 s TCP buffer size p. 997 ServerAdmin email-address|URL sv Email address that the server includes in error messages sent to the client p. 429 ServerAlias hostname [hostname] ... v Alternate names for a host used when matching requests to name-virtual hosts p. 429 ServerLimit number s Upper limit on configurable number of processes p. 997 ServerName [scheme://]domain-name|ip-address[:port] sv Hostname and port that the server uses to identify itself p. 430 ServerPath URL-path v Legacy URL pathname for a name-based virtual host that is accessed by an incompatible browser p. 431 ServerRoot directory-path /usr/local/apache s Base directory for the server installation p. 431 ServerSignature On|Off|EMail Off svdh Configures the footer on server-generated documents p. 431 ServerTokens Major|Minor|Min[imal]|Prod[uctOnly]|OS|Full Full s Configures the Server HTTP response header p. 432 Session On|Off Off svdh Enables a session for the current directory or location p. 887 SessionCookieName name attributes svdh Name and attributes for the RFC2109 cookie storing the session p. 891 SessionCookieName2 name attributes svdh Name and attributes for the RFC2965 cookie storing the session p. 891 SessionCookieRemove On|Off Off svdh Control for whether session cookies should be removed from incoming HTTP headers p. 891 SessionCryptoCipher name svdh The crypto cipher to be used to encrypt the session p. 894 SessionCryptoDriver name [param[=value]] s The crypto driver to be used to encrypt the session p. 894 SessionCryptoPassphrase secret [ secret ... ] svdh The key used to encrypt the session p. 895 SessionCryptoPassphraseFile filename svd File containing keys used to encrypt the session p. 896 SessionDBDCookieName name attributes svdh Name and attributes for the RFC2109 cookie storing the session ID p. 899 C C E M B B B C B B B B B C M C C M C C C C C E E E E X X X X E 1124 CHAPTER 12. GLOSSARY AND INDEX SessionDBDCookieName2 name attributes Name and attributes for the RFC2965 cookie storing the session ID SessionDBDCookieRemove On|Off On Control for whether session ID cookies should be removed from incoming HTTP headers SessionDBDDeleteLabel label deletesession The SQL query to use to remove sessions from the database SessionDBDInsertLabel label insertsession The SQL query to use to insert sessions into the database SessionDBDPerUser On|Off Off Enable a per user session SessionDBDSelectLabel label selectsession The SQL query to use to select sessions from the database SessionDBDUpdateLabel label updatesession The SQL query to use to update existing sessions in the database SessionEnv On|Off Off Control whether the contents of the session are written to the HTTP SESSION environment variable SessionExclude path Define URL prefixes for which a session is ignored SessionExpiryUpdateInterval interval 0 (always update) Define the number of seconds a session’s expiry may change without the session being updated SessionHeader header Import session updates from a given HTTP response header SessionInclude path Define URL prefixes for which a session is valid SessionMaxAge maxage 0 Define a maximum age in seconds for a session SetEnv env-variable [value] Sets environment variables SetEnvIf attribute regex [!]env-variable[=value] [[!]env-variable[=value]] ... Sets environment variables based on attributes of the request SetEnvIfExpr expr [!]env-variable[=value] [[!]env-variable[=value]] ... Sets environment variables based on an ap expr expression SetEnvIfNoCase attribute regex [!]env-variable[=value] [[!]env-variable[=value]] ... Sets environment variables based on attributes of the request without respect to case SetHandler handler-name|none|expression Forces all matching files to be processed by a handler SetInputFilter filter[;filter...] Sets the filters that will process client requests and POST input SetOutputFilter filter[;filter...] Sets the filters that will process responses from the server SSIEndTag tag "–>" String that ends an include element SSIErrorMsg message "[an error occurred + Error message displayed when there is an SSI error SSIETag on|off off Controls whether ETags are generated by the server. SSILastModified on|off off Controls whether Last-Modified headers are generated by the server. SSILegacyExprParser on|off off Enable compatibility mode for conditional expressions. SSIStartTag tag " ... Contains directives that apply only to a specific hostname or IP address VirtualScriptAlias interpolated-directory|none none Dynamically configure the location of the CGI directory for a given virtual host VirtualScriptAliasIP interpolated-directory|none none Dynamically configure the location of the CGI directory for a given virtual host Warning message Warn from configuration parsing with a custom message WatchdogInterval number-of-seconds 1 Watchdog interval in seconds XBitHack on|off|full off Parse SSI directives in files with the execute bit set xml2EncAlias charset alias [alias ...] Recognise Aliases for encoding values xml2EncDefault name Sets a default encoding to assume when absolutely no information can be automatically detected xml2StartParse element [element ...] Advise the parser to skip leading junk. s p. 437 sv p. 985 sv p. 985 svdh p. 438 s p. 986 svdh p. 679 s p. 988 svdh p. 988 svdh p. 989 C E E C B B B B B


Source Exif Data:
File Type                       : PDF
File Type Extension             : pdf
MIME Type                       : application/pdf
PDF Version                     : 1.5
Linearized                      : No
Page Count                      : 1138
Page Mode                       : UseOutlines
Author                          : Apache Software Foundation
Title                           : Apache HTTP Server Documentation Version 2.5
Subject                         : 
Creator                         : LaTeX with hyperref package
Producer                        : pdfTeX-1.40.15
Create Date                     : 2016:06:01 12:37:14-04:00
Modify Date                     : 2016:06:01 12:37:14-04:00
Trapped                         : False
PTEX Fullbanner                 : This is pdfTeX, Version 3.14159265-2.6-1.40.15 (TeX Live 2014) kpathsea version 6.2.0
EXIF Metadata provided by EXIF.tools

Navigation menu