Windows NT Device Driver Book A Guide For Programmers

User Manual:

Open the PDF directly: View PDF PDF.
Page Count: 544

DownloadWindows NT Device Driver Book A Guide For Programmers
Open PDF In BrowserView PDF
The Windows NT Device Driver Book:
A Guide for Programmers

Art Baker
Cydonix Corporation

To join a Prentice Hall PTR Internet mailing list, point to:
http://www.prenhall.com/register

Prentice Hall PTR
Upper Saddle River, New Jersey 07458
http://www.prenhall.com

Library of Congress Cataloging-in-Publication Data
Baker, Art (Arthur H.)

The Windows NT Device Driver Book: A Guide for Programmerss I Art Baker
p.

cm.

Includes index.

ISBN 0-13-184474-1
!.)Microsoft Windows NT device drivers (computer programs)

QA76.76.D46B355

I. Title.

1996

005.7' 126--dc20

96-22449
CIP

EditoriaVproduction supervision and Interior Design: Joanne Anzalone
Manufacturing manager: Alexis R. Heydt
Acquisitions editor: Mike Meehan
Marketing Manager: Stephen Soloman
Editorial assistant: Kate Hargett
Cover design: Design Source
Cover design director: Jerry Votta

© 1 997 by Prentice Hall PTR
Prentice-Hall, Inc.
A Simon & Schuster Company
Upper Saddle River, New Jersey 07458
The publisher offers discounts on this book when ordered in bulk quantities.
For more information, contact:
Corporate Sales Department
Prentice Hall

I

Lake Street

PTR

Upper Saddle River, NJ 07458
Phone: 800-382-3419, Fax: 201-236-7141
E-mail: corpsales@prenhall.com

All product names mentioned herein are the trademarks of their respective owners.
All rights reserved. No part of this book may be
reproduced, in any form or by any means,
without permission in writing from the publisher.

Printed in the United States of America

IO

9 8 7 6 5 4

ISBN 0-13-184474-1

Prentice-Hall International (UK) Limited, London
Prentice-Hall of Australia Pty. Limited, Sydney
Prentice-Hall Canada Inc., Toronto
Prentice-Hall Hispanoamericana, S.A., Mexico
Prentice-Hall of India Private Limited, New Delhi
Prentice-Hall of Japan, Inc., Tokyo
Simon

& Schuster Asia Pte. Ltd., Singapore

Editora Prentice-Hall do Brasil, Ltda., Rio de Janeiro

Contents

PREFACE

........................................................................................................................

ACKNOWLEDGMENTS
CHAPTER 1

............................................................................................

INTRODUCTION TO WINDOWS NT DRIVERS

xv
xx

............................

I

1.1 OVERALL SYSTEM ARCHITECTURE .......................................................................... 1

Design Goals for Windows NT . . . . . . . . . ..... . . . . . . . . . .. ... . . . . . ...... ........ . . . . . . .... ......... . . . . . . . . . . .... ....... . . . . 1
Hardware Privilege Levels in Windows NT ............ ....................... . . . . . ........ ..... . . . . . . . ........... 2
Base Operating System Components ......... ........... . . ........ . . .... ..... .... . . . . . ....... . . . . .... . . .... . ...... .... 2
What's in the Executive .......... .. . . . ............................... .... .... . ....... . .. . . . ....... . ......... . . .... .... . ...... 4
Extensions to the Base Operating System . . ..... . . . . . . . . . . . . . . . ...... . . . . . ..... ....... ..... . . . . ..... . . . . . ........... 7
More about the Win32 Subsystem ..... . ................ . . . .......... ....... . ... .... . .. .. . .. ..... .. .................. .. 8
.

1.2KERNEL-MODE1/0 COMPONENTS ......................................................................... 10

Design Goals for the UO Subsystem ............................................................................ ..... 1 0
Layered Drivers i n Windows NT ...................................... ................................................ 1 0
SCSI Drivers .................................................................. ... ............. ................................... 1 2
Network Drivers ........ .......................... ............................ .................................................. 1 3
1.3 SPECIAL DRIVER ARCHITECTURES ......................................................................... 15

Video Drivers . . . . . . . . ........ . . ........ . ......... . . . ....... . . . . . . . . . . .... . . ..... . ... . . . . . . . . . . . . . . . .... .... .... . ..... . . . . . ....... 1 5
Printer Drivers ..................................................................... . . . . ..................... . . . . . . ............... 1 7
Multimedia Drivers . . . . . . ..... . . . . . . . . . . . . . . . . . . . ....... . ............ . . . . . ...... . . . . . . . .... . . . .... . . . . . . . ................ . . . . . 20
Drivers for Legacy 1 6-bit Applications ... . ..... .......... .... .... . . . . . ........... . . . . . . . . ............. .... ... . . ... 21
1.4 SUMMARY ............................................................................................................... 23

iii

Contents

iv
CHAPTER 2

THE HARDWARE ENVIRONMENT

..............................................

2. 1 HARDWARE BASICS . . .... . .... ...... .. . .. .... .. . .... .. .. .. .. ... . . . . . .... . .....
. .

.

.

.

..

.

..

.

.

.

..

.

.

Device Registers .
.
.
Accessing Device Registers
.
Device Interrupts . .
..
.
. . .
Data Transfer Mechanisms .
.
.
Direct Memory Access (DMA) Mechanisms .
Device-Dedicated Memory . .
.
.. .
.
Requirements for Autoconfiguration
.. .

..

.

.

. .

. . . .

.

.

....

24

.. 24
..

.

. 25

. .................................. ................. ..................... ..................................

.

........................... ........................................................... ......

.

..

. .. ........... . ........ .......... .... .... ........... ....................

.

.. . . .
.. .
.. .
.
... . .

........ ....

.
. .
..
. .
.

. .. .... .. .

.. ........................ ...... ........................ ......................... . .... ..
. ........................ .... .......................

.. ........... ......

.........

.... ......

. ........................... ........................ . . ... ... ..

NT
ISA-The Industry Standard Architecture. . ..
MCA-The Micro Channel Architecture
EISA-The Extended Industry Standard Architecture
PCI-The Peripheral Component Interconnect

2.2 BUSES AND WINDOWS

.

. ....... .................. . ....... .. .... .............. .........

·····················································································

. ....

.

.

33

... .

......................... ....... ................. . . .. ....

.

26
27
29
30
31
32
33

. 36

............................................. ........................ ..

.

. ..

............................. .... .

...............

.

............................. ..................................

2.3 HINTS FOR WORKING WITH HARDWARE ......... .
.

Learn about the Hardware .
Make Use of Hardware Intelligence
Test the Hardware
.
.

.. ....

.

.

. .. . . ..... .
.

. . .

. .

.. .

. . ....

..

.

...

.... .

. ...

39
41

.... .. 45
..

.

.
.
.

. . 45

... ...................... .... ...... .. ...... ...................... ...................... .

........................................................ .........................

...... ............... ............................................................. ........................

2.4 SUMMARY

CHAPTER 3

..

....

...

..... .... . ....
..

.. . .

....

... . ......
.

..

...

.. . . .
.

.

. ........ .. ... ...... ... . .. ...... .... 47

. . .... .

KERNEL-MODE 1/0 PROCESSING

..

..

.

.

..

Exceptions .
Interrupts
Kernel-Mode Threads

.

. ... . .. .

.

.

.

.

.

.

...............................................

3. 1 How KERNEL-MODE CODE EXECUTES .. .... . . . . . .... .. .. . . .... .
.

46
46

.

..

.

.

.. ..

.

.

.

....

..

. ...

.

48

. ...

. . 48

. .

. 48

.

... .

... .......................... .............. ....... ............. ....... ................... ............ ... ..... ..

...........................................................................................................................

.

.

.

........................................... ..................... ...................... ..............

NT
CPU Priority Levels
.
Interrupt Processing Sequence
Software-Generated Interrupts

3.2 USE OF INTERRUPTS BY

···················································································

.
. .

.
. .
. .

.
.
.
. .

.

49
49

49

49
50
. . . 51

.................. .................. ........................... ................... ...... .............

.

.

............. .... .. .......... ............. .. ............. ..... ....................

.

.

.

............... ...................... ...... .. ...... ......... .. .............. .. . .

3.3 DEFERRED PROCEDURE CALLS (DPCS) . ... .. . ..... . .... ........ .
.. .

Operation of a DPC
Behavior of DPCs

.

. .

.

..

.

.

.... . .

...

.

. . . . . 51

. .. ... . . .

.
.

.

.

.. ..

. . . 51

............................... ............................................... ............. .... ... . ..

.

........................................................................ ....... ............................

3.4 ACCESS TO USER BUFFERS... .. .. ..
.

Buffer-Access Mechanisms

.

.

...

........ ... .... . ...... ...... .... .. ........... . ..
..

...

.

.

..

.

.

.

..

. .

.

.

................................................ .............. .................... .........

3.5 STRUCTURE OF A KERNEL-MODE DRIVER . . . .. . ....
. .. . ..

Driver Initialization and Cleanup Routines .
I/O System Service Dispatch Routines
Data Transfer Routines
Resource Synchronization Callbacks .
Other Driver Routines
.

. .

. .. .... ... .... .

.... .

..

.

.

.

.

.

.

. ........................ ........... .......... .....................

.

...................................... ......................................

.

..

. .

............................................... ................. . .... .. .........................

.

... ........................ ............................................. ......

.

.

........... .......................... ................................................... ............

3.61/0 PROCESSING SEQUENCE . . . .......... ... .. . ..... . ...
.. .

.

.

Request Preprocessing by NT
Request Preprocessing by the Driver
Data Transfer
Postprocessing by the Driver
Postprocessing by the I/O Manager

..

. ..

. .

...

..... . ....... . . ....... . ...
...

.

.

53

. . . . .... . 54

. ... . .. .

.

52

. . 53

... . .

. .

.. .

55
55
56
57
57

. 58

...

58
59
59
60
. . 60

...........................................................................................

.
.

.

.

............................. ...... .................. ........................

.................................................................. ..................................................

............................. ...............................................................

.................. ......................................................... .. ...

3.7 SUMMARY . ... ... ..... . ... . .. .
. .

..

..

. ..

. .

.

...

..
.

...

..

...

.... . ..
.

.

. . . . .... . . .... . . .. . .. .. .

....... . . .

.

.

. .

.

. . .

.

.

..

.

. ... 6 1

... ..

Contents
CHAPTER 4

v
DRIVERS AND KERNEL-MODE OBJECTS ................................. 62

4. 1 DATA OBJECTS AND WINDOWS NT······································································· 62

Windows NT and OOP . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
NT Objects and Win32 Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
.. . . . . .... . . . . ... . . . . . . ... . .. . ... . . . .... . ..... ... . . . 63
Layout of an IRP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .......... 64
Manipulating IRPs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .... . . . . . . . . . . . . . . . . . . . . .... . . . . . . 65

4.2110 REQUEST PACKETS (IRPs)

......

4.3 DRIVER OBJECTS . .. . ... . .... . . ...
.

. .

.

....

.

... . .

.... .

.

.

..

.. ..

.

.

.

..... . . . .. . .. . .. ·················································· 67

.....

.

.

.

Layout of a Driver Object . . . . . . . . . . . . . . . . . .... . . . . . . . . . . . . . . . . . . . .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.4 DEVICE OBJECTS AND DEVICE EXTENSIONS .. . .

. . .... . . . . ... . .... . . . .... . .... . .. . .. 69
Layout of a Device Object . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
Manipulating Device Objects . . . .
.
.
.. .
70
Device Extensions
.
. .
.
.
.
. 7l
.

.

.

......

.

....

...

.

..... .. . ... ............ .............................. ..............

... .............

............ ............ ... ............................ ........ ...................... .................

4.5 CONTROLLER OBJECTS AND CONTROLLER EXTENSIONS . . . . . ... .. . . .

. . .. . . .. . . .. . . . 7 1
Layout of a Controller Object . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
Manipulating Controller Objects. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
Controller Extensions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
.

.

...

..

...

.

..... . . . . ... .. .. .... . ... . .
.. . . .. . .. . .. . .
.. .. . .... 74
Layout of an Adapter Object . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
Manipulating Adapter Objects . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

4.6 ADAPTER OBJECTS .. . . ..
.

.

.. ..

...

.

.

.

.

.

.

.......

.... . .

. .

.. .

......

.

. .

.... .. . .... . . . . .... . .. .. . . .. .. . . . . . . . . . .. . . . . . . .
.
. . . . . . . .. . 76
Layout of an Interrrupt Object . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
Manipulating Interrupt Objects .
.
.
. . . .
77
4.8 SUMMARY . ... . . ... . . . . ...... ........ . .
. .. . . .
. . ... . . . . . . ..
.
.... . 77

4.7 INTERRUPT OBJECTS . .

...

.

..

.

.

.

.

.. ...

.

.

.

.

....... ..... .

.

.

.... ............................................ ........ ................. .. . .. ....

. .

CHAPTER 5

... .

.

.... .

.

.......... .

.

........ . ....

.

.

..

..

........ .....

GENERAL DEVELOPMENT ISSUES ............................................. 77

5.1 DRIVER DESIGN STRATEGIES . ... . . ..... . ... .. . ...... . . .. . .
..

.

.

Use Formal Design Models . .
..
Use Incremental Development .
.. . .
Use the Sample Drivers
. . .
. .. ......

.......

. ...........

.. . . . . . . . . . . . . .
.

..

.

.

..

. . 77
78
.. . .. 79
.
. 80

....

...

.....

.

.

................................................ ..... .................. ........

..
...

..
. .

. . ............ . ......................... . ...................

.

.

... ...

...

............ .. .... ............. . . ................. ...... ... ....... .......... .......... ...

5.2 CODING CONVENTIONS AND TECHNIQUES

..

... . . . ...... . . . . .. . ...... . ... . ..... . 80

. . . . .........

.

. . . . ..

.

. .

General Recommendations
.
.
. ..
. . ..
80
Naming Copnventions
.. .
.
.
.. . .
.
81
Header Files
81
Status Return Values . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
NT Driver Support Routines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
Discarding Initialization Routines . . . . . . . . .... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
Controlling Driver Paging . . . . . . . . . . . . . . . . . . . .... . . . . . . . . . . . . . . . . . ...... . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
..... ..................... ............. ...

........

.......................... ... ... . ...........

.. .......................... ............ ........ . ... .. ......................... ......

......................................................................................................................

5.3 DRIVER MEMORY ALLOCATION ............................................................................. 8 6

Memory Available to Drivers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ........... 86
Working with the Kernel Stack
. .
.
...
. .. . .
. 87
Working with the Pool Areas
. ..
.
.
... ..
..
. 87
System Support for Memory Suballocation
.
.
.
..
88
............. .. ....... .......... . . .......................... ....

....... .

..................... ......... .........

.....

..... ....... .

..................

.........

............ ............... .............. .................. . .....

5.4 UNICODE STRINGS .................................................................................................. 9 1

Unicode String Datatypes
..
.
.. ..
. .
..
91
Working with Unicode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
.........

................... ......... . .... . ........................ .... .......... . ....

Contents

vi
5 .5 INTERRUPT SYNCHRONIZATION

. . ... .

... . .

. .

. .. . .

. .... .. ... ..

. ...

. . .... .

. . . . . . . . . .. .

........ .. . . .. . . . . .

. .....

93

The Problem ...................................................................................................................... 93
Interrupt Blocking ............................................................................................................. 94
Rules for Blocking Interrupts ............................................................................................ 94
Synchronization Using DFeferred Procedure Calls .......................................................... 95
5.6 SYNCHRONIZING MULTIPLE CPUS . . .
. . .

. . . .... .... .... . .

... . . . ..

.

.

..

.

. .. ... 95

. . ........ . ........ ........ .

..

.

How Spin Locks Work ...................................................................................................... 95
Using Spin Locks .............................................................................................................. 96
Rules for Using Spin Locks .............................................................................................. 97
5 .7 LINKED LISTS

.

... .

..... ...... . . ..

........

. ..
... .

..

. ... . ... ....

...... . .... .

. ..

.

. ...

...... ...

. . . . . 98

. . ........... ... . .. ...

Singly-Linked Lists ........................................................................................................... 98
Doubly-Linked Lista ......................................................................................................... 99
Removing Blocks from a List ........................................................................................... 99
5 .8 SUMMARY

......

CHAPTER 6

.. . ... ....
. .

...

..

...... . ..........

.

. . . . . . ... .. .....

.... . . . . .. ...

.

.

..

.

INITIALIZATION AND CLEANUP ROUTINES

6.1 WRITING A DRIVERENTRY ROUTINE . ......
. .

....

.. .
.

..

. . . .. . 100

.... . .... ........ . .... . . .

. .. .....

.

........ .... .

.

.

.........................

.....

.... .

.

. ...... .........

101
101

Execution Context ........................................................................................................... 1 01
What a DriverEntry Routine Does .................................................................................. 1 02
Initial.izaig DriverEntry Points ....................................................................................... 1 03
Creating Device Objects ................................................................................................. 1 03
Choosing a Buffering Strategy ........................................................................................ 1 04
NT and Win32 Device Names ........................................................................................ 1 05
6.2 CODE EXAMPLE: DRIVER INITIALIZATION .

....

INIT.C

.

.....

.

..

.... . ....

.

..

.......... . ....

.

.

..

... ...... . ....

105

.............................................................................................................................

6.3 WRITING REINITIALIZE ROUTINES .. . ..

1 06

. . . . . ..
..
..
113
Execution Context ........................................................................................................... 1 1 3
What a Reintializae Routine Does .................................................................................. 1 1 3
...

...

....... ..... ....

....

. .....

........ . ........ . .....

6.4 WRITING AN UNLOAD ROUTINE ........................................................................... 1 14

Execution Context ........................................................................................................... 1 1 4
What an Unload Routine Does ........................................................................................ 1 1 4
6.5 CODE EXAMPLE: DRIVER CLEANUP ..................................................................... 115

UNLOAD.C .................................................................................................................... 1 1 5
6.6 WRITING SHUTDOWN ROUTINES .......................................................................... 118

Execution Context ........................................................................................................... 1 1 8
What a Shutdown Routine Does ..................................................................................... 1 1 9
Enabling Shutdownb Notification ................................................................................... 1 1 9
6 .7 TESTING THE DRIVER ........................................................................................... 119

Testing Procedure ............................................................................................................ 1 20
The WINOBJ Utility ....................................................................................................... 1 20
6.8 SUMMARY............................................................................................................. 12 1
CHAPTER 7

HARDWARE INITIALIZATION

....................................................

122

7.1 FINDING AUTO-DETECTED HARDWARE ............................................................... 122

How Auto-Detectoin Works ........................................................................................... 1 22
Auto-Detected Hardware and the Registry ..................................................................... 1 23

vii

Contents
Querying the Hardware Database
What a ConfigCallback Routine Does
Using Configuration Data
.
.
Traqnslating Configuration Data
. . . .

.

.
.
.

............................. ...................... ..............................

.

.

. .

................. ........... ....... .. ..... .............................

............ ..... .............................................. .............................

.

.

.

. ..

.......... ... .. .. ....... ................ ......................... .... .

7.2 CODE EXAMPLE: LOCATING AUTO-DETECTED HARDWARE

AUTOCON.C

................

..

.

.

.....

130

................................

. .

.

1 25
1 27
1 28
1 30

........... ............................... ...... ........................... ... .............

1 32

7.3 FINDING UNRECOGNIZED HARDWARE································································· 139
..

Adding Driver Parameters to the Registry
Retrieving Parameters from the Registry .
Other Sources of Device Information .

..........

.
.
. .

................................................... ......

.

... ...................................... ...................... ......

.

.

.

....................... ............... ....................... ..... .. ...

1 40
1 40
1 41

7.4 CODE EXAMPLE: QUERYING THE REGISTRY ........................................................ 14 2

REGCON.C

.....................................................................................................................

7 .5 ALLOCATING AND RELEASING HARDWARE

How Resource Allocation Works
How to Claim Hardware Resources
How to Release Hardware
Mapping Device Memory
Loading Device Microcode

.
.

.

.

............................... .................... ....

........ ...........................................................................

.

.

.

.

.

.... ....... .............. ......... ...... ................ .................

.

........................

..

.

7.7 SUMMARY

CHAPTER 8

...............................

.......................... ........................................

7.6 CODE EXAMPLE: ALLOCATING HARDWARE

RESALLOC.C

..

.

.

1 52

. 1 53

...............................................................................................
........................................................ .....

1 43

152

1 55
1 56
1 57

. 158

.

........... ..... .............................. .... ..

. .

.................................................................................................. .. ..........

.....•..................•.••.....•...•.............••.....•.•••..............................•.•......••..•••

1 58

162

DRIVER DISPATCH ROUTINES ................................................... 163

8.1 ENABLING DRIVER DISPATCH ROUTINES

.

.

........... ........................ ........................

1/0 Request Dispatching Mechanism .

.
.
.

.

.

163

... ....... .............. .................. ................................

Enabling Specific Function Codes
.
Deciding Which Function Codes to Support

.

.
.

....... ...................... ...................... ...................... .....

8.2 EXTENDING THE DISPATCH INTERFACE

Defining Private IOCTL Values
IOCTL Argument-Passing Methods
Writing IOCTL Header Files

.

.

.............. .......... ............................ ..... .....

.

..

.

.

.............. . ..................... ......... .......... ...

.
. .

.
.

.

.

1 63
1 64
1 65

165

. 1 67
. 167

.......... ...................... ... ........................... ................. .

.

.... ... .................. ............... ................................. .

.

.................................................................................. .......

1 69

8.3 WRITING DRIVER DISPATCH ROUTINES······························································· 169

Execution Context
What Dispatch Routines Do
Exiting the Dispatch Routine

.

.

.............................. ................................... ........................................

.

.

.................................. ..................................................... ...

.

. .

.

.

... ............................. .. ........... .............................. ..........

1 70
1 70
1 71

8.4 PROCESSING SPECIFIC KINDS OF REQUESTS ........................................................ 173

Processing Read and Write Requests
Processing IOCTL Requests
Managing IOCTL Buffers

..
..
.

.

.

....... . ................... ....... ........................................

......................

.

.
.

.

.

................................. ..... ............. ........ ....

.

......................... ......................................... ...................... ....

1 73
174
1 77

8.5 TEsTING DRIVER DISPATCH ROUTINES................................................................ 17 8

Testing Procedure
Sample Test Program

.

.

.

.
.

..................... ................ .................. ...................... ...........................

8.6 SUMMARY . .

.. . .

.....

.

.

.

.

.

.. ................. ................ .................... ....... ...................... ....

.

. ... ............... ....... .......................... ................... ................ ..................

1 78
1 78

179

Contents

viii
CHAPTER 9

PROGRAMMED 1/0 DATA TRANSFERS

9.1 How PROGRAMMED1/0 WORKS ........ .
.

.

. ...... ....... . ...... . ... ... ....

. .... .......... .

What Happens during Programmed 1/0 .
Synchronizing Various Driver Routines . .

.

..

..
.. ... .

.
..

.

.

... .. ....

.

. .............

Initializing the Start 1/0 Entry Point . .
Initializing a DpcForlsr Routine
.
Connecting to an Interrupt Source . . ..
Disconnecting from an Interrupt Source

... .

.

.

..

.

.

....

.... .. . .... .

. .. . . .

......... .

. .

..

. .. .

. 182

............. . .

.
. .
. ..
.
.
.
. . .. .. . . . .
.
.

.... ................................ ........ .. ....... .

.
. .

. 180

.
180
. .. . . .. 1 81

.. .. .. .......... .........

......... .

. .. . . . ... ....

..... .

..

. .. . . .
. .. . . .

................... . ......... ....... ..

9.2 DRIVER INITIALIZATION AND CLEANUP

180

....................................

. .

182
183
.
. 1 83
.. . 1 85

........ .. .....

....... ....................... ............ ........... ........ ....................
.. . ..

.

.... .......... .. .............. .. ...

.

. . .. .... ........ ......

........................................ .................. ....

9.3 WRITING A START 1/0 ROUTINE . ...... .
.

Execution Context .
What the Start 1/0 Routine Does

.

...

.. ....

.. . . . . .. .. ........ ....... . .... . .... .. ... 185
... .

.. ...

.

..

.

.

.. .

. .

.

.

. . .

. ................................... ... .. ..............................................................

.

.

.

.

9.4 WRITING AN INTERRUPT SERVICE ROUTINE (ISR)

Execution Context
..
.
What the Interrupt Service Routine Does .

..

.

.. . ... ..

. . . . ... . 186

....... . ............... ... ..

.
..

.

...

..

.

.

....... . ...................... ............................. ....... ....................................

.

..

...... .... . ............

9.5 WRITING A DPCFORISR ROUTINE

.......

.

.

Execution Context
What the DpcForlsr Routine Does
Priority Increments ..

................................... ....

.....

.

.

..

............. .......... ......... . ....

... ... .. . . . .... . ...... . . ... . .... .
.

.

. .

..

.

. .

.

.... .
.
. .
. .
. . .. . .

........

...

.

..

.

. . . . 188

. . . . ..

. .

... ..

.

.

. .

188

. 1 88
. . . .. . . . 189

..................... .... ....... .. ......... ................................ .

. ..

........................ .

........... . .

9.6 SOME HARDWARE: THE PARALLEL PORT . .

. .......

.. . . .
.
.
.
. .

How the Parallel Port Works
Device Registers .
.
Interrupt Behavior . . .
A Driver for the Parallel Port

..............

.... ... ..... .

..

........

.
..

. ... ....

.. .

. ..... .. . .

. .. . .

. . .. . . ... ... . . .. ... .. . 189
.

.

.

...

.

..

..

.

.

..

..

.

.

.

. . . 189
.. . .
. ...
. . ..
1 91
. .. ..
. . . .
. . . . 1 92
. . . .. . . . .
.... . 1 92

... .. .. ........ ...................... ........... ............ .... ..

. ................... ................ ............

. ............

... .. ........................... ................... .....

. . ....... ..

..

......... .... .

.......

...... ... .... .. ......... .. .... ..

............ ......... .. ........... .... .... .... . .. .... ..... ... .......

9.7 CODE EXAMPLE: PARALLEL PORT DRIVER ....

. . .... . . . ... . .... .

. . .. . .

XXDRIVER.H .
INIT.C
TRANSFER.C
.

1 86

. 187

. ...... ................... .. ......... .... .. ...

.
. .. . .

1 85

. . . . 1 86

............ ............................... ....... .............. ....

.

.

.

..

.

. .

.

.

.

. .

....

. . ... .... .
.

.

.

. ..

. ................... .. ................................................. ... ....... ...

. ....

192

. ...

.

1 92
193
. . 1 95

....... ............

.............................................................................................................................

.

.

.

.

.. .

...... ..................................... ................... ...... ....... ...... . ... ............. ... ..

9.8 TESTING THE DATA TRANSFER ROUTINES

Testing Procedure

CHAPTER 10

.

..

.

.

. . . .....

. .
..

............

.

. . .. . . . ..
. .

. . ..

..

. .

....

.

201

....

. .

. . 201

.......... .. ........................ .... .................................... ... .......... .. .... .. .

9.9 SUMMARY . ... .. . .
.

. .

.. .

......

.

. .. . .

TIMERS

... . .
..

.

.

........ .....

.... . .... . .... .. . . ...... . . ... ... ......
. ..

. ..

.

. . .

. .

.

..

.

.. ..

....... .. 202
.

203

............................................................................................

. .... ...... .......... .......... ................... .. .... ... 203

10.l HANDLING DEVICE TIMEOUTS ..

.... .

.

How 1/0 Timer Routines Work
.
How to Catch Device Timeout Conditions

.

.

...

.

.

.

........... ...........................................................................

.

............ ................................... .....................

10.2 CODE EXAMPLE: CATCHING DEVICE TIMEOUTS . .

. ....

XXDRIVER.H .
INIT.C
TRANSFER.C
TIMER.C .

..

. ....

.

.....

.... . .

. . . . . .. .....

. .

.

203
204

..

... 205

.

.....

. ...

206
206
207
. 209

.... .

. 211

..

. 21 1
. 212
. . 21 3

............................... .. ....................................... ......... ............... ...........

........................................................................................ .....................................
............................................................................... ..................................

. ..

.

.

.

.

. .

.

. .. .

................ .. . ........ .......... .......... .......... ....... .. ...... ........... ..

10.3 MANAGING DEVICES WITHOUT INTERRUPTS . . .. .
. .

. ....

Working with Noninterrupting Devices
..
.
How CustomTimerDpc Routines Work
How to Set Up a CustomTimerDpc Routine ... . .
How to Specifiy Expiration Times
Other Uses for CustomTimerDpc Routines
.

..

.

... ....

...

.... ........... .

... . . ....

. .... . . . .. ...

. . . . .. .. .. .. . .

....... . ........... .... . .. .. ...

...

...

.

. . .........

.... .

....................................................................... ..

.

.

. ...

. .. ............ .... .............. ... . . ......

..
.

...... ..

........................................................................ .........
............ .......................................................

214
215

Contents

ix

10.4 CODE EXAMPLE: A TIMER-BASED DRIVER......... ..... .
.

XXDRIVER.H
INIT.C
. .. .
TRANSFER.C . .

. .

.

.

...

.. .. .

. .................. . ....... 215
.
21 6
21 6
. .
21 7

... .

. ..

............ .................. .. .............. .. . ................................................ . . ......

........ ....

. ...............

..

10.5 SUMMARY .. . .....
. .

CHAPTER 11

.. .

.. .
. . .. .

.. ......................... . .. .........................................................

. . .

. .

..... .. ... ...... .. ........................ .. ..

....

. ........................................ .. ........

.......................................... . . .............
. . .

. . .. .......... ......... 221

.... .

.

...

.

FULL-DUPLEX DRIVERS ........................................................... 222

11.1 DOING Two THINGS AT ONCE .......... . ........ . . . . ... . .... ....... . ........ . .
..

Do You need to Process Concurrent IRPs?
How the Modified Driver Architecture Works
Data Structures for a Full-Duplex Driver
Implementing the Alternate Path .

. .

.. . . . . . .. 222

... ...

.

...

. . .

.

............................ ........................................

.

.

..................... ............ ......

. .. .
. ..

................................................... .

223

. . .... . 223
. .. 224
. . . 225

. .... .. ..

..

.......... ..

.

..

. .......................................... ............................. .. .... .

11.2 USING DEVICE QUEUE OBJECTS .... ............... . . .. . . . ..... ......... .............. .......... 225
.

How Device Queue Objects Work
How to Use Device Queue Objects

.

.. .

.

.

. .

.

. .

. .

.

.

.

......... .. ..................................... ... ....... .. ....... ........

. .

.

225

. . . 226

........... ... ................................... .......... ..... ..... . .. .

11.3 WRITING CUSTOMDPC ROUTINES ...................................................................... 228

How to Use a CustomDpc Routine . .
.
. .. . .
Execution Cointext . . ....
.. . . . . . . . . . . .
. .. ....... ....... ..

... .

........

.... . .

11.4 CANCELING 1/0 REQUESTS .. .

. ...

. .

. .. 228

.

. ... ........ .. .................. .............. .

.

.. . .. . ............................... .......

.

..

..................

229

...... .. .. . .... .... . ... . .. . .... .............. ..... . . . .. . . . ... ... 229
How IRP Cancellation Works
..
. .
.
230
Synchronization Issues .
..
. .. . .
. 231
What a Cancel Routine Does .
. .
.
. . 232
What a Duispatch Cleanup Routine Does ...
...
234
. ....

..

.

.

. .

..

.

....... . ............ .. ................ ..............................................

................... . ............ ... . .. .. ................................................. ..
......... .... ............... ......................................................
..

.........................................................

..

......

11.5 SOME MORE HARDWARE: THE 16550 UART.. . . . ... . . . ..... ..... . . . . .. . .... . . ....... . . 236
.

What the 1 6550 UART Does
Device Registers .
.
Interrupt Behavior

.

.

. .

.

.

.
. .. .

.

..

. . ..

......... .. ...................................... ................ ... .

....... ...................................

.. .

.. ................... .

.

. ..

. . ..

.. 236

....

.

.

.

. 236
238

. .... ............. ............... .

...........................................................................................................

11.6 CODE EXAMPLE: FuLL-DUPLEX UART DRIVER ...... . ..... .. . ...... .................... ... 239
.

What to Expect
DEVICE_EXTENSION in XXDRIVER.H
DISPATCH.C .
DEVQUEUE.C
INPUT.C
ISR.C
CANCEL.C

.

.

.

240
. 240
. . 241
. 244
247
249
253

........................................................... ....................................................

.

.

.................. ............................. ..................

.

................................................................................................. ..... .... ....
............................................................................................................ ..

.........................................................................................................................

...............................................................................................................................

.

.

....................................................... ................................. ........ ...................

11.7 SUMMARY ................... ...... ........... ...... ....
.

CHAPTER 12

.

..

..

.

.......

.. . .
.

. .....

. . . . . . . ... . ........ . .. ......... 257
. ..

..

DMA DRIVERS ............................................................................... 258

12.1 How DMA WORKS UNDER WINDOWS NT .......... . .............. . . .... . .... . ...... . ........ 258
.

Hiding DMA Hardware Variaitons with Adapter Objects
Solving the Scatter/Gather Problem with Mapping Registers
Managing 1/0 Buffers with Memory Descriptor Lists
Maintaining Cache Coherency
Categorizing DMA Drivers
..
Limitations of the NT DMA Architecture

..

258
259
261
263
.. 265
.
265

..............................................

.

..................... ..................

....................................................

.

................................................................... ....................

...............................

............. . . . . . .....................................

...

.............................................................. ........

12.2 WORKING WITH ADAPTER OBJECTS ................................................................... 266

Fiding the Right Adapter Object

.....................................................................................

266

Contents

x
Acquiring and Releasing the Adapter Object.
Setting Up the DMA Hardware
Flushing the Adapter Object Cache

.

.
.

268
270
271

....... ...................... ..................................

.

.
.

.............................................. ..... ...................... ...........
.................................................................... ...........

12.3 WRITING A PACKET-BASED SLAVE DMA DRIVER

How Packett-Based Slave DMA Works
Splitting DMA Transfers

.........................................•..

272

272
274

.........................................................................

........................·.........................................................................

12.4 CODE EXAMPLE: A PACKET-BASED SLAVE DMA DRIVER

XXDRIVER.H
REGCON.C
TRANSFER.C

..................•..........•.

276

276
277
278

................................................................................................................

.....................................................................................................................

.

.

.

........ ...................... ................................................... .............................

12.5 WRITING A PACKET-BASED Bus MASTER DMA DRIVER································· 285

Setting Up Bus Master Hardware .
Hardware with Scatter/Gather Support
.
Building Scatter/Gather Lists with IoMapTransfer

286
288
289

. .................................................................................

.

.

.

................ ...................... ................ ... ..............
.........................................................

12.6 WRITING A COMMON BUFFER SLAVE DMA DRIVER

........................................

Allocating a Common Buffer
.
.
Using Common Buffer Slave DMA to Maintain Throughput

291

.............................. ........... ...............................................
........................................

12.7 WRITING A COMMON BUFFER Bus MASTER DMA DRIVER

..............................

How Common-Buffer Bus Master DMA Works

.........................................................•.................................................

CHAPTER 13

296

.

............................................... ............

12.8 SUMMARY

296

297

LOGGING DEVICE ERRORS ...................................................... 299

13.1 EVENT-LOGGING IN WINDOWS NT

Deciding What to Log
How Event Logging Works

.

.

........ ............... ...........................................

.

299

.

................................................ .................................... ...............

.

.

............... ............... ............................................................

13.2 WORKING WITH MESSAGES

.

................................................ ...............................

How Message Codes Work
.
Writing Message Definition Files .
A Small Example: XXMSG.MC
Compiling a Message Definition Files
Adding Message Resources to az Driver
Registering a Driver as an Event Source

.
.

. .

................................................ ... .............. ..............

.
.
.

.
..
..
.

......................... ...................... ....................................
................ ......................

...................................

............ ......................

..................... .............

.

.................................... ....................................

13.3 GENERATING LOG ENTRIES

.

.

...... .................... ....................................................

Preparing a Driver for Error Logging
Allocating an Error-Log Packet
Logging the Error

.

.

.
.

.
. .. . . . .
. .. .
.

.

.

...................................... .............................. ............. ...

13.4 CODE EXAMPLE: AN ERROR-LOGGING ROUTINE

EVENTLOG.C
13.5 SUMMARY

.

.... .. .. . ..... ..
. ........ ........

..............................................

31 0
311
31 2

313

.

........... ....................................................................................................

•.•..•.•.••..•.••.•••..•..••....•.•.•....•.•.•••..•.•.•.•.•..•...••••.•........•.•.••••.•..•••...•.•.••.•••

302
303
305
307
308
309

310

........................................ ...................... .............

............. .......... ........................ ........... .

299
300

301

.

......... .................................................................... .... .........

CHAPTER 14

291
292

31 3

319

SYSTEM THREADS ....................................................................... 320

14.1 SYSTEM THREADS

.•.••••.••.•................••••.•.••••.••••..•........•..........••.•.••••...•.••..••...••••.

When to Use Threads
Creating and Terminating System Threads
Managing Thread Priority
.
System Worker Threads
.

320

......................................................................................................
.....................................................................

.

........... .................. ................................................................

.

.............. ...................... ............................................................

320
321
322
322

Contents

xi

14.2 THREAD SYNCHRONIZATION .............................................................................. 323

Time Synchronization
General Synchronization

.

.

..

.

.............. ......... ...... . .......... ....

. . .

.

.

.

... .

. ...............................................

.

.

.

..... ....... ... .. ......... ....... . . .. .......................... ................. ........

323
323

14.3 USING DISPATCHER OBJECTS ............................................................................. 325

Event Objects
. .
Sharing Events between Drivers
Mutex Objects
.
Semaphore Objects
Timer Objects
Thread Objects
.
Variations on the Mutex
Synchronization Deadlocks . .

. .. . .

........ .... ...................... ..

... . ...................

..
.
.

.

. .

325
327
. 327
329
330
331
332
. 333

....................... ........... .. .......

..................................... ...............................................

.

............. ........................................... ....... .... ....................................... ...

.

.

.

.

......................................... ......... ..................................... .... ...........

.

.

.................................................... .................................... ........................

.

.

................... ........................................................ ..................... .............

.
. ..

....................................................................... ..........................

.

.

..... ...................... .... ............................ ..

....................... .

14.4 CODE EXAMPLE: A THREAD-BASED DRIVER .................................................... 334

How the Driver Works
.
.
. .
.
.
.
334
The DEVICE_EXTENSION Structure in XXDRIVER.H
.
. .
.
335
The XxCreateDevice Function in INIT.C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 336
The XxDispatchReadWrite Function in DISPATCH.C
.
. . . . .
338
THREAD.C
339
TRANSFER.C
.
.
. .
.
.
.
341
.......... ........ ........ ........ .... ......... ......... ............................... ......
..... ........ .. .................. ......

............ ........ .... ..... .. .. ...........

.....................................................................................................................

. . ............ ...................... ....... .... ......... ........... ................ .......................

14.5 SUMMARY ........................................................................................................... 349

CHAPTER 15

IDGHER-LEVEL DRIVERS

..........................................................

350

15. 1 AN OVERVIEW OF INTERMEDIATE DRIVERS ...................................................... 350

What Are Intermediate Drivers?
Should You Use a Layered Architecture?

.

.

.............................................. ........ .............................

.

.

.

.

..

........ ....... ........ .......... ................... . ............

350
351

15.2 WRITING LAYERED DRIVERS ............................................................................. 352

How Layered Drivers Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 352
Initialization and Cleanup in Layered Drivers
353
Code Fragment: Connecting to Another Driver
. .
354
'
Other Initialization Concerns for Layered Drivers .
. . .
. . . . . . . . . . . . . . 356
1/0 Request Processing in Layered Drivers
. .
.
357
Code Fragment: Calling a Lower-Level Driver
. .
.
.
. . 359
15.3 WRITING 1/0 COMPLETION ROUTINES ............................................................... 360
Requesting an 1/0 Completion Callback
.
.
360
Execution Context
. .
.
. .
361
What 1/0 Completion Routines Do . ... . .
. .
.
. ...
. 362
Code Fragment: An 1/0 Completion Routine
. . ..
363
................................................................
.............................. .. ............................
........... .... .... ...... ..............

.................. .... ............... ............................
.......... .... ................. ........... ........... .. .

........................................................ .......... .....

........................... .... ................ ...................... ..... ............................
....

... ............. .. .... ........... ...... .
......... ... ..

...................... .

...............................................

15.4 ALLOCATING ADDmONAL IRPs ........................................................................ 364

The IRP' s 1/0 Stack Revisited
.
.
Controlling the Siz of the IRP Stack
. .
. ..
Creating IRPs with IoBuildSynchronousFsdRequest
Creating IRPs with IoBuildAsynchronouysFsdRequest . .
Creating IRPs with IoBuildDeviceloControlRequest
Creating IRPs from Scratch . . . . .
. . . .
.
Setting Up Buffers for Lower Drivers
.
Keeping Track of Driver-Allocated IRPs. .. . .
.
............

................ .....................................

......... ..... ................... .

..

.

364
365
367
368
369
371
374
. 375

........ ..........

.

.... ...................................

.

................................ ....................

.

. .. ............ ...............................

.....................................................

..

.

...... ... .. .. ................ ......................................... .........

. .

.

................. ............................... .......... ... ............

.

.

.

...................... ................... . . ....... .... ...... ..

15.5 WRITING FILTER DRIVERS ................................................................................. 376

How Filter Drivers Worl . . . . . . . . . . ..
Initialization and Cleanup in Filter Drivers
.. .

. ..... .

.. ...

.

. .. .

..... ............. ...

.
. . ..

..

.

....... ....... . ................... ......

........................ ... ...

.

.

.......... ........ ......... . . . . .

377
378

xii

Contents
What Happens behind the Scenes
Making the Attachment Transparent

..........

. ..
. . ..
.

..............................

..........

..

..

.........

.. .
.

..

.

380

......................................

.........................

.

..................

. 380
.

15 . 6 CODE EXAMPLE: A FILTER DRIVER . . . . . . . . . . ... . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381

YYDRIVER.H-Driver Data Structures
INIT.C-Initialization Code
.
DISPATCH.C-Filter Dispatch Routines
COMPLETE.C-1/0 Completions Routines

............

.....................

.
.
.

......

..........

............

15 .7 WRITING TIGHTLY COUPLED DRIVERS

. .
...

......

............

..
.

.............

.

..........

. .
...

.........

..

.
. .
.. . .
.
.......

......................
.................

.............................

.

...........

....

..

.

....

. 381
.

....

..........

. . 386
...

.............

390

.........................

394

..............................................................

How Tightly Coupled Drivers Work
.
Initialization and Cleanup in Tightly Coupled Drivers
1/0 Request Processing in Tightly Coupled Drivers

381

................

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . .

.

394
395
396

............

...................................................

.......................................................

15 . 8 SUMMARY
CHAPTER 16

................................................... .............. ..........................................

BUILDING AND INSTALLING DRIVERS

................. ................

397
398

16. 1 BUILDING DRIVERS · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · 398

What BUILD Does
..
.. .
How to Build a Driver
. . . ..
Writing a SOURCES File .
Log Files Generated by BUILD
Recursive BUILD Operations .
................
..........

.

........

.....

...

.

..

.................

..
.

....

.........................

............................
...............

..
.

.

.

.

..

..

..

...

..........

........

...

.

.

.

..........................................

....

.

.................

.

.

..................................

........

...

...

..

16.3 INSTALLING DRIVERS

..................................

.

...

....

.......

.......

.

.

.

........

.......

. .
.
..

............

.

Changing the Driver' s Start Value
.
Creating Explicit Dependenceies between Drivers
Establishing Global Group Dependencies
Controlling Load Sequence within a Group .

CHAPTER 17

....

.

..........

.

.

..

...

..................

.. . .
..

.

Categories of Driver Errors
.
Reproducing Driver Errors
.
.
Coding Strategies That Reduce Debugging
Keeping Track of Driver Bugs
.
.........

17 .3 READING CRASH SCREENS .
. .

.

........

..

.

....

.

.

. .
..

.

...

.

..

.

...

............

.

.

. .

......

.

..

.

405
407
408

.

.........

409

.

409
41 0
41 0
. . .. 41 2

............

.

..

...

.

.

...

..

.

.

.

.

..

..

.

..

.........

.....

............

.

.

..........

......

413

. 41 3
.. 41 4
.

.............

.........

.............

....................................

4 19

........................

.... . . .
.

..

.

....

.

.

41 5
41 6

418

.............

.. .. .. .
.

.

..

.....

419

. 41 9
..

..........................................

.. .
. .
..

.

.

....

.

..............

.. ..
.

...

.

...................

...

.

..

..........

.

.

.......

.
.

...........

421

422

......................................

..........

.. ..
.

.

....................

................................................

.. . ...

404

. . ... 404

.. .. .. . . . .
.. . . . .. ..
.. .
.
. ....
..

..................................................

........................

.

........

......

..................................

.

..

..............................

........................................

......

....

. . . .. . ..

............................................

.............................

......

.

........

....................................

...........

.

............................

The General Approach to Testing Drivers
Using the Microsoft Hardware Compatibility Tests (HCTs)
17 .2 SOME THOUGHTS ABOUT DRIVER BUGS

..

.............................

..........................

17 .1 SOME GUIDELINES FOR DRIVER TESTING

.

.........................................

TESTING AND DEBUGGING DRIVERS

........

...

.............................................

..............................

...

...............................

......................................

16.5 SUMMARY

.

............................ ...................

....

.........

............

. . . .. ....

. ..
.. . .. .

....

..........................................................

......

..........

..

.....................................

................... . . .

............................................

16.4 CONTROLLING DRIVER LOAD SEQUENCE

.

....................................

.

...............................

How to Install a Driver by Hand
Driver Registry Entries . .
End-User Installation of Standard Drivers
.
End-User Installation of Nonstandard Drivers .
.

...

...............................

Using Precompiled Headers
. ..
Including Version Information in a Driver
.. .. .. ...
..
Incl.uding Nonstandard Components in a BUILD . .
...
Moving Driver Symbol Data into .DBG Files .
..

....

...........................................................

16.2 MISCELLANEOUS BUILD-TIME ACTIVITIES
......................

.
398
. .. . .
. 400
.
. ... 401
. 403
. .. . ..
.. . .. 403

...............

.

............................................

.....

.....................

. . . . ..
.
. .
.. . .. . ..

422

.. 424

....
...

..............................

425
425

426

Contents

xiii

What Happens When the System Crashes . .
Layout of a STOP Message
Deciphering STOP Messages ...
..

.

....

. .
..

......

.

.......

.

...............................

..

......

. 426
..

.............................................................................................
..........................

17.4 AN OVERVIEW OF WINDBG . ..
.

.

.......

The Key to Source-Code Debugging
A Few WINDBAG Commands

..

.....

.

. . .

.

. . .

..

....

.

......................................................

.... . .. . .
.

...........

.

..

.

..

. . . .

. .. .
. .

..

.....

.

. . .

. .
. .

. . . .

...
.

.

.....

.. 430

............................................................

......................................................................................

17.5 ANALYZING A CRASH DUMP ...... ..
.

. . . . .

.

. . .

Goals of the Analysis
Starting the Analysis . . . .
.
Tracing the Stack
.
.
Indirect Methods of Investigation . .
.
Analyzing Crashes with DUMPEXAM

.... . ..
.

.....................................

.

...

..

..............

..

...........

............
.

.

.

. . .

.... . . . .. .
. .

. .

.

. .

. .

. . .

.

. . .

.

. .

.

.

.

.

.

..............................................................

. 433
.

.................................................................................
......

....

.......

..
.
.

................................................

............

.

430
431

.. . . . . .... ... . 433

...............................................................................

..

427
429

.

.
. .

........

..........................................

...

......
......

433
434
436
439

17.6 INTERACTIVE DEBUGGING .................................................................................. 440

Starting and Stopping a Debug Session
Setting Breakpoints
. .
.
Setting Hard Breakpoints
.......

..

.......

....................................................................

...................

. .
..

.......

.

..........

.

..................................................................

17.7 WRITING WINDBG EXTENSIONS . ..... .... .
.

.

.

How WINDBG Extensions Work
.
Initialization and Version-Checking Functions
Writing Extension Commands
WINDBG Helper Functions .. .
Building and Using and Extension DLL .
...................

.

.

.

. . .

.

.

.

.

.

.

.

.

.

.

.

.

XXDBG.C
XXDBG.DEF
SOUIRCES file
Sample Output

.

442

.

...............................

.

.............

.
.
.

....

......................................................

...........................................................

..
.

...................

..............................................................

17 .8 CODE EXAMPLE:

.

.............................

...........................................

.

.....

. .. . .... . ..... . .... . .... . ... 442

...............................

.............................

..

... . ....

. . . .

.
440
.. . 441

.......................................

.

...

....

.........

442
443
444
445
446

A WINDBG EXTENSION ........................................................ 446
446
.
. . . . .
.
451
. .
.
451
.
.
.
452

.......................................................................................................................
.........

......................

...

....

..

........................................

...........

.........

....

..

.........................

....................................

17 .9 MISCELLANEOUS DEBUGGING TECHNIQUES ..
.

Leaving Debug Code in the Driver
Catching Incorrect Assumptions
Using BugCheck Callbacks.... .
Catching Memory Leaks
Using Counters, Bits, and Buffers .

.........

.

......................................

.....................................

. . . . .

................

.............................

......................................................

.

.

. . .

..

. . .

. ....
.

..........

. . ...
.

.

. . . .

. ... .... 453
.

.

452
453
. 453
454
. 455

......................................................

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . .

.

..............

.

.......

.

...................................................

.

...........

.................................................................................................
.

17.10 SUMMARY. . .
.

CHAPTER 19

.

. . . .

. . ..
.

.

.....

. .. .
. .

..

. . . .

.....................

..
.

. . . . .

. .. .
.

.

. . . . . .

.

. . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

.. . .
.

..

..

. . . . . . . . . . .

.

. . . . . . . .

.

. . . . . . . . . . . . . . . . . . . . . .

. 458

DRIVER PERFORMANCE ............................................................ 459

18.1 GENERAL GUIDELINES

. . .

. .
. .

. . .

. ... .
.

Know Where You're Going
Get to Knowe the Hardware .
Explore Creative Driver Designs
Optimize Code Creatively
Measure Everything You Do

. .

. . . . . . . . . . . .

..........................

..

.

.

········· · · · · . . . . . . . . . . . . . . . . . . . .

.

. . . .

...

. . . . . .

...
.

.

...

. 459
.

459
460
460
. 461
. .
461

.................................................................

.........................................................................................
....................................................................................

............................................................................................
...............................................................................

18.2 PERFORMANCE MONITORING IN WINDOWS NT .... . . . .
.

Some Terminology
How Performance Monitoring Works
How Drivers Export Performance Data

..

.

.

. .

. . . . . . . . ..

.

. . . .

.

..

..

.......

. . . . . . . . . . . . . . . . .

. 462

..........................................................................................................
.............................................................................
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . .

18.3 ADDING COUNTER NAMES TO THE REGISTRY .
.

Counter Definitions in the Registry

.....

.

. . . .

..
.

. . . .

.

. . . .

. . .
. .

. .

. . .

.. .
.

. . . . . . . . . .

462
462
464

. 464
. .

................................................................................

464

Contents

xiv

Writing LODCTR Command Files ... ........ ... . .... ...... .. .. .. .. .. .. ... . . . . . . . . . .. . .. .... .. ... ... ... .. ..... .. 466
Using LODCTR and UNLODCTR . . .... .... .... ... .. ... .... . ... . ... . ... .. . .. ... .. . . . . ..... ... .. ........ .. ... . .. . 467
.

.

18.4

..

.

THE FORMAT OF PERFORMANCE DATA ........... ... . ........... . .............. . ... ............ . .. 468
.

Overall Structure of Performance Data .. ..... ... . .... .... ... ... .... .... ...... . .. .. . .. .. . .. ... .. . .. ...... ... . .... 468
Types of Counters ... .. ............ . . .. .. .. . .... .. . ........... .. . ..... .. ............. ... .. .. .. .... .. .... .. .. .... .... .. .. . 470
Objects with Multiple Instances ..... .... ... ... .... ... . ........ ... .. ... .. .. ........ .. . . . . . . . .. .... .... .... .... .. .... 472
.

.

.

.

.

.

18.5

.

WRITING THE DATA-COLLECTION DLL.. ........................................................... 474
Contents of the Data-Collection DLL . . ............... . ... .. ... ... .... ... .... .. . .. . .... .... .. .. .. .... .. .. . . . .. . 474
Error Handling in a Data-Collection DLL ...... ........ .... .... .... .. ... .... .......... .... . ... . . .. .. ... .. ... . 476
Installing the DLL ..... .... ... .. .. .. ... .. ....... . . ........ .. . .... . .. .. ..... . .. ... ... . ... . .. ........ ... . . ... . . . . ... . .. ... .. 477
.

.

.

.

.

18.6

CODE EXAMPLE:

A

.

.

.

DATA-COLLECTION DLL.. .................................................. 478

XXPERF.C . ... ... .. ... ... . . ... ... . . . .. . . .. .... . .. ... . .... . . . . .. ... ... . . ..... .. . . . . . . .. .... .. ... .... ... . .... ... .. .. .. ... . .. 478
Building and Installing this Example ...... . . . .. ..... .... ........... . . . .. .. .. ...... .. .. ... . .... .. ........ ... . .... .. 486
.

18.7

.

.

.

.

SUMMARY ........................................................................................................... 487

APPENDIX A
A.1

.

THE DEVELOPMENT ENVIRONMENT

...................................

488

HARDWARE AND SOFTWARE REQUIREMENTS ..................................................... 488
Connecting the Host and Target. ....... ... .. ..... ...... .. ... .. ... . . . . .. .. ... . . .... ... . .... .... .... . . .... ..... ... .. 489
.

.

.

A.2

DEBUG SYMBOL FILES ........................................................................................ 490
A.3 ENABLING CRASH DUMPS ON THE TARGET SYSTEM .......................................... 490
If You Don't Get Any Crash Dump Files . ... . . . . ... . ..... ... .. . .. . . . .... .. ... . . . .... .. .. .. .... .. .... . ... ..... 491
.

A.4

.

ENABLING THE TARGET SYSTEM'S DEBUG CLIENT ............................................ 492

APPENDIX B

COMMON BUGCHECK CODES

.................................................

494

B .1

GENERAL PROBLEMS WITH DRIVERS .................................................................. 494
SYNCHRONIZATION PROBLEMS ............................................................................ 496
B.3 CORRUPTED DRIVER DATA STRUCTURES ............................................................ 496
B.4 MEMORY PROBLEMS ............................................................................................ 498
B.5 HARDWARE FAILURES ......................................................................................... 500
B.6 CONFIGURATION MANAGER AND REGISTRY PROBLEMS ..................................... 501
B.7 FILE SYSTEM PROBLEMS ..................................................................................... 503
B.8 SYSTEM INITIALIZATION FAILURES ..................................................................... 504
B .9 INTERNAL SYSTEM FAILURES · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · · 506

B.2

BIBLIOGRAPHY

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

ABOUT THE AUTHOR

507

..............................................................................................

509

............................................................................................................................

511

INDEX

Preface

In case you haven't guessed, this book explains how to
write, install, and debug kernel-mode device drivers for Windows NT. If you're in the
process of designing or coding an NT driver, or if you're porting an existing driver from
some other operating system, this book is a valuable companion to the Microsoft DDK
documentation.
This book might also have something to say to you if you just need a little more
insight into the workings of Windows NT, particularly the I/O subsystem. Perhaps
you're trying to decide if NT is a reasonable platform for some specific purpose. Or you
may be studying operating systems, and you want to see how theory gets applied in the
real world.
And of course, we mustn't discount the power of morbid curiosity. The same fas­
cination that forces us to slow down as we drive past a car accident can also motivate
us to pull a volume off the bookstore shelf.

What You Should Already Know
Throughout this book, I make several assumptions about what you already know.
First of all, you need to have all the basic Windows NT user skills such as logging in
and running various utilities . Since driver installation requires you to have adminis­
trator-level privileges, you can trash things pretty badly if you don't know how to use
the system.
Second, you'll need decent C-language programming skills . I've tried to avoid the
xv

Preface

xvi

use of "cleverness" in my code examples, but you still have to be able to read them.
Next, some experience with Win32 user-mode programming is helpful, but it isn't
really required. If you haven't worked with the Win32 API, you might want to browse
through volume two of the

Win32 Programmers Reference.

This is the one that de­

scribes system services . Take a look at the chapters on the 1/0 primitives (CreateFile,
ReadFile, WriteFile, and DeviceloControl) and the thread-model. See the bibliog­
raphy for other books on Win32 programming.
Finally, you need to understand something about hardware in order to write driv­
ers . It would be helpful if you already had some experience working with hardware, but
if not, Chapter 2 will give you a basic introduction. Again, the bibliography will point
you toward other, more-detailed sources for this kind of information.

What You' l l Find Here
One of the most difficult choices any author has to make is deciding what to write
about and what to leave out. In general, I've attempted to focus on core issues that are
crucial to kernel-mode driver development. I've also tried to provide enough back­
ground information so that you'll be able to read the sample code supplied with the NT
DDK, and make intelligent design choices for your own drivers .
The overall flow o f the book goes from the theoretical t o the practical, with earlier
chapters providing the underpinnings for later topics. Here's what's covered:

Chapters 1 -5

The first part of this book provides the basic foundation you'll

need if you plan to write drivers . This includes a general examination of the Windows
NT driver architecture, a little bit about hardware, and a rather detailed look at the
NT 1/0 Manager and its data structures . This group of topics ends with some general
kernel-mode coding guidelines and techniques .

Chapters 6-1 3

These eight chapters form the nucleus of the book and present all

the details of writing kernel-mode NT device drivers . You'll also find discussions here
of full-duplex driver architectures, handling timeout conditions , and logging device er­
rors . Unless you're already familiar with NT's driver architecture, you should probably
read these chapters in order.

Chapters 1 4 and 1 5

The next two chapters deal with alternative driver architec­

tures supported by Windows NT. This includes the use of kernel-mode threads in driv­
ers and higher-level drivers .

Chapters 1 6-1 8

The final part o f the book deals with various practical details of

writing NT drivers . Chapter 16 takes a look at all the things your mother never told
you about the BUILD utility. Chapter 17 covers various aspects of testing and debug­
ging drivers, including how to analyze crash dumps and how to

really

get WINDBG to

work. If you're actually writing a driver while you read this book, you may want to read
these chapters out of order. Chapter 18 examines the crucial issue of driver perfor-

Preface

xvii

mance and how to tie your driver into NT's performance monitoring mechanisms .

Appendices

The appendices cover various topics that people in my classes have

asked about. The first one deals with the mechanics of setting up a driver development
environment.
The second appendix contains a list of the bugcheck codes you're most likely to en­
counter, along with descriptions of their various parameters. Used in conjunction with
the material in Chapter 17, this may help you track down the cause of a blue screen or
two.

What You Won't Find
I excluded topics from this book for several reasons . Some subj ects were just too
large to cover. Others addressed the needs of too small a segment of the driver-writing
community. Finally, some areas of driver-development are simply unsupported by Mi­
crosoft. Specifically, you won't find anything here about the following items:

File system drivers

At the time this book went to press, Microsoft still hadn't

released any kind of developer's kit for NT file system drivers . In fact, there seemed to
be a great deal of resistance to the idea within Microsoft. Until this situation changes ,
there's not much point in talking about the architecture o f file system drivers .

Net-card and network protocol drivers

NDIS and TDI drivers are both very

large topics - large enough to fill a book of their own. Unfortunately, there just wasn't
enough room for all of it here. I can offer one bit of consolation: The material in this
book will give you much of the background you need in order to understand what's hap­
pening inside the NDIS/TDI framework.

SCSI miniport and class drivers

Although SCSI HBA miniport drivers are vital

system components, the number of people actually writing them is (I suspect) rather
small. Consequently, the only reference to SCSI miniports is the overview material in
Chapter 1 .
I would have liked t o include a discussion o f S C S I class drivers i n this book, but
unfortunately there just wasn't any time to write it. The material on developing inter­
mediate drivers in Chapter 15 will give you much of the necessary background. From
there, take a look at the sample SCSI class driver for CD-ROMs that comes with the
NT DDK.

Video, display, and printer drivers

This is another area where I had to make a

tradeoff between the number of people writing these kinds of drivers and the time
available to finish the book. Unfortunately, graphics drivers for video and hardcopy de­
vices didn't make the cut this time. Perhaps in a later, expanded version of the book. . .

Virtual DOS device drivers

In my opinion, the best way to run 16-bit MS-DOS

and Windows applications under Windows NT is to port the source code to Win32. In

Preface

xviii

any event, the Microsoft documentation does a decent j ob describing the mechanics of
writing VDDs so I haven't included anything about them here.

About the Sample Code
There's a great deal of sample driver code scattered throughout this book. You'll
find all of it on the accompanying floppy disk. I've created separate directories on the
floppy for each chapter, and where appropriate, subdirectories for each component or
driver in the chapter.

Coding style

Since the purpose of this book is instruction, I've done a couple

things to improve the clarity of the samples. First, I've adopted a coding style that
avoids smart tricks . Some of the examples could probably have been written in fewer
lines of code, but I don't think they would have been as easy to understand.
Also in the name of clarity, I've eliminated everything except the bare essentials
from each sample. For example, most of the drivers don't contain any error-logging or
debugging code, although a real driver ought to include these things . These topics have
their own chapters, and you shouldn't have too much trouble back-fitting the code into
other sample drivers .

Naming conventions

You'll notice that almost all the sample drivers appearing

in this book are called ''XXD RIVER." (The only exception is the higher-level driver
Chapter 15. Its name is ''YYD RIVER.") This makes it somewhat easier to interchange
the parts of different samples . It also reduces the amount of clutter that you'll be add­
ing to the Registry while you're playing with these drivers .
Within any particular driver, I've also adopted the convention of adding the pre­
fix, Xx to the names of any driver-defined functions . Similarly, device registers, driver
structures, and constants are also prefixed with

xx._.

This makes it easy to see which

things you have to write and which ones come from the folks at Microsoft.

Platform dependencies

It's worth mentioning that these samples have been tar­

geted to run on Intel 80x86 platforms . In particular, the drivers all assume that device
registers live in 1/0 space rather than being memory-mapped. This is relatively easy to

fix with a little bit of coding and some modifications to each driver's hardware-specific
header file.

To build and run the examples

You'll need several tools if you plan to do any

driver development for Windows NT. First, get yourself a Level II subscription to the
Microsoft Developer Network CDs. This is the only source for the NT DDK and the
Win32 SDK.
You'll also need a C compiler. I've chosen to use the Microsoft compiler for devel­
oping and testing all the code in this book. Your mileage may vary if you're using some
other vendor's tools . See Appendix A for more information on setting up your driver de­
velopment environment.

Preface

xix

Training and Consulting Services
The material in this book is based on classes that I've been delivering for several
years through Cydonix Corporation - a training and consulting firm whose goal is to
help its clients develop device drivers and other high-performance Windows NT soft­
ware. Cydonix offers services that range from formal classroom training to direct par­
ticipation in software design and coding.
For the past three years, Cydonix has been helping companies like Adaptec,
AT&T, Compaq Computers, Hewlett-Packard, and Intel to learn more about the work­
ings of Windows NT. We have training available in a number of areas including:
•Windows NT device driver programming
•Win32 system service programming
•Advanced server development techniques
Cydonix offers both onsite training at customer facilities and open enrollment
classes that are available to the general public. The public classes are hosted by train­
ing vendors in several geographic areas .
For more information about training and consulting from Cydonix Corporation,
visit our Web site at http://www .cydonix.com or send email to info@cydonix.com.
You can also contact us through more earthbound means using this postal address:
Cydonix Corporation
Suite 304
2 1 1 7 L Street, N.W.
Washington, DC 20037

Acknowledgments

M

any people have kindly contributed to the cre­
ation of this volume. First and foremost, I want to thank David Lucas (to whom
this book is dedicated) for his steadfast friendship and unfaltering faith in me
over the years. David, so many things have been possible in my life only because
of you . . .
My gratitude also goes to the editorial and production staff at Prentice Hall.
Mike Meehan and Joanne Anzalone have shown infinite patience while I tried to
balance my training and consulting schedule with the demands of writing a book.
I'm sure you're glad it's over.
I would be remiss if I didn't acknowledge all the people who've been stu­
dents in my various driver classes over the last twelve years. Your questions and
insights have helped me understand how to communicate this kind of material to
others, and I'm grateful.
Finally, I'm very pleased to say that all crash sequences were performed by
stunt doubles and no programmers or other small animals were actually harmed.

C

H

A

P

T

E

R

1

Introduction to
Windows NT
Drivers
T

radition demands that any book about writing
device drivers starts out by answering the question, "What is a driver?" Unfortu­
nately, asking this question in Windows NT is a little like asking "What color is
plaid?" because there are at least a dozen different software components that can
rightfully be called drivers. This chapter takes a roundabout look at the different
kinds of drivers supported by Windows NT, and along the way, presents some of
the design philosophy that makes this operating system such an intriguing beast.

1 .1

OVERALL SYSTEM ARCHITECTURE
Windows NT drivers don't live in isolation, o f course. Rather, they are just one
part of a large and complex operating system. This section takes you on a quick
tour of the Windows NT architecture and points out those features that will be of
most interest to driver writers.
Design Goals for Windows NT

Like every other commercial operating system, Windows NT is the result of
a complex interaction between idealized goals and market-driven realities. The
Windows NT design team set their sights on the following:
•

Compatibility
The operating system should support a wide range of
existing software and legacy hardware.

•

Robustness and reliability
The operating system has to resist the
attacks of naive or malicious users, and individual applications should be
as isolated from one another as possible.

-

-

1

2

Chapter 1

Introduction to Windows NT Drivers

•

Portability
The operating system should be able to run on a wide vari­
ety of current and future hardware platforms.

•

Extendibi l ity
It should be possible to add new features and support
new I/0 devices without perturbing the existing code base.

•

Performance
The operating system should be able to give reasonable
performance on commonly available hardware. It should also be able to
take advantage of features like multiprocessing hardware.

-

-

-

Trying to balance all these goals with a reasonable time to market was a
complex process. The rest of this section describes the solution that the system
designers came up with - beginning with a look at the protection mechanisms
that keep the operating system safe.

Hardware Privilege Levels in Windows

NT

There are any number of things that application programs shouldn't be
allowed to do in a multitasking environment. Fooling with the memory manage­
ment hardware or halting the processor are just two examples of actions that
would cause serious problems. Rather than depending on the kindness of strange
applications, Windows NT takes advantage of hardware-enforced privilege­
checking mechanisms to guarantee system integrity.
To avoid hardware dependencies, Windows NT uses a simplified model to
describe hardware privileges. This model then maps onto whatever privilege­
checking mechanisms are available on a given CPU. A CPU must be able to oper­
ate in two modes if it's going to support the Windows NT hardware privilege
model.

Kernel mode Anything goes when the CPU runs in kernel mode. A task
can execute privileged instructions, and it has complete access to any I/0 devices.
It can also touch any virtual address and fiddle with the virtual memory hard­
ware. This mode corresponds to Ring 0 on an Intel 80x86.
User mode In this mode, the hardware prevents execution of privileged
instructions and performs access checks on references to memory and I/0 space.
This allows the operating system to restrict a task's access to various I/O opera­
tions, and trap any other behavior that might violate system integrity. Code run­
ning in user mode can't get itself into kernel mode without going through some
kind of gate mechanism in the operating system. On an Intel 80x86 processor, this
mode corresponds to Ring 3.
Base Operating System Components
The base components of Windows NT implement a general operating sys­
tem platform on which to build more complex environments. As you can see from

Sec. 1.1

Overall System Architecture

3

NT Executive

Kernel

Hardware Abstraction Layer (HAL)

Hardware Platform

Copyright © 1 994 by Cydonlx Corporation. 940002a.vsd

Figure

1 .1

Overall architecture of the NT kernel-mode components

Figure 1.1, these base components consist of three major blocks of kernel-mode
code.

Hardware Abstraction Layer (HAL) The HAL is a thin layer of software
that presents the rest of the system with an abstract model of any hardware that's
not part of the CPU itself. The HAL exposes a well-defined set of functions that
manage such items as:
•

Off-chip caches

•

Timers

•

I/O buses

•

Device registers

•

Interrupt controllers

•

DMA controllers

Various system components use these HAL functions to interact with off­
CPU hardware. This essentially hides platform-specific details from the rest of the
system and removes the need to have different versions of the operating system
for platforms from different system vendors. In particular, the use of HAL rou­
tines makes the Kernel and device drivers binary-compatible across platforms
with the same CPU architecture.

Kernel Where the HAL is an abstraction of the platform, the Kernel pre­
sents an idealized view of the CPU itself. Among other things, the Kernel pro­
vides mechanisms for

Chapter 1

4

Introduction to Windows NT Drivers

•

Interrupt and exception dispatching

•

Thread scheduling and synchronization

•

Multiprocessor synchronization

•

Time keeping

By using these Kernel services, upper layers of the operating system can (for
the most part) ignore the architecture of the underlying CPU. This makes it possi­
ble for drivers and higher-level operating sfstem components to be source-code
portable across different CPU architectures.
An interesting feature of the Kernel is that it presents an object-based inter­
face to its clients. When other parts of the operating system need help from the
Kernel, they request its services by calling functions that create and manipulate
various kinds of objects. These Kernel objects fall into two main categories:
•

Dispatcher objects

-

These are used primarily for managing and syn­

chronizing threads.
•

Control objects
These objects affect the behavior of the operating sys­
tem itself in some way.
-

Device drivers don't have much use for dispatcher objects. Those that do are
described in Chapter 14. Control objects are another matter, however. In particu­
lar, device drivers make frequent use of Deferred Procedure Call objects and
Interrupt objects (described in Chapters 3 and 4 respectively) .

Executive The Executive is by far the largest and most complex kernel­
mode component in Windows NT. Its job is to implement many of the basic func­
tions normally associated with an operating system. Like the Kernel, the Execu­
tive uses the HAL to interact with any off-CPU hardware and so becomes binary
compatible across platforms from different system vendors. By relying on Kernel
objects, the Executive gains the additional advantage of being source-code porta­
ble across different CPU architectures. Because it's such a key part of Windows
NT, it's worth exploring the Executive a little more.
What's in the Executive
As you can see from Figure 1 .2, the Executive actually consists of several
distinct software components that offer their services both to user-mode pro­
cesses and to one another. These Executive components are completely indepen­
dent and communicate only through well-defined interfaces. This modularity

1

It also means that much of the work of porting Windows NT to a new CPU is really a matter of
rewriting the Kernel. To make this process easier, Microsoft has adopted a microkernel approach
that tries to keep the Kernel as small as possible.

Sec. 1 . 1

Overall System Architecture

5

System Service Interface

Object
Mgr

Process
Mgr

Security
Monitor

Config
Mgr

1/0
Mgr

Virtual
Memory
Mgr

Local
Proc
Call

Copyright © 1 994 b y Cydonix Corporation. 940003a.vsd

Figure

1 .2

Detailed view of the Executive

makes it possible to replace an existing Executive component without perturbing
any other parts of the operating system. As long as the replacement exposes the
same interface, the change will be transparent. The remainder of this subsection
gives cursory descriptions of the various Executive modules.

System service interface All operating systems have to give user-mode
processes a limited ability to execute kernel-mode code. In particular, there must
be a controlled path from user to kernel mode that applications can follow when
they call system services. In Windows NT, the system service dispatcher uses a
technique based on the CPU's hardware exception mechanism to give user-mode
code access to Executive services.
Object Manager The Executive offers its services to user-mode processes
through an object-based interface. These Executive objects represent things such
as files, processes, threads, and shared memory segments. This use of objects pro­
vides a unified mechanism for tracking resources and enforcing security.
The Object Manager does all the grunt work of managing these Executive
objects. This includes creating and deleting objects, maintaining the global object
namespace, and keeping track of how many outstanding references there are to
any given object.
Configuration Manager From a driver writer 's perspective, the main job
of the Configuration Manager is to maintain a model of all the hardware and soft­
ware installed on the machine. It does this using a database called the Registry. As
you read through the rest of this book, you'll see that drivers are linked to the
Registry through an intricate web of connections. Among other things, drivers
use the Registry to

Chapter 1

6

Introduction to Windows NT Drivers

•

Identify themselves as trusted system components

•

Find and allocate peripheral hardware

•

Set up error-logging message files

•

Enable driver-performance measurement

Process Manager A process is the unit of resource-tracking and security
access checking in Windows NT. Along with any resources it might be holding,
each process has its own virtual address space and security identity. A process
also contains one or more executable entities called threads. It is the thread (and
not the process) that receives ownership of a CPU and does actual work.
The Process Manager is the Executive component that handles the creation,
management, and deletion of processes and threads. It also provides a standard
set of services for synchronizing the activities of threads. Most of the features
exposed by the Process Manager are just fancy versions of mechanisms imple­
mented by the Kernel.
Security Reference Monitor This Executive component enforces the sys­
tem's security policies. The Security Reference Monitor doesn't actually define
security policy; that job belongs to the Local Security Authority subsystem
(described later in this chapter). Rather, the Security Reference Monitor simply
provides a set of primitives that both kernel- and user-mode components can call
to validate access to objects, check for user privileges, and generate audit mes­
sages. For the most part, device drivers don't concern themselves with security
issues.
Device drivers normally don't do much with the Security Reference Moni­
tor. The I/0 Manager handles those kinds of details before it calls any routines in
your driver.
Virtual Memory Manager Under Windows NT, each process has a flat 4gigabyte virtual address space. The lower half of this space contains process-pri­
vate code and data along with the process's stack and heap space. It also holds
any File Mapping objects and DLLs the process is using. The upper half of every
process's address space contains nothing but kernel-mode code. One of the jobs of
the Executive's Virtual Memory Manager is to maintain this illusion of a huge
address space using demand-paged virtual memory management techniques.
From a driver writer 's point of view, the Virtual Memory Manager is more
important as a memory allocator because it maintains the system heap areas. The
Virtual Memory Manager also builds and manipulates various buffer descriptors
that are crucial to the operation of DMA drivers. Both these topics are covered in
more detail later.
Local Procedure Call facility The Local Procedure Call (LPC) facility is a
message-passing mechanism used for communication between processes on the
same machine. LPCs are used primarily by protected subsystems (described later)
and their clients. Device drivers have no access to the LPC facility.

Sec. 1.1

Overall System Architecture

7

1/0 Manager This Executive component converts I/ 0 requests from user­
and kernel-mode threads into properly sequenced calls to various driver rou­
tines. Through the use of a well-defined formal interface, the 1/0 Manager is
able to communicate with all drivers the same way. This makes it unnecessary
for the 1/0 Manager to know anything about the underlying hardware managed
by a given driver. The rest of this book describes the operation of the 1 / 0 Man­
ager in gory detail.

Extensions to the Base Operating System
The Executive components of Windows NT present a fairly neutral face to
the world. They don't implement a user interface nor do they define any external
policies like security. They don't even offer a programming interface since the
Executive's system service calls are not publicly documented. The base kernel­
mode components simply provide a generic operating system platform.
Defining the look and feel of the operating system - both to users and pro­
grammers - is the job of some extended components known collectively as pro­
tected subsystems. Rather than dealing directly with the Executive, users and
programmers of Windows NT interact with these subsystems.
In the original architecture of Windows NT, protected subsystems were
implemented entirely as a group of privileged user-mode processes. This rather
elegant design made it possible to extend the base operating system without risk­
ing any damage to the underlying kernel-mode components. For performance
reasons, Windows NT 4.0 has moved away from this pure user-mode model and
shifted some subsystem components into kernel mode.
Depending on the kind of work they do, all protected subsystems can be
divided into two major categories. The following subsections describe each cate­
gory in more detail.

Integral subsystems An integral subsystem performs some necessary
system function. The responsibilities of these subsystems actually cover quite a lot
of territory. The following are just a few examples of what they do.
•

Together with the Security Accounts Manager and the Logon process, the
Local Security Authority defines security policy for the system.

•

The Service Control Manager loads, supervises, and unloads trusted sys­
tem components like services and drivers.

•

The RPC Locator and RPC Service processes give support to distributed
applications that use remote procedure calls.

Environment subsystems The other kind of protected subsystem is
called an environment subsystem. The job of an environment subsystem is to pro­
vide a programming interface and execution environment for application pro­
grams native to some specific operating system. Currently, Windows NT provides
the following subsystems:

Chapter 1

8

Introduction to Windows NT Drivers

•

The Win32 subsystem implements the native-mode programming inter­
face for Windows NT. A more detailed description of this subsystem
appears below.

•

The Virtual DOS Machine (VDM) subsystem allows 16-bit MS-DOS appli­
cations to run under Windows NT. Unlike other subsystems, the VDM
software is actually part of the process where the MS-DOS application is
running.

•

The Windows on Windows (WOW) subsystem supports the execution of
16-bit Windows applications. The default behavior of the WOW sub­
system is to run all 16-bit Windows applications as separate threads
within the address space of a single VDM process. This helps to mimic
the 16-bit Windows environment more closely.

•

The POSIX subsystem provides API support for programs conforming to
the POSIX 1003.1 source-code standard. Because POSIX 1003.1 is not a
binary standard, applications must be compiled and linked on Windows
NT in order under this subsystem.

•

The OS/2 subsystem creates an execution environment for 16-bit OS/2
applications. This subsystem is available only for the 80x86 version of
Windows NT.

A given application is always tightly coupled to one specific subsystem and
can use only the features of that subsystem. For example, a POSIX application
can't make calls to Win32 API functions. Also keep in mind that applications run­
ning under any subsystem other than Win32 will experience some performance
degradation. These other subsystems are provided mainly for compatibility.
More about the Win32 Subsystem

All environment subsystems are not created equal. In particular, the services
provided by the Win32 subsystem are crucial to the operation of Windows NT.
The duties of this subsystem include the following:
•

As the owner of the screen, keyboard, and mouse, it manages all console
and GUI I/0 for the entire system. This includes I/O for other sub­
systems as well as user applications.

•

The Win32 subsystem implements the GUI seen by programmers and
users. As the screen and window manager for Windows NT, it defines
GUI policy and style for the whole system.

•

It exposes the Win32 API that both application programs and other sub­
systems use to interact with the Executive.

Because of its special status, the Win32 subsystem is implemented in a dif­
ferent way from any of the others. Figure 1 .3 shows the organization of the Win32
subsystem.

Sec. 1.1

Overall System Architecture

9

Win32
Client

User
Mode

Win32 API DLL

NT System Service Interface

Kernel
Mode

WIN32K.SYS

LPC Facility

Copyright © 1 996 by Cydonix Corporation. 960009a.vsd

Figure

1 .3

The Win32 subsystem has both user- and kernel-mode components

Unlike its counterparts, the Win32 subsystem doesn't run entirely in user
mode. Instead, it consists of both user- and kernel-mode components. To under­
stand how it all fits together, you need to know a little bit about the organization
of the Win32 API itself. Broadly speaking, you can divide Win32 functions into
three categories:
•

The USER functions manage GUI objects like menus and buttons.

•

The GDI functions that perform low-level drawing operations on graphi­
cal devices like the displays and printers.

•

The KERNEL functions manage such things as processes, threads, syn­
chronization objects, shared memory, and files. They map very directly
onto the system services provided by the Executive.

In the original design of Windows NT, one of the goals was to confine all
GUI policy-making code to the Win32 server process, CSRSS. The developers
believed this would make the system more robust and easier to modify. As a
result, calls to many USER and GDI functions required some interaction with the
CSRSS process. This is a rather expensive operation since it involves a process
context switch between the Win32 client and the CSRSS server. By comparison,
KERNEL functions could be handled in the context of the calling process. Their
only overhead was the transition to and from kernel mode.
This architecture has been replaced in Windows NT 4.0 because of the per­
formance limitations it put on graphically-based Win32 programs. Now, a new
kernel-mode component called WIN32K.SYS has taken over most of the work for­
merly done by CSRSS. With this approach, calls to USER and GDI functions can

Chapter 1

10

Introduction to Windows NT Drivers

execute in the context of the calling process. The result is that the speed of graphi­
cally intensive applications improves significantly.
This shift from user- to kernel-mode graphic support also had implications
for the architecture of video and printer drivers under Windows NT. The next sec­
tion of this chapter will provide some more details on this subject.

1 .2

KER N E L-MODE 1/0 COMPONENTS
Here we're going to take a look at the general layered driver model used by the
kernel-mode portions of Windows NT. We'll also be examining variations on this
architecture that support specific kinds of I/O devices.
Design Goals for the 1/0 Subsystem
In addition to the general Windows NT design goals, there were several
additional requirements that the I/O subsystem had to satisfy:
•

Ease of development - It shouldn't take unreasonable amounts of
work to provide support for a new device.

•

Portability - It should be relatively easy to move drivers to new plat­
forms. In the best case, this would mean simply compiling and linking
the driver.

•

Extendibility - It should be easy to add support for new devices and file
systems without breaking anything that already works.

•

Robustness - The I/0 architecture should offer clean, well-defined
interfaces and minimize the use of backdoor mechanisms.

•

Security - It must be possible to allow or deny various kinds of access
to I/O objects on a user-by-user basis.

•

M ultithreaded operation - Drivers should be able to handle overlap­
ping requests from multiple threads, even if the threads are running
simultaneously on multiple CPUs.

•

Performance - I/0 throughput must be consistent with the needs of
large-scale client-server applications.

As if all this isn't enough, the I/O architecture has to work with all the leg­
acy devices that people have been attaching to PCs for the last decade. Some of
these devices have characteristics that don't blend well with modern, large-scale
operating systems.
Layered Drivers in Windows

NT

In most operating systems, the term driver refers to a piece of code that man­
ages some peripheral device. Windows NT takes a more flexible approach which

Sec. 1 .2

Kernel-Mode 1/0 Components

11

File
System
Driver

1/0

Manager

Intermediate
Driver

Device
Driver

Copyright © 1 994 by Cydonix Corporation. 940008a.vsd

Figure 1 .4 Layered kernel-mode drivers
allows several driver layers (shown in Figure 1 .4) t o exist between an application
program and a piece of hardware. This layering permits Windows NT to define a
driver in much broader terms that include file systems, logical volume managers,
and various network components as well as physical device drivers.
These are the drivers that manage actual data transfer and
control operations for a specific type of physical device. This includes starting and
completing I/O operations, handling interrupts, and performing any error pro­
cessing required by the device.

Device drivers

Intermediate drivers Windows NT allows you to layer any number of
intermediate drivers on top of a physical device driver. These intermediate layers
provide a way of extending the capabilities of the I/ 0 system without having to
modify the drivers below them. For example, the fault-tolerant disk driver in
Windows NT Server is implemented as a layer that sits between the file system
and the drivers for any physical disks.
Another use for intermediate drivers is to separate hardware-specific oper­
ations from more general management issues. In this kind of arrangement, the
intermediate driver is referred to as a class driver and the hardware driver is
called a port driver. For example, the keyboard class driver handles general key­
stroke processing while the keyboard port driver worries about the details of
specific keyboard controllers. The use of separate class and port drivers makes it
easier to target a wider range of hardware since only the port driver needs to be
rewritten.
File-system drivers (FSDs) This kind of driver is generally responsible
for maintaining the on-disk structures needed by various file systems. For design

12

Chapter 1

Introduction to Windows NT Drivers

reasons, some other system components are implemented as file-system drivers,
even though they aren't file systems as such. Microsoft currently supplies the fol­
lowing FSDs:
•
•
•
•
•
•
•

FAT - Windows 95 extended MS-DOS file system
NTFS Windows NT high reliability file system
HPFS - OS/2 high performance file system
CDFS - ISO 9660 CD-ROM file system
MSFS - Mailslot file system
NPFS Named pipe file system
RDR - LAN Manager redirector
-

-

Unfortunately, you can't develop file-system drivers using the standard NT
DDK. Microsoft released a beta version of a file system developer's kit at a confer­
ence in 1994, but at the time of this writing, they hadn't committed to any release
date for the final version of this kit.

SCSI Drivers
The Windows NT SCSI architecture uses layered drivers to separate the man­
agement of specific devices from the control of the SCSI host bus adapter (HBA)
itself. Figure 1.5 shows the components of the Windows NT SCSI architecture.

Filter
Driver

Class
Driver

NT SCSI
Port Driver

Miniport
Driver

SCSI
Adapter

Copyright C> 1 996 by Cydonix Corporation. 9600108.vsd

Figure

1 .5

Architecture of Windows NT SCSI drivers

SCSI
Device

Sec. 1.2

Kernel-Mode 1/0 Components

13

SCSI port and miniport drivers The port driver is a Microsoft-supplied
component that acts as an interface between a SCSI miniport driver and the oper­
ating system. By handling common SCSI grunt work and hiding the details of the
local operating system, the SCSI port driver makes it easier to write drivers for
new SCSI HBAs. It also reduces the overall size of a miniport and makes it easier
to move the miniport to other operating systems (like Windows 95).
SCSI miniports supply the port driver with routines that perform any HBA­
specific control operations. Generally, the only people writing SCSI miniport
drivers are HBA vendors who want to sell their products in the Windows NT
marketplace.
SCSI class drivers Class drivers manage all the SCSI devices of a particu­
lar type, regardless of what HBA they're attached to. For example, there are SCSI
class drivers for tapes, disks, and CD-ROM drives. Separating device control from
HBA control makes it possible to mix and match SCSI devices and adapters from
different vendors. If you have a device that attaches to a SCSI bus, this is the only
kind of driver you'll need to write.
SCSI filter drivers Filters are optional SCSI components that intercept and
modify requests sent to a SCSI class driver. This allows you to take advantage of exist­
ing class driver capabilities without writing everything from scratch. Filters are useful
if you're developing a class driver for hardware that's similar to some other device.
Network Drivers
In an effort to get better performance, many of the networking components
in Windows NT are implemented as kernel-mode drivers. As you can see from
Figure 1 .6, Windows NT uses driver layering to disengage network protocol man­
agement from actual network data transfers. The result is much greater flexibility
and support for a wider range of network protocols and hardware.

Network interface card (NIC) drivers At the bottom of the stack are the
NIC drivers that manage the actual networking hardware. NIC drivers present a
standard interface at their top edge that allows higher-level drivers to send and
receive packets, to reset or halt the NIC, and to query and set the characteristics of
the NIC. The interface to a NIC driver is defined by the network driver interface
specification (NDIS).
NDIS NIC drivers rely heavily on the services provided by the NDIS inter­
face library. This library (sometimes referred to as the NDIS wrapper) handles
many of the nasty details involved in managing asynchronous communications
across a network. The NDIS library also exports a complete set of kernel-mode
system functions so that a properly written NDIS driver doesn't need to deal with
the operating system.
Based on the amount of help they get from the NDIS interfa ce library, you
can classify NIC drivers as either miniports or full drivers. NIC rniniports perform

Chapter 1

14

Sockets
Emulator

Introduction to Windows NT Drivers

Other kernel-mode
TOI clients

NetBEUI
Emulator

Transport Driver Interface (TOI)
Legacy Protocol Driver
Media-Aware Protocol Driver
NDIS Intermediate Driver

NIDIS Miniport Driver
NDIS Library

Copynght © 1 996 by Cydonix Corporation. 940009a.vsd

Figure

1 .6

Architecture of kernel-mode networking components in Windows NT

only those hardware-specific operations needed to manage a particular NIC. Code
in the NDIS library takes care of issues common to all NIC miniports such as syn­
chronization, notification of packet arrival, and queuing of outgoing packets. This
is the preferred type of NIC driver for any new hardware.
By comparison, full NIC drivers do almost everything on their own. This
makes them much harder to write and debug and often slower than NIC
miniports. Originally introduced in the first release of Windows NT, full NIC
drivers are supported only to maintain backward compatibility. No one in their
right mind is developing full NIC drivers anymore.

NDIS intermediate drivers Version 4.0 of NDIS (the one included with
Windows NT 4.0) includes a new kind of component: the NDIS intermediate
driver. NDIS intermediate drivers are sandwiched between transport drivers and
NDIS NIC miniports. To the transport driver, they appear to be NDIS miniports
while to the NIC driver, they look like transport drivers.
NDIS intermediate layers are useful if you have a legacy transport driver
and you want to connect it to some new type of media unknown to the transport
driver. In this situation, the intermediate driver performs any necessary transla­
tions between the transport driver and the NIC miniport managing the new
media.
Transport drivers A transport driver is responsible for implementing a
specific network protocol such as TCP /IP or IPX/SPX. It is independent of the
underlying network hardware and uses NDIS NIC or intermediate drivers to
transfer packets over one or more physical network connections.

Sec. 1 .3

Special Driver Architectures

15

All Windows NT transport drivers offer their services to kernel-mode net­
working clients through the transport driver interface (TOI). The TOI specifica­
tion defines a low-level interface that supports both connection-based and
connectionless (i.e., datagram) protocols. Having all transport drivers expose a
single, common interface simplifies the development of both the transport drivers
and the clients they support.

Kernel-mode networking clients Various kernel-mode components that
access the network use the IDI interface to communicate with protocol drivers.
These kernel-mode TOI clients fall into two broad categories: First, there are sys­
tem components whose operation is transparent to user-mode applications. One
example would be the Server and Redirector that handle requests for remote file
access.
The other kind of TOI client is an emulator that exposes some well-known
programming interface. User-mode applications access the network through one
of these standard APis rather than working directly with TOI. This approach
makes it easier to port existing software to Windows NT and prevents the need­
less proliferation of networking APis. Windows NT currently supports interfaces
for sockets, NetBIOS calls, named pipes, and mailslots.
1 .3

SPECIAL DRIVER ARCHITECTU RES
Along with the relatively straightforward kernel-mode drivers described in sec­
tion 1 .2, Windows NT depends on a number of very specialized driver architec­
tures. The following subsections describe each of them in detail.

Video Drivers
Video support in Windows NT is complicated by the fact that Win32 appli­
cations can use three different graphics APls. First, there's the graphical device
interface (GDI). This API provides a set of device-independent rendering func­
tions for generating two-dimensional output on display or hardcopy devices.
Most Win32 applications use this programming interface because it simplifies the
task of producing identical display and printer output.
For programs that need to produce three-dimensional graphics, Win32 also
supports the OpenGL APL These functions generate the kind of high-quality out­
put needed by CAD software or scientific visualization tools. In return for the
quality of the output, however, the OpenGL API demands a great deal of CPU
horsepower or hardware rendering assistance.
Finally, for consumer applications (i.e., games), Windows NT supports a
subset of the DirectDraw API included in Windows 95. DirectDraw is one piece of
Microsoft's DirectX game-programming architecture. Its goal is to give user-mode
applications more direct access to video and audio hardware without compromis­
ing the integrity of the system.

Chapter 1

16

Introduction to Windows NT Drivers

System Service Interface

GDI
Rendering
Engine

VO Manager

DirectDraw
HAL

Video
Port

Video
Miniport
Driver

Video Hardware

�

Copyright @ 1 996 by Cydonix Corporation. 96001 1 a.v d

Figure

1 .7

Architecture of NT kernel-mode video drivers

Supporting multiple APis on video hardware from multiple vendors is a
complex problem. Solving it in a flexible and portable manner requires the inter­
action of a number of software components. Figure 1.7 shows what they are.

GDI engine The GDI engine is the key to Windows NT's device-indepen­
dent output strategy. This Microsoft-supplied component provides full software
rendering support for Win32 GDI calls. In response to a Win32 drawing request,
the GDI engine uses the appropriate display or printer driver to generate com­
mands for a specific piece of hardware.
Display drivers Display drivers are vendor-supplied components that do
the actual work of drawing on the display screen. By selectively overriding the
rendering functions in the GDI engine, they also give Win32 access to any hard­
ware acceleration features provided by the video card.2 Along with a display
driver for a specific piece of video hardware, vendors need to provide a corre­
sponding video miniport (described below).
DirectDraw HAL This vendor-supplied component exposes an abstract
version of the video hardware. This includes the video frame buffer plus any hard­
ware acceleration mechanisms supported by the DirectDraw APL Any features of

2

In earlier versions of Windows NT, both the GDI engine and the display driver were user-mode
components running in the context of the Win32 subsystem process. To improve graphics perfor­
mance, this code runs in kernel mode in Windows NT 4.0.

Sec. 1 .3

Special Driver Architectures

17

the DirectDraw hardware model not supported by the video device are emulated
by Microsoft's DirectDraw software.

Video port and miniport drivers The main responsibility of these two
drivers is to manage state changes in the system's video hardware. The video port
and miniport do not take part in any drawing operations. The work of these driv­
ers includes doing such things as:
•

Finding and initializing the video controller.

•

Managing any cursor or pointer hardware located on the video card.

•

Handling mode-set and palette operations when a full-screen MS-DOS
session is running. (This only applies to 80x86 platforms.)

•

Making the video frame buffer available to user-mode processes.

The video port and miniport are actually a tightly-coupled pair of drivers.
The port driver is a Microsoft-supplied framework that simplifies the task of writ­
ing video drivers. It contains only generic, hardware-independent code that is
common to all video drivers.
The miniport is a vendor-supplied driver whose job is to manage a specific
type of video card. In response to calls from the video port driver, it is the
miniport that actually changes the state of the device. This division of labor
between the port and miniport makes it easier to add support for new video cards
to Windows NT.

Printer Drivers
In Windows NT, hardcopy devices are considered to be just another kind of
graphical output hardware. Unlike display devices, however, there can be more
than one printer on the system, and these printers may not all use the same kind
of physical connection. Some of them may even be located somewhere else on the
network. The Windows NT printing architecture (pictured in Figure 1 .8) is an
attempt to deal with all this variety.

Printer drivers A printer driver is very much like a display driver in that
it runs in kernel mode and helps the GDI engine convert Win32 API graphics calls
into rendering commands. The difference is that a printer driver sends its output
to the spooler (described below) rather than to a video device.
A printer driver is responsible for supporting a particular printer or family
of printers. The Windows NT DOK contains sample drivers for raster-based print­
ers, PostScript printers, and plotters. Most printers available today fall into one of
these categories. Unless your printer uses some completely alien technology, it's
unlikely that you'd need to write an entire driver from scratch.
For raster-based printers, most of the rendering operation is simply a matter
of converting a specific drawing command into the proper set of printer escape

Chapter 1

18

/-

ation

Introduction to Windows NT Drivers

Spooler
Config DLL

Config DLL
Print Proc
Spool API
Lang Monitor
Graphics API

System Service Interface
GDI Engine
Printer Driver

System Service Interface

Serial/Parallel/Network
Device Driver

Copyright © 1 996 by Cydonix Corporation. 960012a.vsd

Figure

1 .8

Architecture of the Windows NT printing components

codes. Because this is such a well-defined problem, you can use a Microsoft-sup­
plied framework called the Unidriver to do most of the work. In this case, you
only need to write the device-specific pieces of code in the form of a miniprint
driver. Adding support for printers based on a page description language like
PostScript is a more complicated task.

Configuration DLL To support a printer under Windows NT, it's not
enough to write a printer driver. You also have to supply a user-mode configura­
tion DLL. The job of this DLL is to display the property-sheet dialog box that
changes the printer 's settings. Application programs use the configuration DLL to
set up the printing environment for specific documents. It also appears when you
select one of the icons in the Windows NT shell's Printers folder.
Spooler The spooler is the central component of Windows NT's printing
mechanism. It takes the output generated by a printer driver and either sends it to
the appropriate printer or stores it in a temporary file for later printing. The
spooler works either with local or networked printers.
The spooler is one of the integral subsystem processes that starts when the
operating system loads. Its architecture is very modular so that it can accommo­
date a wide variety of printing devices and environments. Printer vendors can
customize the spooler by supplying three different kinds of components: print
processors, language monitors, and port monitors.
Print processor DLL A print processor is a DLL that reads the spooled data
produced by a specific printer driver and converts it into actual output. At its upper
edge, the print processor DLL exposes a standard set of functions to the spooler. It
generates output using the services provided by a language or port monitor.

Special Driver Architectures

Sec. 1 .3

19

The standard printer drivers can spool their output as text, as raw data
(already rendered by the GDI engine), or as a series of enhanced metafile (EMF)
commands to be rendered by the spooler. 3 Microsoft supplies a print processor
that can interpret any of these three data formats. If you write a printer driver that
uses a proprietary format for spooled data, you'll also have to write a print pro­
cessor for it.

Language monitor DLL In workgroup situations, it's very common for
several users to be sharing a single printer or print server. Consequently, it's
important to keep their jobs clearly separated and to be able to determine the sta­
tus of a particular job at any point in time. It also may be necessary to set up a dif­
ferent printing environment for each job being output.
To meet these kinds of needs, many vendors offer smart, bidirectional print­
ers that accept commands and report status over the same connection on which
they receive output data. Normally, these command and status messages are in
some kind of control language defined by the printer 's manufacturer. For exam­
ple, Hewlett Packard LaserJet printers use something called the Printer Job Lan­
guage (PJL).
A language monitor is a DLL that allows the spooler to communicate with a
bidirectional printer in a standardized way. It exposes a well-defined set of func­
tions that the spooler can call to control and monitor a job on one of these printers.
The language monitor then converts these requests into the proper stream of job­
language commands and uses the port monitor (described below) to send them to
the printer.
Windows NT comes with a language monitor for the Hewlett Packard PJL
language. If your printer uses some home-brew set of commands, you'll need to
write a language monitor for it.
Port monitor DLL A port monitor is a DLL that manages a particular kind
of output channel on behalf of the spooler subsystem. The monitor exposes a stan­
dard set of functions which the spooler invokes in order to generate output. The
port monitor then converts these calls into the appropriate set of Win32 1/0
requests.
Allowing the spooler to work with an abstraction of the output device
makes it easier to add support for a variety of printer connections. Microsoft sup­
plies the following port monitors with Windows NT:

3

•

The local port monitor that communicates with the parallel and serial
ports as well as printing data to a file.

•

The LPR monitor that manages LPD printers and print-servers using a
TCP /IP network connection.

The use of EMF data for printing allows the program generating the output to finish its print
request more quickly since the rendering operation takes place later in the context of the spooler
process. Raw data slows the application because it's rendered before being sent to the spooler.

Chapter 1

20
•

Introduction to Windows NT Drivers

Port monitors from Hewlett Packard, Apple, and Digital Equipment Cor­
poration that control network-based printers and print-servers from these
vendors.

Normally, you won't need to write a port monitor unless you've developed
some new and strange way to link a printer to a computer. For example, an out­
put device connected to a SCSI controller would need a new port monitor.
Multimedia Drivers

Multimedia is going to change our lives one day - if only someone can fig­
ure out how. For those who'd like to try, Windows NT supports a wide range of
multimedia devices, including:
•

Waveform audio hardware that samples and reconstructs analog audio
signals

•

MIDI ports that connect to external musical devices like keyboards, syn­
thesizers, and drum machines

•

Onboard MIDI synthesizers that are part of the computer itself

•

Video capture devices that digitize either single frame or continuous
video signals

•

Related devices like CD players, video-disk players, and joysticks

Most application programs don't interact with multimedia hardware by
calling such functions as CreateFile or DeviceloControl. Instead they use some of
the special-purpose multimedia functions provided by Win32. This indirect
approach reduces their dependency on hardware from a specific vendor. Figure
1 .9 shows the components involved in multimedia operations.
WIN M M To meet the requirements of different kinds of software, Win32
actually contains two separate multimedia APls. The media control interface
(MCI) functions provide high-level access to a wide variety of multimedia
devices while hiding many of the details from the programmer. MCI is the inter­
face used by most applications. For software needing more direct hardware con­
trol, Win32 also provides a group of low-level audio functions. Programs such
as MIDI sequencers or waveform editors are more likely to use this low-level
interface.
Support for both sets of multimedia functions comes from the WINMM sys­
tem component. WINMM is a user-mode DLL that acts as a translation layer
between the application and the vendor-supplied drivers that actually control the
multimedia hardware. To do its job, WINMM relies on three kinds of drivers.
MCI drivers An MCI driver is just a user-mode DLL that WINMM loads
at runtime to process MCI commands for a specific device. In response to calls

Sec. 1 .3

Special Driver Architectures

21

Application

MCI Driver
WINMM DLL
Low-Level Audio Driver

System Service Interface
Multimedia Device Driver

1/0 Manager

Copyright @ 1 996 by Cydonix Corporation. 960013a.vsd

Figure

1 .9

Multimedia driver architecture

from a multimedia application, WINMM sends various messages to the proper
MCI driver. Depending on the device, the MCI driver then uses either the low­
level audio interface (described below) or Win32 I/O functions to control the
hardware.
Low-level audio drivers When an application calls a low-level audio
function, WINMM loads a vendor-supplied user-mode DLL (the low-level audio
driver) and sends it various messages. The low-level audio driver then uses
Win32 I/O functions to communicate with the audio hardware. This is very simi­
lar to the operation of the MCI drivers described previously.
Kernel-mode device drivers Management of the multimedia hardware
itself comes from a kernel-mode device driver. This includes data transfer opera­
tions, handling interrupts, processing errors, and so on.
Drivers for Legacy 1 6-bit Applications

When Microsoft first introduced Windows NT, a vast amount of software
already existed for MS-DOS and 16-bit Windows. Any new operating system hop­
ing to be a commercial success would have to be able to run the majority of this
code without modification. At the same time, it would be necessary to protect sys­
tem integrity by denying these 16-bit programs the kind of unlimited hardware
access they enjoyed under MS-DOS and Windows. As you saw earlier in this
chapter, Microsoft's solution was to run 16-bit code in the context of one or more
virtual DOS machine (VDM) processes.

Chapter 1

22

Introduction to Windows NT Drivers

VDM
Instruction
Emulation
32-blt MS-DOS
Emulation
Virtual
Device Drivers

Win32 API calls

VO
MS-DOS App
1 6-bit MS-DOS
Emulation

System Service Interface
VO Manager
Device Driver

Copyright @ 1 996 by Cydonix Corporation. 960014a.vsd

Figure

1 .1 0

Relationship of VDDs and kernel-mode drivers

To meet the challenge of allowing VDMs to perform 1/0 without giving
them direct access to any hardware, Windows NT uses a piece of software called a
virtual DOS driver (VDD). Figure 1.10 shows the relationship of such a VDD to
the other parts of the operating system.
The VDD essentially acts as a translation layer between a 16-bit application
and some custom piece of hardware. Whenever the application tries to touch the
hardware directly, the VDD intercepts the request and turns it into a series of
Win32 calls. These Win32 calls are then processed by a standard Windows NT ker­
nel-mode driver.
A VDD can intercept a 16-bit program's attempts to access 1/0 ports and
specific ranges of memory. It also has the ability to perform DMA transfers on
behalf of the application, read and set the contents of CPU registers, and simulate
the arrival of interrupts. All this makes it possible to fool the 16-bit application
into thinking it's still running under MS-DOS or Windows.
The advantage of this approach is that the original 16-bit executable doesn't
need to be modified to run under Windows NT. The disadvantage is that the extra
layer of software can add significant amounts of processing overhead. Since you
have to write a kernel-mode driver to support the underlying hardware, the real
solution is to port the application to the Win32 environment.
One other point to make here: This technique supports the execution of MS­
DOS programs that touch hardware directly. It also supports 16-bit DLLs that
play with hardware (a common form of driver in the 16-bit Windows environ­
ment). It does not allow you to run Windows or Windows 95 VxDs under Win­
dows NT.

Sec. 1.4

1 .4

Summary

23

SUMMARY
As you can see, Windows NT's rich architecture and multiple API environments
add a certain amount of complexity to 1/0 processing. In particular, Windows NT
uses a much broader definition of what constitutes a driver than many other oper­
ating systems. If you're in the process of adding support for a specific piece of
hardware, you should have a good idea at this point of just what kind of driver(s)
you'll need to write.
In the next chapter we'll start our descent into kernel-mode driver develop­
ment by examining some of the hardware issues facing NT driver writers.

C

H

A

P

T

E

R

2

The Hardware
Environment

f

or some people (you know who you are), hot
solder is the only true programming language. If you're not in that category, this
chapter will give you a gentle introduction to those aspects of hardware that have
an impact on writing drivers. You'll also find here a quick tour of the major bus
architectures supported by Windows NT, and a few words to the wise about deal­
ing with hardware in general.

2.1

HARDWAR E BASICS
There are a number of things you need to know about a peripheral device
before you can design a driver for it. At the very least, the following items are
important:
•

How to use the device's control and status registers

•

What causes the device to generate an interrupt

•

How the device transfers data

•

Whether the device uses any dedicated memory

•

Whether the device can be autoconfigured

The following subsections discuss each of these topics in a general way.
24

Sec. 2.1

Hardware Basics

25

Device Registers

Drivers communicate with a peripheral by reading and writing various bits
in a group of registers associated with the device. Each of these device registers
will generally perform one of the following functions:
•

Command
Setting and clearing bits in command registers causes the
device to start an operation or change its behavior in some way.

•

Status
The bits in a status register contain information about the cur­
rent state of the device.

•

Data buffer
Output devices accept data to be transmitted when it's
written to their output buffer registers. Data coming from an input device
will appear in the device's input buffer register.

-

-

-

Simple devices (like the parallel port interface in Table 2.1) have only a few
registers, while complex hardware (like a graphics adapter or a network card)
have a large set of registers. In the absence of any industry standard, the engineer
designing the interface card is the one who decides how these registers are going
to be used. So, if you expect to write a device driver, you'll need detailed informa­
tion about all its control and data registers.
Table 2.1

These registers control a parallel port interface

Parallel port registers
Offset

Register

Access

Description

0
1

Data
Status
Bits 0 - 1
Bit 2
Bit 3
Bit 4
Bit 5
Bit 6
Bit 7
Control
Bit 0
Bit 1
Bit 2
Bit 3
Bit 4
Bits 5 - 7

R/W
RIO

Data byte transferred through parallel port
Current parallel port status
Reserved
0 - interrupt has been requested by port
0 - an error has occurred
1 - printer is selected
1 - printer is out of paper
0
acknowledge
0 printer is busy
Commands sent to parallel port
1 - strobe data to I from parallel port
1 - automatic line feed
0
initialize printer
1 - select printer
1
enable interrupts
Reserved

2

-

-

R/W

-

-

Chapter 2

26

The Hardware Environment

Accessing Device Registers

Once you know what a set of device registers does, you still need two addi­
tional pieces of information before you can work with the device:
•

The address of the device's first register

•

The address space where these registers live

Since a given device's registers usually occupy consecutive locations, the
address of the first register will get you to all the others. Unfortunately, finding
the register base address is a rather involved process that will have to wait for
Chapter 7.
That still doesn't answer the question of where these registers live. As you
can see from Figure 2.1, device registers can occupy either of two different
address spaces. The following subsections describe each of them.
1/0 space registers Some CPU architectures map device registers into a
set of addresses known as I/0 space. These 1/0 space addresses (often referred to
as ports) are not part of the memory space seen by the CPU, and they can only be
accessed with special machine instructions. For example, the 80x86 architecture
has a 64-kilobyte 1/0 space, and IN and OUT instructions for reading and writ­
ing I/ 0 ports.
One extra twist: To promote platform independence, an NT driver shouldn't
actually use hardware instructions to touch I/ 0 ports. Instead, it ought to use the
HAL functions listed in Table 2.2.

Device

Register

LOAD/STORE

CPU

IN/OUT
'"'";,

«'l'i'ii

'.!

i�i!ij1m�iii&iili!i: i:1:11Ji!il!lii!!1�: ri);�1Jj
Copyright © 1 994 b y Cydonix Corporation. 940028a.vsd

Figure

2.1

Memory-mapped device registers and I / 0 space ports

Sec. 2.1

Hardware Basics

Table 2.2

27

Use these HAL functions to access ports in 1/0 space

HAL 1/0 space functions
Function

Description

READ_PORT_XXX
WRITE_PORT_XXX
READ_PORT_BUFFER_XXX
WRITE_pORT_BUFFER_XXX

Read a single value from an I/O port
Write a single value to an I/ 0 port
Read an array of values from consecutive I/ 0 ports
Write an array of values to consecutive I/O ports

Substitute one of the following for XXX: UCHAR, USHORT, or ULONG.

Memory-mapped registers CPU architectures without a separate I/0
space generally map device registers into some range of physical memory
addresses. Access to these memory-mapped device registers is accomplished with
the same load and store instructions used for normal memory operations (for
example, MOV on the 80x86 platform).
Even on CPUs with a separate I/O space, some peripherals memory-map
their control registers anyway. This improves the performance of high-speed
devices with large register sets, since I/O instructions are typically much slower
than memory-access instructions. For example, many SVGA video adapters for
80x86 machines can use memory addresses not only for their video buffers, but
for their control registers as well.
Once again, the HAL provides a set of support functions (listed in Table 2.3)
for accessing memory-mapped registers. Notice that these are not the same func­
tions you use on a CPU with a separate I/ 0 space. So, if you plan to support your
driver on both kinds of architecture, you'll need to take this difference into
account. Chapter 5 presents some coding techniques that make this easier to do.
Device Interrupts

Most reasonable pieces of hardware generate an interrupt request when
they need some kind of attention from the CPU. This request takes the form of an
Table 2.3

Use these HAL functions to access memory-mapped device registers

HAL memory-mapped register functions
Function

Description

READ_REGISTER_XXX
WRITE_REGISTER_XXX
READ_REGISTER_BUFFER_XXX
WRITE_REGISTER_BUFFER_XXX

Read a single value from an I/ 0 register
Write a single value to an I/O register
Read of values from consecutive I/ 0 registers
Write values to consecutive I/O registers

Substitute one of the following for XXX: UCHAR, USHORT, or ULONG.

Chapter 2

28

The Hardware Environment

electrical signal on the interrupt lines in the bus. A device might yank on its inter­
rupt line for any number of reasons, including:
•

The device has completed a previously requested input or output opera­
tion and is now idle.

•

A buffer or FIFO associated with the device is almost full (for input oper­
ations) or almost empty (for output operations). The device uses an inter­
rupt to notify the driver that it must process the buffer if it wants the 1/0
to continue without a pause.

•

The device encountered some kind of error during an 1/0 operation.

Some legacy devices don't use interrupts at all. Drivers for this kind of hard­
ware usually have to poll their devices until some kind of interesting event
occurs. Under single-tasking operating systems like MS-DOS, this behavior
wasn't a problem, but in an environment like Windows NT, it would seriously
degrade system performance. Chapters 10 and 14 will present some techniques
you can use with non-interrupting hardware.
The various bus architectures supported by Windows NT take slightly dif­
ferent approaches to interrupts. Nonetheless, they all share several common fea­
tures, which are described below.
Interrupt priorities When several devices are connected to the same bus,
the CPU needs some way to rank the importance of their interrupt requests. This
allows devices that need immediate servicing to access the CPU ahead of devices
that can afford to wait. Although the exact mechanism depends on the bus, this
ranking generally works by assigning a priority value to each of the interrupt
request lines.
When the CPU accepts an interrupt request, it blocks out any further inter­
rupts at or below the same priority and transfers control to an interrupt service
routine. Until the interrupt service routine handles and dismisses the interrupt,
only requests of a higher priority can take control of the CPU. Lower-priority
requests remain pending until the more important activity is finished.
Interrupt vectors An interrupt vector is a unique, bus-relative number
which allows the CPU to identify the source of an interrupt and call the appropri­
ate service routine. The interrupt controller usually passes this vector to the CPU
when it accepts an interrupt request. The CPU then uses the vector as an index
into a table containing the addresses of interrupt service routines.
Signaling mechanisms Hardware designers have developed two basic
strategies that devices can use when they want to generate an interrupt. The older
mechanism defines an interrupt request as a transition from zero to one on the
interrupt signal line. These are called edge-triggered (or latched) interrupts because
they depend only on the leading edge of the pulse.

Sec. 2.1

Hardware Basics

29

Unfortunately, this scheme has two problems. First, it's very sensitive to
electrical noise - a random spike can easily be mistaken for an interrupt request.
Second, if an interrupt arrives while another one is being serviced at the same pri­
ority, the second interrupt will be ignored. This limits sharing to situations where
simultaneous interrupts will never occur on the same line.
These limitations led to the development of another signaling mechanism
called a level-sensitive (or level-triggered) interrupt. This approach requires the
device to send a continuous signal down the wire until the interrupt service rou­
tine explicitly dismisses the interrupt. In addition to greater noise immunity, this
scheme makes it possible for multiple devices to share the same interrupt request
line.
Processor affin ity To improve overall performance, multiprocessor plat­
forms often contain special interrupt-routing hardware. The purpose of this hard­
ware is to distribute interrupt requests from a given device to one or more specific
CPUs. If a particular CPU can service interrupts from a device, those interrupts
are said to have affinity for that CPU.
Data Transfer Mechanisms

Hardware designers have three basic options when it comes to moving data
between a peripheral and memory.
•

Programmed I/O

•

Direct memory access

•

Shared buffers

The transfer mechanism used by a given device usually depends on the
device's speed, the amount of data it needs to transfer, and any applicable indus­
try standards. In some cases, a complex piece of hardware may actually use more
than one of these techniques.
The following subsections explain the differences between programmed I/O
and direct memory access (illustrated in Figure 2.2). Shared memory buffers are
covered later in the discussion of device-specific memory.
Programmed 1/0 (PIO) PIO devices need the help of the CPU to perform
data transfers. Their drivers are responsible for sending or receiving each byte of
data, keeping track of the buffer in memory, and maintaining a running count of
the number of bytes transferred.
PIO devices typically generate an interrupt after each byte or word of data is
transferred. Some PIO devices have an internal buffer or a hardware FIFO that
helps to reduce the interrupt count. Even so, lengthy transfers need a lot of atten­
tion from the CPU and produce a flood of interrupts. This can lead to very poor
system performance.

30

Chapter 2

The Hardware Environment

OMA Controller

Count Register
----- -�

Address Register

Device

Data Register

Copyright © 1 994 by Cydonix Corporation. 940039a.vsd

Figure 2.2

Paths followed by data in DMA and programmed I/0 transfers

This style of 1/0 is best suited to slower devices that don't move large
amounts of data in a single operation. Parallel ports, pointing devices, and the
keyboard are all examples of PIO hardware. Chapter 9 will explain how to work
with PIO devices.

Direct memory access (OMA) DMA devices take advantage of special
hardware called a DMA controller (DMAC). A DMAC is actually a very simple
auxiliary processor with just enough intelligence to transfer a specified number of
bytes between a peripheral device and memory.
At the beginning of an I/ 0 operation, the driver loads a transfer count and a
memory address into the DMAC and then starts the device. All by itself, the
DMAC moves data to or from successive memory locations, and when the trans­
fer is complete, it generates an interrupt request. During the actual operation, the
driver is suspended and the CPU can work on other tasks.
High-speed devices that perform large transfers generally use DMA because
it significantly reduces driver overhead and system interrupt activity. Disks,
sound samplers, and network cards are examples of DMA devices.
Direct Memory Access (OMA) Mechanisms
Chapter 12 will have a lot more to say about the mechanics of working with
this kind of hardware. There are a number of twists and turns that aren't relevant
here. At this point, it's only necessary to draw a distinction between two general
kinds of DMA.

Sec. 2.1

Hardware Basics

31

System OMA Some devices are connected to the shared DMACs on the
motherboard. These controllers each have a fixed number of data-transfer paths
(called channels) that can all work simultaneously. More than one device can be
attached to the same channel, but only one device at a time can transfer data over
the channel. This is known as system DMA or slave DMA. By sharing hardware,
slave OMA devices have a simpler architecture and lower chip count. On the
downside, they may have to wait for a OMA channel to become available before
they can start an operation. The floppy controller on most PCs is a slave OMA
device.
Bus master OMA Other devices (called bus masters) have their own
OMAC hardware built into the peripheral card itself. This guarantees that high­
speed devices won't have to wait for a system OMA channel to become free. The
AHA-1742 SCSI controller from Adaptec is one example of a bus mastering
device.
Device-Dedicated Memory
Some devices insist on having a private range of addresses in physical mem­
ory. There are several reasons why a peripheral card might need dedicated
address space:
•

Its control registers might be memory-mapped.

•

It might have an internal ROM containing start-up code and data. For the
CPU to execute this code, it has to appear somewhere in memory address
space.

•

It might use a block of memory as a temporary buffer for data that's being
sent or received. High-speed devices like video capture boards and Ether­
net adapters often use this technique.

Peripheral cards generally take one of two approaches to dedicated memory.
Some insist on using a specific range of physical addresses. For example, VGA
cards expect a 128-kilobyte block of addresses beginning at OxAOOOO to belong to
them.
Alternatively, the card might have an address register that holds the base
physical address of its dedicated memory. During initialization, the driver for the
card will load this register with a pointer to some block of available memory. Fig­
ure 2.3 illustrates each of these two possible designs.
Regardless of which approach a card takes, it's important to remember that
the card will be working with physical addresses. Since the only addresses avail­
able to a device driver are virtual addresses, drivers have to map any device mem­
ory somewhere into system virtual space before they can access it. Chapter 7
explains how all this works.

Chapter 2

32

The Hardware Environment

Contiguous
Buffer

Contiguous
Buffer

Copyright © 1 994 by Cydonix Corporation. 940048a.vsd

Figure 2.3 How drivers access device memory

Requirements for Autoconfiguration
Ever since the first add-on card hit the market, PC users have been strug­
gling with ports, IRQs, and DMA channel assignments. In the beginning, things
weren't too bad, and it usually didn't take too long to find an appropriate combi­
nation of DIP-switch and jumper settings. However, as people started attaching
more and more optional equipment to their PCs, getting everything to work
became a real nightmare.
To get around these problems, some bus architectures support various levels
of automatic hardware recognition and configuration. The next section of this
chapter will describe specific autoconfiguration capabilities of the major buses.
Here, it's enough to introduce the kinds of features that make autoconfiguration
possible.

Device resource lists At the very least, a device must identify itself and
provide the system with a description of the resources it needs. In the ideal case,
this resource list contains the following information:
•

Manufacturer ID

•

Device type ID

•

I/ 0 space requirements

•

Interrupt requirements

•

DMA channels

•

Device memory requirements

Sec. 2.2

Buses and Windows NT

33

No jumpers or switches Self-identification isn't enough, however. For
true autoconfiguration, a device must be able to change its port, interrupt, and
DMA channel assignments dynamically under software control. This allows a
driver or some other part of the operating system to arbitrate resource conflicts
among competing devices.
Change notification Finally, the highest level of support also requires the
bus to generate a notification signal whenever a card is plugged in or removed.
Without this kind of mechanism, it's not possible to implement any of the Plug
and Play hot-swapping features. Since the current release of Windows NT doesn't
support Plug and Play, this isn't an issue right now. But it will be in the future.
2.2

B USES AND WIN DOWS N T
A bus is just a collection o f data, address, and control lines that allows a peripheral
device to communicate with memory and the CPU. The specification for a bus
defines such things as the shape and size of physical connectors, the functions
performed by each of the lines in the bus, and the timing and signaling protocols
used by devices attached to the bus.
Over the last decade, hardware vendors have developed a wide variety of
bus architectures with differing electrical and logical characteristics. As of version
4.0, Windows NT supports many of these buses. What follows are brief descrip­
tions of the buses you're most likely to encounter. For more detailed information,
see some of the books listed in the bibliography.

ISA - The Industry Standard Architecture
This is the old standby that made its first appearance on the IBM PC / AT. It
was derived from the original IBM PC bus by adding extra data and address lines
and increasing the number of IRQ levels and DMA channels. Both 16-bit ISA
cards and the older IBM PC 8-bit cards fit into ISA sockets. Figure 2.4 shows the
organization of an ISA-based machine.
The ISA bus isn't especially fast. To maintain backward compatibility with
the IBM PC, the ISA bus clock rate is limited to 8.33 MHz. In the best case, a 16-bit
transfer takes two clock cycles, so the maximum data rate is only about 8 MB/ sec.
This limit applies regardless of the clock rate of the CPU itself. That's why the
CPU and memory communicate over a high-speed local bus (sometimes called
the X bus) .

Register access There are very few rules when it comes to the layout of
I/O space on ISA systems. Beyond some industry conventions, there aren't any
real standards for the kinds of registers an ISA card should implement, nor what
addresses they should use. Generally, I/O addresses between OxOOOO and OxOOFF
belong only to devices on the system board, while the territory between Ox0100

Chapter 2

34

The Hardware Environment

Local Bus

ISA
Card
.. , ..• ·'>,

Copyright © 1 996 b y Cydonix Corporation. 960027a.vsd

Figure 2.4

Layout of an ISA sy stem

and Ox03FF is available for add-on cards. The space used by expansion cards is
doled out in 32-byte chunks.
Unfortunately, many ISA add-on cards don't pay attention to all 16 I / O
address bits. Instead, they look only a t bits 5-9 t o see i f a n I / O space reference
belongs to them. If it does, they decode bits 0-4 to determine the exact register.
Cards like this are a problem because they respond to multiple addresses in the
64-kilobyte I/O space, which can lead to some nasty behavior. The only way to
prevent conflicts on a system with ISA boards is avoid these alias addresses
altogether. 1

Interrupt mechanisms Interrupts on an ISA bus are normally handled by
a pair of Intel 8259A programmable interrupt controller (PIC) chips, each of
which provides eight levels of interrupt priority. These two chips are tied together
in a master-slave configuration that leaves fifteen available priority levels. Table
2.4 lists the ISA priority levels and describes how they are normally used.
The 8259A chip can be programmed to respond to either edge-triggered or
level-sensitive interrupts. This choice must be made for the entire chip; it can't be
set on an IRQ-by-IRQ basis. The power-on self-test (POST) code in the ISA BIOS
programs both chips to use edge-triggered interrupts. This means that multiple
ISA cards cannot normally share the same IRQ levels.

OMA capabilities The standard implementation of ISA OMA uses a pair
of Intel 8237 DMAC chips (or their functional equivalent) . Each of these chips

1

In other words, the control registers of any cards using the range above Ox03FF have to use l / O
space addresses with zeroes in bits 8 and 9.

Sec. 2.2

35

Buses and Windows NT

Table 2.4

I nterrupt priorities on I SA systems

ISA interrupt priority sequence
Priority

IRQ line

Controller

Used for...

Highest

0
1
2
8
9
10
11
12
13
14
15
3
4
5
6
7

Master
Master
Master
Slave
Slave
Slave
Slave
Slave
Slave
Slave
Slave
Master
Master
Master
Master
Master

System timer
Keyboard
(Unavailable - pass-through from slave)
Real-time clock alarm
(Available)
(Available)
(Available)
(Available - usually the mouse)
Error output of numeric coprocessor
(Available - usually the hard disk)
(Available)
2nd serial port
1st serial port
2nd parallel port
Floppy disk controller
1st parallel port

Lowest

provides four independent DMA channels. When they're ganged together in a
master-slave configuration, the first slave channel (number 4) serves as a pass­
through and becomes unavailable. Table 2.5 describes the capabilities of these
DMA channels.
When several DMA channels request the bus simultaneously, the DMAC
chips use a software-selected arbitration scheme to resolve the conflict. The ISA
BIOS POST-code normally programs the DMACs for fixed-priority arbitration.
This means that channel 0 always gets first crack at the bus, and channel 7 always
goes last.
Also notice from Table 2.5 that the lower channels transfer individual bytes,
while the upper ones move data only in words. Since the DMAC uses a 16-bit

Table 2.5

OMA architecture on the ISA bus

ISA OMA channels
Channel

Controller

Transfers ...

Max transfer

0-3
4
5-7

Master
Slave
Slave

Bytes only
(Unavailable)
Words only

64 kilobytes
128 kilobytes

Chapter 2

36

The Hardware Environment

count register in both cases, the upper channels can transfer twice as much data in
a single operation.
One other significant item about DMA operations: The ISA bus has only 24
address lines. This means that DMACs can access only the first 16 megabytes of
system memory. Any DMA buffers outside this range are unavailable. In Chapter
12 you'll see how NT deals with this complication.

Device memory The 24 address lines on the ISA bus have an impact on
device memory as well as DMA buffers. Any device-dedicated memory must live
in the first 16 megabytes of physical address space. This applies to any onboard
ROM as well.
Autoconfiguration Unfortunately, the ISA specification says nothing
about autoconfiguration. ISA devices don't identify themselves (either by manu­
facturer or device type), nor do they provide a resource list. Since ISA cards aren't
required to have any software configuration registers, users normally have to con­
figure the card with DIP switches and jumpers.
Sometimes it's possible to make educated guesses about the presence of a
particular device by tickling various addresses in I/ 0 space and listening for an
appropriate giggle from a device. This is generally not a very reliable way to do
things. Even if you do manage to locate a piece of hardware using this technique,
you still don't know anything about its DMA or interrupt settings.
The proposed Plug and Play extensions to ISA are intended to correct such
problems. Until these extensions become available, you'll have to use some of the
cruder methods described in Chapter 7.
MCA - The Micro Channel Architecture
IBM developed the Micro Channel architecture as a replacement for the
aging ISA bus. In a bold move, they dumped ISA altogether and proposed a
vastly improved architecture. Progress isn't cheap, however, and the cost of
adopting this new design was that all legacy ISA or IBM PC adapter cards would
have to be trashed. Most people were unconvinced, and the MCA bus hasn't
achieved great popularity among hardware vendors. 2 Figure 2.5 shows the orga­
nization of a typical MCA system.
Since they weren't constrained by the 8.33-MHz clock rate of the ISA bus,
IBM was able to design a pretty snappy architecture. Although the original MCA
implementation3 only supported data transfer rates of 10 megabytes/ sec, later
versions of the bus specification incorporated a streaming data protocol that
raised this number by a factor of 16. Table 2.6 summarizes the data rates available
from the MCA bus.

2
3

Political problems also contributed to the failure of MCA. IBM patented the architecture and tried
to impose licensing conditions that many hardware vendors found objectionable.
This was the 16-bit version used for the original IBM

PS/ 2 .

Sec. 2.2

Buses and Windows NT

37
Serial &
Parallel

VGA

Keyboard

MCA
Slot 7

MCA
Slot O
Copyright @ 1 996 by Cydonix Corporation. 960028a.vsd

Figure 2.5

Layout of a Micro Channel system

Register access An MCA bus can have at most eight card sockets,
referred to as slots. 4 Each slot has an associated set of programmable option select
(POS) registers that are used to configure the card. These POS registers replace the
jumpers and DIP switches found on ISA devices. At the very least, an MCA card
must implement a POS register that identifies the card.
Other than the POS registers (which are always at a fixed location), I/O
space under MCA is just about as chaotic as it is on an ISA system. (The problem
with ISA alias addresses doesn't occur, however.) At the option of the designer,
MCA cards can have either fixed or programmable register addresses in 1/0
space. The only requirement is that if more than one of the same card will be
Table 2.6

MCA buses support a wide range of transfer speeds

MCA data transfer speed

- Protocol
Basic
Streaming

4

Data width

Transfer rate

16 bits
32 bits
16 bits
32 bits
64 bits
64 bits

lO MB/sec
20 MB /sec
20 MB/sec
40 MB /sec
BO MB/sec
160 MB/sec

Additional devices can live on the motherboard itseli.

Chapter 2

38

The Hardware Environment

plugged into an MCA bus, the card must have a 3-bit POS field for setting the
card's base register address.

Interrupt mechanisms The Micro Channel architecture supports 15 inter­
rupt request levels. Their functions and relative priorities follow the same pattern
used by the ISA bus (refer back to Table 2.4). The only improvement is that MCA
cards use level-sensitive interrupt signals, thus allowing more than one device to
share a single IRQ line.

OMA capabilities The MCA bus was designed to be shared. The system
board can support up to eight system DMA channels, and there's room on the bus
for an additional seven bus masters. Six of the system DMA channels follow a
fixed priority arbitration scheme, while channels 0 and 4 have assignable priori­
ties. The seven bus masters also have assignable priorities, although they will
always defer to the system DMA hardware.
Older implementations of the system DMAC were limited to 16-bit transfers
(even though the bus itself has a 32-bit data path), and buffers had to fall in the
first 16 megabytes of physical memory. (Bus master cards didn't have this limita­
tion.) Proposed improvements to the MCA specification allowed for 32- and even
64-bit data transfers. 5 These changes also gave the system DMAC access to a full
4-gigabyte address range.
Device memory The MCA specification dictates that any device with
onboard ROM must use 4 bits in one of its POS registers to select a starting
address for the ROM. This gives card designers the option of mapping the ROM
into any of 16 separate locations in physical memory.
Since the MCA bus has 32 address lines, device memory can exist anywhere
in a 4-gigabyte address space.
Autocon figuration MCA autoconfiguration involves the POS registers
and a card-specific script called an adapter description file (ADF). Whenever an
MCA system bootstraps, it checks each slot to see what's there. If it finds a previ­
ously configured card, it downloads configuration data from nonvolatile RAM
(NVRAM) into the card's POS registers.
If something appears in a slot that had previously been empty, the bootstrap
configuration program uses the card's POS ID register to generate the name of the
device's ADF file. After prompting the user for the floppy containing the ADF, the
configuration program selects resource assignments for the new card that don't
conflict with the resources used by any existing cards. These assignments are cop­
ied into NVRAM.
Windows NT can recognize many kinds of MCA devices all by itself. If you
need to touch MCA slots directly, you can use HalGetBusData and HalSetBus­
Data to access them.

5

The extra

32 bits came from multiplexing the address lines on the MCA bus.

Sec. 2.2

Buses and Windows NT

39

EISA - The Extended Industry Standard Architecture
The PC industry responded to IBM's Micro Channel architecture with the
EISA bus. Most people simply weren't willing to throw away all their old hard­
ware. The EISA bus reflects this sentiment by removing some of the ISA limita­
tions while still allowing the use of legacy devices.
However, EISA's emphasis on compatibility limits the architecture in certain
ways. For example, even though the bus supports 32-bit data transfers, the bus
clock still runs at 8.33 MHz so the maximum transfer rate is only about 33 mega­
bytes/ sec. Also, since EISA sockets had to be able to accept ISA cards, it was
impossible to fix some of the electrical noise problems caused by the layout of the
ISA wiring. See Figure 2.6 for the layout of a typical EISA system.

Register access Like MCA, the EISA bus contains a number of slots, each
of which corresponds to one physical socket on the bus. As you can see from Table
2.7, each of the 15 EISA sockets has its own particular range of addresses in I/O
space. Within the 4-kilobyte area assigned to a particular slot, four 256-byte
ranges are guaranteed to be available to the card in that socket. 6
Interrupt mechanisms EISA's interrupt capabilities are a superset of the
ISA mechanisms. Although EISA interrupt controllers provide the same 15 levels
available on the ISA bus (see Table 2.4), each IRQ line can be individually pro­
grammed for edge-triggered or level-sensitive behavior. This allows both ISA
cards and EISA cards to coexist on the same bus.

Memory

EISA
Slot 1 5
Copyright 

Chapter 3

54

Kernel-Mode I/O Processing

Buffered 1/0 ( BIO) Under this scheme, the I/0 Manager allocates a buffer
from nonpaged pool at the start of each I/0 operation and passes the address of
this buffer to the driver. The driver uses this buffer for any data transfer
operations to or from the device.
For output requests, the I/O Manager copies the contents of the user's buffer
into the system buffer before passing it to the driver. For input requests, the driver
fills the system buffer with data from the device, and the I/0 Manager copies it
back into user space at the end of the operation.
There are two disadvantages to this technique. One is that all the memory­
to-memory copying of data can slow things down, particularly for devices that
transfer large amounts of data on a frequent basis. The other is that it can use up a
lot of nonpaged pool. So, drivers should limit the use of buffered I/0 to slow
devices that don't transfer a lot of data at one time. For these reasons, you should
never use Buffered I/0 to perform transfers larger than one page of memory.
Direct 1/0 (DIO) This scheme avoids the need for copying user data by
giving the driver direct access to the physical pages of memory where the user
buffer lives. At the beginning of an I/0 operation, the I/0 Manager locks the
entire user buffer into memory to prevent deadly page faults. It then builds a list
that identifies the physical pages making up the user buffer. The driver uses this
list to perform an I/0 operation using the actual pages of the user 's buffer. When
the I/0 operation is �omplete, the I/0 Manager will unlock the pages.
You should use Direct I/O for high-speed devices that need to transfer large
amounts of data at once, particularly devices that perform DMA. The mechanics
of Direct I/O are described in Chapter 12.
3.5

STRUCTU RE OF A KERN EL-MODE DRIVER
One o f the biggest differences between a driver and a n application program i s the
driver's control structure. Application programs run from beginning to end under
the control of a main or WinMain function that determines the sequence in which
various subroutines are called.
A kernel-mode driver, on the other hand, has no main or WinMain function.
Instead, it's just a collection of subroutines that are called as needed by the 1/0
Manager. Depending on the driver, the 1/0 Manager might call a driver routine in
any of the following situations:
•

When a driver is being loaded

•

When the driver is being unloaded or the system is shutting down

•

When a user-mode program issues an I/O system service call

•

When a shared hardware resource becomes available to the driver

•

At various points during an actual device operation

Sec. 3.5

Structure of a Kernel-Mode Driver

55

The remainder of this section briefly describes the major categories of
routines making up a kernel-mode driver.

Driver Initialization and Cleanup Routines
Before any driver can begin processing 1/0 requests, there are a number of
initialization tasks it must perform. Likewise, drivers need to clean things up
when they leave the system. There are several routines a driver can use to
perform these operations.

DriverEntry routine The I/O Manager calls this routine when it loads the
driver, either at system boot time if the driver is loaded automatically, or later if
you load the driver manually from the Control Panel. The DriverEntry routine
performs a wide range of initialization functions, including setting up pointers to
other driver routines, finding and allocating any hardware resources used by the
driver, and making the name of the device visible to the rest of the system.
Reinitialize routine Some drivers may not be able to complete their
initialization during the DriverEntry routine. This could happen if the driver
depended on some other driver that wasn't yet loaded, or if the driver needed to
initialize itself during different phases of the system boot. These kinds of drivers
can use Reinitialize routines to spread out their initialization functions over time.
Unload routine The 1/0 Manager calls a driver's Unload routine when a
driver is unloaded manually using the Control Panel. The Unload routine is
responsible for undoing everything that was done by the DriverEntry routine,
including deallocating any hardware resources belonging to the driver and
destroying any kernel objects that belong to the driver.
Shutdown routine When the system goes through a user-initiated
shutdown, the I/0 Manager will call the Shutdown routines registered by any
currently loaded drivers. The primary purpose of a Shutdown routine is to put
the hardware into a known state. System resource cleanup is not as important
here because the system is about to disappear anyway.
Bugcheck callback routine If a driver needs to get control in the event of
a system crash, it can register a Bugcheck callback routine with the Kernel. This
mechanism gives the driver a chance to put its devices into a known state, and
perhaps record some state information that will be helpful in debugging the
crash.
VO

System Service Dispatch Routines

When the 1/0 Manager gets a request, it uses the function code of the
request to call one of several Dispatch routines in the driver. The Dispatch routine

Chapter 3

56

Kernel-Mode I/0 Processing

verifies the request and may have the I/0 Manager send it to the device for
processing.

Open and close operations All drivers must provide a Dispatch routine
that handles Win32 CreateFile requests. Drivers that need to perform cleanup
operations can supply a routine to handle CloseHandle calls, as well as separate
Dispatch routines that perform special processing when the last handle on a
shared device is closed.
Device operations Depending on the device, a driver may have one or
more Dispatch routines for handling actual data transfer and control operations.
The 1/0 Manager calls these routines in response to Win32 ReadFile, WriteFile,
and DeviceloControl requests, or in response to an I/ 0 request from a higher­
level driver. These routines perform any final verification of the request and then
pass it to the driver's device management routines for actual processing.
Data Transfer Routines
Device operations involve a number of different driver routines, depending
on the nature and complexity of the device.

Start 1/0 routine The 1/0 Manager calls the driver's Start 1 / 0 routine
when it's time to begin a device operation. This routine allocates any resources
needed to process the request and sets the device in motion. The I/ 0 Manager
provides simplified support for half-duplex drivers that only need a single Start
I/0 routine. Drivers of full-duplex devices that have to manage simultaneous
input and output requests need a somewhat more complex architecture.
Interrupt Service routine (ISR) The Kernel's interrupt dispatcher calls a
driver's Interrupt Service routine whenever the driver's device generates an
interrupt. The ISR is responsible for acknowledging the device, gathering any
volatile state information needed by other parts of the driver, and asking the 1/0
Manager to execute a DPC routine.
DPC routine(s) A driver can have one or more DPC routines that clean up
after a device operation. Depending on the driver, this can involve releasing
various system resources, reporting errors, handing completed 1/0 requests back
to the I/ 0 Manager, and starting the next device operation if one is waiting.
If you can do everything with a single DPC, the I/ 0 Manager provides a
simplified mechanism called a DpcForlsr routine. However, some drivers are
easier to write and maintain if they have separate DPC routines for different kinds
of processing. For example, drivers that perform full-duplex 1/0 might have one
DPC routine that completed input operations, and another DPC routine for
outputs. At your option, your driver can have any number of these CustomDpc
routines.

Sec. 3.5

Structure of a Kernel-Mode Driver

57

Resource Synchronization Callbacks
As an extension of the 1/0 Manager, a driver must be ready to run as
needed at the request of more than one user-mode process. For example, it could
be asked to send data to one device while waiting for a previous operation to
complete on the same or another device. Since there's only one copy of the driver
in memory, it has to handle any contention issues that might result from
processing overlapping requests.
The I/0 Manager makes it easier for drivers to handle these kinds of
problems through the use of various synchronization callback routines. When a
driver needs to access some shared resource, it queues a request for that resource.
When the resource becomes available, the 1/0 Manager invokes a driver callback
routine associated with the request. This has the effect of serializing access to the
resource and avoiding collisions. There are three types of synchronization
callback routines a driver might use.

ControllerControl routine If a peripheral card supports multiple physical
devices, it's important that only one hardware operation is being performed at a
time. Before doing anything to the controller's registers, the Start I/O routine
requests exclusive ownership of the controller. If ownership is granted, the
ControllerControl callback routine executes; otherwise the ownership request
waits until the current owner releases the controller.
AdapterControl routine DMA hardware is another shared resource that
must be passed around from driver to driver. Before doing any DMA operations,
the driver requests ownership of the proper DMA hardware. If ownership is
granted, the AdapterControl callback routine executes; otherwise the ownership
request waits until the current owner releases the DMA hardware.
SynchCritSection routines The parts of your driver that service device
interrupts run at DIRQL while other pieces of driver code execute at or below
DISPATCH_LEVEL. If these low-IRQL sections of code need to touch any
resources used by the Interrupt Service routine, they perform the operation inside
a SynchCritSection routine. Resources in this category include all device control
registers and any other context or state information shared with the Interrupt
Service routine.
Other Driver Routi nes
In addition to the basic set of routines described above, your driver may
contain some of the following additional functions.

Timer routines Drivers that need to keep track of the passage of time dur­
ing a device operation can do so using either an 1/0 Timer or a CustomTimerDpc
routine. Chapter 10 describes both these mechanisms.

Chapter 3

58

Kernel-Mode 1/0 Processing

1/0 completion routines Drivers of higher-level routines may want to
receive notification when a request they've sent to a lower-level driver has
completed. This notification will come in the form of a call to the higher-level
driver's 1/0 Completion routine. Chapter 15 discusses these routines in more detail.
Cancel 1/0 routines Any driver that holds on to pending requests for a
long time must attach a Cancel 1/0 routine to the request. If the request is
canceled, the I/ 0 Manager calls the Cancel I / 0 routine to perform any necessary
cleanup operations. Chapter 11 describes the operation of these routines.

3.6

1/0 PROCESSING SEQUENCE
When a user-mode thread requests a n 1/0 operation, the request goes through
several processing stages:
•

Request preprocessing by NT and the I / 0 Manager

•

Driver-specific preprocessing

•

Device activation and interrupt servicing

•

Driver-specific postprocessing

•

Request postprocessing by the I/ 0 Manage

The following sections describe these stages in more detail.

Request Preprocessing by NT
This phase takes care of all the device-independent setup and verification
required by an I/ 0 request.

1.

The Win32 subsystem converts the request into a native NT system service
call. This triggers a change to kernel mode which is trapped by NT's system
service dispatcher. Eventually, the call ends up inside the 1/0 Manager.

2.

The 1/0 Manager allocates a data structure called an I/O Request Packet (IRP) .
Subsequent chapters will have a lot to say about IRPs, but for now, just think
of them as work orders that describe what the driver is supposed to do. The
1/0 Manager fills in the IRP with various pieces of information including a
function code indicating what operation the user requested.

3.

The 1/0 Manager performs a number of validity checks on the arguments
supplied by the caller. This involves verifying the file handle, checking access
rights to the file object, making sure the device supports the requested func­
tion, and probing any input or output buffer addresses passed by the caller.

4.

If this is a Buffered 1/0 operation, the 1/0 Manager allocates a nonpaged
pool buffer, and for outputs, copies data from user space into the system

1/0 Processing Sequence

Sec. 3.6

59

buffer. If this is a Direct 1/0 operation, the user's buffer pages are faulted into
memory and locked down, and the 1/0 Manager builds a list of the buffer's
physical pages.

5.

The 1/0 Manager calls one of the driver's Dispatch routines.

Request Preprocessing by the Driver
Each driver provides a dispatch table that controls the device-dependent
preprocessing of 1/0 requests. The 1/0 Manager uses the function code of the
requested operation as an index into this table and calls the corresponding driver
Dispatch routine. These routines might perform any of the following operations:
•

Do any device-dependent parameter validation. An example would be
testing whether the size of the request falls within the range of any limita­
tions imposed by the device itself.

•

If the request is such that it can be handled without any device activity,
the Dispatch routine completes the request and sends it back to the 1/0
Manager.

•

If device operation is required, the Dispatch routine marks the request
as pending and tells the 1 / 0 Manager to send it to the driver's Start I/0
routine.

Data Transfer
Data transfers and other device operations are managed by the driver's Start
I/ 0 and Interrupt Service routines.

Start 1/0 When a Dispatch routine tells the I/O Manager to start a device
operation, the 1/0 Manager checks to see if the target device is currently busy. If it
is, the request is queued to the device for later processing. Otherwise, the I/ 0
Manager calls the driver's Start I/0 routine. Depending on the device, the driver's
Start I/O routine performs some or all of the following steps:
l.

I t checks the I RP function (read, write, device control, etc.) and performs any
setup work specific to that type of operation.

2.

If the device is attached to a multiunit controller, the ControllerControl rou­
tine asks for exclusive ownership of the controller hardware.

3.

If the operation is a OMA transfer, the AdapterControl routine allocates OMA
adapter resources.

4.

It uses a SynchCritSection routine to start the device.

5.

It returns control to the 1/0 Manager and waits for a device interrupt

Chapter 3

60

Kernel-Mode 1/0 Processing

ISR

When an interrupt occurs, the Kernel's interrupt dispatcher calls the
driver's ISR. Depending on the device, the ISR performs some of the following
steps:

1.

I t checks t o see i f the interrupt was expected.

2.

It stops the device from interrupting.

3.

If this is a programmed 1/0 operation and more data remains to be trans­
ferred, it moves the next chunk of data to or from the device and waits for the
next interrupt.

4.

If this is a DMA operation and more data remains to be transferred, it queues
a DPC request to set up the DMA hardware for the next chunk of data.

5.

If an error occurs or the data transfer is complete, it queues a DPC request to
perform 1/0 postprocessing at a lower IRQL.

6.

It dismisses the interrupt.

Postprocessing by the Driver
The Kernel's DPC dispatcher eventually calls the driver's DPC routine to
perform device-specific postprocessing operations, including some or all of the
following:

1.

If this i s a DMA operation and more data remains t o b e transferred, i t sets up
the DMA hardware for the next piece of data, starts the device, and waits for
an interrupt. It then returns to the 1/0 Manager without performing any of
the following steps.

2.

If there was an error or timeout, the DPC routine might record it in the system
event log and either retry or abort the I/ 0 request.

3.

It releases any DMA and controller resources being held by the driver.

4.

Next, the DPC routine puts the size of the transfer and final status informa­
tion into the IRP.

5.

Finally, it tells the 1/0 Manager to complete the current request and start the
next one, if one is waiting in the queue for this device.

Postprocessing by the 1/0 Manager
Once the driver's DPC routine releases an IRP, the 1/0 Manager performs
various device-independent cleanup operations. These include the following.

1.

If this was a Buffered 1 / 0 output operation, the 1 / 0 Manager releases the
nonpaged pool buffer used during the transfer.

Sununary

Sec. 3.7

3.7

61

2.

If this was a Direct 1/0 operation, it unlocks the user's buffer pages.

3.

It queues a request to the original thread for a kernel-mode asynchronous pro­
cedure call (APC) . This APC will execute a piece of 1/0 Manager code in the
context of the thread that issued the original 1/0 request.

4.

When the kernel-mode APC runs, it copies status and transfer-size informa­
tion back into user space.

5.

If this was a buffered input, the APC routine copies the contents of the non­
paged pool buffer into the caller's user-space buffer. Then it frees the system
buffer.

6.

If the original request was for an overlapped operation, the APC routine sets
the associated Event object into the signaled state.

7.

If the original request included a completion routine (for example, from a
ReadFileEx or WriteFileEx call), the kernel-mode APC requests a user-mode
APC to execute the completion routine.

SUMMARY
That completes our quick tour of NT and the 1/0 subsystem. At this point, you
should have a good sense of how various driver routines interact with the 1/0
Manager. Later chapters will explain how to apply this understanding.
Keeping track of all the details involved in 1/0 processing obviously
requires a lot of bookkeeping. In the next chapter, we'll take a look at the data
structures used by the 1/0 Manager and your driver.

C

H

A

P

T

E

R

4

Drivers and
Kernel-Mode
Obj ects
D

ata structures are the lifeblood of most operat­
ing systems, and Windows NT is no exception. What's interesting about NT is its
use of object technology to manage all this data. After a quick look at NT's
approach to objects, this chapter introduces the major structures involved in pro­
cessing 1/0 requests. Later chapters will introduce additional data objects as they
become necessary.

4.1

DATA O BJ ECTS AND WIN DOWS N T
Just in case you've been living on Mars for the last decade, object-oriented pro­
gramming (OOP) is one of the currently fashionable software design methodolo­
gies. In this scheme, data structures are viewed as black boxes (objects) whose
contents are invisible, and any interaction with these data structures occurs
through a limited set of access functions (methods). The goal is to improve the reli­
ability and robustness of the resulting software by hiding implementation details
from the users of an object, and by reducing unplanned dependencies between
software modules.

Windows NT and OOP
Using a strict definition of OOP, the design of NT isn't truly object-ori­
ented. Rather, you should think of it as being object-based, because it manages
its internal data structures in an objectlike way. In particular, the Kernel and the
62

Sec. 4.2

1/0 Request Packets (IRPs)

63

various Executive modules each define their own sets of data structures, along
with a corresponding group of access functions. All other modules are expected
to use those access functions to manipulate the contents of the structure. The
data structures themselves are supposed to be opaque outside the module that
defines them.
That's the idea anyway. When it comes to drivers, things get a little fuzzy
since a driver is essentially a trusted add-on component of the 1/0 Manager.
Because of this special status, a driver is allowed to touch some object fields
directly but must use access functions for other operations on the object. So, I/O
Manager objects available to a driver are partially opaque. Objects defined by
other NT components are entirely opaque.

NT Objects and Win32 Objects
If you compare internal NT objects with the Win32 user-mode objects,
you'll see a couple of differences. First, with a couple of exceptions, most of
these NT objects have no externally visible names. This is because these objects
aren't being exported to user mode and don't need to be managed by the Object
Manager.
Second, you don't use handles to access internal NT objects. Instead, you use
a pointer to the object body itself. In some cases, NT will create the object for you
and give you the pointer. In other cases, you'll need to allocate and initialize stor­
age for the object.

4.2

1/0 REQU EST PACKETS (I R Ps)
Almost all 1/0 is packet-driven under Windows NT. Each separate 1/0 transac­
tion is described by a work order that tells the driver what to do and tracks the
progress of the request through the 1/0 subsystem. These work orders take the
form of a data structure called an I/O Request Packet (IRP), and this is how they're
used.

1.

The 1/0 Manager allocates an IRP from nonpaged system memory in
response to an I/0 request. Based on the 1/0 function specified by the user, it
passes the IRP to the appropriate driver Dispatch routine.

2.

The Dispatch routine checks the parameters of the request, and if they're
valid, passes the IRP to the driver's Start I/0 routine.

3.

The Start I/0 routine uses the contents of the IRP to set up a device operation.

4.

When the operation is complete, the driver's DpcForlsr routine stores a final
status code in the IRP and sends it back to the I/O Manager.

5.

The 1/0 Manager uses the information in the IRP to complete the request and
send the user the final status.

Chapter 4

64

Drivers and Kernel-Mode Objects

This describes what happens when requests are being sent to a lowest-level
driver. If the initial request is sent to a higher-level driver, things get a little more
complex, and a single IRP may travel through several layers of drivers before the
request is finished. Higher-level drivers can also create additional IRPs and send
them to other drivers.

Layout of an IRP
An IRP is a variable-sized structure allocated from nonpaged pool. As you
can see from Figure 4.1, an IRP has two sections:
•

A header area containing general bookkeeping information

•

One or more parameter blocks called I/0 stack locations

IRP header This area of the IRP holds various pieces of information about
the overall 1/0 request. Some parts of the header are directly accessible to your
driver, while other pieces are the exclusive property of the 1/0 Manager. Table 4.1
list the fields in the header that your driver is allowed to touch.
The IoStatus member holds the final status of the 1 / 0 operation. When
your driver is ready to complete the processing of an IRP, it sets the Status field
of this block to a STATUS_XXX value. At the same time, your driver should set
the Information field of the status block either to 0 (if there's an error) or to a
function-code-specific value (for example, the number of bytes transferred) .

MajorFunction;
MinorFunction;
union {
struct { } Read;
struct { } Write;
struct { } DeviceloControl;

Stack

L ._,__

} Parameters;
Copyright @ 1994 by Cydonix Corporation. 940033a.vsd

Figure 4 . 1

The structure of an IRP

Sec. 4.2

I / O Request Packets (IRPs)

Table 4.1

65

Externally visible fields in an I R P header

IRP header fields
Field

Description

10_STATUS_BLOCK IoStatus
PVOID Associatedlrp.SystemBuffer

Contains status of the I/0 request
Points to a system space buffer if
device performs Buffered I/ 0
Points to a Memory Descriptor List
for a user-space buffer if device
performs Direct 1/0
User-space address of I/O buffer
Indicates the IRP has been canceled

PMDL MdlAddress

PVOID UserBuffer
BOOLEAN Cancel

The Associatedlrp.SystemBuffer, MdlAddress, and UserBuffer fields play
various roles in managing the driver's access to data buffers. Later chapters will
explain how to use these fields when your driver performs either Buffered or
Direct l/O.

1/0 stack locations The main purpose of an I / 0 stack location is to hold
the function code and parameters for an I/0 request. By examining the Major­
Function field of the stack location, a driver can decide what operation to perform
and how to interpret the Parameters union. Table 4.2 describes some of the com­
monly used members of an I/O stack location.
For requests sent directly to a lowest-level driver, the corresponding IRP
will have only one I/0 stack location. For requests sent to a higher-level driver,
the I / O Manager creates an IRP with separate I / O stack locations for each driver
layer. Every driver in the hierarchy is allowed to touch only its own stack loca­
tion, and if it's not at the bottom of the pile, to set up the stack location for the
next driver beneath it.
When a driver passes an IRP to a lower-level driver, the I/O Manager auto­
matically "pushes" the 1/0 stack-pointer so that it points at the 1/0 stack location
belonging to the lower driver. When the lower driver releases the IRP, the 1/0
stack-pointer is "popped" so that it again points to the stack location of the higher
driver. Chapter 15 will explain how to work with this mechanism.
Manipulating IRPs
IRP access functions fall into two general categories: Those that operate on
the IRP as a whole, and those that deal specifically with the IRP's 1/0 stack loca­
tions. The following subsections describe each of groups.

IRPs as a whole The I/O Manager exports a variety of functions that
work with IRPs. Table 4.3 lists the most common ones. Later chapters will explain
how to use them.

Chapter 4

66

Table 4.2

Drivers and Kernel-Mode Objects

Selected contents of an I R P stack location

IO_STACK_LOCATION, *PIO_STACK_LOCATION
Field

Contents

UCHAR MajorFunction
UCHAR MinorFunction
union Parameters
struct Read

IRP_MJ_XXX function specifying the operation
Used by file system and SCSI drivers
Typed union keyed to MajorFunction code
Parameters for IRP_MJ_READ
• ULONG Length
• ULONG Key
• LARGE_INTEGER ByteOffset
Parameters for IRP_MJ_WRITE
• ULONG Length
• ULONG Key
• LARGE_INTEGER ByteOffset
Parameters for IRP_MJ_DEVICE_CONTROL
and IRP_MJ_INTERNAL_DEVICE_CONTROL
• ULONG OutputBufferLength
• ULONG InputBufferLength
• ULONG IoControlCode
• PVOID Type3InputBuffer

struct Write

struct DeviceloControl

struct Others
PDEVICE_OBJECT DeviceObject
PFILE_OBJECT FileObject

Available to driver
• PVOID Argumentl-Argument4
Target device for this I/ 0 request
File object for this request, if any

Note: See NTDDK.H for additional members of the Parameters union.

Table 4.3

Functions that work with the whole I R P

I R P functions
Function

Description

Called by...

IoStartPacket
IoCompleteRequest
IoStartNextPacket
IoCallDriver*
IoAllocatelrp*
loFreelrp*

Sends IRP to Start 1/0 routine
Indicates that all processing is done
Sends next IRP to Start I/ 0
Sends IRP to another driver
Requests additional IRPs
Releases driver-allocated IRPs

Dispatch
DpcForlsr
DpcForlsr
Dispatch
Dispatch
1/0 Completion

*These functions are used primarily by layered drivers.

Sec. 4.3

Driver Objects

Table 4.4

67

IO_STACK_LOCATION access-functions

IO_STACK_LOCATION access functions
Function
IoGetCurrentlrpStackLocation
loMarklrpPending
IoGetNextlrpStackLocation*
IoSetNextlrpStackLocation*
loSetCompletionRoutine*

Description

Called by...

Gets pointer to caller's stack slot

(Various)
Dispatch

Marks caller's stack slot as needing
further processing
Gets pointer to stack slot for next
lower driver
Pushes the I/ 0 stack pointer one
location
Attaches 1/0 Completion routine
to the next lower driver 's 1/0
stack slot

Dispatch
Dispatch
Dispatch

*These functions are used primarily by layered drivers.

IRP stack locations The 1/0 Manager also provides several functions
that drivers can use to access an IRP's stack locations. These functions are listed in
Table 4.4

4.3

DRIVER OBJ ECTS
DriverEntry is the only driver routine with an exported name. When the 1/0
Manager needs to locate other driver functions, it uses the Driver object associ­
ated with a specific device. This object is basically a catalog that contains pointers
to various driver functions. Here's how it works.

1.

The 1 / 0 Manager creates a Driver object whenever it loads a driver. If the
driver fails during initialization, the 1/0 Manager deletes the object.

2.

During initialization, the DriverEntry routine loads pointers to other driver
functions into the Driver object.

3.

When an IRP is sent to a specific device, the 1/0 Manager uses the associated
Driver object to find the right Dispatch routine.

4.

If a request involves an actual device operation, the 1/0 Manager uses the
Driver object to locate the driver's Start 1/0 routine.

5.

If the driver is unloaded, the 1/0 Manager uses the Driver object to find an
Unload routine. When the Unload routine is done, the 1/0 Manager deletes
the Driver object.

Chapter 4

68

Drivers and Kernel-Mode Objects

Start 1/0
Routine

Unload
Routine
DriverUnload
MajorFunction[ ]

·· · · ···· ·

Dispatch
Routine

• Device

O bject

Dispatch
Routine
Copyright © 1 994 by Cydonix Corporation. 9400348.vsd

Figure 4.2

Structure of a Driver object

Layout of a Driver Object
There is a unique Driver object for each driver currently loaded in the sys­
tem. Figure 4.2 illustrates the structure of a Driver object. As you can see, the
Driver object also contains a pointer to a linked list of devices serviced by this
driver. A driver's Unload routine can use this list to locate any devices it needs to
delete.
Unlike other objects, there are no access functions for modifying Driver
objects. Instead, the DriverEntry routine sets various fields directly. Table 4.5 lists
the fields you're allowed to touch.

Table 4.5

Externally visible fields of a Driver object

Driver object fields
Field

Description

PDRIVER_STARTIO DriverStartio
PDRIVER_UNLOAD DriverUnload
PDRIVER_DISPATCH MajorFunction[ ]

Address of driver's Start 1/0 routine
Address of driver's Unload routine
Table of driver's Dispatch routines,
indexed by 1/0 operation code
Linked list of Device objects created by
this driver

PDEVICE_OBJECT DeviceObject

Device Objects and Device Extensions

Sec. 4.4

4.4

69

DEVICE OBJ ECTS AND DEVICE EXTENSIONS
Both the 1/0 Manager and the driver need to know what's going on with an 1/0
device at all times. Device objects make this possible by keeping information
about a device's characteristics and state. There is one Device object for each vir­
tual, logical, and physical device on the system. Here's how they're used.

1.

The DriverEntry routine creates a Device object for each o f its devices.

2.

The 1/0 Manager uses a pointer in the Device object to locate the correspond­
ing Driver object. There it can find driver routines to operate on 1/0 requests.
It also maintains a queue of current and pending IRPs attached to the Device
object.

3.

Various driver routines use the Device object to locate the corresponding
Device Extension. As an I/ 0 request is processed, the driver uses the Exten­
sion to store any device-specific state information.

4.

The driver's Unload routine deletes the Device object when the driver is
unloaded.
Physical Device drivers aren't the only ones who use these objects. Chapter

15 describes the way higher-level drivers use Device objects.

Layout of a Device Object
Figure 4.3 illustrates the structure of a Device object and its relation to other
structures.

!RP

Flags

+

-

El

DriverObject
Currentlrp
DeviceExtension

Copyright © 1 994 by Cydonix Corporation. 940035a.vsd

Figure 4.3

Structure of a Device object

Chapter 4

70

Table 4.6

Drivers and Kernel-Mode Objects

Externally visible fields of a Device object

Device object fields
Field

Description

PVOID DeviceExtension
PDRIVER_OBJECT DriverObject
ULONG Flags

Points to Device Extension structure
Points to Driver object for this device
Specifies buffering strategy for device
• DO_BUFFERED_IO
• DO_DIRECT_IO
Points to next device belonging to this
driver
Minimum number of I/ 0 stack locations
needed by IRPs sent to this device
Memory alignment required for buffers

PDEVICE_OBJECT NextDevice
CCHAR StackSize
ULONG AlignmentRequirement

Although the Device object contains a lot of data, much of it is the exclusive
property of the 1/0 Manager. Your driver should limit its access to only those
fields listed in Table 4.6.

Manipulating Device Objects
Table 4.7 lists many of the 1/0 Manager functions that operate on Device
objects. The I/ 0 Manager also passes a Device object pointer as an argument to
most of the routines in your driver.

Table 4.7 Access functions for a Device object
Device object access functions
Function

Description

Called by...

IoCreateDevice
IoCreateSymbolicLink
IoAttachDevice*
IoAttachDeviceByPointer*
IoGetDeviceObjectPointer*
IoCallDriver*
IoDetachDevice*
IoDeleteSymbolicLink

Creates a Device object
Makes Device object visible to Win32
Attaches a filter to a Device object
Attaches a filter to a Device object
Layers one driver on top of another
Sends request to another driver
Disconnects from a lower driver
Removes Device object from the Win32
namespace
Removes Device object from system

DriverEntry
DriverEntry
DriverEntry
DriverEntry
DriverEntry
Dispatch
Unload
Unload

IoDeleteDevice

*These functions are used primarily by layered drivers.

Unload

Sec. 4.5

Controller Objects and Controller Extensions

71

Device Extensions
Connected to the Device object is another important data structure, the
Device Extension. The Extension is simply a block of nonpaged pool that the I/0
Manager automatically attaches to any Device object you create. You choose both
the size and the contents of the Device Extension. Typically, you use it to hold any
information associated with a particular device.
Drivers have to be fully reentrant, so global or static variables are a very bad
idea. Any information that you might be tempted to keep in global or static stor­
age probably belongs in the Device Extension. Other things you might want to
store in the Extension include
•

A back pointer to the Device object

•

Any device state or driver context information

•

A pointer to an Interrupt object and an interrupt-expected flag

•

A pointer to a Controller object

•

A pointer to an Adapter object and a count of mapping registers

Since the Device Extension is driver-specific, you'll need to define its struc­
ture in one of your driver's header files. Although the Extension's exact contents
will depend on what your driver does, its general layout will look something
like this:

typede f s t ruct _DEVICE_EXTENS ION {
PDEVICE_OBJECT Devic eObj e c t ;
II

Other dr iver- spec i f i c dec l arat i ons

DEVI CE_EXTENS I ON , * PDEVICE_EXTENS ION ;
In later chapters of this book, you'll see a great many uses for the Device
Extension.

4.5

CONTROLLER O BJ ECTS AND CONTROLLER EXTENSIONS
Some peripheral adapters manage more than one physical device using the same
set of control registers. The floppy disk controller is one example of this architec­
ture. This kind of hardware causes the following synchronization problem: If the
driver tries to perform simultaneous operations on more than one of the con­
nected devices without first synchronizing its access to the shared register space,
the control registers may get trashed. To help with this problem, the 1/0 Manager
provides Controller objects.
The Controller object is a kind of token that can be owned by only one device
at a time. Before accessing any device registers, the driver asks that ownership of

Chapter 4

72

Drivers and Kernel-Mode Objects

the Controller object be given to a specific device. If the hardware is free, ownership
is granted. If not, the device's request is put on hold until the current owner releases
the hardware. By passing the Controller object around this way, the I/O Manager
guarantees that multiple devices will access the hardware in an orderly manner.
Here's a little more detail about how Controller objects are used.

1.

The DriverEntry routine creates the Controller object and usually stores its
address in a field of each device's Device Extension.

2.

Before it starts a device operation, the Start I/0 routine asks for exclusive
ownership of the Controller object on behalf of a specific device.

3.

When the Controller object becomes available, the I/ 0 Manager grants own­
ership and calls the driver's ControllerControl routine. This routine sets up
the device's registers and starts the I/O operation. As long as this device
owns the Controller object, any further requests for ownership will block at
step 2 until the object is released.

4.

When the device operation is finished, the driver's DpcForlsr routine releases
the Controller object, making it available for use by other pending requests.

5.

The driver's Unload routine deletes the Controller object when the driver is
unloaded.

Obviously, not all drivers need a Controller object. If your interface card
supports only one physical or virtual device, or if multiple devices on the same
card don't share any control registers then you can ignore Controller objects.

Layout of a Controller Object
Figure 4.4 shows the relationship of a Controller object to other system data
structures.
The only externally visible field in a Controller object is the PVOID Control­
lerExtension field, which contains a pointer to the extension block.

Manipulating Controller Objects
The I/O Manager exports four functions that operate on Controller objects.
These functions are listed in Table 4.8.

Controller Extensions
Like Device objects, Controller objects contain a pointer to an Extension
structure that you can use to hold any controller-specific data. The Extension is
also a place to store any information that's global to all the devices attached to a
controller. Finally, if the controller (rather than individual devices) is the source of

Sec. 4.5

Controller Objects and Controller Extensions

�···

Device
Object

Device

Exten sion

73

@

Device
Object

@·

ControllerExtension

Copyright © 1 994 by Cydonix Corporation. 940036a. vsd

Figure 4.4 Structure of a Controller object

interrupts, it makes sense to store pointers to Interrupt and Adapter objects in the
Controller Extension.
Since the Controller Extension is driver-specific, you'll need to define its
structure in one of your driver's header files. Although the Extension's exact con­
tents will depend on what your driver does, its general layout will look some­
thing like this:

typede f s t ruc t _CONTROLLER_EXTENS I ON {
PCONTROLLER_OBJECT Contro l l erObj ect ;
II

Other driver- spec i f ic dec l arati ons

CONTROLLER_EXTENS ION , * PCONTROLLER_EXTENS I ON ;
Table 4.8

Access functions for a Controller object

Controller object access functions
Function

Description

Cal led by...

IoCreateController
IoAllocateController
IoFreeController
IoDeleteController

Creates a Controller object
Requests exclusive ownership of controller
Releases ownership of controller
Removes Controller object from the system

DriverEntry
Start 1/0
DpcForlsr
Unload

Chapter 4

74

4.6

Drivers and Kernel-Mode Objects

ADAPTER OBJ ECTS
Just as multiple devices on the same controller need to coordinate their hardware
access, so devices that perform DMA need an orderly way to share system DMA
resources. The 1/0 Manager uses Adapter objects to prevent arguments over
DMA hardware. There is one Adapter object for each DMA data transfer channel
on the system.
Like a Controller object, an Adapter object can be owned by only one device
at a time. Before starting a DMA transfer, the Start I/O routine asks for ownership
of the Adapter object. If the hardware is free, ownership is granted. If not, the
device's request is put on hold until the current owner releases the hardware.
Obviously, if your device supports only programmed 1/0, you don't need to
bother with Adapter objects. Here's how Adapter objects work.

1.

The HAL creates Adapter objects for any DMA data channels detected at
bootstrap time.

2.

The DriverEntry routine locates the Adapter object for its device and stores a
pointer in the Device or Controller Extension. Adapter objects for unrecog­
nized DMA hardware may be created on the fly at this point.

3.

The Start I/O routine requests ownership of the Adapter object on behalf of a
specific device.

4.

When ownership is granted, the 1 / 0 Manager calls the driver's Adapter
Control routine. This routine then uses the Adapter object to set up a DMA
transfer.

5.

The driver's DpcForlsr routine may use the Adapter object to perform addi­
tional operations in the case of a split transfer. When a transfer is finished,
DpcForlsr releases the Adapter object.

Another important function of the Adapter object is to manage some things
called mapping registers. The HAL uses these registers to map the scattered physi­
cal pages of a user's buffer onto the contiguous range of addresses required by
most DMA hardware. If that statement doesn't make any sense to you, don't
worry. We'll be looking at the mechanics of DMA transfers in much greater detail
in Chapter 12.

Layout of an Adapter Object
Figure 4.5 illustrates the relationship of Adapter objects to other structures.
As you can see from the diagram, the Adapter object is completely opaque and
has no externally visible fields. If you're working with DMA devices, you should

Sec. 4.6

Adapter Objects

75

AdapterPtr
MapRegCount

Copyright © 1 994 by Cydonix Corporation. 940037a.vsd

Figure

4.5

Structure of an Adapter object

store the pointer to your Adapter object, as well as the number of mapping regis­
ters it supports, either in a Device or Controller Extension

Manipulating Adapter Objects
Both the HAL and the 1/0 Manager export functions that you can use to
manipulate Adapter objects. Table 4.9 lists the ones you're most likely to encounter.

Table 4.9

Access functions for an Adapter object

Adapter object access functions
Function

Description

Called by...

HalGetAdapter

Gets a pointer to an
Adapter object
Requests exclusive ownership
of OMA hardware
Sets up OMA hardware for a
data transfer
Flushes data after partial
transfers
Releases map registers
Releases Adapter object

DriverEntry

IoAllocateAdapterChannel
IoMapTransfer
IoFlushAdapterBuffers
IoFreeMapRegisters
IoFreeAdapterChannel

Startlo (Controller
Control)
Adapter Control I
DpcForlsr
DpcForlsr
DpcForlsr
DpcForlsr

Chapter 4

76

4.7

Drivers and Kernel-Mode Objects

I NTERR U PT OBJECTS
That brings us to the last of the NT objects we'll be looking at in this chapter, the
Interrupt object. Interrupt objects simply give the Kernel's interrupt dispatcher a
way to find the right service routine when an interrupt occurs. Here's how Inter­
rupt objects are used.

1.

The DriverEntry routine creates a n Interrupt object for each interrupt vector
supported by the device or the Controller

2.

When an interrupt occurs, the Kernel's interrupt dispatcher uses the Interrupt
object to locate the Interrupt Service routine

3.

The Unload routine deletes the Interrupt object after disabling interrupts
from the device.

Other than creating and deleting them, your driver has very little direct
interaction with Interrupt objects. You will, however, need to store a pointer to the
Interrupt object in a convenient place like the Device or Controller Extension.

Layout of an Interrupt Object
Figure 4.6 illustrates the structure of an Interrupt object. Like Adapter
objects, they are completely opaque and have no externally visible fields.

lnterruptPtr

Copyright © 1 994 by Cydonix Corporation. 940038a.vsd

Figure 4.6

Structure of an Interrupt object

Sec. 4.8

Summary

77

Table 4.1 0 Access functions for an I nterrupt object
Interrupt object access functions
Function

Description

Called by...

HalGetlnterruptVector

Converts bus-relative interrupt
vector to systemwide value
Associates Interrupt Service routine
with a system interrupt vector
Synchronizes driver routines that
run at different IRQLs
Removes Interrupt object

DriverEntry

IoConnectlnterrupt
KeSynchronizeExecution
IoDisconnectlnterrupt

DriverEntry
(Various)
Unload

Manipulating Interrupt Objects
Several system components export functions that work with Interrupt ob­
jects. Table 4.10 lists the most common ones.

4.8

SUM MARY
Although it may seem as if there are a lot of objects involved in 1/0 processing,
they're all necessary and important. If you're feeling a little overwhelmed with all
this background material, you can relax. The next chapter will show you how to
put this information to work as we start writing some actual driver code.

C

H

A

P

T

E

R

5

General
Develo p ment
Issues
W

riting kernel-mode code is not the same as
writing an application program. Because your driver is a trusted component of
the system, you have to be much more careful about how you behave. This chap­
ter is a short manual of good etiquette for driver writers.

5.1

DRIVER DESIGN STRATEGIES
Like most other kinds of software, drivers benefit from an organized approach to
development. This section gives some guidelines that may help shorten develop­
ment time.

Use Formal Design Methods
There's a certain cowboy mentality that pervades the driver-writing world.
For some reason, it's easy to think that you can just sit down, scribble a flowchart
on an old candy wrapper, and just start coding. Unfortunately, when you're deal­
ing with a full-duplex driver for some asynchronous communication device, such
ad hoc methods just don't work. So many things are going on that it becomes
impossible to verify the flow of control.
A better approach is to use techniques that have proven helpful in other
areas of real-time design. Some suggestions follow.
78

Sec. 5.1

Driver Design Strategies

79

•

Data flow diagrams can help you break your driver into discrete func­
tional units. These diagrams make it easier to visualize how the func­
tional units in your driver relate to each other, and how they transform
input data into output data.

•

State-machine models are another good way to describe the flow of con­
trol in a driver - especially one that manages an elaborate hardware or
software protocol. In the process of verifying the state machine, you can
also ferret out synchronization issues within the driver.

•

An analysis of expected data repetition rates or mandatory input-to-out­
put response will give you a set of quantitative timing requirements.
These are important when it comes time to tune the driver.

•

Another useful tool is an explicit list of external events and the driver
actions these events should trigger. This list ought to include both hard­
ware events from the device and I/ 0 requests from users.

Using these techniques will help you to separate your driver into well­
defined functional units, which makes the driver easier to develop. In some
cases, this might even mean breaking a single driver into a pair of port and class
drivers that handle hardware-dependent and hardware-independent functions.
In any event, the time you spend analyzing and designing your driver at the
start of the project will more than pay for itself in reduced debugging and
maintenance.

Use Incremental Development
Once you've completed your initial analysis and design, it's time to start the
actual development. Following the steps below can reduce your debugging time
by helping you to detect problems while they're still easy to find.

1.

Decide which kinds o f kernel-mode objects your driver will need.

2.

Decide on any additional context or state information your driver will need,
and decide where you're going to store it.

3.

Write the DriverEntry and Unload routines. To test the driver at this point, see
if you can load and unload it using the Control Panel.

4.

Add code that finds and allocates the driver's hardware, as well as code to
deallocate the hardware when the driver unloads. Again, the test is just
whether you can load and unload the driver using the Control Panel. You can
also use the Registry editor (REGEDT32) to see whether your driver is allocat­
ing and deallocating its resources properly.

5.

Add driver Dispatch routines that process IRP_MJ_CREATE,
IRP_MJ_CLOSE, and any other operations that don't require device access.

Chapter 5

80

General Development Issues

You can test the driver with a simple Win32 program that calls CreateFile and

CloseHandle.
6.

Add Dispatch routines that process any other IRP_MJ_XXX function codes.
Also, add the Start I/O logic but complete each I/O request without starting
the device. Test these new code paths with a simple Win32 program that
makes ReadFile and WriteFile calls, as appropriate.

7.

Finally, implement the real Start I/O logic, the Interrupt Service routine, and
the DPC routine. Now you can test the driver using live data.

Another tip: If you're unsure about the exact behavior of the hardware, add
a DeviceloControl function that gives you direct access to the device registers.
This will allow you to find out how the device really works by writing a few
simple Win32 programs. Just remember to disable this function when you ship
the final version of the driver.

Use the Sample Drivers
The Windows NT device driver kit (DDK) contains a huge body of sample
code in the \ DDK\ SRC directory tree. There are many ways you can use all this
code to make driver development easier. At the very least, you should read it for
hints, clues, and comments. You might also want to be more direct about cutting
and pasting helpful chunks of code (a procedure encouraged by Microsoft). The
usual warning: If you do decide to cut and paste, make sure you thoroughly
understand the code you're grabbing.

5.2

C O D I N G C O N V EN TI O N S A N D T E C H N I Q U ES
Writing a trusted kernel-mode component is not the same as writing an applica­
tion program. This section presents some basic conventions and techniques that
will make it easier to code in this environment.

General Recommendations
First of all, here are some general recommendations for things you should
keep in mind when you're writing a driver:
•

Avoid the use of assembly language in your driver. It makes the code
hard to read, nonportable, and difficult to maintain. In those rare situa­
tions where it's unavoidable, isolate the code in its own module. What­
ever you do, don't go sprinkling inline assembly throughout your driver.

•

If you have any platform-specific code, either put it in its own module, or
at the very least bracket it with #ifdef/#endif statements.

Coding Conventions and Techniques

Sec. 5.2

81

•

Don't link your driver with the standard C runtime library. Some of those
routines may hold state or context information in ways that are not driver
safe. Instead use the RtlXxx support routines supplied for drivers.

•

Commenting code is a religious issue. Some people swear by it; others
think that out-of-date comments are worse than no comments at all. 1

•

Manage your driver project with some kind of source-code control pro­
gram. This is especially important for larger drivers, or drivers being
developed by several people.

Naming Conventions
It's a good idea to adopt some standard naming convention for the routines
in your driver. This makes it easier to debug and test the driver during its initial
development. It also simplifies maintenance of the driver should you have to
reacquaint yourself with the code after being away from it for a year. Microsoft
recommends the following:
•

Add a driver-specific prefix to each of your routines. Choose one prefix
for standard driver routines and another, shorter prefix for any internal
functions.

•

Give the routine itself a name that describes what it does.

For example, the mouse class driver supplied with the NT DOK adds the prefix
MouseClass to all its standard routines which gives names like MouseClassStartlo
and MouseClassUnload. The same class driver uses the prefix Mou for any internal
routines like MouConfiguration and MouConnectToPort.
Regardless of whether you follow these conventions or come up with some
of your own, it's important that you establish some consistent way of naming
your driver routines. When you come back to a driver that you haven't looked at
for six months, uniform naming will make it easier to figure out what you
originally had in mind.

Header Files
NTDDK.H defines all the data types, structures, and constants used by
base-level kernel-mode drivers. SCSI, network, and video drivers use other
header files. Be sure you've included the appropriate headers in your driver.
You can use private header files to hide various hardware and platform
dependencies. For example on 80x86 systems, you can address each byte in 1/0
space, but on other architectures, 1 / 0 registers may need to be aligned on 4-byte

1

Personally, I attend services at the Church of the Detailed Comment.

Chapter 5

82

General Development Issues

or 8-byte boundaries. Hiding these differences in a header file means you can
move your driver to a new platform just by redefining some symbols and
rebuilding the driver.
Even if your driver doesn't face any of these issues, writing a few register
access macros can make the driver itself easier to read. The following code
fragment is an example of some hardware beautification macros for a parallel port
device. This example assumes that some initialization code in the driver has put
the address of the first device register in the PortBase field of the Device
Extension.
II
II

Def ine devi c e regi s t ers as relat ive o f f s e t s

II

# de f ine PAR_DATA
# de f ine PAR_STATUS
# de f ine PAR_CONTROL
II
II
II

0

1

2

Def ine acc e s s macros for regi s ters . Each mac ro take s
a pointer to a Device Ext en s i on as an argument

II

# de f ine ParWr i t eData ( pDevExt , bData )
( WRITE_PORT_UCHAR (
pDevExt - > PortBase + PAR_DATA , bData ) )

\
\

# de f ine ParReadS tatu s ( pDevExt )
( READ_PORT_UCHAR (
pDevExt - > PortBa s e + PAR_STATUS ) )

\
\

# de f ine ParWr i teContro l ( pDevExt , bData
\
( WRITE_PORT_UCHAR (
\
pDevExt - > PortBase + PAR_CONTROL , bData ) )
Status Return Val ues
The kernel-mode portions of NT operating system use 32-bit status values to
describe the outcome of any particular operation. The data type of these codes is
NTSTATUS. There are three situations in which you'll need to use these status
codes:
•

When you call one of the internal NT functions, it will communicate its
displeasure at something you're trying to do by returning an NTSTATUS
value

•

When NT calls some driver-specific callback routines, the routines often
have to return an NTSTATUS value to the system.

Coding Conventions and Techniques

Sec. 5.2

•

83

When you complete the processing of an 1/0 request, you need to mark it
with an NTSTATUS value. This value will ultimately be mapped onto a
Win32 ERROR_XXX code. 2

NTSTATUS.H defines symbolic names for a large number of NTSTATUS
values. These names all have the form STATUS_XXX, where XXX describes the
actual status message. STATUS_SUCCESS, STATUS_NAME_EXISTS, and
STATUS_INSUFFICIENT_RESOURCES are all examples of these names.
When you call a system routine that returns an NTSTATUS value, you can
either check for specific values, or you can use the NT_SUCCESS macro to test for
general success or failure. The following code fragment illustrates this technique.

NTSTATUS s tatus ;
s tatus = I oCreat eDevi c e ( . . . ) ;
i f ( ! NT_SUCCESS ( s tatus ) ) {
I I c l ean up and exi t wi th fai lure
}
Always, always, always check the return values you get from any system
routines you call. If you just assume that the call succeeded, your driver may
damage the system somewhere down the line. If you're lucky, this kind of thing
will crash the system and draw attention to itself; if not, it may just produce
sporadic, hard-to-find errors.

NT Driver Support Routines
The 1/0 Manager and other kernel-mode components of NT export a large
number of support functions that your driver can call. The reference section of the
NT DDK documentation describes these functions, and you'll see plenty of
examples of their use throughout this book. For the moment, it's enough to point
out that these support routines fall into categories based on the NT module that
exports them. Table 5.1 gives a brief overview of the kinds of support that each
NT module provides.
The ZwXxx functions need a little explanation. These are actually an internal
calling interface for all the NtXxx user-mode system services. The difference
between the user- and kernel-mode interfaces is that the ZwXxx functions don't
perform any argument checking. Although there are a large number of these

2

NTSTATUS codes and Win32 error codes are not the same thing. The knowledge base that comes
with the NT DDK has an article that shows the mapping between NTSTATUS values and their cor­
responding Win32 ERROR_XXX codes. It's worth taking a look at this article because the mappings
from STATUS_XXX to ERROR_XXX codes don' t always make a lot of sense.

Chapter 5

84

Table 5.1

General Development Issues

Categories of support routines available to drivers

NT driver support routines
Category

Supports ...

Function names

Executive

Memory allocation
Interlocked queues
Zones
Lookaside lists
System worker threads
Device register access
Bus access
General driver support
Synchronization
DPC
Virtual-to-physical mapping
Memory allocation
Handle management
System thread management
String manipulation
Large integer arithmetic
Registry access
Security functions
Time and date functions
Queue and list support
Privilege checking
Security descriptor functions
Internal system services

ExXxx()

HAL
I/O Manager
Kernel
Memory Manager
Object Manager
Process Manager
Runtime library

Security Monitor
(All)

HalXxx()
IoXxx()
KeXxx()
MmXxx()
ObXxx()
PsXxx()
RtlXxx() (mostly)

SeXxx()
ZwXxx()

functions, the NT DDK reference material describes only a few of them. Microsoft
may eventually tell us about the rest, but for now, limit yourself to using the ones
that show up in the documentation.
One final point. To make life easier for driver writers, the I/O Manager
provides several convenience functions that are really just wrappers around one
or more lower-level calls to other NT modules. These wrappers usually offer a
simpler interface than their low-level counterparts, and you should use them
whenever you can.

Discarding Initialization Routines
Some compilers support the option of declaring certain functions as
discardable. Functions in this category will disappear from memory after your

Sec. 5.2

85

Coding Conventions and Techniques

driver has finished loading, making your driver smaller. If your development
environment offers this feature, you should use it.
Good candidates for discardable functions are DriverEntry and any
subroutines called only by DriverEntry. The following code fragment shows how
to take advantage of discardable code.

# i fde f ALLOC_PRAGMA
#pragma a l l oc_t ext ( i ni t , DriverEnt ry )
#pragma a l l o c_t ext ( ini t , XxStuf fCal l edByDriverEntry
#pragma a l l oc_t ext ( ini t , XxAl s oCal l edByDriverEntry )
# endi f
The alloc_text pragma must appear after the function name is declared, but
before the function itself is defined - so remember to prototype the function at
the top of the code module (or in a suitable header file) . Also, functions referenced
in the pragma statement must be defined in the same compilation unit as the
pragma. If you don't follow these rules, things break.

Controlling Driver Paging
Nonpaged system memory is a precious resource. You can further reduce
the burden your driver puts on nonpaged memory by putting appropriate
routines in paged memory. Any function that executes only at PASSNE_LEVEL
IRQL can be paged. This includes Reinitialize routines, Unload and Shutdown
routines, Dispatch routines, thread functions, and any helper functions running
exclusively at PASSNE_LEVEL IRQL. Once again, it's the alloc_text pragma that
performs this little miracle. Here's an example:

# i fde f ALLOC_PRAGMA
#pragma a l l oc_text (
#pragma a l l oc_text (
#pragma a l l oc_text (
#pragma a l l oc_text (

page ,
page ,
page ,
page ,

XxUnl oad
XxShutdown
XxDi spatchRead
XxDi spat chHe lper

# endi f
Finally, there's another trick you can play if you have a seldom-used device
driver and you want to get it out of the way. By calling the MmPageEntireDriver
function, you can override a driver 's declared memory management attributes and
make the whole thing temporarily paged. Call this function at the end of the
DriverEntry routine and from the Dispatch routine for IRP_MJ_CLOSE when there
are no more open handles to any of your devices. Call MmResetDriverPaging from
the IRP_MJ_CREATE Dispatch routine to make the driver 's page attributes revert
to normal.
If you use this technique, watch out for two things. First, make sure there
aren't any IRPs being processed by high-IRQL portions of the driver when you

86

Chapter 5

General Development Issues

make everything paged. Second, be certain that no device interrupts will arrive
while the driver 's JSR is paged. Handling these details is left as an exercise for the
reader.

5.3

DRIVER M EMORY ALLOCATION
Just like application programs, drivers may need to allocate temporary storage
from time to time. Unfortunately, drivers don't have the luxury of making simple
calls to malloc and free. Instead, they have to be extremely careful about what
kind of memory they allocate and how much of it they use. Drivers must also be
sure to release any memory they may be holding, since there's no automatic
cleanup mechanism for kernel-mode code. This section describes techniques your
driver can use to work with temporary storage.

Memory Available to Drivers
You have three options when you need to allocate temporary storage in a
driver. Which one you select will depend on how long you plan to keep the data
around and what IRQL level your code is running at. You can choose from the
following:
•

Kernel stack
The kernel stack provides limited amounts of nonpaged
storage for local variables during the execution of specific driver routines.

•

Paged pool
Driver routines running below DISPATCH_LEVEL, IRQL
can use a heap area called paged pool. As its name implies, memory in this
area is pageable, and a page fault can occur when you touch it.

•

Nonpaged pool
Driver routines running at elevated IRQLs need to
allocate temporary storage from another heap area called nonpaged pool.
The system guarantees that the virtual memory in nonpaged pool is
always physically resident. The Device and Controller Extensions created
by the I/ 0 Manager come from this pool area.

-

-

-

Global variables are absent from this list because they introduce major syn­
chronization problems. The problem is that everyone using a given driver is shar­
ing the same copy of the driver 's code and global data. Since a driver might be
processing multiple requests at the same time, the contents of unprotected global
variables can become unpredictable.
Local static variables in a driver subroutine are just as bad. Don't try using
them to maintain state information between calls to a function. There's no
guarantee that two successive calls to a driver routine will be made in the context
of the same 1/0 request.
After saying that, it's worth pointing out that global variables can be helpful
for storing read-only parameters that affect the overall behavior the driver. For

Sec. 5.3

Driver Memory Allocation

87

example, your DriverEntry routine might pull a value from the Registry that
controlled the amount of detail you report to the error-log. Storing this value in a
global variable is acceptable since it will essentially be constant for the life of the
driver. You could use a similar strategy for turning the collection of driver
performance data on and off.

Working with the Kernel Stack
On 80x86 and MIPS platforms, the kernel stack is only 12 kilobytes long. On
Alpha and PowerPC systems, the size is 16 kilobytes. This isn't a lot of space, so
be careful how you use the kernel stack. Dreadful things will happen if you run
out of space. You can avoid kernel stack overflow by following these guidelines.
•

Don't design your driver in such a way that internal routines need to
make deeply-nested calls to one another. Try to keep the calling tree as
flat as possible.

•

If any of your routines call themselves recursively, make sure you limit
the depth of recursion. Drivers are not the place to be calculating
Fibonacci numbers.

•

Don't build large temporary data structures on the kernel stack. Use one
of the pool areas instead.

Another characteristic of the kernel stack is that it lives in cached memory.
This means you shouldn't use temporary buffers on the stack for DMA
operations. Instead, your driver should allocate the buffer from nonpaged pool.
Chapter 12 will describe DMA caching issues in more detail.

Working with the Pool Areas
Remember that kernel-mode drivers can't allocate memory by making calls
to malloc. Instead, they have to use the ExAllocatePool and ExFreePool func­
tions. These functions allocate the following kinds of memory:
•

Non Paged Pool - Memory available to driver routines running at or
above DISPATCH_LEVEL IRQL.

•

NonPagedPoolMustSucceed - Temporary memory that is crucial to
the driver 's continuing operation. If the allocation fails, the system will
bugcheck. Use this memory for emergencies only and release it as quickly
as possible.

•

Non PagedPoolCacheAl igned - Memory that's guaranteed to be aligned
on the natural boundary of a CPU data-cache line. A driver might use this
kind of memory for a permanent I/ 0 buffer.

88

Chapter 5

General Development Issues

•

NonPagedPoolCacheAlignedMustS - Storage for a temporary 1/0
buffer that is crucial to the operation of the driver.

•

PagedPool - Memory available only to driver routines running below
DISPATCH_LEVEL IRQL. Normally, this includes the driver's initializa­
tion, cleanup, and Dispatch routines and any system threads the driver is
using.

•

PagedPoolCacheAligned - I/O buffer memory used by file system
drivers.

There are several things to keep in mind when you're working with the
system memory areas. First and foremost, the pools are precious system
resources, and you shouldn't be too extravagant in their use. This is especially
true of the NonPaged and MustSucceed pool areas.
Second, your driver must be executing at or below DISPATCH_LEVEL
IRQL when you allocate or free nonpaged memory, and at or below APC_LEVEL
IRQL to allocate or free paged pool.
Finally, release any memory you've allocated as soon as have finished using
it. Otherwise, the system may start to perform badly because of low memory
conditions. In particular, be very sure to give back any pool memory when your
driver is unloaded.

System Support for Memory Suballocation
Generally, you should avoid driver designs that constantly allocate and
release blocks of pool memory smaller than PAGE_SIZE bytes. This kind of
behavior causes fragmentation of the pool areas and can make it impossible for
other parts of NT to allocate memory. Instead, if your driver needs to create and
destroy lots of little dynamic data structures, you should allocate a single, large
chunk of pool and write your own suballocation routines to carve it up.
Some kinds of drivers need to manage a collection of small, fixed-size
memory blocks. For example, SCSI class drivers maintain a supply of SCSI
Request Blocks (SRBs) which they use repeatedly to send commands to any
devices under their control. If your driver needs to do something similar, the
system provides two different mechanisms you can use to handle all the details of
suballocation.

Zone buffers A zone buffer is just a chunk of driver-allocated pool. By
calling various Executive routines, your driver can use the zone buffer to manage
collections of fixed-size blocks in paged or nonpaged memory.
If you plan to access a zone buffer at or above DISPATCH_LEVEL IRQL, you
must also set up an Executive spin lock to guard it and use the interlocked
versions of the zone management functions. Zone buffers used only below

Sec. 5.3

Driver Memory Allocation

89

DISPATCH_LEVEL IRQL can be guarded with a Fast Mutex. 3 In this case, use the
noninterlocked set of functions.
To set up a zone buffer, you must declare a structure of type ZONE_HEADER.
You may also need to declare and initialize a spin lock or Fast Mutex object. Then
follow these steps to manage the zone buffer.

1.

Call ExAllocatePool t o claim space for the zone buffer itself. Then initialize
the zone buffer with ExlnitializeZone. Both these steps are normally per­
formed in your DriverEntry routine.

2.

To allocate a block from a zone, call either ExAllocateFromZone or Exlnter­
lockedAllocateFromZone. The interlocked version of the function uses a spin
lock to synchronize access to the zone buffer. The noninterlocked function
leaves synchronization entirely up to your driver.

3.

To release a block back to the zone, use either ExFreeToZone or Exlnter­
lockedFreeToZone. Again, the interlocked version of the function synchro­
nizes access to the zone, while the noninterlocked version does not.

4.

In your driver Unload routine, use ExFreePool to release the memory used
for the zone buffer. Your driver has to make sure that no blocks from the zone
buffer are in use when you deallocate the zone buffer.

Zone buffers that are too large put a strain on the system's memory re­
sources, so don't make a zone buffer any bigger than necessary. Try to pick a size
that will allow your driver to handle the I/O demand level you expect on an
average system. This is a more system-friendly approach than making the zone
buffer big enough to handle the worst possible case.
If you're feeling really clever, you can try to base the size of your zone buffer
on the characteristics of the local system. MmQuerySystemSize will give you a
hint about the total amount of memory available. Systems with more memory can
support larger zone buffers. MmlsThisAnNtAsSystem will tell you whether your
driver is running under Windows NT Workstation or Server. Servers are likely to
have more memory and higher I/O demand levels. Calling these functions in your
DriverEntry routine may help you pick an appropriate zone buffer size.
If you try to allocate a block from a zone buffer and the allocation fails, your
driver should use ExAllocatePool (or ExAllocatePoolWithTag) to get the block
from one of the pool areas instead. To use this strategy, you'll need some kind of
flag bit in the allocated structure to indicate whether it came from the zone buffer
or from the general pool; otherwise you won't know what function to call when
you want to release the block.

3

Spin locks are described later in this chapter. Fast Mutexes show up in Chapter 14.

90

Chapter 5

General Development Issues

You can make an existing zone buffer larger by calling ExExtendZone or
ExlnterlockedExtendZone, but this is generally a bad thing to do. If you enlarge a
zone buffer this way, the additional memory that the system gives to the zone will
not be reclaimed until the next bootstrap. Don't do this unless the performance
gains from using zone allocation (compared to repeated ExAllocatePool calls) sig­
nificantly outweigh the damage it does to the system.

Lookaside lists Windows NT 4.0 provides a more efficient mechanism
called a lookaside list for managing driver-allocated memory. A lookaside list is a
linked list of fixed size memory blocks. Unlike zone buffers, lookaside lists can
grow and shrink dynamically in response to changing system conditions. There­
fore, properly-sized lookaside lists are less likely to waste memory than zone
buffers are.
Compared to zone buffers, the synchronization mechanism used with looka­
side lists is also more efficient. If the CPU architecture has an 8-byte compare
exchange instruction, the Executive uses it to guard access to the list. On plat­
forms without such an instruction, it reverts to using a spin lock for lookaside lists
in nonpaged pool and a Fast Mutex for lists in paged pool. Since most common
platforms do have the necessary compare exchange instruction, lookaside lists
have lower synchronization latency than zone buffers.
To use a lookaside list, you need to declare a header structure of type
NPAGED_LOOKASIDE_LIST or PAGED_LOOKASIDE_LIST (depending on
whether your list will be nonpaged or paged) . Then follow these steps to manage
the lookaside list.
1.

Use one o f the ExlnitializeXxxLookasideList functions t o initialize the list
header structure. 4 Normally, this is done in you DriverEntry routine.

2.

Call ExAllocateFromXxxLookasideList to allocate a block from a lookaside
list.

3.

Call ExFreeToXxxLookasideList when you want to release a block.

4.

Use ExDeleteXxxLookasideList to release any resources associated with
the lookaside list. Usually, this is something you do in the driver 's Unload
routine.

The operation of lookaside lists is rather interesting and deserves a little
attention. For starters, the ExlnitializeXxxLookasideList functions just set up the
list header; they don't actually allocate any memory for the list. When you call
one of these initialization functions, you can specify the maximum number of
blocks that the list can hold. (This is referred to as the depth of the list.) You can

4

In this series of instructions, replace the Xxx in the function name with either
depending on the location of the list.

NPaged

or

Paged,

Sec. 5.4

Unicode Strings

91

also pass pointers to memory allocation and deallocation routines in your driver.
The system will call these functions when it needs to add or remove memory from
the list. 5
Later, when you call one of the ExAllocateFromXxxLookasideList func­
tions, the system allocates memory as needed. As you release blocks with ExFree­
ToXxxLookasideList, they are added to the lookaside list until it reaches its
maximum allowable depth. At that point, any additional calls to ExFreeToXxx­
LookasideList result in memory being released back to the system. This behavior
guarantees that, after awhile, the number of available blocks in the lookaside list
will tend to remain near the depth of the list.
You should choose the depth value very carefully. If it's too shallow, the sys­
tem will be performing expensive allocation and deallocation operations too
often. If it's too deep, you'll be wasting memory by tying it up in the list and not
using it. The statistics maintained in the list header structure can help you deter­
mine a proper value for the depth of the list.

5.4

U NICODE STRINGS
All character strings i n the N T operating system are stored internally a s Unicode.
The Unicode scheme uses 16 bits to represent each character and makes it easier
to move NT to language environments not based on the Latin alphabet. Unless
otherwise noted, any character strings your driver sends to or receives from NT
will be Unicode. 6

Unicode String Datatypes
When you're working with Unicode, remember to do the following:
•

Prefix Unicode string constants with the letter L to let the compiler know
you want wide characters. For example, L" some text" generates Unicode
text, whereas "some text" produces 8-bit ANSI.

•

Use the WCHAR data type for Unicode characters and PWSTR to point to
an array of Unicode characters.

•

Use the constant UNICODE_NULL to terminate a Unicode string.

Many NT system routines work with counted Unicode strings described by
a UNICODE_STRING structure (see Table 5.2 for the contents) .

5

I f you don't pass the addresses o f driver-defined memory management functions, the system uses
and ExFreePool by default.
6 Note that this does not include data passed between a user 's buffer and a device - unless the
device specifically works with Unicode.
ExAllocatePoolWithTag

92

Chapter 5

Table 5.2

General Development Issues

This structure defines the basic string object used by drivers

UNICODE_STRING, *PU NICODE_STRING
Field

Contents

USHORT Length
USHORT MaximumLength
PWSTR Buffer

Current string length, in bytes
Maximum string length, in bytes
Address of driver-allocated buffer holding
the string

It's up to you to allocate memory for the string buffer itself. If the Buffer
field points to a NULL-terminated string, the Length field does not include the
NULL character. Notice that the two length fields in the UNICODE_STRING
structure specify a count in bytes, not characters.

Working with U nicode
The NT runtime library provides a number of functions for working with
ANSI and Unicode strings. Table 5 .3 presents a few of them. See the documen­
tation for a complete list. Some of these functions have restrictions on the IRQL
levels from which they can be called, so be careful when you're using them.
If you've never worked with Unicode before, you may have some
programming habits that will cause you problems. Most of them result from

Table 5.3

The NT runtime library provides these U nicode manipulation
functions

Unicode string manipulation functions
Function

Description

RtllnitUnicodeString

Initializes a UNICODE_STRING from
a NULL-terminated Unicode string
Calculates number of bytes required to
hold a converted ANSI string
Converts ANSI string to Unicode
Converts an integer to Unicode text
Concatenates two Unicode strings
Copies a source string to a destination
Converts Unicode string to uppercase
Compares two Unicode strings
Tests equality of two Unicode strings

RtlAnsiStringToUnicodeSize
RtlAnsiStringToUnicodeString
RtllntegerToUnicodeString
RtlAppendUnicodeStringToString
RtlCopyUnicodeString
RtlUpcaseUnicodeString
RtlCompareUnicodeString
RtlEqualUnicodeString

Sec. 5.5

93

Interrupt Synchronization

making the assumption that a character and a byte are the same size. Watch out
for the following when you start working with Unicode:

5.5

•

Remember that the number of characters in a Unicode string is not the
same as the number of bytes. Be very careful about any arithmetic you do
that calculates the length of a Unicode string.

•

Don't assume anything about the collating sequence of the characters or
the relationship of upper- and lowercase characters.

•

Don't assume that a table with 256 entries is large enough to hold the
entire character set.

I NTERRUPT SYNCHRONIZATION
Writing code that executes a t multiple IRQL levels requires some attention to
proper synchronization. This section examines the issues that arise in this kind of
environment.

The Problem
If code executing at two different IRQLs attempts to access the same data
structure simultaneously, the structure can be corrupted. Figure 5.1 illustrates the
details of this synchronization problem.

foo.x = 1 0;
foo.y = 20;

int x;
int y;

Copyright © 1 994 by Cydonix Corporation. 940026a.vsd

Figure

5.1

Data structures can be corrupted by unsynchronized access

94

Chapter 5

General Development Issues

To see the exact problem, consider this sequence of events:
I.

Imagine that some piece o f code executing at a low IRQL decides to modify
several fields in the foo data structure. It gets as far as setting the field foo.x to I .

2.

Suddenly an interrupt occurs, and a higher-IRQL piece of code gets control of
the CPU. This code also decides to modify foo, and it sets foo.x to 10 and
foo.y to 20.

3.

The higher-IRQL code dismisses its interrupt, and control returns to the lower
IRQL routine which finishes its modifications to foo by setting foo.y to 2. The
lower-IRQL code is completely unaware that it was interrupted.

4.

The foo structure is now corrupted, with 10 in x and 2 in y.

In the following sections, you'll see some techniques your driver can use to
avoid these kinds of collisions.

Interrupt B locking

In the previous example, the lower-IRQL routine could have avoided these
synchronization problems by preventing itself from being interrupted. It can do
this by temporarily raising the IRQL of the CPU and then lowering it back to its
initial level after completing the modification. This technique is called interrupt
blocking. If you look at Table 5.4, you'll see the Kernel functions that your driver
can use to manipulate a CPU's IRQL value.
Rules for B locking Interrupts

If you plan to use any of these functions to block interrupts, there are certain
rules you need to follow:
•

Every piece of code touching a protected data structure has to agree on
the IRQL to use for synchronization and must only touch the structure
when it's running at the chosen IRQL.

Table 5.4

These Kernel functions control the CPU's I RQL level

Interrupt B locking Functions
Function

KeRaiseirql
KeLowerirql
KeGetCurrentirql

Description

Changes the CPU IRQL to a specified value, blocking
interrupts at or below that IRQL level
Lowers the CPU IRQL value
Returns the IRQL value of the CPU on which this call
is made

Sec. 5.6

5.6

Synchronizing Multiple CPUs

95

•

Drivers using this technique shouldn't spend too much time at the ele­
vated IRQL level. Depending on the blocking level, this can have a nega­
tive impact on NT's ability to service other interrupts quickly.

•

Although your driver can raise the CPU's IRQL to a higher level and
reduce it back to its previous value, you must never drop the CPU's IRQL
below the level where you found it. Disobeying this rule will compromise
the entire interrupt priority mechanism.

SYNCHRONIZING M U LTIPLE C P U S
But everything is not yet safe. Modifying the IRQL o f one CPU has no affect on
other CPUs in a multiprocessor system. Consequently, IRQLs provide only local
protection to shared data. To prevent corruption of data structures accessed by
multiple CPUs, NT uses synchronization objects called spin locks.

How Spin Locks Work
A spin lock is simply a mutual-exclusion object that you associate with a
specific group of data structures. When a piece of kernel-mode code wants to
touch any of the guarded data structures, it must first request ownership of the
associated spin lock. Since only one CPU at a time can own the spin lock, the data
structure is safe from collisions. Any CPU requesting an already-owned spin lock
will busy-wait until the spin lock becomes available. Look at Figure 5.2 to see how
this works.
A given spin lock is always acquired and released at a specific IRQL level.
This has the effect of blocking potentially dangerous interrupts on the local CPU
and preventing the synchronization problems we saw in the last section. While a
CPU is waiting for a spin lock, all activity at or below the IRQL of the spin lock is
blocked on that CPU. Once the IRQL level has been raised, the CPU can request
ownership of the spin lock, which will guarantee protection against other CPUs.
Fortunately, all these details are hidden inside the Kernel's spin lock routines.

Using Spin Locks
There are two major kinds of spin locks provided by the Kernel. They are
distinguished by the IRQL level at which you use them.
•

Interrupt spin locks
These synchronize access to driver data struc­
tures shared by multiple driver routines. Interrupt spin locks are acquired
at the DIRQL associated with the device.

•

Executive spin locks
These guard various operating system data
structures and their associated IRQL is DISPATCH_LEVEL.

-

-

Chapter 5

96

Raise IRQL
Repeat
Request Spin Lock
Until ACQUIRED
foo.x = 1 ;
foo.y = 2;
Release Spin Lock
Restore IRQL

General Development Issues

Raise lRQL
Repeat
Request Spin Lock
Until ACQUIRED
foo.x = 1 0;
foo.y = 20;
Release Spin Lock
Restore IRQL

S p i n lock
for "too"

int x;
int y;

Copyright © 1 994 b y Cydonix Corporation. 940027a.vsd

Figure

5.2

How spin locks synchronize multiple CPUs

When your driver uses Interrupt spin locks, most of the work happens
behind the scenes. When we look at KeSynchronizeExecution in Chapter 9, you'll
see the exact details.
Executive spin locks are another story. When you use them, you'll need to
follow these steps:

1.

Decide what data items you need to guard and how many spin locks to use.
The tradeoff is that a larger number of spin locks may allow more of your
driver to run in parallel, but it increases the chance of deadlocking if you need
to acquire multiple locks at the same time.

2.

Declare a data item of type KSPIN_LOCK for each lock. Storage for the spin
lock must be permanently resident. Usually, you store spin locks in the Device
or Controller Extension.

3.

Initialize the spin lock once by calling KelnitializeSpinLock. You can call this
function from any IRQL level, though most often you set up all your spin
locks in the DriverEntry routine.

4.

Call KeAcquireSpinLock before you touch any resource guarded by a spin
lock. This function raises IRQL to DISPATCH_LEVEL, acquires the spin lock,
and returns the previous IRQL value to you. To call this function, you must be
at or below DISPATCH_LEVEL IRQL. If you're already running at DIS­
PATCH_LEVEL, you can save some work by calling KeAcquireSpinLockAt­
DpcLevel instead.

Sec. 5.7

Linked Lists
5.

97

When you've finished using the protected resource, call the KeRelease­
SpinLock function to let go of the lock. You call this function from DIS­
PATCH_LEVEL IRQL and it restores IRQL to its previous value. If you were
already at DISPATCH_LEVEL when you acquired the lock, you can save
some work by calling KeReleaseSpinLockFromDpcLevel, which releases the
lock but doesn't change IRQL.

Some other driver support routines (like the interlocked lists and queues
described in the next section) use Executive spin locks for protection. In these cases,
your only responsibility is to initialize the spin lock object. The routines that manage
the interlocked object will acquire and release the spin lock itself on your behalf.

Rules for Using Spin Locks
Spin locks aren't terribly difficult to use, but you do have to keep a few
things in mind when you're working with them:

5.7

•

Be sure to release a spin lock as quickly as possible, because while you're
holding it, you may be blocking all activity on other CPUs. The official rec­
ommendation is not to hold a spin lock for more than about 25 microseconds.

•

Don't cause any hardware or software exceptions while you're holding a
spin lock. This is a sure way to crash the system.

•

Don't try to access any paged code or data while you're holding a spin
lock. This may result in a page fault exception, which is another quick
way to crash the system.

•

Don't try to acquire a spin lock that your CPU already owns. This will
lead to a deadlock situation where the CPU freezes up waiting for itself to
release the spin lock.

•

Avoid driver designs that depend on holding multiple spin locks at the
same time. Unless you're careful, this can also lead to deadlocks. If you
must use multiple spin locks, be sure that everyone agrees to acquire
them in a fixed order and release them in reverse order.

•

Don't call any routines that violate the above rules.

L I N KE D L ISTS
Drivers sometimes need to maintain various kinds of linked lists. You'll see exam­
ples of this in later chapters. The following subsections describe the support avail­
able from NT for managing singly- and doubly-linked lists.

Chapter 5

98

General Development Issues

Singly-Linked Lists
To use singly-linked lists, begin by declaring a list head of type

SINGLE_LIST_ENTRY. This is also the data type of the link pointer itself. You
need to initialize the list by setting the head to NULL, as demonstrated in the
following code fragment.

typede f s t ruct _DEVICE_EXTENS I ON {
S I NGLE_L I ST_ENTRY l i s tHead ; / / Declare head pointer
} DEVICE_EXTENS ION , * PDEVICE_EXTENS ION
pDevExt - > l i s tHead . Next = NULL ; / / I n i t i a l i z e the l i s t
To add o r remove entries from the front o f the list, call PushEntryList and
PopEntryList. Depending on how you're using the list, the actual entries can be in
either paged or nonpaged memory. Just remember that these functions don't
perform any synchronization of their own.
NT also provides convenient support for singly-linked lists guarded by an
Executive spin lock. This kind of protection is important if you're sharing a linked
list among driver routines running at or below DISPATCH_LEVEL IRQL. To use
one of these lists, set up the list head in the usual way, and then initialize an
Executive spin lock that will guard the list.

typede f s t ruct _DEVICE_EXTENS ION
S I NGLE_L I ST_ENTRY l i s tHead ; / / Declare head pointer
/ / and the lock
KSPIN_LOCK l i s tLock ;
} DEVICE_EXTENS ION , * PDEVICE_EXTENSION
Ke ini t i a l i z e Sp inLock ( &pDevExt - > l i s tLock ) ;
pDevExt - > l i s tHead . Next = NULL ;
You pass a pointer to this spin lock as an explicit argument to Exlnter­
lockedPushEntryList and ExlnterlockedPopEntryList. To make these interlocked
calls, you must be running at or below DISPATCH_LEVEL IRQL. The list entries
themselves must reside in nonpaged memory, since the system will be linking
and unlinking them from DISPATCH_LEVEL IRQL.

Doubly-Linked Lists
To use doubly-linked lists, declare a list head of type LIST_ENTRY. This is
also the data type of the link pointer itself. You need to initialize the list head, as
demonstrated in the following code fragment.

Sec. 5.7

Linked Lists

99

typede f s truc t _DEVICE_EXTENS ION
LIST_ENTRY l i s tHead ;
/ / Decl are head pointer
} DEVICE_EXTENS I ON , * PDEVICE_EXTENS I ON
Ini t i al i z eL i s tHead ( &pDevExt - > l i s tHead ) ;
To add entries to the list, call InsertHeadList or InsertTailList, and to pull
entries out, call RemoveHeadList or RemoveTailList. You can determine if there's
anything in a list by calling IsListEmpty. Again, the entries can be paged or
nonpaged, but these functions don't perform any synchronization.
Not surprisingly, NT supports interlocked doubly-linked lists. To use these,
set up the list head in the usual way, and then initialize an Executive spin lock that
will guard the list.

typede f s t ruct _DEVICE_EXTENS ION
L I ST_ENTRY l i s tHead ; / / Dec lare head pointer
KSPIN_LOCK l i s tLock ; / / and the lock
} DEVICE_EXTENSION , * PDEVICE_EXTENSION
Keini t i a l i z eSpinLock ( &pDevExt - > l i s tLock ) ;
Ini t i al i z eLi s tHead ( &pDevExt - > l i s tHead ) ;
You pass this spin lock in calls to ExlnterlockedlnsertTailList, Exlnter­
lockedlnsertHeadList, and ExlnterlockedRemoveHeadList. To make these
interlocked calls, you must be running at or below DISPATCH_LEVEL IRQL. Just
like their singly-linked cousins, entries for doubly-linked interlocked lists have to
live in nonpaged memory.

Removing Blocks from a List
When you pull a block out of a list, what the system gives you is a pointer
to the LIST_ENTRY or SINGLE_LIST_ENTRY structure within the block. What
you probably want is the address of the block itself. If the XXX_LIST_ENTRY
structure is at the top of the block, everything is easy. If it's buried in the block
somewhere, you need to do a little arithmetic to get the address of the containing
structure. Fortunately, NT provides a macro to make this easier. See Table 5.5 for
the details.
The following code fragment shows how to use this macro. It assumes
you're using the Tail.Overlay.ListEntry field of an IRP to maintain your own
linked list of IRPs, and that the listHead field of your Device Extension points to
the beginning of this list.

100

Chapter 5

Table 5.5

General Development Issues

CONTA I N I NG_RECORD macro arguments

CONTAINING_RECORD
Parameter

Description

Address
Type
Field

Address of a field within a data structure
The data type of the structure
Field in structure pointed at by the Address argument
Base address of structure containing Field

Return value

P I RP p i rp ;
PL I ST_ENTRY pEntry ;
pEntry = RemoveHeadL i s t ( &pDevExt - > l i s tHead ) ;
p i rp = CONTAINING_RECORD ( pEntry , I RP ,
Tai l . Over l ay . Li s tEntry ) ;

5.8

SUMMARY
In this chapter we've looked a t some general guidelines for designing and coding
your driver. We've also covered a number of basic techniques that will show up
again and again throughout this book.
This is all just foundation material for the work ahead. In the next chapter,
we'll start to implement some actual driver routines.

C

H

A

P

T

E

R

6

Initialization and
Cleanup Routines

E

verything has to start somewhere. In the case of
an NT kernel-mode driver, the starting point is a function called DriverEntry.
This chapter will show you how to write a DriverEntry routine along with vari­
ous other pieces of initialization and cleanup code. By the time you finish this
chapter, you'll be able to write a minimal driver that you can actually load into
the system.

6 . 1 WRITING A DRIVERENTRY ROUTIN E
Every NT kernel-mode driver, regardless of its purpose, has to expose a routine
whose name is DriverEntry. This routine initializes various driver data structures
and prepares the environment for all the other driver components.

Execution Context
The 1/0 Manager calls your DriverEntry routine once when it loads your
driver. As you can see from Table 6.1, the DriverEntry routine runs at PAS­
SIVE_LEVEL IRQL, which means it has access to paged system resources.
The DriverEntry routine receives a pointer to its own Driver object, which it
must initialize. It also gets a UNICODE_STRING containing the path to the driver 's
service key in the Registry. This string takes the form, HKEY_LOCAL_MA101

102

Chapter 6

Table 6.1

Initialization and Cleanup Routines

Function prototype for a D riverEntry routine
==

NTSTATUS DriverEntry

IRQL

PASSIVE_LEVEL

Parameter

Description

IN PDRIVER_OBJECT DriverObject
IN PUNICODE_STRING RegistryPath
Return value

Driver object for this driver
Registry path string for this driver's key
• STATUS_SUCCESS - success
• STATUS_XXX - some error code

CHINE\ System \CurrentControlSet\Services \DriverName, and DriverEntry can
use it to extract any driver-specific parameters stored in the Registry. 1
What a DriverEntry Routine Does

Although the exact details will vary slightly from driver to driver, in general
you should perform the following steps in your DriverEntry routine.
1.

If you're writing a device driver, start by finding and allocating any hardware
that the driver is supposed to manage.

2.

Initialize the Driver object with pointers to other driver entry points.

3.

If your driver manages a multiunit controller, call IoCreateController to cre­
ate a Controller object and then initialize its Controller Extension.

4.

Call IoCreateDevice to create a Device object and then initialize its Device
Extension.

5.

Make the device visible to the Win32 subsystem by calling IoCreateSymbolic­
Link.

6.

Connect the device to an Interrupt object and initialize any DPC objects
needed by the driver.

7.

Repeat steps 3-6 for all controllers and devices that belong to your driver.

8.

Return STATUS_SUCCESS to the 1/0 Manager.

If you run into problems during initialization, your DriverEntry routine
should release any system resources it may have allocated and return an appro­
priate NTSTATUS failure code to the 1/0 Manager.
The following sections describe some of these steps in greater detail. The
process of finding and allocating hardware is complex enough that it needs to

1

Chapter 7 explains how to extract these parameters from a driver's service key.

103

Sec. 6.1 Writing a DriverEntry Routine

wait until the next chapter. We'll also have to postpone the discussion of interrupt
processing and DPCs until we look at data transfer routines in Chapter 9.
Initializing DriverEntry Points

The 1/0 Manager is able to locate the DriverEntry routine because it has a
well-known name. Other driver routines don't have fixed names, so the 1/0 Man­
ager needs some other way to find them. It does this by looking in the Driver
object for pointers to specific functions. Your DriverEntry routine is responsible
for setting up these function pointers.
These function pointers fall into two categories:
•

Functions with explicit slots in the Driver object.

•

IRP Dispatch functions that are listed in the Driver object's MajorFunc­
tion array. These are discussed in more detail in Chapter 8.

The following code fragment shows how a DriverEntry routine initializes
both kinds of function pointers.
pDO- >DriverStar t i o = XxS tar t i o ;
pDO- >Dr iverUnl oad = XxUnl oad ;
II
II

Ini t i a l i z e the func t i on di spatch array

II
=

pDO - >Maj orFunc t i on [ I RP_MJ_CREATE ]
pDO - >Maj orFunc t i on [ IRP_MJ_CLOSE ]

=

XxDi spatchCreat e ;
XxD i spatchC l o s e ;

Creating Device Objects

Once you've found and allocated all your hardware, you need to create a
Device object for each physical or virtual device you want to expose to the rest of
the system. Most of the work is done by the IoCreateDevice function, which takes
a description of your device and returns a Device object, complete with an
attached Device Extension. IoCreateDevice also links the new Device object into
the list of devices managed by this Driver object. Table 6.2 contains a description
of this function.
Take a look at the NTDDK.H header file to see the standard definitions for
the DeviceType argument. Try to choose a value that's as close as possible to your
device.
If you truly believe your nuclear-powered laser retroscope is unlike any
existing device, you can define a private device type value. Just remember that
Microsoft reserves values in the range 0-32767 and leaves numbers between
32768 and 65535 for you. They also leave the bookkeeping up to you, so there's no

104

Chapter 6

Table 6.2

Initialization and Cleanup Routines

Function prototype for loCreateDevice
==

NTSTATUS loCreateDevice

IRQL

Parameter

Description

IN PDRIVER_OBJECT DriverObject

Pointer to Driver object
Desired size of Device Extension in bytes
NT device name (see below)
FILE_DEVICE_XXX (see NTDDK.H)
Characteristics for mass-storage device
• FILE_REMOVABLE_MEDIA
• FILE_READ_ONLY_DEVICE
• FILE_FLOPPY_DISKETTE
• FILE_WRITE_ONCE_MEDIA
• FILE_REMOTE_DEVICE
TRUE if device is nonshareable
Variable that receives Device object
• STATUS_SUCCESS - success
• STATUS_XXX - some failure code

IN ULONG DeviceExtensionSize

IN PUNICODE_STRING DeviceName
IN DEVICE_TYPE DeviceType
IN ULONG DeviceCharacteristics

IN BOOLEAN Exclusive
OUT PDEVICE_OBJECT *DeviceObject
Return value

PASSIVE_LEVEL

guarantee that the number you choose for your retroscope won't be used by some
other driver to refer to its microwave popcorn warmer.
One final point about creating Device objects. Although the vast majority of
drivers call IoCreateDevice from their DriverEntry routines, it is possible to make
this call from a Dispatch routine instead. For example, a driver that managed
pseudo-devices could use this technique to dynamically create Device objects in
response to a driver-defined DeviceloControl request.
If you do create Device objects somewhere other than in your DriverEntry
routine, you have to reset the DO_DEVICE_INffiALIZING bit in the Flags field
of the object. In the normal course of events, the 1/0 Manager automatically
resets this bit for a driver 's Device objects when the DriverEntry routine is fin­
ished. Until this bit is cleared, the Device object can't be used, and CreateFile calls
referencing it will fail. The following code fragment shows what you need to do.
pDevObj - >F l ags &= -DO_DEVICE_INITIAL I Z ING ;

Don't clear this bit until the Device object is actually initialized and ready to
process requests.
Choosing a Bufferi ng Strategy

If the IoCreateDevice call succeeds, you need to let the I/O Manager know
whether you want to do Buffered or Direct 1/0 with this device. You make this

Sec. 6.2

Code Example: Driver Initialization

105

choice by ORing one of the following bits into the Flags field of the new Device
object. 2 :
•

DO_BUFFERED_IO
If you want the 1/0 Manager to copy data back
and forth between user and system-space buffers.

•

DO_DIRECT_10
If you want the 1/0 Manager to lock user buffers into
physical memory for the duration of an 1/0, and build a descriptor list of
the pages in the buffer.

-

-

Chapter 8 will explain how to work with user buffers in both of these cases.
If you don't set either of these bits, the 1/0 Manager will assume that you're han­
dling everything yourself. Making user data available to a driver is a nasty pro­
cess, so it's best to let the 1/0 Manager do the work for you.
NT and Win32 Device Names

Just like T.S. Elliot's cats, NT devices have more than one name. The one you
specify to loCreateDevice is the name by which the device is known to the NT
Executive itself. If you want to make the device available to the Win32 subsystem,
the Win16 subsystem, and virtual DOS machines, you have to give the device a
DOS name as well.
These two types of names live in different parts of the Object Manager's
namespace. You'll find NT device names dangling beneath the \Device section of
the tree, while the Win32 name appears beneath the \DosDevices area. Notice
that the DOS name is actually a symbolic link that connects it to the NT device.
Figure 6.1 illustrates this relationship.
Also notice that NT and DOS follow different device naming conventions.
NT device names tend to be longer, and they always end in a zero-based number
(FloppyDiskO, FloppyDiskl, etc). DOS devices follow the usual pattern of A
through Z for file-system devices, and names ending in a one-based number for
any other devices (LPTl, LPT2, etc).

6.2

C O D E E XAM P L E : D R I V E R I N ITIALIZATI O N

This example shows how a basic device driver initializes itself. You can find the
code for this example in the CH06 directory on the disk that accompanies this
book.

2

Make sure you use a logical OR to set the Flags field of the Device object. The 1/0 Manager uses
other bits in this field to synchronize its own operation, and if you accidentally clear some of them,
bad things will happen.

106

Chapter 6

Initialization and Cleanup Routines

.. .. . . .. ..

'

\

•: Device :

: DosDevices

'

'

·

·�

XxO

· · · ·

· · · · · · · · ·

·

·

·

·

·�

Sym bolic Link

·

-

.

XX1

.. .. .. .. ..

Copyright @ 1 996 by Cydonix Corporation. 96001 Sa.vsd

Figure 6 . 1

NT and Win32 device names in the Object Manager's namespace

INIT.C

The functions in this module perform all the essential setup tasks needed to
manage one or more physical devices. Although the code supports multiple
devices, it assumes they are all on separate controllers, so it doesn't create any
Controller objects.
DriverEntry This particular implementation isn't very forgiving of initial­
ization errors. H anything fails along the way, the whole driver refuses to load. A
real driver might take a more flexible approach.
II
II

Header f i l e s .

. .

II

# inc l ude " xxdr iver . h " O
II
II

Forward dec l arat i ons o f local func t i ons

II

s t a t i c NTSTATUS
XxCreateDev i c e (
IN PDRIVER_OBJECT DriverObj e c t ,
IN INTERFACE_TYPE BusType ,
IN ULONG BusNumber ,
IN PDEVICE_BLOCK Devi ceBlock ,
IN ULONG NtDevic eNumber
) i
II
II

I f the p l a t f orm can handl e i t , make the Dr iverEntry

Sec. 6.2

107

Code Example: Driver Initialization
II

routine di s cardabl e , so that it doesn ' t was t e space

II

# i fde f ALLOC_PRAGMA
#pragrna a l l o c_text ( i ni t , DriverEntry ) 8
#pragrna a l l o c_text ( i ni t , XxCreateDevi c e
# endi f
I I ++
I I Func t i on :

DriverEntry

II
II

II

Des c r ip t i on :
This function initializes the driver , l ocates
and c l a ims hardware r e s ourc e s , and creates
vari ous NT obj e c t s needed to proc e s s I I O
reques t s .

II
II

Arguments :

II
II
II
II

Pointer to the Driver obj ect
Regis try path string for driver servi ce key

II
II
II
II
II

Return Value :
NTSTATUS s i gnal i ng suc c e s s or fai lure

I I--

NT STATUS
DriverEntry (
IN PDRIVER_OBJECT DriverObj ect ,
IN PUNI CODE_STRING Reg i s t ryPath
)
{
PCONFI G_ARRAY Con f i gLi s t ; 4D
PCONFI G_ARRAY Con f i gArray ;
ULONG NtDevic eNumber ;
NTSTATUS s tatus ;
ULONG i ;
II
II

Load up the Con f i g l i s t

II

.

.

.

XxGetHardwareinfo ( 0
Reg i s t ryPath ,
&Con f i gL i s t ) ;
i f ( ! NT_SUCCES S ( s tatus ) )
{
return s t atus ;
}

s tatus

=

108

Chapter 6
II
II

Initialization and Cleanup Routines

Al l ocate the hardware

.

.

.

II

s t atus = XxReportHardwareUsage (
DriverObj ect ,
Con f i gL i s t ) ;
i f ( ! NT_SUCCESS ( s tatus ) )
{
XxRe l eas eHardwareinfo ( Con f i gL i s t ) ;
return s t atus ;
II
II

Export o ther dr iver entry points

.

.

.

II
=

XxDr iverUnl oad ;

DriverObj ec t - >Maj orFunc t i on [
XxDi spat chOpen ;
DriverObj e c t - >Maj orFunc t i on [
XxDi spatchC l o s e ;
DriverObj e c t - >Maj orFunc t i on [
XxDi spatchWr i t e ;
DriverObj e c t - >Maj orFunc t i on [
XxDi spat chRead ;

I RP_MJ_CREATE ]

Dr iverObj e c t - >Dr iverUnl oad

II
II
II

IRP_MJ_CLOSE
=

I RP_MJ_WRITE
IRP_MJ_READ ]

=

Ini t i al i z e a Devi ce obj ect f o r each p i e c e
o f hardware we ' ve f ound

II

Con f i gArray = Conf igLi s t ;
NtDeviceNurnber = O ;
whi l e ( Con f i gArray ! = NULL )
{
for ( i = 0 ;
i < Con f i gArray->Count ;
i++ )
{
XxCreateDevi c e (
DriverObj e c t ,
Con f i gArray- >BusType ,
Con f i gArray->BusNurnber ,
&Con f i gArray- >Devi ce [ i ] ,
NtDeviceNurnber ) ;
i f ( ! NT_SUCCES S ( s t atus ) ) break ;

s tatus

Sec. 6.2

Code Example: Driver Initialization

109

NtDevic eNumber + + ;
i f ( ! NT_SUCCESS ( s t atus ) ) break ;
II
II

Get next array in the chain

II

Con f i gArray

=

Con f i gArray- >NextCon f i gArray ;

i f ( ! NT_SUCCES S ( s tatus ) )
{
XxRe l eas eHardware ( DriverObj e c t ) ;
XxRe l e a s eHardwareinfo ( Con f i gL i s t ) ;
return s tatus ;

0 This header includes both the system-supplied NTDDK.H and our pri­
vate HARDWARE.H file. It also contains definitions of any driver­
defined structures.
@ NT will discard these routines after DriverEntry executes. You should
also include any functions called only by the DriverEntry routine. Do not
discard any code needed after driver initialization.
@ The Config list is a driver-defined data structure that will follow us
through the DriverEntry routine. It holds information about any hard­
ware that this driver manages. Chapter 7 will show you how to use this
structure.
0 We'll see this routine in the next chapter. It uses one of two techniques to

locate any hardware this driver is responsible for and put a description of
that hardware into the Config list.
XxCreateDevice This is a helper function that does all the grunt work. It
creates and initializes a single Device object using one of the hardware descrip­
tions in the Config list.

s tat i c NTSTATUS
XxCreateDevi c e (
IN PDRIVER_OBJECT DriverObj e c t ,
IN INTERFACE_TYPE BusType ,
IN ULONG BusNumber ,
IN PDEVICE_BLOCK DeviceB l o c k ,
IN ULONG NtDevi c eNumber
)
{

110

Chapter 6

Initialization and Cleanup Routines

NTSTATUS s tatus ;
PDEVICE_OBJECT pDevObj ;
PDEVICE_EXTENS I ON pDevExt ;
UNI CODE_STRING devic eName ;
WCHAR deviceNameBu f f e r [ XX_MAX_NAME_LENGTH ] ;
UNICODE_STRING l inkName ;
WCHAR l inkNarneBu f f er [ XX_MAX_NAME_LENGTH ] ;
UNICODE_STRING number ;
WCHAR numberBu f fer [ l O ] ;
number . Bu f f e r = numberBu f f er ;
number . MaxirnurnLength = 1 0 ;
II
II

Form the bas e NT device name . . .

II

deviceName . Bu f fer = devic eNameBu f fer ;
devic eName . MaxirnurnLength = XX_MAX_NAME_LENGTH ;
devic eName . Length = O ;
Rt lAppendUn i codeToS t ring (
&devic eName ,
XX_NT_DEVICE_NAME ) ;
II
II
II

Convert the dev i c e number into a s tr ing and
attach i t to the end of the devi c e name .

II

number . Length = O ;
Rtl integerToUnicode S t ring (
NtDevic eNumber ,
10 ,
&number ) ;
Rt lAppendUni codeS t ringToStr ing (
&devic eName ,
&number ) ;
II
II

Create a Device obj ect f o r this devi c e .

II

s t atus

=

I oCreat eDevi c e (
DriverObj ect ,
s i z e o f ( DEVI CE_EXTENSION ) ,
&devic eName ,
F I LE_DEVICE_UNKNOWN , 0
0,

. .

Sec. 6.2

Code Example: Driver Initialization
TRUE ,
&pDevObj ) ;
i f ( ! NT_SUCCESS ( s tatus ) )
{
return s t atus ;
}
pDevObj - >F l ags J = DO_BUFFERED_IO ; 8
II
II

Ini t i al i z e the Devi c e Extens i on

II

pDevExt = pDevObj - >Devi c eExt ens i on ;
pDevExt - >Devi c eObj ect = pDevObj ;
pDevExt- >NtDev i ceNumber = NtDevi c eNumber ;
II
II

Copy thing s f rom Dev i c e B l ock @

II

pDevExt - > PortBa s e = Devi ceBl ock- > PortBa s e ;
II
II

Prepare a DPC obj ect f o r later u s e

II

I o ini t ia l i z eDpcReques t (
pDevObj ,
XxDpcFor i s r ) ;
II
II

Form the Win3 2 symbo l i c l ink name .

II

l inkName . Bu f fer = l inkNameBu f fer ;
l inkName . MaximumLength = XX_MAX_NAME_LENGTH ;
l inkName . Length = O ;
Rt lAppendUnicodeToS t ring (
& l inkName ,
XX_WIN3 2_DEVICE_NAME ) ;
II
II
II
II
II

Re s e t the number s t ring and do ano ther
convers i on . Win3 2 devi c e numbers are
one greater than the NT equiva l ent .

number . Length = O ;
Rt l integerToUni codeString (
NtDevic eNumber + 1 ,
10 ,
&number ) ;

111

112

Chapter 6

Initialization and Cleanup Routines

Rt lAppendUn i c odeStringToS t r ing (
& l inkName ,
&number ) ;
II
II
II

Create a symbo l i c l ink s o our device i s
v i s ible to Win3 2 . . .

II

s tatus

II
II

=

I oCreateSymbo l i cLink (
& l inkName ,
&devic eName ) ;

S e e i f the symbo l i c l ink was c reated . . .

II

i f ( ! NT_SUCCESS ( status ) )
{
IoDe l e t eDevi c e ( pDevObj ) ;
return s t atus ;
II
II

Make sure device interrup t s are OFF

II

XxD i s ab l e interrupt s ( pDevExt ) ;
II
II

Connect to an Interrupt obj ect . . . 0

II

s tatus

=

I oConnect interrupt (
&pDevExt - >pinterrupt ,
Xxi s r ,
pDevExt ,
NULL ,
Devi ceBl ock- >Sys temVector ,
Devi ceBlock->Di rql ,
Devi ceBlock- >Di rql ,
Devi ceBl ock- > InterruptMode ,
Devi ceBlock- >ShareVector ,
Devi ceBlock->Af f i n i ty ,
DeviceBlock->Fl oat ingSave ) ;
i f ( ! NT_SUCCE S S ( s tatus ) )
{
I oDe l e t e Symbo l i cLink ( & l inkName ) ;
IoDe l e t eDevi c e ( pDevObj ) ;
return s tatus ;

Sec. 6.3

Writing Reinitialize Routines
II
II

113

Ini t i al i z e the hardware and enab l e interrupts

II

KeSynchroni z eExecut i on (
pDevExt - >pinterrup t ,
XXIni tDevi c e ,
pDevExt ) ;
return s tatus ;

0 Choose a FILE_DEVICE_XXX value that's as close as possible to the type
of device your driver manages.
@ Select an 1/0 method for data transfer operations. In this case, we'll let
the 1/0 Manager copy things to and from user space for us.
8 The Config list will be going away soon, so we need to move anything
important into the Device Extension. At the least, this includes the control
register base address; for DMA devices it would also include the Adapter
object pointer and count of mapping registers. More on this in Chapter 12.
0 Chapters 7 and 9 will explain more about interrupt processing.

6.3

WRITING REINITIALIZE ROUTI NES
Intermediate-level drivers loading a t system boot time may need to delay their
initialization until one or more lower-level drivers have finished loading. If all the
drivers belong to you, you can determine their load sequence by setting various
Registry entries at installation. But if you don't own all the underlying drivers,
your intermediate driver will need a Reinitialize routine.
Execution Context

If your DriverEntry routine discovers that it can't finish its initialization because
system bootstrapping hasn't yet gone far enough, it can register a Reinitialize routine
by calling loRegisterDriverReinitialization. The 1/0 Manager will call the Reinitial­
ize routine at some later point during the bootstrap.
As you can see from Table 6.3, the Reinitalize routine runs at PASSIVE_LEVEL
IRQL, which means it has access to paged system resources. Reinitialize routines
are useful only for drivers that load automatically at system boot.
What a Rein itial ize Routine Does

The Reinitialize routine can perform any driver initialization that the Driver­
Entry routine was unable to complete. If the Reinitialize routine discovers that the
environment still isn't suitable, it can call loRegisterDriverReinitialization to
register itself again.

114

Initialization and Cleanup Routines

Chapter 6

Table 6.3

Function prototype for a Reinitialize routine

VOI D XxReinitialize

IRQL

Parameter

Description

IN PDRIVER_OBJECT DriverObject
IN PVOID Context
IN ULONG Count
Return value

Pointer to Driver object
Context block specified at registration
Zero-based count of reinitialization calls

6.4

==

PASSIVE_LEVEL

WRITING AN U N LOAD ROUTI NE
By default, once a driver i s loaded, i t remains in the system until a reboot occurs.
To make a driver unloadable, you need to write an Unload routine and store a
pointer to the routine in your Driver object's DriverUnload field. The 1/0 Man­
ager will then call this routine in response to an unload request from the Control
Panel's Devices applet. If your driver will never be unloaded, then you can forget
about this routine.
Execution Context

The 1/0 Manager calls your Unload routine once when it unloads the
driver, usually because someone is playing with the Control Panel Devices applet.
As you can see from Table 6.4, the Unload routine runs at PASSIVE_LEVEL IRQL,
which means it has access to paged system resources.
What an Unload Routine Does

Although the exact details will vary slightly from driver to driver, in general
you should perform the following steps in your Unload routine:
1.

For some kinds o f hardware, you may need to save the state of the device in
the Registry. That way, you'll be able to put the device back in the same state
the next time your DriverEntry routine executes. For example, an audio card
driver might save the current volume setting of the card.

Table 6.4

Function prototype for an U nload routine

VOI D XxU nload

IRQL == PASSIVE_LEVEL

Parameter

Description

IN PDRIVER_OBJECT DriverObject
Return value

Pointer to Driver object for this driver

Code Example: Driver Cleanup

Sec. 6.5

115

2.

Disable interrupts from the device and disconnect the device from its Inter­
rupt object. It's crucial that the device not generate any interrupt requests
once the Interrupt object is gone.

3.

Deallocate any hardware belonging to your driver.

4.

Use IoDeleteSymbolicLink to remove the device from the Win32 namespace.

5.

Remove the Device object itself using IoDeleteDevice.

6.

If you're managing multiunit controllers, repeat steps 4 and 5 for each device
attached to the controller. Then remove the Controller object itself using
IoDeleteController.

7.

Repeat steps 4-6 for all controllers and devices that belong to your driver.

8.

Deallocate any pool memory held by the driver

Keep in mind that your Unload routine will not be called at system shut­
down time. If you need to do any special work at system shutdown, you'll need to
write a shutdown routine.

6.5

CODE EXAM PLE: DRIVER CLEAN UP
This example shows how a simple driver removes itself from the system. You can
find the complete code for this example in the CH06 directory on the disk that
accompanies this book.
UN LOAD.C

The functions in this module basically just undo the work that was per­
formed in the DriverEntry code. Again, it assumes there aren't any Controller
objects to deal with.
XxUnload In this case, the Unload routine is just a wrapper for calling
XxReleaseHardware.

VOID
XxDriverUnl oad ( IN PDRIVER_OBJECT DriverObj e c t ) {
II
I I S t op interrupt pro c e s s ing and re l ea s e hardware
II
XxRe l eas eHardware ( Drive rObj ect ) ;
}
XxReleaseHardware The real cleanup work done by the driver happens
in this routine. It's been separated out as a helper routine because parts of the
driver initialization code needs to perform the same kinds of cleanup.

116

Chapter 6

Initialization and Cleanup Routines

VOI D
XXRe l eas eHardware ( IN PDRIVER_OBJECT DriverObj ect )
{
PDEVICE_OBJECT pDevObj ;
PDEVICE_EXTENS ION pDevExt ;
UNI CODE_STRING l inkName ;
WCHAR l inkNameBu f f e r [ XX_MAX NAME LENGTH ] ;
UNICODE_STRINGnumber ;
WCHAR numberBu f f er [ l O ] ;
CM_RESOURCE_L I ST Re sLi s t ;
BOOLEAN bC on f l i c t ;
l inkName . Bu f f er = l inkNameBu f f e r ;
l inkName . MaximumLength = XX_MAX_NAME_LENGTH ;
number . Bu f f er = numberBu f f er ;
number . MaximumLength = 1 0 ;
pDevObj = Dr iverObj e c t - >Devic eObj ect ; O
II
I I Trave r s e the l i s t o f Devi c e obj ects
I I and c l ean up each one in turn . . .
II
whi l e ( pDevObj ! = NULL ) {
pDevExt = pDevObj - >DeviceExtens i on ;
II
I I Add code here t o save the s t ate o f
I I the hardware i n the Regi s t ry and / o r
I I to s e t t h e hardware into a known condi t i on .
II
II
I I S t op handl ing interrupts f rom devi c e
II
XxD i s able interrupt s ( pDevExt ) ;
I oDi s c onne c t interrupt ( pDevExt - >pinterrupt ) ;
II
I I Deal locate hardware resourc e s be l onging @
I I only to thi s Dev i c e obj ect . . .
II
ResLi s t . Count = O ;
I I Bui ld an empty l i s t

Sec. 6.5

117

Code Example: Driver Cleanup
I oReportRe sourceUsage (
NULL ,
Dr iverObj ect ,
NULL ,
0,

I I Defau l t c l a s s name
I I Ptr to Dr iver obj ect
I I No driver res ourc e s

I I Ptr to Devi c e obj ect
pDevObj ,
I I Devi ce resourc e s
&Re s Li s t ,
s i z e o f ( Res L i s t ) ,
FALSE ,
I I Junk , but requi red
&bConf l i c t ) ;
II
I I Form the Win3 2 symbo l i c l ink name .
II
l inkName . Length = O ;
Rt lAppendUnicodeTo Str ing (
& l inkName ,
XX_WIN3 2_DEVICE_NAME ) ;
II
I I Attach Win3 2 device number to the
I I end of the name ; DOS devi ce numbers
I I are one greater than NT number s . . .
II
number . Length = O ;
Rt l int egerToUnicodeStr ing (
pDevExt - >NtDevi c eNumber + l ,
10 ,

&number ) ;
Rt lAppendUnicode S t r ingToS t r i ng (
& l i nkName ,
&number ) ;
II
I I Remove symbo l i c l ink f rom Obj e c t
I I name space . . .
II
I o De l e t e Symbo l i cLink ( & l inkName ) ;
II
I I Get addr e s s o f next Dev i c e obj ect
I I and get rid o f the current one .
II
pDevObj = pDevObj - >NextDevi c e ;
IoDe l e t eDevi c e ( pDevExt - > Devic eObj e c t ) ;
.

.

118

Chapter 6

Initialization and Cleanup Routines

II
I I Deal l ocate hardware re s ourc e s owned @
I I by the Dr iver obj ect . . .
II
I I Bu i l d an empty l i s t
Res L i s t . Count = O ;
I oReportRe s ourceUsage (
NULL ,
I I De faul t c l a s s name
DriverObj ect ,
I I Po inter to Driver obj ect
&Re s L i s t ,
I I Driver r e s ources
s i z e o f ( Res L i s t )
pDevObj ,
I I Po inter to Devi c e obj ect
NULL ,
I I Devi ce r e s ourc es
0,
I I Don ' t override conf l i c t s
FALSE ,
&bCon f l i c t ) ;
I I Junk , but requi red
,

0 We're going to run the linked list of Device objects in order to do our
cleanup. Get the first Device object from the Driver object.
@ The mechanics of actually releasing allocated hardware will be the subject
of Chapter 7. For the moment, just treat these two calls to IoReportRe­
sourceUsage as a piece of necessary magic.

6.6

WRITING SHUTDOWN ROUTI N ES
If your driver has any special processing to do before the operating system disap­
pears, you'll need to write a Shutdown routine.
Execution Context

The 1/0 Manager calls your Shutdown routine once during system shut­
down. As you can see from Table 6.5, the Shutdown routine runs at PAS­
SIVE_LEVEL IRQL, which means it has access to paged system resources.
Table 6.5

Function prototype for a Shutdown routine
==

NTSTATUS XxShutdown

IRQL

Parameter

Description

IN PDRIVER_OBJECT DriverObject
IN PIRP irp
Return value

Pointer to Driver object for this driver
Pointer to shutdown IRP
• STATUS_SUCCESS - success
• STATUS_XXX - appropriate error code

PASSIVE_LEVEL

Sec. 6.7

Testing the Driver

119

What a Shutdown Routine Does

The main purpose of a Shutdown routine is to put the device into a known
state and perhaps store some device information in the Registry. Again, saving the
current volume settings from a sound card is a good example of something a
Shutdown routine would do.
Unlike the driver's Unload routine, Shutdown routines don't have to worry
about releasing driver resources because the operating system is about to disap­
pear anyway.
Enabling Shutdown Notification

If you examine the fields in the Driver object, it won't be obvious where the
address of your Shutdown routine should go. That's because shutdown notifica­
tions are delivered to your driver in the form of an 1/0 request whose function
code is IRP_MJ_SHUTDOWN. This means that your Shutdown routine is really a
Dispatch routine which needs to be added to the Driver object's Maj orFunction
array.
But wait, it doesn't stop there. You also need to tell the 1/0 Manager that
you're interested in receiving shutdown notifications. You do this by making a call
to IoRegisterShutdownNotification.
The following code fragment, taken from a DriverEntry routine, shows how
to enable shutdown notifications in your driver.
NTSTATUS DriverEntry (
IN PDRIVER_OBJECT pDO ,
IN PUNICODE_STRING Regi s t ryPath

pDO - >Maj orFunc t i on [ I RP_MJ_SHUTDOWN ] = XxShutdown ;
I oReg i s terShutdownNo t i f i c a t i on ( pDO ) ;

6. 7

TESTING TH E DRIVER
Even though your driver is far from being complete, there are still a few things
you can do at this point to verify its operation. In particular, you can test the
driver to be sure that it
•

Compiles and links successfully

•

Loads and unloads without crashing the system

•

Creates Device objects and Win32 symbolic links

•

Releases any resources when it unloads

Chapter 6

120

Initialization and Cleanup Routines

These goals may not seem very ambitious, but once you've reached them,
you know you have a solid base on which to build the rest of your driver.
Testing Procedu re

You can use the following procedure to test your driver. If any of the steps
fail, or if you crash the system, find and correct the problem before going on to the
next phase of the test.
1.

Write a SOURCES file for your driver.

2.

Use the BUILD utility to create the driver file.

3.

Move the driver to its target destination.

4.

Install the driver using REGEDT32. Specify manual loading.

5.

Reboot the system.

6.

Use the Control Panel Devices applet to load and start the driver.

7.

Use WINOBJ to see if your driver has created a Device object and its Win32
symbolic link.

8.

Stop the driver using the Control Panel Devices applet.

9.

Examine the Object Manager's namespace with WINOBJ to be certain the
driver has removed any objects it created.

The WINOBJ Utility

WINOBJ is a tool that comes with the Win32 SOK (not the DOK). This little
gem lets you view the NT Object Manager's namespace and determine whether
your driver has created its Device object and symbolic link. Microsoft supplies
executable versions of WINOBJ for the Alpha, Intel, and MIPS architectures.
Unfortunately, you won't find any source code for WINOBJ since it makes direct
calls to some native NT system services.
To use WINOBJ, just run the executable. The program will display the win­
dow pictured in Figure 6.2. The left pane shows the NT object directory in the
form of file folders. Double-clicking on a particular folder will show its contents
in the right window pane. Double-clicking on some objects in the right-hand pane
will display additional information about the object.3 As a driver writer, you'll be
mainly interested in the driver, DosDevices, and device directories.

3

WINOBJ is a little "throw-away" application that someone at Microsoft wrote. It doesn't know
how to display information about all object types, nor do all of its informational displays make
sense. Unfortunately, because it uses some of the "secret" NtXxx system calls, its source code isn't
included with the SDK.

Sec. 6.8

121

Summary

CJ H arddiskO

CJ \

,.,, LanmanRedirector

CJ ??

CJ H arddisk1

,.,, LanmanS erver

CJ arcname

CJ windfs

,.,, mailslot

CJ B aseN amedO bjects

,.,, am 1 500t1

,.,, mup

,.,, beep

,.,, N amedPipe

,.,, floppyO

,.,, N bf_Am1 500t1

,.,, floppy1

,.,, ndis

,.,, KeyboardClassO

,.,, null

"· '
ci;;·�-�-;
'"Ei"

; � device i

CJ FileSystem
CJ KnownD lls

CJ nls

'·;i· FloppyControllerE ventO

,.,, netbios

CJ O bjectTypes

,.,, KeyboardPortO

,.,, ParallelO

CJ R PC Control

,.,, ksecdd

,.,, ParallelPortO

CJ security

,.,, LanmanD atagramReceiver

J)I PhysicalM emory

Figure 6.2 Main window of the WINOBJ utility

6.8

SUM MARY
At this point, your driver is on its way. It can initialize itself and present both NT
and Win32 devices to the system. Depending on your specific needs, it may also
be able to perform various cleanup operations, either when it's unloaded manu­
ally or when the system shuts down.
Unfortunately, your driver still can't locate the hardware it's supposed to be
managing. This is a serious deficiency for a device driver, and it's one we'll see
how to remedy in the next chapter.

C

H

A

P

T

E

R

7

Hardware
Ini tializa ti on

Q

ne of the first things a device driver does is to
locate any devices it has to manage. This means finding their control registers,
determining their OMA capabilities and the IRQ levels at which they interrupt,
and locating any device-specific memory. In other words, the driver has to come
up with a list of the hardware resources used by its devices. This turns out to be a
much easier task if the hardware is auto-detectable. This chapter explains how to
determine the resources needed by a device regardless of whether it auto-detects
or not.
However, it's not enough to know what resources a device uses. Device
drivers also have to claim ownership of any hardware resources they plan to use,
in order to avoid collisions with other drivers. At the end of this chapter, you'll
learn how to allocate and deallocate system hardware.

7.1

FINDING AUTO-DETECTED HARDWAR E
During system bootstrap, NT goes to a lot of trouble to figure out what kinds of
peripherals are attached to the system. This section explains how the process
works and how your driver can access auto-detected hardware information.
How Auto-Detection Works

The exact mechanism used for detecting hardware depends on the platform
architecture. On 80x86 systems, a bootstrap component called NTDETECT gath1 22

Sec. 7.1

Finding Auto-Detected Hardware

123

ers information about the hardware environment, while on RISC-based machines,
the ARC firmware performs a similar function. In either case, the detection com­
ponent makes this hardware data available to the operating system loader, which
in turn writes it into the \HARDWARE\DESCRIPTION area of the Registry.
Later, device drivers can use this information to control their initialization.
The detection components use whatever methods they can to determine the
identity and characteristics of a given system. This includes both interrogating the
hardware directly, as well as using information in the ROM BIOS to draw conclu­
sions about devices attached to the system. Among other things, auto-detection
tries to determine
. . .

•

The number and type of any 1/0 buses on the system

•

Extended information about the bootstrap device itself

•

Information about the monitor and video adapter used to display boot­
strap messages

•

The presence and location of keyboard and mouse hardware

•

Number and location of serial and parallel controllers and any recogniz­
able printers or terminals attached to them

•

The presence and identity of any network cards

•

Information about any other devices on each 1/0 bus

The specific kinds of data that auto-detection searches for include the
address and number of a device's control registers, hardware interrupt levels used
by the device, information about a device's OMA capabilities, and any ranges of
physical memory used by the device. If the hardware offers any device-specific
data, auto-detection will collect that as well.
This is a wonderful scheme, and it promises to make the lives of driver writ­
ers much easier in the long run. Later releases of Windows NT will use this strat­
egy as a basis for supporting Plug and Play capabilities. At the moment, however,
most ISA devices don't have a lot to say for themselves and therefore don't show
up during auto-detection. This means that drivers of ISA devices have to use
other means for locating their hardware. Fortunately, PCI, native EISA, and MCA
devices are much more talkative.
Auto-Detected Hardware and the Registry

Regardless of how NT auto-detects a given piece of hardware, Registry
information about the hardware always has a standard format. This isolates driv­
ers from any bus or platform peculiarities and generally makes life easier for
driver writers. Figure 7.1 shows a portion of the Registry's hardware description
area.
The keys and subkeys below \ System form a tree-structured model of
any auto-detectable hardware. Keys with alphanumeric names correspond to
. . .

124

Chapter 7

Hardware Initialization

H KEV_LOCAL_MACHINE

c

HARDWARE

L

DESCRIPTION
l 'Lm

Multlfu nctionAdapter

Lo
L

Copyright © 1 996 by Cydonix Corporation. 960001a.vsd

Figure 7. 1

DiskController

Lo
L

FloppyPeripheral

L

0l[;
omponentlnformation
Configu ration Data
Identifier

Auto-detected hardware data in the Registry

general classes of hardware. Hanging from each of these keys will be one or more
subkeys whose names are integers. These numeric subkeys identify specific
instances of a CPU, a floating-point unit, a bus, a controller, or a device. In the fig­
ure, the MultifunctionAdapter key represents a category of buses (in this case
ISA), and the subkey 0 below it represents the first actual instance of such a bus.
DiskController\0 is connected to this bus, and FloppyPeripheral\O is attached to
this controller.
Tucked away in the numeric subkeys, you'll find value items containing any
information that NT was able to auto-detect. Three value items can show up in
one of these numeric subkeys:
•

Componentlnformation
This is binary data that (hopefully) the driver
will know how to interpret.

•

ConfigurationData
This names the resources needed by the hardware
in the form of a REG_FULL_RESOURCE_DESCRIPTOR item.

•

Identifier
This is an identifier string generated by the hardware or the
system BIOS. It's converted to Unicode when it goes into the Registry.

-

-

-

You can use the Registry editor, REGEDT32, to browse through this auto­
detected hardware data. This is very helpful if you're trying to resolve conflicts or
make sure that something is auto-detecting properly. Once you've selected a con­
troller or peripheral' s numeric sub key, double-clicking on the Componentlnfor-

Sec. 7.1

125

Finding Auto-Detected Hardware

mation value will bring up a display of the resources needed by that piece of
hardware.
Querying the Hardware Database

Although you're free to wander through the hardware description area
using RtlXxx and ZwXxx routines, IoQueryDeviceDescription (shown in Table
7.1) makes the process a little less painful. You give this function a pattern describ­
ing the kind of hardware information you want, and a callback routine. IoQuery­
DeviceDescription will then rummage around in the Registry and invoke your
callback routine each time it finds something that matches the pattern.
You tell IoQueryDeviceDescription what level of detail you want by using the
XxxType arguments listed in Table 7.2. Only the following combinations will work:

Table 7.1

•

BusType alone gets just bus-level information. 1

•

BusType and ControllerType gets bus and controller information

•

BusType, ControllerType, and PeripheralType together will give you
device-level information.

Prototype for l oQueryDeviceDescription

NTSTATUS loQueryDeviceDescription IRQL

==

PASSIVE_LEVEL

Parameter

Description

IN PINTERFACE_TYPE BusType
IN PULONG BusNumber
IN PCONFIGURATION_TYPE
ControllerType
IN PULONG ControllerNumber
IN PCONFIGURATION_TYPE
PeripheralType
IN PULONG PeripheralNumber
IN PIO_QUERY_DEVICE_ROUTINE
Callback
IN PVOID Context
Return value

Desired bus architecture (see below)
Zero-based bus number
Desired controller type (see
below)
Zero-based controller number
Desired device type (see
below)
Zero-based device number
Address of ConfigCallback routine

1

Address of driver's configuration buffer
• STATUS_OBJECT_NAME_NOT_FOUND
• STATUS_XXX from ConfigCallback

To get information about all the buses on a machine, call IoQueryDeviceDescription in a loop and
iterate the BusType from zero to MaximumlnterfaceType. Alternatively, you can use the HalQue·
rySystemlnformation function to get an explicit list of the buses on the machine.

126

Chapter 7
Table 7.2

Hardware Initialization

Bus, controller, and peripheral types for loQueryDeviceDescription

XxxType arguments for loQueryDeviceDescription
BusType

ControllerType

PeripheralType

CBus
Eisa
Internal
Isa
MicroChannel
MPIBus
MPSABus
NuBus
PCIBus
PCMCIABus
TurboChannel
VMEBus

AudioController
CdrornController
DiskController
DisplayController
KeyboardController
NetworkController
ParallelController
PointerController
SerialController
TapeController
Worm.Controller
OtherController

DiskPeripheral
FloppyDiskPeripheral
KeyboardPeripheral
LinePeripheral
Modem.Peripheral
MonitorPeripheral
NetworkPeripheral
PointerPeripheral
PrinterPeripheral
TapePeripheral
TerrninalPeripheral
OtherPeripheral

Notice that the XxxType arguments are pointers to variables and not the val­
ues themselves. You pass a NULL pointer to indicate that you don't want a partic­
ular kind of information.
You can get data about specific buses, controllers, or devices using one or
more of the XxxNurnber parameters. These arguments are pointers to variables
containing the number of the bus, controller, or device that you're asking about.
Passing a NULL pointer causes the 1/0 Manager to enumerate all items of a par­
ticular type.
To see how this works, suppose you call IoQueryDeviceDescription and
specify BusType as Eisa, BusNurnber as 0, ControllerType as DiskController, and
NULL for the ControllerNumber. The 1/0 Manager will call your ConfigCallback
routine once for each disk controller on EISA bus 0. With each invocation, the
callback will receive data about EISA bus 0 and one particular controller, but
nothing about any devices connected to that controller. Since multiple disk con­
trollers can be attached to a single bus, the ConfigCallback might get the same
bus information more than once, even though the controller information will be
different each time.
Now, suppose you make the same call to IoQueryDeviceDescription, but
this time you further restrict the search by specifying PeripheralType as Floppy­
DiskPeripheral and NULL for the PeripheralNurnber. In this case, your Config­
Callback will be called for each floppy drive on EISA bus 0. Along with bus and
controller data, each call will receive information about a different floppy disk
device. In this case, both the bus and controller information may be repeated for
multiple calls (because several floppies can share the same controller).

Sec. 7. 1

127

Finding Auto-Detected Hardware

If IoQueryDeviceDescription can't find anything in the Registry that
matches your request, it returns STATUS_OBJECT_NAME_NOT_FOUND with­
out invoking the ConfigCallback routine. Otherwise, it continues to execute your
callback until it runs out of matching items, or until your callback returns a value
other than STATUS_SUCCESS. In this case, it's supposed to return the last
NTSTATUS value sent back by your callback routine.
That's the theory. In practice, if you pass a NULL BusNumber parameter,
you always get STATUS_OBJECT_NAME_NOT_FOUND from IoQueryDevice­
Description. This value comes back regardless of whether your callback was
invoked, and it supersedes whatever status value your callback might have
returned. This problem doesn't occur with the other two XxxNumber arguments.
For this reason, the code example in the next section manually iterates both
BusType and BusNumber.
What a ConfigCallback Routine Does

Each time IoQueryDeviceDescription invokes your ConfigCallback rou­
tine, it passes the arguments listed in Table 7.3. These arguments are valid only
within the ConfigCallback routine itself, so you have to store any configuration
Table 7.3

Function prototype for a configuration callback

NTSTATUS XxConfigCallback

IRQL == PASSIVE_LEVEL

Parameter

Description

IN PVOID Context
IN PUNICODE_STRING PathName

Address of configuration buffer
Registry path for bus, controller, or
device information
Bus architecture
Zero-based bus number
Pointer to Registry information

IN INTERFACE_TYPE BusType
IN ULONG BusNumber
IN PKEY_VALUE_FULL_INFORMATION
*Businformation
IN CONFIGURATION_TYPE ControllerType
IN ULONG ControllerNumber
IN PKEY_VALUE_FULL_INFORMATION
*ControllerInformation
IN CONFIGURATION_TYPE PeripheralType
IN ULONG PeripheralNumber
IN PKEY_VALUE_FULL_INFORMATION
*Peripherallnformation
Return value

Controller type
Zero-based controller number
Pointer to Registry information
Device type
Zero-based device number
Pointer to Registry information
•
•

STATUS_SUCCESS
STATUS_XXX - error code

128

Chapter 7

Hardware Initialization

data that you'll need later in a temporary buffer. Usually, you allocate this buffer
somewhere in your DriverEntry routine and pass its address as the Context argu­
ment to IoQueryDeviceDescription.
Although the specific steps will depend on the hardware you're working
with, a ConfigCallback routine generally does the following:
1.

It scans the Registry information for base-register address, count of registers,
interrupt level and vector information, and DMA channel requirements.

2.

The ConfigCallback then stores the Registry values in the Config block allo­
cated by DriverEntry.

3.

It translates the Registry's bus-specific values into systemwide values that
your driver can use and stores these values in the Config block as well.

Each time IoQueryDeviceDescription calls your ConfigCallback routine,
you repeat this procedure for a new controller or device that matches your query.
Using Configuration Data

Your main sources of information in a ConfigCallback routine come from the
various XxxType, XxxNumber, and Xxxlnformation arguments. The meaning of
the XxxType and XxxNumber items should be pretty obvious, but the Xxxlnfor­
mation arguments need some explanation.
Each Xxxlnformation argument is actually a pointer which may or may not
be NULL, depending on what you've asked for. If you follow this pointer, you
come to an array of three items. Use one of these predefined constants to index
into this array:
•

loQueryDeviceldentifier
Points to any auto-detected hardware name
information stored in the Registry as a Unicode string.

•

loQueryDeviceConfigurationData
Points to any bus-relative Registry
information about the bus, controller, or device that was discovered dur­
ing auto-detection.

•

loQueryDeviceComponentlnformation
a device's subcomponents.

-

-

-

Points to information about

Of these, IoQueryDeviceConfigurationData is probably the most helpful.
Using this constant as an index into one of the Xxxlnformation arrays gets you a
pointer to a KEY_VALUE_FULL_INFORMATION structure which, in turn, contains
the actual Registry data about a bus, controller, or device. Figure 7.2 shows how this
works for the Controllerlnformation argument to a ConfigCallback routine.
The group of CM_PARTIAL_RESOURCE_DESCRIPTOR items hanging
from the bottom of this whole mess contains the actual hardware information
you're looking for. As you can see from Table 7.4, each descriptor identifies one

Sec. 7.1

129

Finding Auto-Detected Hardware
Controllerlnformation[ loQueryDeviceConfigurationData ]

DataOffset
CM_FULL_RESOURCE_DESCRIPTOR

CM_PARTIAL_RESOURCE_LIST

CM_PARTIAL_RESOU RCE_DESCRIPTOR

Figure 7.2 Hardware information given to a configuration callback

Table 7.4

Contents of a partial resource descriptor

CM_PARTIAL_RESOURCE_DESCRIPTOR
Field

Description

UCHAR Type

Identifies resource being described:
• CmResourceTypePort
• CmResourceTypelnterrupt
• CmResourceTypeDma
• CmResourceTypeMemory
• CmResourceTypeDeviceSpecificData

UCHAR ShareDisposition

Level of sharing for this resource:
• CmResourceShareDeviceExclusive
• CmResourceShareDriverExclusive
• CmResourceShareShared

USHORT Flags
union u
struct Port
struct Interrupt
struct Dma
struct Memory
struct DeviceSpecificData

Type-specific values
Union based on Type field
• Control register address and span
• Interrupt level and vector
• OMA channel and port
• Device memory address and span
• Device-specific information

Chapter 7

130

Hardware Initialization

kind of hardware resource used by the device. To extract this data, you need to
do a little pointer arithmetic and then examine each of the partial resource
descriptors.
There's something you need to be aware of when you start pulling informa­
tion from Partial Resource Descriptors: The partial descriptors are in no particular
order, so you need to walk through all of them to find the information you want.
The only exception to this is device-specific data, which if present, will always be
the last partial descriptor. 2
Translating Configuration Data

After you've pulled all this data from the Registry, there's still one more step.
The information in the partial descriptors is all bus-relative, just the way the auto­
detection component found it. To use these values in your driver, you need to
translate them into their systemwide equivalents. Specifically, you need to call
some of the following functions:
•

HalTranslateBusAddress
Converts device memory and register
addresses from bus-relative to system-wide values.

•

HalGetlnterruptVector
Converts bus-specific interrupt information
into system-assigned values for the vector, DIRQL, and affinity mask.
Chapter 9 explains how to use these values to connect to an Interrupt
object.

•

HalGetAdapter
locates an Adapter object your driver can use to per­
form DMA operations with a specific device. Chapter 12 explains how to
use this function.

-

-

-

It's worth mentioning that, in some environments, some of these transla­
tions may not do very much, but for portability, you need to perform them
anyway.

7.2

CODE EXAMPLE : LOCATING AUTO-DETECTED HARDWAR E
This rather long example shows how to pull auto-detected hardware information
from the Registry. Specifically, it looks for all the hardware of type ParallelCon­
troller. You can find these files in the CH07 directory on the disk that accompanies
this book.

2

This is because device-specific data is variable in length. Another implication is that there can be
only one device-specific data item in a group of partial resource descriptors.

Code Example: Locating Auto-Detected Hardware

Sec. 7.2

131

XXDRIVER.H

The following excerpts from the driver's header file show the driver-defined
data structures involved in hardware configuration. 3
DEVICE_BLOCK This temporary structure is carved out of paged pool
and is used only during driver initialization. It holds information about one spe­
cific piece of hardware. Some of the items in this block will later be copied into the
Device Extension block for safekeeping.

typede f s truc t _DEVICE_BLOCK {
II
I I Ori ginal values pu l l ed f rom the Reg i s t ry
II
PHYS ICAL_ADDRES S Original PortBas e ;
ULONG PortSpan ;
ULONG Origina l i rql ;
ULONG OriginalVector ;
KINTERRUPT_MODE InterruptMode ;
BOOLEAN ShareVector ;
BOOLEAN FloatingSave ;
ULONG OriginalDmaChannel ;
II
I I Converted values that wi l l be u s ed by
I I the driver
PUCHAR PortBas e ; / / F i r s t contr o l regi s t er
ULONG Sys temVe c t o r ;
KIRQL D i rql ;
KAFFINITY Af f i n i ty ;
DEVICE_BLOCK , * PDEVICE_BLOCK ;
CONFIG_ARRAY This structure is an array of DEVICE_BLOCKs that hold
temporary information about all the hardware belonging to the driver on one par­
ticular bus. In theory, multiple devices might show up on different buses, in
which case there would be a linked list of CONFIG_ARRAYs. The Count field
keeps track of how many DEVICE_BLOCKs actually contain valid data.

typede f s t ruct _CONFI G_ARRAY
II
I I We keep a l i s t o f the s e arrays , one
I I f o r each bus - type / bu s - number combi na t i on

3

You'll notice some DMA-related fields in the following structures. Since the parallel port doesn't
perform any DMA, these won't be used. Chapter 12 will show you how to fill them in.

132

Chapter 7

Hardware Initialization

I I where we f ind our hardware .
II
s t ruct _CONF IG_ARRAY *NextCon f i gArray ;
II
I I The bus to whi ch a l l the devi c e s in thi s
I I array are attached .
II
INTERFACE_TYPE BusType ;
ULONG BusNumber ;
II
I I Number o f devi c e s in thi s array
II
ULONG Count ;
II
I I One array- e l ement for each dev i c e
II
DEVICE_BLOCK Devi c e [ XX_MAXIMUM_DEVICES ] ;
CONFIG_ARRAY , * PCONF IG_ARRAY ;
DEVICE_EXTENSION This driver-defined structure is created from non­
paged pool by IoCreateDevice and automatically attached to our Device object.
It holds information that will be needed throughout the life of the driver.

typede f s t ruct _DEVICE_EXTENS I ON {
PDEVICE_OBJECT Dev i c eObj ect ; I I Back pointer
ULONG NtDevi c eNumber ;
PUCHAR PortBase ;

I I Z ero -based device num
I I F i r s t c ontr o l regi s ter

PKINTERRUPT pinterrup t ;

I I Interrupt obj ect

PADAPTER_OBJECT pAdapter ;
ULONG cMapRegs ;
UCHAR Devi c e S tatus ;

I I DMA Adapter obj e c t
I I Count o f mapping regs
I I Mo s t rec ent s t atus

} DEVICE_EXTENS ION , * PDEVICE_EXTENS ION ;
AUTOCON.C

This group of functions scans the Registry's hardware description map for
all the parallel controllers. It fills in a separate DEVICE_BLOCK for each piece of
hardware it finds. The result is a linked list of CONFIG_ARRAYs describing all
the parallel controllers on all buses in this machine.
XxGetHardwarelnfo This routine just loops through all the known bus
types and checks to see if one or more of our devices live on each bus. This is
mainly a harness for the call to IoQueryDeviceDescription.

Sec. 7.2

Code Example: Locating Auto-Detected Hardware

133

NT STATUS
XxGetHardwareinfo (
IN PUNICODE_STRING Regi s t ryPath , I I ( unus e d )
OUT PCONFIG_ARRAY * Conf i gL i s t
)

INTERFACE_TYPE Interfac eType ;
ULONG InterfaceNumber ;
CONF IGURATI ON_TYPE Ctrl rType
PCONF IG_ARRAY Conf i gArray ;
NTSTATUS s tatus ;

Para l l e l Contro l l e r ; 0

* C on f i gL i s t = NULL ; I I No devi c e s located yet
II
I I Run through a l l the var i ous bus types and
I I see i f our device i s on any o f them . . .
II
for ( InterfaceType = O ;
InterfaceType < Maximuminterfac eType ;
InterfaceType++ )
O;
InterfaceNumber
do {
s tatus = I oQueryDevi ceDe s c r ip t i on ( @
& InterfaceType ,
& InterfaceNumber ,
&Ctrl rType ,
NULL ,
NULL ,
NULL ,
XxCon f i gCal lback ,
Conf i gL i s t ) ;
II
I I Return to cal l e r i f a real
I I error occurs
II
i f ( ! NT_SUCCESS ( s tatus ) @
&& s t atus ! =
STATUS_OBJECT_NAME_NOT FOUND

XxRe leas eHardwareinfo (
* Conf i gL i s t ) ;
* Conf i gLi s t = NULL ;
return s tatus ;
}

134

Chapter 7

Hardware Initialization

Inter fac eNurnber+ + ;
whi l e ( s tatus ! =
STATUS_OBJECT_NAME NOT FOUND ) ;
end
o
f
forl
oop
} II
i f ( *Con f i gL i s t = = NULL )
return STATUS_NO_SUCH_DEVICE ;
else
return STATUS_SUCCES S ;

0 This is the hardware category. Notice that the parallel port is considered
to be a controller rather than a device.
@ Since we're specifying a controller type, our callback will be invoked once
for each piece of hardware on the current bus that matches the Parallel­
Controller type.
@} STATUS_OBJECT_NAME_NOT_FOUND simply means there is no such
item on the current bus - so we keep looking. Other kinds of errors
cause us to abort.
XxConfigCallback This routine gets called by the I/ 0 Manager once for
each device that matches the category ParallelController. We have to scan through
the Registry data for information about 1/0 port addresses and interrupt behavior.

s t a t i c NTSTATUS
XxCon f i gCal lback (
IN PVO I D Context ,
IN PUNICODE STRING PathName ,
IN INTERFACE_TYPE Bus Type ,
IN ULONG BusNumber ,
IN PKEY_VALUE_FULL_INFORMATION * Bus info ,
IN CONF IGURATION_TYPE C t r l rType ,
IN ULONG C t r l rNurnber ,
IN PKEY_VALUE_FULL_INFORMAT ION * C trlrinfo ,
IN CONF IGURAT ION_TYPE Devi ceType ,
IN ULONG Dev i c eNurnber ,
IN PKEY_VALUE_FULL_INFORMATI ON * Devi c e i n f o
)
II
I I So we don ' t have to typecast the cont ext .
II
PCONF IG_ARRAY * C on f i gL i s t = Cont ext ;
II
I I Short-hand po inters to r e s ource data
II

Sec. 7.2

Code Example: Locating Auto-Detected Hardware

135

PCM_FULL_RESOURCE_DESCRI PTOR pFrd ;
PCM_PARTIAL_RESOURCE_DESCRI PTOR pPrd ;
PCONF IG_ARRAY Con f i gArray ;
PDEVICE_BLOCK Devi c eBl ock ;
II
I I The s e bool eans wi l l t e l l us whether we got
I I all the inf orma t i on that we needed .
II
BOOLEAN bFoundPort = FALSE ;
BOOLEAN bFoundinterrupt = FALSE ;
NTSTATUS s tatus ;
ULONG i ;

I I Gene r i c l o op control

II
I I Locate the Con f i g Array for thi s bus
II
s tatus = XxFindMatchingCon f i gArray ( O
BusType ,
BusNumber ,
Conf igLi s t ,
&Con f i gArray ) ;
i f ( ! NT_SUCCES S ( s t atus ) )
{
re turn s tatus ;
}
II
I I See i f there ' s any room l e f t in the Conf i g
I I Array ; i f not , j us t drop thi s devi ce on the
I I f l oor
II
i f ( Con f i gArray->Count >= XX_MAXIMUM_DEVICES
{
re turn STATUS_SUCCES S ;
}
II
I I Make i t eas i e r t o r e f e r to the s l o t i n the
I I Conf i g Array bel onging to thi s device
II
Devi ceBlock =
&Con f i gArray- >Devi c e [ Conf i gArray->Count ] ;
II
I I Ge t pointer t o beginning o f con f i gura t i on
I I data f o r thi s device in the Regi s t ry
II

Chapter 7

136
pFrd

=

Hardware Initialization

( PCM_FULL_RESOURCE_DESCRI PTOR ) @
( ( ( PUCHAR ) C t r l r i n f o
[ I oQueryDeviceCon f i gurat i onData ] )
+ Ctrlrinfo
[ I oQueryDevi c eCon f i gurat i onData ]
- >DataO f f s e t ) ;

II
I I Loop through a l l Par t i a l Res ource Des c r iptors
I I looking for Port and Interrupt informat i on
II
for ( i = 0 ; @)
i < pFrd - > Part i a l Re sourc eL i s t . Count ;
i++ )
pPrd

&pFrd- > PartialRe s ourceL i s t
. Part i a l De s c r iptors [ i ] ;

II
I I Swi tch on the var i ous part i a l r e s ource
I I types . Pul l out the pieces we need . . .
II
swi t ch ( p Prd- >Type ) 0
{

case CmRes ourceTypePort :
bFoundPort =
XxGe t Port info (
pPrd ,
BusTyp e ,
BusNumber ,
Devi ceBlock ) ;
break ;
case CrnRes ourceTypeinterrup t :
bFoundinterrupt =
XxGe t interrup t info (
pPrd ,
BusType ,
BusNumber ,
Devi ceBlock ) ;
break ;
de f aul t :
break ;

} I I end o f swi tch
} I I end of f o r - loop

Sec. 7.2

Code Example: Locating Auto-Detected Hardware

137

i f ( ! ( bFoundPort && bFoundinterrupt ) ) 0
{
re turn STATUS_NO_SUCH_DEVICE ;
}
II
I I Acc ount for the s lo t that we ' ve j us t
I I f i l l ed up . . .
II
Conf i gArray- >Count + + ; CD
return STATUS_SUCCES S ;
}

0 XxFindMatchingBus is a helper function that locates the Config Array for
a specific bus type and number combination. If this is the first time a par­
ticular bus has been encountered, it creates an empty Config Array and
links it into the caller-supplied Config List.
8 Create a pointer to the Full Resource Descriptor for this device. To do this,

we need to skip over the header information by adding the DataOffset
field to the starting address of the block.
4D The Partial Resource Descriptors are in no particular order, so we have to

loop through all of them looking for information about ports and inter­
rupts. Anything we don't recognize, we ignore.
0 Switch on the Partial Resource type and call a helper function to extract

the useful information from it. The parallel controller needs only port and
interrupt data; for other devices you might need to add cases for CmRe­
sourceTypeDma, CmResourceTypeMemory, or CmResourceTypeDevice­
SpecificData.
0 When the entire scan is complete, check to be sure that all the components

have been found. If anything is missing, signal an error.
CD Each time we successfully locate a device, we use up one more slot in the

Config Array. The Count field keeps track of this.
XxGetPortlnfo and XxGetlnterruptlnfo Here are the two helper func­
tions. Each one simply pulls information out of a specific kind of Partial Resource
Descriptor and stores it in the appropriate fields of a DEVICE_BLOCK. They also
translate bus-specific values into their systemwide equivalents.

I I++
I I Func t i on :
XXGetPortinfo
II
II
I I Des cript i on :
II
Thi s func t i on pul l s I I O Port infomat i on

138

Chapter 7

Hardware Initialization

from a Par t i a l Re s ource Descr iptor
II
II
I I Argument s :
Pointer to a Par t i a l Re s ource Des c r iptor
II
Bus type for thi s device
II
Bus number f o r thi s device
II
Pointer to thi s device ' s s l ot in Conf i g Array
II
II
I I Return Value :
Thi s func t i on re turns TRUE i f we f ound the
II
data we wanted , FALSE otherwi s e .
II
I I -s t a t i c BOOLEAN
XxGet Port info (
I N PCM_PARTIAL_RESOURCE DESCRI PTOR pPrd ,
IN INTERFACE_TYPE BusType ,
IN ULONG BusNumber ,
IN PDEVICE_BLOCK Dev i c eB l ock
)
{
PHY S ICAL_ADDRE S S Trans lat edPortBa s e ;
ULONG uAddr e s s Space = 1 ;
Devi c eBlock->Original PortBase
pPrd- >u . Port . Start ;

=

Devi ceBlock- >PortSpan
p Prd- >u . Port . Length ;
i f ( ! Ha lTrans lat eBusAddr e s s (
BusType ,
BusNumber ,
Devi ceBlock->Original PortBas e ,
&uAddres s Spac e ,
&Trans latedPortBas e ) )
{
return FALSE ;
=

Devi ceBl ock- > PortBase
( PUCHAR ) Trans l atedPortBas e . LowPart ;
return TRUE ;
I I++
I I Func t i on :
XxGet int errup t i n f o
II
II
I I Des c r ipt i on :
Thi s func t i on pul l s Interrupt infomat ion
II
from a Par t i a l Resource Descriptor
II

Sec. 7.3

139

Finding Unrecognized Hardware
II
I I Argument s :
Pointer t o a Par t i a l Res ource De s c r iptor
II
Bus type f o r thi s devi c e
II
B u s number for thi s devi c e
II
Po inter to thi s devi c e ' s s l ot in Con f i g Array
II
II
I I Return Value :
Thi s func t i on re turns TRUE i f we f ound the
II
data we want ed , FALS E otherwi s e .
II
I I-s tat i c BOOLEAN
XxGe t interrup t i n f o (
IN PCM_PARTIAL_RESOURCE DESCRI PTOR p Prd ,
IN INTERFACE_TYPE BusTyp e ,
IN ULONG BusNumber ,
IN PDEVICE_BLOCK Devi ceBlock
)

i f ( pPrd- >Flags == CM_RESOURCE_INTERRUPT_LATCHED
Devi ceBl ock- > InterruptMode = Latched ;
else
Devi ceBl ock- > InterruptMode
Level S ens i t ive ;
Devi ceBlock- >Origina l i rql =
p Prd- >u . Interrupt . Leve l ;
Devi ceBlock- >Ori ginalVector
p Prd- >u . Interrupt . Vector ;
Devi ceBlock- > ShareVector = FALSE ;
Devi ceB l ock- >Float ingSave
FALSE ;
Devi c eB l o c k - > Sys temVector =
HalGe t interruptVec tor (
Bus Typ e ,
BusNumber ,
pPrd- >u . Interrupt . Leve l ,
pPrd- >u . Int errupt . Ve c t o r ,
&Devi ceBl ock- >Dirql ,
&Devi ceBl ock- >Af f inity ) ;
r eturn TRUE ;
=

}

7.3

FINDING U N R ECOGN IZED HARDWARE
If your device doesn't show up under auto-detection, or if you just need to sup­

plement the auto-detected information, you can hard-code additional information
into the Registry. This section explains how.

140

Chapter 7

Hardware Initialization

Adding Driver Parameters to the Registry

One way to tell your driver about hardware is to hard-code the information
in a nonvolatile area of the Registry. Although this doesn't seem like a very ele­
gant solution, in the absence of any auto-detection capabilities, it may be your
only option. Many ISA devices will require the use of this technique.
The standard convention is to store device information in one or more value
entries beneath a subkey called Parameters, which dangles off the driver's service
key in the Registry. Figure 7.3 shows how this works. It's usually up to the driver's
installation procedure to set up the Parameters area. For example, suppose your
driver works with a device that the user has to configure manually with DIP
switches. When the driver 's installation program runs it displays a dialog box ask­
ing the user for the port address, IRQ, and DMA settings selected on the device. It
then stores this information in the Parameters area where the driver can find it.
There are no particular standards for the format of driver-specific parameter
data. You simply need to store the same kinds of information that your device
would generate if it auto-detected. As we've already seen, this can include the
addresses of any control registers, the IRQ level used by the device, information
about its DMA capabilities, and the address and span of any device memory. If
your driver supports multiple devices, it's probably a good idea to create separate
subkeys underneath Parameters for each individual device. In Figure 7.3, these
are the DeviceO and Devicel subkeys.
Retrieving Parameters from the Registry

You use RtlQueryRegistryValues (described in Table 7.5) to retrieve values
from the Parameters subkey of your driver's Registry key. This is a very powerful
H KEY_LOCA L_M A CHINE

Ll

CurrentControlSet

L Services
L XxDriver
L Parameters

4;

De�""°

RT'
AN:
:

Copyright © 1 994 by Cydonlx Corporation. 940049a.vsd

Figure 7.3

Device1

REG_DWORD : Ox378
REG_DWOR D : Ox3

REG_DWOR D : Ox7

Registry path for driver-specific parameters

Sec. 7.3

Finding Unrecognized Hardware

Table 7.5

141

Prototype for RtlQueryRegistryValues function

NTSTATUS RtlQueryRegistryValues

IRQL == PASSIVE_LEVEL

Parameter

Description

IN ULONG RelativeTo

Specifies beginning of Registry path
• RTL_REGISTRY_ABSOLUTE
• RTL_REGISTRY_SERVICES
• RTL_REGISTRY_CONTROL
• RTL_REGISTRY_WINDOWS_NT
• RTL_REGISTRY_DEVICE_MAP
• RTL_REGISTRY_USER
• RTL_REGISTRY_OPTIONAL
• RTL_REGISTRY_HANDLE
Identifies an absolute or relative path
Address of a table describing the query

IN PWSTR Path
IN PRTL_QUERY_REGISTRY_
TABLE QueryTable
IN PVOID Context
IN PVOID Environment
Return value

Context passed to a QueryRoutine
Environment block used to expand any
REG_EXPAND_SZ registry entries
• STATUS_SUCCESS
• STATUS_INVALID_PARAMETER
• STATUS_OBJECT_NAME_NOT_FOUND

function, and if you're going to be doing anything fancy with the Registry, you
should become familiar with all its capabilities. For our purposes, we won't need
to do much with it except translate a few value names.
To work with RtlQueryRegistryValues, you need to construct a query table
describing the values you want to translate. The query table is an array of
RTL_QUERY_REGISTRY_TABLE items terminated with an entry containing
NULL QueryRoutine and Name fields. Table 7.6 shows the format of the individ­
ual items.
As with auto-detected hardware information, it's a good idea to store the
Registry data in a configuration buffer that other parts of your DriverEntry rou­
tine can use. That way, you can move the driver to an auto-detecting environment
without having to rewrite too much code. Also remember that values from the
Registry still must be translated into systemwide values.
Other Sources of Device Information

Before we look at an example of using the Registry, it's worth mentioning
some other sources of hardware information. The first is the HalGetBusData
function which allows you to interrogate a specific slot on a specific bus. This

Chapter 7

142
Table 7 .6

Hardware Initialization

Query table entries

RTL_QUERY_REGISTRY_TABLE
Field

Description

PRTL_REGISTRY_QUERY_
ROUTINE QueryRoutine
ULONG Flags

Optional query routine to be called for each item
found in the Registry
Control interpretation of other fields
• RTL_QUERY_REGISTRY_SUBKEY
• RTL_QUERY_REGISTRY_TOPKEY
• RTL_QUERY_REGISTRY_REQUIRED
• RTL_QUERY_REGISTRY_NOVALUE
• RTL_QUERY_REGISTRY_NOEXPAND
• RTL_QUERY_REGISTRY_DIRECT
Name of the value caller wants to query
32-bit value to be passed to QueryRoutine
Type of data
Data item to be used if queried item not present
Default length of data item

PWSTR Name
PVOID EntryContext
ULONG DefaultType
PVOID DefaultData
ULONG DefaultLength

function returns a buffer containing any device-specific data available from a
device. HalGetBusData is only useful if you're working with buses like PCI or
EISA that generate a lot of information.
Also, the I/O Manager keeps a data structure that tracks the number of disk,
tape, floppy, SCSI-HBA, serial, and parallel Device objects that have been created
by various drivers. Calling IoGetConfigurationinformation returns a pointer to
this structure, which you can use to pick an appropriate number for a new device
name. It's also your responsibility to increment the counts in this structure if you
create any of the device types listed above.
Finally, if none of the techniques we've looked at will work, you may have
no alternative but to locate your hardware by poking various control register
addresses. This a potentially dangerous and error-prone way to do things. If you
take this approach, make sure you temporarily allocate the hardware before you
fiddle with it. If the allocation fails, don't touch the hardware. Otherwise, you
may be doing something that confuses an already-loaded driver that owns the
hardware and has put it into a specific state.

7 .4

CO D E EXAM PLE : Q U E RYING TH E R EGISTRY
Here is another hardware locator. This one pulls information about ISA cards
from the Parameters subkey of the driver 's service key. You can find this code in
the CH07 directory on the disk that accompanies this book.

Sec. 7.4

Code Example: Querying the Registry

143

REGCON.C

This group of functions scans the driver 's Parameters key looking for sub­
keys with names like DeviceO, Devicel, and so on. Each time it finds one, it fills
out another DEVICE_BLOCK using values from the Registry.
XxGetHardwarelnfo This routine checks for the existence of an ISA bus on
the machine; if no ISA bus shows up, it checks for an EISA bus where the ISA card
might live. If neither type of bus exists on this machine, the routine fails. This
indirect approach is necessary because ISA cards don't give any feedback about
their presence.

NTSTATUS
XxGetHardware i n f o (
IN PUNICODE_STRING Regi s t ryPath ,
IN PCONF IG_BLOCK pConf i g
)
{
NTSTATUS s tatus ;
PCONF IG_ARRAY Con f i gArray ;
INTERFACE_TYPE BusType ;
ULONG BusNumber ;
UNICODE_STRING TernpString ;
II
I I Check for a bus we can use . Look for an I SA bus
I I f i rs t , then look for an EISA bus . I f nei ther one
I I shows up , qui t .
II
BusType = I s a ;
BusNumber = O ;
s tatus = XxCheckForBus ( I s a , BusNurnber ) ;
i f ( ! NT_SUCCES S ( s tatus ) )
{
Bus Type = E i s a ;
s tatus = XxCheckForBus ( E i s a , BusNumber ) ;
}
i f ( ! NT_SUCCESS ( s tatus ) )
{
* Conf i gL i s t = NULL ;
re turn STATUS_NO_SUCH_DEVICE ;
}
II
I I We found a c ompat ible bus . Al locate
I I spac e f o r the ( s ingl e ) Con f i g array

144

Chapter 7

Hardware Initialization

I I that we ' l l be pas s ing back to the
I I cal ler .
II
ExAl l ocatePoo l (
i f ( ( Con f i gArray
PagedPo o l ,
s i z eo f ( CONFI G_ARRAY ) ) )
= = NULL
)
* C on f i gL i s t = NULL ;
return STATUS_INSUFFICIENT_RESOURCES ;
Rt l Z eroMemory (
Con f i gArray ,
s i z eo f ( CONF IG_ARRAY ) ) ;
* C on f i gL i s t = Con f i gArray ;
Con f i gArray- >BusType = BusType ;
Con f i gArray- >BusNumber
BusNumber ;
II
I I Make a copy o f the Regi s t ry path name
I I and be sure it has a terminator at the
I I end . . .
II
Temp S t r ing . Length = O ; 0
Temp S t r ing . MaximumLength =
Reg i s t ryPath- >Length +
s i z eo f ( UNI CODE_NULL ) ;
i f ( ( TempStr ing . Bu f fer =
ExAl l ocatePoo l (
PagedPool ,
TempS tr ing . MaximumLength ) )
- - NULL )
*Con f i gL i s t = NULL ;
ExFreePoo l ( Conf i gArray ) ;
return STATUS_INSUFFICI ENT_RESOURCES ;
Rt lCopyUni codeString ( &TempString , Regi stryPath ) ;
TempS t r ing . Bu f fer [ TempStr ing . Length ]
UNICODE_NULL ;

=

II
I I Keep l o oping unt i l we run out o f device
I I slots or Regi s t ry ent r i e s , or unt i l an
I I error occurs .

Sec. 7.4

145

Code Example: Querying the Registry
II
Con f i gArray- >Count = O ;
whi l e ( ConfigArray- >Count
{

s tatus

=

<

XX_MAXIMUM_DEVICES ) @

XxF indNextDevice (
BusTyp e ,
BusNumber ,
&Temp S t r ing ,
Con f i gArray ) ;

i f ( ! NT_SUCCESS ( s tatus ) ) break ;
Con f i gArray- >Count + + ;
} I I end whi l e - l oop
ExFreePoo l ( Temp S t r ing . Bu f fer ) ;
i f ( ! NT_SUCCESS ( s tatus ) &&
s tatus ! = STATUS_OBJECT_NAME_NOT FOUND ) @)
* Conf i gL i s t = NULL ;
ExFreePo o l ( Conf i gArray ) ;
return s tatus ;
}
II
I I See i f we f ound anything a f t er a l l
I I that work
II
i f { Con f i gArray->Count - - 0 ) 0
{

* Conf igLi s t = NULL ;
ExFreePo o l ( Conf igArray ) ;
return STATUS_NO_SUCH_DEVICE ;

}
II
I I Everything worked . .
II
return STATUS_SUCCESS ;

.

0 We need to go through all these shenanigans because the RegistryPath
argument is a counted UNICODE_STRING object, but the Registry query
function wants a NULL-terminated array of Unicode characters.
@ This loop keeps going until we run out of slots in the Configuration block,
or until we don't find a matching entry in the Registry. The organization
of this routine means that all the DeviceN subkeys must be consecutive.

146

Chapter 7

Hardware Initialization

@l STATUS_OBJECT_NAME_NOT_FOUND means we ran out of DeviceN

subkeys, but it's not really an error.
0 There must have been at least one valid set of parameter information, or

there's a problem somewhere.
XxFindNextDevice This function extracts information about one device
from the driver's service key and stores it in a slot in the Configuration block.

s ta t i c NTSTATUS
XxF indNextDevi c e (
IN INTERFACE_TYPE BusTyp e ,
IN ULONG BusNumber ,
IN PUNICODE_STRING Regi s t ryPath ,
IN PCONF IG_ARRAY Conf i gArray
)
UNICODE_STRING SubPath ;
WCHAR PathNameBu f f er [ 3 0 ] ;
UNICODE_STRING Number ;
WCHAR NumberBu f f er [ l O ] ;
RTL_QUERY_REG I STRY_TABLE Tab l e [ 5 ] ; 0
NTSTATUS s tatus ;
PDEVICE_BLOCK pDevi c e =
&Con f i gArray- >Devi ce [ Con f i gArray->Count ] ;
II
I I Prepare to interrogate the Regi s t ry by
I I s e t t ing up the query- table
II
Rt l Z eroMemory ( Tabl e , s i z eo f ( Tabl e ) ) ;
II
I I Create a name s t r ing for the
I I query tabl e . S tart by forming
I I the bas e path name
II
SubPath . Bu f f e r = PathNameBuf fer ; @
SubPath . MaximumLength = s i z e o f ( PathNameBu f fer ) ;
SubPath . Length = O ;
RtlAppendUn i codeToStri ng (
& SubPath ,
L " Parameters \ \ Devi c e " ) ;
II
I I Convert the devi c e number into a s t r ing and

Sec. 7.4

147

Code Example: Querying the Registry

I I at tach i t to the end o f the path name .
II
Number . Bu f fer = NumberBu f f e r ;
Number . Maximum.Length = s i z eo f ( Numbe rBu f f er ) ;
Number . Length
O;
=

Rt l i ntegerToUni code S t r ing (
Con f i gArray- >Count ,
I I bas e - 1 0 conve r s i on
10 ,
&Number ) ;
Rt lAppendUnicode S t r ingTo S t r ing (
&SubPath ,
&Number ) ;
II
I I Fabr i cate the query
II
= SubPath . Bu f f e r ;
Tab l e [ O ] . Name
RTL_QUERY_REGI STRY_SUBKEY ; tD
Tab l e [ O ] . F lags
L " PORT " ; I I I I O port addr
Table [ l ] . Name
= RTL_QUERY_REGI STRY_DIRECT ;
Table [ l ] . F lags
Table [ l ] . EntryCont ext
&pDevi c e - >Original PortBas e ;
= L " S PAN " ; I I Number o f ports
Table [ 2 ] . Name
RTL_QUERY_REG I STRY_DIRECT ;
Table [ 2 ] . F lags
Table [ 2 ] . EntryContext =
&pDevi c e - > Port Span ;
=

=

Table [ 3 ] . Name = L " I RQ " ; I I I RQ l eve l
Table [ 3 ] . F l ags = RTL_QUERY_REGI STRY_DIRECT ;
Table [ 3 ] . EntryCont ext =
&pDevi c e - >Origina l i rql ;
II
I I Query the Regi s t ry . . .
II
s tatus = Rt lQueryRegi s tryValues ( 0
RTL_REGI STRY_ABSOLUTE ,
Reg i s t ryPath- >Bu f fer ,
Tabl e ,
NULL NULL ) i
I

i f ( ! NT_SUCCESS ( s tatus ) ) return s tatus ;
II
I I Fix up and t rans late the informat i on
I I from the Reg i s t ry
II

148

Chapter 7

Hardware Initialization

XxGe t Portinf o ( 0
BusType ,
BusNumber ,
pDevi ce ) ;

s tatus

i f ( ! NT_SUCCESS ( s t atus ) ) return s t atus ;
s t atus

XxGetinterruptinfo (
BusType ,
BusNumber ,
pDevi ce ) ;
return status ;
=

0 We need four entries in the query table for our own use, plus one extra to
terminate the query request.
'9 We need to create a string that looks like "Parameters\DeviceN" to repre­
sent the subkey under the driver 's service entry.
@) This query just moves us down a level in the Registry so that all future
queries will be taken from the Parameters \DeviceN subkey.
e One call to RtlQueryRegistryValues does it all. It adds the subkey to the

end of the driver 's service key name, looks for all four value items, and
dumps their contents back into the Configuration block.
0 From here on, we use some helper functions to make the data from the

Registry usable.
XxGetPortlnfo and XxGetlnterruptlnfo Here are the helper functions
again. You'll notice that XxGetlnterruptlnfo has to do some fix-up work on the
data it gets from the Registry.

I I++
I I Func t i on :
XxGe tPortinfo
II
II
I I De s c r ip t i on :
Thi s func t i on f ixes up
II
pul
l ed from the driver ' s
II
II
I I Argument s :
Bus type
II
Bu s number
II
Pointer to thi s device ' s
II
II
I I Return Value :
STATUS_SUCCESS
II
STATUS_XXX i f error
II

I I O port infomat i on
Registry service key

s lot in Config Array

Sec. 7.4

Code Example: Querying the Registry

149

/ /-s tat i c NTSTATUS
XxGet Port info (
IN INTERFACE_TYPE BusType ,
IN ULONG BusNumber ,
IN PDEVICE_BLOCK pDevi c e
)
ULONG Addre s s Spac e ;
PHYS ICAL_ADDRES S Trans latedPortBa s e ;
II
I I Conver t bus - re lat ive por t - inf ormat ion into NT
I I sys t em-mapped value s , and s ave the resul t s . .
II
Addres s Space = l ; / / Ports should be in I / 0 space .
.

i f ( ! Hal Trans lat eBusAddr e s s (
BusType ,
BusNumber ,
pDevi c e - >Or i ginal PortBas e ,
&Addre s s Space ,
&Trans l atedPortBase ) )
{
return STATUS_INSUFF I C IENT_RESOURCES ;
pDevi c e - > PortBase =
( PUCHAR ) Trans latedPortBas e . LowPart ;
return STATUS_SUCCESS ;
}
/ / ++
I I Func t i on :
XxGe t interruptinfo
II
II
I I De s c r ip t i on :
Thi s func t i on f ixes up I RQ infoma t i on
II
pul l ed from the driver ' s Regis try service key
II
II
I I Argument s :
Bus type
II
Bus number
II
Pointer to thi s device ' s slot in Conf ig array
II
II
I I Return Va lue :
STATUS_SUCCESS
II
STATUS_XXX i f error
II

150

Chapter 7

Hardware Initialization

/ / -s t at i c NTSTATUS
XxGetinterrupt info (
IN INTERFACE_TYPE BusType ,
IN ULONG BusNumber ,
IN PDEVICE_BLOCK pDevice
)
II
I I F i l l in the gaps by providing values for things
I I that aren ' t in the Regi s t ry . . .
II
pDevi c e - > InterruptMode
Lat ched ;
pDevi c e - >Ori ginalVe c t o r
pDevi c e - >Original i rql ;
pDevi c e - > ShareVector
FALSE ;
pDevi c e - > F l oat ingSave
FALSE ;
=

=

=

=

II
I I Convert bus -relative interrupt information into
I I NT sys tem-mapped values , and save the results . . .
II
pDevi c e - > Sys t emVector
HalGet interruptVector (
BusType ,
BusNumber ,
pDevi c e - >Origina l i rql ,
pDevi c e - >Ori ginalVector ,
&pDevi c e - >Di rql ,
&pDevi c e - >Af f ini ty ) ;
return STATUS_SUCCES S ;
=

XxCheckForBus and XxBusCallback These little functions allow you to
check for the existence of a particular bus on the system. They make use of IoQue­
ryDeviceDescription to test for the presence of the bus.

I I ++
I I Func t i on :
XxCheckForBus
II
II
I I De s c r i p t i on :
Thi s func t i on veri f i e s the exi s t ence o f a
//
part i cu l ar bus - type and number .
II
II
I I Argument s :
BusType - - I s a , E i s a , e t c
II

Sec. 7.4

Code Example: Querying the Registry

151

BusNumber - - 0 , l , etc
II
II
I I Return Value :
STATUS_SUCCESS or s ome error c ondi t i on .
//
/ /-s tat i c NTSTATUS
XxCheckForBus (
IN INTERFACE_TYPE BusType ,
IN ULONG BusNumber )
return ( IoQueryDevi ceDe s c r ip t i on (
&BusType , &BusNumber ,
NULL , NULL ,
NULL , NULL ,
XxBusCal lback ,
NULL ) ) ;
I I ++
I I Func t i on :
XxBusCal lback
II
II
I I Des c r ipt i on :
Thi s i s a dummy func t i on . The fact that the
II
sys tem cal l s it means that the bus type and
II
number both exi s t , so a l l that ' s nec e s s ary
II
i s to return STATUS_SUCCESS .
II
II
I I Argument s :
( Unus ed )
II
II
I I Return Value :
Thi s funct i on always returns STATUS_SUCCES S
II
I I-s ta t i c NTSTATUS
XxBusCal lback (
IN PVOI D Cont ext ,
IN PUNICODE_STRING PathName ,
IN INTERFACE_TYPE BusType ,
IN ULONG BusNumber ,
IN PKEY_VALUE_FULL_INFORMATION * Bus info ,
IN CONF IGURATI ON_TYPE C t r l rTyp e ,
IN ULONG Ctrl rNumber ,
IN PKEY_VALUE_FULL_INFORMAT ION * C t r l rinf o ,
IN CONF IGURATI ON_TYPE Devic eType ,

152

Chapter 7

Hardware Initialization

IN ULONG Devic eNumbe r ,
IN PKEY_VALUE_FULL_INFORMATION * Devi c e i n f o )
re turn STATUS_SUCCES S ;

7.5

ALLOCATING AND R ELEASING HARDWARE
At this point, your driver has gone to a lot of trouble to locate some hardware.
Before you can use any of it, though, you have to make sure the hardware doesn't
belong to any other driver. This section explains how to allocate hardware for
your driver's exclusive use.

How Resource Allocation Works
NT maintains a central database of all currently owned hardware in the

. . . \HARDWARE\RESOURCEMAP section of the Registry. Before touching any
hardware resources, a driver checks this map to be sure someone else isn't using
them. If everything is free, the driver claims the hardware by adding a description
of its resource requirements to the resource map. If the resources aren't free, the
4
driver must leave them alone.
Resources owned by a particular driver are recorded in a key with the same
name as the driver. In the resource map, these resource keys are organized in arbi­
trary classes. Your driver has the option of declaring its own class, using an exist­
ing class declared by another driver, or using the default resource class called
OtherDrivers. Resource classes are purely decorative and have no effect on
resource allocation or conflict detection.
Within a driver 's resource key, there are two values called .Raw and .Trans­
lated. Each of these items is a list describing the resources owned by the driver.
The raw list contains bus-specific information returned by routines like IoQuery­
DeviceDescription, while the translated list holds the systemwide numbers
returned by the HalTranslateXxx functions.
Drivers can also declare some resources as the property of the whole driver,
and others as belonging to individual devices. In this case, resources shared by
multiple devices go into the driver 's .Raw and .Translated values, while device­
specific resources have their own value items in the resource key. These device­
specific values are called \Device\DeviceName.Raw and \Device\Device­
Name.Translated. Figure 7.4 shows how all this works.

4

For the stability of the operating system, it's vital that all device drivers abide by this arbitration
scheme. As a trusted kernel-mode component, no one can stop a driver from touching hardware
without allocating it. However, this can lead to confusing, unpredictable interactions between mul­
tiple drivers that think they each have exclusive access to a piece of hardware.

Sec. 7.5

153

Allocating and Releasing Hardware

L

H KEY_LOCAL_MACHIN E

L

HARDWARE
RESOURCEMAP

L

XX DRIVER RESOURCES
XxDriver
.Raw

L

OtherDrivers

\Device\XXO.Raw
\Device\XXO.Translated

YyDriver
�-----<

.Raw
.Translated
\Device\YyO. Raw
\Device\YyO.Translated

Copyright © 1 996 by Cydonix Corporation. 960002a. vsd

Figure

7.4

Format of hardware-allocation data in the Registry

In the figure, XXDRIVER has declared a private class (called XX DRIVER
RESOURCES) for its resource list. Some resources are allocated to the driver
itself, while others belong only to the device XxO. YYDRIVER, being somewhat
more shy, doesn't use a private class for its resources, so its resource key ends up
in the OtherDrivers class. Again, some resources belong to the entire driver while
others have been claimed only for one device.
Again, the Registry editor, REGEDT32, gives you an easy way to poke
around in the system resource map. In the initial phases of driver development,
you can use this tool to make sure your driver is allocating all the right resources.
REGEDT32 also lets you verify that an unloadable driver has released whatever
hardware it may have claimed.

How to Claim Hardware Resources
To claim hardware, your driver needs to build a list of the resources it wants to
allocate. Figure 7.5 shows one of these lists. At the very top is a structure called a
CM_RESOURCE_LIST. As you can see, a Resource List is basically an array of the
CM_FULL_RESOURCE_DESCRIPTOR structures that you saw back in Figure 7.2.
Each Full Resource Descriptor in this array identifies all the resources used by the
driver on a single bus type and bus number. Collectively, all the Full Resource
Descriptors in a single Resource List describe the resources used on multiple buses.
As with the data passed to a ConfigCallback routine, individual resources
are identified by Partial Resource Descriptors. The only difference is that the
information given to a ConfigCallback routine is about one specific device or con­
troller. When you fabricate a Full Resource Descriptor to allocate hardware, you

154

Chapter 7

Hardware Initialization

CM_RESOURCE_LIST
CM_FULL_RESOU RCE_DESCRIPTOR
CM_PARTIAL_RESOURCE_LIST
1 st Bus

CM_PARTIAL_RESOURCE_DESCRIPTOR
CM_PARTIAL_RESOURCE_DESCRIPTOR

CM_FULL_RESOURCE_DESCRIPTOR
CM_PARTIAL_RESOU RCE_LIST
CM_PARTIAL_RESOURCE_DESCRIPTOR

2nd Bus

CM_PARTIAL_RESOU RCE_DESCRIPTOR

Copyright © 1 994 by Cydonix Corporation. 940047a.vsd

Figure 7.5

Structures passed to IoReportResourceUsage

have to group together the Partial Descriptors for all resources on one bus in the
same Full Resource Descriptor. 5
You request ownership of the items in a CM_RESOURCE_LIST by passing
the list to loReportResourceUsage (described in Table 7.7) . This function
checks for any conflicts with previously allocated hardware and adds your
claims to the Registry's resource map. When you call this function, it com­
pletely replaces any existing resource list associated with the specified Driver
or Device object.
If you include a class-name string, the 1 / 0 Manager will create a private
class key for your driver 's resources. Passing NULL puts your driver 's
resource key in the OtherDrivers class. If you allocate resources using a private
class, you'll also need to specify the class name when you release these
resources.
Remember that you can associate a resource list either with the Driver object
itself or with a particular Device object. Any resources being used by multiple
devices should be in the DriverList, while device-dedicated resources should go
in the DeviceList. If you break your resources up this way, you'll need to call
IoReportResourceUsage several times: once for the DriverList and once for each

Device List.
If loReportResourceUsage returns STATUS_SUCCESS, you have to check
the value returned in the ConflictDetected boolean. If this variable is TRUE, it
5

It's also worth emphasizing that these Partial Resource Descriptors contain the original bus-rela­
tive values for such things as the I/O port base and the IRQ level - not the translated values
returned by functions like Ha!TranslateBusAdress.

Sec. 7.5

Allocating and Releasing Hardware

Table 7.7

155

Prototype for l oReportResourceUsage
==

NTSTATUS loReportResourceUsage

IRQL

Parameter

Description

IN PUNICODE_STRING ClassName
IN PDRIVER_OBJECT DriverObject
IN PCM_RESOURCE_LIST DriverList
IN ULONG DriverListSize
IN PDEVICE_OBJECT DeviceObject
IN PCM_RESOURCE_LIST DeviceList
IN ULONG DeviceListSize
IN BOOLEAN OverrideConflict

Optional class name for driver
Driver object associated with this driver
Resources used by all driver's devices
Size of list in bytes
Device that will own the resources
Resources used by a single device
Size of list in bytes
• TRUE - ignore resource conflicts
• FALSE - return error if conflict
• TRUE - resources already claimed
• FALSE - no conflict
• STATUS_SUCCESS
• STATUS_INSUFFICIENT_RESOURCES

OUT PBOOLEAN ConflictDetected

Return value

PASSIVE_LEVEL

means that one or more items in your resource list already belong to someone
else. In this case, your driver mustn't use any of the hardware in the list.
The OverrideConflict parameter determines the behavior of IoReportRe­
sourceUsage when it detects a conflict. If you pass FALSE, the function makes no
changes to the Registry's resource map. Instead, it puts a message in the event log
6
identifying the conflicting resources and their current owner. If OverrideConflict
is TRUE, IoReportResourceUsage does add your resource list to the resource map
but doesn't send a message to the system event log. However, even though your
resource list is in the Registry, your driver mustn't touch any hardware in the list;
someone else thinks they own it.
One odd bit of behavior is worth mentioning: Sometimes when there's a
resource conflict, loReportResourceUsage returns an unsuccessful status code
that has no corresponding Win32 error number. The sample code in the next sec­
tion shows how to handle this situation properly.

How to Release Hardware
When you want to free up resources held by your driver, you build an empty
resource list and call IoReportResourceUsage. Since the new list completely
replaces the previous one, this has the effect of releasing any resources described in
the old list. If you allocated hardware on a device-specific or driver-wide basis, you
6 Your driver has to be identified in the Registry as a system event logging component in order for
the Event Viewer to display these messages. Chapter 13 explains how to set this up. These mes­
sages can be very helpful for debugging resource conflicts.

156

Chapter 7

Hardware Initialization

need to release it the same way. Also, if you used a private class name to allocate the
hardware, you'll need to use the same class name to free it.
The following code fragment shows how a driver's Unload routine might
release hardware resources associated with a specific Device object.
CM_RESOURCE_L I ST Re s Li s t ;
BOOL bConf l i c t ;
Re sLi s t . Count

=

O;

I oRepo rtRes ourc eUs age (
NULL ,
II
pDriverObj ect ,
II
II
NULL ,
0,
pDevi ceObj ect ,
II
&ResLi s t ,
II
s i z e o f ( Res L i s t ) ,
FALSE ,
II
II
&bCon f l i c t ) ;

De fau l t c l a s s name
Po inter to Driver obj ect
No driver -wide resources
Po inter to Devi c e obj ect
Devi c e - spec i f i c resourc e s
Don ' t overr i de conf l i c t
Junk , but requi red

Mapping Device Memory

If your device uses a range of dedicated memory addresses, your driver will
need to make that memory available during initialization. Depending on the
architecture of the device, your driver will need to perform one of the following
two procedures.
Driver-chosen addresses Some devices (like Ethernet adapters) have a
control register that specifies the starting address of a device specific memory
area. In this case, your driver needs to allocate memory for the device and let the
device know where the memory is located. 7 Follow these steps to set up this
memory area:

1.

Call loReportResourceUsage to allocate the device's control registers.

2.

Call HalGetAdapter to find the Adapter object associated with your device.

3.

Call HalAllocateCommonBuffer to allocate buffer space for your device's
memory. This function returns both a system virtual address and a physical
address.

4.

Save the system virtual address of this buffer somewhere in your Device
Extension. Use this virtual address from within your driver whenever you
need to reference the device's memory area.

7

This is actually just a special case of something called common buffer bus master DMA which is
described in Chapter 12.

Allocating and Releasing Hardware

Sec. 7.5

157

5.

Write the buffer 's physical address into whatever device registers control
access to the device memory.

6.

When your driver unloads, call HalFreeCommonBuffer to release the buffer.

Hard-wired addresses Some pieces of hardware (like VGA controllers)
have very specific ideas about where their shared buffers should be located. If
your device needs to use a particular range of physical addresses for device mem­
ory, follow these steps to make the memory available to your driver:

1.

Call IoReportResourceUsage to request exclusive ownership of the range of
physical addresses belonging to the device.

2.

Call HalTranslateBusAddress to convert the device's bus-relative physical
addresses into system wide values.

3.

Call MmMaploSpace to map the device's memory into system virtual space.
Save the address returned by this function and use it to access device memory
from within your driver.

4.

When your driver unloads, call MmUnmaploSpace to break the connection
between the device's memory and system virtual space.

Loading Device Microcode
As part of their initialization, some complex devices need to have microcode
loaded into them from a disk file. If the quantity of microcode is small, you can
store it as a REG_BINARY value in the driver 's Parameters subkey. For a device
that needs large amounts of microcode, this may not be feasible.
Fortunately, NT provides several functions that give drivers handle-based
access to files and directories. As you can see from Table 7.8, these routines bear a
strong resemblance to the Win32 user-mode file APL Using these functions, a
driver could load vast quantities of microcode into a device without overburden­
ing the Configuration Manager. In this case, only the path-name for the microcode
file would need to be stored in the driver 's Parameters subkey.
There are three important things to keep in mind if you decide to use these
functions. First, you can only call them from parts of your code running at
PASSIVE_LEVEL IRQL. This effectively limits their use to DriverEntry, the
Unload routine, Dispatch routines, and any thread-based parts of your driver.
Second, you can't access any files with these calls until the file-system
driver for the target volume has finished initializing itself. If your driver loads
during system bootstrap, you can guarantee that it loads after any file systems by
setting up proper group dependencies in the Registry. Chapter 16 explains how
to do this.
Finally, avoid the temptation to store driver initialization parameters in disk
files. That kind of thing belongs only in the Registry. The proliferation of .INI files
in earlier versions of Windows was a bad thing; don't litter NT with them.

Chapter 7

158

Hardware Initialization

Kernel-mode code can access files using these functions

Table 7.8

ZwXxx file functions

IRQL == PASSIVE_LEVEL

I F you want to . . .

TH EN call . . .

Create or open a file, device, or directory
Read data into memory from a file
Write data from memory to a file
Get file size, position, attribute information
Set file size, position, attribute information
Close an open file handle

ZwCreateFile
ZwReadFile
ZwWriteFile
ZwQuerylnformationFile
ZwSetlnformationFile
ZwClose

For more information about the functions listed in Table 7.8, take a look at
the online documentation in the NT DOK. The DOK also contains some sample
code that shows how to use these routines.

7.6

CODE EXAMPLE : ALLOCATING HARDWARE
This example illustrates the hardware allocation techniques we've just been look­
ing at. It assumes that the device uses a OMA channel, but no device-specific
memory or other device-specific data. You can find this code in the CH07 direc­
tory on the disk that accompanies this book.

RESALLOC.C
The functions in this file allocate a group of resources for exclusive use by a
specific Driver object.

XxReportHardwareUsage Given a linked list of CONFIG_ARRAYs, this
routine buids a Resource List and marks the resources as belonging to the entire
Driver object. No resources are tagged as belonging to specific Device objects.

NT STATUS
XxReportHardwareUsage (
IN PDRIVER_OBJECT DriverObj e c t ,
IN PCONF I G_ARRAY Con f i gL i s t
)

ULONG L i s t S i z e ;
PCM_RESOURCE_L I S T ResourceL i s t ;
PCM_FULL_RESOURCE_DESCRI PTOR Frd ;
PCM_PART IAL_RESOURCE_DESCRI PTOR Prd ;
PCONF IG_ARRAY CurrentArray ;

Sec. 7.6

159

Code Example: Allocating Hardware

BOOLEAN bCon f l i c tDetec t ed ;
NTSTATUS s tatus ;
ULONG i ;
II
I I Calculate s i z e o f resource l i s t 0
II
ListSize =
FI ELD_OFFSET ( CM_RESOURCE_L I S T , L i s t [ O ] ) ;
CurrentArray = Con f i gL i s t ;
whi l e ( CurrentArray ! = NULL
{
L i s tS i z e + =
s i zeof ( CM_FULL_RESOURCE_DESCRIPTOR ) +
( ( ( CurrentArray- >Count *
XX_RESOURCE_ITEMS_PER_DEVICE ) - 1 ) *
s izeof (
CM_PARTIAL_RESOURCE_DESCRI PTOR ) ) ;
CurrentArray

=

CurrentArray- >NextCon f i gArray ;

II
I I Try and a l l ocate paged memory f o r the resource
I I l i s t . If it works , z ero out the l i s t .
II
Re s ourceLi s t =
ExAl l ocatePoo l ( PagedPool , L i s t S i z e ) ; $
i f ( ResourceL i s t = = NULL )
{
return STATUS_INSUF F I C IENT_RESOURCES ;
Rt l ZeroMemory ( Res ourceL i s t , Li s t S i z e ) ;
CurrentArray = Con f i gL i s t ; �
Frd = &Res ourceL i s t - >L i s t [ O J ;
whi l e ( CurrentArray ! = NULL )
{
Re s ourceL i s t - >Count + + ;
Frd- > Interf ac eType = CurrentArray- >BusType ;
Frd- >BusNurnbe r = CurrentArray->BusNurnber ;
II
I I Set the number o f Part i a l Re source
I I Descriptors in thi s FRD .

Chapter 7

160

Hardware Initialization

II
Frd- > Par t i al Re s ourceL i s t . C ount
CurrentArray->Count *
XX_RESOURCE_ITEMS_PER_DEVICE ;
II
I I Get pointer t o f i r s t Part ial Res ource
I I Des c r iptor i n thi s FRD .
II
Prd = &Frd- > Par t i al Re s our c eL i s t .
Par t i a l De s c r iptors [ O ] ;
f o r { i = O ; i < CurrentArray->Count ; i + +
{

Prd

0

XxBui ldPar t ialDescriptors {
&CurrentArray->Devi ce [ i ] ,
Prd ) ;

}
II
I I Point t o begi nning o f next Ful l Res ourc e
I I Des c r iptor .
II
{ PUCHAR ) Frd + =
{ { { Frd - > Par t i a lResour c eL i s t . Count - 1 ) *
s i z e o f { CM_PARTIAL_RESOURCE_DESCRI PTOR ) )
+ s i z e o f { CM_FULL_RESOURCE_DESCRI PTOR ) ) ;
II
I I Ge t next Con f i g array from l inked- l i s t
II
CurrentArray = CurrentArray- >NextCon f i gArray ;
}
s tatus = IoReportRe s ourceUsage { 8
NULL ,
DriverObj e c t ,
Res ourceL i s t ,
ListSize ,
NULL ,
NULL ,
0,
FALSE ,
I I Don ' t override
&bCon f l i c tDetected ) ;
ExFreePoo l { Res ourceL i s t ) ;
i f { ! NT_SUCCESS { s tatus ) I I bCon f l i ctDe t e c t ed
return STATUS_INSUFF I C IENT_RESOURCES ;
else

Code Example: Allocating Hardware

Sec. 7.6

161

return STATUS_SUCCESS ;
}

0 Start by accounting for header space between the beginning of the
Resource List and first Full Resource Descriptor (FRD) . For the whole
Resource List, we need one FRD per bus type and bus number. We have
to run the Config List to find them all. Each FRD contains a separate
group of Partial Resource Descriptors (PRDs) for each device we're allo­
cating. Since an FRD has one PRD already embedded in it, we subtract
one from the total PRD count for each FRD.

@ Once the hideous calculations are complete, we allocate a chunk of paged
pool that's large enough to hold the whole thing. As always, it's impor­
tant to zero out any memory allocated from the system pool areas. You
don't know where they've been.

� Run the Config List again. This time, build a separate FRD for each Con­
fig Array in the list.
0 Loop through all the Device Blocks in the current Config Array. For each
Device Block, call a helper function to create PRDs for any resources used
by that device.
0 Once the Resource List is complete, call loReportResourceUsage to
request ownership of the hardware. Afterwards, release the pool memory
used for the Resource List.

XxBuildPartialDescriptors Give a Device Block and a pointer to the first
available Partial Resource Descriptor in an FRD, this function adds all the PRDs
for one device to the current FRD.

s t a t i c PCM_PARTIAL_RESOURCE_DESCRI PTOR
XxBu i l dPartialDescriptors (
IN PDEVICE_BLOCK Device ,
IN PCM_PARTIAL_RESOURCE_DESCRI PTOR Prd
)
II
I I Set up PRD f o r control regi s ters
II
Prd- >Type = CmRes ourc eTypePort ;
Prd- > ShareD i spo s i t i on =
CmRe s ourceShareDr iverExc lus ive ;
Prd- > F l ag s

=

CM_RESOURCE_PORT_IO ; 0

Prd->u . Port . Start =
Dev i c e - >Or iginal PortBas e ;
Prd- >u . Port . Length = Devi c e - > Port Span ;
Prd+ + ; @

162

Chapter 7

Hardware Initialization

II
I I S e t up PRD f o r Interrupt r e s ource
II
Prd->Type = CmRe sourc eTypeinterrupt ;
Prd- > ShareD i spo s i t i on =
CmRes ourceShareDriverExc lus ive ;
i f ( Devi c e - > InterruptMode = = Lat ched )
Prd- > F l ags =
CM_RESOURCE_INTERRUPT_LATCHED ;
else
Prd- > F l ags =
CM_RESOURCE_INTERRUPT_LEVEL_SENS I TIVE ;
Prd- > u . Interrupt . Level =
Devi c e - >Or i ginal i rql ; @
Prd- > u . Interrupt . Vector =
Devi c e - >Ori ginalVe c t o r ;
Prd+ + ;
return Prd ;

0 This example assumes that device control registers are always in I/ 0
space. A truly general driver would need to take a more flexible
approach.
f9 Point to the beginning of the next PRD. (C is a wonderful language.)
@ The setup operations for all the PRDs are very similar; just fill in the nec­
essary fields of the PRD. Remember to use the original values, and not the
ones returned by translation functions such as HalGetlnterruptVector or
HalTranslateBusAddress.

7.7

SUMMARY
In this chapter, we've looked a t various techniques your driver can use to locate
the hardware it has to manage. For some kinds of devices, the hardware will iden­
tify itself and provide the system with a lot of information. Other devices (includ­
ing most ISA cards) are very shy, so you'll need to supplement any auto-detected
information with other data sources, including hard-wired Registry values. What­
ever method you use to find your hardware, you absolutely must claim it for your
driver's exclusive use.
Now that we have a driver that loads and unloads without crashing the sys­
tem, the next step is to make a connection with the NT system service dispatcher.
That's the subject of Chapter 8.

C

H

A

P

T

E

R

8

Driver Dis p atch
Routines

W

hen an 1/0 request begins its arduous journey
through the NT I / 0 subsystem, the first challenge it faces is to get by one of your
driver's Dispatch routines. The Dispatch routine decides whether the request
should go any further, or whether it should be sent back to the original caller in
disgrace. This chapter will help you set up your Dispatch routines and explain
how these routines should behave in various situations. It also fills in some of the
details involved in processing buffered and direct I / 0 requests.

8.1

ENABLING DRIVER DISPATCH ROUTI NES
Before your driver can receive I/O requests, you need to tell the 1/0 Manager
what kinds of operations the driver supports. This section describes the I/O Man­
ager's dispatching mechanism and explains how to enable receipt of specific I/O
function codes. It also presents some guidelines for deciding which function
codes your driver needs to support.

1/0 Request Dispatching Mechanism
Recall from earlier chapters that most I/O operations under NT are packet­
driven. When a user-mode application issues an I/O request, the I/O Manager
first builds an IRP to keep track of the request. Among other things, it stores an
IRP_MJ_XXX code in the MajorFunction field of the IRP's I/O stack location to
identify the exact operation being performed.
1 63

Chapter 8

164

Driver
Object

I R P_MJ_WRITE

IRP_MJ_WRITE

Driver Dispatch Routines

-r

MajorFunction[ ]
_

loplnvalidDeviceRequest

i------iiiii!iiii

XxDispatchWrite
_loplnvalidDeviceRequest

Copyright © 1 994 by Cydonix Corporation. 940030a.vsd

Figure

8.1

How the 1/0 Manager selects Dispatch routines

When it's time to process the IRP, the I / 0 Manager uses the IRP_MJ_XXX
value as an index into the Driver object's MajorFunction table. From the table, it
gets a pointer to a routine that handles this specific IRP_MJ_XXX code, which it
then calls. If the driver doesn't support the requested operation, the table entry
points to the 1/0 Manager 's internal _loplnvalidDeviceRequest function which returns an error to the original caller. If the driver does support the opera­
tion, the table entry points to one of the driver 's own Dispatch routines. Figure 8.1
illustrates this process.

Enabling Specific Function Codes
To enable dispatching for a specific IRP_MJ_XXX function code, your Driv­
erEntry routine must put the address of a Dispatch routine into the Maj orFunc­
tion table of the Driver object. You use the 1/0 function code itself as an index
into the dispatching table. The following code fragment illustrates how to do this.

NT STATUS
DriverEntry (
IN PDRIVER_OBJECT pDO ,
IN PUNICODE_STRING Reg i s t ryPath

pDO - >Ma j orFunc t i on [ IRP_MJ_CREATE ] = XxDi spCreate ;
XxDispC l o s e ;
pDO - >Maj orFunct i on [ IRP_MJ_CLOSE ]
pDO - >Maj orFunc t i on [ IRP_MJ_CLEANUP ] = XxDi spC l eanup ;

Sec. 8.2

165

Extending the Dispatch Interface

pDO- >Maj orFunc t i on [ I RP_MJ_READ ] = XxD i spRead ;
pDO - >Maj orFunc t i on [ I RP_MJ_WR I TE ] = XxDi spWr i t e ;
return STATUS_SUCCES S ;
}

Note that you can use the same Dispatch routine to service more than one
1/0 function code. The choice of how many Dispatch routines to implement is
entirely up to you.
Also, you can ignore MajorFunction table entries corresponding to function
codes your driver doesn't support. By the time the 1/0 Manager calls your
DriverEntry routine, it has already filled every entry in the table with pointers to
_IoplnvalidDeviceRequest, so any slots you don't explicitly fill will appear as
unsupported device operations.

Deciding Which Function Codes to Support
All drivers must support the IRP_MJ_CREATE function code, since this is
the one generated by a Win32 CreateFile call. If you don't process this function
code, Win32 programs will have no way to get a handle to your device.
The choice of other function codes will depend on the nature of your device
and the kinds of operations it can perform. Use Table 8.1 to decide which IRP
function codes might be appropriate. If you're writing an intermediate driver, you
must provide Dispatch entry points for all the 1/0 function codes supported by
any drivers below yours in the chain.
If you're writing a driver for one of the standard system devices, or if you're
writing a layered driver that sits on top of such a device, it's important that you
support a specific set of required IRP function codes. Part II of the Windows NT
DOK Kernel-mode Driver Reference contains extensive descriptions of the
IRP_MJ_XXX function codes your driver must process if it supports one of the
standard devices.

8.2

EXTENDING THE DISPATCH I NTER FACE
What do you do if you need to perform a device operation other than the ones
listed in Table 8.1? The I / 0 Manager doesn't permit you to add any new IRP func­
tion codes, so that's not an option. Fortunately, two of the standard IRP_MJ_XXX
values are escape codes that allow you to define any number of driver-specific
operations:
•

IRP_MJ_DEVICE_CONTROL
Lets you define functions that are avail­
able to user-mode clients through the Win32 DeviceloControl function.
Other drivers can also issue these control requests by building appropri­
ate IRPs.
-

Chapter 8

166
•

Driver Dispatch Routines

IRP_MJ_INTERNAL_DEVICE_CONTROL
Lets you define functions
that are only available to kernel-mode clients (usually other drivers).
There is no user-mode API function that can generate one of these
requests.
-

Both these functions pass a driver-defined 32-bit value as a parameter in the
IRP. This value is referred to as an 1/0 control code (IOCTL), and your driver uses
it to determine just what operation it should perform. The rest of this section

Table 8.1

Commonly used I R P function codes and their Win32 functions

IRP_MJ_XXX function codes
Function code

Description

IRP_MJ_CREATE

Request for a handle.

IRP_MJ_CLEANUP

Cancel pending IRPs when handle
closes

•

•

CreateFile

CloseHandle

IRP_MJ_CLOSE

Close the handle.

IRP_MJ_READ

Get data from device.

•

•

CloseHandle
ReadFile

IRP_MJ_WRITE

Send data to device.

IRP_MJ_DEVICE_CONTROL

Control operation available to user­
or kernel-mode clients.

•

•

WriteFile

DeviceloControl

IRP_MJ_QUERY_INFORMATION

Control operation only available to
kernel-mode clients. (No Win32 call)
Get length of file.

IRP_MJ_SET_INFORMATION

Set length of file.

IRP_MJ_INTERNAL_DEVICE_CONTROL

•

•

IRP_MJ_FLUSH_BUFFERS

SetEndOfFile

Write output buffers or discard input
buffers.
•
•
•

IRP_MJ_SHUTDOWN

GetFileSize

FlushFileBuffers
FlushConsolelnputBuffer
PurgeComm

System shutting down.
•

InitateSystemShutdown

Note: See NTDDK.H or the online documentation for a complete list of IRP MJ XX codes.
_

_

Sec. 8.2

167

Extending the Dispatch Interface

explains how this interface works. Later in the chapter, you'll see how to process
these functions when they appear in an IRP.

Defining Private IOCTL Val ues
The IOCTL values passed to your driver have a very specific structure. Fig­
ure 8.2 illustrates the fields that make up one of these codes.
Although you can fabricate these control codes by hand, it's much easier to
generate them using the CTL_CODE macro that comes with the DDK. As you can
see from Table 8.2, the arguments to this macro parallel the fields of an IOCTL code.

IOCTL Argument-Passing Methods
In many situations, you'll want to define IOCTL codes that either need addi­
tional arguments from the caller, or that need to pass information back to the
caller. For example, an IOCTL that queried a driver for performance data would
need some way to return the data. The Win32 DeviceloControl function solves
this problem by letting the user specify a pair of input and ouput buffer addresses
along with the IOCTL code. The question then becomes: Does the I/0 Manager
pass these buffers to your driver using Buffered or Direct I/0?
You may be tempted to think that the buffering method used for IOCTLs will
be the same one you specified with the DO_BUFFERED_IO or DO_DIRECT_IO
flags in the Device object. However, the method used for a device's IOCTLs is not
necessarily the same as the method used for data transfers. For greater flexibility,
the I/O Manager uses a field in the IOCTL code itself to determine the buffering
method. This allows you to choose different buffering methods for each individual
IOCTL.

15- 14

31 - 1 6

Device Type

13 - 2

J

Required Access
Control Code
Transfer Type
Copyright © 1 996 by Cydonix Corporation. 960016a.vsd

Figure

8.2

Layout of an IOCTL code

1 -0

Chapter 8

168
Table 8.2

Driver Dispatch Routines

Use the CTL_COD E macro to define IOCTL codes

CTL_CODE macro
Parameter

Description

DeviceType

FILE_DEVICE_XXX value given to IoCreateDevice
• OxOOOO to Ox7FFF - reserved for Microsoft
• Ox8000 to OxFFFF - available for customer device types
Driver-defined IOCTL code
• OxOOO to Ox7FF - reserved for Microsoft
• Ox800 to OxFFF - available for customer IOCTLs
Buffer-passing mechanism for this control code (see below)
• METHOD_BUFFERED
• METHOD_IN_DIRECT
• METHOD_OUT_DIRECT
• METHOD_NEITHER

ControlCode

TransferType

RequiredAccess

Access that must be requested when user calls Win32 CreateFile
• FILE_ANY_ACCESS
• FILE_READ_DATA
• FILE_WRITE_DATA
• FILE_READ_DATA I FILE_WRITE_DATA

As you can see from Figure 8.2, the TransferType field is located in the low­
est two bits of the IOCTL code. It can take on one of the following values:
•

M ETHOD_BU FFERED
The 1/0 Manager moves IOCTL data to and
from the driver using an intermediate nonpaged pool buffer.

•

M ETHOD_IN_DI RECT
IOCTL data coming from the caller is passed
using Direct 1/0; data going from the driver back to the caller is passed
through an intermediate system-space buffer.

•

M ETHOD_OUT_DIRECT - Data coming from the caller passes through a
system-space buffer; data going back to the caller is passed using Direct 1/0.

•

M ETHOD_NEITH ER
The I/O Manager simply gives the driver raw
user-space addresses for the caller's incoming and outgoing IOCTL buffers.

-

-

-

If your driver supports a public IOCTL defined by Windows NT, it has to
use the method embedded in the IOCTL. 1 For private IOCTLs, you can choose the
1/0 method that makes the most sense for the operation. The guidelines for
choosing an IOCTL buffering method are the same as those for choosing a data

1

For a complete list of public IOCTLs, see the header file MSTOOLS\ H \ WINIOCTL.H.

Writing Driver Dispatch Routines

Sec. 8.3

169

transfer buffering method. Buffered I/ 0 is suitable for small amounts of data (less
than PAGE_SIZE bytes), while Direct 1/0 is a better approach for large buffers or
DMA operations.

Writing

IOCTL Header Files

It's a good idea to write a separate header file for your control-code defini­
tions. This header file should also contain any structures that describe the con­
tents of the IOCTL's input or output buffers. You'll need to include this header file
in both the driver and any user-mode programs that issue Win32 DeviceloControl
calls to the driver.2 The following is an example of an IOCTL header file:

# de f ine I OCTL_XXDEVICE_AIM3 CTL_CODE (
F ILE_DEVI CE_UNKNOWN ,
O x8 0 1 ,
METHOD_BUFFERED ,
F ILE_ACCES S_ANY )
I I S t ruc tures us ed by I OCTL_XXDEVICE_AIM
II
typede f s t ruct _XX_AIM_IN_BUFF {
ULONG Longi tude ;
ULONG Lati tude ;
XX_AIM_IN_BUFF , * PXX_AIM_IN_BUFF ;

\
\
\
\

typede f s t ruct _XX_AIM_OUT_BUFF {
ULONG Ext endedS tatus ;
} XX_AIM_OUT_BUFF , * PXX_AIM_OUT_BUFF ;
#de f ine IOCTL_XXDEVICE_LAUNCH
CTL_CODE ( \
\
F ILE_DEVICE_UNKNOWN ,
O x8 0 2 ,
\
METHOD_NEI THER ,
\
F I LE_ACCESS_ANY

8.3

WRITING DRIVER DISPATCH ROUTINES
Once you've chosen a n appropriate set o f 1 / 0 function codes, you need to
write the Dispatch routines themselves. This section explains how to code these
routines.

2

Additionally, the Wm32 program will need t o include WINIOCTL.H and th e driver will need to
include DEVIOCTL.H to get the definition of the CTL_CODE macro. These header files need to be
included before you include the file with your IOCTL defintions.

3

Microsoft recommends that the names you give to private IOCTLs look like IOCTL_Device_Functian,
where Device identifies the device that supports the IOCTL, and Function describes the effect of the
IOCTL.

170

Chapter 8

Driver Dispatch Routines

Execution Context
By the time it calls your Dispatch routine, the 1/0 Manager has already
checked the accessibility of the caller's buffer. If this is a Buffered 1/0 operation, it
has also allocated a system buffer from nonpaged pool, and for output requests,
copied the caller's data into the system buffer. For Direct 1/0 operations, the
caller's buffer has been faulted into physical memory and locked down.
Like your driver's initialization and cleanup routines, Dispatch routines run
at PASSIVE_LEVEL IRQL, which means they can access paged system resources.
Table 8.3 shows the prototype for a Dispatch routine.
Normally, a Dispatch routine works only with the contents of the IRP. If a
Dispatch routine touches any data structures shared with other parts of the driver,
it has to synchronize itself properly. This means using a spin lock to coordinate
with driver routines running at DISPATCH_LEVEL IRQL and KeSynchronizeEx­
ecution to synchronize with the Interrupt Service code.
Never forget that you're sharing the IRP with the 1/0 Manager. In particu­
lar, the system uses various fields in the Parameters union to clean up after 1/0
operations. For example, after a Buffered 1/0, it eventually needs to deallocate its
nonpaged pool buffer. A field in the IRP gives it the location of this buffer. Chang­
ing the contents of the IRP can lead to unspecified (but dreadful) results when the
I/ 0 Manager tries to finish processing the request.
If you need to modify any IRP fields, make working copies in local variables
or in the Device Extension. Modify these working copies and not the data in the
IRP. The only exceptions to this rule are the 1/0 status block and the Others struc­
ture in the Parameters union. Chapter 15 will discuss the use of this structure by
higher-level drivers.

What Dispatch Routines Do
Keep in mind that the exact behavior of a Dispatch routine will depend on
the function code it supports. However, the general responsibilities of these rou­
tines include the following:
l.

Call IoGetCurrentlrpStackLocation t o get a pointer t o the IRP stack location
belonging to this driver.

Table 8.3

Function prototype for a Dispatch routine

NTSTATUS XxDispatch

IRQL == PASSIVE_LEVEL

Parameter

Description

IN PDEVICE_OBJECT DeviceObject

Pointer to target device for this request
Pointer to IRP describing this request
• STATUS_SUCCESS - request complete
• STATUS_PENDING - request pending
• STATUS_:XXX - appropriate error code

IN PIRP irp
Return value

171

Writing Driver Dispatch Routines

Sec. 8.3
2.

Perform any additional sanity checking or parameter validation specific to
this function code and device.

3.

If this is an intermediate-level driver, and there are limitations on the underly­
ing physical device (for example, its maximum transfer size), the Dispatch
routine may need to split the caller's request into multiple requests to the
device driver. Chapter 15 explains how to do this.

4.

Continue processing the IRP until one of three exit conditions occur.
The following subsections describe some of these steps in greater detail.

Exiting the Dispatch Routine
When a Dispatch routine processes an IRP, there are only three possible out­
comes:
•

The IRP's request parameters don't pass whatever validation tests you're
applying and you need to reject the request.

•

You can complete the request entirely in the Dispatch routine without
performing any device operations.

•

You need to start a device operation in order to complete the request.

Signaling an error If your Dispatch routine uncovers a problem with the
IRP parameters, you need to send the request back to the caller with a nasty mes­
sage. Follow these steps to reject an IRP:

1.

Put an appropriate error code in the Status field of the IRP's 1/0 status block
and clear the Information field.

2.

Call IoCompleteRequest to release the IRP with no priority increment.

3.

When you exit the Dispatch routine, return the same error code you put in the
IRP.

The code fragment below shows how a Dispatch routine rejects an I/O
request.

NT STATUS
XXDi spWhat ever (
IN PDEVICE_OBJECT pDO ,
IN PIRP I rp )
{
I rp - > I o S tatus . S tatus

4

STATUS_BADVIBES ; 4

No, STATUS_BADVIBES isn't a real NTSTATUS code.

172

Chapter 8

Driver Dispatch Routines

I rp - > I oS tatus . Informat i on = O ;
I oComp l e t eReque s t ( I rp , I O_NO_INCREMENT ) ;
return STATUS_BADVIBES ;

Completing a request You can process some kinds of IRP function codes
without actually performing any device operations. Opening a handle to a device,
or returning information stored in the Device object are examples of these kinds of
requests. To complete a request in the Dispatch routine, do the following:
1.

Put a successful completion code in the Status field of the IRP's I / O status
block, and set the Information to some appropriate value.

2.

Call IoCompleteRequest to release the IRP with no priority increment.

3.

Exit the Dispatch routine with a value of STATUS_SUCCESS.

The code fragment below shows how a Dispatch routine completes a
request.

NT STATUS
XXD i spC l o s e (
IN PDEVICE_OBJECT pDO ,
IN PIRP I rp )

I rp- > I o S tatus . S tatus = STATUS_SUCCES S ;
I rp - > I o S tatus . Informat i on = O ;
I oComplet eReque s t ( I rp , IO_NO_INCREMENT ) ;
return STATUS_SUCCES S ;
}
Starting a device operation The last possibility is that the IRP is request­
ing an actual device operation. This could be either a data transfer, a control func­
tion, or an informational query. In this case, the Dispatch routine has to pass the
IRP to the driver's Start I/O routine. To start a device operation, do the following:
1.

Call IoMarklrpPending s o that the I / O Manager won't try to complete the
request.

2.

Call IoStartPacket to send the request to your driver's Start I / O routine. If
you manage your own IRP queues, call your driver's internal routine to start
the I/0.

3.

Exit the Dispatch routine with a value of STATUS_PENDING.

The following code fragment shows how a Dispatch routine starts a device
operation.

Sec. 8.4

Processing Specific Kinds of Requests

173

NT STATUS
XxDi spWr i t e (
IN PDEVICE_OBJECT pDO ,
IN P I RP I rp )

IoMarki rpPending ( I rp ) ;
I o S tart Packet ( pDO , I rp , 0 , NULL ) ;
re turn STATUS_PENDING ;
It's a little-known fact that the 1/0 Manager automatically completes any
IRP that isn't marked pending as soon as the Dispatch function returns. Unfortu­
nately, this automatic mechanism doesn't work the same way as an explicit call to
IoCompleteRequest. In particular, it doesn't include calling any I/O Completion
routines attached to the IRP by higher-level drivers. Consequently, it's important
that your driver either marks an IRP as pending or completes it explicitly with

IoCompleteRequest.

8.4

PROCESSING SPECI FIC KIN DS OF REQU ESTS
The previous section described the general kinds of processing done by a driver's
Dispatch routines. These routines may also need to perform various operations
that depend on the IRP's function code and the buffering strategy used with the
device. This section discusses some of these request-specific issues. This material
is also relevant to the Start 1 /0 routine and other parts of a driver, but it appears
here because this is the first place where you might run into it.

Processing Read and Write Requests
Chapter 6 explained how to create Device objects, which included setting
the DO_BUFFERED_IO or DO_DIRECT_IO bits in the Device object's Flags field.
These bits control the 1 /0 Manager's behavior for all IRP_MJ_READ and
IRP_MJ_WRITE requests sent to the device. Here's what happens once you've set
these flags.

Buffered 110 At the start of both read and write requests, the 1/0 Manager
checks the accessibility of the user buffer. It then allocates a piece of nonpaged
pool as big as the caller's buffer and puts its address in the Associatedlrp.System­
Buffer field of the IRP. This is the buffer your driver should use for the actual data
transfer.
For IRP_MJ_READ operations, the I/O Manager also sets the IRP's User­
Buffer field to the user-space address of the caller's buffer. Later, when the request
is completed, it will use this address to copy data from the driver's system-space
buffer back to the caller's buffer. For an IRP_MJ_WRITE request, the I/ 0 Manager

Chapter 8

174

Driver Dispatch Routines

sets the IRP's User Buffer field to NULL and copies the contents of the user buffer
into the system buffer.

Direct 1/0 The I/O Manager checks the accesibility of the user buffer and
locks it in physical memory. It then builds a Memory Descriptor List (MDL) for
the buffer and stores the address of the MDL in the IRP's MdlAddress field. Both
the Associatedlrp.SystemBuffer and UserBuffer fields are set to NULL.
Normally, you use the MDL to set up a DMA operation, as you'll see in
Chapter 12. If you're performing Direct 1/0 with a programmed 1/0 device, you
can use the MmGetSystemAddressForMdl function to get a system-space
address for the user buffer. This function doubly maps the caller's buffer into a
range of nonpaged system space. (In effect, the buffer lives at two virtual
addresses at one time.) When your driver completes the 1/0 request, the system
automatically unmaps the buffer from system space. 5
Neither method If you specify neither Buffered nor Direct 1/0 when you
create a Device object, it's up to your driver to decide how to handle buffering
issues. The 1 /0 Manager simply puts the user-space address of the caller's buffer
into the IRP's UserBuffer field. In this case, the IRP's Associatedlrp.SystemBuffer
and MdlAddress fields have no meaning and are set to NULL.
Be very careful about accessing the caller's buffer in user space with the
UserBuffer field of the IRP - even if the buffer is locked down. Since IRPs are
processed asynchronously, there's no guarantee that the calling process will still
be mapped into user space by the time your driver executes. The only exception
to this rule is that the Dispatch routines (and only the Dispatch routines) of a
highest-level driver can use UserBuffer to access the caller's buffer. This is
because these routines always run in the context of the thread issuing the 1 / 0
request. Other routines in a highest-level driver (and any routine i n a lower
driver) don't have this guarantee.
Processing IOCTL Requests
Once your driver has filled in either the IRP_MJ_DEVICE_CONTROL or the
IRP_MJ_INTERNAL_DEVICE_CONTROL slots in the Driver object's MajorFunc­
tion table, the 1/0 Manager starts passing these requests to the associated Dis­
patch routines. At this point, your driver has to decide what to do with the
request.
Other than buffer access checking (described later), the I/O Manager does
no validation of either the IOCTL control code itself or the contents of the caller's
buffers. (For example, the FILE_DEVICE_:XXX field of the IOCTL does not have to

5

Drivers ought to avoid this technique, because releasing the doubly-mapped pages causes every
in the system to flush its data cache. This is terrible for system performance.

CPU

Sec. 8.4

Processing Specific Kinds of Requests

175

match that of the target Device object.) The caller could pass any random number
as an IOCTL code, and it would find its way to your IOCTL Dispatch routine. So,
it's up to you to do any necessary sanity checking.
IOCTL dispatchers usually turn into one of those horrendous switch state­
ments that Microsoft finds so intriguing. The following skeleton of code shows
the general layout of a Dispatch routine that processes IOCTL requests.

NT STATUS
XxD i s p i oContro l ( O
IN PDEVICE_OBJECT pDO ,
IN PIRP I rp )
P IO_STACK_LOCATION I rpS tack ;
ULONG ControlCode ;
ULONG InputLength , OutputLength ;
NTSTATUS S tatus ;
I rpS tack

=

I oGe tCurrent i rpStackLocat i on ( I rp ) ;

I I Extrac t u s e ful informat i on from the I I O s tack
II
ControlCode = I rpS tack- >
Parame ters . Devi c e i oContro l . I oControlCode ;
InputLength = I rpStack->
Parameters . Devi c e i oContro l . InputBu f ferLength ;
OutputLength = I rpStack->
Parameters . Devi c e i oContro l . OutputBu f f e rLength ;
swi tch ( ControlCode ) {
case IOCTL_XXDEVICE_AIM : 8
I I Check bu f f er s i z e s and fai l i f
I I not enough space . . .
II
i f ( ( InputLength < @)
s i z e o f ( XX_AIM_IN_BUFF ) )
Ou
tpu
tLength
<
I I
s i z eo f ( XX_AIM_OUT_BUFF ) ) )
S t atus
break ;
}

=

STATUS_INVALI D_BUFFER_S I Z E ;

I I Everything ' s OK ; pas s I RP to S tart I I O
II
I oMarki rpPending ( I rp ) ; 8
I o S tartPacket ( pDO , I rp , 0 , NULL ) ;
return STATUS_PENDING ;

Chapter 8

176

Driver Dispatch Routines

case IOCTL_XXDEVI CE_LAUNCH :
i f ( InputLength > 0 0
1 1 OutputLength > 0 )
{
Status = STATUS_INVAL I D_PARAMETER ;
break ;
}
I I Same kind o f pro c e s s ing as the case
I I above ;
I I I t ' s not a rec ogni zed contro l code . . .
II
de fau l t :
STATUS_INVALI D_DEVICE_REQUEST ;
Status
break ;
}
I I We only wind up here i f there ' s an error
II
I rp - > I o S tatus . S tatus = S tatus ; ©
I rp - > I o S tatus . Inf orma t i on
O;
IoComp l e t eRequ e s t ( I rp , IO_NO_INCREMENT ) ;
return S tatus ;
=

0 If you support both external IRP_MJ_DEVICE_CONTROL and internal
IRP_MJ_INTERNAL_DEVICE_CONTROL (kernel-mode only) interfaces,
you'll probably want individual IOCTL Dispatch routines for each major
function code.

@ Include a separate case for each IOCTL code that might appear. Any code
that isn't supported will end up in the default case and fail.

� You have to make sure that any buffers associated with the IOCTL are big
enough. This has to be checked individually for each IOCTL code, since
different control codes may have different input and output structures.
0 If the IOCTL makes it through all the validation checks, it gets sent to the
driver's Start 1 / 0 routine. This assumes that the IOCTL causes some kind
of device operation. For IOCTLs that don't require device activity, you
can perform the operation and complete the IRP successfully from the
XxDisploControl routine.
0 If you're not expecting any buffers for a particular IOCTL code, you might
want to return STATUS_INVALID_PARAMETER and fail. This isn't really
an error, but it makes you wonder if the caller is missing a clue or two.

© If something is wrong with this IOCTL request, fail the IRP using what­
ever status value was generated by the switch statement.

Sec. 8.4

Processing Specific Kinds of Requests

177

Managing IOCTL Buffers
IOCTL requests can involve both an input buffer coming from the caller and
an output buffer being returned to the caller. As a result, they act like a combina­
tion of a write operation followed by a read. From previous sections of this chap­
ter, you know that the buffering strategy used for an IOCTL request is determined
by the low-order 2 bits of the IOCTL code itself. The following paragraphs
describe how the various buffering methods work.

M ETHOD_BUFFERED The I / 0 Manager starts by allocating a single
chunk of nonpaged pool that's big enough to hold either the caller's input or
output buffer (whichever is larger) . It puts the address of the nonpaged pool
buffer in the IRP's Associatedlrp.SystemBuffer field. It then copies the IOCTL's
input data into the system buffer and sets the UserBuffer field of the IRP to the
user-space output buffer address. When your driver completes the IOCTL IRP,
the 1 / 0 Manager copies the contents of the system buffer back into the caller's
output buffer.
Since the same piece of nonpaged pool is being used for both the input and
output buffers, your driver should read all incoming data before it writes any out­
put data to the buffer.
M ETHOD_IN_DIR ECT The I / O Manager checks the accessibility of the
caller's input buffer and locks it into physical memory. It then builds an MDL for
the input buffer and stores a pointer to the MDL in the MdlAddress field of the
IRP.
It also allocates an output buffer from nonpaged pool and stores the address
of this buffer in the IRP's Associatedlrp.SystemBuffer field. The IRP's UserBuffer
field is set to the original caller's output buffer address. When the IOCTL IRP is
completed, the contents of the system buffer will be copied back into the caller's
original output buffer.
M ETHOD_OUT_DI RECT The 1 / 0 Manager checks the accessibility of the
caller's output buffer and locks it into physical memory. It then builds an MDL for
the output buffer and stores a pointer to the MDL in the MdlAddress field of the
IRP.
The I/0 Manager also allocates an input buffer from nonpaged pool and
stores its address in the IRP's Associatedlrp.SystemBuffer field. It copies the con­
tents of the caller's original input buffer into the system buffer and sets the IRP's
UserBuffer field to NULL.
M ETHOD_NEITHE R The I / O Manager puts the address of the caller's
input buffer in the Parameters.DeviceloControl.Type3InputBuffer field of the
IRP's current I/0 stack location. It stores the address of the output buffer in the
IRP's UserBuffer field. Both of these are user-space addresses.

Chapter 8

178

8.5

Driver Dispatch Routines

TESTING D RIVER DISPATCH ROUTI NES
Your driver still has a long way t o go, but once again, you can verify some aspects
of its operation. In particular, you can test the driver to be sure that it
•

Opens and closes a handle to the device

•

Supports Win32 l/O function calls that return successfully

•

Manages requests from multiple callers

Still not very ambitious goals, but if you complete these tests successfully,
your driver will be one step closer to full operation.

Testing Procedure
The following procedure will let you check all the code paths through your
driver's Dispatch routines.

1.

Write IRP_MJ_CREATE and IRP_MJ_CLOSE Dispatch routines for your
driver.

2.

Test the driver with a simple Win32 console program that gets a handle to
your device and then closes the handle.

3.

Write other Dispatch routines but modify them so that they always call
IoCompleteRequest rather than starting any device operations.

4.

Modify your Win32 test program to make ReadFile, WriteFile, and Devicelo­
Control calls that exercise each driver Dispatch routine.

5.

If your device is shareable, run several copies of the test program at once to be
sure the driver works with multiple open handles.

6.

If your driver supports multiple physical devices, repeat the tests with each
device unit.

Sample Test Program
This is an example of the kind of test program you can use to verify the code
paths through a driver's Dispatch routines.

# inc lude 
# inc lude < s tdio . h>
VOI D main { VOI D )
{
HANDLE hDevi ce ;
BOOL s tatus ;

Sec. 8.6

Summary

hDevi c e

8.6

179
=

CreateF i l e (

11

\ \ \ \ . \ \ XXl

11

•

•

•

)

;

;

s t atus

ReadF i l e ( hDevi c e . . . )

s tatus

Wr i t e F i l e ( hDevice . . . )

s tatus

Devi c e i oContro l ( hDevice . . . )

s tatus

C l o s eHandl e ( hDevi ce )

;
;

;

SUM MARY
In this chapter, you've seen the beginning of the 1 / 0 processing cycle. By now,
you should have a good idea of what IRP function codes your driver will need to
support. If some of these functions include IOCTLs, the information in this chap­
ter will help you implement them correctly. If you're writing a higher-level driver,
that may be the end of the story.
For device drivers, however, there's still more to do. In the next chapter,
you'll see how to perform actual data transfer operations.

C

H

A

P

T

E

R

9

Pro grammed I / 0
Data Transfers

D

evices that do programmed I/O need a great
deal of attention from the CPU while they transfer data. Usually, these are slow
devices (like the mouse or keyboard) that don't move large amounts of data in a
single operation. This chapter explains how to write the data transfer sections of
drivers for this kind of hardware.

9.1

H ow P R O G RA M M E D 1/0 W O R KS

This section describes the events that occur during a programmed I/ 0 operation,
as well as describing some of the other issues a driver will have to face.

What Happens during Programmed 1/0
In a programmed 1/0 operation, the CPU transfers each unit of data to or
from the device in response to an interrupt. Referring to Figure 9.1, the following
sequence of events takes place:

180

1.

The Start I/ 0 routine performs any necessary preprocessing and setup based
on the IRP_MJ_XXX function code in the IRP. It then starts the device.

2.

Eventually, the device generates an interrupt which the Kernel passes to the
driver's Interrupt Service routine.

Sec. 9.1

How Programmed I / O Works

181

Dispatch

(start device)

Interrupt Service
loRequestDpc

DpcForlsr

.fr'

Interrupt

loCompleteRequest
loStartNextPacket

Copyright © 1 994 by Cydonix Corporation. 940052a.vsd

Figure 9.1

Sequence of events in a programmed 1/0

3.

If there is any more data, the Interrupt Service routine starts the next transfer.
Steps 2 and 3 may repeat any number of times until the operation is complete.

4.

When the operation completes, either because there's no more data or because
an error occurs, the Interrupt Service routine queues a request to fire off the
driver's DpcForlsr routine.

5.

The DPC dispatcher eventually runs the DpcForlsr which releases the current
IRP back to the I/O Manager. If there are any more IRPs waiting, the Dpc­
Forlsr sends the next packet to the driver's Start I/0 routine, and the whole
cycle repeats.

Synchronizing Various Driver Routines
Driver routines running at an IRQL below DIRQL must synchronize their
access to any device registers or memory areas shared with the driver's Interrupt
Service routine. Without this protection, an interrupt might arrive while a low­
IRQL routine was using the shared resource, and the outcome would be unpre­
dictable (but probably nothing good). You solve this synchronization problem by
putting code that touches these shared resources in a SynchCritSection routine.
Table 9.1 shows you the prototype for one of these routines.
When you need to execute a SynchCritSection routine, you pass its address
as an argument to KeSynchronizeExecution (see Table 9.2). This function raises
IRQL to the DIRQL level of the Interrupt object, acquires the object's Interrupt
spin lock and then calls your SynchCritSection routine. While it's running, your

Chapter 9

182
Table 9.1

Programmed 1/0 Data Transfers

Function prototype for a SynchCritSection routine

BOOLEAN XxSynchCritSection

IRQL

==

DIRQL

Parameter

Description

IN PVOID Context

Pointer to context passed to KeSynchronizeExecution
• TRUE - success
• FALSE - something failed

Return value

Table 9.2

Function prototype for KeSynchronizeExecution

BOOLEAN KeSynchronizeExecution

IRQL < DIRQL

Parameter

Description

IN PKINTERRUPT Interrupt
IN PKSYNCHRONIZE_ROUTINE Routine
IN PVOID Context

Address of an Interrupt object
SynchCritSection callback routine
Argument for SynchCritSection
routine
Value returned by SynchCritSection
routine

Return value

SynchCritSection code is guaranteed not to be interrupted by the device associ­
ated with the Interrupt object. When your routine finishes, KeSynchronizeExecu­
tion releases the spin lock, drops IRQL back to its original level, and returns to the
caller.
Notice that you're allowed to pass some context information to the Synch­
CritSection routine. Typically, this will be a pointer to the Device or Controller
Extension structure.

9.2

D R I V E R I N ITIALIZAT I O N A N D C LEAN U P
Along with the general initialization and cleanup issues we've seen in previous
chapters, there are some specific things that a programmed I/O device driver
needs to take care of. The following subsections describe them in detail.

Initializing the Start 1/0 Entry Poi nt
If your driver has a Start I/ 0 routine, you need to let the I/ 0 Manager know
where to find it. You do this by putting the address of the Start I / O routine into
the DriverStartlo field of the Driver object, as in the following code fragment:

Sec. 9.2

Driver Initialization and Cleanup

183

NT STATUS
DriverEntry (
IN PDRIVER_OBJECT Dr iverObj ect ,
IN PUNICODE_STRING Regi s t ryPath
)

II
I I Export o ther dr iver entry points
II
DriverObj e c t - >DriverStar t i o = XxS tar t i o ;
DriverObj e c t - > Dr iverUnload = XxDr iverUnl oad ;
.

.

.

Dr iverObj ec t - >Maj orFunc t i on [ I RP_MJ_CREATE
XxDi spat chOpenC l os e ;

If you forget to initialize this entry point, you'll get an access violation (and a
bright blue screen) when your Dispatch routines call IoStartPacket.

Initializing a DpcForlsr Routi ne
The 1/0 Manager provides you with a simplified version of the DPC mecha­
nism. Tucked away inside each Device object is a single DPC object. To use it, your
DriverEntry routine just calls lolnitializeDpcRequest and associates a DpcForlsr
callback with the Device object. Later, your driver 's Interrupt Service routine can
trigger this DPC by calling IoRequestDpc.
For some kinds of drivers, this simplified mechanism is too limited. In
Chapter 11, you'll see how to set up your own DPC objects if you need the flexibil­
ity of multiple DPCs.

Connecting to an Interrupt Source
Before you can process interrupts, you have to establish a connection
between your device's interrupt vector and an Interrupt Service routine in your
driver. You do this by calling the IoConnectlnterrupt1 function described in Table
9.3. Given an Interrupt Service routine and some of the translated information
generated by your hardware location code, this function adds your ISR to the Ker­
nel's list of interrupt handlers.

1

If you recall, we first bumped into this function in the driver initialization code in Chapter 6, where
we treated it as a necessary bit of magic.

Chapter 9

184
Table 9.3

Programmed 1/0 Data Transfers

Function prototype for loConnectl nterrupt
==

NTSTATUS loConnectlnterrupt

IRQL

Parameter

Description

OUT PKINTERRUPT *lnterruptObject

Address of variable that receives
pointer to Interrupt object
ISR that handles this interrupt
Context argument passed to ISR;
usually the Device Extension
Initialized spin lock (see below)
Translated interrupt vector value
DIRQL value for device
Usually same as DIRQL (see below)
• LevelSensitive
• Latched
If TRUE, identifies this vector as
shareable
Set of CPUs on which device interrupt
can occur
If TRUE, save the state of the FPU
during an interrupt
• STATUS_SUCCESS
• STATUS_INVALID_PARAMETER
• STATUS_INSUFFICIENT_
RESOURCES

IN PKSERVICE_ROUTINE SeviceRoutine
IN PVOID ServiceContext
IN PKSPIN_LOCK SpinLock
IN ULONG Vector
IN KIRQL Irql
IN KIRQL Synchronizelrql
IN KINTERRUPT_MODE InterruptMode
IN BOOLEAN ShareVector

IN KAFFINITY ProcessorEnableMask
IN BOOLEAN FloatingSave

Return value

PASSIVE_LEVEL

If it works, IoConnectlnterrupt returns a pointer to an Interrupt object. You
should store this pointer in your Device or Controller Extension because you'll
need it in order to disconnect from the interrupt source or to execute any Synch­
CritSection routines.
Three things are worth mentioning about IoConnectlnterrupt. First, if your
ISR handles more than one interrupt vector, or if your driver has more than one
ISR, you need to supply the system with a spin lock to prevent collisions over the
ISR's ServiceContext. If you're not doing either of those things, then this spin lock
is unnecessary. 2
Second, if the ISR manages more than one interrupt vector, or your driver
has more than one ISR, make sure that the value you specify for Synchronizelrql
is the highest DIRQL value of any of the vectors you're using.

2

Normally, you declare storage space for this spin lock in the Device or Controller Extension.
Remember to call KelnitializeSpinLock before you connect to an interrupt source.

Sec. 9.3

Writing a Start 1 /0 Routine

185

Finally, your driver's Interrupt Service routine must be ready to run as soon
as you call this function. Interrupts from your device (or from other devices at the
same IRQL) may preempt any additional initialization done by your driver, and
the ISR has to handle these interrupts correctly. So, make sure all the necessary
driver setup work is done before you connect to an interrupt. In general, you
should follow this kind of sequence:

1.

Call IolnitializeDpcRequest to initialize the Device object's DPC and perform
any initialization needed to make the DpcForlsr routine execute properly.

2.

Disable interrupts from the device by setting appropriate bits in the device's
control registers.

3.

Perform any driver initialization required by the ISR in order for it to run
properly.

4.

Call IoConnectlnterrupt to attach your ISR to an interrupt source and store
the address of the Interrupt object in the Device Extension.

5.

Use a SynchCritSection routine to put the device into a known initial state
and enable device interrupts.

Disconnecting from an Interrupt Source
If your driver is unloadable, you need to detach its Interrupt Service routine
from the Kernel's list of interrupt handlers before the driver is removed from
memory. If you forget to do this and your device generates an interrupt after the
driver is unloaded, the Kernel will try to call the address in nonpaged pool where
your ISR used to lived. Nothing good will happen.
Disconnecting from an interrupt is a two-step procedure. First, use KeSyn­
chronizeExecution and a SynchCritSection routine to disable the device and pre­
vent it from generating any further interrupts. Second, remove your ISR from the
Kernel's list of handlers by passing the device's Interrupt object to IoDiscon­

nectlnterrupt.

9.3

W RITING A S TART

1/0 R OUTI N E

In the rest of this chapter, we'll be developing a programmed I/0 driver for a paral­
lel port. To keep things simple, this driver ignores many of the details you'd have to
consider if you were writing a commercial driver. Take a look at the sample driver
that comes with the NT DDK to see what's involved in managing these devices.

Execution Context
The I / O Manager calls your Start 1 /0 routine (described in Table 9.4) either
when a Dispatch routine calls IoStartPacket (if the device was idle), or when

Chapter 9

186
Table 9.4

Programmed 1/0 Data Transfers

Function prototype for a Start 1/0 routine

VOID XxStartlo

IRQL

Parameter

Description

IN PDEVICE_OBJECT DeviceObject
IN PIRP irp

Target device for this request
IRP describing the request

==

DISPATCH_LEVEL

Return value
some other part of the driver calls loStartNextPacket. In either case, Start 1/0
runs at DISPATCH_LEVEL IRQL, so it mustn't do anything that causes a page
fault.

What the Start 1/0 Routine Does
Your driver's Start 1/0 routine is responsible for doing any function-code­
specific processing needed by the current IRP and then starting the actual device
operation. In general terms a Start 1 / 0 routine will do the following:

1.

Call IoGetCurrentStackLocation t o get a pointer t o the IRP's stack location.

2.

If your device supports more than one IRP_MJ_XXX function code, examine
the 1/0 stack location's MajorFunction field to determine the operation.

3.

Make working copies of the system buffer pointer and byte count stored in
the IRP. The Device Extension is the best place to keep these items.

4.

Set a flag in the Device Extension indicating that you expect an interrupt.

5.

Begin the actual device operation.

To guarantee proper synchronization, any of these steps that access data
shared with the ISR should be performed inside a SynchCritSection routine rather
than in Start 1/0 itself.

9.4

WRITING AN I NTER R U PT SERVICE ROUTINE (ISR)
Once a device operation begins, the actual data transfer is driven by the arrival of
hardware interrupts. When an interrupt arrives, the driver's Interrupt Service
routine acknowledges the request and either transfers the next piece of data or
invokes a DPC routine.

Execution Context
When the Kernel gets a device interrupt, it uses its collection of Interrupt
objects to locate an ISR willing to service the event. It does this by running

Sec. 9.4

Writing an Interrupt Service Routine (ISR)

187

through all the Interrupt objects attached to the DIRQL of the interrupt and call­
ing ISRs until one of them claims the interrupt.
The Kernel interrupt dispatcher calls your ISR at the synchronization IRQL
you specified in the call to IoConnectlnterrupt. Usually this will be the DIRQL
level of the device. The Kernel dispatcher also acquires and releases the device
spin lock for you.
Running at such a high IRQL, there are lots of things your ISR isn't allowed
to do. In addition to the usual warning about page faults, your ISR shouldn't try
to allocate or free various system resources (like memory) . If you plan to call any
system support routines from your ISR, check for restrictions on the level at
which they can run. You may need to perform those kinds of operations in a DPC
routine rather than in the ISR itself.
As you can see from Table 9.5, the Kernel passes you a pointer to whatever
context information you identified in IoConnectlnterrupt. Most often, this will be
a pointer to the Device or Controller Extension.

What the Interrupt Service Routine Does
The Interrupt Service routine is the real workhorse in a programmed I/O
driver. In general, one of these routines will do the following:

1.

Determine if the interrupt belongs to this driver. If not, immediately return a
value of FALSE.

2.

Perform any operations needed by the device to acknowledge the interrupt.

3.

Determine if any more data remains to be transferred. If there is, start the next
device operation. This will eventually result in another interrupt.

4.

If all the data has been transferred (or if a device error occurred), queue up a
DPC request by calling IoRequestDpc.

5.

Return a value of TRUE.

Always code an ISR for speed. Any work that isn't absolutely essential
should go in a DPC routine. It's especially important that your ISR doesn't drag its

Table 9.5

Function prototype for an I nterrupt Service routine
==

BOOLEAN XxlSR

IRQL

Parameter

Description

IN PKINTERRUPT Interrupt
IN PVOID ServiceContext

Interrupt object generating the interrupt
Context area passed to IoConnectlnterrupt
• TRUE - interrupt was serviced by XxISR
• FALSE - interrupt not serviced

Return value

DIRQL

Chapter 9

188

Programmed 1/0 Data Transfers

feet while determining whether or not to service an interrupt. There may be any
number of other ISRs waiting in line behind yours for a given interrupt, and if
you do a lot of processing before you decide not to handle the event, you can slow
them down.

9.5

W R ITI N G A D P C F O R I S R R O U TI N E
Your driver's DpcForlsr routine is responsible for determining a final status for
the current request, completing the IRP, and starting the next one.

Execution Context
In response to the ISR's call to IoRequestDpc, your driver's DpcForlsr rou­
tine (described in Table 9 .6) is added to the DPC dispatch queue. When the CPU's
IRQL value drops below DISPATCH_LEVEL, the DPC dispatcher calls the Dpc­
Forlsr routine. Your DpcForlsr routine runs at DISPATCH_LEVEL IRQL, which
means it has no access to pageable addresses.
Once you call IoRequestDpc for a given device, the 1/0 Manager ignores
any further IoRequestDpc calls for that device until the DpcForlsr routine exe­
cutes. This is standard behavior for DPC objects. If your driver design is such that
you might issue overlapping DPC requests for the same device, then it's up to you
to handle this situation properly. You'll need to keep track of the pending requests
and have the DPC routine perform the work for all of them each time it executes.

What the DpcForlsr Routine Does
Since most of the work happens during interrupt processing, the DpcForlsr
routine in a programmed 1/0 driver doesn't have a lot do. In particular, this rou­
tine should

1.

Set IRP's I / 0 status block. Put an appropriate STATUS_XXX code in the Sta­
tus field and the actual number of bytes transferred in the Information field.

Table 9.6

Function prototype for a DpcForlsr routine
==

VOID XxDpcForlsr

IRQL

Parameter

Description

IN PKDPC Dpc
IN PDEVICE_OBJECT DeviceObject
IN PIRP Irp
IN PVOID Context

DPC object responsible for this call
Target device for I/O request
IRP describing the current request
Context passed to IoRequestDpc

Return value

DISPATCH LEVEL

Some Hardware: The Parallel Port

Sec. 9.6

189

2.

Call IoCompleteRequest to complete the IRP with an appropriate priority
boost. Once you've made this call, don't touch the IRP again.

3.

Call IoStartNextPacket to send the next IRP to Start 1/0.

Priority Increments
The NT thread-scheduler uses a priority-boosting strategy to keep the CPU
and 1/0 devices as busy as possible. As you can see from the boost values listed
in Table 9.7, priority increments are weighted so as to favor threads working with
interactive devices like the mouse and keyboard.
As part of this strategy, your driver should compensate any thread that waits
for an actual device operation by giving it a priority boost. Choose an appropriate
increment from the table and specify it as an argument to IoCompleteRequest.

9.6

SOM E HARDWARE : THE PARALLEL PORT
Before we walk through an example of a programmed 1/0 driver, it will be help­
ful to look at some actual hardware. This serves the dual purpose of showing you
what kinds of devices tend to perform programmed 1/0 and of giving us some­
thing to control with our driver.

How the Parallel Port Works
The parallel interface found on most PCs is based on an ancient standard
from the Centronics Company. Although its original purpose was to communicate

Table 9.7

Specify one of these values when you complete an 1/0 request

Priority increment val ues
Symbol
IO_NO_INCREMENT
IO_CD_ROM_INCREMENT
IO_DISK_INCREMENT
IO_PARALLEL_INCREMENT
10_VIDEO_INCREMENT
IO_MAILSLOT_INCREMENT
IO_NAMED_PIPE_INCREMENT
IO_NETWORK_INCREMENT
IO_SERIAL_INCREMENT
IO_MOUSE_INCREMENT
IO_KEYBOARD_INCREMENT
IO_SOUND_INCREMENT

Boost

0
1
1
1
1
2
2
2
2
6
6
6

Use when completing ...
Requests involving no device 1/0
CD-ROM input
Disk l/O
Parallel-port 1/0
Video output
Mailslot I/ 0
Named pipe 1/0
Network l/0
Serial-port 1/0
Pointing-device input
Keyboard input
Sound board I/ 0

Chapter 9

190

Programme d 1/0 Data Transfers

with printers, clever people have found ways of attaching everything from disks to
optical scanners to the parallel port. The DB-25 connector on this port carries a
number of signals, the most important ones being:
•

Initialize-The CPU sends a pulse down this line when it wants to initial­
ize the printer.

•

Data-The CPU uses these eight lines to send one byte of data to the
printer. On systems with extended parallel interfaces, these lines can also
be used for input.

•

Strobe#-The CPU pulses this line once to let the printer know that valid

information is available on the data lines. 3
•

Busy-The printer uses this line to let the CPU know that it can't accept
any data.

•

Ack#-The printer sends a single pulse down this line when it is no
longer busy.

•

Errors-The printer can use several lines to indicate a variety of not­
ready and error conditions to the CPU.

The following sequence of events occurs during a data transfer from the
CPU to a printer attached to the parallel port:

1.

The CPU places a byte on the eight data lines and lets the data settle for at
least half a microsecond.

2.

The CPU grounds the STROBE# line for at least half a microsecond and then
raises it again. This is the signal to the printer that it should latch the byte on
the data lines.

3.

In response to the new data, the printer raises the BUSY line and starts to pro­
cess the byte. This usually means moving the byte to an internal buffer.

4.

After it processes the character (which may take microseconds or seconds,
depending on how full the printer 's buffer is), the printer lowers the BUSY
line and pulses the ACK# wire by grounding it briefly. 4

You can see from this description that the parallel port offers a very low­
level interface to the outside world. Most of the signaling protocol involved in a
data transfer has to be implemented by the CPU itself. This is going to have a
major impact on the design of our driver.

3

Following the standard convention, a line with # in its name means that ground indicates a logic-1,
while presence of a signal on the line indicates a logic-0.

4

Yes, using two lines to indicate a ready status is redundant.

Some Hardware: The Parallel Port

Sec. 9.6

191

Device Registers
The software interface to the parallel port is through a set of three registers,
described in Table 9.8. Since the parallel port is one of the things detected by auto­
configuration (even on an ISA system), our driver will be able to use the Configu­
ration Manager to find the base address of the data register.
If you look at the bit settings in Table 9.8, you'll notice that some of the bits
have the opposite polarity from the signals they represent. For example, you need
to set the STROBE bit to 1 if you want to ground the STROBE# wire and get the
printer to accept your data. Also, the BUSY wire going to ground causes the BUSY
bit in the status register to set itself - so it's really a NOT-BUSY bit. The "solder
people" may have a good explanation for all this, but it's usually best to hide
these oddities in a hardware header file. 5

Table 9.8

These registers control a parallel port i nterface

Parallel port registers
Offset

Register

Access

Description

0

Data

R/W

1

Status

RI O

Data byte transferred through
parallel port
Current parallel port status
Reserved; normally contain a 1
0
interrupt has been requested
by port
0 - an error occurred
1 - printer is selected
1 - printer is out of paper
0
acknowledge
0 printer is busy
Commands sent to parallel port
1 - strobe data to /from parallel
port
1 - automatic line feed
0 - initialize printer
1 - select printer
1
enable interrupts
1
read data from parallel port*
Reserved; must be 1

Bits 0 - 1
Bit 2

-

Bit 3
Bit 4
Bit 5
Bit 6
Bit 7
2

Control

Bit O
Bit 1
Bit 2
Bit 3
Bit 4
Bit 5
Bits 6 - 7

-

-

R/W

-

-

*Only valid for extended parallel ports; otherwise this must be 0.
5

See the HARDWARE.H header file included in the on-disk version of the sample source code that
accompanies this chapter.

Chapter 9

192

Programmed 1/ 0 Data Transfers

Interrupt Behavior
On ISA machines, the parallel port designated as LPTl normally uses IRQ 7
and LPT2 uses IRQ 5. A device connected to a parallel port generates an interrupt
by grounding the ACK# line momentarily. Most printers yank on this line for any
of the following reasons:
•

The printer has finished initializing itself.

•

The printer has processed one character and is now ready for another.

•

Power to the printer has been switched off.

•

The printer has gone offline or has run out of paper.

There's some variability in the way different printers implement these fea­
tures. For example, not all of them generate an interrupt when they've completed
their initialization, nor do all printers interrupt when they go offline or run out of
paper. The driver developed later in this chapter assumes that all these conditions
produce interrupts.

A Driver for the Parallel Port
So, just what is it about the parallel port that makes it a good candidate for
programmed 1/0? Looking at the device's behavior, one clue is that each byte
sent to the device has to be transferred through the CPU. DMA devices work
independently of the CPU and don't demand this much attention.
Another hint is that it generates an interrupt after each byte is accepted by
the device. This means a large number of interrupts will probably occur before an
operation is complete. DMA devices typically generate only a single interrupt
when a transfer is complete.

9.7

CODE EXAMPLE : PARALLEL PORT DRIVER
This example shows how t o write a basic programmed 1/0 driver for the parallel
port. You can find the code for this example in the CH09\DRIVER directory on
the disk that accompanies this book.

XXDRIVER.H
This version of the main header file builds on the ones seen in previous
chapters. Only one structure from this file is of much interest.

DEVICE_EXTENSION The following excerpt shows the changes in the
Device Extension needed to support the parallel port.

Sec. 9.7

Code Example: Parallel Port Driver

193

typede f s t ru c t _DEVICE_EXTENS I ON
ULONG F i foS i z e ;
/ / Byte s to s end at once
ULONG Byte s Requested ; / / Requested trans fer s i z e
ULONG Byte sRemaining ; / / Chars l e f t to t rans ferO
PUCHAR pBu f f e r ;
/ / Next char to s end
BOOLEAN Trans f e rinProgre s s ; @
/ / Mo s t rec ent s tatus@

UCHAR Devi c e S tatus ;

} DEVICE_EXTENS I ON , * PDEVICE_EXTENS ION ;

0 These two fields are working copies of the requested transfer size and the
system buffer pointer taken from the IRP. They are used to keep track of
where we are in the transfer. Modifying the IRP itself would be a disaster
because the 1/0 Manager uses it to clean up after the request.

@ This flag is used to detect spurious interrupts. It's set at the beginning of a
transfer and cleared when the request is completed.

@ This field keeps track of the most recent status of the parallel port. The
DpcForlsr routine uses it to figure out what kind of status to give back to
the caller.

I NIT.C
Most of the code in this module is the same as it was in Chapter 6. The
changes have to do with some hardware-specific initialization.

XxCreateDevice This excerpt shows the proper sequence of operations for
enabling interrupts and initializing a piece of hardware.
s ta t i c NTSTATUS
XxC r e a t eDev i c e

(

IN PDRIVER_OBJECT DriverObj e c t ,
IN PCONF I G_BLOCK pCon f i g ,

I I Con f i g b l o c k

IN ULONG uN um I I Devi c e number
)

s t a tus

=

I o C r e a t e Syrnbo l i cL ink (

& l inkName ,

&dev i c eName

) ;

II
I I S e e i f the symbo l i c l ink was c r e a t ed .

II

if (

! NT_SUCC E S S (

s t atus

I o De l e t eDevi c e (
r e turn s tatus ;

) )

{

pDevObj

) ;

.

.

Chapter 9

194

Programmed 1/0 Data Transfers

II
I I Make sure devi c e interrup t s are OFF
II
XxWri t eControl ( O
pDevExt ,
XX_CTL_DEFAULT I XX_CTL_NOT_INI ) ;
II
I I Conne c t t o an Interrupt obj e c t . . .
II
s tatus = I oConne c t int errupt ( @
&pDevExt - >pinterrup t ,
Xxi s r ,
pDevExt ,
NULL ,
pCon f i g - >Devi c e [ uNum ] . Sys t emVec tor ,
pCon f i g - >Devi c e [ uNum ] . Di rql ,
pCon f i g - >Devi c e [ uNum ] . Di rql ,
pCon f i g - >Devi c e [ uNum ] . InterruptMode ,
pCon f i g - >Devi c e [ uNum ] . ShareVector ,
pConf i g - >Devi c e [ uNum ] . Af f in i ty ,
pConf i g - > Devi c e [ uNum ] . F loat ingSave ) ;
i f ( ! NT_SUCCESS ( s tatus ) ) {
IoDe l e t eSymbo l i cLink ( & l inkName ) ;
IoDe l e t eDevi c e ( pDevObj ) ;
r eturn s tatus ;
}
'

II
I I Ini t i a l i z e the hardware and enab l e interrup t s
II
KeSynchroni z eExecu t i on ( 4D
pDevExt - >pinterrupt ,
XxinitDevi c e ,
pDevExt ) ;
return s tatus ;

0 It's important to put the device into a known state. This includes dis­
abling interrupts from the port.

@ The driver uses values recovered by XxGetHardwarelnfo to attach its
Interrupt Service Routine to the device's interrupt vector.
49 Finally, the driver uses a Synch Critical Section routine to initialize the
device, including turning on its interrupts. Keep in mind that the Inter-

Sec. 9.7

Code Example: Parallel Port Driver

195

rupt Service Routine may actually get called as soon as the KeSynchro­
nizeExecution function returns.

XxlnitHardware This function cycles the INIT line, causing the printer to
start initializing itself. This will eventually produce an interrupt. The function
then sets the SELECT line and enables interrupts from the port. This might result
in an immediate interrupt. However, since this function is being called by KeSyn­
chronizeExecution, it's not in any danger of being disturbed by parallel port
interrupts.

s t at i c BOOLEAN
XXInitDevi ce (
IN PVOI D SynchContext
)
PDEVICE_EXTENS ION pDE
( PDEVICE_EXTENS I ON ) SynchContext ;
XxWri t eControl ( pDE , XX_CTL_DEFAULT ) ; 0
KeS ta l l Execu t i onProc e s s o r ( 6 0 ) ;
XxWr i t eContro l ( @
pDE ,
XX_CTL_DEFAULT
I XX_CTL_NOT_INI
I XX_CTL_SELECT
I XX_CTL_INTENB ) ;
KeS t a l l Execu t i onPro c e s s o r ( 6 0 ) ;
return TRUE ;

0 Clear the NOT_INIT bit. This begins the printer 's initialization cycle. The
driver waits 60 microseconds to be sure the signal has stabilized.

@ To complete the cycle, the driver sets the NOT_INIT bit. It also enables
interrupts and tells the printer to select itself. Again, it's necessary to wait
a little while the signals stabilize.

TRANSFER.C
The routines in this file do the actual work of transferring data out to the
parallel port. This includes starting each operation, handling interrupts, and
cleaning up with a DPC.

XxStartlo This function does any preprocessing needed by the current
IRP and then starts the actual device operation.

Chapter 9

196

Programmed 1/0 Data Transfers

VOI D
XxS tart i o (
IN PDEVICE_OBJECT Devic eObj e c t ,
IN P I RP I rp
)
P IO_STACK_LOCATI ON I rpStack =
I oGe tCurrent i rpStackLocat i on ( I rp ) ;
PDEVICE_EXTENS ION pDE =
Devi c eObj e c t - >Devi c eExt ens i on ;
swi tch ( I rpStack - >Ma j o rFunct i on ) { 0
II
I I Use a SynchC r i t S ec t i on rout ine to
I I s tart the wr i t e opera t i on . . .
II
case I RP_MJ_WRITE :
II
I I S e t up counts and byte pointer@
II
pDE - >Byt e s Reques t ed =
I rpS tack- > Pararneters . Wr i t e . Length ;
pDE - >Byt e s Rernaininng =
pDE - >Byt e s Reque s t ed ;
pDE- >pBu f fer =
I rp - >As s o c i at edirp . Sys t ernBu f f e r ;
i f ( ! Ke Synchroni z eExecut i on ( @}
pDE - >p interrupt ,
XxTransrni tByte s ,
pDE ) )
XxDpcFori s r (
NULL ,
Devi c eObj ect ,
I rp ,
pDE ) ;
break ;
de faul t : 0

I rp - > I o S tatus . S tatus =
STATUS_NOT_SUPPORTED ;

Sec. 9.7

Code Example: Parallel Port Driver

197

O;
I rp - > I o S tatus . Informat i on
I oComp l e teReques t (
I rp ,
I O_NO_INCREMENT ) ;
I o S t artNext Packet ( DeviceOb j e c t , FALSE ) ;
break ;

}
}
0 Since all requests get funneled through a single Start 1 / 0 routine, it's nec­

essary to switch on the major-function code if you have to do any func­
tion-specific preprocessing.

@ These are the private copies of the pointer and byte counts that the driver
uses to keep track of its place in the system buffer.
@) The driver tries to send some number of bytes out to the device. If any­
thing goes wrong, it calls XxDpcForlsr as a regular subroutine to com­
plete the request.
0 The driver should never get to the default case, because unsupported
functions have been filtered out by the 1/0 Manager during the dispatch­

ing process. But it's better to be safe than sorry.

XxTransmitBytes This function sends as many bytes as possible to the
parallel port. This will be either one FIFO's worth, or as many as are left in the
system buffer. Both XxStartlo and Xxlsr call this function. In either case, it expects
to be running at DIRQL, synchronized with the driver 's ISR

s ta t i c BOOLEAN
XxTransmi tByt es (
IN PVOI D Cont ext / / Pointer to the Devi c e Ext ens i on
)
PDEVICE_EXTENS I ON pDE =
( PDEVICE_EXTENS I ON ) Cont ext ;
ULONG XferS i z e ;
UCHAR Contr o l = XxReadCont rol ( pDE ) ;
pDE - >Devi ceS tatus = XxReadStatus ( pDE ) ; 0
i f ( ( pDE - >Byt esRemaining = = 0 ) @
I I ! XX_OK ( pDE - > Devi ceS tatus ) )
pDE- >Trans fer inProgre s s
return FALSE ;

=

FALSE ;

198

Chapter 9

Programmed I/O Data Transfers

II
I I A trans f e r i s happening . Calculate the numbe r
I I o f byt es to s end in one bunch .
II
pDE - >Trans f e rinProgres s
TRUE ; �
i f ( pDE - > Byt e s Remaining < pDE - >Fi f oS i z e
pDE - >Byte s Remaining ;
XferS i z e
else
X f e rS i z e
pDE - >Fi f o S i z e ;
II
I I Send as many byt es to the devi c e as i t
I I can handl e . Each byte mus t b e s t robed
I I out .
II
whi l e ( XferS i z e > 0 ) { 0
II
I I Make sure the STROBE b i t i s o f f
II
XxWr i t eControl (
pDE ,
Control & -XX_CTL_STROBE ) ;
II
I I S end a byt e and ho l d i t f o r at least
I I 5 0 0 nano - s econds
II
XxWr i t eData ( pDE , *pDE - >pBu f f e r ) ;
KeS t a l l Execu t i onProc e s s o r ( 1 ) ;
II
I I Turn on the STROBE b i t and ho l d i t
I I f o r a t l eas t 5 0 0 nano - seconds
II
XxWr i t eControl (
pDE ,
Control I XX_CTL_STROBE ) ;
KeS ta l l Execut i onProc e s s or ( 1 ) ;
II
I I Turn o f f the STROBE l ine
II
XxWr i t eContro l (
pDE ,
Control & -XX_CTL_STROBE ) ;
KeS ta l l Execu t i onProce s s o r ( 1 ) ;

Sec. 9.7

Code Example: Parallel Port Driver

199

II
I I Updat e pointer and count ers
II
pDE - >pBu f f er + + ;
XferS i z e - - ;
pDE - >Byt esRemaining - - ;
return TRUE ;
0 The XxDpcForlsr routine will use this status field to figure out what hap­
pened during the 1 / 0 processing cycle.

@ If all the bytes have been sent, or there was a problem with the printer,
just return a FALSE and quit.
@ Send either one FIFO's worth of data, or as many bytes as are left in the

buffer - whichever is less.
0 This loop sends out one bucketful of data to the port. The body of the

loop incorporates the strobing protocol required for sending data to the
parallel port.

Xxlsr The Kernel calls this function in response to a device interrupt. If
Xxlsr processes the interrupt, it returns TRUE; otherwise, FALSE. It runs at
DIRQL level, holding the Interrupt spin lock for this device.

BOOLEAN
Xxi s r (
IN PKINTERRUPT Interrupt ,
IN PVO I D Servic eContext I I P t r to Devi c e Extens i on
)
PDEVICE_EXTENSION pDE = S ervi ceCont ext ;
PDEVICE_OBJECT pDevi c e = pDE - >Devi ceObj e c t ;
P I RP I rp = pDevice- >Current i rp ;
UCHAR S tatus = XxReadS tatus ( pDE ) ;
i f ( ( S tatus & XX_STS_NOT_IRQ )
re turn FALSE ;

!= 0 ) 0

i f ( pDE - >Trans f e rinProgres s ) @
i f ( ! XxTransmi tByt e s ( pDE ) )
I oReques t Dpc ( pDevi c e , I rp ,
re turn TRUE ;

( PVOID ) pDE ) ;

200

Chapter 9

Programmed 1/ 0 Data Transfers

0 Check the parallel port to see if it generated an interrupt. Not all parallel

devices support this bit, but the ones that don't hold it at 0. If the device
didn't request an interrupt, leave the ISR as soon as possible.

@ The port interrupted. If there's no transfer in progress, just ignore the inter­
rupt; otherwise try to send the next chunk of data. If XxTransmitBytes fails,
it means either an error occurred, or there are no more bytes to send.
XxDpcForlsr Once the data transfer finishes, this function performs any
required cleanup operations. The XxStartlo also calls this function if it needs to fail
an IRP before starting a transfer. XxDpcForlsr runs at DISPATCH_LEVEL IRQL.

VO I D
XxDpcFori s r (
IN PKDPC Dpc ,
IN PDEVICE_OBJECT Devi c eObj ect ,
I N P I RP I rp ,
IN PVOI D Cont ext I I Pointer to Device Extens i on
)
{
PDEVICE_EXTENS ION pDE = Cont ext ;
I rp - > Io S tatus . Inforrnat i on =
pDE - >BytesReques t ed pDE- >cBytesRernaining ; 0
II
I I F i gure out what the f inal s tatus
I I shou l d be
II
i f ( XX_OK ( pDE - >Devi ceS tatus ) ) @
I rp - > I o S tatus . S tatus = STATUS_SUCCES S ;
e l s e i f ( XX_POWERED_OFF ( pDE - > Devi c e S tatus ) )
I rp - > I o S tatus . S tatus =
STATUS_DEVICE_POWERED_OFF ;
e l s e i f ( XX_NOT_CONNECTED ( pDE - >Devi c e S tatus ) )
I rp - > I o S tatus . S tatus =
STATUS_DEVICE_NOT_CONNECTED ;
e l s e i f ( XX_OFF_L INE ( pDE - >Devi ceS tatus ) )
I rp - > I o S tatus . S tatus =
STATUS_DEVICE_OFF_L INE ;
e l s e i f ( XX_PAPER_EMPTY ( pDE - >Devi ceS tatus ) )
I rp- > I o Status . S tatus =
STATUS_DEVICE_PAPER_EMPTY ;

Sec. 9.8

Testing the Data Transfer Routines

201

e l s e I rp - > I o S tatus . S tatus =
STATUS_DEVI CE_DATA_ERROR ;
II
I I I f we ' re being cal l ed dire c t ly from S t art I I O ,
I I don ' t give the user any p r i or i ty boos t .
II
i f ( Dpc = = NULL ) @
I oComp l e teReque s t ( I rp , I O_NO_INCREMENT ) ;
else
I oCompl e teReque s t ( I rp , I O_PARALLEL_INCREMENT ) ;
I o S t artNextPacke t ( Devi c eObj e c t , FALSE ) ; 0
0 The Information field should contain the number of bytes actually trans­
ferred when the IRP goes back to the I/ 0 Manager.

@ This section of code uses several macros defined in HARDWARE.H to
figure out what the final status should be.

@) It's necessary to know whether this function is being called directly from
XxStartlo or by the system DPC dispatcher. In the former case, the origi­
nal thread gets no priority boost. The NULL DPC argument means XxD­
pcForlsr is being called from XxStartlo.
O Once the current IRP is completed, it's necessary to tell the I/O Manager
to start the next one.

9.8

TESTING THE DATA TRANSFER ROUTIN ES
At this point, you've got a real driver to work with and you can do serious testing.
Among other things, you can verify that the driver
•

Sends IRPs from its Dispatch routines to its Start I/O routine

•

Responds to device interrupts

•

Transfers data successfully

•

Completes requests

•

Manages requests from multiple callers

Testing Procedure
The following procedure will let you check all the code paths through your
driver's data transfer routines.

Chapter 9

202

9.9

Programmed I/O Data Transfers

1.

Write a minimal Start 1/0 routine that simply completes each IRP a s soon as
it arrives. This will allow you to test the linkage between the driver's Dispatch
and Start I / 0 routines.

2.

Write the real Start I/O routine, the ISR, and the DpcForlsr routine. If the
driver supports both read and write operations, implement and test each path
separately.

3.

Exercise all the data transfer paths through the driver with a simple Win32
program that makes ReadFile, WriteFile, and DeviceloControl calls.

4.

Stress test the driver with a program that generates large numbers of I/O
requests as quickly as possible. Run this test on a busy system.

5.

If your device is shareable, run several copies of the test program at once to be
sure the driver works with multiple open handles.

6.

If your driver supports multiple physical devices, repeat the tests with each
device unit.

7.

If possible repeat steps 4-6 on a multiprocessor system to verify SMP syn­
chronization.

SUM MARY
At this point, it looks as if you have all the components of a working driver. Its
Start 1 / 0 routine is setting up each request, its ISR is servicing interrupts, and its
DpcForlsr is handling all the details of 1/0 postprocessing. What more could you
want?
Unfortunately, the little parallel port driver we built in this chapter isn't
ready for prime time distribution. In particular, it doesn't handle device timeouts,
so if an interrupt never arrives, the request will simply lock up. In the next chap­
ter, you'll see how to remedy this situation.

C

H

A

P

T

E

R

10

Timers

I

t's a sad fact, but true: Hardware is perverse stuff
that doesn't necessarily behave the way it should. For example, error conditions
may prevent a device from generating an interrupt when you're expecting one.
Even worse, some devices don't even use interrupts to signal interesting state
changes. Handling these situations often requires some kind of timer or polling
mechanism, and that's just what we're going to look at in this chapter.

1 0. 1

HAN DLING DEVICE TIM EOUTS
Your driver should never assume that an expected device interrupt will arrive.
The device might be offline, it might be waiting for some kind of operator inter­
vention, or perhaps it's just broken. This section explains how to use I/O Timer
routines to detect unresponsive devices.

How 1/0 Timer Routines Work
An I/O Timer routine is an optional piece of driver code that your Driver­
Entry routine attaches to a specific Device object. After you start the Device
object's timer, the I/O Manager begins calling the I/0 Timer routine once every
second. These calls continue until you stop the timer. Table 10.1 lists the functions
available for working with 1/0 timers.
Table 10.2 shows the prototype for the 1/0 Timer routine itself. When it exe­
cutes, it receives a pointer to the associated Device object and whatever context
203

Chapter 10

204

Timers

Using 1/0 timers

Table 1 0.1

How to use an 1/0 Timer routine
I F you want to ...

THEN cal l ...

I RQL

Attach a timer routine to a device
Start a device's timer
Stop a device's timer

IolnitialzeTimer
IoStartTimer
IoStopTimer

PASSIVE_LEVEL
:::;; DISPATCH_LEVEL
:::;; DISPATCH_LEVEL

Table 1 0.2

Function prototype of an 1/0 Timer routine

VOID XxloTimer

IRQL

Parameter

Description

IN PDEVICE_OBJECT DeviceObject
IN PVOID Context
Return value

Device object whose timer just fired
Context passed to IolnitializeTimer

==

DISPATCH_LEVEL

information you passed to IolnitializeTimer. As always, the address of the Device
Extension is a good choice for context.

How to Catch Device Timeout Conditions
In general terms, a driver that wants to catch device timeouts should do the
following:

1.

Its DriverEntry routine calls IolnitializeTimer t o associate a n I/OTimer rou­
tine with a specific device.

2.

When a user-mode program attaches a handle to the device by calling Create­
File, the Dispatch routine for IRP_MJ_CREATE calls IoStartTimer. As long as
this handle is open, the device will receive I/O Timer calls. This same Dis­
patch routine also sets a timeout counter in the Device Extension to -1
a
"do nothing" value.
-

3.

When the Start I/O routine starts the device, it also sets the timeout counter to
the maximum number of seconds the driver is willing to wait for an interrupt.

4.

The ISR will do one of two things when an interrupt arrives. If there's more
data, it resets the timeout counter to its maximum value and transfers the next
piece of data. Otherwise, it sets the timeout counter to -1 and issues a DPC
request to complete the IRP.

5.

Meanwhile the system is calling the driver 's I/O Timer routine once every
second. When it executes, the I/O Timer routine checks the timeout counter.

Sec. 10.2 Code Example: Catching Device Timeouts

205

A value of -1 means "ignore the 1/0 Timer call." A positive value causes the
1/0 Timer routine to decrement the device's timeout counter. If the counter
reaches zero before an interrupt arrives, the 1 / 0 Timer routine stops the
device, sets the timeout counter to -1, and processes the request as a timed
out operation.
6.

When the user-mode program calls CloseHandle, the Dispatch routine for
IRP_MJ_CLOSE calls IoStopTimer and disables 1/0 Timer callbacks for the
device.

Notice that the Start 1/0 and 1/0 Timer routines (running at DIS­
PATCH_LEVEL IRQL), and the ISR (running at DIRQL) all have access to the
timeout counter in the Device Extension. This can lead to problems unless these
driver routines synchronize themselves. The code example that appears later in
this chapter shows how to do this properly.
It's also worth pointing out that not all drivers use their Dispatch routines to
start and stop the 1/0 Timer calls. Some drivers just start a device's 1 / 0 Timer in
DriverEntry and stop it in the Unload routine. While the driver is loaded, it sim­
ply ignores 1/0 Timer callbacks whenever the timeout counter is set to -1 . The
only disadvantage of this scheme is that it incurs some system overhead even
when the device isn't being used.
Your driver has a number of options for processing a request that has timed
out. Some of the common things drivers do include:
•

Retrying the device operation some fixed number of times before failing
the IRP that generated it.

•

Failing the IRP by calling IoCompleteRequest with an appropriate final
status value. 1

•

Logging a timeout error for the device in the system event log. This can
help system administrators to track down flaky hardware.

1 0 . 2 C O D E E XAM P L E : C ATC H I N G D E V I C E T I M EO UTS
This example does show how to add timeout support to the simple parallel port
driver developed in the previous chapter. You can find the code for this example
in the CHlO\ TIME-OUT\ DRIVER directory on the disk that accompanies this
book.

1

Watch out if you're tempted to use STATUS_IO_TIMEOUT as the final status for a timedout IRP.
Unfortunately, this status code maps onto the ERROR_SEM_TIMEOUT in Win32. The message for
this code ("The semaphore timeout period has expired.") may be a little confusing to users of your
driver, so it's usually best to find some other NT status code.

Chapter 10

206

Timers

XXDRIVER.H
This version of the main header file builds on the ones seen in previous
chapters. Only one structure from this file is of much interest.

DEVICE_EXTENSION The following excerpt shows the changes in the
Device Extension needed to catch parallel port timeout errors.

typ ede f s truc t _DEVICE_EXTENS I ON {
PUCHAR
pBu f fer ;
I I Working bu f f er pointer
LONG TimeRemaining ; I I Seconds unt i l t imeoutO
UCHAR Devi c e S tatus ; I I Mo s t recent s tatus byte
DEVICE_EXTENS ION , * PDEVICE_EXTENS I ON ;
0 This counter keeps track of the number of seconds remaining until the
driver declares a timeout condition. If it's set to -1, 1/0 Timer callbacks

are ignored. Anyone accessing this variable needs to be synchronized
with the ISR.

INIT.C
Here's an excerpt from the driver initialization code. Only a few changes are
necessary to prepare for I/ 0 Timer support.

XxCreateDevice In this modified version of the function that creates
Device objects, notice the addition of code to set up the I/0 timer.
s t at i c NTSTATUS
XxC r e a t eDevi c e

(

IN PDRIVER_OBJECT DriverObj ec t ,
IN PCONF I G_BLOCK pCon f i g ,

)

II
II

I I Con f i g b l o c k

I I Devi c e number

IN ULONG uNum

Ini t i a l i z e the devi c e extens i on s t ruc ture

II

pDevExt

=

pDevObj - >Devi c eExten s i on ;

pDevEx t - > Devi c eOb j e c t

pDevOb j ;

=

pDevEx t - >NtDev i c eNumbe r
pDevExt - > F i f o S i z e

=

=

uNum ;

XX_F I FO_S I Z E ;

pDevExt - > T imeRema ining

=

-1 ; 0

II

II

II

Prepare the devi c e ' s DPC obj e c t f o r l a t e r u s e

Sec. 10.2 Code Example: Catching Device Timeouts

207

I o i n i t i a l i z eDpcReque s t (
pDevOb j ,
XxDp c Fo r i s r

II
II

)

;

Ini t i a l i z e the dev i c e ' s

t imeout c l o ck

II

I o in i t i a l i z eT irner (

pDevOb j ,

Xxi oTirne r ,

pDevExt

) ;@

0 Set the initial value of the timeout counter to its "do nothing" state.

There's no need to synchronize here because the driver 's ISR hasn't been
activated yet with a call to IoConnectlnterrupt.
@ Associate the Device obj ect with the driver 's I/O Timer routine. Each
time XxloTimer is called, pass it a pointer to the Device Extension.

TRANSFER.C
Most of the changes in these versions of the data transfer routines involve
checking and setting the state of the timeout counter.

XxTransmitBytes For proper synchronization, this function expects to be
holding the Interrupt spin lock when it runs. This means it either must be called
from Xxlsr or as a Synch Critical Section routine.

s tat i c BOOLEAN
XxTransmi tByt e s (
IN PVOI D Context
)
PDEVICE_EXTENS I ON pDE =
( PDEVICE_EXTENS I ON ) Context ;
UCHAR Contro l ;
ULONG i ;
pDE - > Devi ceS tatus
II
II
II
II
II
if

=

XxReadStatus ( pDE ) ;

I f a l l the byt es have been s ent or the
devi c e is unhappy , inhibi t any further
proc e s s ing of thi s reques t and j us t qui t .
( ( pDE - >Byte sRemaining = = 0 )
I I ! XX_OK ( pDE - >Devi ceS tatus ) )

{
pDE ->TimeRemaining

=

-1 ; 0

Chapter 10

208

Timers

return FALSE ;

II
I I S end as many byt es to the device as i t
I I can handl e . Each byte mus t be s t robed
I I out .
II
XxReadCont r o l ( pDE ) ;
Contro l
for ( i=O ; i

<

pDE - >XferS i z e ; i + + )

{@

II
I I Make sure the STROBE l ine i s o f f
II
XxWr i t eContro l (
pDE ,
Control & -XX_CTL_STROBE ) ;
II
I I Update pointer and c ounters
II
pDE - >pBu f fer+ + ;
II
I I S tart the t imeout c l ock and wai t
I I for an interrupt
II
pDE - > TimeRemaining
XX_TIMEOUT_VALUE ; @
re turn TRUE ;
}
0 If the device is unhappy or there are no more bytes to transfer, this is the

end of the request. Disable the timeout counter.
@ There's no danger of the timeout routine failing the IRP during the data
transfer loop. This is because 1/ 0 Timer routine won't access the timeout
counter variable until it acquires the Interrupt spin lock.
@ Now that more data has been sent, reset the timeout counter and wait for

the next interrupt to arrive.

Xxlsr This function responds to interrupts from the parallel port. It differs
from the previous version in that it uses the timeout counter variable to determine
if a transfer is currently in progress.

Sec. 10.2 Code Example: Catching Device Timeouts

209

BOOLEAN
Xxi s r {
IN PKINTERRUPT Interrupt ,
IN PVOI D S ervi c eContext
)
{
PDEVICE_EXTENS I ON pDE = Servi c eContext ;
PDEVICE_OBJECT pDev i c e = pDE- > Devi ceObj e c t ;
UCHAR S tatus = XxReadS tatus { pDE ) ;
II
I I See i f thi s devi c e reques ted an interrupt
II
i f { ( S t atus & XX_STS_NOT_IRQ
!= 0 )
return FALSE ;
i f { pDE - >TimeRemaining = = - 1

r eturn TRUE ; 0

II
I I Otherwi s e , t ry to s end the next bunch o f
I I byt e s . I f XxTransmi tBytes f ai l s , i t means
I I e i ther an error occurred or there ' s no more
I I data to s end .
II
pDE - >BytesRemaining - = pDE - >X f erS i z e ;
i f { ! XxTransmi tByt e s { pDE ) ) @
{
I oRequ e s t Dpc {
pDevi ce ,
pDevi c e - > Current i rp ,
{ PVOI D ) pDE ) ;

return TRUE ;
0 If the timeout clock is -1, either there's no transfer in progress, or the device

has already timedout. In either case, there's nothing to be done here.
@ After the return from XxTransmitBytes, the timeout counter has either
been set to its maximum value (if the next piece of data has been sent), or
-1 (if there was no more data to send or the device had an error).

TIM ER.C
Here are the routines that actually process the timer events. In this particular
driver, the Dispatch routine for IRP_MJ_CREATE starts the device's timer, and the

Chapter 10

210

Timers

Dispatch routine for IRP_MJ_CLOSE stops it. While the timer is running, the con­
tents of the timeout counter variable determine the behavior of the 1/0 Timer
routines.

XxloTimer As long as the 1 /0 Timer for a device is running, the system
will call this routine once every second.

VO I D
Xxi oTirner (
IN PDEVICE_OBJECT Devic eObj ect ,
IN PVOID Context
)
{
PDEVICE_EXTENS ION pDE = Context ;
i f ( pDE->T irneRernaining

= =

-1 )

return ; 0

e l s e i f ( ! Ke Synchron i z eExecu t i on ( @
pDE - >pinterrupt ,
XxProcessTirnerEvent ,
pDE ) )
II
I I Cal l the DPC rout ine to f i gure out a
I I f inal s t atus and comp l e t e the IRP .
II
XxDpcFori s r ( @
NULL ,
Devi c eObj ect ,
Devi ceObj ect - >Current i rp ,
pDE ) ;

0 Do a quick check of the timeout counter. Either there's no data transfer in

progress, or an expected interrupt has arrived. Making this quick check at
DISPATCH_LEVEL avoids needless trips up to DIRQL.

@ The timeout counter appears to contain some value other than -1 . To pro­
cess the timer event safely, synchronize with Xxlsr using a Synch Critical
Section routine.
@} The Sync Critical Section routine returns FALSE if the current IRP has
timed out. In this case, we just fail the IRP. Other options might include
retrying the operation a fixed number of times, logging an error, and so
forth.

Sec. 10.3 Managing Devices without Interrupts

211

XxProcessTimerEvent This function does the real work of processing
timer events. It runs as a Synch Critical Section routine because it has to synchro­
nize its access to the timeout counter with Xxlsr.

s ta t i c BOOLEAN
XxProcessTimerEvent (
IN PDEVICE_EXTENS ION pDE
)
{
i f ( pDE - >TimeRemaining

= =

- 1 ) re turn TRUE ; 0

II
I I Decrement and t e s t the t imer .
II
i f ( - -pDE->TimeRemaining > 0 ) return TRUE ; 8
II
I I A t imeout has oc curred . Prevent further
I I pro c e s s ing of thi s reques t
II
pDE - >TimeRemaining
-1 ; @
pDE - >Devi ceS tatus
XxReadS t atus ( pDE ) ;
return FALSE ;
=

=

}
0 It's necessary to test the timeout counter again because the ISR may have

changed it while we were waiting for the Interrupt spin lock. If that's the
case, do nothing.
8 The timeout counter contains something other than -1 . In this case, decre­

ment the count. If it's still above 0, the IRP hasn't timedout yet.
@ The counter hit 0 so the IRP has timedout. Setting the timeout counter to

-1 blocks further processing of this request by the ISR (should an inter­
rupt just happen to arrive). Returning FALSE will cause the IRP to be
completed by XxloTimer.

1 0 . 3 M A N AG I N G D E V I C E S WITHO U T I N TE R R U PTS
Some devices don't generate interrupts every time they make a significant state
change. Legacy ISA devices can be especially bad about this kind of thing. This
section presents alternative ways of working with noninterrupting devices.

Working with Noninterrupting Devices
Under operating systems like MS-DOS, a driver managing a noninterrupt­
ing device could simply poll the device or busy-wait until it has changed state.

Chapter 10

212

Timers

However, this kind of behavior would cause serious performance problems for
NT. Instead, NT drivers can use one of the following techniques for suspending
their execution during a repeated polling operation:
•

Driver routines running at PASSIVE_LEVEL IRQL can call KeDelayExe­
cutionThread to introduce a time delay. This method can only be used by
the driver's initialization and cleanup code, or any Kernel-mode threads
the driver has created.

•

If you occasionally have to delay execution for intervals less than about
50 microseconds, you can call KeStallExecutionProcessor. This is better
than busy-waitin� because the delay interval doesn't depend on a specific
CPU architecture.

•

If parts of your driver running at DISPATCH_LEVEL IRQL need to intro­
duce a time delay, you can use a CustomTimerDpc routine.

If your device needs to be polled repeatedly and the delay interval between
each polling operation is over 50 microseconds, base your driver design on the
use of system threads (discussed in Chapter 1 4).

How CustomTimerDpc Routines Work
A CustomTimerDpc routine is just a DPC routine that you associate with a
Kernel Timer object. You get the CustomTimerDpc routine to run by setting the
Timer's timeout value. When it expires, the Kernel automatically queues your
DPC routine for execution. Eventually, the Kernel's DPC dispatcher pulls your
request from the queue and executes the CustomTimerDpc routine. Keep in mind
that, depending on system activity, there could be some delay between the
moment the Timer object expires and the actual execution of the DPC routine.
In earlier versions of Windows NT, a CustomTimerDpc routine would fire
only once. If you wanted one of these routines to execute repeatedly, you had to
manually reset the Timer object each time it fired. With NT 4.0, you have the
option of specifying a repetition interval when you set the Timer object's initial
timeout value. Each time it fires, the Timer object will automatically reset itself to
fire again when the repetition interval has elapsed.3
Like all other DPC routines, a CustomTimerDpc runs at DISPATCH_LEVEL
IRQL. Table 10.3 shows the prototype for one of these routines. Notice that a Cus-

2
3

Don't use this function too often. It essentially freezes the CPU on which it's called at whatever
IRQL level it's called from.
If you need to implement a repeating CustomTimerDpc routine, it's generally a good idea to use
the Timer object's automatic repetition feature rather than resetting the Timer yourself each time it
fires. It's more efficient because your driver won't be making so many calls to Kernel support rou­
tines. It also guarantees that there won't be any skewing of the timeout interval.

Sec. 10.3 Managing Devices without Interrupts

Table 1 0.3

213

Function prototype of a CustomTimerDpc routine
==

VOI D XxCustomTimerDpc

IRQL

Parameter

Description

IN PKDPC Dpc
IN PVOID Context
IN PVOID SystemArgl
IN PVOID SystemArg2
Return value

DPC object generating the request
Context passed to KelnitializeDpc
(Not used - contents unspecified)
(Not used - contents unspecified)

DISPATCH_LEVEL

tomTimerDpc routine always receives two junk arguments from the system. The
contents of these two system arguments are undefined, so don't use them. 4 With
CustomTimerDpc routines, you're limited to just a single context argument that is
permanently associated with the DPC object.
It's worth comparing CustomTimerDpc routines with the I/O Timers you
saw in the first part of this chapter. Although both mechanisms operate with time,
they differ in several significant ways. In particular:
•

Unlike I/O Timer routines, a CustomTimerDpc is not associated with any
particular Device object. You can have as many or as few of them as you like.

•

The minimum resolution of an I/O Timer is one second; you specify the
expiration time of a CustomTimerDpc in units of 100 nanoseconds.

•

The I/O Timer always uses a one-second interval. You can specify differ­
ent expiration intervals each time you start a CustomTimerDpc.

•

The storage for an I/O Timer object is automatically part of the Device
object. You need to declare nonpaged storage for both a KDPC and a
KTIMER object if you want to use a CustomTimerDpc.

How to Set Up a CustomTimerDpc Routine
Working with CustomTimerDpc routines is very straightforward. Your
driver simply needs to follow these steps:
l.

Allocate nonpaged storage (usually in a Device or Controller Extension) for
both a KDPC and a KTIMER object.

2.

DriverEntry calls KelnitializeDpc to associate a DPC routine and a context item
with the DPC object. This context item will be passed to your CustomTimerDpc

4

Regular CustomDpc routines (not associated with a Timer object) can make use of these argu­
ments. The discussion of CustomDpc routines in the next chapter shows how to use them.

Chapter 10

214

Timers

routine when it fires. The address of the Device or Controller Extension is a
good choice for the context item.
3.

DriverEntry also calls KelnitializeTimer just once to set up the Timer
object.

4.

To start a one-shot Timer, call KeSetTimer; to set up a repeating Timer, use
KeSetTimerEx instead. If you call these functions using a Timer object that is
currently active, the previous request is canceled and the new expiration time
replaces the old one.

If you want to keep a Timer from firing, call KeCancelTimer before the
Timer object expires. This also cancels a repeating Timer. If you need to find out
whether a Timer has already expired, call KeReadStateTimer.
You must be executing at PASSIVE_LEVEL IRQL when you initialize the
DPC and Timer object. To set, cancel, or read the state of a Timer, you must be run­
ning at or below DISPATCH_LEVEL IRQL. In general, you should avoid calling
KelnsertQueueDpc with a DPC object being used for a CustomTimerDpc rou­
tine. This can lead to race conditions in your driver.

How to Specify Expiration Times
Internally, NT maintains the current system time by counting the number of
100-nanosecond intervals since January 1, 1601 . This is a very big number, so NT
defines a 64-bit data type called a LARGE_INTEGER to hold it. Table 10.4 lists the
functions drivers can use to work with time values.

Table 1 0.4

Functions that operate on system time values

Time functions
Function

Description

KeQuerySystemTime
RtlTimeToTimeFields
RtlTimeFieldsToTime
KeQueryTickCount
KeQueryTimelncrement

Return 64-bit absolute system time
Break 64-bit time into date and time fields
Convert date and time into 64-bit system time
Return number of clock interrupts since boot
Return number of 100-nanosecond units added
to system time for each clock interrupt
Create a signed LARGE_INTEGER
Create a positive LARGE_INTEGER
Perform various arithmetic and logical
operations on LARGE_INTEGERs

RtlConvertLongToLargelnteger
RtlConvertUlongToLargeinteger
RtlLargelntegerXxx

Note: Callers of these functions can be running at any IRQL level.

Sec. 10.4 Code Example: A Timer-Based Driver

215

When you call KeSetTimer to start the clock ticking on a Timer object, you
can specify the expiration time in one of two ways:
•

A positive LARGE_INTEGER value represents an absolute system time at
which you want the Timer to expire. Absolute times correspond to some
exact moment in the future, like "February 23, 2051 at 6:45 PM."

•

A negative LARGE_INTEGER value represents the length of an interval
measured from the current moment, like "10 seconds from now." This is
the form you're most likely to use.

This fragment of code shows how to set a Timer object to expire after an
interval of 75 microseconds. It assumes that pDE holds a pointer to a Device
Extension, and that the Extension contains initialized Timer and DPC objects.

LARGE_INTEGER DueTime ;
DueTime = Rt lConvertLongToLarge integer ( - 7 5 * 1 0 ) ;
KeS etTimer ( &pDE - >Timer , DueTime , &pDE - >DPC ) ;
Since the number is negative, the system will interpret it as a relative time
value. Scaling the number by ten is necessary because the basic unit of system
time is 100 nanoseconds (or 0.1 microseconds) .

Other Uses for CustomTimerDpc Routines
In the next section, you'll see an example of a driver that performs data
transfers using a CustomTimerDpc instead of device interrupts. It's worth point­
ing out that, in some situations, you might want to use this kind of technique even
with devices that do generate interrupts. This could be helpful if your device gen­
erates so many interrupts that it overwhelms the Kernel's interrupt dispatcher
and degrades system performance.
The sample parallel port driver that comes with the NT DDK is one example
of a driver that uses this technique. This driver monitors the arrival rate of inter­
rupts for its device. When a flood of interrupts threatens to drown the system, the
driver intentionally disables parallel port interrupts and uses a CustomTimerDpc
to send data to the device. Depending on the device you're working with, this
kind of adaptive behavior might be something you want to consider.

1 0 . 4 C O D E EX A M P L E : A TI M E R - B A S E D D R I V E R
This modified version o f the parallel port driver disables interrupts and uses a
CustomTimerDpc routine to transfer data at fixed intervals. You can find the code
for this example in the CHlO\POLLING\DRIVER directory on the disk that
accompanies this book.

Chapter 10

216

Timers

XXDRIVER.H
This version of the main header file builds on the ones seen in previous
chapters. Only one structure from this file is of much interest.

DEVICE_EXTENSION The following excerpt shows the changes in the
Device Extension needed to support polling.

typede f s t ruct _DEVICE_EXTENS I ON {
PDEVICE_OBJECT DeviceObj e c t ; I I Back pointer
ULONG NtDevic eNumber ;

I I Zero -based devi c e num

PUCHAR PortBas e ;

I I F i r s t c ontrol reg i s terO

KDPC Po l l ingDpc ;
I I Components o f the @
KTIMER Po l l ingTime r ;
I I po l l ing mechani sm
LARGE_INTEGER Po l l inglnterval ; 8
Byte s t o s end at once 0
Requested trans f e r s i z e
Chars l e f t to trans f e r
Next char to s end

ULONG F i f oS i z e ;
ULONG Byt e sReques t ed ;
ULONG Byte sRemaining ;
PUCHAR pBu f f e r ;

II
II
II
II

UCHAR Devi c e S tatus ;

I I Mos t recent s tatus

} DEVICE_EXTENS I ON , * PDEVICE_EXTENS I ON ;
0 While we need to have access to the device's control registers, we're not
keeping a pointer to an Interrupt object in this driver. All interrupts from

this device will be turned off.

@ The Dpc and Timer objects together will activate the CustomTimerDpc
routine.
8 The Pollinglnterval field holds the expiration interval for the polling
timer. For convenience in this driver, we'll keep the value in microsec­
onds rather than tenths of microseconds.
e The rest of the structure is the same as the interrupt-driven version.

I NIT.C
Here is a tiny excerpt from the driver's initialization code. The rest of it is the
same boilerplate we've been looking at for several chapters. Not shown (but
equally important) is the hardware initialization code that disables interrupts
from the parallel port.

XxCreateDevice This function sets up the Device object. It differs from the
interrupt-driven version in that it never calls IoConnectlnterrupt, and it has to
initialize the polling timer.

Sec. 10.4 Code Example: A Timer-Based Driver

217

s ta t i c NTSTATUS
XxCreateDevi c e (
IN PDRIVER_OBJECT DriverObj e c t ,
IN PCONF I G_BLOCK pCon f i g , I I Con f i g block
IN ULONG uNum
I I Devi ce number
)
{
II
I I Copy things f rom Con f i g block
II

pDevExt - > PortBase

=

pCon f i g - >Devi c e [ uNum ] . PortBas e ;

II
I I Calculate the po l l ing interval
II

pDevExt - > Po l l inginterval
Rt lConver tLongToLarge integer (
XX_POLLING_INTERVAL * - 1 0 ) ; 0
=

II
I I Prepare the po l l ing t imer and i t s DPC obj e c t
II

Keini t i al i z eT imer ( &pDevExt - > Po l l ingTimer ) ; @
Ke i ni t ial i z eDpc ( 49
&pDevExt - > Po l l ingDpc ,
XxPo l l ingTimerDpc ,
( PVOI D ) pDevObj ) ;
II
I I Form the Win3 2 symbo l i c l ink name .
II
}

0 We use an RtlXxx convenience function to create the polling interval.

Since the number is negative, the timeout will be measured relative to the
moment the Timer object is started. Multiplying the value by ten allows
us to specify XX_POLLING_INTERVAL in microseconds.
@ Get the Kernel to turn the blob of memory into a Timer object.
49 Attach the CustomTimerDpc routine to the DPC object. Pass a pointer to

the Device object each time the CustomTimerDpc is called.

TRANSFER.C
Since this driver uses polling rather than interrupts to send data, you won't
find any Interrupt Service routine here.

Chapter 10

218

Timers

XxStartlo This function is called to begin the processing of each IRP. It
looks very much like the interrupt-driven version.

VO I D
XxS tart i o (
IN PDEVI CE_OBJECT Devic eObj ect ,
IN PIRP I rp
)
{
P IO_STACK_LOCATI ON I rpS tack =
I oGe tCurrent i rpStackLocat i on ( I rp ) ;
PDEVICE_EXTENS ION pDE =
Dev i c eObj e c t - >Devi c eExt ens i on ;
swi tch ( I rpS tack- >Maj orFunc t i on )

{

case I RP_MJ_WRITE : 0
pDE - >BytesReques ted =
I rpS tack- > Parame ters . Wr i t e . Length ;
pDE - > Byt esRemaining =
pDE - >BytesReque s t ed ;
pDE - >pBu f fer =
I rp - >As s oc iatedirp . Sys temBu f f e r ;
i f ( ! XxTransmi tByt e s ( pDE ) ) @
{
XxF ini shCurrentRequest (
Devi c eObj e c t ,
pDE ,
I rp ,
IO_NO_INCREMENT ) ;
break ;

II
I I Should never get here - - j us t get r i d
I I o f the packet . . .
II

de faul t :
I rp - > I o S tatus . S tatus =
STATUS_NOT_SUPPORTED ;
I rp- > I o S tatus . Informat i on = O ;
I oComp l e t eReques t (
I rp ,
I O_NO_INCREMENT ) ;

Sec. 10.4 Code Example: A Timer-Based Driver

219

I o S t artNext Packet ( Devic eObj e c t , FALSE ) ;
break ;
}

}

0 If this turns out to be an IRP_MJ_WRITE request, just set up the necessary

counters and pointers, and try to send the first bunch of bytes to the
device.
@ Notice that XxTransmitBytes is being called directly from
DISPATCH_LEVEL IRQL. There's no need to synchronize it using a
Synch Critical Section routine because there is no interrupt activity from
the device.

XxTransmitBytes This routine sends a fixed number of bytes out to the
parallel port. If the device has a personal problem or there are no more bytes left
in the buffer, it returns a FALSE. This data

s tat i c BOOLEAN
XxTransmi tByt e s (
IN PDEVICE_EXTENSION pDE
)
{

ULONG XferS i z e ;
UCHAR Control = XxReadContro l ( pDE ) ;
pDE - >Devi ceS tatus = XxReadS t atus ( pDE ) ;
II
I I I f a l l the byt es have been s ent or the
I I devi ce is unhappy , j us t qui t
II

i f ( ( pDE ->Byte s Remaining == 0 )
I I ! XX_OK ( pDE - >Devi ceS tatus ) )
{

return FALSE ;

}
II
I I Calculate the number o f byt e s to
I I s end in one bunch .
II

i f ( pDE - >Byte sRemaining < pDE - > F i foS i z e
pDE - >BytesRemaining ;
XferS i z e
else
pDE - > F i f o S i z e ;
Xf erS i z e
whi l e ( XferS i z e > 0 )

{0

Chapter 10

220

Timers

II
I I Make sure the STROBE l ine i s o f f
II

XxWr i t eControl (
pDE ,
Control & -XX_CTL_STROBE ) ;

II
I I Updat e pointer and counters
II

pDE- >pBu f f e r + + ;
XferS i z e - - ;
pDE- >Byt esRemaining - - ;
II
I I S t art the po l l ing t imer
II

KeS e tTimer ( @
&pDE - > Po l l ingTime r ,
pDE - > Po l l inginterval ,
&pDE - > Po l l ingDpc ) ;
re turn TRUE ;
}
0 Send as many bytes to the device as it can handle. Since this is a parallel

port device, each byte has to be strobed out.

@ Start the polling Timer object. When the Timer expires, the associated
DPC routine will be queued automatically.
XxPollingTimerDpc This function runs each time the polling timer
expires. It replaces both the ISR and the DpcForlsr routines in the interrupt-driven
version of this driver.

VOI D
XxPo l l ingTimerDpc (
IN PKDPC Dpc ,
IN PVOID Cont ext ,
IN PVO ID Sys t emArgumentl , 0
IN PVOID Sys t emArgument2
)
PDEVICE_OBJECT Dev i c eObj e c t

=

Cont ext ;

i f ( ! XxTransmi tByt e s ( @
Devi c eObj e c t - >Devi ceExtens i on ) )

Sec. 10.S Summary

221

{

XxF inishCurrentReque s t ( @
Devi ceObj ect ,
Devi c eObj e c t - >DeviceExtens i on ,
Devic eObj e c t - >Current irp ,
I O_PARALLEL_INCREMENT ) ;

0 Remember that the contents of the two system arguments are undefined

in a CustomTimerDpc routine.
@ Try to send the next bunch of bytes. If XxTransmitBytes fails, it means
either an error occurred, or there is no more data to send. If it succeeds, it
restarts the polling timer, which will eventually result in another call to
XxPollingTimerDpc.
49 Call XxFinishCurrentRequest to come up with an appropriate status

code and complete the IRP. Again, notice that everything is happening at
DISPATCH_LEVEL IRQL. XxFinishCurrentlrp runs in response to a reg­
ular function-call, not a DPC request.

1 0.5 SUM MARY
This chapter has presented two different aspects of using time in your driver.
Handling device timeouts is something that will always be important, while the
use of CustomTimerDpc routines may only be useful for certain kinds of devices.
One important use of CustomTimerDpc routines is to implement various polling
strategies.
You now have enough tools to write reasonable drivers for many simple
pieces of hardware. In the next chapter, we'll look at some additional techniques
for managing full-duplex devices and devices that generate asynchronous events.

C

H

A

P

T

E

R

11

Full-Dup lex
Drivers

T

he driver model described in the last few chap­
ters has one significant limitation: It allows you to process only a single IRP at a
time per Device object. While this is fine for many situations, it doesn't cut it if
your driver has to perform both input and output operations simultaneously.
This chapter presents a modified driver architecture that lifts the single-IRP
restriction. To implement this architecture, it uses several new techniques (like
CustomDpc and Cancel routines) that can be helpful in any kind of driver. At the
end of the chapter, sample code for a tiny serial port driver will tie all the loose
ends together.

1 1 . 1 DOING Two THINGS AT ONCE
Just what is it about the standard driver architecture that prevents a single Device
object from processing two IRPs at once? The problem becomes clear if you con­
sider what happens when a Dispatch routine sends an IRP to IoStartPacket.
Calling loStartPacket with a pointer to an idle Device object makes the
object busy and invokes the driver 's Start 1 / 0 routine. From then on, any calls to
loStartPacket targeting the same Device object result in IRPs being added to the
object's queue of pending requests. This continues until the Start I / 0 or Dpc­
Forlsr routines call IoStartNextPacket to mark the Device object as being idle.
This kind of behavior makes it very difficult to start another IRP before the cur­
rent one is completed.
222

Sec. 1 1 . 1 Doing Two Things at Once

223

Do You Need to Process Concurrent I R Ps?
The first thing to ask yourself is whether your driver really needs to process
multiple IRPs concurrently. This is actually a question about the kind of software
interface your driver is going to expose. For purposes of this discussion, you can
divide driver interfaces into the following categories:
•

Simplex i nterface
tion.

•

Half-duplex i nterface
These drivers manage hardware that transfers
data in both directions, but (for whatever reason) the drivers only process
one request at a time.

•

Full-duplex interface
Here, the driver can perform both inputs and
outputs simultaneously.

-

These drivers can transfer data only in one direc­

-

-

The standard driver model easily supports both the simplex and half-duplex
cases. Unfortunately, since it can't handle two requests at the same time, you can't
use this model to provide a full-duplex driver interface.
An important factor in choosing a software interface is the behavior of the
underlying hardware. Usually, this will tell you what kind of driver is most
appropriate. Broadly speaking, you can divide hardware into three families.

Simplex devices These devices can transfer data in only one direction.
The standard parallel port and the mouse are both examples of simplex hardware.
It's very unlikely that you'd need a full-duplex driver for a simplex device.
Half-duplex devices This type of device can transfer data in both direc­
tions, but only one transfer can take place at a time. Disk controllers and Ethernet
cards are both examples of half-duplex hardware. The choice of driver interface
will depend on how the device is used. It's natural for disk controllers to process
only one request at a time. Network cards need to give the appearance of per­
forming simultaneous input and output operations, even though the device itself
can only send or receive one packet at a time.
Full-duplex devices These devices can perform simultaneous input and
output operations. The standard serial port exhibits this kind of behavior. A full­
duplex driver is almost always a necessity for this type of device.
How the Modified Driver Architectu re Works
In a nutshell, if you want a single Device object to process two concurrent
IRPs, you need to establish a complete secondary path through your driver. IRPs
taking this alternate route will be processed in parallel with those following the
standard path. To do this, you must:

Chapter 1 1

224

Full-Duplex Drivers

1.

Divide the IRP_MJ_:XXX functions supported by your driver into two catego­
ries: Those to be processed by the standard Start I/O routine (the primary
IRPs) and those that will travel the alternate path (the alternate IRPs).

2.

Set up various bookkeeping structures to handle IRPs with alternate function
codes. This involves maintaining a queue of alternate IRPs, as well as keeping
track of the current alternate IRP. In this chapter, we'll be using Device Queue
objects to hold the alternate IRPs.

3.

Duplicate some of the logic in the I/0 Manager 's loStartPacket and IoStart­
NextPacket functions. Your versions of these routines will be responsible for
controlling the flow of IRPs along the alternate path.

4.

Write additional Start I/0 and DPC routines to handle alternate IRPs.

5.

Modify the Interrupt Service routine so that it can process both primary and
alternate DPC functions.

Data Structures for a Full-Duplex Driver
In Chapter 4 you saw that a standard Device object contains a Currentlrp
field that keeps track of the primary IRP being processed. Although it wasn't dis­
cussed in any detail, you also saw that the Device object contains an embedded
Device Queue object for holding primary IRPs that arrive after the Device object
has become busy. In a full-duplex driver, you need to set up parallel structures to
manage the alternate IRPs. Normally, this bookkeeping happens in the Device
Extension, as shown in Figure 11.1.
Primary I R Ps

Alternate IRPs

Current
IRP

Alternate
Queue

Copyright @ 1 996 b y Cydonix Corporation. 960017a.vsd

Figure

1 1 .1

A full-duplex driver uses these data structures

Sec. 1 1 .2 Using Device Queue Objects

225

Along with the alternate IRP pointer and the Device Queue object, there are
some other changes to the Device Extension. If you're doing Buffered 1 / 0, you'll
need two sets of buffer pointers and counters to keep track of your progress
through an 1 / 0 request. In addition, the strategy adopted in this chapter uses sep­
arate DPC routines for completing primary and alternate IRPs, so you'll need to
leave room in the Device Extension for a KDPC object.

Implementing the Alternate Path
Setting up the alternate path requires changes to several parts of your driver 's
code. The following subsections describe the modifications you'll need to make.

Dispatch routines In a full-duplex driver, Dispatch routines for the alter­
nate IRP_MJ_XXX function codes don't use the IoStartPacket function. Instead,
they call the driver-defined start-packet routine to send IRPs down the alternate
processing path.
Start 1/0 routines The modified driver architecture is going to use two
Start 1/0 routines: One for IRPs with primary IRP_MJ_XXX function codes and
another for the IRPs with alternate codes. Implementing these functions as sepa­
rate pieces of code usually makes them easier to manage.
Interrupt Service routine When an interrupt arrives, the Interrupt Service
routine has to perform different kinds of processing for primary and alternate
operations. It needs to send primary and alternate IRPs to different DPC routines
for postprocessing.
DPC routi nes Although you could write a full-duplex driver with only a
single DpcForlsr routine, it's usually easier to have a separate CustomDpc routine
for the alternate IRPs. When this CustomDpc routine completes an IRP, it calls the
driver-defined version of IoStartNextPacket to begin processing the next alter­
nate IRP.

1 1 .2 USING DEVICE QUEUE OBJ ECTS

I

A full-duplex driver needs some way to keep track of pending IRPs that arrive
while the driver is already processing an alternate IRP. Although there are several
ways to handle this situation, the driver model developed in this chapter is going
to use a Device Queue object to hold on to pending alternate IRPs. This is the
same strategy that the 1 / 0 Manager uses for the driver 's primary IRPs.

How Device Queue Objects Work
A Device Queue is a Kernel object that contains a linked list guarded by an
embedded Executive spin lock. Although a Device Queue object can hold any

Chapter 1 1

226

Full-Duplex Drivers

structure with a KDEVICE_QUEUE_ENTRY in it, they are most commonly used
to store a Device object's pending IRPs.
A Device Queue object is always in one of two states: Busy if there's been at
least one attempt to insert an entry into the queue and Not Busy if there's been an
attempt to remove an entry from an empty queue. Table 11 .1 shows how Device
Queue state transitions work.
The basic pattern is fairly simple: If you try to insert an entry into a Device
Queue that isn't Busy, the insertion fails but the queue becomes Busy. Once it has
become Busy, insertion operations succeed. Removing entries from a Busy Device
Queue causes no change in the object's state. Once the Device Queue has no more
entries, the next attempt to remove one causes the object to return to the Not-Busy
state.
The IoStartPacket and IoStartNextPacket functions use the state of the
Device object's built-in Device Queue to guarantee that a driver 's Start 1/0 rou­
tine receives only one IRP at a time per Device object. The Device Queue is Not
Busy if the associated Device object is ready to process another IRP, and Busy if
the Device object is currently working on an IRP.

How to Use Device Queue Objects
It's fairly easy to work with Device Queue objects. The code example
appearing later in this chapter will show you the specific details. In general, what
you do is:
1.

Declare a KDEVICE_QUEUE item in your Device Extension structure.

2.

In your DriverEntry routine, call KelnitializeDeviceQueue. This sets up both
the Device Queue object and its associated Executive spin lock.

3.

Use the functions in Table 11.2 to insert or remove IRPs. These routines auto­
matically acquire and release the Executive spin lock hidden in the Device
Queue object.

There are two things to notice about Device Queue objects. First, you must
be at DISPATCH_LEVEL IRQL in order to call the functions that insert and

Table 1 1 .1

State transitions in Device Queue objects

Device Queue state transitions
Initial state

Action

Final state

Entry is . . .

Not Busy
Busy
Busy
Busy
Busy

Insert into empty
Insert into empty
Insert into non-empty
Remove from non-empty
Remove from empty

Busy
Busy
Busy
Busy
Not Busy

Not inserted
Inserted
Inserted
Removed

Sec. 1 1 .2 Using Device Queue Objects

Table 1 1 .2

227

Use these functions to work with Device Queue objects

How to use Device Queue objects
IF you want to ...

THEN call ...

IRQL

Create a Device Queue
Insert an IRP at the end
Insert IRP in sort-order
Remove first IRP
Remove specific IRP

KelnitializeDeviceQueue
KelnsertDeviceQueue
KelnsertByKeyDeviceQueue
KeRemoveDeviceQueue
KeRemoveEntryDeviceQueue

PASSIVE_LEVEL
DISPATCH_LEVEL
DISPATCH_LEVEL
DISPATCH_LEVEL
DISPATCH_LEVEL

remove Device Queue entries. To call these functions from some part of your
driver running at PASSIVE_LEVEL IRQL, you have to change levels by calling
KeRaiselrql and KeLowerlrql.
Second, Device Queue objects must live in nonpaged storage. Since you
normally declare them as part of your Device Extension structure, this poses no
particular problem.
To link an IRP into a Device Queue, you use a predefined Device Queue
entry that's a standard part of the IRP. The code looks like this:

KeinsertDevi c eQueue (
&pDevExt - >Al t erna t e i rpQueue ,
& I rp- >Tai l . Over l ay . Devic eQueueEntry ) ;
Here, AltematelrpQueue is a KDEVICE_QUEUE structure that's part of the
Device Extension.
When you remove an item from a Device Queue, you get a pointer to the
queue entry. As this fragment of code illustrates, you still need to use the
CONTAINING_RECORD macro to convert this entry back into the address of an
IRP:

P I RP I rp ;
PKDEVICE_QUEUE_ENTRY QueueEnt ry ;
QueueEntry = KeRemoveDevic eQueue (
&pDevExt - >Al t erna t e i rpQueue ) ;
i f ( QueueEnt ry ! = NULL )
{
I rp = CONTAINING_RECORD (
QueueEntry ,
I RP ,
Tai l . Over l ay . DeviceQueueEntry ) ;
I I Do s omething with the IRP

Chapter 11

228

Full-Duplex Drivers

Also remember to check for a NULL return value. There's always the possi­
bility that the queue might be empty.

1 1 .3 WRITING CUSTOM DPC ROUTI N ES
Chapter 3 briefly introduced DPC objects as a general-purpose way for high­
IRQL code to perform less-important processing at a lower IRQL level. All the
drivers you've seen since then have taken advantage of the 1/0 Manager 's Dpc­
Forlsr mechanism to simplify the use of DPCs. For many situations, this may pro­
vide all the functionality you'll need. In the case of full-duplex drivers, however,
funneling everything through a single DpcForlsr routine adds unnecessary com­
plications to the design of the software.
This section explains how to work directly with Kernel DPC objects using
CustomDpc routines. Although the main focus will be on their use in full-duplex
drivers, CustomDpc routines can be valuable in any situation where a driver 's
Interrupt Service routine needs to perform some action that isn't allowed at
DIRQL.

How to Use a CustomDpc Routine
Working directly with Kernel DPC objects isn't terribly difficult. This is what
you need to do:
l.

When you define your Device or Controller Extension, declare a separate
KDPC item for each CustomDpc routine you plan to use.

2.

In your DriverEntry routine, initialize each KDPC object by calling Kelnitial­
izeDpc. This sets up a correspondence between the KDPC object and a spe­
cific CustomDpc routine in your driver.

3.

When you want to fire off the CustomDpc routine (usually from the driver's
ISR), call KelnsertQueueDpc (see Table 11 .3) . To cancel a pending DPC
request, you can call KeRemoveQueueDpc.

Table 1 1 .3

Function prototype for Kel nsertQueueDpc
:t:

BOOLEAN KelnsertQueueDpc

IRQL

Parameter

Description

IN PKDPC Dpc
IN PVOID SystemArgl
IN PVOID SystemArg2
Return value

Address of initialized DPC object to be queued
First call-specific DPC parameter
Second call-specific DPC parameter
• TRUE - the DPC was successfully queued
• FALSE - the DPC is already in the queue

DISPATCH_LEVEL

Sec. 1 1 .4 Canceling 1/0 Requests

Table 1 1 .4

229

Prototype for a CustomDpc routine
==

VOID XxCustomDpc

IRQL

DISPATCH_LEVEL

Parameter

Description

IN PKDPC Dpc
IN PVOID Context
IN PVOID SystemArgl
IN PVOID SystemArg2
Return value

DPC object that generated the call
Context parameter passed to KelnitializeDpc
1st DPC parameter passed to KelnsertQueueDpc
2nd DPC parameter passed to KelnsertQueueDpc

Remember that you can't queue a DPC object that's already in the queue. If
you try, KelnsertQueueDpc will return FALSE. This kind of thing might happen
if your device has such a high interrupt rate that the DPC routine doesn't get a
chance to run before the next interrupt arrives. In this case, it's up to your driver
to decide what to do. Depending on the design of your driver, one solution might
be to initialize a pool of DPCs for the ISR to use. In any event, remember that it's
your responsibility to take care of this situation.

Execution Context
The Kernel's DPC dispatcher eventually removes your DPC routine from
the queue and calls the associated CustomDpc routine. Table 11 .4 shows the pro­
totype for the DPC routine itself.
Notice that you can pass three driver-specific parameters to a CustomDpc
routine. Along with the Context item that KelnitializeDpc associates with the
DPC object, you can pass two additional parameters each time you call Keinsert­
QueueDpc. This is a little more flexible than the I/O Manager 's DpcForisr mech­
anism, which always passes the Device object, the IRP, and one call-specific
argument. Depending on what you're trying to do, this can be useful.

1 1 . 4 CA N C E L I N G

1/0 R E Q U E STS

One issue we haven't addressed yet is how to deal with 1/0 requests that get
abandoned. Although there's nothing about full-duplex drivers that makes them
more prone to canceled requests, this is as good a place as any to bring up the sub­
ject. Specifically, a driver has to be prepared for any of the following situations:
•

A thread issues one or more overlapped I / 0 requests to a Device object.
Before the driver processes these IRPs, the thread either terminates or
closes its handle to the Device object.

•

A thread issues one or more overlapped 1/0 requests and then calls some
other Win32 function that cancels any previous requests. For example,

Chapter 1 1

230

Full-Duplex Drivers

one side-effect of calling SetupComm is that it automatically cancels all
pending IRPs.
•

A higher-level driver allocates an IRP and sends it to another driver using
IoCallDriver. Before the IRP completes, the higher-level driver calls the
IoCancelirp function to cancel the request. 1

In all three of these cases, the I/O Manager will notify the driver that the
IRPs involved in the I/O need to be cancelled. Once it's been notified, the
driver 's job is to complete the affected IRPs with an IoStatus.Status value of
STATUS_CANCELLED and an IoStatus.Information value of zero.
This section explains the mechanics of canceling I/O requests. You'll see
examples of how to implement cancellation routines in the sample UART driver
at the end of the chapter.

How IRP Cancellation Works
In general, any driver that's going to hold IRPs in a pending state for a long
time needs to support cancellation. This really includes most device drivers, since
any Device object can have multiple overlapped requests waiting in its Device
Queue for the Start I/O routine. Cancellation support is also necessary in any
driver that stores IRPs temporarily in a driver-defined queue during the course of
processing.
Some of the issues will become a little more clear if you think about just
what might be going on when a cancel notification arrives. An IRP can be in one
of the following states at the time it gets cancelled:
•

It might be in a queue waiting for the driver to get to it. This could be the
Device object's Device Queue of pending requests (waiting for the Start
I/0 routine), or some private, internal queue managed by the driver.

•

The IRP might have been removed from a queue, but the driver hasn't
started to work on it yet. For example, an IRP might have become the
Device object's current IRP but the Start I/0 routine hasn't quite begun
processing it.

•

It might have been removed from a queue, and the next driver routine has
begun processing it.

The I/O Manager 's philosophy is that if an IRP is waiting in a queue when a
cancellation request occurs, then the driver should dequeue the IRP and cancel it.
Similarly, if the IRP has just been removed from a queue but processing hasn't
begun, the driver should cancel it. On the other hand, if the IRP has already been

1

Incidentally, a driver is only allowed to cancel IRPs that it has allocated and sent to a lower-level
driver. It must not try to cancel any IRPs sent to it by the 1/0 Manager or by a higher-level driver.

Sec. 1 1 .4 Canceling I/ 0 Requests

231

started (and if it won't take too long to complete), then the driver should finish
processing the request the normal way.
The I/ 0 Manager provides two independent mechanisms for cancelling
IRPs. First of all, a driver can attach a Cancel routine to an IRP before it puts the
IRP into any queue. If the IRP is cancelled while it's still in the queue, the Cancel
routine dequeues the IRP and performs the cancellation. If there's no cancellation
request, some other part of the driver eventually dequeues the IRP, removes its
Cancel routine, and continues processing it. This allows individual IRPs to be can­
celled selectively.
Second, a driver can have a Cleanup Dispatch routine that processes the
IRP_MJ_CLEANUP major function code. The 1/0 Manager automatically sends
an IRP with this function code whenever a thread terminates or closes a handle.
The job of the Cleanup Dispatch is to cancel any queued IRPs belonging to the
thread. This is a more general mechanism that all drivers ought to support.

Synchronization Issues
Keep in mind that a driver's I/O processing, Cleanup Dispatch, and Cancel
routines all execute asynchronously. On a multi-processor system, they could lit­
erally be running at the same time. As a result, various driver routines have to
coordinate their activities with care. Otherwise, there's always the chance one
part of a driver might keep working on an IRP that another part of the driver has
already cancelled.
For example, imagine that an IRP with an attached Cancel routine is sitting
in the Device Queue of some Device object. The I/0 Manager dequeues the IRP,
makes it the Device object's current IRP, and then calls the driver 's Start I/0 rou­
tine. Start 1/0 gets control, but before it can remove the IRP's Cancel routine, the
IRP is cancelled (perhaps on another CPU) and the Cancel routine executes. Now,
the Start I/0 routine will begin processing the IRP and the Cancel routine will
cancel it. 2
Or consider the case where a driver 's Cleanup Dispatch routine is in the
process of cancelling an IRP with an attached Cancel routine. If the Cancel routine
starts running before the Cleanup Dispatch function can disable it, again there
will be a very nasty collision.
The I/O Manager uses two mechanisms to prevent these kinds of synchroni­
zation problems. The following subsections describe how they work.

The Cancel spin lock The 1/0 Manager 's Cancel spin lock is the primary
safeguard against collisions during IRP cancellation. Ownership of this spin lock
guarantees exclusive access to any IRP fields involved in cancelling a request. It

2

This race condition is not limited to the Start I/O routine. Any time a queued IRP has an attached
Cancel routine, there's the chance that the Cancel routine may execute between the moment when
the IRP is dequeued and the moment when its Cancel routine is disabled.

Chapter 1 1

232

Full-Duplex Drivers

also protects the IRP from the time it leaves a Device object's Device Queue until
it becomes the current IRP.
Any driver-defined data structures that are shared among the Cleanup Dis­
patch routine, a Cancel routine, and some other driver routine should also be
guarded by this spin lock. This includes any internal queues where the driver
might be holding IRPs.
To use this lock, you need to call loAcquireCancelSpinLock before you
touch any of the various CancelXxx fields of the IRP and loReleaseCancelSpin­
Lock when you're finished. This is an Executive spin lock, so you have to be at or
below DISPATCH_LEVEL IRQL when you acquire it. During the time you actu­
ally hold the Cancel spin lock, you'll be running at DISPATCH_LEVEL IRQL, so
it's important not to cause any page faults.
Two important points about working with the Cancel spin lock: First, make
sure you release it before you call loCompleteRequest. If you break this rule, you
can cause a system deadlock.
Second, remember that there's only one of these locks for the whole system,
so don't hold on to it for too long. Doing so can prevent other drivers from run­
ning, which can degrade system performance.

The IRP Cancel flag Any time a driver removes an IRP with a Cancel rou­
tine from a queue, there's always the danger that the Cancel routine will execute
in the brief interval before it can be disabled. This would lead to a situation where
the driver continued processing an IRP that had already been completed by the
Cancel function.
To avoid this problem, each IRP contains a Boolean Cancel flag. By setting
this flag to TRUE before it calls the IRP's Cancel routine, the 1 / 0 Manager lets
other parts of the driver know that the IRP has already been completed. Like
other cancellation fields in the IRP, the Cancel flag is guarded by the Cancel spin
lock.
A driver 's processing routines check the Cancel flag after they remove an
IRP from a queue. If the flag is TRUE, it means the Cancel routine has already
grabbed the IRP and nothing more should be done with it. If the flag is FALSE, the
processing routine sets the CancelRoutine field of the IRP to NULL using IoSet­
CancelRoutine and starts to work on it. 3 From this point on, the Cancel routine
can't run anymore, so there's no more danger.
What a Cancel Routine Does
Whenever a driver puts an IRP into a queue where it might remain for an
indefinite time, the driver should give the 1 / 0 Manager the option of canceling
the IRP. To do this, the driver attaches a queue-specific Cancel routine to the IRP

3

Calling IoSetCancelRoutine requires that you first become the owner of the Cancel spin lock.

Sec.

1 1 .4

Canceling I/ 0 Requests

233

by calling IoSetCancelRoutine. The Cancel routine is responsible for doing
whatever is necessary to cancel the IRP. The exact actions it takes will depend on
where the IRP is in its processing cycle. If the driver has multiple internal
queues, it can attach different Cancel routines to an IRP at different stages of
processing.
A driver can have a Cancel routine attached to an IRP only while the driver
actually owns the IRP. In other words, the IRP is only cancelable between the time
the driver receives the IRP and when it either completes the IRP or sends it to
another driver with IoCallDriver. Before releasing an IRP, the driver must set its
CancelRoutine field to NULL using IoSetCancelRoutine.
As described at the beginning of this section, the I/0 Manager will call an
IRP's Cancel routine if the thread issuing the request terminates or closes its han­
dle before the request completes. The Cancel routine will also execute if a higher­
level driver explicitly cancels the request with IoCancellrp.
An IRP's Cancel routine runs at DISPATCH_LEVEL IRQL. As input, it
receives a pointer to the Device object and a pointer to the IRP being cancelled.
Before calling a Cancel routine, the I/O Manager acquires the Cancel spin lock,
sets the IRP's Cancel flag to TRUE and its CancelRoutine field to NULL. The Can­
cel routine has to release the Cancel lock before it returns.
The specific actions taken by a Cancel routine will depend on the state of
the IRP at the time it gets cancelled. The following subsections describe each of
the possibilities.

IRP is in the Device Queue If the IRP has not become current yet, then it
must still be in the Device object's Device queue. In this case, the Cancel routine
takes the following actions:
1.

I t calls KeRemoveEntryDeviceQueue t o pull the IRP from the Device Queue.

2.

The Cancel routine then calls IoReleaseCancelSpinLock to let go of the Can­
cel lock.

3.

Next, it puts STATUS_CANCELLED in the IRP's IoStatus.Status field and 0
in its IoStatus.Information field.

4.

The Cancel routine calls IoCompleteRequest to give the IRP back to the I/O
Manager. The priority boost is set to IO_NO_INCREMENT.

There's no need to call IoStartNextPacket since the IRP was canceled while
it was still in the Device Queue and hadn't yet entered the Start I/O routine.

IRP is current A Cancel routine might run in the brief interval after the
I/ 0 Manager has put an IRP' s address into a Device object's Currentlrp field
but before the Start 1/0 routine has set the IRP's CancelRoutine field to NULL.
The Cancel routine normally checks to see if the IRP being cancelled is the
Device object's current IRP, and if it is, it does the following:

Chapter 1 1

234

Full-Duplex Drivers

1.

I t calls IoReleaseCancelSpinLock to let go o f the Cancel lock.

2.

The Cancel routine next sets the IRP's IoStatus.Status field to
STATUS_CANCELLED and its IoStatus.Information field to 0.

3.

Next, it calls IoCompleteRequest to give the IRP back to the 1/0 Manager.
The priority boost is set to IO_NO_INCREMENT.

4.

Finally, the Cancel routine calls IoStartNextPacket to make the driver start
the next IRP.

IRP is i n some other queue A driver can always maintain its own pri­
vate queue of IRPs. If an IRP is in such a queue at the time it gets cancelled, its
queue-specific Cancel routine does the following:

1.

It calls RemoveEntryList to dequeue it. 4

2.

The Cancel routine then calls IoReleaseCancelSpinLock to let go of the Can­
cel lock.

3.

Next, it puts STATUS_CANCELLED in the IRP's IoStatus.Status field, and
zero in its IoStatus.Information field.

4.

The Cancel routine calls IoCompleteRequest to give the IRP back to the I/ 0
Manager. The priority boost is set to IO_NO_INCREMENT.

5.

Depending on the design of the driver, it may be necessary to call IoStart­
N extPacket to get the driver working on the next request.

What a Dispatch Cleanup Routine Does
At the time a user-mode thread terminates (either normally or abnormally),
it may still have incomplete I/ 0 requests associated with one or more Device
objects. Similarly, it's possible for a thread to close a Device object handle with
requests pending. In both these cases, the 1/0 Manager will try to clean up the
outstanding 1/0 requests by doing two things: It sends the Device object an IRP
with the major function code IRP_MJ_CLEANUP and it calls the Cancel routine of
any IRPs associated with the thread.
After this, the 1/0 Manager delays execution of the thread, giving the driver
time to process the IRPs. If the driver completes the IRPs, the 1/0 Manager
responds by sending an IRP_MJ_CLOSE IRP to the Device object.
If the IRPs aren't completed during the timeout interval (which can last
more than five minutes), things get ugly. In this case, the 1/0 Manager displays a
message box for each IRP (naming the offending driver) and detaches the IRP

4

This assumes the driver-defined queue is protected by the Cancel spin lock. The RemoveEntryList
function is not interlocked.

Sec. 1 1 .4 Canceling I / 0 Requests

235

from the thread. These zombie IRPs are lost to the system, as is any system buffer
space associated with them. Another side-effect is that the driver can't be
unloaded since it still has outstanding IRPs. No IRP_MJ_CLOSE ever gets sent.
From this description, you can see how important it is for a driver to clean
up pending 1/0 requests. As you already know, one way to do this is to attach
Cancel routines to every IRP. For some drivers, this may be overkill, and a simpler
method is just to ask for cleanup notifications. To receive these notifications, the
Driver object has to have a Dispatch routine registered for IRP_MJ_CLEANUP in
its MajorFunction table.
The job of the Cleanup Dispatch routine is to cancel any queued requests
associated with a specific Device object. For nonshareable Device objects, this
means flushing all IRPs out of the Device Queue and any other driver-defined
queues where they may be hiding. Depending on the nature of the device, the
driver might also abort a request in progress, or let it complete normally.
If a Device object is shareable, cleanup involves a little more work. In this
case, the Cleanup Dispatch routine must cancel only those IRPs associated with
the specific user-mode handle being closed. To do this, it uses the File object
pointer stored in the 1/0 stack location of each IRP. This pointer uniquely identi­
fies the user-mode handle that issued the request. The Cleanup Dispatch routine
simply has to compare the File object pointer in each queued IRP with the
pointer in the IRP_MJ_CLEANUP IRP. If they match, the queued IRP needs to be
cancelled.
Like any Dispatch routine, the Cleanup Dispatch executes at PASSIVE_LEVEL
IRQL. Although the specific steps will depend on the driver, a Cleanup Dispatch
routine generally has to do the following:
1.

I t calls IoAcquireCancelSpinLock t o acquire the Cancel spin lock. Unlike a
Cancel routine, the Cleanup Dispatch routine doesn't automatically hold this
spin lock when it's called.

2.

Next, it scans the Device Queue of the target Device object looking for IRPs
whose File object pointer matches the File object pointer of the
IRP_MJ_CLEANUP IRP itself.

3.

The Dispatch Cleanup routine removes each matching IRP from the Device
Queue and sets the IRP's CancelRoutine field to NULL. It also sets the IRP's
Cancel flag to TRUE and its Cancellrql field to DISPATCH_LEVEL. The IRP
is then added to a list of requests to be cancelled.

4.

If the driver maintains any private queues where IRPs might be held, the Dis­
patch Cleanup routine performs a similar scan. Any IRPs with matching File
object pointers are removed from these queues, their various CancelX:xx fields
are modified and they are also put in the list of requests to be cancelled.

5.

After releasing the Cancel spin lock, the Dispatch Cleanup routine completes
all the IRPs in its cancellation list with a status of STATUS_ CANCELLED and
a boost of IO_NO_INCREMENT.

Chapter 1 1

236
6.

Full-Duplex Drivers

Finally, it completes the IRP_MJ_CLEANUP request itself with a status of
STATUS_SUCCESS and a priority boost value of IO_NO_INCREMENT.

1 1 .5 SO M E MORE HARDWARE : TH E 1 6550 UART
This section describes the operation of the 16550 UART (Universal Asynchronous
Receiver/Transmitter), a typical full-duplex device. Knowing how this hardware
works will make it easier to understand the sample driver in the next section.

What the 1 6550 UART Does
The 16550 UART is an integrated circuit that performs serial input and out­
put. Normally the UART is coupled to some kind of line-driver chip that inter­
faces with the outside world. For example, this is how the RS-232 serial ports on
most computers are implemented.
The beauty of the UART is that it hides all the unpleasant details of framing
the data with START and STOP bits, as well as generating parity and making sure
all the bits are shifted out at the proper rate. To perform serial data transfers, you
just move individual bytes to or from the UART's buffer registers.
On output, you send a data byte to the UART's Transmit Data register from
which it is moved into a 16-byte FIFO on the chip. When the data byte makes it to
the other end of the FIFO, it goes into a shift register that sends it out over the
serial line one bit at a time. When the FIFO empties out, the UART sets its TBE
(transmit buffer empty) flag.
Meanwhile, the UART's receiver section is constantly monitoring the serial
line for input. As bits appear, they are added to a shift register that assembles
them into a single byte of data. When the byte is complete, it goes into the input
FIFO and the UART sets its RxRDY (receive data ready) flag to indicate that data is
available. This flag stays up as long as any data remains in the FIFO. You pull data
bytes one by one from the UART's Receive Data register.

Device Registers
You interact with the 16550 UART by reading and writing a set of one-byte
registers, which are described briefly in Table 11.5. Although this chapter gives
you enough information to talk intelligently about the 16550 at a dinner party,
you should read the data sheets from National Semiconductor if you want the
whole story. 5
If you count carefully, you'll notice that there are twelve registers sand­
wiched into an eight-byte span. How can this be? Actually, it's the hardware

5

Joe Campbell's definitive book on serial communications (listed in the bibliography) is another
excellent source of information.

Sec. 1 1 .5 Some More Hardware: The 16550 UART

Table 1 1 .5

237

Control and status registers for a 1 6550 UART

UART register definitions
Offset

Register

Access

Description

0

Receive Data
Transmit Data
Baud rate LSB
Interrupt Enable
Bit 0
Bit 1
Bit 2
Bit 3
Bits 4-7
Baud rate MSB
Interrupt ID
Bit O
Bits 1-2
Bit 3
Bits 4-5
Bits 6-7
FIFO Control
Bit 0
Bit 1
Bit 2
Bit 3
Bits 4-5
Bits 6-7
Line Control
Bits 0-1
Bit 2
Bits 3-5
Bit 6
Bit 7
Modem C qntrol
Line Status
Modem Status
Scratch-pad

RIO
WIO
RIW
RIW

Fetches first byte from input FIFO
Sends byte to output FIFO
Low byte of baud rate divisor*
Enables various interrupts
Received data ready
Transmit buffer empty
Error or BREAK
RS-232 input has changed state
Always zero
High byte of baud rate divisor*
Identifies source of an interrupt
If set, no interrupts pending
Source of interrupt (see below)
FIFO timeout interrupt
Reserved
Set if FIFOs are enabled
Controls FIFO behavior
Enable both FIFOs
Clear all bytes from Receive FIFO
Clear all bytes from Transmit FIFO
Enable DMA support
Reserved
Trigger-level of Receive FIFO
Controls data bits, stop bits, parity
Number of data bits
Number of STOP bits
Parity control
BREAK control
Divisor latch access bit (DLAB)
Controls state of DTR and RTS lines
Reports status of 1/0 operation
Reports state changes in DTR, RTS
Unused, possibly not implemented

1

2

3

4
5
6
7

RIW
RIO

WIO

RIW

RIW
RIW
RIW
RIW

*Accessible only when DLAB in the Line Control register is 1 .

Chapter 1 1

238

Full-Duplex Drivers

people playing those little tricks they like so much. The first trick is that some
addresses go to different registers on the UART depending on whether you're
reading or writing them. For example, if you read from offset 0, you get the con­
tents of the Receive Data register, but if you write to 0, your byte goes to the
Transmit Data register instead.
That accounts for ten registers, but what about the remaining two? The other
trick is that when you set the DLAB bit of register 3, the low and high bytes of the
baud-rate control mysteriously appear at offsets 0 and 1 . You restore things to nor­
mal by clearing the bit. Since you're not likely to change baud rates frequently,
this doesn't cause much of a problem.
One other thing to watch out for is register 7. Although the official data
sheets say you should be able to use it as a one-byte store for anything you like,
the truth is that it may not work. National Semiconductor licenses this UART
design to a number of other manufacturers, and they don't all implement the
scratch-pad.

Interrupt Behavior
The 16550 UART uses interrupts to let the CPU know about a number of
interesting conditions. Specifically, it generates an interrupt whenever:
•

A framing error or a BREAK occurs.

•

The Receive FIFO reaches the trigger level set by the FIFO Control
register.

•

There is at least one character in the Receive FIFO, no other characters
have arrived recently, and the CPU hasn't read the Receive Data register
for awhile. This FIFO timeout interrupt prevents data from wasting away
in the FIFO.

•

The transmit FIFO is empty. Usually, this is the signal to send more data.
A single, spurious FIFO empty interrupt can occur when you first enable
transmitter interrupts.

•

Any of the RS-232 input lines changes state.

Your interrupt service routine determines the cause of the interrupt by
examining the UART's Interrupt ID register. Notice the use of negative logic in
this register: The UART clears the low-order bit when an interrupt occurs and sets
it when all pending interrupts have been serviced. The remaining bits in this reg­
ister describe the exact source of the interrupt. See Table 11.6 for more information
about UART interrupts.
Since several of these conditions might occur simultaneously, the UART
imposes a priority arbitration scheme on interrupt events. When an interrupt
occurs, the 16550 locks out UART events of equal or lower priority until the cur­
rent interrupt has been dismissed.

Sec. 1 1 .6 Code Example: Full-Duplex UART Driver

Table 1 1 .6

239

Determining the cause of a 1 6550 interrupt

UART i nterrupts
Cause

Priority

ID register

(No interrupt)
Error or BREAK
FIFO receiver trigger level
Receive-FIFO timeout
Transmitter buffer empty
RS-232 input

0
1
1
2
3

1
6
4
12
2
0

Priority 0 is the most important, priority 3 the least.

The Interrupt ID register only shows you the highest-priority UART event.
After you service this event, any other pending interrupts appear in the ID regis­
ter in order of priority. This means that when you service a single UART interrupt,
you need to check for any other events that might be pending before you dismiss
the interrupt. Your service routine isn't really finished until the UART sets the low
bit of the ID register.
The action your service routine takes to clear an interrupt depends on the
cause of the interrupt. Table 11.7 shows how to clear various UART interrupts.
Notice that you can clear Transmit interrupts either by sending more data, or (if
this is the end of the I/ 0 operation) simply by reading the ID register again.

Table 1 1 .7

Clearing interrupts on the 1 6550 UART

Clearing UART i nterrupts
I nterrupt source

To clear it ...

Receiver error or BREAK
Received data
Transmit buffer empty

Read the Line Status register
Read data from the Receiver register
• Write to the Transmit buffer
• Read the Interrupt ID register
Read the RS-232 Status register

RS-232 input

1 1 .6 C O D E E XAM P L E : F U LL- D U P L EX UART D R I V E R
This i s an example o f a simple driver that performs simultaneous input and out­
put operations using a 16550 UART. Because the driver is rather large, only
selected pieces will appear here. You can find the complete code for this example
in the CHU \DRIVER directory on the disk accompanying this book.

Chapter 1 1

240

Full-Duplex Drivers

What to Expect
As you're poking around in the code, keep in mind that this is a toy driver
whose real purpose is to illustrate the techniques presented earlier in this chapter.
As a result, it ignores a number of issues that a real serial port driver needs to
worry about. 6 Before examining the driver itself, it's a good idea to describe some
of the things it doesn't do.
Perhaps this driver 's biggest limitation is that it doesn't handle unsolicited
input. In other words, it only accepts data from the device when an
IRP_MJ_READ IRP is pending. Data arriving at any other time is simply dropped
on the floor. In a real serial port driver, the Interrupt Service routine would proba­
bly dump unsolicited input into a type-ahead buffer, where it could be used to
satisfy IRP_MJ_READ requests as they arrived.
Secondly, this driver uses a very simple signaling protocol between the
sender and the receiver: It relies on the timeout interrupt from the UART's input­
FIFO to terminate a read request. If the sender slows down enough to trigger this
interrupt, the receiver will essentially ignore the rest of the transmission. Con­
versely, if the sender doesn't leave enough of a gap between successive transmis­
sions, the receiver will run them together. This is the only kind of flow-control
supported by the driver.
Finally, as a concession to simplicity, this driver doesn't worry about
device operations that time out. Since this can lead to situations where an IRP
never gets completed, it's definitely something you'd want to handle in a real
driver. The first code example in Chapter 10 shows how to deal with device
time-outs.

DEVICE_EXTENSION i n XXDRIVER.H
The following excerpt from the driver-specific header file shows the layout
of the Device Extension.

typede f s t ruct _DEVICE_EXTENS I ON {
PDEVICE_OBJECT Devic eObj ect ; I I Back pointer
ULONG NtDevi c eNumber ;

I I Z ero -based devi ce number

PUCHAR PortBas e ;
I I F i r s t control regi s ter
PKINTERRUPT pinterrupt ; I I Interrupt obj e c t
II
I I Current UART s e t t ings
II

6

If you want to see what really goes into managing a standard COM port, take a look at the serial
port driver source code included in the NT DDK.

Sec. 1 1 .6 Code Example: Full-Duplex UART Driver

ULONG
ULONG
ULONG
ULONG

241

I nputF i f oTri ggerLeve l ;
DataB i t s ;
S t opBi t s ;
Pari ty ;

KDEVICE_QUEUE Al t erna t e i rpQueue ; 0
KDPC Al ternateDpc ;
P IRP CurrentAl terna t e i rp ;
ULONG OutputF i f o S i z e ;
ULONG OutputByt esReques t ed ;
ULONG OutputByt esRemaining ;
PUCHAR pOu tputBu f f e r ;

II
II
II
II

Byt es t o s end at once @
Output bu f f er s i z e
Chars l e f t t o s end
Next char to s end

BOOLEAN Outpu t interrup t sVal i d ;
ULONG I nputF i f oS i z e ;
ULONG I nputByt esReques t ed ;
ULONG InputByt esRemaining ;
PUCHAR p i nputBu f fer ;

II
II
II
II

Count o f bytes �
I nput buf f e r s i z e
Space l e f t i n bu f f e r
Next ava i l ab l e s l o t

BOOLEAN Inpu t interrup t sVa l i d ;
UCHAR Devi c e S t atus ;

I I Mo s t recent s tatus

DEVICE_EXTENSION , * PDEVICE_EXTENS I ON ;
0 The Device Queue object and AltemateCurrentlrp pointer keep track of
input requests. In this driver, all input operations will follow the alternate

processing path. The DPC object is used to request 1/0 postprocessing of
alternate IRPs.
@ Here are the bookkeeping items used for output requests. Since the driver
is using Buffered 1/0, it has to keep a count of the bytes left to be trans­
ferred and a pointer to the location of the next output byte in the system
buffer. The OutputlnterruptsValid flag is set to TRUE whenever an out­
put operation is in progress.

� These items do the bookkeeping for input requests. Notice how they par­
allel the output items.
DISPATCH.C
This portion of the example shows the Dispatch routines for writing, read­
ing, and performing IRP cleanup operations.

XxDispatchWrite This function processes Win32 WriteFile calls by send­
ing the IRP along the standard driver processing path.

Chapter 1 1

242

Full-Duplex Drivers

NT STATUS
XxDi spat chWr i t e (
IN PDEVICE_OBJECT Devic eObj e c t ,
IN P I RP I rp
)
{
P IO_STACK_LOCATION I rpStack =
I oGetCurrent i rp S tackLoc at i on ( I rp ) ;
i f ( I rpS tack- > Parameters . Wr i t e . Length = = 0 ) 0
{
I rp - > I oS tatus . S tatus = STATUS_SUCCES S ;
Irp - > I o S tatus . Inforrnat i on
O;
IoCornp l e teReque s t ( I rp , I O_NO_INCREMENT ) ;
return STATUS_SUCCESS ;
=

}
II
I I S t art devi c e operat i on
II

I oMarkirpPending ( I rp ) ;

I o S tartPacket ( @
Devi c eObj e c t ,
I rp ,
0,
XxCanc e l Prirnaryi rp ) ; 8
re turn STATUS_PENDING ;
0 This driver doesn't consider zero-length transfers to be an error, so the

IRP is just completed immediately.
@ To send an IRP along the standard processing path, the driver calls
IoStartPacket.

8 While the IRP is waiting in the Device object's pending queue, this Cancel
routine will be responsible for canceling it.
XxDispatch Read This function processes Win32 ReadFile calls by sending
the IRP along the alternate driver processing path.

NTSTATUS
XxDi spatchRead (
IN PDEVICE_OBJECT Devi ceObj e c t ,
IN PIRP I rp
)
{

Sec. 1 1 .6 Code Example: Full-Duplex UART Driver

243

P IO_STACK_LOCATI ON I rpStack =
I oGe tCurrent i rpStackLocat i on ( I rp ) ;
II
I I Check f o r z e r o - l ength tran s f e r s
II

i f ( I rpS tack- > Parameters . Read . Length = = O )
{
I rp - > I o S tatus . S tatus = STATUS_SUCCES S ;
I rp - > I oS t atus . Inforrnat i on = O ;
I oCornp l e t eReques t ( I rp , I O_NO_INCREMENT ) ;
return STATUS_SUCCESS ;
}

I oMarki rpPending ( I rp ) ; 0
XxAl ternat eStartPacket ( @
Devi ceObj ect ,
I rp ,
XxCanc e lAl t ernateirp )

;

@)

return STATUS_PENDING ;
}
0 Begin the device operation. As always, the IRP must be marked pending.

@ Unlike the previous Dispatch routine, this one uses a driver-defined function to send the IRP along the alternate processing path.

@) Once again, there's a Cancel routine to process the IRP if it should be can­
celed before the driver actually starts working on it.
XxDispatchCleanup This Dispatch routine gets called when a thread that
opened a handle either calls CloseHandle or terminates. Its job is to pull any IRPs
associated with the handle from the two Device Queues and cancel them.

NT STATUS
XxDi spatchCleanup (
IN PDEVICE_OBJECT Devi c eObj e c t ,
IN PIRP Irp
)
PIO_STACK_LOCAT ION C l e anup i rpS tack =
I oGetCurrent i rpStackLocat i on ( I rp ) ;
PDEVICE_EXTENS ION Devi c eExtens i on =
Devi c eObj e c t - >DeviceExtens i on ;
XxC l eanupDevi c eQueue ( 0
&Devi c eObj e c t - >DeviceQueue ,
C l eanup i rp S tack- > F i l eObj e c t ) ;

Chapter 1 1

244

Full-Duplex Drivers

XxC l eanupDevi c eQueue ( @
&DeviceExten s i on->Al t e rnate i rpQueue ,
C l eanup i rpS tack- >Fi l eObj ect ) ;
I rp - > I o S tatus . Status = STATUS_SUCCESS ; @}
I rp - > I o S tatus . Informa t i on = O ;
I oComp l e t eReque s t ( I rp , IO_NO_INCREMENT ) ;
return STATUS_SUCCES S ;
0 XxCleanupDeviceQueue, a helper function that appears later in this

example, does the actual work. Here, it's being called to cancel
IRP_MJ_WRITE IRPs waiting in the Device object's primary queue. The
File object pointer identifies the handle to look for when canceling IRPs.
@ Here, XxCleanupDeviceQueue will cancel IRP_MJ_READ IRPs associ­
ated with the handle.

@} Finally, the IRP_MJ_CLEANUP IRP itself is completed. Once this IRP is
passed back to the 1/0 Manager, it will be followed by an
IRP_MJ_CLOSE request for the same handle.
DEVQUEU E.C
The routines in this file manage the Device Queue object used for processing
alternate IRPs.

XxAlternateStartPacket Given an IRP, this function either sends it to the
alternate Start 1/0 routine or queues it for later processing if the alternate path is
busy. In many ways, this function resembles the 1/0 Manager 's IoStartPacket
routine.

VOI D
XxAl t e rnat e S tartPacke t (
IN PDEVICE_OBJECT DeviceObj e c t ,
IN PIRP I rp ,
IN PDRIVER_CANCEL Canc el Func t i on
)
KIRQL Oldi rql ;
PDEVICE_EXTENS ION Devic eExt en s i on =
Devi c eObj e c t - >Devic eExtens i on ;
I oAcquireCanc e l Sp inLock ( &Oldi rql ) ; 0
I o S etCanc e l Rout ine ( I rp , Canc e l Func t i on ) ;
i f ( Ke insertDevi c eQueue ( @
&DeviceExt ens i on- >Al ternat e i rpQueue ,

245

Sec. 1 1 .6 Code Example: Full-Duplex UART Driver

& I rp - >Tai l . Ove r l ay . Dev i c eQueueEntry ) )
I oRe l eas eCanc e l Sp inLock ( Oldirql ) ;
e l s e 49
Devi c eExtens i on->CurrentAl ternate i rp
I oRe leas eCanc e l SpinLock ( Oldirql ) ;

=

I rp ; 0

KeRai s e i rql ( D I S PATCH_LEVEL , &Oldi rql ) ; 0
XxAl ternateStart i o ( Dev i ceObj e c t , I rp ) ;
KeLowe r i rql ( Oldi rql ) ;

0 It's necessary to be holding the Cancel spin lock in order to modify the

IRP's CancelRoutine field. This driver also uses the Cancel spin lock to
guard the alternate IRP queue and the pointer to the alternate IRP cur­
rently being processed.
@ Try to put the IRP into the alternate queue. If the Device Queue object was
already busy, the IRP will be inserted and the driver will simply release
the Cancel spin lock.
@ If the Device Queue was not-busy, KelnsertDeviceQueue will fail, and

the Device Queue will flip into the busy state. In that case, it's necessary
to start processing the IRP.
0 The first step is to record the IRP as the current alternate IRP. Once this is

done, it's safe to release the Cancel spin lock.
0 The next step is to call the alternate Start 1/0 routine. Since XxAlternate­

StartPacket runs at PASSIVE_LEVEL IRQL, and the alternate Start 1/0
routine runs at DISPATCH_LEVEL, it's necessary for requests to raise and
lower the CPU's IRQL value.
XxAlternateStartNextPacket This routine does the same job as the 1/0
Manager 's IoStartNextPacket function. If there is an available IRP in the queue of
pending alternate IRPs, this function sends it to the alternate Start 1/0 entry
point. This piece of code expects to run at DISPATCH_LEVEL IRQL only.

VOID
XxAl ternateS tartNext Packe t (
IN PDEVICE_OBJECT Devic eObj e c t ,
IN BOOLEAN Canc e l ab l e
)
{
PDEVICE_EXTENSION Devi c eExtens i on =
Devi c eObj e c t - >DeviceExtens i on ;

246

Chapter 1 1

Full-Duplex Drivers

PKDEVICE_QUEUE_ENTRY QueueEntry ;
P I RP I rp ;
KIRQL Oldi rql ;
i f ( Cance l ab l e
I oAcqu i r eCance l Sp inLock ( &Oldi rql )

;

0

QueueEnt ry =
KeRemoveDevic eQueue (
&Devic eExtens i on - >Al ternat e i rpQueue ) ; @
i f ( QueueEntry ! = NULL )
{
I rp = CONTAINING_RECORD ( @}
QueueEntry ,
IRP ,
Tai l . Over l ay . DeviceQueueEntry ) ;
Devi ceExten s i on- >CurrentAlternat e i rp = I rp ;
i f ( Canc e l ab l e )
I oRe l eas eCance l SpinLock ( Oldi rql ) ; 0
XxAl t ernateS tart i o ( Devi c eObj ect , I rp ) ;
else
{

Devi ceExten s i on- >CurrentAlternat e i rp = NULL ; 0
i f ( Canc e l ab l e )
I oRe l eas eCance l SpinLock ( O l di rql ) ;

}
0 In imitation of the I/O Manager 's routine, this function uses an explicit

argument to decide whether the whole operation should be protected by
the Cancel spin lock. Since this driver always attaches a Cancel routine to
an alternate IRP, this argument will always be TRUE.
@ Try to get the next pending IRP from the alternate Device Queue. If the
queue was empty, KeRemoveDeviceQueue sets the Device Queue's state
to not-busy and returns NULL.

@} There was something in the queue. Reconstitute the address of the IRP
itself and make it the new current IRP for the alternate path.
0 If necessary, let go of the Cancel spin lock, then call the driver 's alternate

Start I/0 entry point.
0 If the queue was empty, the only work to do is to clear out the current-IRP

slot for the alternate path and drop the Cancel spin lock.

Sec. 11.6 Code Example: Full-Duplex UART Driver

247

INPUT.C
In this driver, IRP_MJ_READ requests are sent down the alternate path. This
file contains routines that process these alternate IRPs. You'll find similar code for
handling IRP_MJ_WRITE requests in OUTPUT.C.

XxAlternateStartlo Like any Start 1/0 routine, this one is responsible for
setting up various bookkeeping values and then starting the actual device opera­
tion.

VOI D
XxAl ternat eStart i o (
IN PDEVICE_OBJECT Devi ceObj ect ,
IN P I RP I rp
)
KIRQL Oldi rql ;
P I O_STACK_LOCATION I rpS tack =
I oGetCurrent i rp S tackLocat i on ( I rp ) ;
PDEVICE_EXTENS ION Devi ceExt ens i on
DeviceObj e c t - >Devic eExt ens i on ;
=

I oAcqu i reCanc e l Sp inLock ( &Oldi rql ) ; 0
i f ( I rp - >Canc e l )
{
I oRe l eas eCance l SpinLock ( O l di rql ) ; @
re turn ;
else @
I o S etCance l Rout ine ( I rp , NULL ) ;
IoRe l eas eCanc e l SpinLock ( Oldi rql ) ;
swi t ch ( I rpStack->Ma j o rFunc t i on )

{

case I RP_MJ_READ : 0
Devi ceExtens i on- > I nputByt e s Reque s t ed
I rpS tack- > Parameters . Read . Length ;
Devic eExtens i on- > I nputByte s Remaining =
DeviceExtens i on-> InputBytesReques ted ;
Devi ceExt ens i on - >p inputBu f f e r =
I rp - >As s o c i atedirp . Sys temBu f fer ;
i f ( ! Ke Synchroni z e Execut i on ( 0

Chapter 1 1

248

Full-Duplex Drivers

DeviceExtens i on - >pinterrupt ,
XxReceiveByte s ,
DeviceExtens i on ) )
XxDpcFori npu t s (
NULL ,
Devic eObj e c t ,
I rp ,
Devic eExtens i on ) ;
break ;
II
I I Shoul d never get here - - j us t get r i d
I I o f the packet . . .
II

de faul t :

II
I I Fai l the IRP and s tart the next one .
II

I rp- > I o Status . S t atus =
STATUS_NOT_SUPPORTED ;
I rp- > I o S tatus . Informat i on = O ;
I oComp l e teReques t (
I rp ,
I O_NO_INCREMENT ) ;
XxAl ternateStartNext Packe t (
Devi ceOb j ect ,
TRUE ) ;
break ;

0 Before starting the operation, see if the Cancel routine has run between
the time the IRP was removed from the Device Queue and now. This

requires ownership of the Cancel spin lock.
@ If the Cancel flag is set, it means the IRP has already been processed by
the Cancel routine. In this case, the only thing to do is to release the spin
and return immediately.
49 The Cancel flag is clear. Remove the IRP from the cancelable state by set­

ting its Cancel routine to NULL, then start to process it. From this point
on, only normal completion or an error can stop this request.
0 Set up various pointers and counters in preparation for the data transfer

operation.

Sec. 1 1 .6 Code Example: Full-Duplex UART Driver

249

0 Next, start the device. If something goes wrong, use the DPC routine to fail

the IRP Pass NULL for the DPC object argument to let the DPC routine
know that it's been called early and not as part of a normal 1/0 completion.

XxDpcForlnputs Here's the CustomDpc routine used for inputs. It does
the usual work of putting a final status in the IRP, completing the current request,
and trying to start another.

VOI D
XxDpcForinpu t s (
IN PKDPC Dpc ,
IN PDEVICE_OBJECT Devic eObj e c t ,
IN P I RP I rp ,
IN PVO I D Context
)
PDEVICE_EXTENS ION Devi c eExtens i on

=

Context ;

I rp - > I o S tatus . Inf orma t i on = 0
Devi ceExtens i on- > I nputByt esRequ e s t ed DeviceExtens i on - > I nputBytesRemaining ;
I rp - > I oS tatus . S tatus

=

STATUS_SUCCES S ; @

i f ( Dpc == NULL ) @)
I oComp l e t eReque s t ( Irp , I O_NO_INCREMENT ) ;
else
I oComp l e t eReque s t ( I rp , I O_SERIAL_INCREMENT ) ;
XxAl ternateS tartNext Packet ( Devi c eObj e c t , TRUE ) ; 0
0 Calculate the number of bytes actually transferred.

@ Come up with a final status code for the IRP. A real driver would proba­
bly use the last recorded contents of the device's status register (stored in
the Device Extension) to produce a real status value.

@) If this routine is being called directly from the alternate Start 1/0 routine,
the DPC argument will be NULL. This means the IRP is being failed before
it got started. In that case, don't give the calling thread a priority boost.
0 This request is done. Use a driver-defined routine to start the next alter­

nate IRP (if there is one) .

ISR.C
This file contains the interrupt service code for the UART driver. To make
things a little more readable, processing for input events happens in some auxil­
iary subroutines.

250

Chapter 1 1

Full-Duplex Drivers

Xxlsr The Kernel's interrupt dispatcher calls this function at DIRQL, hold­
ing the Interrupt spin lock for the device. Since the UART can request multiple
kinds of interrupts at the same time, Xxlsr has to keep checking for possible inter­
rupts until nothing more shows up.

BOOLEAN
Xxi s r (
IN PKINTERRUPT Interrupt ,
IN PVOID ServiceCont ext
)
{
PDEVICE_EXTENS ION pDE = Servic eCont ext ;
PDEVICE_OBJECT pDevi c e = pDE- >Devic eObj ect ;
UCHAR Interrup t i d = XxReadint i d ( pDE ) ;
i f ( ( Interruptid & XX_I IR_NO_INTERRUPT
re turn FALSE ;

!= 0 ) 0

do
{
Interrup t i d &= XX_I I R_INTERRUPT_I D_MASK ; @
swi tch ( Interrup t i d )
{
case XX_I I R_ERR :
XxReadLineStatus ( pDE ) ; @
break ;
case XX_I I R_RDA :
XxHandl e inpu tF i f oTrigger ( pDE ) ; 0
break ;
case XX_I I R_F IFO_TMO :
XxHandl e inputFi foTimeOut ( pDE ) ; 0
break ;
case XX_I I R_TBE :
i f ( pDE - >OutputinterruptsVa l i d ) ©
i f ( ! XxTransmi tByt e s ( pDE ) )
I oReques tDpc (
pDevi ce ,
pDevi c e - >Current i rp ,
( PVOI D ) pDE ) ;
break ;
case XX_I I R_RS 2 3 2 :
XxReadModemS tatus ( pDE ) ; @
break ;

Sec. 1 1 . 6 Code Example: Full-Duplex UART Driver

251

Interrup t i d = XxReadint i d ( pDE ) ; �
whi l e ( ( Interrup t i d &
XX_I IR_NO_INTERRUPT ) = = 0 ) ;
return TRUE ;
0 If the low-order bit of the Interrupt ID register is set, then this device

didn't generate an interrupt. Return control to the Kernel's interrupt
dispatcher.
@ The UART interrupted. Enter a loop that will keep processing interrupt
until there's nothing left to do. Begin by masking out any irrelevant bits,
then switch on the interrupt-type.

@) This driver doesn't process any device errors. Just read the status register
to clear the pending interrupt.
0 This interrupt means that the input FIFO hit its trigger level. Call a helper

routine to get the input characters from the FIFO.
0 This interrupt means there's been a little data (less than the trigger level)

sitting and aging in the input FIFO. For this driver, that's a signal to end
an input operation. Call a helper function to empty the FIFO and com­
plete the IRP.

© During an output operation, this interrupt means that it's time to refill the
output FIFO and send more data. The interrupt-valid flag in the Device
Extension prevents the driver from responding to spurious Transmit
Buffer Empty interrupts when no output request is being processed.
8 This driver ignores modem events, but it's still necessary to read the

Modem Status register in order to clear the interrupt.
� That ends the processing for the first UART interrupt. There might be

more waiting in line behind it. Read the Interrupt ID register to get the
next one and do the whole thing over again. If there is no other interrupt
pending in the UART, drop out of the loop and return.

XxHandlelnputFifoTrigger This function is called by Xxlsr during an
input operation to get the next bunch of characters from the UART.

s tat i c VOI D
XxHandl e i npu tF i f oTrigger (
IN PDEVICE_EXTENS ION pDE
)
ULONG i ;
II
I I Read one l e s s than the number o f byte s in

Chapter 1 1

252

Full-Duplex Drivers

I I the F I FO ; thi s guarant ees a F I FO t ime - out
I I whi ch wi l l end the read reque s t .
II
for ( i = O ; i < pDE - > InputF i f o S i z e - 1 ; i + + ) 0
{
i f ( pDE - > Inputinterrup t sVa l i d &&
pDE - > InputByt e sRemaining > 0 ) @
*pDE - >p i nputBu f f e r + + =
XxReadDataBu f f er ( pDE ) ;
pDE - > InputByt esRema ining - - ;
e l s e XxReadDataBu f fer ( pDE ) ;
}

}

0 This loop reads one less than the number of bytes in the FIFO. This last

lonely byte, pining away in the FIFO, will eventually generate a FIFO
timeout interrupt and terminate the input operation.
@ If an input operation is in progress, and if there's room left in the buffer,
move a byte from the FIFO to the input buffer. Otherwise, drop the byte
on the floor. This behavior throws away both excess characters and unso­
licited input.

XxHandlelnputFifoTimeOut This function is called from Xxlsr when
some bytes have been languishing in the input FIFO for more than four character
periods. In this driver, the FIFO timeout interrupt signals the end of an input
operation.

s ta t i c VOI D
XxHandl e i nput F i foTimeOut (
IN PDEVICE_EXTENS ION pDE
)
{
whi l e ( XxReadLineS tatus ( pDE ) & XX_LSR_DATA_RDY ) 0
{
i f ( pDE - > Input interrup t sVa l i d &&
pDE - > InputByt e s Remaining > 0 )
{
*pDE - >pinputBu f f e r + + =
XxReadDataBu f fer ( pDE ) ;
pDE - > InputByt e sRemaining - - ;
}

e l s e XxReadDataBuf f e r ( pDE ) ;
}

Sec. 1 1 .6 Code Example: Full-Duplex UART Driver

253

i f ( pDE - > Input interrup t sVa l i d ) @
{
pDE - > Input interrup t sVa l i d = FALSE ;
Ke insertQueueDpc ( fD
&pDE - >Alternat eDpc ,
( PVOI D ) pDE - >CurrentAl t ernat e i rp ,
( PVOI D ) pDE ) ;

0 Read bytes from the FIFO until it's empty. If this is a genuine input opera­

tion and there's still some room left in the buffer, store the bytes. Other­
wise, drop them on the floor.
f9 If this was a spurious interrupt, there's nothing more to do. If an input

operation really was in progress, clear the interrupt-valid flag (so addi­
tional interrupts will be ignored). Then complete the current input IRP.
fD Input operations use a CustomDpc routine to complete the IRP.

CANCEL.C
This file contains routines that support IRP cancellation.

XxCleanupDeviceQueue This function is called by the driver 's Cleanup
Dispatch routine. Its job is to cancel any IRPs in a Device Queue whose File object
pointer matches the one passed as an argument.

VOI D
XXC l eanupDev i ceQueue (
IN PKDEVICE_QUEUE Devi c eQueue ,
IN PFILE_OBJECT F i l eObj e c t
)
{
KIRQL Oldi rql ;
P I RP Canc e l i rp ;
P I RP Requeu e i rp ;
P IO_STACK_LOCATI ON Canc e l irpS tack ;
L I ST_ENTRY Canc e l L i s t ;
L I S T_ENTRY RequeueL i s t ;
PLI S T_ENTRY L i s tHead ;
PKDEVICE_QUEUE_ENTRY QueueEntry ;
Ini t i a l i zeLi s tHead ( &Canc e lL i s t ) ; 0
Ini t i a l i zeLi s tHead ( &RequeueL i s t ) ;

Chapter 1 1

254

Full-Duplex Drivers

I oAcquireCanc e l SpinLo c k ( &Oldi rql ) ;
i f ( I sL i s tEmpty ( &DeviceQueue - >Devi ceLi s tHead ) ) 8
{
I oRe l ea s eCanc e l SpinLock ( Oldi rql ) ;
re turn ;
}

whi l e ( ( QueueEntry
KeRemoveDevi c eQueue (
DeviceQueue ) ) ! = NULL ) 8
{
Canc e l i rp =
CONTAINING_RECORD (
QueueEntry ,
IRP ,
Tai l . Over l ay . Devic eQueueEntry ) ;
=

Canc e l i rpStack =
I oGetCurren t i rpStackLocat i on ( Canc e l i rp ) ;
i f ( Canc e l i rp S t ack- >Fi l eObj ect = = F i l eObj ect ) 0
{
Canc e l i rp - >Canc e l = TRUE ;
Canc e l i rp - >Canc e l i rql = Oldirql ;
Canc e l i rp - >Cance l Routine = NULL ;
InsertTai l L i s t (
&Canc e l L i s t ,
&Canc e l i rp - >Tai l . Overlay . Li s tEntry ) ;
}

else e
InsertTai l L i s t (
&RequeueL i s t ,
&Canc e l i rp - >Ta i l . Over l ay . L i s tEntry ) ;
}

}

whi l e ( ! I s L i s tEmpty ( &RequeueL i s t ) ) 
Tai l . Over l ay . Devi c eQueueEntry ) )
{

Ke ins ertDevic eQueue (
Dev i c eQueue ,
&Requeueirp - >
Ta i l . Ove r l ay . Dev i c eQueueEntry ) ;
}

}

II
I I Then release the Canc e l spin l ock
II

I oRe l eas eCanc e l SpinLock ( Oldi rql ) ;

II
I I Run the l ength o f the ho l ding queue and
I I c omp l e t e every IRP that we f ound in i t .
II
whi l e ( ! I s L i s tEmpty ( &Cance l L i s t ) ) @

{

L i s tHead = RemoveHeadL i s t ( &Canc e l L i s t ) ;
Canc e l i rp =
CONTAINING_RECORD (
L i s tHead ,
IRP ,
Tai l . Overlay . Li s tEnt ry ) ;
Canc e l i rp - > I o S tatus . S tatus
STATUS_CANCELLED ;
Canc e l i rp - > I o S tatus . Informat i on
O;
=

=

I oComp l e teReques t (
Canc e l i rp ,
I O_NO_INCREMENT ) ;

0 These temporary work-lists will hold IRPs that are chosen for cancellation

and for requeuing. The list-heads need to initialized. It's also necessary to
acquire the Cancel spin lock and hold it until all the IRPs in the Device
Queue have been processed.
8 See if there are any IRPs in the Device Queue. If it's empty, there's no

work to do, so just quit.

8 Loop until every IRP has been removed from the Device Queue. For each
IRP, decide whether to cancel it or requeue it. At the end of this loop, the
Device Queue has been emptied, hence its state will be Not Busy.

Chapter 1 1

256

Full-Duplex Drivers

0 If the IRP' s File object pointer is the same as the one in the
IRP_MJ_CLEANUP IRP, set the IRP' s various CancelXxx fields. Then put
the IRP into a holding queue of requests to be canceled.

0 If the File object pointer doesn't match, this IRP should not be canceled. In

that case, add it to the list of IRPs to be put back in the Device Queue.
Devic eExtens i on ;
i f ( I rp = = Devi ceObj e c t - >CurrentAl ternateirp ) 0
{
I oRe l easeCanc e l SpinLock ( I rp- >Canc e l i rql ) ; fD
I rp - > I o S tatus . S tatus = STATUS_CANCELLED ;
I rp - > I o Status . Informa t i on = O ;
I oComp l e t eReque s t ( I rp , I O_NO_INCREMENT ) ;
XxAl t e rnat e S tartNext Packet ( f9
Devi c eObj e c t ,
TRUE ) ;
}

else 0
{

7

CANCEL.C

contains a similar function for canceling IRP_MJ_WRITE IRPs.

Sec. 11.7 Summary

257

KeRemoveEnt ryDevic eQueue (
&DeviceOb j e c t - >A l t ernate i rpQueue ,
& I rp - >Tai l . Over l ay . DeviceQueueEntry ) ;
I oRe l ea s eCance l Sp i nLock ( I rp - >Canc e l i rql ) ;
I rp- > I o S tatus . S tatus = STATUS_CANCELLED ;
I rp - > I o S ta tus . Informat i on = O ;
II
I I Comp l e t e thi s IRP , but don ' t s tart the
I I next one .
II

I oComp l e t eReque s t ( I rp , I O_NO_INCREMENT ) ;

0 If the IRP is already in the CurrentAltematelrp slot, but not yet started, it

can still be canceled.
@ Release the Cancel spin lock before completing the IRP. Notice that the 1/0
Manager has loaded the Cancellrql field of the IRP with the IRQL to
which the driver should return when it releases the lock.

@ Since the current alternate IRP has been removed, it's necessary to see if
another one is waiting in the wings.
0 The IRP wasn't current, so it must still be sitting in the Device Queue.

Simply remove it from the queue and complete it. In this case, the driver

doesn 't try to start the next IRP.

1 1 . 7 S U M MARY
This chapter has presented a slightly different driver architecture that allows you
to process more than one IRP at a time. Implementing this architecture required
that we set up a Device Queue object to hold alternate IRPs. CustomDpc and Can­
cel routines also proved helpful, although their usefulness goes far beyond full­
duplex drivers.
So much for drivers that manage Programmed 1/0 devices. The next step is
to see what kind of support NT provides for DMA hardware. That will be the sub­
ject of the coming chapter.

C

H

A

P

T

E

R

12

OMA Drivers

O

ne way or another, all the drivers we've seen so
far have depended on the CPU to move data between memory and the peripheral
device. This technique is fine for slower hardware, but for fast devices that trans­
fer large amounts of data, it would introduce too much overhead. Such devices
are usually capable of directly accessing system memory and transferring data
without the CPU's intervention. This chapter explains how to write drivers for
these kinds of devices.

1 2 . 1 H ow O M A W O R KS U N D E R W I N D OWS NT
As you saw in Chapter 1, insulating drivers from CPU- and platform-dependen­
cies was a major design goal of the NT I/O subsystem. One way that NT does this
is by using an abstract model of DMA operations. Drivers that perform DMA
work within the framework of this abstract model and can ignore many of the
hardware-specific aspects of what's going on. This section presents the major fea­
tures of the NT DMA framework.

Hiding OMA Hardware Variations with Adapter Objects
The purpose of using DMA is to minimize the CPU's involvement in data
transfer operations. To do this, DMA devices use an auxiliary processor (called a
DMA controller) to move data between memory and the peripheral device. This
258

Sec. 12.1 How DMA Works Under Windows NT

259

allows the CPU to continue doing other useful work in parallel with the 1 / 0
operation.
Although the exact details will vary, most DMA controllers have a very sim­
ilar architecture. In its simplest form, this consists of an address register for the
starting address of the DMA buffer, and a count register for the number of bytes
or words to transfer. When you set these registers and start the attached device,
the DMA controller begins moving data on its own. With each transfer, it incre­
ments the memory address register and decrements the count register. When the
count register empties out, the DMA controller generates an interrupt, and the
device is ready for another transfer.
Unfortunately, the needs of real-world hardware design complicate this sim­
ple picture. Consider the DMA implementation on ISA-based machines,
described back in Chapter 2. These systems use a pair of Intel 8237 controller
chips cascaded to provide four primary and three secondary DMA data channels.
The primary channels (identified as zero through three) can perform single-byte
transfers, while the secondary channels (five through seven) always transfer two
bytes at a time. Since the 8237 uses a 16-bit transfer counter, the primary and sec­
ondary channels can handle only 64K bytes or 128K bytes per operation, respec­
tively. Due to limitations of the ISA architecture, the DMA buffer must be located
in the first sixteen megabytes of physical memory.
Contrast this with the DMA architecture used by EISA systems. The Intel
82357 EISA I/ 0 controller extends ISA capabilities by supporting one-, two-, or
four-byte transfers on any DMA channel, as well as allowing DMA buffers to be
located anywhere in a 32-bit address space. In addition, EISA introduces three
new DMA bus-cycle formats (known as types A, B, and C) that give peripheral
designers the ability to work with faster devices.
Even on the same ISA or EISA bus, different devices can use different DMA
techniques. Remember the discussion of DMA slaves and bus masters from Chap­
ter 2. Slave devices compete for shareable system DMA hardware on the mother­
board, while bus masters avoid bottlenecks by using their own built-in DMA
controllers.
The problem with all this variety is that it tends to make DMA drivers very
platform dependent. To avoid this trap, NT drivers don't manipulate DMA hard­
ware directly. Instead, they work with an abstract representation of the hardware in
the form of an NT Adapter object. Chapter 4 briefly introduced these objects and
said they help with orderly sharing of system DMA resources. It turns out that
Adapter objects also simplify the task of writing platform-independent drivers by
hiding many of the details of setting up the DMA hardware. The rest of this section
will explain more about what Adapter objects do and how to use them in a driver.

Solving the Scatter/Gather Problem with Mapping Registers
Although virtual memory simplifies the lives of application developers, it
introduces two major complications for DMA-based drivers. The first problem is

Chapter 12

260

DMA Drivers

that the buffer address passed to the I/O Manager is a virtual address. Since the
DMA controller works with physical addresses, DMA drivers need some way to
determine the physical pages making up a virtual buffer. You'll see how this
works when we look at Memory Descriptor Lists in the next section.
The other problem (illustrated in Figure 12.1) is that a process doesn't neces­
sarily occupy consecutive pages of physical memory, and what appears to be a
contiguous buffer in virtual space is probably scattered throughout physical
memory. The NT Virtual Memory Manager uses the platform's address transla­
tion hardware (represented by a generic page table in the diagram) to give the
process the illusion of a single, unbroken virtual address space. Unfortunately, the
DMA controller doesn't participate in this illusion.
Since most DMA controllers can only generate sequential physical
addresses, buffers that span virtual page boundaries present a serious challenge.
Consider what happens if a DMA controller starts at the top of a multi-page
buffer and simply increments its way through successive pages of physical mem­
ory. It's unlikely that any page after the first will actually correspond to one of the
caller's virtual buffer pages. In fact, the pages touched by the DMA controller
probably won't even belong to the process issuing the I/ 0 request.
All virtual memory systems have to deal with the problem of scattering
and gathering physical buffer pages during a DMA operation. Support for scat­
ter/ gather capabilities can come either from system DMA hardware or from
hardware built into a smart bus master device. Once again, NT tries to simplify
things by presenting drivers with a unified, abstract view of whatever scatter I
gather hardware happens to exist on the system. This model consists of a contig­
uous range of addresses (called logical space) used by the DMA hardware and a
Virtual
Space

Physical
Space

Logical
Space

Copyright © 1 994 by Cydonix Corporation. 940050a. vsd

Figure

1 2. 1

Address spaces involved in DMA operations

Sec. 12.1 How OMA Works Under Windows NT

261

set of mapping registers that translate logical space addresses into physical space
addresses.
Here's how it works. Referring to Figure 12.1, each mapping register corre­
sponds to one page of DMA logical space, and a group of consecutively num­
bered registers represents a contiguous range of logical addresses. To perform a
OMA transfer, a driver first allocates enough contiguous mapping registers to
account for all the pages in the caller's buffer. It then loads consecutive mapping
registers with the physical addresses of the caller's buffer pages. This has the
effect of mapping the physically noncontiguous user buffer into a contiguous area
of logical space. Finally, the driver loads the OMA controller with the starting
address of the buffer in logical space and starts the device. While the operation is
in progress, the OMA controller generates sequential, logical addresses that the
scatter I gather hardware maps to appropriate physical page references.
So much for the conceptual view of mapping registers. Like the OMA con­
troller, the actual implementation depends on the platform, the bus, and the 1/0
device. To minimize the driver's awareness of these details, NT lumps the map­
ping registers into the Adapter object and provides a set of routines for managing
them.

Managing 1/0 Buffers with Memory Descriptor Lists
As you've just seen, loading physical addresses into mapping registers is an
important part of setting up a OMA transfer. To make this process easier, the 1/0
Manager uses a structure called a Memory Descriptor List (MDL). An MDL keeps
track of the physical pages associated with a virtual buffer. The buffer described
by an MDL can be in either user- or system-address space.
Direct 1 / 0 operations are one place where MDLs play a major role. If a
Device object has the DO_DIRECT_IO bit set in its Flags field, the 1/0 Manager
automatically builds an MDL describing the caller's buffer each time an 1/0
request is sent to the device. It stores the address of this MDL in the IRP's MdlAd­
dress field, and the driver uses it to prepare the OMA hardware for a transfer.
As you can see from Figure 12.2, the MDL consists of a header describing the
virtual buffer, followed by an array that lists the physical pages associated with
the buffer. Given a virtual address within the buffer, it's possible to determine the
corresponding physical page. Some of the fields in the header help clarify the use
of an MDL.

StartVa and ByteOffset The StartVa field contains the address of the
buffer described by the MDL, rounded down to the nearest virtual page bound­
ary. Since the buffer doesn't necessarily start on a page boundary, the ByteOffset
field specifies the distance from this page boundary to the actual beginning of the
buffer. Keep in mind that if the buffer is in user space, your driver can use the
StartVa field to calculate indexes into the MDL but not as an actual address
pointer.

Chapter 12

262

DMA Drivers

Physical Memory

Virtual Space

ByteOffset

/

ByteCount
Size
Process
MappedSystemVa
Phys Addr 1
Phys Addr 2
Phys Addr 3

Copyright © 1 996 by Cydonix Corporation. 960018a.vsd

Figure

1 2 .2

Structure of a Memory Descriptor List (MDL}

MappedSystemVa If the buffer described by the MDL is in user space and
you need to access the contents of the buffer itself, you first have to map the buffer
into system space with MmGetSystemAddressForMdl. This field of the MDL is
used to hold the system-space address where the user-space buffer has been
mapped. 1
ByteCount and Size These fields contain the number of bytes in the
buffer described by the MDL and the size of the MDL itself, respectively.
Process If the buffer lives in user space, the Process field points to the
Process object that owns the buffer. The 1/0 Manager will use this information
when it cleans up the I/ 0 operation.
Keep in mind that MDLs are opaque data objects defined by the NT Virtual
Memory Manager. Their actual contents may vary from platform to platform and
they might also change in future versions of NT. Consequently, you must access
an MDL using system support functions. Any other approach could lead to disas­
ter. Table 12.1 lists the MDL functions you're most likely to encounter in a driver.
See the DOK documentation for others. It's worth pointing out that some of the
functions in this table are implemented as macros for speed.

1

Using doubly-mapped buffers is generally a bad idea. Unmapping the buffer can cause a great deal
of system overhead.

Sec. 12.1 How DMA Works Under Windows NT

Table 1 2.1

263

Functions that work with Memory Descriptor Lists

MDL access functions
Function

Description

IoAllocateMdl
IoFreeMdl
MmBuildMdlForN onPagedPool

Allocates an empty MDL
Releases MDL allocated by IoAllocateMdl
Builds MDL for an existing nonpaged pool
buffer
Returns a nonpaged system space address for
the buffer described by an MDL
Builds an MDL describing part of a buffer
Returns count of bytes in buffer described by
MDL
Returns page-offset of buffer described by MDL
Returns starting VA of buffer described by MDL

MmGetSystemAddressForMdl
IoBuildPartialMdl
MmGetMdlByteCount
MmGetMdlByteOffset
MmGetMdlVirtualAddress

MDLs give drivers a convenient, platform-independent way of describing
buffers located either in user- or system-address space. For drivers that perform
DMA operations, MDLs are important because they make it easier to set up an
Adapter object's mapping registers. Later parts of this chapter will show you how
to use MDLs to set up DMA transfers.

Maintaining Cache Coherency
The final thing we need to consider is the impact of various caches on DMA
operations. During a DMA transfer, data may be getting cached in various places,
and if everything isn't coordinated properly, someone might end up with stale
data. Figure 12.3 shows who the players are in this drama.

CPU data cache Modern CPUs support both on-chip and external caches
for holding copies of recently-used data. When the CPU wants something from
physical memory, it first looks for the data in the cache. If the CPU finds what it
wants, it doesn't have to make the long, slow trip down the system memory bus.
For write operations, data moves from the CPU to the cache, where (depending
on the caching policy) it may stay for awhile before making its way out to main
memory.
The problem is that, on some architectures (primarily RISC platforms), the
CPU's cache controller and the DMA hardware are unaware of each other. This
lack of awareness can lead to incoherent views of memory. For instance, if the
CPU cache is holding part of a buffer and that buffer is overwritten in physical
memory by a DMA input, the CPU cache will contain stale data. Similarly, if mod­
ified data hasn't been flushed from the CPU cache when a DMA output begins,

264

Chapter 12

OMA Drivers

OMA
Buffer

.. . . .. . . . .. . . .. . .. . . .. . .

.. . .. .. .. .. . .. . .. . . .. .. . .. . .. .. .. . .. .. . ..

Duplicate
OMA
Buffer

Adapter
Object
Cache

· · - - - �----·

Copyright © 1 996 by Cydonix Corporation. 940051 a.vsd

Figure

1 2 .3

Caches involved in DMA processing

the OMA controller will be sending stale data from physical memory out to the
device.
One way of handling this problem is to make sure that any portions of a
OMA buffer residing in the CPU's data cache are flushed before a OMA operation
2
begins. Your driver can do this by calling KeFlushloBuffers and giving it the
MDL describing the OMA buffer. This function flushes any pages in the MDL
from the data cache of every processor on the system. The code example later in
this chapter shows how this works.
If you know something about hardware, you may be horrified by the over­
head of flushing every CPU's data cache before every OMA transfer. It's impor­
tant to emphasize that the cache coherency problem described above is only an
issue on some platforms. On machines that automatically maintain cache coher­
ency, KeFlushloBuffers is a no-op . You should always call it, however, just in case
your driver ends up on a platform that doesn't handle caching properly.

Adapter object cache The Adapter object is another place where data
may be cached during a OMA transfer. Unlike the CPU cache, which is always a
real piece of hardware, the Adapter object's cache is an abstraction representing
platform-dependent hardware or software. It might be an actual cache in a system
OMA controller or a software buffer maintained by the I/ 0 Manager. In fact, for
some combinations of hardware, there might not even be a cache, but your driver
has to act as if there were in order to guarantee portability.

2

Another option is to use non-cached memory for your DMA buffers.

Sec. 12. 1 How DMA Works Under Windows NT

265

If this sounds strange, consider a DMA controller attached to an ISA bus.
Such a controller can access only the first sixteen megabytes of physical memory.
If the pages of a user buffer are outside this range, the 1/0 Manager allocates
another buffer in low memory when your driver sets up its DMA mapping regis­
ters. If you're setting up an output operation, the 1/0 Manager also copies the
contents of the user buffer pages into this Adapter object buffer.
You need to flush the Adapter object cache of this ISA DMA controller in
two cases. First, after an input operation, your driver must tell the I/O Manager
to copy data from the Adapter buffer back to the user buffer. Second, when you
complete any data transfer, you have to let the 1/0 Manager know that it can
·
release the memory in the Adapter buffer. The function that does the work is

IoFlushAdapterBuffers.
Categorizing OMA Drivers
The NT DMA model divides drivers into two categories, based on the loca­
tion of the DMA buffer itself. In packet-based DMA, data moves directly between
the device and the locked-down pages of a user-space buffer. This is the type of
DMA associated with Direct 1/0 operations. The main thing to notice here is that
each new 1/0 request will probably use a different set of physical pages for its
buffer. This has an impact on the kinds of setup and cleanup steps the driver will
have to take for each 1/0.
The other possibility is that the driver sets up a single nonpaged buffer in
system space and uses it for all DMA transfers. This is referred to as common buffer

DMA.
Packet-based and common-buffer DMA are not mutually exclusive catego­
ries. Some complex devices perform both kinds of DMA. One example is the
Adaptec AHA-1742 controller, which uses packet-based DMA to transfer data
between SCSI devices and user buffers. This same controller exchanges command
and status information with its driver using a set of mailboxes kept in a common­
buffer area.
Although DMA drivers are all rather similar, certain implementation details
will depend on whether you're performing packet-based or common-buffer
DMA. Later sections of this chapter will present the specifics of writing each kind
of driver.

Limitations of the NT OMA Architecture
Although NT's use of an abstract DMA model makes some things easier, it
does have its drawbacks. For one thing, it tends to favor the notion of shared-sys­
tem DMA controllers. Much of the setup that goes on in an NT DMA driver is
based on the idea of passing a shared DMA channel from driver to driver. In an
age of dumb peripherals, this made sense, but as more bus-mastering devices
have appeared, the slave DMA model has become a little out of date.
A more significant problem is that NT doesn't allow you to perform DMA
operations directly from device to device. Instead, you have to read data from one

266

Chapter 12

DMA Drivers

3
device, buffer it in system memory, and from there write it out to another device.
This puts severe limitations on the available bandwidth and wastes one of the
main architectural features of modern buses like PCL Sadly, Microsoft appears to
be adamantly opposed to direct device-to-device data transfers.

1 2.2 W O R KI N G WITH A DAPT E R O BJ ECTS
Although the specific details will vary according to the nature of the device and
the architecture of the driver, DMA drivers generally have to perform several
kinds of operations on Adapter objects.
•

Locate the Adapter object associated with a specific device.

•

Acquire and release ownership of Adapter objects and their mapping
registers.

•

Load the Adapter object's mapping registers at the start of a transfer.

•

Flush the Adapter object's cache after a transfer completes.

The following subsections discuss these topics in general terms. Later sec­
tions of this chapter will add more detail.

Finding the Right Adapter Object
All DMA drivers need to locate an Adapter object before they can perform
any I/O operations. To find the right one, a driver's initialization code needs to
call the HalGetAdapter function described in Table 12.2.
Given a description of some DMA hardware, HalGetAdapter returns a pointer
to the corresponding Adapter object and a count of the maximum number of map-

Table 1 2.2

Function prototype for HalGetAdapter

PADAPTER_OBJECT HalGetAdapter

IRQL == PASSIVE LEVEL

Parameter

Description

IN PDEVICE_DESCRIPTION
DeviceDescription
IN OUT PULONG
NumberOfMapRegisters

Points to a structure describing device capabilities

Return value

•

•
•

•

3

IN - requested number of registers
OUT - maximum allowable number
Non-NULL - address of Adapter object
NULL - no such Adapter object available

Part of the problem here is that you can only build MDLs for physical memory that's known to the
system at bootstrap time. There's simply no way to create an MDL describing memory that's actu­
ally located on a peripheral or that's just a range of address space on some bus.

Sec. 12.2 Working with Adapter Objects

267

ping registers available for a single transfer. The driver needs to save both these
items in nonpaged storage (usually the Device or Controller Extension) for later use.
The main input to HalGetAdapter is the DEVICE_DESCRIPTION block
pictured in Table 1 2.3. It's important to set up this structure correctly, since most

Table 1 2.3 The DEVICE_DESCRIPTION

structure describes

a piece of OMA hardware

DEVICE_DESCRIPTION, *PDEVICE_DESCRIPTION
Field
ULONG Version
BOOLEAN Master

Contents
•

DEVICE_DESCRIPTION_VERSION

•

DEVICE_DESCRIPTION_VERSIONl
TRUE - device is a bus master

•
•

BOOLEAN ScatterGather
BOOLEAN DemandMode
BOOLEAN Autolnitialize
BOOLEAN Dma32BitAddresses
BOOLEAN IgnoreCount
BOOLEAN Reservedl
BOOLEAN Reserved2
ULONG BusNumber
ULONG DmaChannel
INTERFACE_TYPE InterfaceType

FALSE - devices uses system DMA

Slave device supports scatter I gather
Slave device uses demand-mode
Slave device uses autoinitialize mode
DMA logical space uses 32-bit addressing
Platform's DMA controller doesn't maintain
an accurate DMA count*
-*
-*
System-assigned bus number
Slave device DMA channel number
Bus architecture
• Internal
•

Isa

•

Eisa

•

MicroChannel
PCIBus
Width of a single transfer operation
• Width8Bits
• Width16Bits
• Width32Bits
DMA bus-cycle speed
• Compatible
• TypeA
• TypeB
• TypeC
•

DMA_WIDTH DmaWidth

DMA_SPEED DmaSpeed

ULONG MaximumLength
ULONG DmaPort

Largest transfer size (in bytes) device can perform
Micro Channel DMA port number

*Requires the use of DEVICE_DESCRIPTION_VERSIONl

Chapter 12

268

DMA Drivers

of the failures of HalGetAdapter are due to bogus device descriptions. Also be
sure to clear the structure with RtlZeroMemory before you fill it in.
Most of these fields are self-explanatory, but the following ones may need a
little clarification.

ScatterGather For bus master devices, this says that the hardware has
some sort of built-in support for transferring data to and from noncontiguous
ranges of physical memory. A later section of this chapter will explain how to
write drivers that can take advantage of these capabilities.
For slave devices, setting this field to TRUE implies that the device can stop
and wait in the middle of a transfer while the I/ 0 Manager reprograms the DMA
controller. Since the system DMA controllers on some platforms have only one
mapping register per channel, setting ScatterGather to TRUE would mean stop­
ping after each page of memory is transferred.
Demand transfer mode

Some devices need to stop and "catch their
breath" during a DMA transfer. This gives them the chance to finish working
with one chunk of data before the next comes through. If your device behaves
this way, the DMA controller has to be programmed to work in demand mode.
Otherwise, the system DMA controller won't stop, no matter how much the
device screams.

Autoinitialization System DMA channels can be programmed to reinitial­
ize themselves when a transfer completes. In this mode, the DMA controller's
count and address registers are automatically reloaded from a pair of base count
and address registers at the end of each operation. This causes another transfer to
begin immediately. Typically, drivers using this mode of operation will also use a
common buffer for the data transfer.
lgnoreCount Setting this field to TRUE says that the platform's DMA
hardware doesn't maintain an accurate running count of the number of bytes
transferred. This forces the HAL to do some extra work during DMA operations,
which slows things down.
Acquiring and Releasing the Adapter Object
There's no guarantee that the DMA resources needed for a transfer will be
free when a driver's Start I/O routine runs. For example, a slave-device's DMA
channel may already be in use by another device, or there may not be enough
mapping registers to handle the request. Consequently, all packet-based DMA
drivers and drivers for common-buffer slave devices have to request ownership
of the Adapter object before starting a data transfer.
Since a Start I/O routine runs at DISPATCH_LEVEL IRQL, there's no way it
can stop and wait for the Adapter object. Instead, it calls the IoAllocateAdapter­
Channel function (see Table 12.4) and then returns control to the I/O Manager.

269

Sec. 12.2 Working with Adapter Objects

Table 1 2.4

Prototype for loAllocateAdapterChannel
==

NTSTATUS loAllocateAdapterChannel

IRQL

Parameter

Description

IN PADAPTER_OBJECT AdapterObject

Adapter object from HalGetAdapter
Target device for OMA operation

IN PDEVICE_OBJECT DeviceObject

DISPATCH_LEVEL

Count of map registers to allocate

IN ULONG NumberOfMapRegisters
IN PDRIVER_CONTROL ExecutionRoutine
IN PVOID Context

Address of XxAdapterControl
Argument for XxAdapterControl
• STATUS_SUCCESS
• STATUS_INSUFFICIENT_

Return value

RESOURCES

When the requested OMA resources become available, the 1/0 Manager notifies
the driver by calling its Adapter Control routine. It's important to keep in mind
that this is an asynchronous callback. It may happen as soon as Start 1/0 calls
IoAllocateAdapterChannel or it may not occur until some other driver releases
the Adapter resources.
Notice that you have to be at DISPATCH_LEVEL IRQL when you call this
function. Since you normally call it from the Start I/O routine, this poses no prob­
lem. However, if you're using it in some weird way and you happen to be at
PASSIVE_LEVEL, make sure you use KeRaiseirql and KeLowerlrql before and
after your call to IoAllocateAdapterChannel.
The Adapter Control routine in a DMA driver is responsible for calling
loMapTransfer to set up the DMA hardware and starting the actual device opera­
tion. Table 12.5 contains a prototype of the Adapter Control callback.
The MapRegisterBase argument is an opaque value that identifies the map­
ping registers assigned to your I/O request. In a sense, it's a kind of handle to a
specific group of registers. You use this handle to set up the DMA hardware for

Table 1 2.5

Function prototype for an Adapter Control routine

IO_ALLOCATION_ACTfON XxAdapterControl

IRQL

==

DISPATCH_LEVEL

Parameter

Description

IN PDEVICE_OBJECT DeviceObject
IN PIRP irp
IN PVOID MapRegisterBase
IN PVOID Context

Target device for OMA operation
IRP describing this operation
Handle to a group of mapping registers
Driver-determined context
• DeallocateObjectKeepRegisters
• KeepObject

Return value

Chapter 12

270

OMA Drivers

the transfer. Normally, you should save this value in the Device or Controller
extension because you'll need it in later parts of the OMA operation.
Watch out for the Irp argument. The IRP address sent to your Adapter Con­
trol routine comes from the Currentlrp field of the Device object. Since the Cur­
rentlrp field only gets set when the Start 1/0 routine is called, you can only use
this passed IRP pointer if IoAllocateAdapterChannel is called from the Start I/ 0
routine. If you're calling it from some other context, this pointer will be NULL. In
that case, you'll have to find another way to pass the IRP (and its associated MDL
address) to the Adapter Control routine.
After it programs the OMA controller and starts the data transfer, the
Adapter Control routine gives control back to the 1/0 Manager. Drivers of slave
devices should return a value of KeepObject from this function so that no one else
will be able to use the Adapter object until this request is finished. Bus master
drivers return DeallocateObjectKeepRegisters instead.
When the DpcForlsr routine in a OMA driver completes an 1/0 request, it
needs to release any Adapter resources it owns. Drivers of slave devices do this by
calling loFreeAdapterChannel; bus master drivers call loFreeMapRegisters.

Setting Up the OMA Hardware
All packet-based drivers, as well as common-buffer drivers for slave
devices, have to program the OMA hardware at the beginning of each data trans­
fer. In terms of the abstract OMA model used by NT, this means loading the
Adapter object's mapping registers with physical-page addresses taken from the
MDL. This set up work is done by the loMapTransfer function described in Table

12.6.
Table 1 2.6

Prototype for loMapTransfer
£

PHVSICAL_ADDRESS loMapTransfer

IRQL

Parameter

Description

IN PADAPTER_OBJECT AdapterObject
IN PMDL Mdl
IN PVOID MapRegisterBase
IN PVOID CurrentVa
IN OUT PULONG Length

Allocated Adapter object
Memory Descriptor List for OMA buffer
Handle to a group of mapping registers
Virtual address of buffer within the MDL
• IN - count of bytes to be mapped
• OUT - actual count of bytes mapped
• TRUE - send data to device
• FALSE - read data from device
OMA logical address of the mapped
region

IN BOOLEAN WriteToDevice

Return value

DISPATCH_LEVEL

271

Sec. 12.2 Working with Adapter Objects

IoMapTransfer uses the CurrentVa and Length arguments to figure out
what physical page addresses to put into the mapping registers. These values
must fall somewhere within the range of addresses described by the MDL.
Keep in mind that IoMapTransfer may actually move the contents of a
DMA output buffer from one place to another in memory. For example, on an ISA
machine, if the pages in the MDL are outside the 16-megabyte DMA limit, calling
this function results in data being copied to a buffer in low physical memory. Sim­
ilarly, if a DMA input buffer is out of range, IoMapTransfer will allocate a buffer
in low memory for the transfer. On buses that support 32-bit DMA addresses, no
copying or duplicate buffers are required.
Drivers of bus master devices also need to call IoMapTransfer. In this case,
however, the function behaves a little differently, since it doesn't know how to
program the bus master 's control registers. Instead, IoMapTransfer simply
returns address and length values that your driver then loads into the device's
registers. For bus masters with built-in scatter/ gather support, this same mecha­
nism allows your driver to create a scatter I gather list for the device. Later sec­
tions of this chapter will explain how all this works.
Flushing the Adapter Object Cache
At the end of a data transfer, all packet-based DMA drivers and drivers for
common-buffer slave devices have to call IoFlushAdapterBuffers (see Table 12.7) .
For devices using the system DMA controller, this function flushes any hardware
caches associated with the Adapater object.
In the case of ISA devices doing packet-based DMA, this call releases any
low memory used for auxiliary buffers. For input operations, it also copies data
back to the physical pages of the caller 's input buffer. Refer back to the section on
cache coherency for a discussion of this process.

Table 1 2.7

Prototype for loFlushAdapterBuffers
£

BOOLEAN loFlushAdapterBuffers

IRQL

Parameter

Description

IN PADAPTER_OBJECT AdapterObject
IN PMDL Mdl
IN PVOID MapRegisterBase
IN PVOID CurrentVa
IN ULONG Length
IN BOOLEAN WriteToDevice

Adapter object used for this 1/0
MDL describing the buffer
Handle passed to XxAdapterControl
Starting VA where I/O operation took place
Length of buffer
• TRUE - operation was an output
• FALSE - operation was an input
• TRUE - Adapter buffers flushed
• FALSE - an error occurred

Return value

DISPATCH_LEVEL

Chapter 12

272

DMA Drivers

1 2.3 WRITING A PACKET- BASED SLAVE OMA DRIVER
In packet-based slave DMA, the device transfers data to o r from the locked-down
pages of the caller's buffer using a shared DMA controller on the motherboard.
The system is also responsible for providing scatter/ gather support.

How Packet-Based Slave OMA Works
Although the specifics will depend on the nature of your device, most
packet-based slave DMA drivers conform to a very similar pattern. The follow­
ing subsections describe what goes on in the routines making up one of these
drivers.

DriverEntry routine Along with its usual duties, the DriverEntry routine
has some extra work to do:

1.

It finds the DMA channel used b y the device. This can come either from auto­
detected hardware information in the Registry or it can be hard-coded in the
Parameters subkey of the driver 's service key.

2.

DriverEntry uses its hardware information to build a DEVICE_DESCRIP­
TION structure and calls HalGetAdapter to locate the Adapter object associ­
ated with the device.

3.

It saves the address of the Adapter object and the count of mapping registers
returned by HalGetAdapter for later use. Usually these are stored in the
Device Extension.

4.

It sets the DO_DIRECT_IO bit in the Flags field of any Device objects it cre­
ates. This causes the 1/0 Manager to lock user buffers in memory and create
MDLs for them.

Start 1/0 routine Unlike its counterpart in a programmed 1/0 driver, this
Start 1/0 routine doesn't actually start the device. Instead, it just requests owner­
ship of the Adapter object and leaves the rest of the work to the Adapter Control
callback routine. Specifically, the Start 1/0 routine does the following:
1.

I t calls KeFlushloBuffers t o flush data from the CPU's cache out t o physical
memory.

2.

Start 1/0 decides how many mapping registers to request. Initially, it calcu­
lates the number of registers needed to cover the entire user buffer. If this
turns out to be more mapping registers than the Adapter object has, it will ask
for as many as are available.

3.

Based on the number of mapping registers and the size of the user buffer,
Start 1/0 calculates the number of bytes to transfer in the first device opera-

Sec. 12.3 Writing a Packet-Based Slave DMA Driver

273

tion. This may be the entire buffer or it may be only the first portion of a split
transfer.

4.

Next, it calls MmGetMdlVirtualAddress to recover the virtual address of the
user buffer from the MDL. It stores this address in the Device Extension. Later
parts of the driver will use this address as an offset into the MDL to set up the
actual DMA transfer.

5.

Start 1/0 then calls IoAllocateAdapterChannel to request ownership of the
Adapter object. If this function succeeds, the rest of the setup work will be
done by the AdapterControl routine, so Start 1/0 simply returns control to
the I/ 0 Manager.

6.

If IoAllocateAdapterChannel returns an error, Start 1/0 puts an error code in
the IRP's IoStatus block, calls IoCompleteRequest, and starts processing the
next IRP.

Adapter Control routine The 1/0 Manager calls the Adapter Control rou­
tine whenever the necessary Adapter resources have become available. Its job is to
initialize the DMA controller for the transfer and start the device itself. This rou­
tine does the following:
1.

I t stores the value o f the MapRegisterBase argument in the Device Extension
for later use.

2.

The Adapter Control routine then calls IoMapTransfer to load the Adapter
object's mapping registers. To make this call, it uses the buffer 's virtual
address and the transfer size calculated by the Start I/ 0 routine.

3.

Next, it sends appropriate commands to the device to begin the transfer
operation.

4.

Finally, the Adapter Control routine returns the value KeepObject to retain
ownership of the Adapter object.

At this point, the transfer is actually in progress, and the system can go off
and do other things until an interrupt arrives from the device.

Interrupt Service routine Compared to a programmed 1/0 driver, the
ISR in a packet-based DMA driver is not very complicated. Unless hardware limi­
tations force the driver to split a large transfer request across several device oper­
ations, there will be only a single interrupt to service when the whole transfer
completes. When this interrupt arrives, the ISR does the following:
1.

I t issues whatever commands are necessary to acknowledge the device and
prevent it from generating any more interrupts.

2.

The ISR then stores device status (and any relevant error information) in the
Device Extension.

274

Chapter 12

OMA Drivers

3.

It calls IoRequestDpc to continue processing the request in the driver 's Dpc­
Forlsr routine.

4.

The ISR returns a value of TRUE to indicate that it serviced the interrupt.

DpcForlsr routine The DpcForlsr routine is triggered by the ISR at the
end of each partial data transfer operation. Its job is to start the next partial trans­
fer (if there is one) or to complete the current request. Specifically, the DpcForlsr
routine in a packet-based OMA driver does the following:
1.

It calls IoFlushAdapterBuffers to force any remaining data from the Adapter
object's cache.

2.

The DpcForlsr routine checks the Device Extension to see if there were any
errors during the operation. If there were, it completes the request with an
appropriate status code and length, and starts the next request.

3.

Otherwise, it decrements the count of bytes remaining by the size of the last
transfer. If the whole buffer has been processed, it completes the current
request and starts the next.

4.

If more data remains, the DpcForlsr routine increments the user-buffer
address pointer (stored in the Device Extension) by the size of the last opera­
tion. It then calculates the number of bytes to transfer in the next device oper­
ation, calls IoMapTransfer to reset the mapping registers, and starts the
device.

If the DpcForlsr routine started another partial transfer, the 1/0 Manager
will return control to the driver again when the device generates an interrupt.

Splitting OMA Transfers
When a packet-based OMA driver receives a buffer, it may not be able to
transfer all the data in a single device operation. It could be that the Adapter
object doesn't have enough mapping registers to handle the whole thing at once,
or there could be limitations on the device itself. In any event, the driver has to be
prepared to split the request across multiple data-transfer operations.
There are two solutions to this problem. One is have the driver reject any
requests that it can't handle in a single 1/0. With this approach, anyone using the
driver is responsible for breaking the request into chunks that are small enough to
process. Of course, the driver will have to provide some mechanism for letting its
clients know the maximum allowable buffer size (an IOCTL, for example) . If you
decide to do things this way, you might want to write a higher-level driver that
sits on top of the OMA device driver and splits the requests. This has the advan­
tage of shielding application programs from the details of splitting the request.
Another approach is to write a single, monolithic driver that accepts
requests of any size and splits them into several 1/0 operations. This is the strat­
egy used by the sample driver in the next section of this chapter.

Sec. 12.3 Writing a Packet-Based Slave DMA Driver

275

To do things this way, you need to maintain a pointer that tracks your posi­
tion in the user buffer as you transfer successive chunks of data. You also need to
maintain a count of the number of bytes left to process, as well as calculating the
amount of data to transfer in the current 1/0 operation. The following subsec­
tions explain how to initialize and update these data items during an 1/0 request.

First transfer The Start 1/0 routine normally sets things up for the first
transfer. Initially, it tries to grab enough mapping registers to do everything in one
1/0. If the Adapter object doesn't have enough mapping registers for this to
work, Start 1 / 0 asks for as many as it can get and sets up the current transfer
accordingly. The following code fragment shows how it's done.

pDE - >Trans ferVA =
MmGetMdlVi rtualAddress ( I rp - >MdlAddre s s ) ;
pDE->Byt esRemaining =
MmGetMdl ByteCount ( I rp - >MdlAddr e s s ) ;
pDE - >Trans ferS i z e
MapRegsNeeded

=

=

pDE - >BytesRemaining ;

ADDRES S_AND_S I ZE_TO_SPAN_PAGES (
pDE - >Trans f erVA ,
pDE - >Trans f e rS i z e ) ;

i f ( MapRegsNeeded > pDE - >MapRegsAva i l ab l e
{
MapRegsNeeded = pDE - >MapRegsAva i l abl e ;
pDE - >Trans f erS i z e =
MapRegsNeeded * PAGE_S I Z E MmGetMdlByt eO f f s et ( I rp - >MdlAddr e s s ) ;
}

I oAl locateAdapterChannel ( .

.

.) ;

Additional transfers After each interrupt, the DpcForlsr checks to see if
there's any data left to process. If there is, it calculates the number of mapping
registers needed to transfer all the remaining bytes in a single 1/0 operation. If
there aren't enough mapping registers available, it sets up another partial transfer.
The following code fragment illustrates the procedure.

pDE - >Byt e s Remaining

-=

pDE - >Trans f e rS i z e ;

i f ( pDE- >Byt esRemaining > 0 )
{
pDE - >Trans f e rVA + = pDE- >Trans f e rS i z e ;
pDE - >Trans f e rS i z e
MapRegsNeeded

=

=

pDE - >Byte sRemaining ;

ADDRES S_AND_S I ZE_TO_S PAN_PAGES (
pDE - >Trans f e rVA ,
pDE - >Trans ferS i z e ) ;

276

Chapter 12

DMA Drivers

i f ( MapRegsNeeded > pDE- >MapRegsAva i l abl e )
{
MapReg sNeeded = pDE - >MapRegsAva i l able ) ;
pDE - >Trans f e rS i z e =
MapRegsNeeded * PAGE_S I Z E BYTE_OFFSET ( pDE - >Trans f e rVA ) ;
I oMapTrans f e r ( . . . ) ;
}

1 2 .4 CODE EXAM PLE : A PACKET-BASED SLAVE OMA DRIVER
This example is a skeleton o f a packet-based driver for a generic slave DMA
device. Although it doesn't actually manage a specific kind of hardware, it will
help you to understand how these drivers work. You can find the complete code
for this example in the CH12\PACKT-S directory on the disk that accompanies
this book.

XXDRIVER.H
This excerpt from the driver-specific header file shows the changes that need
to be made to support a DMA device.

DEVICE_EXTENSION The modified Device Extension structure contains
some extra items that are necessary for packet-based DMA.

typede f s t ruct _DEVICE_EXTENSION {
PDEVICE_OBJECT Devi ceObj e c t ; I I Back pointer
ULONG NtDeviceNumber ;

I I Zero -based devi ce num

PUCHAR PortBas e ;

I I F i r s t control regi s ter

PKINTERRUPT p int errup t ;

I I Interrupt obj e c t

PADAPTER_OBJECT AdapterObj e c t ; 0
ULONG MapRegi s t e rCount ;
PVOI D MapReg i s terBas e ; @
ULONG Byte s Reques t ed ; �
ULONG Byt e s Remaining ;
ULONG Trans f e rS i z e ;
PUCHAR Trans ferVA ;
BOOLEAN Wr i t eToDevi c e ; 0
UCHAR Devi c e S tatu s ;
} DEVICE_EXTENS ION ,

* PDEVICE_EXTENS I ON ;

Sec. 12.4 Code Example: A Packet-Based Slave OMA Driver

277

0 These are returned by HalGetAdapter. They identify the specific Adapter
object and its maximum transfer size.

@ This identifies a particular group of mapping registers that have been
assigned to our driver during the course of an 1/0 request.

@ These bookkeeping fields keep track of our progress through a split trans­
fer operation.
0 These items hold the direction of the current data transfer and the status
of the OMA device itself.

R EGCON.C
This sample uses the version of XxGetHardwarelnfo that extracts hard­
coded information from the Parameters subkey of the driver 's service key. You
could just as easily use auto-detected information.

XxGetDmalnfo This function uses information pulled from the Registry,
supplemented with a few assumptions about the hardware, to find the device's
Adapter object.

s tat i c NTSTATUS
XxGetDmainfo (
IN INTERFACE_TYPE Bus Typ e ,
IN ULONG BusNumber ,
IN PDEVICE_BLOCK pDevice
)
DEVICE_DESCRI PTI ON De s c r ip ;
Rt l Z eroMemory (
&De s c r ip ,
s i z e o f ( DEVICE_DESCRI PTION ) ) ; 0
Des c r ip . Vers i on

=

DEVICE_DESCRI PT I ON_VERS I ONl ;

De s c r ip . Ma s t e r
De s c r ip . ScatterGather
De s c r ip . DemandMode
De s c rip . Auto ini t i al i z e
Des c r ip . Dma3 2 B i tAddres s es

FALSE ; @
FALSE ;
FALSE ;
FALSE ;
FALSE ;

Des c r ip . InterfaceType
Des c rip . BusNumber

BusType ;
BusNumber ;

Des c rip . DmaChanne l
Descrip . MaximumLength
De s c r ip . DmaWidth
De s crip . DmaSpeed

pDevi c e - >DmaChanne l ;
XX_MAX_DMA_LENGTH ;
Widthl 6Bi t s ;
Compatibl e ;

278

Chapter 12

OMA Drivers

pDevi c e - >MapRegi s t e rCount =
( XX_MAX_DMA_LENGTH I PAGE_S I ZE ) + 2 ; @
pDevi c e - >AdapterObj e c t =
HalGetAdapter (
&De scrip ,
&pDevice- >MapRegi s t e rCount ) ; 0
i f ( pDev i c e - >AdapterObj ect = = NULL ) 0
return STATUS_INSUF F I C IENT_RESOURCES ;
else
return STATUS_SUCCES S ;
0 It's important to make sure that there aren't any spurious bits set in the

DEVICE_DESCRIPTION structure.
f9 From this point on, start to build a description of the OMA device. In this

case, it's a slave device that performs 16-bit transfers and needs an ISA­
compatible bus cycle speed.

@ Calculate the number of mapping registers that correspond to the largest
possible transfer the device can handle. In the worst case, a buffer could
occupy some integral number of pages plus one byte before the first page
and one byte after the last page. To account for this possibility, request
two additional mapping registers.
0 Try to find the Adapter object for the device. Later parts of the driver will

need a pointer to the object and information about the maximum number
of available mapping registers.
0 If HalGetAdapter fails, it usually means that the DEVICE_DESCRIP­

TION had some inconsistencies.
TRANSFER.C
This portion of the example performs the actual data transfers. If an 1/0
request is too large for a single device operation, these routines split the request
over several transfers.
XxStartlo This function gets control at the beginning of each request. It
calculates the size of the first data transfer and requests ownership of the Adapter
object.

VOI D
XxS tart i o (
IN PDEVICE_OBJECT Devi ceObj e c t ,
IN PIRP I rp
)

Sec. 12.4 Code Example: A Packet-Based Slave DMA Driver

279

P I O_STACK_LOCATION I rpS tack =
I oGetCurrent i rp S tackLocat i on ( I rp ) ;
PDEVICE_EXTENSION pDE =
DeviceObj e c t - >Devi c eExtens i on ;
PMDL Mdl = I rp - >MdlAddres s ;
ULONG MapRegsNeeded ;
NTSTATUS s tatus ;
swi tch ( I rpStack->Maj orFunc t i on ) {
case I RP_MJ_WRITE :
case I RP_MJ_READ :
pDE- >Byt e sReque s ted
MmGetMdlByteCount ( Mdl ) ; 0
pDE - >Byt esRemaining =
pDE - >Byt esRequested ;
pDE - >Trans ferVA =
MmGe tMdlVi rtualAddr e s s ( Mdl ) ;
II
I I S e t the direc t i on f l ag
II

i f ( I rpStack- >Maj orFunc t i on
= = I RP_MJ_WRITE
)
pDE - >Wr i t eToDevi c e

TRUE ;

pDE - >Wr i t eToDevi c e

FALSE ;

else

pDE - >Trans ferS i z e
pDE - >Byt esRemaining ; @
MapRegsNeeded =
ADDRESS_AND_S I ZE_TO_S PAN_PAGES (
pDE - >Trans f e rVA ,
pDE - >Trans ferS i z e ) ;
i f ( MapRegsNeeded >
pDE - >MapRegi s t e rCount
MapRegsNeeded =
pDE - >MapRegi s te rCount ;

280

Chapter 12

DMA Drivers

pDE - >Trans ferS i z e =
MapRegsNeeded * PAGE_S I ZE MmGe tMdlByt eO f f s e t ( Mdl ) ;
s tatus

I oAl l ocateAdapt erChanne l ( e
pDE - >AdapterObj e c t ,
Devi c eObj e c t ,
MapRegsNeeded ,
XxAdapterC ontrol ,
pDE ) ;
i f ( ! NT_SUCCESS ( s t atus ) ) 0
{
I rp - > I o S tatus . S tatus = s tatus ;
I rp - > I o S tatus . Informat i on = O ;
I oComp l e teReque s t (
I rp ,
I O_NO_INCREMENT ) ;
I o S t ar tNext Packe t (
Devi c eObj ect ,
FALSE ) ;

break ;
II
I I Shou l d never get here - - j us t get r i d
I I o f the packet . . .
II

de f au l t :
I rp - > I o S tatus . S tatus
STATUS_NOT_SUPPORTED ;
I rp- > I o S tatus . Informat i on = O ;
I oComp l e t eReque s t (
I rp ,
IO_NO_INCREMENT ) ;
I o StartNextPacke t ( Devic eObj ect , FALSE ) ;
break ;
} I I end swi tch
0 Set up various bookkeeping values. The size and address of the user
buffer come from the MDL built by the 1/0 Manager. Keep in mind that

you can use the virtual address as an index into the user buffer but you
can't actually dereference it.
@ This section calculates the size of the first partial transfer. First, the driver
tries to transfer everything in a single DMA. If there aren't enough map-

Sec. 12.4 Code Example: A Packet-Based Slave DMA Driver

281

ping registers to handle the whole buffer, the driver asks for as many
mapping registers as it can get. Based on this smaller number, it calculates
a smaller size for the current transfer.

@) Ask for the Adapter object using an asynchronous call. The Adapter Con­
trol routine will execute when the DMA channel is available. It will start
the actual device operation.
0 If the call to IoAllocateAdapaterChannel fails, it usually means there

aren't enough mapping registers. In that case, the driver simply fails the
IRP and starts the next request.
XxAdapterControl This function programs the system DMA hardware and
starts the device itself. The 1/0 Manager calls it when the Adapter object belongs to
our device and there are enough mapping registers to handle the request.

s tat i c IO_ALLOCATI ON_ACT I ON
XxAdapterContro l (
.
IN PDEVI CE_OBJECT De viceObj ect ,
IN P I RP I rp , 0
IN PVOI D MapReg i s terBas e ,
IN PVOI D Context
)
=

PDEVICE_EXTENS I ON pDE
pDE - >MapRegi s terBas e

=

Context ;
MapRegi s terBas e ; @

KeF lushi oBu f fers (
I rp- >MdlAddres s ,
! pDE - >Wri teToDevi c e ,
TRUE ) ; @)
I oMapTrans f e r (
pDE - >AdapterObj ect ,
I rp- >MdlAddres s ,
pDE - >MapRegi s t erBas e ,
pDE - >Trans f erVA ,
&pDE- >Trans ferS i z e ,
pDE - >Wr i t eToDev i c e ) ; 0
II
I I S tart the device
II

XxWr i t eContro l (
pDE ,
XX_CTL_INTENB

return KeepObj ect ; 0

XX_CTL_DMA_GO ) ;

282

Chapter 12

OMA Drivers

0 The 1/0 Manager gets this IRP pointer from the Currentlrp field of the

Device object. Normally, this field gets set when your driver uses loStart­
Packet or loStartNextPacket to call a standard Start I/0 routine. If your
driver doesn't have a Start 1/0 routine, it's up to you to make sure that
the Currentlrp field gets set before you call loAllocateAdapterChannel.
Or, you'll have to have some other way of getting the IRP pointer (and it's
associated MDL address) into the Adapter Control routine.
@ Save the value of the MapRegisterBase argument for use by later parts of
the driver.

@) Flush any processor caches that might be holding parts of the OMA
buffer. This is a no-op on CPUs that handle their own cache coherency.
Notice the perverse way that the direction argument for this function is
TRUE for a read. Other 1/0 Manager functions use TRUE for write
requests.
0 Set up the system OMA channel associated with the device.
@ Return a value of KeepObj ect in order to retain ownership of the Adapter

object until the whole buffer has been transferred.
Xxlsr This function processes interrupts from the device. Normally, there
will be a single interrupt at the end of each partial transfer, or when an error
occurs.

BOOLEAN
Xxi s r (
IN PKINTERRUPT Interrupt ,
IN PVOID Servic eCont ext
)
PDEVICE_EXTENS ION pDE = S ervi c eCont ext ;
PDEVICE_OBJECT Devi ceObj ect = pDE - >Devi c eObj e c t ;
UCHAR S t atus = XxReadS tatus ( pDE ) ;
UCHAR Contro l ;
II
I I See i f thi s device reques ted an interrupt
II

i f ( ( S tatus & XX_STS_IRQ ) = = 0 )
re turn FALSE ;

Contro l = XxReadContr o l ( pDE ) ; 0
Contro l &= - ( XX_CTL_INTENB I
XX_CTL_DMA_GO ) ;
XxWr i t eControl ( pDE , Cont rol ) ;

Sec. 12.4 Code Example: A Packet-Based Slave DMA Driver

pDE- > Devi ceS tatus

=

283

S tatus ; @

I oReques tDpc (
DeviceObj ect ,
DeviceObj e c t - >Current i rp ,
( PVOI D ) pDE ) ; @
re turn TRUE ;
0 When an interrupt arrives, issue some device-specific commands to
acknowledge the interrupt and prevent any further ones from coming in.

@ Save the status of the hardware so that the DpcForlsr routine can figure
out whether the transfer was successful.

@ There's not much more that can be done up at DIRQL. Issue a DPC
request and let the rest of the work happen at DISPATCH_LEVEL IRQL.
XxDpcForlsr This function executes after the Interrupt Service routine
runs. It either sets up the next partial transfer or it completes the current request
and starts the next one.

VOID
XxDpcFor i s r (
IN PKDPC Dpc ,
IN PDEVICE_OBJECT Devi ceObj e c t ,
IN P I RP I rp ,
IN PVOI D Context
)
PDEVICE_EXTENS I ON pDE = Context ;
ULONG MapRegsNeeded ;
PMDL Mdl = I rp - >MdlAddre s s ;
IoFlushAdapt erBu f fers (
pDE- >Adap terObj e c t ,
Mdl ,
pDE- >MapReg i s t erBas e ,
pDE - >Trans f erVA ,
pDE - >Trans ferS i z e ,
pDE - >Wr i t eToDevi c e ) ; 0
i f ( ! XX_STS_OK ( pDE - > Devi ceSt atus ) ) @
{
IoFreeAdapt erChanne l ( pDE - >Adapt erObj e c t ) ;
I rp - > I oS tatus . S tatus =
STATUS_DEVICE_DATA_ERROR ;

284

Chapter 12

DMA Drivers

I rp - > I o S tatus . Inf orma t i on =
pDE - > BytesRequested pDE - >BytesRema ining ;
II
I I Comp l e t e thi s reque s t and
I I s tart the next
II
I oComp l e t eReque s t ( I rp , I O_NO_INCREMENT ) ;
I o S t ar tNext Packe t ( Devic eObj e c t , FALSE ) ;
re turn ;
pDE - >Byt e s Remaining

-=

pDE - >Trans f e rS i z e ;

i f ( pDE - >Byt esRemaining > 0 ) @
{
II
I I Update the pointer and t ry to
I I do a l l o f i t in one opera t i on
II
pDE - >Trans ferVA + = pDE - > Trans ferS i z e ;
pDE - >Trans ferS i z e = pDE - >Byt esRemaining ;
MapRegsNeeded =
ADDRES S_AND_S I ZE_TO_S PAN_PAGES (
pDE - >Trans f e rVA ,
pDE - >Trans ferS i z e ) ;
II
I I I f the remainder o f the bu f f er i s more
I I than we can handle in one I I O . Reduce
I I our expec t at i ons .
II
i f ( MapRegsNeeded > pDE - >MapRegi s t erCount
{
MapRegsNeeded = pDE - >MapRegi s t erCount ;
pDE - >Trans ferS i z e =
MapRegsNe eded * PAGE_S I ZE BYTE_OFFSET ( pDE- >Trans ferVA ) ;
I oMapTrans fer (
pDE- >Adapt erObj e c t ,
Mdl ,
pDE- >MapRegi s t erBas e ,
pDE - >Trans ferVA ,
&pDE - >Trans f e rS i z e ,
pDE - >Wr i teToDevi c e ) ; 0

Sec. 12.5 Writing a Packet-Based Bus Master OMA Driver

XxWr i t eContro l (
pDE ,
XX_CTL_INTENB

285

XX_CTL DMA_GO ) ;

else 0
I oFreeAdapterChanne l ( pDE - >AdapterObj ect ) ;
I rp- > I o S tatus . S tatus = STATUS_SUCCES S ;
I rp- > I o S tatus . Informat i on
pDE - >Byt esRequ e s t ed ;
=

I oComp l e t eRequest ( I rp , IO_DISK_INCREMENT ) ; ©
I o S tartNext Packe t ( Devic eObj ect , FALSE ) ;

0 Flush any data out of the Adapter object's cache. On platforms with OMA

address limitations (ISA buses, for example), this may result in data being
copied from place to place in memory.
@ Check for device errors. This driver simply fails the IRP if an error
occurred. A real driver might retry the operation some number of times
before failing it.

@} At this point, the driver can assume the previous operation was a success.
It checks to see if there are any bytes left in the buffer, and if there are, it
sets up the next partial transfer. The logic here is similar to what goes on
in the Start 1/0 routine: Try to transfer all the remaining bytes, or as
much as the Adapter object can handle, whichever is less.
0 Set up the system OMA controller for the next partial transfer, then start

the device.
0 This else clause executes when the entire user buffer has been transferred.

It simply completes the IRP and starts the next one.

© Pick a priority-boost value that's appropriate for your device. Slower
devices can probably get by with IO_DISK_INCREMENT, while faster
hardware may need a heftier boost.

1 2 . 5 W R I TI N G A P A C KET- B A S E D B u s M A ST E R D M A D R I V E R
In packet-based bus master OMA, the device transfers data to o r from the locked­

down pages of the caller's buffer using OMA hardware that's part of the device
itself. Depending on the capabilities of the device, it might be providing its own
scatter I gather support as well.

286

Chapter 12

DMA Drivers

The architecture of a packet-based bus master driver is almost identical to
that of a driver for a slave device. The only difference is the way the driver sets up
the bus master hardware. The following subsections describe these differences.
Setting Up Bus Master Hardware
A bus master device complicates things because the system doesn't know
how to program the device's onboard DMA controller. The most the 1/0 Man­
ager can do is to give the driver two things: An address in DMA logical space
where a contiguous segment of the buffer begins and a count indicating the num­
ber of bytes in that segment. It then becomes the driver 's responsibility to load
this information into the address and length registers of the device and start the
transfer.
The function that performs this little miracle is none other than our old
friend, IoMapTransfer. When you pass NULL for its AdapterObject pointer, its
return value will be the address in DMA logical space that corresponds to the
CurrentVa and Mdl arguments. You put this logical address into the device's
address register.
Furthermore, when AdapterObject is NULL, Length becomes both an input
and output argument. On input, you ask it to map all the bytes remaining
between CurrentVa and the end of the buffer. On output, Length contains the
number of contiguous bytes starting at the logical address returned by loMap­
Transfer. This number goes into your device's count register. Figure 12.4 shows
how this works.
Supporting bus master devices requires some changes to the driver 's
Adapter Control and DpcForlsr routines. The following subsections contain fragVirtual Space

Physical Memory

A
// /

B

/

/

--+--+-

A

Address

Length: A+B

/

c

/

B

Copyright © 1 996 by Cydonix Corporation. 960019a.vsd

Figure

1 2.4

For bus masters, IoMapTransfer scans for contiguous buffer segments

Sec. 12.5 Writing a Packet-Based Bus Master DMA Driver

287

ments of these routines. Compare them with the corresponding routines in the
packet-based slave DMA driver in the previous section of this chapter.
Adapter Control routine Being optimistic, the Adapter Control routine
asks IoMapTransfer to map the entire buffer at the start of the first transfer.
Instead, it tells the driver how much contiguous memory is actually available in
the first segment of the buffer.

PHYSICAL_ADDRES S DmaAddre s s ;
pDE - >Trans ferVA
MrnGetMdlVi rtualAddress ( I rp - >MdlAddre s s ) ;
=

pDE - >Byt esRemaining
MrnGe tMdlByteCount ( Irp- >MdlAddres s ) ;
=

pDE - >Trans ferS i z e
DmaAddre s s

=

=

pDE - >BytesRemaining ;

I oMapTrans f e r (
NULL ,
I rp - >MdlAddres s ,
pDE - >MapRegi s terBas e ,
pDE - >Trans f e rVA ,
&pDE - > Trans f e rS i z e ,
pDE - >Wr i teReque s t ) ;

XxWri teAddres s ( pDE , ( PUCHAR ) DmaAddr e s s . LowPart ) ;
XxWr i teCount ( pDE , pDE - >Trans f e rS i z e ) ;
XxWr i t eContro l ( XX_CTL_DMA_GO ) ;
re turn Dea l l ocateObj ec tKeepRegi s t ers ;
DpcForlsr routine After each partial transfer, the DpcForlsr routine incre­
ments the CurrentVa pointer by the previously returned Length value. It then
calls IoMapTransfer with this updated pointer and asks to map all the bytes
remaining in the buffer. IoMapTransfer returns another logical address and a new
Length value indicating the size of the next contiguous buffer segment. This con­
tinues until the whole buffer has been processed.

PHYS I CAL_ADDRESS DmaAddre s s ;
I oFlushAdapterBu f f ers (
NULL ,
I rp - >MdlAddress ,
pDE - >MapRegi s terBas e ,
pDE - >Trans f e rVA ,
pDE - > Trans ferSi z e ,
pDE - >Wr i t eRequest ) ;
pDE- >Byte sRemaining

-=

pDE - >Trans f e rS i z e ;

Chapter 12

288

OMA Drivers

i f ( pDE- >Byte sRemaining > 0 )
{
pDE - >Trans f erVA + = pDE - >Trans ferS i z e ;
pDE - >Trans f e rS i z e
DmaAddre s s

=

=

pDE - >Byt esRemaining ;

I oMapTrans fer (
NULL ,
I rp - >MdlAddr es s ,
pDE- >MapReg i s t erBas e ,
pDE - >Trans f e rVA ,
&pDE - >Trans ferSi z e ,
pDE - >Wr i teReques t ) ;

XxWri teAddres s ( pDE , ( PUCHAR ) DmaAddr e s s . LowPart ) ;
XxWr i teCount ( pDE , pDE- >Trans f e rS i z e ) ;
XxWri teControl ( XX_CTL_DMA_GO ) ;

Hardware with Scatter/Gather Support
Some bus master devices contain multiple pairs of address and length regis­
ters, each one describing a single contiguous buffer segment. This allows the
device to perform I/O using buffers that are scattered throughout OMA address
space. These multiple address and count registers are often referred to as a scatter/
gather list, but you can also think of these bus masters as having their own built-in
mapping registers. Figure 12.5 shows how this works.
Virtual Space

Physical Memory

c

A

B

Address

A
c

Length: A+B

Address

B

Length: C

Copyright © 1 996 by Cydonix Corporation. 960020a.vsd

Figure

1 2.5

Some bus masters have their own scatter I gather hardware

Sec. 12.5 Writing a Packet-Based Bus Master DMA Driver

289

Before each transfer, the driver loads as many pairs of address and count
registers as there are segments in the buffer. When the device is started, it walks
through the scatter I gather list entries in sequence, filling or emptying each seg­
ment of the buffer and then moving on to the next. When all the list entries have
been processed, the device generates an interrupt.
Building Scatter/Gather Lists with loMapTransfer
Once a gain, IoMapTransfer will be used to find contiguous segments of the
DMA buffer. In this case, however, the driver will call it several times before each
data transfer operation - once for each entry in the hardware scatter I gather list.
These fragments of an Adapter Control and a DpcForlsr routine show how it's
done.
Adapter Control routine Before the first transfer operation, the Adapter
Control routine loads the hardware scatter I gather list and starts the device. The
remainder of the buffer will be handled by the ISR and DpcForlsr routines.

PHYS ICAL_ADDRES S DmaAddres s ;
ULONG Byt e s Le f t i nBu f f e r ;
ULONG Segment S i z e ;
PUCHAR SegmentVA ;
pDE - >Trans f erVA
MrnGetMdlVi rtualAddre s s ( I rp - >MdlAddre s s ) ;
=

pDE- >Byt esRema ining =
MrnGe tMdlByteCount ( I rp- >MdlAddress ) ;
pDE - > Trans ferS i z e

=

O;

BytesLe f t i nBuf f e r = pDE - >BytesRema i ning ;
SegmentVA = pDE - >Trans f e rVA ;
XxC l earSgL i s t ( pDE ) ;
whi l e ( pDE - >Ava i l abl eSgEnt r i e s > 0 &&
BytesLe f t i nBu f f e r > 0 )
S egmentS i z e
DmaAddre s s

=

=

BytesLe f t i nBu f f e r ;
I oMapTrans f e r (
NULL ,
I rp - >MdlAddr e s s ,
pDE - >MapRegi s terBas e ,
pDE - >Trans ferVA ,
&Segment S i z e ,
pDE - >Wr i t eRequ e s t ) ;

Chapter 12

290

OMA Drivers

XxAddToSgL i s t (
pDE ,
DrnaAddr e s s . LowPart ,
Segment S i z e ) ;
pDE- >Trans f e rS i z e + = Segment S i z e ;
SegmentVA + = Segment S i z e ;
BytesLe f t inBu f f e r - = Segment S i z e ;
Ava i l ab l e S gEnt r i es - - ;
XxWr i t eControl ( XX_CTL_DMA_GO ) ;
re turn Dea l l ocateObj e c tKeepRegi s ters ;
DpcForlsr routine After each transfer is finished, the ISR issues a DPC
request. The DpcForlsr routine flushes the previous request, and if there are more
bytes left to transfer, it rebuilds the scatter I gather list.

PHYS I CAL_ADDRESS DrnaAddres s ;
ULONG BytesLe f t inBu f f e r ;
ULONG SegmentS i z e ;
PUCHAR SegmentVA ;
I o F l ushAdapt e rBu f f ers (
NULL ,
I rp - >MdlAddr e s s ,
pDE - >MapReg i s terBas e ,
pDE - >Trans f erVA ,
pDE - > Trans f e rS i z e ,
pDE - >Wr i t eReques t ) ;
pDE - >Byt esRemaining

-=

pDE - >Trans f e rS i z e ;

i f ( pDE- >Byt e s Remaining > 0 )
{
pDE - >Trans f e rVA + = pDE - >Trans ferS i z e ;
pDE - >Trans f e rS i z e

=

0;

BytesLe f t inBu f f er = pDE - >BytesRemaining ;
SegmentVA = pDE - > Trans f erVA ;
XxC l earSgL i s t ( pDE ) ;
whi l e ( pDE - >Ava i l abl eSgEntr i e s > 0 &&
Byt e s Le f t inBu f fer > 0 )
SegmentS i z e = BytesLe f t inBu f f e r ;
DrnaAddre s s = I oMapTrans fer (

Sec. 12.6 Writing a Common Buffer Slave DMA Driver

291

NULL ,
I rp - >MdlAddre s s ,
pDE - >MapRegi s terBas e ,
pDE- >Trans ferVA ,
& Segment S i z e ,
pDE - >Wr i teRequest ) ;
XxAddTo SgL i s t (
pDE ,
DmaAddr e s s . LowPart ,
Segment S i z e ) ;
pDE - > Trans ferS i z e + = Segment S i z e ;
SegmentVA + = Segment S i z e ;
Byte sLe f t i nBu f f e r - = Segment S i z e ;
Ava i l ab l eSgEnt r i es - - ;
}

//

end whi l e

XxWr i t eContr o l ( XX_CTL_DMA_GO ) ;
else
I oFreeMapReg i s ters ( . . . ) ;
) ;
IoComp l e teReque s t (
I o S tartNext Packe t ( . . . ) ;
}

1 2.6 WRITING A COMMON B U FFER SLAVE OMA DRIVER
In common buffer slave DMA, the device transfers data to or from a contiguous
buffer in nonpaged pool using a system DMA channel. Although originally
intended for devices that use the system DMA controller 's autoinitialize mode,
common buffers can also improve throughput for some types of ISA-based slave
devices.

Allocating a Common Buffer
Memory for a common buffer has to be physically contiguous and visible in
the DMA logical space of a specific device. To guarantee that both these condi­
tions are met, you use the HalAllocateCommonBuffer function described in
Table 12.8 to allocate memory for the buffer.
Notice the CacheEnabled argument to this function. It's usually a good idea
to request non-cached memory for the common buffer since it eliminates the need
to call KeFlushloBuffers. On some platforms, this can improve the performance
of both your driver and the system.

292

Chapter 12

Table 1 2.8

DMA Drivers

Prototype for HalAllocateCommonBuffer

PVOID HalAllocateCommonBuffer

IRQL == PASSIVE_LEVEL

Parameter

Description

IN PADAPTER_OBJECT
Adapter object
IN ULONG Length
OUT PPHYSICAL_ADDRESS
LogicalAddress
IN BOOLEAN CacheEnabled

AdapterObject associated with DMA device

Return value

Requested size of buffer in bytes
Address of the common buffer in the DMA
controller's logical space
• TRUE - memory is cacheable by the CPU
• FALSE - memory is not cached
• Non-NULL - system VA of common buffer
• NULL - error

In the case of common buffer slave DMA, you'll need to build an MDL for
the buffer. 4 This MDL is a required argument for IoMapTransfer and IoFlush­
AdapterBuffers. To set up the MDL, call IoAllocateMdl followed by MrnBuild­
MdlForNonPagedPool. When your driver unloads, call IoFreeMdl to release the
memory used for the MDL.
Using Common Buffer Slave OMA to Maintain Throughput
Common buffer slave DMA is useful if a driver can't afford to have IoMap­
Transfer copy a DMA buffer from one place to another during a data transfer. On
ISA buses, this kind of copying is always a possibility with packet-based DMA.
Since common buffers are guaranteed to be accessible by their associated DMA
devices, there's never any danger of IoMapTransfer moving data from one place
to another.
For example, drivers of some ISA-based tape drives need to maintain very
high throughput if they want to keep the tape streaming. They won't be able to do
this if a buffer copy happens during a call to IoMapTransfer. To prevent this, the
driver uses a ring of common buffers for the actual DMA operation. Other, less
time-critical portions of the driver move data between these common buffers and
the actual user buffers.
To see how this might work, lets consider the operation of a driver for a
hypothetical ISA output device. To maintain a high DMA data rate, it uses a series
of common buffers that are shared between the driver 's Dispatch and DpcForlsr
routines. The Dispatch routine copies user-output data into an available common
buffer and attaches the buffer to a queue of pending DMA requests. Once a DMA

4

The MDL is unnecessary if you plan to use the common buffer for bus master DMA.

293

Sec. 12.6 Writing a Common Buffer Slave DMA Driver

Adapter Control:

Dispatch:
Allocate buffer

loMapTransfer first buffer

RtlMoveMemory

Start device

Add buffer to queue
If idle
l oAllocateAdapterChannel

Interrupt Service:
loReq uestDpc

0. . . . . . . ...

Common Buffers

DpcForlsr:
Release cu rrent buffer
loMapTransfer next buffer

Free

Start device

Copyright © 1 996 by Cydonix Corporation. 960021 a. vsd

Figure

1 2.6

Using common buffers allows some ISA drivers to maintain higher
throughput

is in progress, the DpcForlsr removes buffers from the queue and processes them
as fast as it can. Figure 12.6 shows the organization of this driver, and the subsec­
tions below describe various driver routines.
Dr l verEntry routine As always the DriverEntry routine has to find and
allocate the driver 's hardware. Along with its usual responsibilities, DriverEntry
also does the following:
1.

When it creates its Device object, it sets the DO_BUFFERED_IO bit in the
Flags field. Although the underlying common buffers will be processed using
DMA, the user data will initially be copied into system-space buffers.

2.

DriverEntry initializes two queues in the Device Extension. One holds a list
of free common buffers. The other is for work requests in progress.

3.

Next, it creates separate spin locks to guard each queue. The spin lock for the
work list also protects a flag in the Device Extension called DmalnProgress.

4.

Then, DriverEntry calls HalGetAdapter to find the Adapter object associated
with its device. It uses the count of mapping registers returned by this func­
tion to determine the size of its common buffers.

5.

It allocates some number of common buffers and adds them to the free list in
the Device Extension. (As an implementation detail, some of the space in each
common buffer is used for a linked-list pointer, a pointer to the IRP associated
with this request, and a pointer to the MDL for the common buffer.) For each

294

Chapter 12

DMA Drivers

buffer, it also calls IoAllocateMdl and MmBuildMdlForNonPagedPool to
create an MDL.
Finally, DriverEntry initializes a Semaphore object and sets its initial count to
the number of common buffers it has just created.

6.

Dispatch routine The Dispatch routine of this driver works differently
than the ones you've seen so far. Since the driver has no Start 1/0 routine, the Dis­
patch routine is actually responsible for queuing or starting each request. This is
what the Dispatch routine does to process an output request:
1.

I t calls KeWaitForSingleObj ect to wait for the Semaphore object associated
with the driver 's list of free buffers. The thread issuing the call will freeze
until there's at least one buffer in the queue. 5

2.

The Dispatch routine removes an available common buffer from the free list
and (since we're only considering outputs here) uses RtlMoveMemory to fill
it with data from the user 's buffer.

3.

It prevents the 1/0 Manager from completing the request by calling IoMark­
IrpPending.

4.

Next, it acquires the spin lock associated with the queue of active requests. As
a side-effect, acquiring the spin lock raises IRQL up to DISPATCH_LEVEL.
After it owns the spin lock, the Dispatch routine adds the new request to the
list of buffers to be output.

5.

Still holding the spin lock, the Dispatch routine checks an internal Dmaln­
Progress flag to see if other parts of the driver are already doing an output. If
the flag is TRUE, it simply releases the spin lock. If the flag is FALSE, the Dis­
patch routine sets it to TRUE and calls IoAllocateAdapterChannel to start the
device. It then releases the spin lock.

6.

Finally, it returns a value of STATUS_PENDING.

At this point, the work request for this buffer has been either started or
queued. The next phase of the transfer will take place after the device generates
an interrupt.
Adapter Control routine If the device was idle, the Adapter Control is
called to get it going. This is what it does:
1.

5

It removes the first request from the work queue and saves its address in the
Device Extension as the current request.

Chapter 14 will explain how to use Semaphore objects. If you're familiar with Win32 programming,
you already have a good idea of how they work.

Sec. 12.6 Writing a Common Buffer Slave DMA Driver

295

2.

Next, the Adapter Control routine saves the value of the MapRegisterBase
argument in the Device Extension for later use.

3.

It then calls IoMapTransfer to load the system DMA controller with the
address of the current request's common buffer.

4.

Finally, the Adapter Control routine starts the device and returns a value of
KeepObj ect.

Once the driver owns the Adapter object, it will hold on to it as long as there
are work requests in the queue.
Interrupt Service routine As with packet-based DMA, the ISR in a com­
mon-buffer driver for a slave device just saves hardware status in the Device Exten­
sion. It then calls IoRequestDpc to continue processing at DISPATCH_LEVEL
IRQL.
DpcForlsr routine In this driver, the DpcForlsr routine sets up each addi­
tional work request after the first. Here's how it works:
1.

I t calls IoFlushAdapterBuffers t o flush any data from the system DMA con­
troller 's hardware cache.

2.

The DpcForlsr routine tries to remove the next 1/0 request from the work
queue. If there is another request, the driver makes it the new .current
request, maps its buffer with IoMapTransfer, and starts the device. On the
other hand, if the work queue is empty, the driver calls IoFreeAdapterChan­
nel to release the Adapter object and clears the DmalnProgress flag in the
Device Extension.

3.

Next, it puts appropriate status information in the IRP for the just-completed
request and calls IoCompleteRequest to give it back to the 1/0 Manager.

4.

Finally, the DpcForlsr routine puts the just-completed common buffer back in
the free list and calls KeRleaseSemaphore to increment the count of available
buffers.

Each completed DMA operation causes another interrupt that brings the
driver back through the DpcForlsr routine. This loop continues until all the
requests in the work queue have been processed.
Unload routine When a common buffer bus master driver is unloaded, it
first needs to stop the device from trying to use the buffer. Once the device is
silent, the Unload routine calls HalFreeCommonBuffer to release the memory
associated with the ring of buffers. It also calls IoFreeMdl to release memory used
for each buffer 's MDL.

Chapter 12

296

OMA Drivers

1 2 . 7 WRITING A COMMON-BU FFER Bus MASTER OMA DRIVER
In common-buffer bus master OMA, the device transfers data t o o r from a contig­
uous nonpaged pool buffer using a OMA controller that's part of the device itself.
Frequently, this kind of hardware will treat the common buffer as a mailbox for
exchanging control and status messages with the driver.

How Common-Buffer Bus Master OMA Works
The exact operation of a common-buffer bus master driver will depend on
the whims of the hardware designer. The description that follows is based on a
typical architecture. It assumes the device uses one mailbox for commands and
another to return status information. Figure 12.7 illustrates this arrangement.

DriverEntry routine The DriverEntry routine does the following to set up
a common buffer:

1.

It calls HalGetAdapter to find a n Adapter object for the device.

2.

DriverEntry next calls HalAllocateCommonBuffer to get a block of contigu­
ous, nonpaged memory that both the driver and the device can access. It usu­
ally simplifies things if the common buffer is allocated from non-cached
memory.

3.

It stores the virtual address of the common buffer in the Device Extension for
later use.

Length

Driver

Status
Mailbox

Copyright © 1 996 by Cydonix Corporation. 9600228.vsd

Figure

1 2.7

The driver and the device exchange messages using a common buffer

Sec. 12.8 Summary
4.

297

DriverEntry also makes the device itself aware of the common buffer. This
usually means storing the logical address and size of the buffer in a pair of
device control registers.

Start 1/0 routine When it wants to send a command to the device, the
Start 1/0 routine does the following:
1.

I t builds a command structure in the common buffer using the virtual address
stored in the Device Extension.

2.

If DriverEntry specificed TRUE for the CacheEnabled parameter of HalAllo­
cateCommonBuffer, Start I/O needs to call KeFlushloBuffers to force data
from the CPU's cache out to physical memory.

3.

Finally, Start 1/0 sets a bit in a device control register to notify the device that
there is a command waiting for it.

In response to the notification bit being set, the device begins processing the
command in the common buffer.
Interrupt Service routine When the device has finished processing the
command in the common buffer, it puts a message in the status mailbox and gen­
erates an interrupt. In response to this interrupt, the driver 's Interrupt Service
routine does the following:
1.

It copies the contents of the status mailbox into various fields of the Device
Extension.

2.

If necessary, the ISR sets another bit in the device control register to acknowl­
edge that it has read the status message.

3.

It calls IoRequestDpc to continue processing the request at a lower IRQL.

Unload routi ne When a common-buffer bus master driver is unloaded, it
first needs to stop the device from trying to use the buffer. Once the device is
silent, the Unload routine calls HalFreeCommonBuffer to release the memory
associated with the buffer.

1 2 .8 SUM MARY
Without a doubt, drivers for OMA devices are more complicated than drivers for
programmed I/0 hardware. In return for this added complexity, the system
achieves greater throughput by overlapping CPU activity with data transfers. The
1/0 Manager tries to simplify things by providing a generic framework in which

298

Chapter 12

DMA Drivers

to perform DMA. This chapter has presented the details of NT's abstract DMA
model and shown how to perform various styles of DMA.
So far, we've been assuming that things have gone well during device opera­
tions. But suppose something terrible happens? Something so terrible, in fact, that
you think the system administrator should hear about it. In the next chapter,
you'll see how to add error-logging capabilities to a driver.

C

H

A

P

T

E

R

13

Logging Device
Errors

S

ystem administrators are a nervous and para­
noid lot. Like small mammals in the Jurassic period, they scurry about - imagin­
ing the worst and waiting for it to happen. Adding to their anxiety may seem
cruel, but if you're writing a commercial-quality driver, you really should tell
someone when serious hardware and software errors occur. This chapter explains
how to generate these notifications using NT's event-logging mechanism.

1 3. 1 EVENT LOGGING IN WINDOWS NT
Built into Windows NT is a mechanism that allows software components to keep
a record of interesting events. This event-logging capability can help you monitor
the behavior of a piece of software that's under development. It can also give sup­
port personnel crucial information once the software is out in the field. The
remainder of this section presents guidelines for deciding what information to log
and then describes how event logging works.
Deciding What to Log
For the most part, error logging is something that's best done by lowest-level
device drivers. Higher-level drivers usually don't have anything to say that's
worth putting in the log file, except possibly startup and shutdown notifications.
There are several kinds of events that a device driver might log:
299

Chapter 13

300

Logging Device Errors

•

Hard device errors that result in an IRP failing

•

Soft errors that are corrected after some number of retries

•

Device timeouts

•

Driver startup and shutdown

Along with various pieces of standard information, you're allowed to add
your own data to the messages in the event log. Useful items to include are
•

The contents of any device control or status registers that might indicate
the cause of the problem

•

Any fields from the Device or Controller Extension that indicate the state
of the driver when the error occurred

•

Any additional information about the request that would help with the
diagnosis. For example, logging the transfer size might lead you to dis­
cover that large requests always fail.

Two points are worth mentioning. First, don't get carried away with the idea
of adding driver-specific data to event-log messages. The amount of space avail­
able for private data in a kernel-mode event-log message is rather limited. So,
stick to the essentials and only add things to your log packets that will be of true
diagnostic value.
Second, hardware that's on its last legs can generate a lot of error messages
as it fails and can easily overwhelm the log file. It's important to have some strat­
egy for dealing with this situation. For example, you might keep track of how
many messages a device is generating, and if it exceeds some threshold, reduce
the level of detail reported by your driver.
How Event Logging Works
The developers of Windows NT had several goals for the event-logging archi­
tecture. The first was to provide application programs, drivers, and the operating
system with a unified framework for recording information. This framework
includes a simple yet flexible standard for the binary format of event-log entries.
Another goal was to give system administrators an easy way to view these
messages. As part of this goal, viewing utilities must be able to display event mes­
sages in the currently selected national language. Under the American version of
NT, the message text should appear in English, while the French version of NT
should display French text. Figure 13.l shows how it all works.
The following describes what happens when a kernel-mode driver decides
to log an error. The process is similar for a user-mode Win32 application, although
the specific API calls are different. 1

1

The data-collection DLL in Chapter

1 8 contains an example of using the Win32 event-logging API.

Sec. 13.2 Working with Messages

301

Driver

Message
File

•

�

Logging Thread

· · · · · · · · ·

�

·

· · · ·

- - - -

�

Event Log
File

Copyright © 1 994 by Cydonix Corporation. 940023a.vsd

Figure

1 3.2

1 3. 1

NT event-logging components

l.

All event messages take the form o f packets in Windows NT. When a kemel­
mode driver wants to log an event, it first calls the 1/0 Manager to allocate a
message packet from nonpaged pool.

2.

The driver fills in this packet with various pieces of descriptive information.
One of the key items is a 32-bit message code number that identifies the text
to be displayed for this packet. Once the packet's ready, the driver gives it
back to the I/ 0 Manager.

3.

The 1/0 Manager takes the message packet and sends it to the system event­
logging thread. This thread accumulates packets and periodically writes them
to the proper event-log file. 2

4.

The Event Viewer utility reads binary packets from the log files. To translate a
packet's 32-bit message code into text, the Viewer goes to the Registry. There
it finds the path names of one or more message files associated with the
packet. These message files contain the actual message text (possibly in multi­
ple language�) which the Viewer displays.

W O R KI N G WITH M ESSAG ES
As you've just seen, your driver doesn't include the actual text for its messages in
an event-log entry. Instead, it identifies messages using code numbers. The text
associated with these code numbers takes the form of a message resource stored
2

If the system crashes before a group of log packets have been written out, you can still see them by
using WINDBG's !errlog command. See Chapter 17 for more details.

Chapter 13

302

Logging Device Errors

somewhere on disk. This section describes how these message codes work and
explains how to generate your own message resources.
How Message Codes Work
The code number identifying a specific message is a 32-bit value consisting
of several fields. Figure 13.2 shows the layout of a message code.
Table 13.1 gives a little more detail about the meaning of each of these fields.
Although you'll probably never need to decode these fields on sight, it's always
nice to be able to impress your friends.
The 1/0 Manager provides a number of standard messages that your driver
can use. The header file, NTIOLOGC.H, defines symbolic names for these mes­
sage codes, all of which begin with IO_ERR_ (for example, IO_ERR_TIMEOUT or

31 - 30

28 - 1 6

29

Severity

15 - 0

Faci lity
Error Code

Customer
Copyright © 1 996 by Cydonix Corporation. 960023a.vsd

Figure

Table 1 3. 1

1 3.2

Layout of a message-code number

The meaning of message-code fields

Message-code fields
Field

B its

Description

Code
Facility
Customer

0-15
16-28
29

Severity

30-31

Code number identifying the error
Software component generating the message
If set, this is a customer-generated (non-Microsoft)
message
One of the following:
• 0 - success
• 1 - information
• 2 - warning
• 3 - error

Sec. 13.2 Working with Messages

303

IO_ERR_NOT_READY). Browse through this header file for a complete list of
standard messages.
If you want to use these standard messages, you have to add your driver
to the list of event-logging system components in the Registry. You also have
to identify the file where the text for these messages is located (%System
Root% \SYSTEM32\ IOLOGMSG.DLL) . The procedure for doing this is
described a little later in this chapter.
If the standard messages don't meet all your needs, you can supplement
them with driver-defined messages. To do this, you need to follow these steps:
1.

Write a message definition file that associates your message codes with specific text strings.

2.

Compile this file using the message compiler (MC) utility.

3.

Incorporate the message resources generated by MC into your driver.

4.

Register your driver as an event-logging system component and identify the
driver executable as the file containing the text for these private messages.

Writing Message Defin ition Files
To use the MC utility, you first need to write a definition file describing all
your messages. This definition file is divided into two major sections.
Header section Keywords in the header define names for values that will
be used in the actual message definitions. Table 13.2 contains the keywords that
you can use in the header section of a message definition file.
Message section This portion of the message definition file contains the
actual text of the messages. Each message begins with the keywords listed in
Table 13.3.
Table 1 3.2

Keywords used in the header section of a message definition file

Header section keywords
Keyword

Description

MessageldTypedef = DataType
SeverityNames = ( name=number[:name] )

Typecast applied to all message codes
Up to four severity values used in
the Message section
Facility names used in the Message
section
Language names used in Message
section

FacilityNames = ( name=number[:name] )
LanguageNames

=

( name=number:filename)

Chapter 13

304
Table 1 3.3

Logging Device Errors

Keywords used in the message section of a message definition file

Message section keywords
Keyword

Description

Messageld = [number I +number]
Severity = SeverityName
Facility = FacilityName
SymbolicName = SymbolName
Language = LanguageName

16-bit value assigned to this message*
Severity level of this message
Facility generating the message
Name of message code in generated header file
Language ID associated with the message

*Required.

The message text itself begins after the last keyword. The text of a message
can occupy several lines. You end a message with a line containing only a single
period character.
The message compiler ignores any whitespace or carriage returns in a mes­
sage definition. If you want explicit control over the appearance of a message
when the Event Viewer displays it, you can include various escape sequences
(listed in Table 13.4) in the body of the message.
The %1-%99 escape codes represent Unicode strings (embedded in the
event lo� packet) that will be inserted in the message when the Event Viewer dis­
plays it. If a kernel-mode driver associates an event packet with a Device object,
%1 will automatically contain the NT name of the device; if the driver associates
the packet with the Driver object, %1 will be blank. In either case, your first real
insertion string will be %2, your second one will be %3, and so on. The code
example appearing later in this chapter will explain how to add insertion strings
to an event packet.
Table 1 3.4

The effects of various escape codes on displayed message text

Message formatting escape codes

3

IF you use . . .

TH EN it's replaced with . . .

%b
%t
%r%n
%1-%99

A single space character
A single tab character
Carriage return and linefeed
An insertion string

Remember that these insertion strings will always be displayed as raw text. There's no way for the
Event Viewer to translate them into the local language.

Sec. 13.2 Working with Messages

305

A Small Example: XXMSG.MC
Here is the message definition file for the example that goes with this chap­
ter. You can find it in the CH13\DRIVER directory on the floppy that accompa­
nies this book.
Header section The first part of the message definition file contains
header information.

NTSTATUS O

Mes s ageidTypede f
S eve r i tyName s = (
Suc c e s s
Informat i onal
Warning
Error
Fac i l i tyName s
Sys t em
RpcRunt ime
Rpc S tubs
Io
=
XXDr iver

= OxO : STATUS_SEVERI TY_SUCCESS
= Oxl : STATUS_SEVERITY_INFORMATI ONAL
Ox2 : STATUS_SEVERITY_WARNING
Ox3 : STATUS_SEVERI TY_ERROR

(@
OxO
Ox2 : FAC I L ITY_RPC RUNTIME
Ox3 : FAC I L I TY_RPC_STUBS
Ox4 : FAC I L I TY_I O_ERROR_CODE
Ox7 : FAC I L ITY_XX_ERROR_CODE

0 The definitions of any symbolic names generated by MC will include a

typecast to NTSTATUS.

@ You can find codes for Microsoft-defined facilities in the NTSTATUS.H
header file. For your own facility number, pick something that isn't in
use.
Message section Here's the message section of the file. It defines the
actual text to be associated with message code number.

Me s s age i d= OxO O O l O
Fac i l i ty=XXDriver
Seve r i ty= Inforrnat i ona l
Syrnbo l i cName=XX_MSG_LOGGING_ENABLED@
Language=Engl i sh
Event l o gging enab l ed f o r XxDriver . $
Me s s age Id=+ le
Faci l i ty=XXDriver

306

Chapter 13

S eve r i ty=Informat i onal
Symbo l i cName =XX_MSG_DRIVER_START ING
Language=Engl i sh
XxDriver has succe s s fu l ly ini t i a l i z ed .
Mes s ageid= + l
Fac i l i ty=XxDriver
S eve r i ty=Informat i onal
S ymbo l i cName=XX_MSG_DRIVER_STOPPING
Language=Engl i sh
XxDriver has unloaded .
Mes s ageid= + l
Fac i l i ty=XxDriver
S eve r i ty= Informat i onal
Symbo l i cName=XX_MSG_OPENING_HANDLE
Language=Engl i sh
Opening handle to % 1 .
Mes s ageid= + l
Fac i l i ty=XxDriver
S everi ty= Informat i onal
Symbo l i cName=XX_MSG_CLOS ING_HANDLE
Language=Engl i sh
C l o s ing handle to % 1 .
Mes s ageid= + l
Fac i l i ty=XxDriver
S everi ty=Warning
S ymbo l i cName=XX_MSG_MULT I PLE_OCCUPANCY
Language=Engl i sh
% 1 contains mul t ip l e l i f e - f o rms . Data
spec i f i es number of oc cupant s .
Mes s ageid= + l
Fac i l i ty=XxDriver
S everi ty= Informat i onal
Symbo l i cName=XX_MSG_MERGING_DNA
Language=Eng l i sh
Merging DNA f rom % 2 and % 3 in % 1 . 0

Logging Device Errors

Sec. 13.2 Working with Messages

307

0 The Messageld keyword is required at the start of a message. This form

of the keyword assigns an absolute number to the 16-bit Code field of the
generated message code.
f9 This keyword tells the message compiler to define a symbol called

XX_MSG_LOGGING_ENABLED in the header file it generates.

@) The actual message text begins after the last keyword. A line containing
only a single period character ends the text.
0 This form of the Messageld keyword assigns a Code value to the mes­

sage that's one greater than the previous message.
0 This message contains placeholders for insertion strings. %1 will become

the device name; %2 and %3 will be replaced with whatever insertion
strings are embedded in the event-log packet.
Compiling a Message Defin ition File
Once you've written the message definition file, you use the message com­
piler (MC) to process it. MC is another quirky little command-line utility that
comes with the Win32 SOK and Visual C++. 4 Table 13.5 shows the syntax of the
MC command.
Table 1 3.5

Syntax of the MC command

MC [-?cdosvw] [-herx argument] [-uU] filename.MC
Parameter

Description

-c
-d
-o
-s
-v
-w
-h pathname
-e extension
-r pathname
-x pathname
-u
-U
filename

Set Customer bit in all message codes.
Use decimal definitions of facility and severity codes in header.
Generate OLE2 header file.
Insert symbolic name as first line of each message.
Generate verbose output.
Give warning if message-text is not OS/2 compatible.
Location of generated header file. (Default is current directory.)
One- to three-character extension for header file.
Location of generated RC and binary message files.
Location of generated debug file.
Input file is Unicode.
Message text in binary-output binary file should be Unicode.
Name of the message definition file to compile.

4

Documentation for MC is rather sparse. One of the best sources is the MC.HLP help file that comes
with the compiler.

Chapter 13

308

Logging Device Errors

When you run the message compiler, it automatically generates the follow­
ing files:
•

filename.RC
This is a resource control script that identifies all the lan­
guages used in the message definition file. For each language, it also iden­
tifies the binary message file containing the message text.

•

filename.H
This header file contains #define statements for all the
message code numbers in the MC input file. The compiler also puts a lot
of inline commentary in the header, including the text of the correspond­
ing message.

•

MSGnnnnn.BIN
This binary file holds all the text for messages in one
language. MC will generate separate files (beginning with MSGOOOOl.BIN)
for each national language used in the message definition file.

-

-

-

Although you can specify the paths where the header and RC files will go,
the actual names of these files will always be the same as the name of the message
definition file. You have no control over the names of the binary message file.
Adding Message Resources to a Driver
After you run the message compiler, you still need to do something with the
binary message resources it generates. You could put them in a separate DLL, the
way the 1/0 Manager does with IOLOGMSG.DLL, but for most drivers it makes
more sense to add the message resources to the driver executable itself. That way,
you won't have to worry about keeping track of multiple files when you send
your driver out into the world.
The BUILD utility (described in Chapter 16) understands how to process
resource control scripts. So, all you have to do is to add the name of the script to
the list of source files making up the driver. BUILD will then run the resource
compiler and link the resulting resources into your driver. For example, if you've
just compiled a message definition file called XXMSG.MC, you'll have a resource
script called XXMSG.RC. The following excerpt from a BUILD SOURCES file
shows how you would add this resource script to your driver.

SOURCES= i n i t . c unl oad . c
di spatch . c
eventlog . c
xxms g . rc

\
\
\

There's one glitch in all this. BUILD doesn't know what to do with message
definition files, so you can't just add XXMSG.MC itself to the list of driver
sources. This means you need to run the message compiler by hand any time you
modify your message definition file. Fortunately, there's a way to extend the capa-

Sec. 13.2 Working with Messages

309

bilities of BUILD so that it will automatically maintain message resources for you.
Chapter 17 explains how to perform this little bit of magic.
Registering a Driver as an Event Source
So now you have a header file containing message codes, and a bunch of
message resources stuffed into your driver. But there's still a question: Just how
does the system know that it should look in your driver executable when it wants
to translate a particular message code into text? Once again, we're saved by the
Registry.
Any software component that plans to generate log entries must identify
itself to the system as an event source. Further, every event source has to specify
the location of the message files needed to translate any message codes appearing
in its log entries. Figure 13.3 shows the Registry entries that identify a driver as an
event source. 5
To register your driver as an event source, make the following changes to
the Registry:
1.

Under Services\EventLog\ System, add the name o f your driver 's execut­
able (without the extension) to the REG_MULTI_SZ value called Sources.

2.

Under
driver.

. . .

. . .

Services\EventLog\System, add a key with the same name as your

H KEY_LOCAL_MACHIN E\System\CurrentControlSet\Services

[

�

Eventlog

L

s

tem

Sources: REG_MU LTl_SZ: XXDRIVER YYDRIVER

•••

XXDRIVER
EventMessageFile:
REG_EXPAND_SZ:
%SystemRoot%\System32\IOLOGMSG.DLL;
%SystemRoot%\System32\Drivers\XXDRIVER.SYS
TypesSupported: REG_DWORD: Ox7

Copyrig ht © 1 994 by Cydonix Corporation. 940024a.vsd

Figure
5

1 3.3

Registering a kernel-mode driver as an event source

These entries apply only to kernel-mode event sources. Chapter
mode component as an event source.

18 shows how to register a user­

Chapter 13

310

Logging Device Errors

3.

In this key, create a value called EventMessageFile. This is a
REG_EXPAND_SZ containing the full path names of any message files used
by your driver. If your driver uses multiple files, separate them with a semico­
lon. If you're using standard messages defined in NTIOLOGC.H, you'll also
need to add IOLOGMSG.DLL to this list.

4.

In this same key, create a value called TypesSupported. This is a
REG_DWORD bit mask identifying the types of messages generated by your
driver. A value of Ox7 gets everything.

1 3.3 G E N E RATI N G L O G E NTR I E S
The final piece of the puzzle i s to add code t o your driver that actually generates
event-log entries. This is a relatively straightforward process that involves allocat­
ing an empty packet, filling it in, and sending it off to the system logging thread.
The rest of this section describes the major steps along the way.
Preparing a Driver for Error Logging
If you plan to support error logging, there a few small changes you'll want
to make to your driver. In particular, it's a good idea to add the following items to
your Device Extension:
•

A sequence number field that your driver increments for each IRP pro­
cessed by the device. This value should remain constant for the life of the
request.

•

A retry count for the current request, if you retry device operations when
an error occurs. Set it to zero each time you start processing an IRP and
increment it for each repeated attempt.

•

Copies of any device registers that would help diagnose the error. If your
ISR decides to log an error, it should take a snapshot of the hardware reg­
isters for the logging routine.

You should also adopt some convention that assigns a unique identifying
number to each stage of processing an IRP. This number becomes part of the error­
log information, and it will help you figure out where in your driver the error
occurred. This fragment of a driver's header file shows how you might do this:

# de f ine
# de f ine
# de f ine
#def ine
#de f ine

XX_ERRORLOG_STARTI O
XX_ERRORLOG_CONTROLLER_CONTROL
XX_ERRORLOG_ADAPTER_CONTROL
XX_ERRORLOG_I SR
XX_ERRORLOG_DPC_FOR_ISR

1

2
3
4
5

Sec. 13.3 Generating Log Entries

311

Finally, you might want to define a value in the Parameters subkey of your
driver's Registry service key to control driver error logging. This could either be a
Boolean that simply enables and disables logging, or it could be an actual value
that determines the level of logging detail. The code example appearing later in
this chapter uses a value called EventLogLevel to control the quantity event mes­
sages it generates.
Allocating an Error-Log Packet
When your driver uncovers some terrible sin that needs reporting, it has to
prepare an error-log packet. There are three sections to an error-log packet:
•

A standard header

•

An array of driver-defined ULONGs (referred to as dump data)
One or more NULL-terminated Unicode insertion strings6

•

Both the dump-data and insertion strings are variable in length and are
optional. Figure 13.4 shows the structure of an error-log packet.
Before you can allocate an error-log packet, you need to determine how big
the packet should be. Remember to leave room for any dump-data and insertion

ErrorCode
DumpDataSize

StringOffset

NumberOfStrings
StringOffset

DumpData[ ]

"First Unicode insertion string \O"
"Second U nicode insertion string \O"
Copyright © 1 996 by Cydonix Corporation. 960024a.vsd

Figure

6

1 3.4

Layout of an error-log packet

Don't confuse these with the counted UNICODE_STRING data structures used in other parts of

NT.

Chapter 13

312

Logging Device Errors

strings. You can calculate the size of the packet using a variation on the following
piece of code:

Packe t S i z e =
s i z e o f ( IO_ERROR_LOG_PACKET ) +
( s i z e o f ( ULONG ) * ( DumpDataCount
s i z e o f ( Ins ert i onS tring s ) ;

-

1 ) ) +

Here, DumpDataCount is the number of driver-specific ULONG data items,
and InsertionStrings are any driver-supplied UNICODE strings to be inserted in
the error message. The requested size of the packet cannot exceed
ERROR_LOG_MAXIMUM_SIZE.
Use the IoAllocateErrorLogEntry function (described in Table 13.6) to allo­
cate the packet. As you can see from the table, you're allowed to associate the
packet either with the Driver object or with a particular Device object. Your choice
will determine how the Event Viewer utility displays your message. Overall ini­
tialization and shutdown are good choices for Driver-level messages, while prob­
lems involving specific IRPs or pieces of hardware ought to be associated with a
Device object.
Low memory conditions could make it impossible for the system to get a
packet for you, so don't assume that your allocation request will always suc­
ceed. One easy way to handle these situations is just to forget about logging the
error, with the hope that it will happen again when the system isn't so pressed
for memory.
Finally, notice that you have to be at or below DISPATCH_LEVEL IRQL
when you allocate error-log packets. This means that if your ISR decides to log an
error (a common occurrence), you'll need a CustomDpc routine to do the actual
work.
Logging the Error
Once you've allocated the packet, you need to fill in all the relevant fields. In
addition to the fields listed in Table 13.7, you should also copy any driver-specific
data and strings into the packet.
Table 1 3.6

Use this function to allocates an error-log packet

PVOID loAllocateErrorLogEntry

IRQL � DISPATCH_LEVEL

Parameter

Description

IN PVOID IoObject

Address of a Device object generating an error
Address of a Driver object reporting an error
Size in bytes of packet to be allocated
• PIO_ERROR_LOG_PACKET - success
• NULL - allocation failure
•

•

IN UCHAR EntrySize
Return value

Sec. 13.4 Code Example: An Error-Logging Routine
Table 1 3.7

313

Layout of an IO_ER ROR_LOG_PACKET

IO_ERROR_LOG_PACKET, *PIO_ERROR_LOG_PACKET
Field

Description

UCHAR MajorFunctionCode
UCHAR RetryCount
USHORT DumpDataSize
USHORT NumberOfStrings
USHORT StringOffset
USHORT EventCategory
NTSTATUS ErrorCode
ULONG UniqueErrorValue
NTSTATUS FinalStatus
ULONG SequenceNumber
ULONG IoControlCode
LARGE_INTEGER DeviceOffset
ULONG DumpData[l]

IRP_MJ_XXX code of current IRP
Zero-based count of consecutive retries
Bytes of driver-specific data
Number of insertion strings
Byte offset of first insertion string
Event category from driver's message file
IO_ERR_XXX (see NTIOLOGC.H)
Indicates where in the driver the error occurred
STATUS_XXX value from the IRP
Driver-assigned number for current IRP
IOCTL_XXX if this is a DeviceloControl request
Device offset where error occurred, or zero
Driver-specific data if DumpDataSize is nonzero

When the packet is ready, call IoWriteErrorLogEntry to send it to the system
logging thread. The packet doesn't belong to you once you call this function, so
don't touch it again. As with packet allocation, you can only write an error-log
packet if you're at or below DISPATCH_LEVEL IRQL.

1 3.4 CODE EXAM PLE: AN E R R O R LOGGIN G RO UTI N E
-

This example illustrates how to log event messages from a kernel-mode driver.
The complete example includes a driver that uses these event-logging functions,
as well as a test program that exercises the driver. You can find all of this in the
CH13 directory on the disk that accompanies this book.
EVENTLOG.C
This module provides a general event-logging mechanism that any driver
can use. In addition to the functions listed below, EVENTLOG.C also defines a
global variable called LogLevel that determines logging verbosity. Although glo­
bals are generally a bad idea in drivers, this one's okay because its value doesn't
change once driver initialization is done.
Xxln itializeEventlog This function is called from DriverEntry to set up
the driver 's event-logging mechanism. Its main purpose is to retrieve a value
called EventLogLevel from the driver 's Registry service key and store it in the
LogLevel variable.

Chapter 13

314

Logging Device Errors

VOI D
Xxini t i al i z eEventLog (
IN PDRIVER_OBJECT DriverObj ect
)
RTL_QUERY_REG I STRY_TABLE QueryTabl e [ 2 ] ; 0
II
I I Fabr i cate a Reg i s t ry query .
II

Rt l Z eroMemory ( QueryTable , s i z e o f ( QueryTab l e ) ) ; @
QueryTabl e [ O ] . Name = L 11 EventLogLeve l 11 ;
QueryTabl e [ O ] . Fl ags = RTL_QUERY_REGI STRY_DIRECT ;
QueryTabl e [ O ] . EntryContext = &LogLeve l ;
II
I I Look f o r the EventLogLeve l value
I I in the Regi s t ry .
II

i f ( ! NT_SUCCESS (
Rt lQueryRegi s t ryValues ( @
RTL_REGI STRY_SERVICES ,
XX_DRIVER_NAME
L 11 \ \ Parameters 11 ,
QueryTab l e ,
NULL , NULL ) ) )
{
LogLeve l
DEFAULT_LOG_LEVEL ;
II
I I Log a mes s age s aying that l ogging
I I i s enabled .
II

}

XxReportEvent ( O
LOG_LEVEL_DEBUG ,
XX_MSG_LOGGING_ENABLED ,
XX_ERRORLOG_INI T ,
( PVOI D ) Dr iverObj ect ,
NULL ,
I I No IRP
NULL , 0 ,
I I No dump data
NULL , 0 ) ;
I I No s tr ings

0 This function uses our old friend RtlQueryRegistryValues to set the

event-logging verbosity level. We need a query table with one entry for
the value and another (NULL) entry for a terminator.

Sec. 13.4 Code Example: An Error-Logging Routine

315

@ It's a good idea to clear the table before using it. Otherwise, you can get
some strange error messages resulting from random bit settings.
@ Query the Registry. RTL_REGISTRY_SERVICES says that the path name
(xxdriver\Parameters) should be treated as a subkey of the \ Services
key.
. . .

0 If verbose logging is enabled, log a message indicating that logging is

enabled.
XxReportEvent This function does the actual grunt work of allocating an
error-log packet, filling it in, and sending it off to the system logging thread. You
can only call this function from DISPATCH_LEVEL IRQL.

BOOLEAN
XxReportEvent (
IN ULONG Mes s ageLeve l ,
IN NTSTATUS ErrorCode ,
IN ULONG Uni queErrorValue ,
IN PVOID I oObj e c t ,
IN P I RP I rp ,
IN ULONG DumpData [ ] ,
IN ULONG DumpDataCount ,
IN PWSTR S t ri ng s [ ] ,
IN ULONG S t r i ngCount
)
P IO_ERROR_LOG_PACKET Packet ;
PDEVICE_EXTENS I ON pDE ;
P IO_STACK_LOCATION I rpStack ;
PUCHAR pins e r t i onS tri ng ;
UCHAR Packet S i z e ;
UCHAR Str ingS i z e [ XX_MAX_INSERTION_STRINGS ] ;
ULONG i ;
if(

LOG_LEVEL_NONE
LogLeve l
( Mes s ageLeve l > LogLeve l ) )

1 10

re turn TRUE ;
Packe t S i z e

s i z e o f ( I O_ERROR_LOG_PACKET ) ; @

i f ( DumpDataCount > 0 ) @
Packe t S i z e + =
( UCHAR ) ( s i z eo f ( ULONG ) *
( DumpDataCount
i f ( StringCount > 0 ) 0

-

1 ) ) ;

Chapter 13

316

Logging Device Errors

i f ( StringCount > XX_MAX_INSERT ION_STRINGS )
StringCount = XX_MAX_INSERTION_STRINGS ;
for ( i = O ; i < S t r ingCount ; i + + ) 0
{
StringS i z e [ i ] =
( UCHAR ) XxGe tStr ingS i z e ( S t rings [ i ] ) ;
Packet S i z e + = S t ringS i z e [ i ] ;

II
I I Try to a l l ocate the packet
II

I oAl l ocateErrorLogEntry (
I oObj ect ,
Packet S i z e ) ;

Packet

i f ( Packe t - - NULL ) re turn FALSE ;
II
I I F i l l in s t andard parts o f the packet
II

Packe t - >ErrorCode = ErrorCode ;
Packe t - >UniqueErrorValue = Uni queErrorValue ;
i f ( I rp ! = NULL ) ©
{
I rpStack
= I oGe tCurrent i rpStackLo c a t i on ( I rp ) ;
pDE

( PDEVICE_EXTENS I ON )
( ( PDEVICE_OBJECT ) I oObj ect ) - >
Devi ceExtens i on ;

Packe t - >Maj orFunc t i onCode =
I rpS tack- >Maj orFunc t i on ;
Packe t - >Re t ryCount = pDE - > I rpRe t ryCount ;
Packe t - >Fina l S t atus = I rp - > I o S tatus . S tatus ;
Packe t - > S equenc eNurnber =
pDE - > I rpSequenc eNumber ;
i f ( I rpStack- >Maj orFunc t i on = =
IRP_MJ_DEVI CE_CONTROL I I
I rpStack->Maj orFunc t i on = =
I RP_MJ_INTERNAL_DEVICE_CONTROL

Sec. 13.4 Code Example: An Error-Logging Routine

317

Packe t - > I oControlCode =
I rpStack- > Parameters .
Devi c e i oContro l .
I oContro lCode ;
e l s e Packe t - > I oCont rolCode

O;

e l s e I I No IRP
Packe t - >Maj orFunc t i onCode
O;
Packe t - >Re t ryCount = O ;
Packe t - >Fina l S tatus = O ;
Packe t - > Sequenc eNumber = O ;
Packe t - > I oControlCode = O ;
II
I I Add the dump data
II
i f ( DumpDataCount > 0
{
Packe t - >DumpDataS i z e
( USHORT ) ( s i z e o f ( ULONG ) *
DumpDataCount ) ;
for ( i = O ; i < DumpDataCount ; i + + )
Packe t - >DumpData [ i ] = DumpData [ i ] ;
e l s e Packe t - >DumpDataS i z e = O ;
II
I I Add the insert i on s tr ings
II
Packe t - >NumberO f S t r ings = ( USHORT ) StringCount ;
i f ( StringCount > 0 )
{
Packe t - > S t r ingO f f s e t
s i z e o f ( I O_ERROR_LOG_PACKET ) +
( DumpDataCount - 1 ) * s i z e o f ( ULONG ) ;
p insert i onS t r ing =
( PUCHAR ) Packet + Packe t - > S tringO f f s e t ; @
for ( i = O ; i < S t ringCount ; i + + ) @
{
II

Chapter 13

318

Logging Device Errors

I I Add each new s t ring to the end
I I o f the exi s t ing s tu f f
II

Rt lCopyByt e s (
p i ns e r t i onString ,
S t rings [ i ] ,
S t r ingS i z e [ i ] ) ;
pins e r t i onS t r i ng

+=

S t r ingS i z e [ i ] ;

II
I I Log the mes s age
II

I oWri teErrorLogEntry ( Packet ) ;
return TRUE ;

0 If we're not logging or the message is out of range, return without doing
anything.

@ Begin calculating the packet size. Start with the minimum required num­
ber of bytes.
@) Add in any dump data. Remember that the standard error-lo g packet
already has one slot in its Dump Data array.
0 Determine the total space needed for any insertion strings. If the caller

has sent too many strings, process only as many as this function can
handle.
0 Build a table containing the length of each individual string using XxGet­

StringSize, a local helper function. This table will be used a gain later to
copy the insertion strings into the error-lo g packet. Also add the size of
each string to the total packet requirement.
CD If there's an IRP, then the IoObject argument must point to a Device

object. In that case, use the IRP and the Device Extension to fill in addi­
tional parts of the error-log packet. If there's no IRP, then set the addi­
tional fields to 0.
@ Insertion strings always go just after the Dump Data array in the error-log
packet. After setting the offset of the first string, calculate the address
where the first string should go in the packet.
@ This loop simply adds each new string to the end of the packet using Rtl­

CopyBytes. It takes advantage of the table of string sizes generated ear­
lier in the routine.

Sec. 13.5 Summary

319

XxGetStringSize This little helper function calculates the amount of space
needed by a NULL-terminated Unicode string. The size includes space for the (2
bytes) UNICODE_NULL at the end of the string.

UL ONG
XxGe t S t ringS i z e (
IN PWSTR S t r ing
)
UNI CODE_STRING Temp S t r ing ;
II
I I Us e an RTL routine to get the l ength
II

Rt l in i tUnicodeS t r ing ( &Temp S t r ing , S t r ing ) ;
II
I I S i z e i s actua l ly two greater becau s e
I I o f the UNICODE_NULL at the end .
II

return ( TempStr ing . Length + s i z e o f ( WCHAR ) ) ;
}

1 3.5 SUMMARY
This chapter has presented NT's event-logging mechanisms. As you can see, it
isn't terribly difficult for drivers to leave a little trail when devices start generating
errors. These audit trails can be a useful diagnostic aid to system administrators.
This chapter also finishes our look at basic kernel-mode device driver tech­
niques. In the next chapter, you'll see the first of several variations on the driver
architecture we've developed so far.

C

H

A

P

T

E

R

14

Sy stem Threads

S

ome types of legacy hardware can have a bad
effect on system performance if you manage them using the driver model we've
developed so far. System threads give you a way to keep these devices out of
everyone's way.

1 4. 1 SYSTEM THREADS
A system thread is a thread that runs exclusively in kernel mode. It has no user­
mode context and can't access user address space. Just like a Win32 thread, a sys­
tem thread executes at or below APC_LEVEL IRQL and it competes for use of the
CPU based on its scheduling priority.
When to Use Threads
There are several reasons why you might use threads in a driver. The first
possibility is that you're working with a piece of hardware that has the following
characteristics:

320

•

The device is slow and infrequently accessed.

•

It takes a long time (more than 50 microseconds) for the device to make a
state transition, and the driver has to wait for the transition to occur.

Sec. 14.1 System Threads

321

•

The device needs to make several state transitions in order to complete a
single operation.

•

The device doesn't generate interrupts for some kinds of interesting state
transitions, and the driver has to poll the device for extended periods.

You could, of course, manage a device like this using a CustomTimerDpc
routine. Depending on the amount of device activity, this approach could clog up
the DPC queues and slow down other drivers. Threads, on the other hand, run at
PASSIVE_LEVEL and won't interfere with DPC routines.
Fortunately, there aren't too many categories of hardware that behave this
rudely, and most of them are legacy devices that date from the early days of the
personal computer. The most notable examples are floppy disks and QIC tapes
attached to floppy controllers.
The second possibility is that you've got a device which takes a very long
time to initialize itself, and which your driver has to monitor throughout the ini­
tialization. Certain kinds of optical jukeboxes behave this way. So might a com­
puter-controlled pottery kiln.
This kind of behavior is a problem because the Service Control Manager
gives a driver only about 30 seconds to execute its DriverEntry routine. If Driver­
Entry hasn't returned by then, the Service Control Manager forcibly unloads the
driver. The only solution is to put the long-running device start-up code in a sepa­
rate thread, and return immediately from the DriverEntry routine with
STATUS_SUCCESS. 1
Finally, you might need to perform some kind of operation that will only
work at PASSIVE_LEVEL IRQL. For example, if your driver had to access the Regis­
try on a regular basis, or write something to a file, a thread might be the answer.
Creating and Terminating System Threads
Call PsCreateSystemThread, described in Table 14.1, when you want to cre­
ate a system thread. Since you can only call this function at PASSIVE_LEVEL
IRQL, you will usually create driver threads in your DriverEntry routine.
When your driver unloads, it must kill any system threads it may have cre­
ated. The only way to do this is to have the thread itself call PsTerminateSys­
temThread with an appropriate exit status. Unlike Win32 user-mode threads,
there is no way to forcibly terminate a system thread. This means you need to set
up some kind of signaling mechanism to let a thread know that it should exit. As
you'll see later in this chapter, Event objects provide a convenient way to do this.
1

Of course, you'll have to figure out what to do if the device fails to initialize successfully. Once
DriverEntry has returned, there's no way for a driver to unload itself, so any cleanup will have to
be done by the thread itself. This includes things like deleting Device objects, freeing resources, etc.
If the driver finds it has no initialized devices, it might also make itself entirely paged in order to
reduce its impact on the system.

Chapter 14

322

Table 1 4. 1

System Threads

Prototype for function that creates a system thread

NTSTATUS PsCreateSystemThread

IRQL

==

PASSIVE_LEVEL

Parameter

Description

OUT PHANDLE ThreadHandle
IN ULONG DesiredAccess
IN POBJECT_ATTRIBUTES Attrib
IN HANDLE ProcessHandle
OUT PCLIENT_ID Clientld
IN PKSTART_ROUTINE StartAddr
IN PVOID Context
Return value

Handle of new thread
0 for a driver-created thread
NULL for a driver-created thread
NULL for a driver-created thread
NULL for a driver-created thread
Entry point for thread
Argument passed to thread routine
• STATUS_SUCCESS - thread was created
• STATUS_XXX - an error code

Managing Thread Priority
In general, system threads running in a driver should set their thread prior­
ity to the low end of the real-time range. The following code fragment shows how
to do this.

VOI D ThreadS tartRout ine ( PVO I D Context )
{
KeS e t P r i o r i tyThread (
KeGe tCurrentThread ( ) ,
LOW_REALTIME_PRIORITY ) ;
}

Remember that real-time threads have no quantum timeout. This means that
they only give up the CPU when they voluntarily go into a wait state, or when
they're preempted by a thread of higher priority. So don't design any drivers that
depend on automatic round-robin thread scheduling.
System Worker Threads
For occasional, quick operations at PASSIVE_LEVEL IRQL, creating and ter­
minating a separate thread may not be very efficient. The alternative is to have
one of NT's system worker threads perform the task. These threads use a callback
mechanism to do work on behalf of any driver.
It's not difficult to use system worker threads. First, allocate storage for a
WORK_QUEUE_ITEM structure. The system will use this block to keep track of
your work request. Next, call ExlnitializeWorkltem to associate a callback func­
tion in your driver with the WORK_QUEUE_ITEM.

Sec. 14.2 Thread Synchronization

323

Later, when you want a system thread to execute your callback function, call
ExQueueWorkltem to insert the request block into one of the system work
queues. You can choose to have your request executed either by a worker thread
with a real-time priority, or by one with a variable priority.
Keep in mind that all drivers are sharing the same group of system worker
threads. Requests that take a very long time to complete may delay the execution
of requests from other drivers. If you need to perform tasks involving lengthy
operations or long time delays, use a private driver thread rather than the system
work queues.

1 4.2 THREAD SYNCHRONIZATION
Like user-mode threads in a Wm32 application, system threads may need to sus­
pend their execution until some other condition has been satisfied. This section
describes the basic synchronization techniques available to system threads.
Time Synchronization
The simplest kind of synchronization involves stopping a thread's execution
until a specific time interval elapses. Although you can use the Timer objects
described later in this chapter, the Kernel provides a convenience function
(described in Table 14.2) that's easier to use.
Table 1 4.2

Prototype for the KeDelayExecutionTh read function

NTSTATUS KeDelayExecutionThread

I RQL == PASSIVE_LEVEL

Parameter

Description

IN KPROCESSOR_MODE WaitMode
IN BOOLEAN Alertable
IN PLARGE_INTEGER Interval
Return value

KemelMode for drivers
FALSE for drivers
Absolute or relative duetime
STATUS_SUCCESS - wait completed

General Synchronization
System threads can synchronize their activities in more general ways by
waiting for things called dispatcher objects. Thread synchronization depends on the
fact that a dispatcher object is always in either the Signaled or Nonsignaled state.
When a thread asks to wait for a Nonsignaled dispatcher object, the thread's exe­
cution stops until the object becomes Signaled. (Waiting for a dispatcher object
that's already Signaled is a no-op.) There are two different functions you can use
to wait for a dispatcher object.

Chapter 14

324

System Threads

KeWaitForSingleObject This function, described in Table 14.3, puts the
calling thread into a wait state until a specific dispatcher object is set to the Sig­
naled state.
Optionally, you can also specify a timeout value that will cause the thread to
awaken even if the dispatcher object is Nonsignaled. If you don't pass a timeout
argument, KeWaitForSingleObject will wait indefinitely.
Table 1 4.3

Prototype for the single object wait function

NTSTATUS KeWaitForSingleObject
Parameter

Description

IN PVOID Object
IN KWAIT_REASON Reason
IN KPROCESSOR_MODE WaitMode
IN BOOLEAN Alertable
IN PLARGE_INTEGER Timeout

Pointer to an initialized dispatcher object
Executive for drivers
KemelMode for drivers
FALSE for drivers
• Absolute or relative timeout value
1
• NULL for an infinite wait
• STATUS_SUCCESS
• STATUS_ALERTED
• STATUS_TIMEOUT

Return value

KeWaitForMultipleObjects This function, described in Table 14.4, puts
the calling thread into a wait state until any or all of a group of dispatcher objects
Table 1 4.4

Prototype for the multiple-object wait function

NTSTATUS KeWaitForMu ltipleObjects
Parameter

Description

IN ULONG Count
IN PVOID Object[ ]
IN WAIT_TYPE WaitType

Number of objects to wait for
Array of pointers to dispatcher objects
• WaitAll - wait until all are Signaled
• WaitAny - wait until one is Signaled
Executive for drivers
KemelMode for drivers
FALSE for drivers
• Absolute or relative timeout value
• NULL for an infinite wait
Array of wait blocks for this operation
• STATUS_SUCCESS
• STATUS_ALERTED
• STATUS_TIMEOUT

IN KWAIT_REASON Reason

IN KPROCESSOR_MODE WaitMode
IN BOOLEAN Alertable
IN PLARGE_INTEGER Timeout
IN PKWAIT_BLOCK WaitBlocks[ ]
Return value

Sec. 14.3 Using Dispatcher Objects

325

are set to the Signaled state. Again, you have the option of specifying a timeout
value for the wait.
Be aware that there are limits on how many objects your thread can wait
for at one time. Each thread has a built-in array of Wait blocks that it uses
for concurrent wait operations. The thread can use this array to wait for
THREAD_WAIT_OBJECTS number of objects. If you need to wait for more than
this number of objects, you must supply your own array of Wait blocks when you
call KeWaitForMultipleObjects. In either case, the number of objects you wait for
cannot exceed MAXIMUM_WAIT_OBJECTS.
You can call the KeWaitForXxx functions either from PASSIVE_LEVEL or
DISPATCH_LEVEL IRQL. If you call them from DISPATCH_LEVEL IRQL, how­
ever, you must specify a zero timeout value. 2 This can be useful when your real
goal is to cause some side effect produced by the KeWaitForXxx functions.

1 4.3 USING DISPATCHER OBJECTS
Except for Thread objects, it's up to you to allocate storage for any dispatcher
objects you plan to use. The objects must be permanently resident, so you have to
put them in the Device or Controller Extension, or in some other piece of non­
paged memory.
You also have to initialize the dispatcher object once with the proper Kelni­
tializeXxx function before you use it. Since you can only call these functions at
PASSIVE_LEVEL IRQL, you should usually initialize all dispatcher objects in
your DriverEntry routine.
The following subsections describe each category of dispatcher object in
greater detail.
Event Objects
An Event is a dispatcher object that must be explicitly set to the Signaled or
Nonsignaled state. They are useful for notifying one or more threads of some spe­
cific occurrence. You can see this behavior in Figure 14.1, where thread A awakens
B, C, and D by setting an Event object.
These objects actually come in two different flavors: Notification Events and
Synchronization Events. You choose the type when you initialize the object. These
two types of Events exhibit different behavior when they're put into the Signaled
state. As long as a Notification Event remains Signaled, all threads waiting for the
Event come out of their wait-state. You have to explicitly reset a Notification
Event to put it into the Nonsignaled state. This is the same behavior exhibited by
Win32 manual-reset Events.
When you put a Synchronization Event into the Signaled state, it remains
there only long enough for one call to KeWaitForXxx to be satisfied. It then resets
2

Keep in mind that specifying a timeout value of 0 is not the same as passing a NULL pointer for the
Trmeout argument.

Chapter 14

326

�

System Threads

Thread A

-

Set

Event

Copyright ICJ 1 994 by Cydonix Corporation. 940043a.vsd

Figure

1 4. 1

How Event objects synchronize system threads

itself to the Nonsignaled state automatically. In other words, the gate stays open
until one thread passes through, and then it slams shut. This is equivalent to a
Win32 auto-reset Event.
To use an Event, you need to declare some nonpaged storage for an item of
type KEVENT, and then call the functions listed in Table 14.5.
Notice that you can use either of two functions to put an Event object into
the Nonsignaled state. The difference is that KeResetEvent returns the state of the
Event before it became Nonsignaled, and KeClearEvent does not. KeClearEvent
is somewhat faster, so you should use it unless you specifically need to know the
previous state of the Event.
Table 1 4.5

Use these functions to work with Event objects

How to use Event objects
IF you want to ...

TH EN call ...

IRQL

Create an Event
Create a named Event

KelnitializeEvent
IoCreateSynchronizationEvent
IoCreateNotificationEvent
KeSetEvent
KeClearEvent
KeResetEvent
KeWaitForSingleObject
KeWaitForMultipleObjects
KeReadStateEvent

PASSIVE_LEVEL
PASSIVE_LEVEL

Modify Event state

Wait for a Timer
Interrogate an Event

::; DISPATCH_LEVEL

PASSIVE_LEVEL
::; DISPATCH_LEVEL

,

Sec. 14.3 Using Dispatcher Objects

327

The driver that we'll be examining later in this chapter provides a good
example of using Events. It has a worker thread that needs to pause until an inter­
rupt arrives, so the thread waits for an Event object. The driver 's DpcForlsr rou­
tine sets the Event into the Signaled state, waking up the worker thread.
Sharing Events between Drivers
Normally, it's rather awkward for two unrelated drivers to share an Event
object created with KelnitializeEvent. These Event objects are referenced only by
pointer, and without some kind of explicit agreement (an internal IOCTL for
example), there's no simple way to pass a pointer from one driver to another.
Even then, there's the issue of making sure that the driver creating the Event
object doesn't unload while some other driver is using the object. Overall, it's a
very messy problem. The IoCreateNotificationEvent and IoCreate­
SynchronizationEvent functions make things easier by allowing you to create
named Event objects. As long as two drivers use the same Event name, they will
be able to get pointers to the same Event object.
Both IoCreateXxxEvent functions behave very much like the Win32 Cre­
ateEvent system service. In other words, the first driver to make a call with a spe­
cific Event name causes the Event object to be created. Each additional call using
the same name simply returns a handle to the existing Event object.
There are two things to notice when you use the IoCreateXxxEvent func­
tions. First, you don't supply any memory to hold the KEVENT object itself. Stor­
age for these objects is provided by the system. When everyone using the Event
releases it, the system deletes the object automatically.
The second little twist is that IoCreateXxxEvent calls return a handle to the
Event object. If you want to use the Event object in calls to the KeXxx functions
listed in Table 14.5, you need a pointer to the object rather than a handle. To con­
vert a handle into an object pointer, do the following :
1.

First, call ObReferenceObjectByHandle. This function gives you a pointer to
the Event object itself and increments the object's pointer reference count.

2.

If you don't need the handle for anything (and you probably don't), call
ZwClose to release it. This reduces the object's handle reference count. (Don't
do this until after you increment the pointer count; otherwise the object may
be deleted.)

3.

When you have finished using the Event object (normally in the driver 's
Unload routine), call ObDereferenceObject to decrement the Event object's
pointer reference count and possibly delete the Event object.

You can call these functions only from PASSIVE_LEVEL IRQL which limits
the places in your driver where you can use them.

Chapter 14

328

System Threads
Thread B

�

Thread A

-

Release

Mutex

----- Thread C

Thread D

Copyright © 1 994 by Cydonix Corporation. 940044a.vsd

Figure

1 4.2

How Mutex objects synchronize system threads

M utex Objects
A Mutex (short for mutual exclusion) is a dispatcher object that can be
owned by only one thread at a time. The object becomes Nonsignaled when a
thread owns it and Signaled when it's available. Mutexes provide an easy mecha­
nism for coordinating mutually exclusive access to some shared resource, usually
memory.
Figure 14.2 shows threads B, C, and D waiting for a Mutex owned by thread
A. When A releases the Mutex, one of the waiting threads will wake up and
become its new owner.
To use a Mutex, you need to declare some nonpaged storage for an item of
type KMUTEX, and then call the functions listed in Table 14.6. Be aware that
when you initialize a Mutex, it is always set to the Signaled state.

Table 1 4.6

Use these functions to work with M utex objects

How to use M utex objects
IF you want to ...

TH EN call. ..

IRQL

Create a Mutex
Request Mutex ownership

KelnitializeMutex
KeWaitForSingleObject
KeWaitForMultipleObjects
KeReleaseMutex
KeReadStateMutex

PASSIVE_LEVEL
PASSIVE_LEVEL

Give up Mutex ownership
Interrogate Mutex

PASSIVE_LEVEL
:::;:; DISPATCH_LEVEL

Sec. 14.3 Using Dispatcher Objects

329

If a thread calls KeWaitForXxx on a Mutex it already owns, the thread never
waits. Instead, the Mutex increments an internal counter to record the fact that
this thread is making recursive ownership requests. When the thread wants to
free the Mutex, it has to call KeReleaseMutex as many times as it requested own­
ership. Only then will the Mutex go into the Signaled state. This is the same
behavior exhibited by Win32 Mutex objects.
It's also crucial that your driver release any Mutexes it might be holding
before it makes a transition back into user mode. The NT Kernel will bugcheck if
any of your driver threads attempt to return control to the I/0 Manager while
owning a Mutex. So, for example, a DriverEntry or Dispatch routine isn't allowed
to acquire a Mutex which would later be released by some other Dispatch routine
or by a system thread.
Semaphore Objects
A Semaphore is a dispatcher object that maintains a count. The object
remains Signaled as long as its count is greater than zero, and Nonsignaled when
the count is zero.
Figure 14.3 shows the operation of a Semaphore. Threads B, C, and D are all
waiting for a Semaphore whose count is zero. When thread A calls KeRelease­
Semaphore twice, the count increments to two, and two of the waiting threads are
allowed to resume execution. Waking up the threads also causes the Semaphore
to decrement back to zero.
Again, the driver in Section 14.4 provides a good example. Its Dispatch rou­
tines increment a Semaphore each time they add an IRP to an internal work
queue. As a worker thread removes IRPs from the queue, it decrements the Sema­
phore and finally goes into a wait state when the queue is empty.
Thread B

�

Thread A
-

Release

Semaphore
(Count == 2)

------ Thread C

Thread D

Copyright © 1 994 by Cydonix Corporation. 940045a.vsd

Figure

1 4.3

How Semaphore objects synchronize system threads

Chapter 14

330
Table 1 4. 7

System Threads

Use these functions to work with Semaphore objects

How to use Semaphore objects
IF you want to

..•

Create a Semaphore
Decrement Semaphore
Increment Semaphore
Interrogate Semaphore

THEN call ...

IRQL

KelnitializeSemaphore
KeWaitForSingleObject
KeWaitForMultipleObjects
KeReleaseSemaphore
KeReadStateSemaphore

PASSIVE_LEVEL
PASSIVE_LEVEL
�

DISPATCH_LEVEL
Any

To use a Semaphore, you need to allocate some storage for an item of type
KSEMAPHORE, then call the functions listed in Table 14.7.
Timer Objects
A Timer is a dispatcher object with a timeout value. When you start a Timer,
it goes into the Nonsignaled state until its timeout value expires. At that point, it
becomes Signaled. In Chapter 10, you saw that Timer objects can cause Custom­
TimerDpc routines to execute. Since they are just Kernel dispatcher objects, you
can also use them in calls to KeWaitForXxx.
Figure 14.4 illustrates the operation of a Timer object. Thread A starts a
Timer and then calls KeWaitForSingleObj ect. The thread blocks until the Timer
expires. At that point, the Timer goes into the Signaled state and the thread
wakes up.
Thread A

SetTimer
Wait

Blocked

Timer

Continue

Copyright © 1 994 by Cydonix Corporation. 940046a.vsd

Figure 1 4.4 How Timer objects synchronize system threads

Sec. 14.3 Using Dispatcher Objects

331

Timer objects actually come in two different flavors: Notification Timers and
Synchronization Timers. You choose the type when you initialize the object.
Although both types of Timer go into the Signaled state when their timeout value
expires, their behavior from that point on is different.
When a Notification Timer times out, it remains in the Signaled state until
it's explicitly reset. While the Timer is Signaled, all threads waiting for the Timer
are awakened. Earlier versions of Windows NT supported only Notification
Timers.
When a Synchronization Timer expires, it remains in the Signaled state only
long enough to satisfy a single KeWaitForXxx request. At that point, the Timer
becomes Nonsignaled automatically. Synchronization Timers are a new feature of
Windows NT 4.0.
To use a Timer, you need to allocate some storage for an item of type
KTIMER and then call the functions listed in Table 14.8.
Thread Objects
System threads are also dispatcher objects, which means they have a signal
state. When a system thread terminates, its Thread object changes from the Non­
signaled to the Signaled state. This allows your driver to synchronize its cleanup
operations by waiting for the Thread object.
One thing to notice is that when you call PsCreateSystemThread, you get a
handle to the Thread object. If you want to use a Thread object in a call to KeWait­
ForXxx, you need a pointer to the object rather than a handle. To convert a handle
into an object pointer, do the following:
1.

Call ObReferenceObj ectByHandle. This function gives you a pointer to the
Thread object itself and increments the object's pointer reference count.

2.

If you don't need the handle for anything (and you probably don't), call
ZwClose to release it. This decrements the object's handle reference count.

3.

After the thread terminates, call ObDereferenceObject to decrement the
Thread object's pointer reference count and possibly delete the Thread object.

Table 1 4.8

Use these functions to work with Timer objects

How to use Timer objects
IF you want to ...

THEN call ...

IRQL

Create a Timer
Start a one-shot Timer
Start a repeating Timer
Stop a Timer
Wait for a Timer

KelnitializeTimerEx
KeSetTimer
KeSetTimerEx
KeCancelTimer
KeWaitForSingleObject
KeWaitForMultipleObjects
KeReadTimerState

PASSIVE_LEVEL
::::; DISPATCH_LEVEL
::::; DISPATCH_LEVEL
::::; DISPATCH_LEVEL
PASSIVE_LEVEL

Interrogate a Timer

::::; DISPATCH_LEVEL

Chapter 14

332

System Threads

You can call these functions only from PASSIVE_LEVEL IRQL which limits
the places in your driver where you can use them.
Variations on the M utex
The NT Executive supports two variations on Mutex objects. The following
subsections describe them briefly. In general, using these objects instead of Kernel
Mutexes can result in better driver performance. See the NT DDK documentation
for more complete information.
Fast Mutexes A Fast Mutex is a synchronization object that acts like a
Kernel Mutex, except that it doesn't allow recursive ownership requests. By
removing this feature, the Fast Mutex doesn't have to do as much work and its
speed improves.
The Fast Mutex itself is an object of type FAST_MUTEX that you associate
with one or more data items needing protection. Any code touching the data
items must acquire ownership of the corresponding FAST_MUTEX first. Use the
functions listed in Table 14.9 to work with Fast Mutexes. Notice that these objects
have their own functions for requesting ownership. You can't use KeWaitForXxx
to acquire Fast Mutexes.
Table 1 4.9

Use these functions to work with Fast Mutexes

How to use Fast M utexes
IF you want to ...

THEN cal l ...

IRQL

Create a Fast Mutex
Request Fast Mutex ownership
Give up Fast Mutex ownership

ExlnitializeFastMutex
ExAcquireFastMutex
ExReleaseFastMutex

:::; DISPATCH_LEVEL
< DISPATCH_LEVEL
< DISPATCH_LEVEL

Executive Resources Another synchronization object that behaves very
much like a Kernel Mutex is an Executive Resource. Here, the main difference is
that a Resource can either be owned exclusively by a single thread, or shared by
multiple threads for read access. Since it's common (in the real world) for multiple
readers to request simultaneous access to a resource, Executive Resource objects
provide better throughput than standard Kernel Mutexes.
The Executive Resource itself is just an object of type ERESOURCE that you
associate with one or more data items needing protection. Any code planning to
touch the data items has to acquire ownership of the corresponding ERESOURCE
first. Table 14.10 lists the functions that work with Executive Resources. Notice
that these objects have their own functions for requesting ownership. You can't
use KeWaitForXxx to acquire Executive Resources.

Sec. 14.3 Using Dispatcher Objects
Table 1 4. 1 0

333

Use these functions to work with Executive Resources

How to use Executive Resources
IF you want to ...

THEN cal l ...

IRQL

Create
Acquire

ExlnitializeResourceLite
ExAcquireResourceExclusiveLite
ExAcquiredResourceSharedLite
ExTryToAcquireResourceExclusiveLite
ExConvertExclusiveToSharedLite
ExReleaseResourceForThreadLite
ExlsResourceAcquiredSharedLite
ExisResourceAcquiredExclusiveLite
ExDeleteResourceLite

:5:

Release
Interrogate
Delete

DISPATCH_LEVEL
< DISPATCH_LEVEL
< DISPATCH_LEVEL
< DISPATCH_LEVEL
< DISPATCH_LEVEL
:5: DISPATCH_LEVEL
:5: DISPATCH_LEVEL
:5: DISPATCH_LEVEL
:5: DISPATCH_LEVEL

Synchronization Deadlocks
Deadlock situations can occur whenever multiple threads compete for
simultaneous ownership of multiple resources. Figure 14.5 shows the simplest
form of this problem:
1.

Thread A acquires resource X.

2.

Thread B acquires resource Y.

3.

Thread A requests ownership of resource Y and goes into a wait state until B
releases Y.

4.

Thread B then requests ownership of resource X. This causes B to go into a
wait state until A releases X. Deadlock.

�------

Resource X

Resource Y

Copyright © 1 995 by Cydonix Corporation. 950006a. vsd

Figure

1 4.5

How a multiple-resource deadlock occurs

Chapter 14

334

System Threads

You can cause this kind of deadlock using Events, Mutexes, or Semaphores.
Even Thread objects can deadlock waiting for each other to terminate. There are
two general approaches to solving deadlock problems:
•

Use the Timeout argument of the KeWaitForXxx functions to limit the
time you wait. While this technique may help you detect a deadlock, it
doesn't really correct the underlying problem.

•

Force all the threads using a given set of resources to acquire them in the
same order. In the previous example, if A and B had both gone after
resource X first and then Y second, there would have been no deadlock.

Mutex objects give you some protection against deadlocks through the use
of level numbers. When you initialize a Mutex, you have to assign a level number
to it. Later, when a thread attempts to acquire the Mutex, the Kernel will not grant
ownership if that thread is holding any Mutex with a lower level number. By
enforcing this policy, the Kernel avoids deadlocks involving multiple Mutexes.

1 4.4 CODE EXAM PLE: A THREAD-BASED D RIVE R
This section presents a modified version of the packet-based slave DMA driver
that you saw back in Chapter 12. What's different about this driver is that it uses a
system thread to do most of the 1/0 processing. As a result, it spends very little
time at DISPATCH_LEVEL IRQL or DIRQL and doesn't interfere as much with
other system components. You can find the code for this example in the
CH14\DRIVER directory on the disk that accompanies this book.
How the Driver Works
The driver you're about to see is unlike anything that's appeared so far in
this book. Figure 14.6 gives a high-level view of its inner workings. One of the
first things to notice is that the driver has no Start 1/0 routine. When a user-mode
I/O request arrives, one of the driver 's Dispatch routines simply adds the IRP to a
work queue associated with the Device object. Then the Dispatch routine calls
KeReleaseSemaphore to increment a Semaphore object that keeps track of the
number of IRPs in the work queue.
Each Device object has its own system thread that processes these 1/0
requests. This thread is in an endless loop that begins with a call to KeWaitForSin­
gleObject on the Semaphore. If the Semaphore object has a nonzero count, the
thread will remove an IRP from the work queue and perform the I/0 operation.
On the other hand, if the count is zero the thread will go into a wait state until the
Dispatch routine inserts another IRP in the queue.
When the thread needs to perform a data transfer, it starts the device and
then uses KeWaitForSingleObject to wait for an Event object. The driver 's Dpc­
Forlsr routine will set this Event into the Signaled state after an interrupt arrives.

Sec. 14.4 Code Example: A Thread-Based Driver

Semaphore

Dispatch
Routine

Queue

335

Thread
Walt

Interrupt
Service
Routine

Walt

DPC
Routine

Event

Copyright © 1 994 by Cydonix Corporation. 940042a.vsd

Figure

1 4.6

�

Architecture of the thread-based DMA driver

When the driver 's Unload routine needs to kill the system thread it sets a
flag in the Device Extension and increments the Semaphore object. If the thread
was asleep waiting for the Semaphore object, it will wake up, see the flag, and ter­
minate itself. If it's in the middle of an 1/0 operation, it won't see the flag until it
completes the current IRP.
The DEVICE_EXTENSION Structure in XXDRIVER.H
This file contains all the usual driver-defined data structures. The following
excerpt shows only those fields that driver needs in order to manage the system
thread and its work queue. Other fields are identical to those in the packet-based
slave DMA example of Chapter 12.

typede f s t ruct _DEVICE_EXTENS ION
{
PETHREAD ThreadObj ect ; 0
BOOLEAN ThreadShouldStop ;
KEVENT Adapt erObj e c t i sAcqu i red ; @
KEVENT Devi c eOperat i onComp l e t e ;
KSEMAPHORE I rpQueueSemaphore ; fD
L I ST_ENTRY I rpQueueLi s tHead ;
KS P IN_LOCK I rpQueueSpinLock ;
} DEVICE_EXTENS ION ,

* PDEVICE_EXTENS ION ;

Chapter 14

336

System Threads

0 Once the thread is running, other parts of the driver can use the Thread
object pointer synchronize with it. The BOOLEAN flag tells the thread
when it's time to shut down.

@ The thread waits for these Event objects at appropriate places in its pro­
cessing cycle. Other parts of the driver set them into the Signaled state
when interesting things happen.
@} The work queue consists of a doubly-linked list guarded by a spin lock and
a Semaphore object that keeps track of the number of IRPs in the queue.
The XxCreateDevice Function in INIT.C
This portion of the example shows the initialization code for the Thread
object, the work queue, and the various synchronization objects used to process
an 1/0 request. Remember that DriverEntry calls XxCreateDevice once for each
Device object.

s t a t i c NTSTATUS
XxCreateDevice (
IN PDRIVER_OBJECT DriverObj ect ,
IN INTERFACE_TYPE BusType ,
IN ULONG BusNumber ,
IN PDEVI CE_BLOCK Devi ceBlock ,
IN ULONG NtDevic eNurnber
)

Ke ini t i al i z eSp inLock (
&pDevExt - > I rpQueueSpinLock )

;

0

Ini t i a l i zeLi s tHead (
&pDevExt - > I rpQueueL i s tHead ) ;
Ke ini t i a l i z eS ernaphore (
&pDevExt - > I rpQueueSernaphore ,
0,
MAXLONG ) ;
Keini t i a l i z eEvent ( @
&pDevExt - >Adap terObj ec t i sAcquired ,
Synchron i z a t i onEvent ,
FALSE ) ;
Ke ini t i al i z eEvent (
&pDevExt - >Devi ceOperat i onCornp l e t e ,
Synchron i z a t i onEvent ,
FALSE ) ;

Sec. 14.4 Code Example: A Thread-Based Driver

pDevExt - >ThreadShouldS top
s tatus

=

337
=

FALSE ;

PsCreateSyst emThread ( 8
&Thread.Handle ,
( ACCES S_MASK ) O ,
NULL ,
( HANDLE ) 0 ,
NULL ,
XXThread.Mai n ,
pDevExt ) ;

i f ( ! NT_SUCCE S S ( s tatus ) )
{
I oDe l et e Syrnbo l i cLink ( & l inkName ) ;
IoDe l e t eDevi c e ( pDevObj ) ;
return s tatus ;
}

ObRe f erenc eObj ec tByHandl e ( 0
Thread.Handl e ,
THREAD_ALL_ACCES S ,
NULL ,
Kerne lMode ,
&pDevExt - >ThreadObj ect ,
NULL ) ;
ZWC l o s e ( Thread.Handl e ) ;
I oConnec t interrup t ( . . . ) ;

0 This section of code sets up the work queue used by the thread.
fD These calls initialize the Event objects that signal ownership of the

Adapter object and the arrival of a device interrupt. Notice that they're
both synchronization (i.e., auto-reset) Events.
8 The call to PsCreateSystemThread starts the thread. The entry point
function is XxThreadMain and it will receive a pointer to the Device
Extension as its Context argument. Because this is an asynchronous oper­
ation, the status of PsCreateSystemThread is only telling you that the
thread was started successfully. It says nothing about what happens to
the thread afterwards.
0 PsCreateSystemThread gives back a handle to Thread rather than a

pointer to the Thread object itself. This section of code gets a pointer to
the object and then releases the (unneeded) handle.

Chapter 14

338

System Threads

The XxDispatchReadWrite Function in DISPATCH.C
This portion of the example shows how the Dispatch routine of this driver
works. Its operation is relatively straightforward: After checking for a zero-length
transfer, it puts the IRP into the pending state and inserts it into the work queue
attached to the target Device object. It then increments the count in the work
queue's Semaphore object. Notice that there are no calls to IoStartPacket because
there is no Start I/ 0 routine.

NT STATUS
XxDi spatchReadWr i t e (
IN PDEVICE_OBJECT pDO ,
IN PIRP I rp
)
{
PIO_STACK_LOCATI ON I rpS tack =
I oGetCurrent i rpS tackLocat i on ( I rp ) ;
PDEVICE_EXTENS I ON pDE

=

pDO- >Devic eExtens i on ;

II
I I Check f o r z e ro - l ength trans f e r s
II

i f ( I rp S t ack- > Parameters . Read . Length = = 0 )
{
I rp - > I o S tatus . S tatus = STATUS_SUCCESS ;
I rp - > I o S tatus . Informat i on = O ;
I oComp l e t eReque s t ( I rp , I O_NO_INCREMENT ) ;
return STATUS_SUCCESS ;
II
I I Start device opera t i on
II

I oMarki rp Pending ( I rp ) ;

II
I I Add the I RP to the thread ' s work queue
II

Exinterl ockedinsertTa i l L i s t (
&pDE - > I rpQueueL i s tHead ,
& I rp - >Tai l . Over l ay . Li s tEntry ,
&pDE - > I rpQueueSpinLock ) ;

KeRe leaseSemaphore (
&pDE - > I rpQueueSemaphore ,
0,
I I No p r i o r i ty boo s t
1,
I I Increment s emaphore by 1

Sec. 14.4 Code Example: A Thread-Based Driver

339

FALSE ) ; I I No Wai tForXxx a f t er thi s c a l l
re turn STATUS_PENDING ;
}
THREAD.C
This module contains the main thread function and any routines needed to
manage the thread.
XxTh readMain Here is the !RP-processing engine itself. Its job is to pull
1/0 requests from the work queue in the Device Extension and perform the data
transfer operation. This function continues to wait for new IRPs until the Unload
routine tells it to shut down.

VOI D
XxThreadMain (
IN PVOI D Context
)
{
PDEVICE_EXTENS I ON DevExtens i on

=

Context ;

PDEVICE_OBJECT DeviceObj e c t
DevExtens i on- >Devi ceObj e c t ;
=

PLI ST_ENTRY L i s tEntry ;
P I RP I rp ;
CCHAR P r i o r i tyBoo s t ;
KeSet Pr i o r i tyThread (
KeGetCurrentThread ( ) ,
LOW_REALTIME_PRIORITY ) ; 0
II
I I Now enter the main IRP-pro c e s s ing l oop
II

whi l e ( TRUE )
{
KeWa i tForS ingl eObj e c t ( 8
&DevExt ens i on- > I rpQueueS emaphore ,
Exe cut ive ,
Kerne lMode ,
FALSE ,
NULL ) ;
i f ( DevExtens i on- >ThreadShouldStop ) 8
PsTerminateSys temThread ( STATUS_SUCCES S ) ;

Chapter 14

340

System Threads

II
I I I t mus t be a real reque s t . Get an IRP
II

L i s tEntry
ExinterlockedRemoveHeadL i s t (
&DevExten s i on - > I rpQueueL i s tHead ,
&DevExtens i on- > I rpQueueSpinLock ) ;
=

I rp

CONTAINING_RECORD (
L i s tEntry ,
IRP ,
Ta i l . Overlay . L i s tEntry ) ;

Priori tyBoo s t
XxPerformDataTran s f e r ( 0
Devic eObj ect ,
I rp ) ;
=

I oComp l e t eReques t ( I rp , Priori tyBo o s t ) ;

0 System threads normally start running down in the variable priority
range. The usual practice is to move the thread to the lowest of the time­
critical scheduling priorities.
f9 The thread will wait here indefinitely for an IRP to appear in the work

queue or for the Unload routine to stop the thread.

@) When the thread awakens, it has to see whether the wake-up call was the
result of an 1/0 request or a thread shutdown signal. The flag in the
Device Extension will give a clue.
0 This function processes the IRP. This is a synchronous call which doesn't

return until the data transfer operation is done. It returns a priority boost
value which the thread then uses when it completes the IRP. After releas­
ing the IRP, the thread goes back to the top of the loop and waits for the
Semaphore object again.
XxKillThread This function notifies the thread associated with a particular
Device object that it's time to quit. To simplify things, this function stops and
waits until the target thread is gone. Consequently, it can only be called from
PASSIVE_LEVEL IRQL.

VOI D
XxKi l lThread (
IN PDEVICE_EXTENS ION pDE
)

Sec. 14.4 Code Example: A Thread-Based Driver
II
I I S e t the S t op f l ag
II

pDE - > ThreadShouldStop

341

TRUE ;

II
I I Make sure the thread wakes up
II

KeRe l ea s e S emaphore (
&pDE - > I rpQueueS emaphore ,
0,
I I No p r i o r i ty boo st
I I Increment s emaphore by 1
1,
I I Wai tForXxx a f t e r thi s c a l l
TRUE ) ;
II
I I Wa i t f o r the thread to terminate
II

KeWai tForS ingl eObj ect (
&pDE - >ThreadObj ect ,
Exe cutive ,
Kerne lMode ,
FALSE ,
NULL ) ;

ObDer e f erenc eObj e c t ( &pDE - > ThreadObj ect ) ;

TRANSFER.C
This portion of the example contains the support routines that perform 1/0
operations. A great deal of what's in here is derived from the packet-based slave
DMA driver in Chapter 12. Consequently, only those features that differ signifi­
cantly will be described in detail.
The main thing to notice is that very little work actually happens inside the
Adapter Control or DpcForlsr routines. Instead of doing their usual jobs, these
functions just set Event objects to signal the thread's data transfer routines that
they can proceed.

XxPerform DataTransfer This function moves an entire buffer of data to or
from the device. This may include splitting the transfer over several device opera­
tions if there aren't enough mapping registers to handle it all at once. This rou­
tines runs at PASSIVE_LEVEL IRQL and doesn't return to the caller until
everything is done.

CCHAR
XxPer f o rrnDataTrans fer (
IN PDEVICE_OBJECT Devic eObj ect ,

Chapter 14

342

System Threads

IN P I RP I rp
)
PIO_STACK_LOCATI ON I rpS tack =
I oGetCurrent i rpS tackLocat i on ( I rp ) ;
PDEVICE_EXTENSION pDE =
Devi ceObj e c t - >Devi c eExtens i on ;
PMDL Mdl = I rp - >MdlAddres s ;
ULONG MapRegsNeeded ;
NTSTATUS s tatus ;
II
I I S e t the I I O direc t i on f l ag
II

i f ( I rpS tack- >Maj orFunc t i on
==
I RP_MJ_WRITE )
pDE - >Wr i t eToDevice

TRUE ;

pDE - >Wr i t eToDevice

FALSE ;

else

II
I I S e t up bookkeep ing values
II

pDE - >Byt esReques ted =
MmGetMdlByteCount ( Mdl ) ;
pDE - >Byt esRemaining =
pDE - > Byt e s Reques t ed ;
pDE - >Trans f erVA =
MmGetMdlVi rtua lAddres s ( Mdl ) ;
II
I I Flush C PU cache i f nec e s sary
II

KeF lushioBu f f e r s (
I rp- >MdlAddr e s s ,
! pDE - >Wr i t eToDevi c e ,
TRUE ) ;

II
I I Calculate s i z e o f f i r s t par t i a l

Sec. 14.4 Code Example: A Thread-Based Driver
I I trans fer
II

pDE - > Trans ferS i z e

343

=

pDE - >Byt esRemaining ;

MapReg sNeeded =
ADDRESS_AND_S I ZE_TO_S PAN_PAGES (
pDE - >Trans f e rVA ,
pDE - >Trans f e rS i z e ) ;
i f ( MapRegsNeeded > pDE - >MapRegi s t erCount
{
MapRegsNeeded
pDE - >MapRegi s terCount ;
pDE - >Trans f e rS i z e =
MapRegsNeeded * PAGE_S I Z E MmGe tMdlByteO f f s e t ( Mdl ) ;
II
I I Acqu i r e the adapter obj ect .
II

XxAcqui r eAdapterObj ect ( 0
pDE ,
MapReg sNeeded ) ;
i f ( ! NT_SUCCESS ( s t atus ) )
{
I rp - > I o S tatus . S tatus = s t atus ;
I rp - > I o S tatus . Informat i on = O ;
return I O_NO_INCREMENT ;

s tatus

=

II
I I Try to per f orm the f i r s t part i a l
I I trans fer
II

s tatus =
XxPe r f o rmSynchronousTrans f e r ( @
Devic eObj ect ,
I rp ) ;
i f ( ! NT_SUCCESS ( s tatus ) )
{
I oFreeAdapterChanne l ( pDE - >AdapterOb j e c t
) ;
I rp - > Io S tatus . S tatus = s tatus ;
I rp - > I o S tatus . Informat i on = O ;
return I O_NO_INCREMENT ;

Chapter 14

344

System Threads

II
I I I t worked . Updat e the bookkeeping
I I informa t i on .
II
pDE - >Trans f e rVA + = pDE - >Trans f e rS i z e ;
pDE- >Byte sRemaining - = pDE - >Trans ferS i z e ;
whi l e ( pDE - >BytesRemaining > O ) @)
{

II
I I Try t o do a l l o f i t in one operat i on
II
pDE- >Trans f e rS i z e = pDE - > Byt esRema i ning ;
MapRegsNeeded =
ADDRESS_AND_S I ZE_TO_S PAN_PAGES (
pDE - >Trans f erVA ,
pDE - >Trans f e rS i z e ) ;
II
II
II
II
II
if

I f the remainder o f the bu f fe r i s more
than we can handle i n one I I O . Reduce
our expec tat i ons .
( MapRegsNeeded > pDE - >MapRegi s te rCount

MapRegsNeeded =
pDE - >MapRegi s t erCount ;
pDE- >Trans f e rS i z e =
MapRegsNe eded * PAGE_S I Z E BYTE_OFFSET ( pDE- >Trans f erVA
) i
II
I I Try t o per f o rm a devi c e operat i on .
II
s tatus
XxPer f o rmSynchronousTrans fer (
Dev i ceObj e c t ,
I rp ) ;
i f ( ! NT_SUCCE S S ( s tatus ) ) break ;
II
I I I t worked . Update the bookkeep ing

Sec. 14.4 Code Example: A Thread-Based Driver

345

I I informa t i on for the next cyc l e .
II
pDE - >Trans f e rVA + = pDE - >Trans ferS i z e ;
pDE - >Byt es Remaining - = pDE - >Trans f e r S i z e ;
I oFreeAdapterChanne l ( pDE - >Adapt erObj ect ) ; 0
I rp - > I o S tatus . S tatus = s t atus ; @
I rp - > I o S tatus . Informat i on =
pDE - >Byte sReque s t ed pDE - >BytesRemaining ;
II
I I S ince there has been at least one I I O
I I operat i on , g ive the IRP a p r i o r i ty boo s t .
II
return I O_D I SK_INCREMENT ; ©

0 Before starting a data transfer, the Device object has to acquire its Adapter
object. The thread calls this synchronous helper function to grab the
Adapter object. This is different from the callback model used by the
OMA driver in Chapter 12.

@ Once the Adapter object is secured, the driver can try to perform the first
partial data transfer. Again, since this code is running in the context of a
system thread, it can stop and wait for the I/O operation to complete. If
there's an error, processing stops and the IRP is sent back with no priority
boost.
.

@) If there's more data to transfer, continue to step through the buffer and
perform partial OMA transfers.
0 When the last partial transfer is done, release the OMA Adapter object.
@ The final status of the IRP will be the status of the last data transfer opera­

tion. Also calculate the number of bytes actually transferred.

© Tell the caller to apply a priority boost to the IRP. This makes sense since
there has been at least one actual device operation.
XxAcqu ireAdapterObject and XxAdapterControl These two functions
work together to give the thread a synchronous mechanism for acquiring owner­
ship of the Adapter object. XxAcquireAdapterObj ect runs in the context of a sys­
tem thread so it can stop and wait for a nonzero time interval.

s ta t i c NTSTATUS
XxAcqui reAdapterObj ect (
IN PDEVICE_EXTENS I ON pDE ,
IN ULONG MapRegsNeeded

Chapter 14

346

System Threads

KIRQL Oldi rql ;
NTSTATUS s tatus ;
KeRa i s e i rql ( D I S PATCH_LEVEL , &Oldi rql ) ; 0
s tatus

=

I oAl locateAdapt erChanne l (
pDE - >Adapt erObj ect ,
pDE - >Devi c eObj ect ,
MapRegsNeeded ,
XxAdapterContro l ,
pDE ) ;

KeLowe r i rql ( Oldirql ) ;
II
I I I f the c a l l f ai l ed , i t ' s because there
I I weren ' t enough mapp ing reg i s ters .
II

i f ( ! NT_SUCCES S ( status ) )
{
re turn s tatus ;

KeWai tForS ingl eObj e c t ( @
&pDE - >AdapterObj e c t i sAcqui red ,
Execut ive ,
Kerne lMode ,
FALSE ,
NULL ) ;
return STATUS_SUCCES S ;
s ta t i c IO_ALLOCATION_ACTION
XxAdapterContro l (
IN PDEVICE_OBJECT Devi c eObj ect ,
IN PIRP I rp ,
IN PVOID MapRegi s t erBas e ,
IN PVO I D Context
)
{
PDEVICE_EXTENS ION pDE = Cont ext ;
pDE - >MapRegi s t erBa s e

=

MapReg i s terBas e ; 8

KeSetEvent ( 0
&pDE - >AdapterObj e c t i sAcqui red ,
0,
FALSE ) ;

Sec. 14.4 Code Example: A Thread-Based Driver

347

return KeepObj e c t ; 0

0 Only code running at DISPATCH_LEVEL IRQL can request ownership of
the Adapter object. Consequently, this routine raises its IRQL level before
calling IoAllocateAdapterChannel. Once it makes the call, it returns to
PASSIVE_LEVEL IRQL.

@ The function then stops and waits for the Adapter Control routine to set a
synchronization Event. That will be the signal that Adapter object has
been acquired.
4D It's important for the Adapter Control routine to store the mapping reg­
ister handle because the thread will need it to set up any OMA data
transfers.
0 Next, let the waiting thread know that it can use the OMA hardware.
e Finally, return a value of KeepObject in order to hold on to the Adapter

Object.

XxPerformSynchronousTransfer Running in the context of the system
thread, this function performs a single data transfer operation. It doesn't return to
the caller until the transfer finishes. The main thing to notice here is that the func­
tion uses an Event object to wait for the arrival of a device interrupt.

s tat i c NTSTATUS
XxPer f o rmSynchronousTrans f e r (
IN PDEVICE_OBJECT DeviceObj ect ,
IN P IRP I rp
)
PDEVICE_EXTENS I ON pDE =
Devi ceObj e c t - >Devi ceExtens i on ;
II
I I Set up the sys tem DMA c ontro l l er
I I attached to thi s devi c e .
II

I oMapTrans f e r (
pDE - >Adapt erObj ect ,
I rp- >MdlAddres s ,
pDE - >MapRegi s terBas e ,
pDE- >Trans f e rVA ,
&pDE - >Trans f e rS i z e ,
pDE - >Wr i teToDevi c e ) ;
II
I I Start the devi c e
II

Chapter 14

348

System Threads

XxWr i t eContro l (
pDE ,
XX_CTL_INTENB I XX_CTL_DMA_GO ) ;
II
I I The DPC rout ine wi l l set an Event
I I obj e c t when the I I O operat i on i s
I I done . S t op here and wai t f o r i t .

II
KeWai tForS ingl eObj ect (
&pDE - >DeviceOperati onComp l e t e ,
Execut ive ,
Kerne lMode ,
FALSE ,
NULL ) ;
II

I I Flush data out o f the Adapater
I I obj ect cache .
II

IoFlushAdapterBu f fers (
pDE- >AdapterObj ect ,
I rp- >MdlAddres s ,
pDE - >MapReg i s t erBas e ,
pDE - >Trans ferVA ,
pDE - > Trans ferS i z e ,
pDE - >Wr i teToDevi c e ) ;
II

I I Check f o r device errors

II
i f ( ! XX_STS_OK ( pDE - > Devi ceS tatus ) )
return STATUS_DEVICE_DATA_ERROR ;
else
re turn STATUS_SUCCESS ;

XxDpcForlsr When the device generates an interrupt, the Interrupt Ser­
vice routines (not shown here) saves the status of the hardware and requests a
DPC. Eventually, XxDpcForlsr executes and just sets an Event object into the Sig­
naled state. XxPerformSynchronousTransfer (which has been waiting for this
Event object) wakes up and continues processing the current IRP.

VO I D
XxDpcFor i s r (
IN PKDPC Dpc ,
IN PDEVICE_OBJECT Devic eObj ect ,

Sec. 14.5 Summary

349

IN PIRP I rp ,
IN PVOI D Cont ext
)
{

PDEVICE_EXTENS I ON pDE

=

Context ;

KeSetEvent (
&pDE - >Devi c eOperat i onComp l e t e ,
0,
FALSE ) ;
return ;

1 4.5 S U M MARY
This chapter has presented you with an alternative driver architecture based on
the use of system threads. Although it's not a good choice for most drivers, this
model can be useful if you're trying to manage certain kinds of legacy devices, or
devices that would interfere with normal system operation if you used the stan­
dard interrupt-driven architecture.
Now that you have a good understanding of how to work at the hardware
level, it's time to see how higher-level drivers are organized. That's the subject of
the next chapter.

C

H

A

P

T

E

R

15

Hi gher-Level
Drivers

O

ne of the 1/0 Manager 's nifty features is that it
lets you stack drivers on top of one another. This permits one driver to use
another as a prepackaged component and send requests to it just as a user-mode
thread might. As you saw back in Chapter 1, NT's SCSI and network driver archi­
tectures both rely on this building-block approach. This chapter describes the
techniques you need to use if you want to design your own driver hierarchies.

1 5. 1 AN OVERVIEW OF I NTERM EDIATE DRIVERS
Before getting into a discussion of writing intermediate drivers, it's a good idea to
define just what they are. This section also explores some of the trade-offs inher­
ent in using a hierarchical driver architecture.
What Are Intermediate Drivers?
For the purposes of this chapter, an intermediate driver is any kernel-mode
driver that issues 1/0 requests to another driver. Intermediate drivers are not
usually responsible for any direct, register-level manipulation of hardware
resources. Instead, they often depend on a lower-level device driver to perform
hardware operations. This may seem like an overly broad definition, but the truth
is that intermediate drivers can assume a wide variety of shapes.
350

Sec. 15.l An Overview of Intermediate Drivers

351

From an implementation standpoint, you can classify an intermediate driver
according to its relationship with the driver directly below it. Taking this
approach, you end up with three distinct groups:
•

Layered drivers
This generic category includes just about any driver
that uses the 1/0 Manager 's standard calling mechanism to send requests
to another driver.

•

Filter drivers
This is a special category of intermediate drivers that
transparently intercept requests intended for some other driver. These
drivers also use the 1/0 Manager 's standard calling mechanism.

•

Tightly coupled drivers
This category includes any pair of drivers
that define a private interface between themselves - one that doesn't use
the 1/0 Manager 's calling mechanism for the bulk of the communication.

-

-

-

Later parts of this chapter will explain how to develop drivers in each of
these families.

Should You Use a Layered Architecture?
One important thing to decide is whether your driver design would benefit
from being broken into a series of layers, or whether it should be structured as a
single monolithic unit. The following will help you understand the trade-offs of
taking a layered approach.

Why you should Depending on your goals, using multiple driver layers
can provide a number of benefits. For example, it allows you to separate higher­
level protocol issues from management of the specific underlying hardware. This
makes it possible to support a wider variety of hardware without having to
rewrite large amounts of code. It also promotes flexibility by allowing the same
protocol driver to plug into different hardware drivers at runtime. This is the
approach taken by NT network drivers.
If several different kinds of peripherals can all be attached to the same con­
troller (as in the case of a SCSI adapter), layering allows you to decouple manage­
ment of the peripheral from management of the controller. To do this, you write a
single device driver for the controller (the port driver) and separate higher-level
class drivers for each type of attached peripheral. The two main benefits here are
that the class drivers are smaller and simpler and (assuming a well-defined proto­
col) the class and port drivers can come from different vendors. 1

1

This is exactly what NT's SCSI architecture does. Expect to see more of this kind of thing in future
versions of Windows NT when buses like the IEEE 1 394 bus and the Universal Serial Bus make
their appearance.

Chapter 15

352

Higher-Level Drivers

Layering also makes it possible to hide hardware limitations from users of a
device, or to add features not supported by the hardware itself. For example, if a
given piece of hardware can only handle transfers of a certain size, you might
stack another driver on top of it that would break oversized transfers into smaller
pieces. Users of the device would be unaware of the device's shortcomings.
Inserting driver layers gives you a transparent way to add or remove fea­
tures from a product without having to maintain multiple code bases for the
same product. NT's fault-tolerant disks are one example of this. They're imple­
mented as a separate driver layer which is shipped with NT Server but not with
NT Workstation.

Why you shouldn't Of course, there are costs you have to consider if
you're thinking about a layered architecture. First of all, 1 / 0 requests incur some
extra overhead because each IRP has to take a trip through the 1/0 Manager
every time it passes from one driver to another. To some extent, you can reduce
this overhead by defining a private interdriver interface that partially bypasses
the I/ 0 Manager.
It also takes somewhat more design effort to make sure that the separate driver
components fit together seamlessly. In the absence of an external standard, this can
be especially painful if some of the drivers are coming from different vendors.
Since the overall functionality is no longer contained in a single driver exe­
cutable, there's somewhat more bookkeeping involved in managing the drivers.
This also has some impact on maintaining version compatibility between various
members of the hierarchy.
Finally, installing layered drivers is a little more involved since each one will
need its own area in the Registry. In addition, it's necessary to set up dependency
relationships among the various drivers in the hierarchy to make sure they start in
the proper order. 2

1 5 .2 WRITING LAYERED DRIVERS
Layered drivers are the most general type of intermediate driver. They depend for
their operation on a well-defined interdriver calling mechanism provided by the 1/0
Manager. This is the first of three sections that explain how this mechanism works,
and what a driver needs to do if it wants to use another driver as a component.

How Layered Drivers Work
As you can see from Figure 15.l, a layered driver exposes one or more
named Device objects to which clients send 1/0 requests. When an IRP repre­
senting one of these requests arrives, the layered driver can process it in two dif­
ferent ways: In some cases, it might send the IRP directly to a lower-level driver.
2

See Chapter 16 for more information about creating startup dependencies among drivers.

Sec. 15.2 Writing Layered Drivers

353

IRP

..... loCallDriver

return

loCompleteRequest
Copyright © 1 996 by Cydonix Corporation. 960031a.vsd

Figure

1 5.1

How a layered driver works

Alternatively, the layered driver might hold the IRP in a pending state while it
allocates additional IRPs and sends them to one or more lower-level drivers.
If the layered driver needs to regain control after a lower-level driver fin­
ishes with an IRP, it can attach an 1/0 Completion routine to the IRP. This routine
will execute when the lower driver calls IoCompleteRequest.

Initialization and Cleanup in Layered Drivers
Like every other kernel-mode driver, a layered driver must have a main
entry point called DriverEntry. If the driver is to be unloaded while the system is
running, it needs an Unload routine as well. The following subsections describe
what these routines have to do.

DriverEntry routine The initialization steps performed by a layered driver
are similar to those of a regular device driver. The main difference is that a layered
driver doesn't have any direct contact with hardware, so all the hardware detec­
tion and allocation code that you saw in Chapter 7 will be missing. In general, the
DriverEntry routine of a layered driver will do the following:
l.

I t uses IoCreateDevice t o build the upper-level Device object that will b e seen
by the outside world. Like the Device objects created by hardware drivers,
this one has its own unique name.

2.

DriverEntry then calls IoGetDeviceObjectPointer. Given a device name, this
function returns the address of the target Device object and a pointer to a File
object associated with the target Device. Normally, DriverEnry saves the

Chapter 15

354

Higher-Level Drivers

target Device object pointer in the Device Extension of the upper-level Device
object.
3.

Next, it increments the pointer reference count on the target Device object
by calling ObReferenceObj ectByPointer. This is necessary because
IoGetDeviceObjectPointer automatically increments the reference count on
the File object pointer, but not the reference count on the target Device object.

4.

Then, DriverEntry calls ObDereferenceObj ect to decrement the pointer ref­
erence count on the File object associated with the target Device object.

5.

If the layered driver forwards incoming IRPs to the target Device object,
DriverEntry should set the layered Device object's StackSize field to a value
one greater than the StackSize field of the target Device object. This guaran­
tees that there will be enough stack slots for all the drivers in the hierarchy.

6.

If the lower-level driver requires it, DriverEntry can fabricate an IRP with
IRP_MJ_CREATE as its major function code and send it to the target Device
object.

7.

If the Device object is going to be visible to Win32 applications, DriverEntry
calls IoCreateSymbolicLink to add its Win32 name to the \DosDevices area
of the Object Manager 's namespace.

The layered driver can now use the target Device object pointer to make
calls to the lower-level driver.

U nload routine When a layered driver unloads itself, it basically reverses
the sequence of operations it performed at initialization time. Once again, since
the driver is not working directly with the hardware, it won't need to release any
hardware resources. Although the exact steps may vary, a layered driver 's Unload
routine will generally do the following:
l.

It calls IoDeleteSymbolicLink to remove the upper-level Device object's
Win32 name from the Object Manager's namespace.

2.

If the lower-level driver requires it, the layered driver 's Unload routine can
fabricate an IRP with IRP_MJ_CLOSE as its major function code and send it to
the target Device object.

3.

Next, the Unload routine decrements the target Device object's pointer refer­
ence count by calling ObDereferenceObject. This effectively breaks the con­
nection with the target Device object.

4.

Finally, it destroys the upper-level Device object by calling IoDeleteDevice.

Code Fragment: Connecting to Another Driver
The following code fragment (taken from somewhere in the flow of a
DriverEntry routine) shows how one driver might layer itself on top of

Sec. 15.2 Writing Layered Drivers

355

another. In this example, the lower-level driver XXDRIVER exposes a device
called (what else) XXO and the layered driver (YYDRIVER) exposes YYO.

UNI CODE_STRING UpperDevi c eName ;
DEVICE_OBJECT Uppe rDeviceObj ect ;
PDEVICE_EXTENS I ON UpperExtens i on ;
UNICODE_STRING LowerDeviceName ;
DEVICE_OBJECT LowerDevic eObj ect ;
FILE_OBJECT LowerF i l eObj ect ;
NTSTATUS s tatus ;
Rt l ini tUnicode S t ring ( 0
&UpperDeviceName ,
L " \ devi c e \ YY O " ) ;
Rt l ini tUni code S t r ing (
&LowerDevic eName ,
L " \ devi c e \ XXO " ) ;
s tatus = IoCreat eDevi c e (
&UpperDevic eName ,
&UpperDevic eObj ect ) ;
UpperExt ens i on = UpperDevi c eObj e c t - >DeviceExtens i on ;
s tatus

s tatus

=

I oGe tDevi c eObj ect Pointer ( @
&LowerDevi c eName ,
F I LE_ALL_ACCESS ,
&LowerF i l eObj ect ,
&LowerDevi c eObj e c t ) ;
ObRe f e r enceObj ec tByPo inter ( @
LowerDevic eObj ect ,
F I LE_ALL_ACCESS ,
NULL ,
Kerne lMode ) ;

ObDer e f erenc eObj e c t ( LowerF i l eObj ect ) ;
UpperExt ens i on- >LowerDev i c e = LowerDevic eObj ect ; 0

UpperDeviceObj e c t - >S tackS i z e =
LowerDev i c eObj ec t - >S tackS i z e

+

l; 0

UpperDevi c eObj e c t - >Flags I =
( Lowe rDeviceObj e c t - >Flag s &
( DO_BUFFERED_IO I DO_DIRECT_IO ) ) ;

UpperDevic eObj e c t - >Al i gnmentRequirement =
LowerDeviceObj e c t - >Al i gnmentRequirement ;

Chapter 15

356

Higher-Level Drivers

0 The upper driver prepares Unicode names for both the upper and lower
devices. Be careful: These names are case-sensitive.

@ It then retrieves a pointer to the lower Device object. This function returns
pointers to both a Device object and a File object.
@ IoGetDeviceObj ectPointer doesn't increment the pointer count on the

Device object. The upper driver has to do that itself. Then, it decrements
the pointer count on the lower driver 's File object, since this isn't needed
anymore.
0 The upper driver needs to save the address of the lower Device object in

its own Device Extension so that other routines will be able to find it.
0 If the upper driver plans to forward IRPs directly to the lower one, these

IRPs have to have enough 1/0 stack locations for all the drivers in the
hierarchy. In this case, it's also important for the upper driver to duplicate
the buffering strategy and alignment of the lower driver.

Other Initialization Concerns for Layered Drivers
You've just seen the general steps a layered driver needs to perform if it
wants to connect to another driver. Depending on how the layered driver oper­
ates, there may be some other issues that the initialization code has to deal with.
There are basically two cases to consider.

Transparent layer Some layered drivers are intended to slip transparently
between some lower-level driver and its clients. Here, it's important for the
Device objects exposed by the layered driver to mimic the behavior of the lower
driver 's Device objects. NT Server 's fault-tolerant disk driver is one example of a
transparent layer.
To guarantee that the layered driver can be added or removed transparently,
its DriverEntry routine needs to perform the following extra initialization:

3

•

It should copy the DeviceType and Characteristics fields from the target
Device object to the layered Device object.

•

DriverEntry should also copy the DO_DIRECT_IO and DO_BUF­
FERED_IO bits from the target Device's Flags field. This ensures that the
layered Device object will use the same buffering strategy as the target.

•

It should copy the AlignmentRequirement field from the target to the
upper-level Device object.

•

Finally, the MajorFunction table in the layered Driver object has to sup­
port the exact same set of IRP_MJ_XXX function codes as the lower-level
Driver object. 3

The sample filter driver that appears later in this chapter shows how to set up a layered driver's
MajorFunction table dynamically.

Sec. 15.2 Writing Layered Drivers

357

Virtual or logical device layer The other possibility is that the layered
driver exposes virtual or logical Device objects.4 For example, NT's TDI network
protocol drivers present Device objects that have no particular similarity to the
network interface cards below them. Likewise, SCSI class drivers export Device
objects whose characteristics are those of the peripheral attached to the SCSI bus
- not those of the SCSI interface card.
In this case, the layered driver should pick appropriate values for the Type
and Characteristics fields of the layered Device object. Also, the exact set of
IRP_MJ_XXX functions supported by the layered driver will be ones appropriate
to the layered Device object. There's also no requirement for the layered and tar­
get Device objects to use the same buffering strategy.
1/0 Request Processing in Layered Drivers
Since layered drivers don't directly manage any hardware, they don't need
any Start I/0, Interrupt Service, or DPC routines. Instead, most of the code in a
layered driver consists of Dispatch routines and I/O Completion routines.
Because they deserve some extra attention, I/0 Completion routines get their
own section later in this chapter.
The subsections below describe the operation of a layered driver 's Dispatch
routines. When one of these Dispatch routines receives an IRP, it can do one of
three things.

Complete the original IRP The simplest case is the one where the Dis­
patch routine is able to process the request all by itself and return either success or
failure notification to the original caller. The Dispatch routine does the following:
1.

It calls IoGetCurrentlrpStackLocation to get a pointer to this driver 's I/O
stack slot.

2.

The Dispatch routine processes the request using various fields in the IRP and
the I/O stack location.

3.

It puts an appropriate value in the IoStatus.Information field of the IRP.

4.

The Dispatch routine also fills the IoStatus.Status field of the IRP with a suit­
able STATUS_XXX code.

5.

Then, it calls IoCompleteRequest with a priority-boost value of IO_NO_IN­
CREMENT to send the IRP back to the I/O Manager.

4

A virtual device is one whose behavior is not tied to the characteristics of the underlying peripheral
hardware. This also includes things like RAM disks which have no associated peripheral device.
A logical device is a temporary construct that maintains the context for a specific series of transac­
tions - usually occurring over a shared communication medium. For example, when a client
requests a connection to a Named Pipe object, the pipe driver creates a separate instance of the pipe
just for that client. This pipe instance is a logical device. Logical devices normally have a limited
lifespan; the driver creates them when a series of transactions begins, and destroys them when the
last transaction is finished.

358

Chapter 15

6.

Higher-Level Drivers

As its return-value, the Dispatch routine passes back the same STATUS_XXX
code that it put into the IRP.
..

There's nothing at all mysterious going on here. In fact, it's the same proce­
dure any Dispatch routine follows when it wants to end the processing of a request.

Pass the IRP to another driver The second possibility is that the layered
driver 's Dispatch routine needs to pass the IRP to the next lower driver. The Dis­
patch routine does the following:
1.

I t calls IoGetCurrentlrpStackLocation t o get a pointer t o its own 1/0 stack
location.

2.

The Dispatch routine also calls IoGetNextlrpStackLocation to retrieve a
pointer to the 1/0 stack location belonging to the next lower driver.

3.

It sets up the next lower driver 's 1/0 stack location, including the Major­
Function field and various members of the Parameters union.

4.

The Dispatch routine calls IoSetCompletionRoutine to associate an 1/0
Completion routine with the IRP. At the very least, this 1/0 Completion rou­
tine is going to be responsible for marking the IRP as pending.

5.

It sends the IRP to a lower-level driver using IoCallDriver. This is an asyn­
chronous call that returns immediately regardless of whether the lower-level
driver completed the IRP.

6.

As its return value, the Dispatch routine passes back whatever status code
was returned by IoCallDriver. This will be either STATUS_SUCCESS,
STATUS_FENDING, or some STATUS_XXX error code.

Notice that the Dispatch routine does not call IoMarklrpPending to put the
original IRP in the pending state before sending it to the lower driver. This is
because the Dispatch routine doesn't know whether the IRP should be marked
pending until after loCallDriver returns. Unfortunately, by that time IoCall­
Driver has already pushed the 1/0 stack pointer in the IRP, so a call to IoMark­
IrpPending (which always works with the current stack slot) would mark the
wrong stack location. The solution is to call IoMarklrpPending in an 1/0 Com­
pletion routine, after the IRP stack pointer has been reset to the proper level.

Allocate additional IRPs Finally, the layered driver 's Dispatch routine
may need to allocate one or more additional IRPs which it then sends to lower­
level drivers. The Dispatch routine has the option of waiting for these additional
IRPs to complete, or of issuing asynchronous requests to the lower driver. In the
asynchronous case, cleanup of the additional IRPs occurs in an 1/0 Completion
routine. The discussion of driver-allocated IRPs (appearing later in this chapter)
will explain how to use both these techniques.

Sec. 15.2 Writing Layered Drivers

359

Code Fragment: Calling a Lower-Level Driver
The code fragment below shows how the Dispatch routine in one driver
might forward an IRP to a lower-level driver. For purposes of example, it also
shows how the upper driver could store some context (in this case, a retry count)
in an unused field of its own 1/0 stack location.

NT STATUS
YyDi spatchRead (
IN PDEVICE_OBJECT Devi c eObj ect ,
IN P I RP I rp
)
PDEVICE_EXTENS I ON Extens i on =
Devi c eObj e c t - >Devi ceExtens i on ;
PIO_STACK_LOCATI ON Thi s i rpStack =
I oGetCurrent i rpStackLocat i on ( I rp ) ;
PIO_STACK_LOCATI ON Next i rpStack =
I oGetNext i rpS tackLocat i on ( I rp ) ;
*Next i rpStack

=

*Thi s i rpStack ; 0

Thi s i rpS tack- >
Parameters . Read . Key
YY_RETRY_COUNT_MAXIMUM_VALUE ; @
=

I o S etComp l e t i onRout ine ( @)
I rp ,
YyReadComp l e t i on ,
NULL ,
TRUE , TRUE , TRUE ) ;
return I oCal l Dr iver ( 0
Ext ens i on- >LowerDevi c e ,
I rp ) ;

0 In this simple example, the upper driver just copies the entire 1/0 stack
location from its own slot to the slot of the next lower driver. This is
essentially just a pass-through operation.

@ The upper driver 's 1/0 Completion routine is going to use the count
stored in the Parameters.Read.Key field of the upper driver's 1/0 stack
slot to keep track of attempted retries. Since the upper driver isn't using
this field for its intended purpose, it can get away with this trick.
@) To recapture this IRP after the lower driver completes it, the upper
driver attaches an 1/0 Completion routine. Since all three InvokeOnXxx

Chapter 15

360

Higher-Level Drivers

arguments are TRUE, the I / 0 Manager will call this routine no matter
what happens to the IRP.
0 Finally, the upper driver sends the IRP to the lower driver. Notice that the

return value of IoCallDriver becomes the return value of the Dispatch
routine. Also, notice that the Dispatch routine doesn't call IoMarklrp­
Pending with the IRP; that will happen in the 1/0 Completion routine.

1 5.3 WRITING 1/0 COMPLETION ROUTIN ES
An I/ 0 Completion routine is an I / 0 Manager callback that lets you recapture

an IRP after a lower-level driver has completed it. This section explains how to
use I/ 0 Completion routines in intermediate drivers.

Requesting an 1/0 Completion Callback
If you want to regain control of an IRP after it's been processed, you need to
call IoSetCompletionRoutine (described in Table 15.1). This function puts the
address of an 1/0 Completion routine in the IRP stack location associated with
the next lower driver. When some lower-level driver calls IoCompleteRequest,
the 1/0 Completion routine will execute as the IRP bubbles its way back to the
top of the driver hierarchy.
Except for the driver on the bottom, each driver in the hierarchy can attach
its own 1/0 Completion routine to an IRP. This allows everyone to receive notifi­
cation when an IRP completes. The I/ 0 Completion routines will execute in
driver-stacking order, from bottom to top.
Also notice the three BOOLEAN lnvokeOnXxx arguments. These allow you
to specify the situations in which a particular 1/0 Completion routine will run.
The 1/0 Manager uses the IoStatus.Status field of the IRP to decide whether it
should call the 1/0 Completion routine.

Table 1 5. 1

Function prototype for l oSetCompletionRoutine

VOID loSetCompletionRoutine

IRQL ::; D I SPATCH _L EV E L

Parameter

Description

IN PIRP Irp
IN PIO_COMPLETION_ROUTINE
CompletionRoutine
IN PVOID Context
IN BOOLEAN InvokeOnSuccess
IN BOOLEAN InvokeOnError
IN BOOLEAN InvokeOnCancel
Return value

Address of IRP the driver wants to track
Routine to call when a lower driver completes
the IRP
Argument passed to I/ 0 Completion routine
Call routine if IRP completes successfully
Call routine if IRP completes with error
Call routine if IRP is canceled

Sec. 15.3 Writing 1/0 Completion Routines

361

Execution Context
By the time it calls your I/O Completion routine, the 1/0 Manager has
already popped the 1/0 stack pointer, so that the current stack location is the one
belonging to your driver. Table 15.2 lists the arguments passed to an 1/0 Comple­
tion routine.
One tricky item is the IRQL level at which an 1/0 Completion routine exe­
cutes. If the lower-level driver calls IoCompleteRequest from PASSIVE_LEVEL
IRQL, then higher-level I/0 Completion routines will also run at
PASSIVE_LEVEL. On the other hand, if the lower-level driver completes the
request from DISPATCH_LEVEL IRQL (from a DPC routine, for example), then
higher-level 1/0 Completion routines will execute at DISPATCH_LEVEL. Since
DISPATCH_LEVEL IRQL has more restrictions associated with it than
PASSIVE_LEVEL IRQL, it's a good idea to limit the actions of an I/O Completion
routine to things that can safely be done at DISPATCH_LEVEL. 5
When an I/0 Completion routine is finished, it should return one of two sta­
tus codes. Returning STATUS_SUCCESS causes the IRP to continue its journey
back toward the original caller. This includes the execution of any other I/O Com­
pletion routines attached by drivers above this one. This is normally the appropri­
ate value to use if this is the original IRP that came from some caller outside the
driver.
To stop any further processing of this IRP, an 1/0 Completion routine can
return STATUS_MORE_PROCESSING_REQUIRED. This value blocks the exe­
cution of any higher-level I/0 Completion routines attached to the IRP. It also
prevents the original caller from receiving notification that the IRP has com­
pleted. An I / 0 Completion routine should return this code if it either plans to
send the IRP back down to a lower-level driver (as in the case of split transfer),
or if the IRP was allocated by this driver and the I/O Completion routine is
going to deallocate it.

Table 1 5.2

Function prototype for an 1/0 Completion routine

NTSTATUS XxloCompletion

IRQL

==

PASSIVE_LEVEL I DISPATCH_LEVEL

Parameter

Description

IN PDEVICE_OBJECT
DeviceObject
IN PIRP irp
IN PVOID Context
Return value

Device object that just completed the request

5

The IRP that's being completed
Context that was passed to IoSetCompletionRoutine
One of the following:
• STATUS_MORE_PROCESSING_REQUIRED
• STATUS_SUCCESS

For example, don't mark any 1/0 Completion routines as paged in an alloc_text pragma.

Chapter 15

362

Higher-Level Drivers

What VO Completion Routines Do
An intermediate driver can attach an 1/0 Completion routine to any IRP it
sends to another driver. This includes the original IRP that the driver received
from some outside caller, as well as any IRPs that the driver allocates on its own.
When an 1/0 Completion routine executes, there are three general kinds of tasks
it may need to perform.

Release the original IRP If the completed IRP is one that came from an
outside caller, it may require some driver-specific cleanup. At the very least, the
I/ 0 Completion routine for one of these IRPs needs to do the following:
1.

It tests the value o f the IRP's PendingRetumed flag.

2.

If this flag is TRUE, the 1/0 Completion routine puts the current 1/0 stack
location into the pending state with a call to loMarklrpPending.

3.

Finally, it returns a value of STATUS_SUCCESS to allow completion process­
ing to continue.

Deallocate the IRP If the IRP was allocated by the driver, the 1/0 Com­
pletion routine may be responsible for releasing it. Once again, this is a rather
involved topic because the 1/0 Manager supports several different IRP allocation
strategies. The next section of this chapter will explain all the gory details of
releasing driver-allocated IRPs.
Recycle the IRP Some intermediate drivers have to split a transfer into
smaller pieces before sending it to a lower-level driver. Normally, the most effi­
cient way to do this is to send each partial transfer to the lower driver by reusing
the same IRP. To recycle an IRP, the 1/0 Completion routine does the following:
1.

I t checks the context information stored with the IRP to see if this was the last
partial transfer. If the whole transfer is finished and the IRP came from an out­
side caller, the driver performs any necessary cleanup and returns
STATUS_SUCCESS to allow further completion processing.

2.

If the whole transfer is finished and this is a driver-allocated IRP, the I/ 0
Completion routine performs any necessary cleanup, frees the IRP, and
returns STATUS_MORE_PROCESSING_REQUIRED to prevent any further
completion processing.

3.

If there's more work to be done, the 1/0 Completion routine calls loGetNext­
IrpStackLocation and sets up the 1/0 stack slot for the next lower driver.

4.

Next, it uses loSetCompletionRoutine to attach the address of this 1/0 Com­
pletion routine to the IRP.

5.

It passes the IRP to the target Device object using loCallDriver.

6.

Finally, it returns STATUS_MORE_PROCESSING_REQUIRED to prevent any
further completion processing of this IRP.

Sec. 15.3 Writing I/0 Completion Routines

363

An implementation detail: During each partial transfer, an intermediate

driver has to keep track of how much of the original caller 's request has been sat­
isfied. One clever way to maintain this context information is to store it in unused
fields of the intermediate driver 's 1/0 stack location. For example, if the interme­
diate driver doesn't need the ByteOffset or Key fields, it can use them to hold
three longwords of context data. Of course, if your driver does use these fields for
their intended purpose, you can always allocate a private block and pass it as the
Context argument to IoSetCompletionRoutine.

Code Fragment: An 1/0 Completion Routine
Below you'll find a fragment of an 1/0 Completion routine. It complements
the YyDispatchRead function presented in the previous section of this chapter. If
the request completed normally, it sends it back to the original caller. If something
failed at a lower level, it retries the operation a fixed number of times.

NT STATUS
YyReadComp l e t i on (
IN PDEVICE_OBJECT Devi ceObj e c t ,
IN P I RP I rp ,
IN PVOID Context
)
{
PI O_STACK_LOCATI ON Thi s i rpStack =
I oGetCurrent i rp S t ackLocat i on ( I rp ) ;
PIO_STACK_LOCATI ON Next i rpStack =
I oGetNext i rpStackLocat i on ( I rp ) ;
PDEVICE_EXTENS I ON Ext ens i on =
Devi ceObj e c t - >Devic eExtens i on ;
i f ( ( NT_SUCCESS ( I rp - > I o S tatus . S tatus ) )
I I ( Thi s i rpS tack- > Parameter s . Read . Key
{
i f ( I rp - > PendingReturned ) 8
I oMarki rpPending ( I rp ) ;
return STATUS_SUCCES S ;
}

Thi s i rpS tack- > Parameters . Read . Key- - ; 8
*Next i rpS tack = * Thi s i rpS tack ;
Next i rpS tack- > Parameters . Read . Key
I o SetComp l e t i onRout ine ( 0
I rp ,
YyReadComp l e t i on ,

=

O;

0 ) ) 0

Chapter 15

364

Higher-Level Drivers

NULL ,
TRUE , TRUE , TRUE ) ;
I oCal lDriver ( Extens i on->LowerDevi c e , I rp ) ; 0
return STATUS_MORE_PROCESS ING_REQUI RED ;

0 If the lower driver completed the IRP with a successful status code, or if
the IRP failed and it has run out of retries, this driver is about to send it on
its way back up the driver hierarchy.

@ It's necessary to see if the current I/0 stack location should be marked
pending. Because of the asynchronous nature of IoCallDriver, this can't
be done until the completion routine runs.
@ The lower driver failed the IRP but it still has some retries left. At this
point, the upper driver decrements the retry count and prepares to send
the IRP back down for another try.
0 The 1/0 Completion routine address has to be reset each time the IRP is

recycled.
0 Finally, the I/0 Completion routine sends the IRP back to the lower

driver. As its return value, the I/ 0 Completion routine sends back
STATUS_MORE_PROCESSING_REQUIRED. This prevents the I/O Man­
ager from continuing to complete the IRP.

1 5.4 ALLOCATING ADDITIONAL I R PS
There are some situations where an intermediate driver may need to allocate addi­
tional IRPs to send to another driver. For example, the initialization code in one
driver might want to query the capabilities of a lower-level driver by issuing an
IOCTL request. The filter driver appearing later in this chapter does exactly this.
Or, for purposes of fault tolerance, the intermediate driver might want to
duplicate an incoming request and send redundant copies to multiple lower­
level drivers. The fault-tolerant disk driver that comes with NT Server uses this
technique.
Finally, a command exposed by an intermediate driver might require lower­
level drivers to perform a complex sequence of operations. For example, the class
driver for a particular kind of SCSI device has to issue a whole series of com­
mands to the SCSI port driver to implement one of the class driver 's operations.

The IR P's 1/0 Stack Revisited
When you start to allocate additional IRPs, it's important to have a clear
understanding of just how the IRP's 1/0 stack works. As you already know, when
any driver receives an IRP from an outside caller, the 1/0 stack pointer points to

Sec. 15.4 Allocating Additional IRPs

365

the stack location belonging to that driver. To retrieve this pointer, the driver sim­
ply calls IoGetCurrentlrpStackLocation.
If an intermediate driver plans to pass an incoming IRP to a lower-level
driver, it has to set up the I/ 0 stack location for the lower driver. To get a pointer
to the lower driver's 1/0 stack slot, the intermediate driver makes a call to IoGet­
NextlrpStackLocation. After setting up the lower stack slot, the intermediate
driver uses IoCallDriver to pass the IRP on. This function automatically pushes
the 1/0 stack pointer so that when the lower driver calls IoGetCurrentlrpStack­
Location, it will get the right address.
When the lower driver calls IoCompleteRequest, the completed IRP's 1/0
stack is popped. This allows an 1/0 Completion routine belonging to the interme­
diate driver to call IoGetCurrentlrpStackLocation if it needs to access its own
stack location. As the IRP bubbles its way back up to the original caller, the 1/0
stack is automatically popped again for each driver in the hierarchy. Table 15.3
summarizes the effects of these functions on an IRP's 1/0 stack pointer.
To maintain consistent behavior with driver-allocated IRPs, the 1/0 Man­
ager plays a little trick. When a driver allocates an IRP, the 1/0 Manager initial­
izes the new IRP's 1/0 stack pointer so that it points at a nonexistent slot one
location beyond the end of the stack. This guarantees that when the driver passes
the IRP to a lower-level driver, IoCallDriver 's push operation will set the stack
pointer to the first real slot in the stack. This means the higher-level driver must
call IoGetNextlrpStackLocation to retrieve a pointer to the 1/0 stack slot
intended for the target driver.

Controlling the Size of the IRP Stack
When a driver receives an IRP from an outside caller, the number of 1/0
stack slots is determined by the StackSize field of the driver 's Device object. If
an intermediate driver plans to pass incoming IRPs to a lower-level driver, it
needs to set this field equal to one more than the StackSize value of the lower
driver. This ensures that there will be enough 1/0 stack for all the drivers in the
hierarchy.

Table 1 5.3

What various functions do to the I RP's 1/0 stack pointer

Working with the IRP stack pointer
Function

Effect on the IRP stack pointer

IoGetCurrentlrpStackLocation
IoGetNextlrpStackLocation
IoSetNextlrpStackLocation
IoCallDriver
IoCompleteRequest

No change
No change
Pushes stack pointer one location
Pushes stack pointer one location
Pops stack pointer one location

Chapter 15

366

Higher-Level Drivers

If an intermediate driver calls IoBuildAsynchronousFsdRequest, IoBuild­
DeviceloControlRequest, or IoBuildSynchronousFsdRequest to create an IRP,
the 1/0 Manager uses the StackSize field of the target Device object (passed as an
argument to all three functions) to determine the number of I/ 0 stack locations in
the new IRP. These IRPs will have enough I/ 0 stack slots for the target driver and
any drivers below it. There will not be a slot in the 1/0 stack for the intermediate
driver itself.
If an intermediate driver uses IoAllocatelrp, ExAllocatePool, or some pri­
vately managed memory to create an IRP, the driver must explicitly specify the
number of 1/0 stack slots in the new IRP. Again, the common practice is to use the
StackSize field of the target Device object to determine the proper number of
slots.
Ordinarily, an intermediate driver won't need a stack slot for itself in any
IRPs it allocates. The one exception would be if the intermediate driver needed to
associate some per-request context with the IRP. In that case, the driver could allo­
cate an IRP with one extra stack slot and use the extra slot for holding private con­
text data. This code fragment shows how it's done:

Newi rp

=

I oAl l ocate i rp ( LowerDevi c e - > S tackS i z e

+

1 ) ;

II
I I Push the I I O s tack pointer s o that i t points
I I at the f i r s t val i d s l ot . Use thi s s l o t to ho l d
I I c ontext inf orma t i on needed by the upper driver .
II
I o S e tNext i rpS tackLocat i on ( Newi rp ) ;
Cont extArea = I oGetCurrent i rp S tackLocat i on ( Newi rp ) ;
NextDriverS l o t = I oGetNext i rpStackLoc a t i on ( Newirp ) ;
II
I I S e t up next driver ' s I I O s tack s l o t
II
NextDriverS l o t - >Maj orFunc t i on
IRP_MJ_XXX ;
=

II
I I At tach an I I O Comp l e t i on rout ine and
I I s end the IRP to s omeone e l s e
II
I o S etComp l e t i onRout ine (
Newi rp ,
YyioComp l e t i on ,
NULL ,
TRUE TRUE TRUE ) ;
I

I

I oCal lDr iver ( LowerDevice , Newi rp ) ;

Sec. 15.4 Allocating Additional IRPs

367

Creating IRPs with loBuildSynchronousFsdRequest
The 1/0 Manager provides three convenience functions that simplify the
process of building IRPs for standard kinds of I / 0 request. The first one is
IoBuildSynchronousFsdRequest, and it fabricates read, write, flush, or shut­
down IRPs. See Table 15.4 for a description of this function.
The number of I/ 0 stack locations in IRPs created with this function is equal
to the StackSize field of the TargetDevice argument. There's no straightforward
way to leave room in the I/O stack for the intermediate driver itself.
The Buffer, Length, and StartingOffset arguments to this function are
required for read and write operations. They must be NULL, 0, and NULL
(respectively) for flush or shutdown operations.
IoBuildSynchronousFsdRequest automatically sets up various fields in the
Parameters area of the next lower I/0 stack location, so there's rarely any need to
touch the 1/0 stack. For read or write requests, this function also allocates system
buffer space or builds an MDL, depending on whether the TargetDevice does
Buffered or Direct I/O. For buffered outputs, it also copies the contents of the
caller 's buffer into the system buffer; at the end of a buffered input, data is auto­
matically copied from the system buffer to the caller 's buffer.
As the function name suggests, you make requests for synchronous 1/0
operations with the IRPs returned by IoBuildSynchronousFsdRequest. In other
words, the thread that calls IoCallDriver normally blocks itself until the 1/0
operation completes. To do this, just pass the address of an initialized Event object

Table 1 5.4

Function prototype for loBuildSynch ronousFsd Request

PIRP loBui ldSynchronousFsdRequest

IRQL == PASSIVE_LEVEL

Parameter

Description

IN ULONG MajorFunction

One of the following:
• IRP_MJ_READ
• IRP_MJ_WRITE
• IRP_MJ_FLUSH_BUFFERS
• IRP_MJ_SHUTDOWN
Device object where IRP will be sent
Address of I/ 0 buffer
Length of buffer in bytes
Device offset where I/O will begin
Event object used to signal I/0 completion
Receives final status of 1/0 operation
• Non-NULL - address of new IRP
• NULL - IRP could not be allocated

IN PDEVICE_OBJECT TargetDevice
IN OUT PVOID Buffer
IN ULONG Length
IN PLARGE_INTEGER StartingOffset
IN PKEVENT Event
OUT PIO_STATUS_BLOCK Iosb
Return value

368

Chapter 15

Higher-Level Drivers

when you allocate the IRP. Then, after sending the IRP to a lower-level driver with
IoCallDriver, use KeWaitForSingleObject to wait for the Event object. When a
lower-level driver completes the IRP, the I/O Manager will put this Event object
into the Signaled state, which will awaken your driver. The I/O status block will
tell you whether everything worked.
Two points about intermediate drivers issuing synchronous I/0 requests to
other drivers. First, drivers that perform blocking I/0 can be rather sluggish
because they prevent the calling thread from overlapping its I/0 operations. This
is contrary to the philosophy of the NT I/0 architecture, so you shouldn't do it
unless you really need to.
Second, the Event object used to wait for I/0 completion needs to be syn­
chronized properly or there could be a nasty collision. Consider the case where
two threads in the same process issue a read request using the same handle. The
YyDispatchRead routine executes in the context of the first thread and blocks
itself waiting for the Event object. Then the same YyDispatchRead routine exe­
cutes in the context of the other thread and reuses the same Event object to issue a
second request. When the IRP for either request completes, the Event object will
be set, both threads will awaken, and nothing good will happen. 6 The solution is
to guard the Event object with a Fast Mutex.
The I/0 Manager automatically cleans up and deallocates IRPs created with
IoBuildSynchronousFsdRequest after their completion processing is done. This
includes releasing any system buffer space or MDL attached to the IRP. To trigger
this cleanup, a lower-level driver simply has to call IoCompleteRequest.
Normally, there won't be any need to attach an I/0 Completion routine to
one of these IRPs, unless you need to do some driver-specific postprocessing. If
you do attach an I/O Completion routine, it should return STATUS_SUCCESS
when it's done. This lets the I/0 Manager free the IRP.

Creating IRPs with loBui ldAsynchronousFsdRequest
The second convenience function, IoBuildAsynchronousFsdRequest, is
quite similar to the first. It lets you build read, write, flush, and shutdown
requests without worrying about too many of the details. The main difference is
that you have to process these IRPs asynchronously. You don't have the option of
stopping and waiting for the I/O to complete. Table 15.5 contains the prototype
for this function.
As with IoBuildSynchronousFsdRequest, the Buffer, Length, and Starting­
Offset parameters to IoBuildAsynchronousFsdRequest are required for read and
write operations. They must be NULL, 0, and NULL (respectively) for flush or
shutdown operations.
6 This problem isn't limited to threads in the same process, by the way. If the intermediate driver's
Device object is shareable, the same issue arises if threads in two separate processes issue simulta­
neous requests that travel through the YyDispatchRead routine.

Sec. 15.4 Allocating Additional IRPs

Table 1 5.5

369

Function prototype for loBuildAsynchronousFsdRequest

PIRP loBuildAsynchronousFsdRequest

IRQL � DISPATCH_LEVEL

Parameter

Description

IN ULONG MajorFunction

One of the following:
• IRP_MJ_READ
• IRP_MJ_WRITE
• IRP_MJ_FLUSH_BUFFERS
• IRP_MJ_SHUTDOWN
Device object where IRP will be sent
Address of I/ 0 buffer
Length of buffer in bytes
Device offset where 1/0 will begin
Receives final status of 1/0 operation
• Non-NULL - address of new IRP
• NULL - IRP could not be allocated

IN PDEVICE_OBJECT TargetDevice
IN OUT PVOID Buffer
IN ULONG Length
IN PLARGE_INTEGER StartingOffset
OUT PIO_STATUS_BLOCK Iosb
Return value

Notice that you can call IoBuildAsynchronousFsdRequest at or below
DISPATCH_LEVEL IRQL. IoBuildSynchronousFsdRequest works only at
PASSIVE_LEVEL.
Unlike the IRPs from IoBuildSynchronousFsdRequest, the ones from this
function are not released automatically when a lower-level driver completes
them. Instead, you must attach an 1/0 Completion routine to any IRP created
with IoBuildAsynchronousFsdRequest. The 1/0 Completion routine calls
loFreelrp which releases the system buffer or MDL associated with the IRP and
then deallocates the IRP itself. The return value of the 1/0 Completion routine
should be STATUS_MORE_PROCESSING_REQUIRED.

Creating IRPs with loBuildDeviceloControlRequest
The last convenience function, IoBuildDeviceloControlRequest, (described
in Table 15.6) simplifies the task of building IOCTL IRPs. This is useful because
it's a fairly common practice for drivers of odd pieces of hardware to expose an
interface composed almost entirely of IOCTLs. Some higher-level drivers (like
NT's TDI network protocol drivers) take this same approach.
The IntemalDeviceloControl argument lets you specify the major function
code in the target driver 's 1/0 stack slot. FALSE produces an IRP with
IRP_MJ_DEVICE_CONTROL, while TRUE causes it to be set to IRP_MJ_INTER­
NAL_DEVICE_CONTROL.
Also notice that you can make either synchronous or asynchronous calls
with the IRPs returned by this function. If you want your Dispatch routine to stop

Chapter 15

370

Table 1 5.6

Higher-Level Drivers

Function prototype for loBuildDeviceloControlRequest

PIRP loBuild DeviceloControlRequest

IRQL

==

PASSIVE_LEVEL

Parameter

Description

IN ULONG IoControlCode
IN PDEVICE_OBJECT TargetDevice
IN PVOID lnputBuffer
IN ULONG InputBufferLength
OUT PVOID OutputBuffer
IN ULONG OutputBufferLength
IN BOOLEAN InternalDeviceloControl
IN PKEVENT Event

IOCTL code recognized by target driver
Device object where IRP will be sent
Buffer of data passed to lower driver
Size of data buffer in bytes
Data buffer filled by lower driver
Size of data buffer in bytes
(See below)
Event object used to signal 1/0
completion
Receives final status of I/O operation
• Non-NULL - address of new IRP
• NULL - IRP could not be allocated

OUT PIO_STATUS_BLOCK Iosb
Return value

and wait until an 1/0 control operation completes, simply pass the address of an
initialized Event object when you allocate the IRP. Then, after sending the IRP to a
lower-level driver with IoCallDriver, use KeWaitForSingleObject to wait for the
Event object. When a lower-level driver completes the IRP, the 1/0 Manager will
put this Event object into the Signaled state, which awakens your driver. The I/0
status block will tell you how everything went. As with IoBuildSynchronous­
FsdRequest, you have to be careful about multiple threads using this Event object
at the same time.
The I/O Manager automatically cleans up and deallocates IRPs created with
IoBuildDeviceloControlRequest after their completion processing is done. This
includes releasing any system buffer space or MDL attached to the IRP. To trigger
this cleanup, a lower-level driver simply has to call IoCompleteRequest.
Normally, there's no need to attach an I/O Completion routine to one of
these IRPs, unless you need to do some driver-specific post-processing. If you do
attach an I/0 Completion routine, it should return STATUS_SUCCESS when it's
done. This lets the 1/0 Manager free the IRP.
The one problem with this function is the way it handles the buffering­
method bits embedded in the IOCTL code. If an IOCTL code contains
METHOD_BUFFERED, IoBuildDeviceloControl allocates a nonpaged pool
buffer and copies the contents of the lnputBuffer to it; when the IRP completes,
the contents of the nonpaged pool buffer are automatically copied to Output­
Buffer. So far, it behaves exactly like a Win32 DeviceloControl call coming from a
user-mode application.
But, if you specify an IOCTL code containing one of the Direct 1/0 methods,
a nasty bug appears: IoBuildDeviceloControl always builds an MDL for the Out­
putBuffer address and always uses a nonpaged pool buffer for the InputBuffer

Sec. 15.4 Allocating Additional IRPs

371

address, regardless of whether the IOCTL code specifies METHOD_IN_DIRECT
or METHOD_OUT_DIRECT.

Creating IRPs from Scratch
The 1/0 Manager routines described above are the most convenient way to
work with driver-allocated IRPs. Every once in awhile, however, they may not be
the right thing to use. For example, if you're trying issue a request other than
read, write, flush, shutdown, or device 1/0 control, these functions aren't very
helpful. At that point, your only option is allocate a blank IRP and set it up by
hand. The following subsections describe several ways to do this.

IRPs from loAllocatelrp The IoAllocatelrp function will allocate an IRP
from an 1/0 Manager zone buffer and perform certain basic kinds of initializa­
tion. 7 Your driver has to fill in the I/O stack location for the target driver and set
up whatever kind of buffer the target driver is expecting to find. The following
code fragment illustrates the use of this function.

PMDL NewMdl ;
PIRP Newi rp ;
P IO_STACK_LOCATI ON Next i rpS tack ;
Newi rp

=

I oAl l o c a t e i rp ( LowerDevi c e - > S tackS i z e ) ;

NewMdl

=

I oAl l ocateMdl (
MmGetMdlVi r tualAddr e s s (
Origina l i rp - >MdlAddre s s ) ,
XX_S I ZE_OF_BI GGEST_TRANSFER ,
FALSE , / / Pr imary bu f f er
FALSE , / / No quo ta charge
Newi rp ) ;

I oBu i l dPar t i a lMdl (
Origina l i rp - >MdlAddr e s s ,
NewMdl ,
MmGe tMdlVi rtualAddres s ( Origina l i rp- >MdlAddre s s ) ,
XX_S I Z E_OF_BIGGEST_TRANSFER ) ;
Next i rpS tack

=

I oGe tNext i rpStackLocat i on ( Newirp ) ;

Next i rpS tack- >Maj orFunc t i on

=

I RP_MJ_XXX ;

Next i rpS tack- >
Parameters . Xxx . Length =
XX_S I ZE_OF_BIGGEST_TRANSFER ;
7

There's a very serious error in the DDK documentation that's worth knowing about: The documen­
tation clearly states that you must pass any IRPs created with loAllocatelrp to lolnitializelrp
before you can use them. This turns out to be a lie. If you pass an IRP returned from loAllocatelrp
to lolnitializelrp, the system will crash when your driver tries to release the IRP. So, don't do that.

Chapter 15

372

Higher-Level Drivers

Newi rp - >
Ta i l . Over l ay . Thread
Origina l i rp- >Tai l . Over l ay . Thread ;
I o S e tComp l e t i onRout ine (
Newi rp ,
YyioComp l e t i on ,
NULL ,
TRUE , TRUE , TRUE ) ;
I oCal l Driver ( LowerDevice , Newi rp ) ;
One thing to mention here: If the new IRP is targeted at a disk device or a
device with removable media, the intermediate driver needs to copy the contents
of the original IRP's Tail.Overlay.Thread field into the new IRP. This guarantees
that the system will be able to pop up a dialog box for the user if the underlying
device driver calls IoSetHardErrorOrVerifyDevice.
Your driver is responsible for releasing any IRPs created with IoAllocatelrp.
It also has to free any other resources (MDLs or system buffers, for example) asso­
ciated with the IRP. Normally, this cleanup occurs in the IRP's I/0 Completion
routine. The following code fragment shows what you need to do.

NT STATUS
Yyi oComp l e t i on (
IN PDEVI CE_OBJECT DeviceObj ect ,
IN PIRP I rp ,
IN PVOI D Context
)

I oFreeMdl ( I rp - >MdlAddre s s ) ;
I oFree i rp ( I rp ) ;
return STATUS_MORE_PROCESS ING_REQUI RED ;

I R Ps from ExAllocatePool If, for some odd reason, you'd prefer to get
your IRPs directly from nonpaged pool, you can allocate them with the standard
ExAllocatePool function. Once you have the block of pool, you still need to turn it
into an IRP using Iolnitializelrp. (This is the correct place to call this function.)
Filling in the 1/0 stack location and setting up appropriate buffers or MDLs is still
left to you.
Here's an example of what to do; in this fragment, the lower Device object is
expecting a nonpaged pool buffer rather than an MDL.

Newi rp

=

ExAl l ocatePoo l (
NonPagedPoo l ,

Sec. 15.4 Allocating Additional IRPs

373

I o S i z eO f i rp (
LowerDevi c e - > S tackS i z e ) ) ;
I o i ni t i a l i z e i rp (
Newi rp ,
I o S i z eO f i rp ( LowerDevi c e - > S tackS i z e ) ,
LowerDevi c e - > S tackS i z e ) ;
Next i rpS tack

=

I oGe tNext i rpStackLocat i on ( Newi rp ) ;

Next i rpS tack - >Maj orFunc t i on
Next i rpS tack- >
Parameters . Xxx . Length

=

IRP_MJ_XXX ;

XX_BUFFER_S I Z E ;

Newi rp - >
As s o c i a t edi rp . Sys temBu f f er
ExAl l ocatePoo l ( NonPagedPo o l , XX_BUFFER_S I Z E ) ;
Newi rp - >
Tai l . Overlay . Thread
Ori ginal i rp - >Tai l . Overlay . Thread ;
I o S e tComp l e t i onRout ine (
Newi rp ,
Yyi oComp l e t i on ,
NULL ,
TRUE , TRUE , TRUE ) ;
I oCal lDriver ( LowerDevice , Newi rp ) ;
Once again, it's the j ob of the 1/0 Completion routine attached to the IRP to
do all the cleanup and release the IRP. The following code fragment shows you
how.

NT STATUS
Yyi oComp l e t i on (
IN PDEVICE_OBJECT Devic eObj ect ,
IN P I RP I rp ,
IN PVO I D Context
)

ExFreePoo l ( I rp - >As s o c i atedirp . Sys temBu f f e r ) ;
IoFre e i rp ( I rp ) ;
re turn STATUS_MORE_PROCESS ING_REQUIRED ;
Notice that you use IoFreelrp to get rid of the IRP, even though you allocated
it with ExAllocatePool. This is because a field in the IRP tells the 1/0 Manager

Chapter 15

374

Higher-Level Drivers

whether this IRP came directly from the pool, or whether it came from the 1/0
Manager 's private zone buffer.

IRPs from driver-managed memory Finally, there's always the chance
that you're keeping a private collection of IRPs that you've carved out of a driver­
specific zone buffer or a look-aside list. This is really the same as the case where
you allocate IRPs using ExAllocatePool, in that you still need to initialize each
IRP using Iolnitializelrp.
The big difference is the way you release these privately managed IRPs.
Since the 1/0 Manager doesn't know anything about your driver 's memory man­
agement strategy for these IRPs, the IoFreelrp function wouldn't know what to
do with one of them. So, instead of calling IoFreelrp, the 1/0 Completion routine
needs to call whatever internal driver function is responsible for releasing the IRP.
Setting Up Buffers for Lower Drivers
If you use any of the preceding techniques to create IRPs from scratch, it's
also your responsibility to initialize and clean up any buffers needed by those
IRPs. 8 How you do this will depend on whether the target Device object does
Buffered or Direct 1/0.

Buffered 1/0 requests Here, the Dispatch routine in the intermediate
driver has to call ExAllocatePool to allocate the buffer. It stores the address of this
buffer in Associatedlrp.SystemBuffer field of the driver-allocated IRP. Later, an I/
0 Completion routine attached to the IRP has to release the buffer with a call to
ExFreePool.
Direct 1/0 requests Handling these requests means the intermediate
driver has to set up an MDL describing the 1/0 buffer. In this case, the intermedi­
ate driver 's Dispatch routine would do the following:
1.

I t calls IoAllocateMdl to create a n empty MDL large enough map the buffer.
It stores the address of this MDL in the MdlAddress field of the driver-allo­
cated IRP.

2.

The Dispatch routine fills in the MDL. To map a portion of the buffer associ­
ated with the original caller 's IRP, it calls IoBuildPartialMdl. To map system
memory into the MDL, it uses MmBuildMdlForNonPagedPool.

3.

It then attaches an 1/0 Completion routine to the driver-allocated IRP using
IoSetCompletionRoutine.

4.

Finally, the Dispatch routine sends the IRP to a lower-level driver with
IoCallDriver.

8

This is one of the arguments in favor of using the convenience routines to build IRPs, since they
handle all this nastiness on their own.

Sec. 15.4 Allocating Additional IRPs

375

When the lower-level driver completes the IRP, the intermediate driver 's I/O
Completion routine uses IoFreeMdl to release the MDL.

Keeping Track of Driver-Al located IRPs
Intermediate drivers have to be careful about how they handle incoming
1/0 requests that result in multiple IRPs being sent simultaneously to some
other drivers. In particular, it's important for the original incoming IRP not to be
completed until all the allocated IRPs have finished their work. Exactly how the
intermediate driver does this will depend on whether it performs synchronous
or asynchronous 1/0 with the driver-allocated IRPs.

Synchronous 1/0 This is the simpler of the two cases, since the intermedi­
ate driver 's Dispatch routine just has to stop and wait until all the allocated IRPs
have been completed. In general, the Dispatch routine would do the following:
1.

I t calls IoBuildSynchronousFsdReqest to create some number o f driver-allo­
cated IRPs.

2.

Next, the Dispatch routine uses IoCallDriver to pass all the driver-allocated
IRPs to other drivers.

3.

It then calls KeWaitForMultipleObj ects and freezes until all the allocated
IRPs have completed.

4.

Finally, it calls IoCompleteRequest with the original IRP to send it back to the
caller.

Notice here that, since the original request is blocking inside the Dispatch
routine itself, there's no need to mark the original IRP pending.

Asynchronous 1/0 This is a somewhat more complex case because there's
no central point of control where the driver can stop and wait for everything to
finish. Instead, the intermediate driver has to attach I/O Completion routines to
each driver-allocated IRP, and the completion routine will have to decide whether
it's time to complete the original caller 's IRP.
Here's what happens in the Dispatch routine of an intermediate driver using
this kind of freewheeling approach:
1.

It puts the original caller 's IRP in the pending state by calling IoMarkPending.

2.

Next the Dispatch routine uses one of the methods described in the previous
section to allocate some additional IRPs.

3.

It attaches an 1/0 Completion routine to each of these IRPs with IoSetCom­
pletionRoutine. When it makes this call, the Dispatch routine passes a
pointer to the original caller 's IRP as the Context argument.

Chapter 15

376

Higher-Level Drivers

4.

The Dispatch routine stores a count of outstanding allocated IRPs in an
unused field of the original IRP. The Key field in the current 1/0 stack loca­
tion's Parameters union is one possible place.

5.

Next, it uses IoCallDriver to pass all the IRPs to other drivers.

6.

Finally, the Dispatch routine passes back STATUS_PENDING as its return
value. This is necessary because the original IRP isn't yet ready for comple­
tion processing.

As each of the other drivers completes one of these IRPs, the intermediate
driver 's 1/0 Completion routine executes. That routine does the following:
1.

First, it performs whatever cleanup is necessary and deletes the driver-allo­
cated !RP.

2.

The I/ 0 Completion routine calls ExlnterlockedDecrementLong to decre­
ment the count of outstanding IRPs contained in the original caller 's IRP.
(Remember, it received a pointer to this original IRP as its Context argument.)

3.

If the count equals zero, then this is the last outstanding driver-allocated IRP.
In that case, the 1/0 Completion routine completes the original IRP by calling
IoCompleteRequest.

4.

Finally, it returns STATUS_MORE_FROCESSING_REQUIRED to prevent any
further completion processing of the driver-allocated IRP (which has just
been deleted).

1 5 .5 WRITING FILTER DRIVERS
A filter driver is a special type of intermediate driver. What sets filters apart from
the layered drivers described earlier in this chapter is that they are invisible. They
sit on top of some other driver and intercept requests directed at the lower
driver 's Device objects. Users of the lower driver are completely unaware that
this is going on. Some of the things you can do with filters include the following:
•

Filters let you modify some aspects of an existing driver 's behavior with­
out rewriting the whole thing. SCSI filter drivers (described back in
Chapter 1) work this way.

•

They make it easier to hide the limitations of lower-level device drivers.
For example, a filter could split large transfers into smaller pieces before
passing them on to a driver with transfer size limits.

•

Filters allow you to add features like compression or encryption to a
device without modifying the underlying device driver or the programs
that use the device.

Sec. 15.5 Writing Filter Drivers
•

377

They let you add or remove expensive behavior like performance moni­
toring that you don't want included in a driver all the time. The disk per­
formance monitoring tools in NT work this way.

The rest of this section explains how to write filter drivers. As you read it,
keep in mind that things like driver-allocated IRPs and I/ 0 Completion routines
work the same way in a filter driver as they do in a regular layered driver.
How Fi lter Drivers Work

The main difference between filter drivers and other layered drivers is in the
Device objects they create. Whereas a layered driver exposes Device objects with
their own unique names, a filter driver 's Device objects have no names at all. Fil­
ter drivers work by attaching one of these nameless Device objects to a Device
object created by some lower-level driver. Figure 15.2 illustrates this relationship.
In the diagram, YYDRIVER has attached a filter Device object to XXO, one of
XXDRIVER's Device objects. Any IRPs sent to XXO are automatically rerouted to
the Dispatch routines in YYDRIVER. Here's how it works.
1.

The DriverEntry routine in the filter driver creates an invisible Device object
and attaches it to a named Device object belonging to another driver.

2.

A client of the lower-level driver opens a connection to XXO. This can be a
user-mode program calling CreateFile to get a handle, or a kernel-mode client

loCompleteRequest
Copyright © 1 996 by Cydonix Corporation. 960032a.vsd

Figure

1 5.2

How filter drivers work

Chapter 15

378

Higher-Level Drivers

calling IoGetDeviceObj ectPointer. In either case, the I/0 Manager actually
opens a connection between the client and the filter driver 's invisible Device
object.
3.

When the client sends an 1/0 request to XXO, the 1/0 Manager sends it to the
filter driver 's unnamed Device object instead. The I/O Manager uses the
MajorFunction table of the filter 's Driver object to select an appropriate Dis­
patch routine.

4.

The Dispatch routines in the filter driver either process the IRP on their own
and complete it immediately, or they send the IRP down to XXO with IoCall­
Driver. If the filter driver needs to regain control of the IRP when the lower­
level driver completes it, the filter can associate an 1/0 Completion routine
with the IRP.

Filters can also be layered above other filters. If you try to attach a filter to an
already filtered Device object, the new filter simply gets layered on top of the
highest existing filter. So, you can have essentially any number of filter levels.
Initial ization and Cleanup in Fi lter Drivers

Like every other kernel-mode driver, a filter driver must have a main entry
point called DriverEntry. If the driver is to be unloaded while the system is run­
ning, it needs an Unload routine as well. The following subsections describe what
these routines have to do.
DriverEntry routine The initialization sequence in a filter driver will fol­
low one of two basic patterns. The first possibility is that the filter needs to inter­
cept IRPs directed at all the Device objects created by a lower-level driver. In that
case, the filter 's DriverEntry routine will perform these steps:
1.

It calls IoGetDeviceObjectPointer to get a pointer to one of the Device
objects belonging to the lower-level driver.

2.

From this Device object, the filter's DriverEntry routine gets a pointer to the
target Driver object. It uses this pointer to scan the MajorFunction table of the
target Driver object and make sure that every function code supported by the
target is also supported by the filter driver.

3.

Next, DriverEntry uses the DeviceObject field of the target Driver object to
get the first target Device object.

4.

The filter calls IoCreateDevice to create a filter Device object for this target
Device object. This filter Device object has no NT name, nor does it have a
symbolic link to give it a Win32 name.

5.

It then calls loAttachDeviceByPointer to attach the new filter Device object to
the target Device object.

Sec. 15.5 Writing Filter Drivers

379

6.

It stores the address of the target Device object in the Device Extension of the
filter Device object. Other parts of the filter driver will need this pointer to call
the target driver.

7.

Next, DriverEntry copies the DeviceType and Characteristics fields from the
target Device object to the filter Device object. It also copies the
DO_DIRECT_IO and DO_BUFFERED_IO bits from the target Device object's
Flags field. This guarantees that the filter will look the same and have the
same buffering strategy as the target driver.

8.

It uses the NextDevice field of the target Device object to get the next Device
object in the chain and repeats steps 4-7.

9.

Finally, it calls ObDereferenceObject to decrement the reference count on the
File object returned by IoGetDeviceObjectPointer.

The second possibility is that the filter driver only wants to capture 1/0
requests sent a specific Device object belonging to a lower-level driver. In that
case, the filter 's DriverEntry routine performs the following steps.
1.

I t calls IoCreateDevice to create a filter Device object. This object has no NT
name, nor does it have a symbolic link to give it a Win32 name.

2.

DriverEntry uses IoAttachDevice to connect the filter Device object to a spe­
cific target Device object. This function takes the case-sensitive NT name of
the target device (for example, \Device\XXO) and a pointer to the filter
Device object. After making the attachment, it returns a pointer to the target
Device object.

3.

It stores the address of the target Device object in the Device Extension of the
filter Device object.

4.

Next, DriverEntry copies the DeviceType and Characteristics fields from
the target Device object to the filter Device object. It also copies the
DO_DIRECT_IO and DO_BUFFERED_IO bits from the target Device object's
Flags field.

5.

From the target Device object, the filter 's DriverEntry routine gets a pointer to
the target Driver object. It uses this pointer to scan the MajorFunction table of
the target Driver object and make sure that every function code supported by
the target is also supported by the filter driver.

U nload routine A filter driver 's Unload routine has to disconnect the filter
and target Device objects. It does this by calling IoDetachDevice and passing a
pointer to the target Device object. Once the filter Device object has been
detached, the Unload routine calls IoDeleteDevice to get rid of it. If the filter
driver has attached itself to a number of target Device objects, it needs to repeat
this procedure for each filter Device object.

Chapter 15

380

Higher-Level Drivers

What Happens beh ind the Scenes

A lot of undocumented activity occurs when a filter driver attaches itself to a
target Device object. In response to an IoAttachDeviceByPointer call, the 1/0
Manager performs the following steps.
1.

It sends an IRP t o the target Device object. This IRP contains the function code
IRP_MJ_CREATE. There are enough 1/0 stack locations in this IRP for the
target driver plus any other drivers layered beneath it. This IRP does not pass
through the filter driver's Maj orFunction dispatch table.

2.

Next, the 1/0 Manager sets the filter Device object's StackSize field to one
greater than the StackSize field of the target Device object. This guarantees
that IRPs created for the filter will have enough 1/0 stack locations for any
lower-level drivers in the hierarchy.

3.

It also sets the AlignmentRequirement field of the filter Device object equal
to the AlignmentRequirement field of the target Device object.

4.

The 1/0 Manager then sends an IRP to the filter Device object. This IRP con­
tains the function code IRP_MJ_CLOSE. Regardless of what Dispatch routines
are registered in the filter driver 's MajorFunction table, this IRP_MJ_CLOSE
IRP is not preceded by an IRP_MJ_CLEANUP IRP.

5.

Finally, the 1/0 Manager returns the address of the target Device object to the
caller of IoAttachDeviceByPointer.

Unlike the attach function, IoDetachDevice function doesn't send any self­
generated IRPs to the target Device object, nor does it reset the StackSize field of
the filter Device object.
Making the Attachment Transparent

Once a filter has attached itself to a target driver, any 1/0 requests sent to
the target have to pass through the Dispatch routines of the filter driver first. If
the Maj orFunction table of the filter Driver object doesn't support the same set
of IRP_MJ_XXX codes as the target driver, clients of the target may experience
problems when the filter is attached. Specifically, some types of requests that
work without the filter will be rejected as illegal operations when the filter is in
place.
To avoid this kind of inconsistency, the filter driver 's MajorFunction table
must contain a Dispatch routine for every IRP_MJ_XXX function supported by
the target driver. Even if the filter isn't interested in modifying a particular major
function code, it still has to supply a dummy Dispatch routine that just passes the
IRP on to the target driver.
The best way to set this up is for the filter driver to scan the MajorFunction
table of the target Driver object. If an entry in the target driver 's table contains a

Sec. 15.6 Code Example: A Filter Driver

381

pointer to _IoplnvalidDeviceRequest, 9 then the corresponding IRP_MJ_XXX
code is unsupported; if it contains anything else, then the target driver supports
the function code. In that case, the filter driver has to put a Dispatch routine in the
corresponding Maj orFunction slot of its own Driver object. The sample driver in
the next section shows how to do this.

1 5.6 CODE E XAM PL E : A F I LT E R DRIVER
This example shows how a basic filter driver (called YYDRIVER) intercepts all
requests intended for a lower-level driver (XXDRIVER). The purpose of the filter
is to hide the lower driver 's limited output transfer size. To do this, it breaks large
outputs into smaller pieces. It also overrides an IOCTL from the lower driver that
returns the maximum size of an output buffer. All other major function codes sup­
ported by the lower driver are simply passed through from the filter.
You can find the code for this example in the CH15\FILTER\DRIVER
directory on the disk that accompanies this book. Code for the dummy device
driver sitting below it is in CH15\LOWER\DRIVER.
YYDRIVER.H - Driver Data Structures

Here's the Device Extension used by the filter driver. Notice that it contains
a pointer to the lower driver 's Device object. The filter uses this to send IRPs to
the lower driver.
typede f s t ruc t _DEVICE_EXTENS ION {
PDEVICE_OBJECT Devi c eObj ect ; / / Back po inter
PDEVICE_OBJECT TargetDevice ;
XX_BUFFER_S I Z E_INFO Buf f erinf o ;
DEVICE_EXTENS ION , * PDEVICE_EXTENS ION ;
I NIT.C - Initialization Code

Initialization in this filter follows the pattern described in the previous sec­
tion of this chapter. This driver takes the general approach of intercepting 1/0
requests for all the Device objects created by the lower driver.
DriverEntry This function is responsible for driver-level initialization. It
uses one of the lower driver's Device objects to locate all Device objects belonging
to the lower driver. It uses a helper function to attach filter Device objects to each
one. It also sets up the filter 's Maj orFunction table by scanning the slots in the
lower driver 's table.
9

Remember from Chapter 8 that this is the 1/0 Manager routine that rejects an IRP with an
unwanted function code. This is the default value for any slot in the MajorFunction table.

Chapter 15

382

Higher-Level Drivers

NT STATUS
DriverEnt ry (
IN PDRIVER_OBJECT DriverObj ect ,
IN PUNICODE_STRING Regi s t ryPath
)
{
PDEVICE_OBJECT TargetDevi c e ;
UNI CODE_STRING TargetDevi c eName ;
PDRIVER_OBJECT TargetDriver ;
PDRIVER_D I S PATCH EmptyDi spatchValue ;
XX_BUFFER_S I Z E_INFO Buf f erinf o ;
PF I LE_OBJECT F i l eObj e c t ;
NTSTATUS s tatus ;
ULONG i ;
EmptyDi spatchValue
DriverObj e c t - >Maj orFunc t i on [ I RP_MJ_CREATE ] ; 0
II
I I Export other driver entry points
II

DriverObj e c t - >Drive rUnl oad

=

.

.

.

YyDriverUnl oad ;

DriverObj e c t - >
Maj o rFunc t i on [ I RP_MJ_WRITE
YyDi spatchWr i t e ; 8

=

DriverObj ect - >
Ma j orFunc t i on [ IRP_MJ_DEVICE_CONTROL
YyDi spat chDevi c e i oContro l ;

=

Rt l in i tUni code S t ri ng (
&Targe tDevi c eName ,
TARGET_DEVI CE_NAME ) ;
s tatus = I oGetDeviceObj e c t Po inter ( 8
&TargetDevi ceName ,
F ILE_ALL_ACCESS ,
&F i l eObj e c t ,
&TargetDevice ) ;
i f ( ! NT_SUCCESS ( s tatus ) )
{
return s tatus ;
YyGetBu f f erLimi t s ( TargetDevi c e , &Bu f ferinfo ) ;
TargetDriver = TargetDevi c e - >DriverObj ect ;

Sec. 15.6 Code Example: A Filter Driver

383

for ( i = O ; i < = IRP_MJ_MAXIMUM_FUNCTION ; i + + ) 0
{
i f ( ( Targe tDriver- >Maj orFunc t i on [ i ]
! = EmptyDi spat chValue )
&& ( DriverObj e c t - >Maj orFunc t i on [ i ]
EmptyDi spatchVa lue ) )
{
DriverObj e c t - >Maj o rFunc t i on [ i ]
YyDi spatchPassThrough ;

Targe tDevi c e = Targe tDr iver- >Devi ceOb j ect ; 0
whi l e ( TargetDevi c e ! = NULL )
{
s tatus = YyAt tachF i l t er (
DriverObj ect ,
TargetDevi c e ,
&Bu f ferinfo ) ;
i f ( ! NT_SUCCESS ( s tatus ) )
{
YyDriverUnl oad ( DriverObj ect ) ;
break ;
TargetDevi ce = TargetDev i c e - >NextDevi ce ;
ObDer e f e renc eObj ect ( F i l eObj ect ) ; ©
re turn s tatus ;
}

0 The first step is to get the contents of an empty slot in the filter 's Major­
Function table. This is actually the address of an internal system routine
called _IoplnvalidDeviceRequest. We can find its current value by look­
ing in any slot of the filter 's own table that it hasn't filled in yet.
@ Next, overwrite slots in the filter 's MajorFunction table that correspond
to functions the filter wants to intercept and modify. In this driver, only
write and IOCTL functions are being fooled with.
@} Using the NT name of any device belonging to the lower driver, get a
pointer to the Device object itself. It doesn't really matter which one, since
it's only being used to query buffer size limits and to get a pointer to the
lower Driver object.
0 In this loop, see which IRP_MJ_XXX function codes the lower driver

responds to. If the lower driver processes a given code and the filter

Chapter 15

384

Higher-Level Drivers

doesn't explicitly intercept that code, fill the corresponding slot in the fil­
ter 's MajorFunction table with the address of a generic pass-through Dis­
patch routine.
0 Now, run the list of all Device objects attached to the lower Driver object.

For each one, create and attach an invisible filter Device object.
© Finally, decrement the reference count on the unused File object and

return the most recent status value. This is either STATUS_SUCCESS or
some error code from YyAttachFilter.
YyAttachFilter This is a little helper function that does the grunt work
associated with creating and attaching a filter Device object to a specific lower­
level Device object.

s tat i c NTSTATUS
YyAt tachF i l te r (
IN PDRIVER_OBJECT F i l t erDriver ,
IN PDEVICE_OBJECT TargetDevi c e ,
IN PXX_BUFFER_S I Z E_INFO Bu f f erinfo
)
PDEVICE_OBJECT F i l t erDevi c e ;
PDEVICE_EXTENS ION F i l terExt ens i on ;
ULONG TargetMethod ;
NTSTATUS s tatus ;
s tatus

=

I oCreat eDevi c e ( 0
F i l t e rDriver ,
s i z e o f ( DEVICE_EXTENS ION ) ,
NULL ,
F ILE_DEVICE_UNKNOWN ,
0'
TRUE ,
&Fi l terDevice ) ;

i f ( ! NT_SUCCE S S ( s tatus ) )
{
return s tatus ;
}

s tatus

=

I oAttachDeviceByPo inter ( @
F i l t erDevi c e ,
TargetDevi c e ) ;

i f ( ! NT_SUCCESS ( s tatus ) )
{

Sec. 15.6 Code Example: A Filter Driver

385

I oDe l e t eDevi c e ( F i l t erDevice ) ;
re turn s tatus ;
F i l te rExtens i on = F i l t erDevi c e - >Devi c eExtens i on ; @
F i l terDevi c e ;
F i l te rExtens i on- >Devi c eObj ect
F i l terExtens i on - > TargetDevice = TargetDevi ce ;
F i l terExt ens i on - >
Bu f f erinf o . MaxWri teLength
Buf f erinfo- >MaxWri teLength ;
F i l t erExtens ion->
Bu f f erinf o . MaxReadLength =
Bu f fe r i n f o - >MaxReadLength ;
F i l te rDevi c e - >DeviceType

=

TargetDevi c e - >Devic eType ; 0

F i l terDevi c e - >Charac t e r i s ti c s =
Targe tDevi c e - >Charac t er i s t i c s ;
F i l t erDevi c e - > F l ags I =
( TargetDevic e - > F l ags &
( DO_BUFFERED_IO I DO_DIRECT_IO ) ) ; 0
re turn STATUS_SUCCES S ;
0 Create a Device object without an NT name. It doesn't matter what its

type or characteristics are, since they'll be copied from the lower-level
Device object.

@ Attach the invisible Device object to the lower-level Device object. See the

previous section in this chapter for a description of all the things that hap­
pen when you make this call.

@ Set up the filter Device object's Device Extension structure. This includes
storing the transfer size limitations queried from the lower driver.
0 Copy various items from the lower-level Device object into the filter

Device object. This is necessary to make the presence of the filter as trans­
parent as possible.
0 Last, select the same buffering strategy as the one used by the lower-level

Device object.

YyGetBufferlimits This is an even tinier helper function that queries the
lower-level driver for information about its buffer size limits. It shows how to
make a synchronous IOCTL call from one driver to another.

Chapter 15

386

Higher-Level Drivers

s ta t i c VOI D
YyGetBu f f erLimi t s (
IN PDEVICE_OBJECT Targe tDevi c e ,
IN OUT PXX BUFFER_S I ZE_INFO Bu f ferinfo
)
KEVENT I o c t lComp l e t e ;
I O_STATUS_BLOCK I osb ;
P I RP I rp ;
NTSTATUS s tatus ;
Keini t i a l i z eEvent (
& I o c t l Comp l e t e ,
No t i f i cat i onEvent ,
FALSE ) ;
I rp

I oBu i l dDevi c e i oCont rolReque s t (
I OCTL_XX_GET_MAX_BUFFER_S I ZE ,
TargetDevi ce ,
NULL ,
0,
Bu f ferinf o ,
s i z e o f ( XX_BUFFER_S I Z E_INFO ) ,
FALSE ,
& I o c t lComp l e t e ,
& I osb ) ;

I oCal lDr iver ( Targe tDevi c e , I rp ) ;
KeWa i tForS ingl eObj ect (
& I o c t l Comp l e t e ,
Execut ive ,
Kerne lMode ,
FALSE ,
NULL ) ;

DISPATCH.C - Filter Dispatch Routines
Here are the Dispatch routines for the filter driver. Only two major function
codes are actually modified by the filter. All the others are passed directly to the
lower-level driver.

YyDispatchWrite The lower driver has a limit on the maximum size of an
output operation. The filter hides this by breaking writes into smaller pieces. This
Dispatch routine and the corresponding 1/0 Completion routine do the work of
splitting the transfer.

Sec. 15.6 Code Example: A Filter Driver

387

NT STATUS
YyDi spat chWr i t e (
IN PDEVICE_OBJECT Devi ceObj ect ,
IN P IRP I rp
)
PDEVICE_EXTENS I ON F i l t erExtens i on =
Devi c eObj e c t - >Devi c eExtens i on ;
PI O_STACK_LOCAT ION I rpS tack =
I oGetCurrent i rp S tackLocat i on ( I rp ) ;
P IO_STACK_LOCATI ON Next i rpStack =
I o GetNext i rpStackLo cat i on ( I rp ) ;
ULONG MaxTrans f e r =
F i l te rExtens i on- >
Bu f f erinfo . MaxWri teLength ;
ULONG Byte s Reques ted =
I rpStack- > Paramet er s . Wr i t e . Length ;
i f ( Byt esReques t ed = = 0 ) 0
{
I rp - > I o S tatus . S tatus = STATUS_SUCCES S ;
I rp - > I o S tatus . I n f o rmat i on = O ;
I oComp l e t eReque s t ( I rp , I O_NO_INCREMENT ) ;
re turn STATUS_SUCCESS ;
i f ( Byte sReques t ed < = MaxTrans f e r ) @
{
return YyDi spatchPassThrough (
DeviceObj e c t ,
I rp ) ;
Next i rpS tack- >
Maj orFunc t i on

=

I RP_MJ_WRITE ; @

Next i rpS tack- >
Paramet er s . Wr i t e . Length

=

I rpStack->
Paramet er s . Wr i t e .
ByteOf f s e t . HighPart
I rpStack->
Parameters . Wr i te .
ByteOf f s e t . LowPar t

MaxTrans f e r ;

Byte sReques ted ; 0

Chapter 15

388

Higher-Level Drivers

( ULONG ) I rp - >As s o c i a t edirp .
Sys temBu f f e r ; 0
I o S e tComp l e t i onRout ine ( ©
I rp ,
YyWr i t eComp l e t i on ,
NULL ,
TRUE TRUE TRUE ) ;
I

I

II
I I Pas s the IRP to the target devi c e
II

return I oCal l Dr iver ( @
F i l t erExtens i on- >Targe tDevi c e ,
I rp ) ;

0 Check for zero-length transfers and complete them right here.
@ If the requested length is within the lower driver 's acceptable limits, just
send the IRP right on through.

@) Otherwise, set up the lower driver 's I/0 stack location in this IRP to
transfer as much as possible in a single operation.
0 Use the high-order part of the ByteOffset field in the filter driver 's I/0

stack location to hold the number of bytes remaining in the original
caller 's request. This is all right because this field isn't being used for any­
thing else in this driver. Initially, this is the same as the number of bytes
requested in the whole transfer.
0 Save the original system buffer address in the low-order (unsigned) part

of the ByteOffset field.
© Set up an 1/0 Completion routine to continue working on the split trans­

fer. All the necessary context is stored somewhere in the IRP, so there's no
need to pass any other context block.

@ Finally, pass the IRP to the lower-level driver and begin the first partial
transfer operation.

VyDispatch DeviceloControl To further hide the limitations of the lower­
level driver, the filter intercepts IOCTL queries about the driver 's maximum
transfer size. Instead of returning the lower-level driver 's limit values, it lies and
says there are no limits. Any other kind of IOCTL function is passed through.

NT STATUS
YyDi spat chDevi c e i oContro l (
IN PDEVICE_OBJECT Devic eObj e c t ,

Sec. 15.6 Code Example: A Filter Driver

389

IN P I RP I rp
)
PIO_STACK_LOCATI ON I rpS tack
I oGetCurrenti rpStackLocat i on ( I rp ) ;
=

PXX_BUFFER_S I Z E_INFO Bu f f erinf o ;
i f ( I rpS tack- >
Parameters .
Devi c e ioContro l . IoContro l Code
IOCTL_XX_GET_MAX_BUFFER_S I Z E ) 0
Bu f f erinfo
( PXX_BUFFER_S I ZE_INFO ) I rp - >
As s o c iatedirp . Sys temBu f f e r ;
=

Bu f f erin f o - >
MaxWri teLength
Bu f f erin f o - >
MaxReadLength

=

=

XX_NO_BUFFER_LIMIT ;
XX_NO_BUFFER_LIMIT ;

I rp- > I o S tatus . Inf ormat i on
s i z e o f ( XX_BUFFER_S I ZE_INFO ) ;
=

I rp - > I o S tatus . S tatus = STATUS_SUCCES S ;
I oComp l e t eReques t ( I rp , IO_NO_INCREMENT ) ;
return STATUS_SUCCES S ;
e l s e f9
return YyDi spatchPas sThrough (
Devic eObj ect ,
I rp ) ;

0 Intercept the buffer-size IOCTL code used by the lower-level driver and
tell the caller that there are no size limits.
f9 If it's any other kind of IOCTL, just send it on to the lower driver for

processing.

YyDispatchPassThrough This is the "none of the above" Dispatch rou­
tine. It simply passes everything on to the lower-level driver. It attaches a generic
I/0 Completion routine to handle making the IRP pending.

NT STATUS
YyDi spat chPas sThrough (

Chapter 15

390

Higher-Level Drivers

IN PDEVICE_OBJECT Devic eObj e c t ,
IN PIRP I rp
)
{

PDEVICE_EXTENS ION F i l t erExtens i on =
DeviceObj e c t - >Devi c eExtens i on ;
P IO_STACK_LOCATI ON I rpStack =
I oGetCurrent i rpStackLoc a t i on ( I rp ) ;
P IO_STACK_LOCATI ON Next i rpS tack =
I oGetNext i rp S tackLo c a t i on ( I rp ) ;
NTSTATUS s tatus ;
II
I I Copy args to next l evel
II

* Next i rpS tack

=

* I rpStac k ;

II
I I S e t up Comp l e t i on routine t o handl e
I I marking the IRP p ending .
II

I o S e tComp l e t i onRou t i ne (
I rp ,
YyGeneri cComp l e t ion ,
NULL ,
TRUE , TRUE , TRUE ) ;

II
I I Pas s the IRP t o the target
II

return I oCal l Driver (
F i l te rExt ens i on - >TargetDevi ce ,
I rp ) ;

}

COMPLETE.C

-

1/0 Completion Routines

The functions in this file handle all the 1/0 completion performed by the fil­
ter driver.

YyWriteCompletion This is the real workhorse routine. Its job is to per­
form all the additional partial transfers needed to satisfy the original caller 's
request. If there's an error, or when the whole transfer is finished, it allows the IRP
to continue its journey back up the driver stack. Otherwise, it sets up the IRP for
another small transfer and sends it to the lower driver.

Sec. 15.6 Code Example: A Filter Driver

391

NTSTATUS
YyWri t eComp l e t i on (
IN PDEVICE_OBJECT DeviceObj e c t ,
IN P I RP I rp ,
IN PVOI D Cont ext
)
PDEVICE_EXTENS I ON F i l t erExt ens i on =
Devi c eObj e c t - > Devi c eExtens i on ;
PI O_STACK_LOCATI ON I rpS tack =
I oGetCurrent i rpStackLocat i on ( I rp ) ;
PI O_STACK_LOCATION Next i rpStack =
I oGetNext i rpStackLocat i on ( I rp ) ;
ULONG Trans f e rS i z e =
I rp - > I o S tatus . I n f o rmat i on ;
ULONG Byte sRequested =
I rpS tack- > Parameters . Wr i t e . Length ;
ULONG Byte s Remaining =
( ULONG ) I rpStack- >
Parameters . Wr i t e . Byt eOf f s e t . H i ghPart ;
ULONG MaxTrans f e r =
F i l terExtens i on- >Bu f ferinfo . MaxWri t eLength ;
NTSTATUS s tatus ;
i f ( NT_SUCCES S ( I rp - > I oS tatus . S tatus ) ) 0
{
Trans f e rS i z e ;
Byte sRemaining
I rpStack- >
Parameters . Wr i t e .
Byt eOf f s e t . H i ghPart =
Byt e s Remaining ;
i f ( NT_SUCCES S ( I rp - > I o S tatus . S tatus ) 8
&& Byte sRema i ni ng > 0 )
{
( PUCHAR ) I rp - >
As s o c iatedirp . Sys t emBu f fer + =
Trans f e rS i z e ; 8
Trans f e rS i z e = Byte s Remaining ; 0
i f ( Trans f e rS i z e > MaxTrans fer

Chapter 15

392

Higher-Level Drivers

{

Trans f e rS i z e = MaxTrans f e r ;
}

Next i rpStack- >Maj orFunc t i on = I RP_MJ_WRITE ;
Next i rpStack - >
Parame ters . Wr i t e . Length =
Trans f e rS i z e ;
I o S e tComp l e t i onRout ine ( @
I rp ,
YyWri t eComp l e t i on ,
NULL ,
TRUE TRUE TRUE ) ;
I

I

I oCal lDr iver ( TargetDevi ce ,
I rp ) ;
return STATUS_MORE_PROCESSING_REQUIRED ;
}

else 8
{
I rp - >As s o c i a t edi rp . Sys t emBu f f e r
( PVO I D ) I rpStack- >
Paramet er s . Wr i t e .
Byt eO f f s e t . LowPart ; @
I rp - > I o S tatus . Informa t i on =
Byt e s Requested - BytesRemaining ; CD
i f ( I rp - > PendingReturned ) @
{
I oMarkirpPending ( I rp ) ;
re turn STATUS_SUCCES S ;

0 If the current transfer worked, reduce the count of bytes left to send and
save the new count in an unused part of the filter driver 's 1 / 0 stack
location.
f9 If there's more data left to transfer, set up the next partial output operation.

@) Increment the pointer into the system buffer to account for the data trans­
fer that's just completed.

Sec. 15.6 Code Example: A Filter Driver

393

e Calculate the size of the next partial transfer. Start by assuming it can all

be done in a single operation. Reduce that expectation if it proves to be
too optimistic.
0 After setting up the 1/0 stack location for the lower-level driver, attach

this 1/0 Completion routine to catch the operation when it finishes.
PendingRe turned )
{

I oMarki rpPending ( I rp ) ;

re turn STATUS_SUCCES S ;
}

Chapter 15

394

Higher-Level Drivers

1 5. 7 WRITING TIGHTLY COUPLED DRIVERS
Unlike layered and filter driver, tightly coupled drivers don't use the 1/0 Man­
ager 's IoCallDriver function for most of their communications. Instead, they
define some kind of private calling interface. The advantage of this approach is
that it's usually faster than the !RP-passing model supported by the 1/0 Manager.
In trade for improved performance, however, you have to pay much more atten­
tion to the mechanics of the interface. Also, unless the details of the interface are
well documented, it's difficult for drivers from different vendors to work with
each other this way.

How Tightly Coupled Drivers Work
Since the interface between two tightly coupled drivers is completely deter­
mined by the driver designer, it's impossible to give a single, unified description
of how all tightly coupled drivers work. Instead, this subsection presents some
10
general architectural guidelines. Figure 15.3 shows one common method of
tightly coupling a pair of drivers.
In this picture, the lower driver has exposed a special setup function in the
form of a IRP_MJ_INTERNAL_DEVICE_CONTROL IOCTL. During the upper
driver 's initialization, it calls this IOCTL function to retrieve a table of function

IRP
For

YyO

· · · · ·· · · · · · · · · · · · · · · · · · ······ ··········· · · · ·

Call XxFunction1

Function Table
:

L

XxFunctionO

·
. .

IJi XxFunction1

return

·

Copyright@ 1996 by Cydonix Corporation. 960033a.vsd

Figure
10

1 5.3

How tightly coupled drivers work

For some concrete examples, see source code for the mouse and keyboard drivers that comes with
the DOK.

Sec. 15.7 Writing Tightly Coupled Drivers

395

pointers from the lower driver. When the upper driver needs the services of the
lower driver, it calls one of the functions in this table directly, rather than using
IoCallDriver. Before unloading, the upper driver calls another function in the
function table to disconnect it from the lower driver.

Initialization and Cleanup in Tightly Coupled Drivers
The following subsections describe in general terms how a pair of tightly
coupled drivers might initialize and unload. Of course, the exact steps will
depend on the architecture chosen by the driver designer.

Lower DriverEntry routine Assuming the lower driver manages some
specific piece of hardware, its DriverEntry routine will perform the following
steps.
1.

Using the techniques described in Chapter 7, i t finds and allocates any hard­
ware for which it is responsible.

2.

DriverEntry adds an IRP_MJ_INTERNAL_DEVICE_CONTROL Dispatch
routine to the Driver object's MajorFunction table. One of the IOCTLs sup­
ported by this function code will be to export a table of pointers to various
functions in the lower driver.

3.

Next, it calls IoCreateDevice to build a Device object. Although this object
has an NT name, it does not have a Win32 symbolic link. This Device object is
used by the upper driver to establish its initial connection with the lower
driver.

4.

Finally, DriverEntry does any other driver-specific initialization. For example,
it might set up a ring of buffers that it will share with its higher-level clients.

Upper DriverEntry routine The upper driver makes its initial contact with
the lower driver using the standard 1/0 Manager interface described earlier in
this chapter. This is what its DriverEntry routine does.
1.

I t calls IoGetDeviceObjectPointer to get a pointer to the lower driver 's
Device object. As with a layered driver, this is followed by a call to ObRefer­
enceObjectByPointer to increment the pointer reference count of the lower
Device object, and a call to ObDereferenceObject to decrement the reference
count of the File object returned by IoGetDeviceObjectPointer.

2.

Next, DriverEntry issues a synchronous IOCTL request to the lower Device
object. This IOCTL returns the address of the lower driver 's table of exported
functions.

3.

It creates one or more Device objects with IoCreateDevice. If the upper driver
is exposing these objects to user-mode applications, it calls IoCreateSymbolic­
Link to give them Win32 names.

396

Chapter 15

4.

Higher-Level Drivers

Finally, DriverEntry stores the address of the lower driver 's function table in
the Device Extension of the upper Device objects.

U pper Un load routine When the upper driver is stopped, its Unload rou­
tine should perform the following general steps.

1.

It releases any resources it might have acquired from the lower driver. For
example, if it received a buffer from the lower driver, it returns it.

2.

Next, the Unload routine issues a synchronous IOCTL to the lower Device
object. This notifies the lower driver that the upper one is disconnecting and
gives the lower driver a chance to release resources acquired from the upper
driver.

3.

It then calls ObDereferenceObj ect to decrement the pointer reference count
on the lower Device object. This effectively breaks the connection with the
lower driver.

4.

Finally, the Unload routine performs the usual cleanup tasks, such as deleting
its own Device objects and symbolic links.

Lower Unload routine There's nothing particular exciting about the lower
driver 's Unload routine. It simply releases any hardware it might be holding,
releases any other system resources it has allocated, and deletes the Device object
that it exposed to the upper driver.
1/0 Request Processing in Tightly Coupled Drivers
When a client of the upper driver issues an I/0 request, the I/O Manager
sends an IRP representing the transaction to one of the upper driver 's Dispatch
routines. Rather than using IoCallDriver to send this IRP to the lower driver, the
Dispatch routine directly calls one or more functions in the lower driver to service
the request. The exact processing sequence will depend on whether the request is
handled synchronously or asynchronously.

Synchronous 1/0 For input operations, the upper driver uses a GetBuffer
function in the lower driver to dequeue a buffer of data from the ring of shared
buffers. Following the model described in Chapter 14, this queue has a Sema­
phore object that keeps track of the number of full buffers. If the queue of ready
buffers is empty, the Semaphore will be in the Non-signaled state, and the upper
driver 's Dispatch routine will wait. When the lower driver adds a full buffer to
the queue, it increments the Semaphore, which awakens the waiting Dispatch
routine. The Dispatch routine then formats and copies data from the shared buffer
into the buffer associated with the original caller's IRP, completes the IRP, and
releases the shared buffer using a PutBuffer function exposed by the lower driver.
Synchronous output operations just reverse the sequence. Here, the upper
driver 's Dispatch routine calls a GetBuffer function in the lower driver to get an

Sec. 15.8 Summary

397

empty buffer from the queue. Again, the queue has an attached Semaphore object
that counts the number of available buffers. If there are no empty buffers, the
upper driver 's Dispatch routine waits until the lower driver adds one to the
queue and increments the Semaphore. Once it gets an empty buffer, the upper
driver fills it with data from the buffer associated with the original IRP. It then
calls a PutBuffer function exposed by the lower driver.
The PutBuffer function begins the actual data transfer and then waits for a
synchronization Event object embedded in the buffer. This causes the upper
driver 's Dispatch routine to go to sleep. When the transfer operation completes,
some other part of the lower driver (a DPC routine, for example) sets the Event
object and returns the buffer to the queue of available blocks. At that point, the
upper driver 's Dispatch routine wakes up and completes the original caller 's IRP.

Asynchronous 1/0 In this case, the upper driver 's Dispatch routine calls
IoMarklrpPending to put the original caller 's IRP into the pending state. It then
calls a QueueRequest function exported by the lower driver. As arguments, this
function takes the address of the original IRP and a pointer to a callback routine in
the upper driver. QueueRequest stores the IRP address and callback pointer in a
driver-defined context block and adds it to a private queue of pending requests. It
then returns control to the upper driver, and the upper driver 's Dispatch routine
returns STATUS_PENDING to the 1/0 Manager.
Meanwhile, the lower driver is busily pulling context blocks from its private
queue and performing 1 / 0 requests. As each one finishes, the lower driver
invokes the upper driver 's callback routine and passes it the address of the pro­
cessed IRP. The callback routine in the upper driver does any postprocessing
needed by the request and calls IoCompleteRequest with the original caller 's IRP.

1 5.8 SUMMARY
The layered architecture in Windows NT allows you to simplify the design of
drivers that might otherwise be extremely complex. Breaking a monolithic driver
into smaller, logically distinct pieces makes implementation and maintenance eas­
ier, reduces debugging time, and increases the likelihood that some of the soft­
ware will be reusable.
In this chapter, you've seen a number of different ways to stack drivers on
top of one another. Most of these techniques depend on the 1/0 Manager 's stan­
dard calling mechanism to send IRPs from one driver to another. If this proves not
to be fast enough, you can also define private interfaces between a pair of drivers.
In general, these privately-defined interfaces are a bad idea because they make
the design more fragile and harder to maintain.
Regardless of how your drivers communicate with one another, you still
have to guarantee that they load in the proper order. Getting that to happen is one
of the topics discussed in the next chapter.

C

H

A

P

T

E

R

16

Building and
Installin g Drivers

T

here's always a certain amount of grunt work
associated with any interesting activity. This chapter is about the mundane details
of building drivers and installing them on a system. Some of this information is
pretty straightforward stuff. Other bits of it have been teased painfully from vari­
ous header files, online sources, and tedious experimentation. So, even if you're
familiar with the DDK documentation, you may find something of value here.

1 6 . 1 BUI LDING DRIVERS
One difficult aspect of writing drivers for Windows NT is that you need to main­
tain separate versions of the driver for each hardware platform that you support.
Generating and keeping track of multiple binaries is especially troublesome
because you may need different sets of compiler and linker options for each plat­
form. The BUILD utility supplied with the NT DDK insulates you from most of
these platform dependencies.

What BUILD Does
The BUILD utility is just an elaborate wrapper around NMAKE. Using a set
of keywords, you describe the operation you want to perform. BUILD then scans
your source files for dependencies and constructs an appropriate set of NMAKE
commands. Next, it runs NMAKE to execute these commands, and the result is
398

Sec. 16.l Building Drivers

399

SOURCES
File

Environment
Variables

Command
Options

,/

Free
Build

Checked
Build

Copyright @ 1 994 by Cydonix Corporation. 940040a.vsd

Figure

1 6. 1

How th e BUILD utility works

one or more binary output files (referred to as B UILD products). Figure 16.1 shows
how this process works.
BUILD itself is actually a rather simple-minded piece of software. Most of
the build process is controlled by a set of standard command files that BUILD
passes to NMAKE. These files contain all the platform-specific rules and option
settings needed to create a BUILD product. Keeping these rules in a separate file
allows Microsoft to modify the build process without having to rewrite the
whole BUILD utility. Currently, BUILD uses these command files (located in

... \DDK\INC):
•

MAKEFILE.DEF is the master control file. It uses several other files to do
some of its work.

•

MAKEFILE.PLT selects the target platform for a build operation.

•

1386MK.INC, ALPHAMK.INC, MIPSMK.INC, and PPCMK.INC con­
tain platform-specific compiler and linker switches for Intel, Alpha, MIPS,
and PowerPC systems.

BUILD helps you manage multiplatform projects by separating binary files
according to their platform type. To do this, it uses different directories for Intel,
MIPS, Alpha, and PowerPC binaries. If you have cross-hosted compilers and link­
ers, you can produce the binaries for all the supported platforms on one system
using a single BUILD command. Figure 16.2 shows the directory structure that
BUILD uses.

Chapter 16

400

Building and Installing Drivers


1-----

ALPHA

t

1----- 1386

CHECKED
XXDRIVER.SYS
FREE
XXDRIVER.SYS

1----- MIPS
'------ PPC

Copyright © 1 996 by Cydonlx Corporation. 960025a.vsd

Figure

1 6.2

Directory structure for BUILD products

Notice that BUILD also uses separate directories for the checked and free ver­
sions of your binaries. In the checked version, compiler optimization is disabled,
extra debugging information is added to the file, and the DBG symbol is defined
as 1 (allowing you to include conditional debugging code in your driver) . By con­
trast, free BUILD products are compiled with optimization turned on and the
DBG symbol is defined as 0. Checked builds are useful when you're debugging;
free builds are generally smaller and faster and should be used for the commercial
release of a driver.
One of BUILD' s odd little quirks is that, while it creates the platform-specific
directories automatically, for some reason it doesn't create the CHECKED and FREE
subdirectories. This results in an error message from the linker when it tries to create
your driver. The easiest solution is to set up the directory structure by hand.

How to Build a Driver
Once you have some source code ready, follow these steps to generate your
driver. You only need to perform steps 1-3 the first time you build the driver.

1.

In the directory where you keep your driver source code, create a file called
SOURCES that identifies the components of the final driver. A discussion of
what to put in this file appears later in this section.

2.

In the same directory, create a file called MAKEFILE that contains only the
following line:

! INCLUDE $ ( NTMAKEENV ) \ MAKEFILE . DEF

Sec. 16.1 Building Drivers

401

This stub invokes the standard makefile needed by any driver created with
BUILD. Don't edit this stub makefile. If you want to add more source files to
this driver, add them to the SOURCES file.
3.

Use the File Manager or the MKDIR command to set up the directory tree for
your BUILD products. Refer back to Figure 16.2.

4.

In the Program Manager group for the Windows NT DDK, double-click on
the icon for either the Checked Build or the Free Build environment. A com­
mand window will appear with the appropriate BUILD environment vari­
ables set for a debug or release version of your driver. It's important that you
run the BUILD utility only from one of these windows.

5.

When the Checked or Free command window opens, its default directory is
the same as the installation directory for the NT DDK itself. Use the CD com­
mand to move to the directory where your driver 's SOURCES file is located.

6.

Run the BUILD utility to create the driver executable.

If all goes well, your driver will be in the CHECKED or FREE subdirectory
of the appropriate platform directory. If something goes awry, look at the various
BUILD log files to determine the problem.
You might be wondering whether you can build NT drivers on a Windows
95 system. The VC++ tools all run under Windows 95, so in theory it should work.
Unfortunately, when BUILD spawns NMAKE, it uses a command line that's too
long for Windows 95 to handle and the operation fails. Consequently, you have to
do your BUILDing on a Windows NT system.

Writing a SOURCES File
You describe your BUILD operation using a series of keywords. These key­
words specify things like the type of driver you want to generate, the source files
making up the BUILD product, and the directories for various files. Although you
can pass these keywords to BUILD as command-line options or environment vari­
ables, the usual procedure is to put them in a SOURCES file. Keep the following
points in mind when you write one of these files:
•

The filename must be SOURCES (without any extension).

•

The file should contain some number of commands, each having the fol­
lowing format:

keyword=value
•

You can break a single BUILD command over multiple lines in the
SOURCES file by putting a \ character at the end of each line except the last.

•

The value of a BUILD keyword must be pure text. BUILD itself does only
very limited processing of NMAKE macros and doesn't handle condi­
tional statements at all.

Chapter 16

402
•

Make sure you don't leave any whitespace between a BUILD keyword
and the character. Whitespace after the is acceptable.
=

•

Building and Installing Drivers

=

You can put comments in a SOURCES file by starting the line with a #
character.

Table 16.1 lists the SOURCES keywords that you're most likely to use for
building drivers. If you're the sort of person who enjoys going to the dentist for
root-canal work, you may want to use the BUILD utility for maintaining user­
mode applications as well as drivers. In that case, see the BUILD documentation
for a list of additional keywords.

Table 1 6.1

BU I LD utility keywords for maintaining d rivers and libraries

Selected BUILD keywords
Keyword

Meaning

INCLUDES
SOURCES

List of paths containing header files
List of source files making up the BUILD
product*
Top-level directory for BUILD product tree*
Name of the BUILD product, without an
extension*
File extension for the BUILD product
Case-sensitive keyword describing BUILD
product*
• DRIVER
• GDl_DRIVER
• MINIPORT
• LIBRARY (for static libraries)
• DYNLINK (for DLLs)
List of libraries to be linked with the driver
Linker options of the form -Jlag:value
Example: -MAP:XXDRIVER.MAP
File containing #include directives
List of nonstandard components to be built
with MAKEFILE.INC after initial
dependency scan
List of nonstandard components to be built
with MAKEFILE.INC before linking
List of nonstandard components to be built
with MAKEFILE.INC both before and
after the link

TARGETPATH
TARGETNAME
TARGETEXT
TARGETTYPE

TARGETLIBS
LINKER_FLAGS
PRECOMPILED_INCLUDE
NTTARGETFILEO

NTTARGETFILEl
NTTARGETFILES

*Re quired.

403

Sec. 16.1 Building Drivers

The following is an example of a minimal SOURCES file for building a ker­
nel-mode driver.

TARGETNAME= XXDRIVER
TARGETTYPE= DRIVER
TARGETPATH=
INCLUDES = $ ( BASEDI R ) \ inc ; . . \ inc
SOURCES= ini t . c config . c resal l o c . c \
di spatch . c x f e r . c unl oad . c
One item to point out in this file is the INCLUDES= keyword. For some rea­
son, neither the DOK installation procedure nor the Free/ Checked build icons
add the DOK header directory to the INCLUDE-path environment variable. By
naming it explicitly in SOURCES, you can avoid a number of miscellaneous
BUILD error messages.

Log Files Generated by BUILD
In addition to its screen output, the BUILD utility generates several text files
that you can use to determine the status of a BUILD product. These files are:
•

BUILD.LOG

•

BUILD.WAN

•

BUILD.ERR

-

-

-

Lists the commands invoked by NMAKE.
Contains any warnings generated during the build.
Contains a list of errors generated during the build.

BUILD puts these files in the same directory as the SOURCES file. The
warning and error files appear only if something bad happened during the
BUILD operation.
One other point worth mentioning is BUILD's nasty habit of filtering out
some compiler and linker messages. These filtered messages don't appear on the
screen display, but they will show up in the log files. For that reason, it's impor­
tant to check the log files after each BUILD.

Recursive BUILD Operations
You can use BUILD to maintain an entire source code tree by creating a file
called DIRS. You put this file in a directory that contains nothing but subdirecto­
ries. Each subdirectory can be a source directory (containing a SOURCES file) or
the root of another source tree (containing another DIRS file). When you run
BUILD from the topmost DIRS directory, it creates all the BUILD products
described in each SOURCES file.
The rules for writing a DIRS file are the same as those for a SOURCES file,
with the restriction that you're only allowed to use the following two keywords:

Chapter 16

404

Building and Installing Drivers

•

DIRS - Lists subdirectories that should always be built. Entries in this
list are separated by spaces or tabs.

•

OPTIONAL_DIRS - Lists subdirectories that should be built only if they
are named on the original BUILD command line.

This recursive BUILD feature can be useful for maintaining things like video
drivers that have both a user-mode and a kernel-mode component.

1 6.2 M ISC E L LA N E O U S B U I L D - T I M E A CTIVITI E S
Along with the basic operations of getting your driver to compile and link, there
are several other kinds of activities that you may want to perform at BUILD time.
This section presents the ones that have proven to be the most useful.

Using Precompi led Headers
Much of the time consumed by a BUILD operation is spent compiling vari­
ous large header files. During a normal development cycle, your driver 's code
will change frequently, but these headers will be relatively static. This leads to a
lot of wasted time as the headers are compiled again and again. By taking advan­
tage of the C compiler 's precompiled header feature, you can significantly reduce
the BUILD time of your driver (at the expense of some disk space) .
To use precompiled headers, you'll need to make some changes to your driver
sources and add a new keyword to the BUILD control file. Follow these steps:

1.

Create a header file containing nothing but #include directives for any other
headers used by your driver. For example, if you called this file PRECOMP.H,
it would contain the following:

# inc lude 
# inc lude " xxdr iver . h "
# inc lude " hardware . h "
2.

In all your other driver source files, replace all #include directives with

# inc lude " precomp . h "

3.

Add the following statement to your SOURCES file:

PRECOMP ILED_INCLUDE = PRECOMP . H
When you run BUILD for the first time, the C compiler will save the precom­
piled header information in a binary file called PRECOMP.PCH. As long as you
don't change the contents of your headers, the compiler will be able to save itself
some work by reusing the precompiled binary version.

Sec. 16.2 Miscellaneous BUILD-Time Activities

405

Including Version Information in a Driver
How much time have you spent tracking down weird bugs, only to find
that the real problem was a software version mismatch? This can be a real time
waster, especially if you're trying to support a commercial product used by hun­
dreds of customers. You can avoid this situation altogether by putting explicit ver­
sion information in your drivers and checking it before you start looking for more
complex explanations.
You add version information to a driver using a resource script that defines
a version structure. An example later in this section shows how to do this, but the
basic steps you need to follow are:
l.

Separate your version data into two categories: things that relate to your
company as a whole (like the company name), and things that are product­
specific.

2.

Use the generic company information to write a header that can be included
in the version resource scripts of all your products.

3.

Write a resource script for your driver that contains product-specific version
information. This file should be updated each time you release a version of
your driver for testing.

4.

Add the name of the resource script to the list of driver components identified
by the SOURCES keyword in your SOURCES file.

When you want to examine the driver 's version data, you can use the File
Manager 's File Properties ... menu item. To display this information in a more
complete form, you could also write a little Win32 program to read the version
data. The following Win32 API calls are relevant.
•

GetFileVersionlnfoSize
This tells you the number of bytes of version
data are associated with the driver.

•

GetFileVersionlnfo

•

VerQueryValue
This extracts a specific piece of version information
from the buffer returned by GetFile Versionlnfo.

-

-

This returns a buffer of version data.

-

To make all this more concrete, here are examples of a vendor header file
and the corresponding product resource script.

Vendor information file. This header file contains version information com­
mon to all the products from one vendor. Although you could include this stuff in
the RC file itself, if you're maintaining several products, it's less work to keep it in
one place for all of them. Below is a copy of CYDNXVER.H, the vendor informa­
tion file for Cydonix Corporation.

Chapter 16

406

# de f ine VER_COMPANYNAME_STR

Building and Installing Drivers

" Cydonix Corpora t i on "

# de f ine VER_LEGALTRADEMARKS_STR
\
" Cydonix\ 2 5 6 i s a t rademark o f Cydonix Corporat i on . "
# de f ine VER_LEGALCOPYRIGHT_YEARS " 1 9 9 4 - 1 9 9 5 "
# de f ine VER_LEGALCOPYRIGHT_STR
" Copyr i ght \ 2 5 1 Cydonix Corp . "
VER_LEGALCOPYRIGHT_YEARS
/ * de f au l t i s nodebug * /
# i f DBG
# de f ine VER_DEBUG
#else
# de f ine VER_DEBUG
# endi f
/ * de f au l t i s release * /
# i f BETA
# de f ine VER_PRERELEASE
#else
# de f ine VER_PRERELEASE
# endi f

\
\

VS_FF_DEBUG
0

VS_FF_PRERELEASE
0

# de f ine VER_F I LEFLAGSMASK VS_FFI F I LEFLAGSMASK
VOS_NT_WINDOWS 3 2
# de f ine VER_F I LEOS
( VER_PRERELEASE I VER_DEBUG )
# de f ine VER_F ILEFLAGS
Product information file This is the actual resource control script that sets
product-specific fields in the version resource. Notice that it includes the vendor
default values defined above. The actual version resource is built by including the
system-supplied COMMON.VER file. Any version information not defined by
the time you include COMMON.VER will be filled in with Microsoft-specific
information. The following is a copy of XXDRIVER.RC, the version resource
script for XXDRIVER.

# inc lude 
/ *---------------------------------------------------* /
/ * Inc lude de f au l t va lues f o r generic vendor info * /
*/
/*
/*---------------------------------------------------* /
# inc lude " cydnxver . h "
/ *---------------------------------------------------*/
/ * The f o l l owing values shoul d be modi f i ed only by * /
/ * the o f f i c i al bui l der , and they shou l d be updated * /
/ * f o r each r e l eas e * /
/ *---------------------------------------------------* /

Sec. 16.2 Miscellaneous BUILD-Time Activities

# de f ine
# de f i ne
# de f ine
#de f ine

407

VER_PRODUCTBUILD 4 2
VER_PRODUCTVERS I ON_STR 11 1 . 0 1 11
VER_PRODUCTVERS I ON 1 , 0 1 , VER_PRODUCTBUILD , l
VER_PRODUCTBETA_STR 11 11

/ *---------------------------------------------------*/
/ * Inc lude produc t - spec i f i c de fau l t va lues * /
*/
/*
/*--------------------------------------------------- * /
# de f ine
# de f ine
# de f ine
# de f ine
# de f ine
# de f ine

VER_PRODUCTNAME_STR " XXDRIVER "
VER_F ILETYPEVFT_DRV
VER_FILESUBTYPEVFT2_UNKNOWN
VER_F I LEDESCRI PTION_STR " Dr iver f o r XX "
VER_INTERNALNAME_STR " xxdr iver . sys "
VER_ORIGINALF I LENAME_STR " xxdriver . sys "

/ *---------------------------------------------------*/
/ * De f ine the ver s i on r e s ource i t s e l f * /
*/
/*
/ *---------------------------------------------------*/
# inc lude < c ommon . ver>
Including Nonstandard Components i n a BUILD
Even though BUILD is the epitome of software maintenance technology,
there are still some things it doesn't do very well. For example, if you have a non­
standard driver component (like a custom message file), BUILD won't know what
to do. It's your job to help BUILD out of these sticky situations by writing an aux­
iliary makefile that tells it how to process the nonstandard components. These are
the steps you need to follow:

1.

Decide what nonstandard target files need to b e part o f the driver.

2.

In the same directory as the SOURCES file for your driver, create a makefile

called MAKEFILE.INC. This makefile describes the dependencies among
your driver 's nonstandard components and gives instructions for building
these components.
3.

For each nonstandard component, decide when during the BUILD operation
the component should be created.

4.

Add the component to the list of files in the NTTARGEFILEO,
NTTARGETFILEl, or NTTARGETFILES keyword of your BUILD control file.
See Table 16.1 for a description of these keywords.

5.

Run the BUILD utility.

Back in Chapter 13, you saw an example of a driver that defined some private
messages for logging events. Here are the auxiliary NMAKE and BUILD control

408

Chapter 16

Building and Installing Drivers

files that generate this driver 's executable. You can find the complete example in
the CH13\DRIVER directory on the floppy that accompanies this book.

MAKEFILE.INC Recall from Chapter 13 that the message compiler gener­
ates a tiny resource script along with a binary message file and a header. You
include this stub resource script in the driver 's main resource file, which leads to
the following dependencies in the auxiliary makefile:

xxrnsg . rc xxms g . h msgO O O O l . bi n : xxrns g . mc
me -v - c xxrnsg . mc
SOURCES Since the dependent files must be generated before BUILD
runs the resource compiler or the C compiler, you use the NTTARGETFILEO key­
word. Identifying any one of the dependent files is enough to get BUILD to
invoke MAKEFILE.INC.

TARGETTYPE= DRIVER
TARGETNAME= xxdriver
TARGETPATH=
INCLUDES= $ ( BASEDIR ) \ inc ; . . \ inc ; .
SOURCES= i n i t . c unl oad . c
di spatch . c
event log . c
xxrnsg . rc

\
\
\

NTTARGETF I LE O = xxms g . h
Moving Driver Symbol Data into .DBG Files
Contrary to what the DDK documentation claims, both checked and free
versions of your driver contain symbol data, which greatly increases the size of
your driver executable. This section explains how to strip symbols from your
driver and put them into a separate file. Follow this procedure.

1.

Use the following command t o examine the header information in your
driver 's executable:

DUMPBIN/ HEADERS XXDRIVER . SYS I MORE
2.

In the OPTIONAL HEADER VALUES section, look for the image base
address. Usually this will be OxlOOOO for kernel-mode drivers.

3.

Strip symbol information from your driver and put it in a separate file using
this command:

REBASE -B Oxl O O O O -X . \ SYMBOLS XXDRIVER . SYS
The B option specifies the new base address for the driver (in this case, the
same as the original value). The X option identifies the directory where the

Sec. 16.3 Installing Drivers

Table 1 6.2

409

Effect of removing symbols on driver file sizes

Driver sizes with and without symbols
Version

Before REBASE

After REBASE

Checked build
Free build

376,476 bytes
77,600 bytes

96,544 bytes
46,368 bytes

symbol file should go. The symbol file will have the same name as the driver
executable, with the extension .DBG.
4.

To use the symbol file for debugging, move it to the directory where you keep
other .DBG files on the host machine.

If you look at Table 16.2, you'll see the impact symbol data can have on the
size of a driver. This table compares the sizes of checked and free builds of the
standard NT serial port driver with and without symbols.

1 6.3 INSTALLING D RIVERS
This section explains how to install a driver by hand, which is something you'll
need to do while you're developing your driver. It also presents some guidelines
for automating the driver installation process once the retail version is ready for
the world.

How to Install a Driver by Hand
Installing an NT driver is just a matter of copying some files to the right
directory and making a few entries in the system Registry. These are the basic
steps you need to follow:
l.

Copy the driver to the %SystemRoot% \SYSTEM32\DRIVERS directory on
the target system.

2.

Add appropriate entries to the Registry of the target system using the
REGEDT32 utility. These entries are described below.

3.

Reboot the target system to make the Service Control Manager aware of the
new driver. If the driver 's Registry entries specify automatic startup, the
driver will load during system boot.

4.

If the driver 's Registry entries specify manual startup, use the Control Panel
Devices applet to start the driver.

If you find a nonfatal bug in your driver, you can load a corrected copy with­
out rebooting the system. Just use the Control Panel Dev�ces applet to stop the

Chapter 16

410

Building and Installing Drivers

driver. Then, overwrite the driver executable in the \DRIVERS directory and
restart it using the Devices applet. Of course, this only works if the driver has an
XxUnload routine and if it isn't crucial to the operation of the system.
...

Driver Registry Entries
During system bootstrap, NT builds a list of available drivers by scanning
the Registry. This list identifies both the drivers that start automatically as well as
those that need to be started manually. To add your driver to this list, you need to
build the Registry entries that appear in Figure 16.3.
Table 16.3 describes these Registry keys and values. To bring a driver online,
you only need the driver 's service key plus the Start, 'fype, and ErrorControl val­
ues. The service key should have the same name as the driver executable, without
the file extension. As you saw in Chapter 7, the Parameters subkey is normally
used for device information that doesn't auto-detect, although you can really put
anything in it.

End-User Installation of Standard Drivers
Manual installation is fine while you're still developing a driver, but once
your code is ready for commercial release, it's a good idea to automate the whole
procedure. If your driver manages a standard piece of hardware (like a video or
network card), you can take advantage of NT's built-in driver installation mecha­
nisms. These built-in mechanisms run in three different situations.

During text setup When end users perform a full installation of Windows
NT, the first piece of setup software runs in text mode. During this text phase, the
HKEY_LOCAL_MACHINE
�-- System

L CurrentControlSet

L

Services

L xx

RIVER

ErrorControl: REG_DWORD: Ox1
Start:
REG_DWORD: Ox3
Type:
REG_DWORD: Ox1

Parameters

Copyright @ 1 994 by Cydonix Corporation. 940041a.vsd

Figure

1 6.3

Structure of a driver 's Registry service key

411

Sec. 16.3 Installing Drivers

Table 1 6.3

Kernel-driver Registry entries

Driver service key Registry entries
Name

Data type

Description

XXDRIVER
Type

(Key)
REG_DWORD

Driver service key*
What kind of driver this is*
• 1
kernel-mode driver
• 2
file-system driver
When to start the driver (see below)*
System response if driver fails to load*
• 0
log error and ignore
• 1
log error and put up a message box
• 2
log error and reboot with last-known
good configuration
• 3
log error and fail if already using
last-known good configuration
Driver 's group name (see below)
Drivers needed by this one (see below)
Driver load order within a group (see
below)
Key to hold driver-specific parameters
-

-

Start
ErrorControl

REG_DWORD
REG_DWORD

-

-

-

-

Group
DependOnGroup
Tag

REG_SZ
REG_MULTI_SZ
REG_BINARY

Parameters

(Key)

*These entries are re quired.

setup program installs drivers for the keyboard, the mouse, SCSI HBAs, and
video devices. If it can't find a driver for one of these devices (or if the user
chooses to replace the standard driver), the setup program will prompt the user
for an installation diskette.
The diskette contains a copy of the driver itself and a control script called
TXTSETUP.OEM. This script is just a text file that identifies the type of hardware
supported by the driver, lists the files that need to be copied from the floppy, and
names the keys and values that should be added to the Registry. The Windows
NT DDK Programmer 's Guide describes the exact contents and format of a TXT­
SETUP.OEM file.

During GUI setup Once the text phase of Windows NT installation fin­
ishes, a GUI-based setup program takes over. This GUI setup program can install
drivers for the keyboard and mouse, video and network cards, tape drives, and
SCSI HBAs. Just like its text-based counterpart, the GUI setup program prompts
the user for the location of any drivers it can't find; it also allows the user to sup­
ply replacements for the standard drivers.
To install a driver during GUI setup, once again you'll need to write a con­
trol file. This one is called OEMSETUP.INF, and it uses a much more full-featured
scripting language than TXTSETUP.OEM. The GUI scripting language supports

412

Chapter 16

Building and Installing Drivers

dialog boxes, message text in multiple national languages, elaborate flow control,
and commands for a variety of common installation tasks. If the built-in com­
mands aren't enough, you can call functions in DLLs or run external programs
from within the script. See the Windows NT DOK Programmer 's Guide for a
description of the GUI scripting language.

After NT installation Users can also install drivers for standard devices
after NT itself has been set up. This is referred to as maintenance mode installation,
and it uses the same OEMSETUP.INF script as the GUI setup phase of NT.
Depending on the type of hardware, the end user will have to run either the Win­
dows NT Setup program or a Control Panel applet to execute the script. Table 16.4
shows the various options.
End-User Instal lation of Nonstandard Drivers
If your device isn't one of the types supported by TXTSETUP.OEM or
OEMSETUP.INF, you'll have to provide your own installation program. You can
either use commercial installation software, or you can roll your own using some
of the following Win32 API calls:
•

CopyFile to move the driver file to the appropriate directory.

•

RegCreateKeyEx and RegSetValueEx to set up the proper keys and val­
ues in the Registry.

•

CreateProcess to run any external programs needed during installation.

•

CreateService and StartService if you want to bring the driver online
without rebooting the system. 1

As you've seen elsewhere in this book, you can customize the behavior of
your driver using values stored in the Parameters subkey of the driver 's Registry

Table 1 6.4

How to install standard d rivers in maintenance mode

Installation tools for standard drivers

1

Type of d river

Installation tool

Keyboard
Mouse
Multimedia device
Net-card and network protocol
SCSI HBA
Tape drive
Video

Windows NT Setup
Windows NT Setup
Control Panel Drivers applet
Control Panel Network applet
Windows NT Setup
Windows NT Setup
Control Panel Display applet

See the INSTDRV sample that comes with the NT DDK for an example of using the Service Control
Manager API to install a driver without forcing the user to reboot.

Sec. 16.4 Controlling Driver Load Sequence

41 3

service key. If you have many of these parameters and you expect end users to
change them, you should consider writing either a Control Panel applet or a
standalone program to modify the Registry. This is much safer than asking an end
user to work with REGEDT32.
Finally, you'll make everyone's life easier if you supply software that allows
users to remove your driver from the system. This means cleaning up the Registry
as well as deleting any relevant files.

1 6 .4 CONTROLLING DRIVER LOAD SEQUENCE
There are times when you may need t o control the sequence i n which N T loads
multiple drivers. For example, class drivers usually have to be loaded after the
port drivers that manage their underlying hardware. If your drivers load auto­
matically when the system boots, you can use various Registry entries to control
their load sequence. This section explains how.

Changing the Driver's Start Value
You can control when a driver loads by setting the Start value in the driver 's
Registry service key. The number you assign to Start corresponds to one of the
Service startup types recognized by the NT Service Control Manager. Currently,
Start can take one of the following values.

OxO (SERVICE_BOOT_START)

This value specifies that a driver should
be started by the operating system loader. Since much of the system isn't avail­
able, this value should be used only for drivers that are necessary to the bootstrap
operation itself (for example, the driver for the boot device) .

Ox1 (SERVICE_SVSTEM_START) This value identifies drivers that should
be started after the operating system has been loaded, but while it is still initializing
itself.
Ox2 (SERVICE_AUTO_START) Drivers with this Start value are loaded
by the Service Control Manager after the entire system is up and running. Unless
your driver is crucial to the system bootstrap or initialization, this is probably the
most appropriate value to choose.
Ox3 (SERVICE_DEMAND_START) These drivers have to be started man­
ually, either by using the Control Panel Devices applet or by making direct calls to
the Win32 Service Control Manager APL
Ox4 (SERVICE_DISABLED) Disabled drivers cannot be started until their
Start value is changed to something else. Again, you change this value using the
Control Panel Devices applet or the Service Control Manager API, or by modify­
ing the Registry directly.

Chapter 16

414

Building and Installing Drivers

NT guarantees that drivers with lower Start values will be loaded ahead of
drivers with higher values. So all drivers with a value of 0 will load ahead of any
drivers with values of 1 or 2. Keep in mind that this only works for Start values of
0, 1, or 2, because drivers with other Start values require some kind of manual
intervention to get them going.

Creating Explicit Dependencies between Drivers
Setting Start values is fine if your drivers need to be loaded during different
phases of system startup, but what if you need to control the load order of multi­
ple drivers with the same Start value? For example, a SCSI class driver won't be
able to load successfully until all the SCSI miniport HBA drivers are available.
One solution to this problem is to use the Group and DependOnGroup values in
the driver service keys.
These are the steps you should follow if you want to establish an explicit
load-order dependency between two drivers:
l.

Decide which driver needs to load first and choose a group name for this
driver. In some cases (like the SCSI miniport), you may need to use a standard,
system-defined group name. Otherwise, use a name of your own choosing.

2.

Add a value called Group to the service key of the driver that loads first. The
Group value is a REG_SZ containing the group name you've assigned to this
driver.

3.

Add a value called DependOnGroup to the service key of the driver that
should load second. The DependOnGroup value is a REG_MULTI_SZ con­
taining the names of any groups on which this driver depends. At least one
driver in each named group must be started before the system will start any
dependent driver.
Keep in mind that you can have as many drivers as you like with the same

Group value. This guarantees that all the members of the group will get a chance
to load ahead of any drivers depending on that group name. Again, SCSI
miniports are a good example.
To see how all this works, imagine that you have two drivers, :XXDRIVER
and YYDRIVER, and that XXDRIVER is a member of the group called "Group W. "
If you wanted :XXDRIVER to load ahead of YYDRIVER, you'd need to set up the
following Registry entries:

HKEY_LOCAL_MACHINE \ . . . \ S ervi c e s \ XXDRIVER
S tart : REG_DWORD : 2
Group : REG_S Z : Group W
HKEY_LOCAL_MACHINE \ . . . \ Servi c e s \ YYDRIVER
S tart : REG_DWORD : 2
DependOnGroup : REG_MULTI_S Z : Group W

Sec. 16.4 Controlling Driver Load Sequence

415

With these values, both drivers will load during final stages of system
startup, after everything is running . Further, all the drivers in "Group W" will be
given a chance to load before YYDRIVER.

Establishing Global Group Dependencies
Another way to control the load order of your drivers is to modify the Ser­
viceGroupOrder key in the Registry. This key contains a single REG_MULTI_SZ
value called List that identifies group names in the order that they will be loaded.
The earlier a driver 's group name appears in this list, the sooner it loads. NT will
try to load all the drivers in an earlier group ahead of any driver in a later group.
Figure 16.4 shows an excerpt of this part of the Registry. In this example,
drivers in the group "SCSI class" load after all drivers in the group "Primary
disk" and before any drivers in the group "SCSI CDROM class. "
Although you could achieve the same results using DependOnGroup, this
technique is useful for situations where you don't want to modify the Registry
values of some of the drivers. For example, if you wanted one of your drivers to
load earlier than a particular system-supplied driver group, you could simple
modify the ServiceGroupOrder key. There would be no need to change the
DependOnGroup value of each system-supplied driver.
The ServiceGroupOrder list is actually scanned several times during system
startup. First, at bootstrap time, all drivers with a Start value of 0 load according
to their ServiceGroupOrder sequence. Next, during system initialization, drivers
with a Start value of 1 load. Finally, when the system is up and running, any driv­
ers with a Start value of 2 are loaded. So, drivers with lower Start values load
H KEY_LOCAL_MACHINE

L

system

L CurrentControlSet

L

Co ntrol

L se

[

·ceGroupOrder
List: REG_MULTl_SZ:
System Bus Extender
SCSI mi n iport
port
Primary disk
SCSI class
SCSI CDROM class
filter

Copyright © 1 995 by Cydonix Corporation. 950012a.vsd

Figure

1 6.4

The layout of the ServiceGroupOrder Registry key

416

Chapter 16

Building and Installing Drivers

before any drivers with higher Start values, no matter what their positions in the
ServiceGroupOrder list.
As an example, suppose you had a SCSI disk that needed a special driver.
Unfortunately, the standard SCSI disk class driver is going to allocate anything
that looks like a SCSI disk, including yours. The only way to prevent this is to
make sure that your driver loads ahead of the standard driver. You can do this by
modifying the ServiceGroupOrder list.
First, add a Group value to the Registry key for the driver that manages the
special disk. If this driver were XXDRNER, and you wanted to add it to "Group
W," the Registry key would be

HKEY_LOCAL_MACHINE \ . . . \ S ervi c e s \ XXDRIVER
Group : REG_S Z : Group W
S tart : REG_DWORD : 0
Examining the Registry service key for the standard SCSI disk driver
(SCSIDISK), you find that it belongs to the group "SCSI class. " So, you need to
edit the ServiceGroupOrder list and add "Group W" ahead of "SCSI class. " The
Registry would then look like this:

HKEY_LOCAL_MACHINE \ . . . \ C ontro l \ S ervi c eGroupOrder
L i s t : REG_MULTI_S Z :
Sys t em Bus Ext ender
S C S I miniport
Group W
SCSI class

Controlling Load Sequence within a Group
The techniques presented so far allow you to set up load-order relationships
among groups of drivers, but they make no promises about the load order of driv­
ers in the same group. By adding Tag values to the Registry keys of drivers within
a group, you can control their loading sequence. Here's what you need to do:
1.

Modify the \CurrentControlSet\Control\GroupOrderList key in the Reg­
istry by adding a value with the same name as your driver group. Give this
value a data type of REG_BINARY and make sure its contents follow the pat­
tern described below. This value defines a series of tag numbers and their
sequence.

2.

Add a REG_DWORD value called Tag to the Registry service key of each
driver in the group. Set this value to one of the tag numbers you defined for
your group in GroupOrderList.

...

Within a single group, NT will load drivers according to the sequence of
their Tag values, as defined in the GroupOrderList. Drivers without a Tag value

417

Sec. 16.4 Controlling Driver Load Sequence

Count
1 byte

1 st Tag
4 bytes

Nth Tag
4 bytes

Filler
3 bytes

Copyright © 1 995 by Cydonix Corporation. 950013a.vsd

Figure

1 6.5

Layout of a tag definition in the GroupOrderList key

(and drivers whose Tag value is not in the GroupOrderList) load after the drivers
with valid Tag values. For these drivers, the order of loading is not guaranteed,
other than that all drivers in a group load before the next group loads.
The tag definitions in the GroupOrderList are REG_BINARY data, and their
format needs a little explanation. As you can see from Figure 16.5, each definition
contains several fields. The first field is a 1-byte count of the number of tag values
to follow. Next come the tag numbers themselves, each one taking up a DWORD.
These are followed by 3 null bytes that round the whole entry up to an integral
number of DWORDs.
The following example of one of these values defines two tags: one with a
value of Ox44 and another with a value of Ox28.

02 00 00 00 44 00 00 0 0 2 8 00 00 0 0
Note that it's the sequence of the tags (and not their actual numerical values)
that determines driver load order. With the example above, drivers in this group
with a Tag of Ox44 would load ahead of those with a Tag value of Ox28.
As an example of using these tags, imagine that you have two drivers,
XXDRIVER and YYDRIVER, both belonging to "Group W" and you want
XXDRIVER to load ahead of YYDRIVER. The first step is to add a value to the
GroupOrderList that defines the tags:

HKEY_LOCAL_MACHINE \ . . . \ Contro l \ GroupOrderL i s t
Group W : REG_BINARY : 0 2 0 0 0 0 0 0 4 4 0 0 0 0 0 0 2 8 . . .
Next, modify the service keys for XXDRIVER and YYDRIVER by adding

Tag values to them. The Registry entries would look like this:

HKEY_LOCAL_MACHINE \ . . . \ S ervi ce s \ XXDRIVER
Start : REG_DWORD : 2
Group : REG_S Z : Group W
Tag : REG_DWORD : Ox4 4
HKEY_LOCAL_MACHINE \ . . . \ S ervi c e s \ YYDRIVER
Start : REG_DWORD : 2
Group : REG_S Z : Group W
Tag : REG_DWORD : Ox2 8

418

Chapter 16

Building and Installing Drivers

One final point: Not every group shows up in the GroupOrderList key.
When a group is not in the GroupOrderList, the order in which drivers load
within the group is undetermined.

1 6.5 SUM MARY
This chapter has presented a variety of different topics, all of which had to do
with building a driver and getting it online. But what if the driver has personal
problems? What if, in an occasional psychotic fit, it crashes the system or muti­
lates some data? In the next chapter, you'll see some techniques you can use to
track down and eliminate bugs from your driver.

C

H

A

P

T

E

R

17

Testin g and
Debu ggin g Drivers

W

here do they come from, these driver bugs? Do
they hide beneath the bed like mutant dust bunnies, scheming and plotting waiting for nightfall so they can sneak into our code? No, driver bugs are not ran­
dom events. Instead, they represent some coding or logic error, or some lack of
understanding about how the hardware or the system actually works. This chap­
ter presents a number of testing and debugging techniques you can use to catch
both catastrophic and subtle flaws in your driver.

1 7 . 1 SOME G U IDELINES FOR DRIVER TESTING
A s i n other areas o f software development, a great deal o f thought has gone
into the practice of software testing over the last three decades. It's a good idea
to take advantage of this thinking when you start to design a testing strategy
for your driver. The following sections present some of the major issues you
should consider. (See the Bibliography for some other references on software
testing.)

The General Approach to Testing Drivers
The first thing to do is to accept the hopelessness of your situation. It's sim­
ply not possible to verify that a driver is free of bugs. To begin with, even trivial
pieces of software can have so many code paths that there's just no way to exer­
cise every one of them. Add to that all the various hardware and system-load
419

Chapter 1 7

420

Testing and Debugging Drivers

conditions your driver might encounter in the real world, and your chances of
catching every bug disappear pretty quickly.
As a tester, the best you can do is to show that a driver doesn't exhibit any of
the bugs detectable by your tests. If your tests represent a reasonable model of
conditions in the driver 's target environment, then you'll probably be in good
shape. This points to the fact that designing good tests is just as important as
designing a good driver.

When to do the testing Experience shows that it's more effective to test
individual driver components as they're developed, rather than waiting until the
whole driver is written to perform a single "big bang" test. Although incremental
testing means writing a larger number of small test programs, this strategy makes
it much easier to locate the source of a problem. The tiny test programs are also
helpful when you want to make sure that changes to a driver 's code base haven't
introduced any new bugs.
Another advantage of testing during development is that it can point out
basic design flaws in the driver which might otherwise go undetected until the
end of the project. Correcting these kinds of fundamental errors late in the project
cycle is usually much more expensive than catching them early.
What to test Later in this chapter, you'll see some specific types of driver
failures to watch out for, but you can generally divide driver tests into the follow­
ing categories:
•

Hardware tests
These verify the operation of the hardware. This is
especially important if both the device and the driver are being devel­
oped together. In some cases, this may actually mean using a logic ana­
lyzer to see what's going on.

•

Normal response tests
These confirm that the driver executes the
full range of commands it will have to perform once it's out in the real
world.

•

Error response tests
These check the reaction of the driver to bad
input from a user program, as well as to device errors and timeout
conditions .

•

Boundary tests
If the device has any limitations on its maximum
transfer size or speed, these tests make sure that the driver can handle
them.

•

Stress tests
These subject the driver and its devices to high levels of
sustained activity. This category also includes tests where the overall sys­
tem experiences high levels of CPU, memory, and 1/0 activity, or where
resources like memory are in very short supply.

-

-

-

-

-

How to develop the tests Writing test software is an art. Good tests must
be thorough enough to have a high probability of actually uncovering errors in

Sec. 17. 1 Some Guidelines for Driver Testing

421

the driver. This means you need to analyze the kinds of errors you think the
driver might generate, and then write a test suite that will produce them.
Good test software also gives the tester enough information to pinpoint the
cause of the failure easily. The output generated by a test program should be easy
to read and should be formatted in such a way that important details aren't hid­
den somewhere in a pile of extraneous information.
Finally, test software needs to be complex enough to model a real-world sit­
uation, yet simple enough that it's easy to develop. If a test program is too com­
plex, it may take a long time just to write and debug the test itself.

How to perform the tests It's important to automate the test procedure
itself. This makes it easier to guarantee that the same sequence of tests are being
performed each time.
It's also a good idea to do regression testing. In other words, if you fix some­
thing in the driver, run the tests again to make sure you haven't broken anything
else. This is another good reason to automate the test procedure.
When you run the tests, log the results and keep the output. This will give you
a good idea of whether or not you're actually getting closer to fixing things or not.
Who should do the testing Remember that the goal of testing is to tear
the driver to shreds. To find bugs lurking under every line of code. To prove that
only angelic intervention keeps the driver working at all. This is very different
from the goal of the driver writer, who generally assumes that what he or she is
producing will work properly. Because coding and testing have this kind of
adversarial relationship, it's usually best if these jobs are performed by different
people. It's almost always unreasonable to expect a single person to be objective
about their own code.
Using the M icrosoft Hardware Compatibility Tests (HCTs)
The hardware compatibility test suite (or simply, the HCTs), is a collection of
programs which allow platform vendors to see whether their systems will run
Windows NT. This suite contains a number of different components, including
•

General system tests that exercise the FPU, the onboard serial and parallel
ports, the keyboard interface, and the HAL.

•

Tests that exercise drivers for specific kinds of hardware like video adapt­
ers, multimedia devices, network interface cards, tape drives, SCSI
devices, etc.

•

General stress tests that put unusually high loads on system resources
and 1/0 bandwidth.

•

A GUI-based test manager that automates test execution and data collection.

Even if you're not developing a driver for one of the types of hardware with
its own test, you can use the HCTs as part of your stress-testing strategy.

Chapter 1 7

422

Testing and Debugging Drivers

You can find the HCTs in the \HCT . . . directory tree on the CD containing
the NT DDK. Although they're distributed with the DDK, the HCTs are not auto­
matically installed. For installation instructions, see the README.TXT file in the
\HCT directory. Remember to put the HCTs on the target machine (where your
driver will be running ), not on the host. For more information about using the
HCTs, look in the \HCT\DOC directory on the DDK CD. This directory contains
all the HCT documentation in Word for Windows format.
Finally, if you're writing a driver for a commercial product and you want it
to be logo-branded by Microsoft, you'll need to send your driver (and its hard­
ware) to the Microsoft Compatibility Labs for testing. Microsoft offers Windows
NT certification programs for several hardware categories including video cards,
network adapters, SCSI adapters, multimedia audio cards, and printers. Once a
driver passes the Microsoft certification tests, it's added to the driver library that's
distributed with Windows NT. At that point, you're allowed to display a special
logo on any product packaging. Contact your friends at Microsoft for details and
pricing.

1 7 .2 SOME THOUGHTS ABOUT DRIVER BUGS
A s you saw in the last section, successful testing and debugging depend on figur­
ing out ahead of time what might go wrong. The goal of this section is to get you
thinking about the specific kinds of problems drivers can have. It also presents
some techniques that can make bugs easier to detect and manage.

Categories of Driver Errors
Drivers can fail in any number of interesting ways. Although it's not possi­
ble to give a complete list, the following subsections describe some of the more
common types of driver pathology.

Hardware problems There's always a chance that the hardware itself
might be causing problems. This becomes even more likely if both the device and
the driver are being developed at the same time. Symptoms of hardware prob­
lems include
•

Errors occurring during data transmission.

•

Device status codes indicating an error.

•

Interrupts not arriving.

•

The device not responding properly to commands.

The cause might be as simple as undocumented behavioral quirks in the
device (for example, some kind of restriction on command timing or sequencing).
If it's a complex device, it might have bugs in its firmware (there simply is no bug-

Sec. 17.2 Some Thoughts about Driver Bugs

423

free SCSI firmware in the world). It could also be the result of some low-level bus
contention or external signal noise. The device might just be broken.
The best approach to these problems is to make the error reproducible and
then get as much information as you can. See if the manufacturer has any more
information on the behavior of the device, or on known bugs. Use any available
hardware diagnostics to verify that the device itself is working properly.

System crashes It's easy for failures in kernel-mode code to kill the entire
system. Many kinds of driver logic errors can produce a crash, although the most
common problem seems to be access violations caused by a bogus pointer. It's
also possible for things like bad DMA addresses to corrupt system memory. The
next section of this chapter will have more to say about interpreting system
crashes.
Resource leaks The system doesn't perform any resource tracking or
automatic cleanup for kernel-mode components. When a driver unloads, it's
responsible for releasing whatever it may have allocated. This includes both
memory from the pool areas plus any hardware the driver manages.
Even while a driver is running, it can leak memory if it regularly grabs pool
space for temporary use and doesn't release it. Higher-level drivers can also be a
source of leaks if they allocate their own IRPs and forget to free them. These kinds
of driver errors can lead to bad system performance, as the pools slowly dry up,
or to a complete system crash.
You can use the pool-tagging mechanism and sanity counters (described
later in this chapter) to catch pool leakage and lost IRPs. By examining the
RESOURCEMAP section of the Registry with REGEDT32, you can check for
hardware allocation problems.
Thread hangs Another kind of failure involves synchronous 1/0 requests
that don't return. In this case, the user-mode thread issuing the request is blocked
forever and never comes out of its wait state. This type of behavior can result from
several different driver problems.
The most obvious cause is not calling IoCompleteRequest to send the IRP
back to the 1/0 Manager. Not so obvious is the need to call IoStartNextPacket.
Even if there are no pending requests to be processed, your driver has to call this
function because it marks the Device object as idle. Without this call, all new IRPs
will go into the pending queue, rather than going to the Start 1/0 routine.
The calling thread can hang in a driver 's Dispatch routine if the driver is try­
ing to recursively acquire a Fast Mutex or an Executive Resource. Similarly, if a
kernel-mode thread acquires a Mutex or Executive Resource without releasing it,
Dispatch routines may hang up if they try to acquire the same object.
DMA drivers that don't release the Adapter object or its mapping registers
can prevent IRPs from being processed. In the case of slave DMA devices, the
offending driver might even cause other drivers using the same DMA channel to
lock up.

Chapter 1 7

424

Testing and Debugging Drivers

Drivers that manage multiunit controllers can cause similar trouble by not
releasing the Controller object. In this case, new IRPs sent to any Device object
using the Controller object will freeze up.
Unfortunately, there's no convenient way to see who currently owns
Adapter or Controller objects, Mutexes or Executive resources. About the best you
can do is to use a counter to make sure you're releasing these objects as many
times as you're acquiring them. In some cases, the checked build of NT may flag
some of these errors with a crash.

System hangs Occasionally, a driver error can cause the entire system to
lock up. For example, deadly embraces involving multiple spin locks (or attempts
to acquire the same spin lock multiple times on the same CPU) will bring every­
thing to a grinding halt. Endless loops in a driver 's Interrupt Service routine or a
DPC routine could cause a similar failure.
Once this kind of collapse occurs, it's difficult (if not impossible) to regain
control of the system. The best approach is usually to debug the driver interac­
tively and see if you can trace the exact sequence of steps that lead to the hang.
Reproducing Driver Errors
One of the keys to correcting a driver bug is being able to reproduce the
problem. Intermittent errors are the bane of a driver writer 's existence. Be as
meticulous as possible in recording the exact circumstances at the time a bug
appears, so that you can track and correct it. Several factors can make bugs
intermittent.

Time dependencies Some kinds of problems only show themselves when
a driver is running at full speed. This could mean large numbers of 1/0 requests
per second, high data rates, or both. Stress testing is usually a good way to make
these kinds of bugs appear.
Multiprocessor dependencies Things don't behave the same way on sin­
gle- and multiprocessor systems. For example, ISR, DPC, and 1/0 Timer routines
can all run simultaneously on an SMP machine. This can lead to various problems
that don't show up on a single CPU. For this reason, it's important to make multi­
processor testing part of your driver verification strategy. One warning: SMP
debugging is very painful, so it's a good idea to do the initial debugging on a sin­
gle processor.

M ultithreading dependencies If your driver manages shareable Device
objects, it's important to see what happens when multiple threads are issuing
requests at the same time.
Miscellaneous causes Finally, intermittent errors can depend on a whole
universe of other factors. This includes sensitivity to system load conditions,

Sec. 1 7.2 Some Thoughts about Driver Bugs

425

problems caused by specific combinations of hardware on the same machine, or
specific combinations of devices on the same bus. Once again, a detailed log is
your best hope of determining the factors that make the bug appear.

Coding Strategies That Reduce Debugging
There are several things you can do during the coding phase of driver devel­
opment that will reduce debugging time. Here are some of them:
•

Get someone else to look at your code. It's amazing how quickly an unbi­
ased eye can sometimes see the cause of a problem that you haven't been
able to find.

•

Use assertions (described later in this chapter) to check for various kinds
of inconsistencies.

•

Leave the debug code in your driver, surrounded with appropriate #if
and #endif statements.

•

Add a version resource to the driver so that you can determine exactly
which version of the driver is having problems. Chapter 16 explains how
to do this.

•

If you're working on a large driver project with other people, using ver­
sion control software will help to maintain everyone's sanity.

Keeping Track of Driver Bugs
Research has shown that bugs are not evenly distributed throughout a piece
of code. Rather, they tend to cluster in a few specific routines. Usually, this will be
some very complex piece of code, or code with complex (or questionable) logic. A
bug log can help you track these errors by drawing your attention to the places
where your driver tends to fail.
Such a log can also help you spot patterns of system loading or driver usage
that result in failures. Finally, you can use the bug log to decide which errors are
worth fixing (not all of them are) and to keep track of which errors have already
been corrected.
Individual needs vary, but at the very least, you should keep the following
kinds of information in a bug log:
•

An exact description of the failure.

•

As much detail as possible about the prevailing conditions at the time of
the failure. This includes the version of the operating system and the
driver and a description of the hardware configuration,

•

The importance of fixing this bug.

•

Current status of the bug.

Chapter 1 7

426

Testing and Debugging Drivers

1 7 .3 R EADI N G C R A S H S C R E E N S
System crashes (which Microsoft documentation euphemistically calls "STOP
messages") are perhaps the most dramatic sign that your driver has a bug. This
section describes how STOP messages are generated and explains how to get use­
ful information from them.

What Happens When the System Crashes
In spite of its name, a system crash is really a very orderly thing. It is NT's
way of telling you that something in the operating system has become so unstable
that rebooting is the only safe thing to do. Oddly enough, a crash actually
improves NT' s reliability by preventing further damage to the system, and by
drawing attention to problems that might otherwise go unnoticed.
Two different sequences of events can lead to a system crash. In the first sce­
nario, some kernel-mode component happens to notice a horribly inconsistent
state of affairs and decides to take the system down. For example, the 1/0 Man­
ager might discover that a driver is trying to pass an already completed IRP to
IoCompleteRequest. The 1/0 Manager responds by initiating a crash.
The second path to a system crash is less direct. Here, a kernel-mode compo­
nent causes an exception which it does not or cannot handle. Code in the Kernel
traps the exception and initiates a crash. For example, a buggy driver that gener­
ated an access violation would produce this kind of crash. So would a driver that
caused a page fault at an elevated IRQL level.
Regardless of who decides to crash the system, the deed is done by making
one of the following calls: 1

VOI D KeBugCheck ( Code ) ;
VOI D KeBugCheckEx ( Code , Argl , Arg2 , Arg3 , Arg4 ) ;
These functions generate the STOP screen itself and (optionally) save a crash
file to disk. Then, depending on various system settings, they either reboot, halt
the system, or start up the Kernel's debug client.
The Code argument to KeBugCheck and KeBugCheckEx identifies the cause
of the crash. KeBugCheckEx takes an additional four arguments that appear as
part of the STOP message. KeBugCheck sets these values to zero. The BUG­
CODES.H header file in the DOK defines all the standard bugcheck codes. You'll
find descriptions of the more common codes and their parameters in Appendix B
of this book.

1

You can also call KeBugCheck and KeBugCheckEx in your own code if you discover some terrible
error. If you do make these functions part of your debugging strategy, use conditional compilation
to keep them out of the retail version of the driver. Very, very few situations are serious enough to
warrant a system crash in a commercial driver.

427

Sec. 17.3 Reading Crash Screens

Layout of a STOP Message
It's hard to miss the bright blue, character-mode screen on which STOP mes­
sages appear. If �ou look at one of these "blue screens of death," you'll see four
distinct sections.

Bugcheck i nformation The first part of the display identifies the cause of
the crash. This includes the bugcheck code, zero to four bugcheck parameters,
and (if the bugcheck code is one of Microsoft's) the symbolic name associated
with the error. Here's a sample:
***

STOP :

OxO O O O O O OA

( O x 0 0 0 0 0 0 0 0 , 0 x0 0 0 0 0 0 0 2 , 0x0 0 0 0 0 0 0 0 , 0xFCE 1 0 7 9 6 )

I RQL_NOT_LES S_OR_EQUAL * * * Addr e s s
p4 - 3 0 0

f c e 1 0 7 9 6 has base a t

fcel O O O O

- XxDr iver . SYS

i rq l : l f SYSVER O xf 0 0 0 0 5 2 2

In this example, the bugcheck code is OxOOOOOOOA and the associated sym­
bolic name is IRQL_NOT_LESS_OR_EQUAL. Fine, but just what does it mean? If
you look in Appendix B, you'll find that OxOOOOOOOA is saying that the driver
caused a page fault at or above DISPATCH_LEVEL IRQL.
The four numbers in parentheses after the bugcheck code are the extra argu­
ments passed to KeBugCheckEx. Their significance depends on the bugcheck
code itself. Again consulting Appendix B, you'll see that the first parameter con­
tains the paged address (0), the second is the IRQL level at the time of the refer­
ence (2), the third indicates the type of access (0 means "read"), and the fourth
is the address of the instruction that caused the fault (OxFCE1 0796). Very
thoughtfully, the display tells us that this address falls within the range of the
XXDRIVER.SYS module.
Next comes a line that seems to say something about the IRQL level of the
crash. This would be very useful to know, if it were correct. Sadly, KeBugCheck
always raises IRQL to HIGHEST_LEVEL for synchronization purposes so the
value in a STOP message is always OxlF.
On this same line, the SYSVER field tells you what version of NT was run­
ning. This is just the build-number in hex, with a OxF or a OxC in the highest nib­
ble to indicate whether it's the free or checked build of NT. In the sample above,
converting Ox522 to decimal says that this crash occurred under the free version of
build 1314.
Most of the useful information comes from this section of the STOP mes­
sage. All by itself, it's often enough to give you a good idea of what caused the
crash. You should always take note of this part of the STOP screen before reboot­
ing the system.

2

Under some conditions, the Kernel won't be able to display the entire screen. This usually means
that the services it needs to output some of the information are not available.

Chapter 1 7

428

Testing and Debugging Drivers

Module list Next comes a two-column display naming all the operating
system modules and drivers loaded at the time of the crash. It also lists each mod­
ule's base address in memory and a date-stamp indicating the module's file date.
Dl l B a s e Da t e S tmp - Name

Dll

8 0 1 0 0 0 0 0 2 f c 6 5 3 bc

- n t o skrnl . exe

80400000

� fb2 4 f 4 a - hal . dl l

8 0 0 1 0 0 0 0 2 fa a e 8 b 0

- Atdi s k . sys

80686000

2 f c l 5dl 9

- Fas t f at . sys

fcc2 0 0 0 0

00000000

- F l oppy . SYS

fcc3 0 0 0 0

00000000

- Fs_rec . SYS

fcc4 0 0 0 0

00000000

Base Da t e S tmp - Name

- Nu l l . SYS

fcc5 0 0 0 0

00000000

- Beep . SYS

f c c 6 0 0 0 0 2 faae 8 d9

- S e rmous e . SYS

fcc7 0 0 0 0

2 f aae8b2

-

f c c 8 0 0 0 0 2 fa a e 8 b 5

- Mouc l a s s . SYS

f c c 9 0 0 0 0 2 f aae8b4

i 8 0 4 2 prt . SYS

- Kbdc l as s . SYS

f c cb O O O O 2 faae 8 8 d - VIDEO PRT . SYS

fccaO O O O

2 faa e 8 9 2

- vga . sys

f c c c O O O O 2 faae 8 fd - Ms f s . SYS

f c c dO O O O

2 fa a e 8 e c

- Np f s . SYS

f cc f O O O O 2 fc l 2 a f 6

- NDI S . SYS

fcce O O O O

2 faae 9 2 d - aml 5 0 0 t . sys

f c d3 0 0 0 0 2 f a a e 9 4 5

- TDI . SYS

f c dl O O O O

2 fae6a5 f

- nb f . sys

f c d4 0 0 0 0 2 f aae9 4 f - n e tb i o s . sys

f c d5 0 0 0 0

00000000

-

Parpor t . SYS

f c d6 0 0 0 0

00000000

-

f c d7 0 0 0 0

2 f aae8d8

-

S e r i a l . SYS

fcd8 0 0 0 0

00000000

- a f d . sys

f c d9 0 0 0 0

2 fba 6 8 1 8

-

rdr . sys

f c ddO O O O

2 f c 3 e 3 eb -

Paral l e l . SYS
s rv . sys

fce l O O O O 3 1 6aa5 9 4

- XxDrive r . SYS

Occasionally, this part of the display can help you detect hostile interactions
between drivers. If driver X crashes the system if (and only if) driver Y is loaded,
there may be something going on between them. A written crash log will help
you to see these kinds of patterns.

Stack trace The third part is a listing of the function calls on the stack that
preceded the STOP message.
Addre s s dwo rd dump Bu i l d

[ 1 3 1 4 ) - Name

f f 4 1 6dl 8

fce1 0 7 9 6

fce1 0 7 9 6

ff4f9cl0

e1304018

8 0 1 8 62e3

00000246

- XxDr iver . SYS

f f 4 1 6 d2 4

801862e3

8 0 1 8 6 2 e3

00000246

8013 16e6

f f 4 1 6d4c

ff4 f9cl0

- n t o s krnl . exe

f f 4 1 6 d2 c

8 0 1 3 1 6e6

8 0 1 3 1 6e6

f f 4 1 6 d4c

ff4f9cl0

8 0 1 7 5 de 6

ff538288

- n t o s krnl . exe

f f 4 1 6 d3 8

8 0 1 7 5 de 6

8 0 1 7 5 de 6

ff538288

ff416f04

00000000

00000000

- n t o skrnl . exe

f f 4 1 6d84

fce1 0 7 9 6

fce1 0 7 9 6

fcel 0 0 0 8

00010246

f f 5 6 7 ee 8

00000000

- XxDr iver . SYS

f f 4 1 6d8 8

fce1 0 0 0 8

fce1 0 0 0 8

00010246

f f 5 6 7ee8

00000000

f f 5 8bc 4 0

- XxDr iver . SYS

f f 4 1 6 da4

fce1 0 6 1 f

fce1 0 6 1 f

00000004

f f 5 67ee8

00000000

f f 5 8bc 4 0

- XxDriver . SYS

f f 4 1 6 db 8

8 0 1 1 9b 6 9

8 0 1 1 9b 6 9

f f 5 8bc f 8

ff567f58

f f 4 1 6de4

8 0 1 14d69

- n t o s krnl . exe

f f 4 1 6 dc 8

8 0 114d69

8 0 1 1 4 d6 9

f f 5 8bc 4 0

f f 5 6 7ee8

f f 5 6 7 ee8

f f 5 8bc 4 0

- n t o s krnl . exe

f f 4 1 6de8

f c e l 0 5 d3

f c e l 0 5d3

f f 5 8bc4 0

f f 5 67ee8

00000000

00000000

- XxDriver . SYS

ff416e08

8 0 4 0 4 2 ac

8 0 4 0 4 2 ac

8 0 1 02 f4 8

f f 5 8bc 4 0

f f 5 6 7 ee8

f f 5 6 7 ee8

- hal . dl l

ff416e0c

80102 f48

8 0 1 0 2 f4 8

f f 5 8b c 4 0

8 0 1 0 d5 4 4

f f 5 6 7ee8

8 0 4 0 4 2 a0

- n t o s krnl . exe

ff41 6elc

804042a0

804042a0

8 0 1 5b 9 4 3

f f 4 1 6ed8

f f 5 9 3 d8 c

00403 054

- hal . dl l

f f 4 1 6 e2 0

8 0 1 5b 9 4 3

8 0 1 5b 9 4 3

f f 4 1 6 ed8

f f 5 9 3d8c

00403 054

ff567 f64

- n t o s krnl . exe

ff41 6e3 c

8 0 1 0d544

8 0 1 0 d5 4 4

8 0 1 5a3 4 8

f f 5 8bc 4 0

f f 5 6 7 ee 8

ff4 f9c28

- n t o skrnl . exe

ff41 6e4 0

8 0 1 5a3 4 8

8 0 1 5 a3 4 8

f f 5 8bc 4 0

f f 5 6 7ee8

ff4f9c28

00000001

- n t o s krnl . exe

ff41 6e68

80159c9c

80159c9c

00000000

00120196

01040864

ff41 6e08

- n t o skrnl . exe

ff41 6e84

8013 4 f3 0

80134f30

80100c60

ffffffff

f f 4 1 6 ed0

8 0 1 5 6 1 2 d - n t o s krnl . exe

ff416e88

80100c60

8 0 1 0 0c 6 0

ffffffff

f f 4 1 6ed0

8015612d 0012ff30

- n t o s krnl . exe

ff416e94

8 0 1 5 6 12d 8 0 1 5 612d 0 0 1 2 ff3 0

40100080

00120196

0012 ff14

- n t o s krnl . exe

f f 4 1 6ecc

8 0 1 3 4 f3 0

ffffffff

ff41 6 f04

8 0 1 3 7 fb 5

- n t o s krnl . exe

8 0 1 3 4 f3 0

8 0 1 0 0ec0

Sec. 17.3 Reading Crash Screens

429

Each line in this display represents one frame on the stack, with the most
recent frame being at the top of the display. This top frame is the one that was
active at the time of the crash. Reading down the display gives you a history of
the function calls that led to the crash.
On each line, the first column is the address of the stack frame itself. The sec­
ond two columns both contain the return address of the function. The remaining
columns are the first four DWORD arguments passed to the function. If a particu­
lar function takes more arguments, you won't see anything beyond the fourth. If
it takes less than four DWORDs, the information in some of the rightmost col­
umns will be bogus. The last column identifies the module in which the return
address (from column two) falls.
In the crash pictured above, you can see that code somewhere around
OxFCE10796 in :XXDRIVER.SYS was executing at the time of the crash. This code
was called by a routine in NTOSKRNL.EXE (at Ox801862E3), which in turn was
called by another system routine at Ox801316E6. Unfortunately, without a linker
map, there's no way to turn these hideous addresses back into function names.
This seriously limits the value of this display.
Also keep in mind that the call frames on the stack show you where the
problem was detected, not necessarily where it was caused. It's possible for a
driver to do horrible damage to a seldom-used part of the system and be long
gone before NT discovers it and crashes.

Recovery instructions There's very little useful information in this part
of the display. It basically confirms the communication settings of the Kernel's
debug client (if it's enabled), lets you know when the crash dump is finished, and
recommends a response to the STOP message.
Beginning dump o f phys i c a l memory
Phys i c a l memory dump c omp l e t e .
techn i c a l

Contac t your sys t em admini s t r a t o r or

support group .

The actual text of this message will depend on the current option settings
selected for the system. For example, if you have disabled crash dumps, you'll see
a slightly different display.

Deciphering STOP Messages
If the truth be told, there's not all that much helpful information in a STOP
message. The top few lines, containing the bugcheck information are perhaps the
most useful things to know. The stack trace (which at first glance looks so promis­
ing) actually has very little to say, unless you can determine the identities of the
functions listed in the trace.
To do this, you need linker maps for the modules containing the functions.
This means you're out of luck if the functions are located in a Microsoft module
like NTOSKRNL.EXE or HAL.DLL, since these linker maps don't come with the

Chapter 17

430

Testing and Debugging Drivers

DDK. You can, however, generate a linker map for your own driver using the fol­
lowing BUILD command:

BUILD - c e f -nmake L INKER_FLAGS= -MAP : xxdriver . rnap
This is all a rather tedious process, and it still doesn't give you a great deal of
information. Fortunately, if you have a crash file handy, you can find out much
more with far less work. The next two sections of this chapter will explain how to
work with crash files.

1 7.4 AN OVERVIEW OF W I N D B G
WINDBG is a kernel-mode debugger you can use to analyze both crash files
and running driver code. This section gives a brief overview of WINDBG . For
more information, see WINDBG's online help and the NT DDK Programmer 's

Reference.
Although WINDBG is a helpful tool, it does have some problems. For one
thing, it's actually an amalgamation of an older console-based kernel-mode
debugger (KD) and a GUI-based source-code debugger that came with early ver­
sions of the Win32 SDK. This double ancestry can make WINDBG a little confus­
ing to use, since there may be a console command, a menu option, and a toolbar
button that all do the same thing.
You may also experience occasional unexplainable WINDBG crashes from
time to time, as well as several other kinds of quirky behavior. For a complete list
of known (or at least, acknowledged) WINDBG bugs, look for an article on the
Microsoft Developers CD in the Win32 SDK Knowledge Base. 3

The Key to Source-Code Debugging
One of WINDBG's most powerful features is its ability to debug kernel­
mode components at the source-code level. Sadly, the documentation isn't real
clear about how to accomplish this little miracle. Proper configuration of two sets
of directories is the key to making it all work.

Symbol directories WINDBG gets very cranky if it can't find the symbol
files for the modules you're trying to debug. This includes both the symbols for
your driver and those for various operating system modules. See Appendix A for
a description of how to set up WINDBG symbol directories.
Source code directories On the machine where you're running WINDBG,
the directory path to your driver 's source code must exactly match the source-code
3

Search for

"WINDBG near bug" to find this article.

Sec. 17.4 An Overview of WINDBG

431

path on the machine where the driver was compiled and linked. Even the drive letter has
to be the same. The Linker stores this path information in the driver executable, and
WINDBG uses it to find the source code.4
If you don't know the original source-code path for a kernel-mode compo­
nent, don't worry. As long as you have a checked build of your driver (and its
symbols haven't been stripped out), you can use the DUMPBIN utility to find the
path names. The command looks like this:

DUMPBIN / SYMBOLS XXDRIVER . SYS I MORE
This generates a lot of output. The important information is at the top of the
listing. The following excerpts show the things you should look for.

Dump o f f i l e xxdriver . sys
F i l e Typ e : EXECUTABLE IMAGE
COFF SYMBOL TABLE
'
0 0 0 O O O O O O O B DEBUG notype F i l ename I . f i l e
D : \users \ ar t \ dr iver s \ ch1 8 \ crash \ drive r \ c rash . c
D O B 0 0 0 0 0 0 1 5 DEBUG no type F i l ename I . f i l e
D : \users \ ar t \ driver s \ ch1 8 \ c rash\ driver \ t rans f er . c
0 1 5 O O O O O O lD DEBUG no type F i l ename I . f i l e
D : \us ers \ art \ dr ivers \ chl 8 \ c rash \ driver \ di spatch . c
O lD 0 0 0 0 0 0 2 3 DEBUG no type F i l ename I . f i l e
D : \us ers \ art \ dr ivers \ ch1 8 \ crash\ drive r \ unload . c
0 2 3 0 0 0 0 0 0 0 0 DEBUG no type F i l ename I . f i l e
D : \users \ ar t \ driver s \ ch1 8 \ crash\ driver \ in i t . c

A Few WINDBG Commands
Although WINDBG is a GUI program, you really can't avoid using its com­
mand-line window. This text-based interface supports several dozen built-in com­
mands, and as you'll see later, you can add extensions of your own. Table 17.1
gives an overview of the more helpful WINDBG commands. See the online docu­
mentation and the WINDBG Help file for more information.

4

WINDBG has a menu option that supposedly lets you change the source-code path, but it doesn't
seem to work.

Chapter 17

432

Table 1 7.1

Testing and Debugging Drivers

Some useful W I N D BG commands

WINDBG commands and extensions
Command

Description

help
k, kb, and kv
dd address

Print help on basic WINDBG commands
Print a trace of the current kernel-mode stack
Dump the contents of memory

In

Print symbol names nearby a given value
Open a log file, replacing a previous version
Add new log information to an existing file
Close the debug log file
Print help on standard WINDBG extensions
Print verbose information about process handles

.logopen
.logappend
.logclose
!help
!handle 0 3 CID
!process 0 0
!process address flags
!process CID -1

!sysptes 1

List all processes on system
Print information about a process object
Print detailed information about specific process
Print information about a thread
Print context information for 80x86 CPU
Print Virtual memory statistics
Print summary of system page table usage

!drivers
!irpzone

List currently-loaded kernel-mode modules
List IRPs in use in NT' s IRP zone buffer

!irpzone full
!errlog
!bugdump ComponentName
!irp address

(Same as above, but with more information)
List any unflushed messages in errorlog buffer
Dump contents of bugcheck callback buffer
Print formatted contents of an IRP

!devobj address
!drvobj address

Print formatted contents of a Device object
Print formatted contents of a Driver object
Print formatted contents of an SRB
Print formatted contents of 80x86 trap frame
Print information about pool with a given tag
Print information about tagged pool
Reload a particular module
Load an extension DLL
Unload an extension DLL

!thread address
!per
!vm

!srb address
!trap address
!poolfind Tag
!poolused
!reload
!load ExtensionName
!unload ExtensionName

433

Sec. 17.5 Analyzing a Crash Dump

1 7 .5 ANALYZING A CRASH DUMP
When a crash occurs, Windows NT can save the state of the system in a dump file
on the boot partition. 5 Crash dumps allow you to reboot almost immediately and
determine the cause of the crash at a later time. This section explains how to ana­
lyze a system crash dump.

Goals of the Analysis
Using WINDBG, you can poke around in the remains of a dead system and
find out almost as much as if it were still running. This kind of forensic pathology
can help you develop a convincing explanation of what led to the crash. Some of
the questions you should ask when you're analyzing a crash include:
•

Was my driver executing at the time of the crash?

•

Was my driver responsible for the crash?

•

What was the sequence of events that led to the crash?

•

What operation was the driver trying to perform when the system
crashed?

•

Is there any information in the Device Extension that might tell me what
was going on?

•

What Device object was it working with?

Starting the Analysis
To begin analyzing a crash file, run WINDBG from the command line with
the -y and -z options. These specify the location of the crash symbols and the
dump file. For example,

WINDBG -y c : \wnt \ symbo l s

-

z c : \wnt \ memory . dmp

For the crash that produced the STOP message you saw earlier, the initial
output from WINDBG looks like this:
Thread C r e a t e :
Modu l e Load :

Proces s = O ,

Thread= O

d : \ u s e r s \ a r t \ dr i ve r s \ syrnbo l s \ f r e e \ NTOSKRNL . DBG

( symbo l s

Kernel Debugger c onne c t i on e s t ab l i shed f o r G : \ WINNT \ MEMORY . DMP
Kerne l Vers i on 1 3 1 4

Free

Bugchec k O O O O O O O a

00000000

:

l o aded @ O x8 0 1 0 0 0 0 0
00000002

S t opped a t an unexp e c t e d exc ept i on :

00000000

fce1 0 7 9 6

c o de = 8 0 0 0 0 0 0 3

Hard c oded breakpo i n t h i t

5

See Appendix A for instructions o n enabling crash dumps.

addr = 8 0 1 3 b4 1 6

l o aded )

Chapter 17

434

Testing and Debugging Drivers

You'll recognize some of this information from the STOP message. The
bugcheck code is OxA, which means the fourth parameter (OxFCE10796 in this
case) is the address where the problem occurred. To see where this instruction is
in your source code, choose Goto Address from the View menu, and enter the
address from the bugcheck parameter. In this particular crash, OxFCE10796 turns
out to be a function called XxTryToCrash.
The second parameter for bugcheck OxA is the true IRQL level at the time of
the crash. From NTDDK.H, two turns out to be DISPATCH_LEVEL, which gives
us a hint about what parts of the driver might have been executing at the time of
the crash.
One point: Don't be mislead by the message about the unexpected exception
with a code of Ox80000003. This is just the breakpoint used by KeBugCheck itself
to halt the system, so it has no significance.

Tracing the Stack
The stack trace is like a time line, showing you the sequence of function
calls leading up to the crash. By reading the trace from the oldest frame (at the
bottom) to the crash frame (at the top), you can come up with a coherent story
describing what happened. The trick is to find the stack.

H igh-IRQL crashes If the system crashed while it was running at or
above DISPATCH_LEVEL IRQL, you can use the k command to get a trace of the
stack at the time of the bugcheck.
KDx8 6 > k
f f 4 1 6dlc

f c e 1 0 7 9 6 NT ! KiTrap O E+ Ox2 5 2

f f 4 1 6da0

f c e 1 0 6 1 f XXDRIVER ! XxTryToCrash+ Ox2 6 ( 0 x 0 0 0 0 0 0 0 4 )

f f 4 1 6 dc 4

8 0 1 1 4 d6 9 XXDRIVER ! Xx S t ar t i o + Ox2 f ( O xFF 5 8 BC4 0 ,

f f 4 1 6de4

f c e 1 0 5d3 NT ! I o S t a r t Packe t + O x 9 b

OxFF 5 6 7 EE 8 )

f f 4 1 6 e 0 8 8 0 1 0 2 f 4 8 XXDRIVER ! XxDi spat chWr i t e + O x4 3 ( 0xF F 5 8 B C 4 0 ,
ff41 6elc

OxFF5 6 7 EE 8 )

8 0 1 5b9 4 3 NT ! @ I o fC a l l Driver@ 8 + 0x3 8

f f4 1 6e3c

8 0 1 5 a3 4 8 NT ! I opSynchronou s S e rvi ceTai l + Ox 6 f

f f 4 1 6 ed8

8 0 1 3 7 fb 5 NT ! NtWr i t eF i l e + Ox 6 a c

f f 4 1 6 ed8

7 7 f 8 9 4 2 7 NT ! Ki Sys t emServi c e + Oxa5

0012ff6c

0 0 0 0 0 0 0 0 NTDLL ! ZwWr i t e F i l e + O xb

Each line shows the address of the stack frame, the return address of the
function, the name of the function, and (in parentheses) the arguments passed to
the function. You generally won't see any arguments for system functions. To
make them show up, use the kv version of the stack-trace command.
In this trace, a call to ZwWriteFile eventually found its way to XXDRIVER's
XxDispatchWrite routine. The first argument for a Dispatch routine is always the
Device object (here, OxFF58BC40) and the second (OxFF567EE8) is the IRP. XxDis­
patchWrite called loStartPacket, which called the Start I/0 routine in
XXDRIVER. Just before it died, XxStartlo made a call to XxTryToCrash and
passed it an argument with a value of four.

Sec. 17.5 Analyzing a Crash Dump

435

Another way to see the current stack is by selecting Calls from the WINDBG
Window menu. Double-clicking on one of the frames in the Calls display will take
you to the line of source code where the call originated. Once you've entered a
stack frame this way, you can examine the function's local variables at the time of
the crash by selecting Locals from the WINDBG Window menu.

Crashes below DISPATCH_LEVEL When the system crashes because of
an unhandled exception below DISPATCH_LEVEL IRQL, the stack trace from the
k command won't tell you much about what was going on. 6 Instead, you need to
find the trap frame associated with the crash.
On 80x86 platforms, you can find the trap frame by using the kb command.
First, look for the stack frame associated with a function called KiDispatch­
Exception. 7

KDx8 6 > kb
F r ame P t r RetAddr Paraml

Param2

Param3

00000000

Func t i on Name

f c c c c ab8

80138f59

f c c c c ad4

f c c c cb2 8 NT ! Ki D i spatchExc ep t i on + Ox3 6 6

fccccc l O

8015c542

8 0 1 02 f48

ff564410

f f 4 ebc 8 8 NT ! C ommonDi spat chExcep t i on + Ox4d

f c c c cb2 8

ff56c860

f f 5 0cle8

00000000

0 0 0 0 0 0 0 6 NT ! I opErrorLogQueueReque s t + Ox 5 c

Next, look down the left-hand column (the one labeled "FramePtr") for the
address of the frame two earlier than the KiDispatchException frame. In this
crash, the frame of interest has the address OxFCCCCB28. What you've just found
is called the trap frame, and you can use the !trap command to format it.
KDx 8 6 >

! t rap f c c c cb2 8

eax= O O O O O O O O ebx= O O O O O O O O ecx= f c c c cb8 8

edx= O O O O O O O O

e i p = f c e 1 0 7 9 6 esp= f c c c cb 9 c ebp = f c c c cbac

i op l = O nv u p e i pl

es i = f f 4 ebec 0

edi = f f 5 6 7 ee 8

z r na po nc

v ip = O v i f = O
cs=0008

s s = O O l O ds= 0 0 2 3

ErrCode =

es = 0 0 2 3

fs=0 0 3 0

gs = O O O O

efl=00010246

00000000

From the formatted trap frame, note the contents of the EBP (OxFCCCCBAC),
ESP (OxFCCCCB9C), and EIP (OxFCE10796) registers. Use these values in the k
command to specify the stack address. This displays the true stack trace at the time
of the crash.
KDx8 6 > k = f c c c cbac

f c c ccb9 c

fce1 0 7 9 6

f c c ccbac

f c e 1 0 4 6 a XXDRIVER ! XxTryToCras h + O x 2 6 ( 0x 0 0 0 0 0 0 0 2 )

f c c ccbc 4

8 0 1 0 2 f4 8 XXDRIVER ! XxD i spatchOpenC l o s e + O x l a ( OxFF4 EBEC O ,

f c c c cbd8

8 0 1 5 c c c a NT ! @ I o fC a 1 1 Dr iver@ 8 + 0x3 8

fccccc 9 c

8 0 1 7 9b0 0 NT ! I opPars eDevi c e + Ox7 7 e

f c c c cd O c

8 0 1 7 5 c f 6 NT ! ObpLookupObj e c tName + O x4 8 0

f c c c cde4

8 0 1 5 1 e3 3 NT ! ObOpenObj e c t ByName + Oxa2

f c c c ce 9 0

8 0 1 5 6 1 2 d NT ! I oC r e a t eF i l e + Ox4 3 d

f c c c cedO

8 0 1 3 7 fb5 NT ! N t C r e a t eF i l e + Ox2 f

6 If you have WINDBG connected
7

will give you useful information.

O xFF 5 6 7 EE 8 )

to the target system and it catches the exception, the stack trace

This sample output is from a different crash than the one we've been examining.

436

Chapter 1 7

f c c c c edO

7 7 f 8 8 9 b3 NT ! K i Sys t emServ i c e + O x a S

f c c c cb 9 8

f f 5 6 7 ee 8

Testing and Debugging Drivers

O x7 7 f 8 8 9b3

In this trace, it's obvious that the problem occurred during a call to NtCre­

ateFile in the driver 's XxDispatchOpenClose function.
Using trap frames Another way to find the proper stack on 80x86
machines is to use the kv command. This displays a more detailed view of each
frame. Look for a function with KiTrap in its name. Next to this function, you'll
find the address of the trap frame.
KDx8 6 > kv
f f 4 1 6dlc

f c e l 0 7 9 6 NT ! KiTrap O E + Ox2 5 2

f f 4 1 6da0

f c e l 0 6 l f XXDRIVER ! XxTryToC rash+ O x2 6 ( 0 x 0 0 0 0 0 0 0 4 )

( FPO :

[0, 0]

TrapFrame @

f f 4 1 6dc4

8 0 1 1 4 d 6 9 XXDRIVER ! Xx S t a r t i o + O x2 f ( OxFF 5 8 BC4 0 ,

f f 4 1 6de4

f c e l 0 5d3 NT ! I o S t a r t Pa c ke t + O x 9 b

f f 4 1 6 dl c )

OxFF 5 6 7 EE 8 )

On the line for KiTrap, you'll find the address of the trap frame (in this case,
OxFF416DlC). Use the !trap command to format its contents.
KDx 8 6 >

! t rap

f f 4 1 6dlc

eax= O O O O O O O O

ebx= f f 5 8bc 4 0

ecx= f f 4 1 6 d7 c edx= O O O O O O O O e s i = O O O O O O O O

eip= fcel 0 7 9 6

esp= f f 4 1 6 d9 0

ebp= f f 4 1 6da0

nv up ei pl

edi = f f 5 6 7 ee 8

z r n a po n c

vi f = O

vip= O
cs= 0 0 0 8

i op l = O

s s = O O l O ds = 0 0 2 3

ErrCode =
Oxfcel 0 7 9 6

es = 0 0 2 3

f s = 0 0 3 0 gs = O O O O

efl=00010246

00000000
mov

8a0 0

a l , by t e p t r

[ eax ]

From the trap frame, note the contents of the EBP (OxFF416DAO), ESP
(OxFF416D90), and EIP (OxFCE10796) registers. Use these values in the k com­
mand to specify the stack address. This displays the true stack trace at the time of
the crash.
KDx8 6 > k = f f 4 1 6da0

f f 4 1 6 d9 0

fcel 0 7 9 6

f f 4 1 6da0

f c e l 0 6 l f XXDRIVER ! XxTryToCrash+ O x2 6 ( 0 x 0 0 0 0 0 0 0 4 )

f f 4 1 6 dc 4

8 0 1 1 4d6 9 XXDRIVER ! Xx S t a r t i o + Ox2 f ( OxFF 5 8 BC4 0 ,

f f 4 1 6de4

f c e l 0 5d3 NT ! I oS t ar t Packe t + O x 9 b

OxFF5 6 7 EE 8 )

f f 4 1 6 e 0 8 8 0 1 0 2 f 4 8 XXDRIVER ! XxDi spat chWr i t e + O x4 3 ( 0xFF 5 8 BC 4 0 ,
ff416elc

8 0 1 5b9 4 3 NT ! @ I o fC a l l Dr iver @ 8 + 0x3 8

ff41 6e3c

8 0 1 5 a 3 4 8 NT ! I opSynchronou s S e rvi c eTai l + Ox 6 f

f f 4 1 6 ed8

8 0 1 3 7 fb 5 NT ! NtWr i t eF i l e+ Ox 6 ac

f f 4 1 6 ed8

7 7 f 8 9 4 2 7 NT ! K i Sys t emServi c e + O x a S

f f 4 1 6d8c

f f 5 6 7 ee 8

OxFF5 6 7 EE 8 )

O x7 7 f 8 9 4 2 7

You can see that this display matches the one generated by the k command,
verifying that we've found the right stack.

Indirect Methods of Investigation
If your driver wasn't running at the time of the crash, the stack trace won't
contain any useful information and you'll need to take a more indirect approach

Sec. 17.5 Analyzing a Crash Dump

437

to find the problem. The goal is to gather as much information as possible about
what the driver was doing when the system crashed. This involves a certain
amount of creativity and imagination.

Finding 1/0 requests One approach is to track down any IRPs the driver
was processing at the time it died, and then try to puzzle out what was happen­
ing. Begin by getting a list of all the active IRPs on the system with the !irpzone
command:
! i rp z one

KDx8 6 >
Sma l l

I rp l i s t

f f 5 6 7 ee 8 Thread f f 5 9 9 b e 0 current s t ack b e l ongs

to

\ Dr i ve r \ XxDriver

f f 5 6 a 7 0 8 Thread f f 5 1 9 6 2 0 current s ta c k b e l ongs

to

\ Dr i ve r \ Mouc l as s

f f 5 6 ab 0 8 Thread f f 5 4 7 a 6 0 current s ta c k b e l ongs

to

\ Dr iver \ Kbdc l a s s

f f 5 6bd0 8 Thread f f 5 0 0 5 0 0 current s tack b e l ongs

to

\ F i l e Sys t em \ Rdr

Large I rp l i s t

From this list, select the IRPs currently belonging to your driver. Next, use
the !irp command to format each one (this can be a rather tedious process if there
are a lot of IRPs). This is what the formatted IRP looks like:
KDx8 6 >

! i rp

f f 5 6 7 ee 8

f rom z one and a c t ive w i t h 1

I rp i s
No Mdl

Sys t em bu f f e r

=

s t acks

1

is

current

f f 5 9 3 d8 8 Thread f f 5 9 9 be 0 :

I rp s t a c k t r a c e .

cmd f l g c l Devi c e F i l e C omp l e t i on - C o n t ext
> 4

0

1

f f 5 8bc 4 0

f f4 f 9 c 2 8

0 0 0 0 0 0 0 0 - 0 0 0 0 0 0 0 0 pending

\ Dr i ve r \ XxDr iver
Args :

00000004

00000000

00000000

00000000

The cmd field shows the major function, and the Args field displays the
Parameters union of the 1/0 stack location. The fig and cl fields show the stack
location flags and control bits, which you can find in NTSTATUS.H.
Here, you can see that the function code was a four (IRP_MJ_WRITE) and

Parameters.Write.Length was 4 bytes. Furthermore, no Completion routine (or

completion context) was associated with this 1 / 0 stack location, and it had
already been marked pending at the time of the crash.
Finally, there is a system buffer associated with the IRP (at location
OxFF593D88) which you can examine using the dd command or the Memory
option in the WINDBG Window menu. This tells us that the Device object is doing
Buffered 1/0.
To see exactly which device the IRP was sent to, use the !devobj command
on the address of the Device object from the IRP display. Here you can see that the
target device was CrashO, and that the IRP had already been made current when
the system crashed.
KDx8 6 >

! devobj

Devi c e obj e c t
CrashO
Current

f f 5 8bc 4 0
is

for :

\ Dr iver \ XxDr iver Dr iverObj e c t
I rp f f 5 6 7 ee 8 Re fC ount

Dev i c eQueue :

1 Type

f f 5 3 eld0
00000022

DevExt

f f 5 8bc f 8

Chapter 1 7

438

Testing and Debugging Drivers

Sometimes, you can find out even more information about what was going
on by dumping the contents of the Device Extension with the dd command. Later
in this chapter, you'll see how to write a WINDBG extension that makes the
Device Extension easier to dump.
Of course, this doesn't give us nearly as much information as the stack trace,
but it does tell us that the driver was trying to process a Buffered 1/0
IRP_MJ_WRITE command. Since the IRP had been made current, we know that it
got at least as far the driver 's Start 1/0 routine. Often the best approach in this
case is to set up the system for interactive debugging and try to make the error
repeat.

Examining processes Occasionally, it's helpful to know what processes
were running on a system at the time of a crash. This could help you spot patterns
of system usage or even specific user programs that cause your driver to fail. For
general information, you can use the !process command like this:

KDx8 6 > ! proces s 0 0
* * * * NT ACTIVE PROCESS DUMP * * * *
PROCESS f f 5 7 8 9 4 0 C i d : 0 0 0 2 Peb : 0 0 0 0 0 0 0 0 ParentC i d : 0 0 0 0
Di rBas e : 0 0 0 3 0 0 0 0 Obj ec tTabl e : e 1 0 0 0 f 8 8 Tabl e S i z e : 6 4 .
Image : Sys t em
PROCESS f f 5 5 4 3 6 0 C i d : 0 0 1 3 Peb : 7 f fdf 0 0 0 ParentC i d : 0 0 0 2
Di rBas e : 0 1 2 ec 0 0 0 Obj ectTabl e : e 1 0 0 1 7 c 8 Tabl e S i z e : 4 8 .
Image : sms s . exe
PROCESS f f 5 8b 6 c 0 C i d : 0 0 9 0 Peb : 7 f f df 0 0 0 ParentC i d : 0 0 7 b
Di rBas e : 0 0 3 b9 0 0 0 Obj ec tTabl e : e l l f e e e 8 Tabl e S i z e : 1 6 .
Image : Xxt es t . exe
For more information, you can use the CID number of a specific process and
increase the level of verbosity with some flags. 8
KDx8 6 >

! pr o c e s s

90

-1

Searching f o r Proc e s s w i t h C i d
PROCESS

f f 5 8b 6 c 0 C i d :

D i rBas e :
Image :

0090

= =

Peb :

90

7 f f df 0 0 0

0 0 3 b9 0 0 0 Obj e c tTab l e :

Parent C i d :

16 .

Xxt e s t . exe

VadRo o t

f f 4 f a 6 6 8 C l one 0

P r ivate 3 3 .

Modi f i ed 0 .

FF 5 8 B 8 7 C Mutant S t a t e S i gn a l l ed OWn i ngThread 0

8

0 0 7b

e l l f e e e 8 Tab l eS i z e :

Token

el3 04 0 3 0

E l aps edTime

0 : 00 : 00 . 0110

U s e rT ime

0 : 00 : 00 . 0020

Kerne l Time

0 : 00 : 00 . 0030

Quo t a P o o l U s age [ PagedPo o l )

6892

Quo t a P o o l U s age [ NonPagedPo o l )

1096

Kernel-mode threads always run in the process whose CID is 2.

Locked 0 .

Sec. 17.5 Analyzing a Crash Dump
Working S e t S i z e s

439
( 14 6 ,

( now , mi n , max )

PeakWo r k i ng S e t S i z e

50 ,

345 )

153

8 Mb

V i r tu a l S i z e

8 Mb

PeakVi r t u a l S i z e
PageFau l t Count

159

MemoryP r i o r i ty

FOREGROUND

BasePr i o r i ty

9

C ommi tCharge

38

THREAD f f 5 9 9 b e 0 C i d 9 0 . 8 8 Teb :

7 f fde 0 0 0 Win3 2 Thread :

8 0 1 4 4 8 c 0 RUNNING

IRP L i s t :
f f 5 6 7 ee 8 :
Not

( 0006 , 0094 )

F l ags :

0 0 0 0 0 a3 0 Mdl :

00000000

impers onat ing

Own ing Pro c e s s
Wa i tT ime

f f 5 8b 6 c 0

( s ec onds )

107578

C o n t ext Swi t c h C ount
UserTime

12

0 : 00 : 00 . 0010

Kerne l T ime

0 : 00 : 00 . 0030

S t a r t Addre s s

Ox7 7 f 2 7 0 a 4

I n i t i a l Sp f f 4 1 7 0 0 0 Current S p f f 4 1 6bec
Pr i o r i ty 9

Bas e P r i o r i ty 9

P r i o r i tyDec r ement

0 DecrementC ount

124

Chi l dEBP RetAddr Args t o Chi l d
0 0 1 2 f7 5 0

00000000

00000000

00000000

00000000

For multithreaded processes, this form o f the !process command will tell
you things about all the threads, including any objects they might be waiting for.
It also gives information about the I / O requests issued by a given thread, so if a
thread seems to be getting hung, you can see what IRPs it issued.

Analyzing Crashes with DUM PEXAM
DUMPEXAM is a command-line utility that you can use to analyze a crash
dump file. When you run this utility, it uses the kernel-mode debugger to execute
a standard series of commands and produces an output file called MEM­
ORY.TXT. The analysis performed by DUMPEXAM is intended to give support
personnel a fairly detailed snapshot of the state of the system at the time of the
crash. This can be useful if you're trying to support a driver out in the field.
You'll find DUMPEXAM on the Windows NT distribution CD in the \ SVP­
PORT\DEBUG \ directory. Along with the DUMPEXAM executable,
you have to install the KD EXTS.DLL extension DLL for the target plat­
form. Normally, these DLLs are copied along with everything else when you
install WINDBG from the Win32 SOK. You also need to copy IMAGEHLP.DLL
from the Windows NT distribution CD. It's in the same directory as the
DUMPEXAM executable.
Finally, make sure you mirror the debug symbol tree that's on the CD when
you run DUMPEXAM. Unfortunately, this tool isn't smart enough to handle the
situation where everything is in the same directory.

440

Chapter 1 7

Testing and Debugging Drivers

1 7 .6 I NTERACTIVE D E B U G G I N G
Poking around in the remains o f a dead system can tell you a great deal, but some
problems are easier to diagnose while a driver is still running. This section briefly
describes how to debug driver code interactively.

Starting and Stopping a Debug Session
WINDBG is the primary tool for interactive debugging. To use it, you'll need
to set up host and target systems as described in Appendix A. As with crash
dump analysis, make certain the source-code path on the host exactly matches the
source-code path on the machine where the driver was built. Once everything is
configured, follow these steps to begin an interactive debug session:
1.

Move a copy of your driver 's executable (or the corresponding .DBG symbol
file) into the symbol directory on the host. Repeat this step each time you
rebuild the driver, or the symbols will be out of sync.

2.

From the command line, run WINDBG using the k and -y options, for
example,
-

WINDBG -k i 3 8 6 coml 9 6 0 0

-y

c : \wnt \ symbo l s ntoskrnl . exe

3.

From the WINDBG Run menu, select Go. You'll see a message in the
WINDBG command window saying that WINDBG is waiting to connect.

4.

Reboot the target machine with the Kernel's debug client enabled. As the sys­
tem boots, you'll see it trying to make a connection with the debugger on the
host. When the systems connect, there will be a lot of activity in WINDBG's
command window.

Once you've established a connection between the host and target machines,
you have a wide range of commands available to you. For the most part, the inter­
active WINDBG commands are a superset of the ones you use to analyze a crash.
You also have the added capability of setting breakpoints on the target and single­
stepping through target code.
After you've completed a debugging session, you should follow these steps
to disconnect the host and the target:
1.

If you've set any breakpoints in your driver, pause the target system by typ­
ing CTRL+C in the WINDBG command window. (Alternatively, you can
press the SYSREQ key on the target itself.)

2.

From the Debug menu, choose Breakpoints. When the breakpoint dialog
appears, click on Clear All and OK.

3.

From the Run menu, choose Go (or use the toolbar button) to let the target
machine continue.

4.

From the File menu, choose Exit.

Sec. 17.6 Interactive Debugging

441

After WINDBG has exited, the target machine may pause for 30 seconds
or so the first time it hits a KdPrint macro. This delay is the time it takes the
Kernel's debug client to realize there's no debugger to talk to. It occurs only
once.

Setting Breakpoi nts
One of the great things about WINDBG is its ability to set source-code
breakpoints in a driver. This can be immensely helpful for figuring out the exact
nature of a bug. To set a breakpoint with WINDBG, do the following:
1.

I f the target machine i s currently running, type CTRL+C i n the WINDBG
command window to pause the target. (Alternatively, you can press
the SYSREQ key on the target.) You can't set breakpoints if the target is
running.

2.

From the File menu, choose Open. The Open File dialog box will appear. Nav­
igate to the directory containing your driver 's source code. Double-click on a
source file to open it.

3.

Move the cursor to the source code line where you want to set the breakpoint.
If you're breaking on a multiline C statement, make sure you position the cur­
sor on the line containing the semicolon.

4.

Click on the breakpoint button in the toolbar. (It's the one that looks like a lit­
tle hand.) If your driver is currently loaded in memory, the source-code line
will turn red; if it hasn't been loaded yet, the source line will turn magenta.

5.

Click on the Go button in the toolbar to let the target machine continue. When
the target machine hits the breakpoint, it will stop and the source-code line in
WINDBG will turn green.

To remove a breakpoint, simply pause the target machine, select the source
code line containing the breakpoint, and click on the toolbar 's breakpoint but­
ton. You can also use the Debug Breakpoints menu item to remove multiple
breakpoints.
Breakpoints highlight another of WINDBG's little quirks. If you set several
breakpoints in a driver that hasn't been loaded yet, WINDBG won't be able to
resolve the first one that it hits. Instead it will display a dialog box asking you
how it should handle the breakpoint. You should select the Defer option. This will
cause WINDBG to instantiate all the breakpoints in the driver and proceed. When
WINDBG hits the next breakpoint, it will work correctly. (In fact, even if it hits the
first breakpoint again, it will work properly.) Breakpoints that you set after the
driver is loaded don't seem to have this problem.
This odd behavior can make it difficult to set breakpoints in the DriverEntry
routine. The easiest solution is just to set an extra (dummy) breakpoint some­
where at the beginning of DriverEntry. This one will cause the others to behave
properly.

Chapter 17

442

Testing and Debugging Drivers

Setting Hard Breakpoints
With WINDBG, there aren't too many compelling reasons for putting hard
breakpoints into your driver. If you do find such a need, you can use the follow­
ing two calls:

VO I D DbgBreakPo int ( VOI D ) ;
VOI D KdBreakPo int ( VOID ) ;
KdBreakPoint is just a macro that wraps a conditional compilation directive
around DbgBreakPoint. KdBreakPoint becomes a no-op if you build a free ver­
sion of your driver.
Beware: NT will crash with a KMODE_EXCEPTION_NOT_HANDLED
error if your driver hits a hard-coded breakpoint and the Kernel's debug client
isn't enabled. If your driver hits a breakpoint and there's no debugger on the
other end of the serial line, NT will hang the target machine. You can recover from
the hang by starting up WINDBG on the host machine.
Using Print Statements
Debugging code by peppering it with print£ statements has a long and hon­
orable history. You can continue the tradition by calling either DbgPrint or
KdPrint. Both allow you to send a debug string from your driver (on the target
system) to the WINDBG command window (on the host machine) . These calls
have the following syntax:

ULONG DbgPrint ( Forma t S t ring , argl , arg2 ... ) ;
ULONG KdPrint ( ( Forma t S t ring , argl , arg2 ... ) ) ;
DbgPrint and KdPrint take the same arguments as the standard print£ func­
tion. Since KdPrint is actually a macro (defined in NTDDK.H), you have to
include an extra set of parentheses in order to pass it a variable-length list of argu­
ments. KdPrint also becomes a no-op in free builds of a driver.

1 7.7 W RITI N G W I N D B G E XTE N S I O N S
One o f WINDBG's strengths is that you can expand its capabilities b y writing
extension commands for it. This can be very helpful, particularly for printing out
the contents of driver-defined data structures. Unfortunately, the documentation
and sample extension code that come with the NT DDK are incorrect. This section
explains how to add extension commands to WINDBG .

How WINDBG Extensions Work
A WINDBG extension is just a user-mode DLL that exports various com­
mands in the form of DLL functions. The extension DLL also contains several
support routines that perform initialization and version-checking operations.

443

Sec. 1 7.7 Writing WINDBG Extensions

One of the tricky aspects of writing a WINDBG extension is gaining access
to memory in the target system (whether it's a crash file or a live machine) . To
make this easy, WINDBG supplies a set of callback routines that the extension
DLLs use to touch the debug target. This means the DLL has the same view of
the target system's memory as WINDBG itself. In particular, extension com­
mands can't access anything that is paged out at the time a crash or breakpoint
occurs.

Initial ization and Version-Checking Functions
When you write an extension DLL for WINDBG, there are two required ini­
tialization functions that you must include. At your option, you can also include a
third version-checking function. These are described in the following subsections.

WinDbgExtensionDlllnit WINDBG calls this function when the user
loads the extension DLL. Its job is to save the address of the callback table so
that other parts of the DLL can use it. This function (shown in Table 1 7.2) is
required.
Table 1 7.2

Function prototype for Wi n DbgExtension D l l l nit

VOID Win DbgExtension Dlllnit
Parameter

Description

PWINDBG_EXTENSION_APIS
lpExtensionApis
USHORT MajorVersion

Address of table containing pointers to
WINDBG callback functions
• OxF for free build of NT
• OxC for checked build of NT
Build-number of NT
(None)

USHORT MinorVersion

Return value

ExtensionApiVersion WINDBG calls this function when you try to load
an extension DLL. Its job is to convince WINDBG that the extension DLL has the
same version as WINDBG itself. It does this by returning a pointer to the version
structure associated with the extension DLL. This function (shown in Table 17.3)
is required.
Table 1 7.3

Function prototype for ExtensionApiVersion

LPEXT_API_VERSION ExtensionApiVersion
Parameter

Description

VOID

(None)
Address of the DLL's EXT_API_VERSION structure

Return value

444

Chapter 17

Testing and Debugging Drivers

CheckVersion Each time WINDBG executes a command in the DLL, it
calls this function before calling the command routine. CheckVersion's job is to
make sure that the version of the extension DLL is compatible with the version of
NT being debugged. If not, it should complain loudly (and perhaps set a global
DLL variable to inhibit command execution) . This function (shown in Table 17.4)
is optional.
Table 1 7.4

Function prototype for CheckVersion

VOI D CheckVersion
Parameter

Description

VOID

(None)
(None)

Return value

Writing Extension Commands
Each command in your extension DLL is implemented as a separate func­
tion. Define these command functions using the DELCARE_API macro, like this:

DECLARE_API ( c ommand_name
{

II
I I Your c ode
II

.

.

.

DECLARE_API gives your command function the prototype shown in Table
17.5. Be sure the names of your commands are entirely lower-case, or WNDBG
won't be able to find them.

Table 1 7.5

Commands declared with DECLAR E_API have this prototype

VOI D command_name
Parameter

Description

IN HANDLE hCurrentProcess
IN HANDLE hCurrentThread
IN ULONG dwCurrentPc
IN ULONG dwProcessor
IN PCSTR args

Handle of current process on target machine
Handle of current thread on target machine
Current value of program counter value
Number of current CPU
Argument string passed to the command
(None)

Return value

Sec. 17.7 Writing WINDBG Extensions

445

These extension commands can perform any sort of operation that will
make debugging easier. Their most common use is to format and print the con­
tents of various driver-defined data structures, like the Device Extension.
Finally, if one of your extension commands is going to take a long time to
execute, or if it's going to generate a lot of output, it should periodically check to
see if the WINDBG user has typed CTRL+C. Otherwise, the user won't have any
way to abort the command until it completes. One of the WINDBG helper func­
tions described next lets you make this check.

WIN DBG Helper Functions
Your extension DLL gains access to the system being debugged by calling
various helper functions exported by WINDBG itself. These functions also give
your DLL access to the WINDBG command window for input and output. Table
1 7.6 contains a brief description of these helper functions.

Table 1 7.6

A WIN DBG extension DLL can call these helper functions

WIN DBG helper functions
Function

Description

dprintf
CheckControlC
GetExpression
GetSymbol
Disassm
StackTrace
GetKDContext
GetContext
SetContext
ReadControlSpace
ReadMemory
WriteMemory*
ReadloSpace*
WriteloSpace*
ReadloSpaceEx*

Print formatted text in WINDBG command window
See if WINDBG user has typed CTRL+C
Convert a C expression into a DWORD value
Locate name of symbol nearest a given address
Generate string representation of machine instruction
Return stack-trace of current process
Return current CPU number and count of CPUs
Return CPU context of process being debugged
Modify CPU context of process being debugged
Get platform-specific CPU information
Copy data from system virtual space into buffer
Copy data from buffer to system virtual space
Read 1/0 port
Write 1/0 port
Read 1/0 port on specific bus-type and number
(Alpha only)
Write 1/0 port on specific bus-type and number
(Alpha only)
Copy data from physical memory into buffer
Copy data from buffer to specific physical addresses

WriteloSpaceEx*
ReadPhysical
WritePhysical*

*These functions can only be used during an interactive debugging session.

Chapter 17

446

Testing and Debugging Drivers

The only complete documentation on these helper functions is in the
WINDBG online help. To find it, do the following:
l.

From the WINDBG help Contents screen, click on the KD button.

2.

Click on the "Creating Extensions" topic.

3.

Scroll about halfway down this topic and you'll find a list of helper functions.

4.

Click on the name of a function to see its prototype and a description.

Building and Using an Extension DLL
Although a WINDBG extension is just a user-mode DLL, you still need to
compile and link it using the BUILD utility. This is because it incorporates the
DDK header files, and it needs all the compile-time symbol definitions provided
by BUILD. Consequently, using Visual C++ projects to create an extension DLL
isn't easy. The example in the next section contains a SOURCES file that builds
one of these DLLs.
To use an extension DLL, you first load it using WINDBG's !load command.
Then you execute one of its functions with a command of the form !function. The
!unload command allows you to unload an extension DLL.
WINDBG allows you to have up to 32 extension DLLs loaded at one time.
When you execute a !function command, WINDBG searches the list of currently
loaded extensions, starting with the most recently loaded and going back to earliest.

1 7.8 CODE EXAM PLE: A WI N D BG EXTENSION
This example shows how t o write a simple WINDB extension DLL. You can find
the code for this example in the CH17\XXDBG directory on the disk that accom­
panies this book.

XXDBG.C
All the code for this extension DLL is in a single file. The following subsec­
tions break it into easily digestible pieces.

Headers This part of the code contains all the headers and definitions
needed to make everything work. Warning: There is some odd stuff going on
here. Don't change the sequence of anything between 0 and tD.

# inc lude O
# inc lude 
# de f ine LMEM_F IXED O xO O O O @

Sec. 17.8 Code Example: A WINDBG Extension

# de f ine
# de f ine
# de f ine
# de f ine
# de f ine
# de f ine
# de f ine
# de f ine

447

LMEM_MOVEABLE Ox0 0 0 2
LMEM_NOCOMPACT OxO O l O
LMEM_NODISCARD O x0 0 2 0
LMEM_ZEROINIT Ox0 0 4 0
LMEM_MODIFY Ox0 0 8 0
LMEM_DI SCARDABLE OxO F O O
LMEM_VALI D_FLAGS Ox0 F 7 2
LMEM_INVALID_HANDLE Ox8 0 0 0

# de f ine LPTR ( LMEM_F IXED I LMEM_ZEROINI T )
# de f ine WINBASEAP I
WINBASEAPI
HLOCAL
WINAPI
LocalAl l o c (
UINT uFl ags ,
UINT uByt es
) ;
WINBASEAPI
HLOCAL
WINAPI
LocalFree (
HLOCAL hMem
);
# de f ine CopyMemory Rt l CopyMemory
# de f ine F i l lMemory Rt l F i l lMemory
# de f ine Z eroMemory Rt l Z eroMemory
# inc lude  @}
II
I I Other header f i l e s . . .
II

# inc lude < s tdl ib . h>
# inc l ude < s t r ing . h>

# inc lude " . . \ drive r \ xxdr iver . h " 0

0 This is the beginning of some magic. The problem is that we're trying to
build a Win32 user-mode DLL, but we need access to things defined in
NTDDK.H and XXDRIVER.H. It takes a little trickery to get all the
header files to live together.

@ The various definitions that follow are taken from WINBASE.H in the
Win32 SOK. The WINDBG extension definitions from WDBGEXTS.H

Chapter 1 7

448

Testing and Debugging Drivers

won't work without them. Unfortunately, NTDDK.H and WINBASE.H
can't coexist in the same source file. The only solution is to cut the
required pieces from WINBASE.H and include them here.

8 Now it's safe to bring in the WINDBG extension definitions. This header
is located in MSTOOLS\H in the Win32 SOK. Here ends the magical
sequence of headers and definitions.
0 Finally, bring in the driver-specific data structures and definitions.

Globals These global variables are necessary for the proper operation of
the extension library.

s t a t i c EXT_API_VERS ION
ApiVers i on = { 3 , 5 , EXT_API_VERS I ON_NUMBER , 0 } ; 0
s ta t i c WINDBG_EXTENS I ON_API S Extens i onAp i s ; @
s ta t i c USHORT SavedMaj orVers i on ; 8
s t a t i c USHORT S avedMinorVers i on ;
0 This structure identifies the version of WINDBG that this particular

extension library works with. WINDBG won't allow you to load an
incompatible extension DLL.

@ This will hold a pointer to the table of WINDBG callback functions. The
access macros defined in WDBGEXTS.H assume that this pointer is
called ExtensionApis, so don't change the name.
8 These variables will hold information about the version of NT that is
being debugged. You can use this information to verify that your library
is compatible with that version.

Required functions

These functions perform various kinds of initializa­

tion and version-checking.

VOI D
WinDbgExt ens i onDl l in i t (
PWINDBG_EXTENS ION_API S lpExt ens i onAp i s ,
USHORT Maj orVers i on ,
USHORT MinorVers i on
)
II
I I Save the addr e s s o f the WINDBG cal lback
I I tabl e and the NT vers i on inf orma t i on
II

Extens i onAp i s = * lpExt ens i onAp i s ;

S avedMaj o rVers i on = Maj orVers i on ;

Sec. 17.8 Code Example: A WINDBG Extension

SavedMinorVers i on

449

MinorVers i on ;

return ;
VOID
CheckVe r s i on (
VOI D
)
II
I I Replace thi s wi th your
I I ver s i on - checking c ode
II
dprint f (
" CheckVers i on cal l ed . . .
S avedMa j o rVer s i on ,
SavedMinorVers i on
) ;

[ % 1x ; % d ] \ n " ,

}

LPEXT_API_VERS I ON
Extens i onApiVers i on (
VOI D
)
re turn &Ap iVers i on ;
}

Command routines Here is the code for a command that formats and
prints the contents of the Device Extension. It illustrates how to access memory on
the system being debugged.

DECLARE_API { devext )
{
DWORD dwBytesRead ;
DWORD dwAddres s ;
PDEVICE_OBJECT pDevObj ;
PDEVICE_EXTENS I ON pDevExt ;
i f ( ( pDevObj

mal l o c (
s i z e o f ( DEVICE_OBJECT ) ) )
=

==

NULL ) 0

{
dprintf ( " Can ' t a l l ocate bu f f er . \ n " ) ;
return ;

Chapter 17

450

dwAddres s

=

Testing and Debugging Drivers

GetExpre s s i on ( args ) ; @

i f ( ! ReadMemory (
dwAddre s s ,
pDevObj ,
s i z e o f ( DEVICE_OBJECT ) ,
&dwBytesRead ) ) @)
dprint f ( " Can ' t get Dev i c e obj e c t . \ n " ) ;
free ( pDevObj ) ;
return ;
i f ( ( pDevExt = mal l oc (
s i z e o f ( DEVICE_EXTENS I ON ) ) ) = = NULL ) 8
{
dprint f ( " Can ' t a l l ocate bu f f e r . \ n " ) ;
free ( pDevObj ) ;
return ;
}

i f ( ! ReadMemory (
( DWORD ) pDevObj - >Devic eExtens ion ,
pDevExt ,
s i z e o f ( DEVICE_EXTENS I ON ) ,
&dwByt e s Read ) ) 0
dprint f ( " Can ' t get Devi c e Ext ens i on . \ n " ) ;
free ( pDevExt ) ;
free ( pDevObj ) ;
return ;
}

dpr i nt f ( CD
" BytesReque s t ed : %d\ n "
" Byt e s Remaining : %d\ n "
" TimeoutCount er : %d \ n "
" Devi c eObj e c t : % 8x\ n " ,
pDevExt - >Byt e sReque s t ed ,
pDevExt - >Byt e s Remaining ,
pDevExt - >TimeoutCount er ,
pDevExt - >Devi c eObj ect
) i
free ( pDevExt ) ; @
free ( pDevObj ) ;
}

Sec. 1 7.8 Code Example: A WINDBG Extension

451

0 Allocate memory for a copy of the Device object.

@ Get the address of the Device object from the command line using a
WINDBG callback function.

8 Use another WINDBG callback function to get a copy of the Device object
from the system being debugged.
0 Allocate another buffer to hold the Device Extension.
6' Get the address of the Device Extension (on the target system) from the

Device object. Copy the Extension from the target system into the buffer.
 ! l oad xxdbg 0
Debugger extens i on l ibrary [ xxdbg ] l o aded
KDx8 6 > ! devext f f 5 8bc 4 0 @
CheckVers i on c a l l e d . . . [ f ; 1 0 5 7 ]
Byt es Reques t ed : 0
Byt e sRemaining : 0
TimeoutCount er : 0
Devi c eObj e c t : f f 5 8bc4 0
KDx8 6 > ! unl oad 49
Extens i on dl l xxdbg unl oaded

0 The !load command brings XXDBG into memory and makes it the
default extension library. For this to work, XXDBG.DLL must be in one of
the directories where the system looks for DLLs.

@ To execute a command, just prefix the command name with an exclama­
tion point.
49 The !unload command unloads the current default extension library. To

unload some other extension DLL, specify the name of library as an argu­
ment to the command.

1 7 .9 MISCELLANEOUS DEBUGGING TEC H NIQUES
Often the main problem in correcting driver bugs is just getting enough informa­
tion to make an accurate diagnosis. This section presents a grab bag of techniques
that may help.

Leaving Debug Code in the Driver
In general, it's a good idea to leave debugging code in place, even when
you think the driver is ready for release. That way, you can reuse it when you
have to modify the driver at some later date. Conditional compilation makes
this easy to do.
The BUILD utility defines a compile-time symbol called DBG that you can
use to conditionally add debugging code to your driver. In the checked BUILD
environment, DBG has a value of one; in the free environment it has a value of
zero. Several of the macros described below use this symbol to suppress the gen­
eration of extraneous debugging code in free versions of drivers. If you're adding
your own debugging code to a driver, you should wrap it in #if DBG and #endif
statements.

Sec. 17.9 Miscellaneous Debugging Techniques

453

Catchi ng Incorrect Assumptions
As in real life, making unfounded assumptions in kernel-mode drivers is a
dangerous practice. For example, assuming that some function argument will
always be non-NULL, or that a piece of code will only be called at a specific IRQL
level can lead to disaster if these expectations aren't met.
To catch unforeseen conditions that could lead to driver failure, you need to
do two things. First, you have to document the explicit assumptions made by
your code. Second, you need to verify that these assumptions are actually true at
runtime. The ASSERT and ASSERTMSG macros will help you with both these
tasks. They have the following syntax:

ASSERT ( Expressi on ) ;
ASSERTMSG ( Message , Expressi on ) ;
If Expression evaluates to FALSE, ASSERT writes a message to WINDBG's
command window. The message contains the source code of the failing expres­
sion, plus the file name and line number where the ASSERT macro was called. It
then gives you the option of taking a breakpoint at the point of the ASSERT,
ignoring the assertion failure, or terminating the process or thread in which the
assertion occurred.
ASSERTMSG exhibits the same behavior, except that it includes the text of
the Message argument with its output. Don't try getting too fancy with the Message
argument; it's just a simple string. Unlike the debug print functions described ear­
lier, ASSERTMSG doesn't allow you to include any printf-style substitutions.
Several things are worth mentioning here. First, both assertion macros com­
pile conditionally and disappear altogether in free builds of your driver. This
means it's a very bad idea to put any executable code in the Expression argument.
Another little twist is that RtlAssert (the underlying function used by these
macros) is a no-op in the free version of Windows NT itself. So, if you want to see
any assertion failures, you'll have to run a checked build of your driver under the
checked version of Wmdows NT.
Finally, a warning is in order: The checked build of Windows NT will crash
with a KMODE_EXCEPTION_NOT_HANDLED error if an assertion fails and the
Kernel's debug client isn't enabled. If the debug client is enabled, but there's no
debugger on the other end of the serial line, the target machine will simply hang
when an assertion fails. You can recover from the hang by starting up WINDBG
on the host machine, but you won't see the text of the assertion that failed.

Using Bugcheck Cal lbacks
A bugcheck callback is an optional driver routine that gets called by the Ker­
nel when the system begins to crash. These routines give you a convenient way to
capture debugging information at the time of a crash. You can also use them to
put a piece of hardware in a known state before the system goes away. Here's how
they work.

454

Chapter 1 7

Testing and Debugging Drivers

In DriverEntry, call KelnitializeCallbackRecord to set up a KBUG­
CHECK_CALLBACK_RECORD structure. The space for this opaque struc­
ture must be nonpaged, and must be left alone until you call KeDeregister­

1.

BugCheckCallback.
2.

Also in DriverEntry, call KeRegisterBugCheckCallback to request notifica­
tion when a bugcheck occurs. The arguments to this function include the
bugcheck-callback record, the address of a callback routine, the address and
size of a driver-defined crash buffer, and a string that will be used to identify
this driver 's crash buffer. As with the bugcheck-callback record, memory for
the driver 's crash buffer must be nonpaged and can't be touched until the
driver calls KeDeregisterBugCheckCallback.

3.

Call KeDeregisterBugCheckCallback in your driver 's Unload routine to dis­
connect from the bugcheck notification mechanism.

4.

If a bugcheck occurs, the system will call the driver 's bugcheck-callback
routine and pass it the address and size of the driver 's crash buffer. The j ob
of the callback routine is to fill the crash buffer with any information that
would not otherwise end up in the dump file (like the contents of device
registers) .

5.

When you analyze the crash with WINDBG, use the !bugdump command to
view the contents of the crash buffer.

There are some restrictions on what a bugcheck callback is allowed to do.
When it runs, the callback routine can't allocate any system resources (like mem­
ory). It also can't use spin locks or any other synchronization mechanisms. 9 It is
allowed to call Kernel routines that don't violate these restrictions, as well as the
HAL functions that access device registers.

Catching Memory Leaks
A memory leak is one of the nastier kinds of driver pathology. Drivers that
allocate pool space and then forget to release it may just degrade system perfor­
mance over time, or they can lead to actual system crashes. You can use NT's
built-in pool-tagging mechanism to determine if your driver leaks memory.
Here's how it works.
1.

9

Replace calls to ExAllocatePool with ExAllocatePoolWithTag calls. The extra
4-byte tag argument to this function will be used to mark the block of mem­
ory allocated by your driver.

Synchronization shouldn't be a problem, though, since nothing else is allowed to run while the
bugcheck callback is executing.

Sec. 1 7.9 Miscellaneous Debugging Techniques
2.
3.

455

Run your driver under the checked build of NT. Keeping track of pool ta§s is
an expensive activity, so it only works under the checked version of NT. 1

When you're analyzing a crash, or when your driver is at a breakpoint, use
the !poolused or !poolfind commands in WINDBG to examine the state of the
pool areas. These commands sort the pool areas by tag value and displays
various memory statistics for each tag.

One easy way to use pool tagging is to replace the ExAllocatePool function
with ExAllocatePoolWithTag with conditional compilation. This way, you can
tum tagging on and off without too much trouble. Add something like the follow­
ing to your driver 's header file:

# i f DBG
# de f ine ExAl l ocatePoo l ( type , s i z e
\
ExAl locatePoolWi thTag ( ( type ) , ( s i z e ) , ' DCBA ' )
# endi f
The tag argument to ExAllocatePoolWithTag consists of four case-sensitive
ANSI characters. Because of the way things work on little-endian machines, you
need to specify the characters in reverse order. Hence, the DCBA in the example
will become ABCD in the pool tag display.
In this example, we used the same tag value for all the allocations made by a
single driver. For some situations, you might also want to use different tag values
for different kinds of data structures, or for allocations made by different parts of
your driver. These kinds of strategies might help you see exactly what's been
leaking out of your driver.
The POOLMON utility that comes with the NT DOK also lets you look at
the pool tags dynamically, without the need for WINDBG. You run this com­
mand-line utility on the target machine and it outputs a continuously updated
display of the pool tags. See Chapter 6 of the DOK Programmer 's Guide for details
on running POOLMON.

Using Counters, B its, and B uffers
There's no question that interactive driver debugging is a wonderful thing.
o
Unf rtunately, some kinds of bugs are time-dependent, and they disappear when
you use breakpoints or single step through the code. This subsection presents sev­
eral techniques that may help you catch these bugs.

lO

Chapter 6 of the DOK Programmer's Guide claims that you can enable this feature in the free
build of NT by ORing the FLG _POOL_ENABLE _TAGGING bit into the GlobalFlag value of the
HKEY_LOCAL_MACHINE\System \CurrentControlSet\ Control \ SessionManager key of the
Registry. Unfortunately, none of the currently available documentation or header files defines
what this value is.

Chapter 1 7

456

Testing and Debugging Drivers

Sanity counters You can use pairs of counters to perform several kinds of
sanity checks in your driver. For example, you might count how many IRPs arrive
at your driver and how many you send to IoCompleteRequest. Or, in a higher­
level driver, you could track the number of IRPs allocated versus the number
released. Checks like these can help you find subtle inconsistencies in the behav­
ior of your driver. The only disadvantage of sanity counters is that they don't nec­
essarily tell you where the problem is occurring.
Implementing a counter is very simple. Just declare a ULONG variable in
your Device Extension for each counter and then add appropriate code to incre­
ment the counters throughout your driver. As with all debugging support, it's a
good idea to wrap sanity-counter code in conditional compilation statements that
depend on the DBG symbol.
If you're feeling really ambitious, you can write a WINDBG extension to dis­
play the counters. As a simple alternative, your driver can force a bugcheck after
it has collected enough data, and simply use a bugcheck callback to save the
counter values.
Event bits Another useful technique is to keep a collection of bit flags that
track the occurrence of significant events in your driver. Each bit represents one
specific event, and when that event happens, your driver sets the corresponding
bit. Where sanity counters tell you about global-driver behavior, event bits can
give you an idea of what parts of your code have executed.
One of the decisions you'll have to make is whether to clear the event vari­
able during DriverEntry, during the Dispatch routine for IRP_MJ_CREATE, or
when you begin processing each new IRP. Each of these options can be useful in
different situations.
Trace buffers The problem with event bits and counters is that they don't
give you any idea of the sequence of execution of your code. To get around this
limitation, you can add a simple tracing mechanism that makes entries in a spe­
cial buffer as different parts of your driver execute.
Trace buffers can be very useful for tracking down unexpected interactions
in asynchronous or full-duplex drivers. On the downside, this extra information
isn't free. Trace buffers use more CPU time than counters or event bits, and this
could have an effect on time-sensitive bugs.
Implementing a trace buffer mechanism takes a little more work than the
other techniques we've looked at. Here are the basic steps you need to follow:
1.

Add trace buffer data structures to your driver. Normally, you should put
these structures in the Device Extension so you can trace things on a device­
by-device basis. Every once in awhile, you might find some value in a global
buffer that traces the entire driver.

2.

Define a macro to make entries in the trace buffer. As with other pieces of
debug code, it's a good idea to bracket the trace macro with conditional com­
pilation statements.

457

Sec. 17.9 Miscellaneous Debugging Techniques
3.

Insert calls to the trace macro at various strategic places in your driver.

4.

Write a debugger extension to dump the contents of trace buffer.

The trace buffer itself is just an array, coupled with a counter that keeps
track of the next free slot. The following code fragment illustrates the structure of
a basic trace buffer.

typede f _DEVICE_EXTENS I ON {
# i f DBG
ULONG TraceCount ;
ULONG TraceBu f f e r [ XX_TRACE_BUFFER_S I Z E ] ;
# endi f
} DEVICE_EXTENS I ON , * PDEVICE_EXTENS I ON ;
Again, depending on what you're looking for, you can initialize the Trace­
Count field once in your DriverEntry routine, each time you get an
IRP_MJ_CREATE request, or with each new IRP.
Adding entries to the buffer is just a matter of storing an item in the array
and incrementing the counter. 11 This code fragment shows how to implement a
basic trace macro.

# i f DBG
#de f ine XXTRACE ( pDE , Tag )
i f ( pDE - >TraceC ount > = XX_TRACE_BUFFER_S I Z E
pDE - > Trac eCount = O ;
pDE - >TraceBu f f e r [ pDE - >TraceCount + + ] =
( ULONG ) ( Tag ) ;
#else
#de f ine XXTRACE ( pDE , Tag ) whi l e ( FALSE ) { }
# endi f

\
\
\
\
\

Notice that this implementation ignores all the synchronization issues that
arise when you call XXTRACE from multiple IRQL levels (potentially on multiple
CPUs). Since the whole purpose of using trace buffers is to catch errors that are
sensitive to timing, putting synchronization mechanisms into XXTRACE would
probably make it useless. So, just how do you prevent the trace macro from trash­
ing itself?
One solution is to call XXTRACE only from places in your driver where syn­
chronization won't be a problem. For example, if you call XXTRACE from DPC
routines, synchronization is already being handled as part of the larger structure of
the driver itself. Similarly, if you call it from an JSR and a SyncCritSection routine,
11

If you have a large enough trace buffer and an accurate idea of how many events will be traced,
you can save some time by eliminating the test for a full buffer. This is a very dangerous optimiza­
tion, so use it with care.

Chapter 17

458

Testing and Debugging Drivers

synchronization is already guaranteed. If you can't live with these restrictions,
you'll have to add explicit synchronization to XXTRACE.

1 7. 1 0 SUMMARY
When you write a driver, very few limits are placed on what you can do to the
system. With all this power comes the heavy burden of making sure that your
driver doesn't compromise system integrity. You need to correct not only overt,
catastrophic errors, but also subtle problems that may over time damage the sys­
tem. This chapter has presented some techniques you can use to diagnose and
eliminate bugs, both early in the development cycle, and later when the driver is
out in the world.
But suppose bugs aren't the problem. Suppose the driver works, but it just isn't
fast enough. The next chapter examines the important area of driver performance.

C

H

A

P

T

E

R

18

Driver
Performance

T

here's a certain feverish look - a kind of glassy
stare - that comes into the eyes of a programmer about to start tuning a piece of
code. You can almost hear their thoughts: "ff I just squeeze out a few cycles here
and there, make this loop a little tighter, optimize the code by hand, maybe even
use some assembly language . . . " Through some kind of magic, everything will run
twice as fast.
Unfortunately, the results seldom meet these expectations, and after a lot of
effort, the code runs only a few percent faster. The problem is that no amount of
optimization or tuning will make up for an inherently slow design. Performance
is something you have to think about all the way through the development cycle.
If you've done that, then you can use the techniques described in this chapter to
verify that your driver meets its performance goals.

1 8. 1 G E NE RAL G U I DELI N ES
Acceptable driver performance can mean different things in different situations.
As a result, the guidelines given in this section are necessarily a little fuzzy. Hope­
fully, they'll act as a springboard for your own thinking on the subject.

Know Where You're Going
You have to know where you're trying to go or else you won't know when
you've gotten there. In the case of driver tuning, this means you should have
459

460

Chapter 18

Driver Performance

some specific performance targets in mind when you start. These targets can be
the result of a number of things:
•

The device itself may have some timing needs. For example, it might
need to be serviced within a certain minimum interval, or it may generate
data at some particular rate. Understanding your device and how it will
be used are important factors in setting performance targets.

•

Application programs may have expectations of how quickly the device
will respond, or how many transactions per second it should be able to
handle.

•

The user 's perception may be the determining factor in choosing perfor­
mance targets. The drivers of video cards, sound boards, and even pointing
devices are judged by how they feel to the user more than anything else.

Very early in the design process, formulate your performance goals in the
most concrete terms possible. Come up with numbers if you can. Then look at
your overall driver design and see where these performance needs will have the
biggest impact.

Get to Know the Hardware
Learn as much as you can about the hardware your driver is managing.
Does it have any weird quirks that might impact driver performance? Are there
any specific sequences of operations that make things go faster or slower? Are
you making the most of any built-in processing capabilities of the device itself? If
you're working with a multiunit controller, does it support overlapped operations
on several devices at the same time? The more you know about the hardware
you're driving, the better you'll be able to see what your options are.

Explore Creative Driver Designs
Some of the most powerful optimizations come, not from tweaking code,
but from looking for a whole different approach to the problem. NT has a very
well-defined driver architecture, but it may not always be suitable for what you're
trying to do.
For example, look at the way video and display drivers work. Display speed
would be abysmal if Win32 went all the way through the 1/0 Manager every time
it touched the video hardware, so the drivers use a nonstandard architecture. In
some cases, it may make sense to map device registers or device memory into
user space if that's the only way to achieve acceptable performance. Real-time
device control might demand this kind of design.
The mouse class and port drivers provide another example of nonstandard
interfaces. In this case, the class driver gives the mouse port driver a pointer to a

Sec. 18.1 General Guidelines

461

function that it should call when mouse events arrive. This allows the port driver
to pass data using a common buffer and greatly reduces the system's overhead in
processing large numbers of events.
The downside of all this is that you may end up compromising system
integrity. Don't abandon the standard NT driver architecture right off the bat, but
if it's clear that nothing else will give you good performance, go for it.

Optimize Code Creatively
This is where everyone wants to look first, when in fact it's probably the last
place to focus your attention. It's worth repeating that no amount of clever opti­
mization will make up for an inherently bad design. If you do need to squeeze
more performance out of your code, here are some things to think about.
First, be very clear about what you're trying to accomplish. Your goal should
be to find new ways of doing things, not just ways to tweak existing code. Most
decent C compilers do a wonderful job of tweaking code. Your advantage as a
human is that you know the context in which the code will run. This allows you to
look for entirely different ways of accomplishing a particular task Don't waste
this gift by turning yourself into a glorified peep-hole optimizer.
Also, focus your attention on the relatively small areas of code that really
determine overall performance. It's often the case that one or two tiny subrou­
tines, comprising maybe 10 percent of your overall driver, will be the gate that
controls the speed of the driver. Try to find those hot spots or critical code paths
and make them as fast as possible. The code paths through your driver 's most fre­
quently executed operations are a good place to look
Finally, don't assume that an optimization will have the same impact on all
NT platforms. Some kinds of optimizations may work only on a specific type of
CPU. If you plan to support your driver on more than one CPU or bus architec­
ture, be sure that improvements work equally well everywhere. At the very least,
make certain that an optimization on one configuration doesn't degrade perfor­
mance anywhere else.

Measure Everything You Do
Concrete measurement forms the basis of all good science. It's amazing how
much faster a piece of code can seem just because you've put several hours of
work into optimizing it. Don't get caught in the trap of wishful thinking; measure
the impact of everything you do. If you don't have any quantitative data to go by,
you won't know if you're helping or hurting.
Later in this chapter, you'll see one way to analyze a driver 's behavior using
the PERFMON utility. You can also measure the speed of specific routines using
the profiling timer available in NT. The only limitation is that this counter 's reso­
lution on 80x86 machines is only one microsecond, and on a 100 MHz Pentium, a
lot of instructions can flow by in that time.

Chapter 18

462

Driver Performance

1 8.2 PERFORMANCE MONITORING IN WINDOWS NT
One o f your options fo r observing a driver's behavior i s t o tie into NT's perfor­
mance monitoring system. The advantage of this technique is that you or anyone
else can use the PERFMON utility to collect and display data about your driver.
This section presents the overall architecture of NT' s performance monitor system.

Some Termi nology
Like other parts of NT, the performance system uses an object-based model
to describe its operation. Before we look at the actual steps involved in using the
performance system, it's a good idea to define some of the terms appearing in the
discussion.

Performance object This is any object that makes performance data avail­
able through the Registry. System components, drivers, and services can all
export various performance objects. For example, the system exposes objects like
memory and CPU, and drivers can expose separate performance objects for each
device they support.
Performance counter Data about a given performance object takes the
form of counters. Although the name seems to imply the summing of discrete
events, these counters can actually represent a wide variety of measurements: an
absolute number of events, a rate of occurrence, a ratio of quantities, the average
availability of a resource, and so forth. For example, NT's Memory object exposes
counters representing the number of available bytes and the number of page
faults per second.
Object i nstance There may be more than one instance of some kinds of
objects on the system. For example, there can be several CPUs and several disk
drives. To distinguish among members of a set of identical objects, performance
monitoring components usually represent these objects as separate instances of
the object type. CPU performance data would show up as information about
CPUO, CPUl, CPU2, and so on.
Counter instance When a performance object supports multiple object
instances, each instance will have its own complete set of counters. Referring back
to the CPU object, there are separate interrupt rate counters for each CPU object
instance.
How Performance Monitoring Works
Windows NT provides a common set of interfaces that drivers and applica­
tion programs can use if they want to participate in performance monitoring
operations. Figure 18.1 shows how these interfaces work.

Sec. 18.2 Performance Monitoring in Windows NT

463

Win32
Registry
API

PERFMON
App

File
Mapping
Object

User-mode
Driver

DeviceloControl

Kernel-mode
Driver

Data Collection
DLL

Copyright © 1 994 by Cydonix Corporation. 940055a. vsd

Figure

1 8. 1

N T performance monitoring components

The following describes what happens when you run the PERFMON utility
(located in the Administrative Tools program group). The process would be the
same for any application program curious about system performance data.
1.

The PERFMON utility uses the Win32 RegQueryValueEx function to access
the HKEY_PERFORMANCE_DATA key.

2.

The Registry API scans HKEY_LOCAL_MACHINE\ . . . \ Services for drivers
and services with a Performance subkey. Having this subkey marks a driver
or service as a performance monitoring component. Values contained in the
Performance subkey identify a data-collection DLL that acts as an interface
between the Registry API and the objects being monitored.

3.

The Registry API maps these interface DLLs into the process requesting per­
formance data. It then calls the Open and Collect functions in each DLL to
determine what objects and counters the DLL supports.

4.

Each time PERFMON wants updated performance information, it calls the

RegQueryValueEx again. This results in calls to the Collect function in
each performance component's data-collection DLL. The Collect function
gets a raw data sample from the object being monitored and sends it back
to PERFMON.
5.

When PERFMON closes the HKEY_PERFORMANCE_DATA key with Reg­
CloseKey, the Registry API calls the DLL's Close function to do any necessary
cleanup. It then unmaps the DLL from the process.

464

Chapter 18

Driver Performance

You can see from this description that performance information isn't actu­
ally stored in the Registry in the same way that hardware or software configura­
tion data is. Rather, the Win32 Registry API calls gather performance data at the
time someone asks for it.

How Drivers Export Performance Data
Drivers that support monitoring have to maintain performance data about
themselves. They make this data available to their data-collection DLL using
either of two different techniques:
•

IOCTLs

-

Kernel-mode drivers make their performance data available

through a privately defined IOCTL function.
•

File Mapping objects

-

User-mode drivers expose performance data

through a File Mapping object (i.e., shared memory) that has a well­
known name.
The example appearing later in this chapter shows how to implement a
data-collection DLL for a kernel-mode driver. A similar example

in the NT DOK

illustrates how to set up monitoring for a user-mode system component.

1 8.3 ADDING COUNTER NAMES TO THE REGISTRY
One of the goals of

NT' s performance monitoring architecture was to make the

display names of performance objects and counters independent of any particular

national language. If you have the American version of NT installed, performance
monitoring tools should display counter names in English, while the French ver­
sion of NT should use French names.
To accomplish this, both the data-collection DLL and the PERFMON utility

refer to performance objects and counters using index numbers rather than
names. These index numbers are assigned when a driver is installed on a given
machine, and they are globally unique on that system. These object and counter
indexes are stored in the Registry along with their corresponding display names.
Tools like PERFMON use this area of the Registry to convert an object or counter
index into text. A similar mechanism allows PERFMON to display help text (in
the appropriate language) about a given counter.

Counter Defin itions in the Registry
As you can see from Figure 18.2, individual counter definitions are stored
under the

Perflib

key, grouped according to their language ID. This scheme

allows you to support counter names and help text in multiple languages without
having to modify your driver.

465

Sec. 18.3 Adding Counter Names to the Registry

L

_LOCAL_MACHINE
Software

L Microsoft
L Wlndpws NT
L

,

Curren Version
P

�
lib

Last Counter: ...
Last Help: ...

American
E n glish

Other
Languages

i

f

019�
��
-··

'·········· ···

n nn

ters:

•••

Copyright C 1 994 by Cydonix Corporation. 940057a.vsd

Figure

1 8.2

Counter definition area in the Registry

Look at Table 18.1 for a more detailed view of the individual Registry
entries. As you can see, each performance object or counter is coupled with a
unique, even integer. These pairs are stored under the Counters subkey for each
language. Help text for a given counter has an odd-numbered index one greater
than the index for the counter itself. Help text definitions are stored under the
Help subkey of each language.
Although you could do something disgusting such as using REGEDT32 to
add your counter definitions to the Registry, there is an easier way. The NT DOK

Table 1 8.1

Registry entries that define counter names and help text

Perflib Registry entries
Entry

Contents

Example

\nnn
\ nnn
\Counters

Names and help text for a
specific language ID
REG_MULTI_SZ string
composed of index I name

\nnn
\Help

REG_MULTI_SZ string
composed of index I help text

Last Counter
Last Help

Highest assigned name index
Highest-assigned help index

009 is the language ID for American
English
2 \0 System \ 0
4 \ 0 Memory \ 0
6 \ 0 % Processor time \0 \ 0
3 \ 0 The System object type . . . \ 0
5 \ 0 The Memory object type ... \ 0
7 \ 0 % Processor time i s... \ 0 \ 0
Ox330
Ox331

466

Chapter 18

Driver Performance

contains two utilities, LODCTR and UNLODCTR, that add and remove counter
definitions for you. In order to add counters with LODCTR, you need to do the
following:
1.

Write a LODCTR command file.

2.

Write a counter-offset header file.

3.

Add a subkey called Performance to your driver 's Registry service key.

4.

Run the LODCTR utility to install the counter definitions.

Writing LODCTR Command Files
To use the LODCTR utility, you first need to write a command file describ­
ing the objects, counters, and help text you want to add to the Registry. The com­
mand file is divided into three sections and can contain the keywords listed in
Table 18.2.

Table 1 8.2

Section names and keywords in a LODCTR command file

LODCTR command file
Section

Keywords

Description

[info]

DRlVERNAME=DriverName
APPLICATIONNAME=ProgName
SYMBOLFILE=FileName.H

Name, if driver
Name, if service
Counter-offset definition file

[languages]

langid=LanguageName

IDs of languages in this file
(LanguageName is ignored)

[text]

symbol_langid_NAME=Name text
symbol_langid_HELP=Help text

Name of one object or counter
Single line of explanatory text

The LODCTR utility uses the Win32 profile functions to parse its command
file, so it should come as no surprise that these files usually have the extension
INI. Let's look at an example of the command and header files needed to define
some performance counters.

COU NTERS.IN! The following example of a LODCTR command file adds
one object with two counters to the Registry. It supports only American English
versions of the counters.

[ in f o ]
drivername=XXDRIVER
symbo l f i l e=COUNTERS . H
[ l anguage s ]
0 0 9 =Eng l i sh

Sec. 18.3 Adding Counter Names to the Registry

467

[ t ext ]
XXDEVICE_0 0 9_NAME=XX Devi c e
XXDEVICE_0 0 9_HELP= The X X Devi c e does whatever i t does .
INTERRUPTS_0 0 9_NAME=Interrupt s / s e c
INTERRUPTS_0 0 9_HELP=Measure s the interrup t rate .
OPERATIONS_0 0 9_NAME=Operat i ons / s ec
OPERATIONS_0 0 9_HELP=Measures devi c e ac t ivi ty .
COUNTERS.H You also need to write a header file containing the relative
index values of each object and counter that you plan to add to the Registry. This
header file defines relative offsets for the XXDEVICE object and its two counters.

# de f ine XXDEVICE
# de f ine INTERRUPTS
# de f ine OPERATI ONS

0
2
4

These indexes must be even numbers starting at zero. The names in the
header file have to match the names in the [text] section of the LODCTR com­
mand file, and they are case-sensitive. This header file will also be included in
your data-collection DLL.

Using LODCTR and UN LODCTR
To add your counter names to the Registry, run LODCTR from the com­
mand line and give it the name of the command file, like this:

LODCTR COUNTERS . INI
When you run LODCTR, it uses the Last Counter and Last Help values in
the Perflib Registry key to assign absolute index numbers to your objects,
counters, and help text items. It also stores the first and last counter and help
indexes assigned to your driver in the Performance subkey of the driver 's Regis­
try service key.
A single command file can contain object and counter definitions in more
than one language. However, LODCTR will only install counter definitions for
language IDs already listed under the Perflib Registry key.
To remove all the objects, counters, and help text associated with a particular
driver or service, run the UNLODCTR utility. Its only argument is the name of the
driver or service that you specified in the [info] section of the INI file.

UNLODCTR XXDRIVER
If you want to modify the object and counter names associated with a par­
ticular driver, you have to remove the existing counter definitions for the
driver with UNLODCTR and run LODCTR again. LODCTR performs only
minimal error checking, and if you run it twice for the same driver, the results
are unpredictable.

468

Chapter 18

Driver Performance

1 8.4 THE FORMAT OF PERFORMANCE DATA
When the Registry API calls your data-collection DLL, it expects you to return
counter information in a very specific format. This data format is one of the more
Byzantine things in NT, so it deserves a little motivating explanation.
Along with the goal of language-independent object and counter names, the
NT architects also wanted to make performance data totally self-descriptive. This
means that programs like PERFMON should be able to process and display a
block of performance data using only the contents of the block itself. This open­
ended, extensible architecture allows standard tools to monitor objects that they
know nothing about.
Unfortunately, data that's totally self-descriptive is also very complicated.
The following subsections describe the Registry's performance data format.

Overall Structure of Performance Data
Figure 18.3 illustrates the overall structure of the information returned by
your data-collection DLL. For each performance object in the DLL, you have to
provide
•

Information about the object itself

•

Definitions for each counter the object exposes

•

A header for all the counter data

•

A block containing the counters themselves

Object Type 1
Object Type 2

/

Object Type 3
:
Object Type N

PERF_OBJECT_TVPE

'

PERF_COUNTER_DEFINITION 1
PERF_COUNTER_DEFINITION 2

'i

:
PERF_COUNTER_DEFINITION M
PERF_COUNTER_BLOCK
Counter 1
Counter 2
:
Counter M

Copyright © 1 994 by Cydonix Corporation. 940054a.vsd

Figure

1 8.3

Structure of performance data for objects with single instances

469

Sec. 18.4 The Format of Performance Data

The following subsections describe these structures in more detail. You can
find additional information in the WINPERF.H header file that comes with the
Win32 SDK. 1

PERF_OBJECT_TV PE This structure acts as a header for information
about a single object type. You must provide one of these structures for each object
being exposed by your performance DLL. Table 18.3 lists the fields in this structure.
Table 1 8.3

Contents of a PERF _OBJ ECT_TY PE structure

PERF_OBJECT_TYPE, *PPERF_OBJECT_TYPE
Field

Contents

DWORD TotalByteLength

sizeof( PERF_OBJECT_TYPE )
+ NumCounters
* sizeof( PERF_COUNTER_DEFINffiON )
+ sizeof( PERF_COUNTER_BLOCK )
+ sizeof( allCounters )
sizeof( PERF_OBJECT_TYPE )
+ NumCounters
* sizeof( PERF_COUNTER_DEFINITION )
sizeof( PERF_OBJECT_TYPE )
Index of this object's name in the title database
NULL
Index of object's description in the help database
NULL
Complexity level of information
• PERF_DETAIL_NOVICE
• PERF_DETAIL_ADVANCED
• PERF_DETAIL_EXPERT
• PERF_DETAIL_WIZARD
Number of counters in each counter block
Default to display, or -1
Number of instances of this object, or -1 if
no separate instances
0 for drivers
Current value, in counts, of the high-resolution
performance counter
Current frequency, in counts per second, of the
high-resolution performance counter

DWORD DefinitionLength

DWORD HeaderLength
DWORD ObjectNameTitlelndex
LPWSTR ObjectNameTitle
DWORD ObjectHelpTitlelndex
LPWSTR ObjectHelpTitle
DWORD DetailLevel

DWORD NumCounters
DWORD DefaultCounter
DWORD Numlnstances
DWORD CodePage
LARGE_INTEGER PerfTrme
LARGE_INTEGER PerfFreq

1

This header also contains a great deal of descriptive commentary.
going to be working with the performance subsystem.

I recommend reading it if you're
·

Chapter 18

470

Table 1 8.4

Driver Performance

Contents of a PERF_COU NTER_DEFIN ITION structure

PERF_COUNTER_ D EF I N IT I O N , *PPERF_C O U NTER_ DE F I N IT I ON
Field

Contents

DWORD ByteLength
DWORD CounterNameTitlelndex

sizeof( PERF_COUNTER_DEFINITION )
Index of this counter 's name in the title
database
NULL
Index of this counter 's description in the help
database
NULL
Scaling factor for display, expressed as a
power of lO
Complexity level of information
• PERF_DETAIL_NOVICE
• PERF_DETAIL_ADVANCED
• PERF_DETAIL_EXPERT
• PERF_DETAIL_WIZARD
(See below)
Size of counter in bytes
Offset from start of PERF_COUNTER_BLOCK
structure to the first byte of this counter

LPWSTR CounterNameTitle
DWORD CounterHelpTitlelndex
LPWSTR CounterHelpTitle
DWORD DefaultScale
DWORD DetailLevel

DWORD CounterType
DWORD CounterSize
DWORD CounterOffset

PERF_CO U N T ER_D EF I N IT I O N You must supply a separate counter defi­
nition for each counter in your DLL. This block (described in Table 18.4) pinpoints
the size and position of the counter data itself, as well as defining the type of
information the counter represents.
PERF_C O U NTER_BLOCK This block (described in Table 18.5) is simply a
header for all the raw counter data itself. The counters come immediately after it.

Table 1 8.5

Contents of a PERF_COUNTE R_BLOCK structure

PERF_COU NTER_BLOCK, *PPERF_COUNTER_BLOCK
Field

Contents

DWORD ByteLength

sizeof( PERF_COUNTER_BLOCK )
+ sizeof( allCounters )

Types of Counters
The CounterType field of the counter definition block specifies the kind of
information represented by the counter. WINPERF.H contains a number of pre­
defined types, most of which are listed in Table 18.6. Your choice of a counter type

471

Sec. 18.4 The Format of Performance Data
Table

1 8.6

Use these values for the Cou nterType field of a P E R F_COUNTER_DEFI N ITION

Predefined CounterType val ues
Counter type

Description

PERF_COUNTER_COUNTER

32-bit event rate

PERF_COUNTER_TIM:ER

Suffix
/ sec

LiCount I LiTime
64-bit Timer

%

LiCount I LiTime

PERF_COUNTER_QUEUELEN_TYPE

Average queue length

PERF_COUNTER_BULK_COUNT

64-bit event rate

PERF_COUNTER_TEXT
PERF_COUNTER_RAWCOUNT

Unicode text
32-bit counter
No time averaging
% Busy counter numerator
1 or 0 on each sampling
interrupt

LiCount I LiTime

PERF_SAMPLE_FRACTION

/ sec

LiCount I LiTime

%

LiCount I LiTime
PERF_SAMPLE_BASE

PERF_SAMPLE_COUNTER
PERF_COUNTER_NODATA
PERF_COUNTER_TIMER_INV

% Busy counter denominator
Directly follows numerator
counter.
Sampled counter

Li Count I LiTime
Label only; no data
64-bit Timer inverse
Measure % idle but display
% busy
100 (LiCount I LiTime )
A bulk count which, when
divided (typically) by the number
of operations, gives (typically)
the number of bytes per operation.

%

-

PERF_AVERAGE_BULK

Count / Base
PERF_AVERAGE_TIM:ER

A timer which, when divided by
an average base, produces a time in
seconds which is the average time
of some operation. This timer times
total operations, and the base is
the number of operations.

Timer / Base
PERF_AVERAGE_BASE

Denominator of time or count
averages
Directly follows numerator counter.

sec

472

Chapter 18

Table 1 8.6

Driver Performance

(Continued)

Counter type

Description

PERF_lOONSEC_TIMER

64-bit Timer in 100 nsec units
LiCount I LiTime
64-bit Timer inverse
100 (LiCount I LiTime )
64-bit multi-instance Tuner
LiCount I LiTime
Result can exceed 100%
64-bit multi-instance Tuner
inverse

PERF_lOONSEC_TIMER_INV

Suffix

%
%

-

PERF_COUNTER_:MULTI_TIMER

PERF_COUNTER_:MULTI_TIMER_INV

%

%

100 * MULTI_BASE
- (LiCount I LiTmte )
Result can exceed 100%
PERF_COUNTER_:MULTI_BASE

Followed by a :MULTI_BASE.
Counter
Number of instances to which
the preceding _:MULTI_ _INV
counter applies
64-bit multi-instance 100 nSec
Timer
LiCount I LiTime
Result can exceed 100%
64-bit Tuner inverse
...

PERF_lOONSEC_:MULTI_TIMER

PERF_lOONSEC_MULTI_TIMER_INV

100 * _MULTI_BASE
- LiCount I LiTime
Result can exceed 100%.

PERF_RAW_FRACTION

Followed by a :MULTI_BASE
counter
Counter is a fraction of base

PERF_RAW_BASE

No time averaging
Base for the preceding counter

%

%

%

Count / Base

will determine not only the data you have to supply, but also how the Perfor­
mance Monitor displays that data.

Objects with Multiple Instances
If your data-collection DLL reports data separately for each instance of an
object, you need to use a slightly different data format. As you can see from Figure
18.4, the main change is that you have to supply a name for each object instance
and separate instances of each counter.

Sec. 18.4 The Format of Performance Data

473

PERF_OBJ ECT_TYPE
PERF_COUNTER_DEFINITION

1

v

PERF_COUNTER_DEFINITION 2

:
PERF_COUNTER_DEFINITION
Instance

M

1

lnstance 2

:
Instance P

Copyright © 1 994 by Cydonix Corporation. 940058a.vsd

Figure 1 8.4

PERF_INSTANCE_DEFINITION

\

Unicode Instance Name
PERF_COUNTER_BLOCK
Counter

1

Counter 2

:
Counter

M

Modified structure of performance data for objects with multiple instances

You need to calculate slightly different values for two fields in the
PERF_OBJECT_TYPE if you're using multiple object instances. Table 18.7 lists
these changes.
The other new item for multi-instance objects is a block that describes each
object instance. See Table 18.8 for the contents of this block. Notice that you can
identify an instance either by a Unicode name or by a number. If you use a name,
the name string immediately follows the instance definition block. Keep in mind
that, since this Unicode name string is embedded in the data, it won't be trans­
lated into the local language.

Table 1 8.7

These fields of PERF_OBJ ECT_TYPE are different for multi-instance
objects

PERF_OBJECT_TYPE fields
Field

Contents

TotalByteLength

sizeof( PERF_OBJECT_TYPE )

+ NumCounters * sizeof( PERF_COUNTER_DEFINITION )
+ Numlnstances * sizeof( PERF_INSTANCE_DEFINITION )

+ sizeof( alllnstanceNames )
+ Numlnstances
+ Numlnstances

Numlnstances

Value :f: 1

* sizeof( PERF_COUNTER_BLOCK )
* sizeof( allCounters )

474

Chapter 18

Table 1 8.8

Driver Performance

Contents of a PERF_I NSTANCE D E F I N ITION structu re

PERF_INSTANCE_DEFINITION, *PPERF_INSTANCE_DEFINITION
Field

Contents

DWORD ByteLength

sizeof( PERF_INSTANCE_DEFINITION )
+ sizeof( InstanceNameString )
Index in the title database of object type which is
this object's parent or 0 if no hierarchy
Index, starting at 0, into the instances being
reported for the parent object type
Zero-based numerical identifier used in place of
a name; PERF_NO_UNIQUE_ID if none
sizeof( PERF_INSTANCE_DEFINITION )
sizeof( InstanceNameString ) or 0 if no name

DWORD ParentObjectTitleindex
DWORD ParentObjectlnstance
DWORD UniqueID
DWORD NameOffset
DWORD NameLength

1 8.5 W RITI N G TH E DATA-COLLECTION D l l
As we've already seen, the data-collection DLL acts as an interface between the
driver and the Registry APL This section describes the contents of the DLL and
explains what you have to do to make the DLL visible to the system.

Contents of the Data-Col lection DLL
The data-collection DLL consists of three major functions. You can call these
routines anything you like, since their names will be recorded in the Performance
subkey of your driver 's Registry service key. The following subsections describe
each of these functions.

Open The Open function queries the Registry to determine the proper
index values for each object and counter exported by the DLL. It also initializes
the static versions of PERF_OBJECT_TYPE and PERF_COUNTER_DEFINITION
structures used by the DLL's Collect function. Finally, it establishes a connection
with the specific devices being monitored. Table 18.9 contains the prototype for
the Open function.
Table 1 8.9

Prototype for data collector's Open function

DWOR D XxPerfOpen
Parameter

Description

IN LPWSTR lpDeviceNames

Unicode strings naming each device managed by
this driver or NULL
• ERROR_SUCCESS - function succeeded
• ERROR_XXX - some Win32 error code

Return value

475

Sec. 18.5 Writing the Data-Collection DLL

Table 1 8.1 O

Prototype for data collector's Collect function

DWORD XxPerfCollect
Parameter

Description

IN LPWSTR lpwszValue

Unicode string identifying requested data
• Global
data about all objects
• index 1 index 2 . . . - data about specific objects
-

•
•
•

IN OUT LPVOID *lppData

IN OUT LPDWORD lpcbBytes

OUT LPDWORD lpcObjectTypes

Return value

Foreign ComputerName
Foreign ComputerName indexl index2 . . .
Costly - data that's expensive t o collect

IN: Pointer to buffer pointer for returned data
OUT SUCCESS: Updated pointer
OUT ERROR: Unchanged from input input
IN: Pointer to DWORD containing buffer size
OUT SUCCESS: Number of data bytes in buffer
OUT ERROR: O
OUT SUCCESS - Count of ObjectTypes
OUT ERROR: O
ERROR_MORE_DATA - buffer too small
ERROR_SUCCESS - all other cases

Collect The Collect function (described in Table 18.10) is called once when
the DLL is opened to get a list of all the objects supported by the DLL. From then
on, it is called periodically to retrieve current counter values from each object
being monitored.
The first argument to this function is a NULL-terminated Unicode string
describing the kind of data that the caller wants to receive. This argument can
either be a specific keyword (like Global), or it can be a list of index numbers that
identify particular object types. Your Collect function will need to parse this string
to see if it can provide data about any of the objects the caller is interested in.
Close This function is called when it's time to close the connection with
the monitored devices and release any resources held by the DLL. The prototype
for this function appears in Table 18.11.
Table 1 8.1 1

Prototype for data collector's Close function

DWORD XxPerfClose
Parameter

Description

VOID

Return value

ERROR_SUCCESS

Chapter 18

476

Driver Performance

Error Handling in a Data-Collection DLL
It's a good idea for your data-collection DLL to record any problems it
encounters in the Event Log. That way, you or a system administrator can poke
around with the Event Viewer utility if your driver 's performance objects aren't
showing up in PERFMON for some reason.
Since a data-collection DLL is running in user mode, it doesn't use the ker­
nel-mode event-logging interface described in Chapter 13. Instead, it works with
the Win32 event logging functions, RegisterEventSource, ReportEvent, and
DeregisterEventSource. The code example that accompanies this chapter shows
how to use these functions.
Another implication of the data-collection DLL's user-mode environment
is that you have to record its error message file (which is usually the DLL
itself) in a slightly different part of the Registry. Rather than dangling beneath
Services\ EventLog\ System, the DLL's message file is recorded in Ser­
vices \ EventLog\Application. 2 Figure 18.5 shows how this works.
It's also polite behavior to give system administrators the ability to control
the amount of event logging your DLL performs. One way to do this is to put a
REG_DWORD value called EventLogLevel under the Parameters subkey of the
driver 's Registry service key. The DLL's Open function retrieves this value from
the Registry and uses it as a logging threshold. The higher the number, the more
event-logging detail the DLL generates.
. . .

. . .

H KEY_LOCAL_MACHINE\System\CurrentControlSet\Services

[

Eventlog

L

�
L�

Application
rces: REG_MULTl_SZ: XXPERF

..•

ERF
EventMessageFile:
REG_EXPAND_SZ:
%SystemRooto/o\System32\XXPERF.DLL
TypesSupported: R EG_DWORD: Ox7

Copyright © 1 996 by Cydonix Corporation. 960026a.vsd

Figure
2 This

1 8.5

Adding a data-collection DLL' s message file to the Registry

also means the DLL's event messages will show up
tem log when you use the Event Viewer utility.

in the Application log rather than the Sys­

477

Sec. 18.5 Writing the Data-Collection DLL
HKEY_LOCAL_MACHINE

L

System

L CurrentControlSet

L

Services

L XxDriver

L

Performance
Library: REG_SZ:
Open :
REG_SZ:
Collect: REG_SZ:
Close: REG_SZ:
First Counter: ...
First Help: ...
Last Counter: ...
Last Help: ...

XXPERF.DLL
XxPerfOpen
XxPerfCollect
XxPerfClose

Copyright © 1 994 by Cydonlx Corporation. 940056a.vsd

Figure

1 8.6

Contents of a driver 's Performance subkey

Installing the DLL
Once you've built the data-collection DLL itself, you need to move it to the

%SystemRoot% \SYSTEM32 directory. To make NT aware of your DLL, you
have to add several values to the Performance subkey of your driver 's Registry
service key. Figure 18.6 shows the structure of these Registry entries, and Table
18.12 describes them in detail.
The First Counter, Last Counter, First Help, and Last Help values were put
there by LODCTR. The data-collection DLL retrieves the two First values and
uses them to calculate the proper index numbers for each of its objects, counters,
and help text items. You only need to add the values that identify the DLL and its
entry points.

Table 1 8.1 2

Values in a d river's Performance subkey

Performance subkey values
Value

Description

Example

Library
Open
Collect
Close

Full path name of data-collection DLL
Name of DLL's (optional) Open function
Name of DLL's Collect function
Name of DLL's (optional) Close function

XXPERF.DLL
XxPerfOpen
XxPerfCollect
XxPerfCollect

Chapter 18

478

1 8 .6 CODE EXAM P L E :

Driver Performance

A DATA-COLLECTION D l l

This example shows how t o set u p a data-collection DLL. I t also illustrates the
modifications you'd need to make to a kernel-mode driver in order to retrieve
performance data from it.
It takes a fair amount of code to implement all the pieces of this example.
Unfortunately, not all of it will fit here. The complete code for all the components
can be found in the CH18 directory on the disk that accompanies this book. In this
directory, you'll find three subdirectories:
•

Driver
This directory contains a version of XXDRIVER that supports a
IOCTL_XX_GET_PERF_DATA 1/0 control code. The driver itself is just a
stub that illustrates how to pass performance data back to the collection
DLL. The performance measurements generated by the driver are all
bogus.

•

Ioctl
The only file in this directory is XXIOCTL.H which contains the
IOCTL definitions and structures used by both the driver and the collec­
tion DLL.

•

Library
The files in this directory implement the data-collection DLL
itself. This includes support for event logging, parsing the argument
string of the DLL's Open function, and gathering and formatting perfor­
mance data.

-

-

-

Again, because of space limitations, only selected portions of the data-collec­
tion DLL will appear here.
XXPERF.C

This file of the example contains the Open, Collect, and Close functions that
interface with the Win32 Registry API calls.
Preamble area This section of the data-collection DLL's source code con­
tains header files, data definitions, and function prototypes necessary to the
proper operation of the DLL.
II
I I Al l - inclus ive header f i l e
II
# inc lude " xxper f . h " 0
II
I I Data g l obal to thi s modu l e @
II

s ta t i c HANDLE hDevi c e ;

479

Sec. 18.6 Code Example: A Data-Collection DLL

s tat i c DWORD dwOpenCount
s tat i c BOOL bini t i al i z ed

=

O;
FALSE ;

II
I I Ini t i a l i z ed obj ect header de f ined
I I i n data . c
II

ext e rn XX_HEADER_DEF INITION XxObj ec tHeader ; @
II
I I Forward dec larat i ons o f routines 8
II

PM_OPEN_PROC
PM_COLLECT_PROC
PM_CLOSE_PROC

XxPerfOpen ;
XxPer fCo l l ec t ;
XxPer f C l o s e ;

0 The master header file includes WINPERF.H from the Win32 SDK. This

Win32 header defines all the performance data structures.

@ Multiple functions in this source module need access to the device han­
dle, the count of threads using the library, and the initialization flag. The
easiest way to deal with this is to make the variables global.

@ The modules DATA.C and DATA.H contain a single copy of all the static
parts of the object-type and counter-definition data.
e The three exported functions in the DLL must be identified using these

specific forward declarations if you want everything to work properly.

XxPerfOpen This function sets up the DLL. This includes getting a handle
to the target device and calculating the absolute index values for each object and
counter exported by the DLL. To simplify the collection process, the DLL keeps a
single, statically initialized copy of the data header information in a global struc­
ture defined in DATA.C and DATA.H.

DWORD
XxPer fOpen (
LPWSTR lpDevic eNames
)
{
HKEY hKeyDriverPer f ;
DWORD dwF i r s tCounter ;
DWORD dwF i r s tHelp ;
DWORD dwType ;
DWORD dwS i z e ;
DWORD dwStatus ;
i f ( dwOpenCount -- 0 ) 0
{

480

Chapter 18

Driver Performance

XxOpenEventLog ( ) ; @
hDevi c e

=

Crea t e F i l e@
XX_WIN3 2_DEVICE_NAME ,
GENERI C_READ ,
F I LE_SHARE_READ

I

F I LE_SHARE_WRITE ,
NULL ,
O PEN_EX I S T ING ,
F I LE_ATTRIBUTE_NORMAL ,
NULL
i f ( hDev i c e

= =

) ;

INVALI D_HANDLE_VALUE

{
dwS tatus

=

GetLas tError ( ) ;

XxLogErrorWi thDa t a (
LOG_LEVEL_NORMAL ,
XXPERF_CANT_OPEN_DEVICE_HANDLE ,
&dwS tatus ,

si zeof (

dwS tatus

)

) ;

XxC l o s eEventLog ( ) ;
re turn dwS tatus ;

II
II
II
II

Open the Per f o rmance subkey o f

the dr iver ' s

s ervi c e key in the Reg i s t ry .

dwS tatus

=

RegOpenKeyEx ( O
HKEY_LOCAL_MACHINE ,
" SYSTEM \ \ CurrentContr o l S e t "
" \ \ S e rvi c e s \ \ XxDr iver "
" \ \ Per f o rmanc e " ,
OL ,
KEY_ALL_ACC E S S ,
&hKeyDrive r P e r f

if (

dwS tatus

!=

) ;

ERROR_SUCC E S S

{
XxLogErrorWi thDa t a (
LOG_LEVEL_NORMAL ,
XXPERF_CANT_O PEN_DRIVER_KEY ,
&dwS t atus ,
C l o s eHandl e (

sizeof (

hDevi c e

XxC l o s eEventLog ( ) ;
re turn dwS t atus ;

}

) ;

dwS tatus

) ) ;

481

Sec. 18.6 Code Example: A Data-Collection DLL

II
II
II

Get ba s e i ndex o f f i r s t obj e c t o r counter

dwS i z e

si zeof

dwStatus

( DWORD ) ;

RegQueryValueEx (
hKeyDri verPer f ,
" F i r s t C ount er " ,
OL ,
&dwType ,
( LPBYTE ) &dwF i r s tCount e r ,
&dwS i z e ) ;

if (

dwS t a tus

!=

ERROR_SUCCE S S

{
XxLogErrorWi thData (
LOG_LEVEL_NORMAL ,
XXPERF_CANT_READ_F I RST_COUNTER ,
&dwS tatus ,

sizeof (

dwS tatus

RegC l o s eKey ( hKeyDrive r P e r f
C l o s eHandl e (

hDev i c e

) ) ;

) ;

) ;

XxC l o s eEventLog ( ) ;
re turn dwS t atus ;

II
II
II

Get bas e i ndex o f f i r s t he l p t ext

dwS i z e

s i zeof

dwS tatus

( DWORD ) ;

RegQueryValueEx (
hKeyDriverPer f ,
" F i r s t H e lp " ,
OL ,
&dwType ,
( LPBYTE ) &dwF i r s tHelp ,
&dwS i z e

if (

{

dwS tatus

!=

) ;

ERROR_SUCCESS

XxLogErrorWi thData (
LOG_LEVEL_NORMAL ,
XXPERF_CANT_READ_F I RST_HELP ,
&dwS tatus ,

si zeof (

dwS tatus

RegC l o s eKey ( hKeyDr ive r P e r f
C l o s eHandl e ( hDevi c e
XxC l o s eEventLog ( ) ;

) ;

) ;

) ) ;

482

Chapter 18

Driver Performance

return dwS tatus ;
II
I I Don ' t need Reg i s t ry handle anymore
II

RegC l o s eKey ( hKeyDriverPerf ) ;

II
I I Ini t i a l i z e PERF_OBJECT_TYPE s t ruct0
II

XxObj ec tHeader . XxDevi c e .
Obj ectNameTi t l e index
dwF i r s tCount er

+

XXDEVICE ;

XxObj ec tHeader . XxDevi c e .
Obj ectHelpTi t l e index =
dwF i r s tHelp + XXDEVI CE ;
II
I I Ini t i al i z e 1 s t PERF_COUNTER_DEF INITION
II

XxObj ec tHeader . Interrup t s .
CounterNameTi t l e i ndex
dwF i r s tCounter + INTERRUPTS ;
XxObj e c tHeade r . Interrup t s .
Count erHelpT i t l e i ndex
dwF i r s tHelp + INTERRUPTS ;
=

II
I I Ini t ial i z e 2 nd PERF_COUNTER_DEF INITION
II

XxObj e c tHeader . Operati ons .
Count erNameTi t l e index
dwF i r s tCount er + OPERATI ONS ;
XxObj ectHeader . Operat i ons .
CounterHe lpT i t l e index
dwF i r s tHelp + OPERATIONS ;
=

II
I I Mark DLL as suc c e s s fu l ly ini t i al i z ed
II

bini t i a l i zed

=

TRUE ;

II
I I One way or ano ther , there ' s one more
I I thread u s i ng the DLL .

Sec. 18.6 Code Example: A Data-Collection DLL

483

II

dwOpenCount + + ;
return ERROR_SUCCESS
0 If the DLL is being called by SCREG from a remote computer, there may

be multiple threads accessing it at the same time. Therefore the DLL
needs to keep a count of how many times it's been opened. The first call
causes the DLL to initialize itself; the rest simply bump the count.

@ Any errors that occur will go to the Event Log. This helper function (defined
in EVENTLOG.C) manages the details of setting up the connection.
@

The kernel-mode driver will give performance data to the DLL in
response to a special IOCTL code. To issue that IOCTL, the DLL needs a
handle to the device. This handle is stored in a global variable (hDevice)
where the rest of the DLL can get to it.

e This next section of code gets a handle to the Performance subkey below

XXDRIVER's Registry service key. Then it recovers the base index num­
ber for XXDRIVER's objects and counters (from the First Counter value),
and the base index number for help text (from the First Help value).
0 Once the base index values are recovered, it's necessary to calculate the

index number of every object, counter, and help text item supported by
this DLL. The resulting indexes are put into the various . Titleindex
fields of the statically initialized object header defined in DATA.C.
. .

XxPerfCol lect The Collect function retrieves one sample of data from the
object being monitored. After copying the static data header into the caller 's
buffer, it uses an IOCTL to put the current counter values there as well.

DWORD
XxPerfCo l l ect (
IN L PWSTR lpValueName ,
IN OUT LPVO I D * lppData ,
IN OUT LPDWORD lpcbTo talByt e s ,
IN OUT LPDWORD lpNumObj e c tType s
)
DWORD dwQue ryType ;
DWORD dwStatus ;
DWORD dwBytesRe turned ;
PPERF_COUNTER_BLOCK pPer fCounterBlock ;
PXX_HEADER_DEFINITION pXxObj e c tHeader ;
i f ( ! bi ni t i a l i z ed ) 0
{

Chapter 1 8

484

Driver Performance

* lpcbTotalByt es = ( DWORD ) 0 ;
* lpNumObj ec tTyp e s = ( DWORD ) 0 ;
return ERROR_SUCCES S ;
dwQueryType

=

XxGe t Pe r fQueryType ( lpValueName ) ; @

i f ( dwQueryType = = PERF_QUERY_TYPE_FORE IGN )
{
II
I I Can ' t s ervi c e foreign reques t s .
II
* lpcbTo talByt es = ( DWORD ) O ;
* lpNumObj ec tTyp e s
( DWORD ) O ;
re turn ERROR_SUCCES S ;
=

i f ( dwQueryType = = PERF_QUERY_TYPE_ITEMS
{
i f ( ! Xxi sNumberinL i s t ( @)
XxObj ectHeader .
XxDevi ce .
Obj ec tNameT i t l e index ,
lpValueName ) )
{
* lpcbTotalByt es = ( DWORD ) 0 ;
* lpNumObj ec tTyp e s = ( DWORD ) 0 ;
return ERROR_SUCCES S ;
}
i f ( * lpcbTo ta lByt e s < 0
( s i z e o f ( XX_HEADER_DEF INITION ) +
s i z e o f ( XX_PERF_DATA ) ) )
{

* lpcbTo talByt e s
( DWORD ) O ;
* lpNumObj ec tTypes = ( DWORD ) O ;
re turn ERROR_MORE_DATA ;
=

}
pXxObj e c tHeader = 0
( PXX_HEADER_DEF INITION ) * lppDat a ;
memmove (
pXxObj e c tHeader ,
&XxObj ec tHeader ,
s i z e o f ( XX_HEADER_DEF INITION ) ) ;

Sec. 18.6 Code Example: A Data-Collection DLL

485

pPer fCounterBlock =  \ SYMBOLS on the NT distribution
CD, copy various symbol files to \SYMBOLS\FREE on the host. At a mini­
mum, you'll need EXE\NTOSKRNL.DBG, DLL \NTDLL.DBG, and
DLL \HAL.DBG.
...

3.

Copy the checked versions of the same symbol files from \CHECKED\ SUP­
PORT\DEBUG \ \ SYMBOLS on the NT distribution CD to
... \ SYMBOLS\CHECKED on the host. You'll need these symbols when you
run your driver under the checked build of NT.

4.

Each time you rebuild your driver, copy the driver 's symbol file into these
directories. Refer back to Chapter 16 for an explanation of creating the
driver 's debug symbol file.

One thing to watch out for: Installing an NT service pack changes all the
symbol information. So, if you've upgraded NT on the target system with a ser­
vice pack, you have to get the operating system symbol files from the service pack
CD. The symbols on the standard distribution CD won't work. The symbol direc­
tory paths on the service pack CD are the same as those on the NT distribution
disk.

A.3

ENABLI N G C R A S H D U M PS O N TH E TAR G ET SYSTEM

Crash dump files can be very helpful when you're tracking down bugs in a ker­
nel-mode driver. Refer back to Chapter 17 for information on reading these files.
Follow these steps on the target system if you want Windows NT to dump crash
information after a bugcheck.
1.

In the Control Panel, double-click on the System applet.

2.

Click on the Recovery button. The Recovery dialog box will appear.

Enabling Crash Dumps on the Target System

Sec. A.3

491

3.

Select the Write Debugging Information To check box. You can enter a path
and filename for the crash file in the test box, or accept the default value
(%SystemRoot% \MEMORY.DMP).

4.

Select the Overwrite Any Existing File check box if you want new crashes to
overwrite an existing dump file with the same name. If this check box is clear,
you won't get any crash information if a dump file with the same name
already exists.

5.

Reboot the system to have these options take effect.

When a crash occurs, the system copies an image of physical memory into
the paging file located on the system root partition. During the reboot after a
crash, NT copies the crash image from the paging file to the target file specified in
the Recovery dialog.
If You Don't Get Any Crash Dump Files

Several things can prevent the system from creating a dump file after a
crash. If you're having troubles, here's what to look for.
Premature reboot Make sure you don't hit the reboot switch until NT has
finished dumping memory into the crash file. If you reboot before the dump is
complete, you won't get any crash information. You can tell when NT has finished
by looking at the message at the bottom of the blue screen display.
Paging file issues NT can only use the paging file on the system root par­
tition for storing the crash image. If you don't have a paging file there, NT won't
be able to save crash information.
Also, make sure there's enough space in this paging file. It must be big
enough to hold all of physical memory plus one additional megabyte. If the file is
too small, you won't get any crash information.
Lack of disk space There has to be enough space on the system root par­
tition to hold the dump file itself. Although you can specify any target directory
for the dump file, NT initially creates it in the %SystemRoot% directory and then
copies it to its final destination. If there isn't enough free space, NT won't be able
to create the file.
Hardware issues Certain specific hardware configurations have problems
generating crash files. Most of them (though not all) involve SCSI disk controllers.
If you search the Knowledge Base section of the Microsoft Developer CD for a title
containing the name of your system (or SCSI controller) and MEMORY.DMP, you
may find a bug report helpful. Other than getting some new hardware, there's not
much you can do in this case.

492

Appendix A

The Development Environment

Even if your system isn't one of the ones with known problems, the lack of a
dump file may indicate that you're using an out-of-date driver for your system
disk. See if there's a newer version available.

A.4

E N A B LI N G

TH E T A R G ET SYSTE M ' S D E B U G C L I E NT

Both the retail and checked versions of Windows NT include a debugging client
that allows NT to communicate over a serial line with the WINDBG debugger.
However, you have to enable this debugging client on the target system if you
want to debug the target system interactively with WINDBG.
Depending on the CPU architecture, you follow different procedures to
enable kernel-mode debugging on the target system. On RISC machines, you
need to modify the OSLOADOPTIONS environment variable in the ARC firm­
ware. See your system documentation for an explanation of how to do this.
To enable the debugger on 80x86-based machines, you edit the BOOT.IN!
file located in the root directory of the boot partition. This is a hidden system file
that tells the NT loader what operating systems are available for booting. Follow
these steps to modify BOOT.INI:
1.

Remove the read-only, hidden, and system attributes from the file using this
command:
attrib - r -h - s BOOT . INI

2.

Open BOOT.INI for editing with your favorite text editor.

3.

In the [operating systems] section, add appropriate options to the boot com­
mand line for the free and checked versions of Windows NT.

4.

Save the changes and close the file.

5.

Use the following command (or its File Manager equivalent) to restore the
file's original attributes:
attrib +r +h + s BOOT . INI

Regardless of the machine architecture, you can specify the options listed in
Table A l . Keep the following things in mind when you're selecting bootstrap
options.
•

If you specify NODEBUG, then DEBUGPORT, BAUDRATE, and CRASH­
DEBUG are ignored.

•

If you specify BAUDRATE, kernel debugging is enabled; you do not also
have to specify DEBUG. Select the highest baud rate that works for both
machines.

Sec. A.4

Enabling the Target System's Debug Client

Table A.1

493

Debugging options for BOOT. I N ! files or OSLOADOPTIONS

BOOT.IN! options
Options

Description

DEBUG
NODEBUG
DEBUGPORT=PortName
BAUDRATE=BaudRate
CRASHDEBUG

Enables kernel-mode debugging.
Disables kernel-mode debugging. This is the default.
Specifies debug serial port used by target machine.
Specifies baud rate used by target machine.
Causes debugger to activate only when the system
bugchecks.
Specifies the amount of memory to be made available to
the system.
Displays the name of each module being loaded during
system bootstrap

MAXMEM=SizelnMB
sos

•

On 80x86 machines, COM2 is the default debugger communications port,
if it exists and if it isn't being used. In all other cases, COMl is the default.

•

The MAXMEM option can be useful for stress testing your driver in a
low-memory environment. For example, you can limit a 24-megabyte
machine to using only 12 megabytes.

The following example of a BOOT.INI file offers three choices at boot time: a
nondebugging, free version of NT; a free version of NT with the debugger
enabled; and a checked version of NT with debugging enabled. The checked ver­
sion is also restricted to a 12 MB environment.
[ bo o t

l o ader ]

t irneou t = 3 0
de f au l t = c : \
[ opera t i ng sys t ems ]
rnul t i ( O ) di sk ( O ) rd i s k ( O ) par t i t i on ( l ) \ w i nn t = " NT Free "
rnu l t i ( O ) di s k ( O ) rd i s k ( O ) par t i t i on ( l ) \ w i nn t = " NT Free "

/ DEBUGPORT=COMl

rnu l t i ( O ) di sk ( O ) rd i s k ( O ) par t i t i on ( l ) \ wnt chk= " NT Che c k "

/ DEBUG=COMl

/ MAXMEM= 1 2

A

P

P

E

N

D

I

X

B

Common Bugcheck
Codes

8.1

G E N E RA L PROBLEMS

WITH DRIVERS

A variety of driver errors can produce the bugchecks in Table B.1. The accom­
panying notes may help you locate the source of the problem.
Table B.1

General errors

Bugchecks caused by general driver problems
Code and parameters

Description

IRQL_NOT_LESS_OR_EQUAL (OxOA)
1 - Address that was referenced
2 - IRQL at time of reference
3 - Type of access
• 0 - Read
• 1 - Write
4 - Address where reference occurred

CAUSE: A driver touched
paged memory at or above
DISPATCH_LEVEL IRQL.
ACTION : The driver may be
using a bogus pointer. Use the
fourth bugcheck parameter
to find the offending source
code line.

KMODE_EXCEPTION_NOT_HAN DLED (Ox1 E)
1 - The exception code 1
2 - Address of the failing instruction
3 - First exception parameter
4 - Second exception parameter

CAUSE: A driver generated
an exception.
ACTION: Use the second
bugcheck parameter to locate
the offending source code
line.

494

Sec. B.1

495

General Problems with Drivers

Table B.1

(Continued)

Code and parameters

Description

UNEXPECTED_KERNEL MODE_TRAP (Ox7F)
Code number of trap2

CAUSE: On Intel platforms,
this means the CPU generated
a trap that it can't handle in
kernel mode.
ACTION: From WINDBG,
find the trap frame address
with kb. Use !trap to format
the frame. 3 The contents of
EIP will show where the trap
was taken.

PANIC_STACK_SWITCH (Ox2B)

CAUSE: The kernel-mode
stack has overflowed. This
can mean other operating sys­
tem data structures have been
damaged.
ACTION: In the stack trace,
look for a driver that's using
too much stack space.4

PAGE_FAULT_WITH_INTERRUPTS_OFF (Ox49)

Same as OxOA (above).

IRQL_NOT_DISPATCH_LEVEL (Ox08)
IRQL_NOT_GREATER_OR_EQUAL (Ox09)
IRQL_GT_ZERO_AT_SVSTEM_SERVICE (Ox4A)

CAUSE: Miscellaneous
problems with IRQL level.
ACTION: Use the stack trace
to locate the code causing the
crash.

INVALID_SOFTWARE_INTERRUPT (Ox07)
SVSTEM_SERVICE_EXCEPTION (Ox3B)
INVALID_DATA_ACCESS_TRAP (Ox04)
NO_EXCEPTION_HANDLING_SU PPORT (OxOB)
TRAP_CAUSE_U NKNOWN (Ox1 2)
LAST_CHANCE_CALLED_FROM_KMODE (Ox1 5)

CAUSE: Miscellaneous
problems with exceptions.
ACTION: Use the stack trace
to locate the code executing at
the time of the crash.

1

-

1 You can determine what kind of exception it is by searching NTSTATUS.H for this number. A

common exception code is Ox80000003. This means the system hit a hard-coded breakpoint or
ASSERT while it was booted with the /NODEBUG switch. Connect a debugger and reboot with
the / DEBUG switch to locate the problem.
Another popular error is OxCOOOOOOS, which is an access violation. In this case, argument 4
(the second exception parameter) is the address your driver was trying to touch.
2 See the Intel486 Processor Family Programmer's Reference (listed in the bibliography) for a list of
CPU trap codes.
3 On Intel platforms, the frame will be associated with a procedure called NT!KiTrap.
4 Keep in mind that the driver whose stack operation generated the bugcheck is not necessarily
the driver that's using too much stack space.

Appendix B

496

8.2

Common Bugcheck Codes

SYNCHRONIZATION P R O B L E M S
The bugchecks in Table B.2 are caused b y improper use o f various NT synchroni­
zation mechanisms.

Table B.2

Synch ronization problems

Bugchecks caused by synchronization problems
Code and parameters

Description

SPIN_LOCK_INIT_FAILU RE (Ox81 )
SPIN_LOCK_ALREADY_OWNED (OxF)
SPIN_LOCK_NOT_OWNED (Ox1 0)
NO_SPIN_LOCK_AVAILABLE (Ox1 D)

CAUSE: Misuse of spin locks.
ACTION: Use the stack trace
to locate the code executing
at the time of the crash.

MAXIM U M_WAIT_OBJECTS_EXCEEDED (OxOC)
THREAD_NOT_MUTEX_OWNER (Ox1 1 )
SYSTEM_EXIT_OWNED_MUTEX (Ox39)

CAUSE: Improper use of
Mutexes in kernel mode.
ACTION : Fix the driver logic
error causing the problem.

M UTEX_LEVEL_NUMB ER_VIOLATION (OxD)
1
Current thread's Mutex level
2 - Mutex level of requested Mutex

CAUSE: A driver thread has
requested ownership of a
Mutex that violates the level
number sequence.
ACTION: Use the stack trace
to identify the driver. Use the
level numbers to identify the
Mutexes. 1

-

1 If the Mutexes belong to NT, use EXLEVELS.H to figure out which ones they are.

8.3

COR R U PTED DRIVER DATA STRUCTU R ES
The bugchecks in Table B.3 are caused by problems with various 1/0 Manger data
structures. In general, these problems indicate some kind of serious logic error in
a driver.

Sec. B.3

Corrupted Driver Data Structures

Table B.3

497

Driver data structure problems

Bugchecks caused by data structure problems
Code and parameters

Description

DEVICE_REFERENCE_COU NT_NOT_ZERO (Ox36}
1 Address of Device object

CAUSE: A driver has
called IoDeleteDevice
with a Device object that
still has a nonzero refer­
ence count.
ACTION: Locate the
driver logic error leading
to this situation.

NO_MORE_IRP_STACK_LOCATIONS (Ox35}
1 - Address of the IRP

CAUSE: A higher-level
driver has tried to pass an
IRP to a lower-level driver
using IoCallDriver, but
there are no more stack
locations in the IRP. 1
ACTION: If your driver
allocated the IRP, examine
how you're calculating the
number of stack slots. If the
IRP is being passed to you,
your Device object's Stack­
Size field is too small.

INCONSISTENT_IR P (Ox2A}
Address of the IRP

CAUSE: The 1/0 Manager
has found an IRP with
fields that are not inter­
nally consistent. 2
ACTION: Make sure your
driver isn't writing over
the contents of the IRP.

MULTI PLE_IRP_COMPLETE_REQU ESTS (Ox44}

CAUSE: A driver has
called IoCompleteRequest
with an IRP that's already
been completed. Either
one driver is trying to com­
plete the same IRP twice,
or two drivers both think
they own the IRP. 3
ACTION: The DeviceOb­
j ect field of the IRP's stack
locations will show you
who was using the IRP.
This may help.

-

1

-

1 - Address of the IRP

Appendix B

498
Table B.3

Common Bugcheck Codes

(Continued)

Code and parameters

Description

CANCEL_STATE_IN_COMPLETED_I RP (Ox48)
1 Address of the IRP

CAUSE: A driver has
called IoCompleteRequest
with an IRP that still has a
cancel routine.
ACTION: This is a driver
logic error. Take the IRP out
of the cancelable state be­
fore you try to complete it.

DEVICE_QUEU E_NOT_BUSY (Ox02)

CAUSE: A Device Queue
object is in an inconsistent
state.
ACTION : The Device
Queue object is probably
getting corrupted by inap­
propriate access or be­
cause of bogus use of
pointers.

-

1 This is really a disaster, since the higher-level driver thinks it has filled in the IRP parameter

fields for the lower-level driver. However, there was no room in the IRP for these parameters, so
the higher-level driver has actually written off the end of the IRP and mangled some unrelated
piece of memory.
2 For example, an IRP that was being completed but was still marked as being attached to a
driver's Device Queue object.
3 Finding the two drivers is difficult, since the identity of the first one has already been covered
up by the time the second driver makes the failing call to IoCompleteRequest.

8.4

M EM O RY PROBLEMS

The bugchecks in Table B . 4 are caused by driver memory problems. Drivers can
cause many subtle (and not so subtle) system failures through improper use of
memory.

Sec. B.4

Memory Problems

Table B.4

499

Memory problems

B ugchecks caused by memory problems
Code and parameters

Description

NO_MORE_SVSTEM_PTES (Ox3F)

CAUSE: There are no system
page table entries left. This often
means a driver isn't cleaning up
after itself.
ACTION: The !sysptes com­
mand may give some insight.

TARGET_MDL_TOO_SMALL (Ox40)

CAUSE: A driver has called
IoBuildPartialMdl and passed
a target MDL that isn't large
enough to map the entire range
of addresses requested.
ACTION: Locate the call to Io­
BuildPartialMdl in the stack
trace. Its arguments identify the
bad MDL. Also use the stack trace
to see who called this function.

M UST_SUCCEED_POOL_EMPTY (Ox41 )
1 - Size of unsatisfied request
2 - Number of pages used of nonpaged pool
3 - Number of too large PAGE_SIZE requests
from nonpaged pool
4 - Number of pages available

CAUSE: There isn't enough mem­
ory to satisfy a request from one of
the XxxMustSucceed pool areas.
ACTION: Look for a driver that's
leaking memory.

NO_PAGES_AVAILABLE (Ox4D)
1 - Number of dirty pages
2 - Number of physical pages in machine
3 - Extended commit value in pages
4 - Total commit value in pages

CAUSE: The system has run out
of free pages.
ACTION: Look for processes or
drivers that are leaking memory.

PFN_LIST_CORRUPT (Ox4E)
1-1
2 - ListHead value that was corrupt
3 - Number of pages available
4-0
- OR 1 -2
2 - Entry in list being removed
3 - Highest physical page number
4 - Reference count of entry being removed

CAUSE: A driver has probably
corrupted an MDL.
ACTION: Trace backward on the
stack from the system routine
that detected the error to the
driver routine that passed the
MDL. This may be the driver
that corrupted the MDL.

500

Appendix B

Table B.4

Common Bugcheck Codes

(Continued)

Code and parameters

Description

PROCESS_HAS_LOCKED_PAGES (Ox76)
1 - Process address
2 - Number of locked pages
3 - Number of private pages
4-0

CAUSE: A driver hasn't released
some locked pages at the end of
an 1/0 operation.
ACTION: Look for a driver that
isn't cleaning up after an 1/0.

BAD_POOL_HEADER (Ox1 9)
M EMORY_MANAGEM ENT (Ox1 A)
PFN_SHARE_COUNT (Ox1 B)
PFN_REFERENCE_COU NT (Ox1 C)
PAGE_FAU LT_IN_NONPAGED_AREA (Ox50)
INSUFFICIENT_SYSTEM_MAP_REGS (Ox45)

CAUSE: Miscellaneous memory
errors.
ACTION : Look for drivers active
at the time of the crash. One of
them may be corrupting memory.

B.5

HARDWARE FAI LU R ES
The bugchecks in Table B.5 are the result of various hardware failures. Try to
locate and correct the problem.

Table B.5

Hardware problems

B ugchecks caused by hardware problems
Code and parameters

Description

KERNEL_STACK_INPAGE_ERROR (Ox77)
1-0

4 - Address of Kernel stack signature
- OR 1 - Status code
2 - 1/ 0 status code
3 - Page file number
4 - Offset into page file

CAUSE: A page of the ker­
nel-mode stack couldn't be
read because of a bad
block in the paging file or
a disk controller error.
ACTION: If the first two
parameters are zero, there
is a hardware error. Else,
look at the status code:
• C000009C or C000016A:
bad block
• C0000185: SCSI cable or
termination problem
• C0000009A: insufficient
nonpaged pool

KERNEL_DATA_IN PAGE_ERROR (Ox7 A)
1 - Lock type that was held:
• Value 1, 2, 3
• PTE address
2 - Error status

CAUSE: A page of kernel­
mode data couldn't be read
because of a bad block in
the paging file or a disk
controller error.

2-0
3 - PTE value at time of error

Sec. B.6

501

Configuration Manager and Registry Problems

Table B.5

(Continued)

Code and parameters

Description

3 - Current process

ACTION: See error Ox77
(above)

4 - Virtual address that could not be read

DATA_BUS_ERROR (Ox2E)
1 Virtual address that caused the fault
2 - Physical address that caused the fault
3 - Processor status register (PSR)
4 - Faulting instruction register (FIR)

CAUSE: Either there is a
parity error in system
memory or a driver is
accessing a nonexistent
system-space address.
ACTION: If a memory test
succeeds, then use stack
trace to locate the driver
making the reference.

M U LTIPROCESSOR_CONFIGU RATION_
NOT_SUPPORTED (Ox3E)

CAUSE: NT has detected
that all the CPUs in a mul­
tiprocessor system are not
identical. This is not a sup­
ported configuration.
ACTION: Correct the
asymmetry.

INSTALL_MORE_MEMORY (Ox7D)
1 - Number of physical pages found
2 - Lowest physical page
3 - Highest physical page

CAUSE: There isn't
enough memory available
to boot the system.
ACTION: Install more
memory.

-

4-0

NMl_HARDWARE_FAILURE (Ox80)
INSTRUCTION_BUS_ERROR (Ox2F)
DATA_COHERENCY_EXCEPTION (Ox55)
INSTRUCTION_COHERENCY_EXCEPTION (Ox56)

8.6

CAUSE: Miscellaneous
hardware failures.
ACTION : Use hardware
diagnostics to locate and
correct the problem.

C O N FI G U RATIO N M A N A G E R A N D R E G I ST R Y P R O B L E M S

The bugchecks in Table B.6 result from problems with crucial Registry informa­
tion. If the failure occurs only when your driver is running, you may be able to
trace the problem back to bad calls to Registry functions. Since the Registry is
mapped into system space, drivers can also corrupt the Registry by using bogus
address pointers.

Appendix B

502
Table 8.6

Common Bugcheck Codes

Registry problems

Bugchecks caused by Registry problems
Code and parameters

Description

CONFIG_INITIALIZATION_FAILED (Ox67)

2 - Location where failure occurred

CAUSE: Configuration Manager
couldn't get enough paged pool
for the Registry. 1
ACTION: Get a stack trace and
call Microsoft.

CONFIG_LIST_FAILED (Ox73)
1 -5
2-2
3 - Index of hive
4 - Pointer to UNICODE_STRING
containing filename of hive

CAUSE: One of the core system
Registry hives (SOFTWARE,
SECURITY, or SAM) is unread­
able or corrupted.
ACTION: Get a stack trace and
call Microsoft.

BAD_SYSTEM_CONFIG_IN FO (Ox74)

CAUSE: Either the SYSTEM hive
is corrupted, or various crucial
keys and values are missing.
ACTION: Try booting from the
Last Known Good configuration.
If that fails, use the emergency
repair disk. If that fails, reinstall

1 -5

NT.

CANNOT_WRITE_CONFIGURATION (Ox75)

CAUSE: There is no room on the
disk to increase the size of the
SYSTEM hive files.
ACTION: Free up space in the
system partition.

REGISTRY_ER ROR (Ox51 )
1 - Indicates where error occurred
2 - Indicates where error occurred
3 Pointer to hive
4 - Internal error return code

CAUSE: Something is seriously
wrong with the Registry. It may
be the result of an 1/0 error or
file system corruption.
ACTION: Try rebooting using
the Last Known Good option or
the emergency repair disk.

-

1 This error should never occur, since Registry setup happens early enough during system initial­

ization that there should always be enough pool space.

Sec. B.7

B.7

File System Problems

503

F I L E S YST E M P R O B L E M S

The bugchecks in Table B.7 result from failures in a file-system driver or a related
component. Since Microsoft doesn't currently support customer-written FSDs,
there is little you can do to diagnose these problems.
Table B. 7

File system problems

Bugchecks caused by file system problems
Code and parameters

Description

CACH E_MANAGER (Ox34)
FILE_SYSTEM (Ox22)
FAT_FILE_SYSTEM (Ox23)
NTFS_FILE_SYSTEM (Ox24)
NPFS_FILE_SYSTEM (Ox25)
CDFS_FILE_SYSTEM (Ox26)
RDR_FILE_SYSTEM (Ox27)
MAILSLOT_FILE_SYSTEM (Ox52)
PINBALL_FILE_SYSTEM (Ox59)
LM_SERVER_INTERNAL_ERROR (Ox54)

CAUSE: Internal problems
with a Microsoft-supplied
file-system driver.
ACTION: Get a stack trace
and call Microsoft.

APC_IN DEX_MISMATCH (Ox01 )

CAUSE: This internal error
could be the result of file
system problems.
ACTION: Get a stack trace
and call Microsoft.

KERNEL_APC_PENDING_DURING_EXIT (Ox20)
1 Address of pending APC
2 The thread's APC disable count
3 The current IRQL

CAUSE: This indicates a logic
error in a file system driver.
ACTION: See if any third­
party file system drivers were
installed at the time of the
crash. Be suspicious of them.

-

-

-

Appendix B

504

8.8

Common Bugcheck Codes

SYSTEM I N ITIALIZATION FAI LURES

The bugchecks in Table B.8 occur only during system initialization. Some of them
are the result of mismatched software components, while others indicate prob­
lems that can only be diagnosed by Microsoft.
Table B.8

Bootstrap and initialization fail u res

Bugchecks caused by bootstrap problems
Code and parameters

Description

MISMATCHED_HAL (Ox79)
1 - 1 (Release levels don't match)
2 - Release level of Kernel
3 - Release level of HAL
- OR 1 - 2 (Build types don't match)
2 - Kernel build type
• 0 - Free multiprocessor-enabled build
• 1 - Checked multiprocessor-enabled build
• 2 - Free uniprocessor build
3 - HAL build-type
- OR 1 - 3 (MCA HAL required)
2 - Machine type detected at bootstrap
• 2 means MCA
3 - HAL type

CAUSE: The HAL revision
level and HAL configuration
type do not match those of the
Kernel or the machine type. 1
ACTION: Make sure the
proper versions of the HAL
and NTOSKRNL are installed.

FTDISK_INTERNAL_ERROR (Ox58)

CAUSE: The system is trying to
boot from the wrong copy of a
mirrored partition.
ACTION: Reboot from the
shadow copy of the partition.

INACCESSIBLE_BOOT_DEVICE (Ox7B)
1 - Pointer to boot Device object
- OR 1
Pointer to UNICODE_STRING structure
containing ARC name of volume that can't be
mounted.

CAUSE: Either the device
driver for the boot device
failed to initialize, or the file
system driver for the boot
device didn't recognize the file
structures on the volume.
ACTION: Be sure the right
device driver is installed for the
boot device, and that the sys­
tem is trying to boot from the
correct location.

-

Sec. B.8

505

System Initialization Failures

Table B.8

(Continued)

Code and parameters

Description

PHASEO_EXCEPTION (Ox78)

CAUSE: Failure during initial­
ization of a system component.
ACTION: Get a stack trace and
call Microsoft.

SESSION 1 _1NITIALIZATION_FAILED (Ox6D)
SESSION2_1NITIALIZATION_FAILED (Ox6E)
SESSION3_1NITIALIZATION_FAILED (Ox6F)
SESSION4_1NITIALIZATION_FAILED (Ox70)
SESSIONS_INITIALIZATION_FAILED (Ox71 )
1
NT status code at time of failure

CAUSE: Failure during initial­
ization of a system component.
ACTION: Get a stack trace and
call Microsoft.

PHASEO_I NITIALIZATION_FAILED (Ox31 )
PHASE1 _1NITIALIZATION_FAILED (Ox32)
HAL_IN fflALIZATION_FAILED (Ox5C)
H EAP_llNITIALIZATION_FAILED (Ox5D)
O BJECT_INITIALIZATION_FAILED (Ox5E)
SECURITY_INITIALIZATION_FAILED (Ox5F)
PROCESS_INITIALIZATION_FAILED (Ox60)
HAL 1 _1NITIALIZATION_FAILED (Ox61 )
OBJECT1 _INITIALIZATION_FAILED (Ox62)
SECURITY1 _1NITIALIZATION_FAILED (Ox63)
SYM BOLIC_INITIALIZATION_FAILED (Ox64)
M EMORY1 _1NITIALIZATION_FAILED (Ox65)
CACHE_INITIALIZATION_FAILED (Ox66)
FILE_INITIALIZATION_FAILED (Ox68)
101_1NITIALIZATION_FAILED (Ox69)
LPC_INITIALIZATION_FAILED (Ox6A)
PROCESS1 _1NITIALIZATION_FAILED (Ox6B)
REFMOIN_INITIALIZATION_FAILED (Ox6C)
1
NT status code describing the failure
2 Indicator of location where failure occurred
WINDOWS_NT_BANNER (Ox4000007E)

CAUSE: Failure during initial­
ization of a system component.
ACTION: Get a stack trace and
call Microsoft.

-

-

-

1 This error probably means that someone has manually updated either NTOSKRNL.EXE or
HAL.DLL. It can also result from mixing a uniprocessor HAL with a multiprocessor Kernel, or

vice versa.

506

8.9

Appendix B

Common Bugcheck Codes

I NTE R N A L S YST E M F A I L U R ES

The bugchecks in Table B.9 all come from fatal errors within a Microsoft-supplied
software component. For the most part, there's little you can do to track these
errors.
Table B.9

I nternal system errors

Bugchecks caused by internal system problems
Code and parameters

Description

PORT_DRIVER_INTERNAL (Ox2C)
SCSl_DISK_DRIVER_INTERNAL (Ox2D)
FLOPPY_INTER NAL_ERROR (Ox37)
SERIAL_DRIVER_INTERNAL (Ox38)
ATDISK_DRIVER_INTERNAL (Ox42)

CAUSE: Miscellaneous errors from
a system-supplied driver.
ACTION : Get a stack trace and
call Microsoft.

STREAMS_INTERNAL_ERROR (Ox4B)
N DIS_INTERNAL_ERROR (Ox4F)
XNS_INTERNAL_ERROR (Ox57)

CAUSE: Internal errors from
system-supplied networking
components.
ACTION: Get a stack trace and call
Microsoft.

CORRUPT_ACCESS_TOKEN (Ox28)
SECURITY_SYSTEM (Ox29)

CAUSE: Internal security sub­
system errors.
ACTION: Get a stack trace and call
Microsoft.

Bibliography

Books about Software Development
Hatley, Derek J., and Pirbhai, Imtiaz A. Strategies for Real-Time System Specification. New

York, NY. Dorset House Publishing, 1988. Device drivers are complex pieces of real-time
software. The techniques in this book can help in their design.

Kaner, Cem, et al. Testing Computer Software, 2nd ed. New York, NY. Van Nostrand Reinhold,
1993. This book gives a good overview of the software testing process. If you're responsible
for finding and fixing the bugs, this is a good place to start.

Books about Windows

NT and Win32

Custer, Helen. Inside Windows NT. Redmond, WA. Microsoft Press, 1993. This book (although
getting rather long in the tooth at this point) contains a good high-level overview of the orig­
inal Windows NT architecture. Unfortunately, it's somewhat lacking in specific implemen­
tation details.
Microsoft Corporation. Windows NT 3. 5 Resource Kit. Redmond, WA. Microsoft Press, 1994.
These volumes have been updated for NT 3 . 5 1 and presumably will be for NT 4.0 as well.
Richter, Jeffrey. Advanced Windows NT. Redmond, WA. Microsoft Press, 1994. This book will
give you a good background in Win32 user-mode programming.

Books about Bus Architectures
Anderson, Don. PCMCIA System Architecture, 2nd ed. Reading, MA. Addison-Wesley Publishing
Company, Inc., 1995. I can't say enough good things about this series of hardware books
from Shanley and Anderson. They're accurate, readable, and detailed enough to give driver

507

Bibliography

508

writers a comprehensive introduction to various bus and system architectures. To top it all
off, they're not even terribly expensive.
Bowlds , Pat A. Micro Channel Architecture. New York, NY. Van Nostrand Reinhold, 1 99 1 . If
you're in the unenviable position of writing a driver for an MCA device, this is one of the few
sources of information available. It's a little bit fluffy.
Schmidt, Friedhelm. The SCSI Bus and IDE Interface. Reading, MA Addison-Wesley Publishing
.

Company, Inc . , 1995. This provides a good introduction to the SCSI bus . It's worth reading
before you dive into the ANSI SCSI specification itself.
Shanley, Tom. Plug and Play System Architecture. Reading, MA Addison-Wesley Publishing
.

Company, Inc . , 1995.
Shanley, Tom and Anderson, Don. ISA System Architecture, 3rd ed. Reading, MA. Addison-Wes­
ley Publishing Company, Inc., 1995.
Shanley, Tom and Anderson, Don. EISA System Architecture, 2nd ed. Reading, MA Addison­
.

Wesley Publishing Company, Inc., 1995.
Shanley, Tom and Anderson, Don. PCI System Architecture, 3rd ed. Reading, MA Addison-Wes­
.

ley Publishing Company, Inc . , 1995.
Shanley, Tom and Anderson, Don. CardBus System Architecture. Reading, MA Addison-Wesley
.

Publishing Company, Inc . , 1996.

Books about CPU Architectures
Heinrich, Joe. MIPS R4000 User's Manual. Englewood Cliffs, NJ. Prentice Hall Inc . , 1993 .
Intel Corporation. Intel486 Processor Family Programmer 's Reference, Beaverton, OR. Intel Cor­
poration, 1992.
Shanley, Tom. PowerPC 601 System Architecture. Reading, MA Addison-Wesley Publishing
.

Company, Inc. , 1994.
Sites, Richard and Witek, Richard. Afpha AXP Architecture Reference Manual, 2nd ed. Newton,
MA Digital Press, 1995.
.

Books about Miscellaneous Hardware
Campbell, Joe. Programmer's Guide to Serial Communications, 2nd ed. Indianapolis, IN . SAMS
Publishing, 1994. 1993 . This is an incredibly comprehensive source of information about the
operation of UARTs and related devices .
Ferraro, Richard. Programmer 's Guide t o the EGA, VGA, and Super VGA Cards, 3rd ed. Read­

ing, MA Addison-Wesley Publishing Company, Inc., 1994. This is probably the most com­
.

prehensive source of information about PC video hardware and how to program it.
Intel Corporation. Intel Peripheral Components Handbook, Beaverton, OR. Intel Corporation,
1993. This handbook has information about common peripheral interface chips such as the
programmable interrupt controller and DMA controller.

About the Author
Art Baker has spent over twenty-five years in the computer industry, where
he's worked on everything from compilers to real-time data gathering software.
In 1984, he changed the focus of his career and began writing and teaching techni­
cal training classes for Digital Equipment Corporation. His broad technical back­
ground and good communication skills made him a consistent favorite with
students and won him several awards for instructor excellence. After leaving Dig­
ital, Mr. Baker founded Cydonix Corporation, a Washington, DC training and
consulting firm.
In his spare time, Mr. Baker is an accomplished classical pianist, and an avid
collector of old science fiction movies. He lives in Washington, DC.

' ,; � .

' :-:. · ·

. :;;_ . -

·-: · . .

>-. '- . _
·

.

·
_ · '..-

!.-·•

:-

.:

�

.

.

:
. . · ·>

; '· "

· ' -"

!� .

·

-

-· �

.· · ·

.

. .:

:«

_

··

: . - :·

·

·

. ... .

· : -·- .

;,-·

. _ - . ��:--· · .: ·- : ·

.
. ·_ · '

-

. : ·

INDEX
A
AdapterControl routine, 57
Adapter description file (ADF), 38
Adapter object cache, 264-65
flushin�, 271
Adapter ob1ects, 74-75, 266-71
access functions for, 75
acquiring/releasing, 268-70
OMA hardware, setting up, 270-71
finding/locating, 266-68
layout of, 74-75
manipulating, 75
structure of, 75
AlignmentRequirement field, 380
Allocating hardware, 152-62
code example, 158-62
RESALLOC.C, 158-62
XxBuildPartialDescriptors, 161-62
XxReportHardwareUsage, 158-61
alloc_text pragma, 85
AltemateCurrentlrp pointer, 241
AltematelrpQueue, 227-28
Alternate IRPs, 224
ASSERT, 453
ASSERTMSG, 453
Associatedlrp.SystemBuffer field, 65, 374
Asynchronous procedure call (APC), 61
AUTOCON.C, 132-37
Autoconfiguration:
EISA 0us, 40-41
ISA bus, 36
MCA bus, 38
PCI bus, 44-45
requirements, 32-33
Auto-detected hardware:
code example, 130-39
AUTOCON.C, 132-37
CONFIG -ARRAY, 131-32
DEVICE-BLOCK, 131
DEVICE EXTENSION, 132
XxConfi Callback, 134-37
XXDRIVER.H, 131-32
XxGetHardwarelnfo, 132-34
XxGetlnterruptlnfo, 137-39
XxGetPortlnfo, 137-39
ConfigCallback routine, 127-30
configuration data, translating, 130
finding, 122-30
hardware database, querying, 125-27
how auto-detection worl

D
Data buffer, 25
Data-collection DLL, 474-87
Close function, 475
code example, 478-87
buildin /installin ' 486-87
preamb e area, 47 79
XXPERF.C, 478-86
XxPerfClose, 486
XxPerfCollect, 483-86
XxPerfOpen, 479-83
Collect function, 475
contents of, 474-75
error handling in, 476
installing, 477
Open fuii.ction, 474
writing, 474-87
Data objects, 62-63
Data transfer, 59-60
Data transfer mechanisms, 29-30, 46
Data transfer routines, 57
See also Programmed I/O data transfers
DbgBreakPoint, 442
DbgPrint, 442
DDI< (Windows NT), 17-18, 80
Debugging:
coding strategies that reduce, 425
miscellaneous techniques, 452-58
catching incorrect assumptions, 453
catchin!!; memory leaks, 454-55
event bits, 456
leaving WINDBG utility, code in place, 452
sanity counters, 456
trace buffers, 456-58
using bugcheck callbacks, 453-54
using counters/bits/buffers, 455-58
Debug sym6ol files, 490
Deferred procedure calls (DPCs), 4, 51-53
behavior of, 52-53
and interrupt synchronization, 95
operation of, 52-53
Demand transfer mode, 268
DependOnGroup, 415
DeregisterEventSource, 476
Development environment, 488-93
deoug symbol files, 490
enabling crash dumps on targets system, 490-92
enabling target system's debug client, 492-93
hardware/ software requirements, 488-90
connecting host ancftarget, 489-90
host system, 488
target system, 488-89

�

f

INDEX
Development issues, 78-100
coding conventions / techniques, 80-86
driver design strategies, 78-80
driver memory allocation, 86-91
interrupt synchronization, 93-95
linked lists, 98-100
multiple CPUs, synchronizing, 95-97
Unicode strings, 91-93
DEVICE_BLOCK, 131
Device-dedicated memory, 31
Device drivers, 11
DEVICE_EXTENSION, 132
Device extensions, 71
Device interrupts, 27-29
interrupt priorities, 28
interrupt vectors, 28
processor affinity, 29
signaling mechanisms, 28-29
DeviceloControl, 80, 104, 370
Device memory, 46
Device objects, 69�71
access functions for, 70
device extensions, 71
externally visible fields for, 70
layout of, 69-70
manipulating, 70
structure of, 69
Device operations, 56
Device queue objects, 225-28
how they work, 225-26
using, 226-28
Device re&IBters, 25
accessmg, 26-27
Device resource lists, 32
Device timeouts:
catching, 204-5
code example, 205-11
INIT.C, 206-7
TRANSFER.C, 207-11
XXDRIVER.H, 206
XxloTimer, 210
Xxlsr, 208-9
XxProcessTimerEvent, 211
XxTransmitBytes, 207-8
handling, 203-5
DirectDrawRAL, 15, 16-17
Direct 1/0 (DIO), 54
Direct memory access (DMA), 30
mechanisms, 30-31
DIRS (keyword), 404
DirverEntry routine, device objects, creating, 103-4
Dispatch cfeanup routine, 234-36
Dispatcher objects, 4, 323, 325-34
Event objects, 325-26
Executive Resource, 332-33
Mutex objects, 327-29
Semaphore objects, 329-30
sharing events between drivers, 327
Thread object, 331-32
Timer objects, 330-31
DISPATCH_LEVEL, crashes below, 435-36
Display drivers, 16
DMA controller (DMAC), 30, 35-36
DMA drivers, 258-98
adapter objects, 266-71
caclie coherency, maintaining, 263-65
categorizing, 265

513
common buffer slave DMA driver, 291-95
DMA hardware variations, hiding with adaptor
objects, 258-59
I / 0 buffers, managing with MDLs, 261-63
NT DMA architecture, limitations of, 265-66
packet-based bus master, 285-91
packet-based slave, 272-85
scatter I gather problem, solving with mapping
registers, 259-61
See also Common buffer slave DMA driver; Packet­
based bus master DMA drivers; Packet-based slave
DMA drivers
DmalnProgress, 293
Doubly-linked lists, 99
DPCForisr routine, 56, 72, 74
function of, 188-89
writing, 188-89
execution context, 188
priority increments, 189
DPC routine, 56
Driver bugs, keeping track, 425
Driver-chossen addresses, 156-57
Driver cleanup:
code example, 1 15-18
UNLOAD.C, 115-18
Xxl{eleaseHardware, 115-18
XxUnload, 115
Driver design strategies, 78-80
formal design methods, 79
incremental development, 79-80
sample drivers, 80
Driver dispatch routines, 163-79
dispatch interface, extending, 165-69
execution context, 170
exiting, 171-73
completing a request, 172
signaling an error, 171-72
starting a device operation, 172-73
IOCTL argument-passing methods, 167-69
IOCTL buffers, managin�, 177
IOCTL header files, writing, 169
IOCTL requests, processing, 174-76
I / 0 request dispatching mechanism, 163-64
IRP_MJ_DEVICE_CONTROL, 165
IRP_MJ_INTERNAL_DEVICE_CONTROL,
166-67
METHOD_BUFFERED, 168, 177
METHOD_IN_DIRECT, 1 68, 177
METHOD_NEITHER, 168, 1 77
METHOD_OUT_DIRECT, 168, 1 77
private IOCTL values, defining, 167
read and write requests, processing, 173-74
specific function codes:
deciding which to support, 165
enabling, 164-65
testing, 17ff-79
sample test program, 178-79
testing procedure, 178
what they ao, 170-71
writing, 169-73
DriverEnergy points, initializing, 103
DriverEntry routine, 55, 85, 157, 293, 295, 296-97, 35354, 378, 395-96,
454, 456
buffer strategy, choosin(?i, 104-5
DriverEnergy points, irutializing, 103
function of, 102-3

514

INDEX

NT /Win32 device names, 105
writing, 101-5
execution context, 101-2
Driver errors:
categories of, 422-24
liardware problems, 422-23
resource leaks, 423
system crashes, 423
system hangs, 424
thread han�, 423-24
coding strategies that reduce, 425
keeping track of, 425
reproducing, 424-25
miscellaneous causes, 424-25
multiprocessor dependencies, 424
multithreading dependencies, 424
time dependencies, 424
Driver initialization:
and cleanup routines, 55-56
code example, 105-13
DriverEntry routine, 106-9
INIT.C, 106-13
XxCreateDevice, 109-13
Driver load sequence, controlling, 413-18
changing driver start value, 413-14
controllfug load sequence within a group, 416-18
creating explicit dependencies between arivers,
414-15
establishing global dependencies, 415-16
Driver memory cillocation, 86-91
kernel stack, 86, 87
lookaside lists, 90-91
memory suballocation, system support for, 88-91
nonpaged ool, 86, 87-88
paged poo , 86, 87-88
zone bUffers, 88-90
Driver objects, 67-68
externally visible fileds of, 68
layout of, 68
structure of, 68
Driver paging, controlling, 85-86
Driver performance, 459-87
counter definitions in the Registry, 464-66
counter names, adding to the Registry, 464-66
COUNTERS.H, 467
COUNTERS.IN!, 466-67
data-collection DLL, 474-87
Close function, 475
code example, 478-87
Collect function, 475
contents of, 474-75
error handling in, 476
installing, 477
Open function, 474
writing, 474-87
general guidelines, 459-61
concrete measurement, 461
explore creative driver designs, 460-61
kriow the hardware, 460
know where you're going, 459-60
optimize code, 461
LODCTR utility, 466-67
performance data:
format of, 468-74
objects with multiple instances, 472-74
overall structure of, 468
PERF_COUNTER_BLOCK, 470

f.

PERF_COUNTER_DEFINITION, 470
PERF_OBJECT_TYPE, 469

types of counters, 470-72
performance monitoring, 462-64
how drivers export performance data, 464
how it works, 462-M
terminology, 462
UNLODCTR utility, 467
Drivers:
building, 398-409
installirig, 409-13
testing/ debugging, 419-58
Driver symbol data, moving into .DBG files, 408-9
Driver testing, 419-58
crash dump analysis, 433-40
developing tests, 420-21
driver errors:
categories of, 422-24
reproducing, 424-25
general approach to, 419-21
now to perform tests, 421
interactive debugging, 440-42
Microsoft Hardware Compatibility Tests (HCTs),
421-22
miscellaneous techniques, 452-58
catching incorrect assumptions, 453
catchin� memory leaks, 454-55
event bits, 456
leaving debugging code in place, 452
sanity counters, 456
trace buffers, 456-58
using bugcheck callbacks, 453-54
using counters/bits/buffers, 455-58
system crashes, 426-30
what to test, 420
when to test, 420
who should do testing, 421
WINDBG, 430-32
extensions, 442-46
See also Crash dump analysis; Driver errors; Interactive debugging; WINDBG utility
DriverUnload fie1d, 1 14
DUMPEXAM, analyzing crashes with, 439

E
Edge-triggered interrupts, 28-29
EISA bus, 39-41
autoconfiguration, 40-41
device memory, 40
DMA capabilities, 40
interrupt mechanisms, 39-40
register access, 39
Environment subsystems, 7-8
Error logging, 312-13
code example, 313-19
EVENTLOG.C, 313-19
XxGetStringSize, 319
XxlnitializeEventLog, 313-15
XxReportEvent, 315-18
preparing a driver for, 310-11
Error reporting, 46
Error response tests, 420
EVENTLOG.C, 313-19
Event logging, 299-301
deciding what to log, 299-300
process, 300-301
EventLogLevel, 311, 476, 487

515

INDEX
Event Viewer utility, 301
ExAllocateFromXxxLookasideList, 90-91
ExAllocateFromZone, 89
ExAllocatePool, 87, 89, 366, 372-74, 454-55
ExAllocatePoolWithTag, 89, 454-55
Exceptions, 48-49
ExDeleteXxxLookasideList, 90
Executive, 4-7, 84
Configuration Manager, 5-6
1/0 Manager, 7, 49, 55-72, 84, 101
Long Procedure Call (LPC) facility, 6
Object Manager, 5, 84
Process Manager, 6, 84
Security References Monitor, 6
system service interface, 5
Virtual Memory Manager, 6, 84
Executive Resources, 332-33
functions that work with, 333
ExExtendZone, 90
ExFreePool, 87, 374
ExFreeToXxxLookasideList, 90-91
ExFreeToZone, 89
ExlnitializeWorkltem, 322
ExlnitializeXxxLookasideList, 90
ExlnitializeZone, 89
ExlnterlockedAllocateFromZone, 89
ExlnterlockedDecrementLong, 376
ExlnterlockedExtendZone, 90
ExlnterlockedFreeToZone, 89
ExlnterlockedlnsertHeadList, 99
ExlnterlockedlnsertTailList, 99
ExlnterlockedPopEntryList, 98
ExlnterlockedPushEntryList, 98
ExlnterlockedRemoveHeadList, 99
Expiration times, specifying, 214-15
ExQueueWorkltem, 323
Extendibility, Windows NT, 2
ExtensionApis, 448
ExtensionApiVersion, 443

F
Fast Mutex, 332
File mapping objects, 464
Filename.H, 308
Filename.RC, 308
File-system drivers (FSDs), 11-12
File system problems, 503
Filter drivers, 351, 376-93
code example, 381-93
COMPLETE.C, 390-93
DISPATCH.C, 386-90
DriverEntry routine, 381-84
INIT.C, 381-86
YyAttachFilter, 384-85
YyDispatchDeviceloControl, 388-89
YyDispatchPassThrough, 389-90
YyDispatchWrite, 386-88
YYDRIVER.H, 381
YyGenericCompletion, 393
YyGetBufferLimits, 385-86
YyWriteCompletion, 390-93
how they work, 377-78
initialization/cleanup in, 378-79
DriverEntry routine, 378-79
Unload routine, 379
MajorFunction table, 380-81
making the attachment transparent, 380-81

SCSI, 13
undocumented activity, 380
Flags field, 104, 272, 293
Formal design methods, 79
Full-duplex aevices, 223
Full-duplex drivers, 222-57
alternate path, implementing, 225
CustomDpc routines, writing, 228-29
data structures for, 224-25
device queue objects, 225-28
dispatcfi cleanup routine, 234-36
1/0 requests, canceling, 229-36
modified driver architecture, 223-24
16550 UART, 236-57
See also 16550 UART

G
GDI engine, 1 6
GDI functions, Win32 subsystem, 9
GetBuffer, 396-97
GetFileVersionlnfo, 405
GetFileVersionlnfoSize, 405
Graphical device interface (GDI), 15
GroupOrderList, 416-18

H

HAL, See Hardware Abstraction Layer (HAL)
HalAllocateCommonBuffer, 156, 291 , 296-97
HalAssignSlotResources, 45
HAL.DLL, 429
Half-duplex devices, 223
HalFreeCommonBuffer, 157, 295, 297
HalGetAdapter, 75, 156, 266-69, 272, 277-78, 293, 296
HalGetBusData, 38, 41, 45, 141-42
HalGetlnterruptVector, 77
HalSetBusData, 38, 41, 45
HalTranslateBusAddress, 157
Hardware, 24-47
allocating, 152-62
autoconfiguration requirements, 32-33
basics of, 24-33
bus architecture, understanding, 45
buses, 33-45
control registers, understanding, 45-46
data transfer mechanisms, 29-30
understanding, 46
device-dedicated memory, 31
device memory, understanding, 46
device microcode, loading, 157-58
device registers, 25
accessing, 26-27
direct memory access (DMA) mechanisms, 30-31
error and status reporting, understanding, 46
hints for working with, 45-47
problems with, 422-23
releasing, 155-56
testing, 46-4 7
See also Auto-detected hardware
Hardware Abstraction Layer (HAL), 3, 74, 84
Hardware database, querying, 125-27
Hardware failures, 500-501
Hardware initialization, 122-62
allocating/ releasing hardware, 152-62
auto-detected hardware, finding, 122-30
device memory, mapping, 156-57
unrecognized hardware, finding, 139-52

INDEX

516
Hardware resources, claiming, 153-55
Hardware tests, 420
Hard-wired addreses, 157
Header, 44
Header files, 81-82
Higher-level drivers, 350-97
allocating additional IRPs, 364-76
buffered 1/0 requests, 374
creating IRPs from scratch, 371-74
driver-managed memory, 374
ExAllocatePool, 372-74
IoAllocatelrp, 371-72
direct 1/0 requests, 374-75
driver-allocated IRPs, 375-76
asynchronous I/O, 375-76
synchronous I/O, 375
filter drivers, 376-93
intermediate drivers, 350-52
IoBuildAsynchronousFsdRequest, creating IRPs
with, 368-69
loBuildDeviceloControlRequest, creating IRPs
with, 369-71
IoBuildSynchronousFsdRequest, creating IRPs
with, 367-68
I / O completion routines, 360-64
code example, 363-64
execution context, 361
requesting I/O completion callback, 360
wnat they do, 362-63
IRP stack, controlling size of, 365-66
tight!Y: coupled drivers, 394-97
High-lRQL crashes, 434-35

I
Identifier (subkey), 124
IgnoreCount field, 268
IMAGEHLP.DLL, 439
Incremental development, 79-80
Information field, 64
Initialization and cleanup routines, 101-21
driver cleanup example, 115-18
DriverEntry routine, writing, 101-5
driver initialization example, 105-13
reinitialize routines, writing, 1 13-14
shutdown routines, writing, 118-19
testing the driver, 1 19-21
unload routine, writing, 1 14-15
Initialization routines, discarding, 84-85
lnsertHeadList, 99
InsertTailList, 99
Installing drivers, 409-13
by hand, 409-10
driver Re
try entries, 410
end-user mstallation of nonstandard drivers, 41213
end-user installation of standard drivers, 410-12
after NT installation, 412
during GUI setup, 411-12
during text setup, 410-11
Integral subsystems, 7
Interactive debugging, 440-42
breakpoints, setting, 441
debug session, startin g /stopping, 440-41
hard breakpoints, setting, 442
print statements, using, 442
Intermediate drivers, 11, 350-52
definition of, 350

�

and layered architecture, 351-52
Interrupt behavior, 46
Interrupt objects, 76-77
layout of, 76
manipulation of, 77
Interrupt priorities, 28
Interrupt request level (IRQL), 49-51, 60, 88
Interrupts, 49
CPU priority levels, 49
interrupt processing se quence, 50-51
interrupt request level (IRQ), 49
software-generated interrupts, 51
Interrupt Service routine (ISR), 56, 297
furiction of, 187-88
writing, 186-88
execution context, 186-87
Interrupt synchronization, 93-95
and DPCs, 95
interrupt blocking, 94-95
Interrupt vectors, 28
IoAcquireCancelSpinLock, 232, 235
IoAllocateAdapterChannel, 75, 268-69, 273, 281, 294
IoAllocateController, 73
IoAllocateErrorLogEntry, 312
IoAllocatelrp, 66, 366, 371-72
IoAllocateMdl, 263, 292, 294, 374
IoAttachDevice, 70
IoAttachDeviceByPointer, 70, 378, 380
IoBuildAsynchronousFsdRequest, creating IRPs
with, 366-69
IoBuildDeviceloControlRequest, creating IRPs with,
366-71
IoBuildPartialMdl, 263
IoBuildSynchronousFsdRequest, creating IRPs with,
366-68, 375
IoCallDriver, 66, 70, 230, 233, 358, 364-65, 367, 374-76,
394
IoCancellrp, 230
IoCompleteRequest, 66, 205, 232-34, 273, 353, 357,
360-61, 365, 368, 370,
375, 376, 397, 423, 426
I/0 completion routines, 58, 360-64
code example, 363-64
execution context, 361
reguesting 1/0 completion callback, 360
wnat they do, 362-63
IoConnectlnterrupt, 77, 207
IoCreateController, 73, 102
loCreateDevice, 70, 102-5, 132, 353, 378-79, 395
IoCreateNotificationEvent, 326
loCreateSymbolicLink, 70, 326
IoCreateSynchronizationEvent, 326
IOCTL argument-J? assing methods, 167-69
IOCTL buffers, Driver dispatch routines, managing,
177
IOCTL header files, writing, 169
IOCTL requests, processing, 174-76
IOCTLs, 464
IoDeleteController, 73
IoDeleteDevice, 70, 354, 379
IoDeleteSymbolicLink, 70, 102, 354
IoDetachDevice, 70, 379
IoDisconnectlnterrupt, 77
IoFlushAdapterBuffers, 75, 265, 271, 274, 295
IoFreeAdapterChannel, 75, 270, 295
IoFreeController, 73
loFreelrp, 66, 369, 373-74

517

INDEX
IoFreeMapRegisters, 75, 270
IoFreeMdl, 263, 295
IoGetConfigurationlnformation, 142
IoGetCurrentlrpStackLocation, 67, 357, 365
IoGetDeviceObjectPointer, 70, 353, 356, 378-79, 395
IoGetNextlrpStackLocation, 67, 358, 362, 365
lolnitializel 97, 372, 374
IolnitializeT!mer, 204
IOLOGMSG.DLL, 308
I/0 Manager, 7, 49, 55-72, 84, 101, 460
IoMapTransfer, 75, 269-71, 273, 274, 286-87, 292, 294,
295
IoMarklrpPending, 67, 294, 358, 362, 375
IoQueryDeviceComponentlnformation, 128
IoQueryDeviceConfi�urationData, 128
IoQueryDeviceDescnption, 125-28, 150, 152
IoQueryDeviceldentifier, 128
IoRegisterDriverReinitialization, 113
IoRegisterShutdownNotification, 119
IoReleaseCancelSpinLock, 232-34
IoReportResourceUsage, 154-57
1/0 request dispatching mechanism, 163-64
IoRequestDpc, 274, 295, 297
I/O request packets (IRPs), 58, 63-67
alternate IRPs, 224
IRP header, 64-65
externally visible fields in, 65
layout of, 64-65
manipulating, 65-67
primary IRPs, 224
stack locations, 65, 67
structure of, 64
I/0 requests:
canceling, 229-36
cancel routine, 232-34
Cancel spin lock, 231-32
IRP Cancel flag, 232
synchronization issues, 231-32
IoSetCancelRoutine, 233
IoSetCompletionRoutine, 67, 358, 360, 362, 363, 374,
375
IoSetHardErrorOrVerifyDevice, 372
IoSetNextlrpStackLocation, 67
I/O space, definition of, 26
I/O space registers, 26
I/O stack locations, 64
IoStartNextPacket, 66, 224-26, 233-34, 242, 244, 282,
423
IoStartPacket, 66, 222, 224-26, 282, 338, 434
IoStartTimer, 204
IoStatus.Information, 230, 357
IoStatus member, 64, 273
IoStopTimer, 205
I/O srstem service dispatch routines, 55-56
IoWr1teErrorLogEntry, 313
ISA bus, 33-36
autoconfiguration, 36
device memory, 36
OMA capabilities, 34-36
interrupt mechanisms, 34-35
register access, 33-34
ISR, 60

K
KdPrint, 442
KeAcquireSpinLock, 97
KeBugCheck, 426, 427, 434

KeBugCheckEx, 426, 427
KeCancelTimer, 214
KeClearEvent, 326
KeDelayExecutionThread, 212
KeDeregisterBugCheckCallback, 454
KeFlushloBuffers, 264, 272, 291, 297
KeGetCurrentlrql, 94
KelnitializeCallbackRecord, 454
KelnitializeDeviceQueue, 226, 227, 228
KelnitializeDpc, 213, 229
KelnitializeEvent, 326
KelnitializeMutex, 328
KelnitializeS pinLock, 97
KelnitializeT!mer, 214
KelnsertByKeyDeviceQueue, 227
KelnsertDeviceQueue, 227, 245
KelnsertQueueDpc, 228-29
KeLowerlrql, 94, 227, 269
KeQuerySystemTime, 214
KeQueryTickCount, 214
KeQueryTimelncrement, 214
KeRaiselrql, 94, 227, 269
KeReadStateMutex, 328
KeReadStateTimer, 214
KeRegisterBugCheckCallback, 454
KeReleaseMutex, 328-29
KeReleaseSemaphore, 295, 329
KeReleaseSpinLock, 97
KeReleaseSpinLockFromDpcLevel, 97
KeRemoveDeviceQueue, 227, 246
KeRemoveEntryDeviceQueue, 227, 233
KeRemoveQueueDpc, 228
KeResetEvent, 326
Kernel, 3-4
KERNEL functions, Win32 subsystem, 9
Kernel mode, 48-61
control objects, 4
data transfer, 59-60
deferred l? rocedure calls (DPCs), 51-53
device dnvers, 11
dispatcher objects, 4
exceptions, 48-49
Executive, 4-7
file-system drivers (FSDs), 1 1-12
Hardware Abstraction Layer (HAL), 3
intermediate drivers, 1 1
interrupts, 49-51
I/O components, 10-15
I/0 subsystem design goals, 10
1/0 processing sequence, 58-61
kernel, 3-4
kernel-mode threads, 49
layered drivers, 10- 1 1
network drivers, 13-15
postprocessing:
by the driver, 60
by the 1/0 Manager, 60-61
request preprocessing:
by tne driver, 59
by NT, 58-59
scsr drivers, 12-13
user buffer access, 53-54
Windows NT, 2
Kernel-mode drivers, 21
data transfer routines, 56
driver initialization and cleanup routines, 55
1/0 system service dispatch routines, 55-56

INDEX

518
resource synchronization callbacks, 57
structure of, 54-58
Kernel-mode threads, 49
Kernel stack, 86, 87

KeSetTimer, 214-15
KeSetTimerEx, 214
KeStallExecutionProcessor, 212
KeSynchronizeExecution, 77, 96
KeWaitForMultipleObjects, 324-25, 328, 375
KeWaitForSingleObject, 324, 328, 330, 334, 368, 370
KiTrap, 436

L
Language monitor DLL, and printer drivers, 1 9
Latched interrupts, 28-29
Layered drivers, 10-11, 351, 352-60
code example, 354-56
how they work, 352-53
initialization/ cleanup in, 353-54
DriverEntry routine, 353-54
Unload routine, 354
IRPs in, 357-58
lower-level driver, calling, 359-60
transparent la er, 356
virtual/logica device layer, 357
Legacy 16-bit applications, drivers for, 21-22
Level-sensitive (level-triggered) interrupt, 29
Linked lists, 98-100
doubly-linked lists, 99
removing blocks from, 99-100
singly-linked lists, 98
IistHead field, 100
LODCTR utility, 466-67
Logging device errors, 299-319
error logging, 312-13
code example, 313-19
preparing a driver for, 310-1 1
error-log packet, allocating, 311-12
event logging in Windows NT, 299-301
generating log entries, 310-13
messages, 301-10
ad d ing message resources to a driver, 308-9
message codes, 302-3
message definition files, 303-8
registering drivers as event sources, 309-10

r

XXMSG.MC, 305-7

Logical space, 260
Long Procedure Call (LPC) facility, 6
Lookaside lists, 90-91
Low-level audio drivers, 21

M
MajorFunction field, 65
MajorFunction table, 235, 356, 378-81, 383-84, 395
filter driver object, 380-81
MAKEFILE.INC, 408
Mapping registers, 74
MCA bus, 36-38
autoconfiguration, 38
device memory, 38
DMA capabilities, 38
interrupt mechanisms, 38
register access, 37-38
MCI drivers, 20-21
MC utility, 307-8
Md!Address field, 65, 374

Memory Description Lists (MDLs), 260
managing 1/0 buffers with, 261-63
Memory-mapped registers, 27
Memory suballocation, system support for, 88-91
MEMORY.TXT, 439

Message-code fields, meaning of, 302
Message definition files, 303-8
compiling, 307-8
header section, 303
keywords used in, 303
MC utility, 307-8
message section, 303-4
Messageld keyword, 307
Messages, 301-10
message codes, 302-3
message definition files, 303-8
compiling, 307-8
header section, 303
keywords used in header seciton of, 303
MC utility, 307-8
message section, 303-4
message resources, adding to a driver, 308-9
registering drivers as event sources, 309-10
XXMSG.MC, 305-7
METHOD BUFFERED, 168, 177
METHOD) N_DIRECT, 168, 177
METHOD_NEITHER, 168, 177
METHOD_OUT_DIRECT, 168, 177

Microsoft Hardware Compatibility Tests (HCTs), 42122
MmbuildMd!ForNonPagedPool, 263, 292, 294
MmGetMd!ByteCount, 263
MmGetMd!ByteOffset, 263
MmGetMdlVirtua!Address, 263, 273
MmGetSystemAddressForMdl, 263
MmGetSystemAddressForMdl field, 262
MmlsThisAnNtAsSystem, 89
MmMaploSp ace, 157
MmPageEnbreDriver, 85
MmQuerySystemSize, 89
MmResetDriverPaging, 85
MmUnmaploSpace, 157
Module list, 428
MouConfiguration, 81
MouConnectToPort, 81
MouseClassStartlo, 81
MouseClassUnload, 81
MSGnnnnn.BIN, 308
Multimedia drivers, 20-22
kernel-mode device drivers, 21
low-level audio drivers, 21
MCI drivers, 20-21
WINMM, 20
Multiple CPUs, synchronizing, 95-97
Multiprocessor dependencies, 424
Multithreading dependencies, 424
Mutex objects, 327-29
Fast Mutex, 332, 423-24

N
NDIS intermediate drivers, 14
Network driver interface specfication (NDIS), 13-14
Network drivers, 13-15
kernel-mode networking clients, 15
NDIS intermediate drivers, 14
network interface card (NIC) drivers, 13-14
transport drivers, 14-15

519

INDEX
NIC drivers, 13-14
Nonpaged pool, 86
NonPagedPool, 87
NonPagedPoolCacheAligned, 87
NonPagedPoolCacheAlignedMustS, 88
NonPagedPoolMustSucceed, 87
Nonpaged system memory, controlling, 85-86
Nonsignaled dispatcher objects, 323
Normcil response tests, 420
NTDDK.H, 81, 1 03, 434, 442, 448
NTDETECT, 122-23
NT driver support routines, 83-84
NTOSKRNL.EXE, 429
NTSTATUS, 82-83, 102
NTSTATUS.H, 83, 437
NtXxx, 83

0
ObDereferenceObject, 327, 331, 354, 379, 39506
Object instance, 462
Object Manager, 5, 84
ObReferenceObjectByHandle, 327, 331, 354, 395
OEMSETUP.INF (control script), 411-12
OOP, and Windows NT, 62-63"
Open and close operations, 56
openGL API, 15
OPTIONAL_DIRS (keyword), 404
OS/2 subsystem, 8
OtherDrivers class, 152-54
OutpuilnterruptsValid flag, 241
Overall system architecture, windows NT, 1-10
OverrideConflict parameter, 155

p
Packet-based bus master DMA drivers, 285-91
Adapter Control routine:
and bus master hardware, 287
and scatter/gather lists, 289-90
bus master hardware, setting up, 286-88
DpcForlsr routine:
and bus master hardware, 287-88
and scatter Igather support, 290-91
hardware with scatter I gather support, 288-89
scatter I gather lists, 288-91
building with I/OMapTransfer, 289-91
Packet-based slave DMA drivers, 272-85
Adapter Control routine, 273
code example, 276-85
DEVICE_EXTENSION, 276-77
TRANSFER.C, 278-85
XxAdapterControl, 281-82
XxDpcForlsr, 283-85
XXDRIVER.H, 276
XxGetDmalnfo, 277-78
Xxlsr, 282-83
XxStartlo, 278-81
DMA transfers, splitting, 274-76
DpcForlsr routine, 274
DriverEntry routine, 272
Interrupt Service routine, 273-74
Start 1/0 routine, 272-73
Paged pool, 86, 87-88
PagedPool, 88
PagedPoolCacheAligned, 88
Parallel port, 189-201
code example, 192-201

DEVICE_EXTENSION, 192-93
INIT.C, 193-95
TRANSFER.C, 193-201
XxCreateDevice, 193-95
XxDpcForlsr, 200-201
XXDRIVER.H, 192-93
XxlnitHardware, 195
Xxlsr, 199-200
XxStartlo, 195-97
XxTransmitBytes, 197-99
device registers, 191
driver for, 192
how it works, 189-90
interrupt behavior, 192
Partial resource descriptor, contents of, 129
PCi bus, 41-45
autoconfiguration, 44-45
device memory, 44
DMA capabilities, 43-44
interrupt mechanisms, 43
register access, 43
PCI (peripheral component interconnect), See PCI bus
PERF_COUNTER_BLOCK, 470
PE+RF_COUNTER_DEFINITION, 470
Perflib key, 464, 467
PERFMON utility, 461, 463, 464
PERF_OBJECT_TYPE, 469
Performance, Windows NT, 2
Performance counter, 462
Physical addresses, 31, 260
Pollinglnterval field, 216
PopEntryList, 98
Portability, Windows NT, 2
PortBase field, 82
Port drivers, 1 1
SCSI, 13
video, 17
Port monitor DLL, 19-20
Ports, definition of, 26
POSIX subsystem, 8
Precompiled heads, using, 404
Primary IRPs, 224
Printer drivers, 17-20
configuration DLL, 18
DDK, 17-18
language monitor DLL, 19
port monitor DLL, 19-20
print processor DLL, 18-19
spooler, 18
Printer Job Language (PJL), 19
Print processor DLL, 18-19
Process field, 262-63
Process Manager, 6, 84
Product information file, 406-7
Programmed 1/0 data transfers, 180-202
driver initialization/cleanup, 182-85
connecting to interrupt source, 183-85
disconnecting from interrupt source, 185
initializing DpcForlsr routine, 183
initializing Start 1/0 entry point, 182-83
interru t service routine, writing, 186-88
paralle port, 189-201
code example, 192-201
device registers, 191
driver for, 192
how it works, 189-90
interrupt behavior, 192

f

520
Start I/O routine, writing, 185-86
synchronizing driver routines, 181-82
testing, 201-2
what happens during, 180-81
Programmea l/O (PIO), 29-30
Protected subsystems, 7
PsCreateSystemThread, 331, 337
PushEntryList, 98
PutBuffer, 396-97
PVOID ControllerExtension field, 72

Q
QueueRequest, 397

R
Read and write requests, processing, 1 73-74
buffered I/O, 1 73-74
direct I/O, 174
neither method, 174
Recursive BUILD operations, 403-4
REGCON.C, 143-52
RegCreateKeyEx, 412
RegisterEventSource, 476
Registry:
adding counter names to, 464-66
adding driver parameters to, 140
and auto-detected hardware, 123-25
and Configuration Manager, 501-2
counter definitions in, 464-66
driver entries, 410
RegQueryValueEx, 463
RegSetValueEx, 412
Reinitialize routines, 55, 113-14
writing, 113-14
execution context, 113
what it does, 113
Reliability, Windows NT, 1
RemoveHeadList, 99
RemoveTailList, 99
ReportEvent, 476
RESALLOC.C, 158-62
Resoruce leaks, 423
Resource allocation, 152-53
Resource lists, 32
Resource synchronization callbacks, 57
Robustness, Windows NT, 1
RtlConvertLongToLargelnteger, 214
RtlConvertUlongToLargelnteger, 214
RtlCopyBytes, 318
RtlLargelntegerXx, 214
RtlMoveMemory, 294
RtlQueryRegistryValues, 140-41, 148, 314
RtlTimeFieldsToTime, 214
RtlTimeToTimeFields, 214
RtlZeroMemory, 268
Runtime library, 84

s
Sample drivers, 80
ScatterGather, 268
Scatter I gather list, 288
SCSI drivers, 12-13
class drivers, 13
filter drivers, 13
ort and miniport dirvers, 13
SCS Request Bloc1
Source Exif Data:
File Type                       : PDF
File Type Extension             : pdf
MIME Type                       : application/pdf
PDF Version                     : 1.6
Linearized                      : No
Create Date                     : 2017:03:24 18:37:54-07:00
Creator                         : PFU ScanSnap Manager 6.5.40 #iX500
Modify Date                     : 2017:03:24 18:37:54-07:00
Title                           : 
XMP Toolkit                     : Adobe XMP Core 5.4-c006 80.159825, 2016/09/16-03:31:08
Metadata Date                   : 2017:03:24 18:37:54-07:00
Creator Tool                    : PFU ScanSnap Manager 6.5.40 #iX500
Format                          : application/pdf
Document ID                     : uuid:828d3aac-a688-4660-a7ff-7f50b7eb5f49
Instance ID                     : uuid:173d6b2f-1b31-45af-b8b5-474438988480
Producer                        : Adobe Acrobat Pro 11.0.19 Paper Capture Plug-in with ClearScan
Page Count                      : 544
EXIF Metadata provided by EXIF.tools

Navigation menu