Open GL ES 3.0 Programming Guide

User Manual:

Open the PDF directly: View PDF .
Page Count: 572 [warning: Documents this large are best viewed by clicking the View PDF Link!]

BATiOS交流群：2466454(只聊技术、扯淡勿扰)

Praise for OpenGL® ES™ 3.0 ProgrammingGuide,

Second Edition

“As a graphics technologist and intense OpenGL ES developer, I

can honestly say that if you buy only one book on OpenGL ES 3.0

programming, then this should be the book. Dan and Budirijanto have

written a book clearly by programmers for programmers. It is simply

required reading for anyone interested in OpenGL ES 3.0. It is informative,

well organized, and comprehensive, but best of all practical. You will nd

yourself reaching for this book over and over again instead of the actual

OpenGL ES specication during your programming sessions. I give it my

highest recommendation.”

—Rick Tewell, Graphics Technology Architect, Freescale

“This book provides outstanding coverage of the latest version of OpenGL

ES, with clear, comprehensive explanations and extensive examples. It

belongs on the desk of anyone developing mobile applications.”

—Dave Astle, Graphics Tools Lead, Qualcomm Technologies, Inc.,

andFounder, GameDev.net

“The second edition of OpenGL® ES™ 3.0 Programming Guide provides a

solid introduction to OpenGL ES 3.0 specications, along with a wealth

of practical information and examples to help any level of developer

begin programming immediately. We’d recommend this guide as a primer

on OpenGL ES 3.0 to any of the thousands of developers creating apps

for the many mobile and embedded products using our PowerVR Rogue

graphics.”

—Kristof Beets, Business Development, Imagination Technologies

“This is a solid OpenGL ES 3.0 reference book. It covers all aspects of the

API and will help any developer get familiar with and understand the API,

including specically the new ES 3.0 functionality.”

—Jed Fisher, Managing Partner, 4D Pipeline

“This is a clear and thorough reference for OpenGL ES 3.0, and an

excellent presentation of the concepts present in all modern OpenGL

programming. This is the guide I’d want by my side when diving into

embedded OpenGL.”

—Todd Furlong, President & Principal Engineer, Inv3rsion LLC

This page intentionally left blank

OpenGL®ES™ 3.0

Programming Guide

Second Edition

The OpenGL graphics system is a software interface to graphics hardware.

(“GL” stands for “Graphics Library”.) It allows you to create interactive programs

that produce color images of moving, threedimensional objects. With OpenGL,

you can control computergraphics technology to produce realistic pictures, or

ones that depart from reality in imaginative ways.

The OpenGL Series from AddisonWesley Professional comprises tutorial and

reference books that help programmers gain a practical understanding of OpenGL

standards, along with the insight needed to unlock OpenGL’s full potential.

Visit informit.com/opengl for a complete list of available products.

Make sure to connect with us!

informit.com/socialconnect

OpenGL Series

from AddisonWesley

Dan Ginsburg

Budirijanto Purnomo

With Earlier Contributions From

Dave Shreiner

Aaftab Munshi

OpenGL®ES™ 3.0

Programming Guide

Second Edition

      

            

        

Many of the designations used by manufacturers and sellers to distinguish their

products are claimed as trademarks. Where those designations appear in this book,

and the publisher was aware of a trademark claim, the designations have been printed

with initial capital letters or in all capitals.

Front cover image is from Snapdragon Game Studio’s Fortress: Fire OpenGL® ES™

3.0 demo, courtesy of Qualcomm Technologies Inc.

OpenGL® is a registered trademark and the OpenGL® ES™ logo is a trademark of

Silicon Graphics Inc. used by permission by Khronos.

The OpenGL® ES™ shading language built-in functions described in Appendix B are

copyrighted by Khronos and are reprinted with permission from the OpenGL® ES™

3.00.4 Shading Language Specication.

The OpenGL® ES™ 3.0 Reference Card is copyrighted by Khronos and reprinted with

permission.

The authors and publisher have taken care in the preparation of this book, but make

no expressed or implied warranty of any kind and assume no responsibility for errors or

omissions. No liability is assumed for incidental or consequential damages in connection

with or arising out of the use of the information or programs contained herein.

For information about buying this title in bulk quantities, or for special sales

opportunities (which may include electronic versions; custom cover designs; and

content particular to your business, training goals, marketing focus, or branding

interests), please contact our corporate sales department at

corpsales@pearsoned.com or (800) 382-3419.

For government sales inquiries, please contact governmentsales@pearsoned.com.

For questions about sales outside the U.S., please contact international@pearsoned.com.

Visit us on the Web: informit.com/aw

Library of Congress Cataloging-in-Publication Data

Ginsburg, Dan.

OpenGL ES 3.0 programming guide / Dan Ginsburg, Budirijanto Purnomo ; with

earlier contributions from Dave Shreiner, Aaftab Munshi.—Second edition.

pages cm

Revised edition of: The OpenGL ES 2.0 programming guide / Aaftab Munshi,

Dan Ginsburg, Dave Shreiner. 2009.

Includes bibliographical references and index.

ISBN 978-0-321-93388-1 (paperback : alk. paper)

1. OpenGL. 2. Computer graphics—Specications. 3. Application program

interfaces (Computer software) 4. Computer programming. I. Purnomo, Budirijanto.

II. Shreiner, Dave. III. Munshi, Aaftab. IV. Title.

T385.G5426 2014

006.6’6—dc23 2013049233

protected by copyright, and permission must be obtained from the publisher prior

to any prohibited reproduction, storage inaretrievalsystem, or transmission in any

form or by any means, electronic, mechanical, photocopying, recording, or likewise.

To obtain permissiontousematerial from this work, please submit a written request

to Pearson Education,Inc., Permissions Department, One Lake Street, Upper Saddle

River, NewJersey07458, or you may fax your request to (201) 236-3290.

ISBN-13: 978-0-321-93388-1

ISBN-10: 0-321-93388-5

Text printed in the United States on recycled paper at RR Donnelley in Crawfordsville,

Indiana.

First printing, March 2014

Editor-in-Chief

Mark L. Taub

Executive Editor

Laura Lewin

Development Editor

Sheri Cain

Managing Editor

John Fuller

Project Editor

Elizabeth Ryan

Copy Editor

Jill Hobbs

Indexer

Infodex Indexing Services,

Inc.

Proofreader

Linda Begley

Technical Reviewers

Emmanuel Agu

Peter Lohrmann

Maurice Ribble

Editorial Assistant

Olivia Basegio

Cover Designer

Chuti Prasertsith

Compositor

diacriTech

vii

Contents

List of Figures .......................................................................................xvii

List of Examples .....................................................................................xxi

List of Tables ..........................................................................................xxv

Foreword ...............................................................................................xxix

Preface ..................................................................................................xxxi

Intended Audience ............................................................................xxxi

Organization of This Book ................................................................xxxii

Example Code and Shaders .............................................................xxxvi

Errata ................................................................................................xxxvi

Acknowledgments ............................................................................xxxvii

About the Authors ..............................................................................xxxix

1. Introduction to OpenGL ES 3.0 ................................................................1

OpenGL ES 3.0 ........................................................................................3

Vertex Shader ....................................................................................4

Primitive Assembly ...........................................................................7

Rasterization .....................................................................................7

Fragment Shader ...............................................................................8

Per-Fragment Operations .................................................................9

What’s New in OpenGL ES 3.0 .............................................................11

Texturing ........................................................................................11

Shaders ............................................................................................13

viii Contents

Geometry ........................................................................................15

Buffer Objects .................................................................................16

Framebuffer ....................................................................................17

OpenGL ES 3.0 and Backward Compatibility ......................................17

EGL .......................................................................................................19

Programming with OpenGL ES 3.0 ................................................20

Libraries and Include Files ..............................................................20

EGL Command Syntax .........................................................................20

OpenGL ES Command Syntax .............................................................21

Error Handling ......................................................................................22

Basic State Management .......................................................................23

Further Reading ....................................................................................25

2. Hello Triangle: An OpenGL ES 3.0 Example ..........................................27

Code Framework ...................................................................................28

Where to Download the Examples .......................................................28

Hello Triangle Example ........................................................................29

Using the OpenGL ES 3.0 Framework ..................................................34

Creating a Simple Vertex and Fragment Shader ...................................35

Compiling and Loading the Shaders ....................................................36

Creating a Program Object and Linking the Shaders ...........................38

Setting the Viewport and Clearing the Color Buffer ............................39

Loading the Geometry and Drawing a Primitive .................................40

Displaying the Back Buffer ...................................................................41

Summary ...............................................................................................42

3. An Introduction to EGL ...........................................................................43

Communicating with the Windowing System ....................................44

Checking for Errors ...............................................................................45

Initializing EGL .....................................................................................46

Determining the Available Surface Congurations .............................46

Querying EGLCong Attributes ...........................................................48

Letting EGL Choose the Conguration ................................................51

Creating an On-Screen Rendering Area: The EGL Window .................53

Creating an Off-Screen Rendering Area: EGL Pbuffers .........................56

Creating a Rendering Context ..............................................................60

Contents ix

Making an EGLContext Current ..........................................................62

Putting All Our EGL Knowledge Together............................................63

Synchronizing Rendering .....................................................................66

Summary ...............................................................................................67

4. Shaders and Programs ...........................................................................69

Shaders and Programs ...........................................................................69

Creating and Compiling a Shader ..................................................70

Creating and Linking a Program ....................................................74

Uniforms and Attributes .......................................................................80

Getting and Setting Uniforms ........................................................81

Uniform Buffer Objects ..................................................................87

Getting and Setting Attributes .......................................................92

Shader Compiler ...................................................................................93

Program Binaries ...................................................................................94

Summary ...............................................................................................95

5. OpenGL ES Shading Language .............................................................97

OpenGL ES Shading Language Basics ...................................................98

Shader Version Specication ................................................................98

Variables and Variable Types ................................................................99

Variable Constructors .........................................................................100

Vector and Matrix Components .........................................................101

Constants ............................................................................................102

Structures ............................................................................................103

Arrays ..................................................................................................104

Operators ............................................................................................104

Functions ............................................................................................106

Built-In Functions ...............................................................................107

Control Flow Statements ....................................................................107

Uniforms .............................................................................................108

Uniform Blocks ...................................................................................109

Vertex and Fragment Shader Inputs/Outputs ....................................111

Interpolation Qualiers ......................................................................114

Preprocessor and Directives ................................................................115

Uniform and Interpolator Packing .....................................................117

x Contents

Precision Qualiers .............................................................................119

Invariance ...........................................................................................121

Summary .............................................................................................123

6. Vertex Attributes, Vertex Arrays, and Buffer Objects ..........................125

Specifying Vertex Attribute Data ........................................................126

Constant Vertex Attribute ............................................................126

Vertex Arrays ................................................................................126

Declaring Vertex Attribute Variables inaVertexShader ....................135

Binding Vertex Attributes to Attribute Variables

inaVertexShader .....................................................................137

Vertex Buffer Objects ..........................................................................140

Vertex Array Objects ...........................................................................150

Mapping Buffer Objects ......................................................................154

Flushing a Mapped Buffer ............................................................158

Copying Buffer Objects ......................................................................159

Summary .............................................................................................160

7. Primitive Assembly and Rasterization .................................................161

Primitives ............................................................................................161

Triangles .......................................................................................162

Lines .............................................................................................163

Point Sprites ..................................................................................164

Drawing Primitives .............................................................................165

Primitive Restart ...........................................................................168

Provoking Vertex ..........................................................................169

Geometry Instancing ....................................................................169

Performance Tips ..........................................................................172

Primitive Assembly .............................................................................174

Coordinate Systems ......................................................................175

Perspective Division .....................................................................178

Viewport Transformation .............................................................178

Rasterization .......................................................................................179

Culling ..........................................................................................180

Polygon Offset ..............................................................................181

Occlusion Queries ...............................................................................183

Summary .............................................................................................185

Contents xi

8. Vertex Shaders ............................................................................187

Vertex Shader Overview .....................................................................188

Vertex Shader Built-In Variables ...................................................189

Precision Qualiers .......................................................................192

Number of Uniforms Limitations in a Vertex Shader ..................193

Vertex Shader Examples .....................................................................196

Matrix Transformations ................................................................196

Lighting in a Vertex Shader ..........................................................199

Generating Texture Coordinates ........................................................205

Vertex Skinning ..................................................................................207

Transform Feedback ............................................................................211

Vertex Textures ...................................................................................214

OpenGL ES 1.1 Vertex Pipeline as an ES 3.0 Vertex Shader ...............215

Summary .............................................................................................223

9. Texturing .....................................................................................225

Texturing Basics ..................................................................................226

2D Textures ...................................................................................226

Cubemap Textures ........................................................................228

3D Textures ...................................................................................229

2D Texture Arrays .........................................................................230

Texture Objects and Loading Textures .........................................230

Texture Filtering and Mipmapping ..............................................237

Automatic Mipmap Generation ...................................................242

Texture Coordinate Wrapping ......................................................243

Texture Swizzles ............................................................................244

Texture Level of Detail .................................................................245

Depth Texture Compare (Percentage Closest Filtering) ...............245

Texture Formats ............................................................................246

Using Textures in a Shader ...........................................................255

Example of Using a Cubemap Texture .........................................258

Loading 3D Textures and 2D Texture Arrays ...............................260

Compressed Textures ..........................................................................262

Texture Subimage Specication ..........................................................266

Copying Texture Data from the Color Buffer .....................................269

xii Contents

Sampler Objects ..................................................................................273

Immutable Textures ............................................................................276

Pixel Unpack Buffer Objects ...............................................................277

Summary .............................................................................................278

10. Fragment Shaders .......................................................................279

Fixed-Function Fragment Shaders ......................................................280

Fragment Shader Overview ................................................................282

Built-In Special Variables ..............................................................283

Built-In Constants ........................................................................284

Precision Qualiers .......................................................................285

Implementing Fixed-Function Techniques Using Shaders ................286

Multitexturing ..............................................................................286

Fog ................................................................................................288

Alpha Test (Using Discard) ...........................................................291

User Clip Planes ............................................................................293

Summary .............................................................................................295

11. Fragment Operations ...................................................................297

Buffers .................................................................................................298

Requesting Additional Buffers ......................................................299

Clearing Buffers ............................................................................299

Using Masks to Control Writing to Framebuffers ........................301

Fragment Tests and Operations ..........................................................303

Using the Scissor Test ...................................................................304

Stencil Buffer Testing ....................................................................305

Blending ..............................................................................................311

Dithering .............................................................................................314

Multisampled Anti-Aliasing ................................................................314

Centroid Sampling .......................................................................316

Reading and Writing Pixels to the Framebuffer .................................316

Pixel Pack Buffer Objects ..............................................................320

Multiple Render Targets ......................................................................320

Summary .............................................................................................324

Contents xiii

12. Framebuffer Objects ....................................................................325

Why Framebuffer Objects? .................................................................325

Framebuffer and Renderbuffer Objects ..............................................327

Choosing a Renderbuffer Versus a Texture as

a Framebuffer Attachment ........................................................328

Framebuffer Objects Versus EGL Surfaces ....................................329

Creating Framebuffer and Renderbuffer Objects ...............................329

Using Renderbuffer Objects ................................................................330

Multisample Renderbuffers ..........................................................333

Renderbuffer Formats ...................................................................333

Using Framebuffer Objects .................................................................335

Attaching a Renderbuffer as a Framebuffer Attachment .............337

Attaching a 2D Texture as a Framebuffer Attachment .................338

Attaching an Image of a 3D Texture as a Framebuffer

Attachment ...............................................................................339

Checking for Framebuffer Completeness .....................................341

Framebuffer Blits .................................................................................342

Framebuffer Invalidation ....................................................................344

Deleting Framebuffer and Renderbuffer Objects................................346

Deleting Renderbuffer Objects That Are Used

as Framebuffer Attachments .....................................................347

Reading Pixels and Framebuffer Objects ......................................347

Examples .............................................................................................348

Performance Tips and Tricks ...............................................................354

Summary .............................................................................................355

13. Sync Objects and Fences ............................................................357

Flush and Finish .................................................................................357

Why Use a Sync Object? .....................................................................358

Creating and Deleting a Sync Object .................................................358

Waiting for and Signaling a Sync Object ...........................................359

Example ..............................................................................................360

Summary .............................................................................................361

xiv Contents

14. Advanced Programming with OpenGL ES 3.0 ...............................363

Per-Fragment Lighting ........................................................................363

Lighting with a Normal Map .......................................................364

Lighting Shaders ...........................................................................366

Lighting Equations .......................................................................369

Environment Mapping .......................................................................370

Particle System with Point Sprites ................................................374

Particle System Setup ....................................................................374

Particle System Vertex Shader ......................................................375

Particle System Fragment Shader .................................................377

Particle System Using Transform Feedback ........................................380

Particle System Rendering Algorithm ..........................................381

Particle Emission with Transform Feedback ................................381

Rendering the Particles .................................................................385

Image Postprocessing ..........................................................................387

Render-to-Texture Setup ...............................................................387

Blur Fragment Shader ...................................................................388

Projective Texturing ............................................................................390

Projective Texturing Basics ...........................................................391

Matrices for Projective Texturing .................................................392

Projective Spotlight Shaders .........................................................394

Noise Using a 3D Texture ...................................................................397

Generating Noise ..........................................................................397

Using Noise ...................................................................................402

Procedural Texturing ..........................................................................404

A Procedural Texture Example .....................................................405

Anti-Aliasing of Procedural Textures ............................................407

Further Reading on Procedural Textures ......................................410

Rendering Terrain with Vertex Texture Fetch ....................................410

Generating a Square Terrain Grid ................................................411

Computing Vertex Normal and Fetching Height Value

in Vertex Shader ........................................................................412

Further Reading on Large Terrain Rendering ...............................413

Shadows Using a Depth Texture .........................................................414

Rendering from the Light Position Into a Depth Texture ...........415

Rendering from the Eye Position with the Depth Texture ..........418

Summary .............................................................................................420

Contents xv

15. State Queries ..............................................................................421

OpenGL ES 3.0 Implementation String Queries ................................421

Querying Implementation-Dependent Limits ...................................423

Querying OpenGL ES State .................................................................429

Hints ...................................................................................................435

Entity Name Queries ...........................................................................436

Nonprogrammable Operations Control and Queries ........................436

Shader and Program State Queries .....................................................438

Vertex Attribute Queries .....................................................................440

Texture State Queries ..........................................................................441

Sampler Queries ..................................................................................442

Asynchronous Object Queries ............................................................442

Sync Object Queries ............................................................................443

Vertex Buffer Queries ..........................................................................444

Renderbuffer and Framebuffer State Queries .....................................445

Summary .............................................................................................446

16. OpenGL ES Platforms ...........................................................................447

Building for Microsoft Windows with VisualStudio .........................447

Building for Ubuntu Linux .................................................................449

Building for Android 4.3+ NDK (C++) ................................................450

Prerequisites ..................................................................................451

Building the Example Code with Android NDK ..........................452

Building for Android 4.3+ SDK (Java).................................................452

Building for iOS 7 ...............................................................................453

Prerequisites ..................................................................................453

Building the Example Code with Xcode 5 ...................................453

Summary .............................................................................................455

A. GL_HALF_FLOAT ........................................................................457

16-Bit Floating-Point Number ............................................................458

Converting a Float to a Half-Float ......................................................459

B. Built-In Functions ........................................................................463

Angle and Trigonometry Functions ...................................................465

Exponential Functions .......................................................................466

Common Functions ............................................................................467

xvi Contents

Floating-Point Pack and Unpack Functions .......................................471

Geometric Functions ..........................................................................472

Matrix Functions ................................................................................474

Vector Relational Functions ...............................................................475

Texture Lookup Functions ..................................................................476

Fragment Processing Functions ..........................................................483

C. ES Framework API .................................................................................485

Framework Core Functions ................................................................485

Transformation Functions ..................................................................490

Index ...........................................................................................495

xvii

List of Figures

Figure 1-1 OpenGL ES 3.0 Graphics Pipeline .........................................4

Figure 1-2 OpenGL ES 3.0 Vertex Shader ................................................5

Figure 1-3 OpenGL ES 3.0 Rasterization Stage ........................................7

Figure 1-4 OpenGL ES 3.0 Fragment Shader ...........................................8

Figure 1-5 OpenGL ES 3.0 Per-Fragment Operations ...........................10

Figure 2-1 Hello Triangle Example ........................................................33

Figure 5-1 Z Fighting Artifacts Due to Not Using Invariance .............121

Figure 5-2 Z Fighting Avoided Using Invariance ................................122

Figure 6-1 Triangle with a Constant Color Vertex and

Per-Vertex PositionAttributes ............................................125

Figure 6-2 Position, Normal, and Two Texture Coordinates

Stored as an Array ..............................................................128

Figure 6-3 Selecting Constant or Vertex Array Vertex Attribute ........133

Figure 6-4 Specifying and Binding Vertex Attributes for

Drawing One or More Primitives .......................................138

Figure 7-1 Triangle Primitive Types ....................................................162

Figure 7-2 Line Primitive Types ..........................................................163

Figure 7-3 gl_PointCoord Values ......................................................165

Figure 7-4 Cube ...................................................................................167

Figure 7-5 Connecting Triangle Strips ................................................173

Figure 7-6 OpenGL ES Primitive Assembly Stage................................175

Figure 7-7 Coordinate Systems ...........................................................175

Figure 7-8 Viewing Volume .................................................................176

Figure 7-9 OpenGL ES Rasterization Stage ..........................................179

Figure 7-10 Clockwise and Counterclockwise Triangles .......................180

Figure 7-11 Polygon Offset ....................................................................182

xviii List of Figures

Figure 8-1 OpenGL ES 3.0 Programmable Pipeline ............................188

Figure 8-2 OpenGL ES 3.0 Vertex Shader ............................................189

Figure 8-3 Geometric Factors in Computing Lighting

Equation for a Directional Light ........................................199

Figure 8-4 Geometric Factors in Computing Lighting

Equation for a Spotlight .....................................................202

Figure 9-1 2D Texture Coordinates .....................................................227

Figure 9-2 3D Texture Coordinate for Cubemap ................................228

Figure 9-3 3D Texture ..........................................................................229

Figure 9-4 MipMap2D: Nearest Versus Trilinear Filtering ..................241

Figure 9-5 GL_REPEAT, GL_CLAMP_TO_EDGE, and

GL_MIRRORED_REPEAT Modes ............................................243

Figure 10-1 OpenGL ES 3.0 Programmable Pipeline ............................280

Figure 10-2 OpenGL ES 3.0 Fragment Shader .......................................283

Figure 10-3 Multitextured Quad ...........................................................287

Figure 10-4 Linear Fog on Torus in PVRShaman ..................................289

Figure 10-5 Alpha Test Using Discard ...................................................292

Figure 10-6 User Clip Plane Example ....................................................294

Figure 11-1 The Post-Shader Fragment Pipeline ...................................297

Figure 12-1 Framebuffer Objects, Renderbuffer Objects,

and Textures .......................................................................328

Figure 12-2 Render to Color Texture .....................................................350

Figure 12-3 Render to Depth Texture ....................................................353

Figure 14-1 Per-Fragment Lighting Example ........................................364

Figure 14-2 Environment Mapping Example .......................................370

Figure 14-3 Particle System Sample ......................................................374

Figure 14-4 Particle System with Transform Feedback .........................380

Figure 14-5 Image Postprocessing Example ..........................................387

Figure 14-6 Light Bloom Effect .............................................................389

Figure 14-7 Light Bloom Stages.............................................................390

Figure 14-8 Projective Spotlight Example .............................................391

Figure 14-9 2D Texture Projected onto Object .....................................392

Figure 14-10 Fog Distorted by 3D Noise Texture ....................................397

Figure 14-11 2D Slice of Gradient Noise .................................................402

Figure 14-12 Checkerboard Procedural Texture ......................................407

List of Figures xix

Figure 14-13 Anti-Aliased Checkerboard Procedural Texture .................409

Figure 14-14 Terrain Rendered with Vertex Texture Fetch .....................411

Figure 14-15 Shadow Rendering with a Depth Texture

and 6 × 6 PCF .....................................................................414

Figure 16-1 Building Samples with CMake GUI on Windows .............448

Figure 16-2 VertexArrayObjects Sample in Xcode Running

on iOS 7 Simulator .............................................................454

Figure A-1 A 16-Bit Floating-Point Number ........................................458

This page intentionally left blank

xxi

List of Examples

Example 1-1 A Vertex Shader Example .......................................................6

Example 1-2 A Fragment Shader Example ..................................................9

Example 2-1 Hello_Triangle.c Example ....................................................29

Example 3-1 Initializing EGL ....................................................................44

Example 3-2 Specifying EGL Attributes ....................................................51

Example 3-3 Querying EGL Surface Congurations.................................52

Example 3-4 Creating an EGL Window Surface .......................................55

Example 3-5 Creating an EGL Pixel Buffer ...............................................59

Example 3-6 Creating an EGL Context.....................................................62

Example 3-7 A Complete Routine for Creating an EGL Window ............64

Example 3-8 Creating a Window Using the esUtil Library ....................65

Example 4-1 Loading a Shader ..................................................................73

Example 4-2 Create, Attach Shaders to, and Link a Program ...................79

Example 4-3 Querying for Active Uniforms .............................................86

Example 5-1 Sample Vertex Shader ........................................................112

Example 5-2 Vertex and Fragment Shaders with Matching

Output/Input Declarations ................................................113

Example 6-1 Array of Structures .............................................................129

Example 6-2 Structure of Arrays .............................................................130

Example 6-3 Using Constant and Vertex Array Attributes .....................133

Example 6-4 Creating and Binding Vertex Buffer Objects .....................141

Example 6-5 Drawing with and without Vertex Buffer Objects .............146

Example 6-6 Drawing with a Buffer Object per Attribute ......................149

Example 6-7 Drawing with a Vertex Array Object ..................................152

Example 6-8 Mapping a Buffer Object for Writing .................................157

Example 8-1 Vertex Shader with Matrix Transform for the Position .....196

xxii List of Examples

Example 8-2 Directional Light ................................................................200

Example 8-3 Spotlight .............................................................................203

Example 8-4 Sphere Map Texture Coordinate Generation .....................206

Example 8-5 Cubemap Texture Coordinate Generation ........................206

Example 8-6 Vertex Skinning Shader with No Check of

Whether Matrix Weight = 0 ...............................................208

Example 8-7 Vertex Skinning Shader with Checks of Whether

Matrix Weight = 0 ..............................................................210

Example 8-8 Displacement Mapping Vertex Shader ..............................214

Example 8-9 OpenGL ES 1.1 Fixed-Function Vertex Pipeline ................216

Example 9-1 Generating a Texture Object, Binding It, and

Loading Image Data ...........................................................234

Example 9-2 Loading a 2D Mipmap Chain ............................................238

Example 9-3 Vertex and Fragment Shaders for Performing

2D Texturing ......................................................................255

Example 9-4 Loading a Cubemap Texture ..............................................258

Example 9-5 Vertex and Fragment Shader Pair for

Cubemap Texturing ...........................................................259

Example 10-1 Multitexture Fragment Shader ...........................................287

Example 10-2 Vertex Shader for Computing Distance to Eye ..................289

Example 10-3 Fragment Shader for Rendering Linear Fog .......................290

Example 10-4 Fragment Shader for Alpha Test Using Discard .................292

Example 10-5 User Clip Plane Vertex Shader ...........................................294

Example 10-6 User Clip Plane Fragment Shader ......................................295

Example 11-1 Setting up Multiple Render Targets ...................................322

Example 11-2 Fragment Shader with Multiple Render Targets ................324

Example 12-1 Copying Pixels Using Framebuffer Blits ............................343

Example 12-2 Render to Texture ...............................................................348

Example 12-3 Render to Depth Texture ....................................................351

Example 13-1 Inserting a Fence Command and Waiting for

Its Result in Transform Feedback Example ........................361

Example 14-1 Per-Fragment Lighting Vertex Shader ................................366

Example 14-2 Per-Fragment Lighting Fragment Shader ...........................367

Example 14-3 Environment Mapping Vertex Shader ...............................371

Example 14-4 Environment Mapping Fragment Shader ..........................372

Example 14-5 Particle System Vertex Shader ............................................375

List of Examples xxiii

Example 14-6 Update Function for Particle System Sample ....................376

Example 14-7 Particle System Fragment Shader .......................................377

Example 14-8 Draw Function for Particle System Sample........................378

Example 14-9 Particle Emission Vertex Shader.........................................382

Example 14-10 Emit Particles with Transform Feedback ............................384

Example 14-11 Particle Rendering Vertex Shader .......................................386

Example 14-12 Blur Fragment Shader ........................................................388

Example 14-13 Projective Texturing Vertex Shader ....................................394

Example 14-14 Projective Texturing Fragment Shader ...............................396

Example 14-15 Generating Gradient Vectors .............................................398

Example 14-16 3D Noise .............................................................................400

Example 14-17 Noise-Distorted Fog Fragment Shader ...............................402

Example 14-18 Checker Vertex Shader .......................................................405

Example 14-19 Checker Fragment Shader with Conditional Checks ........406

Example 14-20 Checker Fragment Shader without

Conditional Checks ...........................................................406

Example 14-21 Anti-Aliased Checker Fragment Shader .............................407

Example 14-22 Terrain Rendering Flat Grid Generation ............................411

Example 14-23 Terrain Rendering Vertex Shader .......................................412

Example 14-24 Set up a MVP Matrix from the Light Position ...................415

Example 14-25 Create a Depth Texture and Attach

It to a Framebuffer Object ..................................................416

Example 14-26 Rendering to Depth Texture Shaders .................................417

Example 14-27 Rendering from the Eye Position Shaders .........................418

This page intentionally left blank

xxv

List of Tables

Table 1-1 EGL Data Types.....................................................................21

Table 1-2 OpenGL ES Command Sufxes and

Argument Data Types ...........................................................22

Table 1-3 OpenGL ES Basic Error Codes ..............................................23

Table 3-1 EGLConfig Attributes ...........................................................49

Table 3-2 Attributes for Window Creation Using

eglCreateWindowSurface ..................................................54

Table 3-3 Possible Errors When eglCreateWindowSurface Fails .......55

Table 3-4 EGL Pixel Buffer Attributes ...................................................57

Table 3-5 Possible Errors When eglCreatePbufferSurface Fails .....58

Table 3-6 Attributes for Context Creation Using

eglCreateContext ..............................................................61

Table 5-1 Data Types in the OpenGL ES Shading Language ................99

Table 5-2 OpenGL ES Shading Language Operators ..........................104

Table 5-3 OpenGL ES Shading Language Qualiers ..........................106

Table 5-4 Uniform Block Layout Qualiers .......................................111

Table 5-5 Extension Behaviors ...........................................................116

Table 5-6 Uniform Storage without Packing ......................................118

Table 5-7 Uniform Storage with Packing ...........................................119

Table 6-1 Data Conversions ...............................................................132

Table 6-2 Buffer Usage ........................................................................143

Table 7-1 Provoking Vertex Selection for the ith Primitive

Instance Where Vertices Are Numbered from 1 to n,

and n Is the Number of Vertices Drawn .............................169

Table 8-1 Transform Feedback Primitive Mode

and Allowed Draw Mode ....................................................213

Table 9-1 Texture Base Formats ..........................................................227

xxvi List of Tables

Table 9-2 Pixel Storage Options .........................................................236

Table 9-3 Texture Wrap Modes ...........................................................243

Table 9-4 Valid Unsized Internal Format Combinations

for glTexImage2D ...............................................................247

Table 9-5 Normalized Sized Internal Format Combinations

for glTexImage2D ...............................................................248

Table 9-6 Valid Sized Floating-Point Internal Format

Combinations for glTexImage2D ......................................249

Table 9-7 Valid Sized Internal Integer Texture Format

Combinations for glTexImage2D ......................................251

Table 9-8 Valid Shared Exponent Sized Internal Format

Combinations for glTexImage2D ......................................253

Table 9-9 Valid sRGB Sized Internal Format Combinations

for glTexImage2D ..............................................................254

Table 9-10 Valid Depth Sized Internal Format Combinations

for glTexImage2D ..............................................................255

Table 9-11 Mapping of Texture Formats to Colors ..............................257

Table 9-12 Standard Texture Compression Formats ............................264

Table 9-13 Valid Format Conversions for glCopyTex*Image* ...........273

Table 10-1 OpenGL ES 1.1 RGB Combine Functions ..........................281

Table 11-1 Fragment Test Enable Tokens .............................................304

Table 11-2 Stencil Operations ..............................................................306

Table 11-3 Blending Functions ............................................................312

Table 12-1 Renderbuffer Formats for Color-Renderable Buffer ...........333

Table 12-2 Renderbuffer Formats for Depth-Renderable

and Stencil-Renderable Buffer............................................335

Table 15-1 Implementation-Dependent State Queries ........................423

Table 15-2 Application-Modiable OpenGL ES State Queries ............429

Table 15-3 OpenGL ES 3.0 Capabilities Controlled by

glEnable and glDisable ..................................................437

Table B-1 Angle and Trigonometry Functions ...................................465

Table B-2 Exponential Functions .......................................................466

Table B-3 Common Functions ...........................................................467

Table B-4 Floating-Point Pack and Unpack Functions ......................471

List of Tables xxvii

Table B-5 Geometric Functions .........................................................473

Table B-6 Matrix Functions ................................................................474

Table B-7 Vector Relational Functions ...............................................475

Table B-8 Supported Combinations of Sampler and Internal

Texture Formats .................................................................476

Table B-9 Texture Lookup Functions .................................................478

Table B-10 Fragment Processing Functions .........................................484

This page intentionally left blank

xxix

Foreword

Five years have passed since the OpenGL ES 2.0 version of this reference

book helped alert developers everywhere that programmable 3D graphics

on mobile and embedded systems had not just arrived, but was here

to stay.

Five years later, more than 1 billion people around the world use

OpenGL ES every day to interact with their computing devices, for both

information and entertainment. Nearly every pixel on nearly every

smartphone screen has been generated, manipulated, or composited by

this ubiquitous graphics API.

Now, OpenGL ES 3.0 has been developed by Khronos Group and is shipping

on the latest mobile devices, continuing the steady ow of advanced

graphics features into the hands of consumers everywhere—features that

were rst developed and proven on high-end systems shipping with desktop

OpenGL.

In fact, OpenGL is now easily the most widely deployed family of 3D APIs,

with desktop OpenGL and OpenGL ES being joined by WebGL to bring

the power of OpenGL ES to web content everywhere. OpenGL ES 3.0 will

be instrumental in powering the evolution of WebGL, enabling HTML5

developers to tap directly into the power of the latest GPUs from the rst

truly portable 3D applications.

OpenGL ES 3.0 not only places more graphics capabilities into the hands

of developers across a huge range of devices and platforms, but also

enables faster, more power-efcient 3D applications that are easier to

write, port, and maintain—and this book will show you how.

xxx Foreword

There has never been a more fascinating and rewarding time to be a 3D

developer. My thanks and congratulations go to the authors for continuing

to be a vital part of the evolving story of OpenGL ES, and for working hard

to produce this book that helps ensure developers everywhere can better

understand and leverage the full power of OpenGL ES 3.0.

—Neil Trevett

President, Khronos Group

Vice President Mobile Ecosystem, NVIDIA

xxxi

Preface

OpenGL ES 3.0 is a software interface for rendering sophisticated 3D

graphics on handheld and embedded devices. OpenGL ES is the primary

graphics library for handheld and embedded devices with programmable

3D hardware including cell phones, personal digital assistants (PDAs),

consoles, appliances, vehicles, and avionics. This book details the entire

OpenGL ES 3.0 application programming interface (API) and pipeline,

including detailed examples, to provide a guide for developing a wide

range of high-performance 3D applications for handheld devices.

Intended Audience

This book is intended for programmers who are interested in learning

OpenGL ES 3.0. We expect the reader to have a solid grounding in

computer graphics. In the text we explain many of the relevant graphics

concepts as they relate to various parts of OpenGL ES 3.0, but we expect

the reader to understand basic 3D concepts. The code examples in the book

are all written in C. We assume that the reader is familiar with C or C++

and cover language topics only where they are relevant to OpenGL ES 3.0.

The reader will learn about setting up and programming every aspect

of the graphics pipeline. The book details how to write vertex and

fragment shaders and how to implement advanced rendering techniques

such as per-pixel lighting and particle systems. In addition, it provides

performance tips and tricks for efcient use of the API and hardware.

After nishing the book, the reader will be ready to write OpenGL ES 3.0

applications that fully harness the programmable power of embedded

graphics hardware.

xxxii Preface

Organization of This Book

This book is organized to cover the API in a sequential fashion, building

up your knowledge of OpenGL ES 3.0 as we go.

Chapter 1—Introduction to OpenGL ES 3.0

Chapter 1 introduces OpenGL ES and provides an overview of the

OpenGL ES 3.0 graphics pipeline. We discuss the philosophies and

constraints that went into the design of OpenGL ES 3.0. Finally, the

chapter covers some general conventions and types used in OpenGL

ES3.0.

Chapter 2—Hello Triangle: An OpenGL ES 3.0 Example

Chapter 2 walks through a simple OpenGL ES 3.0 example program

that draws a triangle. Our purpose here is to show what an OpenGL ES

3.0 program looks like, introduce the reader to some API concepts, and

describe how to build and run an example OpenGL ES 3.0 program.

Chapter 3—An Introduction to EGL

Chapter 3 presents EGL, the API for creating surfaces and rendering

contexts for OpenGL ES 3.0. We describe how to communicate with

the native windowing system, choose a conguration, and create EGL

rendering contexts and surfaces. We teach you enough EGL so that you

can do everything you will need to do to get up and rendering with

OpenGL ES 3.0.

Chapter 4—Shaders and Programs

Shader objects and program objects form the most fundamental objects in

OpenGL ES 3.0. In Chapter 4, we describe how to create a shader object,

compile a shader, and check for compile errors. The chapter also explains

how to create a program object, attach shader objects to it, and link a

nal program object. We discuss how to query the program object for

information and how to load uniforms. In addition, you will learn about

the difference between source shaders and program binaries and how to

use each.

Preface xxxiii

Chapter 5—OpenGL ES Shading Language

Chapter 5 covers the shading language basics needed for writing shaders.

These shading language basics include variables and types, constructors,

structures, arrays, uniforms, uniform blocks, and input/output variables.

This chapter also describes some more nuanced parts of the shading

language, such as precision qualiers and invariance.

Chapter 6—Vertex Attributes, Vertex Arrays,

and Buffer Objects

Starting with Chapter 6 (and ending with Chapter 11), we begin our walk

through the pipeline to teach you how to set up and program each part

of the graphics pipeline. This journey begins with a description of how

geometry is input into the graphics pipeline, and includes discussion of

vertex attributes, vertex arrays, and buffer objects.

Chapter 7—Primitive Assembly and Rasterization

After discussing how geometry is input into the pipeline in the previous

chapter, in Chapter 7 we consider how that geometry is assembled into

primitives. All of the primitive types available in OpenGL ES 3.0, including

point sprites, lines, triangles, triangle strips, and triangle fans, are covered.

In addition, we describe how coordinate transformations are performed on

vertices and introduce the rasterization stage of the OpenGL ES 3.0 pipeline.

Chapter 8—Vertex Shaders

The next portion of the pipeline that is covered is the vertex shader.

Chapter 8 provides an overview of how vertex shaders t into the pipeline

and the special variables available to vertex shaders in the OpenGL

ES Shading Language. Several examples of vertex shaders, including

computation of per-vertex lighting and skinning, are covered. We also

give examples of how the OpenGL ES 1.0 (and 1.1) xed-function pipeline

can be implemented using vertex shaders.

Chapter 9—Texturing

Chapter 9 begins the introduction to fragment shaders by describing all

of the texturing functionality available in OpenGL ES 3.0. This chapter

provides details on how to create textures, how to load them with data,

xxxiv Preface

and how to render with them. It describes texture wrap modes, texture

ltering, texture formats, compressed textures, sampler objects, immutable

textures, pixel unpack buffer objects, and mipmapping. This chapter

covers all of the texture types supported in OpenGL ES 3.0: 2D textures,

cubemaps, 2D texture arrays, and 3D textures.

Chapter 10—Fragment Shaders

Chapter 9 focused on how to use textures in a fragment shader;

Chapter10 covers the rest of what you need to know to write fragment

shaders. We give an overview of fragment shaders and all of the special

built-in variables available to them. We also demonstrate how to

implement all of the xed-function techniques that were available in

OpenGL ES 1.1 using fragment shaders. Examples of multitexturing, fog,

alpha test, and user clip planes are all implemented in fragment shaders.

Chapter 11—Fragment Operations

Chapter 11 discusses the operations that can be applied either to the

entire framebuffer, or to individual fragments after the execution of

the fragment shader in the OpenGL ES 3.0 fragment pipeline. These

operations include the scissor test, stencil test, depth test, multisampling,

blending, and dithering. This chapter covers the nal phase in the

OpenGL ES 3.0 graphics pipeline.

Chapter 12—Framebuffer Objects

Chapter 12 discusses the use of framebuffer objects for rendering to

off-screen surfaces. Framebuffer objects have several uses, the most

common of which is for rendering to a texture. This chapter provides

a complete overview of the framebuffer object portion of the API.

Understanding framebuffer objects is critical for implementing many

advanced effects such as reections, shadow maps, and postprocessing.

Chapter 13—Sync Objects and Fences

Chapter 13 provides an overview of sync objects and fences, which are

efcient primitives for synchronizing within the host application and

GPU execution in OpenGL ES 3.0. We discuss how to use sync objects and

fences and conclude with an example.

Preface xxxv

Chapter 14—Advanced Programming with OpenGL ES 3.0

Chapter 14 is the capstone chapter, tying together many of the topics

presented throughout the book. We have selected a sampling of advanced

rendering techniques and show examples that demonstrate how to

implement these features. This chapter includes rendering techniques

such as per-pixel lighting using normal maps, environment mapping,

particle systems, image postprocessing, procedural textures, shadow

mapping, terrain rendering and projective texturing.

Chapter 15—State Queries

A large number of state queries are available in OpenGL ES 3.0. For just

about everything you set, there is a corresponding way to get the current

value. Chapter 15 is provided as a reference for the various state queries

available in OpenGL ES 3.0.

Chapter 16—OpenGL ES Platforms

In the nal chapter, we move away from the details of the API to talk

about how to build the OpenGL ES sample code in this book for iOS7,

Android 4.3 NDK, Android 4.3 SDK, Windows, and Linux. This chapter is

intended to serve as a reference to get you up and running with the book

sample code on the OpenGL ES 3.0 platform of your choosing.

Appendix A—GL_HALF_FLOAT_OES

Appendix A details the half-oat format and provides a reference for how

to convert from IEEE oating-point values into half-oats (and back).

Appendix B—Built-In Functions

Appendix B provides a reference for all of the built-in functions available

in the OpenGL ES Shading Language.

Appendix C—ES Framework API

Appendix C provides a reference for the utility framework we developed

for the book and describes what each function does.

xxxvi Preface

OpenGL ES 3.0 Reference Card

Included as a color insert in the middle of the book is the OpenGL ES 3.0

Reference Card, copyrighted by Khronos and reprinted with permission.

This reference contains a complete list of all of the functions in OpenGL

ES 3.0, along with all of the types, operators, qualiers,

built-ins, and functions in the OpenGL ES Shading Language.

Example Code and Shaders

This book is lled with example programs and shaders. You can download

the examples from the book’s website at opengles-book.com, which

provides a link to the github.com site hosting the book code. As of this

writing, the example programs have been built and tested on iOS7,

Android 4.3 NDK, Android 4.3 SDK, Windows (OpenGL ES 3.0 Emulation),

and Ubuntu Linux. Several of the advanced shader examples in the

book are implemented in PVRShaman, a shader development tool from

PowerVR available for Windows, Mac OS X, and Linux. The book’s website

(opengles-book.com) provides links through which to download any of

the required tools.

Errata

If you nd something in the book that you believe is in error, please send

us a note at errors@opengles-book.com. The list of errata for the book can

be found on the book’s website: opengles-book.com.

xxxvii

Acknowledgments

I want to thank Afe Munshi and Dave Shreiner for their enormous

contributions to the rst edition of this book. I am extremely grateful

to have Budi Purnomo join me to update the book for OpenGL ES 3.0.

Iwould also like to thank the many colleagues with whom I have worked

over the years, who have helped in my education on computer graphics,

OpenGL, and OpenGL ES. There are too many people to list all of them,

but special thanks go to Shawn Leaf, Bill Licea-Kane, Maurice Ribble, Benj

Lipchak, Roger Descheneaux, David Gosselin, Thorsten Scheuermann,

John Isidoro, Chris Oat, Jason Mitchell, Dan Gessel, and Evan Hart.

I would like to extend a special thanks to my wife, Soa, for her support

while I worked on this book. I would also like to thank my son, Ethan,

who was born during the writing of this book. Your smile and laugh bring

me joy every single day.

— Dan Ginsburg

I would like to express my deepest gratitude to Dan Ginsburg for

providing me with an opportunity to contribute to this book. Thank you

to my manager, Callan McInally, and colleagues at AMD for supporting

this endeavor. I would also like to thank my past professors, Jonathan

Cohen, Subodh Kumar, Ching-Kuang Shene, and John Lowther, for

introducing me to the world of computer graphics and OpenGL.

I would like to thank my parents and sister for their unconditional

love. Special thanks to my wonderful wife, Liana Hadi, whose love and

support allowed me to complete this project. Thank you to my daughters,

Michelle Lo and Scarlett Lo. They are the sunshine in my life.

— Budi Purnomo

xxxviii Acknowledgments

We all want to thank Neil Trevett for writing the Foreword and getting

approval from the Khronos Board of Promoters to allow us to use text

from the OpenGL ES Shading Language specication in Appendix B,

as well as the OpenGL ES 3.0 Reference Card. A special thank you and

debt of gratitude go to the reviewers for their enormously valuable

feedback—Maurice Ribble, Peter Lohrmann, and Emmanuel Agu. We

also wish to acknowledge the technical reviewers from the rst edition

of the book—Brian Collins, Chris Grimm, Jeremy Sandmel, Tom Olson,

and Adam Smith.

We owe a huge amount of gratitude to our editor, Laura Lewin, at

Addison-Wesley, who was enormously helpful in every aspect of

creating this book. There were many others at Addison-Wesley who were

invaluable in putting together this book and whom we would like to

thank, including Debra Williams Cauley, Olivia Basegio, Sheri Cain, and

Curt Johnson.

We want to thank our readers from the rst edition who have helped

us immensely by reporting errata and improving the sample code. We

would especially like to thank our reader Javed Rabbani Shah, who ported

the OpenGL ES 3.0 sample code to the Android 4.3 SDK in Java. He also

helped us with the Android NDK port and resolving many device-specic

issues. We thank Jarkko Vatjus-Anttila for providing the Linux X11 port,

and Eduardo Pelegri-Llopart and Darryl Gough for porting the rst-edition

code to the BlackBerry Native SDK.

A big thank you to the OpenGL ARB, the OpenGL ES working group, and

everyone who contributed to the development of OpenGL ES.

xxxix

About the Authors

Dan Ginsburg

Dan is the founder of Upsample Software, LLC, a software company

offering consulting services in 3D graphics and GPU computing. Dan has

coauthored several other books, including the OpenCL Programming Guide

and OpenGL Shading Language, Third Edition. In previous roles Dan has

worked on developing OpenGL drivers, desktop and handheld 3D demos,

GPU developer tools, 3D medical visualization, and games. He holds a B.S.

in computer science from Worcester Polytechnic Institute and an M.B.A.

from Bentley University.

Budirijanto Purnomo

Budi is a senior software architect at Advanced Micro Devices, Inc., where

he leads the software enablement efforts of GPU debugging and proling

technology across multiple AMD software stacks. He collaborates with

many software and hardware architects within AMD to dene future

hardware architectures for debugging and proling GPU applications. He

has published many computer graphics technical articles at international

conferences. He received his B.S. and M.S. in computer science from

Michigan Technological University, and his M.S.E. and Ph.D. in computer

science from Johns Hopkins University.

Aaftab Munshi

Afe has been architecting GPUs for more than a decade. At ATI (now

AMD), he was a senior architect in the Handheld Group. He is the spec

editor for the OpenGL ES 1.1, OpenGL ES 2.0, and OpenCL specications.

He currently works at Apple.

xl About the Authors

Dave Shreiner

Dave has been working with OpenGL for almost two decades, and more

recently with OpenGL ES. He authored the rst commercial training

course on OpenGL while working at Silicon Graphics Computer Systems

(SGI), and has worked as an author on the OpenGL Programming Guide.

He has presented introductory and advanced courses on OpenGL

programming worldwide at numerous conferences, including SIGGRAPH.

Dave is now a media systems architect at ARM, Inc. He holds a B.S. in

mathematics from the University of Delaware.

Chapter 1

Introduction to OpenGL ES 3.0

OpenGL for Embedded Systems (OpenGL ES) is an application

programming interface (API) for advanced 3D graphics targeted at

handheld and embedded devices. OpenGL ES is the dominant graphics

API in today’s smartphones and has even extended its reach onto the

desktop. The list of platforms supporting OpenGL ES includes iOS,

Android, BlackBerry, bada, Linux, and Windows. OpenGL ES also

underpins WebGL, a web standard for browser-based 3D graphics.

Since the release of the iPhone 3GS in June 2009 and Android 2.0 in

March 2010, OpenGL ES 2.0 has been supported on iOS and Android

devices. The rst edition of this book covered OpenGL ES 2.0 in detail.

The current edition focuses on OpenGL ES 3.0, the next revision of

OpenGL ES. It is almost inevitable that every handheld platform that

continues to evolve will support OpenGL ES 3.0. Indeed, OpenGL ES 3.0

is already supported on devices using Android 4.3+ and on the iPhone 5s

with iOS7. OpenGL ES 3.0 is backward compatible with OpenGL ES 2.0,

meaning that applications written for OpenGL ES 2.0 will continue to

work with OpenGL ES 3.0.

OpenGL ES is one of a set of APIs created by the Khronos Group. The

Khronos Group, founded in January 2000, is a member-funded industry

consortium that is focused on the creation of open standard and royalty-

free APIs. The Khronos Group also manages OpenGL, a cross-platform

standard 3D API for desktop systems running Linux, various avors of

UNIX, Mac OS X, and Microsoft Windows. It is a widely accepted standard

3D API that has seen signicant real-world usage.

Due to the widespread adoption of OpenGL as a 3D API, it made sense to

start with the desktop OpenGL API in developing an open standard 3D

2 Chapter 1: Introduction to OpenGL ES 3.0

API for handheld and embedded devices and then modify it to meet the

needs and constraints of the handheld and embedded device space. In the

earlier versions of OpenGL ES (1.0, 1.1, and 2.0), the device constraints

that were considered in the design included limited processing capabilities

and memory availability, low memory bandwidth, and sensitivity to

power consumption. The working group used the following criteria in the

denition of the OpenGL ES specication(s):

The OpenGL API is very large and complex, and the goal of

the OpenGL ES working group was to create an API suitable for

constrained devices. To achieve this goal, the working group removed

any redundancy from the OpenGL API. In any case where the same

operation could be performed in more than one way, the most useful

method was taken and the redundant techniques were removed.

A good example of this is seen with specifying geometry, where in

OpenGL an application can use immediate mode, display lists, or

vertex arrays. In OpenGL ES, only vertex arrays exist; immediate mode

and display lists were removed.

Removing redundancy was an important goal, but maintaining

compatibility with OpenGL was also important. As much as possible,

OpenGL ES was designed so that applications written to the embedded

subset of functionality in OpenGL would also run on OpenGL ES.

This was an important goal because it allows developers to leverage

both APIs and to develop applications and tools that use the common

subset of functionality.

New features were introduced to address specic constraints of

handheld and embedded devices. For example, to reduce the power

consumption and increase the performance of shaders, precision

qualiers were introduced to the shading language.

The designers of OpenGL ES aimed to ensure a minimum set of

features for image quality. In early handheld devices, the screen sizes

were limited, making it essential that the quality of the pixels drawn

on the screen was as good as possible.

The OpenGL ES working group wanted to ensure that any OpenGL

ES implementation would meet certain acceptable and agreed-on

standards for image quality, correctness, and robustness. This was

achieved by developing appropriate conformance tests that an

OpenGL ES implementation must pass to be considered compliant.

Khronos has released four OpenGL ES specications so far: OpenGL ES 1.0

and ES 1.1 (referred to jointly as OpenGL ES 1.x in this book), OpenGL

ES 2.0, and OpenGL ES 3.0. The OpenGL ES 1.0 and 1.1 specications

OpenGL ES 3.0 3

implement a xed function pipeline and are derived from the OpenGL 1.3

and 1.5 specications, respectively.

The OpenGL ES 2.0 specication implements a programmable graphics

pipeline and is derived from the OpenGL 2.0 specication. Being derived

from a revision of the OpenGL specication means that the corresponding

OpenGL specication was used as the baseline for determining the feature

set included in the particular revision of OpenGL ES.

OpenGL ES 3.0 is the next step in the evolution of handheld graphics and

is derived from the OpenGL 3.3 specication. While OpenGL ES 2.0 was

successful in bringing capabilities similar to DirectX9 and the Microsoft

Xbox 360 to handheld devices, graphics capabilities have continued to

evolve on desktop GPUs. Signicant features that enable techniques such

as shadow mapping, volume rendering, GPU-based particle animation,

geometry instancing, texture compression, and gamma correction were

missing from OpenGL ES 2.0. OpenGL ES 3.0 brings these features to

handheld devices, while continuing the philosophy of adapting to the

constraints of embedded systems.

Of course, some of the constraints that were taken into consideration

while designing previous versions of OpenGL ES are no longer relevant

today. For example, handheld devices now feature large screen sizes (some

offer a higher resolution than most desktop PC monitors). Additionally,

many handheld devices now feature high-performance multicore CPUs

and large amounts of memory. The focus for the Khronos Group in

developing OpenGL ES 3.0 shifted toward appropriate market timing of

features relevant to handheld applications rather than addressing the

limited capabilities of devices.

The following sections introduce the OpenGL ES 3.0 pipeline.

OpenGL ES 3.0

As noted earlier, OpenGL ES 3.0 is the API covered in this book. Our

goal is to cover the OpenGL ES 3.0 specication in thorough detail, give

specic examples of how to use the features in OpenGL ES 3.0, and discuss

various performance optimization techniques. After reading this book, you

should have an excellent grasp of the OpenGL ES 3.0 API, be able to easily

write compelling OpenGL ES 3.0 applications, and not have to worry

about reading multiple specications to understand how a feature works.

OpenGL ES 3.0 implements a graphics pipeline with programmable

shading and consists of two specications: the OpenGL ES 3.0

4 Chapter 1: Introduction to OpenGL ES 3.0

API specication and the OpenGL ES Shading Language 3.0

Specication (OpenGL ES SL). Figure 1-1 shows theOpenGLES 3.0

graphics pipeline. The shaded boxes in this gure indicate the

programmable stages of the pipeline in OpenGL ES 3.0. An overview of

each stage in the OpenGL ES 3.0 graphics pipeline is presented next.

Vertex Buffer/

Arrays Objects

Vertex Shader

Textures

Fragment

Shader

Primitive

Assembly

Transform

Feedback

Rasterization

Per-Fragment

Operations Framebuffer

API

Figure 1-1 OpenGL ES 3.0 Graphics Pipeline

Vertex Shader

This section gives a high-level overview of vertex shaders. Vertex and

fragment shaders are covered in depth in later chapters. The vertex shader

implements a general-purpose programmable method for operating on

vertices.

The inputs to the vertex shader consist of the following:

Shader program—Vertex shader program source code or executable

that describes the operations that will be performed on the vertex.

Vertex shader inputs (or attributes)—Per-vertex data supplied using

vertex arrays.

Uniforms—Constant data used by the vertex (or fragment) shader.

Samplers—Specic types of uniforms that represent textures used by

the vertex shader.

OpenGL ES 3.0 5

The outputs of the vertex shader were called varying variables in OpenGL

ES 2.0, but were renamed vertex shader output variables in OpenGL ES

3.0. In the primitive rasterization stage, the vertex shader output values

are calculated for each generated fragment and are passed in as inputs to

the fragment shader. The mechanism used to generate a value for each

fragment from the vertex shader outputs that is assigned to each vertex

of the primitive is called interpolation. Additionally, OpenGL ES 3.0 adds

a new feature called transform feedback, which allows the vertex shader

outputs to be selectively written to an output buffer (in addition to, or

instead of, being passed to the fragment shader). For example, as covered

in the transform feedback example in Chapter 14, a particle system can be

implemented in the vertex shader in which particles are output to a buffer

object using transform feedback. The inputs and outputs of the vertex

shader are shown in Figure 1-2.

Vertex shaders can be used for traditional vertex-based operations such as

transforming the position by a matrix, computing the lighting equation

to generate a per-vertex color, and generating or transforming texture

gl_Position

gl_PointSize

Output (Varying) N

...

Vertex Shader

...

Input (Attribute) N

Input (Attribute) 0

Input (Attribute) 1

Input (Attribute) 2

Input (Attribute) 3

Input (Attribute) 4

Uniforms Samplers

Output (Varying) 0

Output (Varying) 1

Output (Varying) 2

Output (Varying) 3

Output (Varying) 4

Figure 1-2 OpenGL ES 3.0 Vertex Shader

6 Chapter 1: Introduction to OpenGL ES 3.0

coordinates. Alternatively, because the vertex shader is specied by the

application, vertex shaders can be used to perform custom math that

enables new transforms, lighting, or vertex-based effects not allowed in

more traditional xed-function pipelines.

Example 1-1 shows a vertex shader written using the OpenGL ES shading

language. We explain vertex shaders in signicant detail later in the book.

We present this shader here just to give you an idea of what a vertex

shader looks like. The vertex shader in Example 1-1 takes a position and

its associated color data as input attributes, transforms the position using

a 4 × 4 matrix, and outputs the transformed position and color.

Example 1-1 A Vertex Shader Example

1. #version 300 es

2. uniform mat4 u_mvpMatrix; // matrix to convert a_position

3. // from model space to normalized

4. // device space

6. // attributes input to the vertex shader

7. in vec4 a_position; // position value

8. in vec4 a_color; // input vertex color

10. // output of the vertex shader - input to fragment

11. // shader

12. out vec4 v_color; // output vertex color

13. void main()

14. {

15. v_color = a_color;

16. gl_Position = u_mvpMatrix * a_position;

17. }

Line 1 provides the version of the Shading Language—information

that must appear on the rst line of the shader (#version 300 es

indicates the OpenGL ES Shading Language v3.00). Line 2 describes a

uniform variable u_mvpMatrix that stores the combined model view and

projection matrix. Lines 7 and 8 describe the inputs to the vertex shader

and are referred to as vertex attributes. a_position is the input vertex

position attribute and a_color is the input vertex color attribute. On

line 12, we declare the output v_color to store the output of the vertex

shader that describes the per-vertex color. The built-in variable called

gl_Position is declared automatically, and the shader must write the

transformed position to this variable. A vertex or fragment shader has

a single entry point called the main function. Lines 13–17 describe the

OpenGL ES 3.0 7

vertex shader main function. In line 15, we read the vertex attribute input

a_color and write it as the vertex output color v_color. In line 16, the

transformed vertex position is output by writing it to gl_Position.

Primitive Assembly

After the vertex shader, the next stage in the OpenGL ES 3.0 graphics

pipeline is primitive assembly. A primitive is a geometric object such

as a triangle, line, or point sprite. Each vertex of a primitive is sent to

a different copy of the vertex shader. During primitive assembly, these

vertices are grouped back into the primitive.

For each primitive, it must be determined whether the primitive lies

within the view frustum (the region of 3D space that is visible on the

screen). If the primitive is not completely inside the view frustum,

it might need to be clipped to the view frustum. If the primitive is

completely outside this region, it is discarded. After clipping, the vertex

position is converted to screen coordinates. A culling operation can also

be performed that discards primitives based on whether they face forward

or backward. After clipping and culling, the primitive is ready to be passed

to the next stage of the pipeline—the rasterization stage.

Rasterization

The next stage, shown in Figure 1-3, is the rasterization phase, where the

appropriate primitive (point sprite, line, or triangle) is drawn. Rasterization

is the process that converts primitives into a set of two-dimensional

fragments, which are then processed by the fragment shader. These two-

dimensional fragments represent pixels that can be drawn on the screen.

From

Primitive

Assembly

Line

Rasterization

Point Sprite

Rasterization

Triangle

Rasterization

Output for each fragment—

screen (xw, yw) coordinate,

attributes such as color,

texture coordinates, etc.

To Fragment Shader Stage

Figure 1-3 OpenGL ES 3.0 Rasterization Stage

8 Chapter 1: Introduction to OpenGL ES 3.0

Fragment Shader

The fragment shader implements a general-purpose programmable

method for operating on fragments. As shown in Figure 1-4, this shader is

executed for each generated fragment by the rasterization stage and takes

the following inputs:

Shader program—Fragment shader program source code or executable

that describes the operations that will be performed on the fragment.

Input variables—Outputs of the vertex shader that are generated by

the rasterization unit for each fragment using interpolation.

Uniforms—Constant data used by the fragment (or vertex) shader.

Samplers—Specic types of uniforms that represent textures used by

the fragment shader.

The fragment shader can either discard the fragment or generate one or more

color values referred to as outputs. Typically, the fragment shader outputs just

Input (Varying) 0

Input (Varying) 1

Input (Varying) 2

Input (Varying) 3

Input (Varying) 4

Output Color 0

Output Color 1

Output Color N

gl_FragDepth

gl_FragCoord

gl_FrontFacing

gl_PointCoord

Input (Varying) N

...

Fragment Shader

Uniforms Samplers

Figure 1-4 OpenGL ES 3.0 Fragment Shader

OpenGL ES 3.0 9

a single color value, except when rendering to multiple render targets (see

the section Multiple Render Targets in Chapter 11); in the latter case, a color

value is output for each render target. The color, depth, stencil, and screen

coordinate location (xw, yw) generated by the rasterization stage become

inputs to the per-fragment operations stage of the OpenGL ES 3.0 pipeline.

Example 1-2 describes a simple fragment shader that can be coupled with

the vertex shader described in Example 1-1 to draw a Gouraud-shaded

triangle. Again, we will go into much more detail on fragment shaders

later in the book. We present this example just to give you a basic idea of

what a fragment shader looks like.

Example 1-2 A Fragment Shader Example

1. #version 300 es

2. precision mediump float;

4. in vec4 v_color; // input vertex color from vertex shader

6. out vec4 fragColor; // output fragment color

7. void main()

8. {

9. fragColor = v_color;

10. }

Just as in the vertex shader, line 1 provides the version of the Shading

Language; this information must appear on the rst line of the fragment

shader (#version 300 es indicates the OpenGL ES Shading Language

v3.00). Line 2 sets the default precision qualier, which is explained in

detail in Chapter 4, “Shaders and Programs.” Line 4 describes the input

to the fragment shader. The vertex shader must write out the same set

of variables that are read in by the fragment shader. Line 6 provides the

declaration for the output variable of the fragment shader, which will be

the color passed on to the next stage. Lines 7–10 describe the fragment

shader main function. The output color is set to the input color v_color.

The inputs to the fragment shader are linearly interpolated across the

primitive before being passed into the fragment shader.

Per-Fragment Operations

After the fragment shader, the next stage is per-fragment operations. A

fragment produced by rasterization with (xw, yw) screen coordinates can

only modify the pixel at location (xw, yw) in the framebuffer. Figure 1-5

describes the OpenGL ES 3.0 per-fragment operations stage.

10 Chapter 1: Introduction to OpenGL ES 3.0

Pixel

Ownership

Test

Scissor

Test

Fragment

Data

Stencil

Test

Depth

Test Blending Dithering To

Framebuffer

Figure 1-5 OpenGL ES 3.0 Per-Fragment Operations

During the per-fragment operations stage, the following functions (and

tests) are performed on each fragment, as shown in Figure 1-5:

Pixel ownership test—This test determines whether the pixel at

location (xw, yw) in the framebuffer is currently owned by OpenGL

ES. This test allows the window system to control which pixels in the

framebuffer belong to the current OpenGL ES context. For example,

if a window displaying the OpenGL ES framebuffer window is

obscured by another window, the windowing system may determine

that the obscured pixels are not owned by the OpenGL ES context

and, therefore, the pixels might not be displayed at all. While the

pixel ownership test is part of OpenGL ES, it is not controlled by the

developer, but rather takes place internally inside of OpenGL ES.

Scissor test—The scissor test determines whether (xw, yw) lies within

the scissor rectangle dened as part of the OpenGL ES state. If the

fragment is outside the scissor region, the fragment is discarded.

Stencil and depth tests—These tests are performed on the stencil and

depth value of the incoming fragment to determine whether the

fragment should be rejected.

Blending—Blending combines the newly generated fragment color value

with the color values stored in the framebuffer at location (xw, yw).

Dithering—Dithering can be used to minimize the artifacts that occur as

a result of using limited precision to store color values in the framebuffer.

At the end of the per-fragment stage, either the fragment is rejected or

a fragment color(s), depth, or stencil value is written to the framebuffer

at location (xw, yw). Writing of the fragment color(s), depth, and stencil

values depends on whether the appropriate write masks are enabled.

Write masks allow ner control over the color, depth, and stencil values

written into the associated buffers. For example, the write mask for the

What’s New in OpenGL ES 3.0 11

color buffer could be set such that no red values are written into the color

buffer. In addition, OpenGL ES 3.0 provides an interface to read back the

pixels from the framebuffer.

Note: Alpha test and LogicOp are no longer part of the per-fragment

operations stage. These two stages exist in OpenGL 2.0 and

OpenGL ES 1.x. The alpha test stage is no longer needed because

the fragment shader can discard fragments; thus the alpha test

can be performed in the fragment shader. In addition, LogicOp

was removed because it is used only rarely by applications, and

the OpenGL ES working group did not receive requests from

independent software vendors (ISVs) to support this feature in

OpenGL ES 2.0.

What’s New in OpenGL ES 3.0

OpenGL ES 2.0 ushered in the era of programmable shaders for handheld

devices and has been wildly successful in powering games, applications,

and user interfaces across a wide range of devices. OpenGL ES 3.0

extends OpenGL ES 2.0 to support many new rendering techniques,

optimizations, and visual quality enhancements. The following sections

provide a categorized overview of the major new features that have been

added to OpenGL ES 3.0. Each of these features will be described in detail

later in the book.

Texturing

OpenGL ES 3.0 introduces many new features related to texturing:

sRGB textures and framebuffers—Allow the application to perform

gamma-correct rendering. Textures can be stored in gamma-corrected

sRGB space, uncorrected to linear space upon being fetched in the

shader, and then converted back to sRGB gamma-corrected space on

output to the framebuffer. This enables potentially higher visual delity

by properly computing lighting and other calculations in linear space.

2D texture arrays—A texture target that stores an array of 2D textures.

Such arrays might, for example, be used to perform texture animation.

Prior to 2D texture arrays, such animation was typically done by tiling

the frames of an animation in a single 2D texture and modifying the

texture coordinates to change animation frames. With 2D texture

arrays, each frame of the animation can be specied in a 2D slice of

the array.

12 Chapter 1: Introduction to OpenGL ES 3.0

3D textures—While some OpenGL ES 2.0 implementations supported

3D textures through an extension, OpenGL ES 3.0 has made this a

mandatory feature. 3D textures are essential in many medical imaging

applications, such as those that perform direct volume rendering of

3D voxel data (e.g., CT, MRI, or PET data).

Depth textures and shadow comparison—Enable the depth buffer

to be stored in a texture. The most common use for depth textures

is in rendering shadows, where a depth buffer is rendered from the

viewpoint of the light source and then used for comparison when

rendering the scene to determine whether a fragment is in shadow.

In addition to depth textures, OpenGL ES 3.0 allows the comparison

against the depth texture to be done at the time of fetch, thereby

allowing bilinear ltering to be done on depth textures (also known as

percentage closest ltering [PCF]).

Seamless cubemaps—In OpenGL ES 2.0, rendering with cubemaps

could produce artifacts at the boundaries between cubemap faces. In

OpenGL ES 3.0, cubemaps can be sampled such that ltering uses data

from adjacent faces and removes the seaming artifact.

Floating-point textures—OpenGL ES 3.0 greatly expands on the

texture formats supported. Floating-point half-oat (16-bit) textures

are supported and can be ltered, whereas full-oat (32-bit) textures

are supported but not lterable. The ability to access oating-point

texture data has many applications, including high dynamic range

texturing to general-purpose computation.

ETC2/EAC texture compression—While several OpenGL ES 2.0

implementations provided support for vendor-specic compressed

texture formats (e.g., ATC by Qualcomm, PVRTC by Imagination

Technologies, and Ericsson Texture Compression by Sony Ericsson),

there was no standard compression format that developers could rely

on. In OpenGL ES 3.0, support for ETC2/EAC is mandatory. The ETC2/

EAC formats provide compression for RGB888, RGBA8888, and one-

and two-channel signed/unsigned texture data. Texture compression

offers several advantages, including better performance (due to better

utilization of the texture cache) as well as a reduction in GPU memory

utilization.

Integer textures—OpenGL ES 3.0 introduces the capability to render

to and fetch from textures stored as unnormalized signed or unsigned

8-bit, 16-bit, and 32-bit integer textures.

Additional texture formats—In addition to those formats already

mentioned, OpenGL ES 3.0 includes support for 11-11-10 RGB

What’s New in OpenGL ES 3.0 13

oating-point textures, shared exponent RGB 9-9-9-5 textures,

10-10-10-2 integer textures, and 8-bit-per-component signed

normalized textures.

Non-power-of-2 textures (NPOT)—Textures can now be specied with

non-power-of-2 dimensions. This is useful in many situations, such as

when texturing from a video or camera feed that is captured/recorded

at a non-power-of-2 dimension.

Texture level of detail (LOD) features—The texture LOD parameter

used to determine which mipmap to fetch from can now be

clamped. Additionally, the base and maximum mipmap level can

be clamped. These two features, in combination, make it possible to

stream mipmaps. As larger mipmap levels become available, the base

level can be increased and the LOD value can be smoothly increased

to provide smooth-looking streaming textures. This is very useful, for

example, when downloading texture mipmap data over a network

connection.

Texture swizzles—A new texture object state was introduced to allow

independent control of where each channel (R, G, B, and A) of texture

data is mapped to in the shader.

Immutable textures—Provide a mechanism for the application to

specify the format and size of a texture before loading it with data. In

doing so, the texture format becomes immutable and the OpenGL ES

driver can perform all consistency and memory checks up-front. This

can improve performance by allowing the driver to skip consistency

checks at draw time.

Increased minimum sizes—All OpenGL ES 3.0 implementations are

required to support much larger texture resources than OpenGL ES

2.0. For example, the minimum supported 2D texture dimension in

OpenGL ES 2.0 was 64 but was increased to 2048 in OpenGL ES 3.0.

Shaders

OpenGL ES 3.0 includes a major update to the OpenGL ES Shading

Language (ESSL; to v3.00) and new API features to support new shader

features:

Program binaries—In OpenGL ES 2.0, it was possible to store shaders

in a binary format, but it was still required to link them into program

at runtime. In OpenGL ES 3.0, the entire linked program binary

(containing the vertex and fragment shader) can be stored in an

14 Chapter 1: Introduction to OpenGL ES 3.0

ofine binary format with no link step required at runtime. This can

potentially help reduce the load time of applications. Additionally,

OpenGL ES 3.0 provides an interface to retrieve the program binary

from the driver so no ofine tools are required to use program

binaries.

Mandatory online compiler—OpenGL ES 2.0 made it optional

whether the driver would support online compilation of shaders. The

intent was to reduce the memory requirements of the driver, but this

achievement came at a major cost to developers in terms of having to

rely on vendor-specic tools to generate shaders. In OpenGL ES 3.0, all

implementations will have an online shader compiler.

Non-square matrices—New matrix types other than square matrices

are supported, and associated uniform calls were added to the API to

support loading them. Non-square matrices can reduce the instruction

count required for performing transformations. For example, if

performing an afne transformation, a 4 × 3 matrix can be used in

place of a 4 × 4 where the last row is (0, 0, 0, 1), thus reducing the

instructions required to perform the transformation.

Full integer support—Integer (and unsigned integer) scalar and vector

types, along with full integer operations, are supported in ESSL 3.00.

There are various built-in functions such as conversion from int to

oat, and from oat to int, as well as the ability to read integer values

from textures and output integer values to integer color buffers.

Centroid sampling—To avoid rendering artifacts when multisampling,

the output variables from the vertex shader (and inputs to the

fragment shader) can be declared with centroid sampling.

Flat/smooth interpolators—In OpenGL ES 2.0, all interpolators were

implicitly linearly interpolated across the primitive. In ESSL 3.00,

interpolators (vertex shader outputs/fragment shader inputs) can be

explicitly declared to have either smooth or at shading.

Uniform blocks—Uniform values can be grouped together into

uniform blocks. Uniform blocks can be loaded more efciently and

also shared across multiple shader programs.

Layout qualiers—Vertex shader inputs can be declared with layout

qualiers to explicitly bind the location in the shader source without

requiring making API calls. Layout qualiers can also be used for

fragment shader outputs to bind the outputs to each target when

rendering to multiple render targets. Further, layout qualiers can be

used to control the memory layout for uniform blocks.

What’s New in OpenGL ES 3.0 15

Instance and vertex ID—The vertex index is now accessible in the

vertex shader as well as the instance ID if using instance rendering.

Fragment depth—The fragment shader can explicitly control the

depth value for the current fragment rather than relying on the

interpolation of its depth value.

New built-in functions—ESSL 3.00 introduces many new built-in

functions to support new texture features, fragment derivatives, half-

oat data conversion, and matrix and math operations.

Relaxed limitations—ESSL 3.0 greatly relaxes the restrictions on

shaders. Shaders are no longer limited in terms of instruction length,

fully support looping and branching on variables, and support

indexing on arrays.

Geometry

OpenGL ES 3.0 introduces several new features related to geometry

specication and control of primitive rendering:

Transform feedback—Allows the output of the vertex shader to

be captured in a buffer object. This is useful for a wide range of

techniques that perform animation on the GPU without any CPU

intervention—for example, particle animation or physics simulation

using render-to-vertex-buffer.

Boolean occlusion queries—Enable the application to query whether

any pixels of a draw call (or a set of draw calls) passes the depth

test. This feature can be used within a variety of techniques, such as

visibility determination for a lens are effect as well as optimization

to avoid performing geometry processing on objects whose bounding

volume is obscured.

Instanced rendering—Efciently renders objects that contain similar

geometry but differ by attributes (such as transformation matrix, color,

or size). This feature is useful in rendering large quantities of similar

objects, such as for crowd rendering.

Primitive restart—When using triangle strips in OpenGL ES 2.0 for a

new primitive, the application would have to insert indices into the

index buffer to represent a degenerate triangle. In OpenGL ES 3.0, a

special index value can be used that indicates the beginning of a new

primitive. This obviates the need for generating degenerate triangles

when using triangle strips.

16 Chapter 1: Introduction to OpenGL ES 3.0

New vertex formats—New vertex formats, including 10-10-10-2 signed

and unsigned normalized vertex attributes; 8-bit, 16-bit, and 32-bit

integer attributes; and 16-bit half-oat, are supported in OpenGL ES 3.0.

Buffer Objects

OpenGL ES 3.0 introduces many new buffer objects to increase the

efciency and exibility of specifying data to various parts of the graphics

pipeline:

Uniform buffer objects—Provide an efcient method for storing/

binding large blocks of uniforms. Uniform buffer objects can be used

to reduce the performance cost of binding uniform values to shaders,

which is a common bottleneck in OpenGL ES 2.0 applications.

Vertex array objects—Provide an efcient method for binding and

switching between vertex array states. Vertex array objects are

essentially container objects for vertex array states. Using them allows

an application to switch the vertex array state in a single API call

rather than making several calls.

Sampler objects—Separate the sampler state (texture wrap mode

and ltering) from the texture object. This provides a more efcient

method of sharing the sampler state across textures.

Sync objects—Provide a mechanism for the application to check on

whether a set of OpenGL ES operations has nished executing on

the GPU. A related new feature is a fence, which provides a way for

the application to inform the GPU that it should wait until a set of

OpenGL ES operations has nished executing before queuing up more

operations for execution.

Pixel buffer objects—Enable the application to perform asynchronous

transfer of data to pixel operations and texture transfer operations.

This optimization is primarily intended to provide faster transfer

of data between the CPU and the GPU, where the application can

continue doing work during the transfer operation.

Buffer subrange mapping—Allows the application to map a subregion

of a buffer for access by the CPU. This can provide better performance

than traditional buffer mapping, in which the whole buffer needs to

be available to the client.

Buffer object to buffer object copies—Provide a mechanism to

efciently transfer data from one buffer object to another without

intervention on the CPU.

OpenGL ES 3.0 and Backward Compatibility 17

Framebuffer

OpenGL ES 3.0 adds many new features related to off-screen rendering to

framebuffer objects:

Multiple render targets (MRTs)—Allow the application to render

simultaneously to several color buffers at one time. With MRTs, the

fragment shader outputs several colors, one for each attached color

buffer. MRTs are used in many advanced rendering algorithms, such as

deferred shading.

Multisample renderbuffers—Enable the application to render to off-

screen framebuffers with multisample anti-aliasing. The multisample

renderbuffers cannot be directly bound to textures, but they can

be resolved to single-sample textures using the newly introduced

framebuffer blit.

Framebuffer invalidation hints—Many implementations of OpenGL

ES 3.0 are based on GPUs that use tile-based rendering (TBR;

explained in the Framebuffer Invalidation section in Chapter 12).

It is often the case that TBR incurs a signicant performance cost

when having to unnecessarily restore the contents of the tiles for

further rendering to a framebuffer. Framebuffer invalidation gives

the application a mechanism to inform the driver that the contents

of the framebuffer are no longer needed. This allows the driver

to take optimization steps to skip unnecessary restore operations

on the tiles. Such functionality is very important to achieve

peak performance in many applications, especially those that do

signicant amounts of off-screen rendering.

New blend equations—The min/max functions are supported in

OpenGL ES 3.0 as a blend equation.

OpenGL ES 3.0 and Backward Compatibility

OpenGL ES 3.0 is backward compatible with OpenGL ES 2.0. This means

that just about any application written to use OpenGL ES 2.0 will run on

implementations of OpenGL ES 3.0. There are some very minor changes

to the later version that will affect a small number of applications in terms

of backward compatibility. Namely, framebuffer objects are no longer

shared between contexts, cubemaps are always ltered using seamless

ltering, and there are minor changes in the way signed xed-point

numbers are converted to oating-point numbers.

18 Chapter 1: Introduction to OpenGL ES 3.0

The fact that OpenGL ES 3.0 is backward compatible with OpenGL ES

2.0 differs from what was done for OpenGL ES 2.0 with respect to its

backward compatibility with previous versions of OpenGL ES. OpenGL ES

2.0 is not backward compatible with OpenGL ES 1.x. OpenGL ES 2.0/3.0

do not support the xed-function pipeline that OpenGL ES 1.x supports.

The OpenGL ES 2.0/3.0 programmable vertex shader replaces the xed-

function vertex units implemented in OpenGL ES 1.x. The xed-function

vertex units implement a specic vertex transformation and lighting

equation that can be used to transform the vertex position, transform

or generate texture coordinates, and calculate the vertex color. Similarly,

the programmable fragment shader replaces the xed-function texture

combine units implemented in OpenGL ES 1.x. The xed-function texture

combine units implement a texture combine stage for each texture unit.

The texture color is combined with the diffuse color and the output of the

previous texture combine stage with a xed set of operations such as add,

modulate, subtract, and dot.

The OpenGL ES working group decided against backward compatibility

between OpenGL ES 2.0/3.0 and OpenGL ES 1.x for the following reasons:

Supporting the xed-function pipeline in OpenGL ES 2.0/3.0

implies that the API would support more than one way of

implementing a feature, in violation of one of the criteria used

by the working group in determining which features should be

supported. The programmable pipeline allows applications to

implement the xed-function pipeline using shaders, so there

is really no compelling reason to be backward compatible with

OpenGL ES 1.x.

Feedback from ISVs indicated that most games do not mix

programmable and xed-function pipelines. That is, games are written

either for a xed-function pipeline or for a programmable pipeline.

Once you have a programmable pipeline, there is no reason to use

a xed-function pipeline, as you have much more exibility in the

effects that can be rendered.

The OpenGL ES 2.0/3.0 driver’s memory footprint would be much

larger if it had to support both the xed-function and programmable

pipelines. For the devices targeted by OpenGL ES, minimizing

memory footprint is an important design criterion. Separating the

xed-function support into the OpenGL ES 1.x API and placing the

programmable shader support into the OpenGL ES 2.0/3.0 APIs meant

that vendors that do not require OpenGL ES 1.x support no longer

need to include this driver.

EGL 19

EGL

OpenGL ES commands require a rendering context and a drawing

surface. The rendering context stores the appropriate OpenGL ES state.

The drawing surface is the surface to which primitives will be drawn.

The drawing surface species the types of buffers that are required for

rendering, such as a color buffer, depth buffer, and stencil buffer. The

drawing surface also species the bit depths of each of the required buffers.

The OpenGL ES API does not mention how a rendering context is created

or how the rendering context gets attached to the native windowing

system. EGL is one interface between the Khronos rendering APIs such

as OpenGL ES and the native window system; there is no hard-and-fast

requirement to provide EGL when implementing OpenGL ES. Developers

should refer to the platform vendor’s documentation to determine which

interface is supported. As of this writing, the only known platform

supporting OpenGL ES that does not support EGL is iOS.

Any OpenGL ES application will need to perform the following tasks using

EGL before any rendering can begin:

Query the displays that are available on the device and initialize

them. For example, a ip phone might have two LCD panels, and it is

possible that we might use OpenGL ES to render to surfaces that can

be displayed on either or both panels.

Create a rendering surface. Surfaces created in EGL can be categorized

as on-screen surfaces or off-screen surfaces. On-screen surfaces are

attached to the native window system, whereas off-screen surfaces are

pixel buffers that do not get displayed but can be used as rendering

surfaces. These surfaces can be used to render into a texture and can

be shared across multiple Khronos APIs.

Create a rendering context. EGL is needed to create an OpenGL ES

rendering context. This context needs to be attached to an appropriate

surface before rendering can actually begin.

The EGL API implements the features just described as well as additional

functionality such as power management, support for multiple rendering

contexts in a process, sharing objects (such as textures or vertex buffers)

across rendering contexts in a process, and a mechanism to get function

pointers to EGL or OpenGL ES extension functions supported by a given

implementation.

The latest version of the EGL specication is EGL version 1.4.

20 Chapter 1: Introduction to OpenGL ES 3.0

Programming with OpenGL ES 3.0

To write any OpenGL ES 3.0 application, you need to know which header

les must be included and with which libraries your application needs to

link. It is also useful to understand the syntax used by the EGL and GL

command names and command parameters.

Libraries and Include Files

OpenGL ES 3.0 applications need to link with the following libraries: the

OpenGL ES 3.0 library named libGLESv2.lib and the EGL library named

libEGL.lib.

OpenGL ES 3.0 applications also need to include the appropriate ES 3.0

and EGL header les. The following include les must be included by any

OpenGL ES 3.0 application:

#include <EGL/egl.h>

#include <GLES3/gl3.h>

egl.h is the EGL header le and gl3.h is the OpenGL ES 3.0 header le.

Applications can optionally include gl2ext.h, which is the header le that

describes the list of Khronos-approved extensions for OpenGL ES 2.0/3.0.

The header le and library names are platform dependent. The OpenGL

ES working group has tried to dene the library and header names and

indicate how they should be organized, but this arrangement might not

be found on all OpenGL ES platforms. Developers should, however, refer

to the platform vendor’s documentation for information on how the

libraries and include les are named and organized. The ofcial OpenGL

ES header les are maintained by Khronos and available from http://

khronos.org/registry/gles/. The sample code for the book also includes a

copy of the header les (working with the sample code is described in the

next chapter).

EGL Command Syntax

All EGL commands begin with the prex egl and use an initial

capital letter for each word making up the command name (e.g.,

eglCreateWindowSurface). Similarly, EGL data types also begin with the

prex Egl and use an initial capital letter for each word making up the

type name, except for EGLint and EGLenum.

Table 1-1 briey describes the EGL data types used.

OpenGL ES Command Syntax 21

OpenGL ES Command Syntax

All OpenGL ES commands begin with the prex gl and use an initial capital

letter for each word making up the command name (e.g., glBlendEquation).

Similarly, OpenGL ES data types also begin with the prex GL.

In addition, some commands might take arguments in different avors.

The avors or types vary in terms of the number of arguments taken

(one to four arguments), the data type of the arguments used (byte [b],

unsigned byte [ub], short [s], unsigned short [us], int [i], and oat [f]),

and whether the arguments are passed as a vector (v). A few examples of

command avors allowed in OpenGL ES follow.

The following two commands are equivalent except that one species the

uniform value as oats and the other as integers:

glUniform2f(location, l.Of, O.Of);

glUniform2i(location, 1, 0)

The following lines describe commands that are also equivalent, except

that one passes command arguments as a vector and the other does not:

GLfloat coord[4] = { l.Of, 0.75f, 0.25f, O.Of };

glUniform4fv(location, coord);

glUniform4f(location, coord[0], coord[l], coord[2], coord[3]);

Table 1-2 describes the command sufxes and argument data types used in

OpenGL ES.

Finally, OpenGL ES denes the type GLvoid. This type is used for OpenGL

ES commands that accept pointers.

In the rest of this book, OpenGL ES commands are referred to by their base

names only, and an asterisk is used to indicate that this base name refers

Table 1-1 EGL Data Types

Data Type C-Language Type EGL Type

32-bit integer int EGLint

32-bit unsigned integer unsignedint EGLBoolean, EGLenum

Pointer void * EGLConfig,

EGLContext,

EGLDisplay,

EGLSurface,

EGLClientBuffer

22 Chapter 1: Introduction to OpenGL ES 3.0

to multiple avors of the command name. For example, glUniform*()

stands for all variations of the command you use to specify uniforms and

glUniform*v() refers to all the vector versions of the command you use

to specify uniforms. If a particular version of a command needs to be

discussed, we use the full command name with the appropriate sufxes.

Error Handling

OpenGL ES commands incorrectly used by applications generate an error

code. This error code is recorded and can be queried using glGetError.

No other errors will be recorded until the application has queried the rst

error code using glGetError. Once the error code has been queried, the

current error code is reset to GL_NO_ERROR. The command that generated

the error is ignored and does not affect the OpenGL ES state except for the

GL_OUT_OF_MEMORY error described later in this section.

The glGetError command is described next.

Table 1-2 OpenGL ES Command Sufxes and Argument Data Types

Suffix Data Type C-Language Type GL Type

b8-bit signed integer signed char GLbyte

ub 8-bit unsigned integer unsigned char GLubyte,

GLboolean

s16-bit signed integer short GLshort

us 16-bit unsigned integer unsigned short GLushort

i32-bit signed integer int GLint

ui 32-bit unsigned integer unsigned int GLuint,

GLbitfield,

GLenum

x16.16 xed point int GLfixed

f32-bit oating point float GLfloat,

GLclampf

i64 64-bit integer khronos_int64_t

(platform dependent)

GLint64

ui64 64-bit unsigned integer khronos_uint64_t

(platform dependent)

GLuint64

Basic State Management 23

Table 1-3 lists the basic error codes and their description. Other error codes

besides the basic ones listed in this table are described in the chapters that

cover OpenGL ES commands that generate these specic errors.

Basic State Management

Figure 1-1 showed the various pipeline stages in OpenGL ES 3.0. Each

pipeline stage has a state that can be enabled or disabled and appropriate

state values that are maintained per context. Examples of states are

blending enable, blend factors, cull enable, and cull face. The state is

initialized with default values when an OpenGL ES context (EGLContext)

is initialized. The state enables can be set using the glEnable and

glDisable commands.

GLenum glGetError (void)

Returns the current error code and resets the current error code to

GL_NO_ERROR. If GL_NO_ERROR is returned, there has been no detectable

error since the last call to glGetError.

Table 1-3 OpenGL ES Basic Error Codes

Error Code Description

GL_NO_ERROR No error has been generated since the last

call to glGetError.

GL_INVALID_ENUM A GLenum argument is out of range. The

command that generated the error is ignored.

GL_INVALID_VALUE A numeric argument is out of range. The

command that generated the error is ignored.

GL_INVALID_OPERATION The specic command cannot be performed

in the current OpenGL ES state. The

command that generated the error is ignored.

GL_OUT_OF_MEMORY There is insufcient memory to execute

this command. The state of the OpenGL ES

pipeline is considered to be undened if this

error is encountered except for the current

error code.

24 Chapter 1: Introduction to OpenGL ES 3.0

Later chapters will describe the specic state enables for each pipeline

stage shown in Figure 1-1. You can also check whether a state is currently

enabled or disabled by using the gIisEnabled command.

void glEnable(GLenum cap)

void glDisable(GLenum cap)

glEnable and glDisable enable and disable various capabilities. The

initial value for each capability is set to GL_FALSE except for GL_DITHER,

which is set to GL_TRUE. The error code GL_INVALID_ENUM is generated if

cap is not a valid state enum.

cap state to enable or disable, can be:

GL_BLEND

GL_CULL_FACE

GL_DEPTH_TEST

GL_DITHER

GL_POLYGON_OFFSET_FILL

GL_PRIMITIVE_RESTART_FIXED_INDEX

GL_RASTERIZER_DISCARD

GL_SAMPLE_ALPHA_TO_COVERAGE

GL_SAMPLE_COVERAGE

GL_SCISSOR_TEST

GL_STENCIL_TEST

GLboolean gIisEnabled(GLenum cap)

Returns GL_TRUE or GL_FALSE depending on whether the state being

queried is enabled or disabled. Generates the error code GL_INVALID_

ENUM if cap is not a valid state enum.

Specic state values such as blend factor, depth test values, and so on can

also be queried using appropriate glGet*** commands. These commands

are described in detail in Chapter 15, “State Queries.”

Further Reading

The OpenGL ES 1.0, 1.1, 2.0, and 3.0 specications can be found at

khronos.org/opengles/. In addition, the Khronos website (khronos.

org) has the latest information on all Khronos specications, developer

message boards, tutorials, and examples.

Khronos OpenGL ES 1.1 website: http://khronos.org/opengles/1_X/

Khronos OpenGL ES 2.0 website: http://khronos.org/opengles/2_X/

Khronos OpenGL ES 3.0 website: http://khronos.org/opengles/3_X/

Khronos EGL website: http://khronos.org/egl/

This page intentionally left blank

Chapter 2

Hello Triangle: An OpenGL ES 3.0 Example

To introduce the basic concepts of OpenGL ES 3.0, we begin with a simple

example. This chapter shows what is required to create an OpenGL ES 3.0

program that draws a single triangle. The program we will write is just

about the most basic example of an OpenGL ES 3.0 application that draws

geometry. This chapter covers the following concepts:

Creating an on-screen render surface with EGL

Loading vertex and fragment shaders

Creating a program object, attaching vertex and fragment shaders, and

linking a program object

Setting the viewport

Clearing the color buffer

Rendering a simple primitive

Making the contents of the color buffer visible in the EGL window

surface

As it turns out, a signicant number of steps are required before we can

start drawing a triangle with OpenGL ES 3.0. This chapter goes over the

basics of each of these steps. Later in this book, we ll in the details on

each of these steps and further document the API. Our purpose here is to

get you up and running with your rst simple example so that you get an

idea of what goes into creating an application with OpenGL ES 3.0.

28 Chapter 2: Hello Triangle: An OpenGL ES 3.0 Example

Code Framework

Throughout this book, we build a library of utility functions that form a

framework of useful functions for writing OpenGL ES 3.0 programs. In

developing example programs for the book, we had several goals for this

code framework:

1. It should be simple, small, and easy to understand. We wanted to

focus our examples on the relevant OpenGL ES 3.0 calls, rather than

on a large code framework that we invented. Thus we focused our

framework on simplicity and sought to make the example programs

easy to read and understand. The goal of the framework is to allow

you to focus your attention on the important OpenGL ES 3.0 API

concepts in each example.

2. It should be portable. To the extent possible, we wanted the sample

code to be available on all platforms where OpenGL ES 3.0 is present.

As we go through the examples in the book, we will formally introduce

any new code framework functions that we use. In addition, you can

nd full documentation for the code framework in Appendix C. Any

functions called in the example code that have names beginning with es

(e.g., esCreateWindow()) are part of the code framework we wrote for the

sample programs in this book.

Where to Download the Examples

You can nd links to download the examples from the book website at

opengles-book.com.

As of this writing, the source code is available for Windows, Linux,

Android 4.3+ NDK, Android 4.3+ SDK (Java), and iOS7. On Windows,

the code is compatible with the Qualcomm OpenGL ES 3.0 Emulator,

ARM OpenGL ES 3.0 Emulator, and PowerVR OpenGL ES 3.0 Emulator.

On Linux, the currently available emulators are the Qualcomm OpenGL

ES 3.0 Emulator and the PowerVR OpenGL ES 3.0 Emulator. The code

should be compatible with any Windows- or Linux-based OpenGL ES3.0

implementations in addition to those mentioned here. The choice of

development tool is up to the reader. We have used cmake, a cross-

platform build generation tool, on Windows and Linux, which allows

you to use IDEs including Microsoft Visual Studio, Eclipse, Code::Blocks,

andXcode.

Hello Triangle Example 29

On Android and iOS, we provide projects compatible with those platforms

(Eclipse ADT and Xcode). As of this writing, many devices support OpenGL

ES 3.0, including iPhone 5s, Google Nexus 4 and 7, Nexus 10, HTC One,

LG G2, Samsung Galaxy S4 (Snapdragon), and Samsung Galaxy Note 3. On

iOS7, you can run the OpenGL ES 3.0 examples on your Mac using the iOS

Simulator. On Android, you will need a device compatible with OpenGL

ES 3.0 to run the samples. Details on building the sample code for each

platform are provided in Chapter 16, “OpenGL ES Platforms.”

Hello Triangle Example

Let’s look at the full source code for our Hello Triangle example program,

which is listed in Example 2-1. Those readers who are familiar with

xed-function desktop OpenGL will probably think this is a lot of code

just to draw a simple triangle. Those of you who are not familiar with

desktop OpenGL will also probably think this is a lot of code just to draw

a triangle! Remember, OpenGL ES 3.0 is fully shader based, which means

you cannot draw any geometry without having the appropriate shaders

loaded and bound. This means that more setup code is required to render

than in desktop OpenGL using xed-function processing.

Example 2-1 Hello_Triangle.c Example

#include "esUtil.h"

typedef struct

{

// Handle to a program object

GLuint programObject;

} UserData;

///

// Create a shader object, load the shader source, and

// compile the shader

GLuint LoadShader ( GLenum type, const char *shaderSrc )

{

GLuint shader;

GLint compiled;

// Create the shader object

shader = glCreateShader ( type );

(continues)

30 Chapter 2: Hello Triangle: An OpenGL ES 3.0 Example

Example 2-1 Hello_Triangle.c Example (continued)

if ( shader == 0 )

return 0;

// Load the shader source

glShaderSource ( shader, 1, &shaderSrc, NULL );

// Compile the shader

glCompileShader ( shader );

// Check the compile status

glGetShaderiv ( shader, GL_COMPILE_STATUS, &compiled );

if ( !compiled )

{

GLint infoLen = 0;

glGetShaderiv ( shader, GL_INFO_LOG_LENGTH, &infoLen );

if ( infoLen > 1 )

{

char* infoLog = malloc (sizeof(char) * infoLen );

glGetShaderInfoLog( shader, infoLen, NULL, infoLog );

esLogMessage ( "Error compiling shader:\n%s\n", infoLog );

free ( infoLog );

}

glDeleteShader ( shader );

return 0;

}

return shader;

}

///

// Initialize the shader and program object

int Init ( ESContext *esContext )

{

UserData *userData = esContext->userData;

char vShaderStr[] =

"#version 300 es \n"

"layout(location = 0) in vec4 vPosition; \n"

"void main() \n"

"{ \n"

Hello Triangle Example 31

Example 2-1 Hello_Triangle.c Example (continued)

" gl_Position = vPosition; \n"

"} \n";

char fShaderStr[] =

"#version 300 es \n"

"precision mediump float; \n"

"out vec4 fragColor; \n"

"void main() \n"

"{ \n"

" fragColor = vec4 ( 1.0, 0.0, 0.0, 1.0 ); \n"

"} \n";

GLuint vertexShader;

GLuint fragmentShader;

GLuint programObject;

GLint linked;

// Load the vertex/fragment shaders

vertexShader = LoadShader ( GL_VERTEX_SHADER, vShaderStr );

fragmentShader = LoadShader ( GL_FRAGMENT_SHADER, fShaderStr );

// Create the program object

programObject = glCreateProgram ( );

if ( programObject == 0 )

return 0;

glAttachShader ( programObject, vertexShader );

glAttachShader ( programObject, fragmentShader );

// Link the program

glLinkProgram ( programObject );

// Check the link status

glGetProgramiv ( programObject, GL_LINK_STATUS, &linked );

if ( !linked )

{

GLint infoLen = 0;

glGetProgramiv ( programObject, GL_INFO_LOG_LENGTH, &infoLen );

if ( infoLen > 1 )

{

char* infoLog = malloc (sizeof(char) * infoLen );

glGetProgramInfoLog ( programObject, infoLen, NULL, infoLog );

(continues)

32 Chapter 2: Hello Triangle: An OpenGL ES 3.0 Example

Example 2-1 Hello_Triangle.c Example (continued)

esLogMessage ( "Error linking program:\n%s\n", infoLog );

free ( infoLog );

}

glDeleteProgram ( programObject );

return FALSE;

}

// Store the program object

userData->programObject = programObject;

glClearColor ( 0.0f, 0.0f, 0.0f, 0.0f );

return TRUE;

}

///

// Draw a triangle using the shader pair created in Init()

void Draw ( ESContext *esContext )

{

UserData *userData = esContext->userData;

GLfloat vVertices[] = { 0.0f, 0.5f, 0.0f,

-0.5f, -0.5f, 0.0f,

0.5f, -0.5f, 0.0f };

// Set the viewport

glViewport ( 0, 0, esContext->width, esContext->height );

// Clear the color buffer

glClear ( GL_COLOR_BUFFER_BIT );

// Use the program object

glUseProgram ( userData->programObject );

// Load the vertex data

glVertexAttribPointer ( 0, 3, GL_FLOAT, GL_FALSE, 0, vVertices );

glEnableVertexAttribArray ( 0 );

glDrawArrays ( GL_TRIANGLES, 0, 3 );

}

void Shutdown ( ESContext *esContext )

{

UserData *userData = esContext->userData;

glDeleteProgram( userData->programObject );

}

Hello Triangle Example 33

The remainder of this chapter describes the code in this example. If you

run the Hello Triangle example, you should see the window shown in

Figure 2-1. Instructions on how to build and run the sample code for

Windows, Linux, Android 4.3+, and iOS are provided in Chapter 16,

“OpenGL ES Platforms.” Please refer to the instructions in that chapter for

your platform to get up and running with the sample code.

Figure 2-1 Hello Triangle Example

Example 2-1 Hello_Triangle.c Example (continued)

int esMain( ESContext *esContext )

{

esContext->userData = malloc ( sizeof( UserData ) );

esCreateWindow ( esContext, "Hello Triangle", 320, 240,

ES_WINDOW_RGB );

if ( !Init ( esContext ) )

return GL_FALSE;

esRegisterShutdownFunc( esContext, Shutdown );

esRegisterDrawFunc ( esContext, Draw );

return GL_TRUE;

}

The standard GL3 (GLES3/gl3.h) and EGL (EGL/egl.h) header les

provided by Khronos are used as an interface to OpenGL ES 3.0 and EGL.

The OpenGL ES 3.0 examples are organized in the following directories:

Common/—Contains the OpenGL ES 3.0 Framework project, code, and

the emulator.

chapter_x/—Contains the example programs for each chapter.

34 Chapter 2: Hello Triangle: An OpenGL ES 3.0 Example

Using the OpenGL ES 3.0 Framework

Each application that uses our code framework declares a main entry

point named esMain. In the main function in Hello Triangle, you will

see calls to several ES utility functions. The esMain function takes an

ESContext as an argument.

int esMain( ESContext *esContext )

The ESContext has a member variable named userData that is a void*.

Each of the sample programs will store any of the data that are needed

for the application in userData. The other elements in the ESContext

structure are described in the header le and are intended only to be read

by the user application. Other data in the ESContext structure include

information such as the window width and height, EGL context, and

callback function pointers.

The esMain function is responsible for allocating the userData, creating

the window, and initializing the draw callback function:

esContext->userData = malloc ( sizeof( UserData ) );

esCreateWindow( esContext, "Hello Triangle", 320, 240,

ES_WINDOW_RGB );

if ( !Init( esContext ) )

return GL_FALSE;

esRegisterDrawFunc(esContext, Draw);

The call to esCreateWindow creates a window of the specied width and

height (in this case, 320 × 240). The “Hello Triangle” parameter is used to

name the window; on platforms supporting it (Windows and Linux), this

name will be displayed in the top band of the window. The last parameter

is a bit eld that species options for the window creation. In this case,

we request an RGB framebuffer. Chapter 3, “An Introduction to EGL,”

discusses what esCreateWindow does in more detail. This function uses

EGL to create an on-screen render surface that is attached to a window.

EGL is a platform-independent API for creating rendering surfaces and

contexts. For now, we will simply say that this function creates a rendering

surface and leave the details on how it works for the next chapter.

After calling esCreateWindow, the main function next calls Init to

initialize everything needed to run the program. Finally, it registers a

Creating a Simple Vertex and Fragment Shader 35

callback function, Draw, that will be called to render the frame. After exiting

esMain, the framework enters into the main loop, which will call the

registered callback functions (Draw, Update) until the window is closed.

Creating a Simple Vertex and Fragment Shader

In OpenGL ES 3.0, no geometry can be drawn unless a valid vertex

and fragment shader have been loaded. In Chapter 1, “Introduction

to OpenGL ES 3.0,” we covered the basics of the OpenGL ES 3.0

programmable pipeline. There, you learned about the concepts of

vertex and fragment shaders. These two shader programs describe the

transformation of vertices and drawing of fragments. To do any rendering

at all, an OpenGL ES 3.0 program must have at least one vertex shader

and one fragment shader.

The biggest task that the Init function in Hello Triangle accomplishes is

the loading of a vertex shader and a fragment shader. The vertex shader

that is given in the program is very simple:

char vShaderStr[] =

"#version 300 es \n"

"layout(location = 0) in vec4 vPosition; \n"

"void main() \n"

"{ \n"

" gl_Position = vPosition; \n"

"} \n";

The rst line of the vertex shader declares the shader version that is being

used (#version 300 es indicates OpenGL ES Shading Language v3.00).

The vertex shader declares one input attribute array—a four-component

vector named vPosition. Later on, the Draw function in Hello Triangle

will send in positions for each vertex that will be placed in this variable.

The layout(location = 0) qualier signies that the location of this

variable is vertex attribute 0. The shader declares a main function that

marks the beginning of execution of the shader. The body of the shader is

very simple; it copies the vPosition input attribute into a special output

variable named gl_Position. Every vertex shader must output a position

into the gl_Position variable. This variable denes the position that

is passed through to the next stage in the pipeline. The topic of writing

shaders is a large part of what we cover in this book, but for now we just

want to give you a avor of what a vertex shader looks like. In Chapter5,

“OpenGL ES Shading Language,” we cover the OpenGL ES shading

36 Chapter 2: Hello Triangle: An OpenGL ES 3.0 Example

language; in Chapter 8, “Vertex Shaders,” we specically cover how to

write vertex shaders.

The fragment shader in the example is simple:

char fShaderStr[] =

"#version 300 es \n"

"precision mediump float; \n"

"out vec4 fragColor; \n"

"void main() \n"

"{ \n"

" fragColor = vec4 ( 1.0, 0.0, 0.0, 1.0 ); \n"

"} \n";

Just as in the vertex shader, the rst line of the fragment shader declares

the shader version. The next statement in the fragment shader declares

the default precision for oat variables in the shader. For more details

on this topic, please see the section on precision qualiers in Chapter 5,

“OpenGL ES Shading Language.” The fragment shader declares a single

output variable fragColor, which is a vector of four components. The

value written to this variable is what will be written out into the color

buffer. In this case, the shader outputs a red color (1.0, 0.0, 0.0, 1.0) for

all fragments. The details of developing fragment shaders are covered in

Chapter 9, “Texturing,” and Chapter 10, “Fragment Shaders.” Again, here

we are just showing you what a fragment shader looks like.

Typically, a game or application would not place shader source strings

inline in the way we have done in this example. In most real-world

applications, the shader is loaded from some sort of text or data le and

then loaded to the API. However, for simplicity and to make the example

program self-contained, we provide the shader source strings directly in

the program code.

Compiling and Loading the Shaders

Now that we have the shader source code dened, we can go about loading

the shaders to OpenGL ES. The LoadShader function in the Hello Triangle

example is responsible for loading the shader source code, compiling it,

and checking it for errors. It returns a shader object, which is an OpenGL

ES3.0 object that can later be used for attachment to a program object

(these two objects are detailed in Chapter 4, “Shaders and Programs”).

Let’s look at how the LoadShader function works. First, glCreateShader

creates a new shader object of the type specied.

Compiling and Loading the Shaders 37

GLuint LoadShader(GLenum type, const char *shaderSrc)

{

GLuint shader;

GLint compiled;

// Create the shader object

shader = glCreateShader(type);

if(shader == 0)

return 0;

The shader source code itself is loaded to the shader object

using glShaderSource. The shader is then compiled using the

glCompileShader function.

// Load the shader source

glShaderSource(shader, 1, &shaderSrc, NULL);

// Compile the shader

glCompileShader(shader);

After compiling the shader, the status of the compile is determined and

any errors that were generated are printed out.

// Check the compile status

glGetShaderiv(shader, GL_COMPILE_STATUS, &compiled);

if(!compiled)

{

GLint infoLen = 0;

glGetShaderiv(shader, GL_INFO_LOG_LENGTH, &infoLen);

if(infoLen > 1)

{

char* infoLog = malloc(sizeof(char) * infoLen);

glGetShaderInfoLog(shader, infoLen, NULL, infoLog);

esLogMessage("Error compiling shader:\n%s\n", infoLog);

free(infoLog);

}

glDeleteShader(shader);

return 0;

}

return shader;

}

38 Chapter 2: Hello Triangle: An OpenGL ES 3.0 Example

If the shader compiles successfully, a new shader object is returned that

will be attached to the program later. The details of these shader object

functions are covered in the rst sections of Chapter 4, “Shaders and

Programs.”

Creating a Program Object and Linking

theShaders

Once the application has created a shader object for the vertex and

fragment shaders, it needs to create a program object. Conceptually, the

program object can be thought of as the nal linked program. Once the

various shaders are compiled into a shader object, they must be attached

to a program object and linked together before drawing.

The process of creating program objects and linking is fully described in

Chapter 4, “Shaders and Programs.” For now, we provide a brief overview

of the process. The rst step is to create the program object and attach the

vertex shader and fragment shader to it.

// Create the program object

programObject = glCreateProgram();

if(programObject == 0)

return 0;

glAttachShader(programObject, vertexShader);

glAttachShader(programObject, fragmentShader);

Finally, we are ready to link the program and check for errors:

// Link the program

glLinkProgram(programObject);

// Check the link status

glGetProgramiv(programObject, GL_LINK_STATUS, &1inked);

if(!linked)

{

GLint infoLen = 0;

glGetProgramiv(programObject, GL_INFO_LOG_LENGTH,&infoLen);

if(infoLen > 1)

{

Setting the Viewport and Clearing the Color Buffer 39

char* infoLog = malloc(sizeof(char) * infoLen);

glGetProgramInfoLog(programObject, infoLen, NULL,infoLog);

esLogMessage("Error linking program:\n%s\n", infoLog);

free(infoLog) ;

}

glDeleteProgram(programObject) ;

return FALSE;

}

// Store the program object

userData->programObject = programObject;

After all of these steps, we have nally compiled the shaders, checked for

compile errors, created the program object, attached the shaders, linked

the program, and checked for link errors. After successful linking of the

program object, we can now nally use the program object for rendering!

To use the program object for rendering, we bind it using glUseProgram.

// Use the program object

glUseProgram(userData->programObject);

After calling glUseProgram with the program object handle, all

subsequent rendering will occur using the vertex and fragment shaders

attached to the program object.

Setting the Viewport and Clearing the Color Buffer

Now that we have created a rendering surface with EGL and initialized

and loaded shaders, we are ready to actually draw something. The Draw

callback function draws the frame. The rst command that we execute in

Draw is glViewport, which informs OpenGL ES of the origin, width, and

height of the 2D rendering surface that will be drawn to. In OpenGL ES,

the viewport denes the 2D rectangle in which all OpenGL ES rendering

operations will ultimately be displayed.

// Set the viewport

glviewport(0, 0, esContext->width, esContext->height);

The viewport is dened by an origin (x, y) and a width and height. We

cover glViewport in more detail in Chapter 7, “Primitive Assembly and

Rasterization,” when we discuss coordinate systems and clipping.

40 Chapter 2: Hello Triangle: An OpenGL ES 3.0 Example

After setting the viewport, the next step is to clear the screen. In OpenGL

ES, multiple types of buffers are involved in drawing: color, depth, and

stencil. We cover these buffers in more detail in Chapter 11, “Fragment

Operations.” In the Hello Triangle example, only the color buffer is drawn

to. At the beginning of each frame, we clear the color buffer using the

glClear function.

// Clear the color buffer

glClear(GL_COLOR_BUFFER_BIT);

The buffer will be cleared to the color specied with glClearColor. In the

example program at the end of Init, the clear color was set to (1.0, 1.0,

1.0, 1.0), so the screen is cleared to white. The clear color should be set by

the application prior to calling glClear on the color buffer.

Loading the Geometry and Drawing a Primitive

Now that we have the color buffer cleared, viewport set, and program

object loaded, we need to specify the geometry for the triangle. The

vertices for the triangle are specied with three (x, y, z) coordinates in the

vVertices array.

GLfloat vVertices[] = { O.Of, 0.5f, O.Of,

-0.5f, -0.5f, O.Of,

0.5f, -0.5f, O.Of};

…

// Load the vertex data

glVertexAttribPointer(0, 3, GL_FLOAT, GL_FALSE, 0, vVertices);

glEnableVertexAttribArray(O) ;

glDrawArrays(GL_TRIANGLES, 0, 3);

The vertex positions need to be loaded to the GL and connected to the

vPosition attribute declared in the vertex shader. As you will remember,

earlier we bound the vPosition variable to the input attribute location

0. Each attribute in the vertex shader has a location that is uniquely

identied by an unsigned integer value. To load the data into vertex

attribute 0, we call the glVertexAttribPointer function. In Chapter 6,

“Vertex Attributes, Vertex Arrays, and Buffer Objects,” we cover how to

load vertex attributes and use vertex arrays in full.

The nal step in drawing the triangle is to actually tell OpenGL ES to draw

the primitive. In this example, we use the function glDrawArrays for

Displaying the Back Buffer 41

this purpose. This function draws a primitive such as a triangle, line, or

strip. We get into primitives in much more detail in Chapter 7, “Primitive

Assembly and Rasterization.”

Displaying the Back Buffer

We have nally gotten to the point where our triangle has been drawn

into the framebuffer. Now there is one nal detail we must address: how

to actually display the framebuffer on the screen. Before we get into that,

let’s back up a little bit and discuss the concept of double buffering.

The framebuffer that is visible on the screen is represented by a two-

dimensional array of pixel data. One possible way we could think about

displaying images on the screen is to simply update the pixel data in the

visible framebuffer as we draw. However, there is a signicant issue with

updating pixels directly on the displayable buffer—that is, in a typical

display system, the physical screen is updated from framebuffer memory at

a xed rate. If we were to draw directly into the framebuffer, the user could

see artifacts as partial updates to the framebuffer where it is displayed.

To address this problem, we use a system known as double buffering. In

this scheme, there are two buffers: a front buffer and a back buffer. All

rendering occurs to the back buffer, which is located in an area of memory

that is not visible to the screen. When all rendering is complete, this

buffer is “swapped” with the front buffer (or visible buffer). The front

buffer then becomes the back buffer for the next frame.

Using this technique, we do not display a visible surface until all

rendering is complete for a frame. This activity is controlled in an

OpenGL ES application through EGL, by using an EGL function called

eglSwapBuffers (this function is called by our framework after calling

the Draw callback function):

eglSwapBuffers(esContext->eglDisplay, esContext->eglSurface);

This function informs EGL to swap the front buffer and back buffers.

The parameters sent to eglSwapBuffers are the EGL display and surface.

These two parameters represent the physical display and the rendering

surface, respectively. In the next chapter, we explain eglSwapBuffers in

more detail and further clarify the concepts of surface, context, and buffer

management. For now, sufce it to say that after swapping buffers we

nally have our triangle on screen!

42 Chapter 2: Hello Triangle: An OpenGL ES 3.0 Example

Summary

In this chapter, we introduced a simple OpenGL ES 3.0 program that

draws a single triangle to the screen. The purpose of this introduction

was to familiarize you with several of the key components that make

up an OpenGL ES 3.0 application: creating an on-screen render surface

with EGL, working with shaders and their associated objects, setting

the viewport, clearing the color buffer, and rendering a primitive. Now

that you understand the basics of what makes up an OpenGL ES 3.0

application, we will dive into these topics in more detail, starting in the

next chapter with more information on EGL.

Chapter 3

An Introduction to EGL

In Chapter 2, “Hello Triangle: An OpenGL ES 3.0 Example,” we drew a

triangle into a window using OpenGL ES 3.0, but we used some custom

functions of our own design to open and manage the window. Although

that technique simplies our examples, it obscures how you might need

to work with OpenGL ES 3.0 on your own systems.

As part of the family of APIs provided by the Khronos Group for

developing content, a (mostly) platform-independent API, EGL, is

available for managing drawing surfaces (windows are just one type;

we will talk about others later). EGL provides the mechanisms for the

following:

Communicating with the native windowing system of your device

Querying the available types and congurations of drawing surfaces

Creating drawing surfaces

Synchronizing rendering between OpenGL ES 3.0 and other graphics-

rendering APIs (such as desktop OpenGL and OpenVG, a cross-platform

API for hardware-accelerated vector graphics, or the native drawing

commands of your windowing system)

Managing rendering resources such as texture maps

This chapter introduces the fundamentals required to open a window. As

we describe other operations, such as creating texture maps, we discuss

the necessary EGL commands.

44 Chapter 3: An Introduction to EGL

Communicating with the Windowing System

EGL provides a “glue” layer between OpenGL ES 3.0 (and other Khronos

graphics APIs) and the native windowing system running on your

computer, like the X Window System commonly found on GNU/Linux

systems, Microsoft Windows, or Mac OS X’s Quartz. Before EGL can

determine which types of drawing surfaces are available—or any other

characteristics of the underlying system, for that matter—it needs to open

a communications channel with the windowing system. Note that Apple

provides its own iOS implementation of the EGL API called EAGL.

Because every windowing system has different semantics, EGL provides a

basic opaque type—the EGLDisplay—that encapsulates all of the system

dependencies for interfacing with the native windowing system. The rst

operation that any application using EGL will need to perform is to create

and initialize a connection with the local EGL display. This is done in a

two-call sequence, as shown in Example 3-1.

To open a connection to the EGL display server, you call the following

function:

Example 3-1 Initializing EGL

EGLint majorVersion;

EGLint minorVersion;

EGLDisplay display = eglGetDisplay ( EGL_DEFAULT_DISPLAY );

if ( display == EGL_NO_DISPLAY )

{

// Unable to open connection to local windowing system

}

if ( !eglInitialize ( display, &majorVersion, &minorVersion ) )

{

// Unable to initialize EGL; handle and recover

}

EGLDisplay eglGetDisplay(EGLNativeDisplayType displayId)

displayId

species the display connection, use EGL_DEFAULT_DISPLAY

for the default connection

Checking for Errors 45

EGLNativeDisplayType is dened to match the native window

system’s display type. On Microsoft Windows, for example, an

EGLNativeDisplayType would be dened to be an HDC—a handle to

the Microsoft Windows device context. However, to make it easy to move

your code to different operating systems and platforms, the token

EGL_DEFAULT_DISPLAY is accepted and will return a connection to the

default native display, as we did.

If a display connection isn’t available, eglGetDisplay will return

EGL_NO_DISPLAY. This error indicates that EGL isn’t available, and you

won’t be able to use OpenGL ES 3.0.

Before we continue by discussing more EGL operations, we need to briey

describe how EGL processes and reports errors to your application.

Checking for Errors

Most functions in EGL return EGL_TRUE when successful and EGL_FALSE

otherwise. However, EGL will do more than just tell you if the call

failed—it will record an error to indicate the reason for failure. However,

that error code is not returned to you directly; you need to query EGL

explicitly for the error code, which you can do by calling the following

function:

This function returns the error code of the most recent EGL function

called in a specic thread. If EGL_SUCCESS is returned, then there is no

status to return.

You might wonder why this is a prudent approach, as compared to

directly returning the error code when the call completes. Although

we never encourage anyone to ignore function return codes, allowing

optional error code recovery reduces redundant code in applications

veried to work properly. You should certainly check for errors during

development and debugging, and on an ongoing basis in critical

applications, but once you are convinced your application is working as

expected, you can likely reduce your error checking.

EGLint eglGetError()

46 Chapter 3: An Introduction to EGL

Initializing EGL

Once you have successfully opened a connection, EGL needs to be

initialized, which is done by calling the following function:

This function initializes EGL’s internal data structures and returns the

major and minor version numbers of the EGL implementation. If EGL

is unable to be initialized, this call will return EGL_FALSE, and set EGL’s

error code to

EGL_BAD_DISPLAY if display doesn’t specify a valid EGLDisplay.

EGL_NOT_INITIALIZED if the EGL cannot be initialized.

Determining the Available Surface Congurations

Once we have initialized EGL, we are able to determine which types and

congurations of rendering surfaces are available to us. There are two ways

to go about this:

Query every surface conguration and nd the best choice ourselves.

Specify a set of requirements and let EGL make a recommendation for

the best match.

In many situations, the second option is simpler to implement, and

most likely yields what you would have found using the rst option.

In either case, EGL will return an EGLConfig, which is an identier

to an EGL-internal data structure that contains information about a

particular surface and its characteristics, such as the number of bits for

EGLBoolean eglInitialize(EGLDisplay display,

EGLint*majorVersion,

EGLint*minorVersion)

display species the EGL display connection

majorVersion species the major version number returned by the EGL

implementation; may be NULL

minorVersion species the minor version number returned by the EGL

implementation; may be NULL

Determining the Available Surface Congurations 47

each color component, or the depth buffer (if any) associated with that

EGLConfig. You can query any of the attributes of an EGLConfig, using

the eglGetConfigAttrib function, which we describe later.

To query all EGL surface congurations supported by the underlying

windowing system, call this function:

EGLBoolean eglGetConfigs(EGLDisplay display,

EGLConfig *configs,

EGLint maxReturnConfigs,

EGLint *numConfigs)

display species the EGL display connection

configs species the list of configs

maxReturnConfigs species the size of configs

numConfigs species the size of configs returned

This function returns EGL_TRUE if the call succeeded. On failure, this call

will return EGL_FALSE and set EGL’s error code to

EGL_NOT_INITIALIZED if display is not initialized.

EGL_BAD_PARAMETER if numConfigs is NULL.

There are two ways to call eglGetConfigs. First, if you specify NULL

for the value of configs, the system will return EGL_TRUE and set

numConfigs to the number of available EGLConfigs. No additional

information about any of the EGLConfigs in the system is returned, but

knowing the number of available congurations allows you to allocate

enough memory to get the entire set of EGLConfigs, should you care to

do so.

Alternatively, and perhaps more usefully, you can allocate an array of

uninitialized EGLConfig values and pass them into eglGetConfigs as the

configs parameter. Set maxReturnConfigs to the size of the array you

allocated, which will also specify the maximum number of congurations

that will be returned. When the call completes, numConfigs will be

updated with the number of entries in configs that were modied.

You can then begin processing the list of returned values, querying the

characteristics of the various congurations to determine which one best

matches your needs.

48 Chapter 3: An Introduction to EGL

Querying EGLCong Attributes

We now describe the values that EGL associates with an EGLConfig and

explain how you can retrieve those values.

An EGLConfig contains all of the information about a surface made

available by EGL. This includes information about the number of

available colors, additional buffers associated with the conguration

(such as depth and stencil buffers, which we discuss later), the type of

surfaces, and numerous other characteristics. The following is a list of

the attributes that can be queried from an EGLConfig. We discuss only a

subset of these in this chapter, but the entire list appears in Table 3-1 as

a reference.

To query a particular attribute associated with an EGLConfig, use the

following function:

This function returns EGL_TRUE if the call succeeded. On failure,

EGL_FALSE is returned, and an EGL_BAD_ATTRIBUTE error is posted if

attribute is not a valid attribute.

This call will return the value for the specic attribute of the associated

EGLConfig. This allows you total control over which conguration you

choose for ultimately creating rendering surfaces. However, when looking

at Table 3-1, you might be somewhat intimidated by the large number of

options. EGL provides another routine, eglChooseConfig, that allows

you to specify what is important for your application, and will return the

best matching conguration given your requests.

EGLBoolean eglGetConfigAttrib(EGLDisplay display,

EGLConfig config,

EGLint attribute,

EGLint *value)

display species the EGL display connection

config species the conguration to be queried

attribute species the particular attribute to be returned

value species the value returned

Querying EGLCong Attributes 49

Table 3-1 EGLConfig Attributes

Attribute Description Default Value

EGL_BUFFER_SIZE Number of bits for all colorcomponents in the color buffer 0

EGL_RED_SIZE Number of red bits in the color buffer 0

EGL_GREEN_SIZE Number of green bits in the color buffer 0

EGL_BLUE_SIZE Number of blue bits in the color buffer 0

EGL_LUMINANCE_SIZE Number of luminance bits in the color buffer 0

EGL_ALPHA_SIZE Number of alpha bits in the color buffer 0

EGL_ALPHA_MASK_SIZE Number of alpha-mask bits in the mask buffer 0

EGL_BIND_TO_TEXTURE_RGB True if bindable to RGB textures EGL_DONT_CARE

EGL_BIND_TO_TEXTURE_RGBA True if bindable to RGBA textures EGL_DONT_CARE

EGL_COLOR_BUFFER_TYPE Type of the color buffer: either EGL_RGB_BUFFER or

EGL_LUMINANCE_BUFFER

EGL_RGB_BUFFER

EGL_CONFIG_CAVEAT Any caveats associated with the conguration EGL_DONT_CARE

EGL_CONFIG_ID The unique EGLConfig identier value EGL_DONT_CARE

EGL_CONFORMANT True if contexts created with this EGLConfig are conformant —

EGL_DEPTH_SIZE Number of bits in the depth buffer 0

EGL_LEVEL Framebuffer level 0

EGL_MAX_PBUFFER_WIDTH Maximum width for a PBuffer created with this EGLConfig —

EGL_MAX_PBUFFER_HEIGHT Maximum height for a PBuffer created with this EGLConfig —

EGL_MAX_PBUFFER_PIXELS Maximum size of a PBuffer created with this EGLConfig —

EGL_MAX_SWAP_INTERVAL Maximum buffer swap interval EGL_DONT_CARE

(continues)

50 Chapter 3: An Introduction to EGL

Table 3-1 EGLConfig Attributes (continued)

Attribute Description Default Value

EGL_MIN_SWAP_INTERVAL Minimum buffer swap interval EGL_DONT_CARE

EGL_NATIVE_RENDERABLE True if native rendering libraries can render into a surface

created with EGLConfig

EGL_DONT_CARE

EGL_NATIVE_VISUAL_ID Handle of corresponding native window system visual ID EGL_DONT_CARE

EGL_NATIVE_VISUAL_TYPE Type of corresponding native window system visual EGL_DONT_CARE

EGL_RENDERABLE_TYPE A bitmask composed of the tokens EGL_OPENGL_ES_BIT,

EGL_OPENGL_ES2_BIT, EGL_OPENGL_ES3_BIT_KHR (requires

EGL_KHR_create_context extension), EGL_OPENGL_BIT, or

EGL_OPENVG_BIT, which represent the rendering interfaces

supported with the conguration

EGL_OPENGL_ES_BIT

EGL_SAMPLE_BUFFERS Number of available multisample buffers 0

EGL_SAMPLES Number of samples per pixel 0

EGL_STENCIL_SIZE Number of bits in the stencil buffer 0

EGL_SURFACE_TYPE Type of EGL surfaces supported; can be any of

EGL_WINDOW_BIT, EGL_PIXMAP_BIT, EGL_PBUFFER_BIT,

EGL_MULTISAMPLE_RESOLVE_BOX_BIT,

EGL_SWAP_BEHAVIOR_PRESERVED_BIT,

EGL_VG_COLORSPACE_LINEAR_BIT, or

EGL_VG_ALPHA_FORMAT_PRE_BIT

EGL_WINDOW_BIT

EGL_TRANSPARENT_TYPE Type of transparency supported EGL_NONE

EGL_TRANSPARENT_RED_VALUE Red color value interpreted as transparent EGL_DONT_CARE

EGL_TRANSPARENT_GREEN_VALUE Green color value interpreted as transparent EGL_DONT_CARE

EGL_TRANSPARENT_BLUE_VALUE Blue color value interpreted as transparent EGL_DONT_CARE

Note: Various tokens do not have a default value mandated in the EGL specication, as indicated by the dash (—) for their default value.

Letting EGL Choose the Conguration 51

Letting EGL Choose the Conguration

To have EGL make the choice of matching EGLConfigs, use this function:

This function returns EGL_TRUE if the call succeeded. On failure,

EGL_FALSE is returned, and an EGL_BAD_ATTRIBUTE error is posted if

attribList contains an undened EGL attribute or an attribute value

that is unrecognized or out of range.

You need to provide a list of the attributes, with associated preferred

values for all the attributes that are important for the correct operation of

your application. For example, if you need an EGLConfig that supports

a rendering surface having ve bits red and blue, and six bits green (the

commonly used “RGB 565” format); a depth buffer; and OpenGL ES 3.0,

you might declare the array shown in Example 3-2.

For values that are not explicitly specied in the attribute list, EGL will use the

default values shown in Table 3-1. Additionally, when specifying a numeric

value for an attribute, EGL will guarantee the returned conguration has at

least that value at a minimum if there is a matching EGLConfig available.

EGLBoolean eglChooseConfig(EGLDisplay display,

const EGLint *attribList,

EGLConfig *configs,

EGLint maxReturnConfigs,

EGLint *numConfigs)

display species the EGL display connection

attribList species the list of attributes to match by configs

configs species the list of congurations

maxReturnConfigs species the size of congurations

numConfigs species the size of congurations returned

Example 3-2 Specifying EGL Attributes

EGLint attribList[] =

{

EGL_RENDERABLE_TYPE, EGL_OPENGL_ES3_BIT_KHR,

EGL_RED_SIZE, 5,

EGL_GREEN_SIZE, 6,

EGL_BLUE_SIZE, 5,

EGL_DEPTH_SIZE, 1,

EGL_NONE

};

52 Chapter 3: An Introduction to EGL

Note: Using the EGL_OPENGL_ES3_BIT_KHR attribute requires the

EGL_KHR_create_context extension. This attribute is dened

in eglext.h (EGL v1.4). It is also worth noting that some

implementations will always promote OpenGL ES 2.0 contexts to

OpenGL ES 3.0 contexts, as OpenGL ES 3.0 is backward compatible

with OpenGL ES 2.0.

To use this set of attributes as a selection criteria, follow Example 3-3.

Example 3-3 Querying EGL Surface Congurations

const EGLint MaxConfigs = 10;

EGLConfig configs[MaxConfigs]; // We'll accept only 10 configs

EGLint numConfigs;

if ( !eglChooseConfig( display, attribList, configs, MaxConfigs,

&numConfigs ) )

{

// Something did not work ... handle error situation

}

else

{

// Everything is okay; continue to create a rendering surface

}

If eglChooseConfig returns successfully, a set of EGLConfigs matching

your criteria will be returned. If more than one EGLConfig matches

(with at most the maximum number of congurations you specify),

eglChooseConfig will sort the congurations using the following ordering:

1. By the value of EGL_CONFIG_CAVEAT. Precedence is given to

congurations where there are no conguration caveats (when the

value of EGL_CONFIG_CAVEAT is EGL_NONE), then slow rendering

congurations (EGL_SLOW_CONFIG), and nally nonconformant

congurations (EGL_NON_CONFORMANT_CONFIG).

2. By the type of buffer as specied by EGL_COLOR_BUFFER_TYPE.

3. By the number of bits in the color buffer in descending sizes. The

number of bits in a buffer depends on the EGL_COLOR_BUFFER_TYPE, and

will be at least the value specied for a particular color channel. When

the buffer type is EGL_RGB_BUFFER, the number of bits is computed

as the total of EGL_RED_SIZE, EGL_GREEN_SIZE, and EGL_BLUE_SIZE.

When the color buffer type is EGL_LUMINANCE_BUFFER, the number of

bits is the sum of EGL_LUMINANCE_SIZE and EGL_ALPHA_SIZE.

4. By the value of EGL_BUFFER_SIZE in ascending order.

Creating an On-Screen Rendering Area: The EGL Window 53

5. By the value of EGL_SAMPLE_BUFFERS in ascending order.

6. By the number of EGL_SAMPLES in ascending order.

7. By the value of EGL_DEPTH_SIZE in ascending order.

8. By the value of the EGL_STENCIL_SIZE in ascending order.

9. By the value of the EGL_ALPHA_MASK_SIZE (which is applicable only

to OpenVG surfaces).

10. By the EGL_NATIVE_VISUAL_TYPE in an implementation-dependent

manner.

11. By the value of the EGL_CONFIG_ID in ascending order.

Parameters not mentioned in this list are not used in the sorting process.

Note: Because of the third sorting rule, to get the best format that matches

what you specied, you will need to add extra logic to go through

the returned results. For example, if you ask for “565” RGB format,

then the “888” format will appear in the returned results rst.

As mentioned in Example 3-3, if eglChooseConfig returns successfully,

we have enough information to continue to create something to draw

into. By default, if you do not specify which type of rendering surface

type you would like (by specifying the EGL_SURFACE_TYPE attribute), EGL

assumes you want an on-screen window.

Creating an On-Screen Rendering Area:

The EGL Window

Once we have a suitable EGLConfig that meets our requirements for

rendering, we are ready to create our window. To create a window, call the

following function:

EGLSurface eglCreateWindowSurface(EGLDisplay display,

EGLConfig config,

EGLNativeWindowType window,

const EGLint *attribList)

display species the EGL display connection

config species the conguration

window species the native window

attribList species the list of window attributes; may be NULL

54 Chapter 3: An Introduction to EGL

This function takes as arguments our connection to the native

display manager and the EGLConfig that we obtained in the previous

step. Additionally, it requires a window from the native windowing

system that was created previously. Because EGL is a software layer

between many different windowing systems and OpenGL ES 3.0,

demonstrating how to create a native window is outside the scope

of this guide. Please refer to the documentation for your native

windowing system to determine what is required to create a window

in that environment.

Finally, this call takes a list of attributes; however, this list differs

from the attributes shown in Table 3-1. Because EGL supports other

rendering APIs (notably OpenVG), some attributes accepted by

eglCreateWindowSurface do not apply when working with OpenGL ES

3.0 (see Table 3-2). For our purposes, eglCreateWindowSurface accepts a

single attribute, which is used to specify the buffer of the front- or back-

buffer we would like to render into.

Table 3-2 Attributes for Window Creation Using eglCreateWindowSurface

Token Description Default Value

EGL_RENDER_BUFFER Species which buffer

should be used for

rendering (using the

EGL_SINGLE_BUFFER

value), or back

(EGL_BACK_BUFFER)

EGL_BACK_BUFFER

Note: For OpenGL ES 3.0 window rendering surfaces, only double-

buffered windows are supported.

The attribute list might be empty (i.e., passing a NULL pointer as the value

for attribList), or it might be a list populated with an EGL_NONE token

as the rst element. In such cases, all of the relevant attributes use their

default values.

There are a number of ways in which eglCreateWindowSurface could

fail, and if any of them occur, EGL_NO_SURFACE is returned from the call

and the particular error is set. If this situation occurs, we can determine

the reason for the failure by calling eglGetError, which will return one

of the reasons shown in Table 3-3.

Creating an On-Screen Rendering Area:The EGL Window 55

Putting this all together, our code for creating a window is shown in

Example 3-4.

Table 3-3 Possible Errors When eglCreateWindowSurface Fails

Error Code Description

EGL_BAD_MATCH This situation occurs when:

The attributes of the native window do not match

those of the provided EGLConfig.

The provided EGLConfig does not support rendering

into a window (i.e., the EGL_SURFACE_TYPE attribute

does not have the EGL_WINDOW_BIT set).

EGL_BAD_CONFIG This error is agged if the provided EGLConfig is not

supported by the system.

EGL_BAD_

NATIVE_WINDOW This error is specied if the provided native window

handle is not valid.

EGL_BAD_ALLOC This error occurs if eglCreateWindowSurface is

unable to allocate the resources for the new EGL window,

or if there is already an EGLConfig associated with the

provided native window.

Example 3-4 Creating an EGL Window Surface

EGLint attribList[] =

{

EGL_RENDER_BUFFER, EGL_BACK_BUFFER,

EGL_NONE

);

EGLSurface window = eglCreateWindowSurface ( display, config,

nativeWindow,

attribList );

if ( window == EGL_NO_SURFACE )

{

switch ( eglGetError ( ) )

{

case EGL_BAD_MATCH:

// Check window and EGLConfig attributes to determine

// compatibility, or verify that the EGLConfig

// supports rendering to a window

break;

case EGL_BAD_CONFIG:

// Verify that provided EGLConfig is valid

break;

(continues)

56 Chapter 3: An Introduction to EGL

This creates a place for us to draw into, but we still have two more steps

that must be completed before we can successfully use OpenGL ES 3.0

with our window. Windows, however, are not the only rendering surfaces

that you might nd useful. We introduce another type of rendering

surface next before completing our discussion.

Creating an Off-Screen Rendering Area:

EGL Pbuffers

In addition to being able to render into an on-screen window using

OpenGL ES 3.0, you can render into nonvisible off-screen surfaces called

pbuffers (short for pixel buffer). Pbuffers can take full advantage of any

hardware acceleration available to OpenGL ES 3.0, just as a window does.

Pbuffers are most often used for generating texture maps. If all you want

to do is render to a texture, we recommend using framebuffer objects

(covered in Chapter 12, “Framebuffer Objects”) instead of pbuffers because

they are more efcient. However, pbuffers can still be useful in some cases

where framebuffer objects cannot be used, such as when rendering an off-

screen surface with OpenGL ES and then using it as a texture in another

API such as OpenVG.

Creating a pbuffer is very similar to creating an EGL window, with a few

minor differences. To create a pbuffer, we need to nd an EGLConfig

just as we did for a window, with one modication: We need to

augment the value of EGL_SURFACE_TYPE to include EGL_PBUFFER_BIT.

Once we have a suitable EGLConfig, we can create a pbuffer using the

function

Example 3-4 Creating an EGL Window Surface (continued)

case EGL_BAD_NATIVE_WINDOW:

// Verify that provided EGLNativeWindow is valid

break;

case EGL_BAD_ALLOC:

// Not enough resources available; handle and recover

break;

}

Creating an Off-Screen Rendering Area: EGL Pbuffers 57

EGLSurface eglCreatePbufferSurface(EGLDisplay display,

EGLConfig config,

const EGLint *attribList)

display species the EGL display connection

config species the conguration

attribList species the list of pixel buffer attributes; may be NULL

As with window creation, this function takes our connection to the native

display manager and the EGLConfig that we selected. This call also takes a

list of attributes, as described in Table 3-4.

Table 3-4 EGL Pixel Buffer Attributes

Token Description Default Value

EGL_WIDTH Species the desired width (in

pixels) of the pbuffer.

EGL_HEIGHT Species the desired height

(in pixels) of the pbuffer.

EGL_LARGEST_PBUFFER Select the largest available

pbuffer if one of the

requested size is not available.

Valid values are EGL_TRUE

and EGL_FALSE.

EGL_FALSE

EGL_TEXTURE_FORMAT Species the type of texture

format (see Chapter 9,

“Texturing”) if the pbuffer is

bound to a texture map. Valid

values are EGL_TEXTURE_RGB,

EGL_TEXTURE_RGBA, and

EGL_NO_TEXTURE (which

indicates that the pbuffer

will not be used directly as a

texture).

EGL_NO_TEXTURE

EGL_TEXTURE_TARGET Species the associated texture

target that the pbuffer should

be attached to if used as a

texture map (see Chapter 9,

“Texturing”). Valid values are

EGL_TEXTURE_2D and EGL_

NO_TEXTURE.

EGL_NO_TEXTURE

(continues)

58 Chapter 3: An Introduction to EGL

There are a number of ways that eglCreatePbufferSurface could fail. Just

as with window creation, if any of these failures occur, EGL_NO_SURFACE

is returned from the call and the particular error is set. In this situation,

eglGetError will return one of the errors listed in Table 3-5.

Table 3-5 Possible Errors When eglCreatePbufferSurface Fails

Error Code Description

EGL_BAD_ALLOC This error occurs when the pbuffer cannot be allocated

due to a lack of resources.

EGL_BAD_CONFIG This error is agged if the provided EGLConfig is not a

valid EGLConfig supported by the system.

EGL_BAD_PARAMETER This error is generated if either the EGL_WIDTH or

EGL_HEIGHT provided in the attribute list is a negative value.

EGL_BAD_MATCH This error is generated if any of the following

situations occur: if the EGLConfig provided does

not support pbuffer surfaces; if the pbuffer will be

used as a texture map (EGL_TEXTURE_FORMAT is not

EGL_NO_TEXTURE), and the specied EGL_WIDTH and

EGL_HEIGHT specify an invalid texture size; or if either

EGL_TEXTURE_FORMAT and EGL_TEXTURE_TARGET

is EGL_NO_TEXTURE, and the other attribute is not

EGL_NO_TEXTURE.

EGL_BAD_

ATTRIBUTE This error occurs if either EGL_TEXTURE_FORMAT,

EGL_TEXTURE_TARGET, or EGL_MIPMAP_TEXTURE

is specied, but the provided EGLConfig does not

support OpenGL ES rendering (e.g., only OpenVG

rendering is supported).

Putting this all together, we create a pbuffer, as shown in Example 3-5.

Table 3-4 EGL Pixel Buffer Attributes (continued)

Token Description Default Value

EGL_MIPMAP_TEXTURE Species whether storage

for texture mipmap levels

(see Chapter 9, “Texturing”)

should be additionally

allocated. Valid values are

EGL_TRUE and EGL_FALSE.

EGL_FALSE

Creating an Off-Screen Rendering Area: EGL Pbuffers 59

(continues)

Example 3-5 Creating an EGL Pixel Buffer

EGLint attribList[] =

{

EGL_SURFACE_TYPE, EGL_PBUFFER_BIT,

EGL_RENDERABLE_TYPE, EGL_OPENGL_ES3_BIT_KHR,

EGL_RED_SIZE, 5,

EGL_GREEN_SIZE, 6,

EGL_BLUE_SIZE, 5,

EGL_DEPTH_SIZE, 1,

EGL_NONE

};

const EGLint MaxConfigs = 10;

EGLConfig configs[MaxConfigs]; // We'll accept only 10 configs

EGLint numConfigs;

if ( !eglChooseConfig( display, attribList, configs, MaxConfigs,

&numConfigs ) )

{

// Something did not work ... handle error situation

}

else

{

// We have found a pbuffer-capable EGLConfig

}

// Proceed to create a 512 x 512 pbuffer

// (or the largest available)

EGLSurface pbuffer;

EGLint attribList[] =

{

EGL_WIDTH, 512,

EGL_HEIGHT, 512,

EGL_LARGEST_PBUFFER, EGL_TRUE,

EGL_NONE

);

pbuffer = eglCreatePbufferSurface( display, config, attribList);

if ( pbuffer == EGL_NO_SURFACE )

{

switch ( eglGetError ( ) )

{

case EGL_BAD_ALLOC:

// Not enough resources available; handle and recover

break;

case EGL_BAD_CONFIG:

// Verify that provided EGLConfig is valid

break;

60 Chapter 3: An Introduction to EGL

Pbuffers support all OpenGL ES 3.0 rendering facilities, just as windows

do. The major difference—aside from the fact that you cannot display a

pbuffer on the screen—is that instead of swapping buffers when you are

nished rendering as you do with a window, you either copy the values

from a pbuffer to your application or modify the binding of the pbuffer as

a texture.

Creating a Rendering Context

A rendering context is a data structure internal to OpenGL ES 3.0 that

contains all of the state information required for operation. For example,

it contains references to the vertex and fragment shaders and the array of

vertex data used in the example program in Chapter 2, “Hello Triangle:

An OpenGL ES 3.0 Example.” Before OpenGL ES 3.0 can draw, it needs to

have a context available for its use.

Example 3-5 Creating an EGL Pixel Buffer (continued)

case EGL_BAD_PARAMETER:

// Verify that EGL_WIDTH and EGL_HEIGHT are

// non-negative values

break;

case EGL_BAD_MATCH:

// Check window and EGLConfig attributes to determine

// compatibility and pbuffer-texture parameters

break;

}

// Check the size of pbuffer that was allocated

EGLint width;

EGLint height;

if ( !eglQuerySurface ( display, pbuffer, EGL_WIDTH, &width ) ||

!eglQuerySurface ( display, pbuffer, EGL_HEIGHT, &height ))

{

// Unable to query surface information

}

Creating a Rendering Context 61

EGLContext eglCreateContext(EGLDisplay display,

EGLConfig config,

EGLContext shareContext,

const EGLint *attribList)

display species the EGL display connection

config species the conguration

shareContext allows multiple EGL contexts to share specic types of

data, such as shader programs and texture maps; use

EGL_NO_CONTEXT for no sharing

attribList species the list of attributes for the context to be

created; only a single attribute is accepted,

EGL_CONTEXT_CLIENT_VERSION

Once again, you will need the display connection and the EGLConfig

that best represents your application’s requirements. The third parameter,

shareContext, allows multiple EGLContexts to share specic types of

data, such as shader programs and texture maps. For the time being, we

pass EGL_NO_CONTEXT in as the value for shareContext, indicating that

we are not sharing resources with any other contexts.

Finally, as with many EGL calls, a list of attributes specic to eglCreate-

Context’s operation is specied. In this case, a single attribute is accepted,

EGL_CONTEXT_CLIENT_VERSION, which is discussed in Table 3-6.

Table 3-6 Attributes for Context Creation Using eglCreateContext

Token Description Default Value

EGL_CONTEXT_

CLIENT_VERSION Species the type of

context associated with

the version of OpenGL

ES that you are using

1 (species an OpenGL ES

1.X context)

Because we want to use OpenGL ES 3.0, we will always have to specify this

attribute to obtain the right type of context.

When eglCreateContext succeeds, it returns a handle to the newly

created context. If a context cannot be created, then eglCreateContext

returns EGL_NO_CONTEXT, and the reason for the failure is set and can be

obtained by calling eglGetError. With our current knowledge, the only

To create a context, use the following function:

62 Chapter 3: An Introduction to EGL

Other errors may be generated by eglCreateContext, but for the

moment we will check for only bad EGLConfig errors.

After successfully creating an EGLContext, we need to complete one nal

step before we can render.

Making an EGLContext Current

As an application might have created multiple EGLContexts for various

purposes, we need a way to associate a particular EGLContext with our

rendering surface—a process commonly called “make current.”

Example 3-6 Creating an EGL Context

const EGLint attribList[] =

{

// EGL_KHR_create_context is required

EGL_CONTEXT_CLIENT_VERSION, 3,

EGL_NONE

};

EGLContext context = eglCreateContext ( display, config,

EGL_NO_CONTEXT,

attribList );

if ( context == EGL_NO_CONTEXT )

{

EGLError error = eglGetError ( );

if ( error == EGL_BAD_CONFIG )

{

// Handle error and recover

}

reason that eglCreateContext would fail is if the EGLConfig we provide

is not valid, in which case the error returned by eglGetError is EGL_BAD_

CONFIG.

Example 3-6 shows how to create a context after selecting an appropriate

EGLConfig.

Putting All Our EGL Knowledge Together 63

EGLBoolean eglMakeCurrent(EGLDisplay display,

EGLSurface draw,

EGLSurface read,

EGLContext context)

display species the EGL display connection

draw species the EGL draw surface

read species the EGL read surface

context species the EGL rendering context to be attached to the

surfaces

This function returns EGL_TRUE if the call succeeded. On failure, it returns

EGL_FALSE.

You probably noticed that this call takes two EGLSurfaces. Although

this approach allows exibility that we will exploit in our discussion of

advanced EGL usage, we set both read and draw to the same value, the

window that we created previously.

Note: Because the EGL specication requires a ush for eglMakeCurrent

implementation, this call is expensive for tile-based architectures.

Putting All Our EGL Knowledge Together

This chapter concludes with a complete example showing the entire

process starting with the initialization of the EGL through binding an

EGLContext to an EGLSurface. We will assume that a native window has

already been created, and that if any errors occur, the application will

terminate.

In fact, Example 3-7 is similar to what is done in esCreateWindow, our

homegrown function that wraps the required EGL window creation code,

as shown in Chapter 2, “Hello Triangle: An OpenGL ES 3.0 Example,”

except for those routines that separate the creation of the window and the

context (for reasons that we discuss later).

To associate a particular EGLContext with an EGLSurface, use the call

64 Chapter 3: An Introduction to EGL

Example 3-7 A Complete Routine for Creating an EGL Window

EGLBoolean initializeWindow ( EGLNativeWindow nativeWindow )

{

const EGLint configAttribs[] =

{

EGL_RENDER_TYPE, EGL_WINDOW_BIT,

EGL_RED_SIZE, 8,

EGL_GREEN_SIZE, 8,

EGL_BLUE_SIZE, 8,

EGL_DEPTH_SIZE, 24,

EGL_NONE

};

const EGLint contextAttribs[] =

{

EGL_CONTEXT_CLIENT_VERSION, 3,

EGL_NONE

};

EGLDisplay display = eglGetDisplay ( EGL_DEFAULT_DISPLAY )

if ( display == EGL_NO_DISPLAY )

{

return EGL_FALSE;

}

EGLint major, minor;

if ( !eglInitialize ( display, &major, &minor ) )

{

return EGL_FALSE;

}

EGLConfig config;

EGLint numConfigs;

if ( !eglChooseConfig ( display, configAttribs, &config, 1,

&numConfigs ) )

{

return EGL_FALSE;

}

EGLSurface window = eglCreateWindowSurface ( display, config,

nativeWindow, NULL );

if (window == EGL_NO_SURFACE)

{

return EGL_FALSE;

}

Putting All Our EGL Knowledge Together 65

Example 3-8 Creating a Window Using the esUtil Library

ESContext esContext;

const char* title = "OpenGL ES Application Window Title";

if (esCreateWindow(&esContext, title, 512, 512,

ES_WINDOW_RGB | ES_WINDOW_DEPTH))

{

// Window creation failed

}

This code would be similar if an application made the call in Example 3-8

to open a 512 × 512 window.

The last parameter to esCreateWindow species the characteristics we

want in our window, and species as a bitmask of the following values:

ES_WINDOW_RGB—Specify an RGB-based color buffer.

ES_WINDOW_ALPHA—Allocate a destination alpha buffer.

ES_WINDOW_DEPTH—Allocate a depth buffer.

ES_WINDOW_STENCIL—Allocate a stencil buffer.

ES_WINDOW_MULTISAMPLE—Allocate a multisample buffer.

EGLContext context = eglCreateContext ( display, config,

EGL_NO_CONTEXT,

contextAttribs);

if ( context == EGL_NO_CONTEXT )

{

return EGL_FALSE;

}

if ( !eglMakeCurrent ( display, window, window, context ) )

{

return EGL_FALSE;

}

return EGL_TRUE;

}

66 Chapter 3: An Introduction to EGL

Specifying these values in the window conguration bitmask will add the

appropriate tokens and values into the EGLConfig attributes list

(i.e., configAttribs in the preceding example).

Synchronizing Rendering

You might encounter situations in which you need to coordinate the

rendering of multiple graphics APIs into a single window. For example,

you might nd it easier to use OpenVG or nd the native windowing

system’s font rendering functions better suited for drawing characters into

a window than OpenGL ES 3.0. In such cases, you will need to have your

application allow the various libraries to render into the shared window.

EGL has a few functions to help with your synchronization tasks.

If your application is rendering only with OpenGL ES 3.0, then you can

guarantee that all rendering has occurred by simply calling glFinish (or

more efcient sync objects and fences, which are discussed in Chapter 13,

“Sync Objects and Fences”).

However, if you are using more than one Khronos API for rendering (such

as OpenVG) and you might not know which API is used before switching

to the window system’s native rendering API, you can call this function:

EGLBoolean eglWaitNative(EGLint engine)

engine species the renderer to wait for rendering completion

This function’s operation is similar to that of glFinish, but it works

regardless of which Khronos API is currently in operation.

Likewise, if you need to guarantee that the native windowing system’s

rendering is completed, call this function:

EGLBoolean eglWaitClient()

Delays execution of the client until all rendering through a Khronos API

(e.g., OpenGL ES 3.0, OpenGL, or OpenVG) is completed. On success, it

returns EGL_TRUE. On failure, it returns EGL_FALSE and an

EGL_BAD_CURRENT_SURFACE error is posted.

Summary 67

EGL_CORE_NATIVE_ENGINE is always accepted, and represents the most

common engine supported; other engines are implementation specic,

and are specied through EGL extensions. EGL_TRUE is returned on

success. On failure, EGL_FALSE is returned and an EGL_BAD_PARAMETER

error is posted.

Summary

In this chapter, you learned about EGL, the API for creating surfaces and

rendering contexts for OpenGL ES 3.0. Now, you know how to initialize

EGL; query various EGL attributes; and create an on-screen, off-screen

rendering area and rendering context using EGL. You have learned

enough EGL to do everything you will need for rendering with OpenGL

ES 3.0. In the next chapter, we show you how to create OpenGL ES

shaders and programs.

This page intentionally left blank

Chapter 4

Shaders and Programs

Chapter 2, “Hello, Triangle: An OpenGL ES 3.0 Example,” introduced

you to a simple program that draws a single triangle. In that example,

we created two shader objects (one for the vertex shader and one for

the fragment shader) and a single program object to render the triangle.

Shader objects and program objects are fundamental concepts when

working with shaders in OpenGL ES 3.0. In this chapter, we provide

the full details on how to create shaders, compile them, and link them

together into a program object. The details of writing vertex and fragment

shaders come later in this book. For now, we focus on the following topics:

Shader and program object overview

Creating and compiling a shader

Creating and linking a program

Getting and setting uniforms

Getting and setting attributes

Shader compiler and program binaries

Shaders and Programs

There are two fundamental object types you need to create to render

with shaders: shader objects and program objects. The best way to think of

a shader object and a program object is by comparison to a C compiler

and linker. A C compiler generates object code (e.g., .obj or .o les) for a

piece of source code. After the object les have been created, the C linker

then links the object les into a nal program.

70 Chapter 4: Shaders and Programs

A similar paradigm is used in OpenGL ES for representing shaders. The

shader object is an object that contains a single shader. The source code

is given to the shader object, and then the shader object is compiled into

object form (like an .obj le). After compilation, the shader object can then

be attached to a program object. A program object gets multiple shader

objects attached to it. In OpenGL ES, each program object will need to

have one vertex shader object and one fragment shader object attached to

it (no more and no less), unlike in desktop OpenGL. The program object is

linked into a nal “executable,” which can then be used to render.

The general process to get a linked shader object involves six steps:

1. Create a vertex shader object and a fragment shader object.

2. Attach source code to each of the shader objects.

3. Compile the shader objects.

4. Create a program object.

5. Attach the compiled shader objects to the program object.

6. Link the program object.

If there are no errors, you can then tell the GL to use this program for

drawing any time you like. The next sections detail the API calls you use

to execute this process.

Creating and Compiling a Shader

The rst step in working with a shader object is to create it. This is done

using glCreateShader.

GLuint glCreateShader(GLenum type)

type the type of the shader to create, either GL_VERTEX_SHADER

or GL_FRAGMENT_SHADER

Calling glCreateShader causes a new vertex or fragment shader to be

created, depending on the type passed in. The return value is a handle to

the new shader object. When you are nished with a shader object, you

can delete it using glDeleteShader.

Shaders and Programs 71

Note that if a shader is attached to a program object (more on this later),

calling glDeleteShader will not immediately delete the shader. Rather,

the shader will be marked for deletion and its memory will be freed once

the shader is no longer attached to any program objects.

Once you have a shader object created, typically the next thing you will

do is provide the shader source code using glShaderSource.

void glDeleteShader(GLuint shader)

shader handle to the shader object to delete

void glShaderSource(GLuint shader, GLsizei count,

const GLchar* const *string,

const GLint *length)

shader handle to the shader object.

count the number of shader source strings. A shader can be

composed of a number of source strings, although each

shader can have only one main function.

string pointer to an array of strings holding count number of

shader source strings.

length pointer to an array of count integers that holds the size of

each respective shader string. If length is NULL, the shader

strings are assumed to be null terminated. If length is not

NULL, then each element of length holds the number of

characters in the corresponding shader in the string array.

If the value of length for any element is less than zero, then

that string is assumed to be null terminated.

void glCompileShader(GLuint shader)

shader handle to the shader object to compile

Once the shader source has been specied, the next step is to compile the

shader using glCompileShader.

72 Chapter 4: Shaders and Programs

To check whether a shader has compiled successfully, you can call

glGetShaderiv on the shader object with the GL_COMPILE_STATUS

argument for pname. If the shader compiled successfully, the result

will be GL_TRUE. If the shader failed to compile, the result will be

GL_FALSE. If the shader does fail to compile, the compile errors will be

written into the info log. The info log is a log written by the compiler

that contains any error messages or warnings. It can be written with

information even if the compile operation is successful. To check the

info log, its length can be queried using GL_INFO_LOG_LENGTH. The

info log itself can be retrieved using glGetShaderInfoLog (described

next). Querying for GL_SHADER_TYPE will return whether the shader

is a GL_VERTEX_SHADER or GL_FRAGMENT_SHADER. Querying for

GL_SHADER_SOURCE_LENGTH returns the length of the shader source

code, including the null terminator. Finally, querying for GL_DELETE_

STATUS returns whether the shader has been marked for deletion using

glDeleteShader.

After compiling the shader and checking the info log length, you might

want to retrieve the info log (especially if compilation failed, to nd out

why). To do so, you rst need to query for the GL_INFO_LOG_LENGTH and

allocate a string with sufcient storage to store the info log. The info log

can then be retrieved using glGetShaderInfoLog.

void glGetShaderiv(GLuint shader, GLenum pname,

GLint *params)

shader handle to the shader object to get information about

pname the parameter to get information about; can be

GL_COMPILE_STATUS

GL_DELETE_STATUS

GL_INFO_LOG_LENGTH

GL_SHADER_SOURCE_LENGTH

GL_SHADER_TYPE

params pointer to integer storage location for the result of the query

Calling glCompileShader will cause the shader source code that has

been stored in the shader object to be compiled. As with any normal

language compiler, the rst thing you want to know after compiling is

whether there were any errors. You can use glGetShaderiv to query

for this information, along with other information about the shader

object.

Shaders and Programs 73

void glGetShaderInfoLog(GLuint shader, GLsizei maxLength,

GLsizei *length, GLchar *infoLog)

shader handle to the shader object for which to get the info log

maxLength the size of the buffer in which to store the info log

length the length of the info log written (minus the null

terminator); if the length does not need to be known, this

parameter can be NULL

infoLog pointer to the character buffer in which to store the info log

Example 4-1 Loading a Shader

GLuint LoadShader ( GLenum type, const char *shaderSrc )

{

GLuint shader;

GLint compiled;

// Create the shader object

shader = glCreateShader ( type );

if ( shader == 0 )

{

return 0;

}

The info log does not have any mandated format or required information.

Nevertheless, most OpenGL ES 3.0 implementations will return error

messages that contain the line number of the source code on which the

compiler was working when it detected the error. Some implementations

will also provide warnings or additional information in the log. For

example, the following error message is produced by the compiler when

the shader source code contains an undeclared variable:

ERROR: 0:10: ‘i_position’ : undeclared identifier

ERROR: 0:10: ‘assign’ : cannot convert from ‘4X4 matrix of float’

to‘vertex out/varying 4-component vector of float’

ERROR: 2 compilation errors. No code generated.

At this point, we have shown you all of the functions you need to create

a shader, compile it, nd out the compile status, and query the info log.

For review, Example 4-1 shows the code from Chapter 2, “Hello Triangle:

An OpenGL ES 3.0 Example,” to load a shader that uses the functions just

described.

(continues)

74 Chapter 4: Shaders and Programs

GLuint glCreateProgram()

Example 4-1 Loading a Shader (continued)

// Load the shader source

glShaderSource ( shader, 1, &shaderSrc, NULL );

// Compile the shader

glCompileShader ( shader );

// Check the compile status

glGetShaderiv ( shader, GL_COMPILE_STATUS, &compiled );

if ( !compiled )

{

// Retrieve the compiler messages when compilation fails

GLint infoLen = 0;

glGetShaderiv ( shader, GL_INFO_LOG_LENGTH, &infoLen );

if ( infoLen > 1 )

{

char* infoLog = malloc ( sizeof ( char ) * infoLen );

glGetShaderInfoLog ( shader, infoLen, NULL, infoLog );

esLogMessage(“Error compiling shader:\n%s\n”, infoLog);

free ( infoLog );

}

glDeleteShader ( shader );

return 0;

}

return shader;

}

Creating and Linking a Program

Now that we have shown you how to create shader objects, the next step

is to create a program object. As previously described, a program object is

a container object to which you attach shaders and link a nal executable

program. The function calls to manipulate program objects are similar to

shader objects. You create a program object by using glCreateProgram.

Shaders and Programs 75

You might notice that glCreateProgram does not take any arguments; it

simply returns a handle to a new program object. You delete a program

object by using glDeleteProgram.

Void glDeleteProgram(GLuint program)

program handle to the program object to delete

void glAttachShader(GLuint program, GLuint shader)

program handle to the program object

shader handle to the shader object to attach to the program

void glDetachShader(GLuint program, GLuint shader)

program handle to the program object

shader handle to the shader object to detach from the program

void glLinkProgram(GLuint program)

program handle to the program object to link

Once the shaders have been attached (and the shaders have been

successfully compiled), we are nally ready to link the shaders together.

Linking a program object is accomplished using glLinkProgram.

Once you have a program object created, the next step is to attach shaders

to it. In OpenGL ES 3.0, each program object needs to have one vertex

shader and one fragment shader object attached to it. To attach shaders to

a program, you use glAttachShader.

This function attaches the shader to the given program. Note that a shader

can be attached at any point—it does not necessarily need to be compiled

or even have source code before being attached to a program. The only

requirement is that every program object must have one and only one

vertex shader and fragment shader object attached to it. In addition to

attaching shaders, you can detach shaders using glDetachShader.

76 Chapter 4: Shaders and Programs

The link operation is responsible for generating the nal executable

program. The linker will check for a number of things to ensure

successful linkage. We mention some of these conditions now, but until

we describe vertex and fragment shaders in detail, these conditions might

be a bit confusing to you. The linker will make sure that any vertex

shader output variables that are consumed by the fragment shader are

written by the vertex shader (and declared with the same type). The

linker will also make sure that any uniforms and uniform buffers declared

in both the vertex and fragment shaders have matching types. In

addition, the linker will make sure that the nal program ts within the

limits of the implementation (e.g., the number of attributes, uniforms,

or input and output shader variables). Typically, the link phase is the

time at which the nal hardware instructions are generated to run on the

hardware.

After linking a program, you need to check whether the link succeeded. To

check the link status, you use glGetProgramiv.

void glGetProgramiv(GLuint program, GLenum pname,

GLint *params)

program handle to the program object to get information about

pname the parameter to get information about; can be

GL_ACTIVE_ATTRIBUTES

GL_ACTIVE_ATTRIBUTE_MAX_LENGTH

GL_ACTIVE_UNIFORM_BLOCK

GL_ACTIVE_UNIFORM_BLOCK_MAX_LENGTH

GL_ACTIVE_UNIFORMS

GL_ACTIVE_UNIFORM_MAX_LENGTH

GL_ATTACHED_SHADERS

GL_DELETE_STATUS

GL_INFO_LOG_LENGTH

GL_LINK_STATUS

GL_PROGRAM_BINARY_RETRIEVABLE_HINT

GL_TRANSFORM_FEEDBACK_BUFFER_MODE

GL_TRANSFORM_FEEDBACK_VARYINGS

GL_TRANSFORM_FEEDBACK_VARYING_MAX_LENGTH

GL_VALIDATE_STATUS

params pointer to integer storage location for the result of the

query

Shaders and Programs 77

To check whether a link was successful, you can query for GL_LINK_

STATUS. A large number of other queries can also be executed on

program objects. Querying for GL_ACTIVE_ATTRIBUTES returns a count

of the number of active attributes in the vertex shader. Querying for

GL_ACTIVE_ATTRIBUTE_MAX_LENGTH returns the maximum length (in

characters) of the largest attribute name; this information can be used to

determine how much memory to allocate to store attribute name strings.

Likewise, GL_ACTIVE_UNIFORMS and GL_ACTIVE_UNIFORM_MAX_LENGTH

return the number of active uniforms and the maximum length of the

largest uniform name, respectively. The number of shaders attached to

the program object can be queried using GL_ATTACHED_SHADERS. The

GL_DELETE_STATUS query returns whether a program object has been

marked for deletion. As with shader objects, program objects store an info

log, the length of which can be queried for using GL_INFO_LOG_LENGTH.

Querying for GL_TRANSFORM_FEEDBACK_BUFFER_MODE returns either

GL_SEPARATE_ATTRIBS or GL_INTERLEAVED_ATTRIBS, which is the buffer

mode when transform feedback is active. Queries for GL_TRANSFORM_

FEEDBACK_VARYINGS and GL_TRANSFORM_FEEDBACK_VARYING_MAX_LENGTH

return the number of output variables to capture in transform feedback

mode for the program and the maximum length of the output variable

names, respectively. The transform feedback is described in Chapter 8,

“Vertex Shaders.” The number of uniform blocks for programs containing

active uniforms and the maximum length of the uniform block names can

be queried using GL_ACTIVE_UNIFORM_BLOCKS and GL_ACTIVE_UNIFORM_

BLOCK_MAX_LENGTH, respectively. Uniform blocks are described in a later

section. Querying for GL_PROGRAM_BINARY_RETRIEVABLE_HINT returns a

value indicating whether the binary retrieval hint is currently enabled for

program. Finally, the status of the last validation operation can be queried

for using GL_VALIDATE_STATUS. The validation of program objects is

described later in this section.

After linking the program, we next want to get information from the

program info log (particularly if a link failure occurred). Doing so is

similar to getting the info log for shader objects.

void glGetProgramInfoLog(GLuint program, GLsizei maxLength,

GLsizei *length,

GLchar *infoLog)

program handle to the program object for which to get information

maxLength the size of the buffer in which to store the info log

(continues)

78 Chapter 4: Shaders and Programs

Once we have linked the program successfully, we are almost ready to

render with it. Before doing so, however, we might want to check whether

the program validates. That is, there are certain aspects of execution that

a successful link cannot guarantee. For example, perhaps the application

never binds valid texture units to samplers. This behavior will not be

known at link time, but instead will become apparent at draw time. To

check that your program will execute with the current state, you can call

glValidateProgram.

void glValidateProgram(GLuint program)

program handle to the program object to validate

void glUseProgram(GLuint program)

program handle to the program object to make active

(continued)

length the length of the info log written (minus the null

terminator); if the length does not need to be known,

this parameter can be NULL

infoLog pointer to the character buffer in which to store the info log

The result of the validation can be checked using GL_VALIDATE_STATUS

described earlier. The info log will also be updated.

Note: You really want to use glValidateProgram only for debugging

purposes. It is a slow operation and certainly not something you

want to check before every render. In fact, you can get away with

never using it if your application is successfully rendering. We want

to make you aware that this function does exist, though.

So far, we have shown you the functions needed for creating a program

object, attaching shaders to it, linking, and getting the info log. There is

one more thing you need to do with a program object before rendering,

and that is to set it as the active program using glUseProgram.

Now that we have our program active, we are set to render. Once again,

Example 4-2 shows the code from our sample in Chapter 2, “Hello

Triangle: An OpenGL ES 3.0 Example,” that uses these functions.

Shaders and Programs 79

Example 4-2 Create, Attach Shaders to, and Link a Program

// Create the program object

programObject = glCreateProgram ( );

if ( programObject == 0 )

{

return 0;

}

glAttachShader ( programObject, vertexShader );

glAttachShader ( programObject, fragmentShader );

// Link the program

glLinkProgram ( programObject );

// Check the link status

glGetProgramiv ( programObject, GL_LINK_STATUS, &linked );

if ( !linked )

{

// Retrieve compiler error messages when linking fails

GLint infoLen = 0;

glGetProgramiv( programObject, GL_INFO_LOG_LENGTH, &infoLen);

if ( infoLen > 1 )

{

char* infoLog = malloc ( sizeof ( char ) * infoLen );

glGetProgramInfoLog ( programObject, infoLen, NULL,

infoLog );

esLogMessage ( “Error linking program:\n%s\n”, infoLog );

free ( infoLog );

}

glDeleteProgram ( programObject );

return FALSE;

}

// ...

// Use the program object

glUseProgram ( programObject );

80 Chapter 4: Shaders and Programs

Uniforms and Attributes

Once you have a linked program object, there are number of queries that

you might want to do on it. First, you will likely need to nd out about

the active uniforms in your program. Uniforms—as we detail more in the

next chapter on the shading language—are variables that store read-only

constant values that are passed in by the application through the OpenGL

ES 3.0 API to the shader.

Sets of uniforms are grouped into two categories of uniform blocks. The

rst category is the named uniform block, where the uniform’s value is

backed by a buffer object called a uniform buffer object (more on that

next). The named uniform block is assigned a uniform block index.

The following example declares a named uniform block with the name

TransformBlock containing three uniforms (matViewProj, matNormal,

and matTexGen):

uniform TransformBlock

{

mat4 matViewProj;

mat3 matNormal;

mat3 matTexGen;

};

The second category is the default uniform block for uniforms that are

declared outside of a named uniform block. Unlike with the named

uniform block, there is no name or uniform block index for default

uniform blocks. The following example declares the same three uniforms

outside of a named uniform block:

uniform mat4 matViewProj;

uniform mat3 matNormal;

uniform mat3 matTexGen;

We describe uniform blocks in more detail in the section Uniform Blocks in

Chapter 5.

If a uniform is declared in both a vertex shader and a fragment shader, it

must have the same type, and its value will be the same in both shaders.

During the link phase, the linker will assign uniform locations to each

of the active uniforms associated with the default uniform block in the

program. These locations are the identiers the application will use to

load the uniform with a value. The linker will also assign offsets and

strides (for array and matrix type uniforms) for active uniforms associated

with the named uniform blocks.

Uniforms and Attributes 81

Getting and Setting Uniforms

To query for the list of active uniforms in a program, you rst call

glGetProgramiv with the GL_ACTIVE_UNIFORMS parameter (as described

in the previous section). This will tell you the number of active uniforms

in the program. The list includes uniforms in named uniform blocks,

default block uniforms declared in shader code, and built-in uniforms

used in shader code. A uniform is considered “active” if it was used by

the program. In other words, if you declare a uniform in one of your

shaders but never use it, the linker will likely optimize that away and not

return it in the active uniform list. You can also nd out the number of

characters (including the null terminator) that the largest uniform name

has in the program; this can be done by calling glGetProgramiv with the

GL_ACTIVE_UNIFORM_MAX_LENGTH parameter.

Once we know the number of active uniforms and the number

of characters needed to store the uniform names, we can nd

out the details on each uniform using glGetActiveUniform and

glGetActiveUniformsiv.

void glGetActiveUniform(GLuint program, GLuint index,

GLsizei bufSize, GLsizei *length,

GLint *size, GLenum *type,

GLchar *name)

program handle to the program object

index the uniform index to be queried

bufSize the number of characters in the name array

length if not NULL, will be written with the number of characters

written into the name array (less the null terminator)

size if the uniform variable being queried is an array, this

variable will be written with the maximum array element

used in the program (plus 1); if the uniform variable being

queried is not an array, this value will be 1

type will be written with the uniform type; can be

GL_FLOAT, GL_FLOAT_VEC2, GL_FLOAT_VEC3,

GL_FLOAT_VEC4, GL_INT, GL_INT_VEC2, GL_INT_VEC3,

GL_INT_VEC4, GL_UNSIGNED_INT,

GL_UNSIGNED_INT_VEC2, GL_UNSIGNED_INT_VEC3,

GL_UNSIGNED_INT_VEC4, GL_BOOL, GL_BOOL_VEC2,

GL_BOOL_VEC3, GL_BOOL_VEC4, GL_FLOAT_MAT2,

(continues)

82 Chapter 4: Shaders and Programs

Using glGetActiveUniform, you can determine nearly all of the properties

of the uniform. You can determine the name of the uniform variable along

with its type. In addition, you can nd out if the variable is an array, and

if so what the maximum element used in the array was. The name of the

uniform is necessary to nd the uniform’s location, and the type and size

(continued)

GL_FLOAT_MAT3, GL_FLOAT_MAT4, GL_FLOAT_MAT2x3,

GL_FLOAT_MAT2x4, GL_FLOAT_MAT3x2, GL_FLOAT_MAT3x4,

GL_FLOAT_MAT4x2, GL_FLOAT_MAT4x3, GL_SAMPLER_2D,

GL_SAMPLER_3D, GL_SAMPLER_CUBE,

GL_SAMPLER_2D_SHADOW, GL_SAMPLER_2D_ARRAY,

GL_SAMPLER_2D_ARRAY_SHADOW,

GL_SAMPLER_CUBE_SHADOW, GL_INT_SAMPLER_2D,

GL_INT_SAMPLER_3D, GL_INT_SAMPLER_CUBE,

GL_INT_SAMPLER_2D_ARRAY,

GL_UNSIGNED_INT_SAMPLER_2D,

GL_UNSIGNED_INT_SAMPLER_3D,

GL_UNSIGNED_INT_SAMPLER_CUBE,

GL_UNSIGNED_INT_SAMPLER_2D_ARRAY

name will be written with the name of the uniform up to bufSize

number of characters; this will be a null-terminated string

void glGetActiveUniformsiv(GLuint program, GLsizei count,

const GLuint *indices,

GLenum pname, GLint *params)

program handle to the program object

count the number of elements in the array of indices

indices a list of uniform indices

pname property of each uniform in the uniform indices to be

written into the elements of params; can be

GL_UNIFORM_TYPE, GL_UNIFORM_SIZE,

GL_UNIFORM_NAME_LENGTH, GL_UNIFORM_BLOCK_INDEX,

GL_UNIFORM_OFFSET, GL_UNIFORM_ARRAY_STRIDE,

GL_UNIFORM_MATRIX_STRIDE, GL_UNIFORM_IS_ROW_MAJOR

params will be written with the result specied by pname

corresponding to each uniform in the uniform indices

Uniforms and Attributes 83

are needed to gure out how to load it with data. Once we have the name

of the uniform, we can nd its location using glGetUniformLocation. The

uniform location is an integer value used to identify the location of the

uniform in the program (note that uniforms in the named uniform blocks

are not assigned a location). That location value is used by subsequent calls

for loading uniforms with values (e.g., glUniformlf).

GLint glGetUniformLocation(GLuint program,

const GLchar* name)

program handle to the program object

name the name of the uniform for which to get the location

void glUniform1f(GLint location, GLfloat x)

void glUniform1fv( GLint location, GLsizei count,

const GLfloat* value)

void glUniform1i(GLint location, GLint x)

void glUniform1iv( GLint location, GLsizei count,

const GLint* value)

void glUniform1ui(GLint location, GLuint x)

void glUniform1uiv( GLint location, GLsizei count,

const GLuint* value)

void glUniform2f(GLint location, GLfloat x, GLfloat y)

void glUniform2fv( GLint location, GLsizei count,

const GLfloat* value)

void glUniform2i(GLint location, GLint x, GLint y)

void glUniform2iv(GLint location, GLsizei count,

const GLint* value)

void glUniform2ui(GLint location, GLuint x, GLuint y)

void glUniform2uiv(GLint location, GLsizei count,

const GLuint* value)

This function will return the location of the uniform given by name. If

the uniform is not an active uniform in the program, then the return

value will be –1. Once we have the uniform location along with its type

and array size, we can then load the uniform with values. A number of

different functions for loading uniforms are available, with different

functions for each uniform type.

(continues)

84 Chapter 4: Shaders and Programs

(continued)

void glUniform3f( GLint location, GLfloat x, GLfloat y,

GLfloat z)

void glUniform3fv( GLint location, GLsizei count,

const GLfloat* value)

void glUniform3i( GLint location, GLint x, GLint y,

GLint z)

void glUniform3iv( GLint location, GLsizei count,

const GLint* value)

void glUniform3ui( GLint location, GLuint x, GLuint y,

GLuint z)

void glUniform3uiv( GLint location, GLsizei count,

const GLuint* value)

void glUniform4f( GLint location, GLfloat x, GLfloat y,

GLfloat z, GLfloat w);

void glUniform4fv( GLint location, GLsizei count,

const GLfloat* value)

void glUniform4i( GLint location, GLint x, GLint y,

GLint z, GLint w)

void glUniform4iv( GLint location, GLsizei count,

const GLint* value)

void glUniform4ui( GLint location, GLuint x, GLuint y,

GLuint z, GLuint w)

void glUniform4uiv( GLint location, GLsizei count,

const GLuint* value)

void glUniformMatrix2fv( GLint location, GLsizei count,

GLboolean transpose,

const GLfloat* value)

void glUniformMatrix3fv( GLint location, GLsizei count,

GLboolean transpose,

const GLfloat* value)

void glUniformMatrix4fv( GLint location, GLsizei count,

GLboolean transpose,

const GLfloat* value)

void glUniformMatrix2x3fv( GLint location, GLsizei count,

GLboolean transpose,

const GLfloat* value)

void glUniformMatrix3x2fv( GLint location, GLsizei count,

GLboolean transpose,

const GLfloat* value)

Uniforms and Attributes 85

The functions for loading uniforms are mostly self-explanatory. The

determination of which function you need to use for loading the uniform

is based on the type returned by the glGetActiveUniform function.

For example, if the type is GL_FLOAT_VEC4, then either glUniform4f or

glUniform4fv can be used. If the size returned by glGetActiveUniform

is greater than 1, then glUniform4fv would be used to load the entire

array in one call. If the uniform is not an array, then either glUniform4f

or glUniform4fv could be used.

One point worth noting here is that the glUniform* calls do not

take a program object handle as a parameter. The reason is that the

glUniform* calls always act on the current program that is bound with

glUseProgram. The uniform values themselves are kept with the program

object. That is, once you set a uniform to a value in a program object,

that value will remain with it even if you make another program active.

In that sense, we can say that uniform values are local to a program object.

The block of code in Example 4-3 demonstrates how you would go

about querying for uniform information on a program object using the

functions we have described.

void glUniformMatrix2x4fv( GLint location, GLsizei count,

GLboolean transpose,

const GLfloat* value)

void glUniformMatrix4x2fv( GLint location, GLsizei count,

GLboolean transpose,

const GLfloat* value)

void glUniformMatrix3x4fv( GLint location, GLsizei count,

GLboolean transpose,

const GLfloat* value)

void glUniformMatrix4x3fv( GLint location, GLsizei count,

GLboolean transpose,

const GLfloat* value)

location the location of the uniform to load with a value.

count species the number of array elements to be loaded

(for vector commands) or the number of matrices to be

modied (for matrix commands).

transpose for matrix commands, species whether the matrix is in

column major order (with GL_FALSE) or row major order

(with GL_TRUE).

x, y, z, w updated uniform values

value a pointer to an array of count elements

86 Chapter 4: Shaders and Programs

Example 4-3 Querying for Active Uniforms

GLint maxUniformLen;

GLint numUniforms;

char *uniformName;

GLint index;

glGetProgramiv ( progObj, GL_ACTIVE_UNIFORMS, &numUniforms );

glGetProgramiv ( progObj, GL_ACTIVE_UNIFORM_MAX_LENGTH,

&maxUniformLen );

uniformName = malloc ( sizeof ( char ) * maxUniformLen );

for ( index = 0; index < numUniforms; index++ )

{

GLint size;

GLenum type;

GLint location;

// Get the uniform info

glGetActiveUniform ( progObj, index, maxUniformLen, NULL,

&size, &type, uniformName );

// Get the uniform location

location = glGetUniformLocation ( progObj, uniformName );

switch ( type )

{

case GL_FLOAT:

break;

case GL_FLOAT_VEC2:

break;

case GL_FLOAT_VEC3:

break;

case GL_FLOAT_VEC4:

break;

case GL_INT:

break;

// ... Check for all the types ...

Uniforms and Attributes 87

Uniform Buffer Objects

You can share uniforms between shaders in a program or even between

programs by using a buffer object to store uniform data. Such buffer

objects are called uniform buffer objects. Using uniform buffer objects, you

can potentially reduce the API overhead when updating large blocks of

uniforms. In addition, this approach increases the potential storage available

for uniforms because you are not limited by the default uniform block size.

To update the uniform data in a uniform buffer object, you can modify

the contents of the buffer object using commands such as glBufferData,

glBufferSubData, glMapBufferRange, and glUnmapBuffer (these

commands are described in Chapter 6, “Vertex Attributes, Vertex Arrays,

and Buffer Objects”) rather than using the glUniform* commands

described in the previous section.

In the uniform buffer objects, uniforms are represented in memory as follows:

Members of type bool, int, uint, and float are stored in memory at

the specied offset as single uint-typed, int-typed, uint-typed, and

float-typed components, respectively.

Vectors with basic data types of bool, int, uint, or float are stored

in consecutive memory locations beginning at the specied offset,

with the rst component at the lowest offset.

Column-major matrices with C columns and R rows are treated as

an array of C oating-point column vectors, each consisting of R

components. Similarly, row-major matrices with R rows and C columns

are treated as an array of R oating-point row vectors, each consisting

of C components. While the column or row vectors are stored

consecutively, they may be stored with gaps by the implementation. The

offset between two vectors in the matrix is referred to as the column or

row stride (GL_UNIFORM_MATRIX_STRIDE) and can be queried in a linked

program using glGetActiveUniformsiv.

Example 4-3 Querying for Active Uniforms (continued)

default:

// Unknown type

break;

}

88 Chapter 4: Shaders and Programs

Arrays of scalars, vectors, and matrices are stored in memory by

element order, with array member zero placed at the lowest offset.

The offset between each pair of elements in the array is constant and

referred to as the array stride (GL_UNIFORM_ARRAY_STRIDE) and can be

queried in a linked program using glGetActiveUniformsiv.

Unless you use the std140 uniform block layout (the default),

you will need to query the program object for the byte offsets and

strides to set uniform data in the uniform buffer object. The std140

layout guarantees a specic packing behavior with an explicit layout

specication dened by the OpenGL ES 3.0 specication. Thus using

std140 layout allows you to share the uniform block between different

OpenGL ES 3.0 implementations. Other packing formats (see Table 5-4)

may allow some OpenGL ES 3.0 implementations to pack the data more

tightly together than the std140 layout.

The following is an example of a named uniform block LightBlock using

the std140 layout:

layout (std140) uniform LightBlock

{

vec3 lightDirection;

vec4 lightPosition;

};

The std140 layout is specied as follows (adapted from the OpenGL

ES 3.0 specication). When the uniform block contains the following

member:

1. A scalar variable—The base alignment is the size of the scalar. For

example, sizeof(GLint).

2. A two-component vector—The base alignment is twice the size of the

underlying component type size.

3. A three-component or four-component vector—The base alignment

is four times the size of the underlying component type size.

4. An array of scalars or vectors—The base alignment and array stride

are set to match the base alignment of a single element array. The

entire array is padded to a multiple of the size of a vec4.

5. A column-major matrix with C columns and R rows—Stored as an

array of C vectors with R components according to rule 4.

6. An array of M column-major matrices with C columns and R rows—

Stored as M × C vectors with R components according to rule 4.

Uniforms and Attributes 89

7. A row-major matrix with C columns and R rows—Stored as an array

of R vectors with C components according to rule 4.

8. An array of M row-major matrices with C columns and R rows—

Stored as M × R vectors with C components according to rule 4.

9. A single structure—The offset and size are calculated according to the

preceding rules. The structure’s size will be padded to a multiple of

the size of a vec4.

10. An array of S structures—The base alignment is calculated according

to the alignment of the element of the array. The element of the

array is calculated according to rule 9.

Similar to how a uniform location value is used to refer to a uniform, a

uniform block index is used to refer to a uniform block. You can retrieve

the uniform block index using glGetUniformBlockIndex.

GLuint glGetUniformBlockIndex(GLuint program,

const GLchar *blockName)

program handle to the program object

blockName the name of the uniform block for which to get the index

From the uniform block index, you can determine the details of the active

uniform block using glGetActiveUniformBlockName (to get the block

name) and glGetActiveUniformBlockiv (to get many properties of the

uniform block).

void glGetActiveUniformBlockName(GLuint program,

GLuint index,

GLsizei bufSize,

GLsizei *length,

GLchar *blockName)

program handle to the program object

index the uniform block index to be queried

bufSize the number of characters in the name array

length if not NULL, will be written with the number of characters

written into the name array (less the null terminator)

(continues)

90 Chapter 4: Shaders and Programs

Querying for GL_UNIFORM_BLOCK_BINDING returns the last buffer binding

point for the uniform block (zero, if this block does not exist). The

GL_UNIFORM_BLOCK_DATA_SIZE argument returns the minimum total

buffer object size to hold all the uniforms for the uniform block, while

querying for GL_UNIFORM_BLOCK_NAME_LENGTH returns the total length

(including the null terminator) of the name of the uniform block. The

number of active uniforms in the uniform block can be queried using

GL_UNIFORM_BLOCK_ACTIVE_UNIFORMS. The GL_UNIFORM_BLOCK_ACTIVE_

NUMBER_INDICES query returns a list of the active uniform indices in the

uniform block. Finally, querying for GL_UNIFORM_BLOCK_REFERENCED_

BY_VERTEX_SHADER and GL_UNIFORM_BLOCK_REFERENCED_BY_FRAGMENT_

SHADER returns a boolean value, whether the uniform block is referenced

by the vertex or fragment shader in the program, respectively.

Once you have the uniform block index, you can associate the

index with a uniform block binding point in the program by calling

glUniformBlockBinding.

void glGetActiveUniformBlockiv( GLuint program,

GLuint index,

GLenum pname,

GLint *params)

program handle to the program object

index the uniform block index to be queried

pname property of the uniform block index to be written into

params; can be

GL_UNIFORM_BLOCK_BINDING

GL_UNIFORM_BLOCK_DATA_SIZE

GL_UNIFORM_BLOCK_NAME_LENGTH

GL_UNIFORM_BLOCK_ACTIVE_UNIFORMS

GL_UNIFORM_BLOCK_ACTIVE_UNIFORM_INDICES

GL_UNIFORM_BLOCK_REFERENCED_BY_VERTEX_SHADER

GL_UNIFORM_BLOCK_REFERENCED_BY_FRAGMENT_SHADER

params will be written with the result specied by pname

(continued)

blockName will be written with the name of the uniform up to

bufSize number of characters; this will be a null-

terminated string

Uniforms and Attributes 91

void glBindBufferRange( GLenum target, GLuint index,

GLuint buffer, GLintptr offset,

GLsizeiptr size)

void glBindBufferBase(GLenum target, GLuint index,

GLuint buffer)

target must be GL_UNIFORM_BUFFER or

GL_TRANSFORM_FEEDBACK_BUFFER

index the binding index

buffer the handle to the buffer object

offset a starting offset in bytes into the buffer object

(glBindBufferRange only)

size the amount of data in bytes that can be read from or written

to the buffer object (glBindBufferRange only)

void glUniformBlockBinding(GLuint program,

GLuint blockIndex,

GLuint blockBinding)

program handle to the program object

blockIndex index of the uniform block

blockBinding uniform buffer object binding point

When programming the uniform blocks, you should pay attention to the

following limitations:

The maximum number of active uniform blocks used by a vertex

or fragment shader can be queried using glGetIntegerv with

GL_MAX_VERTEX_UNIFORM_BLOCKS or GL_MAX_FRAGMENT_UNIFORM_

BLOCKS, respectively. The minimum supported number for any

implementation is 12.

The maximum number of combined active uniform blocks used by

all shaders in a program can be queried using glGetIntegerv with

GL_MAX_COMBINED_UNIFORM_BLOCKS. The minimum supported

number for any implementation is 24.

Finally, you can bind the uniform buffer object to the GL_UNIFORM_

BUFFER target and a uniform block binding point in the program using

glBindBufferRange or glBindBufferBase.

92 Chapter 4: Shaders and Programs

The maximum available storage per uniform buffer can be queried

using glGetInteger64v with GL_MAX_UNIFORM_BLOCK_SIZE, which

returns the size in bytes. The minimum supported number for any

implementation is 16 KB.

If you violate any of these limits, the program will fail to link.

The following example shows how to set up a uniform buffer object with

the named uniform block LightTransform described earlier:

GLuint blockId, bufferId;

GLint blockSize;

GLuint bindingPoint = 1;

GLfloat lightData[] =

{

// lightDirection (padded to vec4 based on std140 rule)

1.0f, 0.0f, 0.0f, 0.0f,

// lightPosition

0.0f, 0.0f, 0.0f, 1.0f

};

// Retrieve the uniform block index

blockId = glGetUniformBlockIndex ( program, “LightBlock” );

// Associate the uniform block index with a binding point

glUniformBlockBinding ( program, blockId, bindingPoint );

// Get the size of lightData; alternatively,

// we can calculate it using sizeof(lightData) in this example

glGetActiveUniformBlockiv ( program, blockId,

GL_UNIFORM_BLOCK_DATA_SIZE,

&blockSize );

// Create and fill a buffer object

glGenBuffers ( 1, &bufferId );

glBindBuffer ( GL_UNIFORM_BUFFER, bufferId );

glBufferData ( GL_UNIFORM_BUFFER, blockSize, lightData,

GL_DYNAMIC_DRAW);

// Bind the buffer object to the uniform block binding point

glBindBufferBase ( GL_UNIFORM_BUFFER, bindingPoint, buffer );

Getting and Setting Attributes

In addition to querying for uniform information on the program object,

you will need to use the program object to set up vertex attributes.

The queries for vertex attributes are very similar to the uniform

Shader Compiler 93

queries. You can nd the list of active attributes using the GL_ACTIVE_

ATTRIBUTES query. You can nd the properties of an attribute using

glGetActiveAttrib. A set of routines are then available for setting up

vertex arrays to load the vertex attributes with values.

However, setting up vertex attributes really requires a bit more

understanding of primitives and the vertex shader than we are ready to

delve into right now. Instead, we dedicate an entire chapter (Chapter 6,

“Vertex Attributes, Vertex Arrays, and Buffer Objects”) to vertex attributes

and vertex arrays. If you want to nd out how to query for vertex attribute

info, jump to Chapter 6 and the section Declaring Vertex Attribute Variables

in a Vertex Shader.

Shader Compiler

When you ask OpenGL ES to compile and link a shader, take a minute

to think about what the implementation has to do. The shader code is

typically parsed into some sort of intermediate representation, as most

compiled languages are (e.g., an Abstract Syntax Tree). A compiler must

then convert the abstract representation into machine instructions

for the hardware. Ideally, this compiler should also do a great deal

of optimization, such as dead-code removal, constant propagation,

and more. Performing all this work comes at a price—and this price is

primarily CPU time and memory.

OpenGL ES 3.0 implementations must support online shader compilation

(the value of GL_SHADER_COMPILER retrieved using glGetBooleanv

must be GL_TRUE). You can specify your shaders using glShaderSource,

as we have done so far in our examples. You can also try to mitigate

the resource impact of shader compilation. That is, once you have

nished compiling any shaders for your application, you can call

glReleaseShaderCompiler. This function provides a hint to the

implementation that you are done with the shader compiler, so it can

free its resources. Note that this function is only a hint; if you decide to

compile more shaders using glCompileShader, the implementation will

need to reallocate its resources for the compiler.

void glReleaseShaderCompiler (void)

Provides a hint to the implementation that it can release resources

used by the shader compiler. Because this function is only a hint, some

implementations may ignore a call to this function.

94 Chapter 4: Shaders and Programs

Program Binaries

Program binaries are the binary representation of a complete compiled

and linked program. They are useful because they can be saved to the le

system to be reused later, thereby avoiding the cost of online compilation.

You may also use program binaries so that you do not have to distribute

the shader source codes in your implementation.

You can retrieve the program binary using glGetProgramBinary after you

have compiled and linked the program successfully.

After you have retrieved the program binary, you can save it to the le

system or load the program binary back into the implementation using

glProgramBinary.

void glGetProgramBinary( GLuint program, GLsizei bufSize,

GLsizei *length, GLenum binaryFormat,

GLvoid *binary)

program handle to the program object

bufSize the maximum number of bytes that may be written

into the binary

length the number of bytes in the binary data

binaryFormat the vendor-specic binary format token

binary pointer to the binary data generated by the shader

compiler

void glProgramBinary( GLuint program, GLenum binaryFormat,

const GLvoid *binary, GLsizei length)

program handle to the program object

binaryFormat the vendor-specic binary format token

binary pointer to the binary data generated by the shader

compiler

length the number of bytes in the binary data

Summary 95

The OpenGL ES specication does not mandate any particular binary

format; instead, the binary format is left completely up to the vendor. This

obviously means that programs have less portability, but it also means the

vendor can create a less burdensome implementation of OpenGL ES 3.0.

In fact, the binary format may change from one driver version to another

implemented by the same vendor. To ensure that the stored program

binary is still compatible, after calling glProgramBinary, you can

query the GL_LINK_STATUS through glGetProgramiv. If it is no longer

compatible, then you will need to recompile the shader source code.

Summary

In this chapter, you learned how to create, compile, and link shaders

into a program. Shader objects and program objects form the most

fundamental objects in OpenGL ES 3.0. We discussed how to query the

program object for information and how to load uniforms. In addition,

you learned how source shaders and program binaries differ and how to

use each. Next, you will learn how to write a shader using the OpenGL ES

Shading Language.

This page intentionally left blank

Chapter 5

OpenGL ES Shading Language

As you saw in earlier chapters, shaders are a fundamental concept that

lies at the heart of the OpenGL ES 3.0 API. Every OpenGL ES 3.0 program

requires both a vertex shader and a fragment shader to render a meaningful

picture. Given the centrality of the concept of shaders to the API, we want

to make sure you are grounded in the fundamentals of writing shaders

before diving into more details of the graphics API.

This chapter’s goal is to make sure you understand the following concepts

in the shading language:

Variables and variable types

Vector and matrix construction and selection

Constants

Structures and arrays

Operators, control ow, and functions

Input/output variables, uniforms, uniform blocks, and layout

qualiers

Preprocessor and directives

Uniform and interpolator packing

Precision qualiers and invariance

You were introduced to some of these concepts in a small amount of

detail with the example in Chapter 2, “Hello Triangle: An OpenGL ES 3.0

Example.” Now we will ll in the concepts with a lot more detail to make

sure you understand how to write and read shaders.

98 Chapter 5: OpenGL ES Shading Language

OpenGL ES Shading Language Basics

As you read through this book, you will look at a lot of shaders. If you

ever start developing your own OpenGL ES 3.0 application, chances

are that you will write a lot of shaders. By now, you should understand

the fundamental concepts of what a shader does and how it ts in the

pipeline. If not, please go back and review Chapter 1, “Introduction to

OpenGL ES 3.0,” where we covered the pipeline and described where

vertex and fragment shaders t within it.

What we want to look at now is what exactly makes up a shader. As you

have probably already observed, the syntax bears great similarity to that

seen in the C programming language. If you can understand C code, you

likely will not have much difculty understanding the syntax of shaders.

However, there are certainly some major differences between the two

languages, beginning with the version specication and the native data

types that are supported.

Shader Version Specication

The rst line of your OpenGL ES 3.0 vertex and fragment shaders will

always declare a shader version. Declaring the shader version informs the

shader compiler which syntax and constructs it can expect to be present

in the shader. The compiler checks the shader syntax against the declared

version of the shading language used. To declare that your shader uses

version 3.00 of the OpenGL ES Shading Language, use the following

syntax:

#version 300 es

Shaders that do not declare a version number are assumed to use revision

1.00 of the OpenGL ES Shading Language. Revision 1.00 of the shading

language is the version that was used in OpenGL ES 2.0. For OpenGL

ES 3.0, the specication authors decided to match the version numbers

for the API and Shading Language, which explains why the number

jumped from 1.00 to 3.00 for OpenGL ES 3.0. As described in Chapter1,

“Introduction to OpenGL ES 3.0,” the OpenGL ES Shading Language

3.0 adds many new features, including non-square matrices, full integer

support, interpolation qualiers, uniform blocks, layout qualiers, new

built-in functions, full looping, full branching support, and unlimited

shader instruction length.

Variables and Variable Types 99

Variables and Variable Types

In computer graphics, two fundamental data types form the basis of

transformations: vectors and matrices. These two data types are central to

the OpenGL ES Shading Language as well. Specically, Table 5-1 describes

the scalar-, vector-, and matrix-based data types that exist in the shading

language.

Variables in the shading language must be declared with a type. For

example, the following declarations illustrate how to declare a scalar, a

vector, and a matrix:

float specularAtten; // A floating-point-based scalar

vec4 vPosition; // A floating-point-based 4-tuple vector

Table 5-1 Data Types in the OpenGL ES Shading Language

Variable Class Types Description

Scalars float, int, uint,

bool Scalar-based data types

for oating-point, integer,

unsignedinteger, and

booleanvalues

Floating-point

vectors

float, vec2, vec3,

vec4 Floating-point–based vector

types of one, two, three, or

four components

Integer vector int, ivec2, ivec3,

ivec4 Integer-based vector types

of one, two, three, or four

components

Unsigned integer

vector

uint, uvec2, uvec3,

uvec4 Unsigned integer-based vector

types of one, two, three, or

four components

Boolean vector bool, bvec2, bvec3,

bvec4 Boolean-based vector types

of one, two, three, or four

components

Matrices mat2 (or mat2x2),

mat2x3, mat2x4,

mat3x2, mat3 (or

mat3x3), mat3x4,

mat4x2, mat4x3,

mat4 (or mat4x4)

Floating-point based matrices

of size 2 × 2, 2 × 3, 2 × 4, 3 × 2,

3 × 3, 3 × 4, 4 × 2, 4 × 3, or 4 × 4

100 Chapter 5: OpenGL ES Shading Language

mat4 mViewProjection; // A 4 x 4 matrix variable declaration

ivec2 vOffset; // An integer-based 2-tuple vector

Variables can be initialized either at declaration time or later. Initialization is done

through the use of constructors, which are also used for doing type conversions.

Variable Constructors

The OpenGL ES Shading Language has very strict rules regarding type

conversion. That is, variables can only be assigned to or operated on

other variables of the same type. The reasoning behind not allowing

implicit type conversion in the language is that it avoids shader authors

encountering unintended conversion that can lead to difcult-to-track-

down bugs. To cope with type conversions, a number of constructors

are available in the language. You can use constructors for initializing

variables and as a way of type-casting between variables of different types.

Variables can be initialized at declaration (or later in the shader) through

the use of constructors. Each of the built-in variable types has a set of

associated constructors.

Let’s rst look at how constructors can be used to initialize and type-cast

between scalar values.

float myFloat = 1.0;

float myFloat2 = 1; // ERROR: invalid type conversion

bool myBool = true;

int myInt = 0;

int myInt2 = 0.0; // ERROR: invalid type conversion

myFloat = float(myBool); // Convert from bool -> float

myFloat = float(myInt); // Convert from int -> float

myBool = bool(myInt); // Convert from int -> bool

Similarly, constructors can be used to convert to and initialize vector data

types. The arguments to a vector constructor will be converted to the same

basic type as the vector being constructed (float, int, or bool). There are

two basic ways to pass arguments to vector constructors:

If only one scalar argument is provided to a vector constructor, that

value is used to set all values of the vector.

If multiple scalar or vector arguments are provided, the values of the

vector are set from left to right using those arguments. If multiple

scalar arguments are provided, there must be at least as many

components in the arguments as in the vector.

Vector and Matrix Components 101

The following shows some examples of constructing vectors:

vec4 myVec4 = vec4(1.0); // myVec4 = {1.0, 1.0, 1.0,

// 1.0}

vec3 myVec3 = vec3(1.0,0.0,0.5); // myVec3 = {1.0, 0.0, 0.5}

vec3 temp = vec3(myVec3); // temp = myVec3

vec2 myVec2 = vec2(myVec3); // myVec2 = {myVec3.x,

// myVec3.y}

myVec4 = vec4(myVec2, temp); // myVec4 = {myVec2.x,

// myVec2.y,

// temp.x, temp.y}

For matrix construction, the language is exible. These basic rules describe

how matrices can be constructed:

If only one scalar argument is provided to a matrix constructor, that

value is placed in the diagonal of the matrix. For example, mat4

(1.0) will create a 4 × 4 identity matrix.

A matrix can be constructed from multiple vector arguments. For

example, a mat2 can be constructed from two vec2s.

A matrix can be constructed from multiple scalar arguments—one for

each value in the matrix, consumed from left to right.

The matrix construction is even more exible than the basic rules just

stated, in that a matrix can basically be constructed from any combination

of scalars and vectors as long as enough components are provided to

initialize the matrix. Matrices in OpenGL ES are stored in column major

order. When using a matrix constructor, the arguments will be consumed

to ll the matrix by column. The comments in the following example

show how the matrix constructor arguments map into columns.

mat3 myMat3 = mat3(1.0, 0.0, 0.0, // First column

0.0, 1.0, 0.0, // Second column

0.0, 1.0, 1.0); // Third column

Vector and Matrix Components

The individual components of a vector can be accessed in two ways:

by using the “.” operator or through array subscripting. Depending on

the number of components that make up a given vector, each of the

components can be accessed through the use of the swizzles {x, y, z, w},

{r,g, b, a}, or {s, t, p, q}. The reason for the three different naming schemes

102 Chapter 5: OpenGL ES Shading Language

is that vectors are used interchangeably to represent mathematical vectors,

colors, and texture coordinates. The x, r, or s component will always refer

to the rst element of a vector. The different naming conventions are just

provided as a convenience. That said, you cannot mix naming conventions

when accessing a vector (in other words, you cannot do something like

.xgr, as you can use only one naming convention at a time). When using

the “.” operator, it is also possible to reorder components of a vector in an

operation. The following examples show how this can be done.

vec3 myVec3 = vec3(0.0, 1.0, 2.0); // myVec3 = {0.0, 1.0, 2.0}

vec3 temp;

temp = myVec3.xyz; // temp = {0.0, 1.0, 2.0}

temp = myVec3.xxx; // temp = {0.0, 0.0, 0.0}

temp = myVec3.zyx; // temp = {2.0, 1.0, 0.0}

In addition to the “.” operator, vectors can be accessed using the array

subscript “[]” operator. In array subscripting, element [0] corresponds to

x, element [1] corresponds to y, and so forth. Matrices are treated as being

composed of a number of vectors. For example, a mat2 can be thought

of as two vec2s, a mat3 as three vec3s, and so forth. For matrices, the

individual column is selected using the array subscript operator “[]”, and

then each vector can be accessed using the vector access behavior. The

following shows some examples of accessing matrices:

mat4 myMat4 = mat4(1.0); // Initialize diagonal to 1.0

(identity)

vec4 colO = myMat4[0]; // Get colO vector out of the matrix

float ml_l = myMat4[1][1]; // Get element at [1][1] in matrix

float m2_2 = myMat4[2].z; // Get element at [2][2] in matrix

Constants

It is possible to declare any of the basic types as being constant variables.

Constant variables are those whose values do not change within the shader.

To declare a constant, you add the const qualier to the declaration.

Constant variables must be initialized at declaration time. Some examples

of const declarations follow:

const float zero = 0.0;

const float pi = 3.14159;

const vec4 red = vec4(1.0, 0.0, 0.0, 1.0);

const mat4 identity = mat4(1.0);

Structures 103

Just as in C or C++, a variable that is declared as const is read-only and

cannot be modied within the source.

Structures

In addition to using the basic types provided in the language, it is possible

to aggregate variables into structures much like in C. The declaration

syntax for a structure in the OpenGL ES Shading Language is shown in the

following example:

struct fogStruct

{

vec4 color;

float start;

float end;

} fogVar;

The preceding denition will result in a new user type named fogStruct

and a new variable named fogVar.

Structures can be initialized using constructors. After a new structure type

is dened, a new structure constructor is also dened with the same name

as the type. There must be a one-to-one correspondence between types

in the structure and those in the constructor. For example, the preceding

structure could be initialized using the following construction syntax:

struct fogStruct

{

vec4 color;

float start;

float end;

} fogVar;

fogVar = fogStruct(vec4(0.0, 1.0, 0.0, 0.0), // color

0.5, // start

2.0); // end

The constructor for the structure is based on the name of the type and

takes as arguments each of the components. Accessing the elements of a

structure is done just as you would with a structure in C, as shown in the

following example:

vec4 color = fogVar.color;

float start = fogVar.start;

float end = fogVar.end;

104 Chapter 5: OpenGL ES Shading Language

Arrays

In addition to structures, the OpenGL ES Shading Language supports

arrays. The syntax is very similar to C, with the arrays being based on a

0 index. The following block of code shows some examples of creating

arrays:

float floatArray[4];

vec4 vecArray[2];

Arrays can be initialized using the array initializer constructor, as shown

in the following code:

float a[4] = float[](1.0, 2.0, 3.0, 4.0);

float b[4] = float[4](1.0, 2.0, 3.0, 4.0);

vec2 c[2] = vec2[2](vec2(1.0), vec2(1.0));

Providing a size to the array constructor is optional. The number of

arguments in the array constructor must be equal to the size of the array.

Operators

Table 5-2 lists the operators that are offered in the OpenGL ES Shading

Language.

Table 5-2 OpenGL ES Shading Language Operators

Operator Type Description

*Multiply

/Divide

%Modulus

+Add

–Subtract

++ Increment (prex and postx)

– – Decrement (prex and postx)

=Assignment

+=, –=, *=, /= Arithmetic assignment

Operators 105

Most of these operators behave just as they do in C. As mentioned in the

constructor section, the OpenGL ES Shading Language has strict type rules

that govern operators. That is, the operators must occur between variables

that have the same basic type. For the binary operators (*,/, +, –), the

basic types of the variables must be oating point or integer. Furthermore,

operators such as multiply can operate between combinations of oats,

vectors, and matrices. Some examples are provided here:

float myFloat;

vec4 myVec4;

mat4 myMat4;

myVec4 = myVec4 * myFloat;

Multiplies each component of

myVec4 by a scalar myFloat

myVec4 = myVec4 * myVec4;

Multiplies each component of

myVec4 together (e.g.,

myVec4 ^ 2)

myVec4 = myMat4 * myVec4;

Does a matrix * vector multiply of

myMat4 * myVec4

myMat4 = myMat4 * myMat4;

Does a matrix * matrix multiply of

myMat4 * myMat4

myMat4 = myMat4 * myFloat;

Multiplies each matrix component

by the scalar myFloat

The comparison operators, aside from == and != (<, <=, >, >=), can be used

only with scalar values. To compare vectors, special built-in functions

allow you to perform comparisons on a component-by-component basis

(more on that later).

Table 5-2 OpenGL ES Shading Language Operators (continued)

Operator Type Description

==, !=, <, >, <=, >= Comparison operators

&& Logical and

^^ Logical exclusive or

|| Logical inclusive or

<<, >> Bit-wise shift

&, ^, | Bit-wise and, xor, or

?: Selection

,Sequence

106 Chapter 5: OpenGL ES Shading Language

Functions

Functions are declared in much the same way as in C. If a function will be

used prior to its denition, then a prototype declaration must be

provided. The most signicant difference between functions in the

OpenGL ES Shading Language and C is the way in which parameters are

passed to functions. The OpenGL ES Shading Language provides special

qualiers to dene whether a variable argument can be modied by the

function; these qualiers are shown in Table 5-3.

An example function declaration is provided here. This example shows

the use of parameter qualiers.

vec4 myFunc(inout float myFloat, // inout parameter

out vec4 myVec4, // out parameter

mat4 myMat4); // in parameter (default)

An example function denition is given here for a simple function that

computes basic diffuse lighting:

vec4 diffuse(vec3 normal,

vec3 light,

vec4 baseColor)

{

return baseColor * dot(normal, light);

}

One note about functions in the OpenGL ES Shading Language:

functions cannot be recursive. The reason for this limitation is that some

implementations will implement function calls by actually placing the

function code inline in the nal generated program for the GPU. The

shading language was purposely structured to allow this sort of inline

implementation to support GPUs that do not have a stack.

Table 5-3 OpenGL ES Shading Language Qualiers

Qualifier Description

in (Default if none specied) This qualier species that the parameter

is passed by value and will not be modied by the function.

inout This qualier species that the variable is passed by reference into

the function and if its value is modied, it will be changed after

function exit.

out This qualier says that the variable’s value is not passed into the

function, but it will be modied on return from the function.

Control Flow Statements 107

Built-In Functions

The preceding section described how a shader author creates a function.

One of the most powerful features of the OpenGL ES Shading Language is

the built-in functions that are provided in the language. As an example,

here is some shader code for computing basic specular lighting in a

fragment shader:

float nDotL = dot(normal, light);

float rDotV = dot(viewDir, (2.0 * normal) * nDotL - light);

float specular = specularColor * pow(rDotV, specularPower);

As you can see, this block of shader code uses the dot built-in function

to compute the dot product of two vectors and the pow built-in function

to raise a scalar to a power. These are just two simple examples; a wide

array of built-in functions are available in the OpenGL ES Shading

Language to handle the various computational tasks that one typically

has to do in a shader. Appendix B of this text provides a complete

reference to the built-in functions provided in the OpenGL ES Shading

Language. For now, we just want to make you aware that there are a lot

of built-in functions in the language. To become procient in writing

shaders, you will need to familiarize yourself with the most

common ones.

Control Flow Statements

The syntax for control ow statements in the OpenGL ES Shading

Language is similar to that used in C. Simple if-then-else logical tests can

be done using the same syntax as C. For example:

if(color.a < 0.25)

{

color *= color.a;

}

else

{

color = vec4(0.0);

}

The expression that is being tested in the conditional statement must

evaluate to a boolean value. That is, the test must be based on either

a boolean value or some expression that evaluates to a boolean value

(e.g., a comparison operator). This is the basic concept underlying how

conditionals are expressed in the OpenGL ES Shading Language.

108 Chapter 5: OpenGL ES Shading Language

In addition to basic if-then-else statements, it is possible to write for,

while, and do-while loops. In OpenGL ES 2.0, very strict rules governed

the usage of loops. Essentially, only loops that could be unrolled by the

compiler were supported. These restrictions no longer exist in OpenGL

ES 3.0. The GPU hardware is expected to provide support for looping and

ow control; thus loops are fully supported.

That is not to say that loops don’t come with some performance

implications. On most GPU architectures, vertices or fragments are

executed in parallel in batches. The GPU typically requires that all

fragments or vertices in a batch evaluate all branches (or loop iterations)

of ow control statements. If vertices or fragments in a batch execute

different paths, then, usually all of the other vertices/fragments in a

batch will need to execute that path as well. The size of a batch is GPU

dependent and will often require proling to determine the performance

implications of the use of ow control on a particular architecture.

However, a good rule of thumb is to try to limit the use of divergent ow

control or loop iterations across vertices/fragments.

Uniforms

One of the variable type modiers in the OpenGL ES Shading Language

is the uniform variable. Uniform variables store read-only values that

are passed in by the application through the OpenGL ES 3.0 API to the

shader. Uniforms are useful for storing all kinds of data that shaders need,

such as transformation matrices, light parameters, and colors. Basically,

any parameter to a shader that is constant across either all vertices or

fragments should be passed in as a uniform. Variables whose value is

known at compile-time should be constants rather than uniforms for

efciency.

Uniform variables are declared at the global scope and simply require the

uniform qualier. Some examples of uniform variables are shown here:

uniform mat4 viewProjMatrix;

uniform mat4 viewMatrix;

uniform vec3 lightPosition;

In Chapter 4, “Shaders and Programs,” we described how an application

loads uniform variables to a shader. Note also that the namespace for

uniform variables is shared across both a vertex shader and a fragment

shader. That is, if vertex and fragment shaders are linked together into a

Uniform Blocks 109

program object, they share the same set of uniform variables. Therefore,

if a uniform variable is declared in the vertex shader and also in the

fragment shader, both of those declarations must match. When the

application loads the uniform variable through the API, its value will be

available in both the vertex and fragment shaders.

Uniform variables generally are stored in hardware into what is known

asthe “constant store.” This special space is allocated in the hardware

for the storage of constant values. Because it is typically of a xed

size,the number of uniforms that can be used in a program is limited.

Thislimitation can be determined by reading the value of the

gl_MaxVertexUniformVectors and gl_MaxFragmentUniformVectors

built-in variables (or by querying GL_MAX_VERTEX_UNIFORM_VECTORSor

GL_MAX_FRAGMENT_UNIFORM_VECTORS using glGetintegerv). An

implementation of OpenGL ES 3.0 must provide at least 256 vertex

uniform vectors and 224 fragment uniform vectors, although it is free to

provide more. We cover the full set of limitations and queries available

for the vertex and fragment shaders in Chapter 8, “Vertex Shaders,” and

Chapter 10, “Fragment Shaders.”

Uniform Blocks

In Chapter 4, “Shaders and Programs,” we introduced the concept of

uniform buffer objects. To review, uniform buffer objects allow the storage

of uniform data to be backed by a buffer object. Uniform buffer objects

offer several advantages over individual uniform variables in certain

situations. For example, with uniform buffer objects, uniform buffer data

can be shared across multiple programs but need to be set only once.

Further, uniform buffer objects typically allow for storage of larger amounts

of uniform data. Finally, it can be more efcient to switch between

uniform buffer objects than to individually load one uniform at a time.

Uniform buffer objects can be used in the OpenGL ES Shading Language

through application of uniform blocks. An example uniform block

follows:

uniform TransformBlock

{

mat4 matViewProj;

mat3 matNormal;

mat3 matTexGen;

};

110 Chapter 5: OpenGL ES Shading Language

This code declares a uniform block with the name TransformBlock

containing three matrices. The name TransformBlock will be used by the

application as the blockName parameter to glGetUniformBlockIndex

as described in Chapter 4, “Shaders and Programs,” for uniform buffer

objects. The variables in the uniform block declaration are then accessed

throughout the shader just as if they were declared as a regular uniform.

For example, the matViewProj matrix declared in TransformBlock would

be accessed as follows:

#version 300 es

uniform TransformBlock

{

mat4 matViewProj;

mat3 matNormal;

mat3 matTexGen;

};

layout(location = 0) in vec4 a_position;

void main()

{

gl_Position = matViewProj * a_position;

}

A number of optional layout qualiers can be used to specify how the

uniform buffer object that backs the uniform block will be laid out in

memory. Layout qualiers can be provided either for individual

uniform blocks or globally for all uniform blocks. At the global

scope, setting the default layout for all uniform blocks would look as

follows:

layout(shared, column_major) uniform; // default if not

// specified

layout(packed, row_major) uniform;

Individual uniform blocks can also set the layout by overriding the default

set at the global scope. In addition, individual uniforms within a uniform

block can specify a layout qualier as shown here:

layout(std140) uniform TransformBlock

{

mat4 matViewProj;

layout(row_major) mat3 matNormal;

mat3 matTexGen;

};

Table 5-4 lists all of the layout qualiers that can be provided for uniform

blocks.

Vertex and Fragment Shader Inputs/Outputs 111

Vertex and Fragment Shader Inputs/Outputs

Another special variable type in the OpenGL ES Shading Language is the

vertex input (or attribute) variable. Vertex input variables are used to

specify the per-vertex inputs to the vertex shader and are specied with

the in keyword. They typically store data such as positions, normals,

texture coordinates, and colors. The key here to understand is that vertex

inputs are data that are specied for each vertex being drawn. Example 5-1

is a sample vertex shader that has a position and color vertex input.

The two vertex inputs in this shader, a_position and a_color, will

be loaded with data by the application. Essentially, the application will

create a vertex array that contains a position and a color for each vertex.

Notice that the vertex inputs in Example 5-1 are preceded by the layout

qualier. The layout qualier in this case is used to specify the index of

the vertex attribute. The layout qualier is optional; if it is not specied,

the linker will automatically assign locations for the vertex inputs.

We explain this entire process in full detail in Chapter 6, “Vertex

Attributes, Vertex Arrays, and Buffer Objects.”

Table 5-4 Uniform Block Layout Qualiers

Qualifier Description

shared The shared qualier species that the layout in memory

of the uniform block across multiple shaders or multiple

programs will be the same. To use this qualier, the

row_major/column_major values must be identical

across denitions. Overrides std140 and packed. (default)

packed The packed layout qualier species that the compiler

can optimize the memory layout of the uniform block.

The location of the offsets must be queried when using

this qualier, and the uniform blocks cannot be shared

across vertex/fragment shader or programs. Overrides

std140 and shared.

std140 The std140 layout qualier species that the layout

of the uniform block is based on a set of standard rules

dened in the “Standard Uniform Block Layout” section

of the OpenGL ES 3.0 Specication. We detail these layout

rules in the Uniform Buffer Objects section of Chapter 4.

Overrides shared and packed.

row_major Matrices are laid out in row-major order in memory.

column_major Matrices are laid out in column-major order in memory.

(default)

112 Chapter 5: OpenGL ES Shading Language

As with uniform variables, the underlying hardware typically places limits on

the number of attribute variables that can be input to a vertex shader. The

maximum number of attributes that an implementation supports is given by

the gl_MaxVertexAttribs built-in variable (it can also be found by querying

for GL_MAX_VERTEX_ATTRIBS using glGetIntegerv). The minimum

number of attributes that an OpenGL ES 3.0 implementation can support

is 16. Implementations are free to support more, but if you want to write

shaders that are guaranteed to run on any OpenGL ES 3.0 implementation,

you should restrict yourself to using no more than 16 attributes. We cover

attribute limitations in more detail in Chapter 8, “Vertex Shaders.”

The output variables from the vertex shader are specied with the out

keyword. In Example 5-1, the v_color variable is declared as an output

and its contents are copied from the a_color input variable. Each vertex

shader will output the data it needs to pass the fragment shader into one

or more output variables. These variables will then also be declared in the

fragment shader as in variables (with matching types) and will be linearly

interpolated across the primitive during rasterization (if you want more

details on how this interpolation occurs during rasterization, jump to

Chapter 7, “Primitive Assembly and Rasterization”).

For example, the matching input declaration in the fragment shader for

the v_color vertex output in Example 5-1 follows:

in vec3 v_color;

Note that unlike the vertex shader input, the vertex shader output/fragment

shader input variables cannot have layout qualiers. The implementation

automatically chooses locations. As with uniforms and vertex input

attributes, the underlying hardware typically limits the number of vertex

shader outputs/fragment shader inputs (on the hardware, these are usually

Example 5-1 Sample Vertex Shader

#version 300 es

uniform mat4 u_matViewProjection;

layout(location = 0) in vec4 a_position;

layout(location = 1) in vec3 a_color;

out vec3 v_color;

void main(void)

{

gl_Position = u_matViewProjection * a_position;

v_color = a_color;

}

Vertex and Fragment Shader Inputs/Outputs 113

referred to as interpolators). The number of vertex shader outputs supported

by an implementation is given by the gl_MaxVertexOutputVectors

built-in variable (querying for GL_MAX_VERTEX_OUTPUT_COMPONENTS

using glGetIntegerv will provide the number of total component values

rather than the number of vectors). The minimum number of vertex

output vectors that an implementation of OpenGL ES 3.0 can support

is 16. Likewise, the number of fragment shader inputs supported by an

implementation is given by gl_MaxFragmentInputVectors (querying for

GL_MAX_FRAGMENT_INPUT_COMPONENTS using glGetIntegerv will provide

the number of total component values rather than the number of vectors).

The minimum number of fragment input vectors that an implementation

of OpenGL ES 3.0 can support is 15.

Example 5-2 is an example of a vertex shader and a fragment shader with

matching output/input declarations.

Example 5-2 Vertex and Fragment Shaders with Matching Output/Input

Declarations

// Vertex shader

#version 300 es

uniform mat4 u_matViewProjection;

// Vertex shader inputs

layout(location = 0) in vec4 a_position;

layout(location = 1) in vec3 a_color;

// Vertex shader output

out vec3 v_color;

void main(void)

{

gl_Position = u_matViewProjection * a_position;

v_color = a_color;

}

// Fragment shader

#version 300 es

precision mediump float;

// Input from vertex shader

in vec3 v_color;

// Output of fragment shader

layout(location = 0) out vec4 o_fragColor;

(continues)

114 Chapter 5: OpenGL ES Shading Language

In Example 5-2, the fragment shader contains the denition for the

output variable o_fragColor:

layout(location = 0) out vec4 o_fragColor;

The fragment shader can output one or more colors. In the typical case,

we will render just to a single color buffer, in which case the layout

qualier is optional (the output variable is assumed to go to location 0).

However, when rendering to multiple render targets (MRTs), we can use

the layout qualier to specify which render target each output goes to.

MRTs are covered in detail in Chapter 11, “Fragment Operations.” For the

typical case, you will have one output variable in your fragment shader,

and that value will be the output color that is passed to the per-fragment

operations portions of the pipeline.

Interpolation Qualiers

In Example 5-2, we declared our vertex shader output and fragment

shader input without any qualiers. The default behavior for interpolation

when no qualiers are present is to perform smooth shading. That is, the

output variables from the vertex shader are linearly interpolated across

the primitive, and the fragment shader receives that linearly interpolated

value as its input. We could have explicitly requested smooth shading

rather than relying on the default behavior in Example 5-2, in which case

our output/inputs would be as follows:

// ...Vertex shader...

// Vertex shader output

smooth out vec3 v_color;

// ...Fragment shader...

// Input from vertex shader

smooth in vec3 v_color;

OpenGL ES 3.0 also introduces another type of interpolation known as at

shading. In at shading, the value is not interpolated across the primitive.

Rather, one of the vertices is considered the provoking vertex (dependent

Example 5-2 Vertex and Fragment Shaders with Matching Output/Input

Declarations (continued)

void main()

{

o_fragColor = vec4(v_color, 1.0);

}

Preprocessor and Directives 115

on the primitive type; we describe this in the Chapter 7 section, Provoking

Vertex), and that vertex value is used for all fragments in the primitive. We

can declare the output/inputs as at shaded as follows:

// ...Vertex shader...

// Vertex shader output

flat out vec3 v_color;

// ...Fragment shader...

// Input from vertex shader

flat in vec3 v_color;

Finally, another qualier can be added to interpolators with the

centroid keyword. The denition of centroid sampling is provided

in Chapter 11 in the section Multisampled Anti-Aliasing. Essentially,

when rendering with multisampling, the centroid keyword can be

used to force interpolation to occur inside the primitive being rendered

(otherwise, artifacts can occur at the edges of primitives). See Chapter11,

“Fragment Operations,” for a full denition of centroid sampling. For

now, we simply show how you can declare an output/input variable with

centroid sampling:

// ...Vertex shader...

// Vertex shader output

smooth centroid out vec3 v_color;

// ...Fragment shader...

// Input from vertex shader

smooth centroid in vec3 v_color;

Preprocessor and Directives

One feature of the OpenGL ES Shading Language we have not mentioned

yet is the preprocessor. The OpenGL ES Shading Language features

a preprocessor that follows many of the conventions of a standard

C++ preprocessor. Macros can be dened and conditional tests can be

performed using the following directives:

#define

#undef

#if

#ifdef

#ifndef

#else

#elif

#endif

116 Chapter 5: OpenGL ES Shading Language

Note that macros cannot be dened with parameters (as they can be

in C++ macros). The #if, #else, and #elif directives can use the

defined test to see whether a macro is dened. The following macros are

predened and their description is given next:

__LINE__ // Replaced with the current line number in a shader

__FILE__ // Always 0 in OpenGL ES 3.0

__VERSION__ // The OpenGL ES shading language version

// (e.g., 300)

GL_ES // This will be defined for ES shaders to a value

// of 1

The #error directive will cause a compilation error to occur during shader

compilation, with a corresponding message being placed in the info log.

The #pragma directive is used to specify implementation-specic directives

to the compiler.

Another important directive in the preprocessor is #extension, which is

used to enable and set the behavior of extensions. When vendors (or groups

of vendors) extend the OpenGL ES Shading Language, they willcreate a

language extension specication (e.g., GL_NV_shadow_samplers_cube).

The shader must instruct the compiler as to whether to allow extensions

to be used, and if not, which behavior should occur. This is done using the

#extension directive. The general format of #extension usage is shown in

the following code:

// Set behavior for an extension

#extension extension_name : behavior

// Set behavior for ALL extensions

#extension all : behavior

The rst argument will be either the name of the extension (e.g.,

GL_NV_shadow_samplers_cube) or all, which means that the behavior

applies to all extensions. The behavior has four possible options, as shown in

Table 5-5.

Table 5-5 Extension Behaviors

Extension Behavior Description

require The extension is required, so the preprocessor will

throw an error if the extension is not supported. If all

is specied, this will always throw an error.

enable The extension is enabled, so the preprocessor will warn

if the extension is not supported. The language will

be processed as if the extension is enabled. If all is

specied, this will always throw an error.

Uniform and Interpolator Packing 117

As an example, suppose you want the preprocessor to produce a warning

if the NVIDIA shadow samplers cube extension is not supported (and you

want the shader to be processed as if it is supported). To do so, you would

add the following at the top of your shader:

#extension GL_NV_shadow_samplers_cube : enable

Uniform and Interpolator Packing

As noted in the preceding sections on uniforms and vertex shader

outputs/fragment shader inputs, a xed number of underlying hardware

resources are available for the storage of each variable. Uniforms are

typically stored in the so-called constant store, which can be thought

of as a physical array of vectors. Vertex shader outputs/fragment shader

inputs are typically stored in interpolators, which again are usually stored

as an array of vectors. As you have probably noticed, shaders can declare

uniforms and shader input/outputs of various types, including scalars,

various vector components, and matrices. But how do these variable

declarations map to the physical space that’s available on the hardware?

In other words, if an OpenGL ES 3.0 implementation says it supports 16

vertex shader output vectors, how does the physical storage actually

get used?

In OpenGL ES 3.0, this issue is handled through packing rules that dene

how the interpolators and uniforms will map to physical storage space.

The rules for packing are based on the notion that the physical storage

space is organized into a grid with four columns (one column for each

vector component) and a row for each storage location. The packing

rules seek to pack variables such that the complexity of the generated

Table 5-5 Extension Behaviors (continued)

Extension Behavior Description

warn Warn on any use of the extension, unless that use

is required by another enabled extension. If all is

specied, a warning will be thrown whenever the

extension is used. Also, a warning will be thrown if the

extension is not supported.

disable The extension is disabled, so errors will be thrown

if the extension is used. If all is specied (this is

specied by default), no extensions are enabled.

118 Chapter 5: OpenGL ES Shading Language

code remains constant. In other words, the packing rules will not do

reordering that requires the compiler to generate extra instructions to

merge unpacked data. Rather, the packing rules seek to optimize the

use of the physical address space without negatively impacting runtime

performance.

Let’s look at an example group of uniform declarations and see how these

would be packed:

uniform mat3 m;

uniform float f[6];

uniform vec3 v;

If no packing were done at all, you can see that a lot of constant storage

space would be wasted. The matrix m would take up three rows, the array

f would take up six rows, and the vector v would take up one row. This

would use a total of 10 rows to store the variables. Table 5-6 shows what

the results would be without any packing. With the packing rules, the

variables will get organized such that they pack into the grid as shown in

Table 5-7.

Table 5-6 Uniform Storage without Packing

Location X Y Z W

0m[0].x m[0].y m[0].z —

1m[l].x m[l].y m[l].z —

2m[2].x m[2].y m[2].z —

3f[0] ———

4f[l] ———

5f[2] ———

6f[3] ———

7f[4] ———

8f[5] ———

9v.x v.y v.z -6

Precision Qualiers 119

With the packing rules, only six physical constant locations need to be

used. You will notice that the array f needs to keep its elements spanning

across row boundaries. The reason for this is that typically GPUs index the

constant store by vector location index. The packing must keep the arrays

spanning across row boundaries so that indexing will still work.

All of the packing that is done is completely transparent to the user of the

OpenGL ES Shading Language, except for one detail: The packing impacts

the way in which uniforms and vertex shader outputs/fragment shader

inputs are counted. If you want to write shaders that are guaranteed to

run on all implementations of OpenGL ES 3.0, you should not use more

uniforms or interpolators than would exceed the minimum allowed

storage sizes after packing. For this reason, it’s important to be aware of

packing so that you can write portable shaders that will not exceed the

minimum allowed storage on any implementation of OpenGL ES 3.0.

Precision Qualiers

Precision qualiers enable the shader author to specify the precision with

which computations for a shader variable are performed. Variables can be

declared to have either low, medium, or high precision. These qualiers

are used as hints to the compiler to allow it to perform computations

with variables at a potentially lower range and precision. It is possible

that at lower precisions, some implementations of OpenGL ES might

be able to run the shaders either faster or with better power efciency.

Table 5-7 Uniform Storage with Packing

Location X Y Z W

0m[0].x m[0].y m[0].z f[0]

1m[l].x m[l].y m[l].z f[l]

2m[2].x m[2].y m[2].z f[2]

3v.x v.y v.z f[3]

4 — — — f[4]

5 — — — f[5]

120 Chapter 5: OpenGL ES Shading Language

Ofcourse, that efciency savings comes at the cost of precision, which

can result in artifacts if precision qualiers are not used properly. Note

that nothing in the OpenGL ES specication says that multiple precisions

must be supported in the underlying hardware, so it is perfectly valid

for an implementation of OpenGL ES to perform all calculations at the

highest precision and simply ignore the qualiers. However, on some

implementations, using a lower precision might offer an advantage.

Precision qualiers can be used to specify the precision of any oating-

point or integer-based variable. The keywords for specifying the precision

are lowp, mediump, and highp. Some examples of declarations with

precision qualiers are shown here:

highp vec4 position;

varying lowp vec4 color;

mediump float specularExp;

In addition to precision qualiers, the notion of default precision is

available. That is, if a variable is declared without having a precision

qualier, it will have the default precision for that type. The default

precision qualier is specied at the top of a vertex or fragment shader

using the following syntax:

precision highp float;

precision mediump int;

The precision specied for float will be used as the default precision

for all variables based on a oating-point value. Likewise, the precision

specied for int will be used as the default precision for all integer-based

variables.

In the vertex shader, if no default precision is specied, then the default

precision for both int and float is highp. That is, all variables declared

without a precision qualier in a vertex shader will have the highest

precision. The rules for the fragment shader are different. In the fragment

shader, there is no default precision given for oating-point values: Every

shader must declare a default float precision or specify the precision for

every float variable.

One nal note is that the precision specied by a precision qualier has an

implementation-dependent range and precision. There is an associated API

call for determining the range and precision for a givenimplementation,

which is covered in Chapter 15, “State Queries.” As an example, on the

PowerVR SGX GPU, a lowp float variable is represented in a 10-bit xed

point format, a mediump float variable is a 16-bit oating-point value,

and a highp float is a 32-bit oating-point value.

Invariance 121

Invariance

The keyword invariant that was introduced in the OpenGL ES Shading

Language can be applied to any varying output of a vertex shader. What

do we mean by invariance, and why is this necessary? The issue is that

shaders are compiled and the compiler might perform optimizations that

cause instructions to be reordered. This instruction reordering means

that equivalent calculations between two shaders are not guaranteed

to produce exactly identical results. This disparity can be an issue in

particular for multipass shader effects, where the same object is being

drawn on top of itself using alpha blending. If the precision of the values

used to compute the output position is not exactly identical, then artifacts

can arise due to the precision differences. This issue usually manifests

itself as “Z ghting,” when small Z precision differences per pixel cause

the different passes to shimmer against each other.

The following example demonstrates visually why invariance is important

to get right when doing multipass shading. The following torus object

is drawn in two passes: The fragment shader computes specular lighting

in the rst pass and ambient and diffuse lighting in the second pass. The

vertex shaders do not use invariance so small precision differences cause

the Z ghting, as shown in Figure 5-1.

Figure 5-1 Z Fighting Artifacts Due to Not Using Invariance

122 Chapter 5: OpenGL ES Shading Language

Figure 5-2 Z Fighting Avoided Using Invariance

The same multipass vertex shaders using invariance for position produce

the correct image in Figure 5-2.

The introduction of invariance gives the shader writer a way to specify

that if the same computations are used to compute an output, its value

must be exactly the same (or invariant). The invariant keyword can be

used either on varying declarations or for varyings that have already been

declared. Some examples follow:

invariant gl_Position;

invariant texCoord;

Once invariance is declared for an output, the compiler guarantees

that the results will be the same given the same computations and

inputs into the shader. For example, given two vertex shaders that

compute output position by multiplying the view projection matrix

by the input position, you are guaranteed that those positions will be

invariant.

#version 300 es

uniform mat4 u_viewProjMatrix;

layout(location = 0) in vec4 a_vertex;

invariant gl_Position;

Summary 123

void main()

{

// Will be the same value in all shaders with the

// same viewProjMatrix and vertex

gl_Position = u_viewProjMatrix * a_vertex;

}

It is also possible to make all variables globally invariant using a #pragma

directive:

#pragma STDGL invariant(all)

A word of caution: Because the compiler needs to guarantee invariance, it

might have to limit the optimizations it does. Therefore, the invariant

qualier should be used only when necessary; otherwise, it might result in

performance degradation. For this reason, the #pragma directive to globally

enable invariance should be used only when invariance is really required for

all variables. Note also that while invariance does imply that the calculation

will have the same results on a given GPU, it does not mean that the

computation will be invariant across any implementation of OpenGL ES.

Summary

This chapter introduced the following features of the OpenGL ES Shading

Language:

Shader version specication with #version

Scalar, vector, and matrix data types and constructors

Declaration of constants using the const qualier

Creation and initialization of structures and arrays

Operators, control ow, and functions

Vertex shader inputs/output and fragment shader inputs/outputs

using the in and out keywords and layout qualier

Smooth, at, and centroid interpolation qualiers

Uniforms, uniform blocks, and uniform block layout qualiers

Preprocessor and directives

Uniform and interpolator packing

Precision qualiers and invariance

124 Chapter 5: OpenGL ES Shading Language

In the next chapter, we focus on how to load vertex input variables with

data from vertex arrays and vertex buffer objects. We will expand your

knowledge of the OpenGL ES Shading Language throughout the book.

For example, in Chapter 8, “Vertex Shaders,” we describe how to perform

transformation, lighting, and skinning in a vertex shader. In Chapter 9,

“Texturing,” we explain how to load textures and how to use them in a

fragment shader. In Chapter 10, “Fragment Shaders,” we cover how to

compute fog, perform alpha testing, and evaluate user clip planes in a

fragment shader. In Chapter 14, “Advanced Programming with OpenGL

ES 3.0,” we go deep into writing shaders that perform advanced effects

such as environment mapping, projective texturing, and per-fragment

lighting. With the grounding in the OpenGL ES Shading Language from

this chapter, we can show you how to use shaders to achieve a variety of

rendering techniques.

125

Chapter 6

Vertex Attributes, Vertex Arrays,

andBufferObjects

This chapter describes how vertex attributes and data are specied in

OpenGL ES 3.0. We discuss what vertex attributes are, how to specify

them and their supported data formats, and how to bind vertex attributes

for use in a vertex shader. After reading this chapter, you will have a good

grasp of what vertex attributes are and how to draw primitives with vertex

attributes in OpenGL ES 3.0.

Vertex data, also referred to as vertex attributes, specify per-vertex data.

This per-vertex data can be specied for each vertex, or a constant value

can be used for all vertices. For example, if you want to draw a triangle

that has a solid color (for the sake of this example, suppose the color is

black, as shown in Figure 6-1), you would specify a constant value that

will be used by all three vertices of the triangle. However, the position of

the three vertices that make up the triangle will not be the same, so we

will need to specify a vertex array that stores three position values.

Figure 6-1 Triangle with a Constant Color Vertex and

Per-Vertex Position Attributes

126 Chapter 6: Vertex Attributes, Vertex Arrays, andBufferObjects

Specifying Vertex Attribute Data

Vertex attribute data can be specied for each vertex using a vertex array,

or a constant value can be used for all vertices of a primitive.

All OpenGL ES 3.0 implementations must support a minimum of

16vertex attributes. An application can query the exact number of

vertex attributes that are supported by a particular implementation. The

following code shows how an application can query the number of vertex

attributes an implementation actually supports:

GLint maxVertexAttribs; // n will be >= 16

glGetIntegerv(GL_MAX_VERTEX_ATTRIBS, &maxVertexAttribs);

Constant Vertex Attribute

A constant vertex attribute is the same for all vertices of a primitive, so

only one value needs to be specied for all the vertices of a primitive. It is

specied using any of the following functions:

void glVertexAttriblf(GLuint index, GLfloat x);

void glVertexAttrib2f(GLuint index, GLfloat x, GLfloat y);

void glVertexAttrib3f( GLuint index, GLfloat x, GLfloat y,

GLfloatz);

void glVertexAttrib4f( GLuint index, GLfloat x, GLfloat y,

GLfloatz,GLfloat w);

void glVertexAttriblfv(GLuint index, const GLfloat *values);

void glVertexAttrib2fv(GLuint index, const GLfloat *values);

void glVertexAttrib3fv(GLuint index, const GLfloat *values);

void glVertexAttrib4fv(GLuint index, const GLfloat *values);

The glVertexAttrib* commands are used to load the genericvertex

attribute specied by index. The functions glVertexAttriblf and

glVertexAttriblfv load (x, 0.0, 0.0, 1.0) into the generic vertex attribute.

glVertexAttrib2f and glVertexAttrib2fv load (x, y, 0.0, 1.0) into the

generic vertex attribute. glVertexAttrib3f and glVertexAttrib3fv

load (x,y,z,1.0) into the generic vertex attribute. glVertexAttrib4f and

glVertexAttrib4fv load (x, y, z, w) into the generic vertex attribute. In

practice, constant vertex attributes provide equivalent functionality to

using a scalar/vector uniform, and using either is an acceptable choice.

Vertex Arrays

Vertex arrays specify attribute data per vertex and are buffers stored in the

application’s address space (what OpenGL ES calls the client space). They

Specifying Vertex Attribute Data 127

serve as the basis for vertex buffer objects that provide an efcient and

exible way for specifying vertex attribute data. Vertex arrays are specied

using the glVertexAttribPointer or glVertexAttribIPointer function.

void glVertexAttribPointer(GLuint index, GLint size,

GLenum type,

GLboolean normalized,

GLsizei stride,

const void *ptr)

void glVertexAttribIPointer(GLuint index, GLint size,

GLenum type,

GLsizei stride,

const void *ptr)

index species the generic vertex attribute index. This value can

range from 0 to the maximum vertex attributes supported

minus 1.

size number of components specied in the vertex array for

the vertex attribute referenced by the index. Valid values

are 1–4.

type data format. Valid values for both functions are

GL_BYTE

GL_UNSIGNED_BYTE

GL_SHORT

GL_UNSIGNED_SHORT

GL_INT

GL_UNSIGNED_INT

Valid values for glVertexAttribPointer also include

GL_HALF_FLOAT

GL_FLOAT

GL_FIXED

GL_INT_2_10_10_10_REV

GL_UNSIGNED_INT_2_10_10_10_REV

normalized (glVertexAttribPointer only) is used to indicate

whether the non-oating data format type should be

normalized when converted to oating-point values. For

glVertexAttribIPointer, the values are treated as integers.

stride the components of vertex attribute specied by size are

stored sequentially for each vertex. stride species the

delta between data for vertex index I and vertex (I + 1).

If stride is 0, attribute data for all vertices are stored

(continues)

128 Chapter 6: Vertex Attributes, Vertex Arrays, andBufferObjects

Next, we present a few examples that illustrate how to specify vertex

attributes with glVertexAttribPointer. Two methods are commonly

used for allocating and storing vertex attribute data:

Store vertex attributes together in a single buffer—a method called an

array of structures. The structure represents all attributes of a vertex,

and we have an array of these attributes per vertex.

Store each vertex attribute in a separate buffer—a method called a

structure of arrays.

Suppose each vertex has four vertex attributes—position, normal, and two

texture coordinates—and these attributes are stored together in one buffer

that is allocated for all vertices. The vertex position attribute is specied

as a vector of three oats (x, y, z), the vertex normal is also specied as a

vector of three oats, and each texture coordinate is specied as a vector

of two oats. Figure 6-2 gives the memory layout of this buffer. In this

case, the stride of the buffer is the combined size of all attributes that

make up the vertex (one vertex is equal to 10 oats or 40 bytes – 12 bytes

for Position, 12 bytes for Normal, 8 bytes for Tex0, and 8 bytes for Tex1).

sequentially. If stride is greater than 0, then we use the

stride value as the pitch to get vertex data for next index.

ptr pointer to the buffer holding vertex attribute data if using

a client-side vertex array. If using a vertex buffer object,

species anoffset into that buffer.

Position Normal Tex0 Tex1

x y z x y z s t s t x y z x y z s t s t

Position Normal Tex0 Tex1

Figure 6-2 Position, Normal, and Two Texture Coordinates Stored as an Array

Example 6-1 describes how these four vertex attributes are specied with

glVertexAttribPointer. Note that we are illustrating how to use client-

side vertex arrays here so that we can explain the concept of specifying

per-vertex data. We recommend that applications use vertex buffer

objects (described later in the chapter) and avoid client-side vertex arrays

to achieve best performance. Client-side vertex arrays are supported in

OpenGL ES 3.0 only for backward compatibility with OpenGL ES 2.0. In

OpenGL ES 3.0, vertex buffer objects are always recommended.

(continued)

Specifying Vertex Attribute Data 129

Example 6-1 Array of Structures

#define VERTEX_POS_SIZE 3 // x, y, and z

#define VERTEX_NORMAL_SIZE 3 // x, y, and z

#define VERTEX_TEXCOORD0_SIZE 2 // s and t

#define VERTEX_TEXCOORDl_SIZE 2 // s and t

#define VERTEX_POS_INDX 0

#define VERTEX_NORMAL_INDX 1

#define VERTEX_TEXCOORD0_INDX 2

#define VERTEX_TEXCOORDl_INDX 3

// the following 4 defines are used to determine the locations

// of various attributes if vertex data are stored as an array

// of structures

#define VERTEX_POS_OFFSET 0

#define VERTEX_NORMAL_OFFSET 3

#define VERTEX_TEXCOORD0_OFFSET 6

#define VERTEX_TEXC00RD1_0FFSET 8

#define VERTEX_ATTRIB_SIZE (VERTEX_POS_SIZE + \

VERTEX_NORMAL_SIZE + \

VERTEX_TEXCOORD0_SIZE + \

VERTEX_TEXC00RD1_SIZE)

float *p = (float*) malloc(numVertices * VERTEX_ATTRIB_SIZE

* sizeof(float));

// position is vertex attribute 0

glVertexAttribPointer(VERTEX_POS_INDX, VERTEX_POS_SIZE,

GL_FLOAT, GL_FALSE,

VERTEX_ATTRIB_SIZE * sizeof(float),

p);

// normal is vertex attribute 1

glVertexAttribPointer(VERTEX_NORMAL_INDX, VERTEX_NORMAL_SIZE,

GL_FLOAT, GL_FALSE,

VERTEX_ATTRIB_SIZE * sizeof(float),

(p + VERTEX_NORMAL_OFFSET));

// texture coordinate 0 is vertex attribute 2

glVertexAttribPointer(VERTEX_TEXCOORDO_INDX,

VERTEX_TEXCOORD0_SIZE,

GL_FLOAT, GL_FALSE,

VERTEX_ATTRIB_SIZE * sizeof(float),

(p + VERTEX_TEXCOORD0_OFFSET));

// texture coordinate 1 is vertex attribute 3

glVertexAttribPointer(VERTEX_TEXCOORDl_INDX,

(continues)

130 Chapter 6: Vertex Attributes, Vertex Arrays, andBufferObjects

In Example 6-2, position, normal, and texture coordinates 0 and 1 are

stored in separate buffers.

Example 6-1 Array of Structures (continued)

VERTEX_TEXC00RD1_SIZE,

GL_FLOAT, GL_FALSE,

VERTEX_ATTRIB_SIZE * sizeof(float),

(p + VERTEX_TEXC00RD1_0FFSET));

Example 6-2 Structure of Arrays

float *position = (float*) malloc(numVertices *

VERTEX_POS_SIZE * sizeof(float));

float *normal = (float*) malloc(numVertices *

VERTEX_NORMAL_SIZE * sizeof(float));

float *texcoordO = (float*) malloc(numVertices *

VERTEX_TEXCOORD0_SIZE * sizeof(float));

float *texcoordl = (float*) malloc(numVertices *

VERTEX_TEXC00RD1_SIZE * sizeof(float));

// position is vertex attribute 0

glVertexAttribPointer(VERTEX_POS_INDX, VERTEX_POS_SIZE,

GL_FLOAT, GL_FALSE,

VERTEX_POS_SIZE * sizeof(float),

position);

// normal is vertex attribute 1

glVertexAttribPointer(VERTEX_NORMAL_INDX, VERTEX_NORMAL_SIZE,

GL_FLOAT, GL_FALSE,

VERTEX_NORMAL_SIZE * sizeof(float),

normal);

// texture coordinate 0 is vertex attribute 2

glVertexAttribPointer(VERTEX_TEXCOORDO_INDX,

VERTEX_TEXCOORD0_SIZE,

GL_FLOAT, GL_FALSE,

VERTEX_TEXCOORD0_SIZE *

sizeof(float), texcoordO);

// texture coordinate 1 is vertex attribute 3

glVertexAttribPointer(VERTEX_TEXCOORDl_INDX,

VERTEX_TEXC00RD1_SIZE,

GL_FLOAT, GL_FALSE,

VERTEX_TEXC00RD1_SIZE * sizeof(float),

texcoordl);

Specifying Vertex Attribute Data 131

Performance Hints

How to Store Different Attributes of a Vertex

We described the two most common ways of storing vertex attributes:

an array of structures and a structure of arrays. The question to ask is

which allocation method would be the most efcient for OpenGL ES3.0

hardware implementations. In most cases, the answer is an array of

structures. The reason is that the attribute data for each vertex can be

read in sequential fashion, which will most likely result in an efcient

memory access pattern. A disadvantage of using an array of structures

becomes apparent when an application wants to modify specic

attributes. If a subset of vertex attribute data needs to be modied (e.g.,

texture coordinates), this will result in strided updates to the vertex

buffer. When the vertex buffer is supplied as a buffer object, the entire

vertex attribute buffer will need to be reloaded. You can avoid this

inefciency by storing vertex attributes that are dynamic in nature in a

separate buffer.

Which Data Format to Use for Vertex Attributes

The vertex attribute data format specied by the type argument in

glVertexAttribPointer can affect not only the graphics memory

storage requirements for vertex attribute data, but also the overall

performance, which is a function of the memory bandwidth required to

render the frame(s). The smaller the data footprint, the lower the memory

bandwidth required. OpenGL ES 3.0 supports a 16-bit oating-point vertex

format named GL_HALF_FLOAT (described in detail in Appendix A). Our

recommendation is that applications use GL_HALF_FLOAT wherever possible.

Texture coordinates, normals, binormals, tangent vectors, and so on are

good candidates to be stored using GL_HALF_FLOAT for each component.

Color could be stored as GL_UNSIGNED_BYTE with four components per

vertex color. We also recommend GL_HALF_FLOAT for vertex position, but

recognize that this choice might not be feasible for quite a few cases. For

such cases, the vertex position could be stored as GL_FLOAT.

How the Normalized Flag in gIVertexAttribPointer Works

Vertex attributes are internally stored as a single-precision oating-point

number before being used in a vertex shader. If the data type indicates

that the vertex attribute is not a oat, then the vertex attribute will be

converted to a single-precision oating-point number before it is used

inavertex shader. The normalized ag controls the conversion of the

non-oat vertex attribute data to a single precision oating-point value.

If thenormalized ag is false, the vertex data are converted directly to a

132 Chapter 6: Vertex Attributes, Vertex Arrays, andBufferObjects

oating-point value. This would be similar to casting the variable that is

not a oat type to oat. The following code gives an example:

GLfloat f;

GLbyte b;

f = (GLfloat)b; // f represents values in the range [-128.0,

// 127.0]

If the normalized ag is true, the vertex data is mapped to the [–1.0, 1.0]

range if the data type is GL_BYTE, GL_SHORT, or GL_FIXED, or to the [0.0,

1.0] range if the data type is GL_UNSIGNED_BYTE or GL_UNSIGNED_SHORT.

Table 6-1 describes conversion of non-oating-point data types with the

normalized ag set. The value c in the second column of Table 6-1 refers

to a value of the format specied in the rst column.

Table 6-1 Data Conversions

Vertex Data Format Conversion to Floating Point

GL_BYTE max(c / (27 – 1), –1.0)

GL_UNSIGNED_BYTE c / (28 – 1)

GL_SHORT max(c / (216 – 1), –1.0)

GL_UNSIGNED_SHORT c / (216 – l)

GL_FIXED c/216

GL_FLOAT c

GL_HALF_FLOAT_OES c

It is also possible to access integer vertex attribute data as integers in the

vertex shader rather than having them be converted to oats. In this case,

the glVertexAttribIPointer function should be used and the vertex

attribute should be declared to be of an integer type in the vertex shader.

Selecting Between a Constant Vertex Attribute or a Vertex Array

The application can enable OpenGL ES to use either the constant data or

data from vertex array. Figure 6-3 describes how this works in OpenGL ES 3.0.

The commands glEnableVertexAttribArray and glDisableVertex-

AttribArray are used to enable and disable a generic vertex attribute

array, respectively. If the vertex attribute array is disabled for a generic

attribute index, the constant vertex attribute data specied for that index

will be used.

Specifying Vertex Attribute Data 133

Constant

Vertex Attribute 0

Vertex Array

Vertex Attribute 0

Enable/Disable

enable

disable

Vertex Data

Figure 6-3 Selecting Constant or Vertex Array Vertex Attribute

void glEnableVertexAttribArray(GLuint index);

void glDisableVertexAttribArray(GLuint index);

index species the generic vertex attribute index. This value ranges

from 0 to the maximum vertex attributes supported minus 1.

Example 6-3 illustrates how to draw a triangle where one of the vertex

attributes is constant and the other is specied using a vertex array.

Example 6-3 Using Constant and Vertex Array Attributes

int Init ( ESContext *esContext )

{

UserData *userData = (UserData*) esContext->userData;

const char vShaderStr[] =

"#version 300 es \n"

"layout(location = 0) in vec4 a_color; \n"

"layout(location = 1) in vec4 a_position; \n"

"out vec4 v_color; \n"

"void main() \n"

"{ \n"

" v_color = a_color; \n"

" gl_Position = a_position; \n"

"}";

const char fShaderStr[] =

"#version 300 es \n"

"precision mediump float; \n"

"in vec4 v_color; \n"

"out vec4 o_fragColor; \n"

(continues)

134 Chapter 6: Vertex Attributes, Vertex Arrays, andBufferObjects

Example 6-3 Using Constant and Vertex Array Attributes (continued)

"void main() \n"

"{ \n"

" o_fragColor = v_color; \n"

"}" ;

GLuint programObject;

// Create the program object

programObject = esLoadProgram ( vShaderStr, fShaderStr );

if ( programObject == 0 )

return GL_FALSE;

// Store the program object

userData->programObject = programObject;

glClearColor ( 0.0f, 0.0f, 0.0f, 0.0f );

return GL_TRUE;

}

void Draw ( ESContext *esContext )

{

UserData *userData = (UserData*) esContext->userData;

GLfloat color[4] = { 1.0f, 0.0f, 0.0f, 1.0f };

// 3 vertices, with (x, y, z) per-vertex

GLfloat vertexPos[3 * 3] =

{

0.0f, 0.5f, 0.0f, // v0

-0.5f, -0.5f, 0.0f, // v1

0.5f, -0.5f, 0.0f // v2

};

glViewport ( 0, 0, esContext->width, esContext->height );

glClear ( GL_COLOR_BUFFER_BIT );

glUseProgram ( userData->programObject );

glVertexAttrib4fv ( 0, color );

glVertexAttribPointer ( 1, 3, GL_FLOAT, GL_FALSE, 0,

vertexPos );

glEnableVertexAttribArray ( 1 );

glDrawArrays ( GL_TRIANGLES, 0, 3 );

glDisableVertexAttribArray ( 1 );

}

Declaring Vertex Attribute Variables inaVertexShader 135

The vertex attribute color used in the code example is a constant

value specied with glVertexAttrib4fv without enabling the vertex

attribute array 0. The vertexPos attribute is specied by using a vertex

array with glVertexAttribPointer and enabling the array with

glEnableVertexAttribArray. The value of color will be the same for all

vertices of the triangle(s) drawn, whereas the vertexPos attribute could

vary for vertices of the triangle(s) drawn.

Declaring Vertex Attribute Variables

inaVertexShader

We have looked at what a vertex attribute is, and considered how to

specify vertex attributes in OpenGL ES. We now discuss how to declare

vertex attribute variables in a vertex shader.

In a vertex shader, a variable is declared as a vertex attribute by using the

in qualier. Optionally, the attribute variable can also include a layout

qualier that provides the attribute index. A few example declarations of

vertex attributes are given here:

layout(location = 0) in vec4 a_position;

layout(location = 1) in vec2 a_texcoord;

layout(location = 2) in vec3 a_normal;

The in qualier can be used only with the data types float, vec2,

vec3, vec4, int, ivec2, ivec3, ivec4, uint, uvec2, uvec3,

uvec4, mat2, mat2x2, mat2x3, mat2x4, mat3, mat3x3, mat3x4,

mat4, mat4x2, and mat4x3. Attribute variables cannot be declared

as arrays or structures. The following example declarations of vertex

attributes are invalid and should result in a compilation error:

in foo_t a_A; // foo_t is a structure

in vec4 a_B[10];

An OpenGL ES 3.0 implementation supports GL_MAX_VERTEX_ATTRIBS

four-component vector vertex attributes. A vertex attribute that is declared

as a scalar, two-component vector, or three-component vector will count as

a single four-component vector attribute. Vertex attributes declared as two-

dimensional, three-dimensional, or four-dimensional matrices will count

as two, three, or four 4-component vector attributes, respectively. Unlike

uniform and vertex shader output/fragment shader input variables, which

are packed automatically by the compiler, attributes do not get packed.

Please consider your choices carefully when declaring vertex attributes

with sizes less than a four-component vector, as the maximum number of

vertex attributes available is a limited resource. It might be better to pack

136 Chapter 6: Vertex Attributes, Vertex Arrays, andBufferObjects

them together into a single four-component attribute instead of declaring

them as individual vertex attributes in the vertex shader.

Variables declared as vertex attributes in a vertex shader are read-only

variables and cannot be modied. The following code should cause a

compilation error:

in vec4 a_pos;

uniform vec4 u_v;

void main()

{

a_pos = u_v; <--- cannot assign to a_pos as it is read-only

}

An attribute can be declared inside a vertex shader—but if it is not

used, then it is not considered active and does not count against the

limit. If the number of attributes used in a vertex shader is greater than

GL_MAX_VERTEX_ATTRIBS, the vertex shader will fail to link.

Once a program has been successfully linked, we may need to nd out

the number of active vertex attributes used by the vertex shader attached

to this program. Note that this step is necessary only if you are not using

input layout qualiers for attributes. In OpenGL ES 3.0, it is recommended

that you use layout qualiers; thus you will not need to query this

information after the fact. However, for completeness, the following line

of code shows how to get the number of active vertex attributes:

glGetProgramiv(program, GL_ACTIVE_ATTRIBUTES, &numActiveAttribs);

A detailed description of glGetProgramiv is given in Chapter 4, “Shaders

and Programs.”

The list of active vertex attributes used by a program and their data types

can be queried using the glGetActiveAttrib command.

void glGetActiveAttrib( GLuint program, GLuint index,

GLsizei bufsize, GLsizei *length,

GLenum *type, GLint *size,

GLchar *name)

program name of a program object that was successfully linked

previously.

Declaring Vertex Attribute Variables inaVertexShader 137

The glGetActiveAttrib call provides information about the attribute

selected by index. As detailed in the description of glGetActiveAttrib,

index must be a value between 0 and GL_ACTIVE_ATTRIBUTES – l. The

value of GL_ACTIVE_ATTRIBUTES is queried using glGetProgramiv.

An index of 0 selects the rst active attributes, and an index of

GL_ACTIVE_ATTRIBUTES – 1 selects the last vertex attribute.

Binding Vertex Attributes to Attribute Variables

inaVertexShader

We discussed earlier that in a vertex shader, vertex attribute variables

are specied by the in qualier, the number of active attributes can

be queried using glGetProgramiv, and the list of active attributes

in a program can be queried using glGetActiveAttrib. We also

described how generic attribute indices that range from 0 to

(GL_MAX_VERTEX_ATTRIBS – 1) are used to enable a generic vertex

index species the vertex attribute to query and will be a

value between 0 and GL_ACTIVE_ATTRIBUTES – 1. The

value of GL_ACTIVE_ATTRIBUTES is determined with

glGetProgramiv.

bufsize species the maximum number of characters that may be

written into name, including the null terminator.

length returns the number of characters written into name,

excluding the null terminator, if length is not NULL.

type returns the type of the attribute. Valid values are

GL_FLOAT, GL_FLOAT_VEC2, GL_FLOAT_VEC3,

GL_FLOAT_VEC4, GL_FLOAT_MAT2, GL_FLOAT_MAT3,

GL_FLOAT_MAT4, GL_FLOAT_MAT2x3, GL_FLOAT_MAT2x4,

GL_FLOAT_MAT3x2, GL_FLOAT_MAT3x4, GL_FLOAT_MAT4x2,

GL_FLOAT_MAT_4x3, GL_INT, GL_INT_VEC2, GL_INT_VEC3,

GL_INT_VEC4, GL_UNSIGNED_INT, GL_UNSIGNED_INT_VEC2,

GL_UNSIGNED_INT_VEC3, GL_UNSIGNED_INT_VEC4

size returns the size of the attribute. This is specied in units of

the type returned by type. If the variable is not an array,

size will always be 1. If the variable is an array, then size

returns the size of the array.

name name of the attribute variable as declared in the vertex

shader.

138 Chapter 6: Vertex Attributes, Vertex Arrays, andBufferObjects

attribute and specify a constant or per-vertex (i.e., vertex array) value

using the glVertexAttrib* and glVertexAttribPointer commands.

Now we consider how to map this generic attribute index to the

appropriate attribute variable declared in the vertex shader. This mapping

will allow appropriate vertex data to be read into the correct vertex

attribute variable in the vertex shader.

Figure 6-4 describes how generic vertex attributes are specied and bound

to attribute names in a vertex shader.

Constant

Vertex Attribute 0

Vertex Array

Vertex Attribute 0

Enable/Disable

x Attri

Constant

Vertex Attribute 1

Vertex Array

Vertex Attribute 1

Enable/Disable

Constant

Vertex Attribute n – 1

Vertex Array

Vertex Attribute n – 1

– 1

–

Vertex Attribute n – 1

Enable/Disable

glDrawArrays/gl DrawElements

s/gl DrawElemen

Attribute Vertex Attribute

Variable Index Bindings

Vertex Shader

Attribute 0

Attribute 1

Attribute n – 1

Figure 6-4 Specifying and Binding Vertex Attributes for Drawing One or More

Primitives

Declaring Vertex Attribute Variables inaVertexShader 139

void glBindAttribLocation( GLuint program, GLuint index,

const GLchar *name)

program name of a program object

index generic vertex attribute index

name name of the attribute variable

If name was bound previously, its assigned binding is replaced with an

index. glBindAttribLocation can be called even before a vertex shader

is attached to a program object. As a consequence, this call can be used

to bind any attribute name. Attribute names that do not exist or are not

active in a vertex shader attached to the program object are ignored.

Another option is to let OpenGL ES 3.0 bind the attribute variable name

to a generic vertex attribute index. This binding is performed when the

program is linked. In the linking phase, the OpenGL ES 3.0 implementation

performs the following operation for each attribute variable:

For each attribute variable, check whether a binding has been specied via

glBindAttribLocation. If a binding is specied, the appropriate attribute

index specied is used. If not, the implementation will assign a generic vertex

attribute index.

In OpenGL ES 3.0, three approaches may be used to map a generic vertex

attribute index to an attribute variable name in the vertex shader. These

approaches can be categorized as follows:

The index can be specied in the vertex shader source code using the

layout(location = N) qualier (recommended).

OpenGL ES 3.0 will bind the generic vertex attribute index to the

attribute name.

The application can bind the vertex attribute index to an attribute

name.

The easiest way to bind attributes to a location is to simply use the

layout(location = N) qualier; this approach requires the least amount

of code. However, in some cases, the other two options might be more

desirable. The glBindAttribLocation command can be used to bind a

generic vertex attribute index to an attribute variable in a vertex shader.

This binding takes effect when the program is linked the next time—it

does not change the bindings used by the currently linked program.

140 Chapter 6: Vertex Attributes, Vertex Arrays, andBufferObjects

This assignment is implementation specic; that is, it can vary from one

OpenGL ES 3.0 implementation to another. An application can query the

assigned binding by using the glGetAttribLocation command.

GLint glGetAttribLocation ( GLuint program,

const GLchar *name)

program program object

name name of attribute variable

glGetAttribLocation returns the generic attribute index that was

bound to the attribute variable name when the program object dened by

program was last linked. If name is not an active attribute variable, or if

program is not a valid program object or was not linked successfully, then

–1 is returned, indicating an invalid attribute index.

Vertex Buffer Objects

The vertex data specied using vertex arrays are stored in client memory.

This data must be copied from client memory to graphics memory when

a draw call such as glDrawArrays or glDrawElements is made. These

two commands are described in detail in Chapter 7, “Primitive Assembly

and Rasterization.” It would, however, be much better if we did not have

to copy the vertex data on every draw call, but instead could cache the

data in graphics memory. This approach can signicantly improve the

rendering performance and also reduce the memory bandwidth and

power consumption requirements, both of which are quite important

for handheld devices. This is where vertex buffer objects can help. Vertex

buffer objects allow OpenGL ES 3.0 applications to allocate and cache

vertex data in high-performance graphics memory and render from this

memory, thereby avoiding resending data every time a primitive is drawn.

Not only the vertex data, but also the element indices that describe

the vertex indices of the primitive and are passed as an argument to

glDrawElements, can be cached.

OpenGL ES 3.0 supports two types of buffer objects that are used for

specifying vertex and primitive data: array buffer objects and element array

buffer objects. The array buffer objects specied by the GL_ARRAY_BUFFER

token are used to create buffer objects that will store vertex data. The

element array buffer objects specied by the GL_ELEMENT_ARRAY_BUFFER

Vertex Buffer Objects 141

token are used to create buffer objects that will store indices of a primitive.

Other buffer object types in OpenGL ES 3.0 are described elsewhere in this

book: uniform buffers (Chapter 4), transform feedback buffers (Chapter8),

pixel unpack buffers (Chapter 9), pixel pack buffers (Chapter 11), and

copy buffers (the Copying Buffer Objects section later in this chapter).

For now, we will focus on the buffer objects used for specifying vertex

attributes and element arrays.

Note: To get best performance, we recommend that OpenGL ES 3.0

applications use vertex buffer objects for vertex attribute data and

element indices.

Before we can render using buffer objects, we need to allocate the

buffer objects and upload the vertex data and element indices into

appropriatebuffer objects. This is demonstrated by the sample code in

Example 6-4.

Example 6-4 Creating and Binding Vertex Buffer Objects

void initVertexBufferObjects(vertex_t *vertexBuffer,

GLushort *indices,

GLuint numVertices,

GLuint numlndices,

GLuint *vboIds)

{

glGenBuffers(2, vboIds);

glBindBuffer(GL_ARRAY_BUFFER, vboIds[0]);

glBufferData(GL_ARRAY_BUFFER, numVertices *

sizeof(vertex_t), vertexBuffer,

GL_STATIC_DRAW);

// bind buffer object for element indices

glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, vboIds[1]);

glBufferData(GL_ELEMENT_ARRAY_BUFFER,

numIndices * sizeof(GLushort),

indices, GL_STATIC_DRAW);

}

The code in Example 6-4 creates two buffer objects: a buffer object to store

the actual vertex attribute data, and a buffer object to store the element

indices that make up the primitive. In this example, the glGenBuffers

command is called to get two unused buffer object names in vboIds.

The unused buffer object names returned in vboIds are then used to

create an array buffer object and an element array buffer object. The array

142 Chapter 6: Vertex Attributes, Vertex Arrays, andBufferObjects

buffer object is used to store vertex attribute data for vertices of one or

more primitives. The element array buffer object stores the indices of one

or more primitives. The actual array or element data are specied using

glBufferData. Note that GL_STATIC_DRAW is passed as an argument to

glBufferData. This value is used to describe how the buffer is accessed by

the application and will be described later in this section.

void glGenBuffers(GLsizei n, GLuint *buffers)

nnumber of buffer object names to return

buffers pointer to an array of n entries, where allocated buffer

objects are returned

glGenBuffers assigns n buffer object names and returns them in

buffers. The buffer object names returned by glGenBuffers are

unsigned integer numbers other than 0. The value 0 is reserved by

OpenGL ES and does not refer to a buffer object. Attempts to modify or

query the buffer object state for buffer object 0 will generate an error.

The glBindBuffer command is used to make a buffer object current.The

rst time a buffer object name is bound by calling glBindBuffer,

the buffer object is allocated with the default state; if the allocation is

successful, this allocated object is bound as the current buffer object for

the target.

void glBindBuffer(GLenum target, GLuint buffer)

target can be set to any of the following targets:

GL_ARRAY_BUFFER

GL_ELEMENT_ARRAY_BUFFER

GL_COPY_READ_BUFFER

GL_COPY_WRITE_BUFFER

GL_PIXEL_PACK_BUFFER

GL_PIXEL_UNPACK_BUFFER

GL_TRANSFORM_FEEDBACK_BUFFER

GL_UNIFORM_BUFFER

buffer buffer object to be assigned as the current object to target

Vertex Buffer Objects 143

Note that glGenBuffers is not required to assign a buffer object name

before it is bound using glBindBuffer. Alternatively, an application can

specify an unused buffer object name with glBindBuffer. However, we

recommend that OpenGL ES applications call glGenBuffers and use

buffer object names returned by glGenBuffers instead of specifying their

own buffer object names.

The state associated with a buffer object can be categorized as follows:

GL_BUFFER_SIZE. This refers to the size of the buffer object data that

is specied by glBufferData. The initial value when the buffer object

is rst bound using glBindBuffer is 0.

GL_BUFFER_USAGE. This is a hint as to how the application will use the

data stored in the buffer object. It is described in detail in Table 6-2.

The initial value is GL_STATIC_DRAW.

Table 6-2 Buffer Usage

Buffer Usage Enum Description

GL_STATIC_DRAW The buffer object data will be modied once and used

many times to draw primitives or specify images.

GL_STATIC_READ The buffer object data will be modied once and used

many times to read data back from OpenGL ES. The

data read back from OpenGL ES will be queried for

from the application.

GL_STATIC_COPY The buffer object data will be modied once and used

many times to read data back from OpenGL ES. The

data read back from OpenGL ES will be used directly as

a source to draw primitives or specify images.

GL_DYNAMIC_DRAW The buffer object data will be modied repeatedly and

used many times to draw primitives or specify images.

GL_DYNAMIC_READ The buffer object will be modied repeatedly and used

many times to read data back from OpenGL ES. The

data read back from OpenGL ES will be queried for

from the application.

GL_DYNAMIC_COPY The buffer object data will be modied repeatedly and

used many times to read data back from OpenGLES.

The data read back from OpenGL ES will be used

directly as a source to draw primitives or specify

images.

(continues)

144 Chapter 6: Vertex Attributes, Vertex Arrays, andBufferObjects

As mentioned earlier, GL_BUFFER_USAGE is a hint to OpenGL ES—not a

guarantee. Therefore, an application could allocate a buffer object data

store with usage set to GL_STATIC_DRAW and frequently modify it.

The vertex array data or element array data storage is created and

initialized using the glBufferData command.

Buffer Usage Enum Description

GL_STREAM_DRAW The buffer object data will be modied once and used

only a few times to draw primitives or specify images.

GL_STREAM_READ The buffer object data will be modied once and used

only a few times to read data back from OpenGL ES.

The data read back from OpenGL ES will be queried for

from the application.

GL_STREAM_COPY The buffer object data will be modied once and used

only a few times to read data back from OpenGL ES.

The data read back from OpenGL ES will be used

directly as a source to draw primitives or specify

images.

Table 6-2 Buffer Usage (continued)

void glBufferData(GLenum target, GLsizeiptr size,

const void *data, GLenum usage)

target can be set to any of the following targets:

GL_ARRAY_BUFFER

GL_ELEMENT_ARRAY_BUFFER

GL_COPY_READ_BUFFER

GL_COPY_WRITE_BUFFER

GL_PIXEL_PACK_BUFFER

GL_PIXEL_UNPACK_BUFFER

GL_TRANSFORM_FEEDBACK_BUFFER

GL_UNIFORM_BUFFER

size size of buffer data store in bytes

data pointer to the buffer data supplied by the application

usage a hint on how the application will use the data stored in

the buffer object (refer to Table 6-2 for details)

Vertex Buffer Objects 145

glBufferData will reserve appropriate data storage based on the value of size.

The data argument can be a NULL value, indicating that the reserved data store

remains uninitialized. If data is a valid pointer, then the contents of data are

copied to the allocated data store. The contents of the buffer object data store

can be initialized or updated using the glBufferSubData command.

void glBufferSubData( GLenum target, GLintptr offset,

GLsizeiptr size, const void *data)

target can be set to any of the following targets:

GL_ARRAY_BUFFER

GL_ELEMENT_ARRAY_BUFFER

GL_COPY_READ_BUFFER

GL_COPY_WRITE_BUFFER

GL_PIXEL_PACK_BUFFER

GL_PIXEL_UNPACK_BUFFER

GL_TRANSFORM_FEEDBACK_BUFFER

GL_UNIFORM_BUFFER

offset offset into the buffer data store and number of bytes of the

size data store that is being modied

data pointer to the client data that need to be copied into the

buffer object data storage

After the buffer object data store has been initialized or updated using

glBufferData or glBufferSubData, the client data store is no longer

needed and can be released. For static geometry, applications can free the

client data store and reduce the overall system memory consumed by the

application. This might not be possible for dynamic geometry.

We now look at drawing primitives with and without buffer objects.

Example 6-5 describes drawing primitives with and without vertex buffer

objects. Notice that the code to set up the vertex attributes is very similar.

In this example, we use the same buffer object for all attributes of a vertex.

When a GL_ARRAY_BUFFER buffer object is used, the pointer argument

in glVertexAttribPointer changes from being a pointer to the actual

data to being an offset in bytes into the vertex buffer store allocated using

glBufferData. Similarly, if a valid GL_ELEMENT_ARRAY_BUFFER object

is used, the indices argument in glDrawElements changes from being

a pointer to the actual element indices to being an offset in bytes to the

element index buffer store allocated using glBufferData.

146 Chapter 6: Vertex Attributes, Vertex Arrays, andBufferObjects

Example 6-5 Drawing with and without Vertex Buffer Objects

#define VERTEX_POS_SIZE 3 // x, y, and z

#define VERTEX_COLOR_SIZE 4 // r, g, b, and a

#define VERTEX_POS_INDX 0

#define VERTEX_COLOR_INDX 1

// vertices - pointer to a buffer that contains vertex

// attribute data

// vtxStride - stride of attribute data / vertex in bytes

// numIndices - number of indices that make up primitives

// drawn as triangles

// indices - pointer to element index buffer

void DrawPrimitiveWithoutVBOs(GLfloat *vertices,

GLint vtxStride,

GLint numIndices,

GLushort *indices)

{

GLfloat *vtxBuf = vertices;

glBindBuffer(GL_ARRAY_BUFFER, 0);

glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, 0);

glEnableVertexAttribArray(VERTEX_POS_INDX);

glEnableVertexAttribArray(VERTEX_COLOR_INDX);

glVertexAttribPointer(VERTEX_POS_INDX, VERTEX_POS_SIZE,

GL_FLOAT, GL_FALSE, vtxStride,

vtxBuf);

vtxBuf += VERTEX_POS_SIZE;

glVertexAttribPointer(VERTEX_COLOR_INDX,

VERTEX_COLOR_SIZE, GL_FLOAT,

GL_FALSE, vtxStride, vtxBuf);

glDrawElements(GL_TRIANGLES, numIndices, GL_UNSIGNED_SHORT,

indices);

glDisableVertexAttribArray(VERTEX_POS_INDX);

glDisableVertexAttribArray(VERTEX_COLOR_INDX);

}

void DrawPrimitiveWithVBOs(ESContext *esContext,

GLint numVertices, GLfloat *vtxBuf,

GLint vtxStride, GLint numIndices,

GLushort *indices)

Vertex Buffer Objects 147

Example 6-5 Drawing with and without Vertex Buffer Objects (continued)

{

UserData *userData = (UserData*) esContext->userData;

GLuint offset = 0;

// vboIds[0] - used to store vertex attribute data

// vboIds[l] - used to store element indices

if ( userData->vboIds[0] == 0 && userData->vboIds[1] == 0 )

{

// Only allocate on the first draw

glGenBuffers(2, userData->vboIds);

glBindBuffer(GL_ARRAY_BUFFER, userData->vboIds[0]);

glBufferData(GL_ARRAY_BUFFER, vtxStride * numVertices,

vtxBuf, GL_STATIC_DRAW);

glBindBuffer(GL_ELEMENT_ARRAY_BUFFER,

userData->vboIds[1]);

glBufferData(GL_ELEMENT_ARRAY_BUFFER,

sizeof(GLushort) * numIndices,

indices, GL_STATIC_DRAW);

}

glBindBuffer(GL_ARRAY_BUFFER, userData->vboIds[0]);

glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, userData->vboIds[1]);

glEnableVertexAttribArray(VERTEX_POS_INDX);

glEnableVertexAttribArray(VERTEX_COLOR_INDX);

glVertexAttribPointer(VERTEX_POS_INDX, VERTEX_POS_SIZE,

GL_FLOAT, GL_FALSE, vtxStride,

(const void*)offset);

offset += VERTEX_POS_SIZE * sizeof(GLfloat);

glVertexAttribPointer(VERTEX_COLOR_INDX,

VERTEX_COLOR_SIZE,

GL_FLOAT, GL_FALSE, vtxStride,

(const void*)offset);

glDrawElements(GL_TRIANGLES, numIndices, GL_UNSIGNED_SHORT,

0);

glDisableVertexAttribArray(VERTEX_POS_INDX);

glDisableVertexAttribArray(VERTEX_COLOR_INDX);

glBindBuffer(GL_ARRAY_BUFFER, 0);

glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, 0);

}

(continues)

148 Chapter 6: Vertex Attributes, Vertex Arrays, andBufferObjects

Example 6-5 Drawing with and without Vertex Buffer Objects (continued)

void Draw ( ESContext *esContext )

{

UserData *userData = (UserData*) esContext->userData;

// 3 vertices, with (x, y, z),(r, g, b, a) per-vertex

GLfloat vertices[3 * (VERTEX_POS_SIZE + VERTEX_COLOR_SIZE)] =

{

-0.5f, 0.5f, 0.0f, // v0

1.0f, 0.0f, 0.0f, 1.0f, // c0

-1.0f, -0.5f, 0.0f, // v1

0.0f, 1.0f, 0.0f, 1.0f, // c1

0.0f, -0.5f, 0.0f, // v2

0.0f, 0.0f, 1.0f, 1.0f, // c2

};

// index buffer data

GLushort indices[3] = { 0, 1, 2 };

glViewport ( 0, 0, esContext->width, esContext->height );

glClear ( GL_COLOR_BUFFER_BIT );

glUseProgram ( userData->programObject );

glUniform1f ( userData->offsetLoc, 0.0f );

DrawPrimitiveWithoutVBOs ( vertices,

sizeof(GLfloat) * (VERTEX_POS_SIZE + VERTEX_COLOR_SIZE),

3, indices );

// offset the vertex positions so both can be seen

glUniform1f ( userData->offsetLoc, 1.0f );

DrawPrimitiveWithVBOs ( esContext, 3, vertices,

sizeof(GLfloat) * (VERTEX_POS_SIZE + VERTEX_COLOR_SIZE),

3, indices );

}

In Example 6-5, we used one buffer object to store all the vertex data. This

demonstrates the array of structures method of storing vertex attributes

described in Example 6-1. It is also possible to have a buffer object for

each vertex attribute—that is, the structure of arrays method of storing

vertex attributes described in Example 6-2. Example 6-6 illustrates how

drawPrimitiveWithVBOs would look with a separate buffer object for

each vertex attribute.

Vertex Buffer Objects 149

Example 6-6 Drawing with a Buffer Object per Attribute

#define VERTEX_POS_SIZE 3 // x, y, and z

#define VERTEX_COLOR_SIZE 4 // r, g, b, and a

#define VERTEX_POS_INDX 0

#define VERTEX_COLOR_INDX 1

void DrawPrimitiveWithVBOs(ESContext *esContext,

GLint numVertices, GLfloat **vtxBuf,

GLint *vtxStrides, GLint numIndices,

GLushort *indices)

{

UserData *userData = (UserData*) esContext->userData;

// vboIds[0] - used to store vertex position

// vboIds[1] - used to store vertex color

// vboIds[2] - used to store element indices

if ( userData->vboIds[0] == 0 && userData->vboIds[1] == 0 &&

userData->vboIds[2] == 0)

{

// allocate only on the first draw

glGenBuffers(3, userData->vboIds);

glBindBuffer(GL_ARRAY_BUFFER, userData->vboIds[0]);

glBufferData(GL_ARRAY_BUFFER, vtxStrides[0] * numVertices,

vtxBuf[0], GL_STATIC_DRAW);

glBindBuffer(GL_ARRAY_BUFFER, userData->vboIds[1]);

glBufferData(GL_ARRAY_BUFFER, vtxStrides[1] * numVertices,

vtxBuf[1], GL_STATIC_DRAW);

glBindBuffer(GL_ELEMENT_ARRAY_BUFFER,

userData->vboIds[2]);

glBufferData(GL_ELEMENT_ARRAY_BUFFER,

sizeof(GLushort) * numIndices,

indices, GL_STATIC_DRAW);

}

glBindBuffer(GL_ARRAY_BUFFER, userData->vboIds[0]);

glEnableVertexAttribArray(VERTEX_POS_INDX);

glVertexAttribPointer(VERTEX_POS_INDX, VERTEX_POS_SIZE,

GL_FLOAT, GL_FALSE, vtxStrides[0], 0);

glBindBuffer(GL_ARRAY_BUFFER, userData->vboIds[1]);

glEnableVertexAttribArray(VERTEX_COLOR_INDX);

glVertexAttribPointer(VERTEX_COLOR_INDX,

VERTEX_COLOR_SIZE,

GL_FLOAT, GL_FALSE, vtxStrides[1], 0);

(continues)

150 Chapter 6: Vertex Attributes, Vertex Arrays, andBufferObjects

After the application has nished using the buffer objects, they can be

deleted using the glDeleteBuffers command.

glDeleteBuffers deletes the buffer objects specied in buffers. Once

a buffer object has been deleted, it can be reused as a new buffer object

that stores vertex attributes or element indices for a different primitive.

As you can see from these examples, using vertex buffer objects

is very easy and requires very little extra work to implement over

vertex arrays. The minimal extra work involved in supporting

vertex bufferobjects is well worth it, considering the performance

gain this feature provides. In the next chapter, we discuss how

to draw primitives using commands such as glDrawArrays and

glDrawElements, and how the primitive assembly and rasterization

pipeline stages in OpenGL ES 3.0 work.

Vertex Array Objects

So far, we have covered how to load vertex attributes in two different

ways: using client vertex arrays and using vertex buffer objects. Vertex

buffer objects are preferred to client vertex arrays because they can reduce

the amount of data copied between the CPU and GPU and, therefore,

have better performance. In OpenGL ES 3.0, a new feature was introduced

void glDeleteBuffers(GLsizei n, const GLuint *buffers)

nnumber of buffer objects to be deleted

buffers array of n entries that contain the buffer objects to be deleted

Example 6-6 Drawing with a Buffer Object per Attribute (continued)

glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, userData->vboIds[2]);

glDrawElements(GL_TRIANGLES, numIndices,

GL_UNSIGNED_SHORT, 0);

glDisableVertexAttribArray(VERTEX_POS_INDX);

glDisableVertexAttribArray(VERTEX_COLOR_INDX);

glBindBuffer(GL_ARRAY_BUFFER, 0);

glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, 0);

}

Vertex Array Objects 151

to make using vertex arrays even more efcient: vertex array objects

(VAOs). As we have seen, setting up drawing using vertex buffer objects

can require many calls to glBindBuffer, glVertexAttribPointer, and

glEnableVertexAttribArray. To make it faster to switch between vertex

array congurations, OpenGL ES3.0 introduced vertex array objects. VAOs

provide a single object that contains all of the state required to switch

between vertex array/vertex buffer object congurations.

In fact, there is always a vertex array object that is active in OpenGL

ES3.0. All of the examples so far in this chapter have operated on the

default vertex array object (the default VAO has the ID of 0). To create a

new vertex array object, you use the glGenVertexArrays function.

void glGenVertexArrays(GLsizei n, GLuint *arrays)

nnumber of vertex array object names to return

arrays pointer to an array of n entries, where allocated vertex

array objects are returned

void glBindVertexArray(GLuint array)

array object to be assigned as the current vertex array object

Once created, the vertex array object can be bound for use using

glBindVertexArray.

Each VAO contains a full state vector that describes all of the vertex

buffer bindings and vertex client state enables. When the VAO is

bound, its state vector provides the current settings of the vertex buffer

state. After binding the vertex array object using glBindVertexArray,

subsequent calls that change the vertex array state (glBindBuffer,

glVertexAttribPointer, glEnableVertexAttribArray, and

glDisableVertexAttribArray) will affect the new VAO.

In this way, an application can quickly switch between vertex array

congurations by binding a vertex array object that has been set with

state. Rather than having to make many calls to change the vertex array

state, all of the changes can be made in a single function call. Example6-7

demonstrates the use of a vertex array object at initialization time to set

up the vertex array state. The vertex array state is then set in a single

function call at draw time using glBindVertexArray.

152 Chapter 6: Vertex Attributes, Vertex Arrays, andBufferObjects

Example 6-7 Drawing with a Vertex Array Object

#define VERTEX_POS_SIZE 3 // x, y, and z

#define VERTEX_COLOR_SIZE 4 // r, g, b, and a

#define VERTEX_POS_INDX 0

#define VERTEX_COLOR_INDX 1

#define VERTEX_STRIDE ( sizeof(GLfloat) * \

( VERTEX_POS_SIZE + \

VERTEX_COLOR_SIZE ) )

int Init ( ESContext *esContext )

{

UserData *userData = (UserData*) esContext->userData;

const char vShaderStr[] =

"#version 300 es \n"

"layout(location = 0) in vec4 a_position; \n"

"layout(location = 1) in vec4 a_color; \n"

"out vec4 v_color; \n"

"void main() \n"

"{ \n"

" v_color = a_color; \n"

" gl_Position = a_position; \n"

"}";

const char fShaderStr[] =

"#version 300 es \n"

"precision mediump float; \n"

"in vec4 v_color; \n"

"out vec4 o_fragColor; \n"

"void main() \n"

"{ \n"

" o_fragColor = v_color; \n"

"}" ;

GLuint programObject;

// 3 vertices, with (x, y, z),(r, g, b, a) per-vertex

GLfloat vertices[3 * (VERTEX_POS_SIZE + VERTEX_COLOR_SIZE)] =

{

0.0f, 0.5f, 0.0f, // v0

1.0f, 0.0f, 0.0f, 1.0f, // c0

-0.5f, -0.5f, 0.0f, // v1

0.0f, 1.0f, 0.0f, 1.0f, // c1

0.5f, -0.5f, 0.0f, // v2

0.0f, 0.0f, 1.0f, 1.0f, // c2

};

Vertex Array Objects 153

Example 6-7 Drawing with a Vertex Array Object (continued)

// Index buffer data

GLushort indices[3] = { 0, 1, 2 };

// Create the program object

programObject = esLoadProgram ( vShaderStr, fShaderStr );

if ( programObject == 0 )

return GL_FALSE;

// Store the program object

userData->programObject = programObject;

// Generate VBO Ids and load the VBOs with data

glGenBuffers ( 2, userData->vboIds );

glBindBuffer ( GL_ARRAY_BUFFER, userData->vboIds[0] );

glBufferData ( GL_ARRAY_BUFFER, sizeof(vertices),

vertices, GL_STATIC_DRAW);

glBindBuffer ( GL_ELEMENT_ARRAY_BUFFER, userData->vboIds[1]);

glBufferData ( GL_ELEMENT_ARRAY_BUFFER, sizeof ( indices ),

indices, GL_STATIC_DRAW );

// Generate VAO ID

glGenVertexArrays ( 1, &userData->vaoId );

// Bind the VAO and then set up the vertex

// attributes

glBindVertexArray ( userData->vaoId );

glBindBuffer(GL_ARRAY_BUFFER, userData->vboIds[0]);

glBindBuffer(GL_ELEMENT_ARRAY_BUFFER, userData->vboIds[1]);

glEnableVertexAttribArray(VERTEX_POS_INDX);

glEnableVertexAttribArray(VERTEX_COLOR_INDX);

glVertexAttribPointer ( VERTEX_POS_INDX, VERTEX_POS_SIZE,

GL_FLOAT, GL_FALSE, VERTEX_STRIDE, (const void*) 0 );

glVertexAttribPointer ( VERTEX_COLOR_INDX, VERTEX_COLOR_SIZE,

GL_FLOAT, GL_FALSE, VERTEX_STRIDE,

(const void*) ( VERTEX_POS_SIZE * sizeof(GLfloat) ) );

// Reset to the default VAO

glBindVertexArray ( 0 );

glClearColor ( 0.0f, 0.0f, 0.0f, 0.0f );

return GL_TRUE;

}

(continues)

154 Chapter 6: Vertex Attributes, Vertex Arrays, andBufferObjects

void glDeleteVertexArrays(GLsizei n, GLuint *arrays)

nnumber of vertex array objects to be deleted

arrays array of n entries that contain the vertex array objects to

be deleted

When an application is nished with one or more vertex array objects,

they can be deleted using glDeleteVertexArrays.

Example 6-7 Drawing with a Vertex Array Object (continued)

void Draw ( ESContext *esContext )

{

UserData *userData = (UserData*) esContext->userData;

glViewport ( 0, 0, esContext->width, esContext->height );

glClear ( GL_COLOR_BUFFER_BIT );

glUseProgram ( userData->programObject );

// Bind the VAO

glBindVertexArray ( userData->vaoId );

// Draw with the VAO settings

glDrawElements ( GL_TRIANGLES, 3, GL_UNSIGNED_SHORT,

(const void*) 0 );

// Return to the default VAO

glBindVertexArray ( 0 );

}

Mapping Buffer Objects

So far, we have shown how to load data into buffer objects using

glBufferData or glBufferSubData. It is also possible for applications to

map and unmap a buffer object’s data storage into the application’s address

space. There are several reasons why an application might prefer to map a

buffer rather than load its data using glBufferData or glBufferSubData:

Mapping the buffer can reduce the memory utilization of the application

because potentially only a single copy of the data needs to be stored.

On architectures with shared memory, mapping the buffer returns a

direct pointer into the address space where the buffer will be stored for

the GPU. By mapping the buffer, the application can avoid the copy

step, thereby realizing better performance on updates.

Mapping Buffer Objects 155

void *glMapBufferRange( GLenum target, GLintptr offset,

GLsizeiptr length, GLbitfield

access)

target can be set to any of the following targets:

GL_ARRAY_BUFFER

GL_ELEMENT_ARRAY_BUFFER

GL_COPY_READ_BUFFER

GL_COPY_WRITE_BUFFER

GL_MAP_READ_BIT The application will read from

the returned pointer.

GL_MAP_WRITE_BIT The application will write to

the returned pointer.

Additionally, the application may include the following

optional access ags:

GL_MAP_INVALIDATE_RANGE_BIT Indicates that the contents of

the buffer within the specied

range can be discarded by the

driver before returning the

pointer. This ag cannot be

used in combination with

GL_MAP_READ_BIT.

GL_PIXEL_PACK_BUFFER

GL_PIXEL_UNPACK_BUFFER

GL_TRANSFORM_FEEDBACK_BUFFER

GL_UNIFORM_BUFFER

offset offset in bytes into the buffer data store

length number of bytes of the buffer data to map

access a biteld combination of access ags. The application

must specify at least one of the following ags:

The glMapBufferRange command returns a pointer to all of or a portion

(range) of the data storage for the buffer object. This pointer can be used

by the application to read or update the contents of the buffer object. The

glUnmapBuffer command is used to indicate that the updates have been

completed and to release the mapped pointer.

(continues)

156 Chapter 6: Vertex Attributes, Vertex Arrays, andBufferObjects

GL_MAP_INVALIDATE_BUFFER_BIT Indicates that the contents

of the entire buffer can be

discarded by the driver before

returning the pointer. This ag

can only be used in combination

with GL_MAP_READ_BIT.

GL_MAP_FLUSH_EXPLICIT_BIT Indicates that the application

will explicitly ush

operations to subranges of

the mapped range using

glFlushMappedBufferRange.

This ag cannot be used

in combination with

GL_MAP_WRITE_BIT.

GL_MAP_UNSYNCHRONIZED_BIT Indicates that the driver does

not need to wait for pending

operations on the buffer object

before returning a pointer to

the buffer range. If there are

pending operations, the results

of outstanding operations and

any future operations on the

buffer object become undened.

GLboolean glUnmapBuffer(GLenum target)

target must be set to GL_ARRAY_BUFFER

glMapBufferRange returns a pointer to the buffer data storage range

requested. If an error occurs or an invalid request is made, the function

will return NULL. The glUnmapBuffer command unmaps a previously

mapped buffer.

glUnmapBuffer returns GL_TRUE if the unmap operation is successful.

The pointer returned by glMapBufferRange can no longer be used after

asuccessful unmap has been performed. glUnmapBuffer returns

GL_FALSE if the data in the vertex buffer object’s data storage have become

corrupted after the buffer has been mapped. This can occur due to a change

(continued)

Mapping Buffer Objects 157

Example 6-8 Mapping a Buffer Object for Writing

GLfloat *vtxMappedBuf;

GLushort *idxMappedBuf;

glGenBuffers ( 2, userData->vboIds );

glBindBuffer ( GL_ARRAY_BUFFER, userData->vboIds[0] );

glBufferData ( GL_ARRAY_BUFFER, vtxStride * numVertices,

NULL, GL_STATIC_DRAW );

vtxMappedBuf = (GLfloat*)

glMapBufferRange ( GL_ARRAY_BUFFER, 0,

vtxStride * numVertices,

GL_MAP_WRITE_BIT |

GL_MAP_INVALIDATE_BUFFER_BIT );

if ( vtxMappedBuf == NULL )

{

esLogMessage( "Error mapping vertex buffer object." );

return;

}

// Copy the data into the mapped buffer

memcpy ( vtxMappedBuf, vtxBuf, vtxStride * numVertices );

// Unmap the buffer

if ( glUnmapBuffer( GL_ARRAY_BUFFER ) == GL_FALSE )

{

esLogMessage( "Error unmapping array buffer object." );

return;

}

// Map the index buffer

glBindBuffer ( GL_ELEMENT_ARRAY_BUFFER,

userData->vboIds[1] );

glBufferData ( GL_ELEMENT_ARRAY_BUFFER,

sizeof(GLushort) * numIndices,

NULL, GL_STATIC_DRAW );

in the screen resolution, multiple screens being used by OpenGL ES context,

or an out-of-memory event that causes the mapped memory to be discarded.1

The code in Example 6-8 demonstrates the use of glMapBufferRange and

glUnmapBuffer to write the contents of vertex buffer objects.

1. If the screen resolution changes to a larger width, height, and bits per pixel at

runtime, the mapped memory may have to be released. Note that this is not a

very common issue on handheld devices. A backing store is rarely implemented

on most handheld and embedded devices. Therefore, an out-of-memory event will

result in memory being freed and becoming available for reuse for critical needs.

(continues)

158 Chapter 6: Vertex Attributes, Vertex Arrays, andBufferObjects

Flushing a Mapped Buffer

An application may wish to map a range (or all) of a buffer object using

glMapBufferRange, but update only discrete subregions of the mapped

range. To avoid the potential performance penalty for ushing the entire

mapped range when calling glUnmapBuffer, the application can map

with the GL_MAP_FLUSH_EXPLICIT_BIT access ag (along with GL_MAP_

WRITE_BIT). When the application has nished updating a portion of the

mapped range, it can indicate this fact using glFlushMappedBufferRange.

Example 6-8 Mapping a Buffer Object for Writing (continued)

idxMappedBuf = (GLushort*)

glMapBufferRange ( GL_ELEMENT_ARRAY_BUFFER, 0,

sizeof(GLushort) * numIndices,

GL_MAP_WRITE_BIT |

GL_MAP_INVALIDATE_BUFFER_BIT );

if ( idxMappedBuf == NULL )

{

esLogMessage( "Error mapping element buffer object." );

return;

}

// Copy the data into the mapped buffer

memcpy ( idxMappedBuf, indices,

sizeof(GLushort) * numIndices );

// Unmap the buffer

if ( glUnmapBuffer( GL_ELEMENT_ARRAY_BUFFER ) == GL_FALSE )

{

esLogMessage( "Error unmapping element buffer object." );

return;

}

void *glFlushMappedBufferRange( GLenum target,

GLintptr offset,

GLsizeiptr length)

target can be set to any of the following targets:

GL_ARRAY_BUFFER

GL_ELEMENT_ARRAY_BUFFER

GL_COPY_READ_BUFFER

GL_COPY_WRITE_BUFFER

GL_PIXEL_PACK_BUFFER

GL_PIXEL_UNPACK_BUFFER

Copying Buffer Objects 159

If an application maps with GL_MAP_FLUSH_EXPLICIT_BIT but does not

explicitly ush a modied region with glFlushMappedBufferRange,

itscontents will be undened.

Copying Buffer Objects

So far, we have shown how to load buffer objects with data using

glBufferData, glBufferSubData, and glMapBufferRange. All of these

techniques involve transferring data from the application to the device. It

is also possible with OpenGL ES 3.0 to copy data from one buffer object

to another entirely on the device. This can be done using the function

glCopyBufferSubData.

GL_TRANSFORM_FEEDBACK_BUFFER

GL_UNIFORM_BUFFER

offset offset in bytes from the beginning of the mapped buffer

length number of bytes of the buffer from offset to ush

void glCopyBufferSubData( GLenum readtarget,

GLenumwritetarget,

GLintptrreadoffset,

GLintptrwriteoffset,

GLsizeiptr size)

readtarget the buffer object target to read from.

writetarget the buffer object target to write to. Both readtarget

and writetarget can be set to any of the following

targets (although they must not be the same target):

GL_ARRAY_BUFFER

GL_ELEMENT_ARRAY_BUFFER

GL_COPY_READ_BUFFER

GL_COPY_WRITE_BUFFER

GL_PIXEL_PACK_BUFFER

GL_PIXEL_UNPACK_BUFFER

GL_TRANSFORM_FEEDBACK_BUFFER

GL_UNIFORM_BUFFER

(continues)

160 Chapter 6: Vertex Attributes, Vertex Arrays, andBufferObjects

Calling glCopyBufferSubData will copy the specied bytes from

the buffer bound to the readtarget to the writetarget. The buffer

binding is determined based on the last call to glBindBuffer for

each target. Any type of buffer object (array, element array, transform

feedback, and so on) can be bound to the GL_COPY_READ_BUFFER or

GL_COPY_WRITE_BUFFER target. These two targets are provided as a

convenience so that the application doesn’t have to change any of the

true buffer bindings to perform a copy between buffers.

Summary

This chapter explored how vertex attributes and data are specied in

OpenGL ES 3.0. Specically, it covered the following topics:

How to specify constant vertex attributes using the glVertexAttrib*

functions and vertex arrays using the glVertexAttrib[I]Pointer

functions

How to create and store vertex attribute and element data in vertex

buffer objects

How vertex array state is encapsulated in vertex array objects and how

to use VAOs to improve performance

The variety of methods for loading buffer objects with data:

glBuffer[Sub]Data, glMapBufferRange, and glCopyBufferSubData

Now that we know how vertex data are specied, the next chapter covers

all of the primitives that can be drawn in OpenGL ES using vertex data.

readoffset offset in bytes into the read buffer data to copy from.

writeoffset offset in bytes into the write buffer data to copy to.

size the number of bytes to copy from the read buffer

data to the write buffer data.

(continued)

161

Chapter 7

Primitive Assembly and Rasterization

This chapter describes the types of primitives and geometric objects that

are supported by OpenGL ES, and explains how to draw them. It then

describes the primitive assembly stage, which occurs after the vertices of

a primitive are processed by the vertex shader. In the primitive assembly

stage, clipping, perspective divide, and viewport transformation operations

are performed. These operations are discussed in detail. The chapter

concludes with a description of the rasterization stage. Rasterization is the

process that converts primitives into a set of two-dimensional fragments,

which are processed by the fragment shader. These two-dimensional

fragments represent pixels that may be drawn on the screen.

Refer to Chapter 8, “Vertex Shaders,” for a detailed description of vertex

shaders. Chapter 9, “Texturing,” and Chapter 10, “Fragment Shaders,”

describe processing that is applied to fragments generated by the

rasterization stage.

Primitives

A primitive is a geometric object that can be drawn using the

glDrawArrays, glDrawElements, glDrawRangeElements,

glDrawArraysInstanced, and glDrawElementsInstanced commands

in OpenGL ES. The primitive is described by a set of vertices that indicate

the vertex position. Other information, such as color, texture coordinates,

and geometric normal can also be associated with each vertex as generic

attributes.

162 Chapter 7: Primitive Assembly and Rasterization

The following primitives can be drawn in OpenGL ES 3.0:

Triangles

Lines

Point sprites

Triangles

Triangles represent the most common method used to describe a

geometryobject rendered by a 3D application. The triangle primitives

supported byOpenGL ES are GL_TRIANGLES, GL_TRIANGLE_STRIP, and

GL_TRIANGLE_FAN. Figure 7-1 shows examples of supported triangle

primitive types.

GL_TRIANGLES draws a series of separate triangles. In Figure 7-1, two

triangles given by vertices (v0, v1, v2) and (v3, v4, v5) are drawn. A total

of n/3 triangles are drawn, where n is the number of indices specied as

count in glDraw*** APIs mentioned previously.

GL_TRIANGLE_STRIP draws a series of connected triangles. In the example

shown in Figure 7-1, three triangles are drawn given by (v0, v1, v2), (v2, v1, v3)

(note the order), and (v2, v3, v4). A total of (n – 2) triangles are drawn, where

n is the number of indices specied as count in glDraw*** APIs.

GL_TRIANGLE_FAN

GL_TRIANGLES GL_TRIANGLE_STRIP

v3v4

v5v0

v1v0

Figure 7-1 Triangle Primitive Types

Primitives 163

GL_TRIANGLE_FAN also draws a series of connected triangles. In the

example shown in Figure 7-1, the triangles drawn are (v0, v1, v2), (v0, v2, v3),

and (v0, v3, v4). A total of (n – 2) triangles are drawn, where n is the number

of indices specied as count in glDraw*** APIs.

Lines

The line primitives supported by OpenGL ES are GL_LINES, GL_LINE_STRIP,

and GL_LINE_LOOP. Figure 7-2 shows examples of supported line primitive

types.

GL_LINES

v0v4

GL_LINE_LOOP

GL_LINE_STRIP

Figure 7-2 Line Primitive Types

GL_LINES draws a series of unconnected line segments. In the example

shown in Figure 7-2, three individual lines are drawn given by (v0, v1),

(v2, v3), and (v4, v5). A total of n/2 segments are drawn, where n is the

number of indices specied as count in glDraw*** APIs.

GL_LINE_STRIP draws a series of connected line segments. In the example

shown in Figure 7-2, three line segments are drawn given by (v0, v1),

(vl, v2), and (v2, v3). A total of (n – 1) line segments are drawn, where n is

the number of indices specied as count in glDraw*** APIs.

GL_LINE_LOOP works similar to GL_LINE_STRIP, except that a nal line

segment is drawn from vn–1 to v0. In the example shown in Figure 7-2,

the line segments drawn are (v0, v1), (v1, v2), (v2, v3), (v3, v4), and (v4, v0).

A total of n line segments are drawn, where n is the number of indices

specied as count in glDraw*** APIs.

164 Chapter 7: Primitive Assembly and Rasterization

The width specied by glLineWidth will be clamped to the line width

range supported by the OpenGL ES 3.0 implementation. In addition, the

width specied will be remembered by OpenGL until updated by the

application. The supported line width range can be queried using the

following command. There is no requirement for lines with widths greater

than 1 to be supported.

GLfloat lineWidthRange[2];

glGetFloatv ( GL_ALIASED_LINE_WIDTH_RANGE, lineWidthRange );

Point Sprites

The point sprite primitive supported by OpenGL ES is GL_POINTS. A point

sprite is drawn for each vertex specied. Point sprites are typically used for

rendering particle effects efciently by drawing them as points instead of

quads. A point sprite is a screen-aligned quad specied as a position and a

radius. The position describes the center of the square, and the radius is

then used to calculate the four coordinates of the quad that describes the

point sprite.

gl_PointSize is the built-in variable that can be used to output the point

radius (or point size) in the vertex shader. It is important that a vertex

shader associated with the point primitive output gl_PointSize; otherwise,

the value of the point size is considered undened and will most likely result

in drawing errors. The gl_PointSize value output by a vertex shader will

be clamped to the aliased point size range supported by the OpenGL ES 3.0

implementation. This range can be queried using the following command:

GLfloat pointSizeRange[2];

glGetFloatv ( GL_ALIASED_POINT_SIZE_RANGE, pointSizeRange );

By default, OpenGL ES 3.0 describes the window origin (0, 0) to be the

(left, bottom) region. However, for point sprites, the point coordinate

origin is (left, top).

gl_PointCoord is a built-in variable available only inside a fragment

shader when the primitive being rendered is a point sprite. It is declared as

void glLineWidth(GLfloat width)

width species the width of the line in pixels; the default

width is 1.0

The width of a line can be specied using the glLineWidth API call.

Drawing Primitives 165

a vec2 variable using the mediump precision qualier. The values assigned

to gl_PointCoord go from 0.0 to 1.0 as we move from left to right or

from top to bottom, as illustrated in Figure 7-3.

(0, 0)

(0, 1)

(1, 0)

(1, 1)

Figure 7-3 gl_PointCoord Values

The following fragment shader code illustrates how gl_PointCoord can

be used as a texture coordinate to draw a textured point sprite:

#version 300 es

precision mediump float;

uniform sampler2D s_texSprite;

layout(location = 0) out vec4 outColor;

void main()

{

outColor = texture(s_texSprite, gl_PointCoord);

}

Drawing Primitives

There are ve API calls in OpenGL ES to draw primitives: glDrawArrays,

glDrawElements, glDrawRangeElements, glDrawArraysInstanced, and

glDrawElementsInstanced. We will describe the rst three regular non-

instanced draw call APIs in this section and the remaining two instanced

draw call APIs in the next section.

glDrawArrays draws primitives specied by mode using vertices given by

element index first to first + count – 1. A call to glDrawArrays

( G L _ T R I A N G L E S , 0 , 6 ) will draw two triangles: a triangle given by element

indices (0, 1, 2) and another triangle given by element indices (3, 4, 5).

Similarly, a call to glDrawArrays(GL_TRIANGLE_STRIP, 0, 5) will draw

three triangles: a triangle given by element indices (0, 1, 2), the second

triangle given by element indices (2, 1, 3), and the nal triangle given by

element indices (2, 3, 4).

166 Chapter 7: Primitive Assembly and Rasterization

void glDrawArrays( GLenum mode, GLint first,

GLsizei count)

mode species the primitive to render; valid values are

GL_POINTS

GL_LINES

GL_LINE_STRIP

GL_LINE_LOOP

GL_TRIANGLES

GL_TRIANGLE_STRIP

GL_TRIANGLE_FAN

first

species the starting vertex index in the enabled vertex arrays

count species the number of vertices to be drawn

void glDrawElements( GLenum mode, GLsizei count,

GLenum type, const GLvoid *indices)

void glDrawRangeElements(GLenum mode, GLuint start,

GLuint end, GLsizei count,

GLenum type, const GLvoid*indices)

mode species the primitive to render; valid values are

GL_POINTS

GL_LINES

GL_LINE_STRIP

GL_LINE_LOOP

GL_TRIANGLES

GL_TRIANGLE_STRIP

GL_TRIANGLE_FAN

start species the minimum array index in indices

(glDrawRangeElements only)

end species the maximum array index in indices

(glDrawRangeElements only)

count species the number of indices to be drawn

type species the type of element indices stored in indices;

valid values are

GL_UNSIGNED_BYTE

GL_UNSIGNED_SHORT

GL_UNSIGNED_INT

indices species a pointer to location where element indices are stored

Drawing Primitives 167

glDrawArrays is great if you have a primitive described by a sequence

of element indices and if vertices of geometry are not shared. However,

typical objects used by games or other 3D applications are made up of

multiple triangle meshes where element indices may not necessarily be in

sequence and vertices will typically be shared between triangles of a mesh.

6v5

Figure 7-4 Cube

Consider the cube shown in Figure 7-4. If we were to draw this using

glDrawArrays, the code would be as follows:

#define VERTEX_POS_INDX 0

#define NUM_FACES 6

GLfloat vertices[] = { … }; // (x, y, z) per vertex

glEnableVertexAttribArray ( VERTEX_POS_INDX );

glVertexAttribPointer ( VERTEX_POS_INDX, 3, GL_FLOAT,

GL_FALSE, 0, vertices );

for (int i=0; i<NUM_FACES; i++)

{

glDrawArrays ( GL_TRIANGLE_FAN, i*4, 4 );

}

glDrawArrays ( GL_TRIANGLES, 0, 36 );

To draw this cube with glDrawArrays, we would call glDrawArrays for

each face of the cube. Vertices that are shared would need to be replicated,

which means that instead of having 8 vertices, we would now need

to allocate 24 vertices (if we draw each face as a GL_TRIANGLE_FAN) or

36vertices (if we use GL_TRIANGLES). This is not an efcient approach.

168 Chapter 7: Primitive Assembly and Rasterization

This is how the same cube would be drawn using glDrawElements:

#define VERTEX_POS_INDX 0

GLfloat vertices[] = { … };// (x, y, z) per vertex

GLubyte indices[36] = {0, 1, 2, 0, 2, 3,

0, 3, 4, 0, 4, 5,

0, 5, 6, 0, 6, 1,

7, 1, 6, 7, 2, 1,

7, 5, 4, 7, 6, 5,

7, 3, 2, 7, 4, 3 };

glEnableVertexAttribArray ( VERTEX_POS_INDX );

glVertexAttribPointer ( VERTEX_POS_INDX, 3, GL_FLOAT,

GL_FALSE, 0, vertices );

glDrawElements ( GL_TRIANGLES,

sizeof(indices)/sizeof(GLubyte),

GL_UNSIGNED_BYTE, indices );

Even though we are drawing triangles with glDrawElements and a

triangle fan with glDrawArrays and glDrawElements, our application

will run faster than glDrawArrays on a GPU for many reasons.

For example, the size of vertex attribute data will be smaller with

glDrawElements as vertices are reused (we will discuss the GPU post-

transform vertex cache in a later section). This also leads to a smaller

memory footprint and memory bandwidth requirement.

Primitive Restart

Using primitive restart, you can render multiple disconnected primitives

(such as triangle fans or strips) using a single draw call. This is benecial

to reduce the overhead of the draw API calls. A less elegant alternative

to using primitive restart is generating degenerate triangles (with some

caveats), which we will discuss in a later section.

Using primitive restart, you can restart a primitive for indexed draw

calls (such as glDrawElements, glDrawElementsInstanced, or

glDrawRangeElements) by inserting a special index into the indices list.

The special index is the largest possible index for the type of the indices

(such as 255 or 65535 when the index type is GL_UNSIGNED_BYTE or

GL_UNSIGNED_SHORT, respectively).

For example, suppose two triangle strips have element indices of (0, 1,

2, 3) and (8, 9, 10, 11), respectively. The combined element index list

if we were to draw both strips using one call to glDrawElements* with

primitive restart would be (0, 1, 2, 3, 255, 8, 9, 10, 11) if the index type is

GL_UNSIGNED_BYTE.

Drawing Primitives 169

You can enable and disable primitive restart as follows:

glEnable ( GL_PRIMITIVE_RESTART_FIXED_INDEX );

// Draw primitives

…

glDisable ( GL_PRIMITIVE_RESTART_FIXED_INDEX );

Provoking Vertex

Without qualiers, output values of the vertex shader are linearly

interpolated across the primitive. However, with the use of at shading

(described in the Interpolation Qualiers section in Chapter 5), no

interpolation occurs. Because no interpolation occurs, only one of the

vertex values can be used in the fragment shader. For a given primitive

instance, the provoking vertex determines which of the vertices output

from the vertex shader are used, as only one can be used. Table 7-1 shows

the rule for the provoking vertex selection.

Table 7-1 Provoking Vertex Selection for the ith Primitive Instance Where

Vertices Are Numbered from 1 to n, and n Is the Number of Vertices

Drawn

Type of Primitive i Provoking Vertex

GL_POINTS i

GL_LINES 2i

GL_LINE_LOOP i + 1, if i < n

1, if i = n

GL_LINE_STRIP i + 1

GL_TRIANGLES 3i

GL_TRIANGLE_STRIP i + 2

GL_TRIANGLE_FAN i + 2

Geometry Instancing

Geometry instancing allows for efciently rendering an object multiple

times with different attributes (such as a different transformation matrix,

color, or size) using a single API call. This feature is useful in rendering

large quantities of similar objects, such as in crowd rendering. Geometry

instancing reduces the overhead of CPU processing to send many API calls

170 Chapter 7: Primitive Assembly and Rasterization

to the OpenGL ES engine. To render using an instanced draw call, use the

following commands:

void glVertexAttribDivisor(GLuint index, GLuint divisor)

index

species the index of the generic vertex attribute

divisor species the number of instances that will pass between

updates of the generic attribute at slot index

void glDrawArraysIns tanced(GLenum mode, GLint first,

GLsizei count, GLsizei instanceCount)

void glDrawElementsI nstanced (GLenum mode, GLsizei count,

GLenum type, const GLvoid *indices,

GLsizei instanceCount)

mode species the primitive to render; valid values are

GL_POINTS

GL_LINES

GL_LINE_STRIP

GL_LINE_LOOP

GL_TRIANGLES

GL_TRIANGLE_STRIP

GL_TRIANGLE_FAN

first species the starting vertex index in the enabled vertex arrays

(glDrawArraysInstanced only)

count species the number of indices to be drawn

type species the type of element indices stored in indices

(glDrawElementsInstanced only);

valid values are

GL_UNSIGNED_BYTE

GL_UNSIGNED_SHORT

GL_UNSIGNED_INT

indices species a pointer to the location where element indices are stored

(glDrawElementsInstanced only)

instanceCount species the number of instances of the primitive to be

drawn

Two methods may be used to access per-instance data. The rst method is

to instruct OpenGL ES to read vertex attributes once or multiple times per

instance using the following command:

Drawing Primitives 171

By default, if glVertexAttribDivisor is not specied or is specied with

divisor equal to 0 for the vertex attributes, then the vertex attributes will

be read once per vertex. If divisor equals 1, then the vertex attributes

will be read once per primitive instance.

The second method is to use the built-in input variable gl_InstanceID as

an index to a buffer in the vertex shader to access the per-instance data.

gl_InstanceID will hold the index of the current primitive instance

when the previously mentioned geometry instancing API calls are used.

When a non-instanced draw call is used, gl_InstanceID will return 0.

The next two code fragments illustrate how to draw many geometry (i.e.,

cubes) using a single instanced draw call where each cube instance will

be colored uniquely. Note that the complete source code is available in

Chapter_7/Instancing example.

First, we create a color buffer to store many color data to be used later for

the instanced draw call (one color per instance).

// Random color for each instance

{

GLubyte colors[NUM_INSTANCES][4];

int instance;

srandom ( 0 );

for ( instance = 0; instance < NUM_INSTANCES; instance++ )

{

colors[instance][0] = random() % 255;

colors[instance][1] = random() % 255;

colors[instance][2] = random() % 255;

colors[instance][3] = 0;

}

glGenBuffers ( 1, &userData->colorVBO );

glBindBuffer ( GL_ARRAY_BUFFER, userData->colorVBO );

glBufferData ( GL_ARRAY_BUFFER, NUM_INSTANCES * 4, colors,

GL_STATIC_DRAW );

}

After the color buffer has been created and lled, we can bind the color

buffer as one of the vertex attributes for the geometry. Then, we specify

the vertex attribute divisor as 1 so that the color will be read per primitive

instance. Finally, the cubes are drawn with a single instanced draw call.

// Load the instance color buffer

glBindBuffer ( GL_ARRAY_BUFFER, userData->colorVBO );

glVertexAttribPointer ( COLOR_LOC, 4, GL_UNSIGNED_BYTE,

GL_TRUE, 4 * sizeof ( GLubyte ),

( const void * ) NULL );

172 Chapter 7: Primitive Assembly and Rasterization

glEnableVertexAttribArray ( COLOR_LOC );

// Set one color per instance

glVertexAttribDivisor ( COLOR_LOC, 1 );

// code skipped ...

// Bind the index buffer

glBindBuffer ( GL_ELEMENT_ARRAY_BUFFER, userData->indicesIBO );

// Draw the cubes

glDrawElementsInstanced ( GL_TRIANGLES, userData->numIndices,

GL_UNSIGNED_INT,

(const void *) NULL, NUM_INSTANCES );

Performance Tips

Applications should make sure that glDrawElements and

glDrawElementsInstanced are called with as large a primitive size

as possible. This is very easy to do if we are drawing GL_TRIANGLES.

However, if we have meshes of triangle strips or fans, instead of making

individual calls to glDrawElements* for each triangle strip mesh, these

meshes could be connected together by using primitive restart (see the

earlier section discussing this feature).

If you cannot use the primitive restart mechanism to connect meshes

together (to maintain compatibility with an older OpenGL ES version),

you can add element indices that result in degenerate triangles at the

expense of using more indices and some caveats that we will discuss

here. A degenerate triangle is a triangle where two or more vertices of the

triangle are coincident. GPUs can detect and reject degenerate triangles

very easily, so this is a good performance enhancement that allows us to

queue a big primitive to be rendered by the GPU.

The number of element indices (or degenerate triangles) we need to

add to connect distinct meshes will depend on whether each mesh is a

triangle fan or a triangle strip and the number of indices dened in each

strip. The number of indices in a mesh that is a triangle strip matters,

as we need to preserve the winding order as we go from one triangle to

the next triangle of the strip across the distinct meshes that are being

connected.

When connecting separate triangle strips, we need to check the order of

the last triangle and the rst triangle of the two strips being connected. As

seen in Figure 7-5, the ordering of vertices that describe even-numbered

Drawing Primitives 173

triangles of a triangle strip differs from the ordering of vertices that

describe odd-numbered triangles of the same strip.

Two cases need to be handled:

The odd-numbered triangle of the rst triangle strip is being

connected to the rst (and therefore even-numbered) triangle of the

second triangle strip.

The even-numbered triangle of the rst triangle strip is being

connected to the rst (and therefore even-numbered) triangle of the

second triangle strip.

Figure 7-5 shows two separate triangle strips that represent these two

cases, where the strips need to be connected to allow us to draw both of

them using a single call to glDrawElements*.

v0v2v10

v11

Opposite Vertex Order

v0v4

v10

v11

Same Vertex Order

Figure 7-5 Connecting Triangle Strips

For the triangle strips in Figure 7-5 with opposite vertex order for the last

and rst triangles of the two strips being connected, the element indices

for each triangle strip are (0, 1, 2, 3) and (8, 9, 10, 11), respectively. The

combined element index list if we were to draw both strips using one call to

glDrawElements* would be (0, 1, 2, 3, 3, 8, 8, 9, 10, 11). This new element

index results in the following triangles drawn: (0, 1, 2), (2, 1, 3), (2, 3, 3),

(3, 3, 8), (3, 8, 8), (8, 8, 9), (8, 9, 10), (10, 9, 11). The triangles in boldface

174 Chapter 7: Primitive Assembly and Rasterization

type are the degenerate triangles. The element indices in boldface type

represent the new indices added to the combined element index list.

For triangle strips in Figure 7-5 with the same vertex order for the last and

rst triangles of the two strips being connected, the element indices for

each triangle strip are (0, 1, 2, 3, 4) and (8, 9, 10, 11), respectively. The

combined element index list if we were to draw both strips using one call

to glDrawElements would be (0, 1, 2, 3, 4, 4, 4, 8, 8, 9, 10, 11). This new

element index results in the following triangles drawn: (0, 1, 2), (2, 1, 3),

(2, 3, 4), (4, 3, 4), (4, 4, 4), (4, 4, 8), (4, 8, 8), (8, 8, 9), (8, 9, 10),

(10, 9, 11). The triangles in boldface type are the degenerate triangles. The

element indices in boldface type represent the new indices added to the

combined element index list.

Note that the number of additional element indices required and the

number of degenerate triangles generated vary depending on the number

of vertices in the rst strip. This is required to preserve the winding order of

the next strip being connected.

It might also be worth investigating techniques that take the size of the

post-transform vertex cache into consideration in determining how to

arrange element indices of a primitive. Most GPUs implement a post-

transform vertex cache. Before a vertex (given by its element index) is

executed by the vertex shader, a check is performed to determine whether

the vertex already exists in the post-transform cache. If the vertex exists in

the post-transform cache, the vertex is not executed by the vertex shader.

If it is not in the cache, the vertex will need to be executed by the vertex

shader. Using the post-transform cache size to determine how element

indices are created should help overall performance, as it will reduce the

number of times a vertex that is reused gets executed by the vertex shader.

Primitive Assembly

Figure 7-6 shows the primitive assembly stage. Vertices that are supplied

through glDraw*** are executed by the vertex shader. Each vertex

transformed by the vertex shader includes the vertex position that

describes the (x, y, z, w) value of the vertex. The primitive type and vertex

indices determine the individual primitives that will be rendered. For each

individual primitive (triangle, line, and point) and its corresponding vertices,

the primitive assembly stage performs the operations shown in Figure 7-6.

Before we discuss how primitives are rasterized in OpenGL ES, we need to

understand the various coordinate systems used within OpenGL ES 3.0. This

Primitive Assembly 175

is needed to get a good understanding of what happens to vertex coordinates

as they go through the various stages of the OpenGL ES 3.0 pipeline.

Clipping

Perspective

Division

Viewport

Transformation

Output of

vertex shader

To rasterization

stage

Figure 7-6 OpenGL ES Primitive Assembly Stage

Coordinate Systems

Figure 7-7 shows the coordinate systems as a vertex goes through the vertex

shader and primitive assembly stages. Vertices are input to OpenGL ES in

the object or local coordinate space. This is the coordinate space in which

an object is most likely modeled and stored. After a vertex shader executes,

the vertex position is considered to be in the clip coordinate space. The

transformation of the vertex position from the local coordinate system (i.e.,

object coordinates) to clip coordinates is done by loading the appropriate

matrices that perform this conversion in appropriate uniforms dened in

Perspective

Division

Vertex

Shader

Viewport

Transformation

Object

Coordinates

Clip

Coordinates

Normalized

Device

Coordinates

Window

Coordinates

Figure 7-7 Coordinate Systems

176 Chapter 7: Primitive Assembly and Rasterization

the vertex shader. Chapter 8, “Vertex Shaders,” describes how to transform

the vertex position from object to clip coordinates and how to load

appropriate matrices in the vertex shader to perform this transformation.

Clipping

To avoid processing of primitives outside the viewable volume, primitives

are clipped to the clip space. The vertex position after the vertex shader

has been executed is in the clip coordinate space. The clip coordinate is

a homogeneous coordinate given by (xc, yc, zc, wc). The vertex coordinates

dened in clip space (xc, yc, zc, wc) get clipped against the viewing volume

(also known as the clip volume).

The clip volume, as shown in Figure 7-8, is dened by six clipping planes,

referred to as the near, and far clip planes, the left and right clip planes,

and the top and bottom clip planes. In clip coordinates, the clip volume is

given as follows:

-wc <= xc <= wc

-wc <= yc <= wc

-wc <= zc <= wc

The preceding six checks help determine the list of planes against which

the primitive needs to be clipped.

Far

Plane

Near

Plane

Figure 7-8 Viewing Volume

Primitive Assembly 177

The clipping stage will clip each primitive to the clip volume shown in

Figure 7-8. By “primitive,” here we imply each triangle of a list of separate

triangles drawn using GL_TRIANGLES, or a triangle of a triangle strip or a

fan, or a line from a list of separate lines drawn using GL_LINES, or a line

of a line strip or line loop, or a specic point in a list of point sprites. For

each primitive type, the following operations are performed:

Clipping triangles—If the triangle is completely inside the viewing

volume, no clipping is performed. If the triangle is completely

outside the viewing volume, the triangle is discarded. If the triangle

lies partly inside the viewing volume, then the triangle is clipped

against the appropriate planes. The clipping operation will generate

new vertices that are clipped to the plane that are arranged as a

triangle fan.

Clipping lines—If the line is completely inside the viewing volume,

then no clipping is performed. If the line is completely outside the

viewing volume, the line is discarded. If the line lies partly inside the

viewing volume, then the line is clipped and appropriate new vertices

are generated.

Clipping point sprites—The clipping stage will discard the point

sprite if the point position lies outside the near or far clip plane or if

the quad that represents the point sprite is outside the clip volume.

Otherwise, it is passed unchanged and the point sprite will be scissored

as it moves from inside the clip volume to the outside, or vice versa.

After the primitives have been clipped against the six clipping planes,

the vertex coordinates undergo perspective division to become

normalized device coordinates. A normalized device coordinate is in the

range –1.0 to +1.0.

Note: The clipping operation (especially for lines and triangles) can

be quite expensive to perform in hardware. A primitive must be

clipped against six clip planes of the viewing volume, as shown

in Figure 7-8. Primitives that are partly outside the near and far

planes go through the clipping operations. However, primitives

that are partially outside the x and y planes do not necessarily

need to be clipped. By rendering into a viewport that is bigger

than the dimensions of the viewport specied with glViewport,

clipping in the x and y planes becomes a scissoring operation.

Scissoring is implemented very efciently by GPUs. This larger

viewport region is called the guard-band region. Although OpenGL

ES does not allow an application to specify a guard-band region,

most—if not all—OpenGL ES implementations implement a

guard-band.

178 Chapter 7: Primitive Assembly and Rasterization

Perspective Division

Perspective division takes the point given by clip coordinate (xc, yc, zc, wc)

and projects it onto the screen or viewport. This projection is performed

by dividing the (xc, yc, zc) coordinates with wc. After performing (xc/wc),

(yc /wc), and (zc /wc), we get normalized device coordinates (xd, yd, zd).

These are called normalized device coordinates, as they will be in the

[–1.0 ... 1.0] range. These normalized (xd, yd) coordinates will then be

converted to actual screen (or window) coordinates depending on the

dimensions of the viewport. The normalized (zd) coordinate is converted

to the screen z value using the near and far depth values specied

by glDepthRangef. These conversions are performed in the viewport

transformation phase.

Viewport Transformation

A viewport is a 2D rectangular window region in which all OpenGL

ES rendering operations will ultimately be displayed. The viewport

transformation can be set by using the following API call:

void glViewport(GLint x, GLint y, GLsizei w, GLsizei h)

x, y species the window coordinates of the viewport’s lower-left

corner in pixels

w, h species the width and height of viewport in pixels; these values

must be greater than 0

The conversion from normalized device coordinates (xd, yd, zd) to window

coordinates (xw, yw, zw) is given by the following transformation:

⎡

⎣

⎢

⎤

⎦

⎥

−++

⎡

⎣

⎢

⎤

⎦

⎥

(/2)

(( )/2)

()

wx o

hy o

fn znf

In the transformation ox = x + w/2 and oy = y + h/2, n and f represent the

desired depth range.

Rasterization 179

The values specied by glDepthRangef and glViewport are used to

transform the vertex position from normalized device coordinates into

window (screen) coordinates.

The initial (or default) viewport state is set to w = width and h = height

of the window created by the application in which OpenGL ES is to do

its rendering. This window is given by the EGLNativeWindowType win

argument specied in eglCreateWindowSurface.

Rasterization

Figure 7-9 shows the rasterization pipeline. After the vertices have

been transformed and primitives have been clipped, the rasterization

pipelines take an individual primitive such as a triangle, a line segment,

or a point sprite and generate appropriate fragments for this primitive.

Each fragment is identied by its integer location (x, y) in screen space.

A fragment represents a pixel location given by (x, y) in screen space and

additional fragment data that will be processed by the fragment shader

to produce a fragment color. These operations are described in detail in

Chapter 9, “Texturing,” and Chapter 10, “Fragment Shaders.”

void glDepthRangef(GLclampf n, GLclampf f)

n, f

specify the desired depth range. Default values for n and f are 0.0

and 1.0, respectively. The values are clamped to lie within (0.0, 1.0).

From

Primitive

Assembly

Line

Rasterization

Point-Sprite

Rasterization

Triangle

Rasterization

Output for each fragment—

screen (xw, yw) coordinate,

attributes such as color,

texture coordinates, etc.

To Fragment Shader Stage

Figure 7-9 OpenGL ES Rasterization Stage

The depth range values n and f can be set using the following API call:

180 Chapter 7: Primitive Assembly and Rasterization

In this section, we discuss the various options that an application can use

to control rasterization of triangles, strips, and fans.

Culling

Before triangles are rasterized, we need to determine whether they are

front-facing (i.e., facing the viewer) or back-facing (i.e., facing away from

the viewer). The culling operation discards triangles that face away from

the viewer. To determine whether the triangle is front-facing or back-

facing we rst need to know the orientation of the triangle.

The orientation of a triangle species the winding order of a path that

begins at the rst vertex, goes through the second and third vertex, and

ends back at the rst vertex. Figure 7-10 shows two examples of triangles

with clockwise and counterclockwise winding orders.

Clockwise (CW)

Orientation

Counter-Clockwise (CCW)

Orientation

Figure 7-10 Clockwise and Counterclockwise Triangles

The orientation of a triangle is computed by calculating the signed area

of the triangle in window coordinates. We now need to translate the sign

of the computed triangle area into a clockwise (CW) or counterclockwise

(CCW) orientation. This mapping from the sign of triangle area to a CW

or CCW orientation is specied by the application using the following

API call:

void glFrontFace(GLenum dir)

dir species the orientation of front-facing triangles. Valid values

are GL_CW or GL_CCW. The default value is GL_CCW.

Rasterization 181

We have discussed how to calculate the orientation of a triangle. To

determine whether the triangle needs to be culled, we need to know the

facing of triangles that are to be culled. This is specied by the application

using the following API call:

void glCullFace(GLenum mode)

mode species the facing of triangles that are to be culled. Valid values

are GL_FRONT, GL_BACK, and GL_FRONT_AND_BACK. The default

value is GL_BACK.

Last but not least, we need to know whether the culling operation

shouldbe performed. The culling operation will be performed if the

GL_CULL_FACE state is enabled. The GL_CULL_FACE state can be enabled or

disabled by the application using the following API calls:

void glEnable(GLenum cap)

void glDisable(GLenum cap)

where cap is set to GL_CULL_FACE. Initially, culling is disabled.

To recap, to cull appropriate triangles, an OpenGL ES application must

rst enable culling using glEnable (GL_CULL_FACE), set the appropriate

cull face using glCullFace, and set the orientation of front-facing

triangles using glFrontFace.

Note: Culling should always be enabled to avoid the GPU wasting time

rasterizing triangles that are not visible. Enabling culling should

improve the overall performance of the OpenGL ES application.

Polygon Offset

Consider the case where we are drawing two polygons that overlap each

other. You will most likely notice artifacts, as shown in Figure 7-11. These

artifacts, called Z-ghting artifacts, occur because of limited precision of

triangle rasterization, which can affect the precision of the depth values

generated per fragment, resulting in artifacts. The limited precision of

parameters used by triangle rasterization and generated depth values per

fragment will get better and better but will never be completely resolved.

182 Chapter 7: Primitive Assembly and Rasterization

Figure 7-11 shows two coplanar polygons being drawn. The code to draw

these two coplanar polygons without polygon offset is as follows:

glClear ( GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT );

// load vertex shader

// set the appropriate transformation matrices

// set the vertex attribute state

// draw the SMALLER quad

glDrawArrays ( GL_TRIANGLE_FAN, 0, 4 );

// set the depth func to <= as polygons are coplanar

glDepthFunc ( GL_LEQUAL );

// set the vertex attribute state

// draw the LARGER quad

glDrawArrays ( GL_TRIANGLE_FAN, 0, 4 );

To avoid the artifacts shown in Figure 7-11, we need to add a delta to the

computed depth value before the depth test is performed and before the

depth value is written to the depth buffer. If the depth test passes, the

original depth value—and not the original depth value + delta—will be

stored in the depth buffer.

The polygon offset is set using the following API call:

Figure 7-11 Polygon Offset

void glPolygonOffset(GLfloat factor, GLfloat units)

The depth offset is computed as follows:

depth offset = m * factor + r * units

Occlusion Queries 183

In this equation, m is maximum depth slope of the triangle and is

calculated as

()

m= ∂z/∂x∂z/∂y

m can also be calculated as max {|∂z/∂x|, |∂z/∂y|}.

The slope terms

∂z/∂x

and

∂z/∂y

are calculated by the OpenGL ES

implementation during the triangle rasterization stage.

r is an implementation-dened constant and represents the smallest value

that can produce a guaranteed difference in depth value.

Polygon offset can be enabled or disabled using

glEnable(GL_POLYGON_OFFSET_FILL) and

glDisable(GL_POLYGON_OFFSET_FILL), respectively.

With polygon offset enabled, the code for triangles rendered by

Figure 7-11 is as follows:

const float polygonOffsetFactor = –l.Of;

const float polygonOffsetUnits = –2.Of;

glClear ( GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT );

// load vertex shader

// set the appropriate transformation matrices

// set the vertex attribute state

// draw the SMALLER quad

glDrawArrays ( GL_TRIANGLE_FAN, 0, 4 );

// set the depth func to <= as polygons are coplanar

glDepthFunc ( GL_LEQUAL );

glEnable ( GL_POLYGON_OFFSET_FILL );

glPolygonOffset ( polygonOffsetFactor, polygonOffsetUnits );

// set the vertex attribute state

// draw the LARGER quad

glDrawArrays ( GL_TRIANGLE_FAN, 0, 4 );

Occlusion Queries

Occlusion queries use query objects to track any fragments or samples

that pass the depth test. This approach can be used for a variety of

techniques, such as visibility determination for a lens are effect as well

184 Chapter 7: Primitive Assembly and Rasterization

as optimization to avoid performing geometry processing on obscured

objects whose bounding volume is obscured.

Occlusion queries can be started and ended using glBeginQuery and

glEndQuery, respectively, with GL_ANY_SAMPLES_PASSED or

GL_ANY_SAMPLES_PASSED_CONSERVATIVE target.

void glBeginQuery(GLenum target, GLuint id)

void glEndQuery(GLenum target)

target

species the target type of query object; valid values are

GL_ANY_SAMPLES_PASSED

GL_ANY_SAMPLES_PASSED_CONSERVATIVE

GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN

species the name of the query object (glBeginQuery only)

Using the GL_ANY_SAMPLES_PASSED target will return the

precise booleanstate indicating whether any samples passed the

depth test.TheGL_ANY_SAMPLES_PASSED_CONSERVATIVE target

canofferbetterperformance but a less precise answer. Using

GL_ANY_SAMPLES_PASSED_CONSERVATIVE, some implementations may

return GL_TRUE even if no sample passed the depth test.

The id is created using glGenQueries and deleted using

glDeleteQueries.

void glGenQueries(GLsizei n, GLuint *ids)

n species the number of query name objects to be generated

ids species an array to store the list of query name objects

void glDeleteQueries(GLsizei n, const GLuint *ids)

n species the number of query name objects to be deleted

ids species an array of the list of query name objects to be deleted

After you have specied the boundary of the query object using

glBeginQuery and glEndQuery, you can use glGetQueryObjectuiv to

retrieve the result of the query object.

Summary 185

Note: For better performance, you should wait several frames before

performing a glGetQueryObjectuiv call to wait for the result to be

available in the GPU.

The following example shows how to set up an occlusion query object

and query the result:

glBeginQuery ( GL_ANY_SAMPLES_PASSED, queryObject );

// draw primitives here

…

glEndQuery ( GL_ANY_SAMPLES_PASSED );

…

// after several frames have elapsed, query the number of

// samples that passed the depth test

glGetQueryObjectuiv( queryObject, GL_QUERY_RESULT,

&numSamples );

Summary

In this chapter, you learned the types of primitives supported by OpenGL

ES, and saw how to draw them efciently using regular non-instanced and

instanced draw calls. We also discussed how coordinate transformations

are performed on vertices. In addition, you learned about the rasterization

stage, in which primitives are converted into fragments representing

pixels that may be drawn on the screen. Now that you have learned how

to draw primitives using vertex data, in the next chapter we describe how

to write a vertex shader to process the vertices in a primitive.

void glGetQueryObjectuiv( GLuint id, GLenum pname,

GLuint*params)

target species the name of a query object

pname species the query object parameter to be retrieved,

and can be GL_QUERY_RESULT or GL_QUERY_RESULT_

AVAILABLE

params species an array of the appropriate type for storing the

returned parameter values

This page intentionally left blank

187

Chapter 8

Vertex Shaders

This chapter describes the OpenGL ES 3.0 programmable vertex pipeline.

Figure 8-1 illustrates the entire OpenGL ES 3.0 programmable pipeline.

The shaded boxes indicate the programmable stages in OpenGL ES 3.0.

Inthis chapter, we discuss the vertex shader stage. Vertex shaders can

be used to do traditional vertex-based operations such as transforming the

position by a matrix, computing the lighting equation to generate a

per-vertex color, and generating or transforming texture coordinates.

The previous chapters—specically, Chapter 5, “OpenGL ES Shading

Language,” and Chapter 6, “Vertex Attributes, Vertex Arrays, and Buffer

Objects”—discussed how to specify the vertex attribute and uniform

inputs and also gave a good description of the OpenGL ES 3.0 Shading

Language. Chapter 7, “Primitive Assembly and Rasterization,” discussed

how the output of the vertex shader, referred to as vertex shader output

variables, is used by the rasterization stage to generate per-fragment

values, which are then input to the fragment shader. In this chapter,

we begin with a high-level overview of a vertex shader, including its

inputs and outputs. We then describe how to write vertex shaders by

discussing a few examples. These examples describe common use cases

such as transforming a vertex position with a model view and projection

matrix, vertex lighting that generates per-vertex diffuse and specular

colors, texture coordinate generation, vertex skinning, and displacement

mapping. We hope that these examples help you get a good idea of how

to write vertex shaders. Last but not least, we describe a vertex shader that

implements the OpenGL ES 1.1 xed-function vertex pipeline.

188 Chapter 8: Vertex Shaders

Vertex Shader Overview

The vertex shader provides a general-purpose programmable method for

operating on vertices. Figure 8-2 shows the inputs and outputs of a vertex

shader. The inputs to the vertex shader consist of the following:

Attributes—Per-vertex data supplied using vertex arrays.

Uniforms and uniform buffers—Constant data used by the vertex

shader.

Samplers—A specic type of uniform that represents textures used by

the vertex shader.

Shader program—Vertex shader program source code or executable

that describes the operations that will be performed on the vertex.

The outputs of the vertex shader are called vertex shader output

variables. In the primitive rasterization stage, these variables are

computed for each generated fragment and are passed in as inputs to the

fragment shader.

Vertex Buffer/

Array Objects

Array

Objects

Vertex

Shader

Textures

Fragment

Shader

Primitive

Assembly

Primitive

Transform

Feedback

Rasterization

Per-Fragment

Operations Framebuffer

API

Figure 8-1 OpenGL ES 3.0 Programmable Pipeline

Vertex Shader Overview 189

Vertex Shader Built-In Variables

The built-in variables of a vertex shader can be categorized into special

variables that are input or output of the vertex shader, uniform state

such as depth range, and constants that specify maximum values such as

the number of attributes, number of vertex shader output variables, and

number of uniforms.

Built-In Special Variables

OpenGL ES 3.0 has built-in special variables that serve as inputs to the

vertex shader, or outputs by the vertex shader that then become inputs

to the fragment shader, or outputs by the fragment shader. The following

built-in special variables are available to the vertex shader:

gl_VertexID is an input variable that holds an integer index for the

vertex. This integer variable is declared using the highp precision qualier.

gl_InstanceID is an input variable that holds the instance number of

a primitive in an instanced draw call. Its value is 0 for a regular draw

call. gl_InstanceID is an integer variable declared using the highp

precision qualier.

... ...

Output (Varying) 0

Output (Varying) 1

Output (Varying) 2

Output (Varying) 3

Output (Varying) 4

Uniforms Samplers

Vertex Shader

Input (Attribute) 0

Input (Attribute) 1

Input (Attribute) 2

Input (Attribute) 3

Input (Attribute) 4

Input (Attribute) N Output (Varying) N

gl_PointSize

gl_Position

Figure 8-2 OpenGL ES 3.0 Vertex Shader

190 Chapter 8: Vertex Shaders

gl_Position is used to output the vertex position in clip coordinates.

Its values are used by the clipping and viewport stages to perform

appropriate clipping of primitives and to convert the vertex position

from clip coordinates to screen coordinates. The value of gl_Position

is undened if the vertex shader does not write to gl_Position.

gl_Position is a oating-point variable declared using the highp

precision qualier.

gl_PointSize is used to write the size of the point sprite in pixels.

It is used when point sprites are rendered. The gl_PointSize

value output by a vertex shader is then clamped to the aliased

point size range supported by the OpenGL ES 3.0 implementation.

gl_PointSize is a oating-point variable declared using the highp

precision qualier.

gl_FrontFacing is a special variable that, although not directly

written by the vertex shader, is generated based on the position values

generated by the vertex shader and primitive type being rendered.

gl_FrontFacing is a boolean variable.

Built-In Uniform State

The only built-in uniform state available inside a vertex shader is the

depth range in window coordinates. This is given by the built-in uniform

name gl_DepthRange, which is declared as a uniform of type

gl_DepthRangeParameters.

struct gl_DepthRangeParameters

{

highp float near; // near Z

highp float far; // far Z

highp float diff; // far – near

}

uniform gl_DepthRangeParameters gl_DepthRange;

Built-In Constants

The following built-in constants are also available inside the vertex

shader:

const mediump int gl_MaxVertexAttribs = 16;

const mediump int gl_MaxVertexUniformVectors = 256;

const mediump int gl_MaxVertexOutputVectors = 16;

const mediump int gl_MaxVertexTextureImageUnits = 16;

const mediump int gl_MaxCombinedTextureImageUnits = 32;

Vertex Shader Overview 191

The built-in constants describe the following maximum terms:

gl_MaxVertexAttribs is the maximum number of vertex attributes

that can be specied. The minimum value supported by all ES 3.0

implementations is 16.

gl_MaxVertexUniformVectors is the maximum number of

vec4 uniform entries that can be used inside a vertex shader.

The minimum value supported by all ES 3.0 implementations is

256 vec4 entries. The number of vec4 uniform entries that can

actually be used by a developer can vary from one implementation

to another and from one vertex shader to another. For example,

some implementations might count user-specied literal values

used in a vertex shader against the uniform limit. In other cases,

implementation-specic uniforms (or constants) might need to be

included depending on whether the vertex shader makes use of any

built-in transcendental functions. There currently is no mechanism

that an application can use to nd the number of uniform entries

that it can use in a particular vertex shader. The vertex shader

compilation will fail and the compile log might provide specic

information with regard to number of uniform entries being

used. However, the information returned by the compile log is

implementation specic. We provide some guidelines in this chapter

to help maximize the use of vertex uniform entries available in a

vertex shader.

gl_MaxVertexOutputVectors is the maximum number of output

vectors—that is, the number of vec4 entries that can be output

by a vertex shader. The minimum value supported by all ES 3.0

implementations is 16 vec4 entries.

gl_MaxVertexTextureImageUnits is the maximum number of

texture units available in a vertex shader. The minimum value is 16.

gl_MaxCombinedTextureImageUnits is the sum of the maximum

number of texture units available in the vertex + fragment shaders.

The minimum value is 32.

The values specied for each built-in constant are the minimum values

that must be supported by all OpenGL ES 3.0 implementations. It is

possible that implementations might support values greater than the

minimum values described. The actual supported values can be queried

using the following code:

GLint maxVertexAttribs, maxVertexUniforms, maxVaryings;

GLint maxVertexTextureUnits, maxCombinedTextureUnits;

192 Chapter 8: Vertex Shaders

glGetIntegerv ( GL_MAX_VERTEX_ATTRIBS, &maxVertexAttribs );

glGetIntegerv ( GL_MAX_VERTEX_UNIFORM_VECTORS,

&maxVertexUniforms );

glGetIntegerv ( GL_MAX_VARYING_VECTORS,

&maxVaryings );

glGetIntegerv ( GL_MAX_VERTEX_TEXTURE_IMAGE_UNITS,

&maxVertexTextureUnits );

glGetIntegerv ( GL_MAX_COMBINED_TEXTURE_IMAGE_UNITS,

&maxCombinedTextureUnits );

Precision Qualifiers

This section briey reviews precision qualiers, which are covered in

depth in Chapter 5, “OpenGL ES Shading Language.” Precision qualiers

can be used to specify the precision of any oating-point or integer-based

variable. The keywords for specifying the precision are lowp, mediump,

andhighp. Some examples of declarations with precision qualiers are

shown here:

highp vec4 position;

out lowp vec4 color;

mediump float specularExp;

highp int oneConstant;

In addition to precision qualiers, default precision may be employed.

That is, if a variable is declared without having a precision qualier, it will

have the default precision for that type. The default precision qualier

is specied at the top of a vertex or fragment shader using the following

syntax:

precision highp float;

precision mediump int;

The precision specied for float will be used as the default precision

for all variables based on a oating-point value. Likewise, the precision

specied for int will be used as the default precision for all integer-based

variables. In the vertex shader, if no default precision is specied, the

default precision for both int and float is highp.

For operations typically performed in a vertex shader, the precision

qualier that will most likely be needed is highp. For instance, operations

that transform a position with a matrix, transform normals and texture

coordinates, or generate texture coordinates will need to be done with

highp precision. Color computations and lighting equations can most

likely be done with mediump precision. Again, this decision will depend

Vertex Shader Overview 193

on the kind of color computations being performed and the range and

precision required for the operations being performed. We believe that

highp will most likely be the default precision used for most operations in

a vertex shader; thus we use highp as the default precision qualier in the

examples that follow.

Number of Uniforms Limitations in a Vertex Shader

gl_MaxVertexUniformVectors describes the maximum number of

uniforms that can be used in a vertex shader. The minimum value for

gl_MaxVertexUniformVectors that must be supported by any compliant

OpenGL ES 3.0 implementation is 256 vec4 entries. The uniform storage

is used to store the following variables:

Variables declared with the uniform qualier

Constant variables

Literal values

Implementation-specic constants

The number of uniform variables used in a vertex shader along with the

variables declared with the const qualier, literal values, and implementation-

specic constants must t in gl_MaxVertexUniformVectors as per the

packing rules described in Chapter 5, “OpenGL ES Shading Language.” If

these do not t, then the vertex shader will fail to compile. A developer might

potentially apply the packing rules and determine the amount of uniform

storage needed to store uniform variables, constant variables, and literal

values. It is not possible to determine the number of implementation-specic

constants, however, as this value will not only vary from implementation to

implementation but will also change depending on which built-in shading

language functions are being used by the vertex shader. Typically, the

implementation-specic constants are required when built-in transcendental

functions are used.

As far as literal values are concerned, the OpenGL ES 3.0 Shading

Language specication states that no constant propagation is assumed.

As a consequence, multiple instances of the same literal value(s) will

be counted multiple times. Understandably, it is easier to use literal

values such as 0.0 or 1.0 in a vertex shader, but our recommendation

is that this technique be avoided as much as possible. Instead of

using literal values, appropriate constant variables should be declared.

This approach avoids having to perform the same literal value count

194 Chapter 8: Vertex Shaders

multiple times, which might cause the vertex shader to fail to compile

if vertex uniform storage requirements exceed what the implementation

supports.

Consider the following example, which shows a snippet of vertex shader

code that transforms two texture coordinates per vertex:

#version 300 es

#define NUM_TEXTURES 2

uniform mat4 tex_matrix[NUM_TEXTURES]; // texture

// matrices

uniform bool enable_tex[NUM_TEXTURES]; // texture

// enables

uniform bool enable_tex_matrix[NUM_TEXTURES]; // texture matrix

// enables

in vec4 a_texcoord0; // available if enable_tex[0] is true

in vec4 a_texcoordl; // available if enable_tex[1] is true

out vec4 v_texcoord[NUM_TEXTURES];

void main()

{

v_texcoord[0] = vec4 ( 0.0, 0.0, 0.0, 1.0 );

// is texture 0 enabled

if ( enable_tex[0] )

{

// is texture matrix 0 enabled

if ( enable_tex_matrix[0] )

v_texcoord[0] = tex_matrix[0] * a_texcoord0;

else

v_texcoord[0] = a_texcoord0;

}

v_texcoord[1] = vec4 ( 0.0, 0.0, 0.0, 1.0 );

// is texture 1 enabled

if ( enable_tex[1] )

{

// is texture matrix 1 enabled

if ( enable_tex_matrix[1] )

v_texcoord[1] = tex_matrix[1] * a_texcoordl;

else

v_texcoord[1] = a_texcoordl;

}

// set gl_Position to make this into a valid vertex shader

}

Vertex Shader Overview 195

This code might result in each reference to the literal values 0, 1, 0.0, and

1.0 counting against the uniform storage. To guarantee that these literal

values count only once against the uniform storage, the vertex shader

code snippet should be written as follows:

#version 300 es

#define NUM_TEXTURES 2

const int c_zero = 0;

const int c_one = 1;

uniform mat4 tex_matrix[NUM_TEXTURES]; // texture

// matrices

uniform bool enable_tex[NUM_TEXTURES]; // texture

// enables

uniform bool enable_tex_matrix[NUM_TEXTURES]; // texture matrix

// enables

in vec4 a_texcoord0; // available if enable_tex[0] is true

in vec4 a_texcoordl; // available if enable_tex[1] is true

out vec4 v_texcoord[NUM_TEXTURES];

void main()

{

v_texcoord[c_zero] = vec4 ( float(c_zero), float(c_zero),

float(c_zero), float(c_one) );

// is texture 0 enabled

if ( enable_tex[c_zero] )

{

// is texture matrix 0 enabled

if ( enable_tex_matrix[c_zero] )

v_texcoord[c_zero] = tex_matrix[c_zero] * a_texcoord0;

else

v_texcoord[c_zero] = a_texcoord0;

}

v_texcoord[c_one] = vec4(float(c_zero), float(c_zero),

float(c_zero), float(c_one));

// is texture 1 enabled

if ( enable_tex[c_one] )

{

// is texture matrix 1 enabled

if ( enable_tex_matrix[c_one] )

v_texcoord[c_one] = tex_matrix[c_one] * a_texcoordl;

else

v_texcoord[c_one] = a_texcoordl;

}

// set gl_Position to make this into a valid vertex shader

}

196 Chapter 8: Vertex Shaders

This section should help you better understand the limitations of the

OpenGL ES 3.0 Shading Language and appreciate how to write vertex shaders

that should compile and run on most OpenGL ES 3.0 implementations.

Vertex Shader Examples

We now present a few examples that demonstrate how to implement the

following features in a vertex shader:

Transforming vertex position with a matrix

Lighting computations to generate per-vertex diffuse and specular color

Texture coordinate generation

Vertex skinning

Displacing vertex position with a texture lookup value

These features represent typical use cases that OpenGL ES 3.0 applications

will want to perform in a vertex shader.

Matrix Transformations

Example 8-1 describes a simple vertex shader written using the OpenGL

ES Shading Language. The vertex shader takes a position and its associated

Example 8-1 Vertex Shader with Matrix Transform for the Position

#version 300 es

// uniforms used by the vertex shader

uniform mat4 u_mvpMatrix; // matrix to convert position from

// model space to clip space

// attribute inputs to the vertex shader

layout(location = 0) in vec4 a_position; // input position value

layout(location = 1) in vec4 a_color; // input color

// vertex shader output, input to the fragment shader

out vec4 v_color;

void main()

{

v_color = a_color;

gl_Position = u_mvpMatrix * a_position;

}

Vertex Shader Examples 197

color data as inputs or attributes, transforms the position by a 4 × 4

matrix, and outputs the transformed position and color.

The transformed vertex positions and primitive type are then used by the

setup and rasterization stages to rasterize the primitive into fragments. For

each fragment, the interpolated v_color will be computed and passed as

input to the fragment shader.

Example 8-1 introduces the concept of the model–view–projection (MVP)

matrix in the uniform u_mvpMatrix. As described in the Coordinate Systems

section in Chapter 7, the positions input to the vertex shader are stored in

object coordinates and the output position of the vertex shader is stored

in clip coordinates. The MVP matrix is the product of three very important

transformation matrices in 3D graphics that perform this transformation:

the model matrix, the view matrix, and the projection matrix.

The transformations performed by each of the individual matrices that

make up the MVP matrix are as follows:

Model matrix—Transform object coordinates to world coordinates.

View matrix—Transform world coordinates to eye coordinates.

Projection matrix—Transform eye coordinates to clip coordinates.

Model–View Matrix

In traditional xed-function OpenGL, the model and view matrices are

combined into a single matrix known as the model–view matrix. This

4 × 4 matrix transforms the vertex position from object coordinates

into eye coordinates. It is the combination of the transformation from

object to world coordinates and the transformation from world to eye

coordinates. In xed-function OpenGL, the model–view matrix can

be created using functions such as glRotatef, glTranslatef, and

glScalef. Because these functions do not exist in OpenGL ES 2.0 or 3.0,

it is up to the application to handle creation of the model–view matrix.

To simply this process, we have included in the sample code framework

esTransform.c, which contains functions that perform equivalently to

the xed-function OpenGL routines for building a model–view matrix.

These transformation functions (esRotate, esTranslate, esScale,

esMatrixLoadIdentity, and esMatrixMultiply) are detailed in

AppendixC. In Example 8-1, the model–view matrix is computed as follows:

ESMatrix modelview;

// Generate a model-view matrix to rotate/translate the cube

esMatrixLoadIdentity ( &modelview );

198 Chapter 8: Vertex Shaders

// Translate away from the viewer

esTranslate ( &modelview, 0.0, 0.0, -2.0 );

// Rotate the cube

esRotate ( &modelview, userData->angle, 1.0, 0.0, 1.0 );

First, the identity matrix is loaded into the modelview matrix using

esMatrixLoadIdentity. Then the identity matrix is concatenated with a

translation that moves the object away from the viewer. Finally, a rotation

is concatenated to the modelview matrix that rotates the object around

the vector (1.0, 0.0, 1.0) with an angle in degrees that is updated based on

time to rotate the object continuously.

Projection Matrix

The projection matrix takes the eye coordinates (computed from

applying the model–view matrix) and produces clip coordinates

as described in the Clipping section in Chapter 7. In xed-function

OpenGL, this transformation was specied using glFrustum or

the OpenGL utility function gluPerspective. In the OpenGL ES

FrameworkAPI, we have provided two equivalent functions: esFrustum

and esPerspective. These functions specify the clip volume detailed

in Chapter 7. The esFrustum function describes the clip volume by

specifying the coordinates of the clip volume. The esPerspective

function is a convenience function that computes the parameters to

esFrustum using a eld-of-view and aspect ratio description of the

viewing volume. The projection matrix is computed for Example 8-1 as

follows:

ESMatrix projection;

// Compute the window aspect ratio

aspect = (GLfloat) esContext->width /

(GLfloat) esContext->height;

// Generate a perspective matrix with a 60-degree FOV

// and near and far clip planes at 1.0 and 20.0

esMatrixLoadIdentity ( &projection);

esPerspective ( &projection, 60.0f, aspect, 1.0f, 20.0f );

Finally, the MVP matrix is computed as the product of the model–view

and projection matrices:

// Compute the final MVP by multiplying the

// model-view and projection matrices together

esMatrixMultiply ( &userData->mvpMatrix, &modelview,

&projection );

Vertex Shader Examples 199

The MVP matrix is loaded into the uniform for the shader using

glUniformMatrix4fv.

// Get the uniform locations

userData->mvpLoc =

glGetUniformLocation ( userData->programObject,

"u_mvpMatrix" );

…

// Load the MVP matrix

glUniformMatrix4fv( userData->mvpLoc, 1, GL_FALSE,

(GLfloat*) &userData->mvpMatrix.m[0][0] );

Lighting in a Vertex Shader

In this section, we look at examples that compute the lighting equation

for directional lights, point lights, and spotlights. The vertex shaders

described in this section use the OpenGL ES 1.1 lighting equation model

to compute the lighting equation for a directional or a spot (or point)

light. In the lighting examples described here, the viewer is assumed to be

at innity.

A directional light is a light source that is at an innite distance from

the objects in the scene being lit. An example of a directional light is

the sun. As the light is at innite distance, the light rays from the light

source are parallel. The light direction vector is a constant and does not

need to be computed per vertex. Figure 8-3 describes the terms that are

needed in computing the lighting equation for a directional light. Peye is

Peye

Plight

01 = 02

Figure 8-3 Geometric Factors in Computing Lighting Equation for a

Directional Light

200 Chapter 8: Vertex Shaders

the position of the viewer, Plight is the position of the light (Plight . w = 0), N

is the normal, and H is the half-plane vector. Because Plight . w = 0, the light

direction vector will be Plight . xyz. The half-plane vector H is computed

as ||VPlight + VPeye||. As both the light source and viewer are at innity, the

half-plane vector H = ||Plight . xyz + (0, 0, l)||.

Example 8-2 provides the vertex shader code that computes the lighting

equation for a directional light. The directional light properties are

described by a directional_light struct that contains the following

elements:

direction—The normalized light direction in eye space.

halfplane—The normalized half-plane vector H. This can be

precomputed for a directional light, as it does not change.

ambient_color—The ambient color of the light.

diffuse_color—The diffuse color of the light.

specular_color—The specular color of the light.

The material properties needed to compute the vertex diffuse and specular

color are described by a material_properties struct that contains the

following elements:

ambient_color—The ambient color of the material.

diffuse_color—The diffuse color of the material.

specular_color—The specular color of the material.

specular_exponent—The specular exponent that describes the

shininess of the material and is used to control the shininess of the

specular highlight.

Example 8-2 Directional Light

#version 300 es

struct directional_light

{

vec3 direction; // normalized light direction in eye

// space

vec3 halfplane; // normalized half-plane vector

vec4 ambient_color;

vec4 diffuse_color;

vec4 specular_color;

};

Vertex Shader Examples 201

The directional light vertex shader code described in Example 8-2

combines the per-vertex diffuse and specular color into a single color

(given by computed_color). Another option would be to compute the

per-vertex diffuse and specular colors and pass them as separate output

variables to the fragment shader.

Example 8-2 Directional Light (continued)

struct material_properties

{

vec4 ambient_color;

vec4 diffuse_color;

vec4 specular_color;

float specular_exponent;

};

const float c_zero = 0.0;

const float c_one = 1.0;

uniform material_properties material;

uniform directional_light light;

// normal has been transformed into eye space and is a

// normalized vector; this function returns the computed color

vec4 directional_light_color ( vec3 normal )

{

vec4 computed_color = vec4 ( c_zero, c_zero, c_zero,

c_zero );

float ndotl; // dot product of normal & light direction

float ndoth; // dot product of normal & half-plane vector

ndotl = max ( c_zero, dot ( normal, light.direction ) );

ndoth = max ( c_zero, dot ( normal, light.halfplane ) );

computed_color += ( light.ambient_color

* material.ambient_color );

computed_color += ( ndotl * light.diffuse_color

* material.diffuse_color );

if ( ndoth > c_zero )

{

computed_color += ( pow ( ndoth,

material.specular_exponent )*

material.specular_color *

light.specular_color );

}

return computed_color;

}

// add a main function to make this into a valid vertex shader

202 Chapter 8: Vertex Shaders

Note: In Example 8-2, we multiply the material colors (ambient, diffuse,

and specular) with the light colors. This is ne if we are computing

the lighting equation for only one light. If we have to compute the

lighting equation for multiple lights, however, we should compute

the ambient, diffuse, and specular values for each light and then

compute the nal vertex color by multiplying the material ambient,

diffuse, and specular colors with appropriate computed terms and

then summing them to generate a per-vertex color.

A point light is a light source that emanates light in all directions from a

position in space. A point light is given by a position vector (x, y, z, w),

where w ≠ 0. The point light shines evenly in all directions but its intensity

falls off (i.e., becomes attenuated) based on the distance from the light to

the object. This attenuation is computed using the following equation:

distance attenuationKKVPKVP

light light

=+×+×1/(||||||||)

01 22

where K0, K1, and K2 are the constant, linear, and quadratic attenuation

factors, respectively.

A spotlight is a light source with both a position and a direction that

simulates a cone of light emitted from a position (Plight) in a direction

(given by spotdirection). Figure 8-4 describes the terms that are needed in

computing the lighting equation for a spotlight.

Peye

Plight

01 = 02

spot direction

spot cutoff angle

Figure 8-4 Geometric Factors in Computing Lighting Equation for a Spotlight

The intensity of the emitted light is attenuated by a spot cutoff factor

based on the angle from the center of the cone. The angle away from

the center axis of the cone is computed as the dot product of VPlight and

Vertex Shader Examples 203

spotdirection. The spot cutoff factor is 1.0 in the spotlight direction given by

spotdirection and falls off exponentially to 0.0 at spotcutoff angle radians away.

Example 8-3 describes the vertex shader code that computes the lighting

equation for a spot (and point) light. The spotlight properties are

described by a spot_light struct that contains the following elements:

direction—The light direction in eye space.

ambient_color—The ambient color of the light.

diffuse_color—The diffuse color of the light.

specular_color—The specular color of the light.

attenuation_factors—The distance attenuation factors K0, K1, and K2.

compute_distance_attenuation—A boolean term that determines

whether the distance attenuation must be computed.

spot_direction—The normalized spot direction vector.

spot_exponent—The spotlight exponent used to compute the spot

cutoff factor.

spot_cutoff_angle—The spotlight cutoff angle in degrees.

Example 8-3 Spotlight

#version 300 es

struct spot_light

{

vec4 position; // light position in eye space

vec4 ambient_color;

vec4 diffuse_color;

vec4 specular_color;

vec3 spot_direction; // normalized spot direction

vec3 attenuation_factors; // attenuation factors K0, K1, K2

bool compute_distance_attenuation;

float spot_exponent; // spotlight exponent term

float spot_cutoff_angle; // spot cutoff angle in degrees

};

struct material_properties

{

vec4 ambient_color;

vec4 diffuse_color;

(continues)

204 Chapter 8: Vertex Shaders

Example 8-3 Spotlight (continued)

vec4 specular_color;

float specular_exponent;

};

const float c_zero = 0.0;

const float c_one = 1.0;

uniform material_properties material;

uniform spot_light light;

// normal and position are normal and position values in

// eye space.

// normal is a normalized vector.

// This function returns the computed color.

vec4 spot_light_color ( vec3 normal, vec4 position )

{

vec4 computed_color = vec4 ( c_zero, c_zero, c_zero,

c_zero );

vec3 lightdir;

vec3 halfplane;

float ndotl, ndoth;

float att_factor;

att_factor = c_one;

// we assume "w" values for light position and

// vertex position are the same

lightdir = light.position.xyz - position.xyz;

// compute distance attenuation

if ( light.compute_distance_attenuation )

{

vec3 att_dist;

att_dist.x = c_one;

att_dist.z = dot ( lightdir, lightdir );

att_dist.y = sqrt ( att_dist.z );

att_factor = c_one / dot ( att_dist,

light.attenuation_factors );

}

// normalize the light direction vector

lightdir = normalize ( lightdir );

// compute spot cutoff factor

if ( light.spot_cutoff_angle < 180.0 )

{

float spot_factor = dot ( -lightdir,

light.spot_direction );

Generating Texture Coordinates 205

Example 8-3 Spotlight (continued)

if ( spot_factor >= cos ( radians (

light.spot_cutoff_angle ) ) )

spot_factor = pow ( spot_factor, light.spot_exponent );

else

spot_factor = c_zero;

// compute combined distance and spot attenuation factor

att_factor *= spot_factor;

}

if ( att_factor > c_zero )

{

// process lighting equation --> compute the light color

computed_color += ( light.ambient_color *

material.ambient_color );

ndotl = max ( c_zero, dot(normal, lightdir ) );

computed_color += ( ndotl * light.diffuse_color *

material.diffuse_color );

halfplane = normalize ( lightdir + vec3 ( c_zero, c_zero,

c_one ) );

ndoth = dot ( normal, halfplane );

if ( ndoth > c_zero )

{

computed_color += ( pow ( ndoth,

material.specular_exponent )*

material.specular_color *

light.specular_color );

}

// multiply color with computed attenuation

computed_color *= att_factor;

}

return computed_color;

}

// add a main function to make this into a valid vertex shader

Generating Texture Coordinates

We look at two examples that generate texture coordinates in a vertex

shader. The two examples are used when rendering shiny (i.e., reective)

objects in a scene by generating a reection vector and then using this

vector to compute a texture coordinate that indexes into a latitude–

longitude map (also called a sphere map) or a cubemap (represents six

206 Chapter 8: Vertex Shaders

views or faces that capture reected environment, assuming a single

viewpoint in the middle of the shiny object). The xed-function OpenGL

specication describes the texture coordinate generation modes as

GL_SPHERE_MAP and GL_REFLECTION_MAP, respectively. The GL_SPHERE_MAP

mode generates a texture coordinate that uses a reection vector to

compute a 2D texture coordinate for lookup into a 2D texture map.

The GL_REFLECTION_MAP mode generates a texture coordinate that is a

reection vector, which can then can be used as a 3D texture coordinate

for lookup into a cubemap. Examples 8-4 and 8-5 show the vertex shader

code that generates the texture coordinates that will be used by the

appropriate fragment shader to calculate the reected image on the shiny

object.

Example 8-5 Cubemap Texture Coordinate Generation

// position is the normalized position coordinate in eye space.

// normal is the normalized normal coordinate in eye space.

// This function returns the reflection vector as a vec3 texture

// coordinate.

vec3 cube_map ( vec3 position, vec3 normal )

{

return reflect ( position, normal );

}

Example 8-4 Sphere Map Texture Coordinate Generation

// position is the normalized position coordinate in eye space.

// normal is the normalized normal coordinate in eye space.

// This function returns a vec2 texture coordinate.

vec2 sphere_map ( vec3 position, vec3 normal )

{

reflection = reflect ( position, normal );

m = 2.0 * sqrt ( reflection.x * reflection.x +

reflection.y * reflection.y +

( reflection.z + 1.0 ) * ( reflection.z + 1.0 ) );

return vec2(( reflection.x / m + 0.5 ),

( reflection.y / m + 0.5 ) );

}

The reection vector will then be used inside a fragment shader as the

texture coordinate to the appropriate cubemap.

Vertex Skinning 207

Vertex Skinning

Vertex skinning is a commonly used technique whereby the joins

between polygons are smoothed. This is implemented by applying

additional transform matrices with appropriate weights to each vertex.

The multiple matrices used to skin vertices are stored in a matrix palette.

The matrices’ indices per vertex are used to refer to appropriate matrices

in the matrix palette that will be used to skin the vertex. Vertex skinning

is commonly used for character models in 3D games to ensure that they

appear smooth and realistic (as much as possible) without having to use

additional geometry. The number of matrices used to skin a vertex is

typically two to four.

The mathematics of vertex skinning is given by the following equations:

PwMP

wi n

′

=∑××

=∑

××

∑==

−1

11,to

where

n is the number of matrices that will be used to transform the vertex

P is the vertex position

P' is the transformed (skinned) position

N is the vertex normal

N' is the transformed (skinned) normal

Mi is the matrix associated with the ith matrix per vertex and is

computed as

Mi = matrix_palette [ matrix_index[i] ]

with n matrix_index values specied per vertex

Mi–1T is the inverse transpose of matrix Mi

Wi is the weight associated with the matrix

We discuss how to implement vertex skinning with a matrix palette of

32 matrices and up to four matrices per vertex to generate a skinned

vertex. A matrix palette size of 32 matrices is quite common. The

matrices in the matrix palette typically are 4 × 3 column major matrices

(i.e., four vec3 entries per matrix). If the matrices were to be stored

in column-major order, 128 uniform entries with 3 elements of each

uniform entry would be necessary to store a row. The minimum value of

208 Chapter 8: Vertex Shaders

gl_MaxVertexUniformVectors that is supported by all OpenGL ES 3.0

implementations is 256 vec4 entries. Thus we will have only the fourth

row of these 256 vec4 uniform entries available. This row of oating-

point values can store only uniforms declared to be of type float (as

per the uniform packing rule). There is no room, therefore, to store a

vec2, vec3, or vec4 uniform. It would be better to store the matrices in

the palette in row-major order using three vec4 entries per matrix. If

we did this, then we would use 96 vec4 entries of uniform storage and

the remaining 160 vec4 entries could be used to store other uniforms.

Note that we do not have enough uniform storage to store the inverse

transpose matrices needed to compute the skinned normal. This is

typically not a problem, however: In most cases, the matrices used

are orthonormal and, therefore, can be used to transform the vertex

position and the normal.

Example 8-6 shows the vertex shader code that computes the skinned

normal and position. We assume that the matrix palette contains

32matrices, and that these matrices are stored in row-major order. The

matrices are also assumed to be orthonormal (i.e., the same matrix can be

used to transform position and normal) and up to four matrices are used

to transform each vertex.

Example 8-6 Vertex Skinning Shader with No Check of Whether

Matrix Weight = 0

#version 300 es

#define NUM_MATRICES 32 // 32 matrices in matrix palette

const int c_zero = 0;

const int c_one = 1;

const int c_two = 2;

const int c_three = 3;

// store 32 4 x 3 matrices as an array of floats representing

// each matrix in row-major order (i.e., 3 vec4s)

uniform vec4 matrix_palette[NUM_MATRICES * 3];

// vertex position and normal attributes

in vec4 a_position;

in vec3 a_normal;

// matrix weights - 4 entries / vertex

in vec4 a_matrixweights;

// matrix palette indices

in vec4 a_matrixindices;

Vertex Skinning 209

Example 8-6 Vertex Skinning Shader with No Check of Whether

Matrix Weight = 0 (continued)

void skin_position ( in vec4 position, float m_wt, int m_indx,

out vec4 skinned_position )

{

vec4 tmp;

tmp.x = dot ( position, matrix_palette[m_indx] );

tmp.y = dot ( position, matrix_palette[m_indx + c_one] );

tmp.z = dot ( position, matrix_palette[m_indx + c_two] );

tmp.w = position.w;

skinned_position += m_wt * tmp;

}

void skin_normal ( in vec3 normal, float m_wt, int m_indx,

inout vec3 skinned_normal )

{

vec3 tmp;

tmp.x = dot ( normal, matrix_palette[m_indx].xyz );

tmp.y = dot ( normal, matrix_palette[m_indx + c_one].xyz );

tmp.z = dot ( normal, matrix_palette[m_indx + c_two].xyz );

skinned_normal += m_wt * tmp;

}

void do_skinning ( in vec4 position, in vec3 normal,

out vec4 skinned_position,

out vec3 skinned_normal )

{

skinned_position = vec4 ( float ( c_zero ) );

skinned_normal = vec3 ( float ( c_zero ) );

// transform position and normal to eye space using matrix

// palette with four matrices used to transform a vertex

float m_wt = a_matrixweights[0];

int m_indx = int ( a_matrixindices[0] ) * c_three;

skin_position ( position, m_wt, m_indx, skinned_position );

skin_normal ( normal, m_wt, m_indx, skinned_normal );

m_wt = a_matrixweights[1] ;

m_indx = int ( a_matrixindices[1] ) * c_three;

skin_position ( position, m_wt, m_indx, skinned_position );

skin_normal ( normal, m_wt, m_indx, skinned_normal );

m_wt = a_matrixweights[2];

m_indx = int ( a_matrixindices[2] ) * c_three;

skin_position ( position, m_wt, m_indx, skinned_position );

skin_normal ( normal, m_wt, m_indx, skinned_normal );

(continues)

210 Chapter 8: Vertex Shaders

In Example 8-6, the vertex skinning shader generates a skinned vertex by

transforming a vertex with four matrices and appropriate matrix weights.

It is possible and quite common that some of these matrix weights

may be zero. In Example 8-6, the vertex is transformed using all four

matrices, irrespective of their weights. It might be better, however, to use a

conditional expression to check whether the matrix weight is zero before

calling skin_position and skin_normal. In Example 8-7, the vertex

skinning shader checks for a matrix weight of zero before applying the

matrix transformation.

Example 8-7 Vertex Skinning Shader with Checks of Whether

Matrix Weight = 0

void do_skinning ( in vec4 position, in vec3 normal,

out vec4 skinned_position,

out vec3 skinned_normal )

{

skinned_position = vec4 ( float ( c_zero ) );

skinned_normal = vec3 ( float( c_zero ) );

// transform position and normal to eye space using matrix

// palette with four matrices used to transform a vertex

int m_indx = 0;

float m_wt = a_matrixweights[0];

if ( m_wt > 0.0 )

{

m_indx = int ( a_matrixindices[0] ) * c_three;

skin_position( position, m_wt, m_indx, skinned_position );

skin_normal ( normal, m_wt, m_indx, skinned_normal );

}

m_wt = a_matrixweights[1] ;

if ( m_wt > 0.0 )

Example 8-6 Vertex Skinning Shader with No Check of Whether

Matrix Weight = 0 (continued)

m_wt = a_matrixweights[3];

m_indx = int ( a_matrixindices[3] ) * c_three;

skin_position ( position, m_wt, m_indx, skinned_position );

skin_normal ( normal, m_wt, m_indx, skinned_normal );

}

// add a main function to make this into a valid vertex shader

Transform Feedback 211

At rst glance, we might conclude that the vertex skinning shader in

Example 8-7 offers better performance than the vertex skinning shader

in Example 8-6. This is not necessarily true; indeed, the answer can vary

across GPUs. Such variations occur because in the conditional expression

if (m_wt > 0.0), m_wt is a dynamic value and can be different for

vertices being executed in parallel by the GPU. We now run into divergent

ow control where vertices being executed in parallel may have different

values for m_wt, which in turn can cause execution to serialize. If a GPU

does not implement divergent ow control efciently, the vertex shader

in Example 8-7 might not be as efcient as the version in Example8-6.

Applications should, therefore, test performance of divergent ow

control by executing a test shader on the GPU as part of the application

initialization phase to determine which shaders to use.

Transform Feedback

The transform feedback mode allows for capturing the outputs of the

vertex shader into buffer objects. The output buffers then can be used

as sources of the vertex data in a subsequent draw call. This approach is

Example 8-7 Vertex Skinning Shader with Checks of Whether

Matrix Weight = 0 (continued)

{

m_indx = int ( a_matrixindices[1] ) * c_three;

skin_position( position, m_wt, m_indx, skinned_position );

skin_normal ( normal, m_wt, m_indx, skinned_normal );

}

m_wt = a_matrixweights[2] ;

if ( m_wt > 0.0 )

{

m_indx = int ( a_matrixindices[2] ) * c_three;

skin_position( position, m_wt, m_indx, skinned_position );

skin_normal ( normal, m_wt, m_indx, skinned_normal );

}

m_wt = a_matrixweights[3];

if ( m_wt > 0.0 )

{

m_indx = int ( a_matrixindices[3] ) * c_three;

skin_position( position, m_wt, m_indx, skinned_position );

skin_normal ( normal, m_wt, m_indx, skinned_normal );

}

212 Chapter 8: Vertex Shaders

After calling glTransformFeedbackVaryings, it is necessary to link the

program object using glLinkProgram. For example, to specify two vertex

attributes to be captured into one transform feedback buffer, the code will

be as follows:

const char* varyings[] = { "v_position", "v_color" };

glTransformFeedbackVarying ( programObject, 2, varyings,

GL_INTERLEAVED_ATTRIBS );

glLinkProgram ( programObject );

Then, we need to bind one or more buffer objects as the transform

feedback buffers using glBindBuffer with GL_TRANSFORM_FEED-

BACK_BUFFER. The buffer is allocated using glBufferData with

GL_TRANSFORM_FEEDBACK_BUFFER and bound to the indexed binding

points using glBindBufferBase or glBindBufferRange. These buffer

APIs are described in more details in Chapter 6, “Vertex Attributes, Vertex

Arrays, and Buffer Objects.”

void glTransformFeedbackVaryings(GLuint program,

GLsizei count,

const char** varyings,

GLenum bufferMode)

program species the handle to the program object.

count species the number of vertex output variables used for

transform feedback.

varyings species an array of count zero-terminated strings

specifying the names of the vertex output variables to

use for transform feedback.

bufferMode species the mode used to capture the vertex output

variables when transform feedback is active.

Valid values are GL_INTERLEAVED_ATTRIBS, to capture

the vertex output variables into a single buffer, and

GL_SEPARATE_ATTRIBS, to capture each vertex output

variable into its own buffer.

useful for a wide range of techniques that perform animation on the GPU

without any CPU intervention, such as particle animation or physics

simulation using render-to-vertex-buffer.

To specify the set of vertex attributes to be captured during the transform

feedback mode, use the following command:

Transform Feedback 213

After the transform feedback buffers are bound, we can enter and exit the

transform feedback mode using the following API calls:

Table 8-1 Transform Feedback Primitive Mode and Allowed Draw Mode

Primitive Mode Allowed Draw Mode

GL_POINTS GL_POINTS

GL_LINES GL_LINES, GL_LINE_LOOP,

GL_LINE_STRIP

GL_TRIANGLES GL_TRIANGLES, GL_TRIANGLE_STRIP,

GL_TRIANGLE_FAN

All draw calls that occur between glBeginTransformFeedback and

glEndTransformFeedback will have their vertex outputs captured into

the transform feedback buffers. Table 8-1 indicates the allowed draw mode

corresponding to the transform feedback primitive mode.

void glBeginTransformFeedback(GLenum primitiveMode)

void glEndTransformFeedback()

primitiveMode species the output type of the primitives that

will be captured into the buffer objects that

are bound for transform feedback. Transform

feedback is limited to non-indexed GL_POINTS,

GL_LINES, and GL_TRIANGLES.

We can retrieve the number of primitives that were successfully

written into the transform buffer objects using glGetQueryObjectuiv

after setting up glBeginQuery and glEndQuery with GL_TRANSFORM_

FEEDBACK_PRIMITIVES_WRITTEN. For example, to begin and end the

transform feedback mode for rendering a set of points and querying the

number of points written, the code will be as follows:

glBeginTransformFeedback ( GL_POINTS );

glBeginQuery ( GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN,

queryObject );

glDrawArrays ( GL_POINTS, 0, 10 );

glEndQuery ( GL_TRANSFORM_FEEDBACK_PRIMITIVES_WRITTEN );

glEndTransformFeedback ( );

// query the number of primitives written

glGetQueryObjectuiv( queryObject, GL_QUERY_RESULT,

&numPoints );

214 Chapter 8: Vertex Shaders

We can disable and enable rasterization while capturing in transform

feedback mode using glEnable and glDisable with

GL_RASTERIZER_DISCARD. While GL_RASTERIZER_DISCARD is enabled,

no fragment shader will run.

Note that we describe a full example of using transform feedback in

the Particle System Using Transform Feedback example in Chapter 14,

“Advanced Programming with OpenGL ES 3.0.”

Vertex Textures

OpenGL ES 3.0 supports texture lookup operations in a vertex shader. This

is useful to implement techniques such as displacement mapping, where

you can displace the vertex position along the vertex normal based on

the texture lookup value in the vertex shader. A typical application of the

displacement mapping technique is for rendering terrain or water surfaces.

Performing texture lookup in a vertex shader has some notable

limitations:

The level of detail is not implicitly computed.

The bias parameter in the texture lookup function is not accepted.

The base texture is used for mipmapped texture.

The maximum number of texture image units supported by an

implementation can be queried using glGetIntegerv with

GL_MAX_VERTEX_TEXTURE_UNITS. The minimum number that an

OpenGL ES 3.0 implementation can support is 16.

Example 8-8 is a sample vertex shader that performs displacement

mapping. The process of loading textures on various texture units is

described in more detail in Chapter 9, “Texturing.”

Example 8-8 Displacement Mapping Vertex Shader

#version 300 es

// uniforms used by the vertex shader

uniform mat4 u_mvpMatrix; // matrix to convert P from

// model space to clip space

uniform sampler2D displacementMap;

// attribute inputs to the vertex shader

layout(location = 0) in vec4 a_position; // input position value

OpenGL ES 1.1 Vertex Pipeline as an ES 3.0 Vertex Shader 215

We hope that the examples discussed so far have provided a good

understanding of vertex shaders, including how to write them and how to

use them for a wide-ranging array of effects.

OpenGL ES 1.1 Vertex Pipeline as an ES 3.0

Vertex Shader

We now discuss a vertex shader that implements the OpenGL ES 1.1 xed-

function vertex pipeline without vertex skinning. This is also meant to be

an interesting exercise in guring out how big a vertex shader can be and

still run across all OpenGL ES 3.0 implementations.

This vertex shader implements the following xed functions of the

OpenGL ES 1.1 vertex pipeline:

Transform the normal and position to eye space, if required (typically

required for lighting). Rescale or normalization of normal is also

performed.

Compute the OpenGL ES 1.1 vertex lighting equation for up to eight

directional lights, point lights, or spotlights with two-sided lighting

and color material per vertex.

Transform the texture coordinates for up to two texture coordinates

per vertex.

Example 8-8 Displacement Mapping Vertex Shader (continued)

layout(location = 1) in vec3 a_normal; // input normal value

layout(location = 2) in vec2 a_texcoord; // input texcoord value

layout(location = 3) in vec4 a_color; // input color

// vertex shader output, input to the fragment shader

out vec4 v_color;

void main ( )

{

v_color = a_color;

float displacement = texture ( displacementMap,

a_texcoord ).a;

vec4 displaced_position = a_position +

vec4 ( a_normal * displacement, 0.0 );

gl_Position = u_mvpMatrix * displaced_position;

}

216 Chapter 8: Vertex Shaders

Compute the fog factor passed to the fragment shader. The fragment

shader uses the fog factor to interpolate between fog color and vertex

color.

Compute the per-vertex user clip plane factor. Only one user clip

plane is supported.

Transform the position to clip space.

Example 8-9 is the vertex shader that implements the OpenGL ES 1.1

xed-function vertex pipeline as already described.

Example 8-9 OpenGL ES 1.1 Fixed-Function Vertex Pipeline

#version 300 es

//**************************************************************

// OpenGL ES 3.0 vertex shader that implements the following

// OpenGL ES 1.1 fixed-function pipeline

// - compute lighting equation for up to eight

// directional/point/spotlights

// - transform position to clip coordinates

// - texture coordinate transforms for up to two texture

// coordinates

// - compute fog factor

// - compute user clip plane dot product (stored as

// v_ucp_factor)

//**************************************************************

#define NUM_TEXTURES 2

#define GLI_FOG_MODE_LINEAR 0

#define GLI_FOG_MODE_EXP 1

#define GLI_FOG_MODE_EXP2 2

struct light

{

vec4 position; // light position for a point/spotlight or

// normalized dir. for a directional light

vec4 ambient_color;

vec4 diffuse_color;

vec4 specular_color;

vec3 spot_direction;

vec3 attenuation_factors;

float spot_exponent;

float spot_cutoff_angle;

bool compute_distance_attenuation;

};

OpenGL ES 1.1 Vertex Pipeline as an ES 3.0 Vertex Shader 217

Example 8-9 OpenGL ES 1.1 Fixed-Function Vertex Pipeline (continued)

struct material

{

vec4 ambient_color;

vec4 diffuse_color;

vec4 specular_color;

vec4 emissive_color;

float specular_exponent;

};

const float c_zero = 0.0;

const float c_one = 1.0;

const int indx_zero = 0;

const int indx_one = 1;

uniform mat4 mvp_matrix; // combined model-view +

// projection matrix

uniform mat4 modelview_matrix; // model-view matrix

uniform mat3 inv_transpose_modelview_matrix; // inverse

// model-view matrix used

// to transform normal

uniform mat4 tex_matrix[NUM_TEXTURES]; // texture matrices

uniform bool enable_tex[NUM_TEXTURES]; // texture enables

uniform bool enable_tex_matrix[NUM_TEXTURES]; // texture

// matrix enables

uniform material material_state;

uniform vec4 ambient_scene_color;

uniform light light_state[8];

uniform bool light_enable_state[8]; // booleans to indicate

// which of eight

// lights are enabled

uniform int num_lights; // number of lights

// enabled = sum of

// light_enable_state bools

// set to TRUE

uniform bool enable_lighting; // is lighting enabled

uniform bool light_model_two_sided; // is two-sided

// lighting enabled

uniform bool enable_color_material; // is color material

// enabled

uniform bool enable_fog; // is fog enabled

uniform float fog_density;

uniform float fog_start, fog_end;

uniform int fog_mode; // fog mode: linear, exp, or exp2

(continues)

218 Chapter 8: Vertex Shaders

Example 8-9 OpenGL ES 1.1 Fixed-Function Vertex Pipeline (continued)

uniform bool xform_eye_p; // xform_eye_p is set if we need

// Peye for user clip plane,

// lighting, or fog

uniform bool rescale_normal; // is rescale normal enabled

uniform bool normalize_normal; // is normalize normal

// enabled

uniform float rescale_normal_factor; // rescale normal

// factor if

// glEnable(GL_RESCALE_NORMAL)

uniform vec4 ucp_eqn; // user clip plane equation;

// one user clip plane specified

uniform bool enable_ucp; // is user clip plane enabled

//******************************************************

// vertex attributes: not all of them may be passed in

//******************************************************

in vec4 a_position; // this attribute is always specified

in vec4 a_texcoord0; // available if enable_tex[0] is true

in vec4 a_texcoordl; // available if enable_tex[1] is true

in vec4 a_color; // available if !enable_lighting or

// (enable_lighting && enable_color_material)

in vec3 a_normal; // available if xform_normal is set

// (required for lighting)

//************************************************

// output variables of the vertex shader

//************************************************

out vec4 v_texcoord[NUM_TEXTURES];

out vec4 v_front_color;

out vec4 v_back_color;

out float v_fog_factor;

out float v_ucp_factor;

//************************************************

// temporary variables used by the vertex shader

//************************************************

vec4 p_eye;

vec3 n;

vec4 mat_ambient_color;

vec4 mat_diffuse_color;

OpenGL ES 1.1 Vertex Pipeline as an ES 3.0 Vertex Shader 219

Example 8-9 OpenGL ES 1.1 Fixed-Function Vertex Pipeline (continued)

vec4 lighting_equation ( int i )

{

vec4 computed_color = vec4( c_zero, c_zero, c_zero,

c_zero );

vec3 h_vec;

float ndotl, ndoth;

float att_factor;

vec3 VPpli;

att_factor = c_one;

if ( light_state[i].position.w != c_zero )

{

float spot_factor;

vec3 att_dist;

// this is a point or spotlight

// we assume "w" values for PPli and V are the same

VPpli = light_state[i].position.xyz - p_eye.xyz;

if ( light_state[i].compute_distance_attenuation )

{

// compute distance attenuation

att_dist.x = c_one;

att_dist.z = dot ( VPpli, VPpli );

att_dist.y = sqrt ( att_dist.z ) ;

att_factor = c_one / dot ( att_dist,

light_state[i] .attenuation_factors );

}

VPpli = normalize ( VPpli );

if ( light_state[i].spot_cutoff_angle < 180.0 )

{

// compute spot factor

spot_factor = dot ( -VPpli,

light_state[i].spot_direction );

if( spot_factor >= cos ( radians (

light_state[i].spot_cutoff_angle ) ) )

spot_factor = pow ( spot_factor,

light_state[i].spot_exponent );

else

spot_factor = c_zero;

att_factor *= spot_factor;

}

(continues)

220 Chapter 8: Vertex Shaders

Example 8-9 OpenGL ES 1.1 Fixed-Function Vertex Pipeline (continued)

else

{

// directional light

VPpli = light_state[i].position.xyz;

}

if( att_factor > c_zero )

{

// process lighting equation --> compute the light color

computed_color += ( light_state[i].ambient_color *

mat_ambient_color );

ndotl = max( c_zero, dot( n, VPpli ) );

computed_color += ( ndotl * light_state[i].diffuse_color *

mat_diffuse_color );

h_vec = normalize( VPpli + vec3(c_zero, c_zero, c_one ) );

ndoth = dot ( n, h_vec );

if ( ndoth > c_zero )

{

computed_color += ( pow ( ndoth,

material_state.specular_exponent ) *

material_state.specular_color *

light_state[i].specular_color );

}

computed_color *= att_factor; // multiply color with

// computed attenuation

// factor

// * computed spot factor

}

return computed_color;

}

float compute_fog( )

{

float f;

// use eye Z as approximation

if ( fog_mode == GLI_FOG_MODE_LINEAR )

{

f = ( fog_end - p_eye.z ) / ( fog_end - fog_start );

}

else if ( fog_mode == GLI_FOG_MODE_EXP )

{

f = exp( - ( p_eye.z * fog_density ) );

}

OpenGL ES 1.1 Vertex Pipeline as an ES 3.0 Vertex Shader 221

Example 8-9 OpenGL ES 1.1 Fixed-Function Vertex Pipeline (continued)

else

{

f = ( p_eye.z * fog_density );

f = exp( -( f * f ) );

}

f = clamp ( f, c_zero, c_one) ;

return f;

}

vec4 do_lighting( )

{

vec4 vtx_color;

int i, j ;

vtx_color = material_state.emissive_color +

( mat_ambient_color * ambient_scene_color );

j = int( c_zero );

for ( i=int( c_zero ); i<8; i++ )

{

if ( j >= num_lights )

break;

if ( light_enable_state[i] )

{

j++;

vtx_color += lighting_equation(i);

}

vtx_color.a = mat_diffuse_color.a;

return vtx_color;

}

void main( void )

{

int i, j;

// do we need to transform P

if ( xform_eye_p )

p_eye = modelview_matrix * a_position;

if ( enable_lighting )

{

n = inv_transpose_modelview_matrix * a_normal;

if ( rescale_normal )

n = rescale_normal_factor * n;

(continues)

222 Chapter 8: Vertex Shaders

Example 8-9 OpenGL ES 1.1 Fixed-Function Vertex Pipeline (continued)

if ( normalize_normal )

n = normalize(n);

mat_ambient_color = enable_color_material ? a_color

: material_state.ambient_color;

mat_diffuse_color = enable_color_material ? a_color

: material_state.diffuse_color;

v_front_color = do_lighting( );

v_back_color = v_front_color;

// do two-sided lighting

if ( light_model_two_sided )

{

n = -n;

v_back_color = do_lighting( );

}

else

{

// set the default output color to be the per-vertex /

// per-primitive color

v_front_color = a_color;

v_back_color = a_color;

}

// do texture transforms

v_texcoord[indx_zero] = vec4( c_zero, c_zero, c_zero,

c_one );

if ( enable_tex[indx_zero] )

{

if ( enable_tex_matrix[indx_zero] )

v_texcoord[indx_zero] = tex_matrix[indx_zero] *

a_texcoord0;

else

v_texcoord[indx_zero] = a_texcoord0;

}

v_texcoord[indx_one] = vec4( c_zero, c_zero, c_zero, c_one );

if ( enable_tex[indx_one] )

{

if ( enable_tex_matrix[indx_one] )

v_texcoord[indx_one] = tex_matrix[indx_one] *

a_texcoordl;

else

v_texcoord[indx_one] = a_texcoordl;

}

Summary 223

Example 8-9 OpenGL ES 1.1 Fixed-Function Vertex Pipeline (continued)

v_ucp_factor = enable_ucp ? dot ( p_eye, ucp_eqn ) : c_zero;

v_fog_factor = enable_fog ? compute_fog( ) : c_one;

gl_Position = mvp_matrix * a_position;

}

Summary

In this chapter, we provided a high-level overview of how vertex shaders

t into the pipeline and how to perform transformation, lighting,

skinning, and displacement mapping in a vertex shader through

some vertex shader examples. In addition, you learned how to use the

transform feedback mode to capture the vertex outputs into buffer objects

and how to implement the xed-function pipeline using vertex shaders.

Next, before we will discuss fragment shaders, we will cover the texturing

functionality in OpenGL ES 3.0.

This page intentionally left blank

225

Chapter 9

Texturing

Now that we have covered vertex shaders in detail, you should be familiar

with all of the gritty details of transforming vertices and preparing

primitives for rendering. The next step in the pipeline is the fragment

shader, where much of the visual magic of OpenGL ES 3.0 occurs.

Acentral aspect of fragment shaders is the application of textures to

surfaces. This chapter covers all the details of creating, loading, and

applying textures:

Texturing basics

Loading textures and mipmapping

Texture ltering and wrapping

Texture level-of-detail, swizzles, and depth comparison

Texture formats

Using textures in the fragment shader

Texture subimage specication

Copying texture data from the framebuffer

Compressed textures

Sampler objects

Immutable textures

Pixel unpack buffer objects

226 Chapter 9: Texturing

Texturing Basics

One of the most fundamental operations used in rendering 3D graphics

is the application of textures to a surface. Textures allow for the

representation of additional detail not available just from the geometry of

a mesh. Textures in OpenGL ES 3.0 come in several forms: 2D textures, 2D

texture arrays, 3D textures, and cubemap textures.

Textures are typically applied to a surface by using texture coordinates,

which can be thought of as indices into texture array data. The following

sections introduce the different texture types in OpenGL ES and explain

how they are loaded and accessed.

2D Textures

A 2D texture is the most basic and common form of texture in OpenGL

ES. A 2D texture is—as you might guess—a two-dimensional array of

image data. The individual data elements of a texture are known as texels

(short for “texture pixels”). Texture image data in OpenGL ES can be

represented in many different basic formats. The basic formats available

for texture data are shown in Table 9-1.

Each texel in the image is specied according to both its basic format

and its data type. Later, we describe in more detail the various data types

that can represent a texel. For now, the important point to understand

is that a 2D texture is a two-dimensional array of image data. When

rendering with a 2D texture, a texture coordinate is used as an index into

the texture image. Generally, a mesh will be authored in a 3D content

authoring program, with each vertex having a texture coordinate.

Texture coordinates for 2D textures are given by a 2D pair of coordinates

(s, t), sometimes also called (u, v) coordinates. These coordinates

represent normalized coordinates used to look up a texture map, as

shown in Figure9-1.

The lower-left corner of the texture image is specied by the st-coordinates

(0.0, 0.0). The upper-right corner of the texture image is specied by

the st-coordinates (1.0, 1.0). Coordinates outside of the range [0.0, 1.0]

are allowed, and the behavior of texture fetches outside of that range is

dened by the texture wrapping mode (described in the section on texture

ltering and wrapping).

Texturing Basics 227

Table 9-1 Texture Base Formats

Base Format Texel Data Description

GL_RED (Red)

GL_RG (Red, Green)

GL_RGB (Red, Green, Blue)

GL_RGBA (Red, Green, Blue, Alpha)

GL_LUMINANCE (Luminance)

GL_LUMINANCE_ALPHA (Luminance, Alpha)

GL_ALPHA (Alpha)

GL_DEPTH_COMPONENT (Depth)

GL_DEPTH_STENCIL (Depth, Stencil)

GL_RED_INTEGER (iRed)

GL_RG_INTEGER (iRed, iGreen)

GL_RGB_INTEGER (iRed, iGreen, iBlue)

GL_RGBA_INTEGER (iRed, iGreen, iBlue, iAlpha)

(0.0, 1.0)

(0.0, 0.0)

(1.0, 1.0)

(1.0, 0.0)

Texture

Figure 9-1 2D Texture Coordinates

228 Chapter 9: Texturing

Cubemap Textures

In addition to 2D textures, OpenGL ES 3.0 supports cubemap textures.

At its most basic, a cubemap is a texture made up of six individual 2D

texture faces. Each face of the cubemap represents one of the six sides

of a cube. Although cubemaps have a variety of advanced uses in 3D

rendering, the most common use is for an effect known as environment

mapping. For this effect, the reection of the environment onto the object

is rendered by using a cubemap to represent the environment. Typically,

a cubemap is generated for environment mapping by placing a camera

in the center of the scene and capturing an image of the scene from each

of the six axis directions (+X, –X, +Y, –Y, +Z, –Z) and storing the result in

each cube face.

Texels are fetched out of a cubemap by using a 3D vector (s, t, r) as the

texture coordinate to look up into the cubemap. The texture coordinates

(s, t, r) represent the (x, y, z) components of the 3D vector. The 3D vector

is used to rst select a face of the cubemap to fetch from, and then the

coordinate is projected into a 2D (s, t) coordinate to fetch from the

cubemap face. The actual math for computing the 2D (s, t) coordinate

is outside our scope here, but sufce it to say that a 3D vector is used to

look up into a cubemap. You can visualize the way this process works by

picturing a 3D vector coming from the origin inside of a cube. The point

at which that vector intersects the cube is the texel that would be fetched

from the cubemap. This concept is illustrated in Figure 9-2, where a 3D

vector intersects the cube face.

Figure 9-2 3D Texture Coordinate for Cubemap

Texturing Basics 229

3D Texture

Figure 9-3 3D Texture

The faces of a cubemap are each specied in the same manner as one

would specify a 2D texture. Each of the faces must be square (e.g., the

width and height must be equal), and each must have the same width

and height. The 3D vector that is used for the texture coordinate is not

normally stored directly on a per-vertex basis on the mesh as it is for

2D texturing. Instead, cubemaps are usually fetched from by using the

normal vector as a basis for computing the cubemap texture coordinate.

Typically, the normal vector is used along with a vector from the eye to

compute a reection vector that is then used to look up into a cubemap.

This computation is described in the environment mapping example in

Chapter 14, “Advanced Programming with OpenGL ES 3.0.”

3D Textures

Another type of texture in OpenGL ES 3.0 is the 3D texture (or volume

texture). 3D textures can be thought of as an array of multiple slices of

2D textures. A 3D texture is accessed with a three-tuple (s, t, r) coordinate,

much like a cubemap. For 3D textures, the r-coordinate selects which slice

of the 3D texture to sample from and the (s, t) coordinate is used to fetch

into the 2D map at each slice. Figure 9-3 shows a 3D texture where each

slice is made up of an individual 2D texture. Each mipmap level in a 3D

texture contains half the number of slices in the texture above it (more on

this later).

230 Chapter 9: Texturing

2D Texture Arrays

The nal type of texture in OpenGL ES 3.0 is a 2D texture array.

The 2D texture array is very similar to a 3D texture, but is used for a

different purpose. For example, 2D texture arrays are often used to

store an animation of a 2D image. Each slice of the array represents

one frame of the texture animation. The difference between 2D texture

arrays and 3D textures is subtle but important. For a 3D texture,

ltering occurs between slices, whereas fetching from a 2D texture array

will sample from only an individual slice. As such, mipmapping is

also different. Each mipmap level in a 2D texture array contains the

same number of slices as the level above it. Each 2D slice is entirely

mipmapped independently from any other slices (unlike the case with

a 3D texture, for which each mipmap level has half as many slices as

above it).

To address a 2D texture array, three texture coordinates (s, t, r) are used

just like with a 3D texture. The r-coordinate selects which slice in the 2D

texture array to use and the (s, t) coordinates are used on the selected slice

in exactly the same way as a 2D texture.

Texture Objects and Loading Textures

The rst step in the application of textures is to create a texture object.

Atexture object is a container object that holds the texture data needed

for rendering, such as image data, ltering modes, and wrap modes. In

OpenGL ES, a texture object is represented by an unsigned integer that

is a handle to the texture object. The function that is used for generating

texture objects is glGenTextures.

void glGenTextures(GLsizei n, GLuint *textures)

textures

species the number of texture objects to generate

an array of unsigned integers that will hold n texture

objectIDs

At the point of creation, the texture objects(s) generated by

glGenTextures are an empty container that will be used for loading

texture data and parameters. Texture objects also need to be deleted when

an application no longer needs them. This step is typically done either at

application shutdown or, for example, when changing levels in a game.

Itcan be accomplished by using glDeleteTextures.

Texturing Basics 231

Once texture object IDs have been generated with glGenTextures, the

application must bind the texture object to operate on it. Once texture

objects are bound, subsequent operations such as glTexImage2D and

glTexParameter affect the bound texture object. The function used to

bind texture objects is glBindTexture.

void glDeleteTextures(GLsizei n, GLuint *textures)

textures

species the number of texture objects to delete

an array of unsigned integers that hold n texture object IDs

to delete

void glBindTexture(GLenum target, GLuint texture)

target bind the texture object to target GL_TEXTURE_2D,

GL_TEXTURE_3D, GL_TEXTURE_2D_ARRAY, or

GL_TEXTURE_CUBE_MAP

texture the handle to the texture object to bind

void glTexImage2D( GLenum target, GLint level,

GLenum internalFormat, GLsizei width,

GLsizei height, GLint border,

GLenum format, GLenum type,

const void* pixels)

target species the texture target, either GL_TEXTURE_2D or

one of the cubemap face targets

(GL_TEXTURE_CUBE_MAP_POSITIVE_X,

GL_TEXTURE_CUBE_MAP_NEGATIVE_X, and so on).

Once a texture is bound to a particular texture target, that texture object

will remain bound to its target until it is deleted. After generating a

texture object and binding it, the next step in using a texture is to actually

load the image data. The basic function that is used for loading 2D and

cubemap textures is glTexImage2D. In addition, several alternative

methods may be used to specify 2D textures in OpenGL ES 3.0, including

using immutable textures (glTexStorage2D) in conjunction with

glTexSubImage2D. We start rst with the most basic method—using

glTexImage2D—and describe immutable textures later in the chapter. For

best performance, we recommend using immutable textures.

(continues)

232 Chapter 9: Texturing

(continued)

level species which mip level to load. The rst level is

specied by 0, followed by an increasing level for each

successive mipmap.

internalFormat the internal format for the texture storage; can be

either an unsized base internal format or a sized

internal format. The full list of valid internalFormat,

format, and type combinations is provided in

Tables9-4 through 9-10.

The unsized internal formats can be

GL_RGBA, GL_RGB, GL_LUMINANCE_ALPHA

GL_LUMINANCE, GL_ALPHA

The sized internal formats can be

GL_R8, GL_R8_SNORM, GL_R16F, GL_R32F

GL_R8UI, GL_R16UI, GL_R32UI, GL_R32I

GL_RG8, GL_RG8_SNORM, GL_RG16F, GL_RG32F

GL_RG8UI, GL_RG8I, GL_RG16UI, GL_RG32UI

GL_RG32I, GL_RGB8, GL_SRGB8, GL_RGB565

GL_RGB8_SNORM, GL_R11F_G11F_B10F

GL_RGB9_E5, GL_RGB16F, GL_RGB32F

GL_RGB8UI, GL_RGB16UI, GL_RGB16I, GL_RGB32UI

GL_RGB32I, GL_RGBA8, GL_SRGB8_ALPHA8

GL_RGBA8_SNORM, GL_RGB5_A1, GL_RGBA4

GL_RGB10_A2, GL_RGBA16F, GL_RGBA32F

GL_RGBA8UI, GL_RGBA8I, GL_RGB10_A2UI

GL_RGBA16UI, GL_RGBA16I, GL_RGBA32I

GL_RGBA32UI, GL_DEPTH_COMPONENT16

GL_DEPTH_COMPONENT24, GL_DEPTH_COMPONENT32F

GL_DEPTH24_STENCIL8, GL_DEPTH24F_STENCIL8

width the width of the image in pixels.

height the height of the image in pixels.

border this parameter is ignored in OpenGL ES, but was kept

for compatibility with the desktop OpenGL interface;

should be 0.

format the format of the incoming texture data; can be

GL_RED

GL_RED_INTEGER

GL_RG

Texturing Basics 233

Example 9-1, from the Simple_Texture2D example, demonstrates

generating a texture object, binding it, and then loading a 2 × 2 2D

texture with RGB image data made from unsigned bytes.

GL_RG_INTEGER

GL_RGB

GL_RGB_INTEGER

GL_RGBA

GL_RGBA_INTEGER

GL_DEPTH_COMPONENT

GL_DEPTH_STENCIL

GL_LUMINANCE_ALPHA

GL_ALPHA

type the type of the incoming pixel data; can be

GL_UNSIGNED_BYTE

GL_BYTE

GL_UNSIGNED_SHORT

GL_SHORT

GL_UNSIGNED_INT

GL_INT

GL_HALF_FLOAT

GL_FLOAT

GL_UNSIGNED_SHORT_5_6_5

GL_UNSIGNED_SHORT_4_4_4_4

GL_UNSIGNED_SHORT_5_5_5_1

GL_UNSIGNED_INT_2_10_10_10_REV

GL_UNSIGNED_INT_10F_11F_11F_REV

GL_UNSIGNED_INT_5_9_9_9_REV

GL_UNSIGNED_INT_24_8

GL_FLOAT_32_UNSIGNED_INT_24_8_REV

GL_UNSIGNED_SHORT_5_6_5

pixels contains the actual pixel data for the image. The

data must contain (width*height) number of

pixels with the appropriate number of bytes per

pixel based on the format and type specification.

The pixel rows must be aligned to the

GL_UNPACK_ALIGNMENT set with

glPixelStorei (defined next).

234 Chapter 9: Texturing

In the rst part of the code, the pixels array is initialized with simple 2× 2

texture data. The data is composed of unsigned byte RGB triplets that are in

the range [0, 255]. When data is fetched from an 8-bit unsigned byte texture

component in the shader, the values are mapped from the range [0, 255] to

the oating-point range [0.0, 1.0]. Typically, an application would not create

texture data in this simple manner, but rather would load the data from an

image le. This example is provided to demonstrate the use of the API.

Prior to calling glTexImage2D, the application makes a call to

glPixelStorei to set the unpack alignment. When texture data is

uploaded via glTexImage2D, the rows of pixels are assumed to be aligned

to the value set for GL_UNPACK_ALIGNMENT. By default, this value is 4,

meaning that rows of pixels are assumed to begin on 4-byte boundaries.

Example 9-1 Generating a Texture Object, Binding It, and Loading Image Data

// Texture object handle

GLuint textureId;

// 2 x 2 Image, 3 bytes per pixel (R, G, B)

GLubyte pixels[4 * 3] =

{

255, 0, 0, // Red

0, 255, 0, // Green

0, 0, 255, // Blue

255, 255, 0 // Yellow

};

// Use tightly packed data

glPixelStorei(GL_UNPACK_ALIGNMENT, 1);

// Generate a texture object

glGenTextures(1, &textureId);

// Bind the texture object

glBindTexture(GL_TEXTURE_2D, textureId);

// Load the texture

glTexImage2D(GL_TEXTURE_2D, 0, GL_RGB, 2, 2, 0, GL_RGB,

GL_UNSIGNED_BYTE, pixels);

// Set the filtering mode

glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER,

GL_NEAREST);

glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER,

GL_NEAREST);

Texturing Basics 235

This application sets the unpack alignment to 1, meaning that each row

of pixels begins on a byte boundary (in other words, the data is tightly

packed). The full denition for glPixelStorei is givennext.

void glPixelStorei(GLenum pname, GLint param)

pname species the pixel storage type to set. The following options

impact how data is unpacked from memory when calling

glTexImage2D, glTexImage3D, glTexSubImage2D, and

glTexSubImage3D:

GL_UNPACK_ROW_LENGTH, GL_UNPACK_IMAGE_HEIGHT,

GL_UNPACK_SKIP_PIXELS, GL_UNPACK_SKIP_ROWS,

GL_UNPACK_SKIP_IMAGES, GL_UNPACK_ALIGNMENT

The following options impact how data is packed into memory

when calling glReadPixels:

GL_PACK_ROW_LENGTH, GL_PACK_IMAGE_HEIGHT,

GL_PACK_SKIP_PIXELS, GL_PACK_SKIP_ROWS,

GL_PACK_SKIP_IMAGES, GL_PACK_ALIGNMENT

All of these options are described in Table 9-2.

param species the integer value for the pack or unpack option.

The GL_PACK_xxxxx arguments to glPixelStorei do not have any impact

on texture image uploading. The pack options are used by glReadPixels,

which is described in Chapter 11, “Fragment Operations.” The pack

and unpack options set by glPixelStorei are global state and are not

stored or associated with a texture object. In practice, it is rare to use any

options other than GL_UNPACK_ALIGNMENT for specifying textures. For

completeness, the full list of pixel storage options is provided in Table9-2.

Returning to the program in Example 9-1, after dening the image data,

a texture object is generated using glGenTextures and then that object

is bound to the GL_TEXTURE_2D target using glBindTexture. Finally, the

image data is loaded into the texture object using glTexImage2D. The

format is set as GL_RGB, which signies that the image data is composed

of (R, G, B) triplets. The type is set as GL_UNSIGNED_BYTE, which signies

that each channel of the data is stored in an 8-bit unsigned byte. There

are a number of other options for loading texture data, including the

different formats described in Table 9-1. All of the texture formats are

described later in this chapter in the Texture Formats section.

236 Chapter 9: Texturing

The last part of the code uses glTexParameteri to set the minication

and magnication ltering modes to GL_NEAREST. This code is required

because we have not loaded a complete mipmap chain for the texture;

thus we must select a non-mipmapped minication lter. The other

option would have been to use minication and magnication modes of

GL_LINEAR, which provides bilinear non-mipmapped ltering. The details

of texture ltering and mipmapping are explained in the next section.

Table 9-2 Pixel Storage Options

Pixel Storage Option Initial Value Description

GL_UNPACK_ALIGNMENT

GL_PACK_ALIGNMENT

4 Species the alignment of rows in

an image. By default, images begin

at 4-byte boundaries. Setting the

value to 1 means that the image is

tightly packed and rows are aligned

to a byte boundary.

GL_UNPACK_ROW_LENGTH

GL_PACK_ROW_LENGTH

0 If the value is non-zero, gives the

number of pixels in a row of the

image. If the value is zero, then

the row length is the width of the

image (i.e., it is tightly packed).

GL_UNPACK_IMAGE_HEIGHT

GL_PACK_IMAGE_HEIGHT

0 If the value is non-zero, gives the

number of pixels in a column of an

image that is part of a 3D texture.

This option can be used to have

padding of columns in between

slices of a 3D texture. If the value is

zero, then the number of columns

in the image is equal to the height

(i.e., it is tightly packed).

GL_UNPACK_SKIP_PIXELS

GL_PACK_SKIP_PIXELS

0 If the value is non-zero, gives the

number of pixels to skip at the

beginning of a row.

GL_UNPACK_SKIP_ROWS

GL_PACK_SKIP_ROWS

0 If the value is non-zero, gives the

number of rows to skip at the

beginning of the image.

GL_UNPACK_SKIP_IMAGES

GL_PACK_SKIP_IMAGES

0 If the value is non-zero, gives the

number of images in a 3D texture

to skip.

Texturing Basics 237

Texture Filtering and Mipmapping

So far, we have limited our explanation of 2D textures to single 2D images.

Although this allowed us to explain the concept of texturing, there is

actually a bit more to how textures are specied and used in OpenGL ES.

This complexity relates to the visual artifacts and performance issues that

occur due to using a single texture map. As we have described texturing so

far, the texture coordinate is used to generate a 2D index to fetch from the

texture map. When the minication and magnication lters are set to GL_

NEAREST, this is exactly what will happen: A single texel will be fetched at

the texture coordinate location provided. This is known as point or nearest

sampling.

However, nearest sampling might produce signicant visual artifacts.

The artifacts occur because as a triangle becomes smaller in screen space,

the texture coordinates take large jumps when being interpolated from

pixel to pixel. As a result, a small number of samples are taken from a

large texture map, resulting in aliasing artifacts and a potentially large

performance penalty. The solution that is used to resolve this type

of artifact in OpenGL ES is known as mipmapping. The idea behind

mipmapping is to build a chain of images known as a mipmap chain.

The mipmap chain begins with the originally specied image and

then continues with each subsequent image being half as large in each

dimension as the one before it. This chain continues until we reach a

single 1 × 1 texture at the bottom of the chain. The mip levels can be

generated programmatically, typically by computing each pixel in a mip

level as an average of the four pixels at the same location in the mip level

above it (box ltering).

In the Chapter_9/MipMap2D sample program, we provide an example

demonstrating how to generate a mipmap chain for a texture using a

box ltering technique. The code to generate the mipmap chain is given

by the GenMipMap2D function. This function takes an RGB8 image as

input and generates the next mipmap level by performing a box lter on

the preceding image. See the source code in the example for details on

how the box ltering is done. The mipmap chain is then loaded using

glTexImage2D, as shown in Example 9-2.

With a mipmap chain loaded, we can then set up the ltering mode to

use mipmaps. The result is that we achieve a better ratio between screen

pixels and texture pixels, thereby reducing aliasing artifacts. Aliasing is

also reduced because each image in the mipmap chain is successively

ltered so that high-frequency elements are attenuated more and more as

we move down the chain.

238 Chapter 9: Texturing

Two types of ltering occur when texturing: minication and magnication.

Minication is what happens when the size of the projected polygon on

the screen is smaller than the size of the texture. Magnication is what

happens when the size of the projected polygon on screen is larger than the

size of the texture. The determination of which lter type to use is handled

automatically by the hardware, but the API provides control over which type

of ltering to use in each case. For magnication, mipmapping is not relevant,

because we will always be sampling from the largest level available. For

minication, a variety of sampling modes can be used. The choice of which

mode to use is based on which level of visual quality you need to achieve and

how much performance you are willing to give up for texture ltering.

Example 9-2 Loading a 2D Mipmap Chain

// Load mipmap level 0

glTexImage2D(GL_TEXTURE_2D, 0, GL_RGB, width, height,

0, GL_RGB, GL_UNSIGNED_BYTE, pixels);

level = 1;

prevImage = &pixels[0];

while(width > 1 && height > 1)

{

int newWidth,

newHeight;

// Generate the next mipmap level

GenMipMap2D( prevImage, &newImage, width, height, &newWidth,

&newHeight);

// Load the mipmap level

glTexImage2D(GL_TEXTURE_2D, level, GL_RGB,

newWidth, newHeight, 0, GL_RGB,

GL_UNSIGNED_BYTE, newImage);

// Free the previous image

free(prevImage);

// Set the previous image for the next iteration

prevImage = newImage;

level++;

// Half the width and height

width = newWidth;

height = newHeight;

}

free(newlmage);

Texturing Basics 239

The ltering modes are specied (along with many other texture

options) with glTexParameter[i|f][v]. The texture ltering modes are

described next, and the remaining options are described in subsequent

sections.

void glTexParameteri( GLenum target, GLenum pname,

GLint param)

void glTexParameteriv( GLenum target, GLenum pname,

const GLint *params)

void glTexParameterf( GLenum target, GLenum pname,

GLfloat param)

void glTexParameterfv( GLenum target, GLenum pname,

const GLfloat *params)

target the texture target can be GL_TEXTURE_2D, GL_TEXTURE_3D,

GL_TEXTURE_2D_ARRAY, or GL_TEXTURE_CUBE_MAP

pname the parameter to set; one of

GL_TEXTURE_BASE_LEVEL

GL_TEXTURE_COMPARE_FUNC

GL_TEXTURE_COMPARE_MODE

GL_TEXTURE_MIN_FILTER

GL_TEXTURE_MAG_FILTER

GL_TEXTURE_MIN_LOD

GL_TEXTURE_MAX_LOD

GL_TEXTURE_MAX_LEVEL

GL_TEXTURE_SWIZZLE_R

GL_TEXTURE_SWIZZLE_G

GL_TEXTURE_SWIZZLE_B

GL_TEXTURE_SWIZZLE_A

GL_TEXTURE_WRAP_S

GL_TEXTURE_WRAP_T

GL_TEXTURE_WRAP_R

params the value (or array of values for the “v” entrypoints) to set the

texture parameter to

If pname is GL_TEXTURE_MAG_FILTER, then param can be

GL_NEAREST or GL_LINEAR

If pname is GL_TEXTURE_MIN_FILTER, then param can be

GL_NEAREST, GL_LINEAR, GL_NEAREST_MIPMAP_NEAREST,

GL_NEAREST_MIPMAP_LINEAR, GL_LINEAR_MIPMAP_NEAREST,

or GL_LINEAR_MIPMAP_LINEAR

(continues)

240 Chapter 9: Texturing

The magnication lter can be either GL_NEAREST or GL_LINEAR.

InGL_NEAREST magnication ltering, a single point sample will be

taken from the texture nearest to the texture coordinate. In GL_LINEAR

magnication ltering, a bilinear (average of four samples) will be taken

from the texture about the texture coordinate.

The minication lter can be set to any of the following values:

GL_NEAREST—Takes a single point sample from the texture nearest to

the texture coordinate.

GL_LINEAR—Takes a bilinear sample from the texture nearest to the

texture coordinate.

GL_NEAREST_MIPMAP_NEAREST—Takes a single point sample from the

closest mip level chosen.

GL_NEAREST_MIPMAP_LINEAR—Takes a sample from the two closest

mip levels and interpolates between those samples.

GL_LINEAR_MIPMAP_NEAREST—Takes a bilinear fetch from the closest

mip level chosen.

GL_LINEAR_MIPMAP_LINEAR—Takes a bilinear fetch from each of the

two closest mip levels and then interpolates between them. This last

mode, which is typically referred to as trilinear ltering, produces the

best quality of all modes.

Note: GL_NEAREST and GL_LINEAR are the only texture minication

modes that do not require a complete mipmap chain to be specied

(continued)

If pname is GL_TEXTURE_WRAP_S, GL_TEXTURE_WRAP_R, or

GL_TEXTURE_WRAP_T, then param can be

GL_REPEAT, GL_CLAMP_TO_EDGE, or GL_MIRRORED_REPEAT

If pname is GL_TEXTURE_COMPARE_FUNC, then param can be

GL_LEQUAL, GL_EQUAL, GL_LESS, GL_GREATER, GL_EQUAL,

GL_NOTEQUAL, GL_ALWAYS, or GL_NEVER

If pname is GL_TEXTURE_COMPARE_MODE, then param can be

GL_COMPARE_REF_TO_TEXTURE or GL_NONE

If pname is GL_TEXTURE_SWIZZLE_R, GL_TEXTURE_SWIZZLE_G,

GL_TEXTURE_SWIZZLE_B, or GL_TEXTURE_SWIZZLE_A, then

param can be

GL_RED, GL_GREEN, GL_BLUE, GL_ALPHA, GL_ZERO, or GL_ONE

Texturing Basics 241

for the texture. All of the other modes require that a complete

mipmap chain exists for the texture.

The MipMap2D example in Figure 9-4 shows the difference between a

polygon drawn with GL_NEAREST versus GL_LINEAR_MIPMAP_LINEAR

ltering.

Figure 9-4 MipMap2D: Nearest Versus Trilinear Filtering

It is worth mentioning some performance implications for the texture

ltering mode that you choose. If minication occurs and performance

is a concern, using a mipmap ltering mode is usually the best choice

on most hardware. You tend to get very poor texture cache utilization

without mipmaps because fetches happen at sparse locations throughout

a map. However, the higher the ltering mode you use, the greater the

performance cost in the hardware. For example, on most hardware,

doing bilinear ltering is less costly than doing trilinear ltering. You

should choose a mode that gives you the quality desired without unduly

negatively impacting performance. On some hardware, you might get

high-quality ltering virtually for free, particularly if the cost of the texture

ltering is not your bottleneck. This is something that needs to be tuned for

the application and hardware on which you plan to run your application.

Seamless Cubemap Filtering

One change with respect to ltering that is new to OpenGL ES 3.0 relates

to how cubemaps are ltered. In OpenGL ES 2.0, when a linear lter

kernel fell on the edge of a cubemap border, the ltering would happen

on only a single cubemap face. This would result in artifacts at the borders

between cubemap faces. In OpenGL ES 3.0, cubemap ltering is now

seamless—if the lter kernel spans more than one cubemap face, the

kernel will fetch samples from all of the faces it covers. Seamless ltering

results in smoother ltering along cubemap face borders. In OpenGL

ES3.0, there is nothing you need to do to enable seamless cubemap

ltering; all linear lter kernels will use it automatically.

242 Chapter 9: Texturing

Automatic Mipmap Generation

In the MipMap2D example in the previous section, the application

created an image for level zero of the mipmap chain. It then generated

the rest of the mipmap chain by performing a box lter on each image

and successively halving the width and height. This is one way to

generate mipmaps, but OpenGL ES 3.0 also provides a mechanism for

automatically generating mipmaps using glGenerateMipmap.

void glGenerateMipmap(GLenum target)

target the texture target to generate mipmaps for; can be

GL_TEXTURE_2D, GL_TEXTURE_3D, GL_TEXTURE_2D_ARRAY, or

GL_TEXTURE_CUBE_MAP

When calling glGenerateMipmap on a bound texture object, this function

will generate the entire mipmap chain from the contents of the image

in level zero. For a 2D texture, the contents of texture level zero will be

successively ltered and used for each of the subsequent levels. For a

cubemap, each of the cube faces will be generated from the level zero in

each cube face. Of course, to use this function with cubemaps, you must

have specied level zero for each cube face and each face must have a

matching internal format, width, and height. For a 2D texture array, each

slice of the array will be ltered as it would be for a 2D texture. Finally,

for a 3D texture, the entire volume will be mipmapped by performing

ltering across slices.

OpenGL ES 3.0 does not mandate that a particular ltering algorithm be

used for generating mipmaps (although the specication recommends

box ltering, implementations have latitude in choosing which algorithm

they use). If you require a particular ltering method, then you will still

need to generate the mipmaps on your own.

Automatic mipmap generation becomes particularly important when

you start to use framebuffer objects for rendering to a texture. When

rendering to a texture, we don’t want to have to read back the contents of

the texture to the CPU to generate mipmaps. Instead, glGenerateMipmap

can be used and the graphics hardware can then potentially generate the

mipmaps without ever having to read the data back to the CPU. When

we cover framebuffer objects in more detail in Chapter 12, “Framebuffer

Objects,” this point should become clear.

Texturing Basics 243

Texture Coordinate Wrapping

Texture wrap modes are used to specify the behavior that occurs when

a texture coordinate is outside of the range [0.0, 1.0]. The texture wrap

modes are set using glTexParameter[i|f][v]. Such modes can be

set independently for the s-coordinate, t-coordinate, and r-coordinate.

The GL_TEXTURE_WRAP_S mode denes what the behavior is when the

s-coordinate is outside of the range [0.0, 1.0], GL_TEXTURE_WRAP_T sets the

behavior for the t-coordinate, and GL_TEXTURE_WRAP_R sets the behavior

for the r-coordinate (the r-coordinate wrapping is used only for 3D

textures and 2D texture arrays). In OpenGL ES, there are three wrap modes

to choose from, as described in Table 9-3.

Table 9-3 Texture Wrap Modes

Texture Wrap Mode Description

GL_REPEAT Repeat the texture

GL_CLAMP_TO_EDGE Clamp fetches to the edge of the texture

GL_MIRRORED_REPEAT Repeat the texture and mirror

Figure 9-5 GL_REPEAT, GL_CLAMP_TO_EDGE, and GL_MIRRORED_REPEAT

Modes

Note that the texture wrap modes also affect the behavior of ltering. For

example, when a texture coordinate is at the edge of a texture, the bilinear

lter kernel might span beyond the edge of the texture. In this case, the

wrap mode will determine which texels are fetched for the portion of the

kernel that lies outside the texture edge. You should use GL_CLAMP_TO_EDGE

whenever you do not want any form of repeating.

In Chapter_9/TextureWrap, there is an example that draws a quad

with each of the three different texture wrap modes. The quads have

a checkerboard image applied to them and are rendered with texture

coordinates in the range from [–1.0, 2.0]. The results are shown in

Figure9-5.

244 Chapter 9: Texturing

The three quads are rendered using the following setup code for the

texture wrap modes:

// Draw left quad with repeat wrap mode

glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S, GL_REPEAT);

glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T, GL_REPEAT);

glUniformlf(userData->offsetLoc, -0.7f);

glDrawElements(GL_TRIANGLES, 6, GL_UNSIGNED_SHORT, indices);

// Draw middle quad with clamp to edge wrap mode

glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S,

GL_CLAMP_TO_EDGE);

glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T,

GL_CLAMP_TO_EDGE);

glUniformlf(userData->offsetLoc, 0.0f);

glDrawElements(GL_TRIANGLES, 6, GL_UNSIGNED_SHORT, indices);

// Draw right quad with mirrored repeat

glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_S,

GL_MIRRORED_REPEAT);

glTexParameteri(GL_TEXTURE_2D, GL_TEXTURE_WRAP_T,

GL_MIRRORED_REPEAT);

glUniformlf(userData->offsetLoc, 0.7f);

glDrawElements GL_TRIANGLES, 6, GL_UNSIGNED_SHORT, indices);

In Figure 9-5, the quad on the far left is rendered using GL_REPEAT

mode. In this mode, the texture simply repeats outside of the range

[0,1], resulting in a tiling pattern of the image. The quad in the center

is rendered with GL_CLAMP_TO_EDGE mode. As you can see, when the

texture coordinates go outside the range [0, 1], the texture coordinates are

clamped to sample from the edge of the texture. The quad on the right is

rendered with GL_MIRRORED_REPEAT, which mirrors and then repeats the

image when the texture coordinates are outside the range[0, 1].

Texture Swizzles

Texture swizzles control how color components in the input R, RG,

RGB, or RGBA texture map to components when fetched from in the

shader. For example, an application might want a GL_RED texture to

map to (0, 0, 0, R) or (R, R, R, 1) as opposed to the default mapping

of (R, 0, 0, 1). The texture component that each R, G, B, and A value

maps to can be independently controlled using texture swizzles set

using glTexParameter[i|f][v]. The component to control is set by

using GL_TEXTURE_SWIZZLE_R, GL_TEXTURE_SWIZZLE_G, GL_TEXTURE_

SWIZZLE_B, or GL_TEXTURE_SWIZZLE_A. The texture value that will be the

Texturing Basics 245

source for that component can be either GL_RED, GL_GREEN, GL_BLUE,

or GL_ALPHA to fetch from the R, G, B, or A component, respectively.

Additionally, the application can set the value to be the constant 0 or 1

using GL_ZERO or GL_ONE, respectively.

Texture Level of Detail

In some applications, it is useful to be able to start displaying a scene

before all of the texture mipmap levels are available. For example, a GPS

application that is downloading texture images over a data connection

might start with the lowest-level mipmaps and display the higher levels

when they become available. In OpenGL ES 3.0, this can be accomplished

by using several of the arguments to glTexParameter[i|f][v]. The

GL_TEXTURE_BASE_LEVEL sets the largest mipmap level that will be used for

a texture. By default, this has a value of 0, but it can be set to a higher value

if mipmap levels are not yet available. Likewise, GL_TEXTURE_MAX_LEVEL

sets the smallest mipmap level that will be used. By default, it has a value of

1000 (beyond the largest level any texture could have), but it can be set to

a lower number to control the smallest mipmap level to use for a texture.

To select which mipmap level to use for rendering, OpenGL ES

automatically computes a level of detail (LOD) value. This oating-

point value determines which mipmap level to lter from (and in

trilinear ltering, controls how much of each mipmap is used). An

application can also control the minimum and maximum LOD values

with GL_TEXTURE_MIN_LOD and GL_TEXTURE_MAX_LOD. One reason it

is useful to be able to control the LOD clamp separately from the base

and maximum mipmap levels is to provide smooth transitioning when

new mipmap levels become available. Setting just the texture base and

maximum level might result in a popping artifact when new mipmap

levels are available, whereas interpolating the LOD can make this

transition look smoother.

Depth Texture Compare (Percentage Closest Filtering)

The last texture parameters to discuss are GL_TEXTURE_COMPARE_FUNC and

GL_TEXTURE_COMPARE_MODE. These texture parameters were introduced

to provide a feature known as percentage closest ltering (PCF). When

performing the shadowing technique known as shadow mapping, the

fragment shader needs to compare the current depth value of a fragment

to the depth value in a depth texture to determine whether a fragment

is within or outside of the shadow. To achieve smoother-looking shadow

edges, it is useful to be able to perform bilinear ltering on the depth

246 Chapter 9: Texturing

texture. However, when ltering a depth texture, we want the ltering to

occur after we sample the depth value and compare to the current depth

(or reference value). If ltering were to occur before comparison, then we

would be averaging values in the depth texture, which does not provide

the correct result. PCF provides the correct ltering, such that each depth

value sampled is compared to the reference depth and then the results of

those comparisons (0 or 1) are averaged together.

The GL_TEXTURE_COMPARE_MODE defaults to GL_NONE, but when it is

set to GL_COMPARE_REF_TO_TEXTURE, the r-coordinate in the (s, t, r)

texture coordinate will be compared with the value of the depth texture.

The result of this comparison then becomes the result of the shadow

texture fetch (either a value of 0 or 1, or an averaging of these values

if texture ltering is enabled). The comparison function is set using

GL_TEXTURE_COMPARE_FUNC, which can set the comparison function to

GL_LEQUAL, GL_EQUAL, GL_LESS, GL_GREATER, GL_EQUAL, GL_NOTEQUAL,

GL_ALWAYS, or GL_NEVER. More details on shadow mapping are covered

in Chapter 14, “Advanced Programming with OpenGL ES 3.0.”

Texture Formats

OpenGL ES 3.0 offers a wide range of data formats for textures. In fact, the

number of formats has greatly increased from OpenGL ES 2.0. This section

details the texture formats available in OpenGL ES 3.0.

As described in the previous section Texture Objects and Loading Textures,

a 2D texture can be uploaded with either an unsized or sized internal

format using glTexImage2D. If the texture is specied with an unsized

format, the OpenGL ES implementation is free to choose the actual

internal representation in which the texture data is stored. If the texture is

specied with a sized format, then OpenGL ES will choose a format with

at least as many bits as is specied.

Table 9-4 lists the valid combinations for specifying a texture with an

unsized internal format.

If the application wants more control over how the data is stored internally,

then it can use a sized internal format. The valid combinations for sized

internal formats with glTexImage2D are listed in Tables 9-5 to 9-10. In the last

two columns, “R” means renderable and “F” means lterable. OpenGL ES 3.0

mandates only that certain formats be available for rendering to or ltering

from. Further, some formats can be specied with input data containing more

bits than the internal format. In this case, the implementation may choose to

convert to lesser bits or use a format with more bits.

Texturing Basics 247

To explain the large variety of texture formats in OpenGL ES 3.0, we have

organized them into the following categories: normalized texture formats,

oating-point textures, integer textures, shared exponent textures, sRGB

textures, and depth textures.

Normalized Texture Formats

Table 9-5 lists the set of internal format combinations that can be used

to specify normalized texture formats. By “normalized,” we mean that

the results when fetched from the texture in the fragment shader will

be in the [0.0, 1.0] range (or [–1.0, 1.0] range in the case of *_SNORM

formats). For example, a GL_R8 image specied with GL_UNSIGNED_BYTE

data will take each 8-bit unsigned byte value in the range from

[0, 255]

and map it to [0.0, 1.0] when fetched in the fragment shader. A

GL_R8_SNORM image specied with GL_BYTE data will take each 8-bit

signed byte value in the range from [–128, 127] and map it to [–1.0, 1.0]

when fetched.

The normalized formats can be specied with between one and four

components per texel (R, RG, RGB, or RGBA). OpenGL ES 3.0 also

introduces GL_RGB10_A2, which allows the specication of texture

image data with 10 bits for each (R, G, B) value and 2 bits for each alpha

value.

Table 9-4 Valid Unsized Internal Format Combinations for glTexImage2D

internalFormat format type Input Data

GL_RGB GL_RGB GL_UNSIGNED_BYTE 8/8/8 RGB 24-bit

GL_RGB GL_RGB GL_UNSIGNED_SHORT_5_6_5 5/6/5 RGB 16-bit

GL_RGBA GL_RGBA GL_UNSIGNED_BYTE 8/8/8/8 RGBA 32-bit

GL_RGBA GL_RGBA GL_UNSIGNED_SHORT_4_4_4_4 4/4/4/4 RGBA 16-bit

GL_RGBA GL_RGBA GL_UNSIGNED_SHORT_5_5_5_1 5/5/5/1 RGBA 16-bit

GL_

LUMINANCE_

ALPHA

GL_

LUMINANCE_

ALPHA

GL_UNSIGNED_BYTE 8/8 LA 16-bit

GL_

LUMINANCE

GL_LUMINANCE GL_UNSIGNED_BYTE 8L 8-bit

GL_ALPHA GL_ALPHA GL_UNSIGNED_BYTE 8A 8-bit

248 Chapter 9: Texturing

Table 9-5 Normalized Sized Internal Format Combinations for glTexImage2D

internalFormat format Type Input Data R1F2

GL_R8 GL_RED GL_UNSIGNED_BYTE 8-bit Red X X

GL_R8_SNORM GL_RED GL_BYTE 8-bit Red

(signed)

GL_RG8 GL_RG GL_UNSIGNED_BYTE 8/8 RG XX

GL_RG8_SNORM GL_RG GL_BYTE 8/8 RG (signed) X

GL_RGB8 GL_RGB GL_UNSIGNED_BYTE 8/8/8 RGB X X

GL_RGB8_SNORM GL_RGB GL_BYTE 8/8/8 RGB

(signed)

GL_RGB565 GL_RGB GL_UNSIGNED_BYTE 8/8/8 RGB X X

GL_RGB565 GL_RGB GL_UNSIGNED_SHORT_565 5/6/5 RGB X X

GL_RGBA8 GL_RGBA GL_UNSIGNED_BYTE 8/8/8/8 RGBA X X

GL_RGBA8_SNORM GL_RGBA GL_BYTE 8/8/8/8 RGBA

(signed)

GL_RGB5_A1 GL_RGBA GL_UNSIGNED_BYTE 8/8/8/8 RGBA X X

GL_RGB5_A1 GL_RGBA GL_UNSIGNED_

SHORT_5_5_5_1 5/5/5/1 RGBA X X

GL_RGB5_A1 GL_RGBA GL_UNSIGNED_

SHORT_2_10_10_10_ REV 10/1010/2 RGBA X X

GL_RGBA4 GL_RGBA GL_UNSIGNED_BYTE 8/8/8/8 RGBA X X

GL_RGBA4 GL_RGBA GL_UNSIGNED_

SHORT_4_4_4_4 4/4/4/4 RGBA X X

GL_RGB10_A2 GL_RGBA GL_UNSIGNED_

INT_2_10_10_10_REV 10/10/10/2

RGBA

1. R = format is renderable.

2. F = format is lterable.

Texturing Basics 249

Floating-Point Texture Formats

OpenGL ES 3.0 also introduces oating-point texture formats. The

majority of the oating-point formats are backed by either 16-bit

half-oating-point data (described in detail in Appendix A) or 32-bit

oating-point data. Floating-point texture formats can have one to four

components, just like normalized texture formats (R, RG, RGB, RGBA).

OpenGL ES 3.0 does not mandate that oating-point formats be used as

render targets, and only 16-bit half-oating-point data is mandated to be

lterable.

In addition to 16-bit and 32-bit oating-point data, OpenGL ES 3.0

introduces the 11/11/10 GL_R11F_G11F_B10F oating-point format. The

motivation for this format is to provide higher-precision, three-channel

textures while still keeping the storage of each texel at 32 bits. The use of

this format may lead to higher performance than a 16/16/16 GL_RGB16F

or 32/32/32 GL_RGB32F texture. This format has 11 bits for the Red and

Green channel and 10 bits for the Blue channel. For the 11-bit Red and

Green values, there are 6 bits of mantissa and 5 bits of exponent; the

10-bit Blue value has 5 bits of mantissa and 5 bits of exponent. The

11/11/10 format can be used only to represent positive values because

there is no sign bit for any of the components. The largest value that

can be represented in the 11-bit and 10-bit formats is 6.5×104 and the

smallest value is 6.1 × 10−5. The 11-bit format has 2.5 decimal digits of

precision, and the 10-bit format has 2.32 decimal digits of precision.

Table 9-6 Valid Sized Floating-Point Internal Format Combinations for glTexImage2D

internalFormat format type Input Data RF

GL_R16F GL_RED GL_HALF_FLOAT 16-bit Red (half-oat) X

GL_R16F GL_RED GL_FLOAT 32-bit Red (oat) X

GL_R32F GL_RED GL_FLOAT 32-bit Red (oat)

GL_RG16F GL_RG GL_HALF_FLOAT 16/16 RG (half-oat) X

GL_RG16F GL_RG GL_FLOAT 32/32 RG (oat) X

GL_RG32F GL_RG GL_FLOAT 32/32 RG (oat)

GL_RGB16F GL_RGB GL_HALF_FLOAT 16/16/16 RGB (half-

oat)

(continues)

250 Chapter 9: Texturing

Integer Texture Formats

Integer texture formats allow the specication of textures that can

be fetched as integers in the fragment shader. That is, as opposed to

normalized texture formats where the data are converted from their

integer representation to a normalized oating-point value upon fetch

in the fragment shader, the values in integer textures remain as integers

when fetched in the fragment shader.

Integer texture formats are not lterable, but the R, RG, and RGBA

variants can be used as a color attachment to render to in a framebuffer

object. When using an integer texture as a color attachment, the alpha

blend state is ignored (no blending is possible with integer render targets).

The fragment shader used to fetch from integer textures and to output to

an integer render target should use the appropriate signed or unsigned

integer type that corresponds with the format.

Table 9-6 Valid Sized Floating-Point Internal Format Combinations for glTexImage2D

(continued)

internalFormat format type Input Data RF

GL_RGB16F GL_RG GL_FLOAT 16/16 RGB (oat) X

GL_RGB32F GL_RG GL_FLOAT 32/32/32 RGB (oat)

GL_R11F_G11F_B10F GL_RGB GL_UNSIGNED_

INT_10F_11F_

11F_REV

10/11/11 (oat) X

GL_R11F_G11F_B10F GL_RGB GL_HALF_FLOAT 16/16/16 RGB (half-

oat)

GL_R11F_G11F_B10F GL_RGB GL_FLOAT 32/32/32 RGB (half

oat)

GL_RGBA16F GL_RGBA GL_HALF_FLOAT 16/16/16/16 RGBA

(half-oat)

GL_RGBA16F GL_RGBA GL_FLOAT 32/32/32/32 RGBA

(oat)

GL_RGBA32F GL_RGBA GL_FLOAT 32/32/32/32 RGBA

(oat)

Texturing Basics 251

Table 9-7 Valid Sized Internal Integer Texture Format Combinations for glTexImage2D

internalFormat format type Input Data RF

GL_R8UI GL_RED_INTEGER GL_UNSIGNED_BYTE 8-bit Red

(unsigned int)

GL_R8I GL_RED_INTEGER GL_BYTE 8-bit Red (signed

int)

GL_R16UI GL_RED_INTEGER GL_UNSIGNED_

SHORT 16-bit Red

(unsigned int)

GL_R16I GL_RED_INTEGER GL_SHORT 16-bit Red (signed

int)

GL_R32UI GL_RED_INTEGER GL_UNSIGNED_INT 32-bit Red

(unsigned int)

GL_R32I GL_RED_INTEGER GL_INT 32-bit Red (signed

int)

GL_RG8UI GL_RG_INTEGER GL_UNSIGNED_BYTE 8/8 RG (unsigned

int)

GL_RG8I GL_RG_INTEGER GL_BYTE 8/8 RG (signed int) X

GL_RG16UI GL_RG_INTEGER GL_UNSIGNED_

SHORT 16/16 RG

(unsigned int)

GL_RG16I GL_RG_INTEGER GL_SHORT 16/16 RG (signed

int)

GL_RG32UI GL_RG_INTEGER GL_UNSIGNED_INT 32/32 RG

(unsigned int)

GL_RG32I GL_RG_INTEGER GL_INT 32/32 RG (signed

int)

GL_RGBAUI GL_RGBA_INTEGER GL_UNSIGNED_BYTE 8/8/8/8 RGBA

(unsigned int)

GL_RGBAI GL_RGBA_INTEGER GL_BYTE 8/8/8/8 RGBA

(signed int)

GL_RGB8UI GL_RGB_INTEGER GL_UNSIGNED_BYTE 8/8/8 RGB

(unsigned int)

GL_RGB8I GL_RGB_INTEGER GL_BYTE 8/8/8 RGB (signed

int)

(continues)

252 Chapter 9: Texturing

Shared Exponent Texture Formats

Shared exponent textures provide a way to store RGB textures that have a

large range without requiring as much bit depth as used by oating-point

textures. Shared exponent textures are typically used for high dynamic

range (HDR) images where half- or full-oating-point data are not required.

The shared exponent texture format in OpenGL ES 3.0 is GL_RGB9_E5. In

this format, one 5-bit exponent is shared by all three RGB components.

The 5-bit exponent is implicitly biased by the value 15. Each of the 9-bit

values for RGB store the mantissa without a sign bit (and thus must be

positive).

Table 9-7 Valid Sized Internal Integer Texture Format Combinations for glTexImage2D

(continued)

internalFormat format type Input Data RF

GL_RGB16UI GL_RGB_INTEGER GL_UNSIGNED_

SHORT 16/16/16 RGB

(unsigned int)

GL_RGB16I GL_RGB_INTEGER GL_SHORT 16/16/16 RGB

(signed int)

GL_RGB32UI GL_RGB_INTEGER GL_UNSIGNED_INT 32/32/32 RGB

(unsigned int)

GL_RGB32I GL_RGB_INTEGER GL_INT 32/32/32 RG

(signed int)

GL_RG32I GL_RG_INTEGER GL_INT 32/32 RG (signed

int)

GL_RGB10_

A2_UI

GL_RGBA_INTEGER GL_UNSIGNED_

INT_2_10_10_

10_REV

10/10/10/2 RGBA

(unsigned int)

GL_

RGBA16UI

GL_RGBA_INTEGER GL_UNSIGNED_

SHORT 16/16/16/16 RGBA

(unsigned int)

GL_RGBA16I GL_RGBA_INTEGER GL_SHORT 16/16/16/16 RGBA

(signed int)

GL_

RGBA32UI

GL_RGBA_INTEGER GL_UNSIGNED_INT 32/32/32/32

R/G/B/A (unsigned

int)

GL_RGBA32I GL_RGBA_INTEGER GL_INT 32/32/32/32

R/G/B/A (signed

int)

Texturing Basics 253

Upon fetch, the three RGB values are derived from the texture using the

following equations:

R=R*2

G=G*2

B=B*2

outin

(EXP –15)

outin

(EXP –15)

outin

(EXP –15)

If the input texture is specied in 16-bit half-oat or 32-bit oat, then

the OpenGL ES implementation will automatically convert to the shared

exponent format. The conversion is done by rst determining the

maximum color value:

MAX=max(R,G,B)

The shared exponent is then computed using the following formula:

EXP=max(–16, floor(log (MAX ))) +16

Finally, the 9-bit mantissa values for RGB are computed as follows:

()

Rfloor R/(2 )0.5

Gfloor G/(2 )0.5

Bfloor B/(2 )0.5

(EXP –159)

An application could use these conversion formulas to derive the 5-bit EXP

and 9-bit RGB values from incoming data, or it can simply pass in the 16-bit

half-oat or 32-bit oat data to OpenGL ES and let it perform the conversion.

Table 9-8 Valid Shared Exponent Sized Internal Format Combinations for

glTexImage2D

internalFormat format type Input Data RF

GL_RGB9_E5 GL_RGB GL_UNSIGNED_

INT_5_9_9_9_

REV

9/9/9/ RGB with

shared 5-bit

exponent

GL_RGB9_E5 GL_RGB GL_HALF_FLOAT 16/16/16 RGB

(half-oat)

GL_RGB9_E5 GL_RGB GL_FLOAT 32/32/32 RGB

(half-oat)

254 Chapter 9: Texturing

sRGB Texture Formats

Another texture format introduced in OpenGL ES 3.0 is sRGB textures.

sRGB is a nonlinear colorspace that approximately follows a power

function. Most images are actually stored in the sRGB colorspace, as the

nonlinearity accounts for the fact that humans can differentiate color

better at different brightness levels.

If the images used for textures are authored in the sRGB colorspace but are

fetched without using sRGB textures, all of the lighting calculations that

occur in the shader happen in a nonlinear colorspace. That is, the textures

created in standard authoring packages are stored in sRGB and remain in

sRGB when fetched from in the shader. The lighting calculations then are

occurring in the nonlinear sRGB space. While many applications make

this mistake, it is not correct and actually results in discernibly different

(and incorrect) output image.

To properly account for sRGB images, an application should use an sRGB

texture format that will be converted from sRGB into a linear colorspace

on fetch in the shader. Then, all calculations in the shader are done

in linear colorspace. Finally, by rendering to a sRGB render target, the

image will be correctly converted back to sRGB on write. It is possible

to approximate sRGB → linear conversion using a shader instruction

pow(value, 2.2) and then to approximate the linear → sRGB conversion

using pow(value, 1/2.2). However, it is preferable to use a sRGB texture

where possible because it reduces the shader instructions and provides a

more correct sRGB conversion.

Depth Texture Formats

The nal texture format type in OpenGL ES 3.0 is depth textures. Depth

textures allow the application to fetch the depth (and optionally, stencil)

value from the depth attachment of a framebuffer object. This is useful in

a variety of advanced rendering algorithms, including shadow mapping.

Table 9-10 lists the valid depth texture formats in OpenGL ES 3.0.

Table 9-9 Valid sRGB Sized Internal Format Combinations for

glTexImage2D

internalFormat format type Input Data R F

GL_SRGB8 GL_RGB GL_UNSIGNED_BYTE 8/8/8 SRGB X

GL_SRGB8_ALPHA8 GL_RGBA GL_UNSIGNED_BYTE 8/8/8/8 RGBA X X

Texturing Basics 255

Using Textures in a Shader

Now that we have covered the basics of setting up texturing, let’s look at

some sample shader code. The vertex–fragment shader pair in Example9-3

from the Simple_Texture2D sample demonstrates the basics of how 2D

texturing is done in a shader.

Table 9-10 Valid Depth Sized Internal Format Combinations for glTexImage2D

internalFormat format type

GL_DEPTH_COMPONENT16 GL_DEPTH_COMPONENT GL_UNSIGNED_SHORT

GL_DEPTH_COMPONENT16 GL_DEPTH_COMPONENT GL_UNSIGNED_INT

GL_DEPTH_COMPONENT24 GL_DEPTH_COMPONENT GL_UNSIGNED_INT

GL_DEPTH_COMPONENT32F GL_DEPTH_COMPONENT GL_FLOAT

GL_DEPTH24_STENCIL8 GL_DEPTH_STENCIL GL_UNSIGNED_INT_24_8

GL_DEPTH32F_STENCIL8 GL_DEPTH_STENCIL GL_FLOAT_32_UNSIGNED_

INT_24_8_REV

Example 9-3 Vertex and Fragment Shaders for Performing 2D Texturing

// Vertex shader

#version 300 es

layout(location = 0) in vec4 a_position;

layout(location = 1) in vec2 a_texCoord;

out vec2 v_texCoord;

void main()

{

gl_Position = a_position;

v_texCoord = a_texCoord;

}

// Fragment shader

#version 300 es

precision mediump float;

in vec2 v_texCoord;

layout(location = 0) out vec4 outColor;

uniform sampler2D s_texture;

void main()

{

outColor = texture( s_texture, v_texCoord );

}

256 Chapter 9: Texturing

The vertex shader takes in a two-component texture coordinate as a vertex

input and passes it as an output to the fragment shader. The fragment

shader consumes that texture coordinate and uses it for the texture fetch.

The fragment shader declares a uniform variable of type sampler2D called

s_texture. A sampler is a special type of uniform variable that is used to

fetch from a texture map. The sampler uniform will be loaded with a value

specifying the texture unit to which the texture is bound; for example,

specifying that a sampler with a value of 0 says to fetch from unit

GL_TEXTURE0, specifying a value of 1 says to fetch from GL_TEXTURE1, and

so on. Textures are bound to texture units in the OpenGL ES 3.0 API by

using the glActiveTexture function.

void glActiveTexture(GLenum texture)

texture the texture unit to make active: GL_TEXTURE0, GL_TEXTURE1,

… , GL_TEXTURE31

The function glActiveTexture sets the current texture unit so

that subsequent calls to glBindTexture will bind the texture to the

currently active unit. The number of texture units available to the

fragment shader on an implementation of OpenGL ES can be queried

for by using glGetintegerv with the parameter GL_MAX_TEX TURE_

IMAGE_UNITS. Thenumber of texture units available to the vertex

shader can be queried for by using glGetIntegerv with the parameter

GL_MAX_VERTEX_TEXTURE_IMAGE_UNITS.

The following example code from the Simple_Texture2D example shows

how the sampler and texture are bound to the texture unit.

// Get the sampler locations

userData->samplerLoc = glGetUniformLocation(

userData->programObject,

“s_texture”);

// ...

// Bind the texture

glActiveTexture(GL_TEXTURE0);

glBindTexture(GL_TEXTURE_2D, userData->textureId);

// Set the sampler texture unit to 0

glUniformli(userData->samplerLoc, 0);

At this point, we have the texture loaded, the texture bound to texture

unit 0, and the sampler set to use texture unit 0. Going back to the

fragment shader in the Simple_Texture2D example, we see that the

Texturing Basics 257

shader code then uses the built-in function texture to fetch from the

texture map. The texture built-in function takes the form shown here:

vec4 texture(sampler2D sampler, vec2 coord[,

float bias])

sampler a sampler bound to a texture unit specifying the texture from

which to fetch.

coord a 2D texture coordinate used to fetch from the texturemap.

bias an optional parameter that provides a mipmap bias used for

the texture fetch. This allows the shader to explicitly bias the

computed LOD value used for mipmap selection.

The texture function returns a vec4 representing the color fetched from

the texture map. The way the texture data is mapped into the channels

of this color depends on the base format of the texture. Table 9-11 shows

the way in which texture formats are mapped to vec4 colors. The texture

swizzle (described in the Texture Swizzles section earlier in this chapter)

determines how the values from each of these components map to

components in the shader.

Table 9-11 Mapping of Texture Formats to Colors

Base Format Texel Data Description

GL_RED (R, 0.0, 0.0, 1.0)

GL_RG (R, G, 0.0, 1.0)

GL_RGB (R, G, B, 1.0)

GL_RGBA (R, G, B, A)

GL_LUMINANCE (L, L, L, 1.0)

GL_LUMINANCE_ALPHA (L, L, L, A)

GL_ALPHA (0.0, 0.0, 0.0, A)

In the case of the Simple_Texture2D example, the texture was loaded

as GL_RGB and the texture swizzles were left at the default values, so the

result of the texture fetch will be a vec4 with values (R, G, B, 1.0).

258 Chapter 9: Texturing

Example of Using a Cubemap Texture

Using a cubemap texture is very similar to using a 2D texture. The example

Simple_TextureCubemap demonstrates drawing a sphere with a simple

cubemap. The cubemap contains six 1 × 1 faces, each with a different

color. The code in Example 9-4 is used to load the cubemap texture.

Example 9-4 Loading a Cubemap Texture

GLuint CreateSimpleTextureCubemap()

{

GLuint textureId;

// Six l x l RGB faces

GLubyte cubePixels[6][3] =

{

// Face 0 - Red

255, 0, 0,

// Face 1 - Green,

0, 255, 0,

// Face 2 - Blue

0, 0, 255,

// Face 3 - Yellow

255, 255, 0,

// Face 4 - Purple

255, 0, 255,

// Face 5 - White

255, 255, 255

};

// Generate a texture object

glGenTextures(1, &textureId);

// Bind the texture object

glBindTexture(GL_TEXTURE_CUBE_MAP, textureId);

// Load the cube face - Positive X

glTexImage2D(GL_TEXTURE_CUBE_MAP_POSITIVE_X, 0, GL_RGB, 1, 1,

0, GL_RGB, GL_UNSIGNED_BYTE, &cubePixels[0]);

// Load the cube face - Negative X

glTexImage2D(GL_TEXTURE_CUBE_MAP_NEGATIVE_X, 0, GL_RGB, 1, 1,

0, GL_RGB, GL_UNSIGNED_BYTE, &cubePixels[1]);

// Load the cube face - Positive Y

glTexImage2D(GL_TEXTURE_CUBE_MAP_POSITIVE_Y, 0, GL_RGB, 1, 1,

0, GL_RGB, GL_UNSIGNED_BYTE, &cubePixels[2]);

// Load the cube face - Negative Y

glTexImage2D(GL_TEXTURE_CUBE_MAP_NEGATIVE_Y, 0, GL_RGB, 1, 1,

0, GL_RGB, GL_UNSIGNED_BYTE, &cubePixels[3]);

Texturing Basics 259

This code loads each individual cubemap face with l × l RGB pixel data by

calling glTexImage2D for each cubemap face. The shader code to render

the sphere with a cubemap is provided in Example 9-5.

Example 9-5 Vertex and Fragment Shader Pair for Cubemap Texturing

// Vertex shader

#version 300 es

layout(location = 0) in vec4 a_position;

layout(location = 1) in vec3 a_normal;

out vec3 v_normal;

void main()

{

gl_Position = a_position;

v_normal = a_normal;

}

// Fragment shader

#version 300 es

precision mediump float;

in vec3 v_normal;

layout(location = 0) out vec4 outColor;

uniform samplerCube s_texture;

void main()

{

outColor = texture( s_texture, v_normal );

}

// Load the cube face - Positive Z

glTexImage2D(GL_TEXTURE_CUBE_MAP_POSITIVE_Z, 0, GL_RGB, 1, 1,

0, GL_RGB, GL_UNSIGNED_BYTE, &cubePixels[4]);

// Load the cube face - Negative Z

glTexImage2D(GL_TEXTURE_CUBE_MAP_NEGATIVE_Z, 0, GL_RGB, 1, 1,

0, GL_RGB, GL_UNSIGNED_BYTE, &cubePixels[5]);

// Set the filtering mode

glTexParameteri(GL_TEXTURE_CUBE_MAP, GL_TEXTURE_MIN_FILTER,

GL_NEAREST);

glTexParameteri(GL_TEXTURE_CUBE_MAP, GL_TEXTURE_MAG_FILTER,

GL_NEAREST);

return textureId;

}

Example 9-4 Loading a Cubemap Texture (continued)

260 Chapter 9: Texturing

The vertex shader takes in a position and a normal as vertex inputs.

Anormal is stored at each vertex of the sphere that will be used as a

texture coordinate. The normal is passed to the fragment shader. The

fragment shader then uses the built-in function texture to fetch from the

cubemap using the normal as a texture coordinate. The texture built-in

function for cubemaps takes the form shown here:

vec4 texture(samplerCube sampler, vec3 coord[,

float bias])

sampler the sampler is bound to a texture unit specifying the texture

from which to fetch.

coord a 3D texture coordinate used to fetch from the cubemap.

bias an optional parameter that provides a mipmap bias used for

the texture fetch. This allows the shader to explicitly bias the

computed LOD value used for mipmap selection.

void glTexImage3D(GLenum target, GLint level,

GLenum internalFormat,

GLsizei width, GLsizei height,

GLsizei depth, GLint border,

GLenum format, GLenum type,

const void* pixels)

target species the texture target; should be

GL_TEXTURE_3D or GL_TEXTURE_2D_ARRAY.

The function for fetching a cubemap is very similar to a 2D texture. The

only difference is that the texture coordinate has three components

instead of two and the sampler type must be samplerCube. The same

method is used to bind the cubemap texture and load the sampler as is

used for the Simple_Texture2D example.

Loading 3D Textures and 2D Texture Arrays

As discussed earlier in the chapter, in addition to 2D textures and

cubemaps, OpenGL ES 3.0 includes 3D textures and 2D texture arrays.

The function to load 3D textures and 2D texture arrays is glTexImage3D,

which is very similar to glTexImage2D.

Texturing Basics 261

level species which mip level to load. The base level is

specied by 0, followed by an increasing level for each

successive mipmap.

internal

Format the internal format for the texture storage; can be either

an unsized base internal format or a sized internal

format. The full valid internalFormat, format, and

type combinations are provided in Tables 9-4

through 9-10.

width the width of the image in pixels.

height the height of the image in pixels.

depth the number of slices of the 3D texture.

border this parameter is ignored in OpenGL ES. It was kept

for compatibility with the desktop OpenGL interface.

Should be 0.

format the format of the incoming texture data; can be

GL_RED

GL_RED_INTEGER

GL_RG

GL_RG_INTEGER

GL_RGB

GL_RGB_INTEGER

GL_RGBA

GL_RGBA_INTEGER

GL_DEPTH_COMPONENT

GL_DEPTH_STENCIL

GL_LUMINANCE_ALPHA

GL_ALPHA

type the type of the incoming pixel data; can be

GL_UNSIGNED_BYTE

GL_BYTE

GL_UNSIGNED_SHORT

GL_SHORT

GL_UNSIGNED_INT

GL_INT

GL_HALF_FLOAT

GL_FLOAT

pixels contains the actual pixel data for the image. The data must

contain (width * height * depth) number of pixels with

the appropriate number of bytes per pixel based on the

format and type specication. The image data should be

stored as a sequence of 2D texture slices.

262 Chapter 9: Texturing

Once a 3D texture or 2D texture array has been loaded using

glTexImage3D, the texture can be fetched in the shader using the

texture built-in function.

vec4 texture(sampler3D sampler, vec3 coord[,

float bias])

vec4 texture(sampler2DArray sampler, vec3 coord[,

floatbias])

sampler a sampler bound to a texture unit specifying the texture to

fetch from.

coord a 3D texture coordinate used to fetch from the texture map.

bias an optional parameter that provides a mipmap bias use for

the texture fetch. This allows the shader to explicitly bias the

computed LOD value used for mipmap selection.

Note that the r-coordinate is a oating-point value. For 3D textures,

depending on the ltering mode set, the texture fetch might span two

slices of the volume.

Compressed Textures

Thus far, we have been dealing with textures that were loaded with

uncompressed texture image data. OpenGL ES 3.0 also supports the

loading of compressed texture image data. There are several reasons why

compressing textures is desirable. The rst and obvious reason to compress

textures is to reduce the memory footprint of the textures on the device.

A second, less obvious reason to compress textures is that a memory

bandwidth savings occurs when you fetch from compressed textures

in a shader. Finally, compressed textures might allow you to reduce the

download size of your application by reducing the amount of image data

that must be stored.

In OpenGL ES 2.0, the core specication did not dene any compressed

texture image formats. That is, the OpenGL ES 2.0 core simply dened a

mechanism whereby compressed texture image data could be loaded, but

no compressed formats were dened. As a result, many vendors, including

Qualcomm, ARM, Imagination Technologies, and NVIDIA, provided

hardware-specic texture compression extensions. In turn, developers of

OpenGL ES 2.0 applications had to support different texture compression

formats on different platforms and hardware.

Compressed Textures 263

OpenGL ES 3.0 has improved this situation by introducing standard

texture compression formats that all vendors must support. Ericsson

Texture Compression (ETC2 and EAC) was offered as a royalty-free

standard to Khronos, and it was adopted as the standard texture

compression format for OpenGL ES 3.0. There are variants of EAC

for compressing one- and two-channel data as well as variants of

ETC2 for compressing three- and four-channel data. The function

used to load compressed image data for 2D textures and cubemaps

is glCompressedTexImage2D; the corresponding function for 2D

texture arrays is glCompressedTexImage3D. Note that ETC2/EAC is not

supported for 3D textures (only 2D textures and 2D texture arrays), but

glCompressedTexImage3D can be used to potentially load vendor-specic

3D texture compression formats.

void glCompressedTexImage2D( GLenum target, GLint level,

GLenum internalFormat,

GLsizei width,

GLsizei height,

GLint border,

GLsizei imageSize,

const void *data)

void glCompressedTexImage3D( GLenum target, GLint level,

GLenum internalFormat,

GLsizei width,

GLsizei height,

GLsizei depth,

GLint border,

GLsizei imageSize,

const void *data)

target species the texture target; should be GL_

TEXTURE_2D or either the GL_TEXTURE_CUBE_MAP_*

(for glCompressedTexImage2D) or GL_TEXTURE_3D

or GL_TEXTURE_2D_ARRAY (for

glCompressedTexImage3D).

level species which mip level to load. The base level is

specied by 0, followed by an increasing level for

each successive mipmap.

internalFormat the internal format for the texture storage. The

standard compressed texture formats in OpenGL ES

3.0 are described in Table 9-12.

(continues)

264 Chapter 9: Texturing

The standard ETC compressed texture formats supported by OpenGL ES

3.0 are listed in Table 9-12. All of the ETC formats store compressed image

data in 4 × 4 blocks. Table 9-12 lists the number of bits per pixel in each

of the ETC formats. The size of an individual ETC image can be computed

from the bits-per-pixel (bpp) ratio as follows:

sizeInBytes = max(width, 4) * max(height, 4) * bpp/8

Table 9-12 Standard Texture Compression Formats

internalFormat

Size (bits

per pixel) Description

GL_COMPRESSED_R11_EAC 4 Single-channel unsigned

compressed GL_RED format

GL_COMPRESSED_SIGNED_R11_EAC 4 Single-channel signed compressed

GL_RED format

GL_COMPRESSED_RG11_EAC 8 Two-channel unsigned compressed

GL_RG format

GL_COMPRESSED_SIGNED_ RG11_EAC 8 Two-channel signed compressed

GL_RG format

GL_COMPRESSED_RGB8_ETC2 4 Three-channel unsigned

compressed GL_RGB format

GL_COMPRESSED_SRGB8_ETC2 4 Three-channel unsigned

compressed GL_RGB format in

sRGB colorspace

(continued)

width the width of the image in pixels.

height the height of the image in pixels.

depth (glCompressedTexImage3D only) the depth of the

image in pixels (or number of slices for a 2D texture

array).

border this parameter is ignored in OpenGL ES; it was

kept for compatibility with the desktop OpenGL

interface. Should be 0.

imageSize the size of the image in bytes.

data contains the actual compressed pixel data for the

image; must hold imageSize number of bytes.

Compressed Textures 265

Once a texture has been loaded as a compressed texture, it can be used for

texturing in exactly the same way as an uncompressed texture. The details

of the ETC2/EAC formats are beyond our scope here, and most developers

will never write their own compressors. Freely available tools for generating

ETC images include the open-source libKTX library from Khronos (http://

khronos.org/opengles/sdk/tools/KTX/), the rg_etc project (https://code

.google.com/p/rg-etc1/), the ARM Mali Texture Compression Tool, Qualcomm

TexCompress (included in the Adreno SDK), and Imagination Technologies

PVRTexTool. We would encourage readers to evaluate the available tools and

choose the one that ts best with their development environment/platform.

Note that all implementations of OpenGL ES 3.0 will support the formats

listed in Table 9-12. In addition, some implementations may support vendor-

specic compressed formats not listed in Table 9-12. If you attempt to use a

texture compression format on an OpenGL ES 3.0 implementation that does

not support it, a GL_INVALID_ENUM error will be generated. It is important

that you check that the OpenGL ES 3.0 implementation exports the

extension string for any vendor-specic texture compression format you use.

If it does not, you must fall back to using an uncompressed texture format.

In addition to checking extension strings, there is another method

you can use to determine which texture compression formats

are supported by an implementation. That is, you can query for

GL_NUM_COMPRESSED_TEXTURE_FORMATS using glGetIntegerv to determine

the number of compressed image formats supported. You can then query

for GL_COMPRESSED_TEXTURE_FORMATS using glGetIntegerv, which will

return an array of GLenum values. Each GLenum value in the array will be a

compressed texture format that is supported by the implementation.

internalFormat

Size (bits

per pixel) Description

GL_COMPRESSED_RGB8_

PUNCHTHROUGH_ALPHA1_ETC2 4 Four-channel unsigned compressed

GL_RGBA format with 1-bit alpha

GL_COMPRESSED_SRGB8_

PUNCHTHROUGH_ALPHA1_ETC2 4 Four-channel unsigned compressed

GL_RGBA format with 1-bit alpha

in sRGB colorspace

GL_COMPRESSED_RGBA8_ ETC2_EAC 8 Four-channel unsigned compressed

GL_RGBA format

GL_COMPRESSED_SRGBA8_ ETC2_EAC 8 Four-channel unsigned compressed

GL_RGBA format in sRGB

colorspace

Table 9-12 Standard Texture Compression Formats (continued)

266 Chapter 9: Texturing

Texture Subimage Specication

After uploading a texture image using glTexImage2D, it is possible to

update portions of the image. This ability would be useful if you wanted

to update just a subregion of an image. The function to load a portion of a

2D texture image is glTexSubImage2D.

void glTexSubImage2D( GLenum target, GLint level,

GLint xoffset, GLint yoffset,

GLsizei width, GLsizei height,

GLenum format, GLenum type,

const void* pixels)

target species the texture target, either GL_TEXTURE_2D or one of

the cubemap face targets

(GL_TEXTURE_CUBE_MAP_POSITIVE_X,

GL_TEXTURE_CUBE_MAP_NEGATIVE_X, and soon)

level species which mip level to update

xoffset the x index of the texel to start updating from

yoffset the y index of the texel to start updating from

width the width of the subregion of the image to update

height the height of the subregion of the image to update

format the format of the incoming texture data; can be

GL_RED, GL_RED_INTEGER, GL_RG, GL_RG_INTEGER,

GL_GL_RGB, GL_RGB_INTEGER, GL_RGBA,

GL_RGBA_INTEGER, GL_DEPTH_COMPONENT,

GL_DEPTH_STENCIL, GL_LUMINANCE_ALPHA,

GL_LUMINANCE, or GL_ALPHA

type the type of the incoming pixel data; can be

GL_UNSIGNED_BYTE, GL_BYTE, GL_UNSIGNED_SHORT,

GL_SHORT, GL_UNSIGNED_INT, GL_INT, GL_HALF_FLOAT,

GL_FLOAT, GL_UNSIGNED_SHORT_5_6_5,

GL_UNSIGNED_SHORT_4_4_4_4, GL_UNSIGNED_SHORT_5_5_5_l,

GL_UNSIGNED_INT_2_10_10_10_REV,

GL_UNSIGNED_INT_10F_11F_11F_REV,

GL_UNSIGNED_INT_5_9_9_9_REV,

GL_UNSIGNED_INT_24_8, or

GL_FLOAT_32_UNSIGNED_INT_24_8_REV

pixels contains the actual pixel data for the subregion of the image

Texture Subimage Specication 267

This function will update the region of texels in the range (xoffset, yoffset)

to (xoffset + width – 1, yoffset + height – 1). Note that to use this function,

the texture must already be fully specied. The range of the subimage

must be within the bounds of the previously specied texture image. The

data in the pixels array must be aligned to the alignment that is specied

by GL_UNPACK_ALIGNMENT with glPixelStorei.

There is also a function for updating a subregion of a compressed 2D

texture image—that is, glCompressedTexSubImage2D. The denition for

this function is more or less the same as that for glTexImage2D.

void glCompressedTexSubImage2D( GLenum target,

GLint level, GLint xoffset,

GLint yoffset, GLsizei width,

GLsizei height,

GLenum format,

GLenum imageSize,

const void* pixels)

target species the texture target, either GL_TEXTURE_2D or one of

the cubemap face targets

(GL_TEXTURE_CUBE_MAP_POSITIVE_X,

GL_TEXTURE_CUBE_MAP_NEGATIVE_X, and so on)

level species which mip level to update

xoffset the x index of the texel to start updating from

yoffset the y index of the texel to start updating from

width the width of the subregion of the image to update

height the height of the subregion of the image to update

format the compressed texture format to use; must be

the format with which the image was originally specied

pixels contains the actual pixel data for the subregion of the image

void glTexSubImage3D( GLenum target, GLint level,

GLint xoffset, GLint yoffset,

GLint zoffset, GLsizei width,

GLsizei height, GLsizei depth,

GLenum format, GLenum type,

const void* pixels)

In addition, as with 2D textures, it is possible to update just a subregion of

an existing 3D texture and 2D texture arrays using glTexSubImage3D.

(continues)

268 Chapter 9: Texturing

glTexSubImage3D behaves just like glTexSubImage2D, with the only

difference being that the subregion contains a zoffset and a depth

for specifying the subregion within the depth slices to update. For

compressed 2D texture arrays, it is also possible to update a subregion of

the texture using glCompressedTexSubImage3D. For 3D textures, this

function can be used only with vendor-specic 3D compressed texture

formats, because ETC2/EAC are supported only for 2D textures and 2D

texture arrays.

(continued)

target species the texture target, either GL_TEXTURE_3D or

GL_TEXTURE_2D_ARRAY

level species which mip level to update

xoffset the x index of the texel to start updating from

yoffset the y index of the texel to start updating from

zoffset the z index of the texel to start updating from

width the width of the subregion of the image to update

height the height of the subregion of the image to update

depth the depth of the subregion of the image to update

format the format of the incoming texture data; can be

GL_RED, GL_RED_INTEGER, GL_RG, GL_RG_INTEGER,

GL_GL_RGB, GL_RGB_INTEGER, GL_RGBA,

GL_RGBA_INTEGER, GL_DEPTH_COMPONENT,

GL_DEPTH_STENCIL, GL_LUMINANCE_ALPHA,

GL_LUMINANCE, or GL_ALPHA

type the type of the incoming pixel data; can be

GL_UNSIGNED_BYTE, GL_BYTE, GL_UNSIGNED_SHORT,

GL_SHORT, GL_UNSIGNED_INT, GL_INT, GL_HALF_FLOAT,

GL_FLOAT, GL_UNSIGNED_SHORT_5_6_5,

GL_UNSIGNED_SHORT_4_4_4_4, GL_UNSIGNED_SHORT_5_5_5_l,

GL_UNSIGNED_INT_2_10_10_10_REV,

GL_UNSIGNED_INT_10F_11F_11F_REV,

GL_UNSIGNED_INT_5_9_9_9_REV,

GL_UNSIGNED_INT_24_8, or

GL_FLOAT_32_UNSIGNED_INT_24_8_REV

pixels contains the actual pixel data for the subregion of the image

Copying Texture Data from the Color Buffer 269

Copying Texture Data from the Color Buffer

An additional texturing feature that is supported in OpenGL ES 3.0 is the

ability to copy data from a color buffer to a texture. This can be useful

if you want to use the results of rendering as an image in a texture.

Framebuffer objects (Chapter 12) provide a fast method for doing render-

to-texture and are a faster method than copying image data. However, if

performance is not a concern, the ability to copy image data out of the

color buffer can be a useful feature.

The color buffer from which to copy image data from can be set using the

function glReadBuffer. If the application is rendering to a double-buffered

EGL displayable surface, then glReadBuffer must be set to GL_BACK (the

back buffer—the default state). Recall that OpenGL ES 3.0 supports only

double-buffered EGL displayable surfaces. As a consequence, all OpenGL

void glCompressedTexSubImage3D( GLenum target,

GLint level,

GLint xoffset,

GLint yoffset,

GLint zoffset,

GLsizei width,

GLsizei height,

GLsizei depth,

GLenum format,

GLenum imageSize,

const void* data)

target species the texture target, either GL_TEXTURE_2D or

GL_TEXTURE_2D_ARRAY)

level species which mip level to update

xoffset the x index of the texel to start updating from

yoffset the y index of the texel to start updating from

zoffset the z index of the texel to start updating from

width the width of the subregion of the image to update

height the height of the subregion of the image to update

depth the depth of the subregion of the image to update

format the compressed texture format to use; must be

the format with which the image was originally specied

pixels contains the actual pixel data for the subregion of the image

270 Chapter 9: Texturing

The functions to copy data from the color buffer to a texture are

glCopyTexImage2D, glCopyTexSubImage2D, and glCopyTexSubImage3D.

void glCopyTexImage2D(GLenum target, GLint level,

GLenum internalFormat, GLint x,

GLint y, GLsizei width,

GLsizei height, Glint border )

target species the texture target, either GL_TEXTURE_2D or

one of the cubemap face targets

(GL_TEXTURE_CUBE_MAP_POSITIVE_X,

GL_TEXTURE_CUBE_MAP_NEGATIVE_X, and so on)

level species which mip level to load

internalFormat the internal format of the image; can be

GL_ALPHA, GL_LUMINANCE, GL_LUMINANCE_ALPHA,

GL_RGB, GL_RGBA, GL_R8, GL_RG8, GL_RGB565,

GL_RGB8, GL_RGBA4, GL_RGB5_A1, GL_RGBA8,

GL_RGB10_A2, GL_SRGB8, GL_SRGB8_ALPHA8,

ES3.0 applications that draw to the display will have a color buffer for both

the front and back buffers. The buffer that is currently the front or back

is determined by the most recent call to eglSwapBuffers (described in

Chapter 3, “An Introduction to EGL”). When you copy image data out of

the color buffer from a displayable EGL surface, you will always be copying

the contents of the back buffer. If you are rendering to an EGL pbuffer, then

copying will occur from the pbuffer surface. Finally, if you are rendering to

a framebuffer object, then the framebuffer object color attachment to copy

from is set by calling glReadBuffer with GL_COLOR_ATTACHMENTi.

void glReadBuffer(GLenum mode)

mode species the color buffer to read from. This will

set the source color buffer for future calls to

glReadPixels, glCopyTexImage2D,

glCopyTexSubImage2D, and

glCopyTexSubImage3D. The value can be either

GL_BACK, GL_COLOR_ATTACHMENTi, or

GL_NONE.

Copying Texture Data from the Color Buffer 271

void glCopyTexSubImage2D(GLenum target,

GLint level, GLint xoffset,

GLint yoffset, GLint x, GLint y,

GLsizei width, GLsizei height)

target species the texture target, either GL_TEXTURE_2D or one of

the cubemap face targets

(GL_TEXTURE_CUBE_MAP_POSITIVE_X,

GL_TEXTURE_CUBE_MAP_NEGATIVE_X, and so on)

level species which mip level to update

xoffset the x index of the texel to start updating from

yoffset the y index of the texel to start updating from

xthe x window-coordinate of the lower-left rectangle in the

framebuffer to read from

ythe y window-coordinate of the lower-left rectangle in the

framebuffer to read from

GL_R8I, GL_R8UI, GL_R16I, GL_R16UI, GL_R32I,

GL_R32UI, GL_RG8I, GL_RG8UI, GL_RG16I,

GL_RG16UI, GL_RG32I, GL_RG32UI, GL_RGBA8I,

GL_RGBA8UI, GL_RGB10_A2UI, GL_RGBA16I,

GL_RGBA16UI, GL_RGBA32I, or GL_RGBA32UI

xthe x window-coordinate of the lower-left rectangle

in the framebuffer to read from

ythe y window-coordinate of the lower-left rectangle

in the framebuffer to read from

width the width in pixels of the region to read

height the height in pixels of the region to read

border borders are not supported in OpenGL ES 3.0, so this

parameter must be 0

Calling this function will cause the texture image to be loaded with the

pixels in the color buffer from region (x, y) to (x + width – 1, y + height – 1).

This width and height of the texture image will be the size of the region

copied from the color buffer. You should use this information to ll the

entire contents of the texture.

In addition, you can update just the subregion of an already-specied

image using glCopyTexSubImage2D.

(continues)

272 Chapter 9: Texturing

This function will update the subregion of the image starting at (xoffset,

yoffset) to (xoffset + width – 1, yoffset + height – 1) with the pixels in the

color buffer from (x, y) to (x + width – 1, y + height – 1).

Finally, you can also copy the contents of the color buffer into a slice (or

subregion of a slice) of a previously specied 3D texture or 2D texture

array using glCopyTexSubImage3D.

void glCopyTexSubImage3D( GLenum target, GLint level,

GLint xoffset, GLint yoffset,

GLint zoffset, GLint x, GLint y,

GLsizei width, GLsizei height)

target species the texture target, either GL_TEXTURE_3D

or GL_TEXTURE_2D_ARRAY

level species which mip level to update

xoffset the x index of the texel to start updating from

yoffset the y index of the texel to start updating from

zoffset the z index of the texel to start updating from

xthe x window-coordinate of the lower-left rectangle in the

framebuffer to read from

ythe y window-coordinate of the lower-left rectangle in the

framebuffer to read from

width the width in pixels of the region to read

height the height in pixels of the region to read

One thing to keep in mind with glCopyTexImage2D,

glCopyTexSubImage2D, and glCopyTexSubImage3D is that the texture

image format cannot have more components than the color buffer. In

other words, when copying data out of the color buffer, it is possible

to convert to a format with fewer components, but not with more.

Table9-13 shows the valid format conversions when doing a texture

copy. For example, you can copy an RGBA image into any of the possible

formats, but you cannot copy an RGB into an RGBA image because no

alpha component exists in the color buffer.

(continued)

width the width in pixels of the region to read

height the height in pixels of the region to read

Sampler Objects 273

Sampler Objects

Previously in the chapter, we covered how to set texture parameters

such as lter modes, texture coordinate wrap modes, and LOD

settings using glTexParameter[i|f][v]. The issue with using

glTexParameter[i|f][v] is that it can result in a signicant amount

of unnecessary API overhead. Very often, an application will use the

same texture settings for a large number of textures. In such a case,

having to set the sampler state with glTexParameter[i|f][v] for

every texture object can result in a lot of extra overhead. To mitigate

this problem, OpenGL ES 3.0 introduces sampler objects that separate

sampler state from texture state. In short, all of the settings that can

be set with glTexParameter[i|f][v] can be set for a sampler object

and can be bound for use with a texture unit in a single function call.

Sampler objects can be used across many textures and, therefore, reduce

API overhead.

The function used to generate sampler objects is glGenSamplers.

Table 9-13 Valid Format Conversions for glCopyTex*Image*

(To) Texture Format

Color Format

(From) A L LA R RG RGB RGBA

RNY NY NNN

RG NY NY Y NN

RGB NYNYYYN

RGBA YYYYYYY

void glGenSamplers(GLsizei n, GLuint *samplers)

nspecies the number of sampler objects to generate

samplers an array of unsigned integers that will hold n sampler

object IDs

Sampler objects also need to be deleted when an application no longer

needs them. This can be done using glDeleteSamplers.

274 Chapter 9: Texturing

Once sampler object IDs have been generated with glGenSamplers,

the application must bind the sampler object to use its state. Sampler

objects are bound to texture units. Binding the sampler object to the

texture unit supersedes any of the state set in the texture object using

glTexParameter[i|f][v]. The function used to bind a sampler object is

glBindSampler.

void glDeleteSamplers(GLsizei n, const GLuint *samplers)

nspecies the number of sampler objects to delete

samplers an array of unsigned integers that hold n sampler object

IDs to delete

void glBindSampler(GLenum unit, GLuint sampler)

unit species the texture unit to bind the sampler object to

sampler the handle to the sampler object to bind

void glSamplerParameteri( GLuint sampler, GLenum pname,

GLint param)

void glSamplerParameteriv( GLuint sampler, GLenum pname,

const GLint *params)

void glSamplerParameterf( GLuint sampler, GLenum pname,

GLfloat param)

void glSamplerParameterfv( GLuint sampler, GLenum pname,

const GLfloat *params)

If the sampler passed to glBindSampler is 0 (the default sampler),

then the state set for the texture object will be used. The sampler object

state can be set using glSamplerParameter[f|i][v]. The parameters

that can be set by glSamplerParameter[f|i][v] are the exact

same ones that are set by using glTexParameter[i|f][v]. The only

difference is that the state is set to the sampler object rather than the

texture object.

Sampler Objects 275

sampler the sampler object to set

pname the parameter to set; one of

GL_TEXTURE_BASE_LEVEL

GL_TEXTURE_COMPARE_FUNC

GL_TEXTURE_COMPARE_MODE

GL_TEXTURE_MIN_FILTER

GL_TEXTURE_MAG_FILTER

GL_TEXTURE_MIN_LOD

GL_TEXTURE_MAX_LOD

GL_TEXTURE_MAX_LEVEL

GL_TEXTURE_SWIZZLE_R

GL_TEXTURE_SWIZZLE_G

GL_TEXTURE_SWIZZLE_B

GL_TEXTURE_SWIZZLE_A

GL_TEXTURE_WRAP_S

GL_TEXTURE_WRAP_T

GL_TEXTURE_WRAP_R

params the value (or array of values for the “v” entrypoints) to set the

texture parameter to

If pname is GL_TEXTURE_MAG_FILTER, then param can be

GL_NEAREST or GL_LINEAR

If pname is GL_TEXTURE_MIN_FILTER, then param can be

GL_NEAREST, GL_LINEAR, GL_NEAREST_MIPMAP_NEAREST,

GL_NEAREST_MIPMAP_LINEAR, GL_LINEAR_MIPMAP_NEAREST, or

GL_LINEAR_MIPMAP_LINEAR

If pname is GL_TEXTURE_WRAP_S, GL_TEXTURE_WRAP_R, or

GL_TEXTURE_WRAP_T, then param can be

GL_REPEAT, GL_CLAMP_TO_EDGE, or GL_MIRRORED_REPEAT

If pname is GL_TEXTURE_COMPARE_FUNC, then param can be

GL_LEQUAL, GL_EQUAL, GL_LESS, GL_GREATER, GL_EQUAL,

GL_NOTEQUAL, GL_ALWAYS, or GL_NEVER

If pname is GL_TEXTURE_COMPARE_MODE, then param can be

GL_COMPARE_REF_TO_TEXTURE or GL_NONE

If pname is GL_TEXTURE_SWIZZLE_R, GL_TEXTURE_SWIZZLE_G,

GL_TEXTURE_SWIZZLE_B, or GL_TEXTURE_SWIZZLE_A, then

param can be

GL_RED, GL_GREEN, GL_BLUE, GL_ALPHA, GL_ZERO, or

GL_ONE

276 Chapter 9: Texturing

Immutable Textures

Another feature introduced in OpenGL ES 3.0 to help improve application

performance is immutable textures. As discussed earlier in this chapter, an

application species each mipmap level of a texture independently using

functions such as glTexImage2D and glTexImage3D. The problem this

creates for the OpenGL ES driver is that it cannot determine until draw time

whether a texture has been fully specied. That is, it has to check whether

each mipmap level or subimage has matching formats, whether each level has

the correct dimensions, and whether there is sufcient memory. This draw

time check can be costly and can be avoided by using immutable textures.

The idea behind immutable textures is simple: The application species the

format and size of a texture before loading it with data. In doing so,the

texture format becomes immutable and the OpenGL ES driver can perform

all consistency and memory checks up-front. Once a texture has become

immutable, its format and dimensions cannot change. However, the

application can still load it with image data by using glTexSubImage2D,

glTexSubImage3D, or glGenerateMipMap, or by rendering to the texture.

To create an immutable texture, an application would bind the texture

using glBindTexture and then allocate its immutable storage using

glTexStorage2D or glTexStorage3D.

void glTexStorage2D( GLenum target, GLsizei levels,

GLenum internalFormat, GLsizei width,

GLsizei height)

void glTexStorage3D( GLenum target, GLsizei levels,

GLenum internalFormat, GLsizei width,

GLsizei height, GLsizei depth)

target species the texture target, either GL_TEXTURE_2D or

one of the cubemap face targets

(GL_TEXTURE_CUBE_MAP_POSITIVE_X,

GL_TEXTURE_CUBE_MAP_NEGATIVE_X, and so on) for

glTexStorage2D, or GL_TEXTURE_3D or

GL_TEXTURE_2D_ARRAY for glTexStorage3D

levels species the number of mipmap levels

internalFormat the sized internal format for the texture storage;

the full list of valid internalFormat values is the

same as the valid sized internalFormat values for

glTexImage2D provided in the Texture Objects and

Loading Textures section earlier in this chapter.

Pixel Unpack Buffer Objects 277

Once the immutable texture is created, it is invalid to call glTexImage*,

glCompressedTexImage*, glCopyTexImage*, or glTexStorage* on

the texture object. Doing so will result in a GL_INVALID_OPERATION

error being generated. To ll the immutable texture with image data,

the application needs to use glTexSubImage2D, glTexSubImage3D, or

glGenerateMipMap, or else render to the image as a texture (by using it as

an attachment to a framebuffer object).

Internally, when glTexStorage* is used, OpenGL ES marks the texture

object as being immutable by setting GL_TEXTURE_IMMUTABLE_FORMAT to

GL_TRUE and GL_TEXTURE_IMMUTABLE_LEVELS to the number of levels

passed to glTexStorage*. The application can query for these values by

using glGetTexParameter[i|f][v], although it cannot set them directly.

The glTexStorage* function must be used to set up the immutable

texture parameters.

Pixel Unpack Buffer Objects

In Chapter 6, “Vertex Attributes, Vertex Arrays, and Buffer Objects,” we

introduced buffer objects, concentrating the discussion on vertex buffer

objects (VBOs) and copy buffer objects. As you will recall, buffer objects

allow the storage of data in server-side (or GPU) memory as opposed to

client-side (or host) memory. The advantage of using buffer objects is

that they reduce the transfer of data from CPU to GPU and, therefore,

can improve performance (as well as reduce memory utilization).

OpenGL ES 3.0 also introduces pixel unpack buffer objects that are bound

and specied with the GL_PIXEL_UNPACK_BUFFER target. The functions

that operate on pixel unpack buffer objects are described in Chapter6.

Pixel unpack buffer objects allow the specication of texture data that

resides in server-side memory. As a consequence, the pixel unpack

operations glTexImage*, glTexSubImage*, glCompressedTexImage*,

and glCompressedTexSubImage* can come directly from a buffer object.

Much like VBOs with glVertexAttribPointer, if a pixel unpack buffer

object is bound during one of those calls, the data pointer is an offset into

the pixel unpack buffer rather than a pointer to client memory.

width the width of the base image in pixels

height the height of the base image in pixels

depth (glTexStorage3D only) the depth of the base image

in pixels

278 Chapter 9: Texturing

Pixel unpack buffer objects can be used to stream texture data to the GPU.

The application could allocate a pixel unpack buffer and then map regions

of the buffer for updates. When the calls to load the data to OpenGL are

made (e.g., glTexSubImage*), these functions can return immediately

because the data already resides in the GPU (or can be copied at a later

time, but an immediate copy does not need to be made as it does with

client-side data). We recommend using pixel unpack buffer objects in

situations where the performance/memory utilization of texture upload

operations is important for the application.

Summary

This chapter covered how to use textures in OpenGL ES 3.0. We

introduced the various types of textures: 2D, 3D, cubemaps, and 2D

texture arrays. For each texture type, we showed how the texture can be

loaded with data either in full, in subimages, or by copying data from

the framebuffer. We detailed the wide range of texture formats available

in OpenGL ES 3.0, which include normalized texture formats, oating-

point textures, integer textures, shared exponent textures, sRGB textures,

and depth textures. We covered all of the texture parameters that can be

set for texture objects, including lter modes, wrap modes, depth texture

comparison, and level-of-detail settings. We explored how to set texture

parameters using the more efcient sampler objects. Finally, we showed

how to create immutable textures that can help reduce the draw-time

overhead of using textures. We also saw how textures can be read in the

fragment shader with several example programs. With all this information

under your belt, you are well on your way toward using OpenGL ES 3.0

for many advanced rendering effects. Next, we cover more details of the

fragment shader that will help you further understand how textures can

be used to achieve a wide range of rendering techniques.

279

Chapter 10

Fragment Shaders

Chapter 9, “Texturing,” introduced you to the basics of creating and

applying textures in the fragment shader. In this chapter, we provide

more details on the fragment shader and describe some of its uses. In

particular, we focus on how to implement xed-function techniques

using the fragment shader. The topics we cover in this chapter include

thefollowing:

Fixed function fragment shaders

Programmable fragment shader overview

Multitexturing

Fog

Alpha test

User clip planes

In Figure 10-1, we have previously covered the vertex shader, primitive

assembly, and rasterization stages of the programmable pipeline. We have

talked about using textures in the fragment shader. Now, we focus on the

fragment shader portion of the pipeline and ll in the remaining details

on writing fragment shaders.

280 Chapter 10: Fragment Shaders

Fixed-Function Fragment Shaders

Readers who are new to the programmable fragment pipeline but have

worked with OpenGL ES 1.x (or earlier versions of desktop OpenGL) are

probably familiar with the xed-function fragment pipeline. Before diving

into details of the fragment shader, we think it is worthwhile to briey

review the old xed-function fragment pipeline. This will give you an

understanding of how the old xed-function pipeline maps into fragment

shaders. It’s a good way to start before moving into more advanced

fragment programming techniques.

In OpenGL ES 1.1 (and xed-function desktop OpenGL), you had a

limited set of equations that could be used to determine how to combine

the various inputs to the fragment shader. In the xed-function pipeline,

you essentially had three inputs you could use: the interpolated vertex

color, the texture color, and the constant color. The vertex color would

typically hold either a precomputed color or the result of the vertex

lighting computation. The texture color came from fetching from

whichever texture was bound using the primitive’s texture coordinates

and the constant color could be set for each texture unit.

Vertex Buffer/

Array Objects

Transform

Feedback

Array

Objects

Vertex Shader

Fragment

Shader

Feedback

Primitive

Assembly Rasterization

Per-Fragment

Operations Framebuffer

API

Textures

Figure 10-1 OpenGL ES 3.0 Programmable Pipeline

Fixed-Function Fragment Shaders 281

The set of equations you could use to combine these inputs together was

quite limited. For example, in OpenGL ES 1.1, the equations listed in

Table 10-1 were available. The inputs A, B, and C to these equations could

come from the vertex color, texture color, or constant color.

Table 10-1 OpenGL ES 1.1 RGB Combine Functions

RGB Combine Function Equation

REPLACE A

MODULATE A × B

ADD

A + B

ADD_SIGNED A + B – 0.5

INTERPOLATE A × C + B × (1 – C)

SUBTRACT A – B

DOT3_RGB (and DOT3_RGBA)4 × ((A.r – 0.5) × (B.r – 0.5) + (A.g – 0.5) ×

(B.g – 0.5) + (A.b – 0.5) × (B.b × 0.5))

There actually was a great number of interesting effects one could achieve,

even with this limited set of equations. However, this was far from

programmable, as the fragment pipeline could be congured only in a

very xed set of ways.

So why are we reviewing this history here? It helps give an understanding

of how traditional xed-function techniques can be achieved with

shaders. For example, suppose we had congured the xed-function

pipeline with a single base texture map that we wanted to modulate

(multiply) by the vertex color. In xed-function OpenGL ES (or OpenGL),

we would enable a single texture unit, choose a combine equation of

MODULATE, and set up the inputs to the equation to come from the vertex

color and texture color. The code to do this in OpenGL ES 1.1 is provided

here for reference:

glTexEnvi(GL_TEXTURE_ENV, GL_TEXTURE_ENV_MODE, GL_COMBINE);

glTexEnvi(GL_TEXTURE_ENV, GL_COMBINE_RGB, GL_MODULATE);

glTexEnvi(GL_TEXTURE_ENV, GL_SOURCE0_RGB, GL_PRIMARY_COLOR);

glTexEnvi(GL_TEXTURE_ENV, GL_SOURCEl_RGB, GL_TEXTURE);

glTexEnvi(GL_TEXTURE_ENV, GL_COMBINE_ALPHA, GL_MODULATE);

glTexEnvi(GL_TEXTURE_ENV, GL_SOURCE0_ALPHA, GL_PRIMARY_COLOR);

glTexEnvi(GL_TEXTURE_ENV, GL_SOURCEl_ALPHA, GL_TEXTURE);

282 Chapter 10: Fragment Shaders

This code congures the xed-function pipeline to perform a modulate

(A × B) between the primary color (the vertex color) and the texture color.

If this code doesn’t make sense to you, don’t worry, as none of it exists

in OpenGL ES 3.0. Rather, we are simply trying to show how this would

map to a fragment shader. In a fragment shader, this same modulate

computation could be accomplished as follows:

#version 300 es

precision mediump float;

uniform sampler2D s_tex0;

in vec2 v_texCoord;

in vec4 v_primaryColor;

layout(location = 0) out vec4 outColor;

void main()

{

outColor = texture(s_tex0, v_texCoord) * v_primaryColor;

}

The fragment shader performs the exact same operations that would be

performed by the xed-function setup. The texture value is fetched from

a sampler (that is bound to texture unit 0) and a 2D texture coordinate

is used to look up that value. Then, the result of that texture fetch is

multiplied by v_primaryColor, an input value that is passed in from the

vertex shader. In this case, the vertex shader would have passed the color

to the fragment shader.

It is possible to write a fragment shader that would perform the equivalent

computation as any possible xed-function texture combine setup. It is

also possible, of course, to write shaders with much more complex and

varied computations than just xed functions would allow. However, the

point of this section was just to drive home how we have transitioned

from xed-function to programmable shaders. Now, we begin to look at

some specics of fragment shaders.

Fragment Shader Overview

The fragment shader provides a general-purpose programmable method

for operating on fragments. The inputs to the fragment shader consist of

the following:

Inputs (or varyings)—Interpolated data produced by the vertex shader.

The outputs of the vertex shader are interpolated across the primitive

and passed to the fragment shader as inputs.

Uniforms—State used by the fragment shader. These are constant

values that do not vary per fragment.

Fragment Shader Overview 283

Samplers—Used to access texture images in the shader.

Code—Fragment shader source or binary that describes the operations

that will be performed on the fragment.

The output of the fragment shader is one or more fragment colors

that get passed on to the per-fragment operations portion of the pipeline

(thenumber of output colors depends on how many color attachments

are being used). The inputs and outputs to the fragment shader are

illustrated in Figure 10-2.

Built-In Special Variables

OpenGL ES 3.0 has built-in special variables that are output by the

fragment shader or are input to the fragment shader. The following

built-in special variables are available to the fragment shader:

gl_FragCoord—A read-only variable that is available in thefragment

shader. This variable holds the window relative coordinates (x,y,z,1/w)

Uniforms

Samplers

Fragment Shader

Output color 0

Output color 1

Output color

Output color N

gl_FragDepth

…

Input (Varying) 0

Input (Varying) 1

Input (Varying) 2

Input (Varying) 3

Input (Varying) 4

gl_FragCoord

gl_FrontFacing

gl_PointCoord

…

Input (Varying) 4

gl_FragCoord

Input (Varying) N

Figure 10-2 OpenGL ES 3.0 Fragment Shader

284 Chapter 10: Fragment Shaders

of the fragment. There are a number of algorithms where it can be

useful to know the window coordinates of the current fragment. For

example, you can use the window coordinates as offsets into a texture

fetch into a random noise map whose value is used to rotate a lter

kernel on a shadow map. This technique is used to reduce shadow map

aliasing.

gl_FrontFacing—A read-only variable that is available in the

fragment shader. This boolean variable has a value of true if the

fragment is part of a front-facing primitive and false otherwise.

gl_PointCoord—A read-only variable that can be used when rendering

point sprites. It holds the texture coordinate for the point sprite that is

automatically generated in the [0, 1] range during point rasterization.

In Chapter 14, “Advanced Programming with OpenGL ES 3.0,” there is

an example of rendering point sprites that uses this variable.

gl_FragDepth—A write-only output variable that, when written

to in the fragment shader, overrides the fragment’s xed-function

depth value. This functionality should be used sparingly (and only

when necessary) because it can disable depth optimization in many

GPUs. For example, many GPUs have a feature called Early-Z where

the depth test is performed ahead of executing the fragment shader.

The benet of using Early-Z is that fragments that fail the depth

test are never shaded (thus saving performance). However, when

gl_FragDepth is used, this feature must be disabled because the GPU

does not know the depth value ahead of executing the fragment

shader.

Built-In Constants

The following built-in constants are also relevant to the fragment shader:

const mediump int gl_MaxFragmentInputVectors = 15;

const mediump int gl_MaxTextureImageUnits = 16;

const mediump int gl_MaxFragmentUniformVectors = 224;

const mediump int gl_MaxDrawBuffers = 4;

const mediump int gl_MinProgramTexelOffset = -8;

const mediump int gl_MaxProgramTexelOffset = 7;

The built-in constants describe the following maximum terms:

gl_MaxFragmentInputVectors—The maximum number of fragment

shader inputs (or varyings). The minimum value supported by all

ES 3.0 implementations is 15.

Fragment Shader Overview 285

gl_MaxTextureImageUnits—The maximum number of texture image

units that are available. The minimum value supported by all ES 3.0

implementations is 16.

gl_MaxFragmentUniformVectors—The maximum number of

vec4 uniform entries that can be used inside a fragment shader.

The minimum value supported by all ES 3.0 implementations is

224. The number of vec4 uniform entries that can actually be used

by a developer can vary from implementation to implementation

and from one fragment shader to another. This issue is described in

Chapter 8, “Vertex Shaders,” and the same issue applies to fragment

shaders.

gl_MaxDrawBuffers—The maximum number of multiple render

targets (MRTs) supported. The minimum value supported by all ES 3.0

implementations is 4.

gl_MinProgramTexelOffset/gl_MaxProgramTexelOffset—The

minimum and maximum offsets supported by the offset parameter to

the texture*Offset() built-in ESSL functions.

The values specied for each built-in constant are the minimum values

that must be supported by all OpenGL ES 3.0 implementations. It is

possible that implementations may support values greater than the

minimum values described. The actual hardware-dependent values

for fragment shader built-in values can also be queried from API

code. The following code shows how you would query the values of

gl_MaxTextureImageUnits and gl_MaxFragmentUniformVectors:

GLint maxTextureImageUnits, maxFragmentUniformVectors;

glGetIntegerv(GL_MAX_TEXTURE_IMAGE_UNITS,

&maxTextureImageUnits);

glGetIntegerv(GL_MAX_FRAGMENT_UNIFORM_VECTORS

&maxFragmentUniformVectors);

Precision Qualifiers

Precision qualiers were briey introduced in Chapter 5, “OpenGL ES

Shading Language” and were covered in detail in Chapter 8, “Vertex

Shaders.” Please review those sections for full details on precision

qualiers. We remind you here that there is no default precision for

fragment shaders. As a consequence, every fragment shader must

declare a default precision (or provide precision qualiers for all variable

declarations).

286 Chapter 10: Fragment Shaders

Implementing Fixed-Function Techniques

Using Shaders

Now that we have given an overview of fragment shaders, we will

demonstrate how to implement several xed-function techniques using

shaders. The xed-function pipeline in OpenGL ES l.x and desktop

OpenGL provided APIs to perform multitexturing, fog, alpha test, and

user clip planes. Although none of these techniques is provided explicitly

in OpenGL ES 3.0, all of them can still be implemented using shaders.

This section reviews each of these xed-function processes and provides

example fragment shaders that demonstrate each technique.

Multitexturing

We start with multitexturing, which is a very common operation in

fragment shaders used for combining multiple texture maps. For example,

a technique that has been used in many games, such as Quake III, is to store

precomputed lighting from radiosity calculations in a texture map. That

map is then combined with the base texture map in the fragment shader to

represent static lighting. Many other examples of using multiple textures

exist, some of which we cover in Chapter 14, “Advanced Programming

with OpenGL ES 3.0.” For example, often a texture map is used to store

a specular exponent and mask to attenuate and mask specular lighting

contributions. Many games also use normal maps, which are textures that

store normal information at a higher level of detail than per-vertex normals

so that lighting can be computed in the fragment shader.

The point of mentioning this information here is to highlight that you have

now learned about all of the parts of the API that are needed to accomplish

multitexturing techniques. In Chapter 9, “Texturing,” you learned how to

load textures on various texture units and fetch from them in the fragment

shader. Combining the textures in various ways in the fragment shader is

simply a matter of employing the many operators and built-in functions

that exist in the shading language. Using these techniques, you can easily

achieve all of the effects that were made possible with the xed-function

fragment pipeline in previous versions of OpenGL ES.

An example of using multiple textures is provided in the Chapter_10/

MultiTexture example, which renders the image in Figure 10-3.

This example loads a base texture map and light map texture and

combines them in the fragment shader on a single quad. The fragment

shader for the sample program is provided in Example 10-1.

Implementing Fixed-Function Techniques Using Shaders 287

Example 10-1 Multitexture Fragment Shader

#version 300 es

precision mediump float;

in vec2 v_texCoord;

layout(location = 0) out vec4 outColor;

uniform sampler2D s_baseMap;

uniform sampler2D s_lightMap;

void main()

{

vec4 baseColor;

vec4 lightColor;

baseColor = texture( s_baseMap, v_texCoord );

lightColor = texture( s_lightMap, v_texCoord );

// Add a 0.25 ambient light to the texture light color

outColor = baseColor * (lightColor + 0.25);

}

The fragment shader has two samplers, one for each of the textures. The

relevant code for setting up the texture units and samplers follows.

// Bind the base map

glActiveTexture(GL_TEXTURE0);

glBindTexture(GL_TEXTURE_2D, userData->baseMapTexId);

// Set the base map sampler to texture unit 0

glUniformli(userData->baseMapLoc, 0);

Figure 10-3 Multitextured Quad

288 Chapter 10: Fragment Shaders

// Bind the light map

glActiveTexture(GL_TEXTUREl);

glBindTexture(GL_TEXTURE_2D, userData->lightMapTexId);

// Set the light map sampler to texture unit 1

glUniformli(userData->lightMapLoc, 1);

As you can see, this code binds each of the individual texture objects

to textures units 0 and 1. The samplers are set with values to bind the

samplers to the respective texture units. In this example, a single texture

coordinate is used to fetch from both of the maps. In typical light

mapping, there would be a separate set of texture coordinates for the base

map and light map. The light maps are typically paged into a single large

texture and the texture coordinates can be generated using ofine tools.

Fog

A common technique that is used in rendering 3D scenes is the

application of fog. In OpenGL ES 1.1, fog was provided as a xed-function

operation. One of the reasons fog is such a prevalent technique is that it

can be used to reduce draw distances and remove “popping” of geometry

as it comes in closer to the viewer.

There are a number of possible ways to compute fog, and with programmable

fragment shaders you are not limited to any particular equation. Here we

show how you would go about computing linear fog with a fragment shader.

To compute any type of fog, we will need two inputs: the distance of the

pixel to the eye and the color of the fog. To compute linear fog, we also need

the minimum and maximum distance range that the fog should cover.

The equation for the linear fog factor

computes a linear fog factor to multiply the fog color by. This color gets

clamped in the [0.0, 1.0] range and then is linear interpolated with the

overall color of a fragment to compute the nal color. The distance to

the eye is best computed in the vertex shader and interpolated across the

primitive using a varying variable.

A PVRShaman (.POD) workspace is provided as an example in the

Chapter_10/PVR_LinearFog folder that demonstrates the fog

computation. Figure 10-4 is a screenshot of the workspace. PVRShaman is

a shader development integrated development environment (IDE) that is

part of the Imagination Technologies PowerVR SDK downloadable from

http://powervrinsider.com/. Several subsequent examples in the book use

PVRShaman to demonstrate various shading techniques.

−

MaxDistEyeDist

MaxDistMinDist

Implementing Fixed-Function Techniques Using Shaders 289

Figure 10-4 Linear Fog on Torus in PVRShaman

Example 10-2 Vertex Shader for Computing Distance to Eye

#version 300 es

uniform mat4 u_matViewProjection;

uniform mat4 u_matView;

uniform vec4 u_eyePos;

in vec4 a_vertex;

in vec2 a_texCoord0;

out vec2 v_texCoord;

out float v_eyeDist;

void main( void )

{

// Transform vertex to view space

vec4 vViewPos = u_matView * a_vertex;

// Compute the distance to eye

v_eyeDist = sqrt( (vViewPos.x - u_eyePos.x) *

(vViewPos.x - u_eyePos.x) +

(vViewPos.y - u_eyePos.y) *

(vViewPos.y - u_eyePos.y) +

(vViewPos.z - u_eyePos.z) *

(vViewPos.z - u_eyePos.z) );

gl_Position = u_matViewProjection * a_vertex;

v_texCoord = a_texCoord0.xy;

}

Example 10-2 provides the code for the vertex shader that computes the

distance to the eye.

290 Chapter 10: Fragment Shaders

The important part of this vertex shader is the computation of the

v_eyeDist vertex shader output variable. First, the input vertex is

transformed into view space using the view matrix and stored in

vViewPos. Then, the distance from this point to the u_eyePos uniform

variable is computed. This computation gives us the distance in eye space

from the viewer to the transformed vertex. We can use this value in the

fragment shader to compute the fog factor, as shown in Example10-3.

Example 10-3 Fragment Shader for Rendering Linear Fog

#version 300 es

precision mediump float;

uniform vec4 u_fogColor;

uniform float u_fogMaxDist;

uniform float u_fogMinDist;

uniform sampler2D baseMap;

in vec2 v_texCoord;

in float v_eyeDist;

layout( location = 0 ) out vec4 outColor;

float computeLinearFogFactor()

{

float factor;

// Compute linear fog equation

factor = (u_fogMaxDist − v_eyeDist) /

(u_fogMaxDist − u_fogMinDist );

// Clamp in the [0, 1] range

factor = clamp( factor, 0.0, 1.0 );

return factor;

}

void main( void )

{

float fogFactor = computeLinearFogFactor();

vec4 baseColor = texture( baseMap, v_texCoord );

// Compute final color as a lerp with fog factor

outColor = baseColor * fogFactor +

u_fogColor * (1.0 − fogFactor);

}

Implementing Fixed-Function Techniques Using Shaders 291

In the fragment shader, the computeLinearFogFactor() function

performs the computation for the linear fog equation. The minimum

and maximum fog distances are stored in uniform variables, and the

interpolated eye distance that was computed in the vertex shader is

used to compute the fog factor. The fog factor is then used to perform

a linear interpolation (abbreviated as “lerp” in Example 10-3) between

the base texture color and the fog color. The result is that we now have

linear fog and can easily adjust the distances and colors by changing the

uniform values.

Note that with the exibility of programmable fragment shaders, it is very

easy to implement other methods to compute fog. For example, you could

easily compute exponential fog by simply changing the fog equation.

Alternatively, rather than compute fog based on distance to the eye, you

could compute fog based on distance to the ground. A number of possible

fog effects can be easily achieved with small modications to the fog

computations provided here.

Alpha Test (Using Discard)

A common effect used in 3D applications is to draw primitives that are

fully transparent in certain fragments. This is very useful for rendering

something like a chain-link fence. Representing a fence using geometry

would require a signicant amount of primitives. However, an alternative

to using geometry is to store a mask value in a texture that species which

texels should be transparent. For example, you could store the chain-link

fence in a single RGBA texture, where the RGB values represent the color

of the fence and the A value represents the mask of whether the texture

is transparent. Then you could easily render a fence using just one or two

triangles and masking off pixels in the fragment shader.

In traditional xed-function rendering, this effect was achieved using

the alpha test. The alpha test allowed you to specify a comparison test

whereby if comparison of an alpha value of a fragment with a reference

value failed, that fragment would be killed. That is, if a fragment failed the

alpha test, the fragment would not be rendered. In OpenGL ES 3.0, there

is no xed-function alpha test, but the same effect can be achieved in the

fragment shader using the discard keyword.

The PVRShaman example in Chapter_10/PVR_AlphaTest gives a very

simple example of doing the alpha test in the fragment shader, as shown

in Figure 10-5.

292 Chapter 10: Fragment Shaders

Figure 10-5 Alpha Test Using Discard

Example 10-4 Fragment Shader for Alpha Test Using Discard

#version 300 es

precision mediump float;

uniform sampler2D baseMap;

in vec2 v_texCoord;

layout( location = 0 ) out vec4 outColor;

void main( void )

{

vec4 baseColor = texture( baseMap, v_texCoord );

// Discard all fragments with alpha value less than 0.25

if( baseColor.a < 0.25 )

{

discard;

}

else

{

outColor = baseColor;

}

In this fragment shader, the texture is a four-channel RGBA texture. The

alpha channel is used for the alpha test. The alpha color is compared with

0.25; if it is less than that value, the fragment is killed using discard.

Example 10-4 gives the fragment shader code for this example.

Implementing Fixed-Function Techniques Using Shaders 293

Otherwise, the fragment is drawn using the texture color. This technique

can be used for implementing the alpha test by simply changing the

comparison or alpha reference value.

User Clip Planes

As described in Chapter 7, “Primitive Assembly and Rasterization,”

all primitives are clipped against the six planes that make up the view

frustum. However, sometimes a user might want to clip against one or

more additional user clip planes. There are a number of reasons why you

might want to clip against user clip planes. For example, when rendering

reections, you need to ip the geometry about the reection plane and

then render it into an off-screen texture. When rendering into the texture,

you need to clip the geometry against the reection plane, which requires

a user clip plane.

In OpenGL ES 1.1, user clip planes could be provided to the API via

a plane equation and the clipping would be handheld automatically.

In OpenGL ES 3.0, you can still accomplish this same effect, but now

you need to do it yourself in the shader. The key to implementing user

clip planes is using the discard keyword, which was introduced in the

previous section.

Before showing you how to implement user clip planes, let’s review the

basics of the mathematics. A plane is specied by the equation

Ax + By + Cz + D = 0

The vector (A, B, C) represents the normal of the plane and the value D

is the distance of the plane along that vector from the origin. To figure

out whether a point should or should not be clipped against a plane,

we need to evaluate the distance from a point P to a plane with the

equation

Dist = (A × P·x) + (B × P·y) + (C × P·z) + D

If the distance is less than 0, we know the point is behind the plane

and should be clipped. If the distance is greater than or equal to 0, it

should not be clipped. Note that the plane equation and P must be in

the same coordinate space. A PVRShaman example is provided in the

Chapter_10/PVR_ClipPlane workspace and illustrated in Figure10-6.

In the example, a teapot is rendered and clipped against a user

clipplane.

294 Chapter 10: Fragment Shaders

The rst thing the shader needs to do is compute the distance to the

plane, as mentioned earlier. This could be done in either the vertex shader

(and passed into a varying) or the fragment shader. It is cheaper in terms

of performance to do this computation in the vertex shader rather than

having to compute the distance in every fragment. The vertex shader

listing in Example 10-5 shows the distance-to-plane computation.

Example 10-5 User Clip Plane Vertex Shader

#version 300 es

uniform vec4 u_clipPlane;

uniform mat4 u_matViewProjection;

in vec4 a_vertex;

out float v_clipDist;

void main( void )

{

// Compute the distance between the vertex and

// the clip plane

v_clipDist = dot( a_vertex.xyz, u_clipPlane.xyz ) +

u_clipPlane.w;

gl_Position = u_matViewProjection * a_vertex;

}

The u_clipPlane uniform variable holds the plane equation for the clip

plane and is passed into the shader using glUniform4f. The v_clipDist

varying variable then stores the computed clip distance. This value is passed

into the fragment shader, which uses the interpolated distance to determine

whether the fragment should be clipped, as shown in Example10-6.

Figure 10-6 User Clip Plane Example

Summary 295

As you can see, if the v_clipDist varying variable is negative, this

means the fragment is behind the clip plane and must be discarded.

Otherwise, the fragment is processed as usual. This simple example just

demonstrates the computations needed to implement user clip planes.

You can easily implement multiple user clip planes by simply computing

multiple clip distances and having multiple discard tests.

Summary

This chapter introduced implementing several rendering techniques

using fragment shaders. We focused on implementing fragment shaders

that accomplish techniques that were part of xed-function OpenGL

ES 1.1. Specically, we showed you how to implement multitexturing,

linear fog, alpha test, and user clip planes. The number of shading

techniques that become possible when using programmable fragment

shaders is nearly limitless. This chapter gave you grounding in how to

develop some fragment shaders that you can build on to create more

sophisticated effects.

Now we are just ready to introduce a number of advanced rendering

techniques. The next topics to cover before getting there are what

happens after the fragment shader—namely, per-fragment operations and

framebuffer objects. These topics are covered in the next two chapters.

Example 10-6 User Clip Plane Fragment Shader

#version 300 es

precision mediump float;

in float v_clipDist;

layout( location = 0 ) out vec4 outColor;

void main( void )

{

// Reject fragments behind the clip plane

if( v_clipDist < 0.0 )

discard;

outColor = vec4( 0.5, 0.5, 1.0, 0.0 );

}

This page intentionally left blank

297

Scissor

Box

Stencil

Test

Fragment

Shader

Depth

Test Blending Dithering

Figure 11-1 The Post-Shader Fragment Pipeline

Chapter 11

Fragment Operations

This chapter discusses the operations that can be applied either to the

entire framebuffer or to individual fragments after the execution of the

fragment shader in the OpenGL ES 3.0 fragment pipeline. As you’ll recall,

the output of the fragment shader is the fragment’s colors and depth

value. The following operations occur after fragment shader execution

and can affect the visibility and nal color of a pixel:

Scissor box testing

Stencil buffer testing

Depth buffer testing

Multisampling

Blending

Dithering

The tests and operations that a fragment goes through on its way to the

framebuffer are shown in Figure 11-1.

298 Chapter 11: Fragment Operations

As you might have noticed, there isn’t a stage named “multisampling.”

Multisampling is an anti-aliasing technique that duplicates operations

at a subfragment level. We describe how multisampling affects fragment

processing in more depth later in the chapter.

The chapter concludes with a discussion of methods for reading pixels

from and writing pixels to the framebuffer.

Buffers

OpenGL ES supports three types of buffers, each of which stores different

data for every pixel in the framebuffer:

Color buffer (composed of front and back color buffers)

Depth buffer

Stencil buffer

The size of a buffer—commonly referred to as the “depth of the buffer”

(but not to be confused with the depth buffer)—is measured by the

number of bits that are available for storing information for a single pixel.

The color buffer, for example, will have three components for storing the

red, green, and blue color components, and optional storage for the alpha

component. The depth of the color buffer is the sum of the number of bits

for all of its components. For the depth and stencil buffers, in contrast,

a single value represents the bit depth of a pixel in those buffers. For

example, a depth buffer might have 16 bits per pixel. The overall size of

the buffer is the sum of the bit depths of all of the components. Common

framebuffer depths include 16-bit RGB buffers, with 5 bits for red and

blue, and 6 bits for green (the human visual system is more sensitive to

green than to red or blue), and 32 bits divided equally for an RGBA buffer.

Additionally, the color buffer may be double buffered, such that

it contains two buffers: one that is displayed on the output device

(usually a monitor or LCD display), named the “front” buffer; and

another buffer that is hidden from the viewer, but used for constructing

the next image to be displayed, and called the “back” buffer. In double-

buffered applications, animation is accomplished by drawing into the

back buffer, and then swapping the front and back buffers to display

the new image. This swapping of buffers is usually synchronized with

the refresh cycle of the display device, which will give the illusion of

Buffers 299

a continuously smooth animation. Recall that double buffering was

discussed in Chapter 3, “An Introduction to EGL.”

Although every EGL conguration will have a color buffer, the depth and

stencil buffers are optional. However, every EGL implementation must

provide at least one conguration that contains all three of the buffers,

with the depth buffer being at least 16 bits deep, and at least 8 bits for

the stencil buffer.

Requesting Additional Buffers

To include a depth or stencil buffer along with your color buffer, you

need to request them when you specify the attributes for your EGL

conguration. As discussed in Chapter 3, you pass a set of attribute–

value pairs into the EGL that specify the type of rendering surface your

application needs. To include a depth buffer in addition to the color

buffer, you would specify EGL_DEPTH_SIZE in the list of attributes

along with the desired bit depth you need. Likewise, you would add

EGL_STENCIL_SIZE along with the number of required bits to obtain

astencil buffer.

Our convenience library, esUtil, simplies those operations by merely

allowing you to say that you would like those buffers along with a color

buffer, and it takes care of the rest of the work (requesting a maximally

sized buffer). When using our library, you would add (by means of a bit-

wise or operation) ES_WINDOW_DEPTH and ES_WINDOW_STENCIL in your

call to esCreateWindow. For example,

esCreateWindow ( &esContext, “Application Name”,

window_width, window_height,

ES_WINDOW_RGB | ES_WINDOW_DEPTH |

ES_WINDOW_STENCIL );

Clearing Buffers

OpenGL ES is an interactive rendering system, and it assumes that at

the start of each frame, you’ll want to initialize all of the buffers to their

default value. Buffers are cleared by calling the glClear function, which

takes a bitmask representing the various buffers that should be cleared to

their specied clear values.

300 Chapter 11: Fragment Operations

void glClearDepthf(GLfloat depth)

depth species the depth value (in the range [0, 1]) that all pixels in

the depth buffer should be initialized to when

GL_DEPTH_BUFFER_BIT is present in the bitmask passed

toglClear

void glClearStencil(GLint s)

sspecies the stencil value (in the range [0, 2n – 1], where n is

the number of bits available in the stencil buffer) that all pixels

inthe stencil buffer should be initialized to when

GL_STENCIL_BUFFER_BIT is present in the bitmask passed

toglClear

You’re required neither to clear every buffer nor to clear them all at the

same time, but you might obtain the best performance by calling glClear

only once per frame with all the buffers you want simultaneously cleared.

Each buffer has a default value that’s used when you request that buffer be

cleared. For each buffer, you can specify your desired clear value using the

functions shown here:

void glClearColor( GLfloat red, GLfloat green,

GLfloat blue, GLfloat alpha)

red, green, species the color value (in the range [0, 1]) that all

blue, pixels in the color buffers should be initialized to when

alpha GL_COLOR_BUFFER_BIT is present in the bitmask passed

to glClear

void glClear(GLbitfield mask)

mask

species the buffers to be cleared, and is composed of the union

of the following bitmasks representing the various OpenGL ES

buffers: GL_COLOR_BUFFER_BIT‚ GL_DEPTH_BUFFER_BIT,

GL_STENCIL_BUFFER_BIT

Buffers 301

To reduce the number of function calls, you can clear the depth and

stencil buffers at the same time using glClearBufferfi.

void glClearBufferiv(GLenum buffer, GLint drawBuffer,

const GLint *value)

void glClearBufferuiv(GLenum buffer, GLint drawBuffer,

const GLuint *value)

void glClearBufferfv(GLenum buffer, GLint drawBuffer,

const GLfloat *value)

buffer

drawBuffer

value

species the type of buffer to clear. Can be

GL_COLOR, GL_FRONT, GL_BACK, GL_FRONT_AND_BACK,

GL_LEFT, GL_RIGHT, GL_DEPTH (glClearBufferfv only) or

GL_STENCIL (glClearBufferiv only).

species the draw buffer name to clear. Must be zero

for depth or stencil buffers. Otherwise, must be less than

GL_MAX_DRAW_BUFFERS for color buffers.

species a pointer to a four-element vector (for color

buffers) or to a single value (for depth or stencil buffers)

to clear the buffer to.

If you have multiple draw buffers in a framebuffer object (see the Multiple

Render Targets section), you can clear a specic draw buffer with the

following calls:

void glClearBufferfi( GLenum buffer, GLint drawBuffer,

GLfloat depth, GLint stencil)

buffer

drawBuffer

depth

stencil

species the type of buffer to clear; must be

GL_DEPTH_STENCIL

species the draw buffer name to clear; must be zero

species the value to clear the depth buffer to

species the value to clear the stencil buffer to

Using Masks to Control Writing to Framebuffers

You can also control which buffers, or components, in the case of the

color buffer, are writable by specifying a buffer write mask. Before a pixel’s

302 Chapter 11: Fragment Operations

Likewise, writing to the depth buffer is controlled by calling

glDepthMask with GL_TRUE or GL_FALSE to specify whether the depth

buffer is writable.

Often, writing to the depth buffer is disabled when rendering translucent

objects. Initially, you would render all of the opaque objects in the scene

with writing to the depth buffer enabled (i.e., set to GL_TRUE). This would

ensure that all of the opaque objects are correctly depth sorted, and the

depth buffer contains the appropriate depth information for the scene.

Then, before rendering the translucent objects, you would disable writing to

the depth buffer by calling glDepthMask (GL_FALSE). While writing to the

depth buffer is disabled, values can still be read from it and used for depth

comparisons. This allows translucent objects that are obscured by opaque

objects to be correctly depth buffered, but does not modify the depth buffer

such that opaque objects would be obscured by translucent ones.

Finally, you can disable writing to the stencil buffer by calling

glStencilMask. Unlike with glColorMask or glDepthMask, you can

void glColorMask(GLboolean red, GLboolean green,

GLboolean blue, GLboolean alpha)

red, green, specify whether the particular color component

blue, in the color buffer is modiable while rendering

alpha

void glDepthMask(GLboolean depth)

depth species whether the depth buffer is modiable

value is written into a buffer, the buffer’s mask is used to verify that the

buffer is writable.

For the color buffer, the glColorMask routine species which

components in the color buffer will be updated if a pixel is written.

If the mask for a particular component is set to GL_FALSE, that

component will not be updated if written to. By default, all color

components are writable.

Fragment Tests and Operations 303

specify which bits of the stencil buffer are writable by providing a

mask.

The glStencilMaskSeparate routine allows you to set the stencil mask

based on the face vertex order (sometimes called “facedness”) of the

primitive. This allows different stencil masks for front- and back-facing

primitives. glStencilMaskSeparate(GL_FRONT_AND_BACK, mask) is

identical to calling glStencilMask, which sets the same mask for the

front and back polygon faces.

Fragment Tests and Operations

The following sections describe the various tests that can be applied to

a fragment in OpenGL ES. By default, all fragment tests and operations

are disabled, and fragments become pixels as they are written to the

framebuffer in the order in which they are received. By enabling the

various fragments, operational tests can be applied to choose which

fragments become pixels and affect the nal image.

Each fragment test is individually enabled by calling glEnable with the

appropriate token listed in Table 11-1.

void glStencilMaskSeparate(GLenum face, GLuint mask)

face species the stencil mask to be applied based on the face vertex

order of the rendered primitive. Valid values are GL_FRONT,

GL_BACK, and GL_FRONT_AND_BACK.

mask species a bitmask (in the range [0, 2n], where n is the number of

bits in the stencil buffer) of which bits in a pixel in the stencil

buffer are specied by face.

void glStencilMask(GLuint mask)

mask species a bitmask (in the range [0, 2n – 1], where n is the

number of bits in the stencil buffer) of which bits in a pixel in

the stencil buffer are modiable.

304 Chapter 11: Fragment Operations

Using the Scissor Test

The scissor test provides an additional level of clipping by specifying a

rectangular region that further limits which pixels in the framebuffer are

writable. Using the scissor box is a two-step process. First, you need to

specify the rectangular region using the glScissor function.

After specifying the scissor box, you need to enable it by calling

glEnable(GL_SCISSOR_TEST) to employ the additional clipping. All

rendering, including clearing the viewport, is restricted to the scissor box.

Generally, the scissor box is a subregion in the viewport, but the two

regions are not required to actually intersect. When the two regions do

not intersect, the scissoring operation will be performed on pixels that

are rendered outside of the viewport region. Note that the viewport

void glScissor(GLint x, GLint y, GLsizei width,

GLsizei height)

x, y specify the lower-left corner of the scissor rectangle in

viewport coordinates

width species the width of the scissor box (in pixels)

height species the height of the scissor box (in pixels)

Table 11-1 Fragment Test Enable Tokens

glEnable Token Description

GL_DEPTH_TEST Control depth testing of fragments

GL_STENCIL_TEST Control stencil testing of fragments

GL_BLEND Control blending of fragments with

colors stored in the color buffer

GL_DITHER Control dithering of fragment colors

before being written in the color buffer

GL_SAMPLE_COVERAGE Control computation of sample

coverage values

GL_SAMPLE_ALPHA_TO_COVERAGE Control use of a sample’s alpha in the

computation of a sample coverage value

Fragment Tests and Operations 305

transformation happens before the fragment shader stage, while the

scissor test happens after the fragment shader stage.

Stencil Buffer Testing

The next operation that might be applied to a fragment is the stencil test.

The stencil buffer is a per-pixel mask that holds values that can be used to

determine whether a pixel should be updated. The stencil test is enabled

or disabled by the application.

Using the stencil buffer can be considered a two-step operation. The rst

step is to initialize the stencil buffer with the per-pixel masks, which is

done by rendering geometry and specifying how the stencil buffer should

be updated. The second step is generally to use those values to control

subsequent rendering into the color buffer. In both cases, you specify how

the parameters are to be used in the stencil test.

The stencil test is essentially a bit test, as you might do in a C program

where you use a mask to determine if a bit is set, for example. The

stencil function, which controls the operator and values of the stencil

test, is controlled by the glStencilFunc or glStencilFuncSeparate

functions.

void glStencilFunc(GLenum func, GLint ref, GLuint mask)

void glStencilFuncSeparate(GLenum face, GLenum func,

GLint ref, GLuint mask)

face species the face associated with the provided stencil function.

Valid values are GL_FRONT, GL_BACK, and GL_FRONT_AND_BACK

(glStencilFuncSeparate only).

func species the comparison function for the stencil test. Valid values

are GL_EQUAL, GL_NOTEQUAL, GL_LESS, GL_GREATER,

GL_LEQUAL, GL_GEQUAL, GL_ALWAYS, and GL_NEVER.

ref species the comparison value for the stencil test.

mask species the mask that is bit-wise anded with the bits in the

stencil buffer before being compared with the reference value.

To allow ner control of the stencil test, a masking parameter is used to

select which bits of the stencil values should be considered for the test.

After selecting those bits, their value is compared with a reference value

306 Chapter 11: Fragment Operations

using the operator provided. For example, to specify that the stencil test

passes where the lowest three bits of the stencil buffer are equal to 2, you

would call

glStencilFunc ( GL_EQUAL, 2, 0x7 );

and enable the stencil test. Note that in binary format, the last three bits

of 0x7 are 111.

With the stencil test congured, you generally also need to let OpenGL

ES 3.0 know what to do with the values in the stencil buffer when the

stencil test passes. In fact, modifying the values in the stencil buffer

relies on more than just the stencil tests, but also incorporates the

results of the depth test (discussed in the next section). Three possible

outcomes can occur for a fragment with the combined stencil and

depth tests:

1. The fragment fails the stencil tests. If this occurs, no further testing

(i.e., the depth test) is applied to that fragment.

2. The fragment passes the stencil test, but fails the depth test.

3. The fragment passes both the stencil and depth tests.

Each of those possible outcomes can be used to affect the value

in the stencil buffer for that pixel location. The glStencilOp and

glStencilOpSeparate functions control the actions done on the stencil

buffer’s value for each of those test outcomes, and the possible operations

on the stencil values are shown in Table 11-2.

Table 11-2 Stencil Operations

Stencil Function Description

GL_ZERO Set the stencil value to zero

GL_REPLACE Replace the current stencil value with the

reference value specied in glStencilFunc

or glStencilFuncSeparate

GL_INCR, GL_DECR Increment or decrement the stencil value; the

stencil value is clamped to zero or 2n, where n is

the number of bits in the stencil buffer

GL_INCR_WRAP,

GL_DECR_WRAP

Increment or decrement the stencil value, but

“wrap” the value if the stencil value overows

Fragment Tests and Operations 307

The following example illustrates using glStencilFunc and glStencilOp

to control rendering in various parts of the viewport:

GLfloat vVertices[] =

{

−0.75f, 0.25f, 0.50f, // Quad #0

−0.25f, 0.25f, 0.50f,

−0.25f, 0.75f, 0.50f,

−0.75f, 0.75f, 0.50f,

void glStencilOp(GLenum sfail, GLenum zfail,

GLenum zpass)

void glStencilOpSeparate(GLenum face, GLenum sfail,

GLenum zfail, GLenum zpass)

face species the face associated with the provided stencil function.

Valid values are GL_FRONT, GL_BACK, and GL_FRONT_AND_BACK

(glStencilOpSeparate only).

sfail species the operation applied to the stencil bits if the fragment

fails the stencil test. Valid values are GL_KEEP, GL_ZERO,

GL_REPLACE, GL_INCR, GL_DECR, GL_INCR_WRAP,

GL_DECR_WRAP, and GL_INVERT.

zfail species the operation applied when the fragment passes the

stencil test, but fails the depth test

zpass species the operation applied when the fragment passes both

the stencil and depth tests

Table 11-2 Stencil Operations (continued)

Stencil Function Description

(incrementing the maximum value will result

in a new stencil value of zero) or underows

(decrementing zero will result in the maximum

stencil value)

GL_KEEP Keep the current stencil value, effectively not

modifying the value for that pixel

GL_INVERT Bit-wise invert the value in the stencil buffer

(continues)

308 Chapter 11: Fragment Operations

0.25f, 0.25f, 0.90f, // Quad #1

0.75f, 0.25f, 0.90f,

0.75f, 0.75f, 0.90f,

0.25f, 0.75f, 0.90f,

−0.75f, −0.75f, 0.50f, // Quad #2

−0.25f, −0.75f, 0.50f,

−0.25f, −0.25f, 0.50f,

−0.75f, −0.25f, 0.50f,

0.25f, −0.75f, 0.50f, // Quad #3

0.75f, −0.75f, 0.50f,

0.75f, −0.25f, 0.50f,

0.25f, −0.25f, 0.50f,

−1.00f, −1.00f, 0.00f, // Big Quad

1.00f, −1.00f, 0.00f,

1.00f, 1.00f, 0.00f,

−1.00f, 1.00f, 0.00f

};

GLubyte indices[][6] =

{

{ 0, 1, 2, 0, 2, 3 }, // Quad #0

{ 4, 5, 6, 4, 6, 7 }, // Quad #1

{ 8, 9, 10, 8, 10, 11 }, // Quad #2

{ 12, 13, 14, 12, 14, 15 }, // Quad #3

{ 16, 17, 18, 16, 18, 19 } // Big Quad

};

#define NumTests 4

GLfloat colors[NumTests][4] =

{

{ 1.0f, 0.0f, 0.0f, 1.0f },

{ 0.0f, 1.0f, 0.0f, 1.0f },

{ 0.0f, 0.0f, 1.0f, 1.0f },

{ 1.0f, 1.0f, 0.0f, 0.0f }

};

GLint numStencilBits;

GLuint stencilValues[NumTests] =

{

0x7, // Result of test 0

0x0, // Result of test 1

0x2, // Result of test 2

0xff // Result of test 3. We need to fill this

// value in a run-time

};

// Set the viewport

glViewport ( 0, 0, esContext−>width, esContext−>height );

(continued)

Fragment Tests and Operations 309

// Clear the color, depth, and stencil buffers. At this

// point, the stencil buffer will be 0x1 for all pixels.

glClear ( GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT |

GL_STENCIL_BUFFER_BIT );

// Use the program object

glUseProgram ( userData−>programObject );

// Load the vertex position

glVertexAttribPointer ( userData−>positionLoc, 3, GL_FLOAT,

GL_FALSE, 0, vVertices );

glEnableVertexAttribArray ( userData−>positionLoc );

// Test 0:

// Initialize upper-left region. In this case, the stencil-

// buffer values will be replaced because the stencil test

// for the rendered pixels will fail the stencil test,

// which is

// ref mask stencil mask

// ( 0x7 & 0x3 ) < ( 0x1 & 0x7 )

// The value in the stencil buffer for these pixels will

// be 0x7.

glStencilFunc ( GL_LESS, 0x7, 0x3 );

glStencilOp ( GL_REPLACE, GL_DECR, GL_DECR );

glDrawElements ( GL_TRIANGLES, 6, GL_UNSIGNED_BYTE,

indices[0] );

// Test 1:

// Initialize the upper-right region. Here, we’ll decrement

// the stencil-buffer values where the stencil test passes

// but the depth test fails. The stencil test is

// ref mask stencil mask

// ( 0x3 & 0x3 ) > ( 0x1 & 0x3 )

// but where the geometry fails the depth test. The

// stencil values for these pixels will be 0x0.

glStencilFunc ( GL_GREATER, 0x3, 0x3 );

glStencilOp ( GL_KEEP, GL_DECR, GL_KEEP );

glDrawElements ( GL_TRIANGLES, 6, GL_UNSIGNED_BYTE,

indices[1] );

(continues)

310 Chapter 11: Fragment Operations

// Test 2:

// Initialize the lower-left region. Here we’ll increment

// (with saturation) the stencil value where both the

// stencil and depth tests pass. The stencil test for

// these pixels will be

// ref mask stencil mask

// ( 0x1 & 0x3 ) == ( 0x1 & 0x3 )

// The stencil values for these pixels will be 0x2.

glStencilFunc ( GL_EQUAL, 0x1, 0x3 );

glStencilOp ( GL_KEEP, GL_INCR, GL_INCR );

glDrawElements ( GL_TRIANGLES, 6, GL_UNSIGNED_BYTE,

indices[2] );

// Test 3:

// Finally, initialize the lower-right region. We’ll invert

// the stencil value where the stencil tests fails. The

// stencil test for these pixels will be

// ref mask stencil mask

// ( 0x2 & 0x1 ) == ( 0x1 & 0x1 )

// The stencil value here will be set to ~((2^s−1) & 0x1),

// (with the 0x1 being from the stencil clear value),

// where 's' is the number of bits in the stencil buffer.

glStencilFunc ( GL_EQUAL, 0x2, 0x1 );

glStencilOp ( GL_INVERT, GL_KEEP, GL_KEEP );

glDrawElements ( GL_TRIANGLES, 6, GL_UNSIGNED_BYTE,indices[3]);

// As we don’t know at compile-time how many stencil bits are

// present, we’ll query, and update, the correct value in the

// stencilValues arrays for the fourth tests. We’ll use this

// value later in rendering.

glGetIntegerv ( GL_STENCIL_BITS, &numStencilBits );

stencilValues[3] = ~( ( (1 << numStencilBits) – 1 ) & 0x1 ) &

0xff;

// Use the stencil buffer for controlling where rendering

// will occur. We disable writing to the stencil buffer so we

// can test against them without modifying the values we

// generated.

glStencilMask ( 0x0 );

(continued)

Blending 311

for ( i = 0; i < NumTests; ++i )

{

glStencilFunc ( GL_EQUAL, stencilValues[i], 0xff );

glUniform4fv ( userData->colorLoc, 1, colors[i] );

glDrawElements ( GL_TRIANGLES, 6, GL_UNSIGNED_BYTE,

indices[4] );

}

Depth Buffer Testing

The depth buffer is typically used for hidden-surface removal. It traditionally

keeps the distance value of the closest object to the viewpoint for each pixel

in the rendering surface, and for every new incoming fragment, compares

its distance from the viewpoint with the stored value. By default, if the

incoming fragment’s depth value is less than the value stored in the depth

buffer (meaning it’s closer to the viewer), the incoming fragment’s depth

value replaces the values stored in the depth buffer, and then its color value

replaces the color value in the color buffer. This is the standard method for

depth buffering—and if that’s what you would like to do, you simply need

to request a depth buffer when you create a window, and then enable the

depth test by calling glEnable with GL_DEPTH_TEST. If no depth buffer is

associated with the color buffer, the depth test always passes.

Of course, that’s only one way to use the depth buffer. You can modify the

depth comparison operator by calling glDepthFunc.

void glDepthFunc(GLenum func)

func species the depth value comparison function, which can be one

of GL_LESS, GL_GREATER, GL_LEQUAL, GL_GEQUAL,

GL_EQUAL, GL_NOTEQUAL, GL_ALWAYS, or GL_NEVER

Blending

This section discusses blending pixel colors. Once a fragment passes all of

the enabled fragment tests, its color can be combined with the color that’s

already present in the fragment’s pixel location. Before the two colors are

combined, they’re multiplied by a scaling factor and combined using the

specied blending operator. The blending equation is

opCfCfC

finalsourcesourcedestination destination

312 Chapter 11: Fragment Operations

where fsource and Csource are the incoming fragment’s scaling factor and color,

respectively. Likewise, fdestination and Cdestination are the pixel’s scaling factor and

color, and op is the mathematical operator for combining the scaled values.

The scaling factors are specied by calling either glBlendFunc or

glBlendFuncSeparate.

void glBlendFuncSeparate(GLenum srcRGB, GLenum dstRGB,

GLenum srcAlpha, GLenum dstAlpha)

srcRGB species the blending coefcient for the incoming fragment’s

red, green, and blue components

dstRGB species the blending coefcient for the destination pixel’s

red, green, and blue components

srcAlpha species the blending coefcient for the incoming fragment’s

alpha value

dstAlpha species the blending coefcient for the destination pixel’s

alpha value

Table 11-3 Blending Functions

Blending Coefficient Enum RGB Blending Factors

Alpha Blending

Factor

GL_ZERO (0, 0, 0) 0

GL_ONE (1, 1, 1) 1

GL_SRC_COLOR (Rs, Gs, Bs)As

GL_ONE_MINUS_SRC_COLOR (1 – Rs, 1 – Gs, 1 – Bs) 1 – As

GL_SRC_ALPHA (As, As, As)As

GL_ONE_MINUS_SRC_ALPHA (1 – As, 1 – As, 1 – As) 1 – As

void glBlendFunc(GLenum sfactor, GLenum dfactor)

sfactor species the blending coefcient for the incoming fragment

dfactor species the blending coefcient for the destination pixel

The possible values for the blending coefcients are shown in Table 11-3.

Blending 313

In Table 11-3, (Rs, Gs, Bs, As) are the color components associated with the

incoming fragment color, (Rd, Gd, Bd, Ad) are the components associated

with the pixel color already in the color buffer, and (Ra, Gc, Bc, Ac) represent

a constant color that you set by calling glBlendColor. In the case of

GL_SRC_ALHPA_SATURATE, the minimum value computed is applied to the

source color only.

void glBlendColor(GLfloat red, GLfloat green,

GLfloat blue, GLfloat alpha)

red, green, specify the component values for the constant

blue, blending color

alpha

Table 11-3 Blending Functions (continued)

Blending Coefficient Enum RGB Blending Factors

Alpha Blending

Factor

GL_DST_COLOR (Rd, Gd, Bd)Ad

GL_ONE_MINUS_DST_COLOR (1 − Rd, 1 − Gd, 1 − Bd) 1− Ad

GL_DST_ALPHA (Ad, Ad, Ad)Ad

GL_ONE_MINUS_DST_ALPHA (1 − Ad, 1 − Ad, 1 − Ad) 1 − Ad

GL_CONSTANT_COLOR (Rc, Gc, Bc)Ac

GL_ONE_MINUS_CONSTANT_COLOR (1 − Rc, 1 − Gc, 1 − Bc) 1 − Ac

GL_CONSTANT_ALPHA (Ac, Ac, Ac)Ac

GL_ONE_MINUS_CONSTANT_ALPHA (1 − Ac, 1 − Ac, 1 − Ac) 1 − Ac

GL_SRC_ALPHA_SATURATE min(As, 1 − Ad) 1

Once the incoming fragment and pixel color have been multiplied by

their respective scaling factors, they are combined using the operator

specied by glBlendEquation or glBlendEquationSeparate. By

default, blended colors are accumulated using the GL_FUNC_ADD

operator. The GL_FUNC_SUBTRACT operator subtracts the scaled color

from the framebuffer from the incoming fragment’s value. Likewise, the

314 Chapter 11: Fragment Operations

GL_FUNC_REVERSE_SUBTRACT operator reverses the blending equation,

such that the incoming fragment colors are subtracted from the current

pixel value.

Dithering

On a system where the number of colors available in the framebuffer is

limited due to the number of bits per component in the framebuffer, we

can simulate greater color depth using dithering. Dithering algorithms

arrange colors in such a way that the image appears to have more

available colors than are really present. OpenGL ES 3.0 doesn’t specify

which dithering algorithm is to be used in supporting its dithering stage;

the technique is very implementation dependent.

The only control your application has over dithering is whether it is

applied to the nal pixels. This decision is entirely controlled by calling

glEnable or glDisable with GL_DITHER to specify dithering’s use in the

pipeline. Initially, dithering is enabled.

Multisampled Anti-Aliasing

Anti-aliasing is an important technique for improving the quality of

generated images by trying to reduce the visual artifacts of rendering

into discrete pixels. The geometric primitives that OpenGL ES 3.0 renders

are rasterized onto a grid, and their edges may become deformed in that

void glBlendEquationSeparate(GLenum modeRGB,

GLenum modeAlpha)

modeRGB species the blending operator for the red, green, and blue

components

modeAlpha species the alpha component blending operator

void glBlendEquation(GLenum mode)

mode species the blending operator. Valid values are GL_FUNC_ADD,

GL_FUNC_SUBTRACT, GL_FUNC_REVERSE_SUBTRACT,

GL_MIN, or GL_MAX.

Multisampled Anti-Aliasing 315

process. You have almost certainly seen the staircase effect that happens to

lines drawn diagonally across a monitor.

Various techniques can be used to reduce those aliasing effects, and

OpenGL ES 3.0 supports a variant called multisampling. Multisampling

divides every pixel into a set of samples, each of which is treated like a

“mini-pixel” during rasterization. That is, when a geometric primitive

is rendered, it’s like rendering into a framebuffer that has many more

pixels than the real display surface. Each sample has its own color, depth,

and stencil value, and those values are preserved until the image is ready

for display. When it’s time to compose the nal image, the samples are

resolved into the nal pixel color. What makes this process special is that

in addition to using every sample’s color information, OpenGL ES 3.0 has

even more information about how many samples for a particular pixel

were occupied during rasterization. Each sample for a pixel is assigned a

bit in the sample coverage mask. Using that coverage mask, we can control

how the nal pixels are resolved. Every rendering surface created for an

OpenGL ES 3.0 application will be congured for multisampling, even if

only a single sample per pixel is available. Unlike in supersampling, the

fragment shader is executed per pixel rather than per sample.

Multisampling has multiple options that can be turned on and off (using

glEnable and glDisable, respectively) to control the usage of sample

coverage value.

First, you can specify that the sample’s alpha value should be used

to determine the coverage value by enabling GL_SAMPLE_ALPHA_TO_

COVERAGE. In this mode, if the geometric primitive covers a sample, the

alpha value of incoming fragment is used to determine an additional

sample coverage mask computed that is bit-wise anded into the coverage

mask that is computed using the samples of the fragment. This newly

computed coverage value replaces the original one generated directly

from the sample coverage calculation. These sample computations are

implementation dependent.

Additionally, you can specify GL_SAMPLE_COVERAGE or GL_SAMPLE_

COVERAGE_INVERT, which uses the fragment’s (potentially modied by

previous operations) coverage value or its inverted bits, respectively,

and computes the bit-wise and of that value with one specied

using the glSampleCoverage function. The value specied with

glSampleCoverage is used to generate an implementation-specic

coverage mask, and includes an inversion ag, invert, that inverts

the bits in the generated mask. Using this inversion ag, it becomes

possible to create two transparency masks that don’t use entirely

distinct sets of samples.

316 Chapter 11: Fragment Operations

Centroid Sampling

When rendering with multisampling, the fragment data is picked from a

sample that is closest to a pixel center. This can lead to rendering artifacts

near triangle edges, as the pixel center may sometimes fall outside of the

triangle. In such case, the fragment data can be extrapolated to a point

outside of the triangle. Centroid sampling solves this problem by ensuring

that the fragment data is picked from a sample that falls inside the triangle.

To enable centroid sampling, you can declare the output variables of

the vertex shader (and input variables to the fragment shader) with the

centroid qualier as follows:

smooth centroid out vec3 v_color;

Note that using centroid sampling can lead to less accurate derivatives for

pixels near the triangle edges.

Reading and Writing Pixels to the Framebuffer

If you want to preserve your rendered image for posterity’s sake, you can read

the pixel values back from the color buffer, but not from the depth or stencil

buffers. When you call glReadPixels, the pixels in the color buffer are

returned to your application in an array that has been previously allocated.

void glReadPixels(GLint x, GLint y, GLsizei width,

GLsizei height, GLenum format,

GLenum type, GLvoid *pixels)

x, y specify the viewport coordinates of the lower-left corner of the

pixel rectangle read from the color buffer.

width specify the dimensions of the pixel rectangle read from the

height color buffer.

void glSampleCoverage(GLfloat value, GLboolean invert)

value species a value in the range [0, 1] that is converted into a

sample mask; the resulting mask should have a proportional

number of bits set corresponding to the value

invert species that after determining the mask’s value, all of the bits

in the mask should be inverted

Reading and Writing Pixels to the Framebuffer 317

Aside from the xed format (GL_RGBA and GL_RGBA_INTEGER) and type

(GL_UNSIGNED_BYTE, GL_UNSIGNED_INT, GL_INT, and GL_FLOAT), notice

that there are implementation-dependent values that should return the

best format and type combination for the implementation you’re using.

The implementation-specic values can be queried as follows:

GLint readType, readFormat;

GLubyte *pixels;

glGetIntegerv ( GL_IMPLEMENTATION_COLOR_READ_TYPE, &readType );

glGetIntegerv ( GL_IMPLEMENTATION_COLOR_READ_FORMAT,

&readFormat );

unsigned int bytesPerPixel = 0;

switch ( readType )

{

case GL_UNSIGNED_BYTE:

case GL_BYTE:

switch ( readFormat )

{

case GL_RGBA:

bytesPerPixel = 4;

break;

case GL_RGB:

case GL_RGB_INTEGER:

bytesPerPixel = 3;

break;

case GL_RG:

case GL_RG_INTEGER:

case GL_LUMINANCE_ALPHA:

format species the pixel format that you would like returned.

Three formats are available: GL_RGBA, GL_RGBA_INTEGER, and

the value returned by querying GL_IMPLEMENTATION_COLOR_

READ_FORMAT, which is an implementation-specic pixel format.

type species the data type of the pixels returned. Five types are

available: GL_UNSIGNED_BYTE, GL_UNSIGNED_INT, GL_INT,

GL_FLOAT, and the value returned from querying

GL_IMPLEMENTATION_COLOR_READ_TYPE, which is

an implementation-specic pixel type.

pixels a contiguous array of bytes that contain the values read from

the color buffer after glReadPixels returns.

(continues)

318 Chapter 11: Fragment Operations

bytesPerPixel = 2;

break;

case GL_RED:

case GL_RED_INTEGER:

case GL_ALPHA:

case GL_LUMINANCE:

case GL_LUMINANCE_ALPHA:

bytesPerPixel = 1;

break;

default:

// Undetected format/error

break;

}

break;

case GL_FLOAT:

case GL_UNSIGNED_INT:

case GL_INT:

switch ( readFormat )

{

case GL_RGBA:

case GL_RGBA_INTEGER:

bytesPerPixel = 16;

break;

case GL_RGB:

case GL_RGB_INTEGER:

bytesPerPixel = 12;

break;

case GL_RG:

case GL_RG_INTEGER:

bytesPerPixel = 8;

break;

case GL_RED:

case GL_RED_INTEGER:

case GL_DEPTH_COMPONENT:

bytesPerPixel = 4;

break;

default:

// Undetected format/error

break;

}

break;

case GL_HALF_FLOAT:

case GL_UNSIGNED_SHORT:

(continued)

Reading and Writing Pixels to the Framebuffer 319

case GL_SHORT:

switch ( readFormat )

{

case GL_RGBA:

case GL_RGBA_INTEGER:

bytesPerPixel = 8;

break;

case GL_RGB:

case GL_RGB_INTEGER:

bytesPerPixel = 6;

break;

case GL_RG:

case GL_RG_INTEGER:

bytesPerPixel = 4;

break;

case GL_RED:

case GL_RED_INTEGER:

bytesPerPixel = 2;

break;

default:

// Undetected format/error

break;

}

break;

case GL_FLOAT_32_UNSIGNED_INT_24_8_REV: // GL_DEPTH_STENCIL

bytesPerPixel = 8;

break;

// GL_RGBA, GL_RGBA_INTEGER format

case GL_UNSIGNED_INT_2_10_10_10_REV:

case GL_UNSIGNED_INT_10F_11F_11F_REV: // GL_RGB format

case GL_UNSIGNED_INT_5_9_9_9_REV: // GL_RGB format

case GL_UNSIGNED_INT_24_8: // GL_DEPTH_STENCIL format

bytesPerPixel = 4;

break;

case GL_UNSIGNED_SHORT_4_4_4_4: // GL_RGBA format

case GL_UNSIGNED_SHORT_5_5_5_1: // GL_RGBA format

case GL_UNSIGNED_SHORT_5_6_5: // GL_RGB format

bytesPerPixel = 2;

break;

default:

// Undetected type/error

}

(continues)

320 Chapter 11: Fragment Operations

pixels = ( GLubyte* ) malloc( width * height * bytesPerPixel );

glReadPixels ( 0, 0, windowWidth, windowHeight, readFormat,

readType, pixels );

You can read pixels from any currently bound framebuffer, whether it’s

one allocated by the windowing system or from a framebuffer object.

Because each buffer can have a different layout, you’ll probably need to

query the type and format for each buffer you want to read.

OpenGL ES 3.0 provides an efcient mechanism to copy a rectangular

block of pixels into the framebuffer, which will be described in Chapter

12, “Framebuffer Objects.”

Pixel Pack Buffer Objects

When a non-zero buffer object is bound to the GL_PIXEL_PACK_BUFFER

using glBindBuffer, the glReadPixels command can return immediately

and invoke DMA transfer to read pixels from the framebuffer and write the

data into the pixel buffer object (PBO).

To keep the CPU busy, you can schedule some CPU processing after

the glReadPixels call to overlap CPU computations and the DMA

transfer. Depending on the applications, the data may not be available

immediately; in such cases, you can use multiple PBO solutions so that

while the CPU is waiting for the data transfer from one PBO, it can process

the data from an earlier transfer from another PBO.

Multiple Render Targets

Multiple render targets (MRTs) allow the application to render to several

color buffers at one time. With multiple render targets, the fragment

shader outputs several colors (which can be used to store RGBA colors,

normals, depths, or texture coordinates), one for each attached color

buffer. MRTs are used in many advanced rendering algorithms, such as

deferred shading and fast ambient occlusion approximation (SSAO).

In deferred shading, lighting calculations are performed only once

per pixel. This is achieved by separating the geometry and lighting

calculations into two separate rendering passes. The rst geometry pass

outputs multiple attributes (such as position, normal, material color,

or texture coordinates) into multiple buffers (using MRTs). The second

lighting pass performs the lighting calculations by sampling the attributes

(continued)

Multiple Render Targets 321

from each buffer created in the rst pass. As the depth testing has been

performed on the rst pass, we will perform only one lighting calculation

per pixel.

The following steps show how to set up MRTs:

1. Initialize framebuffer objects (FBOs) using glGenFramebuffers and

glBindFramebuffer commands (described in more detail in

Chapter 12, “Framebuffer Objects”) as shown here:

glGenFramebuffers ( 1, &fbo );

glBindFramebuffer ( GL_FRAMEBUFFER, fbo );

2. Initialize textures using glGenTextures and glBindTexture commands

(described in more detail in Chapter 9, “Texturing”) as shown here:

glBindTexture ( GL_TEXTURE_2D, textureId );

glTexImage2D ( GL_TEXTURE_2D, 0, GL_RGBA,

textureWidth, textureHeight,

0, GL_RGBA, GL_UNSIGNED_BYTE, NULL );

// Set the filtering mode

glTexParameteri ( GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER,

GL_NEAREST );

glTexParameteri ( GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER,

GL_NEAREST );

3. Bind relevant textures to the FBO using glFramebufferTexture2D or

glFramebufferTextureLayer command (described in more detail in

Chapter 12) as shown here:

glFramebufferTexture2D ( GL_DRAW_FRAMEBUFFER,

GL_COLOR_ATTACHMENT0,

GL_TEXTURE_2D,

textureId, 0 );

4. Specify color attachments for rendering using the following

glDrawBuffers command:

void glDrawBuffers(GLsizei n, const GLenum* bufs)

n species the number of buffers in bufs

bufs points to an array of symbolic constants specifying the

buffersinto which fragment colors or data values will be

written

322 Chapter 11: Fragment Operations

For example, you can set up a FBO with four color outputs

(attachments) as follows:

const GLenum attachments[4] = { GL_COLOR_ATTACHMENT0,

GL_COLOR_ATTACHMENT1,

GL_COLOR_ATTACHMENT2,

GL_COLOR_ATTACHMENT3 };

glDrawBuffers ( 4, attachments );

You can query the maximum number of color attachments

by calling glGetIntegerv with the symbolic constant

GL_MAX_COLOR_ATTACHMENTS. The minimum number of color

attachments supported by all OpenGL 3.0 implementations is 4.

5. Declare and use multiple shader outputs in the fragment shader.

For example, the following declaration will copy fragment

shader outputs fragData0 to fragData3 to draw buffers 0–3,

respectively:

layout(location = 0) out vec4 fragData0;

layout(location = 1) out vec4 fragData1;

layout(location = 2) out vec4 fragData2;

layout(location = 3) out vec4 fragData3;

Putting everything together, Example 11-1 (as part of the Chapter_11/

MRTs example) illustrates how to set up four draw buffers for a single

framebuffer object.

Example 11-1 Setting up Multiple Render Targets

int InitFBO ( ESContext *esContext)

{

UserData *userData = esContext−>userData;

int i;

GLint defaultFramebuffer = 0;

const GLenum attachments[4] =

{

GL_COLOR_ATTACHMENT0,

GL_COLOR_ATTACHMENT1,

GL_COLOR_ATTACHMENT2,

GL_COLOR_ATTACHMENT3

};

glGetIntegerv ( GL_FRAMEBUFFER_BINDING, &defaultFramebuffer );

Multiple Render Targets 323

Example 11-1 Setting up Multiple Render Targets (continued)

// Set up fbo

glGenFramebuffers ( 1, &userData−>fbo );

glBindFramebuffer ( GL_FRAMEBUFFER, userData−>fbo );

// Set up four output buffers and attach to fbo

userData−>textureHeight = userData−>textureWidth = 400;

glGenTextures ( 4, &userData−>colorTexId[0] );

for (i = 0; i < 4; ++i)

{

glBindTexture ( GL_TEXTURE_2D, userData−>colorTexId[i] );

glTexImage2D ( GL_TEXTURE_2D, 0, GL_RGBA,

userData−>textureWidth,

userData−>textureHeight,

0, GL_RGBA, GL_UNSIGNED_BYTE, NULL );

// Set the filtering mode

glTexParameteri ( GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER,

GL_NEAREST );

glTexParameteri ( GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER,

GL_NEAREST );

glFramebufferTexture2D ( GL_DRAW_FRAMEBUFFER,

attachments[i],

GL_TEXTURE_2D,

userData−>colorTexId[i], 0 );

}

glDrawBuffers ( 4, attachments );

if ( GL_FRAMEBUFFER_COMPLETE !=

glCheckFramebufferStatus ( GL_FRAMEBUFFER ) )

{

return FALSE;

}

// Restore the original framebuffer

glBindFramebuffer ( GL_FRAMEBUFFER, defaultFramebuffer );

return TRUE;

}

Example 11-2 (as part of the Chapter_11/MRTs example) illustrates how

to output four colors per fragment in a fragment shader.

324 Chapter 11: Fragment Operations

Summary

In this chapter, you learned about tests and operations (scissor box testing,

stencil buffer testing, depth buffer testing, multisampling, blending and

dithering) that happen after the fragment shader. This is the nal phase

in the OpenGL ES 3.0 pipeline. In the next chapter, you will learn an

efcient method for rendering to a texture or an off-screen surface using

framebuffer objects.

Example 11-2 Fragment Shader with Multiple Render Targets

#version 300 es

precision mediump float;

layout(location = 0) out vec4 fragData0;

layout(location = 1) out vec4 fragData1;

layout(location = 2) out vec4 fragData2;

layout(location = 3) out vec4 fragData3;

void main()

{

// first buffer will contain red color

fragData0 = vec4 ( 1, 0, 0, 1 );

// second buffer will contain green color

fragData1 = vec4 ( 0, 1, 0, 1 );

// third buffer will contain blue color

fragData2 = vec4 ( 0, 0, 1, 1 );

// fourth buffer will contain gray color

fragData3 = vec4 ( 0.5, 0.5, 0.5, 1 );

}

325

Chapter 12

Framebuffer Objects

In this chapter, we describe what framebuffer objects are, how applications

can create them, and how applications can use them for rendering to an

off-screen buffer or rendering to a texture. We start by discussing why we

need framebuffer objects. We then introduce framebuffer objects and new

object types they add to OpenGL ES, and explain how they differ from the

EGL surfaces described in Chapter 3, “An Introduction to EGL.” We go on

to discuss how to create framebuffer objects; explore how to specify color,

depth, and stencil attachments to a framebuffer object; and then provide

examples that demonstrate rendering to a framebuffer object. Last but not

least, we discuss performance tips and tricks that can help ensure good

performance when using framebuffer objects.

Why Framebuffer Objects?

A rendering context and a drawing surface need to be rst created and

made current before any OpenGL ES commands can be called by an

application. The rendering context and the drawing surface are usually

provided by the native windowing system through an API such as EGL.

Chapter 3 describes how to create an EGL context and surface and how

to attach them to a rendering thread. The rendering context contains

the appropriate state required for correct operation. The drawing surface

provided by the native windowing system can be a surface that will be

displayed on the screen, referred to as the window system–provided

framebuffer, or it can be an off-screen surface, referred to as a pbuffer.

The calls to create the EGL drawing surfaces let you specify the width and

326 Chapter 12: Framebuffer Objects

height of the surface in pixels; whether the surface uses color, depth, and

stencil buffers; and the bit depths of these buffers.

By default, OpenGL ES uses the window system–provided framebuffer as

the drawing surface. If the application is drawing only to an on-screen

surface, the window system–provided framebuffer is usually sufcient.

However, many applications need to render to a texture, and for this

purpose using the window system–provided framebuffer as your drawing

surface is usually not an ideal option. Examples of where the render-to-

texture approach is useful are shadow mapping, dynamic reections and

environment mapping, multipass techniques for depth-of-eld, motion

blur effects, and postprocessing effects.

Applications can use either of two techniques to render to a texture:

Implement render to texture by drawing to the window system–

provided framebuffer and then copy the appropriate region of the

framebuffer to the texture. This can be implemented using the

glCopyTexImage2D and glCopyTexSubImage2D APIs. As their names

imply, these APIs perform a copy from the framebuffer to thetexture

buffer, and this copy operation can often adversely impact performance.

In addition, this approach works only if the dimensions of the texture

are less than or equal to the dimensions of the framebuffer.

Implement render to texture by using a pbuffer that is attached

to a texture. We know that a window system–provided surface

must be attached to a rendering context. This can be inefcient

on some implementations that require separate contexts for each

pbuffer and window surface. Additionally, switching between

window system–provided drawables can sometimes require the

implementation to ush all previous rendering prior to the switch.

This can introduce expensive “bubbles” (idling the GPU) into the

rendering pipeline. On such systems, our recommendation is to

avoid using pbuffers to render to textures because of the overhead

associated with context- and window system–provided drawable

switching.

Neither of these two methods is ideal for rendering to a texture or

other off-screen surface. What is needed instead are APIs that allow

applications to directly render to a texture or the ability to create an

off-screen surface within the OpenGL ES API and use it as a rendering

target. Framebuffer objects and renderbuffer objects allow applications

to do exactly this, without requiring additional rendering contexts to

be created. As a consequence, we no longer have to worry about the

overhead of a context and drawable switch that can occur when using

Framebuffer and Renderbuffer Objects 327

window system–provided drawables. Framebuffer objects, therefore,

provide a better and more efcient method for rendering to a texture or

an off-screen surface.

The framebuffer objects API supports the following operations:

Creating framebuffer objects using OpenGL ES commands only

Creating and using multiple framebuffer objects within a single

EGL context—that is, without requiring a rendering context per

framebuffer

Creating off-screen color, depth, or stencil renderbuffers and textures,

and attaching these to a framebuffer object

Sharing color, depth, or stencil buffers across multiple framebuffers

Attaching textures directly to a framebuffer as color or depth, thereby

avoiding the need to do a copy operation

Copying between framebuffers and invalidating framebuffer contents

Framebuffer and Renderbuffer Objects

In this section, we describe what renderbuffer and framebuffer objects are,

explain how they differ from window system–provided drawables, and

consider when to use a renderbuffer instead of a texture.

A renderbuffer object is a 2D image buffer allocated by the application.

The renderbuffer can be used to allocate and store color, depth, or stencil

values and can be used as a color, depth, or stencil attachment in a

framebuffer object. A renderbuffer is similar to an off-screen window

system–provided drawable surface, such as a pbuffer. A renderbuffer,

however, cannot be directly used as a GL texture.

A framebuffer object (FBO) is a collection of color, depth, and stencil textures

or render targets. Various 2D images can be attached to the color attachment

point in the framebuffer object. These include a renderbuffer object that

stores color values, a mip level of a 2D texture or a cubemap face, a layer of a

2D array textures, or even a mip level of a 2D slice in a 3D texture. Similarly,

various 2D images containing depth values can be attached to the depth

attachment point of an FBO. These can include a renderbuffer, a mip level

of a 2D texture, or a cubemap face that stores depth values. The only 2D

image that can be attached to the stencil attachment point of an FBO is a

renderbuffer object that stores stencil values.

328 Chapter 12: Framebuffer Objects

Figure 12-1 shows the relationships among framebuffer objects,

renderbuffer objects, and textures. Note that there can be only one color,

depth, and stencil attachment in a framebuffer object.

Color

Attachments

Framebuffer Objects

Depth

Attachment

Framebuffer

bjects

ept

Stencil

Attachment

Renderbuffer Objects

Depth

Buffer

Renderbuffer

bjects

Buffe

Stencil

Buffer

Texture mip Images

Figure 12-1 Framebuffer Objects, Renderbuffer Objects, and Textures

Choosing a Renderbuffer Versus a Texture as a Framebuffer

Attachment

For render-to-texture use cases, you would attach a texture object to the

framebuffer object. Examples include rendering to a color buffer that will

be used as a color texture, and rendering into a depth buffer that will be

used as a depth texture for shadows.

There are several reasons to use renderbuffers instead of textures:

Renderbuffers support multisampling.

If the image will not be used as a texture, using a renderbuffer may

deliver a performance advantage. This advantage occurs because the

implementation might be able to store the renderbuffer in a much

more efcient format, better suited for rendering than for texturing.

The implementation can only do so, however, if it knows in advance

that the image will not be used as a texture.

Creating Framebuffer and Renderbuffer Objects 329

Framebuffer Objects Versus EGL Surfaces

The differences between an FBO and the window system–provided

drawable surface are as follows:

Pixel ownership test determines whether the pixel at location (xw, yw)

in the framebuffer is currently owned by OpenGL ES. This test allows

the window system to control which pixels in the framebuffer belong

to the current OpenGL ES context—for example, when a window that

is being rendered into by OpenGL ES is obscured. For an application-

created framebuffer object, the pixel ownership test always succeeds,

as the framebuffer object owns all the pixels.

The window system might support only double-buffered surfaces.

Framebuffer objects, in contrast, support only single-buffered

attachments.

Sharing of stencil and depth buffers between framebuffers is

possible using framebuffer objects but usually not with the window

system–provided framebuffer. Stencil and depth buffers and their

corresponding state are usually allocated implicitly with the window

system–provided drawable surface and, therefore, cannot be shared

between drawable surfaces. With application-created framebuffer

objects, stencil and depth renderbuffers can be created independently

and then associated with a framebuffer object by attaching these

buffers to appropriate attachment points in multiple framebuffer

objects, if desired.

Creating Framebuffer and Renderbuffer Objects

Creating framebuffer and renderbuffer objects is similar to how texture or

vertex buffer objects are created in OpenGL ES 3.0.

The glGenRenderbuffers API call is used to allocate renderbuffer object

names. This API is described next.

void glGenRenderbuffers(GLsizei n, GLuint *renderbuffers)

nnumber of renderbuffer object names to return

renderbuffers pointer to an array of n entries, where the allocated

renderbuffer object names are returned

330 Chapter 12: Framebuffer Objects

glGenRenderbuffers allocates n renderbuffer object names and returns

them in renderbuffers. The renderbuffer object names returned by

glGenRenderbuffers are unsigned integer numbers other than 0. These

names returned are marked in use but do not have any state associated

with them. The value 0 is reserved by OpenGL ES and does not refer to

a renderbuffer object. Applications trying to modify or query the buffer

object state for renderbuffer object 0 will generate an appropriate error.

The glGenFramebuffers API call is used to allocate framebuffer object

names. This API is described here.

glGenFramebuffers allocates n framebuffer object names and returns them

in ids. The framebuffer object names returned by glGenFramebuffers are

unsigned integer numbers other than 0. The framebuffer names returned are

marked in use but do not have any state associated with them. The value0

is reserved by OpenGL ES and refers to the window system–provided

framebuffer. Applications trying to modify or query the buffer object state for

framebuffer object 0 will generate an appropriate error.

Using Renderbuffer Objects

In this section, we describe how to specify the data storage, format, and

dimensions of the renderbuffer image. To specify this information for

a specic renderbuffer object, we need to make this object the current

renderbuffer object. The glBindRenderbuffer command is used to set

the current renderbuffer object.

void glGenFramebuffers(GLsizei n, GLuint *ids)

nnumber of framebuffer object names to return

ids pointer to an array of n entries, where allocated framebuffer

object are returned

void glBindRenderbuffer(GLenum target, GLuint renderbuffer)

target must be set to GL_RENDERBUFFER

renderbuffer renderbuffer object name

Using Renderbuffer Objects 331

Note that glGenRenderbuffers is not required to assign a renderbuffer

object name before it is bound using glBindRenderbuffer. Although

it is a good practice to call glGenRenderbuffers, many applications

specify compile-time constants for their buffers. An application can

specify an unused renderbuffer object name to glBindRenderbuffer.

However, we do recommend that OpenGL ES applications call

glGenRenderbuffers and use renderbuffer object names returned

by glGenRenderbuffers instead of specifying their own buffer

objectnames.

The rst time the renderbuffer object name is bound by calling

glBindRenderbuffer, the renderbuffer object is allocated with the

appropriate default state. If this allocation is successful, the allocated

object will become the newly bound renderbuffer object.

The following state and default values are associated with a renderbuffer

object:

Width and height in pixels—The default value is zero.

Internal format—This describes the format of the pixels stored in the

renderbuffer. It must be a color-, depth-, or stencil-renderable format.

Color bit-depth—This is valid only if the internal format is a color-

renderable format. The default value is zero.

Depth bit-depth—This is valid only if the internal format is a depth-

renderable format. The default value is zero.

Stencil bit-depth—This is valid only if the internal format is a stencil-

renderable format. The default value is zero.

glBindRenderbuffer can also be used to bind to an existing renderbuffer

object (i.e., an object that has been assigned and used before and,

therefore, has a valid state associated with it). No changes to the state of

the newly bound renderbuffer object are made by the bind command.

Once a renderbuffer object is bound, we can specify the

dimensions and format of the image stored in the renderbuffer. The

glRenderbufferStorage command can be used for this purpose.

glRenderbufferStorage looks very similar to glTexImage2D, except

that no image data is supplied. You can also create a multisample

renderbuffer by using the glRenderbufferStorageMultisample

command. glRenderbufferStorage is equivalent to

glRenderStorageMultisample with samples set to zero. The width

and height of the renderbuffer are specied in pixels and must

332 Chapter 12: Framebuffer Objects

be values that are smaller than the maximum renderbuffer size

supported by the implementation. The minimum size value that must

be supported by all OpenGL ES implementations is 1. The actual

maximum size supported by the implementation can be queried using

the following code:

GLint maxRenderbufferSize = 0;

glGetIntegerv(GL_MAX_RENDERBUFFER_SIZE, &maxRenderbufferSize);

The internalformat argument species the format that the application

would like to use to store pixels in the renderbuffer object. Table 12-1

lists the renderbuffer formats to store a color-renderable buffer, and

Table 12-2 lists the formats to store a depth-renderable or stencil-

renderable buffer.

void glRenderbufferStorage ( GLenumtarget,

GLenum internalformat,

GLsizei width, GLsizei height)

void glRenderbufferStorageM ultisample(GLenum target,

GLsizei samples,

GLenum internalformat,

GLsizei width, GLsizei height)

target must be set to GL_RENDERBUFFER.

samples number of samples to be used with

the renderbuffer object’s storage.

Must be less than GL_MAX_SAMPLES

(glRenderbufferStorageMultisample only)

internalformat must be a format that can be used as a color

buffer, depth buffer, or stencil buffer.

The supported formats are listed in Tables 12-1

and 12-2.

width width of the renderbuffer in pixels;

must be less than or equal to

GL_MAX_RENDERBUFFER_SIZE.

height height of the renderbuffer in pixels;

must be less than or equal to

GL_MAX_RENDERBUFFER_SIZE.

Using Renderbuffer Objects 333

The renderbuffer object can be attached to the color, depth, or stencil

attachment of the framebuffer object without the renderbuffer’s storage

format and dimensions being specied. The renderbuffer’s storage format

and dimensions can be specied before or after the renderbuffer object

has been attached to the framebuffer object. This information will,

however, need to be correctly specied before the framebuffer object and

renderbuffer attachment can be used for rendering.

Multisample Renderbuffers

Multisample renderbuffers enable the application to render to off-

screen framebuffers with multisample anti-aliasing. The multisample

renderbuffers cannot be directly bound to textures, but they can be

resolved to single-sample textures using the newly introduced framebuffer

blit (described later in this chapter).

As described in the previous section, to create a multisample renderbuffer,

you use the glRenderbufferStorageMultisample API.

Renderbuffer Formats

Table 12-1 lists the renderbuffer formats to store a color-renderable buffer,

and Table 12-2 lists the renderbuffer formats to store a depth-renderable or

stencil-renderable buffer.

Table 12-1 Renderbuffer Formats for Color-Renderable Buffer

Internal Format Red Bits Green Bits Blue Bits Alpha Bits

GL_R8 8 — — —

GL_R8UI ui8 — — —

GL_R8I i8 — — —

GL_R16UI ui16 — — —

GL_R16I i16 — — —

GL_R32UI ui32 — — —

GL_R32I i32 — — —

(continues)

334 Chapter 12: Framebuffer Objects

i denotes an integer; ui denotes an unsigned integer type.

Internal Format Red Bits Green Bits Blue Bits Alpha Bits

GL_RG8 8 8 — —

GL_RG8UI ui8 ui8 — —

GL_RG8I i8 i8 — —

GL_RG16UI ui16 ui16 — —

GL_RG16I i16 i16 — —

GL_RG32UI ui32 ui32 — —

GL_RG32I i32 i32 — —

GL_RGB8 8 8 8 —

GL_RGB565 5 6 5 —

GL_RGBA8 8 8 8 8

GL_SRGB8_ALPHA8 8 8 8 8

GL_RGB5_A1 5 5 5 1

GL_RGBA4 4 4 4 4

GL_RGB10_A2 10 10 10 2

GL_RGBA8UI ui8 ui8 ui8 ui8

GL_RGBA8I i8 i8 i8 i8

GL_RGB10_A2UI ui10 ui10 ui10 ui2

GL_RGBA16UI ui16 ui16 ui16 ui16

GL_RGBA16I i16 i16 i16 i16

GL_RGBA32UI ui32 ui32 ui32 ui32

GL_RGBA32I i32 i32 i32 i32

Table 12-1 Renderbuffer Formats for Color-Renderable Buffer (continued)

Using Framebuffer Objects 335

f denotes a oat type.

Using Framebuffer Objects

We describe how to use framebuffer objects to render to an off-screen

buffer (i.e., renderbuffer) or to render to a texture. Before we can use a

framebuffer object and specify its attachments, we need to make it the

current framebuffer object. The glBindFramebuffer command is used to

set the current framebuffer object.

Table 12-2 Renderbuffer Formats for Depth-Renderable and

Stencil-Renderable Buffer

Internal Format Depth Bits Stencil Bits

GL_DEPTH_COMPONENT16 16 —

GL_DEPTH_COMPONENT24 24 —

GL_DEPTH_COMPONENT32F f32 —

GL_DEPTH24_STENCIL8 24 8

GL_DEPTH32F_STENCIL8 f32 8

GL_STENCIL_INDEX8 — 8

void glBindFramebuffer(GLenum target, GLuint framebuffer)

target must be set to GL_READ_FRAMEBUFFER,

GL_DRAW_FRAMEBUFFER, or GL_FRAMEBUFFER

framebuffer framebuffer object name

Note that glGenFramebuffers is not required to assign a framebuffer

object name before it is bound using glBindFramebuffer. An application

can specify an unused framebuffer object name to glBindFramebuffer.

However, we do recommend that OpenGL ES applications call

glGenFramebuffers and use framebuffer object names returned by

glGenFramebuffers instead of specifying their own buffer object names.

336 Chapter 12: Framebuffer Objects

On some OpenGL ES 3.0 implementations, the rst time a framebuffer

object name is bound by calling glBindFramebuffer, the framebuffer

object is allocated with the appropriate default state. If the allocation is

successful, this allocated object is bound as the current framebuffer object

for the rendering context.

The following state is associated with a framebuffer object:

Color attachment point—The attachment point for the color buffer.

Depth attachment point—The attachment point for the depth buffer.

Stencil attachment point—The attachment point for the stencil buffer.

Framebuffer completeness status—Whether the framebuffer is in a

complete state and can be rendered to.

For each attachment point, the following information is specied:

Object type—Species the type of object that is associated with the

attachment point. This can be GL_RENDERBUFFER if a renderbuffer

object is attached or GL_TEXTURE if a texture object is attached. The

default value is GL_NONE.

Object name—Species the name of the object attached. This can be

either the renderbuffer object name or the texture object name. The

default value is 0.

Texture level—If a texture object is attached, then this species the

mip level of the texture associated with the attachment point. The

default value is 0.

Texture cubemap face—If a texture object is attached and the texture

is a cubemap, then this species which one of the six cubemap faces

isto be used as the attachment point. The default value is

GL_TEXTURE_CUBE_MAP_POSITIVE_X.

Texture layer—Species the 2D slice of the 3D texture to be used as the

attachment point. The default value is 0.

glBindFramebuffer can also be used to bind to an existing framebuffer

object (i.e., an object that has been assigned and used before and,

therefore, has valid state associated with it). No changes are made to the

state of the newly bound framebuffer object.

Once a framebuffer object has been bound, the color, depth, and stencil

attachments of the currently bound framebuffer object can be set to

Using Framebuffer Objects 337

a renderbuffer object or a texture. As shown in Figure 12-1, the color

attachment can be set to a renderbuffer that stores color values, or

to a mip level of a 2D texture or a cubemap face, or to a layer of a

2D array textures, or to a mip level of a 2D slice in a 3D texture. The

depth attachment can be set to a renderbuffer that stores depth values

or packed depth and stencil values, to a mip level of a 2D depth texture,

or to a depth cubemap face. The stencil attachment must be set to

a renderbuffer that stores stencil values or packed depth and stencil

values.

Attaching a Renderbuffer as a Framebuffer Attachment

The glFramebufferRenderbuffer command is used to attach a

renderbuffer object to a framebuffer attachment point.

void glFramebufferRenderbuffer (GLenum target,

GLenum attachment,

GLenum renderbuffertarget,

GLuint renderbuffer)

target must be set to GL_READ_FRAMEBUFFER,

GL_DRAW_FRAMEBUFFER, or GL_FRAMEBUFFER

attachment must be one of the following enums:

GL_COLOR_ATTACHMENTi

GL_DEPTH_ATTACHMENT

GL_STENCIL_ATTACHMENT

GL_DEPTH_STENCIL_ATTACHMENT

renderbuffertarget must be set to GL_RENDERBUFFER

renderbuffer the renderbuffer object that should be used as

attachment; the renderbuffer must be either

zero or the name of an existing renderbuffer

object

If glFramebufferRenderbuffer is called with renderbuffer not equal

to zero, this renderbuffer object will be used as the new color, depth, or

stencil attachment point as specied by the value of the attachment

argument.

338 Chapter 12: Framebuffer Objects

The attachment point’s state will be modied to

Object type = GL_RENDERBUFFER

Object name = renderbuffer

Texture level and texture layer = 0

Texture cubemap face = GL_NONE

The newly attached renderbuffer object’s state or contents of its buffer do

not change.

If glFramebufferRenderbuffer is called with renderbuffer equal to

zero, then the color, depth, or stencil buffer as specied by attachment is

detached and reset to zero.

Attaching a 2D Texture as a Framebuffer Attachment

The glFramebufferTexture2D command is used to attach a mip level of

a 2D texture or a cubemap face to a framebuffer attachment point. It can

be used to attach a texture as a color, depth, or stencil attachment.

void glFramebufferTexture2D( GLenum target,

GLenum attachment,

GLenum textarget,

GLuint texture,

Glint level)

target must be set to GL_READ_FRAMEBUFFER,

GL_DRAW_FRAMEBUFFER, or GL_FRAMEBUFFER

attachment must be one of the following enums:

GL_COLOR_ATTACHMENTi

GL_DEPTH_ATTACHMENT

GL_STENCIL_ATTACHMENT

GL_DEPTH_STENCIL_ATTACHMENT

textarget species the texture target; this is the

value specied in the target argument in

glTexImage2D

texture species the texture object

level species the mip level of texture image

Using Framebuffer Objects 339

If glFramebufferTexture2D is called with texture not equal to zero,

then the color, depth, or stencil attachment will be set to texture. If

glFramebufferTexture2D generates an error, no change is made to the

state of the framebuffer.

The attachment point’s state will be modied to

Object type = GL_TEXTURE

Object name = texture

Texture level = level

Texture cubemap face = valid if the texture attachment is a cubemap

and is one of the following values:

GL_TEXTURE_CUBE_MAP_POSITIVE_X

GL_TEXTURE_CUBE_MAP_POSITIVE_Y

GL_TEXTURE_CUBE_MAP_POSITIVE_Z

GL_TEXTURE_CUBE_MAP_NEGATIVE_X

GL_TEXTURE_CUBE_MAP_NEGATIVE_Y

GL_TEXTURE_CUBE_MAP_NEGATIVE_Z

Texture layer = 0

The newly attached texture object’s state or contents of its image are

not modied by glFramebufferTexture2D. Note that the texture

object’s state and image can be modied after it has been attached to a

framebuffer object.

If glFramebufferTexture2D is called with texture equal to zero, then

the color, depth, or stencil attachment is detached and reset to zero.

Attaching an Image of a 3D Texture as a Framebuffer

Attachment

The glFramebufferTextureLayer command is used to attach a 2D slice

and a specic mip level of a 3D texture or a level of 2D array textures to

a framebuffer attachment point. Refer to Chapter 9, “Texturing,” for a

detailed description of how 3D textures work.

340 Chapter 12: Framebuffer Objects

The newly attached texture object’s state or contents of its image are

not modied by glFramebufferTextureLayer. Note that the texture

object’s state and image can be modied after it has been attached to a

framebuffer object.

The attachment point’s state will be modied to

Object type = GL_TEXTURE

Object name = texture

Texture level = level

Texture cubemap face = GL_NONE

Texture layer = 0

If glFramebufferTextureLayer is called with texture equal to zero,

then the attachment is detached and reset to zero.

void glFramebufferTextureLayer (GLenum target,

GLenum attachment,

GLuint texture,

GLint level,

GLint layer)

target must be set to GL_READ_FRAMEBUFFER,

GL_DRAW_FRAMEBUFFER, or GL_FRAMEBUFFER.

attachment must be one of the following enums:

GL_COLOR_ATTACHMENTi

GL_DEPTH_ATTACHMENT

GL_STENCIL_ATTACHMENT

GL_DEPTH_STENCIL_ATTACHMENT

texture species the texture object.

level species the mip level of the texture image.

layer species the layer of texture image. If texture is

GL_TEXTURE_3D, then level must be greater than or equal

to zero and less than or equal to log2 of the value of

GL_MAX_3D_TEXTURE_SIZE. If texture is

GL_TEXTURE_2D_ARRAY, then level must be greater than

or equal to zero and no larger than log2 of the value

GL_MAX_TEXTURE_SIZE.

Using Framebuffer Objects 341

One interesting question arises: What happens if we are rendering into

a texture and at the same time use this texture object as a texture in

a fragment shader? Will the OpenGL ES implementation generate an

error when such a situation arises? In some cases, it is possible for the

OpenGL ES implementation to determine if a texture object is being

used as a texture input and a framebuffer attachment into which we

are currently drawing. glDrawArrays and glDrawElements could then

generate an error. To ensure that glDrawArrays and glDrawElements

can be executed as rapidly as possible, however, these checks are not

performed. Instead of generating an error, in this case rendering results

are undened. It is the application’s responsibility to make sure that this

situation does not occur.

Checking for Framebuffer Completeness

A framebuffer object needs to be dened as complete before it can be

used as a rendering target. If the currently bound framebuffer object is

not complete, OpenGL ES commands that draw primitives or read pixels

will fail and generate an appropriate error that indicates the reason the

framebuffer is incomplete.

The rules for a framebuffer object to be considered complete are as

follows:

Make sure that the color, depth, and stencil attachments are valid. A

color attachment is valid if it is zero (i.e., there is no attachment) or

if it is a color-renderable renderbuffer object or a texture object with

one of the formats listed in Table 12-1. A depth attachment is valid

if it is zero or is a depth-renderable renderbuffer object or a depth

texture with one of the formats listed in Table 12-2 with depth buffer

bits. A stencil attachment is valid if it is zero or is a stencil-renderable

renderbuffer object with one of the formats listed in Table 12-2 with

stencil buffer bits. There is a minimum of one valid attachment.

A framebuffer is not complete if it has no attachments, as there is

nothing to draw into or read from.

Valid attachments associated with a framebuffer object must have the

same width and height.

If depth and stencil attachments exist, they must be the same image.

The value of GL_RENDERBUFFER_SAMPLES is the same for all renderbuffer

attachments. If the attachments are a combination of renderbuffers and

textures, the value of GL_RENDERBUFFER_SAMPLES is zero.

342 Chapter 12: Framebuffer Objects

The glCheckFramebufferStatus command can be used to verify that a

framebuffer object is complete.

glCheckFramebufferStatus returns zero if target is not equal to

GL_FRAMEBUFFER. If target is equal to GL_FRAMEBUFFER, one of the

following enums is returned:

GL_FRAMEBUFFER_COMPLETE—Framebuffer is complete.

GL_FRAMEBUFFER_UNDEFINED—If target is the default framebuffer

but it does not exist.

GL_FRAMEBUFFER_INCOMPLETE_ATTACHMENT—The framebuffer

attachment points are not complete. This might be due to the fact

that the required attachment is zero or is not a valid texture or

renderbuffer object.

GL_FRAMEBUFFER_INCOMPLETE_MISSING_ATTACHMENT—No valid

attachments in the framebuffer.

GL_FRAMEBUFFER_UNSUPPORTED—The combination of internal formats

used by attachments in the framebuffer results in a nonrenderable target.

GL_FRAMEBUFFER_INCOMPLETE_MULTISAMPLE—

GL_RENDERBUFFER_SAMPLES is not the same for all renderbuffer

attachments or GL_RENDERBUFFER_SAMPLES is non-zero when the

attachments are a combination of renderbuffers and textures.

If the currently bound framebuffer object is not complete, attempts to use

that object for reading and writing pixels will fail. In turn, calls to draw

primitives, such as glDrawArrays and glDrawElements, and commands

that read the framebuffer, such as glReadPixels, glCopyTeximage2D,

glCopyTexSubImage2D, and glCopyTexSubImage3D, will generate a

GL_INVALID_FRAMEBUFFER_OPERATION error.

Framebuffer Blits

Framebuffer blits allow for efcient copying of a rectangle of pixel values

from one framebuffer (i.e., read framebuffer) to another framebuffer (i.e.,

draw framebuffer). One key application of framebuffer blits is to resolve a

GLenum glCheckFramebufferStatus(GLenum target)

target must be set to GL_READ_FRAMEBUFFER,

GL_DRAW_FRAMEBUFFER, or GL_FRAMEBUFFER

Framebuffer Blits 343

multisample renderbuffer to a texture (with a framebuffer object that has

a texture bound for the color attachment).

You can perform this operation using the following command:

void glBlitFramebuffer( GLint srcX0, GLint srcY0,

GLint srcX1, GLint srcY1,

GLint dstX0, GLint dstY0,

GLint dstX1, GLint dstY1,

GLbitfield mask, GLenum filter)

srcX0, srcY0, srcX1, srcY1 specify the bound of the source

rectangle within the read buffer

dstX0, dstY0, dstX1, dstY1 specify the bound of the

destination rectangle within the write buffer

mask species the bit-wise or of the ags indicating which

buffers are to be copied; consists of

GL_COLOR_BUFFER_BIT

GL_DEPTH_BUFFER_BIT

GL_STENCIL_BUFFER_BIT

GL_DEPTH_STENCIL_ATTACHMENT

filter

species the interpolation to be applied if the image is

stretched; must be GL_NEAREST or GL_LINEAR

Example 12-1 (as part of the Chapter_11/MRTs example) illustrates

howtouse framebuffer blits to copy four color buffers from a

framebufferobject into four quadrants of the window for the default

framebuffer.

Example 12-1 Copying Pixels Using Framebuffer Blits

void BlitTextures ( ESContext *esContext )

{

UserData *userData = esContext->userData;

// set the default framebuffer for writing

glBindFramebuffer ( GL_DRAW_FRAMEBUFFER,

defaultFramebuffer );

// set the fbo with four color attachments for reading

glBindFramebuffer ( GL_READ_FRAMEBUFFER, userData->fbo );

(continues)

344 Chapter 12: Framebuffer Objects

Framebuffer Invalidation

Framebuffer invalidation gives the application a mechanism to inform the

driver that the contents of the framebuffer are no longer needed. This allows

the driver to take several optimization steps: (1) skip unnecessary restoration

of the contents of the tiles in tile-based rendering (TBR) architecture for

further rendering to a framebuffer, (2) skip unnecessary data copying

between GPUs in multi-GPU systems, or (3) skip ushing certain caches in

some implementations to improve performance. This functionality is very

important to achieve peak performance in many applications, especially

those that perform signicant amounts of off-screen rendering.

Example 12-1 Copying Pixels Using Framebuffer Blits (continued)

// Copy the output red buffer to lower-left quadrant

glReadBuffer ( GL_COLOR_ATTACHMENT0 );

glBlitFramebuffer ( 0, 0,

esContext->width, esContext->height,

0, 0,

esContext->width/2, esContext->height/2,

GL_COLOR_BUFFER_BIT, GL_LINEAR );

// Copy the output green buffer to lower-right quadrant

glReadBuffer ( GL_COLOR_ATTACHMENT1 );

glBlitFramebuffer ( 0, 0,

esContext->width, esContext->height,

esContext->width/2, 0,

esContext->width, esContext->height/2,

GL_COLOR_BUFFER_BIT, GL_LINEAR );

// Copy the output blue buffer to upper-left quadrant

glReadBuffer ( GL_COLOR_ATTACHMENT2 );

glBlitFramebuffer ( 0, 0,

esContext->width, esContext->height,

0, esContext->height/2,

esContext->width/2, esContext->height,

GL_COLOR_BUFFER_BIT, GL_LINEAR );

// Copy the output gray buffer to upper-right quadrant

glReadBuffer ( GL_COLOR_ATTACHMENT3 );

glBlitFramebuffer ( 0, 0,

esContext->width, esContext->height,

esContext->width/2, esContext->height/2,

esContext->width, esContext->height,

GL_COLOR_BUFFER_BIT, GL_LINEAR );

}

Framebuffer Invalidation 345

Let us review the design of TBR GPUs to understand why framebuffer

invalidation is important for such GPUs. TBR GPUs are commonly

employed on mobile devices to minimize the amount of data transferred

between the GPU and system memory and thereby reduce one of the

biggest consumers of power, memory bandwidth. This is done by adding

a fast on-chip memory that can hold a small amount of pixel data. The

framebuffer is then divided into many tiles. For each tile, primitives are

rendered into the on-chip memory, and then the results are copied to

the system memory once completed. Because only a minimal amount of

data per pixel (the nal pixel result) will be copied to the system memory,

this approach saves memory bandwidth between the GPU and system

memory.

With framebuffer invalidation, the GPU can remove contents of the

framebuffer that are no longer required so as to reduce the amount

of contents to be held per frame. In addition, the GPU may remove

unnecessary data transfer from the on-chip memory to the system memory if

the tile data is no longer valid. Because the memory bandwidth requirement

between the GPU and system memory can be reduced signicantly, this

leads to reduced power consumption and improved performance.

The glInvalidateFramebuffer and glInvalidateSubFramebuffer

commands are used to invalidate the entire framebuffer or a pixel

subregion of the framebuffer.

void glInvalidateFrameb uffer(GLenum target,

GLsizei numAttachments,

const GLenum *attachments)

void glInvalidateSubFra mebuffer(GLenum target,

GLsizei numAttachments,

const GLenum *attachments,

GLint x, GLint y,

GLsizei width, GLsizei height)

target must be set to GL_READ_FRAMEBUFFER,

GL_DRAW_FRAMEBUFFER, or GL_FRAMEBUFFER

numAttachments

attachments

number of attachments in the attachments list

pointer to an array of numAttachments attachments

x, y specify the lower-left origin of the pixel rectangle to

invalidate (lower-left corner is 0,0)

(glInvalidateSubFramebuffer only)

346 Chapter 12: Framebuffer Objects

Deleting Framebuffer and Renderbuffer Objects

After the application has nished using renderbuffer objects, they can be

deleted. Deleting renderbuffer and framebuffer objects is very similar to

deleting texture objects.

Renderbuffer objects are deleted using the glDeleteRenderbuffers API.

glDeleteRenderbuffers deletes the renderbuffer objects specied in

renderbuffers. Once a renderbuffer object is deleted, it has no state

associated with it and is marked as unused; it can then later be reused as a

new renderbuffer object. When deleting a renderbuffer object that is also

the currently bound renderbuffer object, the renderbuffer object is deleted

and the current renderbuffer binding is reset to zero. If the renderbuffer

object names specied in renderbuffers are invalid or zero, they are

ignored (i.e., no error will be generated). Further, if the renderbuffer is

attached to the currently bound framebuffer object, it is rst detached

from the framebuffer and only then deleted.

Framebuffer objects are deleted using the glDeleteFramebuffers API.

glDeleteFramebuffers deletes the framebuffer objects specied in

framebuffers. Once a framebuffer object is deleted, it has no state

void glDeleteRenderbuffers (GLsizei n,

GLuint *renderbuffers)

nnumber of renderbuffer object names to delete

renderbuffers pointer to an array of n renderbuffer object names

to be deleted

void glDeleteFramebuffers( GLsizei n,

GLuint *framebuffers)

nnumber of framebuffer object names to delete

framebuffers pointer to an array of n framebuffer object names

to be deleted

width species the width of the pixel rectangle to invalidate

(glInvalidateSubFramebuffer only)

height species the height of the pixel rectangle to invalidate

(glInvalidateSubFramebuffer only)

Deleting Framebuffer and Renderbuffer Objects 347

associated with it and is marked as unused; it can then later be reused as

a new framebuffer object. When deleting a framebuffer object that is also

the currently bound framebuffer object, the framebuffer object is deleted

and the current framebuffer binding is reset to zero. If the framebuffer

object names specied in framebuffers are invalid or zero, they are

ignored and no error will be generated.

Deleting Renderbuffer Objects That Are Used

as Framebuffer Attachments

What happens if a renderbuffer object being deleted is used as an

attachment in a framebuffer object? If the renderbuffer object to be

deleted is used as an attachment in the currently bound framebuffer

object, glDeleteRenderbuffers will reset the attachment to zero. If the

renderbuffer object to be deleted is used as an attachment in framebuffer

objects that are not currently bound, then glDeleteRenderbuffers

will not reset these attachments to zero. It is the responsibility of the

application to detach these deleted renderbuffer objects from the

appropriate framebuffer objects.

Reading Pixels and Framebuffer Objects

The glReadPixels command reads pixels from the color buffer and returns

them in a user-allocated buffer. The color buffer that will beread from

isthe color buffer allocated by the window system–providedframebuffer

orthe color attachment of the currentlyboundframebuffer object. When

a non-zero buffer object is bound to GL_PIXEL_PACK_BUFFER using

glBindBuffer, the glReadPixels command can return immediately and

invoke DMA transfer to read pixels from the framebuffer and write the data

into the pixel buffer object.

Several combinations of format and type argumentsinglReadPixels

aresupported: a format of GL_RGBA, GL_RGBA_INTEGER, or

implementation-specic values returned by querying

GL_IMPLEMENTATION_COLOR_READ_FORMAT; and a type of

GL_UNSIGNED_BYTE, GL_UNSIGNED_INT, GL_INT, GL_FLOAT, or

implementation-specic values returned by querying

GL_IMPLEMENTATION_COLOR_READ_TYPE. The implementation-specic

format and type returned will depend on the format and type of the

currently attached color buffer. These values can change if the currently

bound framebuffer changes. They must be queried whenever the

currently bound framebuffer object changes to determine the correct

implementation-specic format and type values that must be passed to

glReadPixels.

348 Chapter 12: Framebuffer Objects

Examples

Let’s now look at some examples that demonstrate how to use framebuffer

objects. Example 12-2 demonstrates how to render to texture using

framebuffer objects. In this example, we draw to a texture using a

framebuffer object. We then use this texture to draw a quad to the

window system–provided framebuffer (i.e., the screen). Figure 12-2 shows

the generated image.

Example 12-2 Render to Texture

GLuint framebuffer;

GLuint depthRenderbuffer;

GLuint texture;

GLint texWidth = 256, texHeight = 256;

GLint maxRenderbufferSize;

glGetIntegerv ( GL_MAX_RENDERBUFFER_SIZE, &maxRenderbufferSize);

// check if GL_MAX_RENDERBUFFER_SIZE is >= texWidth and texHeight

if ( ( maxRenderbufferSize <= texWidth ) ||

( maxRenderbufferSize <= texHeight ) )

{

// cannot use framebuffer objects, as we need to create

// a depth buffer as a renderbuffer object

// return with appropriate error

}

// generate the framebuffer, renderbuffer, and texture object names

glGenFramebuffers ( l, &framebuffer );

glGenRenderbuffers ( l, &depthRenderbuffer );

glGenTextures ( l, &texture );

// bind texture and load the texture mip level 0

// texels are RGB565

// no texels need to be specified as we are going to draw into

// the texture

glBindTexture ( GL_TEXTURE_2D, texture );

glTexImage2D ( GL_TEXTURE_2D, O, GL_RGB, texWidth, texHeight, 0,

GL_RGB, GL_UNSIGNED_SHORT_5_6_5, NULL );

glTexParameteri ( GL_TEXTURE_2D, GL_TEXTURE_WRAP_S,

GL_CLAMP_TO_EDGE );

glTexParameteri ( GL_TEXTURE_2D, GL_TEXTURE_WRAP_T,

GL_CLAMP_TO_EDGE );

glTexParameteri ( GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER,

GL_LINEAR );

Examples 349

Example 12-2 Render to Texture (continued)

glTexParameteri ( GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER,

GL_LINEAR);

// bind renderbuffer and create a 16-bit depth buffer

// width and height of renderbuffer = width and height of

// the texture

glBindRenderbuffer ( GL_RENDERBUFFER, depthRenderbuffer );

glRenderbufferStorage ( GL_RENDERBUFFER, GL_DEPTH_COMPONENT16,

texWidth, texHeight );

// bind the framebuffer

glBindFramebuffer ( GL_FRAMEBUFFER, framebuffer );

// specify texture as color attachment

glFramebufferTexture2D ( GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0,

GL_TEXTURE_2D, texture, 0 );

// specify depth_renderbuffer as depth attachment

glFramebufferRenderbuffer ( GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT,

GL_RENDERBUFFER, depthRenderbuffer);

// check for framebuffer complete

status = glCheckFramebufferStatus ( GL_FRAMEBUFFER );

if ( status == GL_FRAMEBUFFER_COMPLETE )

{

// render to texture using FBO

// clear color and depth buffer

glClearColor ( 0.0f, 0.0f, 0.0f, 1.0f );

glClear ( GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT );

// Load uniforms for vertex and fragment shaders

// used to render to FBO. The vertex shader is the

// ES 1.1 vertex shader described in Example 8-8 in

// Chapter 8. The fragment shader outputs the color

// computed by the vertex shader as fragment color and

// is described in Example 1-2 in Chapter 1.

set_fbo_texture_shader_and_uniforms( );

// drawing commands to the framebuffer object draw_teapot();

// render to window system-provided framebuffer

glBindFramebuffer ( GL_FRAMEBUFFER, 0 );

// Use texture to draw to window system-provided framebuffer.

// We draw a quad that is the size of the viewport.

// The vertex shader outputs the vertex position and texture

// coordinates passed as inputs.

(continues)

350 Chapter 12: Framebuffer Objects

Example 12-2 Render to Texture (continued)

// The fragment shader uses the texture coordinate to sample

// the texture and uses this as the per-fragment color value.

set_screen_shader_and_uniforms ( );

draw_screen_quad ( );

}

// clean up

glDeleteRenderbuffers ( l, &depthRenderbuffer );

glDeleteFramebuffers ( l, &framebuffer);

glDeleteTextures ( l, &texture );

Figure 12-2 Render to Color Texture

In Example 12-2, we create the framebuffer, texture, and

depthRenderbuffer objects using the appropriate glGen*** commands.

The framebuffer object uses a color attachment that is a texture

object (texture) and a depth attachment that is a renderbuffer object

(depthRenderbuffer).

Before we create these objects, we query the maximum renderbuffer size

(GL_MAX_RENDERBUFFER_SIZE) to verify that the maximum renderbuffer

size supported by the implementation is less than or equal to the width

and height of texture that will be used as a color attachment. This step

ensures that we can create a depth renderbuffer successfully and use it as

the depth attachment in framebuffer.

Examples 351

After the objects have been created, we call glBindTexture(texture) to

make the texture the currently bound texture object. The texture mip level

is then specied using glTexImage2D. Note that the pixels argument is

NULL: We are rendering to the entire texture region, so there is no reason

to specify any input data (this data will be overwritten).

The depthRenderbuffer object is bound using glBindRenderbuffer, and

glRenderbufferStorage is called to allocate storage for a 16-bit depth buffer.

The framebuffer object is bound using glBindFramebuffer. texture is

attached as a color attachment to framebuffer, and depthRenderbuffer

is attached as a depth attachment to framebuffer.

We next check the framebuffer status to see if it is complete before we

begin drawing into framebuffer. Once framebuffer rendering is complete,

we reset the currently bound framebuffer to the window system–provided

framebuffer by calling glBindFramebuffer(GL_FRAMEBUFFER, 0). We

can now use texture, which was used as a render target in framebuffer,

to draw to the window system–provided framebuffer.

In Example 12-2, the depth buffer attachment to framebuffer was a

renderbuffer object. In Example 12-3, we consider how to use a depth

texture as a depth buffer attachment to framebuffer. Applications can

render to the depth texture used as a framebuffer attachment from the

light source. The rendered depth texture can then be used as a shadow

map to calculate the percentage in shadow for each fragment. Figure 12-3

shows the generated image.

Example 12-3 Render to Depth Texture

#define COLOR_TEXTURE 0

#define DEPTH_TEXTURE 1

GLuint framebuffer;

GLuint textures[2];

GLint texWidth = 256, texHeight = 256;

// generate the framebuffer and texture object names

glGenFramebuffers ( l, &framebuffer );

glGenTextures ( 2, textures );

// bind color texture and load the texture mip level 0

// texels are RGB565

// no texels need to specified as we are going to draw into

// the texture

(continues)

352 Chapter 12: Framebuffer Objects

Example 12-3 Render to Depth Texture (continued)

glBindTexture ( GL_TEXTURE_2D, textures[COLOR_TEXTURE] );

glTexImage2D ( GL_TEXTURE_2D, 0, GL_RGB, texWidth, texHeight, 0,

GL_RGB, GL_UNSIGNED_SHORT_5_6_5, NULL );

glTexParameteri ( GL_TEXTURE_2D, GL_TEXTURE_WRAP_S,

GL_CLAMP_TO_EDGE );

glTexParameteri ( GL_TEXTURE_2D, GL_TEXTURE_WRAP_T,

GL_CLAMP_TO_EDGE );

glTexParameteri ( GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER,

GL_LINEAR );

glTexParameteri ( GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER,

GL_LINEAR );

// bind depth texture and load the texture mip level 0

// no texels need to specified as we are going to draw into

// the texture

glBindTexture ( GL_TEXTURE_2D, textures[DEPTH_TEXTURE] );

glTexImage2D ( GL_TEXTURE_2D, 0, GL_DEPTH_COMPONENT, texWidth,

texHeight, 0, GL_DEPTH_COMPONENT,

GL_UNSIGNED_SHORT, NULL );

glTexParameteri ( GL_TEXTURE_2D, GL_TEXTURE_WRAP_S,

GL_CLAMP_TO_EDGE );

glTexParameteri ( GL_TEXTURE_2D, GL_TEXTURE_WRAP_T,

GL_CLAMP_TO_EDGE );

glTexParameteri ( GL_TEXTURE_2D, GL_TEXTURE_MAG_FILTER,

GL_NEAREST );

glTexParameteri ( GL_TEXTURE_2D, GL_TEXTURE_MIN_FILTER,

GL_NEAREST );

// bind the framebuffer

glBindFramebuffer ( GL_FRAMEBUFFER, framebuffer );

// specify texture as color attachment

glFramebufferTexture2D ( GL_FRAMEBUFFER, GL_COLOR_ATTACHMENT0,

GL_TEXTURE_2D, textures[COLOR_TEXTURE],

0 );

// specify texture as depth attachment

glFramebufferTexture2D ( GL_FRAMEBUFFER, GL_DEPTH_ATTACHMENT,

GL_TEXTURE_2D, textures[DEPTH_TEXTURE],

0 );

// check for framebuffer complete

status = glCheckFramebufferStatus ( GL_FRAMEBUFFER );

if ( status == GL_FRAMEBUFFER_COMPLETE )

{

// render to color and depth textures using FBO

// clear color and depth buffers

Examples 353

Example 12-3 Render to Depth Texture (continued)

glClearColor ( 0.0f, 0.0f, 0.0f, 1.0f );

glClear ( GL_COLOR_BUFFER_BIT | GL_DEPTH_BUFFER_BIT );

// Load uniforms for vertex and fragment shaders

// used to render to FBO. The vertex shader is the

// ES 1.1 vertex shader described in Example 8-8 in

// Chapter 8. The fragment shader outputs the color

// computed by vertex shader as fragment color and

// is described in Example 1-2 in Chapter 1.

set_fbo_texture_shader_and_uniforms( );

// drawing commands to the framebuffer object

draw_teapot( );

// render to window system-provided framebuffer

glBindFramebuffer ( GL_FRAMEBUFFER, 0 );

// Use depth texture to draw to window system framebuffer.

// We draw a quad that is the size of the viewport.

// The vertex shader outputs the vertex position and texture

// coordinates passed as inputs.

// The fragment shader uses the texture coordinate to sample

// the texture and uses this as the per-fragment color value.

set_screen_shader_and_uniforms( );

draw_screen_quad( );

}

// clean up

glDeleteFramebuffers ( l, &framebuffer );

glDeleteTextures ( 2, textures );

Figure 12-3 Render to Depth Texture

354 Chapter 12: Framebuffer Objects

Note: The width and height of the off-screen renderbuffers do not have to

be a power of 2.

Performance Tips and Tricks

Here, we discuss some performance tips that developers should carefully

consider when using framebuffer objects:

Avoid frequent switching between rendering to the window system–

provided framebuffer and rendering to framebuffer objects. This is an

issue for handheld OpenGL ES 3.0 implementations, as many of these

implementations use a tile-based rendering architecture. With a tile-based

rendering architecture, dedicated internal memory is used to store the

color, depth, and stencil values for a tile (i.e., region) of the framebuffer.

The internal memory is used as it is much more efcient in terms of

power utilization, and it has better memory latency and bandwidth

compared with going to external memory. After rendering to a tile is

completed, the tile is written out to device (or system) memory. Every

time you switch from one rendering target to another, the appropriate

texture and renderbuffer attachments will need to be rendered, saved,

and restored. This can become quite expensive. The best method would

be to render to the appropriate framebuffers in the scene rst, and

then render to the window system–provided framebuffer, followed by

execution of the eglSwapBuffers command to swap the display buffer.

Don’t create and destroy framebuffer and renderbuffer objects (or any

other large data objects for that matter) per frame.

Try to avoid modifying textures (using glTexImage2D,

glTexSubImage2D, glCopyTeximage2D, and so on) that are

attachments to framebuffer objects used as rendering targets.

Set the pixels argument in glTexImage2D and glTexImage3D to

NULL if the entire texture image will be rendered, as the original data

will not be used anyway. Use glInvalidateFramebuffer to clear the

texture image before drawing to the texture if you expect the image to

have any predened pixel values in it.

Share depth and stencil renderbuffers as attachments used by

framebuffer objects wherever possible to keep the memory footprint

requirement to a minimum. We recognize that this recommendation

has limited use, as the width and height of these buffers have to be

the same. In a future version of OpenGL ES, the rule that the width

and height of various attachments of a framebuffer object must be

equal might be relaxed, making sharing easier.

Summary 355

Summary

In this chapter, you learned about the use of framebuffer objects for

rendering to off-screen surfaces. There are several uses of framebuffer

objects, the most common of which is for rendering to a texture.

You learned how to specify color, depth, and stencil attachments to

a framebuffer object and how to copy and invalidate pixels in the

framebuffer, and then saw some examples that demonstrated rendering

to a framebuffer object. Understanding framebuffer objects is critical for

implementing many advanced effects, such as reections, shadow maps,

and postprocessing. Next, you will learn about sync objects and fences—

the mechanisms to synchronize the application and GPU execution.

This page intentionally left blank

357

Chapter 13

Sync Objects and Fences

OpenGL ES 3.0 provides a mechanism for the application to wait until

a set of OpenGL ES operations have nished executing on the GPU. You

can synchronize GL operations among multiple graphics contexts and

threads, which can be important in many advanced graphics applications.

For example, you may want to wait for transform feedback results before

using those results in your applications.

In this chapter, we discuss the ush command, the nish command,

and sync objects and fences, including why they are useful and how to

use them to synchronize operations in the graphics pipeline. Finally, we

conclude with an example of using sync objects and fences.

Flush and Finish

The OpenGL ES 3.0 API inherits the OpenGL client–server model.

The application, or client, issues commands, and these commands are

processed by the OpenGL ES implementation or server. In OpenGL, the

client and the server can reside on different machines over a network.

OpenGL ES also allows the client and server to reside on different

machines but because OpenGL ES targets handheld and embedded

platforms, the client and server will typically be on the same device.

In the client–server model, the commands issued by the client do not

necessarily get sent to the server immediately. If the client and server are

operating over a network, it will be very inefcient to send individual

commands over the network. Instead, the commands can be buffered on the

client side and then issued to the server at a later point in time. To support

this approach, a mechanism is needed that lets the client know when

358 Chapter 13: Sync Objects and Fences

the server has completed execution of previously submitted commands.

Consider another example where multiple OpenGL ES contexts (each current

to a different thread) are sharing objects. To synchronize correctly between

these contexts, it is important that commands from context A be issued to

the server before commands from context B, which depends on OpenGL

ES state modied by context A. The glFlush command is used to ush any

pending commands in the current OpenGL ES context and issue them to the

server. Note that glFlush only issues the commands to the server; it does

not wait for them to complete. If the client requires that the commands be

completed, the glFinish command should be used. We do not recommend

using glFinish unless absolutely necessary. Because glFinish does not

return until all queued commands in the context have been completely

processed by the server, calling glFinish can adversely impact performance

by forcing the client and the server to synchronize their operations.

Why Use a Sync Object?

OpenGL ES 3.0 introduces a new feature, called a fence, that provides a

way for the application to inform the GPU to wait until a set of OpenGL

ES operations have nished executing before queuing up more for

execution. You can insert a fence command into the GL command stream

and associate it with a sync object to be waited on.

If we compare using sync objects to the glFinish command, sync objects

are more efcient, as you can wait on partial completions of the GL

command stream. By comparison, calling the glFinish command may

reduce the performance of your applications, as this command will empty

the graphics pipeline.

Creating and Deleting a Sync Object

To insert a fence command to the GL command stream and create a sync

object, you can call the following function:

GLsync glFenceSync(GLenum condition, GLbitfield flags)

condition

species the condition that must be met to signal the sync

object; must be GL_SYNC_GPU_COMMANDS_COMPLETE

flags

species a bit-wise combination of ags to control the

behavior of the sync object; must be zero presently

Waiting for and Signaling a Sync Object 359

When a sync object is rst created, its status is unsignaled. After the

specied condition is satised by the fence command, then its status

becomes signaled. Because sync objects cannot be reused, you must create

one sync object for each synchronization operation.

To delete a sync object, you can call the following function:

GLvoid glDeleteSync(GLsync sync)

sync species the sync object to be deleted

GLenum glClientWaitSync(GLsync sync, GLbitfield flags,

GLuint64 timeout)

sync species the sync object to wait on for its status

flags species a biteld controlling the command ushing

behavior; may be GL_SYNC_FLUSH_COMMANDS_BIT

timeout

species the timeout in nanoseconds to wait for the sync

object to be signaled

The deletion operation does not occur immediately, as the sync object will

be deleted only when no other operation is waiting for it. Thus you can

call the glDeleteSync command right after waiting for the sync object,

which is described next.

Waiting for and Signaling a Sync Object

You can block the client and wait for a sync object to be signaled with the

following call:

If the sync object is already at a signaled state, the glClientWaitSync

command will return immediately. Otherwise, the call will block and wait

up to timeout nanoseconds for the sync object to be signaled.

360 Chapter 13: Sync Objects and Fences

The glClientWaitSync function can return the following values:

GL_ALREADY_SIGNALED: the sync object was already at the signaled

state when the function was called.

GL_TIMEOUT_EXPIRED: the sync object did not become signaled after

timeout nanoseconds passed.

GL_CONDITION_SATISFIED: the sync object was signaled before the

timeout expired.

GL_WAIT_FAILED: an error occurred.

The glWaitSync function is similar to the glClientWaitSync function,

except that the function returns immediately and blocks the GPU until

the sync object is signaled.

void glWaitSync(GLsync sync, GLbitfield flags,

GLuint64 timeout)

sync species the sync object to wait on for its status.

flags species a biteld controlling the command ushing

behavior; must be zero.

timeout species the timeout in nanoseconds that the server should

wait before continuing; must be GL_TIMEOUT_IGNORED.

Example

Example 13-1 shows an example of inserting a fence command after

transform feedback buffers are created (see the EmitParticles function

implementation) and blocking the GPU to wait on the transform feedback

results before drawing them (see the Draw function implementation). The

EmitParticles function and Draw function are executed by two separate

CPU threads.

This code segment is a part of the particle system with transform feedback

example that will be described in more detail in Chapter 14, “Advanced

Programming with OpenGL ES 3.0.”

Summary 361

Summary

In this chapter, you learned about efcient primitives for synchronizing

within the host application and GPU execution in OpenGL ES 3.0. We

discussed how to use the sync objects and fences. In the next chapter,

you will see many advanced rendering examples that tie together all the

concepts you have learned so far throughout the book.

Example 13-1 Inserting a Fence Command and Waiting for Its Result in

Transform Feedback Example

void EmitParticles ( ESContext *esContext, float deltaTime )

{

// Many codes skipped . . .

// Emit particles using transform feedback

glBeginTransformFeedback ( GL_POINTS );

glDrawArrays ( GL_POINTS, 0, NUM_PARTICLES );

glEndTransformFeedback ( );

// Create a sync object to ensure transform feedback results

// are completed before the draw that uses them

userData->emitSync =

glFenceSync ( GL_SYNC_GPU_COMMANDS_COMPLETE, 0 );

// Many codes skipped . . .

}

void Draw ( ESContext *esContext )

{

UserData *userData = ( UserData* ) esContext->userData;

// Block the GL server until transform feedback results

// are completed

glWaitSync ( userData->emitSync, 0, GL_TIMEOUT_IGNORED );

glDeleteSync ( userData->emitSync );

// Many codes skipped . . .

glDrawArrays ( GL_POINTS, 0, NUM_PARTICLES );

}

This page intentionally left blank

363

Chapter 14

Advanced Programming

with OpenGL ES 3.0

In this chapter, we put together many of the techniques you have learned

throughout this book to discuss some advanced uses of OpenGL ES 3.0.

A large number of advanced rendering techniques can be accomplished

with the programmable exibility of OpenGL ES 3.0. In this chapter, we

cover the following techniques:

Per-fragment lighting

Environment mapping

Particle system with point sprites

Particle system with transform feedback

Image postprocessing

Projective texturing

Noise using a 3D texture

Procedural textures

Terrain rendering with vertex texture fetch

Shadows using a depth texture

Per-Fragment Lighting

In Chapter 8, “Vertex Shaders,” we covered the lighting equations that can

be used in the vertex shader to calculate per-vertex lighting. Commonly,

to achieve higher-quality lighting, we seek to evaluate the lighting

equations on a per-fragment basis. In this section, we provide an example

364 Chapter 14: Advanced Programming with OpenGL ES 3.0

of evaluating ambient, diffuse, and specular lighting on a per-fragment

basis. This example is a PVRShaman workspace that can be found in

Chapter_14/PVR_PerFragmentLighting, as pictured in Figure14-1.

Several of the examples in this chapter make use of PVRShaman, a shader

development integrated development environment (IDE) that is part

of the Imagination Technologies PowerVR SDK (downloadable from

http://powervrinsider.com/).

Figure 14-1 Per-Fragment Lighting Example

Lighting with a Normal Map

Before we get into the details of the shaders used in the PVRShaman

workspace, we need to discuss the general approach that is used in the

example. The simplest way to do lighting per-fragment would be to

use the interpolated vertex normal in the fragment shader and then

move the lighting computations into the fragment shader. However, for

the diffuse term, this would really not yield much better results than

doing the lighting on a per-vertex basis. There would be the advantage

that the normal vector could be renormalized, which would remove

artifacts due to linear interpolation, but the overall quality would be

only minimally better. To really take advantage of the ability to do

computations on a per-fragment basis, we need to use a normal map

to store per-texel normals—a technique that can provide signicantly

more detail.

A normal map is a 2D texture that stores a normal vector at each texel.

The red channel represents the x component, the green channel the y

component, and the blue channel the z component. For a normal map

stored as GL_RGB8 with GL_UNSIGNED_BYTE data, the values will all be

in the range [0, 1]. To represent a normal, these values need to be scaled

and biased in the shader to remap to [−1, 1]. The following block of

Per-Fragment Lighting 365

fragment shader code shows how you would go about fetching from a

normal map:

// Fetch the tangent space normal from normal map

vec3 normal = texture(s_bumpMap, v_texcoord).xyz;

// Scale and bias from [0, 1] to [−1, 1] and normalize

normal = normalize(normal * 2.0 − 1.0);

As you can see, this small bit of shader code will fetch the color value from

a texture map and then multiply the results by 2 and subtract 1. The result

is that the values are rescaled into the [−1, 1] range from the [0, 1] range.

We could actually avoid this scale and bias in the shader code by using

a signed texture format such as GL_RGB8_SNORM, but for the purposes of

demonstration we are showing how to use a normal map stored in an

unsigned format. In addition, if the data in your normal map are not

normalized, you will need to normalize the results in the fragment shader.

This step can be skipped if your normal map contains all unit vectors.

The other signicant issue to tackle with per-fragment lighting has to

do with the space in which the normals in the texture are stored. To

minimize computations in the fragment shader, we do not want to have

to transform the result of the normal fetched from the normal map. One

way to accomplish this would be to store world-space normals in your

normal map. That is, the normal vectors in the normal map would each

represent a world-space normal vector. Then, the light and direction

vectors could be transformed into world space in the vertex shader and

could be directly used with the value fetched from the normal map.

However, some signicant issues arise when storing normal maps in world

space. Most importantly, the object must be assumed to be static because

no transformation can happen on the object. In addition, the same

surface oriented in different directions in space would not be able to share

the same texels in the normal map, which can result in much larger maps.

A better solution than using world-space normal maps is to store normal

maps in tangent space. The idea behind tangent space is that we dene a

space for each vertex using three coordinate axes: the normal, binormal,

and tangent. The normals stored in the texture map are then all stored

in this tangent space. Then, when we want to compute any lighting

equations, we transform our incoming lighting vectors into the tangent

space and those light vectors can then be used directly with the values in

the normal map. The tangent space is typically computed as a preprocess

and the binormal and tangent are added to the vertex attribute data. This

work is done automatically by PVRShaman, which computes a tangent

space for any model that has a vertex normal and texture coordinates.

366 Chapter 14: Advanced Programming with OpenGL ES 3.0

Lighting Shaders

Once we have tangent space normal maps and tangent space vectors set

up, we can proceed with per-fragment lighting. First, let’s look at the

vertex shader in Example 14-1.

Example 14-1 Per-Fragment Lighting Vertex Shader

#version 300 es

uniform mat4 u_matViewInverse;

uniform mat4 u_matViewProjection;

uniform vec3 u_lightPosition;

uniform vec3 u_eyePosition;

in vec4 a_vertex;

in vec2 a_texcoord0;

in vec3 a_normal;

in vec3 a_binormal;

in vec3 a_tangent;

out vec2 v_texcoord;

out vec3 v_viewDirection;

out vec3 v_lightDirection;

void main( void )

{

// Transform eye vector into world space

vec3 eyePositionWorld =

(u_matViewInverse * vec4(u_eyePosition, 1.0)).xyz;

// Compute world−space direction vector

vec3 viewDirectionWorld = eyePositionWorld − a_vertex.xyz;

// Transform light position into world space

vec3 lightPositionWorld =

(u_matViewInverse * vec4(u_lightPosition, 1.0)).xyz;

// Compute world−space light direction vector

vec3 lightDirectionWorld = lightPositionWorld − a_vertex.xyz;

// Create the tangent matrix

mat3 tangentMat = mat3( a_tangent,

a_binormal,

a_normal );

// Transform the view and light vectors into tangent space

v_viewDirection = viewDirectionWorld * tangentMat;

v_lightDirection = lightDirectionWorld * tangentMat;

// Transform output position

gl_Position = u_matViewProjection * a_vertex;

Per-Fragment Lighting 367

Note that the vertex shader inputs and uniforms are set up automatically

by PVRShaman by setting semantics in the PerFragmentLighting.pfx

le. We have two uniform matrices that we need as input to the

vertex shader: u_matViewInverse and u_matViewProjection. The

u_matViewInverse matrix contains the inverse of the view matrix. This

matrix is used to transform the light vector and the eye vector (which

are in view space) into world space. The rst four statements in main

perform this transformation and compute the light vector and view vector

in world space. The next step in the shader is to create a tangent matrix.

Thetangent space for the vertex is stored in three vertex attributes:

a_normal, a_binormal, and a_tangent. These three vectors dene the

three coordinate axes of the tangent space for each vertex. We construct a

3 × 3 matrix out of these vectors to form the tangent matrix tangentMat.

The next step is to transform the view and direction vectors into tangent

space by multiplying them by the tangentMat matrix. Remember,

our purpose here is to get the view and direction vectors into the

same space as the normals in the tangent-space normal map. By doing

this transformation in the vertex shader, we avoid performing any

transformations in the fragment shader. Finally, we compute the nal

output position and place it in gl_Position and pass the texture

coordinate along to the fragment shader in v_texcoord.

Now we have the view and direction vector in view space and a texture

coordinate passed as out variables to the fragment shader. The next step

is to actually light the fragments using the fragment shader, as shown in

Example 14-2.

Example 14-1 Per-Fragment Lighting Vertex Shader (continued)

// Pass through texture coordinate

v_texcoord = a_texcoord0.xy;

}

Example 14-2 Per-Fragment Lighting Fragment Shader

#version 300 es

precision mediump float;

uniform vec4 u_ambient;

uniform vec4 u_specular;

uniform vec4 u_diffuse;

uniform float u_specularPower;

(continues)

368 Chapter 14: Advanced Programming with OpenGL ES 3.0

Example 14-2 Per-Fragment Lighting Fragment Shader (continued)

uniform sampler2D s_baseMap;

uniform sampler2D s_bumpMap;

in vec2 v_texcoord;

in vec3 v_viewDirection;

in vec3 v_lightDirection;

layout(location = 0) out vec4 fragColor;

void main( void )

{

// Fetch base map color

vec4 baseColor = texture(s_baseMap, v_texcoord);

// Fetch the tangent space normal from normal map

vec3 normal = texture(s_bumpMap, v_texcoord).xyz;

// Scale and bias from [0, 1] to [−1, 1] and

// normalize

normal = normalize(normal * 2.0 − 1.0);

// Normalize the light direction and view

// direction

vec3 lightDirection = normalize(v_lightDirection);

vec3 viewDirection = normalize(v_viewDirection);

// Compute N.L

float nDotL = dot(normal, lightDirection);

// Compute reflection vector

vec3 reflection = (2.0 * normal * nDotL) −

lightDirection;

// Compute R.V

float rDotV =

max(0.0, dot(reflection, viewDirection));

// Compute ambient term

vec4 ambient = u_ambient * baseColor;

// Compute diffuse term

vec4 diffuse = u_diffuse * nDotL * baseColor;

// Compute specular term

vec4 specular = u_specular *

pow(rDotV, u_specularPower);

// Output final color

fragColor = ambient + diffuse + specular;

}

Per-Fragment Lighting 369

The rst part of the fragment shader consists of a series of uniform declarations

for the ambient, diffuse, and specular colors. These values are stored in the

uniform variables u_ambient, u_diffuse, and u_specular, respectively.

The shader is also congured with two samplers, s_baseMap and s_bumpMap,

which are bound to a base color map and the normal map, respectively.

The rst part of the fragment shader fetches the base color from the base

map and the normal values from the normal map. As described earlier,

the normal vector fetched from the texture map is scaled and biased and

then normalized so that it is a unit vector with components in the [−1, 1]

range. Next, the light vector and view vector are normalized and stored

in lightDirection and viewDirection. Normalization is necessary

because of the way fragment shader input variables are interpolated across

a primitive. The fragment shader input variables are linearly interpolated

across the primitive. When linear interpolation is done between two vectors,

the results can become denormalized during interpolation. To compensate

for this artifact, the vectors must be normalized in the fragment shader.

Lighting Equations

At this point in the fragment shader, we now have a normal, light vector,

and direction vector all normalized and in the same space. This gives

us the inputs needed to compute the lighting equations. The lighting

computations performed in this shader are as follows:

Ambient = kAmbient × CBase

Diffuse = kDiffuse × NL × CBase

Specular = kSpecular × pow(max(RV, 0.0), kSpecular Power

The k constants for ambient, diffuse, and specular colors come from the

u_ambient, u_diffuse, and u_specular uniform variables. The CBase is

the base color fetched from the base texture map. The dot product of the

light vector and the normal vector, , is computed and stored in the

nDotL variable in the shader. This value is used to compute the diffuse

lighting term. Finally, the specular computation requires R, which is the

reection vector computed from the equation

R = 2 × N × (NL) − L

Notice that the reection vector also requires NL, so the computation

used for the diffuse lighting term can be reused in the reection vector

computation. Finally, the lighting terms are stored in the ambient,

diffuse, and specular variables in the shader. These results are summed

370 Chapter 14: Advanced Programming with OpenGL ES 3.0

and nally stored in the fragColor output variable. The result is a per-

fragment lit object with normal data coming from the normal map.

Many variations are possible on per-fragment lighting. One common

technique is to store the specular exponent in a texture along with a specular

mask value. This allows the specular lighting to vary across a surface.

The main purpose of this example is to give you an idea of the types of

computations that are typically done for per-fragment lighting. The use

of tangent space, along with the computation of the lighting equations in

the fragment shader, is typical of many modern games. Of course, it is also

possible to add more lights, more material information, and much more.

Environment Mapping

The next rendering technique we cover—related to the previous

technique—is performing environment mapping using a cubemap.

The example we cover is the PVRShaman workspace

Chapter_14/PVR_EnvironmentMapping. The results are shown

in Figure 14-2.

Figure 14-2 Environment Mapping Example

The concept behind environment mapping is to render the reection of

the environment on an object. In Chapter 9, “Texturing,” we introduced

cubemaps, which are commonly used to store environment maps. In the

PVRShaman example workspace, the environment of a mountain scene

is stored in a cubemap. The way such cubemaps can be generated is by

positioning a camera at the center of a scene and rendering along each of the

positive and negative major axis directions using a 90-degree eld of view. For

reections that change dynamically, we can render such a cubemap using a

framebuffer object dynamically for each frame. For a static environment, this

process can be done as a preprocess and the results stored in a static cubemap.

The vertex shader for the environment mapping example is provided in

Example 14-3.

Environment Mapping 371

Example 14-3 Environment Mapping Vertex Shader

#version 300 es

uniform mat4 u_matViewInverse;

uniform mat4 u_matViewProjection;

uniform vec3 u_lightPosition;

in vec4 a_vertex;

in vec2 a_texcoord0;

in vec3 a_normal;

in vec3 a_binormal;

in vec3 a_tangent;

out vec2 v_texcoord;

out vec3 v_lightDirection;

out vec3 v_normal;

out vec3 v_binormal;

out vec3 v_tangent;

void main( void )

{

// Transform light position into world space

vec3 lightPositionWorld =

(u_matViewInverse * vec4(u_lightPosition, 1.0)).xyz;

// Compute world−space light direction vector

vec3 lightDirectionWorld = lightPositionWorld − a_vertex.xyz;

// Pass the world−space light vector to the fragment shader

v_lightDirection = lightDirectionWorld;

// Transform output position

gl_Position = u_matViewProjection * a_vertex;

// Pass through other attributes

v_texcoord = a_texcoord0.xy;

v_normal = a_normal;

v_binormal = a_binormal;

v_tangent = a_tangent;

}

The vertex shader in this example is very similar to the previous per-

fragment lighting example. The primary difference is that rather than

transforming the light direction vector into tangent space, we keep the

light vector in world space. The reason we must do this is because we

ultimately want to fetch from the cubemap using a world-space reection

vector. As such, rather than transforming the light vectors into tangent

space, we will transform the normal vector from tangent space into world

372 Chapter 14: Advanced Programming with OpenGL ES 3.0

space. To do so, the vertex shader passes the normal, binormal, and

tangent as varyings into the fragment shader so that a tangent matrix can

be constructed.

The fragment shader listing for the environment mapping sample is

provided in Example 14-4.

Example 14-4 Environment Mapping Fragment Shader

#version 300 es

precision mediump float;

uniform vec4 u_ambient;

uniform vec4 u_specular;

uniform vec4 u_diffuse;

uniform float u_specularPower;

uniform sampler2D s_baseMap;

uniform sampler2D s_bumpMap;

uniform samplerCube s_envMap;

in vec2 v_texcoord;

in vec3 v_lightDirection;

in vec3 v_normal;

in vec3 v_binormal;

in vec3 v_tangent;

layout(location = 0) out vec4 fragColor;

void main( void )

{

// Fetch base map color

vec4 baseColor = texture( s_baseMap, v_texcoord );

// Fetch the tangent space normal from normal map

vec3 normal = texture( s_bumpMap, v_texcoord ).xyz;

// Scale and bias from [0, 1] to [−1, 1]

normal = normal * 2.0 − 1.0;

// Construct a matrix to transform from tangent to

// world space

mat3 tangentToWorldMat = mat3( v_tangent,

v_binormal,

v_normal );

// Transform normal to world space and normalize

normal = normalize( tangentToWorldMat * normal );

// Normalize the light direction

vec3 lightDirection = normalize( v_lightDirection );

Environment Mapping 373

In the fragment shader, you will notice that the normal vector is fetched

from the normal map in the same way as in the per-fragment lighting

example. The difference in this example is that rather than leaving

the normal vector in tangent space, the fragment shader transforms

the normal vector into world space. This is done by constructing the

tangentToWorld matrix out of the v_tangent, v_binormal, and

v_normal varying vectors and then multiplying the fetched normal

vector by this new matrix. The reection vector is then calculated using

the light direction vector and normal, both in world space. The result

of the computation is a reection vector that is in world space, exactly

what we need to fetch from the cubemap as an environment map. This

vector is used to fetch into the environment map using the texture

function with the reflection vector as a texture coordinate. Finally,

the resultant fragColor is written as a combination of the base map

color and the environment map color. The base color is attenuated by

0.25 for the purposes of this example so that the environment map is

clearly visible.

This example demonstrates the basics of environment mapping. The

same basic technique can be used to produce a large variety of effects. For

example, the reection may be attenuated using a fresnel term to more

accurately model the reection of light on a given material. As mentioned

earlier, another common technique is to dynamically render a scene into

a cubemap so that the environment reection varies as an object moves

through a scene and the scene itself changes. Using the basic technique

shown here, you can extend the technique to accomplish more advanced

reection effects.

Example 14-4 Environment Mapping Fragment Shader (continued)

// Compute N.L

float nDotL = dot( normal, lightDirection );

// Compute reflection vector

vec3 reflection = ( 2.0 * normal * nDotL ) − lightDirection;

// Use the reflection vector to fetch from the environment

// map

vec4 envColor = texture( s_envMap, reflection );

// Output final color

fragColor = 0.25 * baseColor + envColor;

}

374 Chapter 14: Advanced Programming with OpenGL ES 3.0

Particle System with Point Sprites

The next example we cover is rendering a particle explosion using point

sprites. This example demonstrates how to animate a particle in a vertex

shader and how to render particles using point sprites. The example we

cover is the sample program in Chapter_14/ParticleSystem, the results

of which are pictured in Figure 14-3.

Particle System Setup

Before diving into the code for this example, it’s helpful to cover at a high

level the approach this sample uses. One of the goals here is to show how

to render a particle explosion without having any dynamic vertex data

modied by the CPU. That is, with the exception of uniform variables,

there are no changes to any of the vertex data as the explosion animates.

To accomplish this goal, a number of inputs are fed into the shaders.

At initialization time, the program initializes the following values in a

vertex array, one for each particle, based on a random value:

Lifetime—The lifetime of a particle in seconds.

Start position—The start position of a particle in the explosion.

End position—The nal position of a particle in the explosion (the

particles are animated by linearly interpolating between the start and

end position).

Figure 14-3 Particle System Sample

Environment Mapping 375

In addition, each explosion has several global settings that are passed in as

uniforms:

Center position—The center of the explosion (the per-vertex

positions are offset from this center).

Color—An overall color for the explosion.

Time—The current time in seconds.

Particle System Vertex Shader

With this information, the vertex and fragment shaders are completely

responsible for the motion, fading, and rendering of the particles. Let’s

begin by looking at the vertex shader code for the sample in Example 14-5.

Example 14-5 Particle System Vertex Shader

#version 300 es

uniform float u_time;

uniform vec3 u_centerPosition;

layout(location = 0) in float a_lifetime;

layout(location = 1) in vec3 a_startPosition;

layout(location = 2) in vec3 a_endPosition;

out float v_lifetime;

void main()

{

if ( u_time <= a_lifetime )

{

gl_Position.xyz = a_startPosition +

(u_time * a_endPosition);

gl_Position.xyz += u_centerPosition;

gl_Position.w = 1.0;

}

else

{

gl_Position = vec4( −1000, −1000, 0, 0 );

}

v_lifetime = 1.0 − ( u_time / a_lifetime );

v_lifetime = clamp ( v_lifetime, 0.0, 1.0 );

gl_PointSize = ( v_lifetime * v_lifetime ) * 40.0;

}

The rst input to the vertex shader is the uniform variable u_time. This

variable is set to the current elapsed time in seconds by the application.

The value is reset to 0.0 when the time exceeds the length of a single

376 Chapter 14: Advanced Programming with OpenGL ES 3.0

explosion. The next input to the vertex shader is the uniform variable

u_centerPosition. This variable is set to the center location of the

explosion at the start of a new explosion. The setup code for u_time

and u_centerPosition appears in the Update function in the C code

of the example program, which is provided in Example 14-6.

Example 14-6 Update Function for Particle System Sample

void Update (ESContext *esContext, float deltaTime)

{

UserData *userData = esContext−>userData;

userData−>time += deltaTime;

glUseProgram ( userData−>programObject );

if(userData−>time >= l.Of)

{

float centerPos[3];

float color[4] ;

userData−>time = O.Of;

// Pick a new start location and color

centerPos[0] = ((float)(rand() % 10000)/10000.0f)−0.5f;

centerPos[l] = ((float)(rand() % 10000)/10000.0f)−0.5f;

centerPos[2] = ((float)(rand() % 10000)/10000.0f)−0.5f;

glUniform3fv(userData−>centerPositionLoc, 1,

&centerPos[0]);

// Random color

color[0] = ((float)(rand() % 10000) / 20000.Of) + 0.5f;

color[l] = ((float)(rand() % 10000) / 20000.Of) + 0.5f;

color[2] = ((float)(rand() % 10000) / 20000.Of) + 0.5f;

color[3] = 0.5;

glUniform4fv(userData−>colorLoc, 1, &color[0]);

}

// Load uniform time variable

glUniformlf(userData−>timeLoc, userData−>time);

}

As you can see, the Update function resets the time after 1 second elapses

and then sets up a new center location and time for another explosion.

The function also keeps the u_time variable up-to-date in each frame.

The vertex inputs to the vertex shader are the particle lifetime, particle start

position, and end position. These variables are all initialized to randomly

Environment Mapping 377

seeded values in the Init function in the program. The body of the vertex

shader rst checks whether a particle’s lifetime has expired. If so, the

gl_Position variable is set to the value (−1000, −1000), which is just a

quick way of forcing the point to be off the screen. Because the point will

be clipped, all of the subsequent processing for the expired point sprites

can be skipped. If the particle is still alive, its position is set to be a linear

interpolated value between the start and end positions. Next, the vertex

shader passes the remaining lifetime of the particle down into the fragment

shader in the varying variable v_lifetime. The lifetime will be used in the

fragment shader to fade the particle as it ends its life. The nal piece of the

vertex shader causes the point size to be based on the remaining lifetime

of the particle by setting the gl_Pointsize built-in variable. This has the

effect of scaling the particles down as they reach the end of their life.

Particle System Fragment Shader

The fragment shader code for the example program is provided in

Example 14-7.

Example 14-7 Particle System Fragment Shader

#version 300 es

precision mediump float;

uniform vec4 u_color;

in float v_lifetime;

layout(location = 0) out vec4 fragColor;

uniform sampler2D s_texture;

void main()

{

vec4 texColor;

texColor = texture( s_texture, gl_PointCoord );

fragColor = vec4( u_color ) * texColor;

fragColor.a *= v_lifetime;

}

The rst input to the fragment shader is the u_color uniform variable,

which is set at the beginning of each explosion by the Update function.

Next, the v_lifetime input variable set by the vertex shader is declared

in the fragment shader. In addition, a sampler is declared to which a 2D

texture image of smoke is bound.

The fragment shader itself is relatively simple. The texture fetch uses the

gl_PointCoord variable as a texture coordinate. This special variable

for point sprites is set to xed values for the corners of the point sprite

(this process was described in Chapter 7, “Primitive Assembly and

378 Chapter 14: Advanced Programming with OpenGL ES 3.0

Rasterization,” in the discussion of drawing primitives). One could also

extend the fragment shader to rotate the point sprite coordinates if

rotation of the sprite was required. This requires extra fragment shader

instructions, but increases the exibility of the point sprite.

The texture color is attenuated by the u_color variable, and the alpha

value is attenuated by the particle lifetime. The application also enables

alpha blending with the following blend function:

glEnable ( GL_BLEND );

glBlendFunc ( GL_SRC_ALPHA, GL_ONE );

As a consequence of this code, the alpha produced in the fragment shader is

modulated with the fragment color. This value is then added into whatever

values are stored in the destination of the fragment. The result is an additive

blend effect for the particle system. Note that various particle effects will use

different alpha blending modes to accomplish the desired effect.

The code to actually draw the particles is shown in Example 14-8.

Example 14-8 Draw Function for Particle System Sample

void Draw ( ESContext *esContext )

{

UserData *userData = esContext−>userData;

// Set the viewport

glViewport ( 0, 0, esContext−>width, esContext−>height );

// Clear the color buffer

glClear ( GL_COLOR_BUFFER_BIT );

// Use the program object

glUseProgram ( userData−>programObject );

// Load the vertex attributes

glVertexAttribPointer ( ATTRIBUTE_LIFETIME_LOC, 1,

GL_FLOAT, GL_FALSE,

PARTICLE_SIZE * sizeof(GLfloat),

userData−>particleData );

glVertexAttribPointer ( ATTRIBUTE_ENDPOSITION_LOC, 3,

GL_FLOAT, GL_FALSE,

PARTICLE_SIZE * sizeof(GLfloat),

&userData−>particleData[1] );

glVertexAttribPointer ( ATTRIBUTE_STARTPOSITION_LOC, 3,

GL_FLOAT, GL_FALSE,

PARTICLE_SIZE * sizeof(GLfloat),

&userData−>particleData[4] );

Environment Mapping 379

The Draw function begins by setting the viewport and clearing the screen.

It then selects the program object to use and loads the vertex data using

glVertexAttribPointer. Note that because the values of the vertex array

never change, this example could have used vertex buffer objects rather

than client-side vertex arrays. In general, this approach is recommended

for any vertex data that does not change because it reduces the vertex

bandwidth used. Vertex buffer objects were not used in this example

merely to keep the code a bit simpler. After setting the vertex arrays, the

function enables the blend function, binds the smoke texture, and then

uses glDrawArrays to draw the particles.

Unlike with triangles, there is no connectivity for point sprites, so using

glDrawElements does not really provide any advantage for rendering

point sprites in this example. However, often particle systems need to

be sorted by depth from back to front to achieve proper alpha blending

results. In such cases, one potential approach is to sort the element array to

modify the draw order. This technique is very efcient, because it requires

minimal bandwidth across the bus per frame (only the index data need be

changed, and they are almost always smaller than the vertex data).

This example has demonstrated a number of techniques that can be useful

in rendering particle systems using point sprites. The particles were animated

entirely on the GPU using the vertex shader. The sizes of the particles were

attenuated based on particle lifetime using the gl_PointSize variable. In

addition, the point sprites were rendered with a texture using the

gl_PointCoord built-in texture coordinate variable. These are the fundamental

elements needed to implement a particle system using OpenGL ES 3.0.

glEnableVertexAttribArray ( ATTRIBUTE_LIFETIME_LOC );

glEnableVertexAttribArray ( ATTRIBUTE_ENDPOSITION_LOC );

glEnableVertexAttribArray ( ATTRIBUTE_STARTPOSITION_LOC );

// Blend particles

glEnable ( GL_BLEND );

glBlendFunc ( GL_SRC_ALPHA, GL_ONE );

// Bind the texture

glActiveTexture ( GL_TEXTURE0 );

glBindTexture ( GL_TEXTURE_2D, userData−>textureId );

// Set the sampler texture unit to 0

glUniform1i ( userData−>samplerLoc, 0 );

glDrawArrays( GL_POINTS, 0, NUM_PARTICLES );

}

Example 14-8 Draw Function for Particle System Sample (continued)

380 Chapter 14: Advanced Programming with OpenGL ES 3.0

Particle System Using Transform Feedback

The previous example demonstrated one technique for animating a

particle system in the vertex shader. Although it included an efcient

method for animating particles, the result was severely limited compared

to a traditional particle system. In a typical CPU-based particle system,

particles are emitted with different initial parameters such as position,

velocity, and acceleration and the paths are animated over the particle’s

lifetime. In the previous example, all of the particles were emitted

simultaneously and the paths were limited to a linear interpolation

between the start and end positions.

We can build a much more general-purpose GPU-based particle system by

using the transform feedback feature of OpenGL ES 3.0. To review, transform

feedback allows the outputs of the vertex shader to be stored in a buffer

object. As a consequence, we can implement a particle emitter completely

in a vertex shader on the GPU, store its output into a buffer object, and

then use that buffer object with another shader to draw the particles. In

general, transform feedback allows you to implement render to vertex

buffer (sometimes referred to by the shorthand R2VB), which means that a

wide range of algorithms can be moved from the CPU to the GPU.

The example we cover in this section is found in Chapter_14/

ParticleSystemTransformFeedback. It demonstrates emitting particles

for a fountain using transform feedback, as shown in Figure 14-4.

Figure 14-4 Particle System with Transform Feedback

Particle System Using Transform Feedback 381

Particle System Rendering Algorithm

This section provides a high-level overview of how the transform

feedback-based particle system works. At initialization time, two buffer

objects are allocated to hold the particle data. The algorithm ping-pongs

(switches back and forth) between the two buffers, each time switching

which buffer is the input or output for particle emission. Each particle

contains the following information: position, velocity, size, current time,

and lifetime.

The particle system is updated with transform feedback and then rendered

in the following steps:

In each frame, one of the particle VBOs is selected as the input

and bound as a GL_ARRAY_BUFFER. The output is bound as a

GL_TRANSFORM_FEEDBACK_BUFFER.

GL_RASTERIZER_DISCARD is enabled so that no fragments are drawn.

The particle emission shader is executed using point primitives (each

particle is one point). The vertex shader outputs new particles to

the transform feedback buffer and copies existing particles to the

transform feedback buffer unchanged.

GL_RASTERIZER_DISCARD is disabled, so that the application can draw

the particles.

The buffer that was rendered to for transform feedback is now bound

as a GL_ARRAY_BUFFER. Another vertex/fragment shader is bound to

draw the particles.

The particles are rendered to the framebuffer.

In the next frame, the input/output buffer objects are swapped and

the same process continues.

Particle Emission with Transform Feedback

Example 14-9 shows the vertex shader that is used for emitting particles.

All of the output variables in this shader are written to a transform

feedback buffer object. Whenever a particle’s lifetime has expired, the

shader will make it a potential candidate for emission as a new active

particle. If a new particle is generated, the shader uses a randomValue

function (shown in the vertex shader code in Example 14-9) that generates

a random value to initialize the new particle’s velocity and size. The

random number generation is based on using a 3D noise texture and using

the gl_VertexID built-in variable to select a unique texture coordinate

382 Chapter 14: Advanced Programming with OpenGL ES 3.0

for each particle. The details of creating and using a 3D Noise texture are

described in the Noise Using a 3D Texture section later in this chapter.

Example 14-9 Particle Emission Vertex Shader

#version 300 es

#define NUM_PARTICLES 200

#define ATTRIBUTE_POSITION 0

#define ATTRIBUTE_VELOCITY 1

#define ATTRIBUTE_SIZE 2

#define ATTRIBUTE_CURTIME 3

#define ATTRIBUTE_LIFETIME 4

uniform float u_time;

uniform float u_emissionRate;

uniform sampler3D s_noiseTex;

layout(location = ATTRIBUTE_POSITION) in vec2 a_position;

layout(location = ATTRIBUTE_VELOCITY) in vec2 a_velocity;

layout(location = ATTRIBUTE_SIZE) in float a_size;

layout(location = ATTRIBUTE_CURTIME) in float a_curtime;

layout(location = ATTRIBUTE_LIFETIME) in float a_lifetime;

out vec2 v_position;

out vec2 v_velocity;

out float v_size;

out float v_curtime;

out float v_lifetime;

float randomValue( inout float seed )

{

float vertexId = float( gl_VertexID ) /

float( NUM_PARTICLES );

vec3 texCoord = vec3( u_time, vertexId, seed );

seed += 0.1;

return texture( s_noiseTex, texCoord ).r;

}

void main()

{

float seed = u_time;

float lifetime = a_curtime − u_time;

if( lifetime <= 0.0 && randomValue(seed) < u_emissionRate )

{

// Generate a new particle seeded with random values for

// velocity and size

v_position = vec2( 0.0, −1.0 );

v_velocity = vec2( randomValue(seed) * 2.0 − 1.00,

randomValue(seed) * 0.4 + 2.0 );

Particle System Using Transform Feedback 383

To use the transform feedback feature with this vertex shader, the output

variables must be tagged as being used for transform feedback before

linking the program object. This is done in the InitEmitParticles

function in the example code, where the following snippet shows how the

program object is set up for transform feedback:

char* feedbackVaryings[5] =

{

"v_position",

"v_velocity",

"v_size",

"v_curtime",

"v_lifetime"

};

// Set the vertex shader outputs as transform

// feedback varyings

glTransformFeedbackVaryings ( userData−>emitProgramObject, 5,

feedbackVaryings,

GL_INTERLEAVED_ATTRIBS );

// Link program must occur after calling

// glTransformFeedbackVaryings

glLinkProgram( userData−>emitProgramObject );

The call to glTransformFeedbackVaryings ensures that the passed-in

output variables are used for transform feedback. The GL_INTERLEAVED_

ATTRIBS parameter species that the output variables will be interleaved

in the output buffer object. The order and layout of the variables must

v_size = randomValue(seed) * 20.0 + 60.0;

v_curtime = u_time;

v_lifetime = 2.0;

}

else

{

// This particle has not changed; just copy it to the

// output

v_position = a_position;

v_velocity = a_velocity;

v_size = a_size;

v_curtime = a_curtime;

v_lifetime = a_lifetime;

}

Example 14-9 Particle Emission Vertex Shader (continued)

384 Chapter 14: Advanced Programming with OpenGL ES 3.0

match the expected layout of the buffer object. In this case, our vertex

structure is dened as follows:

typedef struct

{

float position[2];

float velocity[2];

float size;

float curtime;

float lifetime;

} Particle;

This structure denition matches the order and type of the varyings that

are passed in to glTransformFeedbackVaryings.

The code used to emit the particles is provided in the EmitParticles

function shown in Example 14-10.

Example 14-10 Emit Particles with Transform Feedback

void EmitParticles ( ESContext *esContext, float deltaTime )

{

UserData userData = (UserData) esContext−>userData;

GLuint srcVBO =

userData−>particleVBOs[ userData−>curSrcIndex ];

GLuint dstVBO =

userData−>particleVBOs[(userData−>curSrcIndex+1) % 2];

glUseProgram( userData−>emitProgramObject );

// glVertexAttribPointer and glEnableVeretxAttribArray

// setup

SetupVertexAttributes(esContext, srcVBO);

// Set transform feedback buffer

glBindBufferBase(GL_TRANSFORM_FEEDBACK_BUFFER, 0, dstVBO);

// Turn off rasterization; we are not drawing

glEnable(GL_RASTERIZER_DISCARD);

// Set uniforms

glUniform1f(userData−>emitTimeLoc, userData−>time);

glUniform1f(userData−>emitEmissionRateLoc, EMISSION_RATE);

// Bind the 3D noise texture

glActiveTexture(GL_TEXTURE0);

glBindTexture(GL_TEXTURE_3D, userData−>noiseTextureId);

glUniform1i(userData−>emitNoiseSamplerLoc, 0);

Particle System Using Transform Feedback 385

The destination buffer object is bound to the GL_TRANSFORM_FEEDBACK_

BUFFER target using glBindBufferBase. Rasterization is disabled

by enabling GL_RASTERIZER_DISCARD because we will not actually

draw any fragments; instead, we simply want to execute the vertex

shader and output to the transform feedback buffer. Finally, before

the glDrawArrays call, we enable transform feedback rendering by

calling glBeginTransformFeedback(GL_POINTS). Subsequent calls to

glDrawArrays using GL_POINTS will then be recorded in the transform

feedback buffer until glEndTransformFeedback is called. To ensure

transform feedback results are completed before the draw call that uses

them, we create a sync object and insert a fence command immediately

after the glEndTransformFeedback is called. Prior to the draw call

execution, we will wait on the sync object using the glWaitSync call.

After executing the draw call and restoring state, we ping-pong between

the buffers so that the next time EmitShaders is called, it will use the

previous frame’s transform feedback output as the input.

Rendering the Particles

After emitting the transform feedback buffer, that buffer is bound as

a vertex buffer object from which to render the particles. The vertex

shader used for particle rendering with point sprites is provided in

Example 14-11.

// Emit particles using transform feedback

glBeginTransformFeedback(GL_POINTS);

glDrawArrays(GL_POINTS, 0, NUM_PARTICLES);

glEndTransformFeedback();

// Create a sync object to ensure transform feedback

// results are completed before the draw that uses them

userData−>emitSync = glFenceSync(

GL_SYNC_GPU_COMMANDS_COMPLETE, 0 );

// Restore state

glDisable(GL_RASTERIZER_DISCARD);

glUseProgram(0);

glBindBufferBase(GL_TRANSFORM_FEEDBACK_BUFFER, 0, 0);

glBindTexture(GL_TEXTURE_3D, 0);

// Ping−pong the buffers

userData−>curSrcIndex = ( userData−>curSrcIndex + 1 ) % 2;

}

Example 14-10 Emit Particles with Transform Feedback (continued)

386 Chapter 14: Advanced Programming with OpenGL ES 3.0

This vertex shader uses the transform feedback outputs as input variables.

The current age of each particle is computed based on the timestamp that

was stored at particle creation for each particle in the a_curtime attribute.

The particle’s velocity and position are updated based on this time.

Additionally, the size of the particle is attenuated over the particle’s life.

This example has demonstrated how to generate and render a particle

system entirely on the GPU. While the particle emitter and rendering were

relatively simple here, the same basic model can be used to create more

complex particle systems with more involved physics and properties. The

primary takeaway message is that transform feedback allows us to generate

new vertex data on the GPU without the need for any CPU code. This

powerful feature can be used for many algorithms that require generating

vertex data on the GPU.

Example 14-11 Particle Rendering Vertex Shader

#version 300 es

#define ATTRIBUTE_POSITION 0

#define ATTRIBUTE_VELOCITY 1

#define ATTRIBUTE_SIZE 2

#define ATTRIBUTE_CURTIME 3

#define ATTRIBUTE_LIFETIME 4

layout(location = ATTRIBUTE_POSITION) in vec2 a_position;

layout(location = ATTRIBUTE_VELOCITY) in vec2 a_velocity;

layout(location = ATTRIBUTE_SIZE) in float a_size;

layout(location = ATTRIBUTE_CURTIME) in float a_curtime;

layout(location = ATTRIBUTE_LIFETIME) in float a_lifetime;

uniform float u_time;

uniform vec2 u_acceleration;

void main()

{

float deltaTime = u_time − a_curtime;

if ( deltaTime <= a_lifetime )

{

vec2 velocity = a_velocity + deltaTime * u_acceleration;

vec2 position = a_position + deltaTime * velocity;

gl_Position = vec4( position, 0.0, 1.0 );

gl_PointSize = a_size * ( 1.0 − deltaTime / a_lifetime );

}

else

{

gl_Position = vec4( −1000, −1000, 0, 0 );

gl_PointSize = 0.0;

}

Image Postprocessing 387

Image Postprocessing

The next example covered in this chapter involves image postprocessing.

Using a combination of framebuffer objects and shaders, it is possible to

perform a wide variety of image postprocessing techniques. The rst example

presented here is the simple blur effect in the PVRShaman workspace in

Chapter_14/PVR_PostProcess, results of which are pictured in Figure 14-5.

Figure 14-5 Image Postprocessing Example

Render-to-Texture Setup

This example renders a textured knot into a framebuffer object and

then uses the color attachment as a texture in a subsequent pass. A full-

screen quad is drawn to the screen using the rendered texture as a source.

A fragment shader is run over the full-screen quad, which performs a

blur lter. In general, many types of postprocessing techniques can be

accomplished using this pattern:

1. Render the scene into an off-screen framebuffer object (FBO).

2. Bind the FBO texture as a source and render a full-screen quad to the

screen.

3. Execute a fragment shader that performs ltering across the quad.

Some algorithms require performing multiple passes over an image; others

require more complicated inputs. However, the general idea is to use a

fragment shader over a full-screen quad that performs a postprocessing

algorithm.

388 Chapter 14: Advanced Programming with OpenGL ES 3.0

Blur Fragment Shader

The fragment shader used on the full-screen quad in the blurring example

is provided in Example 14-12.

Example 14-12 Blur Fragment Shader

#version 300 es

precision mediump float;

uniform sampler2D renderTexture;

uniform float u_blurStep;

in vec2 v_texCoord;

layout(location = 0) out vec4 outColor;

void main(void)

{

vec4 sample0,

sample1,

sample2,

sample3;

float fStep = u_blurStep / 100.0;

sample0 = texture2D ( renderTexture,

vec2 ( v_texCoord.x − fStep, v_texCoord.y − fStep ) );

sample1 = texture2D ( renderTexture,

vec2 ( v_texCoord.x + fStep, v_texCoord.y + fStep ) );

sample2 = texture2D ( renderTexture,

vec2 ( v_texCoord.x + fStep, v_texCoord.y − fStep ) );

sample3 = texture2D ( renderTexture,

vec2 ( v_texCoord.x − fStep, v_texCoord.y + fStep) );

outColor = (sample0 + sample1 + sample2 + sample3) / 4.0;

}

This shader begins by computing the fStep variable, which is based on

the u_blurstep uniform variable. The fStep variable is used to determine

how much to offset the texture coordinate when fetching samples from

the image. A total of four different samples are taken from the image and

then averaged together at the end of the shader. The fStep variable is used

to offset the texture coordinate in four directions such that four samples

in each diagonal direction from the center are taken. The larger the value

of fStep, the more the image is blurred. One possible optimization to this

shader would be to compute the offset texture coordinates in the vertex

shader and pass them into varyings in the fragment shader. This approach

would reduce the amount of computation done per fragment.

Image Postprocessing 389

Light Bloom

Now that we have looked at a simple image postprocessing technique, let’s

consider a slightly more complicated one. Using the blurring technique

we introduced in the previous example, we can implement an effect

known as light bloom. Light bloom is what happens when the eye views a

bright light contrasted with a darker surface—that is, the light color bleeds

into the darker surface. As you can see from the screenshot in Figure 14-6,

the car model color bleeds over the background. The algorithm works as

follows:

1. Clear an off-screen render target (rt0) and draw the object in black.

2. Blur the off-screen render target (rt0) into another render target (rtl)

using a blur step of 1.0.

3. Blur the off-screen render target (rt1) back into the original render

target (rt0) using a blur step of 2.0.

Note: For more blur, repeat steps 2 and 3 for the amount of blur,

increasing the blur step each time.

4. Render the object to the back buffer.

5. Blend the nal render target with the back buffer.

Figure 14-6 Light Bloom Effect

390 Chapter 14: Advanced Programming with OpenGL ES 3.0

The process this algorithm uses is illustrated in Figure 14-7, which shows

each of the steps that goes into producing the nal image. As you can see

in this gure, the object is rst rendered in black to the render target. That

render target is then blurred into a second render target in the next pass.

The blurred render target is then blurred again, with an expanded blur

kernel going back into the original render target. At the end, that blurred

render target is blended with the original scene. The amount of bloom can

be increased by ping-ponging the blur targets over and over. The shader

code for the blur steps is the same as in the previous example; the only

difference is that the blur step is being increased for each pass.

Figure 14-7 Light Bloom Stages

A large variety of other image postprocessing algorithms can be performed

using a combination of FBOs and shaders. Some other common

techniques include tone mapping, selective blurring, distortion, screen

transitions, and depth of eld. Using the techniques shown here, you can

start to implement other postprocessing algorithms using shaders.

Projective Texturing

A technique that is used to produce many effects, such as shadow mapping

and reections, is projective texturing. To introduce the topic of projective

texturing, we provide an example of rendering a projective spotlight.

Most of the complexity in using projective texturing derives from the

mathematics that goes into calculating the projective texture coordinates.

The method shown here could also be used to produce texture coordinates

for shadow mapping or reections. The example offered here is found

Projective Texturing 391

in the projective spotlight PVRShaman workspace in Chapter_14/PVR_

ProjectiveSpotlight, the results of which are pictured in Figure 14-8.

Figure 14-8 Projective Spotlight Example

vec4 textureProj(sampler2D sampler, vec3 coord

[, float bias])

sampler a sampler bound to a texture unit specifying the texture to

fetch from.

coord a 3D texture coordinate used to fetch from the texture map.

The (x, y) arguments are divided by (z) such that the fetch

occurs at (x/z, y/z).

bias an optional LOD bias to apply.

Projective Texturing Basics

The example uses the 2D texture image pictured in Figure 14-9 and applies

it to the surface of a teapot using projective texturing. Projective spotlights

were a very common technique used to emulate per-pixel spotlight falloff

before shaders were introduced to GPUs. Projective spotlights can still

provide an attractive solution because of their high level of efciency.

Applying the projective texture takes just a single texture fetch instruction

in the fragment shader and some setup in the vertex shader. In addition,

the 2D texture image that is projected can contain really any picture, so

many different effects can be achieved.

What, exactly, do we mean by projective texturing? At its most basic,

projective texturing is the use of a 3D texture coordinate to look up into a 2D

texture image. The (s, t) coordinates are divided by the (r) coordinate such that

a texel is fetched using (s/r, t/r). The OpenGL ES Shading Language provides a

special built-in function to do projective texturing called textureProj.

392 Chapter 14: Advanced Programming with OpenGL ES 3.0

The idea behind projective lighting is to transform the position of an

object into the projective view space of a light. The projective light

space position, after application of a scale and bias, can then be used as

a projective texture coordinate. The vertex shader in the PVRShaman

example workspace does the work of transforming the position into the

projective view space of a light.

Matrices for Projective Texturing

There are three matrices that we need to transform the position

into projective view space of the light and get a projective texture

coordinate:

Light projection—projection matrix of the light source using the

eld of view, aspect ratio, and near and far planes of the light.

Light view—The view matrix of the light source. This would be

constructed just as if the light were a camera.

Bias matrix—A matrix that transforms the light-space projected

position into a 3D projective texture coordinate.

Figure 14-9 2D Texture Projected onto Object

Projective Texturing 393

The light projection matrix would be constructed just like any other

projection matrix, using the light’s parameters for eld of view (FOV),

aspect ratio (aspect), and near (zNear) and far plane (zFar) distances.

⎛

⎝

⎜⎞

⎠

⎟

⎛

⎝

⎜⎞

⎠

⎟

−

×+

−

⎛

⎝

⎜

⎞

⎠

⎟

FOV

aaspect

FOV

zFar zNear

zNearzFar

zFar zNear

zNearzFar

cot200 0

0cot 2

00 2

rightx up xlook x

righty up ylook y

rightz up zlook z

dotright lightPosdot up lightPosdot look lightPos

(, )(,)(, )1−− −

⎛

⎝

⎜

⎞

⎠

⎟

−

⎛

⎝

⎜

⎞

⎠

⎟

0.50.0 0.0

0.00.5 0.0

0.50.5 1.0

The light view matrix is constructed by using the three primary axis

directions that dene the light’s view axes and the light’s position. We

refer to the axes as the right, up, and look vectors.

After transforming the object’s position by the view and projection

matrices, we must then turn the coordinates into projective texture

coordinates. This is accomplished by using a 3 × 3 bias matrix on the

(x, y, z) components of the position in projective light space. The bias

matrix does a linear transformation to go from the [−1, 1] range to the

[0,1] range. Having the coordinates in the [0, 1] range is necessary for

thevalues to be used as texture coordinates.

Typically, the matrix to transform the position into a projective texture

coordinate would be computed on the CPU by concatenating the

projection, view, and bias matrices together (using a 4 × 4 version of the bias

matrix). The result would then be loaded into a single uniform matrix that

could transform the position in the vertex shader. However, in the example,

we perform this computation in the vertex shader for illustrative purposes.

394 Chapter 14: Advanced Programming with OpenGL ES 3.0

Example 14-13 Projective Texturing Vertex Shader

#version 300 es

uniform float u_time_0_X;

uniform mat4 u_matProjection;

uniform mat4 u_matViewProjection;

in vec4 a_vertex;

in vec2 a_texCoord0;

in vec3 a_normal;

out vec2 v_texCoord;

out vec3 v_projTexCoord;

out vec3 v_normal;

out vec3 v_lightDir;

void main( void )

{

gl_Position = u_matViewProjection * a_vertex;

v_texCoord = a_texCoord0.xy;

// Compute a light position based on time

vec3 lightPos;

lightPos.x = cos(u_time_0_X);

lightPos.z = sin(u_time_0_X);

lightPos.xz = 200.0 * normalize(lightPos.xz);

lightPos.y = 200.0;

// Compute the light coordinate axes

vec3 look = −normalize( lightPos );

vec3 right = cross( vec3( 0.0, 0.0, 1.0), look );

vec3 up = cross( look, right );

// Create a view matrix for the light

mat4 lightView = mat4( right, dot( right, −lightPos ),

up, dot( up, −lightPos ),

look, dot( look, −lightPos),

0.0, 0.0, 0.0, 1.0 );

// Transform position into light view space

vec4 objPosLight = a_vertex * lightView;

// Transform position into projective light view space

objPosLight = u_matProjection * objPosLight;

Projective Spotlight Shaders

Now that we have covered the basic mathematics, we can examine the

vertex shader in Example 14-13.

Projective Texturing 395

The rst operation this shader does is to transform the position by the

u_matViewProjection matrix and output the texture coordinate for the

base map to the v_texCoord output variable. Next, the shader computes a

position for the light based on time. This bit of the code can really be ignored,

but it was added to animate the light in the vertex shader. In a typical

application, this step would be done on the CPU and not in the shader.

Based on the position of the light, the vertex shader then computes the

three coordinate axis vectors for the light and places the results into

the look, right, and up variables. Those vectors are used to create a

view matrix for the light in the lightView variable using the equations

previously described. The input position for the object is then transformed

by the lightView matrix, which transforms the position into light space.

The next step is to use the perspective matrix to transform the light space

position into projected light space. Rather than creating a new perspective

matrix for the light, this example uses the u_matProjection matrix for

the camera. Typically, a real application would want to create its own

projection matrix for the light based on how big the cone angle and falloff

distance are.

Once the position is transformed into projective light space, a biasMatrix

is created to transform the position into a projective texture coordinate.

The nal projective texture coordinate is stored in the vec3 output

variable v_projTexCoord. In addition, the vertex shader passes the light

direction and normal vectors into the fragment shader in the v_lightDir

and v_normal variables. These vectors will be used to determine whether a

fragment is facing the light source so as to mask off the projective texture

for fragments facing away from the light.

The fragment shader performs the actual projective texture fetch that

applies the projective spotlight texture to the surface (Example 14-14).

// Create bias matrix

mat3 biasMatrix = mat3( 0.5, 0.0, 0.5,

0.0, −0.5, 0.5,

0.0, 0.0, 1.0 );

// Compute projective texture coordinates

v_projTexCoord = objPosLight.xyz * biasMatrix;

v_lightDir = normalize(a_vertex.xyz − lightPos);

v_normal = a_normal;

}

Example 14-13 Projective Texturing Vertex Shader (continued)

396 Chapter 14: Advanced Programming with OpenGL ES 3.0

The rst operation that the fragment shader performs is the projective

texture fetch using textureProj. As you can see, the projective texture

coordinate that was computed during the vertex shader and passed in

the input variable v_projTexCoord is used to perform the projective

texture fetch. The wrap modes for the projective texture are set to

GL_CLAMP_TO_EDGE and the minication/magnication lters are both

set to GL_LINEAR. The fragment shader then fetches the color from the

base map using the v_texCoord variable. Next, the shader computes the

dot product of the light direction and the normal vector; this result is

used to attenuate the nal color so that the projective spotlight is not

applied to fragments that are facing away from the light. Finally, all of

the components are multiplied together (and scaled by 2.0 to increase the

brightness). This gives us the nal image of the teapot lit by the projective

spotlight (refer back to Figure 14-7).

As mentioned at the beginning of this section, the key takeaway lesson

from this example is the set of computations that go into computing a

projective texture coordinate. The computation shown here is the exact

same computation that you would use to produce a coordinate to fetch

Example 14-14 Projective Texturing Fragment Shader

#version 300 es

precision mediump float;

uniform sampler2D baseMap;

uniform sampler2D spotLight;

in vec2 v_texCoord;

in vec3 v_projTexCoord;

in vec3 v_normal;

in vec3 v_lightDir;

out vec4 outColor;

void main( void )

{

// Projective fetch of spotlight

vec4 spotLightColor =

textureProj( spotLight, v_projTexCoord );

// Base map

vec4 baseColor = texture( baseMap, v_texCoord );

// Compute N.L

float nDotL = max( 0.0, −dot( v_normal, v_lightDir ) );

outColor = spotLightColor * baseColor * 2.0 * nDotL;

}

Noise Using a 3D Texture 397

from a shadow map. Similarly, rendering reections with projective

texturing requires that you transform the position into the projective

view space of the reection camera. You would do the same thing we

have done here, but substitute the light matrices for the reection camera

matrices. Projective texturing is a very powerful tool in creating advanced

effects, and you should now understand the basics of how to use it.

Noise Using a 3D Texture

The next rendering technique we cover is using a 3D texture for noise. In

Chapter 9, “Texturing,” we introduced the basics of 3D textures. As you will

recall, a 3D texture is essentially a stack of 2D texture slices representing

a 3D volume. 3D textures have many possible uses, one of which is the

representation of noise. In this section, we show an example of using a 3D

volume of noise to create a wispy fog effect. This example builds on the linear

fog example from Chapter 10, “Fragment Shaders.” The example is found in

Chapter_14/Noise3D, the results of which are shown in Figure 14-10.

Figure 14-10 Fog Distorted by 3D Noise Texture

Generating Noise

The application of noise is a very common technique that plays a

role in a large variety of 3D effects. The OpenGL Shading Language

(not OpenGL ES Shading Language) included functions for computing

noise in one, two, three, and four dimensions. These functions return a

pseudorandom continuous noise value that is repeatable based on the

input value. Unfortunately, the functions are expensive to implement.

Most programmable GPUs did not implement noise functions natively in

hardware, which meant the noise computations had to be implemented

using shader instructions (or worse, in software on the CPU). It takes

a lot of shader instructions to implement these noise functions, so the

performance was too slow to be used in most real-time fragment shaders.

Recognizing this problem, the OpenGL ES working group decided to drop

noise from the OpenGL ES Shading Language (although vendors are still

free to expose it through an extension).

398 Chapter 14: Advanced Programming with OpenGL ES 3.0

Although computing noise in the fragment shader is prohibitively

expensive, we can work around the problem using a 3D texture. It is

possible to easily produce acceptable-quality noise by precomputing

the noise and placing the results in a 3D texture. A number of

algorithms can be used to generate noise. The list of references and

links described at the end of this chapter can be used to obtain more

information about the various noise algorithms. Here, we discuss a

specic algorithm that generates a lattice-based gradient noise. Ken

Perlin’s noise function (Perlin, 1985) is a lattice-based gradient noise and

a widely used method for generating noise. For example, a lattice-based

gradient noise is implemented by the noise function in the Renderman

shading language.

The gradient noise algorithm takes a 3D coordinate as input and returns

a oating-point noise value. To generate this noise value given an input

(x, y, z), we map the x, y, and z values to appropriate integer locations

in a lattice. The number of cells in a lattice is programmable and for our

implementation is set to 256 cells. For each cell in the lattice, we need

to generate and store a pseudorandom gradient vector. Example 14-15

describes how these gradient vectors are generated.

Example 14-15 Generating Gradient Vectors

// permTable describes a random permutation of

// 8−bit values from 0 to 255

static unsigned char permTable[256] = {

0xE1, 0x9B, 0xD2, 0x6C, 0xAF, 0xC7, 0xDD, 0x90,

0xCB, 0x74, 0x46, 0xD5, 0x45, 0x9E, 0x21, 0xFC,

0x05, 0x52, 0xAD, 0x85, 0xDE, 0x8B, 0xAE, 0x1B,

0x09, 0x47, 0x5A, 0xF6, 0x4B, 0x82, 0x5B, 0xBF,

0xA9, 0x8A, 0x02, 0x97, 0xC2, 0xEB, 0x51, 0x07,

0x19, 0x71, 0xE4, 0x9F, 0xCD, 0xFD, 0x86, 0x8E,

0xF8, 0x41, 0xE0, 0xD9, 0x16, 0x79, 0xE5, 0x3F,

0x59, 0x67, 0x60, 0x68, 0x9C, 0x11, 0xC9, 0x81,

0x24, 0x08, 0xA5, 0x6E, 0xED, 0x75, 0xE7, 0x38,

0x84, 0xD3, 0x98, 0x14, 0xB5, 0x6F, 0xEF, 0xDA,

0xAA, 0xA3, 0x33, 0xAC, 0x9D, 0x2F, 0x50, 0xD4,

0xB0, 0xFA, 0x57, 0x31, 0x63, 0xF2, 0x88, 0xBD,

0xA2, 0x73, 0x2C, 0x2B, 0x7C, 0x5E, 0x96, 0x10,

0x8D, 0xF7, 0x20, 0x0A, 0xC6, 0xDF, 0xFF, 0x48,

0x35, 0x83, 0x54, 0x39, 0xDC, 0xC5, 0x3A, 0x32,

0xD0, 0x0B, 0xF1, 0x1C, 0x03, 0xC0, 0x3E, 0xCA,

0x12, 0xD7, 0x99, 0x18, 0x4C, 0x29, 0x0F, 0xB3,

0x27, 0x2E, 0x37, 0x06, 0x80, 0xA7, 0x17, 0xBC,

0x6A, 0x22, 0xBB, 0x8C, 0xA4, 0x49, 0x70, 0xB6,

Noise Using a 3D Texture 399

0xF4, 0xC3, 0xE3, 0x0D, 0x23, 0x4D, 0xC4, 0xB9,

0x1A, 0xC8, 0xE2, 0x77, 0x1F, 0x7B, 0xA8, 0x7D,

0xF9, 0x44, 0xB7, 0xE6, 0xB1, 0x87, 0xA0, 0xB4,

0x0C, 0x01, 0xF3, 0x94, 0x66, 0xA6, 0x26, 0xEE,

0xFB, 0x25, 0xF0, 0x7E, 0x40, 0x4A, 0xA1, 0x28,

0xB8, 0x95, 0xAB, 0xB2, 0x65, 0x42, 0x1D, 0x3B,

0x92, 0x3D, 0xFE, 0x6B, 0x2A, 0x56, 0x9A, 0x04,

0xEC, 0xE8, 0x78, 0x15, 0xE9, 0xD1, 0x2D, 0x62,

0xC1, 0x72, 0x4E, 0x13, 0xCE, 0x0E, 0x76, 0x7F,

0x30, 0x4F, 0x93, 0x55, 0x1E, 0xCF, 0xDB, 0x36,

0x58, 0xEA, 0xBE, 0x7A, 0x5F, 0x43, 0x8F, 0x6D,

0x89, 0xD6, 0x91, 0x5D, 0x5C, 0x64, 0xF5, 0x00,

0xD8, 0xBA, 0x3C, 0x53, 0x69, 0x61, 0xCC, 0x34,

};

#define NOISE_TABLE_MASK 255

// lattice gradients 3D noise

static float gradientTable[256*3];

#define FLOOR(x) ((int)(x) − ((x) < 0 && (x) != (int)(x)))

#define smoothstep(t) (t * t * (3.0f − 2.0f * t))

#define lerp(t, a, b) (a + t * (b − a))

void initNoiseTable()

{

int i;

float a;

float x, y, z, r, theta;

float gradients[256*3];

unsigned int *p, *psrc;

srandom(0);

// build gradient table for 3D noise

for (i=0; i<256; i++)

{

* calculate 1 − 2 * random number

a = (random() % 32768) / 32768.0f;

z = (1.0f − 2.0f * a);

r = sqrtf(1.0f − z * z); // r is radius of circle

a = (random() % 32768) / 32768.0f;

theta = (2.0f * (float)M_PI * a);

x = (r * cosf(a));

y = (r * sinf(a));

(continues)

Example 14-15 Generating Gradient Vectors (continued)

400 Chapter 14: Advanced Programming with OpenGL ES 3.0

Example 14-16 3D Noise

// generate the value of gradient noise for a given lattice

// point

// (ix, iy, iz) specifies the 3D lattice position

// (fx, fy, fz) specifies the fractional part

static float

glattice3D(int ix, int iy, int iz, float fx, float fy,

float fz)

{

float *g;

int indx, y, z;

z = permTable[iz & NOISE_TABLE_MASK];

y = permTable[(iy + z) & NOISE_TABLE_MASK];

indx = (ix + y) & NOISE_TABLE_MASK;

g = &gradientTable[indx*3];

return (g[0]*fx + g[l]*fy + g[2]*fz);

}

// generate the 3D noise value

Example 14-16 shows how the gradient noise is calculated using the

pseudorandom gradient vectors and an input 3D coordinate.

gradients[i*3] = x;

gradients[i*3+1] = y;

gradients[i*3+2] = z;

}

// use the index in the permutation table to load the

// gradient values from gradients to gradientTable

p = (unsigned int *)gradientTable;

psrc = (unsigned int *)gradients;

for (i=0; i<256; i++)

{

int indx = permTable[i];

p[i*3] = psrc[indx*3];

p[i*3+1] = psrc[indx*3+1];

p[i*3+2] = psrc[indx*3+2];

}

Example 14-15 Generating Gradient Vectors (continued)

Noise Using a 3D Texture 401

The noise3D function returns a value between −1.0 and 1.0. The value

of gradient noise is always 0 at the integer lattice points. For points in

between, trilinear interpolation of gradient values across the eight integer

lattice points that surround the point is used to generate the scalar noise

// f describes input (x, y, z) position for which the noise value

// needs to be computed. noise3D returns the scalar noise value

float

noise3D(float *f)

{

int ix, iy, iz;

float fxO, fxl, fyO, fyl, fzO, fzl;

float wx, wy, wz;

float vxO, vxl, vyO, vyl, vzO, vzl;

ix = FLOOR(f[0]);

fxO = f[0] − ix;

fxl = fxO − 1;

wx = smoothstep(fxO);

iy = FLOOR(f[1]);

fyO = f[1] − iy;

fyl = fyO − 1;

wy = smoothstep(fyO);

iz = FLOOR(f[2]);

fzO = f[2] − iz;

fzl = fzO − 1;

wz = smoothstep(fzO);

vxO = glattice3D(ix, iy, iz, fxO, fyO, fzO);

vxl = glattice3D(ix+1, iy, iz, fxl, fyO, fzO);

vyO = lerp(wx, vxO, vxl);

vxO = glattice3D(ix, iy+1, iz, fxO, fyl, fzO);

vxl = glattice3D(ix+1, iy+1, iz, fxl, fyl, fzO);

vyl = lerp(wx, vxO, vxl);

vzO = lerp(wy, vyO, vyl);

vxO = glattice3D(ix, iy, iz+1, fxO, fyO, fzl);

vxl = glattice3D(ix+1, iy, iz+1, fxl, fyO, fzl);

vyO = lerp(wx, vxO, vxl);

vxO = glattice3D(ix, iy+1, iz+1, fxO, fyl, fzl);

vxl = glattice3D(ix+1, iy+1, iz+1, fxl, fyl, fzl);

vyl = lerp(wx, vxO, vxl);

vzl = lerp(wy, vyO, vyl);

return lerp(wz, vzO, vzl);;

}

Example 14-16 3D Noise (continued)

402 Chapter 14: Advanced Programming with OpenGL ES 3.0

value. Figure 14-11 shows a 2D slice of the gradient noise using the

preceding algorithm.

Example 14-17 Noise-Distorted Fog Fragment Shader

#version 300 es

precision mediump float;

uniform sampler3D s_noiseTex;

uniform float u_fogMaxDist;

uniform float u_fogMinDist;

uniform vec4 u_fogColor;

uniform float u_time;

in vec4 v_color;

in vec2 v_texCoord;

in vec4 v_eyePos;

layout(location = 0) out vec4 outColor;

float computeLinearFogFactor()

Figure 14-11 2D Slice of Gradient Noise

Using Noise

Once we have created a 3D noise volume, it is very easy to use it to

produce a variety of effects. In the case of the wispy fog effect, the idea is

simple: Scroll the 3D noise texture in all three dimensions based on time

and use the value from the texture to distort the fog factor. Let’s take a

look at the fragment shader in Example 14-17.

Noise Using a 3D Texture 403

This shader is very similar to our linear fog example in Chapter 10,

“Fragment Shaders.” The primary difference is that the linear fog factor

is distorted by the 3D noise texture. The shader computes a 3D texture

coordinate based on time and places it in noiseCoord. The u_time uniform

variable is tied to the current time and is updated each frame. The 3D

texture is set up with s, t, and r wrap modes of GL_MIRRORED_REPEAT so that

the noise volume scrolls smoothly on the surface. The (s, t) coordinates are

based on the coordinates for the base texture and scroll in both directions.

The r-coordinate is based purely on time; thus it is continuously scrolled.

The 3D texture is a single-channel (GL_R8) texture, so only the red

component of the texture is used (the green and blue channels have the same

value as the red channel). The value fetched from the volume is subtracted

from the computed fogFactor and then used to linearly interpolate between

the fog color and base color. The result is a wispy fog that appears to roll in

from a distance. Its speed can be increased easily by applying a scale to the

u_time variable when scrolling the 3D texture coordinates.

You can achieve a number of different effects by using a 3D texture to

represent noise. For example, you can use noise to represent dust in a

{

float factor;

// Compute linear fog equation

float dist = distance( v_eyePos,

vec4( 0.0, 0.0, 0.0, 1.0 ) );

factor = (u_fogMaxDist − dist) /

(u_fogMaxDist − u_fogMinDist );

// Clamp in the [0, 1] range

factor = clamp( factor, 0.0, 1.0 );

return factor;

}

void main( void )

{

float fogFactor = computeLinearFogFactor();

vec3 noiseCoord =

vec3( v_texCoord.xy + u_time, u_time );

fogFactor −=

texture(s_noiseTex, noiseCoord).r * 0.25;

fogFactor = clamp(fogFactor, 0.0, 1.0);

vec4 baseColor = v_color;

outColor = baseColor * fogFactor +

u_fogColor * (1.0 − fogFactor);

}

Example 14-17 Noise-Distorted Fog Fragment Shader (continued)

404 Chapter 14: Advanced Programming with OpenGL ES 3.0

light volume, add a more natural appearance to a procedural texture, and

simulate water waves. Applying a 3D texture is a great way to economize

on performance, yet still achieve high-quality visual effects. It is unlikely

that you can expect handheld devices to compute noise functions in the

fragment shader and have enough performance to run at a high frame

rate. As such, having a precomputed noise volume will be a very valuable

trick to have in your toolkit for creating effects.

Procedural Texturing

The next topic we cover is the generation of procedural textures. Textures

are typically described as a 2D image, a cubemap, or a 3D image. These

images store color or depth values. Built-in functions dened in the

OpenGL ES Shading Language take a texture coordinate, a texture object

referred to as a sampler, and return a color or depth value. Procedural

texturing refers to textures that are described as a procedure instead of

as an image. The procedure describes the algorithm that will generate a

texture color or depth value given a set of inputs.

The following are some of the benets of procedural textures:

They provide much more compact representation than a stored texture

image. All you need to store is the code that describes the procedural

texture, which will typically be much smaller in size than astored image.

Procedural textures, unlike stored images, have no xed resolution.

As a consequence, they can be applied to the surface without loss of

detail. Thus we will not see problematic issues such as reduced detail

as we zoom onto a surface that uses a procedural texture. We will,

however, encounter these issues when using a stored texture image

because of its xed resolution.

The disadvantages of procedural textures are as follows:

Although the procedural texture might have a smaller footprint

than a stored texture, it might take a lot more cycles to execute the

procedural texture versus doing a lookup in the stored texture. With

procedural textures, you are dealing with instruction bandwidth,

versus memory bandwidth for stored textures. Both theinstruction

and memory bandwidth are at a premium on handheld devices, and a

developer must carefully choose which approach to take.

Procedural textures can lead to serious aliasing artifacts. Although

most of these artifacts can be resolved, they result in additional

instructions to the procedural texture code, which can impact the

performance of a shader.

Procedural Texturing 405

The decision whether to use a procedural texture or a stored texture

should be based on careful analysis of the performance and memory

bandwidth requirements of each.

A Procedural Texture Example

We now look at a simple example that demonstrates procedural textures.

We are familiar with how to use a checkerboard texture image to draw a

checkerboard pattern on an object. We now look at a procedural texture

implementation that renders a checkerboard pattern on an object. The

example we cover is the Checker.pod PVRShaman workspacein Chapter_14/

PVR_ProceduralTextures. Examples 14-18 and 14-19 describe the vertex

and fragment shaders that implement the checkerboard texture procedurally.

Example 14-18 Checker Vertex Shader

#version 300 es

uniform mat4 mvp_matrix; // combined model−view

// + projection matrix

in vec4 a_position; // input vertex position

in vec2 a_st; // input texture coordinate

out vec2 v_st; // output texture coordinate

void main()

{

v_st = a_st;

gl_Position = mvp_matrix * a_position;

}

The vertex shader code in Example 14-18 is really straightforward. It

transforms the position using the combined model–view and projection

matrix and passes the texture coordinate (a_st) to the fragment shader as

a varying variable (v_st).

The fragment shader code in Example 14-19 uses the v_st texture coordinate

to draw the texture pattern. Although easy to understand, the fragment

shader might yield poor performance because of the multiple conditional

checks done on values that can differ over fragments being executed in

parallel. This can diminish performance, as the number of vertices or

fragments executed in parallel by the GPU is reduced. Example 14-20 is a

version of the fragment shader that omits any conditional checks.

Figure 14-12 shows the checkerboard image rendered using the fragment

shader in Example 14-17 with u_frequency = 10.

406 Chapter 14: Advanced Programming with OpenGL ES 3.0

Example 14-20 Checker Fragment Shader without Conditional Checks

#version 300 es

precision mediump float;

// frequency of the checkerboard pattern

uniform int u_frequency;

in vec2 v_st;

layout(location = 0) out vec4 outColor;

void

main()

{

vec2 texcoord = mod(floor(v_st * float(u_frequency * 2)),2.0);

float delta = abs(texcoord.x − texcoord.y);

outColor = mix(vec4(1.0), vec4(0.0), delta);

}

Example 14-19 Checker Fragment Shader with Conditional Checks

#version 300 es

precision mediump float;

// frequency of the checkerboard pattern

uniform int u_frequency;

in vec2 v_st;

layout(location = 0) out vec4 outColor;

void main()

{

vec2 tcmod = mod(v_st * float(u_frequency), 1.0);

if(tcmod.s < 0.5)

{

if(tcmod.t < 0.5)

outColor = vec4(1.0);

else

outColor = vec4(0.0);

}

else

{

if(tcmod.t < 0.5)

outColor = vec4(0.0);

else

outColor = vec4(1.0);

}

Procedural Texturing 407

Figure 14-12 Checkerboard Procedural Texture

As you can see, this was really easy to implement. We do see quite a

bit of aliasing, which is never acceptable. With a texture checkerboard

image, aliasing issues are overcome by using mipmapping and applying

preferably a trilinear or bilinear lter. We now look at how to render an

anti-aliased checkerboard pattern.

Anti-Aliasing of Procedural Textures

In Advanced RenderMan: Creating CGI for Motion Pictures, Anthony Apodaca

and Larry Gritz give a very thorough explanation of how to implement

analytic anti-aliasing of procedural textures. We use the techniques

described in this book to implement our anti-aliased checker fragment

shader. Example 14-21 describes the anti-aliased checker fragment shader

code from the CheckerAA.rfx PVR_Shaman workspace in Chapter_14/

PVR_ProceduralTextures.

Example 14-21 Anti-Aliased Checker Fragment Shader

#version 300 es

precision mediump float;

uniform int u_frequency;

in vec2 v_st;

layout(location = 0) out vec4 outColor;

(continues)

408 Chapter 14: Advanced Programming with OpenGL ES 3.0

Figure 14-13 shows the checkerboard image rendered using the anti-

aliased fragment shader in Example 14-18 with u_frequency = 10.

To anti-alias the checkerboard procedural texture, we need to estimate

the average value of the texture over an area covered by the pixel. Given

a function g(v) that represents a procedural texture, we need to calculate

Example 14-21 Anti-Aliased Checker Fragment Shader (continued)

void main()

{

vec4 color;

vec4 color0 = vec4(0.0);

vec4 color1 = vec4(1.0);

vec2 st_width;

vec2 fuzz;

vec2 check_pos;

float fuzz_max;

// calculate the filter width

st_width = fwidth(v_st);

fuzz = st_width * float(u_frequency) * 2.0;

fuzz_max = max(fuzz.s, fuzz.t);

// get the place in the pattern where we are sampling

check_pos = fract(v_st * float(u_frequency));

if (fuzz_max <= 0.5)

{

// if the filter width is small enough, compute

// the pattern color by performing a smooth interpolation

// between the computed color and the average color

vec2 p = smoothstep(vec2(0.5), fuzz + vec2(0.5),

check_pos) + (1.0 − smoothstep(vec2(0.0), fuzz,

check_pos));

color = mix(color0, color1,

p.x * p.y + (1.0 − p.x) * (1.0 − p.y));

color = mix(color, (color0 + color1)/2.0,

smoothstep(0.125, 0.5, fuzz_max));

}

else

{

// filter is too wide; just use the average color

color = (color0 + color1)/2.0;

}

outColor = color;

}

Procedural Texturing 409

the average value of (v) of the region covered by this pixel. To determine

this region, we need to know the rate of change of g(v). The OpenGL ES

Shading Language 3.00 contains derivative functions we can use to compute

the rate of change of g(v) in x and y using the functions dFdx and dFdy.

The rate of change, called the gradient vector, is given by [dFdx(g(v)),

dFdy(g(v))]. The magnitude of the gradient vector is computed as sqrt

((dFdx(g(v))2 + dFdx(g(v))2). This value can also be approximated by

abs(dFdx(g(v)))+abs(dFdy(g(v))). The function fwidth can be used to

compute the magnitude of this gradient vector. This approach works well if

g(v) is a scalar expression. If g(v) is a point, however, we need to compute

the cross-product of dFdx(g(v)) and dFdy(g(v)). In the case of the

checkerboard texture example, we need to compute the magnitude of the

v_st.x and v_st.y scalar expressions and, therefore, the function fwidth

can be used to compute the lter widths for v_st.x and v_st.y.

Let w be the lter width computed by fwidth. We need to know two

additional things about the procedural texture:

The smallest value of lter width k such that the procedural texture

g(v) will not show any aliasing artifacts for lter widths less than k/2.

The average value of the procedural texture g(v) over very large widths.

If w < k/2, we should not see any aliasing artifacts. If w > k/2 (i.e., the

lter width is too large), aliasing will occur. We use the average value

Figure 14-13 Anti-aliased Checkerboard Procedural Texture

410 Chapter 14: Advanced Programming with OpenGL ES 3.0

of g(v) in this case. For other values of w, we use a smoothstep to fade

between the true function and average values. The full denition of the

smoothstep built-in function is provided in Appendix B.

This discussion should have provided you with good insight into how

touse procedural textures and how to resolve aliasing artifacts that become

apparent when you are using procedural textures. The generation of

procedural textures for many different applications is a very broad subject.

The following list of references is a good place to start if you are interested

in nding more information about procedural texture generation.

Open GL ES 3.0 Programming Guide

Navigation menu

Versions of this User Manual:

Views

Navigation