# Dakota V6.7 User's Manual

User Manual: Pdf

Open the PDF directly: View PDF .

Page Count: 364 [warning: Documents this large are best viewed by clicking the View PDF Link!]

- Preface
- Introduction
- Dakota Tutorial
- Parameter Study Capabilities
- Design of Experiments Capabilities
- Uncertainty Quantification Capabilities
- Optimization Capabilities
- Nonlinear Least Squares Capabilities
- Models
- Overview
- Single Models
- Recast Models
- Surrogate Models
- Overview of Surrogate Types
- Correction Approaches
- Data Fit Surrogate Models
- Procedures for Surface Fitting
- Taylor Series
- Two Point Adaptive Nonlinearity Approximation
- Linear, Quadratic, and Cubic Polynomial Models
- Kriging/Gaussian-Process Spatial Interpolation Models
- Artificial Neural Network (ANN) Models
- Multivariate Adaptive Regression Spline (MARS) Models
- Radial Basis Functions
- Moving Least Squares
- Piecewise Decomposition Option for Global Surrogate Models
- Surrogate Diagnostic Metrics

- Multifidelity Surrogate Models
- Reduced Order Models
- Surrogate Model Selection

- Nested Models
- Random Field Models
- Active Subspace Models

- Variables
- Interfaces
- Responses
- Inputs to Dakota
- Output from Dakota
- Advanced Methods
- Advanced Model Recursions
- Advanced Simulation Code Interfaces
- Parallel Computing
- Overview
- Single-level parallelism
- Multilevel parallelism
- Capability Summary
- Running a Parallel Dakota Job
- Specifying Parallelism
- Application Parallelism Use Cases

- Restart Capabilities and Utilities
- Simulation Failure Capturing
- Additional Examples

SAND2014-4633

Unlimited Release

July 2014

Updated November 13, 2017

Dakota, A Multilevel Parallel Object-Oriented Framework for

Design Optimization, Parameter Estimation, Uncertainty

Quantiﬁcation, and Sensitivity Analysis:

Version 6.7 User’s Manual

Brian M. Adams, Mohamed S. Ebeida, Michael S. Eldred, Gianluca Geraci,

John D. Jakeman, Kathryn A. Maupin, Jason A. Monschke, J. Adam Stephens,

Laura P. Swiler, Dena M. Vigil, Timothy M. Wildey

Optimization and Uncertainty Quantiﬁcation Department

William J. Bohnhoff

Radiation Transport Department

Keith R. Dalbey

Software Simulation and Data Processing Department

John P. Eddy

System Readiness and Sustainment Technologies Department

Joseph R. Frye, Russell W. Hooper

Multiphysics Applications Department

Kenneth T. Hu

W76-0/W88 Stockpile Systems Engineering Department

Patricia D. Hough, Mohammad Khalil

Quantitative Modeling and Analysis Department

Elliott M. Ridgway

Data-driven and Neural Computing

Sandia National Laboratories

P.O. Box 5800

Albuquerque, New Mexico 87185

Ahmad Rushdi

Electrical and Computer Engineering

University of California

One Shields Avenue

Davis, CA 95616-5294

4

Abstract

The Dakota toolkit provides a ﬂexible and extensible interface between simulation codes and iterative analysis

methods. Dakota contains algorithms for optimization with gradient and nongradient-based methods; uncertainty

quantiﬁcation with sampling, reliability, and stochastic expansion methods; parameter estimation with nonlinear

least squares methods; and sensitivity/variance analysis with design of experiments and parameter study methods.

These capabilities may be used on their own or as components within advanced strategies such as surrogate-

based optimization, mixed integer nonlinear programming, or optimization under uncertainty. By employing

object-oriented design to implement abstractions of the key components required for iterative systems analyses,

the Dakota toolkit provides a ﬂexible and extensible problem-solving environment for design and performance

analysis of computational models on high performance computers.

This report serves as a user’s manual for the Dakota software and provides capability overviews and procedures

for software execution, as well as a variety of example studies.

Dakota Version 6.7 User’s Manual generated on November 13, 2017

Contents

Preface 17

1 Introduction 19

1.1 Motivation for Dakota Development ................................. 19

1.2 Dakota Capabilities .......................................... 19

1.3 Coupling Dakota to a Simulation ................................... 21

1.4 User’s Manual Organization ..................................... 22

1.5 Files Referenced in this Manual ................................... 23

1.6 Summary ............................................... 23

2 Dakota Tutorial 25

2.1 Quickstart ............................................... 25

2.1.1 Acquiring and Installing Dakota ............................... 25

2.1.2 Running Dakota with a simple input ﬁle ........................... 26

2.1.3 Examples of Dakota output .................................. 27

2.2 Dakota Input File Format ....................................... 29

2.3 Examples ............................................... 31

2.3.1 Rosenbrock Test Problem .................................. 32

2.3.2 Two-Dimensional Grid Parameter Study ........................... 32

2.3.3 Gradient-based Unconstrained Optimization ......................... 33

2.3.4 Uncertainty Quantiﬁcation with Monte Carlo Sampling ................... 34

2.3.5 User Supplied Simulation Code Examples .......................... 37

2.3.5.1 Optimization with a User-Supplied Simulation Code - Case 1 .......... 37

2.3.5.2 Optimization with a User-Supplied Simulation Code - Case 2 .......... 41

2.4 Dakota Command-Line Options ................................... 42

2.5 Next Steps ............................................... 43

6 CONTENTS

2.5.1 Problem Exploration and Method Selection ......................... 43

2.5.2 Key Getting Started References ............................... 44

3 Parameter Study Capabilities 47

3.1 Overview ............................................... 47

3.1.1 Initial Values ......................................... 48

3.1.2 Bounds ............................................ 48

3.2 Vector Parameter Study ........................................ 49

3.3 List Parameter Study ......................................... 50

3.4 Centered Parameter Study ...................................... 51

3.5 Multidimensional Parameter Study .................................. 52

3.6 Parameter Study Usage Guidelines .................................. 54

3.7 Example: Vector Parameter Study with Rosenbrock ......................... 54

4 Design of Experiments Capabilities 57

4.1 Overview ............................................... 57

4.2 Design of Computer Experiments .................................. 58

4.3 DDACE ................................................ 59

4.3.1 Central Composite Design .................................. 59

4.3.2 Box-Behnken Design ..................................... 60

4.3.3 Orthogonal Array Designs .................................. 61

4.3.4 Grid Design .......................................... 62

4.3.5 Monte Carlo Design ..................................... 62

4.3.6 LHS Design .......................................... 62

4.3.7 OA-LHS Design ....................................... 62

4.4 FSUDace ............................................... 62

4.5 PSUADE MOAT ........................................... 63

4.6 Sensitivity Analysis .......................................... 64

4.6.1 Sensitivity Analysis Overview ................................ 64

4.6.2 Assessing Sensitivity with DACE .............................. 66

4.7 DOE Usage Guidelines ........................................ 67

5 Uncertainty Quantiﬁcation Capabilities 69

5.1 Overview ............................................... 69

5.1.1 Summary of Dakota UQ Methods .............................. 69

Dakota Version 6.7 User’s Manual generated on November 13, 2017

CONTENTS 7

5.1.2 Variables and Responses for UQ ............................... 72

5.2 Sampling Methods .......................................... 72

5.2.1 Uncertainty Quantiﬁcation Example using Sampling Methods ............... 74

5.2.2 Incremental Sampling .................................... 78

5.2.3 Principal Component Analysis ................................ 79

5.2.4 Wilks-based Sample Sizes .................................. 79

5.3 Reliability Methods .......................................... 79

5.3.1 Local Reliability Methods .................................. 81

5.3.1.1 Method mapping .................................. 81

5.3.2 Global Reliability Methods .................................. 82

5.3.3 Uncertainty Quantiﬁcation Examples using Reliability Analysis .............. 83

5.3.3.1 Mean-value Reliability with Textbook ...................... 83

5.3.3.2 FORM Reliability with Lognormal Ratio ..................... 84

5.4 Stochastic Expansion Methods .................................... 89

5.4.1 Uncertainty Quantiﬁcation Examples using Stochastic Expansions ............. 90

5.4.1.1 Polynomial Chaos Expansion for Rosenbrock .................. 90

5.4.1.2 Uncertainty Quantiﬁcation Example using Stochastic Collocation ........ 90

5.5 Importance Sampling Methods .................................... 95

5.5.1 Importance Sampling Method based on Reliability Approach ................ 96

5.5.2 Gaussian Process Adaptive Importance Sampling Method ................. 96

5.6 Adaptive Sampling Methods ..................................... 97

5.6.1 Adaptive sampling based on surrogates ........................... 97

5.6.2 Adaptive sampling based on dart throwing .......................... 98

5.7 Epistemic Nondeterministic Methods ................................. 99

5.7.1 Interval Methods for Epistemic Analysis ..........................100

5.7.2 Dempster-Shafer Theory of Evidence ............................100

5.8 Bayesian Calibration Methods ....................................105

5.8.1 QUESO ............................................106

5.8.2 DREAM ...........................................107

5.8.3 GPMSA ............................................108

5.8.4 WASABI ...........................................108

5.8.5 Feature Comparison .....................................108

5.8.6 Bayesian Calibration Example ................................109

5.8.7 Calibrating the Observation Error Model ..........................111

Dakota Version 6.7 User’s Manual generated on November 13, 2017

8 CONTENTS

5.8.8 Model Discrepancy ......................................111

5.8.9 Bayesian Experimental Design ................................114

5.8.9.1 One-at-a-time Implementation ..........................117

5.9 Uncertainty Quantiﬁcation Usage Guidelines ............................118

6 Optimization Capabilities 121

6.1 Optimization Formulations ......................................121

6.1.1 Constraint Considerations ..................................123

6.2 Optimizing with Dakota: Choosing a Method ............................123

6.2.1 Gradient-Based Local Methods ...............................124

6.2.1.1 Method Descriptions ...............................124

6.2.1.2 Example ......................................125

6.2.2 Derivative-Free Local Methods ...............................125

6.2.2.1 Method Descriptions ...............................125

6.2.2.2 Example ......................................126

6.2.3 Derivative-Free Global Methods ...............................128

6.2.3.1 Method Descriptions ...............................128

6.2.3.2 Examples .....................................129

6.3 Additional Optimization Capabilities .................................133

6.3.1 Multiobjective Optimization .................................133

6.3.1.1 Multiobjective Example 1 .............................134

6.3.1.2 Multiobjective Example 2 .............................137

6.3.2 Optimization with User-speciﬁed or Automatic Scaling ...................139

6.3.2.1 Scaling Example ..................................141

6.4 Optimization Usage Guidelines ....................................141

6.5 Optimization Third Party Libraries ..................................144

7 Nonlinear Least Squares Capabilities 147

7.1 Overview ...............................................147

7.2 Nonlinear Least Squares Fomulations ................................148

7.3 Nonlinear Least Squares with Dakota ................................148

7.4 Solution Techniques ..........................................149

7.4.1 Gauss-Newton ........................................149

7.4.2 NLSSOL ...........................................149

7.4.3 NL2SOL ...........................................150

Dakota Version 6.7 User’s Manual generated on November 13, 2017

CONTENTS 9

7.4.4 Additional Features ......................................150

7.5 Examples ...............................................150

7.6 Usage Guidelines ...........................................152

8 Models 153

8.1 Overview ...............................................153

8.2 Single Models .............................................154

8.3 Recast Models .............................................154

8.4 Surrogate Models ...........................................154

8.4.1 Overview of Surrogate Types .................................154

8.4.2 Correction Approaches ....................................156

8.4.3 Data Fit Surrogate Models ..................................157

8.4.3.1 Procedures for Surface Fitting ...........................157

8.4.3.2 Taylor Series ....................................158

8.4.3.3 Two Point Adaptive Nonlinearity Approximation .................158

8.4.3.4 Linear, Quadratic, and Cubic Polynomial Models ................159

8.4.3.5 Kriging/Gaussian-Process Spatial Interpolation Models .............160

8.4.3.6 Artiﬁcial Neural Network (ANN) Models ....................163

8.4.3.7 Multivariate Adaptive Regression Spline (MARS) Models ............163

8.4.3.8 Radial Basis Functions ..............................164

8.4.3.9 Moving Least Squares ...............................164

8.4.3.10 Piecewise Decomposition Option for Global Surrogate Models .........164

8.4.3.11 Surrogate Diagnostic Metrics ...........................165

8.4.4 Multiﬁdelity Surrogate Models ................................166

8.4.5 Reduced Order Models ....................................166

8.4.6 Surrogate Model Selection ..................................167

8.5 Nested Models ............................................167

8.6 Random Field Models .........................................168

8.7 Active Subspace Models .......................................169

9 Variables 171

9.1 Overview ...............................................171

9.2 Design Variables ...........................................171

9.2.1 Continuous Design Variables .................................171

9.2.2 Discrete Design Variables ..................................172

Dakota Version 6.7 User’s Manual generated on November 13, 2017

10 CONTENTS

9.2.2.1 Discrete Design Integer Variables .........................172

9.2.2.2 Discrete Design String Variables .........................173

9.2.2.3 Discrete Design Real Variables ..........................173

9.3 Uncertain Variables ..........................................173

9.3.1 Aleatory Uncertain Variables .................................173

9.3.1.1 Continuous Aleatory Uncertain Variables .....................174

9.3.1.2 Discrete Aleatory Uncertain Variables ......................174

9.3.2 Epistemic Uncertain Variables ................................175

9.3.2.1 Continuous Epistemic Uncertain Variables ....................175

9.3.2.2 Discrete Epistemic Uncertain Variables ......................175

9.4 State Variables ............................................175

9.5 Management of Mixed Variables by Iterator .............................176

9.5.1 View .............................................176

9.5.2 Domain ............................................177

9.5.3 Precedence ..........................................177

9.6 Dakota Parameters File Data Format .................................177

9.6.1 Parameters ﬁle format (standard) ...............................178

9.6.2 Parameters ﬁle format (APREPRO) .............................179

9.7 The Active Set Vector .........................................181

9.7.1 Active set vector control ...................................181

10 Interfaces 183

10.1 Overview ...............................................183

10.2 Simulation Interfaces .........................................183

10.2.1 Direct Simulation Interface ..................................183

10.2.2 System Call Simulation Interface ..............................184

10.2.3 Fork Simulation Interface ..................................184

10.2.4 Syntax for Filter and Driver Strings .............................185

10.2.5 Fork or System Call: Which to Use? .............................185

10.3 Building a Black-Box Interface to a Simulation Code ........................186

10.3.1 Generic Script Interface Files ................................186

10.3.2 Adapting These Scripts to Another Simulation .......................192

10.3.3 Additional Examples .....................................193

10.4 Simulation Interface Components ..................................193

Dakota Version 6.7 User’s Manual generated on November 13, 2017

CONTENTS 11

10.4.1 Single analysis driver without ﬁlters .............................194

10.4.2 Single analysis driver with ﬁlters ...............................195

10.4.3 Multiple analysis drivers without ﬁlters ...........................196

10.4.4 Multiple analysis drivers with ﬁlters .............................197

10.5 Simulation File Management .....................................198

10.5.1 File Saving ..........................................198

10.5.2 File Tagging for Evaluations .................................199

10.5.3 Temporary Files .......................................199

10.5.4 File Tagging for Analysis Drivers ..............................200

10.5.5 Work Directories .......................................201

10.6 Parameter to Response Mapping Examples .............................202

10.7 Parameters and Results Files with dakota.interfacing ........................206

10.7.1 Creating Parameters and Results objects ...........................206

10.7.2 Parameters objects ......................................207

10.7.3 Results objects ........................................207

10.7.4 Response object .......................................208

10.7.5 dakota.interfacing Examples .................................208

11 Responses 211

11.1 Overview ...............................................211

11.1.1 Response function types ...................................211

11.1.2 Gradient availability .....................................212

11.1.3 Hessian availability ......................................212

11.1.4 Field Data ...........................................212

11.2 Dakota Results File Data Format ...................................213

11.3 Active Variables for Derivatives ...................................214

12 Inputs to Dakota 217

12.1 Overview of Inputs ..........................................217

12.1.1 Tabular Data Formats .....................................217

12.2 Data Imports .............................................218

12.2.1 AMPL algebraic mappings: stub.nl, stub.row, and stub.col .................218

12.2.2 Genetic algorithm population import .............................219

12.2.3 Calibration data import ....................................219

12.2.4 PCE coefﬁcient import ....................................220

Dakota Version 6.7 User’s Manual generated on November 13, 2017

12 CONTENTS

12.2.5 Surrogate construction and evaluation data .........................220

12.2.6 Variables/responses import to post-run ............................220

13 Output from Dakota 223

13.1 Overview of Output Formats .....................................223

13.2 Standard Output ............................................223

13.3 Tabular Output Data ..........................................229

13.4 Graphics Output ............................................230

13.5 Error Messages Output ........................................230

13.6 Stochastic expansion exports .....................................232

13.7 Surrogate Model Exports .......................................232

13.8 Variables Output from Pre-run ....................................232

14 Advanced Methods 233

14.1 Overview ...............................................233

14.2 Hybrid Minimization .........................................233

14.3 Multistart Local Minimization ....................................234

14.4 Pareto Optimization ..........................................236

14.5 Mixed Integer Nonlinear Programming (MINLP) ..........................236

14.5.1 Example MINLP Problem ..................................239

14.6 Surrogate-Based Minimization ....................................241

14.6.1 Surrogate-Based Local Minimization ............................241

14.6.1.1 SBO with Data Fits ................................241

14.6.1.2 SBO with Multiﬁdelity Models ..........................244

14.6.1.3 SBO with Reduced Order Models .........................244

14.6.2 Surrogate-Based Global Minimization ............................245

15 Advanced Model Recursions 249

15.1 Mixed Aleatory-Epistemic UQ ....................................249

15.1.1 Interval-valued probability (IVP) ...............................250

15.1.2 Second-order probability (SOP) ...............................251

15.1.3 Dempster-Shafer Theory of Evidence ............................253

15.2 Optimization Under Uncertainty (OUU) ...............................253

15.2.1 Nested OUU .........................................254

15.2.2 Surrogate-Based OUU (SBOUU) ..............................256

Dakota Version 6.7 User’s Manual generated on November 13, 2017

CONTENTS 13

15.2.3 Trust-Region Surrogate-Based OUU (TR-SBOUU) .....................256

15.2.4 RBDO ............................................257

15.2.5 Stochastic Expansion-Based Design Optimization ......................257

15.2.6 Epistemic OUU ........................................258

15.3 Surrogate-Based Uncertainty Quantiﬁcation .............................258

16 Advanced Simulation Code Interfaces 261

16.1 Algebraic Mappings ..........................................261

16.2 Developing a Direct Simulation Interface ..............................264

16.2.1 Extension ...........................................264

16.2.2 Derivation ...........................................265

16.2.3 Sandwich ...........................................265

16.3 Existing Direct Interfaces to External Simulators ..........................266

16.3.1 Matlab ............................................266

16.3.1.1 Dakota/Matlab input ﬁle speciﬁcation .......................266

16.3.1.2 Matlab .m ﬁle speciﬁcation ............................267

16.3.2 Python ............................................268

16.4 Scilab Script and Direct Interfaces ..................................268

16.4.1 Scilab Script Interface ....................................270

16.4.2 Scilab Linked Interface ....................................271

16.4.3 Scilab Compiled Interface ..................................271

17 Parallel Computing 273

17.1 Overview ...............................................273

17.1.1 Categorization of parallelism .................................274

17.1.2 Parallel Dakota algorithms ..................................275

17.1.2.1 Parallel iterators ..................................275

17.1.2.2 Advanced methods ................................276

17.1.2.3 Parallel models ..................................277

17.2 Single-level parallelism ........................................277

17.2.1 Asynchronous Local Parallelism ...............................279

17.2.1.1 Direct function synchronization ..........................279

17.2.1.2 System call synchronization ............................279

17.2.1.3 Fork synchronization ...............................280

17.2.1.4 Asynchronous Local Example ...........................281

Dakota Version 6.7 User’s Manual generated on November 13, 2017

14 CONTENTS

17.2.1.5 Local evaluation scheduling options .......................283

17.2.2 Message Passing Parallelism .................................283

17.2.2.1 Partitioning ....................................283

17.2.2.2 Scheduling .....................................284

17.2.2.3 Message Passing Example .............................285

17.2.3 Hybrid Parallelism ......................................286

17.2.3.1 Hybrid Example ..................................287

17.3 Multilevel parallelism .........................................289

17.3.1 Asynchronous Local Parallelism ...............................290

17.3.2 Message Passing Parallelism .................................290

17.3.2.1 Partitioning of levels ................................290

17.3.2.2 Scheduling within levels ..............................291

17.3.3 Hybrid Parallelism ......................................291

17.4 Capability Summary .........................................292

17.5 Running a Parallel Dakota Job ....................................293

17.5.1 Single-processor execution ..................................293

17.5.2 Multiprocessor execution ...................................293

17.6 Specifying Parallelism ........................................294

17.6.1 The interface speciﬁcation ..................................295

17.6.2 The meta-iterator and nested model speciﬁcations ......................295

17.6.3 Single-processor Dakota speciﬁcation ............................296

17.6.3.1 Example 1 .....................................296

17.6.3.2 Example 2 .....................................297

17.6.4 Multiprocessor Dakota speciﬁcation .............................297

17.6.4.1 Example 3 .....................................297

17.6.4.2 Example 4 .....................................299

17.6.4.3 Example 5 .....................................301

17.7 Application Parallelism Use Cases ..................................302

17.7.1 Case 1: Massively Serial — Multiple serial analysis jobs ..................303

17.7.2 Case 2: Sequential Parallel — One parallel analysis job at a time ..............304

17.7.3 Case 3: Evaluation Tiling — Multiple simultaneous parallel analysis jobs .........304

17.7.3.1 Mpiexec server mode ...............................305

17.7.3.2 Relative node scheduling .............................305

17.7.3.3 Machineﬁle management .............................306

Dakota Version 6.7 User’s Manual generated on November 13, 2017

CONTENTS 15

17.7.4 Case 4: Evaluation Submission — Parallel analysis jobs submitted to a queue . . . . . . . 306

18 Restart Capabilities and Utilities 307

18.1 Restart Management .........................................307

18.2 The Dakota Restart Utility ......................................308

18.2.1 Print ..............................................309

18.2.2 To/From Neutral File Format .................................310

18.2.3 To Tabular Format ......................................311

18.2.4 Concatenation of Multiple Restart Files ...........................312

18.2.5 Removal of Corrupted Data .................................312

19 Simulation Failure Capturing 315

19.1 Failure detection ...........................................315

19.2 Failure communication ........................................316

19.3 Failure mitigation ...........................................316

19.3.1 Abort (default) ........................................316

19.3.2 Retry .............................................316

19.3.3 Recover ............................................316

19.3.4 Continuation .........................................317

19.4 Special values .............................................317

20 Additional Examples 319

20.1 Textbook ...............................................319

20.1.1 Gradient-based Constrained Optimization ..........................321

20.2 Rosenbrock ..............................................324

20.2.1 Least-Squares Optimization .................................324

20.3 Herbie, Smooth Herbie, and Shubert .................................327

20.3.1 Efﬁcient Global Optimization ................................328

20.4 Sobol and Ishigami Functions ....................................328

20.5 Cylinder Head .............................................329

20.5.1 Constrained Gradient Based Optimization ..........................333

20.6 Container ...............................................336

20.6.1 Constrained Gradient Based Optimization ..........................337

20.7 Cantilever ...............................................338

20.7.1 Constrained Gradient Based Optimization ..........................339

Dakota Version 6.7 User’s Manual generated on November 13, 2017

16 CONTENTS

20.7.2 Optimization Under Uncertainty ...............................340

20.8 Multiobjective Test Problems .....................................340

20.8.1 Multiobjective Test Problem 2 ................................341

20.8.2 Multiobjective Test Problem 3 ................................341

20.9 Morris .................................................347

20.9.1 Morris One-at-a-Time Sensitivity Study ...........................347

20.10Test Problems for Reliability Analyses ................................350

20.10.1 Log Ratio ...........................................350

20.10.2 Steel Section .........................................351

20.10.3 Portal Frame .........................................351

20.10.4 Short Column .........................................351

20.10.5 Steel Column .........................................352

20.11Test Problems for Forward Uncertainty Quantiﬁcation ........................353

20.11.1 Genz functions ........................................353

20.11.2 Elliptic Partial differential equation with uncertain coefﬁcients ...............354

20.11.3 Damped Oscillator ......................................355

20.11.4 Non-linear coupled system of ODEs .............................355

20.11.5 Experimental Design .....................................355

20.11.6 Bayes linear ..........................................356

Dakota Version 6.7 User’s Manual generated on November 13, 2017

Preface

The Dakota project started in 1994 as an internal research and development activity at Sandia National Labora-

tories in Albuquerque, New Mexico. The original goal was to provide a common set of optimization tools for

a group of engineers solving structural analysis and design problems. Prior to the Dakota project, there was no

focused effort to archive optimization methods for reuse on other projects. Thus, engineers found themselves

repeatedly building new custom interfaces between the engineering analysis software and optimization software.

This was especially burdensome when using parallel computing, as each project developed a unique master pro-

gram to coordinate concurrent simulations on a network of workstations or a parallel computer. The initial Dakota

toolkit provided the engineering and analysis community at Sandia access to a variety of optimization algorithms,

hiding the complexity of the optimization software interfaces from the users. Engineers could readily switch

between optimization software packages by simply changing a few lines in a Dakota input ﬁle. In addition to

structural analysis, Dakota has been applied to computational ﬂuid dynamics, nonlinear dynamics, shock physics,

heat transfer, electrical circuits, and many other science and engineering models.

Dakota has grown signiﬁcantly beyond an optimization toolkit. In addition to its state-of-the-art optimization

methods, Dakota includes methods for global sensitivity and variance analysis, parameter estimation, uncertainty

quantiﬁcation, and veriﬁcation, as well as meta-level strategies for surrogate-based optimization, hybrid optimiza-

tion, and optimization under uncertainty. Available to all these algorithms is parallel computation support; ranging

from desktop multiprocessor computers to massively parallel computers typically found at national laboratories

and supercomputer centers.

As of Version 5.0, Dakota is publicly released as open source under a GNU Lesser General Public License and

is available for free download world-wide. See http://www.gnu.org/licenses/lgpl.html for more

information on the LGPL software use agreement. Dakota Versions 3.0 through 4.2+ were licensed under the GNU

General Public License. Dakota public release facilitates research and software collaborations among Dakota

developers at Sandia National Laboratories and other institutions, including academic, government, and corporate

entities. See the Dakota FAQ at http://dakota.sandia.gov/faq-page for more information on the

public release rationale and ways to contribute.

Dakota leadership includes Brian Adams (project lead), Mike Eldred (founder and research lead), Adam Stephens

(support manager), Dena Vigil (product owner), and Jim Stewart (business manager). For a listing of current

and former contributors and third-party library developers, visit the Dakota webpage at http://dakota.

sandia.gov.

Contact Information:

Brian M. Adams, Dakota Project Lead

Sandia National Laboratories

P.O. Box 5800, Mail Stop 1318

Albuquerque, NM 87185-1318

Web (including support information): http://dakota.sandia.gov

18 CONTENTS

Dakota Version 6.7 User’s Manual generated on November 13, 2017

Chapter 1

Introduction

1.1 Motivation for Dakota Development

Computational models are commonly used in engineering design and scientiﬁc discovery activities for simulating

complex physical systems in disciplines such as ﬂuid mechanics, structural dynamics, heat transfer, nonlinear

structural mechanics, shock physics, and many others. These simulators can be an enormous aid to engineers who

want to develop an understanding and/or predictive capability for complex behaviors typically observed in the

corresponding physical systems. Simulators often serve as virtual prototypes, where a set of predeﬁned system

parameters, such as size or location dimensions and material properties, are adjusted to improve the performance

of a system, as deﬁned by one or more system performance objectives. Such optimization or tuning of the

virtual prototype requires executing the simulator, evaluating performance objective(s), and adjusting the system

parameters in an iterative, automated, and directed way. System performance objectives can be formulated, for

example, to minimize weight, cost, or defects; to limit a critical temperature, stress, or vibration response; or

to maximize performance, reliability, throughput, agility, or design robustness. In addition, one would often

like to design computer experiments, run parameter studies, or perform uncertainty quantiﬁcation (UQ). These

approaches reveal how system performance changes as a design or uncertain input variable changes. Sampling

methods are often used in uncertainty quantiﬁcation to calculate a distribution on system performance measures,

and to understand which uncertain inputs contribute most to the variance of the outputs.

A primary goal for Dakota development is to provide engineers and other disciplinary scientists with a systematic

and rapid means to obtain improved or optimal designs or understand sensitivity or uncertainty using simulation-

based models. These capabilities generally lead to improved designs and system performance in earlier design

stages, alleviating dependence on physical prototypes and testing, shortening design cycles, and reducing product

development costs. In addition to providing this practical environment for answering system performance ques-

tions, the Dakota toolkit provides an extensible platform for the research and rapid prototyping of customized

methods and meta-algorithms [34].

1.2 Dakota Capabilities

Dakota delivers a variety of iterative methods and meta-algorithms, and the ability to ﬂexibly interface them to

your simulation code. While Dakota was originally conceived to more readily interface simulation codes and

optimization algorithms, recent versions go beyond optimization to include other iterative analysis methods such

20 CHAPTER 1. INTRODUCTION

as uncertainty quantiﬁcation with nondeterministic propagation methods, parameter estimation with nonlinear

least squares solution methods, and sensitivity/variance analysis with general-purpose design of experiments and

parameter study capabilities. These capabilities may be used on their own or as building blocks within more

sophisticated meta-algorithms such as hybrid optimization, surrogate-based optimization, optimization under un-

certainty, or mixed aleatory/epistemic UQ.

The principal classes of Dakota algorithms, with brief descriptions, are summarized here. For details, formula-

tions, and usage guidelines, see the referenced chapters.

•Parameter Studies (Chapter 3): Parameter studies employ deterministic designs to explore the effect of

parametric changes within simulation models, yielding one form of sensitivity analysis. They can help

assess simulation characteristics such as smoothness, multi-modality, robustness, and nonlinearity, which

affect the choice of algorithms and controls in follow-on optimization and UQ studies. Typical examples

include centered, one-at-a-time variations or joint variation on a grid.

•Design of Experiments (Chapter 4): Design and analysis of computer experiments (DACE) techniques

are often used to explore the parameter space of an engineering design problem, for example to perform

global sensitivity analysis. DACE methods can help reach conclusions similar to parameter studies, but the

primary goal of these methods is to generate good coverage of the input parameter space. Representative

methods include Latin hypercube sampling, orthogonal arrays, and Box-Behnken designs.

•Uncertainty Quantiﬁcation (Chapter 5): Uncertainty quantiﬁcation methods (also referred to as nonde-

terministic analysis methods) compute probabilistic information about response functions based on simu-

lations performed according to speciﬁed input parameter probability distributions. Put another way, these

methods perform a forward uncertainty propagation in which probability information for input parameters

is mapped to probability information for output response functions. Common approaches include Monte

Carlo sampling, reliability methods, and polynomial chaos expansions.

•Optimization (Chapter 6): Optimization solvers seek to minimize cost or maximize system performance,

as predicted by the simulation model, subject to constraints on input variables or secondary simulation re-

sponses. Categories of algorithms include gradient-based, derivative-free, and global optimization. Dakota

also includes capabilities for multi-objective trade-off optimization and automatic scaling of problem for-

mulations. Advanced Dakota approaches include hybrid (multi-method), multi-start local, and Pareto-set

optimization.

•Calibration (Chapter 7): Calibration algorithms seek to maximize agreement between simulation outputs

and experimental data (or desired outputs). They are used to solve inverse problems (often referred to as

parameter estimation or least-squares problems). Dakota approaches include nonlinear least squares and

Bayesian calibration.

Dakota includes a number of related advanced capabilities. Surrogate models are inexpensive approximate mod-

els that are intended to capture the salient features of an expensive high-ﬁdelity model and include data ﬁts,

multiﬁdelity, and reduced-order model surrogates. They can be used to explore the variations in response quanti-

ties over regions of the parameter space, or they can serve as inexpensive stand-ins for optimization or uncertainty

quantiﬁcation studies. Section 8.4 summarizes surrogate model mechanics in Dakota, while optimization methods

tailored to particular surrogate approaches are surveyed in Section 14.6.

Nested models permit layering one Dakota method over another, enabling algorithms like mixed epistemic-

aleatory or second-order UQ, optimization under uncertainty, or surrogate-based UQ. Additional information

on these nested approaches is provided in Section 8.5 and Chapter 15.

The methods and algorithms in Dakota are designed to exploit parallel computing resources such as those found

in a desktop multiprocessor workstation, a network of workstations, or a massively parallel computing platform.

Dakota Version 6.7 User’s Manual generated on November 13, 2017

1.3. COUPLING DAKOTA TO A SIMULATION 21

Figure 1.1: The loosely-coupled or “black-box” interface between Dakota and a user-supplied simulation code.

This parallel computing capability is a critical technology for rendering real-world engineering design problems

computationally tractable. See Chapter 17.

Dakota also has emerging capabilities in solution veriﬁcation and Bayesian calibration/UQ, which are documented

brieﬂy in the Dakota Reference Manual, and in later sections of this manual.

In addition to its iterative methods and algorithms, Dakota also provides a graphical user interface that allows

you to manage your Dakota studies visually. Among other features, the GUI allows you to intuitively link your

Dakota study to a simulation model (see Section 1.3 below) as well as graphically plot output data from your

Dakota study. For the full Dakota GUI tutorial, please visit this link: https://dakota.sandia.gov/

content/dakota-gui-user-manual.

1.3 Coupling Dakota to a Simulation

A key Dakota advantage is access to a broad range of iterative capabilities through a single, relatively simple,

interface between Dakota and your simulator. Trying a different iterative method or meta-algorithm typically

requires changing only a few commands in the Dakota text input ﬁle and starting the new analysis. It does not

require intimate knowledge of the underlying software package integrated in Dakota, with its unique command

syntax and interfacing requirements. In addition, Dakota will manage concurrent executions of your computa-

tional model in parallel, whether on a desktop or high-performance cluster computer.

Figure 1.1 depicts a typical loosely-coupled relationship between Dakota and the simulation code(s). Such cou-

pling is often referred to as “black-box,” as Dakota has no (or little) awareness of the internal details of the

computational model, obviating any need for its source code. Such loose coupling is the simplest and most com-

mon interfacing approach Dakota users employ. Dakota and the simulation code exchange data by reading and

writing short data ﬁles. Dakota is executed with commands that the user supplies in a text input ﬁle (not shown

in Figure 1.1) which specify the type of analysis to be performed (e.g., parameter study, optimization, uncer-

tainty quantiﬁcation, etc.), along with the ﬁle names associated with the user’s simulation code. During operation,

Dakota automatically executes the user’s simulation code by creating a separate process external to Dakota.

The solid lines in Figure 1.1 denote ﬁle input/output (I/O) operations inherent to Dakota or the user’s simulation

code. The dotted lines indicate passing or converting information that must be implemented by the user. As Dakota

Dakota Version 6.7 User’s Manual generated on November 13, 2017

22 CHAPTER 1. INTRODUCTION

runs, it writes out a parameters ﬁle containing the current variable values. Dakota then starts the user’s simulation

code (or, often, a short driver script wrapping it), and when the simulation completes, reads the response data from

a results ﬁle. This process is repeated until all of the simulations required by the iterative study are complete.

In some cases it is advantageous to have a close coupling between Dakota and the simulation code. This close

coupling is an advanced feature of Dakota and is accomplished through either a direct interface or a SAND (si-

multaneous analysis and design) interface. For the direct interface, the user’s simulation code is modiﬁed to

behave as a function or subroutine under Dakota. This interface can be considered to be “semi-intrusive” in that

it requires relatively minor modiﬁcations to the simulation code. Its major advantage is the elimination of the

overhead resulting from ﬁle I/O and process creation. It can also be a useful tool for parallel processing, by en-

capsulating all computation in a single executable. For details on direct interfacing, see Section 16.2. A SAND

interface approach is “fully intrusive” in that it requires further modiﬁcations to the simulation code so that an

optimizer has access to the internal residual vector and Jacobian matrices computed by the simulation code. In

a SAND approach, both the optimization method and a nonlinear simulation code are converged simultaneously.

While this approach can greatly reduce the computational expense of optimization, considerable software devel-

opment effort must be expended to achieve this intrusive coupling between SAND optimization methods and the

simulation code. SAND may be supported in future Dakota releases.

1.4 User’s Manual Organization

The Dakota User’s Manual is organized into the following major categories. New users should consult the Tutorial

to get started, then likely the Method Tour and Interfacing to select a Dakota method and build an interface to

your code.

•Tutorial (Chapter 2): How to obtain, install, and use Dakota, with a few introductory examples.

•Method Tour (Chapters 3through 7): Survey of the major classes of iterative methods included in Dakota,

with background, mathematical formulations, usage guidelines, and summary of supporting third-party

software.

•Models (Chapters 8through 11): Explanation of Dakota models, which manage the mapping from variables

through interfaces to responses, as well as details on parameter and response ﬁle formats for simulation code

interfacing.

•Input/Output (Chapters 12 and 13): Summary of input to Dakota, including tabular data, and outputs

generated by Dakota.

•Advanced Topics:

– Recursion with Components: Chapter 14 addresses component-based method recursions, and Chap-

ter 15 addresses component-based model recursions.

– Interfacing: Chapter 16 describes interfacing Dakota with engineering simulation codes in both

loose- and tightly-coupled modes.

– Parallelism: Chapter 17 describes Dakota’s parallel computing capabilities, with a summary of major

application parallel modes in Section 17.7.

– Fault Tolerance: Chapter 18 describes restart capabilities and utilities, and Chapter 19 explains ways

to detect and mitigate simulation failures.

•Additional Examples (Chapter 20): Supplemental example analysis problems and discussion.

Dakota Version 6.7 User’s Manual generated on November 13, 2017

1.5. FILES REFERENCED IN THIS MANUAL 23

1.5 Files Referenced in this Manual

Dakota input ﬁles are shown in ﬁgures throughout the Manual. The ﬁlename is speciﬁed in the comments and un-

less speciﬁed otherwise, these ﬁles are available in the Dakota/examples/users directory, where Dakota

refers to the directory where Dakota was installed. Some of the input ﬁles have associated ﬁles, such as output or

tabular data, with the same base ﬁlename, and .sav appended to the names.

Additional ﬁles are referenced, and if the location differs then it will be speciﬁed in the text. A small number of

examples refer to ﬁles included only in the source directory, which is labeled Dakota Source. You will need a

copy of the source to view these ﬁles - see Section 2.1.1.

1.6 Summary

Dakota is both a production tool for engineering design and analysis activities and a research tool for the develop-

ment of new algorithms in optimization, uncertainty quantiﬁcation, and related areas. Because of the extensible,

object-oriented design of Dakota, it is relatively easy to add new iterative methods, meta-algorithms, simulation

interfacing approaches, surface ﬁtting methods, etc. In addition, Dakota can serve as a rapid prototyping tool for

algorithm development. That is, by having a broad range of building blocks available (i.e., parallel computing,

surrogate models, simulation interfaces, fundamental algorithms, etc.), new capabilities can be assembled rapidly

which leverage the previous software investments. For additional discussion on framework extensibility, refer to

the Dakota Developers Manual [2].

The capabilities of Dakota have been used to solve engineering design and optimization problems at Sandia Labs,

at other Department of Energy labs, and by our industrial and academic collaborators. Often, this real-world

experience has provided motivation for research into new areas of optimization. The Dakota development team

welcomes feedback on the capabilities of this software toolkit, as well as suggestions for new areas of research.

Dakota Version 6.7 User’s Manual generated on November 13, 2017

24 CHAPTER 1. INTRODUCTION

Dakota Version 6.7 User’s Manual generated on November 13, 2017

Chapter 2

Dakota Tutorial

2.1 Quickstart

This section provides an overview of acquiring and installing Dakota, running a simple example, and looking

at the basic output available. More detailed information about downloads and installation can be found on the

Dakota website http://dakota.sandia.gov.

2.1.1 Acquiring and Installing Dakota

Dakota operates on most systems running Unix or Linux operating systems as well as on Windows, natively in

a Command Prompt window, and (optionally) with the help of a Cygwin emulation layer. Dakota is developed

and most extensively tested on Redhat Enterprise Linux with GNU compilers, but additional operating systems

/ compiler combinations are tested nightly as well. See the Dakota website for more information on supported

platforms for particular Dakota versions.

Department of Energy users: Dakota may already be available on your target system. Sandia users should visit

http://dakota.sandia.gov/ for information on supported Dakota installations on engineering networks

and cluster computers, as well as for Sandia-speciﬁc downloads. At other DOE institutions, contact your system

administrator about Dakota availability. If Dakota is not available for your target platform, you may still download

Dakota as described below.

New users should visit https://dakota.sandia.gov/quickstart.html to get started with Dakota.

This typically involves the following steps:

1. Download Dakota.

You may download binary executables for your preferred platforms or you can compile Dakota from source

code. Downloads are available from http://dakota.sandia.gov/download.html.

2. Install Dakota.

Instructions are available from http://dakota.sandia.gov/content/install-dakota. Guid-

ance is also included in the Dakota source ﬁles, including Dakota Source/INSTALL. Further platfor-

m/operating system-speciﬁc guidance can be found in Dakota Source/examples/platforms.

3. Verify that Dakota runs.

To perform a quick check that your Dakota executable runs, open a terminal window (in Windows, cmd.exe),

26 CHAPTER 2. DAKOTA TUTORIAL

and type:

dakota -v

Dakota version information should display in your terminal window. For a more detailed description of

Dakota command line options, see Section 2.4.

4. Participate in Dakota user communities.

Join Dakota mail lists to get the most up-to-date guidance for downloading, compiling, installing, or run-

ning. For information about mail lists, getting help, and other available help resources, see

http://dakota.sandia.gov/content/get-help.

2.1.2 Running Dakota with a simple input ﬁle

This section is intended for users who are new to Dakota, to demonstrate the basics of running a simple example.

First Steps

1. Make sure Dakota runs. You should see Dakota version information when you type: dakota -v

2. Create a working directory

3. Copy rosen multidim.in from the Dakota/examples/users/ directory to the working direc-

tory – see Section 1.5 for help.

4. From the working directory, run dakota -i rosen multidim.in -o rosen multidim.out

> rosen multidim.stdout

What should happen

Dakota outputs a large amount of information to help users track progress. Four ﬁles should have been created:

1. The screen output has been redirected to the ﬁle rosen multidim.stdout.

The contents are messages from Dakota and notes about the progress of the iterator (i.e. method/algorithm).

2. The output ﬁle rosen multidim.out contains information about the function evaluations.

3. rosen multidim.dat is created due to the speciﬁcation of tabular graphics data and

tabular graphics file. This summarizes the variables and responses for each function evaluation.

4. dakota.rst is a restart ﬁle. If a Dakota analysis is interrupted, it can be often be restarted without losing

all progress.

In addition to the ﬁles, some plots are created due to the speciﬁcation of graphics. These can be helpful when

processing the data or diagnosing unexpected results. If your particular installation or build of Dakota does not

support graphics, you will instead get a warning to this effect.

Dakota has some data processing capabilities for output analysis. The output ﬁle will contain the relevant results.

In this case, the output ﬁle has details about each of the 81 function evaluations. For more advanced or customized

data processing or visualization, the tabular data ﬁle can be imported into another analysis tool.

What now?

•Assuming Dakota ran successfully, skim the three text ﬁles (restart ﬁles are in a binary format). These are

described further in Section 2.1.3.

Dakota Version 6.7 User’s Manual generated on November 13, 2017

2.1. QUICKSTART 27

•This example used a parameter study method, and the rosenbrock test problem. More details about the

example are in Section 2.3.2 and the test problem is described in Sections 2.3.1 and 20.2.

•Explore the many methods available in Dakota in Chapters 3–7.

•Try running the other examples in the same directory. These are mentioned throughout the manual and are

listed in Table 2.1 for convenience.

•Learn the syntax needed to use these methods. For help running Dakota, see Section 2.4 and for input ﬁle

information, see Section 2.2.

•Learn how to use your own analysis code with Dakota in Chapter 16.

2.1.3 Examples of Dakota output

Beyond numerical results, all output ﬁles provide information that allows the user to check that the actual analysis

was the intended analysis. More details on all outputs can be found in Chapter 13.

Screen output saved to a ﬁle

Whenever an output ﬁle is speciﬁed for a Dakota run, the screen output itself becomes quite minimal consisting

of version statements, environment statements and execution times.

Output ﬁle

The output ﬁle is much more extensive, because it contains information on every function evaluation (See Figure

2.1). It is organized into three basic parts:

1. Information on the problem

For this example, we see that a new restart ﬁle is being created and Dakota has carried out a

multidim parameter study with 8 partitions for each of two variables.

2. Information on each function evaluation

Each function evaluation is numbered. Details for function evaluation 1 show that at input vari-

able values x1 = −2.0and x2 = −2.0, the direct rosenbrock function is being evaluated. There

is one response with a value of 3.609e+03.

3. Summary statistics

The function evaluation summary is preceded by <<<<<. For this example 81 total evalua-

tions were assessed; all were new, none were read in from the restart ﬁle. Correlation matrices

complete the statistics and output for this problem. Successful runs will ﬁnish with <<<<<

Iterator study type completed.

Tabular output ﬁle

For this example, the default name for the tabular output ﬁle dakota tabular.dat was changed in the input

ﬁle to rosen multidim.dat. This tab-delimited text ﬁle (Figure 2.1.3) summarizes the inputs and outputs to

the function evaluator. The ﬁrst line contains the names of the variables and responses, as well as headers for the

evaluation id and interface columns.

%eval_id interface x1 x2 response_fn_1

Dakota Version 6.7 User’s Manual generated on November 13, 2017

28 CHAPTER 2. DAKOTA TUTORIAL

The number of function evaluations will match the number of evaluations listed in the summary part of the output

ﬁle for single method approaches; the names of inputs and outputs will match the descriptors speciﬁed in the input

ﬁle. The interface column is useful when a Dakota input ﬁle contains more than one simulation interface. In

this instance, there is only one, and it has no id interface speciﬁed, so Dakota has supplied a default value

of NO ID. This ﬁle is ideal for import into other data analysis packages.

{Writing new restart file dakota.rst

methodName = multidim_parameter_study

gradientType = none

hessianType = none

>>>>> Running multidim_parameter_study iterator.

Multidimensional parameter study for variable partitions of

8

8

------------------------------

Begin Function Evaluation 1

------------------------------

Parameters for function evaluation 1:

-2.0000000000e+00 x1

-2.0000000000e+00 x2

Direct function: invoking rosenbrock

Active response data for function evaluation 1:

Active set vector = { 1 }

3.6090000000e+03 response_fn_1

.

.

.

<<<<< Function evaluation summary: 81 total (81 new, 0 duplicate)

Simple Correlation Matrix among all inputs and outputs:

x1 x2 response_fn_1

x1 1.00000e+00

x2 1.73472e-17 1.00000e+00

response_fn_1 -3.00705e-03 -5.01176e-01 1.00000e+00

.

.

.

<<<<< Iterator multidim_parameter_study completed.}

Figure 2.1: Rosenbrock 2-D parameter study example: excerpt from output ﬁle

Dakota Version 6.7 User’s Manual generated on November 13, 2017

2.2. DAKOTA INPUT FILE FORMAT 29

%eval_id interface x1 x2 response_fn_1

1 NO_ID -2 -2 3609

2 NO_ID -1.5 -2 1812.5

3 NO_ID -1 -2 904

4 NO_ID -0.5 -2 508.5

Figure 2.2: Rosenbrock 2-D parameter study example: excerpt from tabular data ﬁle

2.2 Dakota Input File Format

See Section 1.5 for location of all ﬁles referenced in this manual.

There are six speciﬁcation blocks that may appear in Dakota input ﬁles. These are identiﬁed in the input ﬁle using

the following keywords: variables, interface, responses, model, method, and environment. While, these keyword

blocks can appear in any order in a Dakota input ﬁle, there is an inherent relationship that ties them together. The

simplest form of that relationship is shown in Figure 2.3 and can be summarized as follows: In each iteration of

its algorithm, a method block requests a variables-to-responses mapping from its model, which the model fulﬁlls

through an interface. While most Dakota analyses satisfy this relationship, where a single method runs a single

model, advanced cases are possible and are discussed in Chapter 14.

Figure 2.3: Relationship between the six blocks, for a simple study.

As a concrete example, a simple Dakota input ﬁle, rosen multidim.in, is shown in Figure 2.4 for a two-

dimensional parameter study on Rosenbrock’s function. This input ﬁle will be used to describe the basic format

and syntax used in all Dakota input ﬁles. The results are shown later, in Section 2.3.2.

The ﬁrst block of the input ﬁle shown in Figure 2.4 is the environment block. This keyword block is used

to specify the general Dakota settings such as Dakota’s graphical output (via the graphics ﬂag) and the

tabular data output (via the tabular graphics data keyword). In advanced cases, it also identiﬁes the

top method pointer that will control the Dakota study. The environment block is optional, and at most one

such block can appear in a Dakota input ﬁle.

Dakota Version 6.7 User’s Manual generated on November 13, 2017

30 CHAPTER 2. DAKOTA TUTORIAL

# Dakota Input File: rosen_multidim.in

# Usage:

# dakota -i rosen_multidim.in -o rosen_multidim.out > rosen_multidim.stdout

environment

graphics

tabular_data

tabular_data_file = ’rosen_multidim.dat’

method

multidim_parameter_study

partitions = 8 8

model

single

variables

continuous_design = 2

lower_bounds -2.0 -2.0

upper_bounds 2.0 2.0

descriptors ’x1’ "x2"

interface

analysis_drivers = ’rosenbrock’

direct

responses

response_functions = 1

no_gradients

no_hessians

Figure 2.4: Rosenbrock 2-D parameter study example: the Dakota input ﬁle.

The method block of the input ﬁle speciﬁes which iterative method Dakota will employ and associated method

options. The keyword multidim parameter study in Figure 2.4 calls for a multidimensional parameter

study, while the keyword partitions speciﬁes the number of intervals per variable (a method option). In

this case, there will be eight intervals (nine data points) evaluated between the lower and upper bounds of both

variables (bounds provided subsequently in the variables section), for a total of 81 response function evaluations.

At least one method block is required, and multiple blocks may appear in Dakota input ﬁles for advanced studies.

The model block of the input ﬁle speciﬁes the model that Dakota will use. A model provides the logical unit

for determining how a set of variables is mapped through an interface into a set of responses when needed by

an iterative method. In the default case, the model allows one to specify a single set of variables, interface,

and responses. The model block is optional in this simple case. Alternatively, it can be explicitly deﬁned as in

Figure 2.4, where the keyword single speciﬁes the use of a single model in the parameter study. If one wishes to

perform more sophisticated studies such as surrogate-based analysis or optimization under uncertainty, the logical

organization speciﬁed in the model block becomes critical in informing Dakota on how to manage the different

components of such studies, and multiple model blocks are likely needed. See Chapter 8for relevant advanced

model speciﬁcation details.

The variables block of the input ﬁle speciﬁes the number, type, and characteristics of the parameters that will

Dakota Version 6.7 User’s Manual generated on November 13, 2017

2.3. EXAMPLES 31

be varied by Dakota. The variables can be classiﬁed as design variables, uncertain variables, or state variables.

Design variables are typically used in optimization and calibration, uncertain variables are used in UQ and sensi-

tivity studies, and state variables are usually ﬁxed. In all three cases, variables can be continuous or discrete, with

discrete having real, integer, and string subtypes. See Chapter 9for more information on the types of variables

supported by Dakota. The variables section shown in Figure 2.4 speciﬁes that there are two continuous design

variables. The sub-speciﬁcations for continuous design variables provide the descriptors “x1” and “x2” as well as

lower and upper bounds for these variables. The information about the variables is organized in column format

for readability. So, both variables x1and x2have a lower bound of -2.0 and an upper bound of 2.0. At least one

variables block is required, and multiple blocks may appear in Dakota input ﬁles for advanced studies.

The interface block of the input ﬁle speciﬁes the simulation code that will be used to map variables into responses

as well as details on how Dakota will pass data to and from that code. In this example, the keyword direct

is used to indicate the use of a function linked directly into Dakota, and data is passed directly between the two.

The name of the function is identiﬁed by the analysis driver keyword. Alternatively, fork or system

executions can be used to invoke instances of a simulation code that is external to Dakota as explained in Section

2.3.5.2 and Chapter 16. In this case, data is passed between Dakota and the simulation via text ﬁles. At least one

interface block is required, and multiple blocks may appear in Dakota input ﬁles for advanced studies.

The responses block of the input ﬁle speciﬁes the types of data that the interface will return to Dakota. They

are categorized primarily according to usage. Objective functions are used in optimization, calibration terms in

calibration, and response functions in sensitivity analysis and UQ. For the example shown in Figure 2.4, the as-

signment response functions = 1 indicates that there is only one response function. The responses block

can include additional information returned by the interface. That includes constraints and derivative information,

both discussed in Chapter 11. In this example, there are no constraints associated with Rosenbrock’s function,

so the keywords for constraint speciﬁcations are omitted. The keywords no gradients and no hessians

indicate that no derivatives will be provided to the method; none are needed for a parameter study. At least one

responses block is required, and multiple blocks may appear in Dakota input ﬁles for advanced studies.

We close this section with a list of rules regarding the formatting of the Dakota input ﬁle.

•“Flat” text only.

•Whitespace is ignored.

•Comments begin with # and continue to the end of the line.

•Keyword order is largely unimportant as long as major sections are respected and there is no ambiguity.

•Equal signs are optional.

•Strings can be surrounded by single or double quotes (but not “fancy” quotes).

•Scientiﬁc notation is ﬁne.

Please see the Dakota Reference Manual [3] for additional details on this input ﬁle syntax.

2.3 Examples

This section serves to familiarize users with how to perform parameter studies, optimization, and uncertainty

quantiﬁcation through their common Dakota interface. The initial examples utilize simple built in driver functions;

later we show how to utilize Dakota to drive the evaluation of user supplied black box code. The examples

presented in this chapter are intended to show the simplest use of Dakota for methods of each type. More advanced

examples of using Dakota for speciﬁc purposes are provided in subsequent, topic-based, chapters.

Dakota Version 6.7 User’s Manual generated on November 13, 2017

32 CHAPTER 2. DAKOTA TUTORIAL

2.3.1 Rosenbrock Test Problem

The examples shown later in this chapter use the Rosenbrock function [115] (also described in [57], among other

places), which has the form:

f(x1, x2) = 100(x2−x2

1)2+ (1 −x1)2(2.1)

A three-dimensional plot of this function is shown in Figure 2.5(a), where both x1and x2range in value from −2

to 2. Figure 2.5(b) shows a contour plot for Rosenbrock’s function. An optimization problem using Rosenbrock’s

function is formulated as follows:

minimize f(x1, x2)

x∈ <2

subject to −2≤x1≤2(2.2)

−2≤x2≤2

(a) (b)

Figure 2.5: Rosenbrock’s function: (a) 3-D plot and (b) contours with x1on the bottom axis.

Note that there are no linear or nonlinear constraints in this formulation, so this is a bound constrained optimization

problem. The unique solution to this problem lies at the point (x1, x2) = (1,1), where the function value is zero.

Several other test problems are available. See Chapter 20 for a description of these test problems as well as further

discussion of the Rosenbrock test problem.

2.3.2 Two-Dimensional Grid Parameter Study

Parameter study methods in the Dakota toolkit involve the computation of response data sets at a selection of

points in the parameter space. These response data sets are not linked to any speciﬁc interpretation, so they may

consist of any allowable speciﬁcation from the responses keyword block, i.e., objective and constraint functions,

Dakota Version 6.7 User’s Manual generated on November 13, 2017

2.3. EXAMPLES 33

least squares terms and constraints, or generic response functions. This allows the use of parameter studies in

direct coordination with optimization, least squares, and uncertainty quantiﬁcation studies without signiﬁcant

modiﬁcation to the input ﬁle.

An example of a parameter study is the 2-D parameter study example problem listed in Figure 2.4. This is

executed by Dakota using the command noted in the comments:

dakota -i rosen_multidim.in -o rosen_multidim.out > rosen_multidim.stdout

The output of the Dakota run is written to the ﬁle named rosen multidim.out while the screen output, or

standard output, is redirect to rosen multidim.stdout. For comparison, ﬁles named rosen multidim.out.sav

and rosen multidim.stdout.sav are included in the Dakota/examples/users directory. As for

many of the examples, Dakota provides a report on the best design point located during the study at the end of

these output ﬁles.

This 2-D parameter study produces the grid of data samples shown in Figure 2.6. In general, a multidimensional

parameter study lets one generate a grid in multiple dimensions. The keyword multidim parameter study

indicates that a grid will be generated over all variables. The keyword partitions indicates the number of grid

partitions in each dimension. For this example, the number of the grid partitions are the same in each dimension (8

partitions) but it would be possible to specify (partitions = 8 2), and have only two partitions over the second input

variable. Note that the graphics ﬂag in the environment block of the input ﬁle could be commented out since,

for this example, the iteration history plots created by Dakota are not particularly instructive. More interesting

visualizations can be created by importing Dakota’s tabular data into an external graphics/plotting package. Com-

mon graphics and plotting packages include Mathematica, Matlab, Microsoft Excel, Origin, Tecplot, Gnuplot and

many others. (Sandia National Laboratories and the Dakota developers do not endorse any of these commercial

products.)

Figure 2.6: Rosenbrock 2-D parameter study example: location of the design points (dots) evaluated.

2.3.3 Gradient-based Unconstrained Optimization

Dakota’s optimization capabilities include a variety of gradient-based and nongradient-based optimization meth-

ods. This subsection demonstrates the use of one such method through the Dakota interface.

Dakota Version 6.7 User’s Manual generated on November 13, 2017

34 CHAPTER 2. DAKOTA TUTORIAL

A Dakota input ﬁle for a gradient-based optimization of Rosenbrock’s function is listed in Figure 2.7. The

format of the input ﬁle is similar to that used for the parameter studies, but there are some new keywords in

the responses and method sections. First, in the responses block of the input ﬁle, the keyword block start-

ing with numerical gradients speciﬁes that a ﬁnite difference method will be used to compute gradi-

ents for the optimization algorithm. Note that the Rosenbrock function evaluation code inside Dakota has the

ability to give analytical gradient values. (To switch from ﬁnite difference gradient estimates to analytic gra-

dients, uncomment the analytic gradients keyword and comment out the four lines associated with the

numerical gradients speciﬁcation.) Next, in the method block of the input ﬁle, several new keywords have

been added. In this block, the keyword conmin frcg indicates the use of the Fletcher-Reeves conjugate gra-

dient algorithm in the CONMIN optimization software package [139] for bound-constrained optimization. The

keyword max iterations is used to indicate the computational budget for this optimization (in this case, a

single iteration includes multiple evaluations of Rosenbrock’s function for the gradient computation steps and the

line search steps). The keyword convergence tolerance is used to specify one of CONMIN’s convergence

criteria (under which CONMIN terminates if the objective function value differs by less than the absolute value

of the convergence tolerance for three successive iterations).

The Dakota command is noted in the ﬁle, and copies of the outputs are in the Dakota/examples/users

directory, with .sav appended to the name. When this example problem is executed using Dakota with graph-

ics support enabled, Dakota creates some iteration history graphics similar to the screen capture shown in Fig-

ure 2.8(a). These plots show how the objective function and design parameters change in value during the

optimization steps. The scaling of the horizontal and vertical axes can be changed by moving the scroll knobs on

each plot. Also, the “Options” button allows the user to plot the vertical axes using a logarithmic scale. Note that

log-scaling is only allowed if the values on the vertical axis are strictly greater than zero.

Figure 2.8(b) shows the iteration history of the optimization algorithm. The optimization starts at the point

(x1, x2)=(−1.2,1.0) as given in the Dakota input ﬁle. Subsequent iterations follow the banana-shaped val-

ley that curves around toward the minimum point at (x1, x2) = (1.0,1.0). Note that the function evaluations

associated with the line search phase of each CONMIN iteration are not shown on the plot. At the end of the

Dakota run, information is written to the output ﬁle to provide data on the optimal design point. These data in-

clude the optimum design point parameter values, the optimum objective and constraint function values (if any),

plus the number of function evaluations that occurred and the amount of time that elapsed during the optimization

study.

2.3.4 Uncertainty Quantiﬁcation with Monte Carlo Sampling

Uncertainty quantiﬁcation (UQ) is the process of determining the effect of input uncertainties on response metrics

of interest. These input uncertainties may be characterized as either aleatory uncertainties, which are irreducible

variabilities inherent in nature, or epistemic uncertainties, which are reducible uncertainties resulting from a lack

of knowledge. Since sufﬁcient data is generally available for aleatory uncertainties, probabilistic methods are

commonly used for computing response distribution statistics based on input probability distribution speciﬁca-

tions. Conversely, for epistemic uncertainties, data is generally sparse, making the use of probability theory

questionable and leading to nonprobabilistic methods based on interval speciﬁcations.

The subsection demonstrates the use of Monte Carlo random sampling for Uncertainty Quantiﬁcation.

Figure 2.9 shows the Dakota input ﬁle for an example problem that demonstrates some of the random sampling

capabilities available in Dakota. In this example, the design parameters, x1 and x2, will be treated as uncertain

parameters that have uniform distributions over the interval [-2, 2]. This is speciﬁed in the variables block of

the input ﬁle, beginning with the keyword uniform uncertain. Another difference from earlier input ﬁles

such as Figure 2.7 occurs in the responses block, where the keyword response functions is used in place

of objective functions. The ﬁnal changes to the input ﬁle occur in the method block, where the keyword

Dakota Version 6.7 User’s Manual generated on November 13, 2017

2.3. EXAMPLES 35

# Dakota Input File: rosen_grad_opt.in

# Usage:

# dakota -i rosen_grad_opt.in -o rosen_grad_opt.out > rosen_grad_opt.stdout

environment

graphics

tabular_data

tabular_data_file = ’rosen_grad_opt.dat’

method

conmin_frcg

convergence_tolerance = 1e-4

max_iterations = 100

model

single

variables

continuous_design = 2

initial_point -1.2 1.0

lower_bounds -2.0 -2.0

upper_bounds 2.0 2.0

descriptors ’x1’ "x2"

interface

analysis_drivers = ’rosenbrock’

direct

responses

objective_functions = 1

# analytic_gradients

numerical_gradients

method_source dakota

interval_type forward

fd_step_size = 1.e-5

no_hessians

Figure 2.7: Rosenbrock gradient-based unconstrained optimization example: the Dakota input ﬁle.

Dakota Version 6.7 User’s Manual generated on November 13, 2017

36 CHAPTER 2. DAKOTA TUTORIAL

(a)

(b)

Figure 2.8: Rosenbrock gradient-based unconstrained optimization example: (a) screen capture of the Dakota

graphics and (b) sequence of design points (dots) evaluated (line search points omitted).

sampling is used.

The other keywords in the methods block of the input ﬁle specify the number of samples (200), the seed for the

random number generator (17), the sampling method (random), and the response threshold (100.0). The seed

speciﬁcation allows a user to obtain repeatable results from multiple runs. If a seed value is not speciﬁed, then

Dakota’s sampling methods are designed to generate nonrepeatable behavior (by initializing the seed using a

system clock). The keyword response levels allows the user to specify threshold values for which Dakota

will output statistics on the response function output. Note that a unique threshold value can be speciﬁed for each

response function.

In this example, Dakota will select 200 design points from within the parameter space, evaluate the value of

Rosenbrock’s function at all 200 points, and then perform some basic statistical calculations on the 200 response

values.

The Dakota command is noted in the ﬁle, and copies of the outputs are in the Dakota/examples/users

Dakota Version 6.7 User’s Manual generated on November 13, 2017

2.3. EXAMPLES 37

directory, with .sav appended to the name. Figure 2.10 shows example results from this sampling method. Note

that your results will differ from those in this ﬁle if your seed value differs or if no seed is speciﬁed.

As shown in Figure 2.10, the statistical data on the 200 Monte Carlo samples is printed at the end of the output

ﬁle in the section that starts with “Statistics based on 200 samples.” In this section summarizing moment-based

statistics, Dakota outputs the mean, standard deviation, skewness, and kurtosis estimates for each of the response

functions. For example, the mean of the Rosenbrock function given uniform input uncertainties on the input

variables is 455.4 and the standard deviation is 536.8. This is a very large standard deviation, due to the fact that

the Rosenbrock function varies by three orders of magnitude over the input domain. The skewness is positive,

meaning this is a right-tailed distribution, not a symmetric distribution. Finally, the kurtosis (a measure of the

“peakedness” of the distribution) indicates that this is a strongly peaked distribution (note that we use a central,

standardized kurtosis so that the kurtosis of a normal is zero). After the moment-related statistics, the 95% con-

ﬁdence intervals on the mean and standard deviations are printed. This is followed by the fractions (“Probability

Level”) of the response function values that are below the response threshold values speciﬁed in the input ﬁle. For

example, 34 percent of the sample inputs resulted in a Rosenbrock function value that was less than or equal to

100, as shown in the line listing the cumulative distribution function values. Finally, there are several correlation

matrices printed at the end, showing simple and partial raw and rank correlation matrices. Correlations provide

an indication of the strength of a monotonic relationship between input and outputs. More detail on correlation

coefﬁcients and their interpretation can be found in Section 5.2.1. More detail about sampling methods in general

can be found in Section 5.2. Finally, Figure 2.11 shows the locations of the 200 sample sites within the parameter

space of the Rosenbrock function for this example.

2.3.5 User Supplied Simulation Code Examples

This subsection provides examples of how to use Dakota to drive user supplied black box code.

2.3.5.1 Optimization with a User-Supplied Simulation Code - Case 1

Many of the previous examples made use of the direct interface to access the Rosenbrock and textbook test

functions that are compiled into Dakota. In engineering applications, it is much more common to use the fork

interface approach within Dakota to manage external simulation codes. In both of these cases, the communication

between Dakota and the external code is conducted through the reading and writing of short text ﬁles. For this

example, the C++ program rosenbrock.cpp in Dakota Source/test is used as the simulation code. This

ﬁle is compiled to create the stand-alone rosenbrock executable that is referenced as the analysis driver

in Figure 2.12. This stand-alone program performs the same function evaluations as Dakota’s internal Rosenbrock

test function.

Figure 2.12 shows the text of the Dakota input ﬁle named rosen syscall.in that is provided in the directory

Dakota/examples/users. The only differences between this input ﬁle and the one in Figure 2.7 occur in

the interface keyword section. The keyword fork indicates that Dakota will use fork calls to create separate Unix

processes for executions of the user-supplied simulation code. The name of the simulation code, and the names

for Dakota’s parameters and results ﬁle are speciﬁed using the analysis driver,parameters file, and

results file keywords, respectively.

The Dakota command is noted in the ﬁle, and copies of the outputs are in the Dakota/examples/users

directory, with .sav appended to the name.

This run of Dakota takes longer to complete than the previous gradient-based optimization example since the

fork interface method has additional process creation and ﬁle I/O overhead, as compared to the internal com-

munication that occurs when the direct interface method is used.

Dakota Version 6.7 User’s Manual generated on November 13, 2017

38 CHAPTER 2. DAKOTA TUTORIAL

# Dakota Input File: rosen_sampling.in

# Usage:

# dakota -i rosen_sampling.in -o rosen_sampling.out > rosen_sampling.stdout

environment

graphics

tabular_data

tabular_data_file = ’rosen_sampling.dat’

method

sampling

sample_type random

samples = 200

seed = 17

response_levels = 100.0

model

single

variables

uniform_uncertain = 2

lower_bounds -2.0 -2.0

upper_bounds 2.0 2.0

descriptors ’x1’ ’x2’

interface

analysis_drivers = ’rosenbrock’

direct

responses

response_functions = 1

no_gradients

no_hessians

Figure 2.9: Monte Carlo sampling example: the Dakota input ﬁle.

To gain a better understanding of what exactly Dakota is doing with the fork interface approach, add the key-

words file tag and file save to the interface speciﬁcation and re-run Dakota. Check the listing of the

local directory and you will see many new ﬁles with names such as params.in.1,params.in.2, etc., and

results.out.1,results.out.2, etc. There is one params.in.X ﬁle and one results.out.X ﬁle

for each of the function evaluations performed by Dakota. This is the ﬁle listing for params.in.1:

2 variables

-1.200000000000000e+00 x1

1.000000000000000e+00 x2

1 functions

1 ASV_1:obj_fn

2 derivative_variables

1 DVV_1:x1

2 DVV_2:x2

0 analysis_components

Dakota Version 6.7 User’s Manual generated on November 13, 2017

2.3. EXAMPLES 39

Statistics based on 200 samples:

Moment-based statistics for each response function:

Mean Std Dev Skewness Kurtosis

response_fn_1 4.5540183516e+02 5.3682678089e+02 1.6661798252e+00 2.7925726822e+00

95% confidence intervals for each response function:

LowerCI_Mean UpperCI_Mean LowerCI_StdDev UpperCI_StdDev

response_fn_1 3.8054757609e+02 5.3025609422e+02 4.8886795789e+02 5.9530059589e+02

Level mappings for each response function:

Cumulative Distribution Function (CDF) for response_fn_1:

Response Level Probability Level Reliability Index General Rel Index

-------------- ----------------- ----------------- -----------------

1.0000000000e+02 3.4000000000e-01

Probability Density Function (PDF) histograms for each response function:

PDF for response_fn_1:

Bin Lower Bin Upper Density Value

--------- --------- -------------

1.1623549854e-01 1.0000000000e+02 3.4039566059e-03

1.0000000000e+02 2.7101710856e+03 2.5285698843e-04

Simple Correlation Matrix among all inputs and outputs:

x1 x2 response_fn_1

x1 1.00000e+00

x2 -5.85097e-03 1.00000e+00

response_fn_1 -9.57746e-02 -5.08193e-01 1.00000e+00

Partial Correlation Matrix between input and output:

response_fn_1

x1 -1.14659e-01

x2 -5.11111e-01

Simple Rank Correlation Matrix among all inputs and outputs:

x1 x2 response_fn_1

x1 1.00000e+00

x2 -6.03315e-03 1.00000e+00

response_fn_1 -1.15360e-01 -5.04661e-01 1.00000e+00

Partial Rank Correlation Matrix between input and output:

response_fn_1

x1 -1.37154e-01

x2 -5.08762e-01

Figure 2.10: Results of Monte Carlo Sampling on the Rosenbrock Function

Dakota Version 6.7 User’s Manual generated on November 13, 2017

40 CHAPTER 2. DAKOTA TUTORIAL

Figure 2.11: Monte Carlo sampling example: locations in the parameter space of the 200 Monte Carlo samples

using a uniform distribution for both x1and x2.

The basic pattern is that of array lengths and string identiﬁers followed by listings of the array entries, where the

arrays consist of the variables, the active set vector (ASV), the derivative values vector (DVV), and the analysis

components (AC). For the variables array, the ﬁrst line gives the total number of variables (2) and the “variables”

string identiﬁer, and the subsequent two lines provide the array listing for the two variable values (-1.2 and 1.0)

and descriptor tags (“x1” and “x2” from the Dakota input ﬁle). The next array conveys the ASV, which indicates

what simulator outputs are needed. The ﬁrst line of the array gives the total number of response functions (1)

and the “functions” string identiﬁer, followed by one ASV code and descriptor tag (“ASV 1”) for each function.

In this case, the ASV value of 1 indicates that Dakota is requesting that the simulation code return the response

function value in the ﬁle results.out.X. (Possible ASV values: 1 = value of response function value, 2

= response function gradient, 4 = response function Hessian, and any sum of these for combinations up to 7

= response function value, gradient, and Hessian; see 9.7 for more detail.) The next array provides the DVV,

which deﬁnes the variable identiﬁers used in computing derivatives. The ﬁrst line of the array gives the number

of derivative variables (2) and the “derivative variables” string identiﬁer, followed by the listing of the two DVV

variable identiﬁers (the ﬁrst and second variables) and descriptor tags (“DVV 1” and “DVV 2”). The ﬁnal array

provides the AC array used to provide additional strings for use by the simulator (e.g., to provide the name of

a particular mesh ﬁle). The ﬁrst line of the array gives the total number of analysis components (0) and the

“analysis components” string identiﬁer, followed by the listing of the array, which is empty in this case.

The executable program rosenbrock reads in the params.in.X ﬁle and evaluates the objective function at the

given values for x1and x2. Then, rosenbrock writes out the objective function data to the results.out.X ﬁle.

Here is the listing for the ﬁle results.out.1:

2.420000000000000e+01 f

The value shown above is the value of the objective function, and the descriptor ‘f’ is an optional tag returned by

the simulation code. When the fork call has completed, Dakota reads in the data from the results.in.X ﬁle

and processes the results. Dakota then continues with additional executions of the rosenbrock program until the

optimization process is complete.

Dakota Version 6.7 User’s Manual generated on November 13, 2017

2.3. EXAMPLES 41

2.3.5.2 Optimization with a User-Supplied Simulation Code - Case 2

In many situations the user-supplied simulation code cannot be modiﬁed to read and write the params.in.X

file and the results.out.X ﬁle, as described above. Typically, this occurs when the simulation code is a

commercial or proprietary software product that has speciﬁc input ﬁle and output ﬁle formats. In such cases, it

is common to replace the executable program name in the Dakota input ﬁle with the name of a Unix shell script

containing a sequence of commands that read and write the necessary ﬁles and run the simulation code. For

example, the executable program named rosenbrock listed in Figure 2.12 could be replaced by a Unix Bourne

or C-shell script named simulator script, with the script containing a sequence of commands to perform

# Dakota Input File: rosen_syscall.in

# Usage:

# dakota -i rosen_syscall.in -o rosen_syscall.out > rosen_syscall.stdout

environment

graphics

tabular_data

tabular_data_file = ’rosen_syscall.dat’

method

conmin_frcg

convergence_tolerance = 1e-4

max_iterations = 100

model

single

variables

continuous_design = 2

initial_point -1.2 1.0

lower_bounds -2.0 -2.0

upper_bounds 2.0 2.0

descriptors ’x1’ "x2"

interface

analysis_drivers = ’rosenbrock’

fork

parameters_file = ’params.in’

results_file = ’results.out’

responses

objective_functions = 1

numerical_gradients

method_source dakota

interval_type forward

fd_step_size = 1.e-5

no_hessians

Figure 2.12: Dakota input ﬁle for gradient-based optimization using the fork call interface to an external rosen-

brock simulator.

Dakota Version 6.7 User’s Manual generated on November 13, 2017

42 CHAPTER 2. DAKOTA TUTORIAL

the following steps: insert the data from the parameters.in.X ﬁle into the input ﬁle of the simulation code,

execute the simulation code, post-process the ﬁles generated by the simulation code to compute response data, and

return the response data to Dakota in the results.out.X ﬁle. The steps that are typically used in constructing

and using a Unix shell script are described in Section 10.3.

2.4 Dakota Command-Line Options

The Dakota executable ﬁle is named dakota (dakota.exe on Windows). If this command is entered at the

command prompt without any arguments, a usage message similar to the following appears:

usage: dakota [options and <args>]

-help (Print this summary)

-version (Print Dakota version number)

-input <$val> (REQUIRED Dakota input file $val)

-output <$val> (Redirect Dakota standard output to file $val)

-error <$val> (Redirect Dakota standard error to file $val)

-parser <$val> (Parsing technology: nidr[strict][:dumpfile])

-no_input_echo (Do not echo Dakota input file)

-check (Perform input checks)

-pre_run [$val] (Perform pre-run (variables generation) phase)

-run [$val] (Perform run (model evaluation) phase)

-post_run [$val] (Perform post-run (final results) phase)

-read_restart [$val] (Read an existing Dakota restart file $val)

-stop_restart <$val> (Stop restart file processing at evaluation $val)

-write_restart [$val] (Write a new Dakota restart file $val)

Of these available command line inputs, only the “-input” option is required, and “-input” can be omitted

if the input ﬁle name is the ﬁnal item on the command line; all other command-line inputs are optional. The

“-help” option prints the usage message above. The “-version” option prints the version number of the

executable. The “-check” option invokes a dry-run mode in which the input ﬁle is processed and checked for

errors, but the study is not performed. The “-input” option provides the name of the Dakota input ﬁle. The

“-output” and “-error” options provide ﬁle names for redirection of the Dakota standard output (stdout)

and standard error (stderr), respectively. By default, Dakota will echo the input ﬁle to the output stream, but

“-no input echo” can override this behavior.

The “-parser” input is for debugging and will not be further described here. The “-read restart” and

“-write restart” options provide the names of restart databases to read from and write to, respectively.

The “-stop restart” option limits the number of function evaluations read from the restart database (the

default is all the evaluations) for those cases in which some evaluations were erroneous or corrupted. Restart

management is an important technique for retaining data from expensive engineering applications. This advanced

topic is discussed in detail in Chapter 18. Note that these command line options can be abbreviated so long as

the abbreviation is unique. Accordingly, the following are valid, unambiguous speciﬁcations: “-h”, “-v”, “-c”,

“-i”, “-o”, “-e”, “-re”, “-s”, “-w”, “-pr”, “-ru”, and “-po” and can be used in place of the longer forms

of the command line options.

To run Dakota with a particular input ﬁle, the following syntax can be used:

dakota -i dakota.in

or more simply

Dakota Version 6.7 User’s Manual generated on November 13, 2017

2.5. NEXT STEPS 43

dakota dakota.in

This will echo the standard output (stdout) and standard error (stderr) messages to the terminal. To redirect stdout

and stderr to separate ﬁles, the -o and -e command line options may be used:

dakota -i dakota.in -o dakota.out -e dakota.err

or

dakota -o dakota.out -e dakota.err dakota.in

Alternatively, any of a variety of Unix redirection variants can be used. The simplest of these redirects stdout to

another ﬁle:

dakota dakota.in > dakota.out

To run the dakota process in the background, append an ampersand symbol (&) to the command with an embedded

space, e.g.,

dakota dakota.in > dakota.out &

Refer to [6] for more information on Unix redirection and background commands.

The “-pre run”, “-run”, and “-post run” options instruct Dakota to run one or more execution phases,

excluding others. For example pre-run might generate variable sets, run (core run) might invoke the simulation

to evaluate variables and produce responses, and post-run might accept variable/response sets and analyzes the

results (for example, calculate correlations from a set of samples). Currently only two modes are supported and

only for sampling, parameter study, and DACE methods: (1) pre-run only with optional tabular output of variables:

dakota -i dakota.in -pre_run [::myvariables.dat]

and (2) post-run only with required tabular input of variables/responses:

dakota -i dakota.in -post_run myvarsresponses.dat::

2.5 Next Steps

After reading this chapter, you should understand the mechanics of acquiring, installing, and executing Dakota

to perform simple studies. You should have a high-level appreciation for what inputs Dakota requires, how it

behaves during interfacing and operation for a few kinds of studies, and what representative output is obtained.

To effectively use Dakota, you will need to understand the character of your problem, select a Dakota method to

help you meet your analysis goals, and develop an interface to your computational model.

2.5.1 Problem Exploration and Method Selection

Section 1.2 provides a high-level overview of the analysis techniques available in Dakota, with references to more

details and usage guidelines in the following chapters. Selecting an appropriate method to meet your analysis

goals requires understanding problem characteristics. For example, in optimization, typical questions that should

Dakota Version 6.7 User’s Manual generated on November 13, 2017

44 CHAPTER 2. DAKOTA TUTORIAL

be addressed include: Are the design variables continuous, discrete, or mixed? Is the problem constrained or un-

constrained? How expensive are the response functions to evaluate? Will the response functions behave smoothly

as the design variables change or will there be nonsmoothness and/or discontinuities? Are the response functions

likely to be multimodal, such that global optimization may be warranted? Is analytic gradient data available,

and if not, can gradients be calculated accurately and cheaply? Questions pertinent for uncertainty quantiﬁcation

may include: Can I accurately model the probabilistic distributions of my uncertain variables? Are the response

functions relatively linear? Am I interested in a full random process characterization of the response functions, or

just statistical results?

If there is not sufﬁcient information from the problem description and prior knowledge to answer these questions,

then additional problem characterization activities may be warranted. Dakota parameter studies and design of

experiments methods can help answer these questions by systematically interrogating the model. The resulting

trends in the response functions can be evaluated to determine if these trends are noisy or smooth, unimodal or

multimodal, relatively linear or highly nonlinear, etc. In addition, the parameter studies may reveal that one or

more of the parameters do not signiﬁcantly affect the results and can be omitted from the problem formulation.

This can yield a potentially large savings in computational expense for the subsequent studies. Refer to Chapters 3

and 4for additional information on parameter studies and design of experiments methods.

For a list of all the example Dakota input ﬁles, see Table 2.1. All of these input ﬁles can be found in

Dakota/examples/users.

2.5.2 Key Getting Started References

The following references address many of the most common questions raised by new Dakota users:

•Dakota documentation and training materials are available from the Dakota website http://dakota.

sandia.gov.

•Dakota input ﬁle syntax (valid keywords and settings) is described in the Dakota Reference Manual [3].

•Example input ﬁles are included throughout this manual, and are included in Dakota distributions and

installations. See Section 1.5 for help ﬁnding these ﬁles.

•Detailed method descriptions appear in the Method Tour in Chapters 3through 7.

•Building an interface to a simulation code: Section 10.3, and related information on parameters ﬁle formats

(Section 9.6) and results ﬁle format (Section 11.2).

•Chapter 13 describes the different Dakota output ﬁle formats, including commonly encountered error mes-

sages.

•Chapter 18 describes the ﬁle restart and data re-use capabilities of Dakota.

•Documentation for getting started with the Dakota Graphical User Interface may be found here: https:

//dakota.sandia.gov/content/dakota-gui-user-manual.

Dakota Version 6.7 User’s Manual generated on November 13, 2017

2.5. NEXT STEPS 45

Table 2.1: List of Example Input Files

Method Category Speciﬁc Method Input File Name Reference

parameter study multidimensional parameter study rosen multidim.in 2.4

parameter study vector parameter study rosen ps vector.in 3.3

design of comp exp moat (Morris) morris ps moat.in 20.22

uncertainty quantiﬁcation random sampling rosen sampling.in 2.9

uncertainty quantiﬁcation LHS sampling textbook uq sampling.in 5.2

uncertainty quantiﬁcation polynomial chaos expansion rosen uq pce.in 5.14

uncertainty quantiﬁcation stochastic collocation rosen uq sc.in 5.17

uncertainty quantiﬁcation reliability Mean Value textbook uq meanvalue.in 5.8

uncertainty quantiﬁcation reliability FORM logratio uq reliability.in 5.10

uncertainty quantiﬁcation global interval analysis cantilever uq global interval.in 5.20

uncertainty quantiﬁcation global evidence analysis textbook uq glob evidence.in 5.23

optimization gradient-based, unconstrained rosen grad opt.in 2.7

optimization gradient-based, unconstrained rosen syscall.in 2.12

optimization gradient-based, constrained textbook opt conmin.in 20.2

optimization gradient-based, constrained cantilever opt npsol.in 20.16

optimization gradient-based, constrained container opt npsol.in 13.1

optimization evolutionary algorithm rosen opt ea.in 6.3

optimization pattern search rosen opt patternsearch.in 6.1

optimization efﬁcient global optimization (EGO) rosen opt ego.in 6.5

optimization efﬁcient global optimization (EGO) herbie shubert opt ego.in 20.7

optimization multiobjective textbook opt multiobj1.in 6.6

optimization Pareto opt., moga mogatest1.in 6.8

optimization Pareto opt., moga mogatest2.in 20.17

optimization Pareto opt., moga mogatest3.in 20.19

optimization optimization with scaling rosen opt scaled.in 6.10

calibration nonlinear least squares rosen opt nls.in 20.4

calibration NLS with dataﬁle textbook nls dataﬁle.in

advanced methods hybrid minimization textbook hybrid strat.in 14.1

advanced methods Pareto minimization textbook pareto strat.in 14.4

advanced methods multistart minimization qsf multistart strat.in 14.2

advanced methods surrogate based global mogatest1 opt sbo.in 14.10

advanced methods surrogate based local rosen opt sbo.in 14.8

advanced models opt. under uncertainty (OUU) textbook opt ouu1.in 15.5

advanced models second order probability cantilever uq sop rel.in 15.2

advanced models sampling on a surrogate model textbook uq surrogate.in

Dakota Version 6.7 User’s Manual generated on November 13, 2017

46 CHAPTER 2. DAKOTA TUTORIAL

Dakota Version 6.7 User’s Manual generated on November 13, 2017

Chapter 3

Parameter Study Capabilities

3.1 Overview

Dakota parameter studies explore the effect of parametric changes within simulation models by computing re-

sponse data sets at a selection of points in the parameter space, yielding one type of sensitivity analysis. (For a

comparison with DACE-based sensitivity analysis, see Section 4.6.) The selection of points is deterministic and

structured, or user-speciﬁed, in each of the four available parameter study methods:

•Vector: Performs a parameter study along a line between any two points in an n-dimensional parameter

space, where the user speciﬁes the number of steps used in the study.

•List: The user supplies a list of points in an n-dimensional space where Dakota will evaluate response data

from the simulation code.

•Centered: Given a point in an n-dimensional parameter space, this method evaluates nearby points along

the coordinate axes of the parameter space. The user selects the number of steps and the step size.

•Multidimensional: Forms a regular lattice or hypergrid in an n-dimensional parameter space, where the

user speciﬁes the number of intervals used for each parameter.

More detail on these parameter studies is found in Sections 3.2 through 3.5 below.

When used in parameter studies, the response data sets are not linked to any speciﬁc interpretation, so they may

consist of any allowable speciﬁcation from the responses keyword block, i.e., objective and constraint functions,

least squares terms and constraints, or generic response functions. This allows the use of parameter studies in

alternation with optimization, least squares, and uncertainty quantiﬁcation studies with only minor modiﬁcation

to the input ﬁle. In addition, the response data sets may include gradients and Hessians of the response functions,

which will be catalogued by the parameter study. This allows for several different approaches to “sensitivity

analysis”: (1) the variation of function values over parameter ranges provides a global assessment as to the

sensitivity of the functions to the parameters, (2) derivative information can be computed numerically, provided

analytically by the simulator, or both (mixed gradients) in directly determining local sensitivity information at a

point in parameter space, and (3) the global and local assessments can be combined to investigate the variation of

derivative quantities through the parameter space by computing sensitivity information at multiple points.

In addition to sensitivity analysis applications, parameter studies can be used for investigating nonsmoothness in

simulation response variations (so that models can be reﬁned or ﬁnite difference step sizes can be selected for

48 CHAPTER 3. PARAMETER STUDY CAPABILITIES

computing numerical gradients), interrogating problem areas in the parameter space, or performing simulation

code veriﬁcation (verifying simulation robustness) through parameter ranges of interest. A parameter study can

also be used in coordination with minimization methods as either a pre-processor (to identify a good starting

point) or a post-processor (for post-optimality analysis).

Parameter study methods will iterate any combination of design, uncertain, and state variables deﬁned over con-

tinuous and discrete domains into any set of responses (any function, gradient, and Hessian deﬁnition). Parameter

studies draw no distinction among the different types of continuous variables (design, uncertain, or state) or

among the different types of response functions. They simply pass all of the variables deﬁned in the variables

speciﬁcation into the interface, from which they expect to retrieve all of the responses deﬁned in the responses

speciﬁcation. As described in Section 11.3, when gradient and/or Hessian information is being catalogued in the

parameter study, it is assumed that derivative components will be computed with respect to all of the continuous

variables (continuous design, continuous uncertain, and continuous state variables) speciﬁed, since derivatives

with respect to discrete variables are assumed to be undeﬁned. The speciﬁcation of initial values or bounds is

important for parameter studies.

3.1.1 Initial Values

The vector and centered parameter studies use the initial values of the variables from the variables keyword

block as the starting point and the central point of the parameter studies, respectively. In the case of design

variables, the initial point is used, and in the case of state variables, the initial state is used (see the

Dakota Reference Manual [3] for default values when initial point or initial state are unspeciﬁed).

In the case of uncertain variables, initial values are inferred from the distribution speciﬁcation: all uncertain

initial values are set to their means, where mean values for bounded normal and bounded lognormal are repaired

of needed to satisfy the speciﬁed distribution bounds, mean values for discrete integer range distributions are

rounded down to the nearest integer, and mean values for discrete set distributions are rounded to the nearest

set value. These parameter study starting values for design, uncertain, and state variables are referenced in the

following sections using the identiﬁer “Initial Values.”

3.1.2 Bounds

The multidimensional parameter study uses the bounds of the variables from the variables keyword block to

deﬁne the range of parameter values to study. In the case of design and state variables, the lower bounds

and upper bounds speciﬁcations are used (see the Dakota Reference Manual [3] for default values when

lower bounds or upper bounds are unspeciﬁed). In the case of uncertain variables, these values are either

drawn or inferred from the distribution speciﬁcation. Distribution lower and upper bounds can be drawn directly

from required bounds speciﬁcations for uniform, loguniform, triangular, and beta distributions, as well as from

optional bounds speciﬁcations for normal and lognormal. Distribution bounds are implicitly deﬁned for histogram

bin, histogram point, and interval variables (from the extreme values within the bin/point/interval speciﬁcations)

as well as for binomial (0 to num trials) and hypergeometric (0 to min(num drawn,num selected)) vari-

ables. Finally, distribution bounds are inferred for normal and lognormal if optional bounds are unspeciﬁed, as

well as for exponential, gamma, gumbel, frechet, weibull, poisson, negative binomial, and geometric (which have

no bounds speciﬁcations); these bounds are [0, µ+3σ] for exponential, gamma, frechet, weibull, poisson, negative

binomial, geometric, and unspeciﬁed lognormal, and [µ−3σ,µ+ 3σ] for gumbel and unspeciﬁed normal.

Dakota Version 6.7 User’s Manual generated on November 13, 2017

3.2. VECTOR PARAMETER STUDY 49

3.2 Vector Parameter Study

The vector parameter study computes response data sets at selected intervals along an n-dimensional vector in

parameter space. This capability encompasses both single-coordinate parameter studies (to study the effect of

a single variable on a response set) as well as multiple coordinate vector studies (to investigate the response

variations along some arbitrary vector; e.g., to investigate a search direction failure).

Dakota’s vector parameter study includes two possible speciﬁcation formulations which are used in conjunction

with the Initial Values (see Section 3.1.1) to deﬁne the vector and steps of the parameter study:

final_point (vector of reals) and num_steps (integer)

step_vector (vector of reals) and num_steps (integer)

In both of these cases, the Initial Values are used as the parameter study starting point and the speciﬁcation

selection above deﬁnes the orientation of the vector and the increments to be evaluated along the vector. In

the former case, the vector from initial to ﬁnal point is partitioned by num steps, and in the latter case, the

step vector is added num steps times. In the case of discrete range variables, both final point and

step vector are speciﬁed in the actual values; and in the case of discrete sets (integer or real), final point

is speciﬁed in the actual values but step vector must instead specify index offsets for the (ordered, unique)

set. In all cases, the number of evaluations is num steps+1. Two examples are included below:

Three continuous parameters with initial values of (1.0, 1.0, 1.0), num steps = 4, and either final point =

(1.0, 2.0, 1.0) or step vector = (0, .25, 0):

Parameters for function evaluation 1:

1.0000000000e+00 c1

1.0000000000e+00 c2

1.0000000000e+00 c3

Parameters for function evaluation 2:

1.0000000000e+00 c1

1.2500000000e+00 c2

1.0000000000e+00 c3

Parameters for function evaluation 3:

1.0000000000e+00 c1

1.5000000000e+00 c2

1.0000000000e+00 c3

Parameters for function evaluation 4:

1.0000000000e+00 c1

1.7500000000e+00 c2

1.0000000000e+00 c3

Parameters for function evaluation 5:

1.0000000000e+00 c1

2.0000000000e+00 c2

1.0000000000e+00 c3

Two continuous parameters with initial values of (1.0, 1.0), one discrete range parameter with initial value of 5,

one discrete real set parameter with set values of (10., 12., 18., 30., 50.) and initial value of 10., num steps = 4,

and either final point = (2.0, 1.4, 13, 50.) or step vector = (.25, .1, 2, 1):

Parameters for function evaluation 1:

1.0000000000e+00 c1

1.0000000000e+00 c2

5 di1

Dakota Version 6.7 User’s Manual generated on November 13, 2017

50 CHAPTER 3. PARAMETER STUDY CAPABILITIES

1.0000000000e+01 dr1

Parameters for function evaluation 2:

1.2500000000e+00 c1

1.1000000000e+00 c2

7 di1

1.2000000000e+01 dr1

Parameters for function evaluation 3:

1.5000000000e+00 c1

1.2000000000e+00 c2

9 di1

1.8000000000e+01 dr1

Parameters for function evaluation 4:

1.7500000000e+00 c1

1.3000000000e+00 c2

11 di1

3.0000000000e+01 dr1

Parameters for function evaluation 5:

2.0000000000e+00 c1

1.4000000000e+00 c2

13 di1

5.0000000000e+01 dr1

An example using a vector parameter study is described in Section 3.7.

3.3 List Parameter Study

The list parameter study computes response data sets at selected points in parameter space. These points are

explicitly speciﬁed by the user and are not conﬁned to lie on any line or surface. Thus, this parameter study

provides a general facility that supports the case where the desired set of points to evaluate does not ﬁt the

prescribed structure of the vector, centered, or multidimensional parameter studies.

The user input consists of a list of points speciﬁcation which lists the requested parameter sets in succes-

sion. The list parameter study simply performs a simulation for the ﬁrst parameter set (the ﬁrst nentries in the

list), followed by a simulation for the next parameter set (the next nentries), and so on, until the list of points has

been exhausted. Since the Initial Values will not be used, they need not be speciﬁed. In the case of discrete range

or discrete set variables, list values are speciﬁed using the actual values (not set indices).

An example speciﬁcation that would result in the same parameter sets as in the second example in Section 3.2

would be:

list_of_points = 1.0 1.0 5 10.

1.25 1.1 7 12.

1.5 1.2 9 18.

1.75 1.3 11 30.

2.0 1.4 13 50.

For convenience, the points for evaluation in a list parameter study may instead be speciﬁed via the import points file

speciﬁcation, e.g., ’import points file listpstudy.dat’, where the ﬁle listpstudy.dat may

be in freeform or annotated format 12.1.1. The ordering of the points is in input speciﬁcation order, with both

active and inactive variables by default.

Dakota Version 6.7 User’s Manual generated on November 13, 2017

3.4. CENTERED PARAMETER STUDY 51

3.4 Centered Parameter Study

The centered parameter study executes multiple coordinate-based parameter studies, one per parameter, centered

about the speciﬁed Initial Values. This is useful for investigation of function contours in the vicinity of a speciﬁc

point. For example, after computing an optimum design, this capability could be used for post-optimality analysis

in verifying that the computed solution is actually at a minimum or constraint boundary and in investigating the

shape of this minimum or constraint boundary.

This method requires step vector (list of reals) and steps per variable (list of integers) speciﬁcations,

where the former speciﬁes the size of the increments per variable (employed sequentially, not all at once as

for the vector study in Section 3.2) and the latter speciﬁes the number of increments per variable (employed

sequentially, not all at once) for each of the positive and negative step directions. As for the vector study described

in Section 3.2,step vector includes actual variable steps for continuous and discrete range variables, but

employs index offsets for discrete set variables (integer or real).

For example, with Initial Values of (1.0, 1.0), a step vector of (0.1, 0.1), and a steps per variable of

(2, 2), the center point is evaluated followed by four function evaluations (two negative deltas and two positive

deltas) per variable:

Parameters for function evaluation 1:

1.0000000000e+00 d1

1.0000000000e+00 d2

Parameters for function evaluation 2:

8.0000000000e-01 d1

1.0000000000e+00 d2

Parameters for function evaluation 3:

9.0000000000e-01 d1

1.0000000000e+00 d2

Parameters for function evaluation 4:

1.1000000000e+00 d1

1.0000000000e+00 d2

Parameters for function evaluation 5:

1.2000000000e+00 d1

1.0000000000e+00 d2

Parameters for function evaluation 6:

1.0000000000e+00 d1

8.0000000000e-01 d2

Parameters for function evaluation 7:

1.0000000000e+00 d1

9.0000000000e-01 d2

Parameters for function evaluation 8:

1.0000000000e+00 d1

1.1000000000e+00 d2

Parameters for function evaluation 9:

1.0000000000e+00 d1

1.2000000000e+00 d2

This set of points in parameter space is depicted in Figure 3.1.

Dakota Version 6.7 User’s Manual generated on November 13, 2017

52 CHAPTER 3. PARAMETER STUDY CAPABILITIES

Figure 3.1: Example centered parameter study.

3.5 Multidimensional Parameter Study

The multidimensional parameter study computes response data sets for an n-dimensional hypergrid of points.

Each variable is partitioned into equally spaced intervals between its upper and lower bounds (see Section 3.1.2),

and each combination of the values deﬁned by these partitions is evaluated. As for the vector and centered studies

described in Sections 3.2 and 3.4, partitioning occurs using the actual variable values for continuous and discrete

range variables, but occurs within the space of valid indices for discrete set variables (integer or real). The number

of function evaluations performed in the study is:

n

Y

i=1

(partitionsi+ 1) (3.1)

The partitions information is speciﬁed using the partitions speciﬁcation, which provides an integer list of the

number of partitions for each variable (i.e., partitionsi). Since the Initial Values will not be used, they need

not be speciﬁed.

In a two variable example problem with d1 ∈[0,2] and d2 ∈[0,3] (as deﬁned by the upper and lower bounds

from the variables speciﬁcation) and with partitions = (2,3), the interval [0,2] is divided into two equal-sized

partitions and the interval [0,3] is divided into three equal-sized partitions. This two-dimensional grid, shown in

Figure 3.2, would result in the following twelve function evaluations:

Parameters for function evaluation 1:

0.0000000000e+00 d1

0.0000000000e+00 d2

Parameters for function evaluation 2:

1.0000000000e+00 d1

0.0000000000e+00 d2

Parameters for function evaluation 3:

2.0000000000e+00 d1

0.0000000000e+00 d2

Parameters for function evaluation 4:

0.0000000000e+00 d1

1.0000000000e+00 d2

Parameters for function evaluation 5:

Dakota Version 6.7 User’s Manual generated on November 13, 2017

3.5. MULTIDIMENSIONAL PARAMETER STUDY 53

Figure 3.2: Example multidimensional parameter study

1.0000000000e+00 d1

1.0000000000e+00 d2

Parameters for function evaluation 6:

2.0000000000e+00 d1

1.0000000000e+00 d2

Parameters for function evaluation 7:

0.0000000000e+00 d1

2.0000000000e+00 d2

Parameters for function evaluation 8:

1.0000000000e+00 d1

2.0000000000e+00 d2

Parameters for function evaluation 9:

2.0000000000e+00 d1

2.0000000000e+00 d2

Parameters for function evaluation 10:

0.0000000000e+00 d1

3.0000000000e+00 d2

Parameters for function evaluation 11:

1.0000000000e+00 d1

3.0000000000e+00 d2

Parameters for function evaluation 12:

2.0000000000e+00 d1

3.0000000000e+00 d2

The ﬁrst example shown in this User’s Manual is a multi-dimensional parameter study. See Section 2.3.2.

Dakota Version 6.7 User’s Manual generated on November 13, 2017

54 CHAPTER 3. PARAMETER STUDY CAPABILITIES

3.6 Parameter Study Usage Guidelines

Parameter studies, classical design of experiments (DOE), design/analysis of computer experiments (DACE), and

sampling methods share the purpose of exploring the parameter space. Parameter Studies are recommended for

simple studies with deﬁned, repetitive structure. A local sensitivity analysis or an assessment of the smoothness

of a response function is best addressed with a vector or centered parameter study. A multi-dimensional parameter

study may be used to generate grid points for plotting response surfaces. For guidance on DACE and sampling

methods, in contrast to parameter studies, see Section 4.7 and especially Table 4.4, which clariﬁes the different

purposes of the method types.

3.7 Example: Vector Parameter Study with Rosenbrock

This section demonstrates a vector parameter study on the Rosenbrock test function described in Section 2.3.1.

An example of multidimensional parameter study is shown in Section 2.3.2.

A vector parameter study is a study between any two design points in an n-dimensional parameter space. An input

ﬁle for the vector parameter study is shown in Figure 3.3. The primary differences between this input ﬁle and the

input ﬁle for the multidimensional parameter study are found in the variables and method sections. In the variables

section, the keywords for the bounds are removed and replaced with the keyword initial point that speciﬁes

the starting point for the parameter study. In the method section, the vector parameter study keyword

is used. The final point keyword indicates the stopping point for the parameter study, and num steps

speciﬁes the number of steps taken between the initial and ﬁnal points in the parameter study.

Figure 3.4(a) shows the graphics output created by Dakota. For this study, the simple Dakota graphics are more

useful for visualizing the results. Figure 3.4(b) shows the locations of the 11 sample points generated in this study.

It is evident from these ﬁgures that the parameter study starts within the banana-shaped valley, marches up the

side of the hill, and then returns to the valley.

Dakota Version 6.7 User’s Manual generated on November 13, 2017

3.7. EXAMPLE: VECTOR PARAMETER STUDY WITH ROSENBROCK 55

# Dakota Input File: rosen_ps_vector.in

environment

graphics

tabular_data

tabular_data_file = ’rosen_ps_vector.dat’

method

vector_parameter_study

final_point = 1.1 1.3

num_steps = 10

model

single

variables

continuous_design = 2

initial_point -0.3 0.2

descriptors ’x1’ "x2"

interface

analysis_drivers = ’rosenbrock’

direct

responses

objective_functions = 1

no_gradients

no_hessians

Figure 3.3: Rosenbrock vector parameter study example: the Dakota input ﬁle – see

Dakota/examples/users/rosen ps vector.in

Dakota Version 6.7 User’s Manual generated on November 13, 2017

56 CHAPTER 3. PARAMETER STUDY CAPABILITIES

(a)

(b)

Figure 3.4: Rosenbrock vector parameter study example: (a) screen capture of the Dakota graphics and (b)

location of the design points (dots) evaluated.

Dakota Version 6.7 User’s Manual generated on November 13, 2017

Chapter 4

Design of Experiments Capabilities

4.1 Overview

Classical design of experiments (DoE) methods and the more modern design and analysis of computer experi-

ments (DACE) methods are both techniques which seek to extract as much trend data from a parameter space as

possible using a limited number of sample points. Classical DoE techniques arose from technical disciplines that

assumed some randomness and nonrepeatability in ﬁeld experiments (e.g., agricultural yield, experimental chem-

istry). DoE approaches such as central composite design, Box-Behnken design, and full and fractional factorial

design generally put sample points at the extremes of the parameter space, since these designs offer more reliable

trend extraction in the presence of nonrepeatability. DACE methods are distinguished from DoE methods in that

the nonrepeatability component can be omitted since computer simulations are involved. In these cases, space

ﬁlling designs such as orthogonal array sampling and Latin hypercube sampling are more commonly employed

in order to accurately extract trend information. Quasi-Monte Carlo sampling techniques which are constructed

to ﬁll the unit hypercube with good uniformity of coverage can also be used for DACE.

Dakota supports both DoE and DACE techniques. In common usage, only parameter bounds are used in selecting

the samples within the parameter space. Thus, DoE and DACE can be viewed as special cases of the more

general probabilistic sampling for uncertainty quantiﬁcation (see following section), in which the DoE/DACE

parameters are treated as having uniform probability distributions. The DoE/DACE techniques are commonly

used for investigation of global response trends, identiﬁcation of signiﬁcant parameters (e.g., main effects), and

as data generation methods for building response surface approximations.

Dakota includes several approaches sampling and design of experiments, all implemented in included third-party

software libraries. LHS (Latin hypercube sampling) [132] is a general-purpose sampling package developed at

Sandia that has been used by the DOE national labs for several decades. DDACE (distributed design and analysis

for computer experiments) is a more recent package for computer experiments developed at Sandia Labs [137].

DDACE provides the capability for generating orthogonal arrays, Box-Behnken designs, Central Composite de-

signs, and random designs. The FSUDace (Florida State University’s Design and Analysis of Computer Ex-

periments) package provides the following sampling techniques: quasi-Monte Carlo sampling based on Halton

or Hammersley sequences, and Centroidal Voronoi Tessellation. Lawrence Livermore National Lab’s PSUADE

(Problem Solving Environment for Uncertainty Analysis and Design Exploration) [136] includes several methods

for model exploration, but only the Morris screening method is exposed in Dakota.

This chapter describes DDACE, FSUDace, and PSUADE, with a focus on designing computer experiments. Latin

Hypercube Sampling, also used in uncertainty quantiﬁcation, is discussed in Section 5.2.

58 CHAPTER 4. DESIGN OF EXPERIMENTS CAPABILITIES

4.2 Design of Computer Experiments

What distinguishes design of computer experiments? Computer experiments are often different from physical

experiments, such as those performed in agriculture, manufacturing, or biology. In physical experiments, one

often applies the same treatment or factor level in an experiment several times to get an understanding of the

variability of the output when that treatment is applied. For example, in an agricultural experiment, several ﬁelds

(e.g., 8) may be subject to a low level of fertilizer and the same number of ﬁelds may be subject to a high level of

fertilizer to see if the amount of fertilizer has a signiﬁcant effect on crop output. In addition, one is often interested

in the variability of the output within a treatment group: is the variability of the crop yields in the low fertilizer

group much higher than that in the high fertilizer group, or not?

In physical experiments, the process we are trying to examine is stochastic: that is, the same treatment may result

in different outcomes. By contrast, in computer experiments, often we have a deterministic code. If we run the

code with a particular set of input parameters, the code will always produce the same output. There certainly

are stochastic codes, but the main focus of computer experimentation has been on deterministic codes. Thus, in

computer experiments we often do not have the need to do replicates (running the code with the exact same input

parameters several times to see differences in outputs). Instead, a major concern in computer experiments is to

create an experimental design which can sample a high-dimensional space in a representative way with a minimum

number of samples. The number of factors or parameters that we wish to explore in computer experiments is

usually much higher than physical experiments. In physical experiments, one may be interested in varying a few

parameters, usually ﬁve or less, while in computer experiments we often have dozens of parameters of interest.

Choosing the levels of these parameters so that the samples adequately explore the input space is a challenging

problem. There are many experimental designs and sampling methods which address the issue of adequate and

representative sample selection.

There are many goals of running a computer experiment: one may want to explore the input domain or the

design space and get a better understanding of the range in the outputs for a particular domain. Another objective

is to determine which inputs have the most inﬂuence on the output, or how changes in the inputs change the

output. This is usually called sensitivity analysis. Another goal is to use the sampled input points and their

corresponding output to create a response surface approximation for the computer code. The response surface

approximation (e.g., a polynomial regression model, a Gaussian-process/Kriging model, a neural net) can then

be used to emulate the computer code. Constructing a response surface approximation is particularly important

for applications where running a computational model is extremely expensive: the computer model may take 10

or 20 hours to run on a high performance machine, whereas the response surface model may only take a few

seconds. Thus, one often optimizes the response surface model or uses it within a framework such as surrogate-

based optimization. Response surface models are also valuable in cases where the gradient (ﬁrst derivative) and/or

Hessian (second derivative) information required by optimization techniques are either not available, expensive

to compute, or inaccurate because the derivatives are poorly approximated or the function evaluation is itself

noisy due to roundoff errors. Furthermore, many optimization methods require a good initial point to ensure

fast convergence or to converge to good solutions (e.g. for problems with multiple local minima). Under these

circumstances, a good design of computer experiment framework coupled with response surface approximations

can offer great advantages.

In addition to the sensitivity analysis and response surface modeling mentioned above, we also may want to do

uncertainty quantiﬁcation on a computer model. Uncertainty quantiﬁcation (UQ) refers to taking a particular set

of distributions on the inputs, and propagating them through the model to obtain a distribution on the outputs. For

example, if input parameter A follows a normal with mean 5 and variance 1, the computer produces a random

draw from that distribution. If input parameter B follows a weibull distribution with alpha = 0.5 and beta = 1, the

computer produces a random draw from that distribution. When all of the uncertain variables have samples drawn

from their input distributions, we run the model with the sampled values as inputs. We do this repeatedly to build

up a distribution of outputs. We can then use the cumulative distribution function of the output to ask questions

Dakota Version 6.7 User’s Manual generated on November 13, 2017

4.3. DDACE 59

such as: what is the probability that the output is greater than 10? What is the 99th percentile of the output?

Note that sampling-based uncertainty quantiﬁcation and design of computer experiments are very similar. There

is signiﬁcant overlap in the purpose and methods used for UQ and for DACE. We have attempted to delineate

the differences within Dakota as follows: we use the methods DDACE, FSUDACE, and PSUADE primarily for

design of experiments, where we are interested in understanding the main effects of parameters and where we

want to sample over an input domain to obtain values for constructing a response surface. We use the nondeter-

ministic sampling methods (sampling) for uncertainty quantiﬁcation, where we are propagating speciﬁc input

distributions and interested in obtaining (for example) a cumulative distribution function on the output. If one

has a problem with no distributional information, we recommend starting with a design of experiments approach.

Note that DDACE, FSUDACE, and PSUADE currently do not support distributional information: they take an

upper and lower bound for each uncertain input variable and sample within that. The uncertainty quantiﬁcation

methods in sampling (primarily Latin Hypercube sampling) offer the capability to sample from many distri-

butional types. The distinction between UQ and DACE is somewhat arbitrary: both approaches often can yield

insight about important parameters and both can determine sample points for response surface approximations.

Three software packages are available in Dakota for design of computer experiments, DDACE (developed at

Sandia Labs), FSUDACE (developed at Florida State University), and PSUADE (LLNL).

4.3 DDACE

The Distributed Design and Analysis of Computer Experiments (DDACE) package includes both classical design

of experiments methods [137] and stochastic sampling methods. The classical design of experiments methods

in DDACE are central composite design (CCD) and Box-Behnken (BB) sampling. A grid-based sampling (full-

factorial) method is also available. The stochastic methods are orthogonal array sampling [89] (which permits

main effects calculations), Monte Carlo (random) sampling, Latin hypercube sampling, and orthogonal array-

Latin hypercube sampling. While DDACE LHS supports variables with normal or uniform distributions, only

uniform are supported through Dakota. Also DDACE does not allow enforcement of user-speciﬁed correlation

structure among the variables.

The sampling methods in DDACE can be used alone or in conjunction with other methods. For example, DDACE

sampling can be used with both surrogate-based optimization and optimization under uncertainty advanced meth-

ods. See Figure 15.5 for an example of how the DDACE settings are used in Dakota.

The following sections provide more detail about the sampling methods available for design of experiments in

DDACE.

4.3.1 Central Composite Design

A Box-Wilson Central Composite Design, commonly called a central composite design (CCD), contains an em-

bedded factorial or fractional factorial design with center points that is augmented with a group of ’star points’

that allow estimation of curvature. If the distance from the center of the design space to a factorial point is ±1 unit

for each factor, the distance from the center of the design space to a star point is ±αwith |α|>1. The precise

value of αdepends on certain properties desired for the design and on the number of factors involved. The CCD

design is speciﬁed in Dakota with the method command dace central composite.

As an example, with two input variables or factors, each having two levels, the factorial design is shown in

Table 4.1 .

With a CCD, the design in Table 4.1 would be augmented with the following points shown in Table 4.2 if α= 1.3.

Dakota Version 6.7 User’s Manual generated on November 13, 2017

60 CHAPTER 4. DESIGN OF EXPERIMENTS CAPABILITIES

Table 4.1: Simple Factorial Design

Input 1 Input 2

-1 -1

-1 +1

+1 -1

+1 +1

These points deﬁne a circle around the original factorial design.

Table 4.2: Additional Points to make the factorial design a CCD

Input 1 Input 2

0 +1.3

0 -1.3

1.3 0

-1.3 0

0 0

Note that the number of sample points speciﬁed in a CCD,samples, is a function of the number of variables in

the problem:

samples = 1 + 2 ∗N umV ar + 2NumV ar

4.3.2 Box-Behnken Design

The Box-Behnken design is similar to a Central Composite design, with some differences. The Box-Behnken

design is a quadratic design in that it does not contain an embedded factorial or fractional factorial design. In this

design the treatment combinations are at the midpoints of edges of the process space and at the center, as compared

with CCD designs where the extra points are placed at ’star points’ on a circle outside of the process space. Box-

Behken designs are rotatable (or near rotatable) and require 3 levels of each factor. The designs have limited

capability for orthogonal blocking compared to the central composite designs. Box-Behnken requires fewer runs

than CCD for 3 factors, but this advantage goes away as the number of factors increases. The Box-Behnken design

is speciﬁed in Dakota with the method command dace box behnken.

Note that the number of sample points speciﬁed in a Box-Behnken design, samples, is a function of the number

of variables in the problem:

samples = 1 + 4 ∗N umV ar + (NumV ar −1)/2

Dakota Version 6.7 User’s Manual generated on November 13, 2017

4.3. DDACE 61

4.3.3 Orthogonal Array Designs

Orthogonal array (OA) sampling is a widely used technique for running experiments and systematically testing

factor effects [75]. An orthogonal array sample can be described as a 4-tuple (m, n, s, r), where mis the number

of sample points, nis the number of input variables, sis the number of symbols, and ris the strength of the

orthogonal array. The number of sample points, m, must be a multiple of the number of symbols, s. The number

of symbols refers to the number of levels per input variable. The strength refers to the number of columns where

we are guaranteed to see all the possibilities an equal number of times.

For example, Table 4.3 shows an orthogonal array of strength 2 for m= 8, with 7 variables:

Table 4.3: Orthogonal Array for Seven Variables

Input 1 Input 2 Input 3 Input 4 Input 5 Input 6 Input 7

0000000

0001111

0110011

0111100

1010101

1011010

1100110

1101001

If one picks any two columns, say the ﬁrst and the third, note that each of the four possible rows we might see

there, 0 0, 0 1, 1 0, 1 1, appears exactly the same number of times, twice in this case.

DDACE creates orthogonal arrays of strength 2. Further, the OAs generated by DDACE do not treat the factor

levels as one ﬁxed value (0 or 1 in the above example). Instead, once a level for a variable is determined in the

array, DDACE samples a random variable from within that level. The orthogonal array design is speciﬁed in

Dakota with the method command dace oas.

The orthogonal array method in DDACE is the only method that allows for the calculation of main effects, speci-

ﬁed with the command main effects. Main effects is a sensitivity analysis method which identiﬁes the input

variables that have the most inﬂuence on the output. In main effects, the idea is to look at the mean of the response

function when variable A (for example) is at level 1 vs. when variable A is at level 2 or level 3. If these mean

responses of the output are statistically signiﬁcantly different at different levels of variable A, this is an indication

that variable A has a signiﬁcant effect on the response. The orthogonality of the columns is critical in performing

main effects analysis, since the column orthogonality means that the effects of the other variables ’cancel out’

when looking at the overall effect from one variable at its different levels. There are ways of developing orthog-

onal arrays to calculate higher order interactions, such as two-way interactions (what is the inﬂuence of Variable

A * Variable B on the output?), but this is not available in DDACE currently. At present, one way interactions

are supported in the calculation of orthogonal array main effects within DDACE. The main effects are presented

as a series of ANOVA tables. For each objective function and constraint, the decomposition of variance of that

objective or constraint is presented as a function of the input variables. The p-value in the ANOVA table is used

to indicate if the input factor is signiﬁcant. The p-value is the probability that you would have obtained samples

more extreme than you did if the input factor has no effect on the response. For example, if you set a level of

signiﬁcance at 0.05 for your p-value, and the actual p-value is 0.03, then the input factor has a signiﬁcant effect

on the response.

Dakota Version 6.7 User’s Manual generated on November 13, 2017

62 CHAPTER 4. DESIGN OF EXPERIMENTS CAPABILITIES

4.3.4 Grid Design

In a grid design, a grid is placed over the input variable space. This is very similar to a multi-dimensional

parameter study where the samples are taken over a set of partitions on each variable (see Section 3.5). The

main difference is that in grid sampling, a small random perturbation is added to each sample value so that the

grid points are not on a perfect grid. This is done to help capture certain features in the output such as periodic

functions. A purely structured grid, with the samples exactly on the grid points, has the disadvantage of not being

able to capture important features such as periodic functions with relatively high frequency (due to aliasing).

Adding a random perturbation to the grid samples helps remedy this problem.

Another disadvantage with grid sampling is that the number of sample points required depends exponentially on

the input dimensions. In grid sampling, the number of samples is the number of symbols (grid partitions) raised

to the number of variables. For example, if there are 2 variables, each with 5 partitions, the number of samples

would be 52. In this case, doubling the number of variables squares the sample size. The grid design is speciﬁed

in Dakota with the method command dace grid.

4.3.5 Monte Carlo Design

Monte Carlo designs simply involve pure Monte-Carlo random sampling from uniform distributions between the

lower and upper bounds on each of the input variables. Monte Carlo designs, speciﬁed by dace random, are a

way to generate a set of random samples over an input domain.

4.3.6 LHS Design

DDACE offers the capability to generate Latin Hypercube designs. For more information on Latin Hypercube

sampling, see Section 5.2. Note that the version of LHS in DDACE generates uniform samples (uniform between

the variable bounds). The version of LHS offered with nondeterministic sampling can generate LHS samples

according to a number of distribution types, including normal, lognormal, weibull, beta, etc. To specify the

DDACE version of LHS, use the method command dace lhs.

4.3.7 OA-LHS Design

DDACE offers a hybrid design which is combination of an orthogonal array and a Latin Hypercube sample.

This design is speciﬁed with the method command dace oa lhs. This design has the advantages of both

orthogonality of the inputs as well as stratiﬁcation of the samples (see [111]).

4.4 FSUDace

The Florida State University Design and Analysis of Computer Experiments (FSUDace) package provides quasi-

Monte Carlo sampling (Halton and Hammersley) and Centroidal Voronoi Tessellation (CVT) methods. All three

methods natively generate sets of uniform random variables on the interval [0,1] (or in Dakota, on user-speciﬁed

uniform intervals).

The quasi-Monte Carlo and CVT methods are designed with the goal of low discrepancy. Discrepancy refers to the

nonuniformity of the sample points within the unit hypercube. Low discrepancy sequences tend to cover the unit

hypercube reasonably uniformly. Quasi-Monte Carlo methods produce low discrepancy sequences, especially if

Dakota Version 6.7 User’s Manual generated on November 13, 2017

4.5. PSUADE MOAT 63

one is interested in the uniformity of projections of the point sets onto lower dimensional faces of the hypercube

(usually 1-D: how well do the marginal distributions approximate a uniform?) CVT does very well volumetrically:

it spaces the points fairly equally throughout the space, so that the points cover the region and are isotropically

distributed with no directional bias in the point placement. There are various measures of volumetric uniformity

which take into account the distances between pairs of points, regularity measures, etc. Note that CVT does not

produce low-discrepancy sequences in lower dimensions, however: the lower-dimension (such as 1-D) projections

of CVT can have high discrepancy.

The quasi-Monte Carlo sequences of Halton and Hammersley are deterministic sequences determined by a set of

prime bases. A Halton design is speciﬁed in Dakota with the method command fsu quasi mc halton, and

the Hammersley design is speciﬁed with the command fsu quasi mc hammersley. For more details about

the input speciﬁcation, see the Reference Manual. CVT points tend to arrange themselves in a pattern of cells that

are roughly the same shape. To produce CVT points, an almost arbitrary set of initial points is chosen, and then

an internal set of iterations is carried out. These iterations repeatedly replace the current set of sample points by

an estimate of the centroids of the corresponding Voronoi subregions [30]. A CVT design is speciﬁed in Dakota

with the method command fsu cvt.

The methods in FSUDace are useful for design of experiments because they provide good coverage of the input

space, thus allowing global sensitivity analysis.

4.5 PSUADE MOAT

PSUADE (Problem Solving Environment for Uncertainty Analysis and Design Exploration) is a Lawrence Liv-

ermore National Laboratory tool for metamodeling, sensitivity analysis, uncertainty quantiﬁcation, and optimiza-

tion. Its features include non-intrusive and parallel function evaluations, sampling and analysis methods, an

integrated design and analysis framework, global optimization, numerical integration, response surfaces (MARS

and higher order regressions), graphical output with Pgplot or Matlab, and fault tolerance [136]. Dakota includes a

prototype interface to its Morris One-At-A-Time (MOAT) screening method, a valuable tool for global sensitivity

(including interaction) analysis.

The Morris One-At-A-Time method, originally proposed by M. D. Morris [103], is a screening method, designed

to explore a computational model to distinguish between input variables that have negligible, linear and additive,

or nonlinear or interaction effects on the output. The computer experiments performed consist of individually

randomized designs which vary one input factor at a time to create a sample of its elementary effects.

With MOAT, each dimension of a k−dimensional input space is uniformly partitioned into plevels, creating a grid

of pkpoints x∈Rkat which evaluations of the model y(x)might take place. An elementary effect corresponding

to input iis computed by a forward difference

di(x) = y(x+ ∆ei)−y(x)

∆,(4.1)

where eiis the ith coordinate vector, and the step ∆is typically taken to be large (this is not intended to be a

local derivative approximation). In the present implementation of MOAT, for an input variable scaled to [0,1],

∆ = p

2(p−1) , so the step used to ﬁnd elementary effects is slightly larger than half the input range.

The distribution of elementary effects diover the input space characterizes the effect of input ion the output of

interest. After generating rsamples from this distribution, their mean,

µi=1

r

r

X

j=1

d(j)

i,(4.2)

Dakota Version 6.7 User’s Manual generated on November 13, 2017

64 CHAPTER 4. DESIGN OF EXPERIMENTS CAPABILITIES

modiﬁed mean

µ∗

i=1

r

r

X

j=1 |d(j)

i|,(4.3)

(using absolute value) and standard deviation

σi=v

u

u

t

1

r

r

X

j=1 d(j)

i−µi2

(4.4)

are computed for each input i. The mean and modiﬁed mean give an indication of the overall effect of an input

on the output. Standard deviation indicates nonlinear effects or interactions, since it is an indicator of elementary

effects varying throughout the input space.

The MOAT method is selected with method keyword psuade moat as shown in the sample Dakota input deck in

Figure 4.1. The number of samples (samples) must be a positive integer multiple of (number of continuous de-

sign variables k+ 1) and will be automatically adjusted if misspeciﬁed. The number of partitions (partitions)

applies to each variable being studied and must be odd (the number of MOAT levels pper variable is partitions

+ 1, similar to Dakota multidimensional parameter studies). This will also be adjusted at runtime as necessary.

Finite user-speciﬁed lower and upper bounds are required and will be scaled as needed by the method. For more

information on use of MOAT sampling, see the Morris example in Section 20.9, or Saltelli, et al. [116].

4.6 Sensitivity Analysis

4.6.1 Sensitivity Analysis Overview

In many engineering design applications, sensitivity analysis techniques and parameter study methods are useful in

identifying which of the design parameters have the most inﬂuence on the response quantities. This information is

helpful prior to an optimization study as it can be used to remove design parameters that do not strongly inﬂuence

the responses. In addition, these techniques can provide assessments as to the behavior of the response functions

(smooth or nonsmooth, unimodal or multimodal) which can be invaluable in algorithm selection for optimization,

uncertainty quantiﬁcation, and related methods. In a post-optimization role, sensitivity information is useful is

determining whether or not the response functions are robust with respect to small changes in the optimum design

point.

In some instances, the term sensitivity analysis is used in a local sense to denote the computation of response

derivatives at a point. These derivatives are then used in a simple analysis to make design decisions. Dakota

supports this type of study through numerical ﬁnite-differencing or retrieval of analytic gradients computed within

the analysis code. The desired gradient data is speciﬁed in the responses section of the Dakota input ﬁle and

the collection of this data at a single point is accomplished through a parameter study method with no steps.

This approach to sensitivity analysis should be distinguished from the activity of augmenting analysis codes to

internally compute derivatives using techniques such as direct or adjoint differentiation, automatic differentiation

(e.g., ADIFOR), or complex step modiﬁcations. These sensitivity augmentation activities are completely separate

from Dakota and are outside the scope of this manual. However, once completed, Dakota can utilize these analytic

gradients to perform optimization, uncertainty quantiﬁcation, and related studies more reliably and efﬁciently.

In other instances, the term sensitivity analysis is used in a more global sense to denote the investigation of

variability in the response functions. Dakota supports this type of study through computation of response data

sets (typically function values only, but all data sets are supported) at a series of points in the parameter space.

The series of points is deﬁned using either a vector, list, centered, or multidimensional parameter study method.

Dakota Version 6.7 User’s Manual generated on November 13, 2017

4.6. SENSITIVITY ANALYSIS 65

# Dakota Input File: morris_ps_moat.in

environment

tabular_data

tabular_data_file ’dakota_psuade.0.dat’

method

psuade_moat

samples = 84

partitions = 3

seed = 500

variables

continuous_design = 20

lower_bounds = 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0

upper_bounds = 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0

1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0 1.0

interface

analysis_drivers = ’morris’

fork

asynchronous evaluation_concurrency = 5

responses

objective_functions = 1

no_gradients

no_hessians

Figure 4.1: Dakota input ﬁle showing the Morris One-at-a-Time method – see

Dakota/examples/users/morris ps moat.in

Dakota Version 6.7 User’s Manual generated on November 13, 2017

66 CHAPTER 4. DESIGN OF EXPERIMENTS CAPABILITIES

For example, a set of closely-spaced points in a vector parameter study could be used to assess the smoothness of

the response functions in order to select a ﬁnite difference step size, and a set of more widely-spaced points in a

centered or multidimensional parameter study could be used to determine whether the response function variation

is likely to be unimodal or multimodal. See Chapter 3for additional information on these methods. These more

global approaches to sensitivity analysis can be used to obtain trend data even in situations when gradients are

unavailable or unreliable, and they are conceptually similar to the design of experiments methods and sampling

approaches to uncertainty quantiﬁcation described in the following sections.

4.6.2 Assessing Sensitivity with DACE

Like parameter studies (see Chapter 3), the DACE techniques are useful for characterizing the behavior of the

response functions of interest through the parameter ranges of interest. In addition to direct interrogation and

visualization of the sampling results, a number of techniques have been developed for assessing the parameters

which are most inﬂuential in the observed variability in the response functions. One example of this is the

well-known technique of scatter plots, in which the set of samples is projected down and plotted against one

parameter dimension, for each parameter in turn. Scatter plots with a uniformly distributed cloud of points indicate

parameters with little inﬂuence on the results, whereas scatter plots with a deﬁned shape to the cloud indicate

parameters which are more signiﬁcant. Related techniques include analysis of variance (ANOVA) [104] and

main effects analysis, in which the parameters which have the greatest inﬂuence on the results are identiﬁed

from sampling results. Scatter plots and ANOVA may be accessed through import of Dakota tabular results (see

Section 13.3) into external statistical analysis programs such as S-plus, Minitab, etc.

Running any of the design of experiments or sampling methods allows the user to save the results in a tabular

data ﬁle, which then can be read into a spreadsheet or statistical package for further analysis. In addition, we have

provided some functions to help determine the most important variables.

We take the deﬁnition of uncertainty analysis from [116]: “The study of how uncertainty in the output of a model

can be apportioned to different sources of uncertainty in the model input.”

As a default, Dakota provides correlation analyses when running LHS. Correlation tables are printed with the

simple, partial, and rank correlations between inputs and outputs. These can be useful to get a quick sense of how

correlated the inputs are to each other, and how correlated various outputs are to inputs. The correlation analyses

are explained further in Chapter 5.2.

We also have the capability to calculate sensitivity indices through Variance-based Decomposition (VBD). Variance-

based decomposition is a global sensitivity method that summarizes how the uncertainty in model output can be

apportioned to uncertainty in individual input variables. VBD uses two primary measures, the main effect sen-

sitivity index Siand the total effect index Ti. The main effect sensitivity index corresponds to the fraction of

the uncertainty in the output, Y, that can be attributed to input xialone. The total effects index corresponds to

the fraction of the uncertainty in the output, Y, that can be attributed to input xiand its interactions with other

variables. The main effect sensitivity index compares the variance of the conditional expectation V arxi[E(Y|xi)]

against the total variance V ar(Y). Formulas for the indices are:

Si=V arxi[E(Y|xi)]

V ar(Y)(4.5)

and

Ti=E(V ar(Y|x−i))

V ar(Y)=V ar(Y)−V ar(E[Y|x−i])

V ar(Y)(4.6)

where Y=f(x)and x−i= (x1, ..., xi−1, xi+1, ..., xm).

Dakota Version 6.7 User’s Manual generated on November 13, 2017

4.7. DOE USAGE GUIDELINES 67

The calculation of Siand Tirequires the evaluation of m-dimensional integrals which are typically approximated

by Monte-Carlo sampling. More details on the calculations and interpretation of the sensitivity indices can be

found in [116]. In Dakota version 5.1, we have improved calculations for the calculation of the Siand Tiindices

when using sampling. The implementation details of these calculatiosn are provided in [148]. VBD can be

speciﬁed for any of the sampling or DACE methods using the command variance based decomposition.

Note that VBD is extremely computationally intensive when using sampling since replicated sets of sample values

are evaluated. If the user speciﬁed a number of samples, N, and a number of nondeterministic variables, M,

variance-based decomposition requires the evaluation of N(M+ 2) samples. To obtain sensitivity indices that

are reasonably accurate, we recommend that N, the number of samples, be at least one hundred and preferably

several hundred or thousands. Because of the computational cost, variance-based decomposition is turned off

as a default for sampling or DACE. Another alternative, however, is to obtain these indices using one of the

stochastic expansion methods described in Section 5.4. The calculation of the indices using expansion methods is

much more efﬁcient since the VBD indices are analytic functions of the coefﬁcients in the stochastic expansion.

The paper by Weirs et al. [148] compares different methods for calculating the sensitivity indices for nonlinear

problems with signiﬁcant interaction effects.

In terms of interpretation of the sensitivity indices, a larger value of the sensitivity index, Si, means that the

uncertainty in the input variable ihas a larger effect on the variance of the output. Note that the sum of the main

effect indices will be less than or equal to one. If the sum of the main effect indices is much less than one, it

indicates that there are signiﬁcant two-way, three-way, or higher order interactions that contribute signiﬁcantly to

the variance. There is no requirement that the sum of the total effect indices is one: in most cases, the sum of the

total effect indices will be greater than one. An example of the Main and Total effects indices as calculated by

Dakota using sampling is shown in Figure 4.2

Global sensitivity indices for each response function:

response_fn_1 Sobol indices:

Main Total

4.7508913283e-01 5.3242162037e-01 uuv_1

3.8112392892e-01 4.9912486515e-01 uuv_2

Figure 4.2: Dakota output for Variance-based Decomposition

Finally, we have the capability to calculate a set of quality metrics for a particular input sample. These quality

metrics measure various aspects relating to the volumetric spacing of the samples: are the points equally spaced,

do they cover the region, are they isotropically distributed, do they have directional bias, etc.? The quality metrics

are explained in more detail in the Reference Manual.

4.7 DOE Usage Guidelines

Parameter studies, classical design of experiments (DOE), design/analysis of computer experiments (DACE), and

sampling methods share the purpose of exploring the parameter space. When a global space-ﬁlling set of samples

is desired, then the DOE, DACE, and sampling methods are recommended. These techniques are useful for scatter

plot and variance analysis as well as surrogate model construction.

The distinction between DOE and DACE methods is that the former are intended for physical experiments con-

taining an element of nonrepeatability (and therefore tend to place samples at the extreme parameter vertices),

whereas the latter are intended for repeatable computer experiments and are more space-ﬁlling in nature.

The distinction between DOE/DACE and sampling is drawn based on the distributions of the parameters. DOE/-

Dakota Version 6.7 User’s Manual generated on November 13, 2017

68 CHAPTER 4. DESIGN OF EXPERIMENTS CAPABILITIES

DACE methods typically assume uniform distributions, whereas the sampling approaches in Dakota support a

broad range of probability distributions.

To use sampling in design of experiments mode (as opposed to uncertainty quantiﬁcation mode), an active view

override (e.g., active all) can be included in the variables speciﬁcation (see Section 9.5.1) of the Dakota input

ﬁle.

Design of experiments method selection recommendations are summarized in Table 4.4.

Table 4.4: Guidelines for selection of parameter study, DOE, DACE, and sampling methods.

Method Applications Applicable Methods

Classiﬁcation

parameter study sensitivity analysis, centered parameter study,

directed parameter space investigations list parameter study,

multidim parameter study,

vector parameter study

classical design physical experiments dace (box behnken,

of experiments (parameters are uniformly distributed) central composite)

design of computer variance analysis, dace (grid, random, oas, lhs, oa lhs),

experiments space ﬁlling designs fsu quasi mc (halton, hammersley),

(parameters are uniformly distributed) fsu cvt, psuade moat

sampling space ﬁlling designs sampling (Monte Carlo or LHS)

(parameters have general probability distributions) with optional active view override

Dakota Version 6.7 User’s Manual generated on November 13, 2017

Chapter 5

Uncertainty Quantiﬁcation Capabilities

5.1 Overview

At a high level, uncertainty quantiﬁcation (UQ) or nondeterministic analysis is the process of characterizing input

uncertainties, forward propagating these uncertainties through a computational model, and performing statisti-

cal or interval assessments on the resulting responses. This process determines the effect of uncertainties and

assumptions on model outputs or results. In Dakota, uncertainty quantiﬁcation methods speciﬁcally focus on

the forward propagation part of the process, where probabilistic or interval information on parametric inputs are

mapped through the computational model to assess statistics or intervals on outputs. For an overview of these

approaches for engineering applications, consult [73].

UQ is related to sensitivity analysis in that the common goal is to gain an understanding of how variations in

the parameters affect the response functions of the engineering design problem. However, for UQ, some or all

of the components of the parameter vector, are considered to be uncertain as speciﬁed by particular probability

distributions (e.g., normal, exponential, extreme value), or other uncertainty structures. By assigning speciﬁc

distributional structure to the inputs, distributional structure for the outputs (i.e, response statistics) can be inferred.

This migrates from an analysis that is more qualitative in nature, in the case of sensitivity analysis, to an analysis

that is more rigorously quantitative.

UQ methods are often distinguished by their ability to propagate aleatory or epistemic input uncertainty charac-

terizations, where aleatory uncertainties are irreducible variabilities inherent in nature and epistemic uncertainties

are reducible uncertainties resulting from a lack of knowledge. Since sufﬁcient data is generally available for

aleatory uncertainties, probabilistic methods are commonly used for computing response distribution statistics

based on input probability distribution speciﬁcations. Conversely, for epistemic uncertainties, any use of proba-

bility distributions is based on subjective knowledge rather than objective data, and we may alternatively explore

nonprobabilistic methods based on interval speciﬁcations.

5.1.1 Summary of Dakota UQ Methods

Dakota contains capabilities for performing nondeterministic analysis with both types of input uncertainty. These

UQ methods have been developed by Sandia Labs, in conjunction with collaborators in academia [53,54,35,134].

The aleatory UQ methods in Dakota include various sampling-based approaches (e.g., Monte Carlo and Latin Hy-

percube sampling), local and global reliability methods, and stochastic expansion (polynomial chaos expansions

70 CHAPTER 5. UNCERTAINTY QUANTIFICATION CAPABILITIES

and stochastic collocation) approaches. The epistemic UQ methods include local and global interval analysis and

Dempster-Shafer evidence theory. These are summarized below and then described in more depth in subsequent

sections of this chapter. Dakota additionally supports mixed aleatory/epistemic UQ via interval-valued probabil-

ity, second-order probability, and Dempster-Shafer theory of evidence. These involve advanced model recursions

and are described in Section 15.1.

LHS (Latin Hypercube Sampling): This package provides both Monte Carlo (random) sampling and Latin

Hypercube sampling methods, which can be used with probabilistic variables in Dakota that have the following

distributions: normal, lognormal, uniform, loguniform, triangular, exponential, beta, gamma, gumbel, frechet,

weibull, poisson, binomial, negative binomial, geometric, hypergeometric, and user-supplied histograms. In addi-

tion, LHS accounts for correlations among the variables [82], which can be used to accommodate a user-supplied

correlation matrix or to minimize correlation when a correlation matrix is not supplied. In addition to a standard

sampling study, we support the capability to perform “incremental” LHS, where a user can specify an initial LHS

study of N samples, and then re-run an additional incremental study which will double the number of samples (to

2N, with the ﬁrst N being carried from the initial study). The full incremental sample of size 2N is also a Latin

Hypercube, with proper stratiﬁcation and correlation. Statistics for each increment are reported separately at the

end of the study.

Reliability Methods: This suite of methods includes both local and global reliability methods. Local methods

include ﬁrst- and second-order versions of the Mean Value method (MVFOSM and MVSOSM) and a variety of

most probable point (MPP) search methods, including the Advanced Mean Value method (AMV and AMV2),

the iterated Advanced Mean Value method (AMV+ and AMV2+), the Two-point Adaptive Nonlinearity Approx-

imation method (TANA-3), and the traditional First Order and Second Order Reliability Methods (FORM and

SORM) [73]. The MPP search methods may be used in forward (Reliability Index Approach (RIA)) or inverse

(Performance Measure Approach (PMA)) modes, as dictated by the type of level mappings. Each of the MPP

search techniques solve local optimization problems in order to locate the MPP, which is then used as the point

about which approximate probabilities are integrated (using ﬁrst- or second-order integrations in combination

with reﬁnements based on importance sampling). Global reliability methods are designed to handle nonsmooth

and multimodal failure surfaces, by creating global approximations based on Gaussian process models. They

accurately resolve a particular contour of a response function and then estimate probabilities using multimodal

adaptive importance sampling.

Stochastic Expansion Methods: The development of these techniques mirrors that of deterministic ﬁnite element

analysis utilizing the notions of projection, orthogonality, and weak convergence [53], [54]. Rather than estimat-

ing point probabilities, they form an approximation to the functional relationship between response functions and

their random inputs, which provides a more complete uncertainty representation for use in multi-code simula-

tions. Expansion methods include polynomial chaos expansions (PCE), which employ multivariate orthogonal

polynomials that are tailored to representing particular input probability distributions, and stochastic collocation

(SC), which employs multivariate interpolation polynomials. For PCE, expansion coefﬁcients may be evalu-

ated using a spectral projection approach (based on sampling, tensor-product quadrature, Smolyak sparse grid,

or cubature methods for numerical integration) or a regression approach (least squares or compressive sensing).

For SC, interpolants are formed over tensor-product or sparse grids and may be local or global, value-based or

gradient-enhanced, and nodal or hierarchical. In global value-based cases (Lagrange polynomials), the barycentric

formulation is used [11,88,79] to improve numerical efﬁciency and stability. Both sets of methods provide ana-

lytic response moments and variance-based metrics; however, CDF/CCDF probabilities are evaluated numerically

by sampling on the expansion.

Importance Sampling: Importance sampling is a method that allows one to estimate statistical quantities such

as failure probabilities in a way that is more efﬁcient than Monte Carlo sampling. The core idea in importance

sampling is that one generates samples that are preferentially placed in important regions of the space (e.g. in

or near the failure region or user-deﬁned region of interest), then appropriately weights the samples to obtain an

Dakota Version 6.7 User’s Manual generated on November 13, 2017

5.1. OVERVIEW 71

unbiased estimate of the failure probability.

Adaptive Sampling: The goal in performing adaptive sampling is to construct a surrogate model that can be used

as an accurate predictor of an expensive simulation. The aim is to build a surrogate that minimizes the error over

the entire domain of interest using as little data as possible from the expensive simulation. The adaptive sampling

methods start with an initial LHS sample, and then adaptively choose samples that optimize a particular criteria.

For example, if a set of additional possible sample points are generated, one criteria is to pick the next sample

point as the point which maximizes the minimum distance to the existing points (maximin). Another criteria is to

pick the sample point where the surrogate indicates the most uncertainty in its prediction.

Recently, Dakota added a new method to assess failure probabilities based on ideas from computational geometry.

Part of the idea underpinning this method is the idea of throwing “darts” which are higher dimensional objects

than sample points (e.g. lines, planes, etc.) The POF (Probability-of-Failure) darts method uses these objects to

estimate failure probabilities.

Interval Analysis: Interval analysis is often used to model epistemic uncertainty. In interval analysis, one assumes

that nothing is known about an epistemic uncertain variable except that its value lies somewhere within an interval.

In this situation, it is NOT assumed that the value has a uniform probability of occurring within the interval.

Instead, the interpretation is that any value within the interval is a possible value or a potential realization of that

variable. In interval analysis, the uncertainty quantiﬁcation problem is one of determining the resulting bounds

on the output (deﬁning the output interval) given interval bounds on the inputs. Again, any output response that

falls within the output interval is a possible output with no frequency information assigned to it.

We have the capability to perform interval analysis using either global or local methods. In the global approach,

one uses either a global optimization method (based on a Gaussian process surrogate model) or a sampling method

to assess the bounds. The local method uses gradient information in a derivative-based optimization approach,

using either SQP (sequential quadratic programming) or a NIP (nonlinear interior point) method to obtain bounds.

Dempster-Shafer Theory of Evidence: The objective of evidence theory is to model the effects of epistemic

uncertainties. Epistemic uncertainty refers to the situation where one does not know enough to specify a proba-

bility distribution on a variable. Sometimes epistemic uncertainty is referred to as subjective, reducible, or lack

of knowledge uncertainty. In contrast, aleatory uncertainty refers to the situation where one does have enough

information to specify a probability distribution. In Dempster-Shafer theory of evidence, the uncertain input vari-

ables are modeled as sets of intervals. The user assigns a basic probability assignment (BPA) to each interval,

indicating how likely it is that the uncertain input falls within the interval. The intervals may be overlapping,

contiguous, or have gaps. The intervals and their associated BPAs are then propagated through the simulation

to obtain cumulative distribution functions on belief and plausibility. Belief is the lower bound on a probability

estimate that is consistent with the evidence, and plausibility is the upper bound on a probability estimate that is

consistent with the evidence. In addition to the full evidence theory structure, we have a simpliﬁed capability for

users wanting to perform pure interval analysis (e.g. what is the interval on the output given intervals on the input)

using either global or local optimization methods. Interval analysis is often used to model epistemic variables in

nested analyses, where probability theory is used to model aleatory variables.

Bayesian Calibration: In Bayesian calibration, uncertain input parameters are described by a “prior” distribution.

The priors are updated with experimental data, in a Bayesian framework that involves the experimental data and

a likelihood function which describes how well each parameter value is supported by the data. After the process

of Bayesian calibration, the prior distribution becomes a posterior distribution. The posterior distribution of the

parameters represents the parameter values that have been updated with the data (e.g. the model, when run at

these parameters, should yield results that are consistent with the observational data).

Dakota Version 6.7 User’s Manual generated on November 13, 2017

72 CHAPTER 5. UNCERTAINTY QUANTIFICATION CAPABILITIES

5.1.2 Variables and Responses for UQ

All the UQ methods perform a forward uncertainty propagation in which probability or interval information for

input parameters is mapped to probability or interval information for output response functions. The mfunctions

in the Dakota response data set are interpreted as mgeneral response functions by the Dakota methods (with no

speciﬁc interpretation of the functions as for optimization and least squares).

Within the variables speciﬁcation, uncertain variable descriptions are employed to deﬁne the parameter probability

distributions (see Section 9.3). The continuous aleatory distribution types include: normal (Gaussian), lognormal,

uniform, loguniform, triangular, exponential, beta, gamma, gumbel, frechet, weibull, and histogram bin. The

discrete aleatory distribution types include: poisson, binomial, negative binomial, geometric, hypergeometric,

and histogram point. The epistemic distribution type is interval for continuous variables. For epistemic discrete

variables, there are three types: integer range, integer set, and real set. When gradient and/or Hessian information

is used in an uncertainty assessment, derivative components are normally computed with respect to the active

continuous variables, which could be aleatory uncertain, epistemic uncertain, aleatory and epistemic uncertain, or

all continuous variables, depending on the active view (see Section 9.5).

5.2 Sampling Methods

Sampling techniques are selected using the sampling method selection. This method generates sets of samples

according to the probability distributions of the uncertain variables and maps them into corresponding sets of

response functions, where the number of samples is speciﬁed by the samples integer speciﬁcation. Means,

standard deviations, coefﬁcients of variation (COVs), and 95% conﬁdence intervals are computed for the response

functions. Probabilities and reliabilities may be computed for response levels speciﬁcations, and response

levels may be computed for either probability levels or reliability levels speciﬁcations (refer

to the Method keywords section in the Dakota Reference Manual [3] for additional information).

Currently, traditional Monte Carlo (MC) and Latin hypercube sampling (LHS) are supported by Dakota and are

chosen by specifying sample type as random or lhs. In Monte Carlo sampling, the samples are selected

randomly according to the user-speciﬁed probability distributions. Latin hypercube sampling is a stratiﬁed sam-

pling technique for which the range of each uncertain variable is divided into Nssegments of equal probability,

where Nsis the number of samples requested. The relative lengths of the segments are determined by the nature

of the speciﬁed probability distribution (e.g., uniform has segments of equal width, normal has small segments

near the mean and larger segments in the tails). For each of the uncertain variables, a sample is selected randomly

from each of these equal probability segments. These Nsvalues for each of the individual parameters are then

combined in a shufﬂing operation to create a set of Nsparameter vectors with a speciﬁed correlation structure.

A feature of the resulting sample set is that every row and column in the hypercube of partitions has exactly one

sample. Since the total number of samples is exactly equal to the number of partitions used for each uncertain

variable, an arbitrary number of desired samples is easily accommodated (as compared to less ﬂexible approaches

in which the total number of samples is a product or exponential function of the number of intervals for each

variable, i.e., many classical design of experiments methods).

Advantages of sampling-based methods include their relatively simple implementation and their independence

from the scientiﬁc disciplines involved in the analysis. The main drawback of these techniques is the large number

of function evaluations needed to generate converged statistics, which can render such an analysis computationally

very expensive, if not intractable, for real-world engineering applications. LHS techniques, in general, require

fewer samples than traditional Monte Carlo for the same accuracy in statistics, but they still can be prohibitively

expensive. For further information on the method and its relationship to other sampling techniques, one is referred

to the works by McKay, et al. [101], Iman and Shortencarier [82], and Helton and Davis [76]. Note that under

Dakota Version 6.7 User’s Manual generated on November 13, 2017

5.2. SAMPLING METHODS 73

certain separability conditions associated with the function to be sampled, Latin hypercube sampling provides a

more accurate estimate of the mean value than does random sampling. That is, given an equal number of samples,

the LHS estimate of the mean will have less variance than the mean value obtained through random sampling.

Figure 5.1 demonstrates Latin hypercube sampling on a two-variable parameter space. Here, the range of both

parameters, x1and x2, is [0,1]. Also, for this example both x1and x2have uniform statistical distributions.

For Latin hypercube sampling, the range of each parameter is divided into p“bins” of equal probability. For

parameters with uniform distributions, this corresponds to partitions of equal size. For ndesign parameters,

this partitioning yields a total of pnbins in the parameter space. Next, psamples are randomly selected in the

parameter space, with the following restrictions: (a) each sample is randomly placed inside a bin, and (b) for all

one-dimensional projections of the psamples and bins, there will be one and only one sample in each bin. In a

two-dimensional example such as that shown in Figure 5.1, these LHS rules guarantee that only one bin can be

selected in each row and column. For p= 4, there are four partitions in both x1and x2. This gives a total of

16 bins, of which four will be chosen according to the criteria described above. Note that there is more than one

possible arrangement of bins that meet the LHS criteria. The dots in Figure 5.1 represent the four sample sites in

this example, where each sample is randomly located in its bin. There is no restriction on the number of bins in

the range of each parameter, however, all parameters must have the same number of bins.

Figure 5.1: An example of Latin hypercube sampling with four bins in design parameters x1and x2. The dots are

the sample sites.

The actual algorithm for generating Latin hypercube samples is more complex than indicated by the description

given above. For example, the Latin hypercube sampling method implemented in the LHS code [132] takes into

account a user-speciﬁed correlation structure when selecting the sample sites. For more details on the implemen-

tation of the LHS algorithm, see Reference [132].

In addition to Monte Carlo vs. LHS design choices, Dakota sampling methods support options for incrementally-

reﬁned designs, generation of approximately determinant-optimal (D-optimal) designs, and selection of sample

sizes to satisfy Wilks’ criteria.

Dakota Version 6.7 User’s Manual generated on November 13, 2017

74 CHAPTER 5. UNCERTAINTY QUANTIFICATION CAPABILITIES

5.2.1 Uncertainty Quantiﬁcation Example using Sampling Methods

The input ﬁle in Figure 5.2 demonstrates the use of Latin hypercube Monte Carlo sampling for assessing prob-

ability of failure as measured by speciﬁed response levels. The two-variable Textbook example problem (see

Equation 20.1) will be used to demonstrate the application of sampling methods for uncertainty quantiﬁcation

where it is assumed that x1and x2are uniform uncertain variables on the interval [0,1].

The number of samples to perform is controlled with the samples speciﬁcation, the type of sampling algorithm

to use is controlled with the sample type speciﬁcation, the levels used for computing statistics on the response

functions is speciﬁed with the response levels input, and the seed speciﬁcation controls the sequence of

the pseudo-random numbers generated by the sampling algorithms. The input samples generated are shown in

Figure 5.3 for the case where samples = 5 and samples = 10 for both random (◦) and lhs (+) sample types.

# Dakota Input File: textbook_uq_sampling.in

environment

tabular_data

tabular_data_file = ’textbook_uq_sampling.dat’

top_method_pointer = ’UQ’

method

id_method = ’UQ’

sampling

sample_type lhs

samples = 10

seed = 98765

response_levels = 0.1 0.2 0.6

0.1 0.2 0.6

0.1 0.2 0.6

distribution cumulative

variables

uniform_uncertain = 2

lower_bounds = 0. 0.

upper_bounds = 1. 1.

descriptors = ’x1’ ’x2’

interface

id_interface = ’I1’

analysis_drivers = ’text_book’

fork

asynchronous evaluation_concurrency = 5

responses

response_functions = 3

no_gradients

no_hessians

Figure 5.2: Dakota input ﬁle for UQ example using LHS – see

Dakota/examples/users/textbook uq sampling.in

Latin hypercube sampling ensures full coverage of the range of the input variables, which is often a problem with

Monte Carlo sampling when the number of samples is small. In the case of samples = 5, poor stratiﬁcation

Dakota Version 6.7 User’s Manual generated on November 13, 2017

5.2. SAMPLING METHODS 75

Figure 5.3: Distribution of input sample points for random (4) and lhs (+) sampling for samples=5 and 10.

is evident in x1as four out of the ﬁve Monte Carlo samples are clustered in the range 0.35 < x1<0.55, and

the regions x1<0.3and 0.6< x1<0.9are completely missed. For the case where samples = 10, some

clustering in the Monte Carlo samples is again evident with 4samples in the range 0.5< x1<0.55. In both

cases, the stratiﬁcation with LHS is superior.

The response function statistics returned by Dakota are shown in Figure 5.4. The ﬁrst block of output speciﬁes

the response sample means, sample standard deviations, and skewness and kurtosis. The second block of output

displays conﬁdence intervals on the means and standard deviations of the responses. The third block deﬁnes

Probability Density Function (PDF) histograms of the samples: the histogram bins are deﬁned by the lower and

upper values of the bin and the corresponding density for that bin. Note that these bin endpoints correspond

to the response levels and/or probability levels deﬁned by the user in the Dakota input ﬁle. If

there are just a few levels, these histograms may be coarse. Dakota does not do anything to optimize the bin

size or spacing. Finally, the last section of the output deﬁnes the Cumulative Distribution Function (CDF) pairs.

In this case, distribution cumulative was speciﬁed for the response functions, and Dakota presents the

probability levels corresponding to the speciﬁed response levels (response levels) that were set. The default

compute probabilities was used. Alternatively, Dakota could have provided CCDF pairings, reliability

levels corresponding to prescribed response levels, or response levels corresponding to prescribed probability or

reliability levels.

In addition to obtaining statistical summary information of the type shown in Figure 5.4, the results of LHS

sampling also include correlations. Four types of correlations are returned in the output: simple and partial “raw”

correlations, and simple and partial “rank” correlations. The raw correlations refer to correlations performed

on the actual input and output data. Rank correlations refer to correlations performed on the ranks of the data.

Ranks are obtained by replacing the actual data by the ranked values, which are obtained by ordering the data

in ascending order. For example, the smallest value in a set of input samples would be given a rank 1, the next

smallest value a rank 2, etc. Rank correlations are useful when some of the inputs and outputs differ greatly in

magnitude: then it is easier to compare if the smallest ranked input sample is correlated with the smallest ranked

output, for example.

Dakota Version 6.7 User’s Manual generated on November 13, 2017

76 CHAPTER 5. UNCERTAINTY QUANTIFICATION CAPABILITIES

Statistics based on 10 samples:

Sample moment statistics for each response function:

Mean Std Dev Skewness Kurtosis

response_fn_1 3.8383990322e-01 4.0281539886e-01 1.2404952971e+00 6.5529797327e-01

response_fn_2 7.4798705803e-02 3.4686110941e-01 4.5716015887e-01 -5.8418924529e-01

response_fn_3 7.0946176558e-02 3.4153246532e-01 5.2851897926e-01 -8.2527332042e-01

95% confidence intervals for each response function:

LowerCI_Mean UpperCI_Mean LowerCI_StdDev UpperCI_StdDev

response_fn_1 9.5683125821e-02 6.7199668063e-01 2.7707061315e-01 7.3538389383e-01

response_fn_2 -1.7333078422e-01 3.2292819583e-01 2.3858328290e-01 6.3323317325e-01

response_fn_3 -1.7337143113e-01 3.1526378424e-01 2.3491805390e-01 6.2350514636e-01

Probability Density Function (PDF) histograms for each response function:

PDF for response_fn_1:

Bin Lower Bin Upper Density Value

--------- --------- -------------

2.3066424677e-02 1.0000000000e-01 3.8994678038e+00

1.0000000000e-01 2.0000000000e-01 2.0000000000e+00

2.0000000000e-01 6.0000000000e-01 5.0000000000e-01

6.0000000000e-01 1.2250968624e+00 4.7992562123e-01

PDF for response_fn_2:

Bin Lower Bin Upper Density Value

--------- --------- -------------

-3.5261164651e-01 1.0000000000e-01 1.1046998102e+00

1.0000000000e-01 2.0000000000e-01 2.0000000000e+00

2.0000000000e-01 6.0000000000e-01 5.0000000000e-01

6.0000000000e-01 6.9844576220e-01 1.0157877573e+00

PDF for response_fn_3:

Bin Lower Bin Upper Density Value

--------- --------- -------------

-3.8118095128e-01 1.0000000000e-01 1.2469321539e+00

1.0000000000e-01 2.0000000000e-01 0.0000000000e+00

2.0000000000e-01 6.0000000000e-01 7.5000000000e-01

6.0000000000e-01 6.4526450977e-01 2.2092363423e+00

Level mappings for each response function:

Cumulative Distribution Function (CDF) for response_fn_1:

Response Level Probability Level Reliability Index General Rel Index

-------------- ----------------- ----------------- -----------------

1.0000000000e-01 3.0000000000e-01

2.0000000000e-01 5.0000000000e-01

6.0000000000e-01 7.0000000000e-01

Cumulative Distribution Function (CDF) for response_fn_2:

Response Level Probability Level Reliability Index General Rel Index

-------------- ----------------- ----------------- -----------------

1.0000000000e-01 5.0000000000e-01

2.0000000000e-01 7.0000000000e-01

6.0000000000e-01 9.0000000000e-01

Cumulative Distribution Function (CDF) for response_fn_3:

Response Level Probability Level Reliability Index General Rel Index

-------------- ----------------- ----------------- -----------------

1.0000000000e-01 6.0000000000e-01

2.0000000000e-01 6.0000000000e-01

6.0000000000e-01 9.0000000000e-01

Figure 5.4: Dakota response function statistics from UQ sampling example.

Dakota Version 6.7 User’s Manual generated on November 13, 2017

5.2. SAMPLING METHODS 77

Correlations are always calculated between two sets of sample data. One can calculate correlation coefﬁcients

between two input variables, between an input and an output variable (probably the most useful), or between two

output variables. The simple correlation coefﬁcients presented in the output tables are Pearson’s correlation co-

efﬁcient, which is deﬁned for two variables xand yas: Corr(x, y) = Pi(xi−¯x)(yi−¯y)

√Pi(xi−¯x)2Pi(yi−¯y)2. Partial correlation

coefﬁcients are similar to simple correlations, but a partial correlation coefﬁcient between two variables measures

their correlation while adjusting for the effects of the other variables. For example, say one has a problem with

two inputs and one output; and the two inputs are highly correlated. Then the correlation of the second input and

the output may be very low after accounting for the effect of the ﬁrst input. The rank correlations in Dakota are

obtained using Spearman’s rank correlation. Spearman’s rank is the same as the Pearson correlation coefﬁcient

except that it is calculated on the rank data.

Figure 5.5 shows an example of the correlation output provided by Dakota for the input ﬁle in Figure 5.2. Note that

these correlations are presently only available when one speciﬁes lhs as the sampling method under sampling.

Also note that the simple and partial correlations should be similar in most cases (in terms of values of correlation

coefﬁcients). This is because we use a default “restricted pairing” method in the LHS routine which forces near-

zero correlation amongst uncorrelated inputs.

Simple Correlation Matrix between input and output:

x1 x2 response_fn_1 response_fn_2 response_fn_3

x1 1.00000e+00

x2 -7.22482e-02 1.00000e+00

response_fn_1 -7.04965e-01 -6.27351e-01 1.00000e+00

response_fn_2 8.61628e-01 -5.31298e-01 -2.60486e-01 1.00000e+00

response_fn_3 -5.83075e-01 8.33989e-01 -1.23374e-01 -8.92771e-01 1.00000e+00

Partial Correlation Matrix between input and output:

response_fn_1 response_fn_2 response_fn_3

x1 -9.65994e-01 9.74285e-01 -9.49997e-01

x2 -9.58854e-01 -9.26578e-01 9.77252e-01

Simple Rank Correlation Matrix between input and output:

x1 x2 response_fn_1 response_fn_2 response_fn_3

x1 1.00000e+00

x2 -6.66667e-02 1.00000e+00

response_fn_1 -6.60606e-01 -5.27273e-01 1.00000e+00

response_fn_2 8.18182e-01 -6.00000e-01 -2.36364e-01 1.00000e+00

response_fn_3 -6.24242e-01 7.93939e-01 -5.45455e-02 -9.27273e-01 1.00000e+00

Partial Rank Correlation Matrix between input and output:

response_fn_1 response_fn_2 response_fn_3

x1 -8.20657e-01 9.74896e-01 -9.41760e-01

x2 -7.62704e-01 -9.50799e-01 9.65145e-01

Figure 5.5: Correlation results using LHS Sampling.

Finally, note that the LHS package can be used for design of experiments over design and state variables by

including an active view override in the variables speciﬁcation section of the Dakota input ﬁle (see Section 9.5.1).

Then, instead of iterating on only the uncertain variables, the LHS package will sample over all of the active

variables. In the active all view, continuous design and continuous state variables are treated as having

uniform probability distributions within their upper and lower bounds, discrete design and state variables are

sampled uniformly from within their sets or ranges, and any uncertain variables are sampled within their speciﬁed

Dakota Version 6.7 User’s Manual generated on November 13, 2017

78 CHAPTER 5. UNCERTAINTY QUANTIFICATION CAPABILITIES

probability distributions.

5.2.2 Incremental Sampling

In many situations, one may run an initial sample set and then need to perform further sampling to get better

estimates of the mean, variance, and percentiles, and to obtain more comprehensive sample coverage. We call

this capability incremental sampling. Typically, a Dakota restart ﬁle (dakota.rst) would be available from

the original sample, so only the newly generated samples would need to be evaluated. Incremental sampling sup-

ports continuous uncertain variables and discrete uncertain variables such as discrete distributions (e.g. binomial,

Poisson, etc.) as well as histogram variables and uncertain set types.

There are two cases, incremental random and incremental Latin hypercube sampling, with incremental LHS being

the most common. One major advantage of LHS incremental sampling is that it maintains the stratiﬁcation and

correlation structure of the original LHS sample. That is, if one generated two independent LHS samples and

simply merged them, the calculation of the accuracy of statistical measures such as the mean and the variance

would be slightly incorrect. However, in the incremental case, the full sample (double the original size) is a Latin

Hypercube sample itself and statistical measures and their accuracy can be properly calculated. The incremental

sampling capability is most useful when one is starting off with very small samples. Once the sample size is more

than a few hundred, the beneﬁt of incremental sampling diminishes.

1. Incremental random sampling: With incremental random sampling, the original sample set with N1sam-

ples must be generated using sample type = random and samples = N1. Then, the user can du-

plicate the Dakota input ﬁle and add refinement samples = N2 with the number of new samples

N2to be added. Random incremental sampling does not require a doubling of samples each time. Thus,

the user can specify any number of refinement samples (from an additional one sample to a large

integer).

For example, if the ﬁrst sample has 50 samples, and 10 more samples are desired, the second Dakota

run should specify samples = 50,refinement samples = 10. In this situation, only 10 new

samples will be generated, and the ﬁnal statistics will be reported at the end of the study both for the

initial 50 samples and for the full sample of 60. The command line syntax for running the second sample

is dakota -i input60.in -r dakota.50.rst where input60.in is the input ﬁle with the

reﬁnement samples speciﬁcation and dakota.50.rst is the restart ﬁle containing the initial 50 samples.

Note that if the restart ﬁle has a different name, that is ﬁne; the correct restart ﬁle name should be used.

This process can be repeated if desired,arbitrarily extending the total sample size each time, e.g, samples

= 50,refinement samples = 10 3 73 102.

2. Incremental Latin hypercube sampling: With incremental LHS sampling, the original sample set with N1

samples must be generated using sample type = lhs and samples = N1. Then, the user can du-

plicate the Dakota input ﬁle and add refinement samples = N1. The sample size must double each

time, so the ﬁrst set of reﬁnement samples must be the same size as the initial set. That is, if one starts with

a very small sample size of 10, then one can use the incremental sampling capability to generate sample

sizes of 20, 40, 80, etc.

For example, if the ﬁrst sample has 50 samples, in the second Dakota run, the number of reﬁnement samples

should be set to 50 for a total of 100. In this situation, only 50 new samples will be generated, and at the

end of the study ﬁnal statistics will be reported both for the initial 50 samples and for the full sample

of 100. The command line syntax for running the second sample is dakota -i input100.in -r

dakota.50.rst, where input100.in is the input ﬁle with the incremental sampling speciﬁcation

and dakota.50.rst is the restart ﬁle containing the initial 50 samples. Note that if the restart ﬁle has a

different name, that is ﬁne; the correct restart ﬁle name should be used.

Dakota Version 6.7 User’s Manual generated on November 13, 2017

5.3. RELIABILITY METHODS 79

This process can be repeated if desired, doubling the total sample size each time, e.g, samples = 50,

refinement samples = 50 100 200 400.

5.2.3 Principal Component Analysis

As of Dakota 6.3, we added a capability to perform Principal Component Analysis on ﬁeld response data when

using LHS sampling. Principal components analysis (PCA) is a data reduction method and allows one to express

an ensemble of ﬁeld data with a set of principal components responsible for the spread of that data.

Dakota can calculate the principal components of the response matrix of N samples * L responses (the ﬁeld

response of length L) using the keyword principal components. The Dakota implementation is under

active development: the PCA capability may ultimately be speciﬁed elsewhere or used in different ways. For

now, it is performed as a post-processing analysis based on a set of Latin Hypercube samples.

If the user speciﬁes LHS sampling with ﬁeld data responses and also speciﬁes principal components,

Dakota will calculate the principal components by calculating the eigenvalues and eigenvectors of a centered data

matrix. Further, if the user speciﬁes percent variance explained = 0.99, the number of components that

accounts for at least 99 percent of the variance in the responses will be retained. The default for this percentage

is 0.95. In many applications, only a few principal components explain the majority of the variance, resulting in

signiﬁcant data reduction. The principal components are written to a ﬁle, princ comp.txt. Dakota also uses

the principal components to create a surrogate model by representing the overall response as weighted sum of M

principal components, where the weights will be determined by Gaussian processes which are a function of the

input uncertain variables. This reduced form then can be used for sensitivity analysis, calibration, etc.

5.2.4 Wilks-based Sample Sizes

Most of the sampling methods require the user to specify the number of samples in advance. However, if one

speciﬁes random sampling, one can use an approach developed by Wilks[149] to determine the number of

samples that ensures a particular conﬁdence level in a percentile of interest. The Wilks method of computing the

number of samples to execute for a random sampling study is based on order statistics, eg considering the outputs

ordered from smallest to largest [149,108]. Given a probability level,α, and confidence level,

β, the Wilks calculation determines the minimum number of samples required such that there is (β∗100)%

conﬁdence that the (α∗100)%-ile of the uncertain distribution on model output will fall below the actual (α∗

100)%-ile given by the sample. To be more speciﬁc, if we wish to calculate the 95% conﬁdence limit on the

95th percentile, Wilks indicates that 59 samples are needed. If we order the responses and take the largest

one, that value deﬁnes a tolerance limit on the 95th percentile: we have a situation where 95% of the time,

the 95th percentile will fall at or below that sampled value. This represents a one sided upper treatment

applicable to the largest output value. This treatment can be reversed to apply to the lowest output value by

using the one sided lower option, and further expansion to include an interval containing both the smallest

and the largest output values in the statistical statement can be speciﬁed via the two sided option. Additional

generalization to higher order statistics, eg a statement applied to the N largest outputs (one sided upper) or

the N smallest and N largest outputs (two sided), can be speciﬁed using the order option along with value N.

5.3 Reliability Methods

Reliability methods provide an alternative approach to uncertainty quantiﬁcation which can be less computa-

tionally demanding than sampling techniques. Reliability methods for uncertainty quantiﬁcation are based on

Dakota Version 6.7 User’s Manual generated on November 13, 2017

80 CHAPTER 5. UNCERTAINTY QUANTIFICATION CAPABILITIES

probabilistic approaches that compute approximate response function distribution statistics based on speciﬁed un-

certain variable distributions. These response statistics include response mean, response standard deviation, and

cumulative or complementary cumulative distribution functions (CDF/CCDF). These methods are often more ef-

ﬁcient at computing statistics in the tails of the response distributions (events with low probability) than sampling

based approaches since the number of samples required to resolve a low probability can be prohibitive.

The methods all answer the fundamental question: “Given a set of uncertain input variables, X, and a scalar

response function, g, what is the probability that the response function is below or above a certain level, ¯z?” The

former can be written as P[g(X)≤¯z] = Fg(¯z)where Fg(¯z)is the cumulative distribution function (CDF) of the

uncertain response g(X)over a set of response levels. The latter can be written as P[g(X)>¯z]and deﬁnes the

complementary cumulative distribution function (CCDF).

This probability calculation involves a multi-dimensional integral over an irregularly shaped domain of interest,

D, where g(X)< z as displayed in Figure 5.6 for the case of two variables. The reliability methods all involve the

transformation of the user-speciﬁed uncertain variables, X, with probability density function, p(x1, x2), which

can be non-normal and correlated, to a space of independent Gaussian random variables, u, possessing a mean

value of zero and unit variance (i.e., standard normal variables). The region of interest, D, is also mapped to

the transformed space to yield, Du, where g(U)< z as shown in Figure 5.7. The Nataf transformation [28],

which is identical to the Rosenblatt transformation [114] in the case of independent random variables, is used in

Dakota to accomplish this mapping. This transformation is performed to make the probability calculation more

tractable. In the transformed space, probability contours are circular in nature as shown in Figure 5.7 unlike in

the original uncertain variable space, Figure 5.6. Also, the multi-dimensional integrals can be approximated by

simple functions of a single parameter, β, called the reliability index. βis the minimum Euclidean distance from

the origin in the transformed space to the response surface. This point is also known as the most probable point

(MPP) of failure. Note, however, the methodology is equally applicable for generic functions, not simply those

corresponding to failure criteria; this nomenclature is due to the origin of these methods within the disciplines

of structural safety and reliability. Note that there are local and global reliability methods. The majority of the

methods available are local, meaning that a local optimization formulation is used to locate one MPP. In contrast,

global methods can ﬁnd multiple MPPs if they exist.

Figure 5.6: Graphical depiction of calculation of cumulative distribution function in the original uncertain variable

space.

Dakota Version 6.7 User’s Manual generated on November 13, 2017

5.3. RELIABILITY METHODS 81

Table 5.1: Mapping from Dakota options to standard reliability methods.

Order of approximation and integration

MPP search First order Second order

none MVFOSM MVSOSM

x taylor mean AMV AMV2

u taylor mean u-space AMV u-space AMV2

x taylor mpp AMV+ AMV2+

u taylor mpp u-space AMV+ u-space AMV2+

x two point TANA

u two point u-space TANA

no approx FORM SORM

Figure 5.7: Graphical depiction of integration for the calculation of cumulative distribution function in the trans-

formed uncertain variable space.

5.3.1 Local Reliability Methods

The Dakota Theory Manual [4] provides the algorithmic details for the local reliability methods, including the

Mean Value method and the family of most probable point (MPP) search methods.

5.3.1.1 Method mapping

Given settings for limit state approximation, approximation order, integration approach, and other details pre-

sented to this point, it is evident that the number of algorithmic combinations is high. Table 5.1 provides a

succinct mapping for some of these combinations to common method names from the reliability literature, where

blue indicates the most well-known combinations and gray indicates other supported combinations.

Within the Dakota speciﬁcation (refer to local reliability in the keywords section of the Reference Man-

ual [3]) within the Reference Manual), the MPP search and integration order selections are explicit in the method

speciﬁcation, but the order of the approximation is inferred from the associated response speciﬁcation (as is done

Dakota Version 6.7 User’s Manual generated on November 13, 2017

82 CHAPTER 5. UNCERTAINTY QUANTIFICATION CAPABILITIES

with local taylor series approximations described in Section 8.4.3.2). Thus, reliability methods do not have to be

synchronized in approximation and integration order as shown in the table; however, it is often desirable to do so.

5.3.2 Global Reliability Methods

Global reliability methods are designed to handle nonsmooth and multimodal failure surfaces, by creating global

approximations based on Gaussian process models. They accurately resolve a particular contour of a response

function and then estimate probabilities using multimodal adaptive importance sampling.

The global reliability method in Dakota is called Efﬁcient Global Reliability Analysis (EGRA) [12]. The name

is due to its roots in efﬁcient global optimization (EGO) [84,81]. The main idea in EGO-type optimization

methods is that a global approximation is made of the underlying function. This approximation, which is a

Gaussian process model, is used to guide the search by ﬁnding points which maximize the expected improvement

function (EIF). The EIF is used to select the location at which a new training point should be added to the Gaussian

process model by maximizing the amount of improvement in the objective function that can be expected by adding

that point. A point could be expected to produce an improvement in the objective function if its predicted value

is better than the current best solution, or if the uncertainty in its prediction is such that the probability of it

producing a better solution is high. Because the uncertainty is higher in regions of the design space with fewer

observations, this provides a balance between exploiting areas of the design space that predict good solutions, and

exploring areas where more information is needed.

The general procedure of these EGO-type methods is:

1. Build an initial Gaussian process model of the objective function.

2. Find the point that maximizes the EIF. If the EIF value at this point is sufﬁciently small, stop.

3. Evaluate the objective function at the point where the EIF is maximized. Update the Gaussian process

model using this new point. Go to Step 2.

Gaussian process (GP) models are used because they provide not just a predicted value at an unsampled point, but

also an estimate of the prediction variance. This variance gives an indication of the uncertainty in the GP model,

which results from the construction of the covariance function. This function is based on the idea that when input

points are near one another, the correlation between their corresponding outputs will be high. As a result, the

uncertainty associated with the model’s predictions will be small for input points which are near the points used

to train the model, and will increase as one moves further from the training points.

The expected improvement function is used in EGO algorithms to select the location at which a new training point

should be added. The EIF is deﬁned as the expectation that any point in the search space will provide a better

solution than the current best solution based on the expected values and variances predicted by the GP model. It

is important to understand how the use of this EIF leads to optimal solutions. The EIF indicates how much the

objective function value at a new potential location is expected to be less than the predicted value at the current

best solution. Because the GP model provides a Gaussian distribution at each predicted point, expectations can

be calculated. Points with good expected values and even a small variance will have a signiﬁcant expectation of

producing a better solution (exploitation), but so will points that have relatively poor expected values and greater

variance (exploration).

The application of EGO to reliability analysis, however, is made more complicated due to the inclusion of equality

constraints. In forward reliability analysis, the response function appears as a constraint rather than the objective.

That is, we want to satisfy the constraint that the response equals a threshold value and is on the limit state:

G(u) = ¯z. Therefore, the EIF function was modiﬁed to focus on feasibility, and instead of using an expected

Dakota Version 6.7 User’s Manual generated on November 13, 2017

5.3. RELIABILITY METHODS 83

improvement function, we use an expected feasibility function (EFF) [12]. The EFF provides an indication of

how well the response is expected to satisfy the equality constraint. Points where the expected value is close to

the threshold (µG≈¯z) and points with a large uncertainty in the prediction will have large expected feasibility

values.

The general outline of the EGRA algorithm is as follows: LHS sampling is used to generate a small number of

samples from the true response function. Then, an initial Gaussian process model is constructed. Based on the

EFF, the point with maximum EFF is found using the global optimizer DIRECT. The true response function is

then evaluated at this new point, and this point is added to the sample set and the process of building a new GP

model and maximizing the EFF is repeated until the maximum EFF is small. At this stage, the GP model is

accurate in the vicinity of the limit state. The GP model is then used to calculate the probability of failure using

multimodal importance sampling, which is explained below.

One method to calculate the probability of failure is to directly perform the probability integration numerically

by sampling the response function. Sampling methods can be prohibitively expensive because they generally

require a large number of response function evaluations. Importance sampling methods reduce this expense by

focusing the samples in the important regions of the uncertain space. They do this by centering the sampling

density function at the MPP rather than at the mean. This ensures the samples will lie the region of interest,

thus increasing the efﬁciency of the sampling method. Adaptive importance sampling (AIS) further improves the

efﬁciency by adaptively updating the sampling density function. Multimodal adaptive importance sampling [29]

is a variation of AIS that allows for the use of multiple sampling densities making it better suited for cases where

multiple sections of the limit state are highly probable.

Note that importance sampling methods require that the location of at least one MPP be known because it is

used to center the initial sampling density. However, current gradient-based, local search methods used in MPP

search may fail to converge or may converge to poor solutions for highly nonlinear problems, possibly making

these methods inapplicable. The EGRA algorithm described above does not depend on the availability of accurate

gradient information, making convergence more reliable for nonsmooth response functions. Moreover, EGRA has

the ability to locate multiple failure points, which can provide multiple starting points and thus a good multimodal

sampling density for the initial steps of multimodal AIS. The probability assessment using multimodal AIS thus

incorporates probability of failure at multiple points.

5.3.3 Uncertainty Quantiﬁcation Examples using Reliability Analysis

In summary, the user can choose to perform either forward (RIA) or inverse (PMA) mappings when performing

a reliability analysis. With either approach, there are a variety of methods from which to choose in terms of limit

state approximations (MVFOSM, MVSOSM, x-/u-space AMV, x-/u-space AMV2, x-/u-space AMV+, x-/u-space

AMV2+, x-/u-space TANA, and FORM/SORM), probability integrations (ﬁrst-order or second-order), limit state

Hessian selection (analytic, ﬁnite difference, BFGS, or SR1), and MPP optimization algorithm (SQP or NIP)

selections.

All reliability methods output approximate values of the CDF/CCDF response-probability-reliability levels for

prescribed response levels (RIA) or prescribed probability or reliability levels (PMA). In addition, mean value

methods output estimates of the response means and standard deviations as well as importance factors that attribute

variance among the set of uncertain variables (provided a nonzero response variance estimate).

5.3.3.1 Mean-value Reliability with Textbook

Figure 5.8 shows the Dakota input ﬁle for an example problem that demonstrates the simplest reliability method,

called the mean value method (also referred to as the Mean Value First Order Second Moment method). It is

Dakota Version 6.7 User’s Manual generated on November 13, 2017

84 CHAPTER 5. UNCERTAINTY QUANTIFICATION CAPABILITIES

speciﬁed with method keyword local reliability. This method calculates the mean and variance of the

response function based on information about the mean and variance of the inputs and gradient information at the

mean of the inputs. The mean value method is extremely cheap computationally (only ﬁve runs were required

for the textbook function), but can be quite inaccurate, especially for nonlinear problems and/or problems with

uncertain inputs that are signiﬁcantly non-normal. More detail on the mean value method can be found in the

Local Reliability Methods section of the Dakota Theory Manual [4], and more detail on reliability methods in

general (including the more advanced methods) is found in Section 5.3.

Example output from the mean value method is displayed in Figure 5.9. Note that since the mean of both inputs

is 1, the mean value of the output for response 1 is zero. However, the mean values of the constraints are both

0.5. The mean value results indicate that variable x1 is more important in constraint 1 while x2 is more important

in constraint 2, which is the case based on Equation 20.1. The importance factors are not available for the ﬁrst

response as the standard deviation is zero.

# Dakota Input File: textbook_uq_meanvalue.in

environment

#graphics

method

local_reliability

interface

analysis_drivers = ’text_book’

fork asynchronous

variables

lognormal_uncertain = 2

means = 1. 1.

std_deviations = 0.5 0.5

descriptors = ’TF1ln’ ’TF2ln’

responses

response_functions = 3

numerical_gradients

method_source dakota

interval_type central

fd_gradient_step_size = 1.e-4

no_hessians

Figure 5.8: Mean Value Reliability Method: the Dakota input ﬁle – see

Dakota/examples/users/textbook uq meanvalue.in

5.3.3.2 FORM Reliability with Lognormal Ratio

This example quantiﬁes the uncertainty in the “log ratio” response function:

g(x1, x2) = x1

x2

(5.1)

Dakota Version 6.7 User’s Manual generated on November 13, 2017

5.3. RELIABILITY METHODS 85

MV Statistics for response_fn_1:

Approximate Mean Response = 0.0000000000e+00

Approximate Standard Deviation of Response = 0.0000000000e+00

Importance Factors not available.

MV Statistics for response_fn_2:

Approximate Mean Response = 5.0000000000e-01

Approximate Standard Deviation of Response = 1.0307764064e+00

Importance Factor for TF1ln = 9.4117647059e-01

Importance Factor for TF2ln = 5.8823529412e-02

MV Statistics for response_fn_3:

Approximate Mean Response = 5.0000000000e-01

Approximate Standard Deviation of Response = 1.0307764064e+00

Importance Factor for TF1ln = 5.8823529412e-02

Importance Factor for TF2ln = 9.4117647059e-01

Figure 5.9: Results of the Mean Value Method on the Textbook Function

by computing approximate response statistics using reliability analysis to determine the response cumulative

distribution function:

P[g(x1, x2)<¯z](5.2)

where X1and X2are identically distributed lognormal random variables with means of 1, standard deviations of

0.5, and correlation coefﬁcient of 0.3.

A Dakota input ﬁle showing RIA using FORM (option 7 in limit state approximations combined with ﬁrst-order

integration) is listed in Figure 5.10. The user ﬁrst speciﬁes the local reliability method, followed by

the MPP search approach and integration order. In this example, we specify mpp search no approx and

utilize the default ﬁrst-order integration to select FORM. Finally, the user speciﬁes response levels or probabili-

ty/reliability levels to determine if the problem will be solved using an RIA approach or a PMA approach. In the

example ﬁgure of 5.10, we use RIA by specifying a range of response levels for the problem. The resulting

output for this input is shown in Figure 5.11, with probability and reliability levels listed for each response level.

Figure 5.12 shows that FORM compares favorably to an exact analytic solution for this problem. Also note that

FORM does have some error in the calculation of CDF values for this problem, but it is a very small error (on the

order of e-11), much smaller than the error obtained when using a Mean Value method, which will be discussed

next.

If the user speciﬁes local reliability as a method with no additional speciﬁcation on how to do the MPP

search (for example, by commenting out mpp search no approx in Figure 5.10), then no MPP search is

done: the Mean Value method is used. The mean value results are shown in Figure 5.13 and consist of approximate

mean and standard deviation of the response, the importance factors for each uncertain variable, and approximate

probability/reliability levels for the prescribed response levels that have been inferred from the approximate mean

and standard deviation (see Mean Value section in Reliability Methods Chapter of Dakota Theory Manual [4]).

It is evident that the statistics are considerably different from the fully converged FORM results; however, these

rough approximations are also much less expensive to calculate. The importance factors are a measure of the

sensitivity of the response function(s) to the uncertain input variables. A comparison of the mean value results

with the FORM results is shown in Figure 5.12. The mean value results are not accurate near the tail values of the

CDF, and can differ from the exact solution by as much as 0.11 in CDF estimates. A comprehensive comparison

of various reliability methods applied to the logratio problem is provided in [36].

Additional reliability analysis and design results are provided in Sections 20.10.1-20.10.5.

Dakota Version 6.7 User’s Manual generated on November 13, 2017

86 CHAPTER 5. UNCERTAINTY QUANTIFICATION CAPABILITIES

# Dakota Input File: logratio_uq_reliability.in

environment

graphics

method

local_reliability

mpp_search no_approx

response_levels = .4 .5 .55 .6 .65 .7

.75 .8 .85 .9 1. 1.05 1.15 1.2 1.25 1.3

1.35 1.4 1.5 1.55 1.6 1.65 1.7 1.75

variables

lognormal_uncertain = 2

means = 1. 1

std_deviations = 0.5 0.5

initial_point = 0.6 1.4

descriptors = ’TF1ln’ ’TF2ln’

uncertain_correlation_matrix = 1 0.3

0.3 1

interface

analysis_drivers = ’log_ratio’

direct

# fork asynch

responses

response_functions = 1

numerical_gradients

method_source dakota

interval_type central

fd_step_size = 1.e-4

no_hessians

Figure 5.10: Dakota input ﬁle for Reliability UQ example using FORM – see

Dakota/examples/users/logratio uq reliability.in

Dakota Version 6.7 User’s Manual generated on November 13, 2017

5.3. RELIABILITY METHODS 87

Cumulative Distribution Function (CDF) for response_fn_1:

Response Level Probability Level Reliability Index

-------------- ----------------- -----------------

4.0000000000e-01 4.7624085962e-02 1.6683404020e+00

5.0000000000e-01 1.0346525475e-01 1.2620507942e+00

5.5000000000e-01 1.3818404972e-01 1.0885143628e+00

6.0000000000e-01 1.7616275822e-01 9.3008801339e-01

6.5000000000e-01 2.1641741368e-01 7.8434989943e-01

7.0000000000e-01 2.5803428381e-01 6.4941748143e-01

7.5000000000e-01 3.0020938124e-01 5.2379840558e-01

8.0000000000e-01 3.4226491013e-01 4.0628960782e-01

8.5000000000e-01 3.8365052982e-01 2.9590705956e-01

9.0000000000e-01 4.2393548232e-01 1.9183562480e-01

1.0000000000e+00 5.0000000000e-01 6.8682233460e-12

1.0500000000e+00 5.3539344228e-01 -8.8834907167e-02

1.1500000000e+00 6.0043460094e-01 -2.5447217462e-01

1.2000000000e+00 6.3004131827e-01 -3.3196278078e-01

1.2500000000e+00 6.5773508987e-01 -4.0628960782e-01

1.3000000000e+00 6.8356844630e-01 -4.7770089473e-01

1.3500000000e+00 7.0761025532e-01 -5.4641676380e-01

1.4000000000e+00 7.2994058691e-01 -6.1263331274e-01

1.5000000000e+00 7.6981945355e-01 -7.3825238860e-01

1.5500000000e+00 7.8755158269e-01 -7.9795460350e-01

1.6000000000e+00 8.0393505584e-01 -8.5576118635e-01

1.6500000000e+00 8.1906005158e-01 -9.1178881995e-01

1.7000000000e+00 8.3301386860e-01 -9.6614373461e-01

1.7500000000e+00 8.4588021938e-01 -1.0189229206e+00

Figure 5.11: Output from Reliability UQ example using FORM.

Figure 5.12: Comparison of the cumulative distribution function (CDF) computed by FORM, the Mean Value

method, and the exact CDF for g(x1, x2) = x1

x2

Dakota Version 6.7 User’s Manual generated on November 13, 2017

88 CHAPTER 5. UNCERTAINTY QUANTIFICATION CAPABILITIES

MV Statistics for response_fn_1:

Approximate Mean Response = 1.0000000000e+00

Approximate Standard Deviation of Response = 5.9160798127e-01

Importance Factor for TF1ln = 7.1428570714e-01

Importance Factor for TF2ln = 7.1428572143e-01

Importance Factor for TF1ln TF2ln = -4.2857142857e-01

Cumulative Distribution Function (CDF) for response_fn_1:

Response Level Probability Level Reliability Index General Rel Index

-------------- ----------------- ----------------- -----------------

4.0000000000e-01 1.5524721837e-01 1.0141851006e+00 1.0141851006e+00

5.0000000000e-01 1.9901236093e-01 8.4515425050e-01 8.4515425050e-01

5.5000000000e-01 2.2343641149e-01 7.6063882545e-01 7.6063882545e-01

6.0000000000e-01 2.4948115037e-01 6.7612340040e-01 6.7612340040e-01

6.5000000000e-01 2.7705656603e-01 5.9160797535e-01 5.9160797535e-01

7.0000000000e-01 3.0604494093e-01 5.0709255030e-01 5.0709255030e-01

7.5000000000e-01 3.3630190949e-01 4.2257712525e-01 4.2257712525e-01

8.0000000000e-01 3.6765834596e-01 3.3806170020e-01 3.3806170020e-01

8.5000000000e-01 3.9992305332e-01 2.5354627515e-01 2.5354627515e-01

9.0000000000e-01 4.3288618783e-01 1.6903085010e-01 1.6903085010e-01

1.0000000000e+00 5.0000000000e-01 0.0000000000e+00 0.0000000000e+00

1.0500000000e+00 5.3367668035e-01 -8.4515425050e-02 -8.4515425050e-02

1.1500000000e+00 6.0007694668e-01 -2.5354627515e-01 -2.5354627515e-01

1.2000000000e+00 6.3234165404e-01 -3.3806170020e-01 -3.3806170020e-01

1.2500000000e+00 6.6369809051e-01 -4.2257712525e-01 -4.2257712525e-01

1.3000000000e+00 6.9395505907e-01 -5.0709255030e-01 -5.0709255030e-01

1.3500000000e+00 7.2294343397e-01 -5.9160797535e-01 -5.9160797535e-01

1.4000000000e+00 7.5051884963e-01 -6.7612340040e-01 -6.7612340040e-01

1.5000000000e+00 8.0098763907e-01 -8.4515425050e-01 -8.4515425050e-01

1.5500000000e+00 8.2372893005e-01 -9.2966967555e-01 -9.2966967555e-01

1.6000000000e+00 8.4475278163e-01 -1.0141851006e+00 -1.0141851006e+00

1.6500000000e+00 8.6405064339e-01 -1.0987005257e+00 -1.0987005257e+00

1.7000000000e+00 8.8163821351e-01 -1.1832159507e+00 -1.1832159507e+00

1.7500000000e+00 8.9755305196e-01 -1.2677313758e+00 -1.2677313758e+00

Figure 5.13: Output from Reliability UQ example using mean value.

Dakota Version 6.7 User’s Manual generated on November 13, 2017

5.4. STOCHASTIC EXPANSION METHODS 89

5.4 Stochastic Expansion Methods

The development of these techniques mirrors that of deterministic ﬁnite element analysis through the utilization

of the concepts of projection, orthogonality, and weak convergence. The polynomial chaos expansion is based

on a multidimensional orthogonal polynomial approximation and the stochastic collocation approach is based

on a multidimensional interpolation polynomial approximation, both formed in terms of standardized random

variables. A distinguishing feature of these two methodologies is that the ﬁnal solution is expressed as a func-

tional mapping, and not merely as a set of statistics as is the case for many nondeterministic methodologies. This

makes these techniques particularly attractive for use in multi-physics applications which link different analysis

packages. The ﬁrst stochastic expansion method is the polynomial chaos expansion (PCE) [53,54]. For smooth

functions (i.e., analytic, inﬁnitely-differentiable) in L2(i.e., possessing ﬁnite variance), exponential convergence

rates can be obtained under order reﬁnement for integrated statistical quantities of interest such as mean, vari-

ance, and probability. Dakota implements the generalized PCE approach using the Wiener-Askey scheme [152],

in which Hermite, Legendre, Laguerre, Jacobi, and generalized Laguerre orthogonal polynomials are used for

modeling the effect of continuous random variables described by normal, uniform, exponential, beta, and gamma

probability distributions, respectively1. These orthogonal polynomial selections are optimal for these distribution

types since the inner product weighting function corresponds2to the probability density functions for these con-

tinuous distributions. Orthogonal polynomials can be computed for any positive weight function, so these ﬁve

classical orthogonal polynomials may be augmented with numerically-generated polynomials for other probabil-

ity distributions (e.g., for lognormal, extreme value, and histogram distributions). When independent standard

random variables are used (or computed through transformation), the variable expansions are uncoupled, allow-

ing the polynomial orthogonality properties to be applied on a per-dimension basis. This allows one to mix and

match the polynomial basis used for each variable without interference with the spectral projection scheme for

the response.

In non-intrusive PCE, simulations are used as black boxes and the calculation of chaos expansion coefﬁcients for

response metrics of interest is based on a set of simulation response evaluations. To calculate these response PCE

coefﬁcients, two primary classes of approaches have been proposed: spectral projection and regression. The spec-

tral projection approach projects the response against each basis function using inner products and employs the

polynomial orthogonality properties to extract each coefﬁcient. Each inner product involves a multidimensional

integral over the support range of the weighting function, which can be evaluated numerically using sampling,

tensor-product quadrature, Smolyak sparse grid [124], or cubature [128] approaches. The regression approach

ﬁnds a set of PCE coefﬁcients which best match a set of response values obtained from either a design of computer

experiments (“point collocation” [145]) or from a randomly selected subset of tensor Gauss points (“probabilistic

collocation” [135]). Various methods can be used to solve the resulting linear system, including least squares

methods for over-determined systems and compressed sensing methods for under-determined systems. Details of

these methods are documented in the Linear regression section of the Dakota Theory Manual [4] and the neces-

sary speciﬁcations needed to activate these techniques are listed in the keyword section of the Dakota Reference

Manual [3].

Stochastic collocation (SC) is another stochastic expansion technique for UQ that is closely related to PCE. As

for PCE, exponential convergence rates can be obtained under order reﬁnement for integrated statistical quantities

of interest, provided that the response functions are smooth with ﬁnite variance. The primary distinction is that,

whereas PCE estimates coefﬁcients for known multivariate orthogonal polynomial basis functions, SC forms

multivariate interpolation polynomial bases for known coefﬁcients. The interpolation polynomials may be either

local or global and either value-based or gradient-enhanced (four combinations: Lagrange, Hermite, piecewise

linear spline, and piecewise cubic spline), and may be used within nodal or hierarchical interpolation formulations.

Interpolation is performed on structured grids such as tensor-product or sparse grids. Starting from a tensor-

1Orthogonal polynomial selections also exist for discrete probability distributions, but are not yet supported in Dakota.

2Identical support range; weight differs by at most a constant factor.

Dakota Version 6.7 User’s Manual generated on November 13, 2017

90 CHAPTER 5. UNCERTAINTY QUANTIFICATION CAPABILITIES

product multidimensional interpolation polynomial in the value-based case (Lagrange or piecewise linear spline),

we have the feature that the ith interpolation polynomial has a value of 1 at collocation point iand a value of 0

for all other collocation points, leading to the use of expansion coefﬁcients that are just the response values at

each of the collocation points. In the gradient-enhanced case (Hermite or piecewise cubic spline), SC includes

both type 1 and type 2 interpolation polynomials, where the former interpolate the values while producing zero

gradients and the latter interpolate the gradients while producing zero values (refer to [4] for additional details).

Sparse interpolants are weighted sums of these tensor interpolants; however, they are only interpolatory for sparse

grids based on fully nested rules and will exhibit some interpolation error at the collocation points for sparse

grids based on non-nested rules. A key to maximizing performance with SC is performing collocation using the

Gauss points and weights from the same optimal orthogonal polynomials used in PCE. For use of standard Gauss

integration rules (not nested variants such as Gauss-Patterson or Genz-Keister) within tensor-product quadrature,

tensor PCE expansions and tensor SC interpolants are equivalent in that identical polynomial approximations

are generated [22]. Moreover, this equivalence can be extended to sparse grids based on standard Gauss rules,

provided that a sparse PCE is formed based on a weighted sum of tensor expansions [20].

The Dakota Theory Manual [4] provides full algorithmic details for the PCE and SC methods.

5.4.1 Uncertainty Quantiﬁcation Examples using Stochastic Expansions

5.4.1.1 Polynomial Chaos Expansion for Rosenbrock

A typical Dakota input ﬁle for performing an uncertainty quantiﬁcation using PCE is shown in Figure 5.14. In this

example, we compute CDF probabilities for six response levels of Rosenbrock’s function. Since Rosenbrock is a

fourth order polynomial and we employ a fourth-order expansion using an optimal basis (Legendre for uniform

random variables), we can readily obtain a polynomial expansion which exactly matches the Rosenbrock function.

In this example, we select Gaussian quadratures using an anisotropic approach (ﬁfth-order quadrature in x1and

third-order quadrature in x2), resulting in a total of 15 function evaluations to compute the PCE coefﬁcients.

The tensor product quadature points upon which the expansion is calculated are shown in Figure 5.15. The tensor

product generates all combinations of values from each individual dimension: it is an all-way pairing of points.

Once the expansion coefﬁcients have been calculated, some statistics are available analytically and others must be

evaluated numerically. For the numerical portion, the input ﬁle speciﬁes the use of 10000 samples, which will be

evaluated on the expansion to compute the CDF probabilities. In Figure 5.16, excerpts from the results summary

are presented, where we ﬁrst see a summary of the PCE coefﬁcients which exactly reproduce Rosenbrock for a

Legendre polynomial basis. The analytic statistics for mean, standard deviation, and COV are then presented.

For example, the mean is 455.66 and the standard deviation is 606.56. The moments are followed by global

sensitivity indices (Sobol’ indices).This example shows that variable x1 has the largest main effect (0.497) as

compared with variable x2 (0.296) or the interaction between x1 and x2 (0.206). After the global sensitivity

indices, the local sensitivities are presented, evaluated at the mean values. Finally, we see the numerical results

for the CDF probabilities based on 10000 samples performed on the expansion. For example, the probability

that the Rosenbrock function is less than 100 over these two uncertain variables is 0.342. Note that this is a very

similar estimate to what was obtained using 200 Monte Carlo samples, with fewer true function evaluations.

5.4.1.2 Uncertainty Quantiﬁcation Example using Stochastic Collocation

Compared to the previous PCE example, this section presents a more sophisticated example, where we use

stochastic collocation built on an anisotropic sparse grid deﬁned from numerically-generated orthogonal poly-

nomials. The uncertain variables are lognormal in this example and the orthogonal polynomials are generated

Dakota Version 6.7 User’s Manual generated on November 13, 2017

5.4. STOCHASTIC EXPANSION METHODS 91

# Dakota Input File: rosen_uq_pce.in

environment

#graphics

method

polynomial_chaos

quadrature_order = 5

dimension_preference = 5 3

samples_on_emulator = 10000

seed = 12347

response_levels = .1 1. 50. 100. 500. 1000.

variance_based_decomp #interaction_order = 1

variables

uniform_uncertain = 2

lower_bounds = -2. -2.

upper_bounds = 2. 2.

descriptors = ’x1’ ’x2’

interface

analysis_drivers = ’rosenbrock’

direct

responses

response_functions = 1

no_gradients

no_hessians

Figure 5.14: Dakota input ﬁle for performing UQ using polynomial chaos expansions – see

Dakota/examples/users/rosen uq pce.in

Figure 5.15: Rosenbrock polynomial chaos example: tensor product quadrature points.

Dakota Version 6.7 User’s Manual generated on November 13, 2017

92 CHAPTER 5. UNCERTAINTY QUANTIFICATION CAPABILITIES

Polynomial Chaos coefficients for response_fn_1:

coefficient u1 u2

----------- ---- ----

4.5566666667e+02 P0 P0

-4.0000000000e+00 P1 P0

9.1695238095e+02 P2 P0

-9.9475983006e-14 P3 P0

3.6571428571e+02 P4 P0

-5.3333333333e+02 P0 P1

-3.9968028887e-14 P1 P1

-1.0666666667e+03 P2 P1

-3.3573144265e-13 P3 P1

1.2829737273e-12 P4 P1

2.6666666667e+02 P0 P2

2.2648549702e-13 P1 P2

4.8849813084e-13 P2 P2

2.8754776338e-13 P3 P2

-2.8477220582e-13 P4 P2

-------------------------------------------------------------------

Statistics derived analytically from polynomial expansion:

Moment-based statistics for each response function:

Mean Std Dev Skewness Kurtosis

response_fn_1

expansion: 4.5566666667e+02 6.0656024184e+02

numerical: 4.5566666667e+02 6.0656024184e+02 1.9633285271e+00 3.3633861456e+00

Covariance among response functions:

[[ 3.6791532698e+05 ]]

Local sensitivities for each response function evaluated at uncertain variable means:

response_fn_1:

[ -2.0000000000e+00 2.4055757386e-13 ]

Global sensitivity indices for each response function:

response_fn_1 Sobol indices:

Main Total

4.9746891383e-01 7.0363551328e-01 x1

2.9636448672e-01 5.0253108617e-01 x2

Interaction

2.0616659946e-01 x1 x2

Statistics based on 10000 samples performed on polynomial expansion:

Probability Density Function (PDF) histograms for each response function:

PDF for response_fn_1:

Bin Lower Bin Upper Density Value

--------- --------- -------------

6.8311107124e-03 1.0000000000e-01 2.0393073423e-02

1.0000000000e-01 1.0000000000e+00 1.3000000000e-02

1.0000000000e+00 5.0000000000e+01 4.7000000000e-03

5.0000000000e+01 1.0000000000e+02 1.9680000000e-03

1.0000000000e+02 5.0000000000e+02 9.2150000000e-04

5.0000000000e+02 1.0000000000e+03 2.8300000000e-04

1.0000000000e+03 3.5755437782e+03 5.7308286215e-05

Level mappings for each response function:

Cumulative Distribution Function (CDF) for response_fn_1:

Response Level Probability Level Reliability Index General Rel Index

-------------- ----------------- ----------------- -----------------

1.0000000000e-01 1.9000000000e-03

1.0000000000e+00 1.3600000000e-02

5.0000000000e+01 2.4390000000e-01

1.0000000000e+02 3.4230000000e-01

5.0000000000e+02 7.1090000000e-01

1.0000000000e+03 8.5240000000e-01

-------------------------------------------------------------------

Figure 5.16: Excerpt of UQ output for polynomial chaos example.

Dakota Version 6.7 User’s Manual generated on November 13, 2017

5.4. STOCHASTIC EXPANSION METHODS 93

from Gauss-Wigert recursion coefﬁcients [121] in combination with the Golub-Welsch procedure [65]. The input

ﬁle is shown in Figure 5.17. Note that the dimension preference of (2,1) is inverted to deﬁne a γweighting vector

of (0.5,1) (and γof 0.5) for use in the anisotropic Smolyak index set constraint (see Smolyak sparse grids sec-

tion in Stochastic Expansion Methods chapter in Dakota Theory Manual [4]). In this example, we compute CDF

probabilities for six response levels of Rosenbrock’s function. This example requires 19 function evaluations to

calculate the interpolating polynomials in stochastic collocation and the resulting expansion exactly reproduces

Rosenbrock’s function. The placement of the points generated by the sparse grid is shown in Figure 5.18.

# Dakota Input File: rosen_uq_sc.in

environment

#graphics

method

stoch_collocation

sparse_grid_level = 3

dimension_preference = 2 1

samples_on_emulator = 10000 seed = 12347

response_levels = .1 1. 50. 100. 500. 1000.

variance_based_decomp #interaction_order = 1

output silent

variables

lognormal_uncertain = 2

means = 1. 1.

std_deviations = 0.5 0.5

descriptors = ’x1’ ’x2’

interface

analysis_drivers = ’rosenbrock’

direct

responses

response_functions = 1

no_gradients

no_hessians

Figure 5.17: Dakota input ﬁle for performing UQ using stochastic collocation – see

Dakota/examples/users/rosen uq sc.in

Once the expansion coefﬁcients have been calculated, some statistics are available analytically and others must be

evaluated numerically. For the numerical portion, the input ﬁle speciﬁes the use of 10000 samples, which will be

evaluated on the expansion to compute the CDF probabilities. In Figure 5.19, excerpts from the results summary

are presented. We ﬁrst see the moment statistics for mean, standard deviation, skewness, and kurtosis computed

by numerical integration (see Analytic moments section in Stochastic Expansion Methods chapter in Dakota

Theory Manual [4]), where the numerical row corresponds to integration using the original response values and the

expansion row corresponds to integration using values from the interpolant. The response covariance (collapsing

to a single variance value for one response function) and global sensitivity indices (Sobol’ indices) are presented

next. This example shows that variable x1 has the largest main effect (0.99) as compared with variable x2 (0.0007)

or the interaction between x1 and x2 (0.005). Finally, we see the numerical results for the CDF probabilities based

on 10000 samples performed on the expansion. For example, the probability that the Rosenbrock function is less

Dakota Version 6.7 User’s Manual generated on November 13, 2017

94 CHAPTER 5. UNCERTAINTY QUANTIFICATION CAPABILITIES

Figure 5.18: Rosenbrock stochastic collocation example: sparse grid points.

than 100 is 0.7233. Note that these results are signiﬁcantly different than the ones presented in Section 5.4.1.1

because of the different assumptions about the inputs: uniform[-2,2] versus lognormals with means of 1.0 and

standard deviations of 0.5.

5.5 Importance Sampling Methods

Importance sampling is a method that allows one to estimate statistical quantities such as failure probabilities (e.g.

the probability that a response quantity will exceed a threshold or fall below a threshold value) in a way that is

more efﬁcient than Monte Carlo sampling. The core idea in importance sampling is that one generates samples

that preferentially samples important regions in the space (e.g. in or near the failure region or user-deﬁned region

of interest), and then appropriately weights the samples to obtain an unbiased estimate of the failure probability

[126]. In importance sampling, the samples are generated from a density which is called the importance density:

it is not the original probability density of the input distributions. The importance density should be centered

near the failure region of interest. For black-box simulations such as those commonly interfaced with Dakota, it

is difﬁcult to specify the importance density a priori: the user often does not know where the failure region lies,

especially in a high-dimensional space. [131]

More formally, we deﬁne the objective of importance sampling as calculating the probability, P, that the output

will exceed a threshold level. This is a failure probability, where the failure probability is deﬁned as some scalar

function, y(X), exceeding a threshold, T, where the inputs, X, are randomly distributed with density, ρ(X).

When evaluating y(X)is sufﬁciently expensive or Pis sufﬁciently small, Monte Carlo (MC) sampling methods

to estimate Pwill be infeasible due to the large number of function evaluations required for a speciﬁed accuracy.

The probability of failure can be thought of as the mean rate of occurrence of failure. The Monte Carlo (MC)

estimate of Pis therefore the sample mean of the indicator function, I(X),

PMC =1

N

N

X

i=1

I(Xi)X∼ρ(X),(5.3)

where Nsamples, Xi, are drawn from ρ(X), and the indicator function I(X)is 1 if failure occurs and zero

otherwise.

Dakota Version 6.7 User’s Manual generated on November 13, 2017

5.5. IMPORTANCE SAMPLING METHODS 95

Statistics derived analytically from polynomial expansion:

Moment-based statistics for each response function:

Mean Std Dev Skewness Kurtosis

response_fn_1

expansion: 2.5671972656e+02 2.0484189184e+03 2.7419241630e+02 1.9594567379e+06

numerical: 2.5671972656e+02 2.0484189184e+03 2.7419241630e+02 1.9594567379e+06

Covariance among response functions:

[[ 4.1960200651e+06 ]]

Global sensitivity indices for each response function:

response_fn_1 Sobol indices:

Main Total

9.9391978710e-01 9.9928724777e-01 x1

7.1275222945e-04 6.0802128961e-03 x2

Interaction

5.3674606667e-03 x1 x2

Statistics based on 10000 samples performed on polynomial expansion:

Level mappings for each response function:

Cumulative Distribution Function (CDF) for response_fn_1:

Response Level Probability Level Reliability Index General Rel Index

-------------- ----------------- ----------------- -----------------

1.0000000000e-01 1.8100000000e-02

1.0000000000e+00 8.7800000000e-02

5.0000000000e+01 5.8410000000e-01

1.0000000000e+02 7.2330000000e-01

5.0000000000e+02 9.2010000000e-01

1.0000000000e+03 9.5660000000e-01

Figure 5.19: Excerpt of UQ output for stochastic collocation example.

Dakota Version 6.7 User’s Manual generated on November 13, 2017

96 CHAPTER 5. UNCERTAINTY QUANTIFICATION CAPABILITIES

Importance sampling draws samples from the importance density ρ0(X)and scales the sample mean by the im-

portance density:

PIS =1

N

N

X

i=1 I(Xi)ρ(Xi)

ρ0(Xi)X∼ρ0(X).(5.4)

This reduces the asymptotic error variance from:

σ2

errM C =

Eh(I(X)−P)2i

N(5.5)

to

σ2

errI S =

EI(X)ρ(X)

ρ0(X)−P2

N.(5.6)

Inspection of Eq. 5.6 reveals σ2

errI S = 0 if ρ0(X)equals the ideal importance density ρ∗(X),

ρ∗(X) = I(X)ρ(X)

P.(5.7)

However, ρ∗(X)is unknown a priori because I(X)is only known where it has been evaluated. Therefore, the

required Pin the denominator is also unknown: this is what we are trying to estimate.

If importance sampling is to be effective, the practitioner must be able to choose a good ρ0(X)without already

knowing I(X)everywhere. There is a danger: a poor choice for ρ0(X)can put most of the samples in unimportant

regions and make σ2

errI S much greater than σ2

errM C . In particular, importance sampling can be challenging for

very low probability events in high-dimensional spaces where the output yis calculated by a simulation. In these

cases, usually one does not know anything a priori about where the failure region exists in input space. We have

developed two importance sampling approaches which do not rely on the user explicitly specifying an importance

density.

5.5.1 Importance Sampling Method based on Reliability Approach

The ﬁrst method is based on ideas in reliability modeling 5.3.1. An initial Latin Hypercube sampling is performed

to generate an initial set of samples. These initial samples are augmented with samples from an importance density

as follows: The variables are transformed to standard normal space. In the transformed space, the importance

density is a set of normal densities centered around points which are in the failure region. Note that this is similar

in spirit to the reliability methods, in which importance sampling is centered around a Most Probable Point (MPP).

In the case of the LHS samples, the importance sampling density will simply by a mixture of normal distributions

centered around points in the failure region.

This method is speciﬁed by the keyword importance sampling. The options for importance sampling are as

follows: import centers a sampling density at one of the initial LHS samples identiﬁed in the failure region. It

then generates the importance samples, weights them by their probability of occurence given the original density,

and calculates the required probability (CDF or CCDF level). adapt import is the same as import but is

performed iteratively until the failure probability estimate converges. mm adapt import starts with all of the

samples located in the failure region to build a multimodal sampling density. First, it uses a small number of

samples around each of the initial samples in the failure region. Note that these samples are allocated to the

different points based on their relative probabilities of occurrence: more probable points get more samples. This

early part of the approach is done to search for “representative” points. Once these are located, the multimodal

sampling density is set and then the multi-modal adaptive method proceeds similarly to the adaptive method

(sample until convergence).

Dakota Version 6.7 User’s Manual generated on November 13, 2017

5.6. ADAPTIVE SAMPLING METHODS 97

5.5.2 Gaussian Process Adaptive Importance Sampling Method

The second importance sampling method in Dakota is the one we recommend, at least for problems that have a

relatively small number of input variables (e.g. less than 10). This method, Gaussian Process Adaptive Importance

Sampling, is outlined in the paper [24]. This method starts with an initial set of LHS samples and adds samples

one at a time, with the goal of adaptively improving the estimate of the ideal importance density during the process.

The approach uses a mixture of component densities. An iterative process is used to construct the sequence of

improving component densities. At each iteration, a Gaussian process (GP) surrogate is used to help identify areas

in the space where failure is likely to occur. The GPs are not used to directly calculate the failure probability; they

are only used to approximate the importance density. Thus, the Gaussian process adaptive importance sampling

algorithm overcomes limitations involving using a potentially inaccurate surrogate model directly in importance

sampling calculations.

This method is speciﬁed with the keyword gpais. There are three main controls which govern the behavior of

the algorithm. samples speciﬁes the initial number of Latin Hypercube samples which are used to create the

initial Gaussian process surrogate. emulator samples speciﬁes the number of samples taken on the latest

Gaussian process model each iteration of the algorithm. These samples are used in the construction of the next

importance sampling density. The default is 10,000 samples. The third control is max iterations, which

controls the number of iterations of the algorithm. Each iteration, one additional sample of the “true” simulation

is taken. Thus, if samples were set at 100 and max iterations were set to 200, there would be a total of

300 function evaluations of the simulator model taken.

5.6 Adaptive Sampling Methods

The goal in performing adaptive sampling is to construct a surrogate model that can be used as an accurate

predictor to some expensive simulation, thus it is to one’s advantage to build a surrogate that minimizes the error

over the entire domain of interest using as little data as possible from the expensive simulation. The adaptive part

alludes to the fact that the surrogate will be reﬁned by focusing samples of the expensive simulation on particular

areas of interest rather than rely on random selection or standard space-ﬁlling techniques.

5.6.1 Adaptive sampling based on surrogates

At a high-level, the adaptive sampling pipeline is a four-step process:

1. Evaluate the expensive simulation (referred to as the true model) at initial sample points

2. Fit/reﬁt a surrogate model

3. Create a candidate set and score based on information from surrogate

4. Select a candidate point to evaluate the true model and Repeat 2-4

In terms of the Dakota implementation, the adaptive sampling method currently uses Latin Hypercube sampling

(LHS) to generate the initial points in Step 1 above. For Step 2, we use a Gaussian process model. The user

can specify the scoring metric used to select the next point (or points) to evaluate and add to the set. We have

investigated several scoring metrics with which to evaluate candidate points for Step 3. There are some classical

ones such as distance (e.g. add a point which maximizes the minimum distance to all of the existing points). This

distance metric tends to generate points that are space-ﬁlling. We have investigated several methods that involve

Dakota Version 6.7 User’s Manual generated on November 13, 2017

98 CHAPTER 5. UNCERTAINTY QUANTIFICATION CAPABILITIES

interesting topological features of the space (e.g. points that are near saddle points). These are an area of active

investigation but are not currently included in Dakota. The ﬁtness metrics for scoring candidate points currently

include:

Predicted Variance First introduced in [97] and later used in [120], this method uses the predicted variance of

the Gaussian process surrogate as the score of a candidate point. Thus, the adaptively chosen points will be

in areas of highest uncertainty according to the Gaussian process model.

Distance A candidate’s score is the Euclidean distance in domain space between the candidate and its nearest

neighbor in the set of points already evaluated on the true model. Therefore, the most undersampled area of

the domain will always be selected. The adaptivity of this method could be brought to question as it would

chose the exact same points regardless of the surrogate model used. However, it is useful to use to compare

other adaptive metrics to one that relies purely on space-ﬁlling in an equivalent context.

Gradient Similar to the above metric, a candidate’s nearest neighbor is determined as in the distance metric,

only now the score is the absolute value of the difference in range space of the two points. The range space

values used are predicted from the surrogate model. Though this method is called the gradient metric, it

actually does not take into account how close the candidate and its neighbor are in domain space. This

method attempts to evenly ﬁll the range space of the surrogate.

Note that in our approach, a Latin Hypercube sample is generated (a new one, different from the initial sample)

and the surrogate model is evaluated at this points. These are the “candidate points” that are then evaluated

according to the ﬁtness metric outlined above. The number of candidates used in practice should be high enough

to ﬁll most of the input domain: we recommend at least hundreds of points for a low- dimensional problem. All

of the candidates (samples on the emulator) are given a score and then the highest-scoring candidate is selected to

be evaluated on the true model.

The adaptive sampling method also can generate batches of points to add at a time. With batch or multi-point

selection, the true model can be evaluated in parallel and thus increase throughput before reﬁtting our surrogate

model. This proposes a new challenge as the problem of choosing a single point and choosing multiple points off

a surrogate are fundamentally different. Selecting the nbest scoring candidates is more than likely to generate a

set of points clustered in one area which will not be conducive to adapting the surrogate. We have implemented

several strategies for batch selection of points:

Naive Selection This strategy will select the nhighest scoring candidates regardless of their position. This tends

to group an entire round of points in the same area.

Distance Penalized Re-weighted Scoring In this strategy, the highest scoring candidate is selected and then all

remaining candidates are re-scored with a distance penalization factor added in to the score. Only points

selected within a round are used for the distance penalization. The factor is the same as used in the distance

penalization scoring metrics from [98]. First, compute all of the minimum distances from each remaining

candidate to the selected candidates. Then, determine the median value of these distances. If the smallest

distance, d, between a point and the selected set is less than the computed median distance its score is

unaltered, otherwise the score is multiplied by a value ρdetermined by the following equation:

ρ= 1.5∗d−0.5∗d3(5.8)

Topological Maxima of Scoring Function In this strategy we look at the topology of the scoring function and

select the nhighest maxima in the topology. To determine local maxima, we construct the approximate

Morse-Smale complex. If the number of local maxima is less than n, we revert to the distance strategy

above. As a further extension, one may want to ﬁlter low-persistence maxima, but to keep the framework

Dakota Version 6.7 User’s Manual generated on November 13, 2017

5.6. ADAPTIVE SAMPLING METHODS 99

general, we chose to omit this feature as deﬁning a threshold for what deems a critical point as ”low

persistence” can vary drastically from problem to problem.

Constant Liar We adapt the constant liar strategy presented in [59] with the scoring metrics. The strategy ﬁrst

selects the highest scoring candidate, and then reﬁts the surrogate using a “lie” value at the point selected

and repeating until npoints have been selected whereupon the lie values are removed from the surrogate

and the selected points are evaluated on the true model and the surrogate is reﬁt with these values.

The adaptive sampling method is speciﬁed by the method keyword adaptive sampling. There are many

controls, including the number of candidate samples to investigate each iteration (emulator samples), the

ﬁtness metric used in scoring candidates (fitness metric), and the number of iterations to perform the adap-

tive sampling (max iterations). For batch selection of points, one speciﬁes a batch selection strategy

and a batch size. The details of the speciﬁcation are provided in the Dakota reference manual.

5.6.2 Adaptive sampling based on dart throwing

pof darts is a novel method for estimating the tail probability (Probability of Failure) based on random sphere-

packing in the uncertain parameter space. Random points are sequentially sampled from the domain and conse-

quently surrounded by protecting spheres, with the constraint that each new sphere center has to be outside all

prior spheres [31]. The radius of each sphere is chosen such that the entire sphere lies either in the failure or

the non-failure region. This radius depends of the function evaluation at the disk center, the failure threshold

and an estimate of the function gradient at the disk center. After exhausting the sampling budget speciﬁed by

build samples, which is the number of spheres per failure threshold, the domain is decomposed into two

regions. These regions correspond to failure and non-failure categories, each represented by the union of the

spheres of each type. The volume of the union of failure spheres gives a lower bound on the required estimate of

the probability of failure, while the volume of the union of the non-failure spheres subtracted from the volume of

the domain gives an upper estimate. After all the spheres are constructed, we construct a surrogate model, spec-

iﬁed via a model pointer, and sample the surrogate model extensively to estimate the probability of failure

for each threshold.

pof darts handles multiple response functions and allows each to have multiple failure thresholds. For each

failure threshold pof darts will insert a number of spheres speciﬁed by the user-input parameter ”samples”.

However, estimating the probability of failure for each failure threshold would utilize the total number of disks

sampled for all failure thresholds. For each failure threshold, the sphere radii changes to generate the right spatial

decomposition. The POF-Darts method is speciﬁed by the method keyword pof darts. The sample budget

is speciﬁed by build samples. By default, the method employs a local approach to estimate the Lipschitz

constant per sphere.

The surrogate model used by the pof darts method for extensive sampling is speciﬁed using a model pointer,

and its parameters are therefore deﬁned in that model. It can typically be any global surrogate in Dakota (e.g.,

Gaussian process, polynomial chaos expansion, polynomial regression, etc). POF-Darts can also use piecewise-

decomposed surrogates which build local pieces of the surrogate over different domain patches. The piecewise

decomposition option is a new capability added to Dakota to help construct surrogates in high-dimensional spaces,

using known function evaluations as well as gradient and Hessian information, if available. The piecewise decom-

position option is declared using the keyword domain decomp and currently supports polynomial, Gaussian

Process (GP), and Radial Basis Functions (RBF) surroagte models only. For example: a polynomial regres-

sion global surrogate is speciﬁed with model polynomial, its order is selected using surrogate order,

and the piecewise decomposition option is speciﬁed with domain decomp. The domain decomp option is

parametrized by a cell type set by default to Voronoi cells, an optional number of support layers, and

an optional discontinuity detection capability. See 8.4.3.10 for more details.

Dakota Version 6.7 User’s Manual generated on November 13, 2017

100 CHAPTER 5. UNCERTAINTY QUANTIFICATION CAPABILITIES

5.7 Epistemic Nondeterministic Methods

Uncertainty quantiﬁcation is often used as part of the risk assessment of performance, reliability, and safety of

engineered systems. Increasingly, uncertainty is separated into two categories for analysis purposes: aleatory

and epistemic uncertainty [109,77]. Aleatory uncertainty is also referred to as variability, irreducible or inherent

uncertainty, or uncertainty due to chance. Examples of aleatory uncertainty include the height of individuals

in a population, or the temperature in a processing environment. Aleatory uncertainty is usually modeled with

probability distributions, and sampling methods such as Latin Hypercube sampling in Dakota can be used to model

aleatory uncertainty. In contrast, epistemic uncertainty refers to lack of knowledge or lack of information about

a particular aspect of the simulation model, including the system and environment being modeled. An increase

in knowledge or information relating to epistemic uncertainty will lead to a reduction in the predicted uncertainty

of the system response or performance. For epistemic uncertain variables, typically one does not know enough

to specify a probability distribution on a variable. Epistemic uncertainty is referred to as subjective, reducible,

or lack of knowledge uncertainty. Examples of epistemic uncertainty include little or no experimental data for

a ﬁxed but unknown physical parameter, incomplete understanding of complex physical phenomena, uncertainty

about the correct model form to use, etc.

There are many approaches which have been developed to model epistemic uncertainty, including fuzzy set theory,

possibility theory, and evidence theory. It is also possible to use simple interval analysis in an epistemic context.

Interval analysis and evidence theory are described in more detail below.

5.7.1 Interval Methods for Epistemic Analysis

In interval analysis, one assumes that nothing is known about an epistemic uncertain variable except that its value

lies somewhere within an interval. In this situation, it is NOT assumed that the value has a uniform probability

of occuring within the interval. Instead, the interpretation is that any value within the interval is a possible value

or a potential realization of that variable. In interval analysis, the uncertainty quantiﬁcation problem is one of

determining the resulting bounds on the output (deﬁning the output interval) given interval bounds on the inputs.

Again, any output response that falls within the output interval is a possible output with no frequency information

assigned to it.

We have the capability to perform interval analysis using either global interval est or local interval est.

In the global approach, one uses either a global optimization method or a sampling method to assess the bounds.

global interval est allows the user to specify either lhs, which performs Latin Hypercube Sampling

and takes the minimum and maximum of the samples as the bounds (no optimization is performed) or ego. In

the case of ego, the efﬁcient global optimization method is used to calculate bounds. The ego method is de-

scribed in Section 6.2.3. If the problem is amenable to local optimization methods (e.g. can provide derivatives

or use ﬁnite difference method to calculate derivatives), then one can use local methods to calculate these bounds.

local interval est allows the user to specify either sqp which is sequential quadratic programming, or

nip which is a nonlinear interior point method.

Note that when performing interval analysis, it is necessary to deﬁne interval uncertain variables as described in

Section 9.3. For interval analysis, one must deﬁne only one interval per input variable, in contrast with Dempster-

Shafer evidence theory, where an input can have several possible intervals. Interval analysis can be considered

a special case of Dempster-Shafer evidence theory where each input is deﬁned by one input interval with a

basic probability assignment of one. In Dakota, however, the methods are separate and semantic differences

exist in the output presentation. If you are performing a pure interval analysis, we recommend using either

global interval est or local interval est instead of global evidence or local evidence,

for reasons of simplicity.

Dakota Version 6.7 User’s Manual generated on November 13, 2017

5.7. EPISTEMIC NONDETERMINISTIC METHODS 101

These interval methods can also be used as the outer loop within an interval-valued probability analysis for prop-

agating mixed aleatory and epistemic uncertainty – refer to Section 15.1.1 for additional details.

An example of interval estimation is shown in Figure 5.20, with example results in Figure 5.21. This example is

a demonstration of calculating interval bounds for three outputs of the cantilever beam problem. The cantilever

beam problem is described in detail in Section 20.7. Given input intervals of [1,10] on beam width and beam

thickness, we can see that the interval estimate of beam weight is approximately [1,100].

# Dakota Input File: cantilever_uq_global_interval.in

environment

tabular_data

# tabular_data_file = ’cantilever_uq_global_interval.dat’

method

global_interval_est ego

seed = 1234567

output verbose

variables

continuous_interval_uncertain = 2

num_intervals = 1 1

interval_probabilities = 1.0 1.0

lower_bounds = 1.0 1.0

upper_bounds = 10.0 10.0

descriptors ’w’ ’t’

continuous_state = 4

initial_state<