UQTk V3.0.4 Manual
User Manual: Pdf
Open the PDF directly: View PDF .
Page Count: 106
Download | |
Open PDF In Browser | View PDF |
SANDIA REPORT SAND2017-11051 Unlimited Release Printed Oct, 2017 UQTk Version 3.0.4 User Manual Khachik Sargsyan, Cosmin Safta, Kenny Chowdhary, Sarah Castorena, Sarah de Bord, Bert Debusschere Prepared by Sandia National Laboratories Albuquerque, New Mexico 87185 and Livermore, California 94550 Sandia National Laboratories is a multimission laboratory managed and operated by National Technology and Engineering Solutions of Sandia, LLC., a wholly owned subsidiary of Honeywell International, Inc, for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-NA0003525. Approved for public release; further dissemination unlimited. Issued by Sandia National Laboratories, operated for the United States Department of Energy by National Technology and Engineering Solutions of Sandia, LLC. NOTICE: This report was prepared as an account of work sponsored by an agency of the United States Government. Neither the United States Government, nor any agency thereof, nor any of their employees, nor any of their contractors, subcontractors, or their employees, make any warranty, express or implied, or assume any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represent that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise, does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States Government, any agency thereof, or any of their contractors or subcontractors. The views and opinions expressed herein do not necessarily state or reflect those of the United States Government, any agency thereof, or any of their contractors. Printed in the United States of America. This report has been reproduced directly from the best available copy. Available to DOE and DOE contractors from U.S. Department of Energy Office of Scientific and Technical Information P.O. Box 62 Oak Ridge, TN 37831 Telephone: Facsimile: E-Mail: Online ordering: (865) 576-8401 (865) 576-5728 reports@adonis.osti.gov http://www.osti.gov/bridge Available to the public from U.S. Department of Commerce National Technical Information Service 5285 Port Royal Rd Springfield, VA 22161 (800) 553-6847 (703) 605-6900 orders@ntis.fedworld.gov http://www.ntis.gov/help/ordermethods.asp?loc=7-4-0#online NT OF E ME N RT GY ER DEP A Telephone: Facsimile: E-Mail: Online ordering: • ED ER U NIT IC A • ST A TES OF A M 2 SAND2017-11051 Unlimited Release Printed Oct, 2017 UQTk Version 3.0.4 User Manual Khachik Sargsyan, Cosmin Safta, Kenny Chowdhary, Sarah Castorena, Sarah de Bord, Bert Debusschere Abstract The UQ Toolkit (UQTk) is a collection of libraries and tools for the quantification of uncertainty in numerical model predictions. Version 3.0.4 offers intrusive and non-intrusive methods for propagating input uncertainties through computational models, tools for sensitivity analysis, methods for sparse surrogate construction, and Bayesian inference tools for inferring parameters from experimental data. This manual discusses the download and installation process for UQTk, provides pointers to the UQ methods used in the toolkit, and describes some of the examples provided with the toolkit. 3 4 Contents 1 Overview 7 2 Download and Installation 9 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Download . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 Directory Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10 External Software and Libraries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 PyUQTk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 Installation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Configuration flags . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 Installation example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3 Theory and Conventions 17 Polynomial Chaos Expansions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 Source Code Description 17 19 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 } generate quad: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 } gen mi: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20 } gp regr: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 } model inf: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22 } pce eval: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 } pce quad: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27 5 } pce resp: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29 } pce rv: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 } pce sens: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 } pdf cl: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30 } regression: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 } sens: . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33 5 Examples 35 Elementary Operations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 Forward Propagation of Uncertainty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38 Numerical Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44 Forward Propagation of Uncertainty with PyUQTk . . . . . . . . . . . . . . . . . . . . . . . . . . 54 Expanded Forward Propagation of Uncertainty - PyUQTk . . . . . . . . . . . . . . . . . . . 61 Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 Heat Transfer - Dual Pane Window . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62 Bayesian Inference of a Line . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69 Surrogate Construction and Sensitivity Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72 Global Sensitivity Analysis via Sampling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77 Karhunen-Loève Expansion of a Stochastic Process . . . . . . . . . . . . . . . . . . . . . . . . . . 83 6 Support 99 References 100 6 Chapter 1 Overview The UQ Toolkit (UQTk) is a collection of libraries and tools for the quantification of uncertainty in numerical model predictions. In general, uncertainty quantification (UQ) pertains to all aspects that affect the predictive fidelity of a numerical simulation, from the uncertainty in the experimental data that was used to inform the parameters of a chosen model, and the propagation of uncertain parameters and boundary conditions through that model, to the choice of the model itself. In particular, UQTk provides implementations of many probabilistic approaches for UQ in this general context. Version 3.0.4 offers intrusive and non-intrusive methods for propagating input uncertainties through computational models, tools for sensitivity analysis, methods for sparse surrogate construction, and Bayesian inference tools for inferring parameters from experimental data. The main objective of UQTk is to make these methods available to the broader scientific community for the purposes of algorithmic development in UQ or educational use. The most direct way to use the libraries is to link to them directly from C++ programs. Alternatively, in the examples section, many scripts for common UQ operations are provided, which can be modified to fit the users’ purposes using existing numerical simulation codes as a black-box. The next chapter in this manual discusses the download and installation process for UQTk, followed by some pointers to the UQ methods used in the toolkit, and a description of some of the examples provided with the toolkit. 7 8 Chapter 2 Download and Installation Requirements The core UQTk libraries are written in C++, with some dependencies on FORTRAN numerical libraries. As such, to use UQTk, a compatible C++ and FORTRAN compiler will be needed. UQTk is installed and built most naturally on a Unix-like platform, and has been tested on Mac OS X and Linux. Installation and use on Windows machines under Cygwin is possible, but has not been tested extensively. Many of the examples rely on Python, including NumPy, SciPy, and matplotlib packages for postprocessing and graphing. As such, Python version 2.7.x with compatible NumPy, SciPy, and matplotlib are recommended. Further the use of XML for input files requires the Expat XML parser library to be installed on your system. Note, if you will be linking the core UQTk libraries directly to your own codes, and do not plan on using the UQTk examples, then those additional dependencies are not required. Download The most recent version of UQTk, currently 3.0.4, can be downloaded from the following location: http://www.sandia.gov/UQToolkit After download, extract the tar file into the directory where you want to keep the UQTk source directory. % tar -xzvf UQTk_v3.0.tgz Make sure to replace the name of the tar file in this command with the name of the most recent tar file you just downloaded, as the file name depends on the version of the toolkit. 9 Directory Structure After extraction, there will be a new directory UQTk_v3.0 (version number may be different). Inside this top level directory are the following directories: config cpp app lib tests dep ann blas cvode-2.7.0 dsfmt figtree lapack lbfgs slatec doc examples fwd_prop kle_ex1 line_infer muq num_integ ops pce_bcs sensMC surf_rxn uqpc PyUQTk array bcs dfi inference kle mcmc pce plotting pytests quad sens Configuration files C++ source code C++ apps C++ libraries Tests for C++ libraries External dependencies Approximate Nearest Neighbors library Netlib’s BLAS library (linear algebra) SUNDIALS’ CVODE library (ODE solvers) dsfmt library (random number generators) Fast Improved Gauss Transform library Netlib’s LAPACK library (linear algebra) lbfgs library (optimization) Netlib’s SLATEC library (general purpose math) Documentation Examples with C++ libraries and apps forward propagation with a heat transfer example Karhunen-Loeve expansion example calibrate parameters of a linear model interface between MUQ and UQTk quadrature and Monte Carlo integrations operations with Polynomial Chaos expansions construct sparse Polynomial Chaos expansions Monte-Carlo based sensitivity index computation surface reaction example for forward and inverse UQ construct Polynomial Chaos surrogates for multiple outputs/functions Python scripts and interface to C++ libraries interface to array class interface to Bayesian compressive sensing library interface to data-free inference class Python Markov Chain Monte Carlo (MCMC) scripts interface to Karhunen-Loeve expansion class interface to MCMC class interface to Polynomial Chaos expansion class Python plotting scripts interface to Quad class Python global sensitivity analysis scripts 10 tools utils interface to UQTk tools interface to UQTk utils External Software and Libraries The following software and libraries are required to compile UQTK 1. C++/Fortran compilers. Please note that C++ and Fortran compilers need to be compatible with each other. Please see the table below for a list of platforms and compilers we have tested so far. For OS X these compilers were installed either using MacPorts, or Homebrew, or directly built from source code. Platform OS X 10.9 OS X 10.10 OS X 10.11 Linux x86 64 (Red Hat) Linux ia-64 (Suse) Compiler GNU 4.8, 4.9, 5.2, Intel 14.0.3 GNU 5.2, 5.3 GNU 5.2–4, 6.1, 6.2, Intel 14.0.3, OpenMPI 1.10 GNU 4.8 Intel 10.1.021 2. CMake. We switched to a CMake-based build/install configuration in version 3.0. The configuration files require a CMake version 2.6.x or higher. 3. Expat library. The Expat XML Parser is installed together with other XCode tools on OS X. It is also fairly common on Linux systems, with installation scripts available for several platforms. Alternatively this library can be downloaded from http:// expat.sourceforge.net PyUQTk The following additional software and libraries are not required to compile UQTK, but are necessary for the full Python interface to UQTk called PyUQTk. 1. Python, NumPy, SciPy, and Matplotlib. We have successfully compiled PyUQTk with Python 2.6.x and Python 2.7.x, NumPy 1.8.1, SciPy 0.14.0, and Matplotlib 1.4.2. Note that we have not tried using Python 3 or higher. Note that it is important that all of these packages be compatible with each other. Sometimes, your OS may come with a default version of Python but not SciPy or NumPy. When adding those packages afterwards, it can be hard to get them to all be compatible with each other. To avoid issues, it is recommended to install Python, NumPy, and SciPy all from the same package manager (e.g. get them all through MacPorts or Homebrew on OS X). 2. SWIG. PyUQTk has been tested with SWIG 3.0.2. 11 Installation We define the following keywords to simplify build and install descriptions in this section. • sourcedir - directory containing UQTk source files, i.e. the top level directory mentioned in Section 2. • builddir - directory where UQTk library and its dependencies will be built. This directory should not be the same as sourcedir. • installdir - directory where UQTk libraries are installed and header and script files are copied To generate the build structure, compile, test, and install UQTk: (1) > (2) > (3) > (4) > (5) > mkdir builddir ; cd builddir cmakesourcedir make ctest make install Configuration flags A (partial) list of configuration flags that can be set at step (2) above is provided below: • CMAKE INSTALL PREFIX : set installdir. • CMAKE C COMPILER : C compiler • CMAKE CXX COMPILER : C++ compiler • CMAKE Fortran COMPILER : Fortran compiler • IntelLibPath: For Intel compilers: path to libraries if different than default system paths • PyUQTk: If ON, then build PyUQTk’s Python to C++ interface. Default: OFF Several pre-set config files are available in the “sourcedir/config” directory. Some of these shell scripts also accept arguments, e.g. config-teton.sh, to switch between several configurations. Type, for example “config-teton.sh –help” to obtain a list of options. For a basic setup using default system settings for GNU compilers, see “config-gcc-base.sh”. The user is encouraged to copy of one these script files and edit to match the desired configuration. Then, 12 step no. 2 above (cmake sourcedir) should be replaced by a command executing a particular shell script from the command line, e.g. (2) > ../UQTk_v3.0.1/config/config-gcc-base.sh In this example, the configuration script is executed from the build directory, while it is assumed that the configuration script still sits in the configuration directory, in this case version 3.0.1, of the UQTk source code tree. If all goes well, there should be no errors. Two log files in the “config” directory contain the output for Steps (2) and (3) above, for compilation and installation on OS X 10.9.5 using GNU 4.8.3 compilers: (2) > ../UQTk/config/config-teton.sh -c gnu -p ON >& cmake-mac-gnu.log (3) > make >& make-gnu.log ; make install >>& make-gnu.log After compilation ends, the installdir will be contain the following sub-directories: PyUQTk bin cpp examples include examples Python scripts and, if PyUQTk=ON, interface to C++ classes app’s binaries tests for C++ libraries examples on using UQTk UQTk header files UQTk libraries, including for external dependencies To use the UQTk libraries, your program should link in the libraries in installdir/lib and add installdir/include/uqtk and installdir/include/dep directories to the compiler include path. The apps are standalone programs that perform UQ operations, such as response surface construction, or sampling from random variables. For more details, see the Examples section. Installation example In this section, we will take the user through the installation of UQTk and PyUQTk on a Mac OSX 10.11 system with the GNU compilers. The following example uses GNU 6.1 installed under /opt/local/gcc61. For the compilation of PyUQTk, we are using Python version 2.7.10 with SciPy 0.14.0, Matplotlib 1.4.2, NumPy 1.8.1, and SWIG 3.0.2. Note that you will need to install both swig and swig-python libraries if you install SWIG via Macports. If you install SWIG from source, you do not need to install a separate swig-python library. It will be cleaner to keep the source directory separate from the build and install directories. For simplicity, we will create a UQTk-build directory in the same parent folder as the source directory, UQTk . While in the source directory, create the build directory and cd into it: 13 $ mkdir ../UQTk-build $ cd ../UQTk-build It is important to note that the CMake compilation uses the cc and c++ defined compilers by default. This may not be the compilers you want when installing UQTk. Luckily, CMake allows you to specify which compilers you want, similar to autoconf. Thus, we type $ cmake -DCMAKE_INSTALL_PREFIX:PATH=$PWD/../UQTk-install \ -DCMAKE_Fortran_COMPILER=/opt/local/gcc61/bin/gfortran-6.1.0 \ -DCMAKE_C_COMPILER=/opt/local/gcc61/bin/gcc-6.1.0 \ -DCMAKE_CXX_COMPILER=/opt/local/gcc61/bin/g++-6.1.0 ../UQTk Note that this will configure CMake to compile UQTk without the Python interface. Also, we specified the installation directory to be UQTk-install in the same parent directory at UQTk and UQTk-build. Figure 2.1 shows what CMake prints to the screen. To turn on the Figure 2.1: CMake configuration without the Python interface. Python interface just set the CMake flag, PyUQTk, on, i.e., $ cmake -DPyUQTk=ON \ -DCMAKE_INSTALL_PREFIX:PATH=$PWD/../UQTk-install \ -DCMAKE_Fortran_COMPILER=/opt/local/gcc61/bin/gfortran-6.1.0 \ -DCMAKE_C_COMPILER=/opt/local/gcc61/bin/gcc-6.1.0 \ -DCMAKE_CXX_COMPILER=/opt/local/gcc61/bin/g++-6.1.0 ../UQTk 14 Figure 2.2 shows the additional output to screen after the Python interface flag is turned on. Figure 2.2: CMake configuration with the Python interface. If the CMake command has executed without error, you are now ready to build UQTk. While in the build directory, type $ make or, for a faster compilation using N parallel threads, $ make -j N where one can replace N with the number of virtual cores on your machine, e.g. 8. This will build in the UQTK-build/ directory. The screen should look similar to Figure 2.3 with or without the Python interface when building. Figure 2.3: Start and end of build without Python interface. To verify that the build was successful, run the ctest command from the UQTK-build/ directory to execute C++ and Python (only if building PyUQTk) test scripts. $ ctest 15 Figure 2.4: Result of ctest after successful build and install. Note that if you do not build PyUQTk, those tests will not be run. The output should look similar to Figure 2.4. If the Python tests fail, even though the compilation went well, a common issue that the configure script may have found a different version of the Python libraries than the one is used when you issue Python from the command line. To avoid this, specify the path to your Python executable and libraries to the configuration process. For example (on OS X): cmake -DCMAKE_INSTALL_PREFIX:PATH=$PWD/../UQTk-install \ -DCMAKE_Fortran_COMPILER=/opt/local/gcc61/bin/gfortran-6.1.0 \ -DCMAKE_C_COMPILER=/opt/local/gcc61/bin/gcc-6.1.0 \ -DCMAKE_CXX_COMPILER=/opt/local/gcc61/bin/g++-6.1.0 ../UQTk -DPYTHON_EXECUTABLE:FILEPATH=/opt/local/bin/python \ -DPYTHON_LIBRARY:FILEPATH=/opt/local/Library/Frameworks/Python.framework/Versions/2.7/lib/libpython2.7.dylib \ -DPyUQTk=ON \ ../UQTk If all looks good, you are now ready to install UQTk. While in the build directory, type $ make install which installs the libraries, headers, apps, examples, and such in the specified installation directory. Additionally, if you are building the Python interface, the install command will copy over the python scripts and SWIG modules (*.so) over to PyUQTk/. As a reminder, commonly used configure options are illustrated in the scripts that are provided in the “sourcedir/config” folder. 16 Chapter 3 Theory and Conventions UQTk implements many probabilistic methods found in the literature. For more details on the methods, please refer to the following papers and books on Polynomial Chaos methods for uncertainty propagation [3, 17], Karhunen-Loève (KL) expansions [7], numerical quadrature (including sparse quadrature) [13, 2, 9, 31, 6], Bayesian inference [30, 4, 18], Markov Chain Monte Carlo [5, 8, 10, 11], Bayesian compressive sensing [1], and the Rosenblatt transformation [23]. Below, some key aspects and conventions of UQTk Polynomial Chaos expansions are outlined in order to connect the tools in UQTk to the broader theory. Polynomial Chaos Expansions • The default ordering of PCE terms in the multi-index in UQTk is the canonical ordering for total order truncation • The PC basis functions in UQTk are not normalized • The Legendre-Uniform PC Basis type is defined on the interval [-1, 1], with weight function 1/2 17 18 Chapter 4 Source Code Description For more details on the actual source code in UQTk, HTML documentation is also available in the doc_cpp/html folder. Applications The following command-line applications are available (source code is in cpp/app ) } generate_quad : Quadrature point/weight generation } gen_mi : Polynomial multiindex generation } gp_regr : Gaussian process regression } model_inf : Model parameter inference } pce_eval : PC evaluation } pce_quad : PC generation from samples } pce_resp : PC projection via quadrature integration } pce_rv : PC-related random variable generation } pce_sens : PC sensitivity extraction } pdf_cl : Kernel Density Estimation } regression : Linear parametric regression } sens : Sobol sensitivity indices via Monte-Carlo sampling Below we detail the theory behind all the applications. For specific help in running an app, type app_name -h. 19 } generate quad: This utility generates isotropic quadrature (both full tensor product or sparse) points of given dimensionality and type. The keyword options are: Quadrature types: -g • LU : Legendre-Uniform • HG : Gauss-Hermite • LG : Gamma-Laguerre • SW : Stieltjes-Wiegert • JB : Beta-Jacobi • CC : Clenshaw-Curtis • CCO : Clenshaw-Curtis Open (endpoints not included) • NC : Newton-Cotes (equidistant) • NCO : Newton-Cotes Open (endpoints not included) • GP3 : Gauss-Patterson • pdf : Custom PDF Sparsity types: -x • full : full tensor product • sparse : Smolyak sparse grid construction Note that one can create an equidistant multidimensional grid by using ‘NC’ quadrature type and ‘full’ sparsity type. } gen mi: This utility generates multi index set of a given type and dimensionality. The keyword options are: Multiindex types: -x 20 • TO : Total order truncation, i.e. α = (α1 , . . . , αd ), where α1 +· · ·+αd = ||α||1 ≤ p, TO = (p + for given order p and dimensionality d. The number of multiindices is Np,d d)!/(p!d!). • TP : Tensor product truncation, i.e. α = (α1 , . . . , αd ), where αi ≤ pi , for i = 1, . . . , d. The dimension-specific orders are given in a file with a name Q specified as a TP command-line argument (-f). The number of multiindices is Np1 ,...,pd = di=1 (pi + 1). • HDMR : High-Dimensional Model Representation, where, for each k, k-variate multiindices are truncated up to a given order. That is, if ||α||0 = k (i.e. the number of non-zero elements is equal to k), then ||α||1 ≤ pk , for k = 1, . . . , kmax . The variate-specific orders pk are given in a file with a name specified as a command-line R = argument (-f). The number of multiindices constructed in this way is NpHDM 0 ,...,pkmax Pkmax k=0 (pk + k)!/(pk !k!). } gp regr: This utility performs Gaussian process regression [22], in particular using the Bayesian perspective of constructing GP emulators, see e.g. [12, 20]. The data is given as pairs D = d {(x(i) , y (i) )}N i=1 , where x ∈ R . The function to be found, f (x) is endowed with a Gaussian prior with mean h(x)T c and a predefined covariance C(x, x0 ) = σ 2 c(x, x0 ). Currently, only 0 T 0 a squared-exponential covariance is implemented, i.e. c(x, x0 ) = e−(x−x ) B(x−x ) . The mean trend basis vector h(x) = (L0 (x), . . . , LK−1 (x)) consists of Legendre polynomials, while c and σ 2 are hyperparameters with a normal inverse gamma (conjugate) prior (c−c0 )T V −1 (c−c0 ) 2σ 2 β e− σ2 . p(c, σ ) = p(c|σ )p(σ ) ∝ σ σ 2(α+1) The parameters c0 , V −1 and B are fixed for the duration of the regression. Conditioned on yi = f (xi ), the posterior is a student-t process 2 2 2 e− f (x)|D, c0 , V −1 , B, α, β ∼ St-t(µ∗ (x), σ̂c∗ (x, x0 )) with mean and covariance defined as µ∗ (x) = h(x)T ĉ + t(x)T A−1 (y − H ĉ), c∗ (x, x0 ) = c(x, x0 ) − t(x)T A−1 t(x0 ) + [h(x)T − t(x)T A−1 H]V ∗ [h(x0 )T − t(x0 )T A−1 H]T , where y T = (y (1) , . . . , y (N ) ) and 2β + cT0 V −1 c0 + y T A−1 y − ĉT (V ∗ )−1 ĉ N + 2α − K − 2 ∗ −1 T −1 V = (V + H A H)−1 Amn = c(x(m) , x(n) ) (4.1) ĉ = V ∗ (V −1 c0 + H T A−1 y) σ̂ 2 = t(x)T = (c(x, x(1) ), . . . , c(x, x(N ) )) H = (h(x(1) )T , . . . , h(x(N ) )T ) 21 Note that currently the commonly used prior p(c, σ 2 ) ∝ σ −2 is implemented which is a special case with α = β = 0 and c0 = 0, V −1 = 0K×K . Also, a small nugget of size 10−6 is added to the diagonal of matrix A for numerical purposes, playing a role of ‘data noise’. Finally, the covariance matrix B is taken to be diagonal, with the entries either fixed or found before the regression by maximizing marginal posterior [20]. More flexibility in trend basis and covariance structure selection is a matter of current work. The app builds the Student-t process according to the computations detailed above, and evaluates its mean and covariance at a user-defined grid of points x. } model inf: This utility perform Bayesian inference for several generic types of models. Consider a dataset D = {(x(i) , y (i) )}N i=1 of pairs of x-y measured values from some unknown ‘truth’ function g(·), i.e. y (i) = g(x(i) )+meas.errors. For example, y (i) can be measurements at spatial locations x(i) , or at time instances x(i) , or x(i) = i simply enumerating several observables. We call elements of x ∈ IRS design or controllable parameters. Assume, generally, that g is not deterministic, i.e. the vector of measurements y (i) at each i contains R instances/replicas/measurements of the true output g(x). Furthermore, consider a model of interest f (λ; x) as a function of model parameters λ ∈ IRD producing a single output. We are interested in calibrating the model f (λ; x) with respect to model parameters λ, seeking an approximate match of the model to the truth: f (λ; x) ≈ g(x). (4.2) The full error budget takes the following form y (i) = f (λ; x(i) ) + δ(x(i) ) + i , (4.3) where δ(x) is the model discrepancy term, and i is the measurement error for the i-th data point. The most common assumption for the latter is an i.i.d Gaussian assumption with vanishing mean i ∼ N (0, σ 2 ), for all i = 1, . . . , N. (4.4) Concerning model error δ(x), we envision three scenarios: • when the model discrepancy term δ(x) is ignored, one arrives at the classical construction y (i) − f (λ; x(i) ) ∼ N (0, σ 2 ) with likelihood described below in Eq. (4.12). • when the model discrepancy δ(x) is modeled explicitly as a Gaussian process with a predefined, typically squared-exponential covariance term with parameters either fixed apriori or inferred as hyperparameters, together with λ. This approach has been established in [15], and is referred to as “Kennedy-O’Hagan”, koh approach. • embedded model error approach is a novel strategy when model error is embedded into the model itself. For detailed discussion on the advantages and challenges of 22 the approach, see [26]. This method leads to several likelihood options (keywords abc, abcm, gausmarg, mvn, full, marg), many of which are topics of current research and are under development. In this approach, one augments some of the parameters in λ with a probabilistic representation, such as multivariate normal, and infers parameters of this representation instead. Without loss of generality, and for the clarity of illustration, we assumed that the first M components of λ are augmented with a random variable. To this end, λ is augmented by a multivariate normal random variable as ~ λ → Λ = λ + A(α)ξ, (4.5) where α11 0 0 α21 α22 0 α31 α32 α33 . .. .. . . . . A(α) = αM 1 αM 2 αM 3 0 0 0 . .. .. .. . . 0 0 0 0 0 0 ... ... ... .. . ... ... .. . ... ... 0 0 0 .. . ξ1 ξ2 αM M , and ξ~ = .. . 0 ξM .. . 0 0 D×M (4.6) Here ξ~ is a vector of independent identically distributed standard normal variables, and α = (α11 , . . . , αM M ) is the vector of size M (M + 1)/2 of all non-zero entries in the matrix A. The set of parameters describing the random vector Λ is λ̂ = (λ, α) The full data model then is written as ~ x(i) ) + i y (i) = f (λ + A(α)ξ; (4.7) ~ + σ 2 ξM +i , y (i) = fλ̂ (x(i) ; ξ) (4.8) or ~ is a random process induced by this model error embedding. The mean where fλ̂ (x; ξ) and variance of this process are defined as µλ̂ (x) and σλ̂2 (x), respectively. To represent this random process and allow easy access to its first two moments, we employ a non-intrusive spectral projection (NISP) approach to propagate uncertainties in f via Gauss-Hermite PC expansion, y (i) = K−1 X ~ + σ 2 ξM +i , fik (λ, α)Ψk (ξ) (4.9) k=0 for a fixed order p expansion, leading to K = (p + M )!/(p!M !) terms. The parameter estimation problem for λ is now reformulated as a parameter estimation for λ̂ = (λ, α). This inverse problem is solved via Bayesian machinery. Bayes’ formula reads p(λ̂|D) ∝ p(D|λ̂) p(λ̂), | {z } | {z } |{z} posterior likelihood prior 23 (4.10) where the key function is the likelihood function LD (λ̂) = p(D|λ̂) (4.11) that connects the prior distribution of the parameters of interest to the posterior one. The options for the likelihood are given further in this section. For details on the likelihood construction, see [26]. To alleviate the invariance with respect to sign-flips, we use a prior that enforces αM i > 0 for i = 1, . . . , M . Also, one can either fix σ 2 or infer it together with λ̂. Exact computation of the potentially high-dimensional posterior (4.10) is usually problematic, therefore we employ Markov chain Monte Carlo (MCMC) algorithm for sampling from the posterior. Model f and the exact form of the likelihood are determined using command line arguments. Below we detail the currently implemented model types. Model types: -f • prop : for x ∈ IR1 and λ ∈ IR1 , the function is defined as f (λ; x) = λx. • prop_quad : for x ∈ IR1 and λ ∈ IR2 , the function is defined as f (λ; x) = λ1 x+λ2 x2 . • exp : for x ∈ IR1 and λ ∈ IR2 , the function is defined as f (λ; x) = eλ1 +λ2 x . • exp_quad : for x ∈ IR1 and λ ∈ IR3 , the function is defined as f (λ; x) = eλ1 +λ2 x+λ3 x . • const : for any x ∈ IRn and λ ∈ IR1 , the function is defined as f (λ; x) = λ. • linear : for x ∈ IR1 and λ ∈ IR2 , the function is defined as f (λ; x) = λ1 + λ2 x. 2 • bb : the model is a ‘black-box’ executed via system-call of a script named bb.x that takes files p.dat (matrix R × D for λ) and x.dat (matrix N × S for x) and returns output y.dat (matrix R × N for f ). This effectively simulates f (λ; x) at R values of λ and N values of x. • heat_transfer1 : a custom model designed for a tutorial case of a heat conduction problem: for x ∈ IR1 and λ ∈ IR1 , the model is defined as f (λ; x) = Axdwwλ + T0 , where dw = 0.1, Aw = 0.04 and T0 = 273. • heat_transfer2 : a custom model designed for a tutorial case of a heat conduction + λ2 , where problem: for x ∈ IR1 and λ ∈ IR2 , the model is defined as f (λ; x) = AxQ w λ1 Aw = 0.04 and Q = 20.0. • frac_power : a custom function for testing. For x ∈ IR1 and λ ∈ IR4 , the function is defined as f (λ; x) = λ0 + λ1 x + λ2 x2 + λ3 (x + 1)3.5 . • exp_sketch : exponential function to enable the sketch illustrations of model error embedding approach, for x ∈ IR1 and λ ∈ IR2 , the model is defined as f (λ; x) = λ2 eλ1 x − 2. 24 • inp : a function that produces the input components as output. That is f (λ; x(i) ) = λi , for x ∈ IR1 and λ ∈ IRd , assuming exactly d values for the design variables x (these are usually simply indices xi = i for i = 1, . . . , d). • pcl : the model is Pa Legendre PC expansion that is linear with respect to coefficients λ, i.e. f (λ; x) = α∈S λα Ψα (x). • pcx : the P model is a Legendre PC expansion in both x and λ, i.e. z = (λ, x), and f (λ; x) = α∈S cα Ψα (z) • pc : the model P is a set of Legendre polynomial expansions for each value of x: i.e. (i) f (λ; x ) = α∈S cα,i Ψα (λ). • pcs : same P as pc, only the multi-index set S can be different for each x(i) , i.e. f (λ; x(i) ) = α∈Si cα,i Ψα (λ). Likelihood construction is the key step and the biggest challenge in model parameter inference. Likelihood types: -l • classical : No α, or M = 0. This is a classical, least-squares likelihood log LD (λ) = − N X (y (i) − f (λ; x(i) ))2 2σ 2 i=1 − N log (2πσ 2 ), 2 (4.12) • koh : Kennedy-O’Hagan likelihood with explicit additive representation of model discrepancy [15]. • full : This is the exact likelihood LD (λ̂) = πhλ̂ (y (1) , . . . , y (N ) ), (4.13) ~ + σ 2 ξM +i . When there is no where hλ̂ is the random vector with entries fλ̂ (x(i) ; ξ) data noise, i.e. σ = 0, this likelihood is degenerate [26]. Typically, computation of this likelihood requires a KDE step for each λ̂ to evaluate a high-d PDF πhλ̂ (·). • marg : Marginal approximation of the exact likelihood LD (λ̂) = N Y πhλ̂,i (y (i) ), (4.14) i=1 where hλ̂,i is the i-th component of hλ̂ . This requires one-dimensional KDE estimates performed for all N dimensions. 25 • mvn : Multivariate normal approximation of the full likelihood 1 1 N log LD (λ̂) = − (y − µλ̂ )T Σ−1 (y − µλ̂ ) − log (2π) − log (det Σλ̂ ), λ̂ 2 2 2 (4.15) where mean vector µλ̂ and covariance matrix Σλ̂ are defined as µiλ̂ = µλ̂ (x(i) ) and Σλ̂ ij = E(hλ̂,i − µλ̂ (x(i) ))(hλ̂,j − µλ̂ (x(j) ))T , respectively. • gausmarg : This likelihood further assumes independence in the gaussian approximation, leading to N N X (y (i) − µλ̂ (x(i) ))2 1X − log 2π σλ̂2 (x(i) ) + σ 2 . log LD (λ̂) = − 2 i=1 2 (i) 2 i=1 2 σ (x ) + σ λ̂ • abcm (4.16) : This likelihood enforces the mean of fλ̂ to match the mean of data N X (y (i) − µλ̂ (x(i) ))2 1 log LD (λ̂) = − − log (2π2 ), 2 2 2 i=1 (4.17) • abc : This likelihood enforces the mean of fλ̂ to match the mean of data and the standard deviation to match the average spread of data around mean within some factor γ 2 q 2 (i) ) + σ 2 N (y (i) − µ (x(i) ))2 + γ|y (i) − µ (x(i) )| − σ (x X λ̂ λ̂ 1 λ̂ log LD (λ̂) = − − log (2π2 ), 2 2 2 i=1 (4.18) } pce eval: This utility evaluates PC-related functions given input file xdata.dat and return the evaluations in an output file ydata.dat. The keyword options are: Function types: -f ~ = PK ck Ψk (ξ) ~ given a set of ξ, ~ the PC type, • PC : Evaluates the function f (ξ) k=0 dimensionality, order and coefficients. ~ = PK ck Ψk (ξ) ~ given a set of ξ, ~ the PC type, • PC_mi : Evaluates the function f (ξ) k=0 multiindex and coefficients. • PCmap : Evaluates ‘map’ functions from a germ of one PC type to another. That is ~ = C −1 C1 (ξ~1 ), where C1 and C2 are the cumulative PC1 to PC2 is a function ~η = f (ξ) 2 distribution functions (CDFs) associated with the PDFs of PC1 and PC2, respectively. For example, HG→LU is a map from standard normal random variable to a uniform random variable in [−1, 1]. 26 } pce quad: This utility constructs a PC expansion from a given set of samples. Given a set of N ~ samples {x(i) }N i=1 of a random d-variate vector X, the goal is to build a PC expansion ~ ' X K X ~ ck Ψk (ξ), (4.19) k=0 where d is the stochastic dimensionality, i.e. ξ~ = (ξ1 , . . . , ξd ). We use orthogonal projection method, i.e. ~ k (ξ)i ~ ~ ~ ξ)Ψ ~ k (ξ)i hG( hXΨ = . (4.20) ck = ~ ~ hΨ2 (ξ)i hΨ2 (ξ)i k k The denominator can be precomputed analytically or numerically with high precision. The ~ in the numerator is constructed as follows. We employ the Rosenblatt trans~ ξ) key map G( formation, constructed by shifted and scaled successive conditional cumulative distribution functions (CDFs), η1 η2 η3 .. . ηd = 2F1 (X1 ) − 1 = 2F2|1 (X2 |X1 ) − 1 = 2F3|2,1 (X3 |X2 , X1 ) − 1 (4.21) = 2Fd|d−1,...,1 (Xd |Xd−1 , . . . , X1 ) − 1. maps any joint random vector to a set of independent standard Uniform[-1,1] random variables. Rosenblatt transformation is the multivariate generalization of the well-known CDF transformation, stating that F (X) is uniformly distributed if F (·) is the CDF of random ~ X). ~ Now denote the shifted and scaled univariable X. The shorthand notation is ~η = R( ~ = ~η . ~ ξ) variate CDF of the ‘germ’ ξi by H(·), so that by the CDF transformation reads as H( For example, for Legendre-Uniform PC, the germ itself is uniform and H(·) is identity, while for Gauss-Hermite PC the function H(·) is shifted and scaled version of the normal CDF. ~ and ξ~ by Now, we can write the connection between X ~ ~ X) ~ = H( ~ ξ), R( or ~ =R ~ −1 ~ ~ X | {z◦ H}(ξ) (4.22) ~ G ~ is done analytically or numerically with high precision, While the computation of H ~ −1 . In practice the exact joint cumulative distribution the main challenge is to estimate R F (x1 , . . . , xd ) is generally not available and is estimated using a standard Kernel Density Estimator (KDE) using the samples available. Given N samples {x(i) }N i=1 , the KDE estimate of its joint probability density function is a sum of N multivariate gaussian functions centered 27 at each data point x(i) : N X 1 (x − x(i) )T (x − x(i) ) pX~ (x) = exp − N σ d (2π)d/2 i=1 2σ 2 (4.23) or ! N (i) (i) X 1 (x1 − x1 )2 + · · · + (xd − xd )2 pX~ 1 ,...,X~ d (x1 , . . . , xd ) = exp − , N σ d (2π)d/2 i=1 2σ 2 (4.24) where the bandwidth σ should be chosen to balance smoothness and accuracy, see [28, 29] for discussions of the choice of σ. Note that ideally σ should be chosen to be dimensiondependent, however the current implementation uses the same bandwidth for all dimensions. Now the conditional CDF is KDE-estimated by Z xk 0 0 pk|k−1,...,1 (xk |xk−1 , . . . , x1 )dxk Fk|k−1,...,1 (xk |xk−1 , . . . , x1 ) = Z−∞xk = −∞ 0 pk,...,1 (xk , xk−1 , . . . , x1 ) 0 dxk pk−1,...,1 (xk−1 , . . . , x1 ) N 0 xk X (i) (i) (x1 −x1 )2 +···+(xk −xk )2 exp − 2σ 2 Z ≈ 1 √ σ 2π −∞ 0 i=1 N X dxk (i) (i) (x1 −x1 )2 +···+(xk−1 −xk−1 )2 exp − 2σ 2 i=1 Z xk N X (i) (i) (x1 −x1 )2 +···+(xk−1 −xk−1 )2 exp − × 2σ 2 √1 σ 2π 0 (i) (xk −xk )2 exp − 2σ2 0 i=1 = N X −∞ (i) (i) (x1 −x1 )2 +···+(xk−1 −xk−1 )2 exp − 2σ 2 dxk i=1 N X = exp − (i) (i) (x1 −x1 )2 +···+(xk−1 −xk−1 )2 2σ 2 ×Φ i=1 N X (i) xk −xk σ , (4.25) (i) (i) (x1 −x1 )2 +···+(xk−1 −xk−1 )2 exp − 2σ 2 i=1 where Φ(z) is the CDF of a standard normal random variable.Note that the numerator in (i) x −x (4.25) differs from the denominator only by an extra factor Φ k σ k in each summand, allowing an efficient computation scheme. 28 The above Rosenblatt transformation maps the random vector x to a set of i.i.d. uniform random variables ~η = (η1 , . . . , ηd ). However, the formula (4.22) requires the inverse of the Rosenblatt transformation. Nevertheless, the approximate conditional distributions are monotonic, hence they are guaranteed to have an inverse function, and it can be evaluated rapidly with a bisection method. With the numerical estimation of the map (4.22) available, we can proceed to evaluation the numerator of the orthogonal projection (4.20) Z ~ ~ ~ ~ ~ ~ (4.26) hG(ξ)Ψk (ξ)i = G(x)Ψ k (x)πξ~(ξ)dξ, ξ~ ~ is the PDF of ξ. ~ The projection integral (4.26) is computed via quadrature where πξ~(ξ) integration Z ξ~ ~ k (ξ)π ~ ~(ξ)d ~ ξ~ ≈ ~ ξ)Ψ G( ξ Q X ~ ξ~q )Ψk (ξ~q )wq = G( q=1 Q X ~ −1 (H( ~ ξ~q ))Ψk (ξ~q )wq , R (4.27) q=1 ~ where (ξ~q , wq ) are Gaussian quadrature point-weight pairs for the weight function πξ~(ξ). } pce resp: This utility performs orthogonal projection given function evaluations at quadrature points, in order to arrive at polynomial chaos coefficients for a Total-Order PC expansion X ~ ≈ ~ ≡ g(ξ). ~ f (ξ) cα Ψα (ξ) (4.28) ||α||1 ≤p The orthogonal projection computed by this utility is 1 cα = hΨ2α i Z ξ~ Q ~ α (ξ)π ~ ~(ξ)d ~ ξ~ ≈ f (ξ)Ψ ξ 1 X wq f (ξ~(q) )Ψα (ξ~(q) ). hΨ2α i q=1 (4.29) Given the function evaluations f (ξ~(q) ) and precomputed quadrature (ξ~(q) , wq ), this utility outputs the PC coefficients cα , PC evaluations at the quadrature points g(ξ~(q) ) as well as, if requested by a command line flag, a quadrature estimate of the relative L2 error v u PQ ~(q) ~(q) 2 ||f − g||2 u q=1 wq (f (ξ ) − g(ξ )) t ≈ . (4.30) PQ ||f ||2 wq f (ξ~(q) )2 q=1 Note that the selected quadrature may not compute the error accurately, since the integrated functions are squared and can be higher than the quadrature is expected to integrate accurately. In such cases, one can use the pce_eval app to evaluate the PC expansion separately and compare to the function evaluations with an `2 norm instead. 29 } pce rv: This utility generates PC-related random variables (RVs). The keyword options are: RV types: -w • PC : Generates samples of univariate random variable type, dimensionality, order and coefficients. PK ~ given the PC • PCmi : Generates samples of univariate random variable type, multiindex and coefficients. PK ~ given the PC k=0 ck Ψk (ξ) k=0 ck Ψk (ξ) • PCvar : Generates samples of multivariate random variable ξ~ that is the germ of a given PC type and dimensionality. } pce sens: This utility evaluates Sobol sensitivity indices of a PC expansion with a given multiindex and a coefficient vector. It computes main, total and joint sensitivities, as well as variance P ~ the computed fraction of each PC term individually. Given a PC expansion I cα Ψα (ξ), moments and sensitivity indices are: • mean: m = c~0 • total variance: V = 2 2 α6=~0 cα hΨα i P • variance fraction for the basis term α: Vα = c2α hΨ2α i V • main Sobol sensitivity index for dimension i: Si = set of multiindices that include only dimension i. 1 V P α∈IS i c2α hΨ2α i, where ISi is the P • total Sobol sensitivity index for dimension i: SiT = V1 α∈IT c2α hΨ2α i, where ITi is the i set of multiindices that include dimension i, among others. P • joint-total Sobol sensitivity index for dimension pair (i, j): SijT = V1 α∈IT c2α hΨ2α i, ij where ITij is the set of multiindices that include dimensions i and j, among others. Note that this is somewhat different from the conventional definition of joint sensitivity indices, which presumes terms that include only dimensions i and j. } pdf cl: Kernel density estimation (KDE) with Gaussian kernels given a set of samples to evaluate probability distribution function (PDF). The procedure relies on approximate nearest 30 neighbors algorithm with fast improved Gaussian transform to accelerate KDE by only computing Gaussians of relevant neighbors. Our tests have shown 10-20x speedup compared to Python’s default KDE package. Also, the app allows clustering enhancement to the data set to enable cluster-specific bandwidth selection - particularly useful for multimodal data. User provides the samples’ file, and either a) number of grid points per dimension for density evaluation, or b) a file with target points where the density is evaluated, or c) a file with a hypercube limits in which the density is evaluated. } regression: This utility performs regression with respect to a linear parametric expansions such as PCs or RBFs. Consider a dataset (x(i) , y (i) )N i=1 that one tries to fit a basis expansion with: y (i) ≈ K X ck Pk (x(i) ), (4.31) k=1 for a set of basis functions Pk (x). This is a linear regression problem, since the object of interest is the vector of coefficients c = (c1 , . . . , ck ), and the summation above is linear in c. This app provides various methods of obtaining the expansion coefficients, using different kinds of bases. The key implemented command line options are Basis types: -f • PC : Polynomial Chaos bases of total-order truncation • PC_MI : Polynomial Chaos bases of custom multiindex truncation • POL : Monomial bases of total-order truncation • POL_MI : Monomial bases of custom multiindex truncation • RBF : Radial Basis Functions, see e.g. [21] Regression methods: -f • lsq : Bayesian least-squares, see [25] and more details below. • wbcs : Weighted Bayesian compressive sensing, see [27]. Although the standard least squares is commonly used and well-documented elsewhere, we detail here the specific implementation in this app, including the Bayesian interpretation. 31 Define the data vector y = (y (1) , . . . , y (N ) ), and the measurement matrix P of size N × K with entries Pik = Pk (x(i) ). The regularized least-squares problem is formulated as arg min ||y − P c||2 + ||Λc||2 | {z } c (4.32) ĉ = (P T P + Λ)−1 P T y | {z } (4.33) R(c) with a closed form solution Σ √ √ where Λ = diag( λ1 , . . . , λK ) is a diagonal matrix of non-negative regularization weights λi ≥ 0. The Bayesian analog of this, detailed in [25], infers coefficient vector c and data noise variance σ 2 , given data y, employing Bayes’ formula Posterior Likelihood Prior z }| { z }| { z }| { p(c, σ 2 |y) ∝ p(y|c, σ 2 ) p(c, σ 2 ) (4.34) The likelihood function is associated with i.i.d. Gaussian noise model y − P c ∼ N (0, σ 2 IN ), and is written as, 1 2 2 2 −N p(y|c, σ ) ≡ Lc,σ2 (y) = (2πσ ) 2 exp − 2 ||y − P c|| (4.35) 2σ Further, the prior p(c, σ 2 ) is written as a product of a zero-mean Gaussian prior on c and an inverse-gamma prior on σ 2 : K Y λk 2π k=1 2 p(c, σ ) = ! 21 | 1 β 2 2 −α−1 exp − ||Λc|| (σ ) exp − 2 2 σ {z } {z } | 2 (4.36) p(σ ) p(c) The posterior distribution then takes a form of normal-scaled inverse gamma distribution which, after some re-arranging, is best described as p(c|σ 2 , y) ∼ M V N (ĉ, σ 2 Σ), (4.37) R(ĉ) N −K p(σ 2 |y) ∼ IG α + , β + 2 2 {z } | {z } | α∗ (4.38) β∗ where ĉ and Σ, as well as the residual R(·) are defined via the classical least-squares problem (4.32) and (4.33). Thus, the mean posterior value of data variance is σ̂ 2 = 32 R(ĉ) 2 N −K α+ 2 −1 β+ . −1 T Also, note that the residual can be written as R(ĉ) = y T IN − P P T P + Λ P y. One can integrate out σ 2 from (4.36) to arrive at a multivariate t-distribution β∗ ∗ p(c|y) ∼ M V T ĉ, ∗ Σ, 2α α with a mean ĉ and covariance (4.39) α∗ Σ. α∗ −2 Now, the pushed-forward process at new values x would be, defining P (x) = (P1 (x), . . . , Pk (x)), ∗ a Student-t process with mean µ(x) = P (x)ĉ, scale C(x, x0 ) = αβ ∗ P (x)ΣP (x0 ) and degreesof-freedom 2α∗ . Note that, currently, Jeffrey’s prior for p(σ 2 ) = 1/σ 2 is implemented, which corresponds to the case of α = β = 0. We are currently implementing more flexible user-defined input for α and β. In particular, in the limit of β = σ02 α → ∞, one recovers the case with a fixed, predefined data noise variance σ02 . } sens: This utility performs a series of tasks for for the computation of Sobol indices. Some theoretical background on the statistical estimators employed here is given in Chapter 5. This utility can be used in conjunction with utility trdSpls which generates truncated normal or log-normal random samples. It can also be used to generate uniform random samples by selecting a truncated normal distribution and a suitably large standard deviation. In addition to the -h flag, it has the following command line options: • -a : Action to be performed by this utility – splFO: assemble samples for first order Sobol indices – idxFO: compute first order Sobol indices – splTO: assemble samples for total order Sobol indices – idxTO: compute total order Sobol indices – splJnt: assemble samples for joint Sobol indices – idxJnt: compute joint Sobol indices • -d : Number of dimensions • -n : Number of dimensions • -u : name of file holding the first set of samples, nspl×ndim • -v : name of file holding the second set of samples, nspl×ndim 33 • -x : name of file holding model evaluations • -p : name of file possibly holding a custom list of parameters for Sobol indices 34 Chapter 5 Examples The primary intended use for UQTk is as a library that provides UQ functionality to numerical simulations. To aid the development of UQ-enabled simulation codes, some examples of programs that perform common UQ operations with UQTk are provided with the distribution. These examples can serve as a template to be modified for the user’s purposes. In some cases, e.g. in sampling-based approaches where the simulation code is used as a black-box entity, the examples may provide enough functionality to be used directly, with only minor adjustments. Below is a brief description of the main examples that are currently in the UQTk distribution. For all of these, make sure the environment variable UQTK_INS is set and points upper level directory of the UQTk install directory, e.g. the keyword installdir described in the installation section. This path also needs to be added to environment variable PYTHONPATH. Elementary Operations Overview This set of examples is located under examples/ops. It illustrates the use of UQTk for elementary operations on random variables that are represented with Polynomial Chaos (PC) expansions. Description This example can be run from command-line: ./Ops.x followed by ./plot_pdf.py samples.a.dat ./plot_pdf.py samples.loga.dat 35 to plot select probability distributions based on samples from Polynomial Chaos Expansions (PCE) utilized in this example. Ops.x step-by-step • Wherever relevant the PCSet class implements functions that take either “double *” arguments or array container arguments. The array containers, named “Array1D”, “Array2D”, and“Array3D”, respectively, are provided with the UQTk library to streamline the management of data structures. 1. Instantiate a PCSet class for a 2nd order 1D PCE using Hermite-Gauss chaos. int ord = 2; int dim = 1; PCSet myPCSet("ISP",ord,dim,"HG"); 2. Initialize coefficients for HG PCE expansion â given its mean and standard deviation: double ma = 2.0; // Mean double sa = 0.1; // Std Dev myPCSet.InitMeanStDv(ma,sa,a); â = P X ak Ψk (ξ), a0 = µ, k=0 σ a1 = p 2 , hψ1 i a2 = a3 = . . . = 0 3. Initialize b̂ = 2.0ψ0 (ξ) + 0.2ψ1 (ξ) + 0.01ψ2 (ξ) and subtract b̂ from â: b[0] = 2.0; b[1] = 0.2; b[2] = 0.01; myPCSet.Subtract(a,b,c); The subtraction is a term by term operation: ck = ak − bk 4. Product of PCE’s, ĉ = â · b̂: myPCSet.Prod(a,b,c); ĉ = P X ck Ψk (ξ) = k=0 ck = P X ! ak Ψk (ξ) k=0 P X P X Cijk ai bj , i=0 j=0 P X ! bk Ψk (ξ) k=0 Cijk = hψi ψj ψk i hψk2 i The triple product Cijk is computed and stored when the PCSet class is instantiated. 36 5. Exponential of a PCE, ĉ = exp(â) is computed using a Taylor series approach myPCSet.Exp(a,c); ĉ = exp(â) = exp(a0 ) 1 + NT ˆn X d n=0 where dˆ = â − a0 = P X ! n! ak (5.1) (5.2) k=1 The number of terms NT in the Taylor series expansion are incremented adaptively until an error criterion is met (relative magnitude of coefficients compared to the mean) or the maximum number of terms is reached. Currently, the default relative tolerance and maximum number of Taylor terms are 10−6 and 500. This values can be changed by the user using public PCSet methods SetTaylorTolerance and SetTaylorTermsMax, respectively. 6. Division, ĉ = â/b̂: myPCSet.Div(a,b,c); Internally the division operation is cast as a linear system, see item 4, â = b̂ · ĉ, with unknown coefficients ck and known coefficients ak and bk . The linear system is sparse and it is solved with a GMRES iterative solver provided by NETLIB 7. Natural logarithm, ĉ = log(â): myPCSet.Log(a,c); Currently, two methodologies are implemented to compute the logarithm of a PCE: Taylor series expansion and an integration approach. For more details see Debusschere et. al. [3]. 8. Draw samples from the random variable â represented as a PCE: myPCSet.DrawSampleSet(aa,aa_samp); Currently “Ops.x” draws sample from both â and log(â) and saves the results to files “samples.a.dat” and “samples.loga.dat”, respectively. 9. The directory contains a python script that computes probability distributions from samples via Kernel Density Estimate (KDE, also see Lecture #1) and generates two plots, “samples.a.dat.pdf” and “samples.loga.dat.pdf”, also shown in Fig. 5.1. 37 0.9 0.8 0.7 9 optimal optimal/2 optimal*2 8 7 6 PDF(log(â)) PDF(â) 0.6 optimal optimal/2 optimal*2 0.5 0.4 0.3 0.2 5 4 3 2 0.1 1 0.01 0 0.45 2 3 4 â 5 6 7 0.50 0.55 0.60 0.65 0.70 log(â) 0.75 0.80 0.85 0.90 Figure 5.1: Probability densities for â and log(â) computed via KDE. Results generated using several KDE bandwidths. This feature is available in the Python’s SciPy package starting with version 0.11 Forward Propagation of Uncertainty Overview • Located in examples/surf_rxn • Several examples of propagating uncertainty in input parameters through a model for surface reactions, consisting of three Ordinary Differential Equations (ODEs). Two approaches are illustrated: – Direct linking to the C++ UQTk libraries from a C++ simulation code: ∗ Propagation of input uncertainties with Intrusive Spectral Projection (ISP), Non Intrusive Spectral Projection (NISP) via quadrature , and NISP via Monte Carlo (MC) sampling. ∗ For more documentation, see a detailed description below ∗ An example can be run with ./forUQ_sr.py – Using simulation code as a black box forward model: ∗ Propagation of uncertainty in one input parameter with NISP quadrature approach. ∗ For more documentation, see a detailed description below ∗ An example can be run with ./forUQ_BB_sr.py Simulation Code Linked to UQTk Libraties The example script forUQ_sr.py, provided with this example can perform parametric uncertainty propagation using three methods • NISP: Non-intrusive spectral projection using quadrature integration 38 • ISP: Intrusive spectral projection • NISP MC: Non-intrusive spectral projection using Monte-Carlo integration The command-line usage for this example is ./forUQ_sr.py [ ] [ ] The script requires the xml input template file forUQ surf rxn.in.xml.templ. In this template, the default setting for param b is uncertain normal random variable with a standard deviation set to 10% of the mean. The following parameters are defined at the beginning of the file: • pctype: The type of PC, supports ’HG’, ’LU’, ’GLG’, ’JB’ • pcord: The order of output PC expansion • methodX: NISP, ISP or NISP MC • nsam: Number of samples requested for NISP Monte-Carlo (currently hardwired in the script) Description of Non-Intrusive Spectral Projection utilities (SurfRxnNISP.cpp and SurfRxnNISP MC.cpp ) ~ = f (ξ) X ~ ck Ψk (ξ) ck = k k ~ k (ξ)i ~ = hf (ξ)Ψ Z ~ k (ξ)π( ~ ξ)d ~ ξ~ ≈ f (ξ)Ψ ~ k (ξ)i ~ hf (ξ)Ψ ~ hΨ2 (ξ)i " X q | # " # X 1 f (ξ~q )Ψk (ξ~q )wq or f (ξ~s )Ψk (ξ~s ) N s {z } | {z } N ISP N ISP M C These codes implement the following workflows 1. Read XML file 2. Create a PC object with or without quadrature • NISP: PCSet myPCSet("NISP",order,dim,pcType,0.0,1.0) • NISP MC: PCSet myPCSet("NISPnoq",order,dim,pcType,0.0,1.0) 3. Get the quadrature points or generate Monte-Carlo samples 39 • NISP: myPCSet.GetQuadPoints(qdpts) • NISP MC: myPCSet.DrawSampleVar(samPts) 4. Create input PC objects and evaluate input parameters corresponding to quadrature points 5. Step forward in time - Collect values for all input parameter samples - Perform Galerkin projection or Monte-Carlo integration - Write the PC modes and derived first two moments to files Description of Intrusive Spectral Projection utility (SurfRxnISP.cpp ) This code implement the following workflows 1. Read XML file 2. Create a PC object for intrusive propagation PCSet myPCSet("ISP",order,dim,pcType,0.0,1.0) 3. Represent state variables and all parameters with their PC coefficients • u → {uk }, v → {vk }, w → {wk }, z → {zk }, • a → {ak }, b → {bk }, c → {ck }, d → {dk }, e → {ek }, f → {fk }. 4. Step forward in time according to PC arithmetics, e.g. a · u → {(a · u)k } with ! a·u= X ~ ai Ψi (ξ) i ! X ~ uj Ψj (ξ) j = X X i,j k | ! hΨi Ψj Ψk i ~ ai u j Ψk (ξ) hΨ2k i {z } (a·u)k Postprocessing Utilities - time series ./plSurfRxnMstd.py NISP ./plSurfRxnMstd.py ISP ./plSurfRxnMstd.py NISP_MC These commands plot the time series of mean and standard deviations of all three species with all three methods. Sample results are shown in Fig. 5.2. Postprocessing Utilities - PDFs 40 v w 0.8 0.6 0.4 0.2 0.00 200 400 600 Time [-] 800 1000 Method ISP 1.0 u v w 0.8 0.6 0.4 0.2 0.00 200 400 600 Time [-] 800 1000 Species Mass Fractions [-] u Species Mass Fractions [-] Species Mass Fractions [-] Method NISP 1.0 Method NISP_MC 1.0 u v w 0.8 0.6 0.4 0.2 0.00 200 400 600 800 Time [-] 1000 Figure 5.2: Time series of mean and standard deviations for u, v, and w with NISP, ISP, and NISP MC, respectively. ./plPDF_method.py [ ] [ ] e.g. ./plPDF_method.py u ave HG 3 NISP ISP This script samples the PC representations, then computes the PDFs of time-average (ave) or the final time value (tf) for all three species. Sample results are shown in Fig. 5.3. 500 400 80 300 60 200 40 100 20 12 10 PDF(u) 8 0.1 0.2 0.3 QoI: average u 0.4 25 NISP ISP NISP_MC NISP ISP NISP_MC 15 0.1 0.2 0.3 QoI: u @ final time 0.4 0.5 80 60 40 0 0.4 0.5 25 0.6 0.7 0.8 QoI: average w 0.9 1.0 NISP ISP NISP_MC 20 15 10 5 2 0 0.0 QoI: average v 10 4 NISP ISP NISP_MC 100 20 20 6 120 0 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.00 0.5 PDF(v) 0 0.0 NISP ISP NISP_MC PDF(v) PDF(u) 100 600 NISP ISP NISP_MC PDF(w) 120 PDF(w) 140 5 0 0.02 0.04 0.06 0.08 0.10 0.12 0.14 0.00 QoI: v @ final time 0 0.4 0.5 0.6 0.7 0.8 QoI: w @ final time 0.9 1.0 Figure 5.3: PDFs for u, v, and w; Top row shows results for average u, v, and w; Bottom row shows results corresponding to values at the last integration step (final time). Simulation Code Employed as a Black Box The command-line usage for the script implementing this example is given as ./forUQ_BB_sr.py --nom nomvals -s stdfac -d dim -l lev -o ord -q sp --npdf npdf --npces npces The following parameters can be controlled by the user 41 • nomvals: List of nominal parameter values, separated by comma if more than one value, and no spaces. Default is one value, 20.75 • stdfac: Ratio of standard deviation/nominal parameter values. Default value: 0.1 • dim: number of uncertain input parameters. Currently this example can only handle dim = 1 • lev: No. of quadrature points per dimension (for full quadrature) or sparsity level (for sparse quadrature). Default value: 21. • ord: PCE order. Default value: 20 • sp: Quadrature type “full” or “sparse”. Default value: “full” • npdf: No. of grid points for Kernel Density Estimate evaluations of output model PDF’s. Default value 100 • npces: No. of PCE evaluations to estimate output densitie. Default value 105 Note: This example assumes Hermite-Gauss chaos for the model input parameters. This script uses the following utilities, located in the bin directory under the UQTk installation path • generate quad: Generate quadrature points for full/sparse quadrature and several types of rules. • pce rv: Generate samples from a random variable defined by a Polynomial Chaos expansion (PCE) • pce eval: Evaluates PCE for germ samples saved in input file “xdata.dat”. • pce resp: Constructs PCE by Galerkin projection Sequence of computations: 1. forUQ BB sr.py saves the input parameters’ nominal values and standard deviations in a diagonal matrix format in file “pcfile”. First it saves the matrix of nominal values, then the matrix of standard deviations. This information is sufficient to define a PCE for a normal random variable in terms of a standard normal germ. For a one parameter problem, this file has two lines. 42 2. generate quad: Generate quadrature points for full/sparse quadrature and several types of rules. The usage with default script arguments generate_quad -d1 -g’HG’ -xfull -p21 > logQuad.dat This generates Hermite-Gauss quadrature points for a 21-point rule in one dimension. Quadrature points locations are saved in “qdpts.dat” and weights in “wghts.dat” and indices of points in the 1D space in “indices.dat”. At the end of “generate quad” execution file “qdpts.dat” is copied over “xdata.dat” 3. pce eval: Evaluates PCE of input parameters at quadrature points, saved previously in “xdata.dat”. The evaluation is dimension by dimension, and for each dimension the corresponding column from “pcfile” is saved in “pccf.dat”. See command-line arguments below. pce_eval -x’PC’ -p1 -q1 -f’pccf.dat’ -sHG >> logEvalInPC.dat At the end of this computation, file “input.dat” contains a matrix of PCE evaluations. The number of lines is equal to the number of quadrature points and the number of columns to the dimensionality of input parameter space. 4. Model evaluations: funcBB("input.dat","output.dat",xmltpl="surf_rxn.in.xml.tp3", xmlin="surf_rxn.in.xml") The Python function “funcBB” is defined in file “prob3 utils.py”. This evaluates the forward model at sets of input parameters in file “input.dat” and saves the model output in “output.dat”. For each model evaluation, specific parameters are inserted in the xml file “surf rxn.in.xml” which is a copy of the template in “surf rxn.in.xml.tp3”. At the end “output.dat” is copied over “ydata.dat” 5. pce resp: pce_resp -xHG -o20 -d1 -e > logPCEresp.dat Computes a Hermite-Gauss PCE of the model output via Galerkin projection. The model evaluations are taken from “ydata.dat”, and the quadrature point locations from “xdata.dat”. PCE coefficients are saved in “PCcoeff quad.dat”, the multi-index list in “mindex.dat” and these files are pasted together in “mipc.dat” 0.7 model quadrature points 0.6 uss(b) 0.5 0.4 0.3 0.2 0.1 0.00 5 10 15 20 b 25 30 35 40 (average u as a function of parameter b values. Location of quadrature points is shown with circles.) 43 6. pce rv: pce_rv -w’PCvar’ -xHG -d1 -n100 -p1 -q0 > logPCrv.dat Draw a 100 samples from the germ of the HG PCE. Samples are saved in file “rvar.dat” and also copied to file “xdata.dat” 7. pce eval: pce_eval -x’PC’ -p1 -q1 -f’pccf.dat’ -sHG >> logEvalInPCrnd.dat See item 3 for de- tails. Results are saved “input val.dat”. 8. Evaluate both the forward model (through the black-box script “funcBB”, see item 4) and its PCE surrogate (see item 3) and save results to files “output val.dat” and “output val pc.dat”. Compute L2 error between the two sets of values using function “compute err” defined in “utils.py” 9. Sample output PCE and plot the PDF of these samples computed using either a Kernel Density Estimate approach with several kernel bandwidths or by binning: 6 optimal (1/2 * optimal) (2 * optimal) binning 5 PDF 4 3 2 1 00.0 0.1 0.2 0.3 0.4 PC surrogate 0.5 0.6 Numerical Integration Overview This example is located in examples/num_integ. It contains a collection of Python scripts that can be used to perform numerical integration on six Genz functions: oscillatory, exponential, continuous, Gaussian, corner-peak, and product-peak. Quadrature and Monte Carlo integration methods are both employed in this example. 44 Theory In uncertainty quantification, forward propagation of uncertain inputs often involves evaluating integrals that cannot be computed analytically. Such integrals can be approximated numerically using either a random or a deterministic sampling approach. Of the two integration methods implemented in this example, quadrature methods are deterministic while Monte Carlo methods are random. Quadrature Integration The general quadrature rule for integrating a function u(ξ) is given by: Z Nq X u(ξ)dξ ≈ q i u(ξ i ) (5.3) i=1 i where the Nq ξ are quadrature points with corresponsing weights q i . The accuracy of quadrature integration relies heavily on the choice of the quadrature points. There are countless quadrature rules that can be used to generate quadrature points, such as Gauss-Hermite, Gauss-Legendre, and Clenshaw-Curtis. When performing quadrature integration, one can use either full tensor product or sparse quadrature methods. While full tensor product quadrature methods are effective for functions of low dimension, they suffer from the curse of dimensionality. Full tensor product quadrature integration methods require N d quadrature points to integrate a function of dimension d with N quadrature points per dimension. Thus, for functions of high dimension the number of quadrature points required quickly becomes too large for these methods to be practical. Therefore, in higher dimensions sparse quadrature approaches, which require far fewer points, are utilized. When performing sparse quadrature integration, rather than determining the number of quadrature points per dimension, a level is selected. Once a level is selected, the total number of quadrature points can be determined from the dimension of the function. For more information on quadrature integration see reference here. Monte Carlo Integration One random sampling approach that can be used to evaluate integrals numerically is Monte Carlo integration. To use Monte Carlo integration methods to evaluate the integral of a general function u(ξ) on the d-dimensional [0, 1]d the following equation can be used: Z Ns 1 X u(ξ i ) (5.4) u(ξ)dξ ≈ Ns i=1 The Ns ξ i are random sampling points chosen from the region of integration according to the distribution of the inputs. In this example, we are assuming the inputs have uniform 45 distribution. One advantage of using Monte Carlo integration is that any number of sampling points can be used, while quadrature integration methods require a certain number of sampling points. One disadvantage of using Monte Carlo integration methods is that there is slow convergence. However, this O( √1Ns ) convergence rate is independent of the dimension of the integral. Genz Functions The functions being integrated in this example are six Genz functions, and they are integrated over the d-dimensional [0, 1]d . These functions, along with their exact integrals, are defined as follows. The Genz parameters wi represent weight parameters and ui represent shift parameters. In the current example, the parameters wi and ui are set to 1, with one exception. The parameters wi and ui are instead set to 0.1 for the Corner-peak function in the sparse_quad.py file. R Model Formula: f (λ) Exact Integral: f (λ)dλ [0,1]d Oscillatory cos(2πu1 + d P w i λi ) cos (2πu1 + i=1 Exponential exp( d P wi (λi − ui )) i=1 Continuous Gaussian exp(− exp(− d P i=1 d P wi |λi − ui |) wi2 (λi − ui )2 ) i=1 Corner-peak (1 + d P d Q i=1 d Q Product-peak i=1 wi2 1+wi2 (λi −ui )2 i=1 1 (exp(wi (1 wi 1 d! d Q i=1 d Q wi ) d Q i=1 w 2 sin( 2i ) wi − ui )) − exp(−wi ui )) 1 (2 − exp(−wi ui ) − exp(wi (ui wi i=1 d √ Q π (erf(wi (1 − ui )) + erf(wi ui )) 2wi i=1 wi λi )−(d+1) i=1 1 2 d P d Q − 1))) (−1)||r||1 d P wi r{0,1}d 1+ wi ri P i=1 wi (arctan(wi (1 − ui )) + arctan(wi ui )) i=1 Implementation The script set consists of three files: • full_quad.py: a script to compare full quadrature and Monte Carlo integration methods. • sparse_quad.py: a script to compare sparse quadrature and Monte Carlo integration methods. • quad_tools.py: a script containing functions called by full_quad.py and sparse_quad.py. 46 full quad.py This script will produce a graph comparing full quadrature and Monte Carlo integration methods. Use the command ./full_quad.py to run this file. Upon running the file, the user will be prompted to select a model from the Genz functions listed. Please enter desired model from choices: genz_osc genz_exp genz_cont genz_gaus genz_cpeak genz_ppeak The six functions listed correspond to the Genz functions defined above. After the user selects the desired model, he/she will be prompted to enter the desired dimension. Please enter desired dimension: The dimension should be entered as an integer without any decimal points. As full quadrature integration is being implemented, this script should not be used for functions of high dimension. If you wish to integrate a function of high dimension, instead use sparse_quad.py. After the user enters the desired dimension, she/he will be prompted to enter the desired maximum number of quadrature points per dimension. Enter the desired maximum number of quadrature points per dimension: Again, this number should be entered as an integer without any decimal points. Several quadrature integrations will be performed, with the first beginning with 1 quadrature point per dimension. For subsequent quadrature integrations, the number of quadrature points will be incremented by one until the maximum number of quadrature points per dimension, as specified by the user, is reached. For example, if the user has requested a maximum of 4 quadrature points per dimension, 4 quadrature integrations will be performed: one with 1 quadrature point per dimension, another with 2 quadrature points per dimension, a third with 3 quadrature points per dimension, and a fourth with 4 quadrature points per dimension. Next, the script will call the function generate_qw from the quad_tools.py script to generate quadrature points as well as the corresponding weights. Then, the exact integral for the chosen function is computed by calling the integ_exact function in quad_tools.py. This function calculates the exact integral according to the 47 formulas found in the above Theory section. The error between the exact integral and the quadrature approximation is then calculated and stored in a list of errors. Now, for each quadrature integration performed, a Monte Carlo integration is also performed with the same number of sampling points as the total number of quadrature points. To account for the random nature of the Monte Carlo sampling approach, ten Monte Carlo integrations are performed and their errors from the exact integral are averaged. To perform these Monte Carlo integrations and calculate the error in these approximations, the function find_error found in quad_tools.py is called. Although we are integrating over [0, 1]d , the sampling points will be uniformly random points in [−1, 1]d . We do this so the same function func can be used to evaluate the model at these points and the quadrature points, which are generated in [−1, 1]d . The function func takes points in [−1, 1]d as input and maps these points to points in [0, 1]d before the function is evaluated at these new points Finally, the data from both the quadrature and Monte Carlo integrations are plotted. A log-log graph is created that displays the total number of sampling points versus the absolute error in the integral approximation. The graph will be displayed and will be saved as quad_vs_mc.pdf as well. sparse quad.py This script is similar to the full_quad.py file and will produce a graph comparing sparse quadrature and Monte Carlo integration methods. Sparse quadrature integration rules should be utilized for functions of high dimension, as they do not obey full tensor product rules. Use the command ./sparse_quad.py to run this script. Upon running the file, the user will be prompted to select a model from the Genz functions listed. Please enter desired model from choices: genz_osc genz_exp genz_cont genz_gaus genz_cpeak genz_ppeak After the user selects the desired model, he/she will be prompted to enter the desired dimension. Please enter desired dimension: The dimension should be entered as an integer without any decimal points. After the user enters the desired dimension, she/he will be prompted to enter the maximum desired level. 48 Enter the maximum desired level: Again, this number should be entered as an integer without any decimal points. Multiple quadrature integrations will be performed, with the first beginning at level 1. For subsequent quadrature integrations, the level will increase by one until the maximum desired level, as specified by the user, is reached. Next, the script will call the function generate_qw from the quad_tools.py script to generate quadrature points as well as the corresponding weights. Then, the exact integral for the chosen function is computed by calling the integ_exact function in quad_tools.py. The error between the exact integral and the quadrature approximation is then calculated and stored in a list of errors. Now, for each quadrature integration performed, a Monte Carlo integration is also performed with the same number of sampling points as the total number of quadrature points. This is done in the same manner as in the full_quad.py script. Lastly, the data from both the sparse quadrature and Monte Carlo integration are plotted. A log-log graph is created that displays the total number of sampling points versus the absolute error in the integral approximation. The graph will be displayed and will be saved as sparse_quad.pdf. quad tools.py This script contains four functions called by the full_quad.py and sparse_quad.py files. • generate_qw(ndim,param,sp=’full’,type=’LU’): This function generates quadrature points and corresponding weights. The quadrature points will be generated in the d-dimensional [−1, 1]d . – ndim: The number of dimensions as specified by the user. – param: Equal to the number of quadrature points per dimension when full quadrature integration is being performed. When sparse quadrature integration is being performed, param represents the level. – sp: The sparsity, which can be set to either full or sparse. The default is set as sp=’full’, and to change to sparse quadrature one can pass sp=’sparse’ as a parameter to the function. – type: The quadrature type. The default rule is Legendre-Uniform (’LU’). To change the quadrature type, one can pass a different type to the function. For example, to change to a Gauss-Hermite quadrature rule, pass type=’HG’ to the function. For a complete list of the available quadrature types see the generate_quad subsection in the Applications section of Chapter 3 of the manual. 49 • func(xdata,model,func_params): This function evaluates the Genz functions at the selected sampling points. – xdata: These will either be the quadrature points generated by generate_qw or the uniform random points generated in the find_error function. The points specified as xdata into this function will be in [−1, 1]d and thus will first be mapped to points in [0, 1]d before the function can be evaluated at these new points. – model : The Genz function specified by the user. – func params: The parameters, wi and ui , of the Genz function selected. In the full_quad.py file, all Genz parameters are set to 1. In the sparse_quad.py file, all Genz parameters are set to 1 for all models except genz_cpeak. For the genz_cpeak model, the Genz parameters are set to 0.1. R • integ_exact(model,func_params): This function computes the exact integral f (λ)dλ [0,1]d of the selected Genz function, f (λ). – model : The Genz function selected by the user. – func params: The parameters, wi and ui , of the Genz function selected. In the full_quad.py file, all Genz parameters are set to 1. In the sparse_quad.py file, all Genz parameters are set to 1 for all models except genz_cpeak. For the genz_cpeak model, the Genz parameters are set to 0.1. • find_error: This function performs 10 Monte Carlo integrations, and returns their average error from the exact integral. The function takes inputs: pts, ndim, model, integ ex, and func params. – pts: The number of uniform random points that will be generated. Equal to the total number of quadrature points used. – ndim: The number of dimensions as specified by the user. – model : The Genz function selected by the user. R f (λ)dλ of the selected Genz function returned – integ ex : The exact integral [0,1]d by the integ_exact function. – func params: The parameters, wi and ui , of the Genz function selected. In the full_quad.py file, all Genz parameters are set to 1. In the sparse_quad.py file, all Genz parameters are set to 1 for all models except genz_cpeak. For the genz_cpeak model, the Genz parameters are set to 0.1. 50 Sample Results Try running the full_quad.py file with the following input: Please enter desired model from choices: genz_osc genz_exp genz_cont genz_gaus genz_cpeak genz_ppeak genz_exp Please enter desired dimension: 5 Enter the desired maximum number of quadrature points per dimension: 10 Your graph should look similar to the one in the figure below. Although the Monte Carlo integration curve may vary due to the random nature of the sampling, your quadrature curve should be identical to the one pictured. Comparison of Full Quadrature and Monte Carlo Integration for Genz Exponential Model with Dimension 5 Absolute Error in Integral Approximation 10-2 10-4 10-6 10-8 10-10 10-12 10-14 10-16 0 10 Full Quadrature Monte Carlo 101 102 103 104 Total number of Sampling Points 105 Figure 5.4: Sample results of full quad.py 51 Now try running the sparse_quad.py file with the following input: Please enter desired model from choices: genz_osc genz_exp genz_cont genz_gaus genz_cpeak genz_ppeak genz_cont Please enter desired dimension: 14 Enter the maximum desired level: 4 While the quadrature integrations are being performed, the current level will be printed to your screen. Your graph should look similar to the figure below. Again, the Monte Carlo curve may differ but the quadrature curve should be the same as the one pictured. Comparison of Sparse Quadrature and Monte Carlo Integration for Genz Continuous Model with Dimension 14 -3 Absolute Error in Integral Approximation 10 10-4 10-5 10-6 10-7 1 10 Sparse Quadrature Monte Carlo 102 103 104 Total Number of Sampling points 105 Figure 5.5: Samples results of sparse quad.py 52 Next, try running full_quad.py with a quadrature rule other than the default LegendreUniform. Locate the line in full_quad.py that calls the function generate_quad. It should read: xpts,wghts=generate_qw(ndim,quad_param) Now, change this line to read: xpts,wghts=generate_qw(ndim,quad_param, type= ’CC’) This will change the quadrature rule to Clenshaw-Curtis. Then run the file with input: genz_gaus, 5, 10. Sample results can be found in the figure below. Comparison of Full Quadrature and Monte Carlo Integration for Genz Gaussian Model with Dimension 5 0 10 Absolute Error in Integral Approximation 10-1 10-2 10-3 10-4 10-5 10-6 10-7 10-8 10-9 10-10 10-11 0 10 Full Quadrature Monte Carlo 101 102 103 104 Total number of Sampling Points 105 Figure 5.6: Sample results of full quad.py with Clenshaw-Curtis quadrature rule. 53 Forward Propagation of Uncertainty with PyUQTk Overview This example is located in examples/fwd_prop. It contains a pair of Python scripts that propagate uncertainty in input parameters through a heat transfer model using both a Monte Carlo sampling approach and a non-intrusive spectral projection (NISP) via quadrature methods. Theory Heat Transfer TA Wall Glass Ti Q To T1 T2 In this example, the heat transfer through a window is calculated using samples of seven uncertain Gaussian parameters. These parameters, along with their means and standard deviations are defined below. Parameter Ti : Room temperature To : Outside temperature dw : Window thickness kw : Window conductivity hi : Inner convective heat transfer coefficient ho : Outer convective heat transfer coefficient TA : Atmospheric temperature 54 Mean 293 K 273 K 0.01 m 1 W/mK 2 W/m2 K 6 W/m2 K 150 K Standard deviation (%) 0.5 0.5 1 5 15 15 10 Once we have samples of the 7 parameters, the following forward model is used to calculate heat flux Q: Q = hi (Ti − T1 ) = kw T1 − T2 = ho (T2 − To ) + σ(T24 − TA4 ) dw T1 represents the inner window temperature and T2 represents the outer window temperature. is the emissivity of uncoated glass, which we take to be 0.95, and σ is the Stefan-Boltzmann constant. Polynomial Chaos Expansion In this example, the forward propagation requires the representation of heat flux Q with a multidimensional Polynomial Chaos Expansion (PCE). This representation can be defined as follows: P X Qk Ψk (ξ1 , ..., ξn ) Q= k=0 • Q: Random variable represented with multi-D PCE • Qk : PC coefficients • Ψk : Multi-D orthogonal polynomials up to order p • ξi : Gaussian random variable known as the germ • n: Dimensionality = number of uncertain model parameters • P + 1: Number of PC terms = (n+p)! n!p! Non-Intrusive Spectral Projection (NISP) Having specified a PCE form for our heat flux Q, we must determine the PC coefficients. We do so through a non-intrusive Galerkin Projection. The coefficients are determined using the following formula: Z hQΨk i 1 Qk = = QΨk (ξ)w(ξ)dξ hΨ2k i hΨ2k i In this example we use quadrature methods to evaluate this projection integral to determine the PC coefficients. 55 Kernel Density Estimation Once we have a large number of heat flux samples, to obtain a probability density function (PDF) curve, a kernel density estimation (KDE) is performed. When performing KDE, the following formula is used to evaluate the PDF at point Q: Ns Q − Qi 1 X K P DF (Q) = Ns h i=1 h Qi are samples of the heat flux, Ns is the number of sample points, K represents the kernel, a non-negative function, and h > 0 is the bandwidth. In this example we use the Gaussian −x2 kernel, K(x) = √12π e 2 . The results rely heavily on the choice of bandwidth, and there are many rules that can be used to calculate the bandwidth. In our example, the built-in SciPy function employed automatically determines the bandwidth using Scott’s rule of thumb. Implementation The script set consists of two files: • rad_heat_transfer_atm_pce.py: the main script • heat_transfer_pce_tools.py: functions called by rad_heat_transfer_atm_pce.py rad heat transfer atm pce.py This script will produce a graph comparing PDFs of heat flux generated using NISP full and sparse quadrature methods and a Monte Carlo sampling method. Use the command ./rad_heat_transfer_atm_pce.py to run this file The top section of the script has some flags that are useful to consider: • main verbose: set this parameter to 1 for intermediate print statements, otherwise set to 0. • compute rad: set to True to include radiation; use False to not use radiation. Note, radiation is used, the nonlinear solver in Python sometimes has a hard time converging, in which case you will see a warning message pop up. • nord: the order of the PCE • ndim: the number of dimensions of the PCE • pc type: indicates the polynomial type and weighting function. The default is set to ”HG”, Hermite-Gauss. 56 • pc alpha and pc beta: Free parameters greater than -1. Used with Gamma-Laguerre and Beta-Jacobi PC types. • param: The parameter used for quadrature point generation. Equal to the number of quadrature points per dimension for full quadrature or the level for sparse quadrature methods. This parameter is generally set to nord + 1 in order to have the right polynomial exactness. Monte Carlo Sampling Methods The script begins by assigning the input parameters and their means and standard deviations. Using this information, a large number of random parameter samples (default is 100,000) is generated. With these samples and our forward model, the function compute_heat_flux then calculates the heat flux assuming that no radiative heat transfer occurs. This simply requires solving a system of three linear equations. Then, using the values of Q, T1 , and T2 obtained from compute_heat_flux as initial guesses, the function r_heat_flux calculates the total heat flux (including radiative heat transfer) using the SciPy nonlinear solver optimize.fsolve. Using the heat flux samples, a kernel density estimation is then performed using function KDE. NISP Quadrature Methods After the Monte Carlo sampling process is complete, forward propagation using projection methods will take place. At the beginning of this section of the script, the following variables are defined: While running the file, a statement similar to the following will print indicating that PC objects are being instantiated. Instantiating PC Objects Generating 4^7 = 16384 quadrature points. Used 4 quadrature points per dimension for initialization. Generating 4^7 = 16384 quadrature points. Used 4 quadrature points per dimension for initialization. Level 0 / 4 Level 1 / 4 Level 2 / 4 Level 3 / 4 Level 4 / 4 Instantiation complete These PC objects contain all of the information needed about the polynomial chaos expansion, such as the number of dimensions, the order of the expansion, and the sparsity (full or sparse) of the quadrature methods to be implemented. 57 Next, a NumPy array of quadrature points is generated, using the function get_quadpts. Then, the quadrature points in ξ are converted to equivalent samples of the input parameters, taking advantage of the fact that the inputs are assumed to be Gaussian. If we let µ represent the mean of an input parameter, σ represent its standard deviation, and qdpts be a vector of quadrature points, the following equation is used to convert these samples in ξ to equivalent samples of the input parameter: parameter samples = µ + σ(qdpts) Now that we have samples of all the input parameters, these samples are run through the forward model to obtain values of Q using the function fwd_model. Then the actual Galerkin projection is performed on these function evaluations to obtain the PC coefficients, using the function GalerkinProjection. Next, to create a PDF of the output Q from its PCE, germ samples are generated and the PCE is evaluated at these sample points using the function evaluate_pce. Lastly, using our PCE evaluations, a kernel density estimation is performed using function KDE This entire process is done twice, once with full tensor product quadrature points and again with sparse quadrature points. Printing and Graphing Next, statements indicating the total number of sampling points used for each forward propagation method will be printed. Monte Carlo sampling used 100000 points Full quadrature method used 16384 points Sparse quadrature method used 6245 points Finally, a graph is created which displays the three different heat flux PDFs on the same figure. It will be saved under the file name heat_flux_pce.pdf. heat transfer pce tools.py This script contains several functions called by the rad_heat_transfer_atm_pce.py file. • compute_heat_flux(Ti, To, dw, kw, hi, ho): This function calculates heat flux Q, assuming no radiative heat transfer occurs. – Ti, To, dw, kw, hi, ho: Sample values (scalars) of the input parameters • r_heat_flux(Ti,To,dw,kw,hi,ho,TA,estimates): This function calculates heat flux Q, assuming radiative heat transfer occurs. The SciPy non-linear solver optimize.fsolve is employed. 58 – Ti, To, dw, kw, hi, ho, TA: Sample values (scalars) of the input parameters – estimates: Estimates of Q, T1 , and T2 required to solve the nonlinear system. For these estimates, we use the output of the function compute_heat_flux, which solves the system assuming no radiative heat transfer occurs. • get_quadpts(pc_model,ndim): This function generates a NumPy array of either full tensor product or sparse quadrature points. Returns number of quadrature points, totquat, as well. – pc model : PC object with information about the PCE, including the desired sparsity for quadrature point generation. – ndim: Number of dimensions of the PCE • fwd_model(Ti_samples,To_samples,dw_samples,kw_samples,hi_samples, ho_samples,verbose): This function returns a NumPy array of evaluations of the forward model. This function calls the functions compute_heat_flux and r_heat_flux. – Ti samples, To samples, dw samples, kw samples, hi samples, ho samples: 1D NumPy arrays (vectors) of parameter sample values. • fwd_model_rad(Ti_samples,To_samples,dw_samples,kw_samples,hi_samples, ho_samples,TA_samples,verbose): Same as fwd_model but with radiation enabled, and an extra argument for the samples of TA. • GalerkinProjection(pc_model,f_evaluations): This function returns a 1D NumPy array with the PC coefficients. – pc model : PC object with information about the PCE. – f evaluations: 1D NumPy array (vector) with function to be projected, evaluated at the quadrature points. • evaluate_pce(pc_model,pc_coeffs,germ_samples): This function evaluates the PCE at a set of samples of the germ. – pc model : PC object with information about the PCE. – pc coeffs: 1D NumPy array with PC coefficients. This array is the output of the function GalerkinProjection – germ samples: NumPy array with samples of the PCE germ at which the random variable is to be evaluated. • KDE(fcn_evals): This function performs a kernel density estimation. It returns a NumPy array of points at which the PDF was estimated, as well as a NumPy array of the corresponding estimated PDF values. – fcn evals: A NumPy array of evaluations of the forward model (values of heat flux Q). 59 Sample Results Run the file rad_heat_transfer_atm_pce.py with the default settings. You should expect to see the following print statement, and your graph should look similar to the one found in the figure below. Monte Carlo sampling used 100000 points Full quadrature method used 16384 points Sparse quadrature method used 6245 points Heat Transfer Through a Window 0.040 0.035 NISP full quadrature method Monte Carlo Sampling NISP sparse quadrature method 0.030 PDF 0.025 0.020 0.015 0.010 0.005 0.000 20 40 60 80 100 Total Heat Flux (W/m2 ) 120 140 Figure 5.7: Sample results of rad heat transfer atm pce.py Now trying changing one of the default settings. Find the line that reads nord = 3 and change it to read nord = 2. This will change the order of the PCE to 2, rather than 3. You should expect to see the following print statement, and a sample graph is found in the figure below. Monte Carlo sampling used 100000 points Full quadrature method used 2187 points Sparse quadrature method used 1023 points 60 Heat Transfer Through a Window 0.040 0.035 NISP full quadrature method Monte Carlo Sampling NISP sparse quadrature method 0.030 PDF 0.025 0.020 0.015 0.010 0.005 0.000 20 40 60 80 100 Total Heat Flux (W/m2 ) 120 140 Figure 5.8: Sample results of rad heat transfer atm pce.py with nord = 2 The user can also change the default pc_type = "HG". For example, to use LegendreUniform PC type, change this line to read pc_type = "LU". Expanded Forward Propagation of Uncertainty - PyUQTk Overview This example is contains two pairs of python files, each pair consists of a main script and a tools file that holds functions to be called from the main script. They are located in UQTk/examples/window. This example is an expanded version of the previous forward propagation problem, it utilizes a more complex model that provides additional parameters. One pair of scripts produces a probability density function of the heat flux for a twelve dimension model, using fourth order PCE’s and sparse quadrature. The other pair produces a probability density function of the heat flux for a fifth dimension model, using fourth order PCE’s. This script compares the results of using sparse and full quadrature, as well as Monte Carlo sampling. It also produces a separate plot showing the spectral decay of the 61 PC coefficients. Theory Heat Transfer - Dual Pane Window The heat flux calculations are implemented using random samples from the normal distribution of each of the uncertain Gaussian parameters. The standard deviations assigned are estimates. These parameters are defined in the table below: The forward model that was developed relies on the same set of steady state heat transfer equations from the first example, with the addition of a combined equation for conduction and convection for the air gap. This is defined as conductance, and was implemented in order to provide an alternative to determining the convection coefficient for this region, which can be challenging[2]. T1 − T2 ka Q = hi (Ti − T1 ) = kw = 0.41 dw da 4 4 = ho (T4 − To ) + σ(T4 − Ts ) gβρ2 da 3 µ2 0.16 T3 − T4 |T2 − T3 | (T2 − T3 ) = kw dw [1] Leonard, John H. IV, Leinhard, John H. V. A Heat Transfer Textbook - 4th edition. pg 579. Phlogiston Press. 2011 [2] Rubin, Michael.Calculating Heat Transfer Through Windows. Energy Research. Vol. 6, pg 341-349. 1982. 62 Parameter Ti To Ts kw ka hi ho dw da µ ρ β Definition Inside temperature Outside temperature Sky Temperature[1] Conduction constant for glass Conduction constant for air Inside convection coefficient Outside convection coefficient Width of the glass Width of the air gap Viscosity or air Density of air Thermal expansion coefficient Mean Value 293 K 273 K 263 K 1 W/m2 K 0.024 W/m2 K 2 W/m2 K 6 W/m2 5 mm 1 cm 1.86x10−5 kg/m s 1.29 kg/m3 3.67x10−3 1/K Standard Deviation % 0.5 0.5 10 5 5 15 15 1 1 5 5 5 T1 represents the glass temperature of the outer facing surface for the first glass layer, and T2 represents the temperature of the inner surface for the first glass layer, T3 represents the temperature of the inner surface of the second glass layer, T4 represents the outer facing surface of the second glass layer. is the emissivity of uncoated glass, which we take to be 0.95, and σ is the Stefan-Boltzmann constant. Polynomial Chaos Expansion* Non-Intrusive Spectral Projection (NISP)* Kernel Density Estimation* *Please see previous example. Implementation The first set of scripts: • 5D_fwd_prop.py: the main script • fwd_prop_tools.py: functions called by 5D_fwd_prop.py 63 Figure 5.9: Increase in number of samples with change in order. 5D fwd prop.py This script will produce a plot comparing the probability density function of the heat flux using three methods. Non-intrusive spectral projection full and sparse quadrature methods, and Monte Carlo sampling. It also produces a plot showing the spectral decay of the PC coefficient magnitude in relation to the PC order. This plot illustrates the how the projection converges to give a precise representation of the model. Changing the order of this script (nord) may alter the results of the second plot, since it has elements that are not dynamically linked. Use the command python 5D_fwd_prop.py to run this file from the window directory. The second set of scripts: • highd_sparse.py: the main script • tools_conductance_dp_pce_wrad.py: functions called by highd_sparse.py 64 highd sparse.py This script will produce of the probability density function of the heat flux in twelve dimensions, using only non-intrusive spectral projection with sparse quadrature. Use the command python highd_sparse.py to run this file from the window directory. Adjustments may be made to nord if desired, though the number of points produced increases dramatically with an increase in order, as illustrated by figure 5.9. Monte Carlo Sampling Methods This is very similar to the previous example with the exception of compute_heat_flux producing T1 , T2 , T3 , T4 and Q. This function consists of a solved set of five linear equations, our forward model neglecting convection for the air gap and radiative heat transfer, which are evaluated for the parameter samples. The values obtained are used as initial inputs for r_heat_flux, which calculates the total heat flux for all heat transfer modes, using the SciPy nonlinear solver optimize.fsolve. Using the heat flux samples, a kernel density estimation is then performed using function KDE. NISP Quadrature Methods Please see previous example. Printing and Graphing Next, statements indicating the total number of sampling points used for each forward propagation method will be printed. The number of Monte Carlo points is fixed, but the number of points produced by quadrature method will vary with the number of uncertain parameters and the order of the PCE. Monte Carlo sampling used 100000 points Full quadrature method used xxxxxxx points Sparse quadrature method used xxxxxxx points Finally, a graph is created which displays the three different heat flux PDFs on the same figure. It will be saved under the file name heat_flux_pce.pdf. fwd prop tools.py and tools conductance dp pce wrad.py This scripts contain several functions called by the 5D_fwd_prop.py and highd_sparse.py files. The five dimension example uses the five parameters with the highest uncertainty, Ts,hi,ho,kw,ka. 65 • compute_heat_flux(Ti,To,dw,da,kw,ka,hi,ho): This function calculates heat flux Q, assuming no radiative heat transfer occurs and no convective heat transfer occurs in the air gap. – Ti, To, dw, da, kw, ka, hi, ho: Sample values (scalars) of the input parameters • r_heat_flux(Ti,To,Ts,dw,da,kw,ka,hi,ho,beta,rho,mu,estimates): This function calculates heat flux Q, assuming radiative heat transfer, and convective heat transfer for the air gap occurs. The SciPy non-linear solver optimize.fsolve is employed. – Ti, To, Ts, dw, da, kw, ka, hi, ho, beta, rho, mu: Sample values (scalars) of the input parameters – estimates: Estimates of Q, T1 , T2 , T3 , and T4 are required to solve the nonlinear system. For these estimates, we use the output of the function compute_heat_flux, which solves the system assuming no radiative or convective heat transfer for the air gap occurs. • get_quadpts(pc_model,ndim)* • fwd_model(Ti_samples,To_samples,Ts_samples,dw_samples,da_samples,kw_samples, ka_samples,hi_samples,ho_samples,beta_samples,rho_samples,mu_samples): This function returns a NumPy array of evaluations of the forward model. This function calls the functions compute_heat_flux and r_heat_flux. – Ti samples, To samples,Ts samples, dw samples, da samples, kw samples, ka samples, hi samples, ho samples, beta samples, rho samples, mu samples: 1D NumPy arrays (vectors) of parameter sample values. • GalerkinProjection(pc_model,f_evaluations)* • evaluate_pce(pc_model,pc_coeffs,germ_samples)*: • KDE(fcn_evals)*: • get_multi_index(pc_model,ndim) This function computes the multi indices to be used in the inverse relationship plot. – pc model : Contains information about the PC set. – ndim: Number of model dimensions. • plot_mi_dims(pc_model,c_k,ndim) This function produces the inverse relationship plot using the multi indices produced in the previous function, and the value of the PC coefficients. – pc model : Contains information about the PC set. – ndim: Number of model dimensions. – c k : Numpy array of PC coefficients. *Please see previous example. 66 Sample Results Run the file highd_sparse.py with the default settings. You should expect to see the following print statement, and your graph should look similar to the one found in the figure below. This sample is for a problem with twelve dimensions, the PCE is fourth order. The following will print to the terminal: Sparse quadrature method used 258681 points Figure 5.10: Sample results of highd sparse.py The next two figures show the two results of the five dimension example with a fourth order PCE. The following will print to the terminal: 67 Monte Carlo sampling used 100000 points Full quadrature method used 3125 points Sparse quadrature method used 10363 points Figure 5.11: Sample results of 5D fwd prop.py 68 Figure 5.12: Sample results of 5D fwd prop.py Bayesian Inference of a Line Overview This example is located in examples/line_infer It infers the slope and intercept of a line from noisy data using Bayes’ rule. The C++ libraries are called directly from the driver program. By changing the likelihood function and the input data, this program can be tailored to other inference problems. To run an example, type ./line_infer.py directly. This file contains quite a bit of inline documentation about the various settings and methods used. To get a listing of all command line options, type ./line_infer.py -h". A typical run, with parameters changed from command-line, is as follows: ./line_infer.py --nd 5 --stats This will run the inference problem with 5 data points, generate plots of the posterior distributions, and generate statistics of the MCMC samples. If no plots are desired, also give the --noplots argument. 69 More details After setting a number of default values for the example problem overall, the line_infer.py script sets the proper inference inputs in the file line_infer.xml, starting from a set of defaults in line_infer.xml.templ. The file line_infer.xml is read in by the C++ code line_infer.x, which does the actual Bayesian inference. After that, synthetic data is generated, either from a linear, or cosine model, with added noise. Then, the script calls the C++ line inference code line_infer.x to infer the two parameters (slope and intercept) of a line that best fits the artificial data. (Note, one can also run the inference code directly by manually editing the file line_infer.xml and typing the command ./line_infer.x) The script then reads in the MCMC posterior samples file, and performs some postprocessing. Unless the flag --noplots is specified, the script computes and plots the following: • The pushed-forward and posterior predictive error bars – Generate a dense grid of x-values – Evaluate the linear model y = a + bx for all posterior samples (a, b) after the burn-in – Pushed-forward distribution: compute the sample mean and standard deviation of using the sampled models – Posterior predictive distribution: combine pushed-forward distribution with the noise model • The MCMC chain for each variable, as well as a scatter plot for each pair of variables • The marginal posterior distribution for each variable, as well as the marginal joint distribution for each pair of variables If the flag --stats is specified, the following statistics are also computed: • The mean, MAP (Maximum A Posteriori), and standard deviations of all parameters • The covariance matrix • The average acceptance probability of the chain • The effective sample sizes for each variable in the chain Sample Results 70 6.5 Mean prediction Data 6.0 y 5.5 5.0 4.5 4.0 3.51.0 0.5 0.0 0.5 x 1.0 Figure 5.13: The pushed forward posterior distribution (dark grey) and posterior predictive distribution (light grey). 0.6 5.2 0.7 0.8 5.1 0.9 a b 1.0 5.0 1.1 1.2 4.9 1.3 1.4 4.80 50000 100000 MCMC step 150000 1.50 200000 50000 100000 MCMC step 150000 200000 0.6 0.7 0.8 0.9 b 1.0 1.1 1.2 1.3 1.4 1.54.8 4.9 5.0 a 5.1 5.2 Figure 5.14: MCMC chains for parameters a and b, as well as a scatter plot for a and b 71 a 4.8 5.0 5.2 4.8 5.0 5.2 0.8 b 1.0 1.2 1.4 1.4 1.2 1.0 0.8 Figure 5.15: Marginal posterior distributions for all variables, as well as marginal joint posteriors Surrogate Construction and Sensitivity Analysis Overview • Located in examples/uqpc • A collection of scripts that construct a PC surrogate for a computational model which is specified as a black box simulation code. Also provides tools for sensitivity analysis of the outputs of this black box model with respect to its input parameters. Theory Consider a function f (λ; x) where λ = (λ1 , . . . , λd ) are the model input parameters of interest, while x ∈ Rr are design parameters with controllable values. For example, x can denote a spatial coordinate or a time snapshot, or it can simply enumerate multiple quantities of interest. Furthermore, we assume known domains for each input parameter, i.e. λi ∈ [ai , bi ] for 72 i = 1, . . . .d. (5.5) The goal is to build a Polynomial Chaos (PC) surrogate function for each value of design parameter x, i.e. for l = 1, . . . , L, f (λ; xl ) ≈ g(λ; xl ) = K−1 X ckl Ψk (ξ) (5.6) for (5.7) k=0 with respect to scaled inputs ξi = bi −ai 2 bi +ai 2 λi − ∈ [−1, 1] i = 1, . . . .d. Here Ψk (ξ) = Ψk (ξ1 , . . . , ξd ) are multivariate normalized Legendre polynomials, defined as products of univariate normalized Legendre polynomials ψki (ξi ) as follows: Ψk (ξ) = ψk1 (ξ1 ) . . . ψkd (ξd ). (5.8) A typical truncation rule in (5.6) is defined according to the total order of the basis terms, i.e. only polynomials with the total order ≤ p are retained for some positive integer order p, implying |k1 | + · · · + |kd | ≤ p, and K = (d + p)!/(d!p!). The scalar index k is simply counting the multi-indices (k1 , . . . , kd ). After computing the PC coefficients ckl , one can extract the global sensitivity information, also called Sobol indices or variance-based decomposition. For example, the main sensitivity index with respect to the dimension i (or variable ξi ) is P k∈Ii Si (xl ) = PK−1 k=1 c2kl c2kl , (5.9) where Ii is the indices of basis terms that involve only the variable ξi , i.e. the one-dimensional monomials ψ1 (ξi ), ψ2 (ξi ), .... In other words, these are basis terms corresponding to multiindices with the only non-zero entry at the i-th location. The two generic methods of finding the PC coefficients ckl are detailed below. Projection The basis orthonormality enables the projection formulae Z 1 ckl = f (λ(ξ); xl )Ψk (ξ) d dξ 2 Ω where Ω = [−1, 1]d and λ(ξ) simply denotes the linear scaling relation in (5.7). The projection integral can be taken by 73 (5.10) • Quadrature integration ckl ≈ Q X f (λ(ξ (q) ); xl )Ψk (ξ (q) ), (5.11) q=1 where ξ (q) are Gauss-Legendre quadrature points in the d-dimensional Ω = [−1, 1]d . • Monte-Carlo integration [to be implemented] ckl ≈ M X f (λ(ξ (m) ); xl )Ψk (ξ (m) ), m=1 where ξ (m) are uniformly random points in the d-dimensional Ω = [−1, 1]d . Inference [to be implemented] Implementation The script set consists of three files are • uq_pc.py : the main script • model.py : black-box example model • plot.py : plotting The apps employed • generate_quad : Quadrature point/weight generation • gen_mi : PC multiindex generation • pce_resp : Projection via quadrature integration • pce_sens : Sensitivity extraction • pce_eval : PC evaluation Main script: The syntax of the main script is 74 (5.12) % uq_pc.py -r % -p -m -o -s -n -v Also one can run uq_pc.py -h for help in the terminal and to check the defaults. The list of arguments: • -r : The regime in which the workflow is employed. The options are online : Black-box model, which is in model.py, is run directly as parameter ensemble becomes available. User can provide their own model.py or simply use the current one as an example. offline_prep : Prepare the input parameter ensemble and store in ytrain.dat and, if validation is requested, yval.dat. The user then should run the model (model.py ptrain.dat ytrain.dat and perhaps model.py pval.dat yval.dat) in order to provide ensemble output for the offline_post stage. offline_post : Postprocess the output ensemble, assuming the model is run offline with input ensemble provided in the offline_prep stage producing ytrain.dat and, if validation is requested, yval.dat. The rest of the arguments should remain the same as in the offline_post stage. • -p : A file with d rows and 2 columns, where d is the number of parameters and each row consists of the lower and upper bound of the corresponding parameter. • -m : The method of finding the PC surrogate coefficients. The options are proj : Projection method outlined in (5.10) and (5.11) lsq : Least-squares. bcs : Bayesian compressive sensing. • -o : Total order p of the requested PC surrogates. • -s : The input parameter sampling method. The options are Q : Quadrature points. This sampling scheme works with the projection method only, described in (5.11) U : Uniformly random points. To be implemented. • -n : Number of samples requested if sam_method=U, or the number of quadrature points per dimension, if sam_method=Q. • -v : Number of uniformly random samples generated for PC surrogate validation, can be equal to 0 to skip validation. Generated output files are: 75 • allsens.dat and allsens_sc.dat :The main effect sensitivity indices in a format Lxd, where each row corresponds to a single value for the design parameter, and each column corresponds to the sensitivity index of a parameter. The second file stores the same results, only the sensitivity indices are scaled to sum to one for each parameter. • results.pk : Python pickle file that includes surrogate, sensitivity and error information. It is loaded in plotting utilities. • *.log : Log files of the apps that have been employed, for reference. • mi.dat : PC multiindex, for reference, in a matrix form of size Kxd, see (5.8). • designPar.dat : A file containing values for the design parameters, for reference. Black-box model: The syntax of the model script is % model.py Also one can run model.py -h for help in the terminal and to check the defaults. The list of arguments: • : A file with N rows and d columns that stores the input parameter ensemble of N samples. • : The file where output is stored, with N rows (number of input parameter samples) and L columns (number of outputs, or number of design parameter values). model.py is the file that needs to be modified/provided by the user Pto the model Paccording λi +λ2i d d is λ under study. Currently, a simple example function f (λ; x) = i=1 ix+1 i=1 i implemented that also produces the file designPar.dat for design parameters xl = l for l = 0, . . . , 6. The default dimensionality is set to d = 3. A user-created black-box model.py should accept an input parameter file and should produce an output file with the formats described above, in order to be consistent with the expected I/O. Visualization: The syntax of the plotting script is % plot.py The user is encouraged to enhance or change the visualization scripts. The current defaults, explained below, are simply for illustration purposes. The options for the only argument are 76 sens : Plots the sensitivity information. dm : Plots model-vs-data for one of the values of the design parameter. idm : Plots model and data values on the same axis, for one of the values of the design parameter. For the visualization script above, make sure the PYTHONPATH environment variable contains the directory UQTK_INS. Sample run: In order to run a simple example, use the prepared an input domain file which is the default domain file, pdomain_3d.dat, and run • Online mode: uq_pc.py -r online -v111 • Offline mode: uq_pc.py -r offline_prep -v111, followed by model evaluations model.py ptrain.dat ytrain.dat and model.py pval.dat yval.dat, and the postprocessing stage uq_pc.py -r offline_post -v111 After finishing, run plot.py sens or plot.py dm or plot.py idm to visualize some results. Global Sensitivity Analysis via Sampling Overview • Located in PyUQTk/sens • A collection of Python functions that generate input samples for black-box models, followed by functions that post-process model outputs to generate total, first-order, and joint effect Sobol indices Theory Let X = (X1 , · · · , Xn ) : Ω → X ⊂ IRn be an n−dimensional Random Variable in L2 (Ω, S, P ) with probability density X ∼ pX (x). Let x = (x1 , · · · , xn ) ∈ X be a sample drawn from this density, with X = X1 ⊗ X2 ⊗ · · · ⊗ Xn , and Xi ⊂ IR is the range of Xi . Let X−i = (X1 , · · · , Xi−1 , Xi+1 , · · · , Xn ) : Ω → X−i ⊂ IRn−1 , where X−i ∼ pX−i |Xi (x−i |xi ) = pX (x)/pXi (xi ), pXi (xi ) is the marginal density of Xi , x−i = (x1 , · · · , xi−1 , xi+1 , · · · , xn ), and X−i = X1 ⊗ · · · ⊗ Xi−1 ⊗ Xi+1 ⊗ · · · ⊗ Xn . 77 Consider a function Y = f (X) : Ω → IR, with Y ∈ L2 (Ω, S, P ). Further, let Y ∼ pY (y), with y = f (x). Given the variance of f is finite , one can employ the law of total variance1,2 to decompose the variance of f as V [f ] = Vxi [E[f |xi ]] + Exi [V [f |xi ]] (5.13) The conditional mean, E[f |xi ] ≡ E[f (X)|Xi = xi ], and conditional variance, V [f |xi ] = V [f (X)|Xi = xi ], are defined as Z hf i−i ≡ E[f |xi ] = f (x)pX−i |Xi (x−i |xi )dx−i (5.14) X−i V [f |xi ] = E[(f − hf i−i )2 |xi ] = E[(f 2 − 2f hf i−i + hf i2−i )|xi ] = E[f 2 |xi ] − 2hf i−i hf i−i + hf i2−i Z f (x)2 pX−i |Xi (x−i |xi )dx−i − hf i2−i = (5.15) X−i The terms in the rhs of Eq. (5.13) can be written as Vxi [E[f |xi ]] = Exi [ (E[f |xi ] − Exi [E[f |xi ]])2 ] = Exi [ (E[f |xi ] − f0 )2 ] = Exi [ (E[f |xi ])2 ] − f02 Z E[f |xi ]2 pXi (xi )dxi − f02 = (5.16) Xi where f0 = E[f ] = Exi [E[f |xi ]] is the expectation of f , and Z V [f |xi ]pXi (xi )dxi Exi [V [f |xi ]] = (5.17) Xi The ratio Si = Vxi [E[f |xi ]] V [f ] (5.18) is called the first-order Sobol index [32] and T S−i = Exi [V [f |xi ]] V [f ] (5.19) is the total effect Sobol index for x−i . Using Eq. (5.13), the sum of the two indices defined above is T = S−i + SiT = 1 (5.20) Si + S−i Joint Sobol indices Sij are defined as Sij = 1 2 Vxi ,xj [E[f |xi , xj ]] − Si − Sj V [f ] en.wikipedia.org/wiki/Law_of_total_variance en.wikipedia.org/wiki/Law_of_total_expectation 78 (5.21) for i, j = 1, 2 . . . , n and i 6= j. Si can be interpreted as the fraction of the variance in model f that can be attributed to the i-th input parameter only, while Sij is the variance fraction that is due to the joint contribution of i-th and j-th input parameters. SiT measures the fractional contribution to the total variance due to parameter xi and its interactions with all other model parameters. The Sobol indices are numerically estimated using Monte Carlo (MC) algorithms proposed by Saltelli [24] and Kucherenko et al [16]. Let xk = (x1 , · · · , xn )k be a sample of X 0 k drawn from pX . Let x0k −i be a sample from the conditional distribution pX−i |Xi (x−i |xi ), and 00 k x00k i a sample from the conditional distribution pXi |X−i (xi |x−i ). The expectation f0 = E[f ] and variance V = V [f ] are estimated using the xk samples as N 1 X f (xk ), f0 ≈ N k=1 N 1 X V ≈ f (xk )2 − f02 N k=1 (5.22) where N is the total number of samples. The first-order Sobol indices Si are estimated as ! N 1 1 X k 0k k 2 Si ≈ f (x )f (x−i ∪ xi ) − f0 (5.23) V N k=1 The joint Sobol indices are estimated as 1 Sij ≈ V N 1 X k 2 f (xk )f (x0k −(i,j) ∪ xi,j ) − f0 N k=1 ! − Si − Sj (5.24) For SiT , UQTk offers two alternative MC estimators. In the first approach, SiT is estimated as ! N 1 1 X T k 00k k 2 Si = 1 − S−i ≈ 1 − f (x )f (xi ∪ x−i ) − f0 (5.25) V N k=1 In the second approach, SiT is estimated as SiT ≈ 1 2V N 2 1 X f (xk ) − f (xk−i ∪ x00k i ) N k=1 ! (5.26) Implementation Directory pyUQTk/sensitivity contains two Python files • gsalib.py : set of Python functions implementing the MC sampling and estimators for Sobol indices • gsatest.py : workflow illustrating the computation of Sobol indices for a toy problem 79 gsalib.py implements the following functions • genSpl_Si(nspl,ndim,abrng,**kwargs) : generates samples for Eq. (5.23). The input parameters are as follows nspl: number of samples N , ndim: dimensionality n of the input parameter space , abrng: a 2-dimensional array n × 2, containing the range for each component xi . The following optional parameters can also be specified splout: name of ascii output file for MC samples matfile: name of binary output file for select MC samples. These samples are used in subsequent calculations of joint Sobol indices verb: verbosity level nd: number of significant digits for ascii output The default values for optional parameters are listed in gsalib.py • genSens_Si(modeval,ndim,**kwargs) : computes first-order Sobol indices using Eq. (5.23). The input parameters are as follows modeval: name of ascii file with model evaluations, ndim: dimensionality n of the input parameter space The following optional parameter can also be specified verb: verbosity level The default value for the optional parameter is listed in gsalib.py • genSpl_SiT(nspl,ndim,abrng,**kwargs) : generates samples for Eqs. (5.25-5.26). The input parameters are as follows nspl: number of samples N , ndim: dimensionality n of the input parameter space , abrng: an 2-dimensional array n × 2, containing the range for each component xi . The following optional parameters can also be specified splout: name of ascii output file for MC samples matfile: name of binary output file for select MC samples. These samples are used in subsequent calculations of Sobol indices verb: verbosity level nd: number of significant digits for ascii output The default values for optional parameters are listed in gsalib.py 80 • genSens_SiT(modeval,ndim,**kwargs) : computes total Sobol indices using either Eq. (5.25) or Eq. (5.26). The input parameters are as follows modeval: name of ascii file with model evaluations, ndim: dimensionality n of the input parameter space The following optional parameter can also be specified type: specifies wether to use Eq. (5.25) for type = ”type1” or Eq. (5.26) for type 6= ”type1” verb: verbosity level The default value for the optional parameter is listed in gsalib.py • genSpl_Sij(ndim,**kwargs) : generates samples for Eq. (5.24). The input parameters are as follows ndim: dimensionality n of the input parameter space , The following optional parameters can also be specified splout: name of ascii output file for MC samples matfile: name of binary output file for select MC samples saved by genSpl_Si. verb: verbosity level nd: number of significant digits for ascii output The default values for optional parameters are listed in gsalib.py • genSens_Sij(sobolSi,modeval,**kwargs) : computes joint Sobol indices using Eq. (5.24). The input parameters are as follows sobolSi: array with values for first-order Sobol indices Si modeval: name of ascii file with model evaluations. The following optional parameter can also be specified verb: verbosity level The default value for the optional parameter is listed in gsalib.py gsatest.py provides the workflow for the estimation of Sobol indices for a simple model given by n n−1 X X f (x1 , x2 , . . . , xn ) = xi + i2 xi xi+1 (5.27) i=1 i=1 In the example provided in this file, n (ndim in the file) is set equal to 4, and the number of samples N (nspl in the file) to 104 . Figures 5.16 and 5.17 show results based on an ensemble of 10 runs. To generate these results run the example workflow: python gsatest.py 81 0.5 0.06 0.4 0.3 Si Sij 0.04 0.2 0.02 0.1 0 x1 x2 x3 0 x4 (x1 ,x2 ) (x2 ,x3 ) (x3 ,x4 ) Figure 5.16: First-order (left frame) and joint (right frame) Sobol indices for the model given in Eq. (5.27). The black circles show the theorerical values, computed analytically, and the error bars correspond to ±σ computed based on an ensemble of 10 runs. Exact Est.1 Est.2 0.5 SiT 0.4 0.3 0.2 0.1 0 x1 x2 x3 x4 Figure 5.17: Total-order Sobol indices for the model given in Eq. (5.27). The red bars shows results based on Eq. (5.25) while the yellow bars are based on Eq. (5.26). The black circles show the theorerical values, computed analytically, and the error bars correspond to ±σ computed based on an ensemble of 10 runs. For this model, Eq. (5.26) provides more accurate estimates for SiT compared to results based on Eq. (5.25). 82 Karhunen-Loève Expansion of a Stochastic Process • Located in examples/kle_ex1 • Some examples of the construction of 1D and 2D Karhunen-Loève (KL) expansions of a Gaussian stochastic process, based on sample realizations of this stochastic process. Theory Assume stochastic process F (x, ω) : D × Ω → R is L2 random field on D, with covariance function C(x, y). Then F can be written as ∞ p X F (x, ω) = hF (x, ω)iω + λk fk (x)ξk (5.28) k=1 where fk (x) are eigenfunctions of C(x, y) and λk are corresponding eigenvalues (all positive). Random variables ξk are uncorrelated with unit variance. Projecting realizations of F onto fk leads to samples of ξk . These samples are generally not independent. In the special case when F is a Gaussian random process, ξk are i.i.d. normal random variables. The KL expansion is optimal, i.e. of all possible orthonormal bases for L2 (D × Ω) the above {fk (x)|k = 1, 2, . . .} minimize the mean-square error in a finite linear representation of F (·). If known, the covariance matrix can be specified analytically, e.g. the squareexponential form ! |x − y|2 (5.29) C(x, y) = exp − c2l where |x − y| is the distance between x and y and cl is the correlation length. The covariance matrix can also be estimated from realizations, e.g. C(x, y) = 1 X (F (x, ω) − hF (x, ω)iω )(F (y, ω) − hF (y, ω)iω ) Nω ω (5.30) where Nω is the number of samples, and hF (x, ω)iω is the mean over the random field realizations at x. The eigenvalues and eigenvectors in Eq. (5.28) are solutions of the Fredholm equation of second kind: Z C(x, y)f (y)dy = λf (x) (5.31) One can employ the Nystrom algorithm [19] to discretize of the integral in the left-hand side of Eq. (5.31) Np X wi C(x, yi )f (yi ) = λf (x) (5.32) i=1 83 Here wi are the weights for the quadrature rule that uses Np points yi where realizations are provided. In a 1D configuration, one can employ the weights corresponding to the trapezoidal rule: y2 −y1 if i = 1, 2 yi+1 −yi−1 wi = (5.33) if 2 ≤ i < Np , 2 yNp −yNp −1 if i = Np , 2 After further manipulation, Eq. (5.32) is written as Ag = λg √ where A = W KW and g = W f , with W being the diagonal matrix Wii = wi and Kij = C(xi , yj ). Since matrix A is symmetric, one can employ efficient algorithms to compute its eigenvalues λk and eigenvectors gk . Currently UQTk relies on the dsyevx function provided by the LAPACK library. The KL eigenvectors are computed as fk = W −1 gk and samples of random variables ξk are obtained by projecting realizations of the random process F on the eigenmodes fk p ξk |ωl = hF (x, ωl ) − hF (x, ω)iω , fk (x)ix / λk Numerically, these projections can be estimated via quadrature ξk |ωl = Np X p wi (F (xi , ωl ) − hF (xi , ω)iω ) fk (xi )/ λk (5.34) i=1 If F is a Gaussian process, ξk are i.i.d. normal RVs, i.e. automatically have first order Wiener-Hermite Polynomial Chaos Expansions (PCE). In general however, the KL RVs can be converted to PCEs (not shown in the current example). 1D Examples In this section we are presenting 1D RFs generated with kl 1D.x. The RFs are generated on a non-uniform 1D grid, with smaller grid spacing near x = 0 and larger grid spacing towards x = 1. This grid is computed using an algebraic expression [14] β + 1 − (β − 1)ri xi = L , ri = ri + 1 β+1 β−1 1−ηi , ηi = i−1 , i = 1, 2, . . . , Np Np − 1 (5.35) The β > 1 factor in the above expression controls the compression near x = 0. It results in higher compression as β gets closer to 1. The examples shown in this section are based on default values for the parameters that control the grid definition in kl 1D: β = 1.1, L = 1, Np = 129 84 Figure 5.18 shows sample realizations for 1D random fields (RF) generated with a squareexponential covariance matrix employing several correlation lenghts cl . These figures were generated with ./mkplots.py samples 0.05 ./mkplots.py samples 0.10 ./mkplots.py samples 0.20 (b) cl = 0.10 (c) cl = 0.20 20 20 10 10 10 0 10 200.0 F(x,θ) 20 F(x,θ) F(x,θ) (a) cl = 0.05 0 10 0.2 0.4 x 0.6 0.8 1.0 200.0 0 10 0.2 0.4 x 0.6 0.8 1.0 200.0 0.2 0.4 x 0.6 0.8 1.0 Figure 5.18: Sample 1D random field realizations for several correlation lengths cl . Once the RF realizations are generated the covariance matrix is discarded and a “numerical” covariance matrix is estimated based on the available realizations. Figure 5.19 shows shaded illustration of covariance matrices computed using several sets of 1D RF samples. These figures were generated with ./mkplots.py numcov 0.05 512 ./mkplots.py numcov 0.20 512 ./mkplots.py numcov 0.05 8192 ./mkplots.py numcov 0.20 8192 ./mkplots.py numcov 0.05 131072 ./mkplots.py numcov 0.20 131071 These matrices employ RF samples generated on a non-uniform grid with higher densitiy of points near the left boundary. Hence, the matrix entries near the diagonal in the upper right corner show larger values. Grids grow further apart away from the left boundary hence the region near the diagonal grows thinner for these grid points. Figure 5.20 shows the eigenvalue solution of Fredholm equation (5.31) in its discretized form given by Eq. (5.32). This figure was generated with ./mkplots.py pltKLeig1D 512 131072 For this 1D example problem, 29 = 512 RF realizations are sufficient to estimate the KLE eigenvalue spectrum. As the correlation length decreases the eigenvalues decrease more slowly suggesting that more terms are needed to represent RF fluctuations. Figure 5.21 shows first four KL eigenvectors corresponding to cl = 0.05, scaled by the square rood of the corresponding eigenvalue. These plots were generated with ./mkplots.py numKLevec 0.05 512 on 85 (a) cl = 0.05, Nω = 29 (b) cl = 0.05, Nω = 213 (c) cl = 0.05, Nω = 217 (d) cl = 0.20, Nω = 29 (e) cl = 0.20, Nω = 213 (f) cl = 0.20, Nω = 217 Figure 5.19: Illustration of covariance matrices computed from 1D RF realizations. Red corresponds to large values, close to 1, while blue corresponds to small values, close to 0. ./mkplots.py numKLevec 0.05 8192 off ./mkplots.py numKLevec 0.05 131072 off Unlike the eigenvalue spectrum, the eigenvectors are very sensitive to the covariance matrix entries. For cl = 0.05, a large number of RF realizations, e.g. Nω = 217 in Fig. 5.21c, are required for computing a covariance matrix with KL modes that are close to the ones based on analytical covariance matrix (analytical modes not shown). Figure 5.22 shows first four KL eigenvectors corresponding to cl = 0.20, scaled by the square rood of the corresponding eigenvalue. These plots were generated with ./mkplots.py numKLevec 0.20 512 on ./mkplots.py numKLevec 0.20 8192 off ./mkplots.py numKLevec 0.20 131072 off For larger correlation lengths, a smaller number of samples is sufficient to estimate a covariance matrix and subsequently the KL modes. The results based on Nω = 213 = 8192 RF realizations, in Fig. 5.22b, are close to the ones based on a much larger number of realizations, Nω = 217 = 131072 in Fig. 5.22c. One can explore the orthogonality of the KLE modes to compute samples of germ ξk , introduced in Eq. (5.28). These samples are computed via Eq. (eq:xirealiz) and are saved in files xidata* in the corresponding run directories. Using the ξ samples, one can estimate their density via Kernel Density Estimate (KDE). Figures 5.23 and 5.24. These figures were 86 Eigenvalue Magnitude 102 cl =0.05 cl =0.10 cl =0.20 100 10-2 10-4 10-6 0 10 20 30 40 Eigenvalue # 50 Figure 5.20: KL eigenvalues estimated with two sets of RF realizations: 29 = 512 (dashed lines) and 217 = 131072 (solid lines). (b) Nω = 213 (c) Nω = 217 4 2 2 2 40.0 f1 f2 f3 f4 0.2 0.4 x 0.6 0.8 1.0 0 q 2 0 q 0 λk fk 4 λk fk 4 q λk fk (a) Nω = 29 2 2 40.0 0.2 0.4 x 0.6 0.8 1.0 40.0 0.2 0.4 x 0.6 0.8 1.0 Figure 5.21: Illustration of first 4 KL modes, computed based on a numerical covariance matrices using three sets of RF realizations with cl = 0.05 generated with ./mkplots.py xidata 0.05 512 ./mkplots.py xidata 0.05 131072 ./mkplots.py xidata 0.20 512 ./mkplots.py xidata 0.20 131072 Independent of the correlation length, a relatively large number of samples is required for “converged” estimates for the density of ξ. Figures 5.25 and 5.26 show reconstructions of select RF realizations. As observed in the figure showing the decay in the magnitude of the KL eigenvalues, more terms are needed to represents small scale features occuring for smaller correlation lengths, in Fig. 5.25, compared to RF with larger correlation lengths, e.g. the example shown in Fig. 5.26. The plots shown in Figs. 5.25 and 5.26 were generated with 87 (a) Nω = 29 (b) Nω = 213 (c) Nω = 217 2 2 2 40.0 2 2 f1 f2 f3 f4 0.2 0.4 x 0.6 0.8 0 q 2 0 q q 0 λk fk 4 λk fk 4 λk fk 4 40.0 1.0 0.2 0.4 x 0.6 0.8 40.0 1.0 0.2 0.4 x 0.6 0.8 1.0 Figure 5.22: Illustration of first 4 KL modes, computed based on a numerical covariance matrices using three sets of RF realizations with cl = 0.20 (a) Nω = 29 (b) Nω = 217 ξ1 ξ2 ξ3 ξ4 PDF(ξ) 0.3 0.3 0.2 0.1 0.0 4 ξ1 ξ2 ξ3 ξ4 0.4 PDF(ξ) 0.4 0.2 0.1 2 0 ξ( θ ) 2 0.0 4 4 2 0 ξ( θ ) 2 4 Figure 5.23: Probability densities for ξk obtained via KDE using samples obtained by projecting RF realizations onto KL modes. Results correspond to cl = 0.05. ./mkplots.py pltKLrecon1D 0.05 21 51 10 ./mkplots.py pltKLrecon1D 0.10 63 21 4 2D Examples on Structured Grids In this section we are presenting 2D RFs generated with kl 2D.x. The RFs are generated on a non-uniform structured 2D grid [0, Lx ] × [0, Ly ], with smaller grid spacing near the boundaries and larger grid spacing towards the center of the domain. This grid is computed using an algebraic expression [14]. The first coordinate is computed via (2α + β)ri + 2α − β β+1 xi = Lx , ri = (2α + 1) (1 + ri ) β−1 ηi −α 1−α 88 , ηi = i−1 , i = 1, 2, . . . , Nx Np − 1 (5.36) (a) Nω = 29 (b) Nω = 217 ξ1 ξ2 ξ3 ξ4 PDF(ξ) 0.3 ξ1 ξ2 ξ3 ξ4 0.4 0.3 PDF(ξ) 0.4 0.2 0.1 0.2 0.1 0.0 4 2 0 0.0 4 4 2 ξ( θ ) 2 0 4 2 ξ( θ ) Figure 5.24: Probability densities for ξk obtained via KDE using samples obtained by projecting RF realizations onto KL modes. Results correspond to cl = 0.20. (a) Mean (b) Mean + 10 terms 0 terms 10 10 Fn (x) 0 0 10 0.2 0.4 x 0.6 0.8 1.0 (d) Mean + 30 terms 0.0 0.4 x 0.6 1.0 0.0 10 x 0.6 0 1.0 0.0 0.8 1.0 10 10 0.8 x 50 terms Fn (x) 0 0.6 0.4 (f) Mean + 50 terms 10 0.4 0.2 40 terms Fn (x) Fn (x) 0.8 (e) Mean + 40 terms 10 0.2 0 10 0.2 30 terms 0.0 20 terms 10 Fn (x) Fn (x) 10 0.0 (c) Mean + 20 terms 10 terms 0 10 0.2 0.4 x 0.6 0.8 1.0 0.0 0.2 0.4 x 0.6 0.8 1.0 Figure 5.25: Reconstructing realizations with an increasing number of KL expansion terms for cl = 0.05 The β > 1 factor in the above expression controls the compression near x = 0 and x = Lx , while α ∈ [0, 1] determines where the clustering occurs. The examples shown in this section are based on default values for the parameters that control the grid definition in kl 2D.x: α = 1/2, β = 1.1, Lx1 = Lx2 = L = 1, Nx1 = Nx2 = 65 Figure 5.27 shows the 2D computational grid created with these parameters. This figure was generated with the Python script “pl2Dsgrid.py” ./pl2Dsgrid.py cvspl2D_0.1_4096 Figure 5.28 shows 2D RF realizations with correlation lengths cl = 0.1 and cl = 0.2. As the correlation length increases the realizations become smoother. These figure were generated with ./mkplots.py samples2D 0.1 4096 2 (Figs. 5.28a,5.28b) 89 (a) Mean (b) Mean + 4 terms 0 terms 10 10 Fn (x) 0 0 10 0.2 0.4 x 0.6 0.8 1.0 (d) Mean + 12 terms 0.0 0.2 0.4 x 0.6 0.8 0.0 10 x 0.6 1.0 0 0.0 0.8 1.0 20 terms 10 0.8 x 10 Fn (x) 0 0.6 0.4 (f) Mean + 20 terms 10 0.4 0.2 16 terms Fn (x) Fn (x) 1.0 (e) Mean + 16 terms 10 0.2 0 10 12 terms 0.0 8 terms 10 Fn (x) Fn (x) 10 0.0 (c) Mean + 8 terms 4 terms 0 10 0.2 0.4 x 0.6 0.8 1.0 0.0 0.2 0.4 x 0.6 0.8 1.0 Figure 5.26: Reconstructing realizations with an increasing number of KL expansion terms for cl = 0.10 ./mkplots.py samples2D 0.2 4096 2 (Figs. 5.28c,5.28d) In a 2D configuration the rhs of Eq. (eq:fredint) is discretized using a 2D finite volume approach: Nx1 −1 Nx2 −1 Z X X Cov(x, y)f (y)dy ≈ (Cov(x, y)f (y)) |ij Aij (5.37) i=1 j=1 Here, Aij is the area of rectangle (ij) with lower left corner (i, j) and upper right corner (i + 1, j + 1), and (Cov(x, y)f (y)) |ij is the average over rectangle (ij) computed as the arithmetic average of values at its four vertices. Eq. (5.37) can be further cast as Z Cov(x, y)f (y)dy ≈ Nx1 Nx2 X X (Cov(x, y)f (y))i,j wi,j , (5.38) i=1 j=1 where wi,j is a quarter of the area of all rectangles that surround vertex (i, j). Figures 5.29 and 5.30 shows first 8 KL modes computed based on covariance matrices that where estimated from 212 = 4096 and 216 = 65536 number of RF samples, respectively, and correlation length cl = 0.1 for both sets. The results in Fig. 5.30 are close to the KL modes corresponding to the analytical covariance matrix (results not shown), while the results in Fig. 5.29 indicate that 212 RF realizations is not sufficient to generate converged KL modes. These figures were generated with ./mkplots.py numKLevec2D 0.1 4096 (Fig. 5.29) ./mkplots.py numKLevec2D 0.1 65536 (Fig. 5.30) Figure 5.31 shows first 8 KL modes computed based on a covariance matrix that was 90 Figure 5.27: Structured grid employed for 2D RF examples. (a) cl = 0.1 (b) cl = 0.1 (c) cl = 0.2 (d) cl = 0.2 0.8 0.8 0.8 0.8 0.6 0.6 0.6 0.6 0.4 0.4 0.2 0.4 0.2 0.00.0 0.2 0.4 0.6 0.8 1.0 x1 x2 1.0 x2 1.0 x2 1.0 x2 1.0 0.4 0.2 0.00.0 0.2 0.4 0.6 0.8 1.0 x1 0.2 0.00.0 0.2 0.4 0.6 0.8 1.0 x1 0.00.0 0.2 0.4 0.6 0.8 1.0 x1 Figure 5.28: Sample 2D random field realizations for cl = 0.1 and cl = 0.2. estimated from 212 = 4096 number of RF samples. For these results, with correlation length cl = 0.5, 212 samples are sufficient to estimate the covariance matrix and subsequently KL modes that are close to analytical results (results not shown). The plots in Fig. 5.31 were generated with ./mkplots.py numKLevec2D 0.5 4096 Figures 5.32 and 5.33 show reconstructions of select 2D RF realizations. As observed in the previous section for 1D RFs, more terms are needed to represents small scale features occuring for smaller correlation lengths, in Fig. 5.32, compared to RF with larger correlation lengths, e.g. the example shown in Fig. 5.33. The plots shown in Figs. 5.32 and 5.33 were generated with ./mkplots.py pltKLrecon2D 0.2 3 85 12 (Fig. 5.32) ./mkplots.py pltKLrecon2D 0.5 37 36 5 (Fig. 5.33) 91 (b) Mode 2 (d) Mode 4 1.0 0.8 0.8 0.8 0.8 0.6 0.6 0.6 0.6 0.4 0.00.0 0.4 0.2 0.2 0.4 x1 0.6 0.8 0.00.0 1.0 (e) Mode 5 x2 1.0 x2 1.0 0.2 0.4 0.2 0.2 0.4 x1 0.6 0.8 0.00.0 1.0 (f) Mode 6 0.4 0.2 0.2 0.4 x1 0.6 0.8 0.00.0 1.0 (g) Mode 7 1.0 0.8 0.8 0.8 0.8 0.6 0.6 0.6 0.6 0.2 0.00.0 0.4 0.2 0.2 0.4 x1 0.6 0.8 1.0 0.00.0 x2 1.0 x2 1.0 0.4 0.4 0.2 0.2 0.4 x1 0.6 0.8 1.0 0.00.0 0.2 0.4 x1 0.6 0.8 1.0 0.8 1.0 (h) Mode 8 1.0 x2 x2 (c) Mode 3 1.0 x2 x2 (a) Mode 1 0.4 0.2 0.2 0.4 x1 0.6 0.8 1.0 0.00.0 0.2 0.4 x1 0.6 Figure 5.29: Illustration of first 8 KL modes, computed based on a numerical covariance matrix estimated using 212 2D RF realizations with cl = 0.1 2D Examples on Unstructured Grids For this example we choose a computational domain that resembles the shape of California. A number of 212 = 4096 points were randomly distributed inside this computational domain, and a triangular grid with 8063 triangles was generated via Delaunay triangulation. The 2D grid point locations are provided in “data/cali grid.dat” and the grid point connectivities are provided in “data/cali tria.dat”. Figure 5.34 shows the placement of these grid points, including an inset plot with the triangular grid connectivities. This figure shows the grids on a uniform scale in terms of latitude and longitude degrees and was generated with ./pl2Dugrid.py Figure 5.35 shows 2D RF realizations with correlation lengths cl = 0.5◦ and cl = 2◦ . These figure were generated with ./mkplots.py samples2Du 0.5 4096 2 (Figs. 5.35a,5.35b) ./mkplots.py samples2Du 2.0 4096 2 (Figs. 5.35c,5.35d) Figure 5.36 shows first 16 KL modes computed based on a covariance matrix that was estimated from 216 = 65536 number of RF samples, with correlation length cl = 0.5◦ . The KL modes corresponding to an analytically estimated covariance matrix with the same correlation length are shown in Fig. 5.37. For this example, it seems that 216 samples are sufficient to estimate the first 12 to 13 modes accurately. Please note that some of the 92 (b) Mode 2 (d) Mode 4 1.0 0.8 0.8 0.8 0.8 0.6 0.6 0.6 0.6 0.4 0.00.0 0.4 0.2 0.2 0.4 x1 0.6 0.8 0.00.0 1.0 (e) Mode 5 x2 1.0 x2 1.0 0.2 0.4 0.2 0.2 0.4 x1 0.6 0.8 0.00.0 1.0 (f) Mode 6 0.4 0.2 0.2 0.4 x1 0.6 0.8 0.00.0 1.0 (g) Mode 7 1.0 0.8 0.8 0.8 0.8 0.6 0.6 0.6 0.6 0.2 0.00.0 0.4 0.2 0.2 0.4 x1 0.6 0.8 1.0 0.00.0 x2 1.0 x2 1.0 0.4 0.4 0.2 0.2 0.4 x1 0.6 0.8 1.0 0.00.0 0.2 0.4 x1 0.6 0.8 1.0 0.8 1.0 (h) Mode 8 1.0 x2 x2 (c) Mode 3 1.0 x2 x2 (a) Mode 1 0.4 0.2 0.2 0.4 x1 0.6 0.8 1.0 0.00.0 0.2 0.4 x1 0.6 Figure 5.30: Illustration of first 8 KL modes, computed based on a numerical covariance matrix estimated using 216 2D RF realizations with cl = 0.1 modes can differ up to a multiplicative factor of −1, hence the colorscheme will be reversed. Higher order modes start diverging from analytical estimates, e.g. modes 14 through 16 in this example. Figure 5.38 shows KL modes corresponding to a covariance matrix estimated from RF realizations with cl = 2◦ . For this correlation length, 216 samples are sufficient to generate KL modes that are very close to analytical results (not shown). These figures were generated with ./mkplots.py numKLevec2Du 0.5 65536 (Fig. 5.36) ./mkplots.py anlKLevec2Du SqExp 0.5 (Fig. 5.37) ./mkplots.py numKLevec2Du 2.0 65536 (Fig. 5.38) 93 (b) Mode 2 (d) Mode 4 1.0 0.8 0.8 0.8 0.8 0.6 0.6 0.6 0.6 0.4 0.00.0 0.4 0.2 0.2 0.4 x1 0.6 0.8 0.00.0 1.0 (e) Mode 5 x2 1.0 x2 1.0 0.2 0.4 0.2 0.2 0.4 x1 0.6 0.8 0.00.0 1.0 (f) Mode 6 0.4 0.2 0.2 0.4 x1 0.6 0.8 0.00.0 1.0 (g) Mode 7 1.0 0.8 0.8 0.8 0.8 0.6 0.6 0.6 0.6 0.2 0.00.0 0.4 0.2 0.2 0.4 x1 0.6 0.8 0.00.0 1.0 x2 1.0 x2 1.0 0.4 0.4 0.2 0.2 0.4 x1 0.6 0.8 0.00.0 1.0 0.2 0.4 x1 0.6 0.8 1.0 0.8 1.0 (h) Mode 8 1.0 x2 x2 (c) Mode 3 1.0 x2 x2 (a) Mode 1 0.4 0.2 0.2 0.4 x1 0.6 0.8 0.00.0 1.0 0.2 0.4 x1 0.6 Figure 5.31: Illustration of first 8 KL modes, computed based on a numerical covariance matrix estimated using 212 2D RF realizations with cl = 0.5 (b) Mean + 12 terms (d) Mean + 36 terms 1.0 0.8 0.8 0.8 0.8 0.6 0.6 0.6 0.6 0.4 0.00.0 0.4 0.2 0.2 0.4 x1 0.6 0.8 0.00.0 1.0 (e) Mean + 48 terms x2 1.0 x2 1.0 0.2 0.4 0.2 0.2 0.4 x1 0.6 0.8 0.00.0 1.0 (f) Mean + 60 terms 0.4 0.2 0.2 0.4 x1 0.6 0.8 0.00.0 1.0 (g) Mean + 72 terms 1.0 0.8 0.8 0.8 0.8 0.6 0.6 0.6 0.6 0.2 0.00.0 0.4 0.2 0.2 0.4 x1 0.6 0.8 1.0 0.00.0 x2 1.0 x2 1.0 0.4 0.4 0.2 0.2 0.4 x1 0.6 0.8 1.0 0.00.0 0.2 0.4 x1 0.6 0.8 1.0 (h) Mean + 84 terms 1.0 x2 x2 (c) Mean + 24 terms 1.0 x2 x2 (a) Mean 0.4 0.2 0.2 0.4 x1 0.6 0.8 1.0 0.00.0 0.2 0.4 x1 0.6 0.8 1.0 Figure 5.32: Reconstructing 2D realizations with an increasing number of KL expansion terms for cl = 0.2 94 (b) Mean + 5 terms (d) Mean + 15 terms 1.0 0.8 0.8 0.8 0.8 0.6 0.6 0.6 0.6 0.4 0.00.0 0.4 0.2 0.2 0.4 x1 0.6 0.8 (e) Mean + 20 terms 0.4 0.2 0.00.0 1.0 x2 1.0 x2 1.0 0.2 0.2 0.4 x1 0.6 0.8 0.4 0.2 0.00.0 1.0 (f) Mean + 25 terms 0.2 0.4 x1 0.6 0.8 0.00.0 1.0 (g) Mean + 30 terms 1.0 0.8 0.8 0.8 0.8 0.6 0.6 0.6 0.6 0.2 0.00.0 0.4 0.2 0.2 0.4 x1 0.6 0.8 1.0 x2 1.0 x2 1.0 0.4 0.4 0.2 0.00.0 0.2 0.4 x1 0.6 0.8 0.4 x1 0.6 0.8 1.0 0.4 0.2 0.00.0 1.0 0.2 (h) Mean + 35 terms 1.0 x2 x2 (c) Mean + 10 terms 1.0 x2 x2 (a) Mean 0.2 0.4 x1 0.6 0.8 1.0 0.00.0 0.2 0.4 x1 0.6 0.8 1.0 Figure 5.33: Reconstructing 2D realizations with an increasing number of KL expansion terms for cl = 0.5 40 ◦ lat 38 ◦ 36 ◦ 34 ◦ −124 ◦ −122 ◦ −120 ◦ lon −118 ◦ −116 ◦ Figure 5.34: Unstructured grid generated via Delaunay traingulation overlaid over California. 95 (d) cl = 2◦ lon lon lat (c) cl = 2◦ lat (b) cl = 0.5◦ lat lat (a) cl = 0.5◦ lon lon Figure 5.35: Sample 2D random field realizations on an unstructured grid for cl = 0.5◦ and cl = 2◦ . lon lon lon lon (f) Mode 16 lat lat (e) Mode 10 (d) Mode 6 lat (c) Mode 3 lat (b) Mode 2 lat lat (a) Mode 1 lon lon Figure 5.36: Illustration of seclect KL modes, computed based on a numerical covariance matrix estimated using 216 2D RF realizations on an unstructured grid with cl = 0.5◦ . 96 lon lon lon lon (f) Mode 16 lat (e) Mode 10 lat (d) Mode 6 lat (c) Mode 3 lat (b) Mode 2 lat lat (a) Mode 1 lon lon Figure 5.37: Illustration of select KL modes, computed based on an analytical covariance matrix for 2D RF realizations on an unstructured grid with cl = 0.5◦ and a square-exponential form. lon lon lon (f) Mode 16 lat lat (e) Mode 10 (d) Mode 6 lat lat lat lon (c) Mode 3 lat (b) Mode 2 (a) Mode 1 lon lon Figure 5.38: Illustration of select KL modes, computed based on a numerical covariance matrix estimated using 216 2D RF realizations on an unstructured grid with cl = 2◦ . 97 98 Chapter 6 Support UQTk is the subject of continual development and improvement. If you have questions about or suggestions for UQTk, feel free to e-mail the UQTk Users list, at uqtk-users@ software.sandia.gov. You can sign up for this mailing list at https://software.sandia. gov/mailman/listinfo/uqtk-users. 99 100 References [1] S. Babacan, R. Molina, and A. Katsaggelos. Bayesian compressive sensing using Laplace priors. IEEE Transactions on Image Processing, 19(1):53–63, 2010. [2] C. W. Clenshaw and A. R. Curtis. A method for numerical integration on an automatic computer. Numerische Mathematik, 2:197–205, 1960. [3] B.J. Debusschere, H.N. Najm, P.P. Pébay, O.M. Knio, R.G. Ghanem, and O.P. Le Maı̂tre. Numerical challenges in the use of polynomial chaos representations for stochastic processes. SIAM Journal on Scientific Computing, 26(2):698–719, 2004. [4] Andrew Gelman, John B. Carlin, Hal S. Stern, and Donald B. Rubin. Bayesian Data Analysis. Chapman & Hall CRC, 2 edition, 2003. [5] Stuart Geman and D. Geman. Stochastic Relaxation, Gibbs Distributions, and the Bayesian Restoration of Images. Pattern Analysis and Machine Intelligence, IEEE Transactions on, PAMI-6(6):721–741, 1984. [6] Thomas Gerstner and Michael Griebel. Numerical integration using sparse grids. Numerical Algorithms, 18(3-4):209–232, 1998. (also as SFB 256 preprint 553, Univ. Bonn, 1998). [7] R.G. Ghanem and P.D. Spanos. Stochastic Finite Elements: A Spectral Approach. Springer Verlag, New York, 1991. [8] W. R. Gilks, S. Richardson, and D. J. Spiegelhalter. Markov Chain Monte Carlo in Practice. Chapman & Hall, London, 1996. [9] G. H. Golub and J. H. Welsch. Calculation of Gauss quadrature rules. Math. Comp., 23:221–230, 1969. [10] H. Haario, E. Saksman, and J. Tamminen. An adaptive Metropolis algorithm. Bernoulli, 7:223–242, 2001. [11] Heikki Haario, Marko Laine, Antonietta Mira, and Eero Saksman. Dram: Efficient adaptive mcmc. Statistics and Computing, 16(4):339–354, 2006. [12] R.G. Haylock and A. O’Hagan. On inference for outputs of computationally expensive algorithms with uncertainty on the inputs. Bayesian statistics, 5:629–637, 1996. [13] F.B. Hildebrand. Introduction to Numerical Analysis. Dover, 1987. [14] K.A Hoffmann and S.T. Chiang. Computational Fluid Dynamics, volume 1, chapter 9, pages 358–426. EES, 2000. 101 [15] M. C. Kennedy and A. O’Hagan. Bayesian calibration of computer models. Journal of the Royal Statistical Society: Series B, 63(3):425–464, 2001. [16] S. Kucherenko, S. Tarantola, and P. Annoni. Estimation of global sensitivity indices for models with dependent variables. Computer Physics Communications, 183:937–946, 2012. [17] O.P. Le Maı̂tre and O.M. Knio. Spectral Methods for Uncertainty Quantification: With Applications to Computational Fluid Dynamics (Scientific Computation). Springer, 1st edition. edition, April 2010. [18] Y. M. Marzouk and H. N. Najm. Dimensionality reduction and polynomial chaos acceleration of Bayesian inference in inverse problems. Journal of Computational Physics, 228(6):1862–1902, 2009. [19] E.J. Nyström. Über die praktische auflösung von integralgleichungen mit anwendungen auf randwertaufgaben. Acta Mathematica, 54(1):185–204, 1930. [20] J. Oakley and A. O’Hagan. Bayesian inference for the uncertainty distribution of computer model outputs. Biometrika, 89(4):769–784, 2002. [21] Mark Orr. Introduction to radial basis function networks. Technical Report, Center for Cognitive Science, University of Edinburgh, 1996. [22] C. E. Rasmussen and C. K. I. Williams. Gaussian Processes for Machine Learning. MIT Press, 2006. [23] M. Rosenblatt. Remarks on a multivariate transformation. Annals of Mathematical Statistics, 23(3):470 – 472, 1952. [24] A. Saltelli. Making best use of model evaluations to compute sensitivity indices. Computer Physics Communications, 145:280–297, 2002. [25] K. Sargsyan. Surrogate models for uncertainty propagation and sensitivity analysis. In R. Ghanem, D. Higdon, and H. Owhadi, editors, Handbook of Uncertainty Quantification. Springer, 2016. [26] K. Sargsyan, H. N. Najm, and R. Ghanem. On the Statistical Calibration of Physical Models. International Journal for Chemical Kinetics, 47(4):246–276, 2015. [27] K. Sargsyan, C. Safta, H. Najm, B. Debusschere, D. Ricciuto, and P. Thornton. Dimensionality reduction for complex models via Bayesian compressive sensing. International Journal of Uncertainty Quantification, 4(1):63–93, 2014. [28] D.W. Scott. Multivariate Density Estimation. Theory, Practice and Visualization. Wiley, New York, 1992. [29] B.W. Silverman. Density Estimation for Statistics and Data Analysis. Chapman and Hall, London, 1986. 102 [30] D.S. Sivia. Data Analysis: A Bayesian Tutorial. Oxford Science, 1996. [31] S. A. Smolyak. Quadrature and interpolation formulas for tensor products of certain classes of functions. Soviet Mathematics Dokl., 4:240–243, 1963. [32] I. M. Sobol. Sensitivity estimates for nonlinear mathematical models. Math. Modeling and Comput. Exper., 1:407–414, 1993. 103 DISTRIBUTION: 1 MS 0899 Technical Library, 8944 (electronic copy) 104 v1.40 105 106
Source Exif Data:
File Type : PDF File Type Extension : pdf MIME Type : application/pdf PDF Version : 1.5 Linearized : No Page Count : 106 Producer : pdfTeX-1.40.16 Creator : TeX Create Date : 2017:10:10 14:46:34-07:00 Modify Date : 2017:10:10 14:46:34-07:00 Trapped : False PTEX Fullbanner : This is pdfTeX, Version 3.14159265-2.6-1.40.16 (TeX Live 2015) kpathsea version 6.2.1EXIF Metadata provided by EXIF.tools