Idas Guide
User Manual:
Open the PDF directly: View PDF .
Page Count: 368
Download | |
Open PDF In Browser | View PDF |
User Documentation for idas v3.0.0 (sundials v4.0.0) Radu Serban, Cosmin Petra, and Alan C. Hindmarsh Center for Applied Scientific Computing Lawrence Livermore National Laboratory December 7, 2018 UCRL-SM-208112 DISCLAIMER This document was prepared as an account of work sponsored by an agency of the United States government. Neither the United States government nor Lawrence Livermore National Security, LLC, nor any of their employees makes any warranty, expressed or implied, or assumes any legal liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately owned rights. Reference herein to any specific commercial product, process, or service by trade name, trademark, manufacturer, or otherwise does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States government or Lawrence Livermore National Security, LLC. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States government or Lawrence Livermore National Security, LLC, and shall not be used for advertising or product endorsement purposes. This work was performed under the auspices of the U.S. Department of Energy by Lawrence Livermore National Laboratory under Contract DE-AC52-07NA27344. Approved for public release; further dissemination unlimited Contents List of Tables ix List of Figures xi 1 Introduction 1.1 Changes from previous versions . . . . 1.2 Reading this User Guide . . . . . . . . 1.3 SUNDIALS Release License . . . . . . 1.3.1 Copyright Notices . . . . . . . 1.3.1.1 SUNDIALS Copyright 1.3.1.2 ARKode Copyright . 1.3.2 BSD License . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 2 8 9 9 9 9 10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 11 15 16 17 17 18 19 19 20 20 20 21 22 23 3 Code Organization 3.1 SUNDIALS organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.2 IDAS organization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 25 25 4 Using IDAS for IVP Solution 4.1 Access to library and header files . . . . . . . . . . . . . 4.2 Data types . . . . . . . . . . . . . . . . . . . . . . . . . 4.2.1 Floating point types . . . . . . . . . . . . . . . . 4.2.2 Integer types used for vector and matrix indices 4.3 Header files . . . . . . . . . . . . . . . . . . . . . . . . . 4.4 A skeleton of the user’s main program . . . . . . . . . . 4.5 User-callable functions . . . . . . . . . . . . . . . . . . . 4.5.1 IDAS initialization and deallocation functions . . 4.5.2 IDAS tolerance specification functions . . . . . . 31 31 32 32 32 33 34 37 38 38 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2 Mathematical Considerations 2.1 IVP solution . . . . . . . . . . . . . . . . . . . . . . . . 2.2 Preconditioning . . . . . . . . . . . . . . . . . . . . . . . 2.3 Rootfinding . . . . . . . . . . . . . . . . . . . . . . . . . 2.4 Pure quadrature integration . . . . . . . . . . . . . . . . 2.5 Forward sensitivity analysis . . . . . . . . . . . . . . . . 2.5.1 Forward sensitivity methods . . . . . . . . . . . . 2.5.2 Selection of the absolute tolerances for sensitivity 2.5.3 Evaluation of the sensitivity right-hand side . . . 2.5.4 Quadratures depending on forward sensitivities . 2.6 Adjoint sensitivity analysis . . . . . . . . . . . . . . . . 2.6.1 Sensitivity of G(p) . . . . . . . . . . . . . . . . . 2.6.2 Sensitivity of g(T, p) . . . . . . . . . . . . . . . . 2.6.3 Checkpointing scheme . . . . . . . . . . . . . . . 2.7 Second-order sensitivity analysis . . . . . . . . . . . . . iii . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4.5.3 4.5.4 4.5.5 4.5.6 4.5.7 4.5.8 Linear solver interface functions . . . . . . . . . . . . . . . . . Nonlinear solver interface function . . . . . . . . . . . . . . . . Initial condition calculation function . . . . . . . . . . . . . . . Rootfinding initialization function . . . . . . . . . . . . . . . . IDAS solver function . . . . . . . . . . . . . . . . . . . . . . . . Optional input functions . . . . . . . . . . . . . . . . . . . . . . 4.5.8.1 Main solver optional input functions . . . . . . . . . . 4.5.8.2 Linear solver interface optional input functions . . . . 4.5.8.3 Initial condition calculation optional input functions . 4.5.8.4 Rootfinding optional input functions . . . . . . . . . . 4.5.9 Interpolated output function . . . . . . . . . . . . . . . . . . . 4.5.10 Optional output functions . . . . . . . . . . . . . . . . . . . . . 4.5.10.1 SUNDIALS version information . . . . . . . . . . . . 4.5.10.2 Main solver optional output functions . . . . . . . . . 4.5.10.3 Initial condition calculation optional output functions 4.5.10.4 Rootfinding optional output functions . . . . . . . . . 4.5.10.5 idals linear solver interface optional output functions 4.5.11 IDAS reinitialization function . . . . . . . . . . . . . . . . . . . User-supplied functions . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6.1 Residual function . . . . . . . . . . . . . . . . . . . . . . . . . . 4.6.2 Error message handler function . . . . . . . . . . . . . . . . . . 4.6.3 Error weight function . . . . . . . . . . . . . . . . . . . . . . . 4.6.4 Rootfinding function . . . . . . . . . . . . . . . . . . . . . . . . 4.6.5 Jacobian construction (matrix-based linear solvers) . . . . . . . 4.6.6 Jacobian-vector product (matrix-free linear solvers) . . . . . . 4.6.7 Jacobian-vector product setup (matrix-free linear solvers) . . . 4.6.8 Preconditioner solve (iterative linear solvers) . . . . . . . . . . 4.6.9 Preconditioner setup (iterative linear solvers) . . . . . . . . . . Integration of pure quadrature equations . . . . . . . . . . . . . . . . . 4.7.1 Quadrature initialization and deallocation functions . . . . . . 4.7.2 IDAS solver function . . . . . . . . . . . . . . . . . . . . . . . . 4.7.3 Quadrature extraction functions . . . . . . . . . . . . . . . . . 4.7.4 Optional inputs for quadrature integration . . . . . . . . . . . . 4.7.5 Optional outputs for quadrature integration . . . . . . . . . . . 4.7.6 User-supplied function for quadrature integration . . . . . . . . A parallel band-block-diagonal preconditioner module . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40 42 42 43 44 45 45 51 55 57 58 58 58 60 66 66 67 71 72 72 73 73 74 74 76 77 78 78 79 81 82 82 83 84 85 86 5 Using IDAS for Forward Sensitivity Analysis 5.1 A skeleton of the user’s main program . . . . . . . . . . . . . . . . . . 5.2 User-callable routines for forward sensitivity analysis . . . . . . . . . . 5.2.1 Forward sensitivity initialization and deallocation functions . . 5.2.2 Forward sensitivity tolerance specification functions . . . . . . 5.2.3 Forward sensitivity nonlinear solver interface functions . . . . . 5.2.4 Forward sensitivity initial condition calculation function . . . . 5.2.5 IDAS solver function . . . . . . . . . . . . . . . . . . . . . . . . 5.2.6 Forward sensitivity extraction functions . . . . . . . . . . . . . 5.2.7 Optional inputs for forward sensitivity analysis . . . . . . . . . 5.2.8 Optional outputs for forward sensitivity analysis . . . . . . . . 5.2.8.1 Main solver optional output functions . . . . . . . . . 5.2.8.2 Initial condition calculation optional output functions 5.3 User-supplied routines for forward sensitivity analysis . . . . . . . . . 5.4 Integration of quadrature equations depending on forward sensitivities 5.4.1 Sensitivity-dependent quadrature initialization and deallocation 5.4.2 IDAS solver function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 93 96 96 98 100 101 101 101 103 105 105 108 108 109 111 112 4.6 4.7 4.8 iv . . . . . 112 114 116 117 118 6 Using IDAS for Adjoint Sensitivity Analysis 6.1 A skeleton of the user’s main program . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 User-callable functions for adjoint sensitivity analysis . . . . . . . . . . . . . . . . . . . 6.2.1 Adjoint sensitivity allocation and deallocation functions . . . . . . . . . . . . . 6.2.2 Adjoint sensitivity optional input . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.3 Forward integration function . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.4 Backward problem initialization functions . . . . . . . . . . . . . . . . . . . . . 6.2.5 Tolerance specification functions for backward problem . . . . . . . . . . . . . . 6.2.6 Linear solver initialization functions for backward problem . . . . . . . . . . . 6.2.7 Initial condition calculation functions for backward problem . . . . . . . . . . . 6.2.8 Backward integration function . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2.9 Optional input functions for the backward problem . . . . . . . . . . . . . . . . 6.2.9.1 Main solver optional input functions . . . . . . . . . . . . . . . . . . . 6.2.9.2 Linear solver interface optional input functions . . . . . . . . . . . . . 6.2.10 Optional output functions for the backward problem . . . . . . . . . . . . . . . 6.2.10.1 Main solver optional output functions . . . . . . . . . . . . . . . . . . 6.2.10.2 Initial condition calculation optional output function . . . . . . . . . 6.2.11 Backward integration of quadrature equations . . . . . . . . . . . . . . . . . . . 6.2.11.1 Backward quadrature initialization functions . . . . . . . . . . . . . . 6.2.11.2 Backward quadrature extraction function . . . . . . . . . . . . . . . . 6.2.11.3 Optional input/output functions for backward quadrature integration 6.3 User-supplied functions for adjoint sensitivity analysis . . . . . . . . . . . . . . . . . . 6.3.1 DAE residual for the backward problem . . . . . . . . . . . . . . . . . . . . . . 6.3.2 DAE residual for the backward problem depending on the forward sensitivities 6.3.3 Quadrature right-hand side for the backward problem . . . . . . . . . . . . . . 6.3.4 Sensitivity-dependent quadrature right-hand side for the backward problem . . 6.3.5 Jacobian construction for the backward problem (matrix-based linear solvers) . 6.3.6 Jacobian-vector product for the backward problem (matrix-free linear solvers) . 6.3.7 Jacobian-vector product setup for the backward problem (matrix-free linear solvers) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.3.8 Preconditioner solve for the backward problem (iterative linear solvers) . . . . 6.3.9 Preconditioner setup for the backward problem (iterative linear solvers) . . . . 6.4 Using the band-block-diagonal preconditioner for backward problems . . . . . . . . . . 6.4.1 Usage of IDABBDPRE for the backward problem . . . . . . . . . . . . . . . . 6.4.2 User-supplied functions for IDABBDPRE . . . . . . . . . . . . . . . . . . . . . 121 121 124 124 126 126 127 129 130 131 132 134 134 134 138 138 139 140 140 141 142 142 142 143 144 145 146 148 7 Description of the NVECTOR module 7.1 NVECTOR functions used by IDAS . . . . . . . . 7.2 The NVECTOR SERIAL implementation . . . . . 7.2.1 NVECTOR SERIAL accessor macros . . . 7.2.2 NVECTOR SERIAL functions . . . . . . . 7.2.3 NVECTOR SERIAL Fortran interfaces . . 7.3 The NVECTOR PARALLEL implementation . . . 7.3.1 NVECTOR PARALLEL accessor macros . 7.3.2 NVECTOR PARALLEL functions . . . . . 7.3.3 NVECTOR PARALLEL Fortran interfaces 7.4 The NVECTOR OPENMP implementation . . . . 7.4.1 NVECTOR OPENMP accessor macros . . 159 168 169 169 170 173 173 174 175 178 178 178 5.5 5.4.3 Sensitivity-dependent quadrature extraction functions . . . . . . . . . . 5.4.4 Optional inputs for sensitivity-dependent quadrature integration . . . . 5.4.5 Optional outputs for sensitivity-dependent quadrature integration . . . 5.4.6 User-supplied function for sensitivity-dependent quadrature integration Note on using partial error control . . . . . . . . . . . . . . . . . . . . . . . . . v . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150 151 153 154 154 156 7.4.2 NVECTOR OPENMP functions . . . . . . . 7.4.3 NVECTOR OPENMP Fortran interfaces . . 7.5 The NVECTOR PTHREADS implementation . . . 7.5.1 NVECTOR PTHREADS accessor macros . . 7.5.2 NVECTOR PTHREADS functions . . . . . . 7.5.3 NVECTOR PTHREADS Fortran interfaces . 7.6 The NVECTOR PARHYP implementation . . . . . 7.6.1 NVECTOR PARHYP functions . . . . . . . 7.7 The NVECTOR PETSC implementation . . . . . . 7.7.1 NVECTOR PETSC functions . . . . . . . . . 7.8 The NVECTOR CUDA implementation . . . . . . . 7.8.1 NVECTOR CUDA functions . . . . . . . . . 7.9 The NVECTOR RAJA implementation . . . . . . . 7.9.1 NVECTOR RAJA functions . . . . . . . . . 7.10 The NVECTOR OPENMPDEV implementation . . 7.10.1 NVECTOR OPENMPDEV accessor macros . 7.10.2 NVECTOR OPENMPDEV functions . . . . 7.11 NVECTOR Examples . . . . . . . . . . . . . . . . . 8 Description of the SUNMatrix module 8.1 SUNMatrix functions used by IDAS . . . . . 8.2 The SUNMatrix Dense implementation . . . 8.2.1 SUNMatrix Dense accessor macros . . 8.2.2 SUNMatrix Dense functions . . . . . . 8.2.3 SUNMatrix Dense Fortran interfaces . 8.3 The SUNMatrix Band implementation . . . . 8.3.1 SUNMatrix Band accessor macros . . 8.3.2 SUNMatrix Band functions . . . . . . 8.3.3 SUNMatrix Band Fortran interfaces . 8.4 The SUNMatrix Sparse implementation . . . 8.4.1 SUNMatrix Sparse accessor macros . . 8.4.2 SUNMatrix Sparse functions . . . . . 8.4.3 SUNMatrix Sparse Fortran interfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 182 183 183 184 187 188 188 191 192 195 196 201 202 205 205 206 210 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 215 218 218 219 220 221 222 224 225 227 228 231 232 234 9 Description of the SUNLinearSolver module 9.1 The SUNLinearSolver API . . . . . . . . . . . . . . 9.1.1 SUNLinearSolver core functions . . . . . . 9.1.2 SUNLinearSolver set functions . . . . . . . 9.1.3 SUNLinearSolver get functions . . . . . . . 9.1.4 Functions provided by sundials packages . 9.1.5 SUNLinearSolver return codes . . . . . . . 9.1.6 The generic SUNLinearSolver module . . . 9.2 Compatibility of SUNLinearSolver modules . . . . 9.3 Implementing a custom SUNLinearSolver module 9.3.1 Intended use cases . . . . . . . . . . . . . . 9.4 IDAS SUNLinearSolver interface . . . . . . . . . . 9.4.1 Lagged matrix information . . . . . . . . . 9.4.2 Iterative linear solver tolerance . . . . . . . 9.5 The SUNLinearSolver Dense implementation . . . 9.5.1 SUNLinearSolver Dense description . . . . 9.5.2 SUNLinearSolver Dense functions . . . . . 9.5.3 SUNLinearSolver Dense Fortran interfaces . 9.5.4 SUNLinearSolver Dense content . . . . . . 9.6 The SUNLinearSolver Band implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 237 238 238 240 241 242 243 244 245 245 246 247 248 248 249 249 249 250 251 252 vi . . . . . . . . . . . . . . . . . . . . . . . . . . 9.7 9.8 9.9 9.10 9.11 9.12 9.13 9.14 9.15 9.16 9.6.1 SUNLinearSolver Band description . . . . . . . . 9.6.2 SUNLinearSolver Band functions . . . . . . . . . 9.6.3 SUNLinearSolver Band Fortran interfaces . . . . 9.6.4 SUNLinearSolver Band content . . . . . . . . . . The SUNLinearSolver LapackDense implementation . . 9.7.1 SUNLinearSolver LapackDense description . . . 9.7.2 SUNLinearSolver LapackDense functions . . . . 9.7.3 SUNLinearSolver LapackDense Fortran interfaces 9.7.4 SUNLinearSolver LapackDense content . . . . . The SUNLinearSolver LapackBand implementation . . . 9.8.1 SUNLinearSolver LapackBand description . . . . 9.8.2 SUNLinearSolver LapackBand functions . . . . . 9.8.3 SUNLinearSolver LapackBand Fortran interfaces 9.8.4 SUNLinearSolver LapackBand content . . . . . . The SUNLinearSolver KLU implementation . . . . . . . 9.9.1 SUNLinearSolver KLU description . . . . . . . . 9.9.2 SUNLinearSolver KLU functions . . . . . . . . . 9.9.3 SUNLinearSolver KLU Fortran interfaces . . . . 9.9.4 SUNLinearSolver KLU content . . . . . . . . . . The SUNLinearSolver SuperLUMT implementation . . . 9.10.1 SUNLinearSolver SuperLUMT description . . . . 9.10.2 SUNLinearSolver SuperLUMT functions . . . . . 9.10.3 SUNLinearSolver SuperLUMT Fortran interfaces 9.10.4 SUNLinearSolver SuperLUMT content . . . . . . The SUNLinearSolver SPGMR implementation . . . . . 9.11.1 SUNLinearSolver SPGMR description . . . . . . 9.11.2 SUNLinearSolver SPGMR functions . . . . . . . 9.11.3 SUNLinearSolver SPGMR Fortran interfaces . . 9.11.4 SUNLinearSolver SPGMR content . . . . . . . . The SUNLinearSolver SPFGMR implementation . . . . 9.12.1 SUNLinearSolver SPFGMR description . . . . . 9.12.2 SUNLinearSolver SPFGMR functions . . . . . . 9.12.3 SUNLinearSolver SPFGMR Fortran interfaces . 9.12.4 SUNLinearSolver SPFGMR content . . . . . . . The SUNLinearSolver SPBCGS implementation . . . . . 9.13.1 SUNLinearSolver SPBCGS description . . . . . . 9.13.2 SUNLinearSolver SPBCGS functions . . . . . . . 9.13.3 SUNLinearSolver SPBCGS Fortran interfaces . . 9.13.4 SUNLinearSolver SPBCGS content . . . . . . . . The SUNLinearSolver SPTFQMR implementation . . . 9.14.1 SUNLinearSolver SPTFQMR description . . . . 9.14.2 SUNLinearSolver SPTFQMR functions . . . . . 9.14.3 SUNLinearSolver SPTFQMR Fortran interfaces . 9.14.4 SUNLinearSolver SPTFQMR content . . . . . . The SUNLinearSolver PCG implementation . . . . . . . 9.15.1 SUNLinearSolver PCG description . . . . . . . . 9.15.2 SUNLinearSolver PCG functions . . . . . . . . . 9.15.3 SUNLinearSolver PCG Fortran interfaces . . . . 9.15.4 SUNLinearSolver PCG content . . . . . . . . . . SUNLinearSolver Examples . . . . . . . . . . . . . . . . vii . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252 252 253 254 254 254 255 255 256 256 257 257 258 259 259 259 260 262 264 264 265 265 267 268 269 269 269 271 274 275 275 276 278 280 282 282 282 284 286 287 287 288 289 291 292 293 294 295 297 298 10 Description of the SUNNonlinearSolver module 10.1 The SUNNonlinearSolver API . . . . . . . . . . . . . . . . . . 10.1.1 SUNNonlinearSolver core functions . . . . . . . . . . . 10.1.2 SUNNonlinearSolver set functions . . . . . . . . . . . 10.1.3 SUNNonlinearSolver get functions . . . . . . . . . . . 10.1.4 Functions provided by SUNDIALS integrators . . . . 10.1.5 SUNNonlinearSolver return codes . . . . . . . . . . . . 10.1.6 The generic SUNNonlinearSolver module . . . . . . . 10.1.7 Usage with sensitivity enabled integrators . . . . . . . 10.1.8 Implementing a Custom SUNNonlinearSolver Module 10.2 The SUNNonlinearSolver Newton implementation . . . . . . 10.2.1 SUNNonlinearSolver Newton description . . . . . . . . 10.2.2 SUNNonlinearSolver Newton functions . . . . . . . . . 10.2.3 SUNNonlinearSolver Newton Fortran interfaces . . . . 10.2.4 SUNNonlinearSolver Newton content . . . . . . . . . . 10.3 The SUNNonlinearSolver FixedPoint implementation . . . . . 10.3.1 SUNNonlinearSolver FixedPoint description . . . . . . 10.3.2 SUNNonlinearSolver FixedPoint functions . . . . . . . 10.3.3 SUNNonlinearSolver FixedPoint Fortran interfaces . . 10.3.4 SUNNonlinearSolver FixedPoint content . . . . . . . . A SUNDIALS Package Installation Procedure A.1 CMake-based installation . . . . . . . . . . . . . . . . . A.1.1 Configuring, building, and installing on Unix-like A.1.2 Configuration options (Unix/Linux) . . . . . . . A.1.3 Configuration examples . . . . . . . . . . . . . . A.1.4 Working with external Libraries . . . . . . . . . A.1.5 Testing the build and installation . . . . . . . . . A.2 Building and Running Examples . . . . . . . . . . . . . A.3 Configuring, building, and installing on Windows . . . . A.4 Installed libraries and exported header files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 301 301 301 303 304 305 307 307 308 310 310 310 311 312 313 313 313 314 315 316 . . . . . systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 319 320 320 322 329 329 331 332 332 332 . . . . . . . . . . . . . . . . . . . B IDAS Constants 339 B.1 IDAS input constants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339 B.2 IDAS output constants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 339 Bibliography 343 Index 347 viii List of Tables 4.1 4.2 4.3 sundials linear solver interfaces and vector implementations that can be used for each. 37 Optional inputs for idas and idals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46 Optional outputs from idas and idals . . . . . . . . . . . . . . . . . . . . . . . . . . . 59 5.1 5.2 Forward sensitivity optional inputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103 Forward sensitivity optional outputs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105 7.1 7.2 7.3 7.4 7.5 Vector Identifications associated with vector kernels supplied with sundials. Description of the NVECTOR operations . . . . . . . . . . . . . . . . . . . . Description of the NVECTOR fused operations . . . . . . . . . . . . . . . . . Description of the NVECTOR vector array operations . . . . . . . . . . . . . List of vector functions usage by idas code modules . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 162 165 166 214 8.1 8.2 8.3 8.4 Identifiers associated with matrix kernels supplied with sundials. . . . . . . . . . Description of the SUNMatrix operations . . . . . . . . . . . . . . . . . . . . . . . . sundials matrix interfaces and vector implementations that can be used for each. List of matrix functions usage by idas code modules . . . . . . . . . . . . . . . . . . . . . . . . . 216 216 217 218 9.1 9.2 Description of the SUNLinearSolver error codes . . . . . . . . . . . . . . . . . . . . . 244 sundials matrix-based linear solvers and matrix implementations that can be used for each. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 245 List of linear solver function usage in the idals interface . . . . . . . . . . . . . . . . . 248 9.3 . . . . . . . . . . 10.1 Description of the SUNNonlinearSolver return codes . . . . . . . . . . . . . . . . . . . 307 A.1 sundials libraries and header files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 334 ix List of Figures 2.1 Illustration of the checkpointing algorithm for generation of the forward solution during the integration of the adjoint system. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 3.1 3.2 3.3 High-level diagram of the sundials suite . . . . . . . . . . . . . . . . . . . . . . . . . Organization of the sundials suite . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Overall structure diagram of the ida package . . . . . . . . . . . . . . . . . . . . . . . 26 27 28 8.1 8.2 Diagram of the storage for a sunmatrix band object . . . . . . . . . . . . . . . . . . 223 Diagram of the storage for a compressed-sparse-column matrix . . . . . . . . . . . . . 230 A.1 Initial ccmake configuration screen . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 321 A.2 Changing the instdir . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 322 xi Chapter 1 Introduction idas is part of a software family called sundials: SUite of Nonlinear and DIfferential/ALgebraic equation Solvers [26]. This suite consists of cvode, arkode, kinsol, and ida, and variants of these with sensitivity analysis capabilities, cvodes and idas. idas is a general purpose solver for the initial value problem (IVP) for systems of differentialalgebraic equations (DAEs). The name IDAS stands for Implicit Differential-Algebraic solver with Sensitivity capabilities. idas is an extension of the ida solver within sundials, itself based on daspk [7, 8]; however, like all sundials solvers, idas is written in ANSI-standard C rather than Fortran77. Its most notable features are that, (1) in the solution of the underlying nonlinear system at each time step, it offers a choice of Newton/direct methods and a choice of Inexact Newton/Krylov (iterative) methods; (2) it is written in a data-independent manner in that it acts on generic vectors and matrices without any assumptions on the underlying organization of the data; and (3) it provides a flexible, extensible framework for sensitivity analysis, using either forward or adjoint methods. Thus idas shares significant modules previously written within CASC at LLNL to support the ordinary differential equation (ODE) solvers cvode [27, 15] and pvode [11, 12], the DAE solver ida [30] on which idas is based, the sensitivity-enabled ODE solver cvodes [28, 42], and also the nonlinear system solver kinsol [16]. At present, idas may utilize a variety of Krylov methods provided in sundials that can be used in conjuction with Newton iteration: these include the GMRES (Generalized Minimal RESidual) [41], FGMRES (Flexible Generalized Minimum RESidual) [40], Bi-CGStab (Bi-Conjugate Gradient Stabilized) [44], TFQMR (Transpose-Free Quasi-Minimal Residual) [23], and PCG (Preconditioned Conjugate Gradient) [24] linear iterative methods. As Krylov methods, these require little matrix storage for solving the Newton equations as compared to direct methods. However, the algorithms allow for a user-supplied preconditioner matrix, and, for most problems, preconditioning is essential for an efficient solution. For very large DAE systems, the Krylov methods are preferable over direct linear solver methods, and are often the only feasible choice. Among the Krylov methods in sundials, we recommend GMRES as the best overall choice. However, users are encouraged to compare all options, especially if encountering convergence failures with GMRES. Bi-CGFStab and TFQMR have an advantage in storage requirements, in that the number of workspace vectors they require is fixed, while that number for GMRES depends on the desired Krylov subspace size. FGMRES has an advantage in that it is designed to support preconditioners that vary between iterations (e.g. iterative methods). PCG exhibits rapid convergence and minimal workspace vectors, but only works for symmetric linear systems. idas is written with a functionality that is a superset of that of ida. Sensitivity analysis capabilities, both forward and adjoint, have been added to the main integrator. Enabling forward sensitivity computations in idas will result in the code integrating the so-called sensitivity equations simultaneously with the original IVP, yielding both the solution and its sensitivity with respect to parameters in the model. Adjoint sensitivity analysis, most useful when the gradients of relatively few functionals of the solution with respect to many parameters are sought, involves integration of the original IVP 2 Introduction forward in time followed by the integration of the so-called adjoint equations backward in time. idas provides the infrastructure needed to integrate any final-condition ODE dependent on the solution of the original IVP (in particular the adjoint system). There are several motivations for choosing the C language for idas. First, a general movement away from Fortran and toward C in scientific computing was apparent. Second, the pointer, structure, and dynamic memory allocation features in C are extremely useful in software of this complexity, with the great variety of method options offered. Finally, we prefer C over C++ for idas because of the wider availability of C compilers, the potentially greater efficiency of C, and the greater ease of interfacing the solver to applications written in extended Fortran. 1.1 Changes from previous versions Changes in v3.0.0 idas’ previous direct and iterative linear solver interfaces, idadls and idaspils, have been merged into a single unified linear solver interface, idals, to support any valid sunlinsol module. This includes the “DIRECT” and “ITERATIVE” types as well as the new “MATRIX ITERATIVE” type. Details regarding how idals utilizes linear solvers of each type as well as discussion regarding intended use cases for user-supplied sunlinsol implementations are included in Chapter 9. All idas example programs and the standalone linear solver examples have been updated to use the unified linear solver interface. The unified interface for the new idals module is very similar to the previous idadls and idaspils interfaces. To minimize challenges in user migration to the new names, the previous C routine names may still be used; these will be deprecated in future releases, so we recommend that users migrate to the new names soon. The names of all constructor routines for sundials-provided sunlinsol implementations have been updated to follow the naming convention SUNLinSol * where * is the name of the linear solver. The new names are SUNLinSol Band, SUNLinSol Dense, SUNLinSol KLU, SUNLinSol LapackBand, SUNLinSol LapackDense, SUNLinSol PCG, SUNLinSol SPBCGS, SUNLinSol SPFGMR, SUNLinSol SPGMR, SUNLinSol SPTFQMR, and SUNLinSol SuperLUMT. Solver-specific “set” routine names have been similarly standardized. To minimize challenges in user migration to the new names, the previous routine names may still be used; these will be deprecated in future releases, so we recommend that users migrate to the new names soon. All idas example programs and the standalone linear solver examples have been updated to use the new naming convention. The SUNBandMatrix constructor has been simplified to remove the storage upper bandwidth argument. sundials integrators have been updated to utilize generic nonlinear solver modules defined through the sunnonlinsol API. This API will ease the addition of new nonlinear solver options and allow for external or user-supplied nonlinear solvers. The sunnonlinsol API and sundials provided modules are described in Chapter 10 and follow the same object oriented design and implementation used by the nvector, sunmatrix, and sunlinsol modules. Currently two sunnonlinsol implementations are provided, sunnonlinsol newton and sunnonlinsol fixedpoint. These replicate the previous integrator specific implementations of a Newton iteration and a fixed-point iteration (previously referred to as a functional iteration), respectively. Note the sunnonlinsol fixedpoint module can optionally utilize Anderson’s method to accelerate convergence. Example programs using each of these nonlinear solver modules in a standalone manner have been added and all idas example programs have been updated to use generic sunnonlinsol modules. By default idas uses the sunnonlinsol newton module. Since idas previously only used an internal implementation of a Newton iteration no changes are required to user programs and functions for setting the nonlinear solver options (e.g., IDASetMaxNonlinIters) or getting nonlinear solver statistics (e.g., IDAGetNumNonlinSolvIters) remain unchanged and internally call generic sunnonlinsol functions as needed. While sundials includes a fixed-point nonlinear solver module, it is not currently supported in idas. For details on attaching a user-supplied nonlinear solver to idas see Chapter 4, 5, and 6. 1.1 Changes from previous versions 3 Three fused vector operations and seven vector array operations have been added to the nvector API. These optional operations are disabled by default and may be activated by calling vector specific routines after creating an nvector (see Chapter 7 for more details). The new operations are intended to increase data reuse in vector operations, reduce parallel communication on distributed memory systems, and lower the number of kernel launches on systems with accelerators. The fused operations are N VLinearCombination, N VScaleAddMulti, and N VDotProdMulti and the vector array operations are N VLinearCombinationVectorArray, N VScaleVectorArray, N VConstVectorArray, N VWrmsNormVectorArray, N VWrmsNormMaskVectorArray, N VScaleAddMultiVectorArray, and N VLinearCombinationVectorArray. If an nvector implementation defines any of these operations as NULL, then standard nvector operations will automatically be called as necessary to complete the computation. Multiple updates to nvector cuda were made: • Changed N VGetLength Cuda to return the global vector length instead of the local vector length. • Added N VGetLocalLength Cuda to return the local vector length. • Added N VGetMPIComm Cuda to return the MPI communicator used. • Removed the accessor functions in the namespace suncudavec. • Changed the N VMake Cuda function to take a host data pointer and a device data pointer instead of an N VectorContent Cuda object. • Added the ability to set the cudaStream t used for execution of the nvector cuda kernels. See the function N VSetCudaStreams Cuda. • Added N VNewManaged Cuda, N VMakeManaged Cuda, and N VIsManagedMemory Cuda functions to accommodate using managed memory with the nvector cuda. Multiple changes to nvector raja were made: • Changed N VGetLength Raja to return the global vector length instead of the local vector length. • Added N VGetLocalLength Raja to return the local vector length. • Added N VGetMPIComm Raja to return the MPI communicator used. • Removed the accessor functions in the namespace suncudavec. A new nvector implementation for leveraging OpenMP 4.5+ device offloading has been added, nvector openmpdev. See §7.10 for more details. Changes in v2.2.1 The changes in this minor release include the following: • Fixed a bug in the cuda nvector where the N VInvTest operation could write beyond the allocated vector data. • Fixed library installation path for multiarch systems. This fix changes the default library installation path to CMAKE INSTALL PREFIX/CMAKE INSTALL LIBDIR from CMAKE INSTALL PREFIX/lib. CMAKE INSTALL LIBDIR is automatically set, but is available as a CMake option that can modified. 4 Introduction Changes in v2.2.0 Fixed a bug in idas where the saved residual value used in the nonlinear solve for consistent initial conditions was passed as temporary workspace and could be overwritten. Fixed a thread-safety issue when using ajdoint sensitivity analysis. Fixed a problem with setting sunindextype which would occur with some compilers (e.g. armclang) that did not define STDC VERSION . Added hybrid MPI/CUDA and MPI/RAJA vectors to allow use of more than one MPI rank when using a GPU system. The vectors assume one GPU device per MPI rank. Changed the name of the raja nvector library to libsundials nveccudaraja.lib from libsundials nvecraja.lib to better reflect that we only support cuda as a backend for raja currently. Several changes were made to the build system: • CMake 3.1.3 is now the minimum required CMake version. • Deprecate the behavior of the SUNDIALS INDEX TYPE CMake option and added the SUNDIALS INDEX SIZE CMake option to select the sunindextype integer size. • The native CMake FindMPI module is now used to locate an MPI installation. • If MPI is enabled and MPI compiler wrappers are not set, the build system will check if CMAKECOMPILER can compile MPI programs before trying to locate and use an MPI installation. • The previous options for setting MPI compiler wrappers and the executable for running MPI programs have been have been depreated. The new options that align with those used in native CMake FindMPI module are MPI C COMPILER, MPI CXX COMPILER, MPI Fortran COMPILER, and MPIEXEC EXECUTABLE. • When a Fortran name-mangling scheme is needed (e.g., LAPACK ENABLE is ON) the build system will infer the scheme from the Fortran compiler. If a Fortran compiler is not available or the inferred or default scheme needs to be overridden, the advanced options SUNDIALS F77 FUNC CASE and SUNDIALS F77 FUNC UNDERSCORES can be used to manually set the name-mangling scheme and bypass trying to infer the scheme. • Parts of the main CMakeLists.txt file were moved to new files in the src and example directories to make the CMake configuration file structure more modular. Changes in v2.1.2 The changes in this minor release include the following: • Updated the minimum required version of CMake to 2.8.12 and enabled using rpath by default to locate shared libraries on OSX. • Fixed Windows specific problem where sunindextype was not correctly defined when using 64-bit integers for the sundials index type. On Windows sunindextype is now defined as the MSVC basic type int64. • Added sparse SUNMatrix “Reallocate” routine to allow specification of the nonzero storage. 1.1 Changes from previous versions 5 • Updated the KLU sunlinsol module to set constants for the two reinitialization types, and fixed a bug in the full reinitialization approach where the sparse SUNMatrix pointer would go out of scope on some architectures. • Updated the “ScaleAdd” and “ScaleAddI” implementations in the sparse SUNMatrix module to more optimally handle the case where the target matrix contained sufficient storage for the sum, but had the wrong sparsity pattern. The sum now occurs in-place, by performing the sum backwards in the existing storage. However, it is still more efficient if the user-supplied Jacobian routine allocates storage for the sum I + γJ manually (with zero entries if needed). • Changed the LICENSE install path to instdir/include/sundials. Changes in v2.1.1 The changes in this minor release include the following: • Fixed a potential memory leak in the spgmr and spfgmr linear solvers: if “Initialize” was called multiple times then the solver memory was reallocated (without being freed). • Updated KLU SUNLinearSolver module to use a typedef for the precision-specific solve function to be used (to avoid compiler warnings). • Added missing typecasts for some (void*) pointers (again, to avoid compiler warnings). • Bugfix in sunmatrix sparse.c where we had used int instead of sunindextype in one location. • Added missing #include in nvector and sunmatrix header files. • Added missing prototype for IDASpilsGetNumJTSetupEvals. • Fixed an indexing bug in the cuda nvector implementation of N VWrmsNormMask and revised the raja nvector implementation of N VWrmsNormMask to work with mask arrays using values other than zero or one. Replaced double with realtype in the raja vector test functions. In addition to the changes above, minor corrections were also made to the example programs, build system, and user documentation. Changes in v2.1.0 Added nvector print functions that write vector data to a specified file (e.g., N VPrintFile Serial). Added make test and make test install options to the build system for testing sundials after building with make and installing with make install respectively. Changes in v2.0.0 All interfaces to matrix structures and linear solvers have been reworked, and all example programs have been updated. The goal of the redesign of these interfaces was to provide more encapsulation and to ease interfacing of custom linear solvers and interoperability with linear solver libraries. Specific changes include: • Added generic sunmatrix module with three provided implementations: dense, banded and sparse. These replicate previous SUNDIALS Dls and Sls matrix structures in a single objectoriented API. • Added example problems demonstrating use of generic sunmatrix modules. • Added generic SUNLinearSolver module with eleven provided implementations: sundials native dense, sundials native banded, LAPACK dense, LAPACK band, KLU, SuperLU MT, SPGMR, SPBCGS, SPTFQMR, SPFGMR, and PCG. These replicate previous SUNDIALS generic linear solvers in a single object-oriented API. 6 Introduction • Added example problems demonstrating use of generic SUNLinearSolver modules. • Expanded package-provided direct linear solver (Dls) interfaces and scaled, preconditioned, iterative linear solver (Spils) interfaces to utilize generic sunmatrix and SUNLinearSolver objects. • Removed package-specific, linear solver-specific, solver modules (e.g. CVDENSE, KINBAND, IDAKLU, ARKSPGMR) since their functionality is entirely replicated by the generic Dls/Spils interfaces and SUNLinearSolver/SUNMATRIX modules. The exception is CVDIAG, a diagonal approximate Jacobian solver available to cvode and cvodes. • Converted all sundials example problems and files to utilize the new generic sunmatrix and SUNLinearSolver objects, along with updated Dls and Spils linear solver interfaces. • Added Spils interface routines to arkode, cvode, cvodes, ida, and idas to allow specification of a user-provided ”JTSetup” routine. This change supports users who wish to set up data structures for the user-provided Jacobian-times-vector (”JTimes”) routine, and where the cost of one JTSetup setup per Newton iteration can be amortized between multiple JTimes calls. Two additional nvector implementations were added – one for cuda and one for raja vectors. These vectors are supplied to provide very basic support for running on GPU architectures. Users are advised that these vectors both move all data to the GPU device upon construction, and speedup will only be realized if the user also conducts the right-hand-side function evaluation on the device. In addition, these vectors assume the problem fits on one GPU. Further information about raja, users are referred to the web site, https://software.llnl.gov/RAJA/. These additions are accompanied by additions to various interface functions and to user documentation. All indices for data structures were updated to a new sunindextype that can be configured to be a 32- or 64-bit integer data index type. sunindextype is defined to be int32 t or int64 t when portable types are supported, otherwise it is defined as int or long int. The Fortran interfaces continue to use long int for indices, except for their sparse matrix interface that now uses the new sunindextype. This new flexible capability for index types includes interfaces to PETSc, hypre, SuperLU MT, and KLU with either 32-bit or 64-bit capabilities depending how the user configures sundials. To avoid potential namespace conflicts, the macros defining booleantype values TRUE and FALSE have been changed to SUNTRUE and SUNFALSE respectively. Temporary vectors were removed from preconditioner setup and solve routines for all packages. It is assumed that all necessary data for user-provided preconditioner operations will be allocated and stored in user-provided data structures. The file include/sundials fconfig.h was added. This file contains sundials type information for use in Fortran programs. The build system was expanded to support many of the xSDK-compliant keys. The xSDK is a movement in scientific software to provide a foundation for the rapid and efficient production of high-quality, sustainable extreme-scale scientific applications. More information can be found at, https://xsdk.info. Added functions SUNDIALSGetVersion and SUNDIALSGetVersionNumber to get sundials release version information at runtime. In addition, numerous changes were made to the build system. These include the addition of separate BLAS ENABLE and BLAS LIBRARIES CMake variables, additional error checking during CMake configuration, minor bug fixes, and renaming CMake options to enable/disable examples for greater clarity and an added option to enable/disable Fortran 77 examples. These changes included changing EXAMPLES ENABLE to EXAMPLES ENABLE C, changing CXX ENABLE to EXAMPLES ENABLE CXX, changing F90 ENABLE to EXAMPLES ENABLE F90, and adding an EXAMPLES ENABLE F77 option. A bug fix was done to add a missing prototype for IDASetMaxBacksIC in ida.h. Corrections and additions were made to the examples, to installation-related files, and to the user documentation. 1.1 Changes from previous versions 7 Changes in v1.3.0 Two additional nvector implementations were added – one for Hypre (parallel) ParVector vectors, and one for PETSc vectors. These additions are accompanied by additions to various interface functions and to user documentation. Each nvector module now includes a function, N VGetVectorID, that returns the nvector module name. An optional input function was added to set a maximum number of linesearch backtracks in the initial condition calculation, and four user-callable functions were added to support the use of LAPACK linear solvers in solving backward problems for adjoint sensitivity analysis. For each linear solver, the various solver performance counters are now initialized to 0 in both the solver specification function and in solver linit function. This ensures that these solver counters are initialized upon linear solver instantiation as well as at the beginning of the problem solution. A bug in for-loop indices was fixed in IDAAckpntAllocVectors. A bug was fixed in the interpolation functions used in solving backward problems. A memory leak was fixed in the banded preconditioner interface. In addition, updates were done to return integers from linear solver and preconditioner ’free’ functions. In interpolation routines for backward problems, added logic to bypass sensitivity interpolation if input sensitivity argument is NULL. The Krylov linear solver Bi-CGstab was enhanced by removing a redundant dot product. Various additions and corrections were made to the interfaces to the sparse solvers KLU and SuperLU MT, including support for CSR format when using KLU. New examples were added for use of the OpenMP vector and for use of sparse direct solvers within sensitivity integrations. Minor corrections and additions were made to the idas solver, to the examples, to installationrelated files, and to the user documentation. Changes in v1.2.0 Two major additions were made to the linear system solvers that are available for use with the idas solver. First, in the serial case, an interface to the sparse direct solver KLU was added. Second, an interface to SuperLU MT, the multi-threaded version of SuperLU, was added as a thread-parallel sparse direct solver option, to be used with the serial version of the NVECTOR module. As part of these additions, a sparse matrix (CSC format) structure was added to idas. Otherwise, only relatively minor modifications were made to idas: In IDARootfind, a minor bug was corrected, where the input array rootdir was ignored, and a line was added to break out of root-search loop if the initial interval size is below the tolerance ttol. In IDALapackBand, the line smu = MIN(N-1,mu+ml) was changed to smu = mu + ml to correct an illegal input error for DGBTRF/DGBTRS. An option was added in the case of Adjoint Sensitivity Analysis with dense or banded Jacobian: With a call to IDADlsSetDenseJacFnBS or IDADlsSetBandJacFnBS, the user can specify a usersupplied Jacobian function of type IDADls***JacFnBS, for the case where the backward problem depends on the forward sensitivities. A minor bug was fixed regarding the testing of the input tstop on the first call to IDASolve. For the Adjoint Sensitivity Analysis case in which the backward problem depends on the forward sensitivities, options have been added to allow for user-supplied pset, psolve, and jtimes functions. In order to avoid possible name conflicts, the mathematical macro and function names MIN, MAX, SQR, RAbs, RSqrt, RExp, RPowerI, and RPowerR were changed to SUNMIN, SUNMAX, SUNSQR, SUNRabs, SUNRsqrt, SUNRexp, SRpowerI, and SUNRpowerR, respectively. These names occur in both the solver and in various example programs. In the User Guide, a paragraph was added in Section 6.2.1 on IDAAdjReInit, and a paragraph was added in Section 6.2.9 on IDAGetAdjY. Two new nvector modules have been added for thread-parallel computing environments — one for OpenMP, denoted NVECTOR OPENMP, and one for Pthreads, denoted NVECTOR PTHREADS. 8 Introduction With this version of sundials, support and documentation of the Autotools mode of installation is being dropped, in favor of the CMake mode, which is considered more widely portable. Changes in v1.1.0 One significant design change was made with this release: The problem size and its relatives, bandwidth parameters, related internal indices, pivot arrays, and the optional output lsflag have all been changed from type int to type long int, except for the problem size and bandwidths in user calls to routines specifying BLAS/LAPACK routines for the dense/band linear solvers. The function NewIntArray is replaced by a pair NewIntArray/NewLintArray, for int and long int arrays, respectively. In a minor change to the user interface, the type of the index which in IDAS was changed from long int to int. Errors in the logic for the integration of backward problems were identified and fixed. A large number of minor errors have been fixed. Among these are the following: A missing vector pointer setting was added in IDASensLineSrch. In IDACompleteStep, conditionals around lines loading a new column of three auxiliary divided difference arrays, for a possible order increase, were fixed. After the solver memory is created, it is set to zero before being filled. In each linear solver interface function, the linear solver memory is freed on an error return, and the **Free function now includes a line setting to NULL the main memory pointer to the linear solver memory. A memory leak was fixed in two of the IDASp***Free functions. In the rootfinding functions IDARcheck1/IDARcheck2, when an exact zero is found, the array glo of g values at the left endpoint is adjusted, instead of shifting the t location tlo slightly. In the installation files, we modified the treatment of the macro SUNDIALS USE GENERIC MATH, so that the parameter GENERIC MATH LIB is either defined (with no value) or not defined. 1.2 Reading this User Guide The structure of this document is as follows: • In Chapter 2, we give short descriptions of the numerical methods implemented by idas for the solution of initial value problems for systems of DAEs, continue with short descriptions of preconditioning (§2.2) and rootfinding (§2.3), and then give an overview of the mathematical aspects of sensitivity analysis, both forward (§2.5) and adjoint (§2.6). • The following chapter describes the structure of the sundials suite of solvers (§3.1) and the software organization of the idas solver (§3.2). • Chapter 4 is the main usage document for idas for simulation applications. It includes a complete description of the user interface for the integration of DAE initial value problems. Readers that are not interested in using idas for sensitivity analysis can then skip the next two chapters. • Chapter 5 describes the usage of idas for forward sensitivity analysis as an extension of its IVP integration capabilities. We begin with a skeleton of the user main program, with emphasis on the steps that are required in addition to those already described in Chapter 4. Following that we provide detailed descriptions of the user-callable interface routines specific to forward sensitivity analysis and of the additonal optional user-defined routines. • Chapter 6 describes the usage of idas for adjoint sensitivity analysis. We begin by describing the idas checkpointing implementation for interpolation of the original IVP solution during integration of the adjoint system backward in time, and with an overview of a user’s main program. Following that we provide complete descriptions of the user-callable interface routines for adjoint sensitivity analysis as well as descriptions of the required additional user-defined routines. • Chapter 7 gives a brief overview of the generic nvector module shared amongst the various components of sundials, as well as details on the nvector implementations provided with sundials. 1.3 SUNDIALS Release License 9 • Chapter 8 gives a brief overview of the generic sunmatrix module shared among the various components of sundials, and details on the sunmatrix implementations provided with sundials: a dense implementation (§8.2), a banded implementation (§8.3) and a sparse implementation (§8.4). • Chapter 9 gives a brief overview of the generic sunlinsol module shared among the various components of sundials. This chapter contains details on the sunlinsol implementations provided with sundials. The chapter also contains details on the sunlinsol implementations provided with sundials that interface with external linear solver libraries. • Chapter 10 describes the sunnonlinsol API and nonlinear solver implementations shared among the various components of sundials. • Finally, in the appendices, we provide detailed instructions for the installation of idas, within the structure of sundials (Appendix A), as well as a list of all the constants used for input to and output from idas functions (Appendix B). Finally, the reader should be aware of the following notational conventions in this user guide: program listings and identifiers (such as IDAInit) within textual explanations appear in typewriter type style; fields in C structures (such as content) appear in italics; and packages or modules, such as idals, are written in all capitals. Usage and installation instructions that constitute important warnings are marked with a triangular symbol in the margin. 1.3 SUNDIALS Release License The SUNDIALS packages are released open source, under a BSD license. The only requirements of the BSD license are preservation of copyright and a standard disclaimer of liability. Our Copyright notice is below along with the license. **PLEASE NOTE** If you are using SUNDIALS with any third party libraries linked in (e.g., LaPACK, KLU, SuperLU MT, petsc, or hypre), be sure to review the respective license of the package as that license may have more restrictive terms than the SUNDIALS license. For example, if someone builds SUNDIALS with a statically linked KLU, the build is subject to terms of the LGPL license (which is what KLU is released with) and *not* the SUNDIALS BSD license anymore. 1.3.1 Copyright Notices All SUNDIALS packages except ARKode are subject to the following Copyright notice. 1.3.1.1 SUNDIALS Copyright Copyright (c) 2002-2016, Lawrence Livermore National Security. Produced at the Lawrence Livermore National Laboratory. Written by A.C. Hindmarsh, D.R. Reynolds, R. Serban, C.S. Woodward, S.D. Cohen, A.G. Taylor, S. Peles, L.E. Banks, and D. Shumaker. UCRL-CODE-155951 (CVODE) UCRL-CODE-155950 (CVODES) UCRL-CODE-155952 (IDA) UCRL-CODE-237203 (IDAS) LLNL-CODE-665877 (KINSOL) All rights reserved. 1.3.1.2 ! ARKode Copyright ARKode is subject to the following joint Copyright notice. Copyright (c) 2015-2016, Southern Methodist University and Lawrence Livermore National Security Written by D.R. Reynolds, D.J. Gardner, A.C. Hindmarsh, C.S. Woodward, and J.M. Sexton. ! 10 Introduction LLNL-CODE-667205 (ARKODE) All rights reserved. 1.3.2 BSD License Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the disclaimer below. 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the disclaimer (as noted below) in the documentation and/or other materials provided with the distribution. 3. Neither the name of the LLNS/LLNL nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS “AS IS” AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL LAWRENCE LIVERMORE NATIONAL SECURITY, LLC, THE U.S. DEPARTMENT OF ENERGY OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. Additional BSD Notice 1. This notice is required to be provided under our contract with the U.S. Department of Energy (DOE). This work was produced at Lawrence Livermore National Laboratory under Contract No. DE-AC52-07NA27344 with the DOE. 2. Neither the United States Government nor Lawrence Livermore National Security, LLC nor any of their employees, makes any warranty, express or implied, or assumes any liability or responsibility for the accuracy, completeness, or usefulness of any information, apparatus, product, or process disclosed, or represents that its use would not infringe privately-owned rights. 3. Also, reference herein to any specific commercial products, process, or services by trade name, trademark, manufacturer or otherwise does not necessarily constitute or imply its endorsement, recommendation, or favoring by the United States Government or Lawrence Livermore National Security, LLC. The views and opinions of authors expressed herein do not necessarily state or reflect those of the United States Government or Lawrence Livermore National Security, LLC, and shall not be used for advertising or product endorsement purposes. Chapter 2 Mathematical Considerations idas solves the initial-value problem (IVP) for a DAE system of the general form F (t, y, ẏ) = 0 , y(t0 ) = y0 , ẏ(t0 ) = ẏ0 , (2.1) where y, ẏ, and F are vectors in RN , t is the independent variable, ẏ = dy/dt, and initial values y0 , ẏ0 are given. (Often t is time, but it certainly need not be.) Additionally, if (2.1) depends on some parameters p ∈ RNp , i.e. F (t, y, ẏ, p) = 0 y(t0 ) = y0 (p) , ẏ(t0 ) = ẏ0 (p) , (2.2) idas can also compute first order derivative information, performing either forward sensitivity analysis or adjoint sensitivity analysis. In the first case, idas computes the sensitivities of the solution with respect to the parameters p, while in the second case, idas computes the gradient of a derived function with respect to the parameters p. 2.1 IVP solution Prior to integrating a DAE initial-value problem, an important requirement is that the pair of vectors y0 and ẏ0 are both initialized to satisfy the DAE residual F (t0 , y0 , ẏ0 ) = 0. For a class of problems that includes so-called semi-explicit index-one systems, idas provides a routine that computes consistent initial conditions from a user’s initial guess [8]. For this, the user must identify sub-vectors of y (not necessarily contiguous), denoted yd and ya , which are its differential and algebraic parts, respectively, such that F depends on ẏd but not on any components of ẏa . The assumption that the system is “index one” means that for a given t and yd , the system F (t, y, ẏ) = 0 defines ya uniquely. In this case, a solver within idas computes ya and ẏd at t = t0 , given yd and an initial guess for ya . A second available option with this solver also computes all of y(t0 ) given ẏ(t0 ); this is intended mainly for quasisteady-state problems, where ẏ(t0 ) = 0 is given. In both cases, ida solves the system F (t0 , y0 , ẏ0 ) = 0 for the unknown components of y0 and ẏ0 , using Newton iteration augmented with a line search global strategy. In doing this, it makes use of the existing machinery that is to be used for solving the linear systems during the integration, in combination with certain tricks involving the step size (which is set artificially for this calculation). For problems that do not fall into either of these categories, the user is responsible for passing consistent values, or risks failure in the numerical integration. The integration method used in idas is the variable-order, variable-coefficient BDF (Backward Differentiation Formula), in fixed-leading-coefficient form [4]. The method order ranges from 1 to 5, with the BDF of order q given by the multistep formula q X i=0 αn,i yn−i = hn ẏn , (2.3) 12 Mathematical Considerations where yn and ẏn are the computed approximations to y(tn ) and ẏ(tn ), respectively, and the step size is hn = tn − tn−1 . The coefficients αn,i are uniquely determined by the order q, and the history of the step sizes. The application of the BDF (2.3) to the DAE system (2.1) results in a nonlinear algebraic system to be solved at each step: ! q X −1 G(yn ) ≡ F tn , yn , hn αn,i yn−i = 0 . (2.4) i=0 By default idas solves (2.4) with a Newton iteration but idas also allows for user-defined nonlinear solvers (see Chapter 10). Each Newton iteration requires the soution of a linear system of the form J[yn(m+1) − yn(m) ] = −G(yn(m) ) , (2.5) where yn(m) is the m-th approximation to yn . Here J is some approximation to the system Jacobian J= ∂F ∂F ∂G = +α , ∂y ∂y ∂ ẏ (2.6) where α = αn,0 /hn . The scalar α changes whenever the step size or method order changes. For the solution of the linear systems within the Newton iteration, idas provides several choices, including the option of a user-supplied linear solver module (see Chapter 9). The linear solver modules distributed with sundials are organized in two families, a direct family comprising direct linear solvers for dense, banded, or sparse matrices and a spils family comprising scaled preconditioned iterative (Krylov) linear solvers. The methods offered through these modules are as follows: • dense direct solvers, using either an internal implementation or a BLAS/LAPACK implementation (serial or threaded vector modules only), • band direct solvers, using either an internal implementation or a BLAS/LAPACK implementation (serial or threaded vector modules only), • sparse direct solver interfaces, using either the KLU sparse solver library [17, 1], or the threadenabled SuperLU MT sparse solver library [35, 19, 2] (serial or threaded vector modules only) [Note that users will need to download and install the klu or superlumt packages independent of idas], • spgmr, a scaled preconditioned GMRES (Generalized Minimal Residual method) solver without restarts, • spfgmr, a scaled preconditioned FGMRES (Flexible Generalized Minimal Residual method) solver, • spbcgs, a scaled preconditioned Bi-CGStab (Bi-Conjugate Gradient Stable method) solver, • sptfqmr, a scaled preconditioned TFQMR (Transpose-Free Quasi-Minimal Residual method) solver, or • pcg, a scaled preconditioned CG (Conjugate Gradient method) solver. For large stiff systems, where direct methods are not feasible, the combination of a BDF integrator and a preconditioned Krylov method yields a powerful tool because it combines established methods for stiff integration, nonlinear iteration, and Krylov (linear) iteration with a problem-specific treatment of the dominant source of stiffness, in the form of the user-supplied preconditioner matrix [6]. For the spils linear solvers with idas, preconditioning is allowed only on the left (see §2.2). Note that the dense, band, and sparse direct linear solvers can only be used with serial and threaded vector representations. 2.1 IVP solution 13 In the process of controlling errors at various levels, idas uses a weighted root-mean-square norm, denoted k · kWRMS , for all error-like quantities. The multiplicative weights used are based on the current solution and on the relative and absolute tolerances input by the user, namely Wi = 1/[rtol · |yi | + atoli ] . (2.7) Because 1/Wi represents a tolerance in the component yi , a vector whose norm is 1 is regarded as “small.” For brevity, we will usually drop the subscript WRMS on norms in what follows. In the case of a matrix-based linear solver, the default Newton iteration is a Modified Newton iteration, in that the Jacobian J is fixed (and usually out of date) throughout the nonlinear iterations, with a coefficient ᾱ in place of α in J. However, in the case that a matrix-free iterative linear solver is used, the default Newton iteration is an Inexact Newton iteration, in which J is applied in a matrixfree manner, with matrix-vector products Jv obtained by either difference quotients or a user-supplied routine. In this case, the linear residual J∆y + G is nonzero but controlled. With the default Newton iteration, the matrix J and preconditioner matrix P are updated as infrequently as possible to balance the high costs of matrix operations against other costs. Specifically, this matrix update occurs when: • starting the problem, • the value ᾱ at the last update is such that α/ᾱ < 3/5 or α/ᾱ > 5/3, or • a non-fatal convergence failure occurred with an out-of-date J or P . The above strategy balances the high cost of frequent matrix evaluations and preprocessing with the slow convergence due to infrequent updates. To reduce storage costs on an update, Jacobian information is always reevaluated from scratch. The default stopping test for nonlinear solver iterations in idas ensures that the iteration error yn −yn(m) is small relative to y itself. For this, we estimate the linear convergence rate at all iterations m > 1 as 1 m−1 δm , R= δ1 where the δm = yn(m) − yn(m−1) is the correction at iteration m = 1, 2, . . .. The nonlinear solver iteration is halted if R > 0.9. The convergence test at the m-th iteration is then Skδm k < 0.33 , (2.8) where S = R/(R − 1) whenever m > 1 and R ≤ 0.9. The user has the option of changing the constant in the convergence test from its default value of 0.33. The quantity S is set to S = 20 initially and whenever J or P is updated, and it is reset to S = 100 on a step with α 6= ᾱ. Note that at m = 1, the convergence test (2.8) uses an old value for S. Therefore, at the first nonlinear solver iteration, we make an additional test and stop the iteration if kδ1 k < 0.33 · 10−4 (since such a δ1 is probably just noise and therefore not appropriate for use in evaluating R). We allow only a small number (default value 4) of nonlinear iterations. If convergence fails with J or P current, we are forced to reduce the step size hn , and we replace hn by hn /4. The integration is halted after a preset number (default value 10) of convergence failures. Both the maximum number of allowable nonlinear iterations and the maximum number of nonlinear convergence failures can be changed by the user from their default values. When an iterative method is used to solve the linear system, to minimize the effect of linear iteration errors on the nonlinear and local integration error controls, we require the preconditioned linear residual to be small relative to the allowed error in the nonlinear iteration, i.e., kP −1 (Jx+G)k < 0.05 · 0.33. The safety factor 0.05 can be changed by the user. When the Jacobian is stored using either dense or band sunmatrix objects, the Jacobian J defined in (2.6) can be either supplied by the user or have idas compute one internally by difference quotients. In the latter case, we use the approximation Jij = [Fi (t, y + σj ej , ẏ + ασj ej ) − Fi (t, y, ẏ)]/σj , with √ σj = U max {|yj |, |hẏj |, 1/Wj } sign(hẏj ) , 14 Mathematical Considerations where U is the unit roundoff, h is the current step size, and Wj is the error weight for the component yj defined by (2.7). We note that with sparse and user-supplied sunmatrix objects, the Jacobian must be supplied by a user routine. In the case of an iterative linear solver, if a routine for Jv is not supplied, such products are approximated by Jv = [F (t, y + σv, ẏ + ασv) − F (t, y, ẏ)]/σ , √ where the increment σ = N . As an option, the user can specify a constant factor that is inserted into this expression for σ. During the course of integrating the system, idas computes an estimate of the local truncation error, LTE, at the n-th time step, and requires this to satisfy the inequality kLTEkWRMS ≤ 1 . Asymptotically, LTE varies as hq+1 at step size h and order q, as does the predictor-corrector difference ∆n ≡ yn − yn(0) . Thus there is a constant C such that LTE = C∆n + O(hq+2 ) , and so the norm of LTE is estimated as |C| · k∆n k. In addition, idas requires that the error in the associated polynomial interpolant over the current step be bounded by 1 in norm. The leading term of the norm of this error is bounded by C̄k∆n k for another constant C̄. Thus the local error test in idas is max{|C|, C̄}k∆n k ≤ 1 . (2.9) A user option is available by which the algebraic components of the error vector are omitted from the test (2.9), if these have been so identified. In idas, the local error test is tightly coupled with the logic for selecting the step size and order. First, there is an initial phase that is treated specially; for the first few steps, the step size is doubled and the order raised (from its initial value of 1) on every step, until (a) the local error test (2.9) fails, (b) the order is reduced (by the rules given below), or (c) the order reaches 5 (the maximum). For step and order selection on the general step, idas uses a different set of local error estimates, based on the asymptotic behavior of the local error in the case of fixed step sizes. At each of the orders q 0 equal to q, q − 1 (if q > 1), q − 2 (if q > 2), or q + 1 (if q < 5), there are constants C(q 0 ) such that the norm of the local truncation error at order q 0 satisfies 0 LTE(q 0 ) = C(q 0 )kφ(q 0 + 1)k + O(hq +2 ) , where φ(k) is a modified divided difference of order k that is retained by idas (and behaves asymptotically as hk ). Thus the local truncation errors are estimated as ELTE(q 0 ) = C(q 0 )kφ(q 0 + 1)k to select step sizes. But the choice of order in idas is based on the requirement that the scaled derivative norms, khk y (k) k, are monotonically decreasing with k, for k near q. These norms are again estimated using the φ(k), and in fact 0 0 khq +1 y (q +1) k ≈ T (q 0 ) ≡ (q 0 + 1)ELTE(q 0 ) . The step/order selection begins with a test for monotonicity that is made even before the local error test is performed. Namely, the order is reset to q 0 = q − 1 if (a) q = 2 and T (1) ≤ T (2)/2, or (b) q > 2 and max{T (q − 1), T (q − 2)} ≤ T (q); otherwise q 0 = q. Next the local error test (2.9) is performed, and if it fails, the step is redone at order q ← q 0 and a new step size h0 . The latter is based on the hq+1 asymptotic behavior of ELTE(q), and, with safety factors, is given by η = h0 /h = 0.9/[2 ELTE(q)]1/(q+1) . The value of η is adjusted so that 0.25 ≤ η ≤ 0.9 before setting h ← h0 = ηh. If the local error test fails a second time, idas uses η = 0.25, and on the third and subsequent failures it uses q = 1 and η = 0.25. After 10 failures, idas returns with a give-up message. 2.2 Preconditioning 15 As soon as the local error test has passed, the step and order for the next step may be adjusted. No such change is made if q 0 = q − 1 from the prior test, if q = 5, or if q was increased on the previous step. Otherwise, if the last q + 1 steps were taken at a constant order q < 5 and a constant step size, idas considers raising the order to q + 1. The logic is as follows: (a) If q = 1, then reset q = 2 if T (2) < T (1)/2. (b) If q > 1 then • reset q ← q − 1 if T (q − 1) ≤ min{T (q), T (q + 1)}; • else reset q ← q + 1 if T (q + 1) < T (q); • leave q unchanged otherwise [then T (q − 1) > T (q) ≤ T (q + 1)]. In any case, the new step size h0 is set much as before: η = h0 /h = 1/[2 ELTE(q)]1/(q+1) . The value of η is adjusted such that (a) if η > 2, η is reset to 2; (b) if η ≤ 1, η is restricted to 0.5 ≤ η ≤ 0.9; and (c) if 1 < η < 2 we use η = 1. Finally h is reset to h0 = ηh. Thus we do not increase the step size unless it can be doubled. See [4] for details. idas permits the user to impose optional inequality constraints on individual components of the solution vector y. Any of the following four constraints can be imposed: yi > 0, yi < 0, yi ≥ 0, or yi ≤ 0. The constraint satisfaction is tested after a successful nonlinear system solution. If any constraint fails, we declare a convergence failure of the nonlinear iteration and reduce the step size. Rather than cutting the step size by some arbitrary factor, idas estimates a new step size h0 using a linear approximation of the components in y that failed the constraint test (including a safety factor of 0.9 to cover the strict inequality case). These additional constraints are also imposed during the calculation of consistent initial conditions. Normally, idas takes steps until a user-defined output value t = tout is overtaken, and then computes y(tout ) by interpolation. However, a “one step” mode option is available, where control returns to the calling program after each step. There are also options to force idas not to integrate past a given stopping point t = tstop . 2.2 Preconditioning When using a nonlinear solver that requires the solution of a linear system of the form J∆y = −G (e.g., the default Newton iteration), idas makes repeated use of a linear solver. If this linear system solve is done with one of the scaled preconditioned iterative linear solvers supplied with sundials, these solvers are rarely successful if used without preconditioning; it is generally necessary to precondition the system in order to obtain acceptable efficiency. A system Ax = b can be preconditioned on the left, on the right, or on both sides. The Krylov method is then applied to a system with the matrix P −1 A, or AP −1 , or PL−1 APR−1 , instead of A. However, within idas, preconditioning is allowed only on the left, so that the iterative method is applied to systems (P −1 J)∆y = −P −1 G. Left preconditioning is required to make the norm of the linear residual in the nonlinear iteration meaningful; in general, kJ∆y + Gk is meaningless, since the weights used in the WRMS-norm correspond to y. In order to improve the convergence of the Krylov iteration, the preconditioner matrix P should in some sense approximate the system matrix A. Yet at the same time, in order to be cost-effective, the matrix P should be reasonably efficient to evaluate and solve. Finding a good point in this tradeoff between rapid convergence and low cost can be very difficult. Good choices are often problem-dependent (for example, see [6] for an extensive study of preconditioners for reaction-transport systems). Typical preconditioners used with idas are based on approximations to the iteration matrix of ∂F the systems involved; in other words, P ≈ ∂F ∂y + α ∂ ẏ , where α is a scalar inversely proportional to the integration step size h. Because the Krylov iteration occurs within a nonlinear solver iteration and further also within a time integration, and since each of these iterations has its own test for convergence, the preconditioner may use a very crude approximation, as long as it captures the dominant numerical feature(s) of the system. We have found that the combination of a preconditioner 16 Mathematical Considerations with the Newton-Krylov iteration, using even a fairly poor approximation to the Jacobian, can be surprisingly superior to using the same matrix without Krylov acceleration (i.e., a modified Newton iteration), as well as to using the Newton-Krylov method with no preconditioning. 2.3 Rootfinding The idas solver has been augmented to include a rootfinding feature. This means that, while integrating the Initial Value Problem (2.1), idas can also find the roots of a set of user-defined functions gi (t, y, ẏ) that depend on t, the solution vector y = y(t), and its t−derivative ẏ(t). The number of these root functions is arbitrary, and if more than one gi is found to have a root in any given interval, the various root locations are found and reported in the order that they occur on the t axis, in the direction of integration. Generally, this rootfinding feature finds only roots of odd multiplicity, corresponding to changes in sign of gi (t, y(t), ẏ(t)), denoted gi (t) for short. If a user root function has a root of even multiplicity (no sign change), it will probably be missed by idas. If such a root is desired, the user should reformulate the root function so that it changes sign at the desired root. The basic scheme used is to check for sign changes of any gi (t) over each time step taken, and then (when a sign change is found) to home in on the root (or roots) with a modified secant method [25]. In addition, each time g is computed, idas checks to see if gi (t) = 0 exactly, and if so it reports this as a root. However, if an exact zero of any gi is found at a point t, idas computes g at t + δ for a small increment δ, slightly further in the direction of integration, and if any gi (t + δ) = 0 also, idas stops and reports an error. This way, each time idas takes a time step, it is guaranteed that the values of all gi are nonzero at some past value of t, beyond which a search for roots is to be done. At any given time in the course of the time-stepping, after suitable checking and adjusting has been done, idas has an interval (tlo , thi ] in which roots of the gi (t) are to be sought, such that thi is further ahead in the direction of integration, and all gi (tlo ) 6= 0. The endpoint thi is either tn , the end of the time step last taken, or the next requested output time tout if this comes sooner. The endpoint tlo is either tn−1 , or the last output time tout (if this occurred within the last step), or the last root location (if a root was just located within this step), possibly adjusted slightly toward tn if an exact zero was found. The algorithm checks g at thi for zeros and for sign changes in (tlo , thi ). If no sign changes are found, then either a root is reported (if some gi (thi ) = 0) or we proceed to the next time interval (starting at thi ). If one or more sign changes were found, then a loop is entered to locate the root to within a rather tight tolerance, given by τ = 100 ∗ U ∗ (|tn | + |h|) (U = unit roundoff) . Whenever sign changes are seen in two or more root functions, the one deemed most likely to have its root occur first is the one with the largest value of |gi (thi )|/|gi (thi ) − gi (tlo )|, corresponding to the closest to tlo of the secant method values. At each pass through the loop, a new value tmid is set, strictly within the search interval, and the values of gi (tmid ) are checked. Then either tlo or thi is reset to tmid according to which subinterval is found to have the sign change. If there is none in (tlo , tmid ) but some gi (tmid ) = 0, then that root is reported. The loop continues until |thi − tlo | < τ , and then the reported root location is thi . In the loop to locate the root of gi (t), the formula for tmid is tmid = thi − (thi − tlo )gi (thi )/[gi (thi ) − αgi (tlo )] , where α a weight parameter. On the first two passes through the loop, α is set to 1, making tmid the secant method value. Thereafter, α is reset according to the side of the subinterval (low vs high, i.e. toward tlo vs toward thi ) in which the sign change was found in the previous two passes. If the two sides were opposite, α is set to 1. If the two sides were the same, α is halved (if on the low side) or doubled (if on the high side). The value of tmid is closer to tlo when α < 1 and closer to thi when α > 1. If the above value of tmid is within τ /2 of tlo or thi , it is adjusted inward, such that its fractional distance from the endpoint (relative to the interval size) is between .1 and .5 (.5 being the midpoint), and the actual distance from the endpoint is at least τ /2. 2.4 Pure quadrature integration 2.4 17 Pure quadrature integration In many applications, and most notably during the backward integration phase of an adjoint sensitivity analysis run (see §2.6) it is of interest to compute integral quantities of the form Z t q(τ, y(τ ), ẏ(τ ), p) dτ . (2.10) z(t) = t0 The most effective approach to compute z(t) is to extend the original problem with the additional ODEs (obtained by applying Leibnitz’s differentiation rule): ż = q(t, y, ẏ, p) , z(t0 ) = 0 . (2.11) Note that this is equivalent to using a quadrature method based on the underlying linear multistep polynomial representation for y(t). This can be done at the “user level” by simply exposing to idas the extended DAE system (2.2)+(2.10). However, in the context of an implicit integration solver, this approach is not desirable since the nonlinear solver module will require the Jacobian (or Jacobian-vector product) of this extended DAE. Moreover, since the additional states, z, do not enter the right-hand side of the ODE (2.10) and therefore the residual of the extended DAE system does not depend on z, it is much more efficient to treat the ODE system (2.10) separately from the original DAE system (2.2) by “taking out” the additional states z from the nonlinear system (2.4) that must be solved in the correction step of the LMM. Instead, “corrected” values zn are computed explicitly as ! q X 1 zn = hn q(tn , yn , ẏn , p) − αn,i zn−i , αn,0 i=1 once the new approximation yn is available. The quadrature variables z can be optionally included in the error test, in which case corresponding relative and absolute tolerances must be provided. 2.5 Forward sensitivity analysis Typically, the governing equations of complex, large-scale models depend on various parameters, through the right-hand side vector and/or through the vector of initial conditions, as in (2.2). In addition to numerically solving the DAEs, it may be desirable to determine the sensitivity of the results with respect to the model parameters. Such sensitivity information can be used to estimate which parameters are most influential in affecting the behavior of the simulation or to evaluate optimization gradients (in the setting of dynamic optimization, parameter estimation, optimal control, etc.). The solution sensitivity with respect to the model parameter pi is defined as the vector si (t) = ∂y(t)/∂pi and satisfies the following forward sensitivity equations (or sensitivity equations for short): ∂F ∂F ∂F si + ṡi + =0 ∂y ∂ ẏ ∂pi ∂y0 (p) ∂ ẏ0 (p) si (t0 ) = , ṡi (t0 ) = , ∂pi ∂pi (2.12) obtained by applying the chain rule of differentiation to the original DAEs (2.2). When performing forward sensitivity analysis, idas carries out the time integration of the combined system, (2.2) and (2.12), by viewing it as a DAE system of size N (Ns + 1), where Ns is the number of model parameters pi , with respect to which sensitivities are desired (Ns ≤ Np ). However, major improvements in efficiency can be made by taking advantage of the special form of the sensitivity equations as linearizations of the original DAEs. In particular, the original DAE system and all sensitivity systems share the same Jacobian matrix J in (2.6). The sensitivity equations are solved with the same linear multistep formula that was selected for the original DAEs and the same linear solver is used in the correction phase for both state and sensitivity variables. In addition, idas offers the option of including (full error control) or excluding (partial error control) the sensitivity variables from the local error test. 18 Mathematical Considerations 2.5.1 Forward sensitivity methods In what follows we briefly describe three methods that have been proposed for the solution of the combined DAE and sensitivity system for the vector ŷ = [y, s1 , . . . , sNs ]. • Staggered Direct In this approach [14], the nonlinear system (2.4) is first solved and, once an acceptable numerical solution is obtained, the sensitivity variables at the new step are found by directly solving (2.12) after the BDF discretization is used to eliminate ṡi . Although the system matrix of the above linear system is based on exactly the same information as the matrix J in (2.6), it must be updated and factored at every step of the integration, in contrast to an evaluation of J which is updated only occasionally. For problems with many parameters (relative to the problem size), the staggered direct method can outperform the methods described below [34]. However, the computational cost associated with matrix updates and factorizations makes this method unattractive for problems with many more states than parameters (such as those arising from semidiscretization of PDEs) and is therefore not implemented in idas. • Simultaneous Corrector In this method [37], the discretization is applied simultaneously to both the original equations (2.2) and the sensitivity systems (2.12) resulting in an “extended” nonlinear system Ĝ(ŷn ) = 0 where ŷn = [yn , . . . , si , . . .]. This combined nonlinear system can be solved using a modified Newton method as in (2.5) by solving the corrector equation ˆ n(m+1) − ŷn(m) ] = −Ĝ(ŷn(m) ) J[ŷ (2.13) at each iteration, where J J1 J2 .. . ˆ J = JNs J 0 .. . J .. . .. 0 ... 0 , . J J is defined as in (2.6), and Ji = (∂/∂y) [Fy si + Fẏ ṡi + Fpi ]. It can be shown that 2-step quadratic convergence can be retained by using only the block-diagonal portion of Jˆ in the corrector equation (2.13). This results in a decoupling that allows the reuse of J without additional matrix factorizations. However, the sum Fy si + Fẏ ṡi + Fpi must still be reevaluated at each step of the iterative process (2.13) to update the sensitivity portions of the residual Ĝ. • Staggered corrector In this approach [22], as in the staggered direct method, the nonlinear system (2.4) is solved first using the Newton iteration (2.5). Then, for each sensitivity vector ξ ≡ si , a separate Newton iteration is used to solve the sensitivity system (2.12): J[ξn(m+1) − ξn(m) ] = " − Fy (tn , yn , ẏn )ξn(m) + Fẏ (tn , yn , ẏn ) · h−1 n αn,0 ξn(m) + q X ! αn,i ξn−i # + Fpi (tn , yn , ẏn ) . i=1 (2.14) In other words, a modified Newton iteration is used to solve a linear system. In this approach, the matrices ∂F/∂y, ∂F/∂ ẏ and vectors ∂F/∂pi need be updated only once per integration step, after the state correction phase (2.5) has converged. idas implements both the simultaneous corrector method and the staggered corrector method. An important observation is that the staggered corrector method, combined with a Krylov linear solver, effectively results in a staggered direct method. Indeed, the Krylov solver requires only the action of the matrix J on a vector, and this can be provided with the current Jacobian information. Therefore, the modified Newton procedure (2.14) will theoretically converge after one iteration. 2.5 Forward sensitivity analysis 2.5.2 19 Selection of the absolute tolerances for sensitivity variables If the sensitivities are included in the error test, idas provides an automated estimation of absolute tolerances for the sensitivity variables based on the absolute tolerance for the corresponding state variable. The relative tolerance for sensitivity variables is set to be the same as for the state variables. The selection of absolute tolerances for the sensitivity variables is based on the observation that the sensitivity vector si will have units of [y]/[pi ]. With this, the absolute tolerance for the j-th component of the sensitivity vector si is set to atolj /|p̄i |, where atolj are the absolute tolerances for the state variables and p̄ is a vector of scaling factors that are dimensionally consistent with the model parameters p and give an indication of their order of magnitude. This choice of relative and absolute tolerances is equivalent to requiring that the weighted root-mean-square norm of the sensitivity vector si with weights based on si be the same as the weighted root-mean-square norm of the vector of scaled sensitivities s̄i = |p̄i |si with weights based on the state variables (the scaled sensitivities s̄i being dimensionally consistent with the state variables). However, this choice of tolerances for the si may be a poor one, and the user of idas can provide different values as an option. 2.5.3 Evaluation of the sensitivity right-hand side There are several methods for evaluating the residual functions in the sensitivity systems (2.12): analytic evaluation, automatic differentiation, complex-step approximation, and finite differences (or directional derivatives). idas provides all the software hooks for implementing interfaces to automatic differentiation (AD) or complex-step approximation; future versions will include a generic interface to AD-generated functions. At the present time, besides the option for analytical sensitivity righthand sides (user-provided), idas can evaluate these quantities using various finite difference-based approximations to evaluate the terms (∂F/∂y)si + (∂F/∂ ẏ)ṡi and (∂F/∂pi ), or using directional derivatives to evaluate [(∂F/∂y)si + (∂F/∂ ẏ)ṡi + (∂F/∂pi )]. As is typical for finite differences, the proper choice of perturbations is a delicate matter. idas takes into account several problem-related features: the relative DAE error tolerance rtol, the machine unit roundoff U , the scale factor p̄i , and the weighted root-mean-square norm of the sensitivity vector si . Using central finite differences as an example, the two terms (∂F/∂y)si + (∂F/∂ ẏ)ṡi and ∂F/∂pi in (2.12) can be evaluated either separately: ∂F F (t, y + σy si , ẏ + σy ṡi , p) − F (t, y − σy si , ẏ − σy ṡi , p) ∂F si + ṡi ≈ , ∂y ∂ ẏ 2 σy F (t, y, ẏ, p + σi ei ) − F (t, y, ẏ, p − σi ei ) ∂F ≈ , ∂pi 2 σi p 1 σi = |p̄i | max(rtol, U ) , σy = , max(1/σi , ksi kWRMS /|p̄i |) (2.15) (2.15’) or simultaneously: ∂F ∂F ∂F F (t, y + σsi , ẏ + σ ṡi , p + σei ) − F (t, y − σsi , ẏ − σ ṡi , p − σei ) si + ṡi + ≈ , ∂y ∂ ẏ ∂pi 2σ σ = min(σi , σy ) , (2.16) or by adaptively switching between (2.15)+(2.15’) and (2.16), depending on the relative size of the two finite difference increments σi and σy . In the adaptive scheme, if ρ = max(σi /σy , σy /σi ), we use separate evaluations if ρ > ρmax (an input value), and simultaneous evaluations otherwise. These procedures for choosing the perturbations (σi , σy , σ) and switching between derivative formulas have also been implemented for one-sided difference formulas. Forward finite differences can ∂F be applied to (∂F/∂y)si + (∂F/∂ ẏ)ṡi and ∂p separately, or the single directional derivative formula i ∂F ∂F ∂F F (t, y + σsi , ẏ + σ ṡi , p + σei ) − F (t, y, ẏ, p) si + ṡi + ≈ ∂y ∂ ẏ ∂pi σ 20 Mathematical Considerations can be used. In idas, the default value of ρmax = 0 indicates the use of the second-order centered directional derivative formula (2.16) exclusively. Otherwise, the magnitude of ρmax and its sign (positive or negative) indicates whether this switching is done with regard to (centered or forward) finite differences, respectively. 2.5.4 Quadratures depending on forward sensitivities If pure quadrature variables are also included in the problem definition (see §2.4), idas does not carry their sensitivities automatically. Instead, we provide a more general feature through which integrals depending on both the states y of (2.2) and the state sensitivities si of (2.12) can be evaluated. In other words, idas provides support for computing integrals of the form: Z t z̄(t) = q̄(τ, y(τ ), ẏ(τ ), s1 (τ ), . . . , sNp (τ ), p) dτ . t0 If the sensitivities of the quadrature variables z of (2.10) are desired, these can then be computed by using: q̄i = qy si + qẏ ṡi + qpi , i = 1, . . . , Np , as integrands for z̄, where qy , qẏ , and qp are the partial derivatives of the integrand function q of (2.10). As with the quadrature variables z, the new variables z̄ are also excluded from any nonlinear solver phase and “corrected” values z̄n are obtained through explicit formulas. 2.6 Adjoint sensitivity analysis In the forward sensitivity approach described in the previous section, obtaining sensitivities with respect to Ns parameters is roughly equivalent to solving an DAE system of size (1 + Ns )N . This can become prohibitively expensive, especially for large-scale problems, if sensitivities with respect to many parameters are desired. In this situation, the adjoint sensitivity method is a very attractive alternative, provided that we do not need the solution sensitivities si , but rather the gradients with respect to model parameters of a relatively few derived functionals of the solution. In other words, if y(t) is the solution of (2.2), we wish to evaluate the gradient dG/dp of Z T G(p) = g(t, y, p)dt , (2.17) t0 or, alternatively, the gradient dg/dp of the function g(t, y, p) at the final time t = T . The function g must be smooth enough that ∂g/∂y and ∂g/∂p exist and are bounded. In what follows, we only sketch the analysis for the sensitivity problem for both G and g. For details on the derivation see [13]. 2.6.1 Sensitivity of G(p) We focus first on solving the sensitivity problem for G(p) defined by (2.17). Introducing a Lagrange multiplier λ, we form the augmented objective function Z T I(p) = G(p) − λ∗ F (t, y, ẏ, p)dt. t0 Since F (t, y, ẏ, p) = 0, the sensitivity of G with respect to p is dI dG = = dp dp Z T Z T (gp + gy yp )dt − t0 t0 λ∗ (Fp + Fy yp + Fẏ ẏp )dt, (2.18) 2.6 Adjoint sensitivity analysis 21 where subscripts on functions such as F or g are used to denote partial derivatives. By integration by parts, we have Z T Z T ∗ ∗ T λ Fẏ ẏp dt = (λ Fẏ yp )|t0 − (λ∗ Fẏ )0 yp dt, t0 t0 0 where (· · · ) denotes the t−derivative. Thus equation (2.18) becomes dG = dp Z T (gp − λ∗ Fp ) dt − T Z t0 [−gy + λ∗ Fy − (λ∗ Fẏ )0 ] yp dt − (λ∗ Fẏ yp )|Tt0 . t0 (2.19) Now by requiring λ to satisfy (λ∗ Fẏ )0 − λ∗ Fy = −gy , we obtain dG = dp Z (2.20) T (gp − λ∗ Fp ) dt − (λ∗ Fẏ yp )|Tt0 . t0 (2.21) Note that yp at t = t0 is the sensitivity of the initial conditions with respect to p, which is easily obtained. To find the initial conditions (at t = T ) for the adjoint system, we must take into consideration the structure of the DAE system. For index-0 and index-1 DAE systems, we can simply take λ∗ Fẏ |t=T = 0, (2.22) yielding the sensitivity equation for dG/dp dG = dp Z T (gp − λ∗ Fp ) dt + (λ∗ Fẏ yp )|t=t0 . (2.23) t0 This choice will not suffice for a Hessenberg index-2 DAE system. For a derivation of proper final conditions in such cases, see [13]. The first thing to notice about the adjoint system (2.20) is that there is no explicit specification of the parameters p; this implies that, once the solution λ is found, the formula (2.21) can then be used to find the gradient of G with respect to any of the parameters p. The second important remark is that the adjoint system (2.20) is a terminal value problem which depends on the solution y(t) of the original IVP (2.2). Therefore, a procedure is needed for providing the states y obtained during a forward integration phase of (2.2) to idas during the backward integration phase of (2.20). The approach adopted in idas, based on checkpointing, is described in §2.6.3 below. 2.6.2 Sensitivity of g(T, p) Now let us consider the computation of dg/dp(T ). From dg/dp(T ) = (d/dT )(dG/dp) and equation (2.21), we have dg = (gp − λ∗ Fp )(T ) − dp Z T λ∗T Fp dt + (λ∗T Fẏ yp )|t=t0 − t0 where λT denotes ∂λ/∂T . For index-0 and index-1 DAEs, we obtain d(λ∗ Fẏ yp )|t=T = 0, dT while for a Hessenberg index-2 DAE system we have d(gya (CB)−1 fp2 ) d(λ∗ Fẏ yp )|t=T =− dT dt . t=T d(λ∗ Fẏ yp ) dT (2.24) 22 Mathematical Considerations The corresponding adjoint equations are (λ∗T Fẏ )0 − λ∗T Fy = 0. (2.25) For index-0 and index-1 DAEs (as shown above, the index-2 case is different), to find the boundary condition for this equation we write λ as λ(t, T ) because it depends on both t and T . Then λ∗ (T, T )Fẏ |t=T = 0. Taking the total derivative, we obtain (λt + λT )∗ (T, T )Fẏ |t=T + λ∗ (T, T ) dFẏ |t=T = 0. dt Since λt is just λ̇, we have the boundary condition dFẏ + λ̇∗ Fẏ |t=T . (λ∗T Fẏ )|t=T = − λ∗ (T, T ) dt For the index-one DAE case, the above relation and (2.20) yield (λ∗T Fẏ )|t=T = [gy − λ∗ Fy ] |t=T . (2.26) For the regular implicit ODE case, Fẏ is invertible; thus we have λ(T, T ) = 0, which leads to λT (T ) = −λ̇(T ). As with the final conditions for λ(T ) in (2.20), the above selection for λT (T ) is not sufficient for index-two Hessenberg DAEs (see [13] for details). 2.6.3 Checkpointing scheme During the backward integration, the evaluation of the right-hand side of the adjoint system requires, at the current time, the states y which were computed during the forward integration phase. Since idas implements variable-step integration formulas, it is unlikely that the states will be available at the desired time and so some form of interpolation is needed. The idas implementation being also variable-order, it is possible that during the forward integration phase the order may be reduced as low as first order, which means that there may be points in time where only y and ẏ are available. These requirements therefore limit the choices for possible interpolation schemes. idas implements two interpolation methods: a cubic Hermite interpolation algorithm and a variable-degree polynomial interpolation method which attempts to mimic the BDF interpolant for the forward integration. However, especially for large-scale problems and long integration intervals, the number and size of the vectors y and ẏ that would need to be stored make this approach computationally intractable. Thus, idas settles for a compromise between storage space and execution time by implementing a socalled checkpointing scheme. At the cost of at most one additional forward integration, this approach offers the best possible estimate of memory requirements for adjoint sensitivity analysis. To begin with, based on the problem size N and the available memory, the user decides on the number Nd of data pairs (y, ẏ) if cubic Hermite interpolation is selected, or on the number Nd of y vectors in the case of variable-degree polynomial interpolation, that can be kept in memory for the purpose of interpolation. Then, during the first forward integration stage, after every Nd integration steps a checkpoint is formed by saving enough information (either in memory or on disk) to allow for a hot restart, that is a restart which will exactly reproduce the forward integration. In order to avoid storing Jacobian-related data at each checkpoint, a reevaluation of the iteration matrix is forced before each checkpoint. At the end of this stage, we are left with Nc checkpoints, including one at t0 . During the backward integration stage, the adjoint variables are integrated backwards from T to t0 , going from one checkpoint to the previous one. The backward integration from checkpoint i + 1 to checkpoint i is preceded by a forward integration from i to i + 1 during which the Nd vectors y (and, if necessary ẏ) are generated and stored in memory for interpolation1 1 The degree of the interpolation polynomial is always that of the current BDF order for the forward interpolation at 2.7 Second-order sensitivity analysis k0 k1 23 k2 Forward pass k3 . . . . t0 t1 t2 t3 . . . tf Backward pass Figure 2.1: Illustration of the checkpointing algorithm for generation of the forward solution during the integration of the adjoint system. This approach transfers the uncertainty in the number of integration steps in the forward integration phase to uncertainty in the final number of checkpoints. However, Nc is much smaller than the number of steps taken during the forward integration, and there is no major penalty for writing/reading the checkpoint data to/from a temporary file. Note that, at the end of the first forward integration stage, interpolation data are available from the last checkpoint to the end of the interval of integration. If no checkpoints are necessary (Nd is larger than the number of integration steps taken in the solution of (2.2)), the total cost of an adjoint sensitivity computation can be as low as one forward plus one backward integration. In addition, idas provides the capability of reusing a set of checkpoints for multiple backward integrations, thus allowing for efficient computation of gradients of several functionals (2.17). Finally, we note that the adjoint sensitivity module in idas provides the necessary infrastructure to integrate backwards in time any DAE terminal value problem dependent on the solution of the IVP (2.2), including adjoint systems (2.20) or (2.25), as well as any other quadrature ODEs that may be needed in evaluating the integrals in (2.21). In particular, for DAE systems arising from semidiscretization of time-dependent PDEs, this feature allows for integration of either the discretized adjoint PDE system or the adjoint of the discretized PDE. 2.7 Second-order sensitivity analysis In some applications (e.g., dynamically-constrained optimization) it may be desirable to compute second-order derivative information. Considering the DAE problem (2.2) and some model output functional2 g(y), the Hessian d2 g/dp2 can be obtained in a forward sensitivity analysis setting as d2 g = gy ⊗ INp ypp + ypT gyy yp , 2 dp where ⊗ is the Kronecker product. The second-order sensitivities are solution of the matrix DAE system: Fẏ ⊗ INp · ẏpp + Fy ⊗ INp · ypp + IN ⊗ ẏpT · (Fẏẏ ẏp + Fyẏ yp ) + IN ⊗ ypT · (Fyẏ ẏp + Fyy yp ) = 0 ypp (t0 ) = ∂ 2 y0 , ∂p2 ẏpp (t0 ) = ∂ 2 ẏ0 , ∂p2 the first point to the right of the time at which the interpolated value is sought (unless too close to the i-th checkpoint, in which case it uses the BDF order at the right-most relevant point). However, because of the FLC BDF implementation (see §2.1), the resulting interpolation polynomial is only an approximation to the underlying BDF interpolant. The Hermite cubic interpolation option is present because it was implemented chronologically first and it is also used by other adjoint solvers (e.g. daspkadjoint). The variable-degree polynomial is more memory-efficient (it requires only half of the memory storage of the cubic Hermite interpolation) and is more accurate. 2 For the sake of simplifity in presentation, we do not include explicit dependencies of g on time t or parameters p. Moreover, we only consider the case in which the dependency of the original DAE (2.2) on the parameters p is through its initial conditions only. For details on the derivation in the general case, see [38]. 24 Mathematical Considerations where yp denotes the first-order sensitivity matrix, the solution of Np systems (2.12), and ypp is a third-order tensor. It is easy to see that, except for situations in which the number of parameters Np is very small, the computational cost of this so-called forward-over-forward approach is exorbitant as it requires the solution of Np + Np2 additional DAE systems of the same dimension as (2.2). A much more efficient alternative is to compute Hessian-vector products using a so-called forwardover-adjoint approach. This method is based on using the same “trick” as the one used in computing gradients of pointwise functionals with the adjoint method, namely applying a formal directional forward derivation to the gradient of (2.21) (or the equivalent one for a pointwise functional g(T, y(T ))). With that, the cost of computing a full Hessian is roughly equivalent to the cost of computing the gradient with forward sensitivity analysis. However, Hessian-vector products can be cheaply computed with one additional adjoint solve. As an illustration3 , consider the ODE problem ẏ = f (t, y) , y(t0 ) = y0 (p) , depending on some R t parameters p through the initial conditions only and consider the model functional output G(p) = t0f g(t, y) dt. It can be shown that the product between the Hessian of G (with respect to the parameters p) and some vector u can be computed as ∂2G u = λT ⊗ INp ypp u + ypT µ t=t , 2 0 ∂p where λ and µ are solutions of − µ̇ = fyT µ + λT ⊗ In fyy s ; − λ̇ = fyT λ ṡ = fy s ; + gyT ; µ(tf ) = 0 λ(tf ) = 0 (2.27) s(t0 ) = y0p u. In the above equation, s = yp u is a linear combination of the columns of the sensitivity matrix yp . The forward-over-adjoint approach hinges crucially on the fact that s can be computed at the cost of a forward sensitivity analysis with respect to a single parameter (the last ODE problem above) which is possible due to the linearity of the forward sensitivity equations (2.12). Therefore (and this is also valid for the DAE case), the cost of computing the Hessian-vector product is roughly that of two forward and two backward integrations of a system of DAEs of size N . For more details, including the corresponding formulas for a pointwise model functional output, see the work by Ozyurt and Barton [38] who discuss this problem for ODE initial value problems. As far as we know, there is no published equivalent work on DAE problems. However, the derivations given in [38] for ODE problems can be extended to DAEs with some careful consideration given to the derivation of proper final conditions on the adjoint systems, following the ideas presented in [13]. To allow the foward-over-adjoint approach described above, idas provides support for: • the integration of multiple backward problems depending on the same underlying forward problem (2.2), and • the integration of backward problems and computation of backward quadratures depending on both the states y and forward sensitivities (for this particular application, s) of the original problem (2.2). 3 The derivation for the general DAE case is too involved for the purposes of this discussion. Chapter 3 Code Organization 3.1 SUNDIALS organization The family of solvers referred to as sundials consists of the solvers cvode and arkode (for ODE systems), kinsol (for nonlinear algebraic systems), and ida (for differential-algebraic systems). In addition, sundials also includes variants of cvode and ida with sensitivity analysis capabilities (using either forward or adjoint methods), called cvodes and idas, respectively. The various solvers of this family share many subordinate modules. For this reason, it is organized as a family, with a directory structure that exploits that sharing (see Figs. 3.1 and 3.2). The following is a list of the solver packages presently available, and the basic functionality of each: • cvode, a solver for stiff and nonstiff ODE systems dy/dt = f (t, y) based on Adams and BDF methods; • cvodes, a solver for stiff and nonstiff ODE systems with sensitivity analysis capabilities; • arkode, a solver for ODE systems M dy/dt = fE (t, y) + fI (t, y) based on additive Runge-Kutta methods; • ida, a solver for differential-algebraic systems F (t, y, ẏ) = 0 based on BDF methods; • idas, a solver for differential-algebraic systems with sensitivity analysis capabilities; • kinsol, a solver for nonlinear algebraic systems F (u) = 0. 3.2 IDAS organization The idas package is written in the ANSI C language. The following summarizes the basic structure of the package, although knowledge of this structure is not necessary for its use. The overall organization of the idas package is shown in Figure 3.3. The central integration module, implemented in the files idas.h, idas impl.h, and idas.c, deals with the evaluation of integration coefficients, estimation of local error, selection of stepsize and order, and interpolation to user output points, among other issues. idas utilizes generic linear and nonlinear solver modules defined by the sunlinsol API (see Chapter 9) and sunnonlinsol API (see Chapter 10) respectively. As such, idas has no knowledge of the method being used to solve the linear and nonlinear systems that arise in each time step. For any given user problem, there exists a single nonlinear solver interface and, if necessary, one of the linear system solver interfaces is specified, and invoked as needed during the integration. While sundials includes a fixed-point nonlinear solver module, it is not currently supported in idas (note the fixed-point module is listed in Figure 3.1 but not Figure 3.3). In addition, if forward sensitivity analysis is turned on, the main module will integrate the forward sensitivity equations simultaneously with the original IVP. The sensitivity variables may be included 26 Code Organization SUNDIALS CVODE CVODES ARKODE IDA IDAS KINSOL NVECTOR API SUNMATRIX API SUNLINEARSOLVER API SUNNONLINEARSOLVER API VECTOR MODULES NONLINEAR SOLVER MODULES MATRIX MODULES LINEAR SOLVER MODULES PARALLEL (MPI) DENSE MATRIX-BASED BAND DENSE BAND OPENMP PTHREADS SPARSE LAPACK DENSE LAPACK BAND PARHYP (HYPRE) PETSC KLU SUPERLU_MT CUDA RAJA SERIAL NEWTON FIXED POINT MATRIX-FREE SPBCG SPGMR SPFGMR SPTFQMR PCG Cut Here Figure 3.1: High-level diagram of the sundials suite in the local error control mechanism of the main integrator. idas provides two different strategies for dealing with the correction stage for the sensitivity variables: IDA SIMULTANEOUS IDA STAGGERED (see §2.5). The idas package includes an algorithm for the approximation of the sensitivity equations residuals by difference quotients, but the user has the option of supplying these residual functions directly. The adjoint sensitivity module (file idaa.c) provides the infrastructure needed for the backward integration of any system of DAEs which depends on the solution of the original IVP, in particular the adjoint system and any quadratures required in evaluating the gradient of the objective functional. This module deals with the setup of the checkpoints, the interpolation of the forward solution during the backward integration, and the backward integration of the adjoint equations. idas now has a single unified linear solver interface, idals, supporting both direct and iterative linear solvers built using the generic sunlinsol API (see Chapter 9). These solvers may utilize a sunmatrix object (see Chapter 8) for storing Jacobian information, or they may be matrix-free. Since idas can operate on any valid sunlinsol implementation, the set of linear solver modules available to idas will expand as new sunlinsol modules are developed. For users employing dense or banded Jacobian matrices, idals includes algorithms for their approximation through difference quotients, but the user also has the option of supplying the Jacobian (or an approximation to it) directly. This user-supplied routine is required when using sparse or user-supplied Jacobian matrices. For users employing matrix-free iterative linear solvers, idals includes an algorithm for the approximation by difference quotients of the product between the Jacobian matrix and a vector, Jv. Again, the user has the option of providing routines for this operation, in two phases: setup (preprocessing of Jacobian data) and multiplication. For preconditioned iterative methods, the preconditioning must be supplied by the user, again in two phases: setup and solve. While there is no default choice of preconditioner analogous to the difference-quotient approximation in the direct case, the references [6, 10], together with the example and demonstration programs included with idas, offer considerable assistance in building 3.2 IDAS organization 27 sundials-x.x.x include src examples cvode fcmix cvodes ida ida fcmix idas ida kinsol kinsol fcmix sundials idas nvector kinsol sundials nvector fcmix sunmatrix test arkode arkode idas doc cvodes cvodes arkode config cvode cvode sunmatrix sunlinsol sundials sunlinsol sunnonlinsol nvec_* sunnonlinsol sunmat_* sunlinsol_* sunnonlinsol (a) Directory structure the sundials source tree CutofHere sundials-x.x.x examples nvector cvode serial parallel fcmix_parallel C_openmp parhyp serial fcmix_serial cuda raja parallel pthread cuda cvodes serial parallel C_openmp dense C_parallel C_openmp CXX_serial CXX_parallel F77_parallel F90_serial C_parhyp F77_serial F90_parallel band ida dense klu serial parallel C_openmp fcmix_parallel sparse sunlinsol lapackdense band fcmix_serial petsc raja sunmatrix arkode C_serial openmp parhyp lapackband superlumt petsc spgmr spfgmr fcmix_opemp spbcg pcg idas serial parallel C_openmp sunnonlinsol newton kinsol serial parallel fcmix_serial C_openmp fcmix_parallel (b) Directory structure the sundials examples CutofHere Figure 3.2: Organization of the sundials suite fixed point sptfqmr 28 Code Organization SUNDIALS IDAS IDAADJOINT IDALS: LINEAR SOLVER INTERFACE IDANLS: NONLINEAR SOLVER INTERFACE NVECTOR API SUNMATRIX API SUNLINEARSOLVER API SUNNONLINEARSOLVER API VECTOR MODULES MATRIX MODULES LINEAR SOLVER MODULES NONLINEAR SOLVER MODULES SERIAL DENSE MATRIX-BASED NEWTON OPENMP PTHREADS PARALLEL (MPI) PETSC PARHYP (HYPRE) BAND DENSE BAND SPARSE LAPACK DENSE LAPACK BAND KLU SUPERLU_MT MATRIX-FREE CUDA SPBCG RAJA SPFGMR SPGMR SPTFQMR PCG PRECONDITIONER MODULES IDABBDPRE Cut Here Figure 3.3: Overall structure diagram of the ida package. Modules specific to ida begin with “IDA” (idals, idabbdpre, and idanls), all other items correspond to generic solver and auxiliary modules. Note also that the LAPACK, klu and superlumt support is through interfaces to external packages. Users will need to download and compile those packages independently. 3.2 IDAS organization 29 preconditioners. idas’ linear solver interface consists of four primary routines, devoted to (1) memory allocation and initialization, (2) setup of the matrix data involved, (3) solution of the system, and (4) freeing of memory. The setup and solution phases are separate because the evaluation of Jacobians and preconditioners is done only periodically during the integration, as required to achieve convergence. The call list within the central idas module to each of the four associated functions is fixed, thus allowing the central module to be completely independent of the linear system method. idas also provides a preconditioner module, idabbdpre, for use with any of the Krylov iterative linear solvers. It works in conjunction with nvector parallel and generates a preconditioner that is a block-diagonal matrix with each block being a banded matrix. All state information used by idas to solve a given problem is saved in a structure, and a pointer to that structure is returned to the user. There is no global data in the idas package, and so, in this respect, it is reentrant. State information specific to the linear solver is saved in a separate structure, a pointer to which resides in the idas memory structure. The reentrancy of idas was motivated by the situation where two or more problems are solved by intermixed calls to the package from one user program. Chapter 4 Using IDAS for IVP Solution This chapter is concerned with the use of idas for the integration of DAEs in a C language setting. The following sections treat the header files, the layout of the user’s main program, description of the idas user-callable functions, and description of user-supplied functions. This usage is essentially equivalent to using ida [30]. The sample programs described in the companion document [43] may also be helpful. Those codes may be used as templates (with the removal of some lines involved in testing), and are included in the idas package. The user should be aware that not all sunlinsol and sunmatrix modules are compatible with all nvector implementations. Details on compatibility are given in the documentation for each sunmatrix module (Chapter 8) and each sunlinsol module (Chapter 9). For example, nvector parallel is not compatible with the dense, banded, or sparse sunmatrix types, or with the corresponding dense, banded, or sparse sunlinsol modules. Please check Chapters 8 and 9 to verify compatibility between these modules. In addition to that documentation, we note that the preconditioner module idabbdpre can only be used with nvector parallel. It is not recommended to use a threaded vector module with SuperLU MT unless it is the nvector openmp module, and SuperLU MT is also compiled with OpenMP. idas uses various constants for both input and output. These are defined as needed in this chapter, but for convenience are also listed separately in Appendix B. 4.1 Access to library and header files At this point, it is assumed that the installation of idas, following the procedure described in Appendix A, has been completed successfully. Regardless of where the user’s application program resides, its associated compilation and load commands must make reference to the appropriate locations for the library and header files required by idas. The relevant library files are • libdir/libsundials idas.lib, • libdir/libsundials nvec*.lib, where the file extension .lib is typically .so for shared libraries and .a for static libraries. The relevant header files are located in the subdirectories • incdir/include/idas • incdir/include/sundials • incdir/include/nvector • incdir/include/sunmatrix 32 Using IDAS for IVP Solution • incdir/include/sunlinsol • incdir/include/sunnonlinsol The directories libdir and incdir are the install library and include directories, respectively. For a default installation, these are instdir/lib and instdir/include, respectively, where instdir is the directory where sundials was installed (see Appendix A). Note that an application cannot link to both the ida and idas libraries because both contain user-callable functions with the same names (to ensure that idas is backward compatible with ida). Therefore, applications that contain both DAE problems and DAEs with sensitivity analysis, should use idas. 4.2 Data types The sundials types.h file contains the definition of the type realtype, which is used by the sundials solvers for all floating-point data, the definition of the integer type sunindextype, which is used for vector and matrix indices, and booleantype, which is used for certain logic operations within sundials. 4.2.1 Floating point types The type realtype can be float, double, or long double, with the default being double. The user can change the precision of the sundials solvers arithmetic at the configuration stage (see §A.1.2). Additionally, based on the current precision, sundials types.h defines BIG REAL to be the largest value representable as a realtype, SMALL REAL to be the smallest value representable as a realtype, and UNIT ROUNDOFF to be the difference between 1.0 and the minimum realtype greater than 1.0. Within sundials, real constants are set by way of a macro called RCONST. It is this macro that needs the ability to branch on the definition realtype. In ANSI C, a floating-point constant with no suffix is stored as a double. Placing the suffix “F” at the end of a floating point constant makes it a float, whereas using the suffix “L” makes it a long double. For example, #define A 1.0 #define B 1.0F #define C 1.0L defines A to be a double constant equal to 1.0, B to be a float constant equal to 1.0, and C to be a long double constant equal to 1.0. The macro call RCONST(1.0) automatically expands to 1.0 if realtype is double, to 1.0F if realtype is float, or to 1.0L if realtype is long double. sundials uses the RCONST macro internally to declare all of its floating-point constants. A user program which uses the type realtype and the RCONST macro to handle floating-point constants is precision-independent except for any calls to precision-specific standard math library functions. (Our example programs use both realtype and RCONST.) Users can, however, use the type double, float, or long double in their code (assuming that this usage is consistent with the typedef for realtype). Thus, a previously existing piece of ANSI C code can use sundials without modifying the code to use realtype, so long as the sundials libraries use the correct precision (for details see §A.1.2). 4.2.2 Integer types used for vector and matrix indices The type sunindextype can be either a 32- or 64-bit signed integer. The default is the portable int64 t type, and the user can change it to int32 t at the configuration stage. The configuration system will detect if the compiler does not support portable types, and will replace int32 t and int64 t with int and long int, respectively, to ensure use of the desired sizes on Linux, Mac OS X, and Windows platforms. sundials currently does not support unsigned integer types for vector and matrix indices, although these could be added in the future if there is sufficient demand. 4.3 Header files 33 A user program which uses sunindextype to handle vector and matrix indices will work with both index storage types except for any calls to index storage-specific external libraries. (Our C and C++ example programs use sunindextype.) Users can, however, use any one of int, long int, int32 t, int64 t or long long int in their code, assuming that this usage is consistent with the typedef for sunindextype on their architecture). Thus, a previously existing piece of ANSI C code can use sundials without modifying the code to use sunindextype, so long as the sundials libraries use the appropriate index storage type (for details see §A.1.2). 4.3 Header files The calling program must include several header files so that various macros and data types can be used. The header file that is always required is: • idas/idas.h, the header file for idas, which defines the several types and various constants, and includes function prototypes. This includes the header file for idals, ida/ida ls.h. Note that idas.h includes sundials types.h, which defines the types realtype, sunindextype, and booleantype and the constants SUNFALSE and SUNTRUE. The calling program must also include an nvector implementation header file, of the form nvector/nvector ***.h. See Chapter 7 for the appropriate name. This file in turn includes the header file sundials nvector.h which defines the abstract N Vector data type. If using a non-default nonlinear solver module, or when interacting with a sunnonlinsol module directly, the calling program must also include a sunnonlinsol implementation header file, of the form sunnonlinsol/sunnonlinsol ***.h where *** is the name of the nonlinear solver module (see Chapter 10 for more information). This file in turn includes the header file sundials nonlinearsolver.h which defines the abstract SUNNonlinearSolver data type. If using a nonlinear solver that requires the solution of a linear system of the form (2.5) (e.g., the default Newton iteration), a linear solver module header file is also required. The header files corresponding to the various sundials-provided linear solver modules available for use with idas are: • Direct linear solvers: – sunlinsol/sunlinsol dense.h, which is used with the dense linear solver module, sunlinsol dense; – sunlinsol/sunlinsol band.h, which is used with the banded linear solver module, sunlinsol band; – sunlinsol/sunlinsol lapackdense.h, which is used with the LAPACK dense linear solver module, sunlinsol lapackdense; – sunlinsol/sunlinsol lapackband.h, which is used with the LAPACK banded linear solver module, sunlinsol lapackband; – sunlinsol/sunlinsol klu.h, which is used with the klu sparse linear solver module, sunlinsol klu; – sunlinsol/sunlinsol superlumt.h, which is used with the superlumt sparse linear solver module, sunlinsol superlumt; • Iterative linear solvers: – sunlinsol/sunlinsol spgmr.h, which is used with the scaled, preconditioned GMRES Krylov linear solver module, sunlinsol spgmr; – sunlinsol/sunlinsol spfgmr.h, which is used with the scaled, preconditioned FGMRES Krylov linear solver module, sunlinsol spfgmr; – sunlinsol/sunlinsol spbcgs.h, which is used with the scaled, preconditioned Bi-CGStab Krylov linear solver module, sunlinsol spbcgs; 34 Using IDAS for IVP Solution – sunlinsol/sunlinsol sptfqmr.h, which is used with the scaled, preconditioned TFQMR Krylov linear solver module, sunlinsol sptfqmr; – sunlinsol/sunlinsol pcg.h, which is used with the scaled, preconditioned CG Krylov linear solver module, sunlinsol pcg; The header files for the sunlinsol dense and sunlinsol lapackdense linear solver modules include the file sunmatrix/sunmatrix dense.h, which defines the sunmatrix dense matrix module, as as well as various functions and macros acting on such matrices. The header files for the sunlinsol band and sunlinsol lapackband linear solver modules include the file sunmatrix/sunmatrix band.h, which defines the sunmatrix band matrix module, as as well as various functions and macros acting on such matrices. The header files for the sunlinsol klu and sunlinsol superlumt sparse linear solvers include the file sunmatrix/sunmatrix sparse.h, which defines the sunmatrix sparse matrix module, as well as various functions and macros acting on such matrices. The header files for the Krylov iterative solvers include the file sundials/sundials iterative.h, which enumerates the kind of preconditioning, and (for the spgmr and spfgmr solvers) the choices for the Gram-Schmidt process. Other headers may be needed, according to the choice of preconditioner, etc. For example, in the idasFoodWeb kry p example (see [43]), preconditioning is done with a block-diagonal matrix. For this, even though the sunlinsol spgmr linear solver is used, the header sundials/sundials dense.h is included for access to the underlying generic dense matrix arithmetic routines. 4.4 A skeleton of the user’s main program The following is a skeleton of the user’s main program (or calling program) for the integration of a DAE IVP. Most of the steps are independent of the nvector, sunmatrix, sunlinsol, and sunnonlinsol implementations used. For the steps that are not, refer to Chapter 7, 8, 9, and 10 for the specific name of the function to be called or macro to be referenced. 1. Initialize parallel or multi-threaded environment, if appropriate For example, call MPI Init to initialize MPI if used, or set num threads, the number of threads to use within the threaded vector functions, if used. 2. Set problem dimensions etc. This generally includes the problem size N, and may include the local vector length Nlocal. Note: The variables N and Nlocal should be of type sunindextype. 3. Set vectors of initial values To set the vectors y0 and yp0 to initial values for y and ẏ, use the appropriate functions defined by the particular nvector implementation. For native sundials vector implementations (except the cuda and raja-based ones), use a call of the form y0 = N VMake ***(..., ydata) if the realtype array ydata containing the initial values of y already exists. Otherwise, create a new vector by making a call of the form y0 = N VNew ***(...), and then set its elements by accessing the underlying data with a call of the form ydata = N VGetArrayPointer(y0). See §7.2-7.5 for details. For the hypre and petsc vector wrappers, first create and initialize the underlying vector and then create an nvector wrapper with a call of the form y0 = N VMake ***(yvec), where yvec is a hypre or petsc vector. Note that calls like N VNew ***(...) and N VGetArrayPointer(...) are not available for these vector wrappers. See §7.6 and §7.7 for details. If using either the cuda- or raja-based vector implementations use a call of the form y0 = N VMake ***(..., c) where c is a pointer to a suncudavec or sunrajavec vector class if this class already exists. Otherwise, create a new vector by making a call of the form y0 = N VNew ***(...), 4.4 A skeleton of the user’s main program 35 and then set its elements by accessing the underlying data where it is located with a call of the form N VGetDeviceArrayPointer *** or N VGetHostArrayPointer ***. Note that the vector class will allocate memory on both the host and device when instantiated. See §7.8-7.9 for details. Set the vector yp0 of initial conditions for ẏ similarly. 4. Create idas object Call ida mem = IDACreate() to create the idas memory block. IDACreate returns a pointer to the idas memory structure. See §4.5.1 for details. This void * pointer must then be passed as the first argument to all subsequent idas function calls. 5. Initialize idas solver Call IDAInit(...) to provide required problem specifications (residual function, initial time, and initial conditions), allocate internal memory for idas, and initialize idas. IDAInit returns an error flag to indicate success or an illegal argument value. See §4.5.1 for details. 6. Specify integration tolerances Call IDASStolerances(...) or IDASVtolerances(...) to specify, respectively, a scalar relative tolerance and scalar absolute tolerance, or a scalar relative tolerance and a vector of absolute tolerances. Alternatively, call IDAWFtolerances to specify a function which sets directly the weights used in evaluating WRMS vector norms. See §4.5.2 for details. 7. Create matrix object If a nonlinear solver requiring a linear solver will be used (e.g., the default Newton iteration) and the linear solver will be a matrix-based linear solver, then a template Jacobian matrix must be created by using the appropriate constructor function defined by the particular sunmatrix implementation. For the sundials-supplied sunmatrix implementations, the matrix object may be created using a call of the form SUNMatrix J = SUNBandMatrix(...); or SUNMatrix J = SUNDenseMatrix(...); or SUNMatrix J = SUNSparseMatrix(...); NOTE: The dense, banded, and sparse matrix objects are usable only in a serial or threaded environment. 8. Create linear solver object If a nonlinear solver requiring a linear solver is chosen (e.g., the default Newton iteration), then the desired linear solver object must be created by calling the appropriate constructor function defined by the particular sunlinsol implementation. For any of the sundials-supplied sunlinsol implementations, the linear solver object may be created using a call of the form SUNLinearSolver LS = SUNLinSol *(...); where * can be replaced with “Dense”, “SPGMR”, or other options, as discussed in §4.5.3 and Chapter 9. 9. Set linear solver optional inputs Call *Set* functions from the selected linear solver module to change optional inputs specific to that linear solver. See the documentation for each sunlinsol module in Chapter 9 for details. 36 Using IDAS for IVP Solution 10. Attach linear solver module If a nonlinear solver requiring a linear solver is chosen (e.g., the default Newton iteration), then initialize the idals linear solver interface by attaching the linear solver object (and matrix object, if applicable) with the following call (for details see §4.5.3): ier = IDASetLinearSolver(...); 11. Set optional inputs Optionally, call IDASet* functions to change from their default values any optional inputs that control the behavior of idas. See §4.5.8.1 and §4.5.8 for details. 12. Create nonlinear solver object (optional ) If using a non-default nonlinear solver (see §4.5.4), then create the desired nonlinear solver object by calling the appropriate constructor function defined by the particular sunnonlinsol implementation (e.g., NLS = SUNNonlinSol ***(...); where *** is the name of the nonlinear solver (see Chapter 10 for details). 13. Attach nonlinear solver module (optional ) If using a non-default nonlinear solver, then initialize the nonlinear solver interface by attaching the nonlinear solver object by calling ier = IDASetNonlinearSolver(ida mem, NLS); (see §4.5.4 for details). 14. Set nonlinear solver optional inputs (optional ) Call the appropriate set functions for the selected nonlinear solver module to change optional inputs specific to that nonlinear solver. These must be called after IDAInit if using the default nonlinear solver or after attaching a new nonlinear solver to idas, otherwise the optional inputs will be overridden by idas defaults. See Chapter 10 for more information on optional inputs. 15. Correct initial values Optionally, call IDACalcIC to correct the initial values y0 and yp0 passed to IDAInit. See §4.5.5. Also see §4.5.8.3 for relevant optional input calls. 16. Specify rootfinding problem Optionally, call IDARootInit to initialize a rootfinding problem to be solved during the integration of the DAE system. See §4.5.6 for details, and see §4.5.8.4 for relevant optional input calls. 17. Advance solution in time For each point at which output is desired, call flag = IDASolve(ida mem, tout, &tret, yret, ypret, itask). Here itask specifies the return mode. The vector yret (which can be the same as the vector y0 above) will contain y(t), while the vector ypret (which can be the same as the vector yp0 above) will contain ẏ(t). See §4.5.7 for details. 18. Get optional outputs Call IDA*Get* functions to obtain optional output. See §4.5.10 for details. 19. Deallocate memory for solution vectors Upon completion of the integration, deallocate memory for the vectors yret and ypret (or y and yp) by calling the appropriate destructor function defined by the nvector implementation: N VDestroy(yret); and similarly for ypret. 20. Free solver memory 4.5 User-callable functions 37 IDAFree(&ida mem) to free the memory allocated for idas. 21. Free nonlinear solver memory (optional ) If a non-default nonlinear solver was used, then call SUNNonlinSolFree(NLS) to free any memory allocated for the sunnonlinsol object. 22. Free linear solver and matrix memory Call SUNLinSolFree and SUNMatDestroy to free any memory allocated for the linear solver and matrix objects created above. 23. Finalize MPI, if used Call MPI Finalize() to terminate MPI. sundials provides some linear solvers only as a means for users to get problems running and not as highly efficient solvers. For example, if solving a dense system, we suggest using the LAPACK solvers if the size of the linear system is > 50, 000. (Thanks to A. Nicolai for his testing and recommendation.) Table 4.1 shows the linear solver interfaces available as sunlinsol modules and the vector implementations required for use. As an example, one cannot use the dense direct solver interfaces with the MPI-based vector implementation. However, as discussed in Chapter 9 the sundials packages operate on generic sunlinsol objects, allowing a user to develop their own solvers should they so desire. 4.5 X X X X X X User Supp. X X X X X X raja X X X X X X cuda X X X X X X X X X X X X petsc pThreads X X X X X X X X X X X X X X X X X X hypre OpenMP X X X X X X X X X X X X Parallel (MPI) Linear Solver Dense Band LapackDense LapackBand klu superlumt spgmr spfgmr spbcgs sptfqmr pcg User Supp. Serial Table 4.1: sundials linear solver interfaces and vector implementations that can be used for each. X X X X X X X X X X X X X X X X X X User-callable functions This section describes the idas functions that are called by the user to set up and solve a DAE. Some of these are required. However, starting with §4.5.8, the functions listed involve optional inputs/outputs or restarting, and those paragraphs can be skipped for a casual use of idas. In any case, refer to §4.4 for the correct order of these calls. On an error, each user-callable function returns a negative value and sends an error message to the error handler routine, which prints the message on stderr by default. However, the user can set a file as error output or can provide his own error handler function (see §4.5.8.1). 38 4.5.1 Using IDAS for IVP Solution IDAS initialization and deallocation functions The following three functions must be called in the order listed. The last one is to be called only after the DAE solution is complete, as it frees the idas memory block created and allocated by the first two calls. IDACreate Call ida mem = IDACreate(); Description The function IDACreate instantiates an idas solver object. Arguments IDACreate has no arguments. Return value If successful, IDACreate returns a pointer to the newly created idas memory block (of type void *). Otherwise it returns NULL. IDAInit Call flag = IDAInit(ida mem, res, t0, y0, yp0); Description The function IDAInit provides required problem and solution specifications, allocates internal memory, and initializes idas. Arguments ida mem (void *) pointer to the idas memory block returned by IDACreate. res (IDAResFn) is the C function which computes the residual function F in the DAE. This function has the form res(t, yy, yp, resval, user data). For full details see §4.6.1. t0 (realtype) is the initial value of t. y0 (N Vector) is the initial value of y. yp0 (N Vector) is the initial value of ẏ. Return value The return value flag (of type int) will be one of the following: IDA SUCCESS The call to IDAInit was successful. IDA MEM NULL The idas memory block was not initialized through a previous call to IDACreate. IDA MEM FAIL A memory allocation request has failed. IDA ILL INPUT An input argument to IDAInit has an illegal value. Notes If an error occurred, IDAInit also sends an error message to the error handler function. IDAFree Call IDAFree(&ida mem); Description The function IDAFree frees the pointer allocated by a previous call to IDACreate. Arguments The argument is the pointer to the idas memory block (of type void *). Return value The function IDAFree has no return value. 4.5.2 IDAS tolerance specification functions One of the following three functions must be called to specify the integration tolerances (or directly specify the weights used in evaluating WRMS vector norms). Note that this call must be made after the call to IDAInit. 4.5 User-callable functions 39 IDASStolerances Call flag = IDASStolerances(ida mem, reltol, abstol); Description The function IDASStolerances specifies scalar relative and absolute tolerances. Arguments ida mem (void *) pointer to the idas memory block returned by IDACreate. reltol (realtype) is the scalar relative error tolerance. abstol (realtype) is the scalar absolute error tolerance. Return value The return value flag (of type int) will be one of the following: The call to IDASStolerances was successful. The idas memory block was not initialized through a previous call to IDACreate. IDA NO MALLOC The allocation function IDAInit has not been called. IDA ILL INPUT One of the input tolerances was negative. IDA SUCCESS IDA MEM NULL IDASVtolerances Call flag = IDASVtolerances(ida mem, reltol, abstol); Description The function IDASVtolerances specifies scalar relative tolerance and vector absolute tolerances. Arguments ida mem (void *) pointer to the idas memory block returned by IDACreate. reltol (realtype) is the scalar relative error tolerance. abstol (N Vector) is the vector of absolute error tolerances. Return value The return value flag (of type int) will be one of the following: The call to IDASVtolerances was successful. The idas memory block was not initialized through a previous call to IDACreate. IDA NO MALLOC The allocation function IDAInit has not been called. IDA ILL INPUT The relative error tolerance was negative or the absolute tolerance had a negative component. IDA SUCCESS IDA MEM NULL Notes This choice of tolerances is important when the absolute error tolerance needs to be different for each component of the state vector y. IDAWFtolerances Call flag = IDAWFtolerances(ida mem, efun); Description The function IDAWFtolerances specifies a user-supplied function efun that sets the multiplicative error weights Wi for use in the weighted RMS norm, which are normally defined by Eq. (2.7). Arguments ida mem (void *) pointer to the idas memory block returned by IDACreate. efun (IDAEwtFn) is the C function which defines the ewt vector (see §4.6.3). Return value The return value flag (of type int) will be one of the following: The call to IDAWFtolerances was successful. The idas memory block was not initialized through a previous call to IDACreate. IDA NO MALLOC The allocation function IDAInit has not been called. IDA SUCCESS IDA MEM NULL General advice on choice of tolerances. For many users, the appropriate choices for tolerance values in reltol and abstol are a concern. The following pieces of advice are relevant. (1) The scalar relative tolerance reltol is to be set to control relative errors. So reltol=10−4 means that errors are controlled to .01%. We do not recommend using reltol larger than 10−3 . 40 Using IDAS for IVP Solution On the other hand, reltol should not be so small that it is comparable to the unit roundoff of the machine arithmetic (generally around 10−15 ). (2) The absolute tolerances abstol (whether scalar or vector) need to be set to control absolute errors when any components of the solution vector y may be so small that pure relative error control is meaningless. For example, if y[i] starts at some nonzero value, but in time decays to zero, then pure relative error control on y[i] makes no sense (and is overly costly) after y[i] is below some noise level. Then abstol (if scalar) or abstol[i] (if a vector) needs to be set to that noise level. If the different components have different noise levels, then abstol should be a vector. See the example idasRoberts dns in the idas package, and the discussion of it in the idas Examples document [43]. In that problem, the three components vary between 0 and 1, and have different noise levels; hence the abstol vector. It is impossible to give any general advice on abstol values, because the appropriate noise levels are completely problem-dependent. The user or modeler hopefully has some idea as to what those noise levels are. (3) Finally, it is important to pick all the tolerance values conservatively, because they control the error committed on each individual time step. The final (global) errors are a sort of accumulation of those per-step errors. A good rule of thumb is to reduce the tolerances by a factor of .01 from the actual desired limits on errors. So if you want .01% accuracy (globally), a good choice is reltol= 10−6 . But in any case, it is a good idea to do a few experiments with the tolerances to see how the computed solution values vary as tolerances are reduced. Advice on controlling unphysical negative values. In many applications, some components in the true solution are always positive or non-negative, though at times very small. In the numerical solution, however, small negative (hence unphysical) values can then occur. In most cases, these values are harmless, and simply need to be controlled, not eliminated. The following pieces of advice are relevant. (1) The way to control the size of unwanted negative computed values is with tighter absolute tolerances. Again this requires some knowledge of the noise level of these components, which may or may not be different for different components. Some experimentation may be needed. (2) If output plots or tables are being generated, and it is important to avoid having negative numbers appear there (for the sake of avoiding a long explanation of them, if nothing else), then eliminate them, but only in the context of the output medium. Then the internal values carried by the solver are unaffected. Remember that a small negative value in yret returned by idas, with magnitude comparable to abstol or less, is equivalent to zero as far as the computation is concerned. (3) The user’s residual routine res should never change a negative value in the solution vector yy to a non-negative value, as a ”solution” to this problem. This can cause instability. If the res routine cannot tolerate a zero or negative value (e.g., because there is a square root or log of it), then the offending value should be changed to zero or a tiny positive number in a temporary variable (not in the input yy vector) for the purposes of computing F (t, y, ẏ). (4) idas provides the option of enforcing positivity or non-negativity on components. Also, such constraints can be enforced by use of the recoverable error return feature in the user-supplied residual function. However, because these options involve some extra overhead cost, they should only be exercised if the use of absolute tolerances to control the computed values is unsuccessful. 4.5.3 Linear solver interface functions As previously explained, if the nonlinear solver requires the solution of linear systems of the form (2.5) (e.g., the default Newton iteration, then solution of these linear systems is handled with the idals linear solver interface. This interface supports all valid sunlinsol modules. Here, matrix-based sunlinsol modules utilize sunmatrix objects to store the Jacobian matrix J = ∂F/∂y + α∂F/∂ ẏ and factorizations used throughout the solution process. Conversely, matrix-free sunlinsol modules instead use iterative methods to solve the linear systems of equations, and only require the action of the Jacobian on a vector, Jv. With most iterative linear solvers, preconditioning can be done on the left only, on the right only, on both the left and the right, or not at all. The exceptions to this rule are spfgmr that supports 4.5 User-callable functions 41 right preconditioning only and pcg that performs symmetric preconditioning. However, in idas only left preconditioning is supported. For the specification of a preconditioner, see the iterative linear solver sections in §4.5.8 and §4.6. A preconditioner matrix P must approximate the Jacobian J, at least crudely. To specify a generic linear solver to idas, after the call to IDACreate but before any calls to IDASolve, the user’s program must create the appropriate sunlinsol object and call the function IDASetLinearSolver, as documented below. To create the SUNLinearSolver object, the user may call one of the sundials-packaged sunlinsol module constructor routines via a call of the form SUNLinearSolver LS = SUNLinSol_*(...); The current list of such constructor routines includes SUNLinSol Dense, SUNLinSol Band, SUNLinSol LapackDense, SUNLinSol LapackBand, SUNLinSol KLU, SUNLinSol SuperLUMT, SUNLinSol SPGMR, SUNLinSol SPFGMR, SUNLinSol SPBCGS, SUNLinSol SPTFQMR, and SUNLinSol PCG. Alternately, a user-supplied SUNLinearSolver module may be created and used instead. The use of each of the generic linear solvers involves certain constants, functions and possibly some macros, that are likely to be needed in the user code. These are available in the corresponding header file associated with the specific sunmatrix or sunlinsol module in question, as described in Chapters 8 and 9. Once this solver object has been constructed, the user should attach it to idas via a call to IDASetLinearSolver. The first argument passed to this function is the idas memory pointer returned by IDACreate; the second argument is the desired sunlinsol object to use for solving systems. The third argument is an optional sunmatrix object to accompany matrix-based sunlinsol inputs (for matrix-free linear solvers, the third argument should be NULL). A call to this function initializes the idals linear solver interface, linking it to the main idas integrator, and allows the user to specify additional parameters and routines pertinent to their choice of linear solver. IDASetLinearSolver Call flag = IDASetLinearSolver(ida mem, LS, J); Description The function IDASetLinearSolver attaches a generic sunlinsol object LS and corresponding template Jacobian sunmatrix object J (if applicable) to idas, initializing the idals linear solver interface. Arguments ida mem (void *) pointer to the idas memory block. LS (SUNLinearSolver) sunlinsol object to use for solving linear systems of the form (2.5. J (SUNMatrix) sunmatrix object for used as a template for the Jacobian (or NULL if not applicable). Return value The return value flag (of type int) is one of IDALS SUCCESS The idals initialization was successful. IDALS MEM NULL The ida mem pointer is NULL. IDALS ILL INPUT The idals interface is not compatible with the LS or J input objects or is incompatible with the current nvector module. IDALS SUNLS FAIL A call to the LS object failed. IDALS MEM FAIL A memory allocation request failed. Notes If LS is a matrix-based linear solver, then the template Jacobian matrix J will be used in the solve process, so if additional storage is required within the sunmatrix object (e.g., for factorization of a banded matrix), ensure that the input object is allocated with sufficient size (see the documentation of the particular sunmatrix type in Chapter 8 for further information). The previous routines IDADlsSetLinearSolver and IDASpilsSetLinearSolver are now wrappers for this routine, and may still be used for backward-compatibility. However, these will be deprecated in future releases, so we recommend that users transition to the new routine name soon. 42 4.5.4 Using IDAS for IVP Solution Nonlinear solver interface function By default idas uses the sunnonlinsol implementation of Newton’s method defined by the sunnonlinsol newton module (see §10.2). To specify a different nonlinear solver in idas, the user’s program must create a sunnonlinsol object by calling the appropriate constructor routine. The user must then attach the sunnonlinsol object to idas by calling IDASetNonlinearSolver, as documented below. When changing the nonlinear solver in idas, IDASetNonlinearSolver must be called after IDAInit. If any calls to IDASolve have been made, then idas will need to be reinitialized by calling IDAReInit to ensure that the nonlinear solver is initialized correctly before any subsequent calls to IDASolve. The first argument passed to the routine IDASetNonlinearSolver is the idas memory pointer returned by IDACreate and the second argument is the sunnonlinsol object to use for solving the nonlinear system 2.4. A call to this function attaches the nonlinear solver to the main idas integrator. We note that at present, the sunnonlinsol object must be of type SUNNONLINEARSOLVER ROOTFIND. IDASetNonlinearSolver Call flag = IDASetNonlinearSolver(ida mem, NLS); Description The function IDASetNonLinearSolver attaches a sunnonlinsol object (NLS) to idas. Arguments ida mem (void *) pointer to the idas memory block. NLS (SUNNonlinearSolver) sunnonlinsol object to use for solving nonlinear systems. Return value The return value flag (of type int) is one of IDA SUCCESS The nonlinear solver was successfully attached. IDA MEM NULL The ida mem pointer is NULL. IDA ILL INPUT The sunnonlinsol object is NULL, does not implement the required nonlinear solver operations, is not of the correct type, or the residual function, convergence test function, or maximum number of nonlinear iterations could not be set. Notes 4.5.5 When forward sensitivity analysis capabilities are enabled and the IDA STAGGERED corrector method is used this function sets the nonlinear solver method for correcting state variables (see §5.2.3 for more details). Initial condition calculation function IDACalcIC calculates corrected initial conditions for the DAE system for certain index-one problems including a class of systems of semi-implicit form. (See §2.1 and Ref. [8].) It uses Newton iteration combined with a linesearch algorithm. Calling IDACalcIC is optional. It is only necessary when the initial conditions do not satisfy the given system. Thus if y0 and yp0 are known to satisfy F (t0 , y0 , ẏ0 ) = 0, then a call to IDACalcIC is generally not necessary. A call to the function IDACalcIC must be preceded by successful calls to IDACreate and IDAInit (or IDAReInit), and by a successful call to the linear system solver specification function. The call to IDACalcIC should precede the call(s) to IDASolve for the given problem. IDACalcIC Call flag = IDACalcIC(ida mem, icopt, tout1); Description The function IDACalcIC corrects the initial values y0 and yp0 at time t0. Arguments ida mem (void *) pointer to the idas memory block. icopt (int) is one of the following two options for the initial condition calculation. icopt=IDA YA YDP INIT directs IDACalcIC to compute the algebraic components of y and differential components of ẏ, given the differential components 4.5 User-callable functions tout1 43 of y. This option requires that the N Vector id was set through IDASetId, specifying the differential and algebraic components. icopt=IDA Y INIT directs IDACalcIC to compute all components of y, given ẏ. In this case, id is not required. (realtype) is the first value of t at which a solution will be requested (from IDASolve). This value is needed here only to determine the direction of integration and rough scale in the independent variable t. Return value The return value flag (of type int) will be one of the following: IDA IDA IDA IDA IDA IDA IDA IDA IDA IDA IDA IDA IDA IDA Notes IDASolve succeeded. The argument ida mem was NULL. The allocation function IDAInit has not been called. One of the input arguments was illegal. The linear solver’s setup function failed in an unrecoverable manner. LINIT FAIL The linear solver’s initialization function failed. LSOLVE FAIL The linear solver’s solve function failed in an unrecoverable manner. BAD EWT Some component of the error weight vector is zero (illegal), either for the input value of y0 or a corrected value. FIRST RES FAIL The user’s residual function returned a recoverable error flag on the first call, but IDACalcIC was unable to recover. RES FAIL The user’s residual function returned a nonrecoverable error flag. NO RECOVERY The user’s residual function, or the linear solver’s setup or solve function had a recoverable error, but IDACalcIC was unable to recover. CONSTR FAIL IDACalcIC was unable to find a solution satisfying the inequality constraints. LINESEARCH FAIL The linesearch algorithm failed to find a solution with a step larger than steptol in weighted RMS norm, and within the allowed number of backtracks. CONV FAIL IDACalcIC failed to get convergence of the Newton iterations. SUCCESS MEM NULL NO MALLOC ILL INPUT LSETUP FAIL All failure return values are negative and therefore a test flag < 0 will trap all IDACalcIC failures. Note that IDACalcIC will correct the values of y(t0 ) and ẏ(t0 ) which were specified in the previous call to IDAInit or IDAReInit. To obtain the corrected values, call IDAGetconsistentIC (see §4.5.10.3). 4.5.6 Rootfinding initialization function While integrating the IVP, idas has the capability of finding the roots of a set of user-defined functions. To activate the rootfinding algorithm, call the following function. This is normally called only once, prior to the first call to IDASolve, but if the rootfinding problem is to be changed during the solution, IDARootInit can also be called prior to a continuation call to IDASolve. IDARootInit Call flag = IDARootInit(ida mem, nrtfn, g); Description The function IDARootInit specifies that the roots of a set of functions gi (t, y, ẏ) are to be found while the IVP is being solved. Arguments ida mem (void *) pointer to the idas memory block returned by IDACreate. 44 Using IDAS for IVP Solution nrtfn g (int) is the number of root functions gi . (IDARootFn) is the C function which defines the nrtfn functions gi (t, y, ẏ) whose roots are sought. See §4.6.4 for details. Return value The return value flag (of type int) is one of IDA IDA IDA IDA Notes 4.5.7 SUCCESS MEM NULL MEM FAIL ILL INPUT The call to IDARootInit was successful. The ida mem argument was NULL. A memory allocation failed. The function g is NULL, but nrtfn> 0. If a new IVP is to be solved with a call to IDAReInit, where the new IVP has no rootfinding problem but the prior one did, then call IDARootInit with nrtfn= 0. IDAS solver function This is the central step in the solution process, the call to perform the integration of the DAE. One of the input arguments (itask) specifies one of two modes as to where idas is to return a solution. But these modes are modified if the user has set a stop time (with IDASetStopTime) or requested rootfinding. IDASolve Call flag = IDASolve(ida mem, tout, &tret, yret, ypret, itask); Description The function IDASolve integrates the DAE over an interval in t. Arguments ida mem tout tret yret ypret itask (void *) pointer to the idas memory block. (realtype) the next time at which a computed solution is desired. (realtype) the time reached by the solver (output). (N Vector) the computed solution vector y. (N Vector) the computed solution vector ẏ. (int) a flag indicating the job of the solver for the next user step. The IDA NORMAL task is to have the solver take internal steps until it has reached or just passed the user specified tout parameter. The solver then interpolates in order to return approximate values of y(tout) and ẏ(tout). The IDA ONE STEP option tells the solver to just take one internal step and return the solution at the point reached by that step. Return value IDASolve returns vectors yret and ypret and a corresponding independent variable value t = tret, such that (yret, ypret) are the computed values of (y(t), ẏ(t)). In IDA NORMAL mode with no errors, tret will be equal to tout and yret = y(tout), ypret = ẏ(tout). The return value flag (of type int) will be one of the following: IDA SUCCESS IDASolve succeeded. IDA TSTOP RETURN IDASolve succeeded by reaching the stop point specified through the optional input function IDASetStopTime. IDA ROOT RETURN IDASolve succeeded and found one or more roots. In this case, tret is the location of the root. If nrtfn > 1, call IDAGetRootInfo to see which gi were found to have a root. See §4.5.10.4 for more information. IDA MEM NULL The ida mem argument was NULL. IDA ILL INPUT One of the inputs to IDASolve was illegal, or some other input to the solver was either illegal or missing. The latter category includes the following situations: (a) The tolerances have not been set. (b) A component of the error weight vector became zero during 4.5 User-callable functions 45 internal time-stepping. (c) The linear solver initialization function (called by the user after calling IDACreate) failed to set the linear solver-specific lsolve field in ida mem. (d) A root of one of the root functions was found both at a point t and also very near t. In any case, the user should see the printed error message for details. IDA TOO MUCH WORK The solver took mxstep internal steps but could not reach tout. The default value for mxstep is MXSTEP DEFAULT = 500. IDA TOO MUCH ACC The solver could not satisfy the accuracy demanded by the user for some internal step. IDA ERR FAIL Error test failures occurred too many times (MXNEF = 10) during one internal time step or occurred with |h| = hmin . Convergence test failures occurred too many times (MXNCF = 10) IDA CONV FAIL during one internal time step or occurred with |h| = hmin . IDA LINIT FAIL The linear solver’s initialization function failed. IDA LSETUP FAIL The linear solver’s setup function failed in an unrecoverable manner. IDA LSOLVE FAIL The linear solver’s solve function failed in an unrecoverable manner. IDA CONSTR FAIL The inequality constraints were violated and the solver was unable to recover. IDA REP RES ERR The user’s residual function repeatedly returned a recoverable error flag, but the solver was unable to recover. IDA RES FAIL The user’s residual function returned a nonrecoverable error flag. IDA RTFUNC FAIL The rootfinding function failed. Notes The vector yret can occupy the same space as the vector y0 of initial conditions that was passed to IDAInit, and the vector ypret can occupy the same space as yp0. In the IDA ONE STEP mode, tout is used on the first call only, and only to get the direction and rough scale of the independent variable. All failure return values are negative and therefore a test flag < 0 will trap all IDASolve failures. On any error return in which one or more internal steps were taken by IDASolve, the returned values of tret, yret, and ypret correspond to the farthest point reached in the integration. On all other error returns, these values are left unchanged from the previous IDASolve return. 4.5.8 Optional input functions There are numerous optional input parameters that control the behavior of the idas solver. idas provides functions that can be used to change these optional input parameters from their default values. Table 4.2 lists all optional input functions in idas which are then described in detail in the remainder of this section. For the most casual use of idas, the reader can skip to §4.6. We note that, on an error return, all these functions also send an error message to the error handler function. We also note that all error return values are negative, so a test flag < 0 will catch any error. 4.5.8.1 Main solver optional input functions The calls listed here can be executed in any order. However, if the user’s program calls either IDASetErrFile or IDASetErrHandlerFn, then that call should appear first, in order to take effect for any later error message. 46 Using IDAS for IVP Solution Table 4.2: Optional inputs for idas and idals Optional input Function name IDAS main solver Pointer to an error file IDASetErrFile Error handler function IDASetErrHandlerFn User data IDASetUserData Maximum order for BDF method IDASetMaxOrd Maximum no. of internal steps before tout IDASetMaxNumSteps Initial step size IDASetInitStep Maximum absolute step size IDASetMaxStep Value of tstop IDASetStopTime Maximum no. of error test failures IDASetMaxErrTestFails Maximum no. of nonlinear iterations IDASetMaxNonlinIters Maximum no. of convergence failures IDASetMaxConvFails Maximum no. of error test failures IDASetMaxErrTestFails Coeff. in the nonlinear convergence test IDASetNonlinConvCoef Suppress alg. vars. from error test IDASetSuppressAlg Variable types (differential/algebraic) IDASetId Inequality constraints on solution IDASetConstraints Direction of zero-crossing IDASetRootDirection Disable rootfinding warnings IDASetNoInactiveRootWarn IDAS initial conditions calculation Coeff. in the nonlinear convergence test IDASetNonlinConvCoefIC Maximum no. of steps IDASetMaxNumStepsIC Maximum no. of Jacobian/precond. evals. IDASetMaxNumJacsIC Maximum no. of Newton iterations IDASetMaxNumItersIC Max. linesearch backtracks per Newton iter. IDASetMaxBacksIC Turn off linesearch IDASetLineSearchOffIC Lower bound on Newton step IDASetStepToleranceIC IDALS linear solver interface Jacobian function IDASetJacFn Jacobian-times-vector function IDASetJacTimes Preconditioner functions IDASetPreconditioner Ratio between linear and nonlinear tolerances IDASetEpsLin Increment factor used in DQ Jv approx. IDASetIncrementFactor Default stderr internal fn. NULL 5 500 estimated ∞ ∞ 10 4 10 7 0.33 SUNFALSE NULL NULL both none 0.0033 5 4 10 100 SUNFALSE uround2/3 DQ NULL, DQ NULL, NULL 0.05 1.0 4.5 User-callable functions 47 IDASetErrFile Call flag = IDASetErrFile(ida mem, errfp); Description The function IDASetErrFile specifies the pointer to the file where all idas messages should be directed when the default idas error handler function is used. Arguments ida mem (void *) pointer to the idas memory block. errfp (FILE *) pointer to output file. Return value The return value flag (of type int) is one of IDA SUCCESS The optional value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. Notes The default value for errfp is stderr. Passing a value NULL disables all future error message output (except for the case in which the idas memory pointer is NULL). This use of IDASetErrFile is strongly discouraged. If IDASetErrFile is to be called, it should be called before any other optional input functions, in order to take effect for any later error message. ! IDASetErrHandlerFn Call flag = IDASetErrHandlerFn(ida mem, ehfun, eh data); Description The function IDASetErrHandlerFn specifies the optional user-defined function to be used in handling error messages. Arguments ida mem (void *) pointer to the idas memory block. ehfun (IDAErrHandlerFn) is the user’s C error handler function (see §4.6.2). eh data (void *) pointer to user data passed to ehfun every time it is called. Return value The return value flag (of type int) is one of IDA SUCCESS The function ehfun and data pointer eh data have been successfully set. IDA MEM NULL The ida mem pointer is NULL. Notes Error messages indicating that the idas solver memory is NULL will always be directed to stderr. IDASetUserData Call flag = IDASetUserData(ida mem, user data); Description The function IDASetUserData specifies the user data block user data and attaches it to the main idas memory block. Arguments ida mem (void *) pointer to the idas memory block. user data (void *) pointer to the user data. Return value The return value flag (of type int) is one of IDA SUCCESS The optional value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. Notes If specified, the pointer to user data is passed to all user-supplied functions that have it as an argument. Otherwise, a NULL pointer is passed. If user data is needed in user linear solver or preconditioner functions, the call to IDASetUserData must be made before the call to specify the linear solver. ! 48 Using IDAS for IVP Solution IDASetMaxOrd Call flag = IDASetMaxOrd(ida mem, maxord); Description The function IDASetMaxOrd specifies the maximum order of the linear multistep method. Arguments ida mem (void *) pointer to the idas memory block. maxord (int) value of the maximum method order. This must be positive. Return value The return value flag (of type int) is one of IDA SUCCESS The optional value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. IDA ILL INPUT The input value maxord is ≤ 0, or larger than its previous value. Notes The default value is 5. If the input value exceeds 5, the value 5 will be used. Since maxord affects the memory requirements for the internal idas memory block, its value cannot be increased past its previous value. IDASetMaxNumSteps Call flag = IDASetMaxNumSteps(ida mem, mxsteps); Description The function IDASetMaxNumSteps specifies the maximum number of steps to be taken by the solver in its attempt to reach the next output time. Arguments ida mem (void *) pointer to the idas memory block. mxsteps (long int) maximum allowed number of steps. Return value The return value flag (of type int) is one of IDA SUCCESS IDA MEM NULL Notes The optional value has been successfully set. The ida mem pointer is NULL. Passing mxsteps = 0 results in idas using the default value (500). Passing mxsteps < 0 disables the test (not recommended). IDASetInitStep Call flag = IDASetInitStep(ida mem, hin); Description The function IDASetInitStep specifies the initial step size. Arguments ida mem (void *) pointer to the idas memory block. hin (realtype) value of the initial step size to be attempted. Pass 0.0 to have idas use the default value. Return value The return value flag (of type int) is one of IDA SUCCESS The optional value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. Notes By default, idas estimates the initial step as the solution of khẏkWRMS = 1/2, with an added restriction that |h| ≤ .001|tout - t0|. IDASetMaxStep Call flag = IDASetMaxStep(ida mem, hmax); Description The function IDASetMaxStep specifies the maximum absolute value of the step size. Arguments ida mem (void *) pointer to the idas memory block. hmax (realtype) maximum absolute value of the step size. Return value The return value flag (of type int) is one of 4.5 User-callable functions 49 IDA SUCCESS The optional value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. IDA ILL INPUT Either hmax is not positive or it is smaller than the minimum allowable step. Notes Pass hmax= 0 to obtain the default value ∞. IDASetStopTime Call flag = IDASetStopTime(ida mem, tstop); Description The function IDASetStopTime specifies the value of the independent variable t past which the solution is not to proceed. Arguments ida mem (void *) pointer to the idas memory block. tstop (realtype) value of the independent variable past which the solution should not proceed. Return value The return value flag (of type int) is one of IDA SUCCESS The optional value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. IDA ILL INPUT The value of tstop is not beyond the current t value, tn . Notes The default, if this routine is not called, is that no stop time is imposed. IDASetMaxErrTestFails Call flag = IDASetMaxErrTestFails(ida mem, maxnef); Description The function IDASetMaxErrTestFails specifies the maximum number of error test failures in attempting one step. Arguments ida mem (void *) pointer to the idas memory block. maxnef (int) maximum number of error test failures allowed on one step (> 0). Return value The return value flag (of type int) is one of IDA SUCCESS The optional value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. Notes The default value is 7. IDASetMaxNonlinIters Call flag = IDASetMaxNonlinIters(ida mem, maxcor); Description The function IDASetMaxNonlinIters specifies the maximum number of nonlinear solver iterations at one step. Arguments ida mem (void *) pointer to the idas memory block. maxcor (int) maximum number of nonlinear solver iterations allowed on one step (> 0). Return value The return value flag (of type int) is one of IDA SUCCESS The optional value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. IDA MEM FAIL The sunnonlinsol module is NULL. Notes The default value is 3. 50 Using IDAS for IVP Solution IDASetMaxConvFails Call flag = IDASetMaxConvFails(ida mem, maxncf); Description The function IDASetMaxConvFails specifies the maximum number of nonlinear solver convergence failures at one step. Arguments ida mem (void *) pointer to the idas memory block. maxncf (int) maximum number of allowable nonlinear solver convergence failures on one step (> 0). Return value The return value flag (of type int) is one of IDA SUCCESS The optional value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. Notes The default value is 10. IDASetNonlinConvCoef Call flag = IDASetNonlinConvCoef(ida mem, nlscoef); Description The function IDASetNonlinConvCoef specifies the safety factor in the nonlinear convergence test; see Chapter 2, Eq. (2.8). Arguments ida mem (void *) pointer to the idas memory block. nlscoef (realtype) coefficient in nonlinear convergence test (> 0.0). Return value The return value flag (of type int) is one of IDA SUCCESS The optional value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. IDA ILL INPUT The value of nlscoef is <= 0.0. Notes The default value is 0.33. IDASetSuppressAlg Call flag = IDASetSuppressAlg(ida mem, suppressalg); Description The function IDASetSuppressAlg indicates whether or not to suppress algebraic variables in the local error test. Arguments ida mem (void *) pointer to the idas memory block. suppressalg (booleantype) indicates whether to suppress (SUNTRUE) or not (SUNFALSE) the algebraic variables in the local error test. Return value The return value flag (of type int) is one of IDA SUCCESS The optional value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. Notes The default value is SUNFALSE. If suppressalg=SUNTRUE is selected, then the id vector must be set (through IDASetId) to specify the algebraic components. In general, the use of this option (with suppressalg = SUNTRUE) is discouraged when solving DAE systems of index 1, whereas it is generally encouraged for systems of index 2 or more. See pp. 146-147 of Ref. [4] for more on this issue. 4.5 User-callable functions 51 IDASetId Call flag = IDASetId(ida mem, id); Description The function IDASetId specifies algebraic/differential components in the y vector. Arguments ida mem (void *) pointer to the idas memory block. id (N Vector) state vector. A value of 1.0 indicates a differential variable, while 0.0 indicates an algebraic variable. Return value The return value flag (of type int) is one of IDA SUCCESS The optional value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. Notes The vector id is required if the algebraic variables are to be suppressed from the local error test (see IDASetSuppressAlg) or if IDACalcIC is to be called with icopt = IDA YA YDP INIT (see §4.5.5). IDASetConstraints Call flag = IDASetConstraints(ida mem, constraints); Description The function IDASetConstraints specifies a vector defining inequality constraints for each component of the solution vector y. Arguments ida mem (void *) pointer to the idas memory block. constraints (N Vector) vector of constraint flags. If constraints[i] is 0.0 1.0 −1.0 2.0 −2.0 then then then then then no constraint is imposed on yi . yi will be constrained to be yi ≥ 0.0. yi will be constrained to be yi ≤ 0.0. yi will be constrained to be yi > 0.0. yi will be constrained to be yi < 0.0. Return value The return value flag (of type int) is one of IDA SUCCESS The optional value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. IDA ILL INPUT The constraints vector contains illegal values or the simultaneous corrector option has been selected when doing forward sensitivity analysis. Notes The presence of a non-NULL constraints vector that is not 0.0 in all components will cause constraint checking to be performed. However, a call with 0.0 in all components of constraints will result in an illegal input return. Constraint checking when doing forward sensitivity analysis with the simultaneous corrector option is currently disallowed and will result in an illegal input return. 4.5.8.2 Linear solver interface optional input functions The mathematical explanation of the linear solver methods available to idas is provided in §2.1. We group the user-callable routines into four categories: general routines concerning the overall idals linear solver interface, optional inputs for matrix-based linear solvers, optional inputs for matrix-free linear solvers, and optional inputs for iterative linear solvers. We note that the matrix-based and matrix-free groups are mutually exclusive, whereas the “iterative” tag can apply to either case. When using matrix-based linear solver modules, the idals solver interface needs a function to compute an approximation to the Jacobian matrix J(t, y, ẏ). This function must be of type IDALsJacFn. The user can supply a Jacobian function, or if using a dense or banded matrix J can use the default internal difference quotient approximation that comes with the idals interface. To specify a user-supplied Jacobian function jac, idals provides the function IDASetJacFn. The idals interface 52 Using IDAS for IVP Solution passes the pointer user data to the Jacobian function. This allows the user to create an arbitrary structure with relevant problem data and access it during the execution of the user-supplied Jacobian function, without using global data in the program. The pointer user data may be specified through IDASetUserData. IDASetJacFn Call flag = IDASetJacFn(ida mem, jac); Description The function IDASetJacFn specifies the Jacobian approximation function to be used for a matrix-based solver within the idals interface. Arguments ida mem (void *) pointer to the idas memory block. jac (IDALsJacFn) user-defined Jacobian approximation function. Return value The return value flag (of type int) is one of IDALS SUCCESS The optional value has been successfully set. IDALS MEM NULL The ida mem pointer is NULL. IDALS LMEM NULL The idals linear solver interface has not been initialized. Notes This function must be called after the idals linear solver interface has been initialized through a call to IDASetLinearSolver. By default, idals uses an internal difference quotient function for dense and band matrices. If NULL is passed to jac, this default function is used. An error will occur if no jac is supplied when using other matrix types. The function type IDALsJacFn is described in §4.6.5. The previous routine IDADlsSetJacFn is now a wrapper for this routine, and may still be used for backward-compatibility. However, this will be deprecated in future releases, so we recommend that users transition to the new routine name soon. When using matrix-free linear solver modules, the idals solver interface requires a function to compute an approximation to the product between the Jacobian matrix J(t, y) and a vector v. The user can supply a Jacobian-times-vector approximation function, or use the default internal difference quotient function that comes with the idals solver interface. A user-defined Jacobian-vector function must be of type IDALsJacTimesVecFn and can be specified through a call to IDASetJacTimes (see §4.6.6 for specification details). The evaluation and processing of any Jacobian-related data needed by the user’s Jacobian-times-vector function may be done in the optional user-supplied function jtsetup (see §4.6.7 for specification details). The pointer user data received through IDASetUserData (or a pointer to NULL if user data was not specified) is passed to the Jacobian-times-vector setup and product functions, jtsetup and jtimes, each time they are called. This allows the user to create an arbitrary structure with relevant problem data and access it during the execution of the user-supplied preconditioner functions without using global data in the program. IDASetJacTimes Call flag = IDASetJacTimes(ida mem, jsetup, jtimes); Description The function IDASetJacTimes specifies the Jacobian-vector setup and product functions. Arguments ida mem (void *) pointer to the idas memory block. jtsetup (IDALsJacTimesSetupFn) user-defined function to set up the Jacobian-vector product. Pass NULL if no setup is necessary. jtimes (IDALsJacTimesVecFn) user-defined Jacobian-vector product function. Return value The return value flag (of type int) is one of IDALS SUCCESS IDALS MEM NULL The optional value has been successfully set. The ida mem pointer is NULL. 4.5 User-callable functions 53 IDALS LMEM NULL The idals linear solver has not been initialized. IDALS SUNLS FAIL An error occurred when setting up the system matrix-times-vector routines in the sunlinsol object used by the idals interface. Notes The default is to use an internal finite difference quotient for jtimes and to omit jtsetup. If NULL is passed to jtimes, these defaults are used. A user may specify non-NULL jtimes and NULL jtsetup inputs. This function must be called after the idals linear solver interface has been initialized through a call to IDASetLinearSolver. The function type IDALsJacTimesSetupFn is described in §4.6.7. The function type IDALsJacTimesVecFn is described in §4.6.6. The previous routine IDASpilsSetJacTimes is now a wrapper for this routine, and may still be used for backward-compatibility. However, this will be deprecated in future releases, so we recommend that users transition to the new routine name soon. Alternately, when using the default difference-quotient approximation to the Jacobian-vector product, the user may specify the factor to use in setting increments for the finite-difference approximation, via a call to IDASetIncrementFactor: IDASetIncrementFactor Call flag = IDASetIncrementFactor(ida mem, dqincfac); Description The function IDASetIncrementFactor specifies the increment factor to be used in the difference-quotient approximation to the product Jv. Specifically, Jv is approximated via the formula 1 Jv = [F (t, ỹ, ỹ 0 ) − F (t, y, y 0 )] , σ 0 0 where √ ỹ = y + σv, ỹ = y + cj σv, cj is a BDF parameter proportional to the step size, σ = N dqincfac, and N is the number of equations in the DAE system. Arguments ida mem (void *) pointer to the idas memory block. dqincfac (realtype) user-specified increment factor (positive). Return value The return value flag (of type int) is one of IDALS IDALS IDALS IDALS Notes SUCCESS MEM NULL LMEM NULL ILL INPUT The The The The optional value has been successfully set. ida mem pointer is NULL. idals linear solver has not been initialized. specified value of dqincfac is ≤ 0. The default value is 1.0. This function must be called after the idals linear solver interface has been initialized through a call to IDASetLinearSolver. The previous routine IDASpilsSetIncrementFactor is now a wrapper for this routine, and may still be used for backward-compatibility. However, this will be deprecated in future releases, so we recommend that users transition to the new routine name soon. When using an iterative linear solver, the user may supply a preconditioning operator to aid in solution of the system. This operator consists of two user-supplied functions, psetup and psolve, that are supplied to ida using the function IDASetPreconditioner. The psetup function supplied to this routine should handle evaluation and preprocessing of any Jacobian data needed by the user’s preconditioner solve function, psolve. Both of these functions are fully specified in §4.6. The user data pointer received through IDASetUserData (or a pointer to NULL if user data was not specified) is passed to the psetup and psolve functions. This allows the user to create an arbitrary structure with relevant problem data and access it during the execution of the user-supplied preconditioner functions without using global data in the program. 54 Using IDAS for IVP Solution Also, as described in §2.1, the idals interface requires that iterative linear solvers stop when the norm of the preconditioned residual satisfies krk ≤ L 10 where is the nonlinear solver tolerance, and the default L = 0.05; this value may be modified by the user through the IDASetEpsLin function. IDASetPreconditioner Call flag = IDASetPreconditioner(ida mem, psetup, psolve); Description The function IDASetPreconditioner specifies the preconditioner setup and solve functions. Arguments ida mem (void *) pointer to the idas memory block. psetup (IDALsPrecSetupFn) user-defined function to set up the preconditioner. Pass NULL if no setup is necessary. psolve (IDALsPrecSolveFn) user-defined preconditioner solve function. Return value The return value flag (of type int) is one of IDALS IDALS IDALS IDALS Notes SUCCESS The optional values have been successfully set. MEM NULL The ida mem pointer is NULL. LMEM NULL The idals linear solver has not been initialized. SUNLS FAIL An error occurred when setting up preconditioning in the sunlinsol object used by the idals interface. The default is NULL for both arguments (i.e., no preconditioning). This function must be called after the idals linear solver interface has been initialized through a call to IDASetLinearSolver. The function type IDALsPrecSolveFn is described in §4.6.8. The function type IDALsPrecSetupFn is described in §4.6.9. The previous routine IDASpilsSetPreconditioner is now a wrapper for this routine, and may still be used for backward-compatibility. However, this will be deprecated in future releases, so we recommend that users transition to the new routine name soon. IDASetEpsLin Call flag = IDASetEpsLin(ida mem, eplifac); Description The function IDASetEpsLin specifies the factor by which the Krylov linear solver’s convergence test constant is reduced from the nonlinear iteration test constant. Arguments ida mem (void *) pointer to the idas memory block. eplifac (realtype) linear convergence safety factor (≥ 0.0). Return value The return value flag (of type int) is one of IDALS IDALS IDALS IDALS Notes SUCCESS MEM NULL LMEM NULL ILL INPUT The The The The optional value has been successfully set. ida mem pointer is NULL. idals linear solver has not been initialized. factor eplifac is negative. The default value is 0.05. This function must be called after the idals linear solver interface has been initialized through a call to IDASetLinearSolver. If eplifac= 0.0 is passed, the default value is used. 4.5 User-callable functions 55 The previous routine IDASpilsSetEpsLin is now a wrapper for this routine, and may still be used for backward-compatibility. However, this will be deprecated in future releases, so we recommend that users transition to the new routine name soon. 4.5.8.3 Initial condition calculation optional input functions The following functions can be called just prior to calling IDACalcIC to set optional inputs controlling the initial condition calculation. IDASetNonlinConvCoefIC Call flag = IDASetNonlinConvCoefIC(ida mem, epiccon); Description The function IDASetNonlinConvCoefIC specifies the positive constant in the Newton iteration convergence test within the initial condition calculation. Arguments ida mem (void *) pointer to the idas memory block. epiccon (realtype) coefficient in the Newton convergence test (> 0). Return value The return value flag (of type int) is one of IDA SUCCESS The optional value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. IDA ILL INPUT The epiccon factor is <= 0.0. Notes The default value is 0.01 · 0.33. This test uses a weighted RMS norm (with weights defined by the tolerances). For new initial value vectors y and ẏ to be accepted, the norm of J −1 F (t0 , y, ẏ) must be ≤ epiccon, where J is the system Jacobian. IDASetMaxNumStepsIC Call flag = IDASetMaxNumStepsIC(ida mem, maxnh); Description The function IDASetMaxNumStepsIC specifies the maximum number of steps allowed when icopt=IDA YA YDP INIT in IDACalcIC, where h appears in the system Jacobian, J = ∂F/∂y + (1/h)∂F/∂ ẏ. Arguments ida mem (void *) pointer to the idas memory block. maxnh (int) maximum allowed number of values for h. Return value The return value flag (of type int) is one of IDA SUCCESS The optional value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. IDA ILL INPUT maxnh is non-positive. Notes The default value is 5. IDASetMaxNumJacsIC Call flag = IDASetMaxNumJacsIC(ida mem, maxnj); Description The function IDASetMaxNumJacsIC specifies the maximum number of the approximate Jacobian or preconditioner evaluations allowed when the Newton iteration appears to be slowly converging. Arguments ida mem (void *) pointer to the idas memory block. maxnj (int) maximum allowed number of Jacobian or preconditioner evaluations. Return value The return value flag (of type int) is one of IDA SUCCESS The optional value has been successfully set. 56 Using IDAS for IVP Solution IDA MEM NULL The ida mem pointer is NULL. IDA ILL INPUT maxnj is non-positive. Notes The default value is 4. IDASetMaxNumItersIC Call flag = IDASetMaxNumItersIC(ida mem, maxnit); Description The function IDASetMaxNumItersIC specifies the maximum number of Newton iterations allowed in any one attempt to solve the initial conditions calculation problem. Arguments ida mem (void *) pointer to the idas memory block. maxnit (int) maximum number of Newton iterations. Return value The return value flag (of type int) is one of IDA SUCCESS The optional value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. IDA ILL INPUT maxnit is non-positive. Notes The default value is 10. IDASetMaxBacksIC Call flag = IDASetMaxBacksIC(ida mem, maxbacks); Description The function IDASetMaxBacksIC specifies the maximum number of linesearch backtracks allowed in any Newton iteration, when solving the initial conditions calculation problem. Arguments ida mem (void *) pointer to the idas memory block. maxbacks (int) maximum number of linesearch backtracks per Newton step. Return value The return value flag (of type int) is one of IDA SUCCESS The optional value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. IDA ILL INPUT maxbacks is non-positive. Notes The default value is 100. If IDASetMaxBacksIC is called in a Forward Sensitivity Analysis, the the limit maxbacks applies in the calculation of both the initial state values and the initial sensititivies. IDASetLineSearchOffIC Call flag = IDASetLineSearchOffIC(ida mem, lsoff); Description The function IDASetLineSearchOffIC specifies whether to turn on or off the linesearch algorithm. Arguments ida mem (void *) pointer to the idas memory block. lsoff (booleantype) a flag to turn off (SUNTRUE) or keep (SUNFALSE) the linesearch algorithm. Return value The return value flag (of type int) is one of IDA SUCCESS The optional value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. Notes The default value is SUNFALSE. 4.5 User-callable functions 57 IDASetStepToleranceIC Call flag = IDASetStepToleranceIC(ida mem, steptol); Description The function IDASetStepToleranceIC specifies a positive lower bound on the Newton step. Arguments ida mem (void *) pointer to the idas memory block. steptol (int) Minimum allowed WRMS-norm of the Newton step (> 0.0). Return value The return value flag (of type int) is one of IDA SUCCESS The optional value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. IDA ILL INPUT The steptol tolerance is <= 0.0. The default value is (unit roundoff)2/3 . Notes 4.5.8.4 Rootfinding optional input functions The following functions can be called to set optional inputs to control the rootfinding algorithm. IDASetRootDirection Call flag = IDASetRootDirection(ida mem, rootdir); Description The function IDASetRootDirection specifies the direction of zero-crossings to be located and returned to the user. Arguments ida mem (void *) pointer to the idas memory block. rootdir (int *) state array of length nrtfn, the number of root functions gi , as specified in the call to the function IDARootInit. A value of 0 for rootdir[i] indicates that crossing in either direction should be reported for gi . A value of +1 or −1 indicates that the solver should report only zero-crossings where gi is increasing or decreasing, respectively. Return value The return value flag (of type int) is one of IDA SUCCESS The optional value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. IDA ILL INPUT rootfinding has not been activated through a call to IDARootInit. Notes The default behavior is to locate both zero-crossing directions. IDASetNoInactiveRootWarn Call flag = IDASetNoInactiveRootWarn(ida mem); Description The function IDASetNoInactiveRootWarn disables issuing a warning if some root function appears to be identically zero at the beginning of the integration. Arguments ida mem (void *) pointer to the idas memory block. Return value The return value flag (of type int) is one of IDA SUCCESS The optional value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. Notes idas will not report the initial conditions as a possible zero-crossing (assuming that one or more components gi are zero at the initial time). However, if it appears that some gi is identically zero at the initial time (i.e., gi is zero at the initial time and after the first step), idas will issue a warning which can be disabled with this optional input function. 58 4.5.9 Using IDAS for IVP Solution Interpolated output function An optional function IDAGetDky is available to obtain additional output values. This function must be called after a successful return from IDASolve and provides interpolated values of y or its derivatives of order up to the last internal order used for any value of t in the last internal step taken by idas. The call to the IDAGetDky function has the following form: IDAGetDky Call flag = IDAGetDky(ida mem, t, k, dky); Description The function IDAGetDky computes the interpolated values of the k th derivative of y for any value of t in the last internal step taken by idas. The value of k must be nonnegative and smaller than the last internal order used. A value of 0 for k means that the y is interpolated. The value of t must satisfy tn − hu ≤ t ≤ tn , where tn denotes the current internal time reached, and hu is the last internal step size used successfully. Arguments ida mem (void *) pointer to the idas memory block. t (realtype) time at which to interpolate. k (int) integer specifying the order of the derivative of y wanted. dky (N Vector) vector containing the interpolated k th derivative of y(t). Return value The return value flag (of type int) is one of IDA IDA IDA IDA IDA Notes 4.5.10 SUCCESS MEM NULL BAD T BAD K BAD DKY IDAGetDky succeeded. The ida mem argument was NULL. t is not in the interval [tn − hu , tn ]. k is not one of {0, 1, . . . , klast}. dky is NULL. It is only legal to call the function IDAGetDky after a successful return from IDASolve. Functions IDAGetCurrentTime, IDAGetLastStep and IDAGetLastOrder (see §4.5.10.2) can be used to access tn , hu and klast. Optional output functions idas provides an extensive list of functions that can be used to obtain solver performance information. Table 4.3 lists all optional output functions in idas, which are then described in detail in the remainder of this section. Some of the optional outputs, especially the various counters, can be very useful in determining how successful the idas solver is in doing its job. For example, the counters nsteps and nrevals provide a rough measure of the overall cost of a given run, and can be compared among runs with differing input options to suggest which set of options is most efficient. The ratio nniters/nsteps measures the performance of the nonlinear solver in solving the nonlinear systems at each time step; typical values for this range from 1.1 to 1.8. The ratio njevals/nniters (in the case of a matrixbased linear solver), and the ratio npevals/nniters (in the case of an iterative linear solver) measure the overall degree of nonlinearity in these systems, and also the quality of the approximate Jacobian or preconditioner being used. Thus, for example, njevals/nniters can indicate if a user-supplied Jacobian is inaccurate, if this ratio is larger than for the case of the corresponding internal Jacobian. The ratio nliters/nniters measures the performance of the Krylov iterative linear solver, and thus (indirectly) the quality of the preconditioner. 4.5.10.1 SUNDIALS version information The following functions provide a way to get sundials version information at runtime. 4.5 User-callable functions Table 4.3: Optional outputs from idas and idals Optional output Function name IDAS main solver Size of idas real and integer workspace IDAGetWorkSpace Cumulative number of internal steps IDAGetNumSteps No. of calls to residual function IDAGetNumResEvals No. of calls to linear solver setup function IDAGetNumLinSolvSetups No. of local error test failures that have occurred IDAGetNumErrTestFails Order used during the last step IDAGetLastOrder Order to be attempted on the next step IDAGetCurrentOrder Order reductions due to stability limit detection IDAGetNumStabLimOrderReds Actual initial step size used IDAGetActualInitStep Step size used for the last step IDAGetLastStep Step size to be attempted on the next step IDAGetCurrentStep Current internal time reached by the solver IDAGetCurrentTime Suggested factor for tolerance scaling IDAGetTolScaleFactor Error weight vector for state variables IDAGetErrWeights Estimated local errors IDAGetEstLocalErrors No. of nonlinear solver iterations IDAGetNumNonlinSolvIters No. of nonlinear convergence failures IDAGetNumNonlinSolvConvFails Array showing roots found IDAGetRootInfo No. of calls to user root function IDAGetNumGEvals Name of constant associated with a return flag IDAGetReturnFlagName IDAS initial conditions calculation Number of backtrack operations IDAGetNumBacktrackops Corrected initial conditions IDAGetConsistentIC IDALS linear solver interface Size of real and integer workspace IDAGetLinWorkSpace No. of Jacobian evaluations IDAGetNumJacEvals No. of residual calls for finite diff. Jacobian[-vector] evals. IDAGetNumLinResEvals No. of linear iterations IDAGetNumLinIters No. of linear convergence failures IDAGetNumLinConvFails No. of preconditioner evaluations IDAGetNumPrecEvals No. of preconditioner solves IDAGetNumPrecSolves No. of Jacobian-vector setup evaluations IDAGetNumJTSetupEvals No. of Jacobian-vector product evaluations IDAGetNumJtimesEvals Last return from a linear solver function IDAGetLastLinFlag Name of constant associated with a return flag IDAGetLinReturnFlagName 59 60 Using IDAS for IVP Solution SUNDIALSGetVersion Call flag = SUNDIALSGetVersion(version, len); Description The function SUNDIALSGetVersion fills a character array with sundials version information. Arguments version (char *) character array to hold the sundials version information. len (int) allocated length of the version character array. Return value If successful, SUNDIALSGetVersion returns 0 and version contains the sundials version information. Otherwise, it returns −1 and version is not set (the input character array is too short). Notes A string of 25 characters should be sufficient to hold the version information. Any trailing characters in the version array are removed. SUNDIALSGetVersionNumber Call flag = SUNDIALSGetVersionNumber(&major, &minor, &patch, label, len); Description The function SUNDIALSGetVersionNumber set integers for the sundials major, minor, and patch release numbers and fills a character array with the release label if applicable. Arguments major minor patch label len (int) sundials release major version number. (int) sundials release minor version number. (int) sundials release patch version number. (char *) character array to hold the sundials release label. (int) allocated length of the label character array. Return value If successful, SUNDIALSGetVersionNumber returns 0 and the major, minor, patch, and label values are set. Otherwise, it returns −1 and the values are not set (the input character array is too short). Notes 4.5.10.2 A string of 10 characters should be sufficient to hold the label information. If a label is not used in the release version, no information is copied to label. Any trailing characters in the label array are removed. Main solver optional output functions idas provides several user-callable functions that can be used to obtain different quantities that may be of interest to the user, such as solver workspace requirements, solver performance statistics, as well as additional data from the idas memory block (a suggested tolerance scaling factor, the error weight vector, and the vector of estimated local errors). Also provided are functions to extract statistics related to the performance of the sunnonlinsol nonlinear solver being used. As a convenience, additional extraction functions provide the optional outputs in groups. These optional output functions are described next. IDAGetWorkSpace Call flag = IDAGetWorkSpace(ida mem, &lenrw, &leniw); Description The function IDAGetWorkSpace returns the idas real and integer workspace sizes. Arguments ida mem (void *) pointer to the idas memory block. lenrw (long int) number of real values in the idas workspace. leniw (long int) number of integer values in the idas workspace. Return value The return value flag (of type int) is one of IDA SUCCESS The optional output value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. 4.5 User-callable functions Notes 61 In terms of the problem size N , the maximum method order maxord, and the number nrtfn of root functions (see §4.5.6), the actual size of the real workspace, in realtype words, is given by the following: • base value: lenrw = 55 + (m + 6) ∗ Nr + 3∗nrtfn; • with IDASVtolerances: lenrw = lenrw +Nr ; • with constraint checking (see IDASetConstraints): lenrw = lenrw +Nr ; • with id specified (see IDASetId): lenrw = lenrw +Nr ; where m = max(maxord, 3), and Nr is the number of real words in one N Vector (≈ N ). The size of the integer workspace (without distinction between int and long int words) is given by: • base value: leniw = 38 + (m + 6) ∗ Ni + nrtfn; • with IDASVtolerances: leniw = leniw +Ni ; • with constraint checking: lenrw = lenrw +Ni ; • with id specified: lenrw = lenrw +Ni ; where Ni is the number of integer words in one N Vector (= 1 for nvector serial and 2*npes for nvector parallel on npes processors). For the default value of maxord, with no rootfinding, no id, no constraints, and with no call to IDASVtolerances, these lengths are given roughly by: lenrw = 55 + 11N , leniw = 49. Note that additional memory is allocated if quadratures and/or forward sensitivity integration is enabled. See §4.7.1 and §5.2.1 for more details. IDAGetNumSteps Call flag = IDAGetNumSteps(ida mem, &nsteps); Description The function IDAGetNumSteps returns the cumulative number of internal steps taken by the solver (total so far). Arguments ida mem (void *) pointer to the idas memory block. nsteps (long int) number of steps taken by idas. Return value The return value flag (of type int) is one of IDA SUCCESS The optional output value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. IDAGetNumResEvals Call flag = IDAGetNumResEvals(ida mem, &nrevals); Description The function IDAGetNumResEvals returns the number of calls to the user’s residual evaluation function. Arguments ida mem (void *) pointer to the idas memory block. nrevals (long int) number of calls to the user’s res function. Return value The return value flag (of type int) is one of IDA SUCCESS The optional output value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. Notes The nrevals value returned by IDAGetNumResEvals does not account for calls made to res from a linear solver or preconditioner module. 62 Using IDAS for IVP Solution IDAGetNumLinSolvSetups Call flag = IDAGetNumLinSolvSetups(ida mem, &nlinsetups); Description The function IDAGetNumLinSolvSetups returns the cumulative number of calls made to the linear solver’s setup function (total so far). Arguments ida mem (void *) pointer to the idas memory block. nlinsetups (long int) number of calls made to the linear solver setup function. Return value The return value flag (of type int) is one of IDA SUCCESS The optional output value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. IDAGetNumErrTestFails Call flag = IDAGetNumErrTestFails(ida mem, &netfails); Description The function IDAGetNumErrTestFails returns the cumulative number of local error test failures that have occurred (total so far). Arguments ida mem (void *) pointer to the idas memory block. netfails (long int) number of error test failures. Return value The return value flag (of type int) is one of IDA SUCCESS The optional output value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. IDAGetLastOrder Call flag = IDAGetLastOrder(ida mem, &klast); Description The function IDAGetLastOrder returns the integration method order used during the last internal step. Arguments ida mem (void *) pointer to the idas memory block. klast (int) method order used on the last internal step. Return value The return value flag (of type int) is one of IDA SUCCESS The optional output value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. IDAGetCurrentOrder Call flag = IDAGetCurrentOrder(ida mem, &kcur); Description The function IDAGetCurrentOrder returns the integration method order to be used on the next internal step. Arguments ida mem (void *) pointer to the idas memory block. kcur (int) method order to be used on the next internal step. Return value The return value flag (of type int) is one of IDA SUCCESS The optional output value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. 4.5 User-callable functions 63 IDAGetLastStep Call flag = IDAGetLastStep(ida mem, &hlast); Description The function IDAGetLastStep returns the integration step size taken on the last internal step. Arguments ida mem (void *) pointer to the idas memory block. hlast (realtype) step size taken on the last internal step by idas, or last artificial step size used in IDACalcIC, whichever was called last. Return value The return value flag (of type int) is one of IDA SUCCESS The optional output value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. IDAGetCurrentStep Call flag = IDAGetCurrentStep(ida mem, &hcur); Description The function IDAGetCurrentStep returns the integration step size to be attempted on the next internal step. Arguments ida mem (void *) pointer to the idas memory block. hcur (realtype) step size to be attempted on the next internal step. Return value The return value flag (of type int) is one of IDA SUCCESS The optional output value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. IDAGetActualInitStep Call flag = IDAGetActualInitStep(ida mem, &hinused); Description The function IDAGetActualInitStep returns the value of the integration step size used on the first step. Arguments ida mem (void *) pointer to the idas memory block. hinused (realtype) actual value of initial step size. Return value The return value flag (of type int) is one of IDA SUCCESS The optional output value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. Notes Even if the value of the initial integration step size was specified by the user through a call to IDASetInitStep, this value might have been changed by idas to ensure that the step size is within the prescribed bounds (hmin ≤ h0 ≤ hmax ), or to meet the local error test. IDAGetCurrentTime Call flag = IDAGetCurrentTime(ida mem, &tcur); Description The function IDAGetCurrentTime returns the current internal time reached by the solver. Arguments ida mem (void *) pointer to the idas memory block. tcur (realtype) current internal time reached. Return value The return value flag (of type int) is one of IDA SUCCESS The optional output value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. 64 Using IDAS for IVP Solution IDAGetTolScaleFactor Call flag = IDAGetTolScaleFactor(ida mem, &tolsfac); Description The function IDAGetTolScaleFactor returns a suggested factor by which the user’s tolerances should be scaled when too much accuracy has been requested for some internal step. Arguments ida mem (void *) pointer to the idas memory block. tolsfac (realtype) suggested scaling factor for user tolerances. Return value The return value flag (of type int) is one of IDA SUCCESS The optional output value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. IDAGetErrWeights Call flag = IDAGetErrWeights(ida mem, eweight); Description The function IDAGetErrWeights returns the solution error weights at the current time. These are the Wi given by Eq. (2.7) (or by the user’s IDAEwtFn). Arguments ida mem (void *) pointer to the idas memory block. eweight (N Vector) solution error weights at the current time. Return value The return value flag (of type int) is one of IDA SUCCESS The optional output value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. ! Notes The user must allocate space for eweight. IDAGetEstLocalErrors Call flag = IDAGetEstLocalErrors(ida mem, ele); Description The function IDAGetEstLocalErrors returns the estimated local errors. Arguments ida mem (void *) pointer to the idas memory block. ele (N Vector) estimated local errors at the current time. Return value The return value flag (of type int) is one of IDA SUCCESS The optional output value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. ! Notes The user must allocate space for ele. The values returned in ele are only valid if IDASolve returned a non-negative value. The ele vector, togther with the eweight vector from IDAGetErrWeights, can be used to determine how the various components of the system contributed to the estimated local error test. Specifically, that error test uses the RMS norm of a vector whose components are the products of the components of these two vectors. Thus, for example, if there were recent error test failures, the components causing the failures are those with largest values for the products, denoted loosely as eweight[i]*ele[i]. IDAGetIntegratorStats Call flag = IDAGetIntegratorStats(ida mem, &nsteps, &nrevals, &nlinsetups, &netfails, &klast, &kcur, &hinused, &hlast, &hcur, &tcur); Description The function IDAGetIntegratorStats returns the idas integrator statistics as a group. Arguments ida mem (void *) pointer to the idas memory block. 4.5 User-callable functions 65 nsteps (long int) cumulative number of steps taken by idas. nrevals (long int) cumulative number of calls to the user’s res function. nlinsetups (long int) cumulative number of calls made to the linear solver setup function. netfails (long int) cumulative number of error test failures. klast (int) method order used on the last internal step. kcur (int) method order to be used on the next internal step. hinused (realtype) actual value of initial step size. hlast (realtype) step size taken on the last internal step. hcur (realtype) step size to be attempted on the next internal step. tcur (realtype) current internal time reached. Return value The return value flag (of type int) is one of IDA SUCCESS the optional output values have been successfully set. IDA MEM NULL the ida mem pointer is NULL. IDAGetNumNonlinSolvIters Call flag = IDAGetNumNonlinSolvIters(ida mem, &nniters); Description The function IDAGetNumNonlinSolvIters returns the cumulative number of nonlinear iterations performed. Arguments ida mem (void *) pointer to the idas memory block. nniters (long int) number of nonlinear iterations performed. Return value The return value flag (of type int) is one of IDA SUCCESS The optional output value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. IDA MEM FAIL The sunnonlinsol module is NULL. IDAGetNumNonlinSolvConvFails Call flag = IDAGetNumNonlinSolvConvFails(ida mem, &nncfails); Description The function IDAGetNumNonlinSolvConvFails returns the cumulative number of nonlinear convergence failures that have occurred. Arguments ida mem (void *) pointer to the idas memory block. nncfails (long int) number of nonlinear convergence failures. Return value The return value flag (of type int) is one of IDA SUCCESS The optional output value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. IDAGetNonlinSolvStats Call flag = IDAGetNonlinSolvStats(ida mem, &nniters, &nncfails); Description The function IDAGetNonlinSolvStats returns the idas nonlinear solver statistics as a group. Arguments ida mem (void *) pointer to the idas memory block. nniters (long int) cumulative number of nonlinear iterations performed. nncfails (long int) cumulative number of nonlinear convergence failures. Return value The return value flag (of type int) is one of IDA SUCCESS The optional output value has been successfully set. 66 Using IDAS for IVP Solution IDA MEM NULL The ida mem pointer is NULL. IDA MEM FAIL The sunnonlinsol module is NULL. IDAGetReturnFlagName Call name = IDAGetReturnFlagName(flag); Description The function IDAGetReturnFlagName returns the name of the idas constant corresponding to flag. Arguments The only argument, of type int, is a return flag from an idas function. Return value The return value is a string containing the name of the corresponding constant. 4.5.10.3 Initial condition calculation optional output functions IDAGetNumBcktrackOps Call flag = IDAGetNumBacktrackOps(ida mem, &nbacktr); Description The function IDAGetNumBacktrackOps returns the number of backtrack operations done in the linesearch algorithm in IDACalcIC. Arguments ida mem (void *) pointer to the idas memory block. nbacktr (long int) the cumulative number of backtrack operations. Return value The return value flag (of type int) is one of IDA SUCCESS The optional output value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. IDAGetConsistentIC Call flag = IDAGetConsistentIC(ida mem, yy0 mod, yp0 mod); Description The function IDAGetConsistentIC returns the corrected initial conditions calculated by IDACalcIC. Arguments ida mem (void *) pointer to the idas memory block. yy0 mod (N Vector) consistent solution vector. yp0 mod (N Vector) consistent derivative vector. Return value The return value flag (of type int) is one of IDA SUCCESS The optional output value has been successfully set. IDA ILL INPUT The function was not called before the first call to IDASolve. IDA MEM NULL The ida mem pointer is NULL. Notes ! If the consistent solution vector or consistent derivative vector is not desired, pass NULL for the corresponding argument. The user must allocate space for yy0 mod and yp0 mod (if not NULL). 4.5.10.4 Rootfinding optional output functions There are two optional output functions associated with rootfinding. 4.5 User-callable functions 67 IDAGetRootInfo Call flag = IDAGetRootInfo(ida mem, rootsfound); Description The function IDAGetRootInfo returns an array showing which functions were found to have a root. Arguments ida mem (void *) pointer to the idas memory block. rootsfound (int *) array of length nrtfn with the indices of the user functions gi found to have a root. For i = 0, . . . ,nrtfn −1, rootsfound[i] 6= 0 if gi has a root, and = 0 if not. Return value The return value flag (of type int) is one of IDA SUCCESS The optional output values have been successfully set. IDA MEM NULL The ida mem pointer is NULL. Notes Note that, for the components gi for which a root was found, the sign of rootsfound[i] indicates the direction of zero-crossing. A value of +1 indicates that gi is increasing, while a value of −1 indicates a decreasing gi . The user must allocate memory for the vector rootsfound. IDAGetNumGEvals Call flag = IDAGetNumGEvals(ida mem, &ngevals); Description The function IDAGetNumGEvals returns the cumulative number of calls to the user root function g. Arguments ida mem (void *) pointer to the idas memory block. ngevals (long int) number of calls to the user’s function g so far. Return value The return value flag (of type int) is one of IDA SUCCESS The optional output value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. 4.5.10.5 idals linear solver interface optional output functions The following optional outputs are available from the idals modules: workspace requirements, number of calls to the Jacobian routine, number of calls to the residual routine for finite-difference Jacobian or Jacobian-vector product approximation, number of linear iterations, number of linear convergence failures, number of calls to the preconditioner setup and solve routines, number of calls to the Jacobianvector setup and product routines, and last return value from an idals function. Note that, where the name of an output would otherwise conflict with the name of an optional output from the main solver, a suffix LS (for Linear Solver) has been added (e.g., lenrwLS). IDAGetLinWorkSpace Call flag = IDAGetLinWorkSpace(ida mem, &lenrwLS, &leniwLS); Description The function IDAGetLinWorkSpace returns the sizes of the real and integer workspaces used by the idals linear solver interface. Arguments ida mem (void *) pointer to the idas memory block. lenrwLS (long int) the number of real values in the idals workspace. leniwLS (long int) the number of integer values in the idals workspace. Return value The return value flag (of type int) is one of IDALS SUCCESS The optional output value has been successfully set. IDALS MEM NULL The ida mem pointer is NULL. IDALS LMEM NULL The idals linear solver has not been initialized. ! 68 Notes Using IDAS for IVP Solution The workspace requirements reported by this routine correspond only to memory allocated within this interface and to memory allocated by the sunlinsol object attached to it. The template Jacobian matrix allocated by the user outside of idals is not included in this report. The previous routines IDADlsGetWorkspace and IDASpilsGetWorkspace are now wrappers for this routine, and may still be used for backward-compatibility. However, these will be deprecated in future releases, so we recommend that users transition to the new routine name soon. IDAGetNumJacEvals Call flag = IDAGetNumJacEvals(ida mem, &njevals); Description The function IDAGetNumJacEvals returns the cumulative number of calls to the idals Jacobian approximation function. Arguments ida mem (void *) pointer to the idas memory block. njevals (long int) the cumulative number of calls to the Jacobian function (total so far). Return value The return value flag (of type int) is one of IDALS SUCCESS The optional output value has been successfully set. IDALS MEM NULL The ida mem pointer is NULL. IDALS LMEM NULL The idals linear solver has not been initialized. Notes The previous routine IDADlsGetNumJacEvals is now a wrapper for this routine, and may still be used for backward-compatibility. However, this will be deprecated in future releases, so we recommend that users transition to the new routine name soon. IDAGetNumLinResEvals Call flag = IDAGetNumLinResEvals(ida mem, &nrevalsLS); Description The function IDAGetNumLinResEvals returns the cumulative number of calls to the user residual function due to the finite difference Jacobian approximation or finite difference Jacobian-vector product approximation. Arguments ida mem (void *) pointer to the idas memory block. nrevalsLS (long int) the cumulative number of calls to the user residual function. Return value The return value flag (of type int) is one of IDALS SUCCESS The optional output value has been successfully set. IDALS MEM NULL The ida mem pointer is NULL. IDALS LMEM NULL The idals linear solver has not been initialized. Notes The value nrevalsLS is incremented only if one of the default internal difference quotient functions is used. The previous routines IDADlsGetNumRhsEvals and IDASpilsGetNumRhsEvals are now wrappers for this routine, and may still be used for backward-compatibility. However, these will be deprecated in future releases, so we recommend that users transition to the new routine name soon. IDAGetNumLinIters Call flag = IDAGetNumLinIters(ida mem, &nliters); Description The function IDAGetNumLinIters returns the cumulative number of linear iterations. Arguments ida mem (void *) pointer to the idas memory block. 4.5 User-callable functions 69 nliters (long int) the current number of linear iterations. Return value The return value flag (of type int) is one of IDALS SUCCESS The optional output value has been successfully set. IDALS MEM NULL The ida mem pointer is NULL. IDALS LMEM NULL The idals linear solver has not been initialized. Notes The previous routine IDASpilsGetNumLinIters is now a wrapper for this routine, and may still be used for backward-compatibility. However, this will be deprecated in future releases, so we recommend that users transition to the new routine name soon. IDAGetNumLinConvFails Call flag = IDAGetNumLinConvFails(ida mem, &nlcfails); Description The function IDAGetNumLinConvFails returns the cumulative number of linear convergence failures. Arguments ida mem (void *) pointer to the idas memory block. nlcfails (long int) the current number of linear convergence failures. Return value The return value flag (of type int) is one of IDALS SUCCESS The optional output value has been successfully set. IDALS MEM NULL The ida mem pointer is NULL. IDALS LMEM NULL The idals linear solver has not been initialized. Notes The previous routine IDASpilsGetNumConvFails is now a wrapper for this routine, and may still be used for backward-compatibility. However, this will be deprecated in future releases, so we recommend that users transition to the new routine name soon. IDAGetNumPrecEvals Call flag = IDAGetNumPrecEvals(ida mem, &npevals); Description The function IDAGetNumPrecEvals returns the cumulative number of preconditioner evaluations, i.e., the number of calls made to psetup. Arguments ida mem (void *) pointer to the idas memory block. npevals (long int) the cumulative number of calls to psetup. Return value The return value flag (of type int) is one of IDALS SUCCESS The optional output value has been successfully set. IDALS MEM NULL The ida mem pointer is NULL. IDALS LMEM NULL The idals linear solver has not been initialized. Notes The previous routine IDASpilsGetNumPrecEvals is now a wrapper for this routine, and may still be used for backward-compatibility. However, this will be deprecated in future releases, so we recommend that users transition to the new routine name soon. IDAGetNumPrecSolves Call flag = IDAGetNumPrecSolves(ida mem, &npsolves); Description The function IDAGetNumPrecSolves returns the cumulative number of calls made to the preconditioner solve function, psolve. Arguments ida mem (void *) pointer to the idas memory block. npsolves (long int) the cumulative number of calls to psolve. Return value The return value flag (of type int) is one of IDALS SUCCESS The optional output value has been successfully set. 70 Using IDAS for IVP Solution IDALS MEM NULL The ida mem pointer is NULL. IDALS LMEM NULL The idals linear solver has not been initialized. Notes The previous routine IDASpilsGetNumPrecSolves is now a wrapper for this routine, and may still be used for backward-compatibility. However, this will be deprecated in future releases, so we recommend that users transition to the new routine name soon. IDAGetNumJTSetupEvals Call flag = IDAGetNumJTSetupEvals(ida mem, &njtsetup); Description The function IDAGetNumJTSetupEvals returns the cumulative number of calls made to the Jacobian-vector setup function jtsetup. Arguments ida mem (void *) pointer to the idas memory block. njtsetup (long int) the current number of calls to jtsetup. Return value The return value flag (of type int) is one of IDA SUCCESS The optional output value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. IDA LMEM NULL The ida linear solver has not been initialized. Notes The previous routine IDASpilsGetNumJTSetupEvals is now a wrapper for this routine, and may still be used for backward-compatibility. However, this will be deprecated in future releases, so we recommend that users transition to the new routine name soon. IDAGetNumJtimesEvals Call flag = IDAGetNumJtimesEvals(ida mem, &njvevals); Description The function IDAGetNumJtimesEvals returns the cumulative number of calls made to the Jacobian-vector function, jtimes. Arguments ida mem (void *) pointer to the idas memory block. njvevals (long int) the cumulative number of calls to jtimes. Return value The return value flag (of type int) is one of IDA SUCCESS The optional output value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. IDA LMEM NULL The ida linear solver has not been initialized. Notes The previous routine IDASpilsGetNumJtimesEvals is now a wrapper for this routine, and may still be used for backward-compatibility. However, this will be deprecated in future releases, so we recommend that users transition to the new routine name soon. IDAGetLastLinFlag Call flag = IDAGetLastLinFlag(ida mem, &lsflag); Description The function IDAGetLastLinFlag returns the last return value from an idals routine. Arguments ida mem (void *) pointer to the idas memory block. lsflag (long int) the value of the last return flag from an idals function. Return value The return value flag (of type int) is one of IDALS SUCCESS The optional output value has been successfully set. IDALS MEM NULL The ida mem pointer is NULL. IDALS LMEM NULL The idals linear solver has not been initialized. 4.5 User-callable functions Notes 71 If the idals setup function failed (i.e., IDASolve returned IDA LSETUP FAIL) when using the sunlinsol dense or sunlinsol band modules, then the value of lsflag is equal to the column index (numbered from one) at which a zero diagonal element was encountered during the LU factorization of the (dense or banded) Jacobian matrix. If the idals setup function failed when using another sunlinsol module, then lsflag will be SUNLS PSET FAIL UNREC, SUNLS ASET FAIL UNREC, or SUNLS PACKAGE FAIL UNREC. If the idals solve function failed (IDASolve returned IDA LSOLVE FAIL), lsflag contains the error return flag from the sunlinsol object, which will be one of: SUNLS MEM NULL, indicating that the sunlinsol memory is NULL; SUNLS ATIMES FAIL UNREC, indicating an unrecoverable failure in the J ∗ v function; SUNLS PSOLVE FAIL UNREC, indicating that the preconditioner solve function psolve failed unrecoverably; SUNLS GS FAIL, indicating a failure in the Gram-Schmidt procedure (generated only in spgmr or spfgmr); SUNLS QRSOL FAIL, indicating that the matrix R was found to be singular during the QR solve phase (spgmr and spfgmr only); or SUNLS PACKAGE FAIL UNREC, indicating an unrecoverable failure in an external iterative linear solver package. The previous routines IDADlsGetLastFlag and IDASpilsGetLastFlag are now wrappers for this routine, and may still be used for backward-compatibility. However, these will be deprecated in future releases, so we recommend that users transition to the new routine name soon. IDAGetLinReturnFlagName Call name = IDAGetLinReturnFlagName(lsflag); Description The function IDAGetLinReturnFlagName returns the name of the ida constant corresponding to lsflag. Arguments The only argument, of type long int, is a return flag from an idals function. Return value The return value is a string containing the name of the corresponding constant. If 1 ≤ lsflag ≤ N (LU factorization failed), this function returns “NONE”. Notes 4.5.11 The previous routines IDADlsGetReturnFlagName and IDASpilsGetReturnFlagName are now wrappers for this routine, and may still be used for backward-compatibility. However, these will be deprecated in future releases, so we recommend that users transition to the new routine name soon. IDAS reinitialization function The function IDAReInit reinitializes the main idas solver for the solution of a new problem, where a prior call to IDAInit has been made. The new problem must have the same size as the previous one. IDAReInit performs the same input checking and initializations that IDAInit does, but does no memory allocation, as it assumes that the existing internal memory is sufficient for the new problem. A call to IDAReInit deletes the solution history that was stored internally during the previous integration. Following a successful call to IDAReInit, call IDASolve again for the solution of the new problem. The use of IDAReInit requires that the maximum method order, maxord, is no larger for the new problem than for the problem specified in the last call to IDAInit. In addition, the same nvector module set for the previous problem will be reused for the new problem. If there are changes to the linear solver specifications, make the appropriate calls to either the linear solver objects themselves, or to the idals interface routines, as described in §4.5.3. If there are changes to any optional inputs, make the appropriate IDASet*** calls, as described in §4.5.8. Otherwise, all solver inputs set previously remain in effect. 72 Using IDAS for IVP Solution One important use of the IDAReInit function is in the treating of jump discontinuities in the residual function. Except in cases of fairly small jumps, it is usually more efficient to stop at each point of discontinuity and restart the integrator with a readjusted DAE model, using a call to IDAReInit. To stop when the location of the discontinuity is known, simply make that location a value of tout. To stop when the location of the discontinuity is determined by the solution, use the rootfinding feature. In either case, it is critical that the residual function not incorporate the discontinuity, but rather have a smooth extention over the discontinuity, so that the step across it (and subsequent rootfinding, if used) can be done efficiently. Then use a switch within the residual function (communicated through user data) that can be flipped between the stopping of the integration and the restart, so that the restarted problem uses the new values (which have jumped). Similar comments apply if there is to be a jump in the dependent variable vector. IDAReInit Call flag = IDAReInit(ida mem, t0, y0, yp0); Description The function IDAReInit provides required problem specifications and reinitializes idas. Arguments ida mem (void *) pointer to the idas memory block. t0 (realtype) is the initial value of t. y0 (N Vector) is the initial value of y. yp0 (N Vector) is the initial value of ẏ. Return value The return value flag (of type int) will be one of the following: The call to IDAReInit was successful. The idas memory block was not initialized through a previous call to IDACreate. IDA NO MALLOC Memory space for the idas memory block was not allocated through a previous call to IDAInit. IDA ILL INPUT An input argument to IDAReInit has an illegal value. IDA SUCCESS IDA MEM NULL Notes 4.6 If an error occurred, IDAReInit also sends an error message to the error handler function. User-supplied functions The user-supplied functions consist of one function defining the DAE residual, (optionally) a function that handles error and warning messages, (optionally) a function that provides the error weight vector, (optionally) one or two functions that provide Jacobian-related information for the linear solver, and (optionally) one or two functions that define the preconditioner for use in any of the Krylov iteration algorithms. 4.6.1 Residual function The user must provide a function of type IDAResFn defined as follows: IDAResFn Definition typedef int (*IDAResFn)(realtype tt, N Vector yy, N Vector yp, N Vector rr, void *user data); Purpose This function computes the problem residual for given values of the independent variable t, state vector y, and derivative ẏ. Arguments tt yy yp is the current value of the independent variable. is the current value of the dependent variable vector, y(t). is the current value of ẏ(t). 4.6 User-supplied functions 73 rr is the output residual vector F (t, y, ẏ). user data is a pointer to user data, the same as the user data parameter passed to IDASetUserData. Return value An IDAResFn function type should return a value of 0 if successful, a positive value if a recoverable error occurred (e.g., yy has an illegal value), or a negative value if a nonrecoverable error occurred. In the last case, the integrator halts. If a recoverable error occurred, the integrator will attempt to correct and retry. Notes A recoverable failure error return from the IDAResFn is typically used to flag a value of the dependent variable y that is “illegal” in some way (e.g., negative where only a non-negative value is physically meaningful). If such a return is made, idas will attempt to recover (possibly repeating the nonlinear solve, or reducing the step size) in order to avoid this recoverable error return. For efficiency reasons, the DAE residual function is not evaluated at the converged solution of the nonlinear solver. Therefore, in general, a recoverable error in that converged value cannot be corrected. (It may be detected when the right-hand side function is called the first time during the following integration step, but a successful step cannot be undone.) However, if the user program also includes quadrature integration, the state variables can be checked for legality in the call to IDAQuadRhsFn, which is called at the converged solution of the nonlinear system, and therefore idas can be flagged to attempt to recover from such a situation. Also, if sensitivity analysis is performed with the staggered method, the DAE residual function is called at the converged solution of the nonlinear system, and a recoverable error at that point can be flagged, and idas will then try to correct it. Allocation of memory for yp is handled within idas. 4.6.2 Error message handler function As an alternative to the default behavior of directing error and warning messages to the file pointed to by errfp (see IDASetErrFile), the user may provide a function of type IDAErrHandlerFn to process any such messages. The function type IDAErrHandlerFn is defined as follows: IDAErrHandlerFn Definition typedef void (*IDAErrHandlerFn)(int error code, const char *module, const char *function, char *msg, void *eh data); Purpose This function processes error and warning messages from idas and its sub-modules. Arguments error code module function msg eh data is the error code. is the name of the idas module reporting the error. is the name of the function in which the error occurred. is the error message. is a pointer to user data, the same as the eh data parameter passed to IDASetErrHandlerFn. Return value A IDAErrHandlerFn function has no return value. Notes 4.6.3 error code is negative for errors and positive (IDA WARNING) for warnings. If a function that returns a pointer to memory encounters an error, it sets error code to 0. Error weight function As an alternative to providing the relative and absolute tolerances, the user may provide a function of type IDAEwtFn to compute a vector ewt containing the multiplicative weights Wi used in the WRMS 74 Using IDAS for IVP Solution q PN norm k vkWRMS = (1/N ) 1 (Wi · vi )2 . These weights will used in place of those defined by Eq. (2.7). The function type IDAEwtFn is defined as follows: IDAEwtFn Definition typedef int (*IDAEwtFn)(N Vector y, N Vector ewt, void *user data); Purpose This function computes the WRMS error weights for the vector y. Arguments y is the value of the dependent variable vector at which the weight vector is to be computed. ewt is the output vector containing the error weights. user data is a pointer to user data, the same as the user data parameter passed to IDASetUserData. Return value An IDAEwtFn function type must return 0 if it successfully set the error weights and −1 otherwise. Notes Allocation of memory for ewt is handled within idas. ! The error weight vector must have all components positive. It is the user’s responsiblity to perform this test and return −1 if it is not satisfied. 4.6.4 Rootfinding function If a rootfinding problem is to be solved during the integration of the DAE system, the user must supply a C function of type IDARootFn, defined as follows: IDARootFn Definition typedef int (*IDARootFn)(realtype t, N Vector y, N Vector yp, realtype *gout, void *user data); Purpose This function computes a vector-valued function g(t, y, ẏ) such that the roots of the nrtfn components gi (t, y, ẏ) are to be found during the integration. Arguments t y yp gout user data is the current value of the independent variable. is the current value of the dependent variable vector, y(t). is the current value of ẏ(t), the t−derivative of y. is the output array, of length nrtfn, with components gi (t, y, ẏ). is a pointer to user data, the same as the user data parameter passed to IDASetUserData. Return value An IDARootFn should return 0 if successful or a non-zero value if an error occurred (in which case the integration is halted and IDASolve returns IDA RTFUNC FAIL). Notes 4.6.5 Allocation of memory for gout is handled within idas. Jacobian construction (matrix-based linear solvers) If a matrix-based linear solver module is used (i.e. a non-NULL sunmatrix object was supplied to IDASetLinearSolver), the user may provide a function of type IDALsJacFn defined as follows: IDALsJacFn Definition typedef int (*IDALsJacFn)(realtype tt, realtype cj, N Vector yy, N Vector yp, N Vector rr, SUNMatrix Jac, void *user data, N Vector tmp1, N Vector tmp2, N Vector tmp3); 4.6 User-supplied functions 75 Purpose This function computes the Jacobian matrix J of the DAE system (or an approximation to it), defined by Eq. (2.6). Arguments tt cj is the current value of the independent variable t. is the scalar in the system Jacobian, proportional to the inverse of the step size (α in Eq. (2.6) ). yy is the current value of the dependent variable vector, y(t). yp is the current value of ẏ(t). rr is the current value of the residual vector F (t, y, ẏ). Jac is the output (approximate) Jacobian matrix (of type SUNMatrix), J = ∂F/∂y + cj ∂F/∂ ẏ. user data is a pointer to user data, the same as the user data parameter passed to IDASetUserData. tmp1 tmp2 tmp3 are pointers to memory allocated for variables of type N Vector which can be used by IDALsJacFn function as temporary storage or work space. Return value An IDALsJacFn should return 0 if successful, a positive value if a recoverable error occurred, or a negative value if a nonrecoverable error occurred. In the case of a recoverable eror return, the integrator will attempt to recover by reducing the stepsize, and hence changing α in (2.6). Notes Information regarding the structure of the specific sunmatrix structure (e.g., number of rows, upper/lower bandwidth, sparsity type) may be obtained through using the implementation-specific sunmatrix interface functions (see Chapter 8 for details). Prior to calling the user-supplied Jacobian function, the Jacobian matrix J(t, y) is zeroed out, so only nonzero elements need to be loaded into Jac. If the user’s IDALsJacFn function uses difference quotient approximations, it may need to access quantities not in the call list. These quantities may include the current stepsize, the error weights, etc. To obtain these, the user will need to add a pointer to ida mem to user data and then use the IDAGet* functions described in §4.5.10.2. The unit roundoff can be accessed as UNIT ROUNDOFF defined in sundials types.h. dense: A user-supplied dense Jacobian function must load the Neq × Neq dense matrix Jac with an approximation to the Jacobian matrix J(t, y, ẏ) at the point (tt, yy, yp). The accessor macros SM ELEMENT D and SM COLUMN D allow the user to read and write dense matrix elements without making explicit references to the underlying representation of the sunmatrix dense type. SM ELEMENT D(J, i, j) references the (i, j)-th element of the dense matrix Jac (with i, j = 0 . . . N − 1). This macro is meant for small problems for which efficiency of access is not a major concern. Thus, in terms of the indices m and n ranging from 1 to N , the Jacobian element Jm,n can be set using the statement SM ELEMENT D(J, m-1, n-1) = Jm,n . Alternatively, SM COLUMN D(J, j) returns a pointer to the first element of the j-th column of Jac (with j = 0 . . . N − 1), and the elements of the j-th column can then be accessed using ordinary array indexing. Consequently, Jm,n can be loaded using the statements col n = SM COLUMN D(J, n-1); col n[m-1] = Jm,n . For large problems, it is more efficient to use SM COLUMN D than to use SM ELEMENT D. Note that both of these macros number rows and columns starting from 0. The sunmatrix dense type and accessor macros are documented in §8.2. banded: A user-supplied banded Jacobian function must load the Neq × Neq banded matrix Jac with an approximation to the Jacobian matrix J(t, y, ẏ) at the point (tt, yy, yp). The accessor macros SM ELEMENT B, SM COLUMN B, and SM COLUMN ELEMENT B allow the 76 Using IDAS for IVP Solution user to read and write banded matrix elements without making specific references to the underlying representation of the sunmatrix band type. SM ELEMENT B(J, i, j) references the (i, j)-th element of the banded matrix Jac, counting from 0. This macro is meant for use in small problems for which efficiency of access is not a major concern. Thus, in terms of the indices m and n ranging from 1 to N with (m, n) within the band defined by mupper and mlower, the Jacobian element Jm,n can be loaded using the statement SM ELEMENT B(J, m-1, n-1) = Jm,n . The elements within the band are those with -mupper ≤ m-n ≤ mlower. Alternatively, SM COLUMN B(J, j) returns a pointer to the diagonal element of the j-th column of Jac, and if we assign this address to realtype *col j, then the i-th element of the j-th column is given by SM COLUMN ELEMENT B(col j, i, j), counting from 0. Thus, for (m, n) within the band, Jm,n can be loaded by setting col n = SM COLUMN B(J, n-1); and SM COLUMN ELEMENT B(col n, m-1, n-1) = Jm,n . The elements of the j-th column can also be accessed via ordinary array indexing, but this approach requires knowledge of the underlying storage for a band matrix of type sunmatrix band. The array col n can be indexed from −mupper to mlower. For large problems, it is more efficient to use SM COLUMN B and SM COLUMN ELEMENT B than to use the SM ELEMENT B macro. As in the dense case, these macros all number rows and columns starting from 0. The sunmatrix band type and accessor macros are documented in §8.3. sparse: A user-supplied sparse Jacobian function must load the Neq × Neq compressed-sparsecolumn or compressed-sparse-row matrix Jac with an approximation to the Jacobian matrix J(t, y, ẏ) at the point (tt, yy, yp). Storage for Jac already exists on entry to this function, although the user should ensure that sufficient space is allocated in Jac to hold the nonzero values to be set; if the existing space is insufficient the user may reallocate the data and index arrays as needed. The amount of allocated space in a sunmatrix sparse object may be accessed using the macro SM NNZ S or the routine SUNSparseMatrix NNZ. The sunmatrix sparse type and accessor macros are documented in §8.4. The previous function type IDADlsJacFn is identical to IDALsJacFn, and may still be used for backward-compatibility. However, this will be deprecated in future releases, so we recommend that users transition to the new function type name soon. 4.6.6 Jacobian-vector product (matrix-free linear solvers) If a matrix-free linear solver is to be used (i.e., a NULL-valued sunmatrix was supplied to IDASetLinearSolver), the user may provide a function of type IDALsJacTimesVecFn in the following form, to compute matrix-vector products Jv. If such a function is not supplied, the default is a difference quotient approximation to these products. IDALsJacTimesVecFn Definition typedef int (*IDALsJacTimesVecFn)(realtype N Vector N Vector realtype N Vector Purpose This function computes the product Jv of the DAE system Jacobian J (or an approximation to it) and a given vector v, where J is defined by Eq. (2.6). Arguments tt yy yp rr is is is is the the the the current current current current value value value value of of of of tt, N Vector yy, yp, N Vector rr, v, N Vector Jv, cj, void *user data, tmp1, N Vector tmp2); the independent variable. the dependent variable vector, y(t). ẏ(t). the residual vector F (t, y, ẏ). 4.6 User-supplied functions 77 is the vector by which the Jacobian must be multiplied to the right. is the computed output vector. is the scalar in the system Jacobian, proportional to the inverse of the step size (α in Eq. (2.6) ). user data is a pointer to user data, the same as the user data parameter passed to IDASetUserData. tmp1 tmp2 are pointers to memory allocated for variables of type N Vector which can be used by IDALsJacTimesVecFn as temporary storage or work space. v Jv cj Return value The value returned by the Jacobian-times-vector function should be 0 if successful. A nonzero value indicates that a nonrecoverable error occurred. This function must return a value of J ∗ v that uses the current value of J, i.e. as evaluated at the current (t, y, ẏ). Notes If the user’s IDALsJacTimesVecFn function uses difference quotient approximations, it may need to access quantities not in the call list. These include the current stepsize, the error weights, etc. To obtain these, the user will need to add a pointer to ida mem to user data and then use the IDAGet* functions described in §4.5.10.2. The unit roundoff can be accessed as UNIT ROUNDOFF defined in sundials types.h. The previous function type IDASpilsJacTimesVecFn is identical to IDALsJacTimesVecFn, and may still be used for backward-compatibility. However, this will be deprecated in future releases, so we recommend that users transition to the new function type name soon. 4.6.7 Jacobian-vector product setup (matrix-free linear solvers) If the user’s Jacobian-times-vector requires that any Jacobian-related data be preprocessed or evaluated, then this needs to be done in a user-supplied function of type IDALsJacTimesSetupFn, defined as follows: IDAJacTimesSetupFn Definition typedef int (*IDAJacTimesSetupFn)(realtype tt, N Vector yy, N Vector yp, N Vector rr, realtype cj, void *user data); Purpose This function preprocesses and/or evaluates Jacobian data needed by the Jacobiantimes-vector routine. Arguments tt yy yp rr cj is the current value of the independent variable. is the current value of the dependent variable vector, y(t). is the current value of ẏ(t). is the current value of the residual vector F (t, y, ẏ). is the scalar in the system Jacobian, proportional to the inverse of the step size (α in Eq. (2.6) ). user data is a pointer to user data, the same as the user data parameter passed to IDASetUserData. Return value The value returned by the Jacobian-vector setup function should be 0 if successful, positive for a recoverable error (in which case the step will be retried), or negative for an unrecoverable error (in which case the integration is halted). Notes Each call to the Jacobian-vector setup function is preceded by a call to the IDAResFn user function with the same (t,y, yp) arguments. Thus, the setup function can use any auxiliary data that is computed and saved during the evaluation of the DAE residual. 78 Using IDAS for IVP Solution If the user’s IDALsJacTimesVecFn function uses difference quotient approximations, it may need to access quantities not in the call list. These include the current stepsize, the error weights, etc. To obtain these, the user will need to add a pointer to ida mem to user data and then use the IDAGet* functions described in §4.5.10.2. The unit roundoff can be accessed as UNIT ROUNDOFF defined in sundials types.h. The previous function type IDASpilsJacTimesSetupFn is identical to IDALsJacTimesSetupFn, and may still be used for backward-compatibility. However, this will be deprecated in future releases, so we recommend that users transition to the new function type name soon. 4.6.8 Preconditioner solve (iterative linear solvers) If a user-supplied preconditioner is to be used with a sunlinsol solver module, then the user must provide a function to solve the linear system P z = r where P is a left preconditioner matrix which approximates (at least crudely) the Jacobian matrix J = ∂F/∂y + cj ∂F/∂ ẏ. This function must be of type IDALsPrecSolveFn, defined as follows: IDALsPrecSolveFn Definition typedef int (*IDALsPrecSolveFn)(realtype tt, N Vector yy, N Vector yp, N Vector rr, N Vector rvec, N Vector zvec, realtype cj, realtype delta, void *user data); Purpose This function solves the preconditioning system P z = r. Arguments tt yy yp rr rvec zvec cj is the current value of the independent variable. is the current value of the dependent variable vector, y(t). is the current value of ẏ(t). is the current value of the residual vector F (t, y, ẏ). is the right-hand side vector r of the linear system to be solved. is the computed output vector. is the scalar in the system Jacobian, proportional to the inverse of the step size (α in Eq. (2.6) ). delta is an input tolerance to be used if an iterative method is employed in the solution. In that case, the residual vector Res = r −P zpofPthe system should 2 be made less than delta in weighted l2 norm, i.e., i (Resi · ewti ) < delta. To obtain the N Vector ewt, call IDAGetErrWeights (see §4.5.10.2). user data is a pointer to user data, the same as the user data parameter passed to the function IDASetUserData. Return value The value to be returned by the preconditioner solve function is a flag indicating whether it was successful. This value should be 0 if successful, positive for a recoverable error (in which case the step will be retried), negative for an unrecoverable error (in which case the integration is halted). Notes 4.6.9 The previous function type IDASpilsPrecSolveFn is identical to IDALsPrecSolveFn, and may still be used for backward-compatibility. However, this will be deprecated in future releases, so we recommend that users transition to the new function type name soon. Preconditioner setup (iterative linear solvers) If the user’s preconditioner requires that any Jacobian-related data be evaluated or preprocessed, then this needs to be done in a user-supplied function of type IDALsPrecSetupFn, defined as follows: 4.7 Integration of pure quadrature equations 79 IDALsPrecSetupFn Definition typedef int (*IDALsPrecSetupFn)(realtype tt, N Vector yy, N Vector yp, N Vector rr, realtype cj, void *user data); Purpose This function evaluates and/or preprocesses Jacobian-related data needed by the preconditioner. Arguments tt yy yp rr cj is the current value of the independent variable. is the current value of the dependent variable vector, y(t). is the current value of ẏ(t). is the current value of the residual vector F (t, y, ẏ). is the scalar in the system Jacobian, proportional to the inverse of the step size (α in Eq. (2.6) ). user data is a pointer to user data, the same as the user data parameter passed to the function IDASetUserData. Return value The value returned by the preconditioner setup function is a flag indicating whether it was successful. This value should be 0 if successful, positive for a recoverable error (in which case the step will be retried), negative for an unrecoverable error (in which case the integration is halted). Notes The operations performed by this function might include forming a crude approximate Jacobian, and performing an LU factorization on the resulting approximation. Each call to the preconditioner setup function is preceded by a call to the IDAResFn user function with the same (tt, yy, yp) arguments. Thus the preconditioner setup function can use any auxiliary data that is computed and saved during the evaluation of the DAE residual. This function is not called in advance of every call to the preconditioner solve function, but rather is called only as often as needed to achieve convergence in the nonlinear solver. If the user’s IDALsPrecSetupFn function uses difference quotient approximations, it may need to access quantities not in the call list. These include the current stepsize, the error weights, etc. To obtain these, the user will need to add a pointer to ida mem to user data and then use the IDAGet* functions described in §4.5.10.2. The unit roundoff can be accessed as UNIT ROUNDOFF defined in sundials types.h. The previous function type IDASpilsPrecSetupFn is identical to IDALsPrecSetupFn, and may still be used for backward-compatibility. However, this will be deprecated in future releases, so we recommend that users transition to the new function type name soon. 4.7 Integration of pure quadrature equations idas allows the DAE system to include pure quadratures. In this case, it is more efficient to treat the quadratures separately by excluding them from the nonlinear solution stage. To do this, begin by excluding the quadrature variables from the vectors yy and yp and the quadrature equations from within res. Thus a separate vector yQ of quadrature variables is to satisfy (d/dt)yQ = fQ (t, y, ẏ). The following is an overview of the sequence of calls in a user’s main program in this situation. Steps that are unchanged from the skeleton program presented in §4.4 are grayed out. 1. Initialize parallel or multi-threaded environment, if appropriate 2. Set problem dimensions, etc. 80 Using IDAS for IVP Solution This generally includes N, the problem size N (excluding quadrature variables), Nq, the number of quadrature variables, and may include the local vector length Nlocal (excluding quadrature variables), and local number of quadrature variables Nqlocal. 3. Set vectors of initial values 4. Create idas object 5. Initialize idas solver 6. Specify integration tolerances 7. Create matrix object 8. Create linear solver object 9. Set linear solver optional inputs 10. Attach linear solver module 11. Set optional inputs 12. Create nonlinear solver object 13. Attach nonlinear solver module 14. Set nonlinear solver optional inputs 15. Correct initial values 16. Set vector of initial values for quadrature variables Typically, the quadrature variables should be initialized to 0. 17. Initialize quadrature integration Call IDAQuadInit to specify the quadrature equation right-hand side function and to allocate internal memory related to quadrature integration. See §4.7.1 for details. 18. Set optional inputs for quadrature integration Call IDASetQuadErrCon to indicate whether or not quadrature variables should be used in the step size control mechanism. If so, one of the IDAQuad*tolerances functions must be called to specify the integration tolerances for quadrature variables. See §4.7.4 for details. 19. Advance solution in time 20. Extract quadrature variables Call IDAGetQuad or IDAGetQuadDky to obtain the values of the quadrature variables or their derivatives at the current time. See §4.7.3 for details. 21. Get optional outputs 22. Get quadrature optional outputs Call IDAGetQuad* functions to obtain optional output related to the integration of quadratures. See §4.7.5 for details. 23. Deallocate memory for solution vectors and for the vector of quadrature variables 24. Free solver memory 4.7 Integration of pure quadrature equations 81 25. Free nonlinear solver memory 26. Free linear solver and matrix memory 27. Finalize MPI, if used IDAQuadInit can be called and quadrature-related optional inputs (step 18 above) can be set, anywhere between steps 4 and 19. 4.7.1 Quadrature initialization and deallocation functions The function IDAQuadInit activates integration of quadrature equations and allocates internal memory related to these calculations. The form of the call to this function is as follows: IDAQuadInit Call flag = IDAQuadInit(ida mem, rhsQ, yQ0); Description The function IDAQuadInit provides required problem specifications, allocates internal memory, and initializes quadrature integration. Arguments ida mem (void *) pointer to the idas memory block returned by IDACreate. rhsQ (IDAQuadRhsFn) is the C function which computes fQ , the right-hand side of the quadrature equations. This function has the form fQ(t, yy, yp, rhsQ, user data) (for full details see §4.7.6). yQ0 (N Vector) is the initial value of yQ . Return value The return value flag (of type int) will be one of the following: IDA SUCCESS The call to IDAQuadInit was successful. IDA MEM NULL The idas memory was not initialized by a prior call to IDACreate. IDA MEM FAIL A memory allocation request failed. Notes If an error occurred, IDAQuadInit also sends an error message to the error handler function. In terms of the number of quadrature variables Nq and maximum method order maxord, the size of the real workspace is increased as follows: • Base value: lenrw = lenrw + (maxord+5)Nq • If IDAQuadSVtolerances is called: lenrw = lenrw +Nq and the size of the integer workspace is increased as follows: • Base value: leniw = leniw + (maxord+5)Nq • If IDAQuadSVtolerances is called: leniw = leniw +Nq The function IDAQuadReInit, useful during the solution of a sequence of problems of same size, reinitializes the quadrature-related internal memory and must follow a call to IDAQuadInit (and maybe a call to IDAReInit). The number Nq of quadratures is assumed to be unchanged from the prior call to IDAQuadInit. The call to the IDAQuadReInit function has the following form: IDAQuadReInit Call flag = IDAQuadReInit(ida mem, yQ0); Description The function IDAQuadReInit provides required problem specifications and reinitializes the quadrature integration. Arguments ida mem (void *) pointer to the idas memory block. yQ0 (N Vector) is the initial value of yQ . 82 Using IDAS for IVP Solution Return value The return value flag (of type int) will be one of the following: IDA SUCCESS The call to IDAReInit was successful. IDA MEM NULL The idas memory was not initialized by a prior call to IDACreate. IDA NO QUAD Memory space for the quadrature integration was not allocated by a prior call to IDAQuadInit. Notes If an error occurred, IDAQuadReInit also sends an error message to the error handler function. IDAQuadFree Call IDAQuadFree(ida mem); Description The function IDAQuadFree frees the memory allocated for quadrature integration. Arguments The argument is the pointer to the idas memory block (of type void *). Return value The function IDAQuadFree has no return value. Notes 4.7.2 In general, IDAQuadFree need not be called by the user as it is invoked automatically by IDAFree. IDAS solver function Even if quadrature integration was enabled, the call to the main solver function IDASolve is exactly the same as in §4.5.7. However, in this case the return value flag can also be one of the following: The quadrature right-hand side function failed in an unrecoverable manIDA QRHS FAIL ner. IDA FIRST QRHS ERR The quadrature right-hand side function failed at the first call. IDA REP QRHS ERR Convergence test failures occurred too many times due to repeated recoverable errors in the quadrature right-hand side function. This value will also be returned if the quadrature right-hand side function had repeated recoverable errors during the estimation of an initial step size (assuming the quadrature variables are included in the error tests). 4.7.3 Quadrature extraction functions If quadrature integration has been initialized by a call to IDAQuadInit, or reinitialized by a call to IDAQuadReInit, then idas computes both a solution and quadratures at time t. However, IDASolve will still return only the solution y in y. Solution quadratures can be obtained using the following function: IDAGetQuad Call flag = IDAGetQuad(ida mem, &tret, yQ); Description The function IDAGetQuad returns the quadrature solution vector after a successful return from IDASolve. Arguments ida mem (void *) pointer to the memory previously allocated by IDAInit. tret (realtype) the time reached by the solver (output). yQ (N Vector) the computed quadrature vector. Return value The return value flag of IDAGetQuad is one of: IDA IDA IDA IDA SUCCESS MEM NULL NO QUAD BAD DKY IDAGetQuad was successful. ida mem was NULL. Quadrature integration was not initialized. yQ is NULL. 4.7 Integration of pure quadrature equations 83 The function IDAGetQuadDky computes the k-th derivatives of the interpolating polynomials for the quadrature variables at time t. This function is called by IDAGetQuad with k = 0 and with the current time at which IDASolve has returned, but may also be called directly by the user. IDAGetQuadDky Call flag = IDAGetQuadDky(ida mem, t, k, dkyQ); Description The function IDAGetQuadDky returns derivatives of the quadrature solution vector after a successful return from IDASolve. Arguments ida mem (void *) pointer to the memory previously allocated by IDAInit. t (realtype) the time at which quadrature information is requested. The time t must fall within the interval defined by the last successful step taken by idas. k (int) order of the requested derivative. This must be ≤ klast. dkyQ (N Vector) the vector containing the derivative. This vector must be allocated by the user. Return value The return value flag of IDAGetQuadDky is one of: IDA IDA IDA IDA IDA IDA 4.7.4 SUCCESS MEM NULL NO QUAD BAD DKY BAD K BAD T IDAGetQuadDky succeeded. The pointer to ida mem was NULL. Quadrature integration was not initialized. The vector dkyQ is NULL. k is not in the range 0, 1, ..., klast. The time t is not in the allowed range. Optional inputs for quadrature integration idas provides the following optional input functions to control the integration of quadrature equations. IDASetQuadErrCon Call flag = IDASetQuadErrCon(ida mem, errconQ); Description The function IDASetQuadErrCon specifies whether or not the quadrature variables are to be used in the step size control mechanism within idas. If they are, the user must call either IDAQuadSStolerances or IDAQuadSVtolerances to specify the integration tolerances for the quadrature variables. Arguments ida mem (void *) pointer to the idas memory block. errconQ (booleantype) specifies whether quadrature variables are included (SUNTRUE) or not (SUNFALSE) in the error control mechanism. Return value The return value flag (of type int) is one of: IDA SUCCESS The optional value has been successfully set. IDA MEM NULL The ida mem pointer is NULL IDA NO QUAD Quadrature integration has not been initialized. Notes By default, errconQ is set to SUNFALSE. It is illegal to call IDASetQuadErrCon before a call to IDAQuadInit. If the quadrature variables are part of the step size control mechanism, one of the following functions must be called to specify the integration tolerances for quadrature variables. ! 84 Using IDAS for IVP Solution IDAQuadSStolerances Call flag = IDAQuadSVtolerances(ida mem, reltolQ, abstolQ); Description The function IDAQuadSStolerances specifies scalar relative and absolute tolerances. Arguments ida mem (void *) pointer to the idas memory block. reltolQ (realtype) is the scalar relative error tolerance. abstolQ (realtype) is the scalar absolute error tolerance. Return value The return value flag (of type int) is one of: IDA SUCCESS The optional value has been successfully set. IDA NO QUAD Quadrature integration was not initialized. IDA MEM NULL The ida mem pointer is NULL. IDA ILL INPUT One of the input tolerances was negative. IDAQuadSVtolerances Call flag = IDAQuadSVtolerances(ida mem, reltolQ, abstolQ); Description The function IDAQuadSVtolerances specifies scalar relative and vector absolute tolerances. Arguments ida mem (void *) pointer to the idas memory block. reltolQ (realtype) is the scalar relative error tolerance. abstolQ (N Vector) is the vector absolute error tolerance. Return value The return value flag (of type int) is one of: IDA SUCCESS The optional value has been successfully set. IDA NO QUAD Quadrature integration was not initialized. IDA MEM NULL The ida mem pointer is NULL. IDA ILL INPUT One of the input tolerances was negative. 4.7.5 Optional outputs for quadrature integration idas provides the following functions that can be used to obtain solver performance information related to quadrature integration. IDAGetQuadNumRhsEvals Call flag = IDAGetQuadNumRhsEvals(ida mem, &nrhsQevals); Description The function IDAGetQuadNumRhsEvals returns the number of calls made to the user’s quadrature right-hand side function. Arguments ida mem (void *) pointer to the idas memory block. nrhsQevals (long int) number of calls made to the user’s rhsQ function. Return value The return value flag (of type int) is one of: IDA SUCCESS The optional output value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. IDA NO QUAD Quadrature integration has not been initialized. 4.7 Integration of pure quadrature equations 85 IDAGetQuadNumErrTestFails Call flag = IDAGetQuadNumErrTestFails(ida mem, &nQetfails); Description The function IDAGetQuadNumErrTestFails returns the number of local error test failures due to quadrature variables. Arguments ida mem (void *) pointer to the idas memory block. nQetfails (long int) number of error test failures due to quadrature variables. Return value The return value flag (of type int) is one of: IDA SUCCESS The optional output value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. IDA NO QUAD Quadrature integration has not been initialized. IDAGetQuadErrWeights Call flag = IDAGetQuadErrWeights(ida mem, eQweight); Description The function IDAGetQuadErrWeights returns the quadrature error weights at the current time. Arguments ida mem (void *) pointer to the idas memory block. eQweight (N Vector) quadrature error weights at the current time. Return value The return value flag (of type int) is one of: IDA SUCCESS The optional output value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. IDA NO QUAD Quadrature integration has not been initialized. Notes The user must allocate memory for eQweight. If quadratures were not included in the error control mechanism (through a call to IDASetQuadErrCon with errconQ = SUNTRUE), IDAGetQuadErrWeights does not set the eQweight vector. IDAGetQuadStats Call flag = IDAGetQuadStats(ida mem, &nrhsQevals, &nQetfails); Description The function IDAGetQuadStats returns the idas integrator statistics as a group. Arguments ida mem (void *) pointer to the idas memory block. nrhsQevals (long int) number of calls to the user’s rhsQ function. nQetfails (long int) number of error test failures due to quadrature variables. Return value The return value flag (of type int) is one of IDA SUCCESS the optional output values have been successfully set. IDA MEM NULL the ida mem pointer is NULL. IDA NO QUAD Quadrature integration has not been initialized. 4.7.6 User-supplied function for quadrature integration For integration of quadrature equations, the user must provide a function that defines the right-hand side of the quadrature equations (in other words, the integrand function of the integral that must be evaluated). This function must be of type IDAQuadRhsFn defined as follows: ! 86 Using IDAS for IVP Solution IDAQuadRhsFn Definition typedef int (*IDAQuadRhsFn)(realtype t, N Vector yy, N Vector yp, N Vector rhsQ, void *user data); Purpose This function computes the quadrature equation right-hand side for a given value of the independent variable t and state vectors y and ẏ. Arguments t yy yp rhsQ user data is is is is is the the the the the current value of the independent variable. current value of the dependent variable vector, y(t). current value of the dependent variable derivative vector, ẏ(t). output vector fQ (t, y, ẏ). user data pointer passed to IDASetUserData. Return value A IDAQuadRhsFn should return 0 if successful, a positive value if a recoverable error occurred (in which case idas will attempt to correct), or a negative value if it failed unrecoverably (in which case the integration is halted and IDA QRHS FAIL is returned). Notes Allocation of memory for rhsQ is automatically handled within idas. Both y and rhsQ are of type N Vector, but they typically have different internal representations. It is the user’s responsibility to access the vector data consistently (including the use of the correct accessor macros from each nvector implementation). For the sake of computational efficiency, the vector functions in the two nvector implementations provided with idas do not perform any consistency checks with respect to their N Vector arguments (see §7.2 and §7.3). There is one situation in which recovery is not possible even if IDAQuadRhsFn function returns a recoverable error flag. This is when this occurs at the very first call to the IDAQuadRhsFn (in which case idas returns IDA FIRST QRHS ERR). 4.8 A parallel band-block-diagonal preconditioner module A principal reason for using a parallel DAE solver such as idas lies in the solution of partial differential equations (PDEs). Moreover, the use of a Krylov iterative method for the solution of many such problems is motivated by the nature of the underlying linear system of equations (2.5) that must be solved at each time step. The linear algebraic system is large, sparse, and structured. However, if a Krylov iterative method is to be effective in this setting, then a nontrivial preconditioner needs to be used. Otherwise, the rate of convergence of the Krylov iterative method is usually unacceptably slow. Unfortunately, an effective preconditioner tends to be problem-specific. However, we have developed one type of preconditioner that treats a rather broad class of PDEbased problems. It has been successfully used for several realistic, large-scale problems [32] and is included in a software module within the idas package. This module works with the parallel vector module nvector parallel and generates a preconditioner that is a block-diagonal matrix with each block being a band matrix. The blocks need not have the same number of super- and sub-diagonals, and these numbers may vary from block to block. This Band-Block-Diagonal Preconditioner module is called idabbdpre. One way to envision these preconditioners is to think of the domain of the computational PDE problem as being subdivided into M non-overlapping sub-domains. Each of these sub-domains is then assigned to one of the M processors to be used to solve the DAE system. The basic idea is to isolate the preconditioning so that it is local to each processor, and also to use a (possibly cheaper) approximate residual function. This requires the definition of a new function G(t, y, ẏ) which approximates the function F (t, y, ẏ) in the definition of the DAE system (2.1). However, the user may set G = F . Corresponding to the domain decomposition, there is a decomposition of the solution vectors y and ẏ into M disjoint blocks ym and ẏm , and a decomposition of G into blocks Gm . The block Gm depends on ym and ẏm , and also on components of ym0 and ẏm0 associated with neighboring sub-domains 4.8 A parallel band-block-diagonal preconditioner module 87 (so-called ghost-cell data). Let ȳm and ẏ¯m denote ym and ẏm (respectively) augmented with those other components on which Gm depends. Then we have G(t, y, ẏ) = [G1 (t, ȳ1 , ẏ¯1 ), G2 (t, ȳ2 , ẏ¯2 ), . . . , GM (t, ȳM , ẏ¯M )]T , (4.1) and each of the blocks Gm (t, ȳm , ẏ¯m ) is uncoupled from the others. The preconditioner associated with this decomposition has the form P = diag[P1 , P2 , . . . , PM ] (4.2) Pm ≈ ∂Gm /∂ym + α∂Gm /∂ ẏm (4.3) where This matrix is taken to be banded, with upper and lower half-bandwidths mudq and mldq defined as the number of non-zero diagonals above and below the main diagonal, respectively. The difference quotient approximation is computed using mudq + mldq +2 evaluations of Gm , but only a matrix of bandwidth mukeep + mlkeep +1 is retained. Neither pair of parameters need be the true half-bandwidths of the Jacobians of the local block of G, if smaller values provide a more efficient preconditioner. Such an efficiency gain may occur if the couplings in the DAE system outside a certain bandwidth are considerably weaker than those within the band. Reducing mukeep and mlkeep while keeping mudq and mldq at their true values, discards the elements outside the narrower band. Reducing both pairs has the additional effect of lumping the outer Jacobian elements into the computed elements within the band, and requires more caution and experimentation. The solution of the complete linear system Px = b (4.4) Pm x m = b m (4.5) reduces to solving each of the equations and this is done by banded LU factorization of Pm followed by a banded backsolve. Similar block-diagonal preconditioners could be considered with different treatment of the blocks Pm . For example, incomplete LU factorization or an iterative method could be used instead of banded LU factorization. The idabbdpre module calls two user-provided functions to construct P : a required function Gres (of type IDABBDLocalFn) which approximates the residual function G(t, y, ẏ) ≈ F (t, y, ẏ) and which is computed locally, and an optional function Gcomm (of type IDABBDCommFn) which performs all inter-process communication necessary to evaluate the approximate residual G. These are in addition to the user-supplied residual function res. Both functions take as input the same pointer user data as passed by the user to IDASetUserData and passed to the user’s function res. The user is responsible for providing space (presumably within user data) for components of yy and yp that are communicated by Gcomm from the other processors, and that are then used by Gres, which should not do any communication. IDABBDLocalFn Definition typedef int (*IDABBDLocalFn)(sunindextype Nlocal, realtype tt, N Vector yy, N Vector yp, N Vector gval, void *user data); Purpose This Gres function computes G(t, y, ẏ). It loads the vector gval as a function of tt, yy, and yp. Arguments Nlocal tt yy yp is is is is the the the the local vector length. value of the independent variable. dependent variable. derivative of the dependent variable. 88 Using IDAS for IVP Solution gval is the output vector. user data is a pointer to user data, the same as the user data parameter passed to IDASetUserData. Return value An IDABBDLocalFn function type should return 0 to indicate success, 1 for a recoverable error, or -1 for a non-recoverable error. Notes This function must assume that all inter-processor communication of data needed to calculate gval has already been done, and this data is accessible within user data. The case where G is mathematically identical to F is allowed. IDABBDCommFn Definition typedef int (*IDABBDCommFn)(sunindextype Nlocal, realtype tt, N Vector yy, N Vector yp, void *user data); Purpose This Gcomm function performs all inter-processor communications necessary for the execution of the Gres function above, using the input vectors yy and yp. Arguments Nlocal tt yy yp user data is the local vector length. is the value of the independent variable. is the dependent variable. is the derivative of the dependent variable. is a pointer to user data, the same as the user data parameter passed to IDASetUserData. Return value An IDABBDCommFn function type should return 0 to indicate success, 1 for a recoverable error, or -1 for a non-recoverable error. Notes The Gcomm function is expected to save communicated data in space defined within the structure user data. Each call to the Gcomm function is preceded by a call to the residual function res with the same (tt, yy, yp) arguments. Thus Gcomm can omit any communications done by res if relevant to the evaluation of Gres. If all necessary communication was done in res, then Gcomm = NULL can be passed in the call to IDABBDPrecInit (see below). Besides the header files required for the integration of the DAE problem (see §4.3), to use the idabbdpre module, the main program must include the header file idas bbdpre.h which declares the needed function prototypes. The following is a summary of the usage of this module and describes the sequence of calls in the user main program. Steps that are unchanged from the user main program presented in §4.4 are grayed-out. 1. Initialize MPI 2. Set problem dimensions etc. 3. Set vectors of initial values 4. Create idas object 5. Initialize idas solver 6. Specify integration tolerances 7. Create linear solver object When creating the iterative linear solver object, specify the use of left preconditioning (PREC LEFT) as idas only supports left preconditioning. 8. Set linear solver optional inputs 4.8 A parallel band-block-diagonal preconditioner module 89 9. Attach linear solver module 10. Set optional inputs Note that the user should not overwrite the preconditioner setup function or solve function through calls to idIDASetPreconditioner optional input function. 11. Initialize the idabbdpre preconditioner module Specify the upper and lower bandwidths mudq, mldq and mukeep, mlkeep and call flag = IDABBDPrecInit(ida mem, Nlocal, mudq, mldq, mukeep, mlkeep, dq rel yy, Gres, Gcomm); to allocate memory and initialize the internal preconditioner data. The last two arguments of IDABBDPrecInit are the two user-supplied functions described above. 12. Create nonlinear solver object 13. Attach nonlinear solver module 14. Set nonlinear solver optional inputs 15. Correct initial values 16. Specify rootfinding problem 17. Advance solution in time 18. Get optional outputs Additional optional outputs associated with idabbdpre are available by way of two routines described below, IDABBDPrecGetWorkSpace and IDABBDPrecGetNumGfnEvals. 19. Deallocate memory for solution vectors 20. Free solver memory 21. Free nonlinear solver memory 22. Free linear solver memory 23. Finalize MPI The user-callable functions that initialize (step 11 above) or re-initialize the idabbdpre preconditioner module are described next. IDABBDPrecInit Call flag = IDABBDPrecInit(ida mem, Nlocal, mudq, mldq, mukeep, mlkeep, dq rel yy, Gres, Gcomm); Description The function IDABBDPrecInit initializes and allocates (internal) memory for the idabbdpre preconditioner. Arguments ida mem Nlocal mudq mldq mukeep (void *) pointer to the idas memory block. (sunindextype) local vector dimension. (sunindextype) upper half-bandwidth to be used in the difference-quotient Jacobian approximation. (sunindextype) lower half-bandwidth to be used in the difference-quotient Jacobian approximation. (sunindextype) upper half-bandwidth of the retained banded approximate Jacobian block. 90 Using IDAS for IVP Solution (sunindextype) lower half-bandwidth of the retained banded approximate Jacobian block. dq rel yy (realtype) the relative increment in components of y√used in the difference quotient approximations. The default is dq rel yy= unit roundoff, which can be specified by passing dq rel yy= 0.0. Gres (IDABBDLocalFn) the C function which computes the local residual approximation G(t, y, ẏ). Gcomm (IDABBDCommFn) the optional C function which performs all inter-process communication required for the computation of G(t, y, ẏ). mlkeep Return value The return value flag (of type int) is one of IDALS IDALS IDALS IDALS IDALS Notes SUCCESS MEM NULL MEM FAIL LMEM NULL ILL INPUT The call to IDABBDPrecInit was successful. The ida mem pointer was NULL. A memory allocation request has failed. An idals linear solver memory was not attached. The supplied vector implementation was not compatible with the block band preconditioner. If one of the half-bandwidths mudq or mldq to be used in the difference-quotient calculation of the approximate Jacobian is negative or exceeds the value Nlocal−1, it is replaced by 0 or Nlocal−1 accordingly. The half-bandwidths mudq and mldq need not be the true half-bandwidths of the Jacobian of the local block of G, when smaller values may provide a greater efficiency. Also, the half-bandwidths mukeep and mlkeep of the retained banded approximate Jacobian block may be even smaller, to reduce storage and computation costs further. For all four half-bandwidths, the values need not be the same on every processor. The idabbdpre module also provides a reinitialization function to allow for a sequence of problems of the same size, with the same linear solver choice, provided there is no change in local N, mukeep, or mlkeep. After solving one problem, and after calling IDAReInit to re-initialize idas for a subsequent problem, a call to IDABBDPrecReInit can be made to change any of the following: the half-bandwidths mudq and mldq used in the difference-quotient Jacobian approximations, the relative increment dq rel yy, or one of the user-supplied functions Gres and Gcomm. If there is a change in any of the linear solver inputs, an additional call to the “Set” routines provided by the sunlinsol module, and/or one or more of the corresponding IDASet*** functions, must also be made (in the proper order). IDABBDPrecReInit Call flag = IDABBDPrecReInit(ida mem, mudq, mldq, dq rel yy); Description The function IDABBDPrecReInit reinitializes the idabbdpre preconditioner. Arguments ida mem mudq (void *) pointer to the idas memory block. (sunindextype) upper half-bandwidth to be used in the difference-quotient Jacobian approximation. mldq (sunindextype) lower half-bandwidth to be used in the difference-quotient Jacobian approximation. dq rel yy (realtype) the relative increment in components of y√used in the difference quotient approximations. The default is dq rel yy = unit roundoff, which can be specified by passing dq rel yy = 0.0. Return value The return value flag (of type int) is one of IDALS SUCCESS IDALS MEM NULL The call to IDABBDPrecReInit was successful. The ida mem pointer was NULL. 4.8 A parallel band-block-diagonal preconditioner module 91 IDALS LMEM NULL An idals linear solver memory was not attached. IDALS PMEM NULL The function IDABBDPrecInit was not previously called. Notes If one of the half-bandwidths mudq or mldq is negative or exceeds the value Nlocal−1, it is replaced by 0 or Nlocal−1, accordingly. The following two optional output functions are available for use with the idabbdpre module: IDABBDPrecGetWorkSpace Call flag = IDABBDPrecGetWorkSpace(ida mem, &lenrwBBDP, &leniwBBDP); Description The function IDABBDPrecGetWorkSpace returns the local sizes of the idabbdpre real and integer workspaces. Arguments (void *) pointer to the idas memory block. ida mem lenrwBBDP (long int) local number of real values in the idabbdpre workspace. leniwBBDP (long int) local number of integer values in the idabbdpre workspace. Return value The return value flag (of type int) is one of IDALS SUCCESS The optional output value has been successfully set. IDALS MEM NULL The ida mem pointer was NULL. IDALS PMEM NULL The idabbdpre preconditioner has not been initialized. Notes The workspace requirements reported by this routine correspond only to memory allocated within the idabbdpre module (the banded matrix approximation, banded sunlinsol object, temporary vectors). These values are local to each process. The workspaces referred to here exist in addition to those given by the corresponding function IDAGetLinWorkSpace. IDABBDPrecGetNumGfnEvals Call flag = IDABBDPrecGetNumGfnEvals(ida mem, &ngevalsBBDP); Description The function IDABBDPrecGetNumGfnEvals returns the cumulative number of calls to the user Gres function due to the finite difference approximation of the Jacobian blocks used within idabbdpre’s preconditioner setup function. Arguments (void *) pointer to the idas memory block. ida mem ngevalsBBDP (long int) the cumulative number of calls to the user Gres function. Return value The return value flag (of type int) is one of IDALS SUCCESS The optional output value has been successfully set. IDALS MEM NULL The ida mem pointer was NULL. IDALS PMEM NULL The idabbdpre preconditioner has not been initialized. In addition to the ngevalsBBDP Gres evaluations, the costs associated with idabbdpre also include nlinsetups LU factorizations, nlinsetups calls to Gcomm, npsolves banded backsolve calls, and nrevalsLS residual function evaluations, where nlinsetups is an optional idas output (see §4.5.10.2), and npsolves and nrevalsLS are linear solver optional outputs (see §4.5.10.5). Chapter 5 Using IDAS for Forward Sensitivity Analysis This chapter describes the use of idas to compute solution sensitivities using forward sensitivity analysis. One of our main guiding principles was to design the idas user interface for forward sensitivity analysis as an extension of that for IVP integration. Assuming a user main program and user-defined support routines for IVP integration have already been defined, in order to perform forward sensitivity analysis the user only has to insert a few more calls into the main program and (optionally) define an additional routine which computes the residuals for sensitivity systems (2.12). The only departure from this philosophy is due to the IDAResFn type definition (§4.6.1). Without changing the definition of this type, the only way to pass values of the problem parameters to the DAE residual function is to require the user data structure user data to contain a pointer to the array of real parameters p. idas uses various constants for both input and output. These are defined as needed in this chapter, but for convenience are also listed separately in Appendix B. We begin with a brief overview, in the form of a skeleton user program. Following that are detailed descriptions of the interface to the various user-callable routines and of the user-supplied routines that were not already described in Chapter 4. 5.1 A skeleton of the user’s main program The following is a skeleton of the user’s main program (or calling program) as an application of idas. The user program is to have these steps in the order indicated, unless otherwise noted. For the sake of brevity, we defer many of the details to the later sections. As in §4.4, most steps are independent of the nvector, sunmatrix, sunlinsol, and sunnonlinsol implementations used. For the steps that are not, refer to Chapter 7, 8, 9, and 10 for the specific name of the function to be called or macro to be referenced. Differences between the user main program in §4.4 and the one below start only at step (16). Steps that are unchanged from the skeleton program presented in §4.4 are grayed out. First, note that no additional header files need be included for forward sensitivity analysis beyond those for IVP solution (§4.4). 1. Initialize parallel or multi-threaded environment, if appropriate 2. Set problem dimensions etc. 3. Set vectors of initial values 4. Create idas object 5. Initialize idas solver 94 Using IDAS for Forward Sensitivity Analysis 6. Specify integration tolerances 7. Create matrix object 8. Create linear solver object 9. Set linear solver optional inputs 10. Attach linear solver module 11. Set optional inputs 12. Create nonlinear solver object 13. Attach nonlinear solver module 14. Set nonlinear solver optional inputs 15. Initialize quadrature problem, if not sensitivity-dependent 16. Define the sensitivity problem •Number of sensitivities (required) Set Ns = Ns , the number of parameters with respect to which sensitivities are to be computed. •Problem parameters (optional) If idas is to evaluate the residuals of the sensitivity systems, set p, an array of Np real parameters upon which the IVP depends. Only parameters with respect to which sensitivities are (potentially) desired need to be included. Attach p to the user data structure user data. For example, user data->p = p; If the user provides a function to evaluate the sensitivity residuals, p need not be specified. •Parameter list (optional) If idas is to evaluate the sensitivity residuals, set plist, an array of Ns integers to specify the parameters p with respect to which solution sensitivities are to be computed. If sensitivities with respect to the j-th parameter p[j] (0 ≤ j < Np) are desired, set plisti = j, for some i = 0, . . . , Ns − 1. If plist is not specified, idas will compute sensitivities with respect to the first Ns parameters; i.e., plisti = i (i = 0, . . . , Ns − 1). If the user provides a function to evaluate the sensitivity residuals, plist need not be specified. •Parameter scaling factors (optional) If idas is to estimate tolerances for the sensitivity solution vectors (based on tolerances for the state solution vector) or if idas is to evaluate the residuals of the sensitivity systems using the internal difference-quotient function, the results will be more accurate if order of magnitude information is provided. Set pbar, an array of Ns positive scaling factors. Typically, if pi 6= 0, the value p̄i = |pplisti | can be used. If pbar is not specified, idas will use p̄i = 1.0. If the user provides a function to evaluate the sensitivity residual and specifies tolerances for the sensitivity variables, pbar need not be specified. Note that the names for p, pbar, plist, as well as the field p of user data are arbitrary, but they must agree with the arguments passed to IDASetSensParams below. 5.1 A skeleton of the user’s main program 95 17. Set sensitivity initial conditions Set the Ns vectors yS0[i] and ypS0[i] of initial values for sensitivities (for i = 0, . . . , Ns −1), using the appropriate functions defined by the particular nvector implementation chosen. First, create an array of Ns vectors by making the appropriate call yS0 = N VCloneVectorArray ***(Ns, y0); or yS0 = N VCloneVectorArrayEmpty ***(Ns, y0); Here the argument y0 serves only to provide the N Vector type for cloning. Then, for each i = 0, . . . ,Ns −1, load initial values for the i-th sensitivity vector yS0[i]. Set the initial conditions for the Ns sensitivity derivative vectors ypS0 of ẏ similarly. 18. Activate sensitivity calculations Call flag = IDASensInit(...); to activate forward sensitivity computations and allocate internal memory for idas related to sensitivity calculations (see §5.2.1). 19. Set sensitivity tolerances Call IDASensSStolerances, IDASensSVtolerances, or IDASensEEtolerances. See §5.2.2. 20. Set sensitivity analysis optional inputs Call IDASetSens* routines to change from their default values any optional inputs that control the behavior of idas in computing forward sensitivities. See §5.2.7. 21. Create sensitivity nonlinear solver object (optional ) If using a non-default nonlinear solver (see §5.2.3), then create the desired nonlinear solver object by calling the appropriate constructor function defined by the particular sunnonlinsol implementation e.g., NLSSens = SUNNonlinSol_***Sens(...); where *** is the name of the nonlinear solver and ... are constructor specific arguments (see Chapter 10 for details). 22. Attach the sensitvity nonlinear solver module (optional ) If using a non-default nonlinear solver, then initialize the nonlinear solver interface by attaching the nonlinear solver object by calling ier = IDASetNonlinearSolverSensSim(ida_mem, NLSSens); when using the IDA SIMULTANEOUS corrector method or ier = IDASetNonlinearSolverSensStg(ida_mem, NLSSens); when using the IDA STAGGERED corrector method (see §5.2.3 for details). 23. Set sensitivity nonlinear solver optional inputs (optional ) Call the appropriate set functions for the selected nonlinear solver module to change optional inputs specific to that nonlinear solver. These must be called after IDASensInit if using the default nonlinear solver or after attaching a new nonlinear solver to idas, otherwise the optional inputs will be overridden by idas defaults. See Chapter 10 for more information on optional inputs. 96 Using IDAS for Forward Sensitivity Analysis 24. Correct initial values 25. Specify rootfinding problem 26. Advance solution in time 27. Extract sensitivity solution After each successful return from IDASolve, the solution of the original IVP is available in the y argument of IDASolve, while the sensitivity solution can be extracted into yS and ypS (which can be the same as yS0 and ypS0, respectively) by calling one of the following routines: IDAGetSens, IDAGetSens1, IDAGetSensDky or IDAGetSensDky1 (see §5.2.6). 28. Get optional outputs 29. Deallocate memory for solution vector 30. Deallocate memory for sensitivity vectors Upon completion of the integration, deallocate memory for the vectors contained in yS0 and ypS0: N VDestroyVectorArray ***(yS0, Ns); If yS was created from realtype arrays yS i, it is the user’s responsibility to also free the space for the arrays yS i, and likewise for ypS. 31. Free user data structure 32. Free solver memory 33. Free nonlinear solver memory 34. Free vector specification memory 35. Free linear solver and matrix memory 36. Finalize MPI, if used 5.2 User-callable routines for forward sensitivity analysis This section describes the idas functions, in addition to those presented in §4.5, that are called by the user to set up and solve a forward sensitivity problem. 5.2.1 Forward sensitivity initialization and deallocation functions Activation of forward sensitivity computation is done by calling IDASensInit. The form of the call to this routine is as follows: IDASensInit Call flag = IDASensInit(ida mem, Ns, ism, resS, yS0, ypS0); Description The routine IDASensInit activates forward sensitivity computations and allocates internal memory related to sensitivity calculations. Arguments ida mem (void *) pointer to the idas memory block returned by IDACreate. Ns (int) the number of sensitivities to be computed. ism (int) a flag used to select the sensitivity solution method. Its value can be either IDA SIMULTANEOUS or IDA STAGGERED: 5.2 User-callable routines for forward sensitivity analysis 97 • In the IDA SIMULTANEOUS approach, the state and sensitivity variables are corrected at the same time. If the default Newton nonlinear solver is used, this amounts to performing a modified Newton iteration on the combined nonlinear system; • In the IDA STAGGERED approach, the correction step for the sensitivity variables takes place at the same time for all sensitivity equations, but only after the correction of the state variables has converged and the state variables have passed the local error test; resS yS0 ypS0 (IDASensResFn) is the C function which computes the residual of the sensitivity DAE. For full details see §5.3. (N Vector *) a pointer to an array of Ns vectors containing the initial values of the sensitivities of y. (N Vector *) a pointer to an array of Ns vectors containing the initial values of the sensitivities of ẏ. Return value The return value flag (of type int) will be one of the following: The call to IDASensInit was successful. The idas memory block was not initialized through a previous call to IDACreate. IDA MEM FAIL A memory allocation request has failed. IDA ILL INPUT An input argument to IDASensInit has an illegal value. IDA SUCCESS IDA MEM NULL Notes Passing resS=NULL indicates using the default internal difference quotient sensitivity residual routine. If an error occurred, IDASensInit also prints an error message to the file specified by the optional input errfp. In terms of the problem size N , number of sensitivity vectors Ns , and maximum method order maxord, the size of the real workspace is increased as follows: • Base value: lenrw = lenrw + (maxord+5)Ns N • With IDASensSVtolerances: lenrw = lenrw +Ns N the size of the integer workspace is increased as follows: • Base value: leniw = leniw + (maxord+5)Ns Ni • With IDASensSVtolerances: leniw = leniw +Ns Ni , where Ni is the number of integer words in one N Vector. The routine IDASensReInit, useful during the solution of a sequence of problems of same size, reinitializes the sensitivity-related internal memory and must follow a call to IDASensInit (and maybe a call to IDAReInit). The number Ns of sensitivities is assumed to be unchanged since the call to IDASensInit. The call to the IDASensReInit function has the form: IDASensReInit Call flag = IDASensReInit(ida mem, ism, yS0, ypS0); Description The routine IDASensReInit reinitializes forward sensitivity computations. Arguments ida mem (void *) pointer to the idas memory block returned by IDACreate. ism (int) a flag used to select the sensitivity solution method. Its value can be either IDA SIMULTANEOUS or IDA STAGGERED. yS0 (N Vector *) a pointer to an array of Ns variables of type N Vector containing the initial values of the sensitivities of y. 98 Using IDAS for Forward Sensitivity Analysis ypS0 (N Vector *) a pointer to an array of Ns variables of type N Vector containing the initial values of the sensitivities of ẏ. Return value The return value flag (of type int) will be one of the following: The call to IDAReInit was successful. The idas memory block was not initialized through a previous call to IDACreate. Memory space for sensitivity integration was not allocated through a IDA NO SENS previous call to IDASensInit. IDA ILL INPUT An input argument to IDASensReInit has an illegal value. IDA MEM FAIL A memory allocation request has failed. IDA SUCCESS IDA MEM NULL Notes All arguments of IDASensReInit are the same as those of IDASensInit. If an error occurred, IDASensReInit also prints an error message to the file specified by the optional input errfp. To deallocate all forward sensitivity-related memory (allocated in a prior call to IDASensInit), the user must call IDASensFree Call IDASensFree(ida mem); Description The function IDASensFree frees the memory allocated for forward sensitivity computations by a previous call to IDASensInit. Arguments The argument is the pointer to the idas memory block (of type void *). Return value The function IDASensFree has no return value. Notes In general, IDASensFree need not be called by the user as it is invoked automatically by IDAFree. After a call to IDASensFree, forward sensitivity computations can be reactivated only by calling IDASensInit again. To activate and deactivate forward sensitivity calculations for successive idas runs, without having to allocate and deallocate memory, the following function is provided: IDASensToggleOff Call IDASensToggleOff(ida mem); Description The function IDASensToggleOff deactivates forward sensitivity calculations. It does not deallocate sensitivity-related memory. Arguments ida mem (void *) pointer to the memory previously allocated by IDAInit. Return value The return value flag of IDASensToggle is one of: IDA SUCCESS IDASensToggleOff was successful. IDA MEM NULL ida mem was NULL. Notes 5.2.2 Since sensitivity-related memory is not deallocated, sensitivities can be reactivated at a later time (using IDASensReInit). Forward sensitivity tolerance specification functions One of the following three functions must be called to specify the integration tolerances for sensitivities. Note that this call must be made after the call to IDASensInit. 5.2 User-callable routines for forward sensitivity analysis 99 IDASensSStolerances Call flag = IDASensSStolerances(ida mem, reltolS, abstolS); Description The function IDASensSStolerances specifies scalar relative and absolute tolerances. Arguments ida mem (void *) pointer to the idas memory block returned by IDACreate. reltolS (realtype) is the scalar relative error tolerance. abstolS (realtype*) is a pointer to an array of length Ns containing the scalar absolute error tolerances. Return value The return flag flag (of type int) will be one of the following: The call to IDASStolerances was successful. The idas memory block was not initialized through a previous call to IDACreate. The sensitivity allocation function IDASensInit has not been called. IDA NO SENS IDA ILL INPUT One of the input tolerances was negative. IDA SUCCESS IDA MEM NULL IDASensSVtolerances Call flag = IDASensSVtolerances(ida mem, reltolS, abstolS); Description The function IDASensSVtolerances specifies scalar relative tolerance and vector absolute tolerances. Arguments ida mem (void *) pointer to the idas memory block returned by IDACreate. reltolS (realtype) is the scalar relative error tolerance. abstolS (N Vector*) is an array of Ns variables of type N Vector. The N Vector from abstolS[is] specifies the vector tolerances for is-th sensitivity. Return value The return flag flag (of type int) will be one of the following: The call to IDASVtolerances was successful. The idas memory block was not initialized through a previous call to IDACreate. IDA NO SENS The sensitivity allocation function IDASensInit has not been called. IDA ILL INPUT The relative error tolerance was negative or one of the absolute tolerance vectors had a negative component. IDA SUCCESS IDA MEM NULL Notes This choice of tolerances is important when the absolute error tolerance needs to be different for each component of any vector yS[i]. IDASensEEtolerances Call flag = IDASensEEtolerances(ida mem); Description When IDASensEEtolerances is called, idas will estimate tolerances for sensitivity variables based on the tolerances supplied for states variables and the scaling factors p̄. Arguments ida mem (void *) pointer to the idas memory block returned by IDACreate. Return value The return flag flag (of type int) will be one of the following: IDA SUCCESS IDA MEM NULL IDA NO SENS The call to IDASensEEtolerances was successful. The idas memory block was not initialized through a previous call to IDACreate. The sensitivity allocation function IDASensInit has not been called. 100 5.2.3 Using IDAS for Forward Sensitivity Analysis Forward sensitivity nonlinear solver interface functions As in the pure DAE case, when computing solution sensitivities using forward sensitivitiy analysis idas uses the sunnonlinsol implementation of Newton’s method defined by the sunnonlinsol newton module (see §10.2) by default. To specify a different nonlinear solver in idas, the user’s program must create a sunnonlinsol object by calling the appropriate constructor routine. The user must then attach the sunnonlinsol object to idas by calling either IDASetNonlinearSolverSensSim when using the IDA SIMULTANEOUS corrector option, or IDASetNonlinearSolver (see §4.5.4) and IDASetNonlinearSolverSensStg when using the IDA STAGGERED corrector option, as documented below. When changing the nonlinear solver in idas, IDASetNonlinearSolver must be called after IDAInit; similarly IDASetNonlinearSolverSensSim and IDASetNonlinearSolverStg must be called after IDASensInit. If any calls to IDASolve have been made, then idas will need to be reinitialized by calling IDAReInit to ensure that the nonlinear solver is initialized correctly before any subsequent calls to IDASolve. The first argument passed to the routines IDASetNonlinearSolverSensSim and IDASetNonlinearSolverSensStg is the idas memory pointer returned by IDACreate and the second argument is the sunnonlinsol object to use for solving the nonlinear system 2.4. A call to this function attaches the nonlinear solver to the main idas integrator. We note that at present, the sunnonlinsol object must be of type SUNNONLINEARSOLVER ROOTFIND. IDASetNonlinearSolverSensSim Call flag = IDASetNonlinearSolverSensSim(ida mem, NLS); Description The function IDASetNonLinearSolverSensSim attaches a sunnonlinsol object (NLS) to idas when using the IDA SIMULTANEOUS approach to correct the state and sensitivity variables at the same time. Arguments ida mem (void *) pointer to the idas memory block. NLS (SUNNonlinearSolver) sunnonlinsol object to use for solving nonlinear systems. Return value The return value flag (of type int) is one of IDA SUCCESS The nonlinear solver was successfully attached. IDA MEM NULL The ida mem pointer is NULL. IDA ILL INPUT The sunnonlinsol object is NULL, does not implement the required nonlinear solver operations, is not of the correct type, or the residual function, convergence test function, or maximum number of nonlinear iterations could not be set. IDASetNonlinearSolverSensStg Call flag = IDASetNonlinearSolverSensStg(ida mem, NLS); Description The function IDASetNonLinearSolverSensStg attaches a sunnonlinsol object (NLS) to idas when using the IDA STAGGERED approach to correct the sensitivity variables after the correction of the state variables. Arguments ida mem (void *) pointer to the idas memory block. NLS (SUNNonlinearSolver) sunnonlinsol object to use for solving nonlinear systems. Return value The return value flag (of type int) is one of IDA SUCCESS IDA MEM NULL The nonlinear solver was successfully attached. The ida mem pointer is NULL. 5.2 User-callable routines for forward sensitivity analysis 101 IDA ILL INPUT The sunnonlinsol object is NULL, does not implement the required nonlinear solver operations, is not of the correct type, or the residual function, convergence test function, or maximum number of nonlinear iterations could not be set. Notes 5.2.4 This function only attaches the sunnonlinsol object for correcting the sensitivity variables. To attach a sunnonlinsol object for the state variable correction use IDASetNonlinearSolver (see §4.5.4). Forward sensitivity initial condition calculation function IDACalcIC also calculates corrected initial conditions for sensitivity variables of a DAE system. When used for initial conditions calculation of the forward sensitivities, IDACalcIC must be preceded by successful calls to IDASensInit (or IDASensReInit) and should precede the call(s) to IDASolve. For restrictions that apply for initial conditions calculation of the state variables, see §4.5.5. Calling IDACalcIC is optional. It is only necessary when the initial conditions do not satisfy the sensitivity systems. Even if forward sensitivity analysis was enabled, the call to the initial conditions calculation function IDACalcIC is exactly the same as for state variables. flag = IDACalcIC(ida_mem, icopt, tout1); See §4.5.5 for a list of possible return values. 5.2.5 IDAS solver function Even if forward sensitivity analysis was enabled, the call to the main solver function IDASolve is exactly the same as in §4.5.7. However, in this case the return value flag can also be one of the following: The sensitivity residual function failed in an unrecoverable manner. IDA SRES FAIL IDA REP SRES ERR The user’s residual function repeatedly returned a recoverable error flag, but the solver was unable to recover. 5.2.6 Forward sensitivity extraction functions If forward sensitivity computations have been initialized by a call to IDASensInit, or reinitialized by a call to IDASensReInit, then idas computes both a solution and sensitivities at time t. However, IDASolve will still return only the solutions y and ẏ in yret and ypret, respectively. Solution sensitivities can be obtained through one of the following functions: IDAGetSens Call flag = IDAGetSens(ida mem, &tret, yS); Description The function IDAGetSens returns the sensitivity solution vectors after a successful return from IDASolve. Arguments ida mem (void *) pointer to the memory previously allocated by IDAInit. tret (realtype) the time reached by the solver (output). yS (N Vector *) the array of Ns computed forward sensitivity vectors. Return value The return value flag of IDAGetSens is one of: IDA IDA IDA IDA Notes SUCCESS MEM NULL NO SENS BAD DKY IDAGetSens was successful. ida mem was NULL. Forward sensitivity analysis was not initialized. yS is NULL. Note that the argument tret is an output for this function. Its value will be the same as that returned at the last IDASolve call. 102 Using IDAS for Forward Sensitivity Analysis The function IDAGetSensDky computes the k-th derivatives of the interpolating polynomials for the sensitivity variables at time t. This function is called by IDAGetSens with k = 0, but may also be called directly by the user. IDAGetSensDky Call flag = IDAGetSensDky(ida mem, t, k, dkyS); Description The function IDAGetSensDky returns derivatives of the sensitivity solution vectors after a successful return from IDASolve. Arguments ida mem (void *) pointer to the memory previously allocated by IDAInit. t (realtype) specifies the time at which sensitivity information is requested. The time t must fall within the interval defined by the last successful step taken by idas. k (int) order of derivatives. dkyS (N Vector *) array of Ns vectors containing the derivatives on output. The space for dkyS must be allocated by the user. Return value The return value flag of IDAGetSensDky is one of: IDA SUCCESS IDAGetSensDky succeeded. IDA MEM NULL ida mem was NULL. IDA NO SENS Forward sensitivity analysis was not initialized. IDA BAD DKY dkyS or one of the vectors dkyS[i] is NULL. IDA BAD K k is not in the range 0, 1, ..., klast. IDA BAD T The time t is not in the allowed range. Forward sensitivity solution vectors can also be extracted separately for each parameter in turn through the functions IDAGetSens1 and IDAGetSensDky1, defined as follows: IDAGetSens1 Call flag = IDAGetSens1(ida mem, &tret, is, yS); Description The function IDAGetSens1 returns the is-th sensitivity solution vector after a successful return from IDASolve. Arguments ida mem (void *) pointer to the memory previously allocated by IDAInit. tret (realtype *) the time reached by the solver (output). is (int) specifies which sensitivity vector is to be returned (0 ≤is< Ns ). yS (N Vector) the computed forward sensitivity vector. Return value The return value flag of IDAGetSens1 is one of: IDA SUCCESS IDAGetSens1 was successful. IDA MEM NULL ida mem was NULL. Notes IDA NO SENS Forward sensitivity analysis was not initialized. IDA BAD IS The index is is not in the allowed range. IDA BAD DKY yS is NULL. IDA BAD T The time t is not in the allowed range. Note that the argument tret is an output for this function. Its value will be the same as that returned at the last IDASolve call. 5.2 User-callable routines for forward sensitivity analysis 103 IDAGetSensDky1 Call flag = IDAGetSensDky1(ida mem, t, k, is, dkyS); Description The function IDAGetSensDky1 returns the k-th derivative of the is-th sensitivity solution vector after a successful return from IDASolve. Arguments ida mem (void *) pointer to the memory previously allocated by IDAInit. t (realtype) specifies the time at which sensitivity information is requested. The time t must fall within the interval defined by the last successful step taken by idas. k (int) order of derivative. is (int) specifies the sensitivity derivative vector to be returned (0 ≤is< Ns ). dkyS (N Vector) the vector containing the derivative on output. The space for dkyS must be allocated by the user. Return value The return value flag of IDAGetSensDky1 is one of: IDA IDA IDA IDA IDA IDA IDA 5.2.7 SUCCESS MEM NULL NO SENS BAD DKY BAD IS BAD K BAD T IDAGetQuadDky1 succeeded. ida mem was NULL. Forward sensitivity analysis was not initialized. dkyS is NULL. The index is is not in the allowed range. k is not in the range 0, 1, ..., klast. The time t is not in the allowed range. Optional inputs for forward sensitivity analysis Optional input variables that control the computation of sensitivities can be changed from their default values through calls to IDASetSens* functions. Table 5.1 lists all forward sensitivity optional input functions in idas which are described in detail in the remainder of this section. IDASetSensParams Call flag = IDASetSensParams(ida mem, p, pbar, plist); Description The function IDASetSensParams specifies problem parameter information for sensitivity calculations. Arguments ida mem (void *) pointer to the idas memory block. p (realtype *) a pointer to the array of real problem parameters used to evaluate F (t, y, ẏ, p). If non-NULL, p must point to a field in the user’s data structure user data passed to the user’s residual function. (See §5.1). pbar (realtype *) an array of Ns positive scaling factors. If non-NULL, pbar must have all its components > 0.0. (See §5.1). plist (int *) an array of Ns non-negative indices to specify which components of p to use in estimating the sensitivity equations. If non-NULL, plist must have all components ≥ 0. (See §5.1). Table 5.1: Forward sensitivity optional inputs Optional input Sensitivity scaling factors DQ approximation method Error control strategy Maximum no. of nonlinear iterations Routine name IDASetSensParams IDASetSensDQMethod IDASetSensErrCon IDASetSensMaxNonlinIters Default NULL centered,0.0 SUNFALSE 3 104 Using IDAS for Forward Sensitivity Analysis Return value The return value flag (of type int) is one of: IDA IDA IDA IDA ! Notes SUCCESS MEM NULL NO SENS ILL INPUT The optional value has been successfully set. The ida mem pointer is NULL. Forward sensitivity analysis was not initialized. An argument has an illegal value. This function must be preceded by a call to IDASensInit. IDASetSensDQMethod Call flag = IDASetSensDQMethod(ida mem, DQtype, DQrhomax); Description The function IDASetSensDQMethod specifies the difference quotient strategy in the case in which the residual of the sensitivity equations are to be computed by idas. Arguments ida mem (void *) pointer to the idas memory block. DQtype (int) specifies the difference quotient type and can be either IDA CENTERED or IDA FORWARD. DQrhomax (realtype) positive value of the selection parameter used in deciding switching between a simultaneous or separate approximation of the two terms in the sensitivity residual. Return value The return value flag (of type int) is one of: IDA SUCCESS The optional value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. IDA ILL INPUT An argument has an illegal value. Notes If DQrhomax = 0.0, then no switching is performed. The approximation is done simultaneously using either centered or forward finite differences, depending on the value of DQtype. For values of DQrhomax ≥ 1.0, the simultaneous approximation is used whenever the estimated finite difference perturbations for states and parameters are within a factor of DQrhomax, and the separate approximation is used otherwise. Note that a value DQrhomax < 1.0 will effectively disable switching. See §2.5 for more details. The default value are DQtype=IDA CENTERED and DQrhomax= 0.0. IDASetSensErrCon Call flag = IDASetSensErrCon(ida mem, errconS); Description The function IDASetSensErrCon specifies the error control strategy for sensitivity variables. Arguments ida mem (void *) pointer to the idas memory block. errconS (booleantype) specifies whether sensitivity variables are included (SUNTRUE) or not (SUNFALSE) in the error control mechanism. Return value The return value flag (of type int) is one of: IDA SUCCESS The optional value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. Notes By default, errconS is set to SUNFALSE. If errconS=SUNTRUE then both state variables and sensitivity variables are included in the error tests. If errconS=SUNFALSE then the sensitivity variables are excluded from the error tests. Note that, in any event, all variables are considered in the convergence tests. 5.2 User-callable routines for forward sensitivity analysis 105 IDASetSensMaxNonlinIters Call flag = IDASetSensMaxNonlinIters(ida mem, maxcorS); Description The function IDASetSensMaxNonlinIters specifies the maximum number of nonlinear solver iterations for sensitivity variables per step. Arguments ida mem (void *) pointer to the idas memory block. maxcorS (int) maximum number of nonlinear solver iterations allowed per step (> 0). Return value The return value flag (of type int) is one of: IDA SUCCESS The optional value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. IDA MEM FAIL The ida mem sunnonlinsol module is NULL. Notes The default value is 3. 5.2.8 Optional outputs for forward sensitivity analysis 5.2.8.1 Main solver optional output functions Optional output functions that return statistics and solver performance information related to forward sensitivity computations are listed in Table 5.2 and described in detail in the remainder of this section. IDAGetSensNumResEvals Call flag = IDAGetSensNumResEvals(ida mem, &nfSevals); Description The function IDAGetSensNumResEvals returns the number of calls to the sensitivity residual function. Arguments ida mem (void *) pointer to the idas memory block. nfSevals (long int) number of calls to the sensitivity residual function. Return value The return value flag (of type int) is one of: IDA SUCCESS The optional output value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. IDA NO SENS Forward sensitivity analysis was not initialized. IDAGetNumResEvalsSens Call flag = IDAGetNumResEvalsSens(ida mem, &nfevalsS); Description The function IDAGetNumResEvalsSEns returns the number of calls to the user’s residual function due to the internal finite difference approximation of the sensitivity residuals. Table 5.2: Forward sensitivity optional outputs Optional output No. of calls to sensitivity residual function No. of calls to residual function for sensitivity No. of sensitivity local error test failures No. of calls to lin. solv. setup routine for sens. Sensitivity-related statistics as a group Error weight vector for sensitivity variables No. of sens. nonlinear solver iterations No. of sens. convergence failures Sens. nonlinear solver statistics as a group Routine name IDAGetSensNumResEvals IDAGetNumResEvalsSens IDAGetSensNumErrTestFails IDAGetSensNumLinSolvSetups IDAGetSensStats IDAGetSensErrWeights IDAGetSensNumNonlinSolvIters IDAGetSensNumNonlinSolvConvFails IDAGetSensNonlinSolvStats 106 Arguments Using IDAS for Forward Sensitivity Analysis ida mem (void *) pointer to the idas memory block. nfevalsS (long int) number of calls to the user residual function for sensitivity residuals. Return value The return value flag (of type int) is one of: IDA SUCCESS The optional output value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. IDA NO SENS Forward sensitivity analysis was not initialized. Notes This counter is incremented only if the internal finite difference approximation routines are used for the evaluation of the sensitivity residuals. IDAGetSensNumErrTestFails Call flag = IDAGetSensNumErrTestFails(ida mem, &nSetfails); Description The function IDAGetSensNumErrTestFails returns the number of local error test failures for the sensitivity variables that have occurred. Arguments ida mem (void *) pointer to the idas memory block. nSetfails (long int) number of error test failures. Return value The return value flag (of type int) is one of: IDA SUCCESS The optional output value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. IDA NO SENS Forward sensitivity analysis was not initialized. Notes This counter is incremented only if the sensitivity variables have been included in the error test (see IDASetSensErrCon in §5.2.7). Even in that case, this counter is not incremented if the ism=IDA SIMULTANEOUS sensitivity solution method has been used. IDAGetSensNumLinSolvSetups Call flag = IDAGetSensNumLinSolvSetups(ida mem, &nlinsetupsS); Description The function IDAGetSensNumLinSolvSetups returns the number of calls to the linear solver setup function due to forward sensitivity calculations. Arguments ida mem (void *) pointer to the idas memory block. nlinsetupsS (long int) number of calls to the linear solver setup function. Return value The return value flag (of type int) is one of: IDA SUCCESS The optional output value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. IDA NO SENS Forward sensitivity analysis was not initialized. Notes This counter is incremented only if a nonlinear solver requiring linear solves has been used and staggered sensitivity solution method (ism=IDA STAGGERED) was specified in the call to IDASensInit (see §5.2.1). IDAGetSensStats Call flag = IDAGetSensStats(ida mem, &nfSevals, &nfevalsS, &nSetfails, &nlinsetupsS); Description The function IDAGetSensStats returns all of the above sensitivity-related solver statistics as a group. Arguments ida mem (void *) pointer to the idas memory block. nfSevals (long int) number of calls to the sensitivity residual function. 5.2 User-callable routines for forward sensitivity analysis 107 nfevalsS (long int) number of calls to the user-supplied residual function. nSetfails (long int) number of error test failures. nlinsetupsS (long int) number of calls to the linear solver setup function. Return value The return value flag (of type int) is one of: IDA SUCCESS The optional output values have been successfully set. IDA MEM NULL The ida mem pointer is NULL. IDA NO SENS Forward sensitivity analysis was not initialized. IDAGetSensErrWeights Call flag = IDAGetSensErrWeights(ida mem, eSweight); Description The function IDAGetSensErrWeights returns the sensitivity error weight vectors at the current time. These are the reciprocals of the Wi of (2.7) for the sensitivity variables. Arguments ida mem (void *) pointer to the idas memory block. eSweight (N Vector S) pointer to the array of error weight vectors. Return value The return value flag (of type int) is one of: IDA SUCCESS The optional output value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. IDA NO SENS Forward sensitivity analysis was not initialized. Notes The user must allocate memory for eweightS. IDAGetSensNumNonlinSolvIters Call flag = IDAGetSensNumNonlinSolvIters(ida mem, &nSniters); Description The function IDAGetSensNumNonlinSolvIters returns the number of nonlinear iterations performed for sensitivity calculations. Arguments ida mem (void *) pointer to the idas memory block. nSniters (long int) number of nonlinear iterations performed. Return value The return value flag (of type int) is one of: IDA IDA IDA IDA Notes SUCCESS MEM NULL NO SENS MEM FAIL The optional output value has been successfully set. The ida mem pointer is NULL. Forward sensitivity analysis was not initialized. The sunnonlinsol module is NULL. This counter is incremented only if ism was IDA STAGGERED in the call to IDASensInit (see §5.2.1). IDAGetSensNumNonlinSolvConvFails Call flag = IDAGetSensNumNonlinSolvConvFails(ida mem, &nSncfails); Description The function IDAGetSensNumNonlinSolvConvFails returns the number of nonlinear convergence failures that have occurred for sensitivity calculations. Arguments ida mem (void *) pointer to the idas memory block. nSncfails (long int) number of nonlinear convergence failures. Return value The return value flag (of type int) is one of: IDA SUCCESS The optional output value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. IDA NO SENS Forward sensitivity analysis was not initialized. Notes This counter is incremented only if ism was IDA STAGGERED in the call to IDASensInit (see §5.2.1). 108 Using IDAS for Forward Sensitivity Analysis IDAGetSensNonlinSolvStats Call flag = IDAGetSensNonlinSolvStats(ida mem, &nSniters, &nSncfails); Description The function IDAGetSensNonlinSolvStats returns the sensitivity-related nonlinear solver statistics as a group. Arguments ida mem (void *) pointer to the idas memory block. nSniters (long int) number of nonlinear iterations performed. nSncfails (long int) number of nonlinear convergence failures. Return value The return value flag (of type int) is one of: IDA IDA IDA IDA 5.2.8.2 SUCCESS MEM NULL NO SENS MEM FAIL The optional output values have been successfully set. The ida mem pointer is NULL. Forward sensitivity analysis was not initialized. The sunnonlinsol module is NULL. Initial condition calculation optional output functions The sensitivity consistent initial conditions found by idas (after a successful call to IDACalcIC) can be obtained by calling the following function: IDAGetSensConsistentIC Call flag = IDAGetSensConsistentIC(ida mem, yyS0 mod, ypS0 mod); Description The function IDAGetSensConsistentIC returns the corrected initial conditions calculated by IDACalcIC for sensitivities variables. Arguments ida mem (void *) pointer to the idas memory block. yyS0 mod (N Vector *) a pointer to an array of Ns vectors containing consistent sensitivity vectors. ypS0 mod (N Vector *) a pointer to an array of Ns vectors containing consistent sensitivity derivative vectors. Return value The return value flag (of type int) is one of IDA IDA IDA IDA Notes ! SUCCESS MEM NULL NO SENS ILL INPUT IDAGetSensConsistentIC succeeded. The ida mem pointer is NULL. The function IDASensInit has not been previously called. IDASolve has been already called. If the consistent sensitivity vectors or consistent derivative vectors are not desired, pass NULL for the corresponding argument. The user must allocate space for yyS0 mod and ypS0 mod (if not NULL). 5.3 User-supplied routines for forward sensitivity analysis In addition to the required and optional user-supplied routines described in §4.6, when using idas for forward sensitivity analysis, the user has the option of providing a routine that calculates the residual of the sensitivity equations (2.12). By default, idas uses difference quotient approximation routines for the residual of the sensitivity equations. However, idas allows the option for user-defined sensitivity residual routines (which also provides a mechanism for interfacing idas to routines generated by automatic differentiation). The user may provide the residuals of the sensitivity equations (2.12), for all sensitivity parameters at once, through a function of type IDASensResFn defined by: 5.4 Integration of quadrature equations depending on forward sensitivities 109 IDASensResFn Definition typedef int (*IDASensResFn)(int Ns, realtype t, N Vector yy, N Vector yp, N Vector resval, N Vector *yS, N Vector *ypS, N Vector *resvalS, void *user data, N Vector tmp1, N Vector tmp2, N Vector tmp3); Purpose This function computes the sensitivity residual for all sensitivity equations. It must compute the vectors (∂F/∂y)si (t)+(∂F/∂ ẏ)ṡi (t)+(∂F/∂pi ) and store them in resvalS[i]. Arguments Ns t yy yp resval yS ypS resvalS user data tmp1 tmp2 tmp3 is the number of sensitivities. is the current value of the independent variable. is the current value of the state vector, y(t). is the current value of ẏ(t). contains the current value F of the original DAE residual. contains the current values of the sensitivities si . contains the current values of the sensitivity derivatives ṡi . contains the output sensitivity residual vectors. is a pointer to user data. are N Vectors of length N which can be used as temporary storage. Return value An IDASensResFn should return 0 if successful, a positive value if a recoverable error occurred (in which case idas will attempt to correct), or a negative value if it failed unrecoverably (in which case the integration is halted and IDA SRES FAIL is returned). Notes 5.4 There is one situation in which recovery is not possible even if IDASensResFn function returns a recoverable error flag. That is when this occurs at the very first call to the IDASensResFn, in which case idas returns IDA FIRST RES FAIL. Integration of quadrature equations depending on forward sensitivities idas provides support for integration of quadrature equations that depends not only on the state variables but also on forward sensitivities. The following is an overview of the sequence of calls in a user’s main program in this situation. Steps that are unchanged from the skeleton program presented in §5.1 are grayed out. See also §4.7. 1. Initialize parallel or multi-threaded environment 2. Set problem dimensions, etc. 3. Set vectors of initial values 4. Create idas object 5. Initialize idas solver 6. Specify integration tolerances 7. Create matrix object 8. Create linear solver object 9. Set linear solver optional inputs 110 Using IDAS for Forward Sensitivity Analysis 10. Attach linear solver module 11. Set optional inputs 12. Create nonlinear solver object 13. Attach nonlinear solver module 14. Set nonlinear solver optional inputs 15. Initialize sensitivity-independent quadrature problem 16. Define the sensitivity problem 17. Set sensitivity initial conditions 18. Activate sensitivity calculations 19. Set sensitivity tolerances 20. Set sensitivity analysis optional inputs 21. Create sensitivity nonlinear solver object 22. Attach the sensitvity nonlinear solver module 23. Set sensitivity nonlinear solver optional inputs 24. Set vector of initial values for quadrature variables Typically, the quadrature variables should be initialized to 0. 25. Initialize sensitivity-dependent quadrature integration Call IDAQuadSensInit to specify the quadrature equation right-hand side function and to allocate internal memory related to quadrature integration. See §5.4.1 for details. 26. Set optional inputs for sensitivity-dependent quadrature integration Call IDASetQuadSensErrCon to indicate whether or not quadrature variables should be used in the step size control mechanism. If so, one of the IDAQuadSens*tolerances functions must be called to specify the integration tolerances for quadrature variables. See §5.4.4 for details. 27. Advance solution in time 28. Extract sensitivity-dependent quadrature variables Call IDAGetQuadSens, IDAGetQuadSens1, IDAGetQuadSensDky or IDAGetQuadSensDky1 to obtain the values of the quadrature variables or their derivatives at the current time. See §5.4.3 for details. 29. Get optional outputs 30. Extract sensitivity solution 31. Get sensitivity-dependent quadrature optional outputs Call IDAGetQuadSens* functions to obtain optional output related to the integration of sensitivitydependent quadratures. See §5.4.5 for details. 32. Deallocate memory for solutions vector 33. Deallocate memory for sensitivity vectors 5.4 Integration of quadrature equations depending on forward sensitivities 111 34. Deallocate memory for sensitivity-dependent quadrature variables 35. Free solver memory 36. Free nonlinear solver memory 37. Free vector specification memory 38. Free linear solver and matrix memory 39. Finalize MPI, if used Note: IDAQuadSensInit (step 25 above) can be called and quadrature-related optional inputs (step 26 above) can be set, anywhere between steps 16 and 27. 5.4.1 Sensitivity-dependent quadrature initialization and deallocation The function IDAQuadSensInit activates integration of quadrature equations depending on sensitivities and allocates internal memory related to these calculations. If rhsQS is input as NULL, then idas uses an internal function that computes difference quotient approximations to the functions q̄i = (∂q/∂y)si + (∂q/∂ ẏ)ṡi + ∂q/∂pi , in the notation of (2.10). The form of the call to this function is as follows: IDAQuadSensInit Call flag = IDAQuadSensInit(ida mem, rhsQS, yQS0); Description The function IDAQuadSensInit provides required problem specifications, allocates internal memory, and initializes quadrature integration. Arguments ida mem (void *) pointer to the idas memory block returned by IDACreate. rhsQS (IDAQuadSensRhsFn) is the C function which computes fQS , the right-hand side of the sensitivity-dependent quadrature equations (for full details see §5.4.6). yQS0 (N Vector *) contains the initial values of sensitivity-dependent quadratures. Return value The return value flag (of type int) will be one of the following: IDA IDA IDA IDA IDA Notes SUCCESS MEM NULL MEM FAIL NO SENS ILL INPUT The call to IDAQuadSensInit was successful. The idas memory was not initialized by a prior call to IDACreate. A memory allocation request failed. The sensitivities were not initialized by a prior call to IDASensInit. The parameter yQS0 is NULL. Before calling IDAQuadSensInit, the user must enable the sensitivites by calling IDASensInit. If an error occurred, IDAQuadSensInit also sends an error message to the error handler function. In terms of the number of quadrature variables Nq and maximum method order maxord, the size of the real workspace is increased as follows: • Base value: lenrw = lenrw + (maxord+5)Nq • If IDAQuadSensSVtolerances is called: lenrw = lenrw +Nq Ns and the size of the integer workspace is increased as follows: • Base value: leniw = leniw + (maxord+5)Nq • If IDAQuadSensSVtolerances is called: leniw = leniw +Nq Ns ! 112 Using IDAS for Forward Sensitivity Analysis The function IDAQuadSensReInit, useful during the solution of a sequence of problems of same size, reinitializes the quadrature related internal memory and must follow a call to IDAQuadSensInit. The number Nq of quadratures as well as the number Ns of sensitivities are assumed to be unchanged from the prior call to IDAQuadSensInit. The call to the IDAQuadSensReInit function has the form: IDAQuadSensReInit Call flag = IDAQuadSensReInit(ida mem, yQS0); Description The function IDAQuadSensReInit provides required problem specifications and reinitializes the sensitivity-dependent quadrature integration. Arguments ida mem (void *) pointer to the idas memory block. yQS0 (N Vector *) contains the initial values of sensitivity-dependent quadratures. Return value The return value flag (of type int) will be one of the following: The call to IDAQuadSensReInit was successful. The idas memory was not initialized by a prior call to IDACreate. Memory space for the sensitivity calculation was not allocated by a prior call to IDASensInit. IDA NO QUADSENS Memory space for the sensitivity quadratures integration was not allocated by a prior call to IDAQuadSensInit. The parameter yQS0 is NULL. IDA ILL INPUT IDA SUCCESS IDA MEM NULL IDA NO SENS Notes If an error occurred, IDAQuadSensReInit also sends an error message to the error handler function. IDAQuadSensFree Call IDAQuadSensFree(ida mem); Description The function IDAQuadSensFree frees the memory allocated for sensitivity quadrature integration. Arguments The argument is the pointer to the idas memory block (of type void *). Return value The function IDAQuadSensFree has no return value. Notes 5.4.2 In general, IDAQuadSensFree need not be called by the user as it is called automatically by IDAFree. IDAS solver function Even if quadrature integration was enabled, the call to the main solver function IDASolve is exactly the same as in §4.5.7. However, in this case the return value flag can also be one of the following: The sensitivity quadrature right-hand side function failed in an unrecoverable IDA QSRHS FAIL manner. IDA FIRST QSRHS ERR The sensitivity quadrature right-hand side function failed at the first call. IDA REP QSRHS ERR 5.4.3 Convergence test failures occurred too many times due to repeated recoverable errors in the quadrature right-hand side function. The IDA REP RES ERR will also be returned if the quadrature right-hand side function had repeated recoverable errors during the estimation of an initial step size (assuming the sensitivity quadrature variables are included in the error tests). Sensitivity-dependent quadrature extraction functions If sensitivity-dependent quadratures have been initialized by a call to IDAQuadSensInit, or reinitialized by a call to IDAQuadSensReInit, then idas computes a solution, sensitivities, and quadratures depending on sensitivities at time t. However, IDASolve will still return only the solutions y and ẏ. Sensitivity-dependent quadratures can be obtained using one of the following functions: 5.4 Integration of quadrature equations depending on forward sensitivities 113 IDAGetQuadSens Call flag = IDAGetQuadSens(ida mem, &tret, yQS); Description The function IDAGetQuadSens returns the quadrature sensitivity solution vectors after a successful return from IDASolve. Arguments ida mem (void *) pointer to the memory previously allocated by IDAInit. tret (realtype) the time reached by the solver (output). yQS (N Vector *) array of Ns computed sensitivity-dependent quadrature vectors. Return value The return value flag of IDAGetQuadSens is one of: IDA IDA IDA IDA IDA SUCCESS MEM NULL NO SENS NO QUADSENS BAD DKY IDAGetQuadSens was successful. ida mem was NULL. Sensitivities were not activated. Quadratures depending on the sensitivities were not activated. yQS or one of the yQS[i] is NULL. The function IDAGetQuadSensDky computes the k-th derivatives of the interpolating polynomials for the sensitivity-dependent quadrature variables at time t. This function is called by IDAGetQuadSens with k = 0, but may also be called directly by the user. IDAGetQuadSensDky Call flag = IDAGetQuadSensDky(ida mem, t, k, dkyQS); Description The function IDAGetQuadSensDky returns derivatives of the quadrature sensitivities solution vectors after a successful return from IDASolve. Arguments ida mem (void *) pointer to the memory previously allocated by IDAInit. t (realtype) the time at which information is requested. The time t must fall within the interval defined by the last successful step taken by idas. k (int) order of the requested derivative. dkyQS (N Vector *) array of Ns vectors containing the derivatives. This vector array must be allocated by the user. Return value The return value flag of IDAGetQuadSensDky is one of: IDA IDA IDA IDA IDA IDA IDA SUCCESS MEM NULL NO SENS NO QUADSENS BAD DKY BAD K BAD T IDAGetQuadSensDky succeeded. ida mem was NULL. Sensitivities were not activated. Quadratures depending on the sensitivities were not activated. dkyQS or one of the vectors dkyQS[i] is NULL. k is not in the range 0, 1, ..., klast. The time t is not in the allowed range. Quadrature sensitivity solution vectors can also be extracted separately for each parameter in turn through the functions IDAGetQuadSens1 and IDAGetQuadSensDky1, defined as follows: IDAGetQuadSens1 Call flag = IDAGetQuadSens1(ida mem, &tret, is, yQS); Description The function IDAGetQuadSens1 returns the is-th sensitivity of quadratures after a successful return from IDASolve. Arguments ida mem tret is yQS (void *) pointer to the memory previously allocated by IDAInit. (realtype) the time reached by the solver (output). (int) specifies which sensitivity vector is to be returned (0 ≤is< Ns ). (N Vector) the computed sensitivity-dependent quadrature vector. 114 Using IDAS for Forward Sensitivity Analysis Return value The return value flag of IDAGetQuadSens1 is one of: IDA IDA IDA IDA IDA IDA SUCCESS MEM NULL NO SENS NO QUADSENS BAD IS BAD DKY IDAGetQuadSens1 was successful. ida mem was NULL. Forward sensitivity analysis was not initialized. Quadratures depending on the sensitivities were not activated. The index is is not in the allowed range. yQS is NULL. IDAGetQuadSensDky1 Call flag = IDAGetQuadSensDky1(ida mem, t, k, is, dkyQS); Description The function IDAGetQuadSensDky1 returns the k-th derivative of the is-th sensitivity solution vector after a successful return from IDASolve. Arguments ida mem (void *) pointer to the memory previously allocated by IDAInit. t (realtype) specifies the time at which sensitivity information is requested. The time t must fall within the interval defined by the last successful step taken by idas. k (int) order of derivative. is (int) specifies the sensitivity derivative vector to be returned (0 ≤is< Ns ). dkyQS (N Vector) the vector containing the derivative. The space for dkyQS must be allocated by the user. Return value The return value flag of IDAGetQuadSensDky1 is one of: IDA IDA IDA IDA IDA IDA IDA IDA 5.4.4 SUCCESS MEM NULL NO SENS NO QUADSENS BAD DKY BAD IS BAD K BAD T IDAGetQuadDky1 succeeded. ida mem was NULL. Forward sensitivity analysis was not initialized. Quadratures depending on the sensitivities were not activated. dkyQS is NULL. The index is is not in the allowed range. k is not in the range 0, 1, ..., klast. The time t is not in the allowed range. Optional inputs for sensitivity-dependent quadrature integration idas provides the following optional input functions to control the integration of sensitivity-dependent quadrature equations. IDASetQuadSensErrCon Call flag = IDASetQuadSensErrCon(ida mem, errconQS) Description The function IDASetQuadSensErrCon specifies whether or not the quadrature variables are to be used in the local error control mechanism. If they are, the user must specify the error tolerances for the quadrature variables by calling IDAQuadSensSStolerances, IDAQuadSensSVtolerances, or IDAQuadSensEEtolerances. Arguments ida mem (void *) pointer to the idas memory block. errconQS (booleantype) specifies whether sensitivity quadrature variables are included (SUNTRUE) or not (SUNFALSE) in the error control mechanism. Return value The return value flag (of type int) is one of: IDA SUCCESS IDA MEM NULL The optional value has been successfully set. The ida mem pointer is NULL. 5.4 Integration of quadrature equations depending on forward sensitivities 115 IDA NO SENS Sensitivities were not activated. IDA NO QUADSENS Quadratures depending on the sensitivities were not activated. Notes By default, errconQS is set to SUNFALSE. It is illegal to call IDASetQuadSensErrCon before a call to IDAQuadSensInit. If the quadrature variables are part of the step size control mechanism, one of the following functions must be called to specify the integration tolerances for quadrature variables. IDAQuadSensSStolerances Call flag = IDAQuadSensSVtolerances(ida mem, reltolQS, abstolQS); Description The function IDAQuadSensSStolerances specifies scalar relative and absolute tolerances. Arguments ida mem (void *) pointer to the idas memory block. reltolQS (realtype) is the scalar relative error tolerance. abstolQS (realtype*) is a pointer to an array containing the Ns scalar absolute error tolerances. Return value The return value flag (of type int) is one of: IDA IDA IDA IDA IDA SUCCESS MEM NULL NO SENS NO QUADSENS ILL INPUT The optional value has been successfully set. The ida mem pointer is NULL. Sensitivities were not activated. Quadratures depending on the sensitivities were not activated. One of the input tolerances was negative. IDAQuadSensSVtolerances Call flag = IDAQuadSensSVtolerances(ida mem, reltolQS, abstolQS); Description The function IDAQuadSensSVtolerances specifies scalar relative and vector absolute tolerances. Arguments ida mem (void *) pointer to the idas memory block. reltolQS (realtype) is the scalar relative error tolerance. abstolQS (N Vector*) is an array of Ns variables of type N Vector. The N Vector from abstolS[is] specifies the vector tolerances for is-th quadrature sensitivity. Return value The return value flag (of type int) is one of: IDA IDA IDA IDA IDA IDA SUCCESS NO QUAD MEM NULL NO SENS NO QUADSENS ILL INPUT The optional value has been successfully set. Quadrature integration was not initialized. The ida mem pointer is NULL. Sensitivities were not activated. Quadratures depending on the sensitivities were not activated. One of the input tolerances was negative. IDAQuadSensEEtolerances Call flag = IDAQuadSensEEtolerances(ida mem); Description The function IDAQuadSensEEtolerances specifies that the tolerances for the sensitivitydependent quadratures should be estimated from those provided for the pure quadrature variables. Arguments ida mem (void *) pointer to the idas memory block. Return value The return value flag (of type int) is one of: ! 116 Using IDAS for Forward Sensitivity Analysis IDA IDA IDA IDA Notes 5.4.5 SUCCESS The optional value has been successfully set. MEM NULL The ida mem pointer is NULL. NO SENS Sensitivities were not activated. NO QUADSENS Quadratures depending on the sensitivities were not activated. When IDAQuadSensEEtolerances is used, before calling IDASolve, integration of pure quadratures must be initialized (see 4.7.1) and tolerances for pure quadratures must be also specified (see 4.7.4). Optional outputs for sensitivity-dependent quadrature integration idas provides the following functions that can be used to obtain solver performance information related to quadrature integration. IDAGetQuadSensNumRhsEvals Call flag = IDAGetQuadSensNumRhsEvals(ida mem, &nrhsQSevals); Description The function IDAGetQuadSensNumRhsEvals returns the number of calls made to the user’s quadrature right-hand side function. Arguments ida mem (void *) pointer to the idas memory block. nrhsQSevals (long int) number of calls made to the user’s rhsQS function. Return value The return value flag (of type int) is one of: IDA SUCCESS The optional output value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. IDA NO QUADSENS Sensitivity-dependent quadrature integration has not been initialized. IDAGetQuadSensNumErrTestFails Call flag = IDAGetQuadSensNumErrTestFails(ida mem, &nQSetfails); Description The function IDAGetQuadSensNumErrTestFails returns the number of local error test failures due to quadrature variables. Arguments ida mem (void *) pointer to the idas memory block. nQSetfails (long int) number of error test failures due to quadrature variables. Return value The return value flag (of type int) is one of: IDA SUCCESS The optional output value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. IDA NO QUADSENS Sensitivity-dependent quadrature integration has not been initialized. IDAGetQuadSensErrWeights Call flag = IDAGetQuadSensErrWeights(ida mem, eQSweight); Description The function IDAGetQuadSensErrWeights returns the quadrature error weights at the current time. Arguments (void *) pointer to the idas memory block. ida mem eQSweight (N Vector *) array of quadrature error weight vectors at the current time. Return value The return value flag (of type int) is one of: IDA SUCCESS The optional output value has been successfully set. IDA MEM NULL The ida mem pointer is NULL. IDA NO QUADSENS Sensitivity-dependent quadrature integration has not been initialized. 5.4 Integration of quadrature equations depending on forward sensitivities ! Notes 117 The user must allocate memory for eQSweight. If quadratures were not included in the error control mechanism (through a call to IDASetQuadSensErrCon with errconQS=SUNTRUE), IDAGetQuadSensErrWeights does not set the eQSweight vector. IDAGetQuadSensStats Call flag = IDAGetQuadSensStats(ida mem, &nrhsQSevals, &nQSetfails); Description The function IDAGetQuadSensStats returns the idas integrator statistics as a group. Arguments ida mem (void *) pointer to the idas memory block. nrhsQSevals (long int) number of calls to the user’s rhsQS function. nQSetfails (long int) number of error test failures due to quadrature variables. Return value The return value flag (of type int) is one of the optional output values have been successfully set. IDA SUCCESS IDA MEM NULL the ida mem pointer is NULL. IDA NO QUADSENS Sensitivity-dependent quadrature integration has not been initialized. 5.4.6 User-supplied function for sensitivity-dependent quadrature integration For the integration of sensitivity-dependent quadrature equations, the user must provide a function that defines the right-hand side of the sensitivity quadrature equations. For sensitivities of quadratures (2.10) with integrands q, the appropriate right-hand side functions are given by q̄i = (∂q/∂y)si + (∂q/∂ ẏ)ṡi + ∂q/∂pi . This user function must be of type IDAQuadSensRhsFn, defined as follows: IDAQuadSensRhsFn Definition typedef int (*IDAQuadSensRhsFn)(int Ns, realtype t, N Vector yy, N Vector yp, N Vector *yyS, N Vector *ypS, N Vector rrQ, N Vector *rhsvalQS, void *user data, N Vector tmp1, N Vector tmp2, N Vector tmp3) Purpose This function computes the sensitivity quadrature equation right-hand side for a given value of the independent variable t and state vector y. Arguments is the number of sensitivity vectors. is the current value of the independent variable. is the current value of the dependent variable vector, y(t). is the current value of the dependent variable vector, ẏ(t). is an array of Ns variables of type N Vector containing the dependent sensitivity vectors si . ypS is an array of Ns variables of type N Vector containing the dependent sensitivity derivatives ṡi . rrQ is the current value of the quadrature right-hand side q. rhsvalQS contains the Ns output vectors. user data is the user data pointer passed to IDASetUserData. tmp1 tmp2 tmp3 are N Vectors which can be used as temporary storage. Ns t yy yp yyS 118 Using IDAS for Forward Sensitivity Analysis Return value An IDAQuadSensRhsFn should return 0 if successful, a positive value if a recoverable error occurred (in which case idas will attempt to correct), or a negative value if it failed unrecoverably (in which case the integration is halted and IDA QRHS FAIL is returned). Notes Allocation of memory for rhsvalQS is automatically handled within idas. Both yy and yp are of type N Vector and both yyS and ypS are pointers to an array containing Ns vectors of type N Vector. It is the user’s responsibility to access the vector data consistently (including the use of the correct accessor macros from each nvector implementation). For the sake of computational efficiency, the vector functions in the two nvector implementations provided with idas do not perform any consistency checks with respect to their N Vector arguments (see §7.2 and §7.3). There is one situation in which recovery is not possible even if IDAQuadSensRhsFn function returns a recoverable error flag. That is when this occurs at the very first call to the IDAQuadSensRhsFn, in which case idas returns IDA FIRST QSRHS ERR). 5.5 Note on using partial error control For some problems, when sensitivities are excluded from the error control test, the behavior of idas may appear at first glance to be erroneous. One would expect that, in such cases, the sensitivity variables would not influence in any way the step size selection. The short explanation of this behavior is that the step size selection implemented by the error control mechanism in idas is based on the magnitude of the correction calculated by the nonlinear solver. As mentioned in §5.2.1, even with partial error control selected in the call to IDASensInit, the sensitivity variables are included in the convergence tests of the nonlinear solver. When using the simultaneous corrector method (§2.5), the nonlinear system that is solved at each step involves both the state and sensitivity equations. In this case, it is easy to see how the sensitivity variables may affect the convergence rate of the nonlinear solver and therefore the step size selection. The case of the staggered corrector approach is more subtle. The sensitivity variables at a given step are computed only once the solver for the nonlinear state equations has converged. However, if the nonlinear system corresponding to the sensitivity equations has convergence problems, idas will attempt to improve the initial guess by reducing the step size in order to provide a better prediction of the sensitivity variables. Moreover, even if there are no convergence failures in the solution of the sensitivity system, idas may trigger a call to the linear solver’s setup routine which typically involves reevaluation of Jacobian information (Jacobian approximation in the case of idadense and idaband, or preconditioner data in the case of the Krylov solvers). The new Jacobian information will be used by subsequent calls to the nonlinear solver for the state equations and, in this way, potentially affect the step size selection. When using the simultaneous corrector method it is not possible to decide whether nonlinear solver convergence failures or calls to the linear solver setup routine have been triggered by convergence problems due to the state or the sensitivity equations. When using one of the staggered corrector methods, however, these situations can be identified by carefully monitoring the diagnostic information provided through optional outputs. If there are no convergence failures in the sensitivity nonlinear solver, and none of the calls to the linear solver setup routine were made by the sensitivity nonlinear solver, then the step size selection is not affected by the sensitivity variables. Finally, the user must be warned that the effect of appending sensitivity equations to a given system of DAEs on the step size selection (through the mechanisms described above) is problem-dependent and can therefore lead to either an increase or decrease of the total number of steps that idas takes to complete the simulation. At first glance, one would expect that the impact of the sensitivity variables, if any, would be in the direction of increasing the step size and therefore reducing the total number of steps. The argument for this is that the presence of the sensitivity variables in the convergence test of the nonlinear solver can only lead to additional iterations (and therefore a smaller iteration error), or to additional calls to the linear solver setup routine (and therefore more up-to-date Jacobian information), both of which will lead to larger steps being taken by idas. However, this is true only 5.5 Note on using partial error control 119 locally. Overall, a larger integration step taken at a given time may lead to step size reductions at later times, due to either nonlinear solver convergence failures or error test failures. Chapter 6 Using IDAS for Adjoint Sensitivity Analysis This chapter describes the use of idas to compute sensitivities of derived functions using adjoint sensitivity analysis. As mentioned before, the adjoint sensitivity module of idas provides the infrastructure for integrating backward in time any system of DAEs that depends on the solution of the original IVP, by providing various interfaces to the main idas integrator, as well as several supporting user-callable functions. For this reason, in the following sections we refer to the backward problem and not to the adjoint problem when discussing details relevant to the DAEs that are integrated backward in time. The backward problem can be the adjoint problem (2.20) or (2.25), and can be augmented with some quadrature differential equations. idas uses various constants for both input and output. These are defined as needed in this chapter, but for convenience are also listed separately in Appendix B. We begin with a brief overview, in the form of a skeleton user program. Following that are detailed descriptions of the interface to the various user-callable functions and of the user-supplied functions that were not already described in Chapter 4. 6.1 A skeleton of the user’s main program The following is a skeleton of the user’s main program as an application of idas. The user program is to have these steps in the order indicated, unless otherwise noted. For the sake of brevity, we defer many of the details to the later sections. As in §4.4, most steps are independent of the nvector, sunmatrix, sunlinsol, and sunnonlinsol implementations used. For the steps that are not, refer to Chapters 7, 8, 9, and 10 for the specific name of the function to be called or macro to be referenced. Steps that are unchanged from the skeleton programs presented in §4.4, §5.1, and §5.4, are grayed out. 1. Include necessary header files The idas.h header file also defines additional types, constants, and function prototypes for the adjoint sensitivity module user-callable functions. In addition, the main program should include an nvector implementation header file (for the particular implementation used) and, if a nonlinear solver requiring a linear solver (e.g., the default Newton iteration) will be used, the header file of the desired linear solver module. 2. Initialize parallel or multi-threaded environment Forward problem 3. Set problem dimensions etc. for the forward problem 122 Using IDAS for Adjoint Sensitivity Analysis 4. Set initial conditions for the forward problem 5. Create idas object for the forward problem 6. Initialize idas solver for the forward problem 7. Specify integration tolerances for forward problem 8. Set optional inputs for the forward problem 9. Create matrix object for the forward problem 10. Create linear solver object for the forward problem 11. Set linear solver optional inputs for the forward problem 12. Attach linear solver module for the forward problem 13. Create nonlinear solver module for the forward problem 14. Attach nonlinear solver module for the forward problem 15. Set nonlinear solver optional inputs for the forward problem 16. Initialize quadrature problem or problems for forward problems, using IDAQuadInit and/or IDAQuadSensInit. 17. Initialize forward sensitivity problem 18. Specify rootfinding 19. Allocate space for the adjoint computation Call IDAAdjInit() to allocate memory for the combined forward-backward problem (see §6.2.1 for details). This call requires Nd, the number of steps between two consecutive checkpoints. IDAAdjInit also specifies the type of interpolation used (see §2.6.3). 20. Integrate forward problem Call IDASolveF, a wrapper for the idas main integration function IDASolve, either in IDA NORMAL mode to the time tout or in IDA ONE STEP mode inside a loop (if intermediate solutions of the forward problem are desired (see §6.2.3)). The final value of tret is then the maximum allowable value for the endpoint T of the backward problem. Backward problem(s) 21. Set problem dimensions etc. for the backward problem This generally includes NB, the number of variables in the backward problem and possibly the local vector length NBlocal. 22. Set initial values for the backward problem Set the endpoint time tB0 = T , and set the corresponding vectors yB0 and ypB0 at which the backward problem starts. 23. Create the backward problem Call IDACreateB, a wrapper for IDACreate, to create the idas memory block for the new backward problem. Unlike IDACreate, the function IDACreateB does not return a pointer to the newly created memory block (see §6.2.4). Instead, this pointer is attached to the internal adjoint memory block (created by IDAAdjInit) and returns an identifier called which that the user must later specify in any actions on the newly created backward problem. 6.1 A skeleton of the user’s main program 123 24. Allocate memory for the backward problem Call IDAInitB (or IDAInitBS, when the backward problem depends on the forward sensitivities). The two functions are actually wrappers for IDAInit and allocate internal memory, specify problem data, and initialize idas at tB0 for the backward problem (see §6.2.4). 25. Specify integration tolerances for backward problem Call IDASStolerancesB(...) or IDASVtolerancesB(...) to specify a scalar relative tolerance and scalar absolute tolerance, or a scalar relative tolerance and a vector of absolute tolerances, respectively. The functions are wrappers for IDASStolerances(...) and IDASVtolerances(...) but they require an extra argument which, the identifier of the backward problem returned by IDACreateB. See §6.2.5 for more information. 26. Set optional inputs for the backward problem Call IDASet*B functions to change from their default values any optional inputs that control the behavior of idas. Unlike their counterparts for the forward problem, these functions take an extra argument which, the identifier of the backward problem returned by IDACreateB (see §6.2.9). 27. Create matrix object for the backward problem If a nonlinear solver requiring a linear solve will be used (e.g., the the default Newton iteration) and the linear solver will be a direct linear solver, then a template Jacobian matrix must be created by calling the appropriate constructor function defined by the particular sunmatrix implementation. NOTE: The dense, banded, and sparse matrix objects are usable only in a serial or threaded environment. Note also that it is not required to use the same matrix type for both the forward and the backward problems. 28. Create linear solver object for the backward problem If a nonlinear solver requiring a linear solver is chosen (e.g., the default Newton iteration), then the desired linear solver object for the backward problem must be created by calling the appropriate constructor function defined by the particular sunlinsol implementation. Note that it is not required to use the same linear solver module for both the forward and the backward problems; for example, the forward problem could be solved with the sunlinsol dense linear solver module and the backward problem with sunlinsol spgmr linear solver module. 29. Set linear solver interface optional inputs for the backward problem Call IDASet*B functions to change optional inputs specific to the linear solver interface. See §6.2.9 for details. 30. Attach linear solver module for the backward problem If a nonlinear solver requiring a linear solver is chosen for the backward problem (e.g., the default Newton iteration), then initialize the idals linear solver interface by attaching the linear solver object (and matrix object, if applicable) with the following call (for details see §4.5.3): ier = IDASetLinearSolverB(...); 31. Create nonlinear solver object for the backward problem (optional ) If using a non-default nonlinear solver for the backward problem, then create the desired nonlinear solver object by calling the appropriate constructor function defined by the particular sunnonlinsol implementation e.g., NLSB = SUNNonlinSol ***(...); where *** is the name of the nonlinear solver (see Chapter 10 for details). 32. Attach nonlinear solver module for the backward problem (optional ) 124 Using IDAS for Adjoint Sensitivity Analysis If using a non-default nonlinear solver for the backward problem, then initialize the nonlinear solver interface by attaching the nonlinear solver object by calling ier = IDASetNonlinearSolverB(idaode mem, NLSB); (see §4.5.4 for details). 33. Initialize quadrature calculation If additional quadrature equations must be evaluated, call IDAQuadInitB or IDAQuadInitBS (if quadrature depends also on the forward sensitivities) as shown in §6.2.11.1. These functions are wrappers around IDAQuadInit and can be used to initialize and allocate memory for quadrature integration. Optionally, call IDASetQuad*B functions to change from their default values optional inputs that control the integration of quadratures during the backward phase. 34. Integrate backward problem Call IDASolveB, a second wrapper around the idas main integration function IDASolve, to integrate the backward problem from tB0 (see §6.2.8). This function can be called either in IDA NORMAL or IDA ONE STEP mode. Typically, IDASolveB will be called in IDA NORMAL mode with an end time equal to the initial time t0 of the forward problem. 35. Extract quadrature variables If applicable, call IDAGetQuadB, a wrapper around IDAGetQuad, to extract the values of the quadrature variables at the time returned by the last call to IDASolveB. See §6.2.11.2. 36. Deallocate memory Upon completion of the backward integration, call all necessary deallocation functions. These include appropriate destructors for the vectors y and yB, a call to IDAFree to free the idas memory block for the forward problem. If one or more additional adjoint sensitivity analyses are to be done for this problem, a call to IDAAdjFree (see §6.2.1) may be made to free and deallocate the memory allocated for the backward problems, followed by a call to IDAAdjInit. 37. Free the nonlinear solver memory for the forward and backward problems 38. Free linear solver and matrix memory for the forward and backward problems 39. Finalize MPI, if used The above user interface to the adjoint sensitivity module in idas was motivated by the desire to keep it as close as possible in look and feel to the one for DAE IVP integration. Note that if steps (21)-(35) are not present, a program with the above structure will have the same functionality as one described in §4.4 for integration of DAEs, albeit with some overhead due to the checkpointing scheme. If there are multiple backward problems associated with the same forward problem, repeat steps (21)-(35) above for each successive backward problem. In the process, each call to IDACreateB creates a new value of the identifier which. 6.2 6.2.1 User-callable functions for adjoint sensitivity analysis Adjoint sensitivity allocation and deallocation functions After the setup phase for the forward problem, but before the call to IDASolveF, memory for the combined forward-backward problem must be allocated by a call to the function IDAAdjInit. The form of the call to this function is IDAAdjInit Call flag = IDAAdjInit(ida mem, Nd, interpType); 6.2 User-callable functions for adjoint sensitivity analysis 125 Description The function IDAAdjInit updates idas memory block by allocating the internal memory needed for backward integration. Space is allocated for the Nd = Nd interpolation data points, and a linked list of checkpoints is initialized. Arguments ida mem (void *) is the pointer to the idas memory block returned by a previous call to IDACreate. Nd (long int) is the number of integration steps between two consecutive checkpoints. interpType (int) specifies the type of interpolation used and can be IDA POLYNOMIAL or IDA HERMITE, indicating variable-degree polynomial and cubic Hermite interpolation, respectively (see §2.6.3). Return value The return value flag (of type int) is one of: IDA IDA IDA IDA Notes SUCCESS MEM FAIL MEM NULL ILL INPUT IDAAdjInit was successful. A memory allocation request has failed. ida mem was NULL. One of the parameters was invalid: Nd was not positive or interpType is not one of the IDA POLYNOMIAL or IDA HERMITE. The user must set Nd so that all data needed for interpolation of the forward problem solution between two checkpoints fits in memory. IDAAdjInit attempts to allocate space for (2Nd+3) variables of type N Vector. If an error occurred, IDAAdjInit also sends a message to the error handler function. IDAAdjReInit Call flag = IDAAdjReInit(ida mem); Description The function IDAAdjReInit reinitializes the idas memory block for ASA, assuming that the number of steps between check points and the type of interpolation remain unchanged. Arguments ida mem (void *) is the pointer to the idas memory block returned by a previous call to IDACreate. Return value The return value flag (of type int) is one of: IDA SUCCESS IDAAdjReInit was successful. IDA MEM NULL ida mem was NULL. IDA NO ADJ The function IDAAdjInit was not previously called. Notes The list of check points (and associated memory) is deleted. The list of backward problems is kept. However, new backward problems can be added to this list by calling IDACreateB. If a new list of backward problems is also needed, then free the adjoint memory (by calling IDAAdjFree) and reinitialize ASA with IDAAdjInit. The idas memory for the forward and backward problems can be reinitialized separately by calling IDAReInit and IDAReInitB, respectively. IDAAdjFree Call IDAAdjFree(ida mem); Description The function IDAAdjFree frees the memory related to backward integration allocated by a previous call to IDAAdjInit. Arguments The only argument is the idas memory block pointer returned by a previous call to IDACreate. Return value The function IDAAdjFree has no return value. 126 Using IDAS for Adjoint Sensitivity Analysis Notes This function frees all memory allocated by IDAAdjInit. This includes workspace memory, the linked list of checkpoints, memory for the interpolation data, as well as the idas memory for the backward integration phase. Unless one or more further calls to IDAAdjInit are to be made, IDAAdjFree should not be called by the user, as it is invoked automatically by IDAFree. 6.2.2 Adjoint sensitivity optional input At any time during the integration of the forward problem, the user can disable the checkpointing of the forward sensitivities by calling the following function: IDAAdjSetNoSensi Call flag = IDAAdjSetNoSensi(ida mem); Description The function IDAAdjSetNoSensi instructs IDASolveF not to save checkpointing data for forward sensitivities any more. Arguments ida mem (void *) pointer to the idas memory block. Return value The return flag (of type int) is one of: IDA SUCCESS The call to IDACreateB was successful. IDA MEM NULL The ida mem was NULL. IDA NO ADJ The function IDAAdjInit has not been previously called. 6.2.3 Forward integration function The function IDASolveF is very similar to the idas function IDASolve (see §4.5.7) in that it integrates the solution of the forward problem and returns the solution (y, ẏ). At the same time, however, IDASolveF stores checkpoint data every Nd integration steps. IDASolveF can be called repeatedly by the user. Note that IDASolveF is used only for the forward integration pass within an Adjoint Sensitivity Analysis. It is not for use in Forward Sensitivity Analysis; for that, see Chapter 5. The call to this function has the form IDASolveF Call flag = IDASolveF(ida mem, tout, &tret, yret, ypret, itask, &ncheck); Description The function IDASolveF integrates the forward problem over an interval in t and saves checkpointing data. Arguments ida mem tout tret yret ypret itask (void *) pointer to the idas memory block. (realtype) the next time at which a computed solution is desired. (realtype) the time reached by the solver (output). (N Vector) the computed solution vector y. (N Vector) the computed solution vector ẏ. (int) a flag indicating the job of the solver for the next step. The IDA NORMAL task is to have the solver take internal steps until it has reached or just passed the user-specified tout parameter. The solver then interpolates in order to return an approximate value of y(tout) and ẏ(tout). The IDA ONE STEP option tells the solver to take just one internal step and return the solution at the point reached by that step. ncheck (int) the number of (internal) checkpoints stored so far. Return value On return, IDASolveF returns vectors yret, ypret and a corresponding independent variable value t = tret, such that yret is the computed value of y(t) and ypret the value of ẏ(t). Additionally, it returns in ncheck the number of internal checkpoints 6.2 User-callable functions for adjoint sensitivity analysis 127 saved; the total number of checkpoint intervals is ncheck+1. The return value flag (of type int) will be one of the following. For more details see §4.5.7. IDA SUCCESS IDASolveF succeeded. IDA TSTOP RETURN IDASolveF succeeded by reaching the optional stopping point. IDA ROOT RETURN IDASolveF succeeded and found one or more roots. In this case, tret is the location of the root. If nrtfn > 1, call IDAGetRootInfo to see which gi were found to have a root. The function IDAInit has not been previously called. IDA NO MALLOC IDA ILL INPUT One of the inputs to IDASolveF is illegal. IDA TOO MUCH WORK The solver took mxstep internal steps but could not reach tout. IDA TOO MUCH ACC The solver could not satisfy the accuracy demanded by the user for some internal step. IDA ERR FAILURE Error test failures occurred too many times during one internal time step or occurred with |h| = hmin . IDA CONV FAILURE Convergence test failures occurred too many times during one internal time step or occurred with |h| = hmin . IDA LSETUP FAIL The linear solver’s setup function failed in an unrecoverable manner. IDA LSOLVE FAIL The linear solver’s solve function failed in an unrecoverable manner. IDA NO ADJ The function IDAAdjInit has not been previously called. IDA MEM FAIL A memory allocation request has failed (in an attempt to allocate space for a new checkpoint). Notes All failure return values are negative and therefore a test flag< 0 will trap all IDASolveF failures. At this time, IDASolveF stores checkpoint information in memory only. Future versions will provide for a safeguard option of dumping checkpoint data into a temporary file as needed. The data stored at each checkpoint is basically a snapshot of the idas internal memory block and contains enough information to restart the integration from that time and to proceed with the same step size and method order sequence as during the forward integration. In addition, IDASolveF also stores interpolation data between consecutive checkpoints so that, at the end of this first forward integration phase, interpolation information is already available from the last checkpoint forward. In particular, if no checkpoints were necessary, there is no need for the second forward integration phase. It is illegal to change the integration tolerances between consecutive calls to IDASolveF, as this information is not captured in the checkpoint data. 6.2.4 Backward problem initialization functions The functions IDACreateB and IDAInitB (or IDAInitBS) must be called in the order listed. They instantiate an idas solver object, provide problem and solution specifications, and allocate internal memory for the backward problem. IDACreateB Call flag = IDACreateB(ida mem, &which); Description The function IDACreateB instantiates an idas solver object for the backward problem. Arguments ida mem (void *) pointer to the idas memory block returned by IDACreate. which (int) contains the identifier assigned by idas for the newly created backward problem. Any call to IDA*B functions requires such an identifier. ! 128 Using IDAS for Adjoint Sensitivity Analysis Return value The return flag (of type int) is one of: IDA IDA IDA IDA SUCCESS MEM NULL NO ADJ MEM FAIL The call to IDACreateB was successful. The ida mem was NULL. The function IDAAdjInit has not been previously called. A memory allocation request has failed. There are two initialization functions for the backward problem – one for the case when the backward problem does not depend on the forward sensitivities, and one for the case when it does. These two functions are described next. The function IDAInitB initializes the backward problem when it does not depend on the forward sensitivities. It is essentially wrapper for IDAInit with some particularization for backward integration, as described below. IDAInitB Call flag = IDAInitB(ida mem, which, resB, tB0, yB0, ypB0); Description The function IDAInitB provides problem specification, allocates internal memory, and initializes the backward problem. Arguments ida mem (void *) pointer to the idas memory block returned by IDACreate. which (int) represents the identifier of the backward problem. resB (IDAResFnB) is the C function which computes f B, the residual of the backward DAE problem. This function has the form resB(t, y, yp, yB, ypB, resvalB, user dataB) (for full details see §6.3.1). tB0 (realtype) specifies the endpoint T where final conditions are provided for the backward problem, normally equal to the endpoint of the forward integration. yB0 (N Vector) is the initial value (at t = tB0) of the backward solution. ypB0 (N Vector) is the initial derivative value (at t = tB0) of the backward solution. Return value The return flag (of type int) will be one of the following: The call to IDAInitB was successful. The function IDAInit has not been previously called. The ida mem was NULL. The function IDAAdjInit has not been previously called. The final time tB0 was outside the interval over which the forward problem was solved. IDA ILL INPUT The parameter which represented an invalid identifier, or one of yB0, ypB0, resB was NULL. IDA IDA IDA IDA IDA Notes SUCCESS NO MALLOC MEM NULL NO ADJ BAD TB0 The memory allocated by IDAInitB is deallocated by the function IDAAdjFree. For the case when backward problem also depends on the forward sensitivities, user must call IDAInitBS instead of IDAInitB. Only the third argument of each function differs between these functions. IDAInitBS Call flag = IDAInitBS(ida mem, which, resBS, tB0, yB0, ypB0); Description The function IDAInitBS provides problem specification, allocates internal memory, and initializes the backward problem. Arguments ida mem (void *) pointer to the idas memory block returned by IDACreate. which (int) represents the identifier of the backward problem. resBS (IDAResFnBS) is the C function which computes f B, the residual or the backward DAE problem. This function has the form resBS(t, y, yp, yS, ypS, yB, ypB, resvalB, user dataB) (for full details see §6.3.2). 6.2 User-callable functions for adjoint sensitivity analysis tB0 yB0 ypB0 129 (realtype) specifies the endpoint T where final conditions are provided for the backward problem. (N Vector) is the initial value (at t = tB0) of the backward solution. (N Vector) is the initial derivative value (at t = tB0) of the backward solution. Return value The return flag (of type int) will be one of the following: The call to IDAInitB was successful. The function IDAInit has not been previously called. The ida mem was NULL. The function IDAAdjInit has not been previously called. The final time tB0 was outside the interval over which the forward problem was solved. IDA ILL INPUT The parameter which represented an invalid identifier, or one of yB0, ypB0, resB was NULL, or sensitivities were not active during the forward integration. IDA IDA IDA IDA IDA Notes SUCCESS NO MALLOC MEM NULL NO ADJ BAD TB0 The memory allocated by IDAInitBS is deallocated by the function IDAAdjFree. The function IDAReInitB reinitializes idas for the solution of a series of backward problems, each identified by a value of the parameter which. IDAReInitB is essentially a wrapper for IDAReInit, and so all details given for IDAReInit in §4.5.11 apply here. Also, IDAReInitB can be called to reinitialize a backward problem even if it has been initialized with the sensitivity-dependent version IDAInitBS. Before calling IDAReInitB for a new backward problem, call any desired solution extraction functions IDAGet** associated with the previous backward problem. The call to the IDAReInitB function has the form IDAReInitB Call flag = IDAReInitB(ida mem, which, tB0, yB0, ypB0) Description The function IDAReInitB reinitializes an idas backward problem. Arguments ida mem (void *) pointer to idas memory block returned by IDACreate. which (int) represents the identifier of the backward problem. tB0 (realtype) specifies the endpoint T where final conditions are provided for the backward problem. yB0 (N Vector) is the initial value (at t = tB0) of the backward solution. ypB0 (N Vector) is the initial derivative value (at t = tB0) of the backward solution. Return value The return value flag (of type int) will be one of the following: The call to IDAReInitB was successful. The function IDAInit has not been previously called. The ida mem memory block pointer was NULL. The function IDAAdjInit has not been previously called. The final time tB0 is outside the interval over which the forward problem was solved. IDA ILL INPUT The parameter which represented an invalid identifier, or one of yB0, ypB0 was NULL. IDA IDA IDA IDA IDA 6.2.5 SUCCESS NO MALLOC MEM NULL NO ADJ BAD TB0 Tolerance specification functions for backward problem One of the following two functions must be called to specify the integration tolerances for the backward problem. Note that this call must be made after the call to IDAInitB or IDAInitBS. 130 Using IDAS for Adjoint Sensitivity Analysis IDASStolerancesB Call flag = IDASStolerances(ida mem, which, reltolB, abstolB); Description The function IDASStolerancesB specifies scalar relative and absolute tolerances. Arguments ida mem which reltolB abstolB (void *) pointer to the idas memory block returned by IDACreate. (int) represents the identifier of the backward problem. (realtype) is the scalar relative error tolerance. (realtype) is the scalar absolute error tolerance. Return value The return flag (of type int) will be one of the following: The call to IDASStolerancesB was successful. The idas memory block was not initialized through a previous call to IDACreate. IDA NO MALLOC The allocation function IDAInit has not been called. IDA NO ADJ The function IDAAdjInit has not been previously called. IDA ILL INPUT One of the input tolerances was negative. IDA SUCCESS IDA MEM NULL IDASVtolerancesB Call flag = IDASVtolerancesB(ida mem, which, reltolB, abstolB); Description The function IDASVtolerancesB specifies scalar relative tolerance and vector absolute tolerances. Arguments ida mem which reltol abstol (void *) pointer to the idas memory block returned by IDACreate. (int) represents the identifier of the backward problem. (realtype) is the scalar relative error tolerance. (N Vector) is the vector of absolute error tolerances. Return value The return flag (of type int) will be one of the following: The call to IDASVtolerancesB was successful. The idas memory block was not initialized through a previous call to IDACreate. IDA NO MALLOC The allocation function IDAInit has not been called. IDA NO ADJ The function IDAAdjInit has not been previously called. IDA ILL INPUT The relative error tolerance was negative or the absolute tolerance had a negative component. IDA SUCCESS IDA MEM NULL Notes 6.2.6 This choice of tolerances is important when the absolute error tolerance needs to be different for each component of the DAE state vector y. Linear solver initialization functions for backward problem All idas linear solver modules available for forward problems are available for the backward problem. They should be created as for the forward problem then attached to the memory structure for the backward problem using the following function. IDASetLinearSolverB Call flag = IDASetLinearSolverB(ida mem, which, LS, A); Description The function IDASetLinearSolverB attaches a generic sunlinsol object LS and corresponding template Jacobian sunmatrix object A (if applicable) to idas, initializing the idals linear solver interface for solution of the backward problem. Arguments ida mem (void *) pointer to the idas memory block. which (int) represents the identifier of the backward problem returned by IDACreateB. 6.2 User-callable functions for adjoint sensitivity analysis LS A 131 (SUNLinearSolver) sunlinsol object to use for solving linear systems for the backward problem. (SUNMatrix) sunmatrix object for used as a template for the Jacobian for the backward problem (or NULL if not applicable). Return value The return value flag (of type int) is one of IDALS SUCCESS The idals initialization was successful. IDALS MEM NULL The ida mem pointer is NULL. IDALS ILL INPUT The idals interface is not compatible with the LS or A input objects or is incompatible with the current nvector module. IDALS MEM FAIL A memory allocation request failed. IDALS NO ADJ The function IDAAdjInit has not been previously called. IDALS ILL INPUT The parameter which represented an invalid identifier. Notes If LS is a matrix-based linear solver, then the template Jacobian matrix A will be used in the solve process, so if additional storage is required within the sunmatrix object (e.g. for factorization of a banded matrix), ensure that the input object is allocated with sufficient size (see the documentation of the particular sunmatrix type in Chapter 8 for further information). The previous routines IDADlsSetLinearSolverB and IDASpilsSetLinearSolverB are now wrappers for this routine, and may still be used for backward-compatibility. However, these will be deprecated in future releases, so we recommend that users transition to the new routine name soon. 6.2.7 Initial condition calculation functions for backward problem idas provides support for calculation of consistent initial conditions for certain backward index-one problems of semi-implicit form through the functions IDACalcICB and IDACalcICBS. Calling them is optional. It is only necessary when the initial conditions do not satisfy the adjoint system. The above functions provide the same functionality for backward problems as IDACalcIC with parameter icopt = IDA YA YDP INIT provides for forward problems (see §4.5.5): compute the algebraic components of yB and differential components of ẏB, given the differential components of yB. They require that the IDASetIdB was previously called to specify the differential and algebraic components. Both functions require forward solutions at the final time tB0. IDACalcICBS also needs forward sensitivities at the final time tB0. IDACalcICB Call flag = IDACalcICB(ida mem, which, tBout1, N Vector yfin, N Vector ypfin); Description The function IDACalcICB corrects the initial values yB0 and ypB0 at time tB0 for the backward problem. Arguments ida mem (void *) pointer to the idas memory block. which (int) is the identifier of the backward problem. tBout1 (realtype) is the first value of t at which a solution will be requested (from IDASolveB). This value is needed here only to determine the direction of integration and rough scale in the independent variable t. yfin (N Vector) the forward solution at the final time tB0. ypfin (N Vector) the forward solution derivative at the final time tB0. Return value The return value flag (of type int) can be any that is returned by IDACalcIC (see §4.5.5). However IDACalcICB can also return one of the following: IDA NO ADJ IDAAdjInit has not been previously called. IDA ILL INPUT Parameter which represented an invalid identifier. 132 Using IDAS for Adjoint Sensitivity Analysis Notes All failure return values are negative and therefore a test flag < 0 will trap all IDACalcICB failures. Note that IDACalcICB will correct the values of yB(tB0 ) and ẏB(tB0 ) which were specified in the previous call to IDAInitB or IDAReInitB. To obtain the corrected values, call IDAGetconsistentICB (see §6.2.10.2). In the case where the backward problem also depends on the forward sensitivities, user must call the following function to correct the initial conditions: IDACalcICBS Call Description Arguments flag = IDACalcICBS(ida mem, which, tBout1, N Vector yfin, N Vector ypfin, N Vector ySfin, N Vector ypSfin); The function IDACalcICBS corrects the initial values yB0 and ypB0 at time tB0 for the backward problem. ida mem (void *) pointer to the idas memory block. which (int) is the identifier of the backward problem. tBout1 (realtype) is the first value of t at which a solution will be requested (from IDASolveB).This value is needed here only to determine the direction of integration and rough scale in the independent variable t. yfin (N Vector) the forward solution at the final time tB0. ypfin (N Vector) the forward solution derivative at the final time tB0. ySfin (N Vector *) a pointer to an array of Ns vectors containing the sensitivities of the forward solution at the final time tB0. ypSfin (N Vector *) a pointer to an array of Ns vectors containing the derivatives of the forward solution sensitivities at the final time tB0. Return value The return value flag (of type int) can be any that is returned by IDACalcIC (see §4.5.5). However IDACalcICBS can also return one of the following: IDA NO ADJ IDAAdjInit has not been previously called. IDA ILL INPUT Parameter which represented an invalid identifier, sensitivities were not active during forward integration, or IDAInitBS (or IDAReInitBS) has not been previously called. Notes All failure return values are negative and therefore a test flag < 0 will trap all IDACalcICBS failures. Note that IDACalcICBS will correct the values of yB(tB0 ) and ẏB(tB0 ) which were specified in the previous call to IDAInitBS or IDAReInitBS. To obtain the corrected values, call IDAGetConsistentICB (see §6.2.10.2). 6.2.8 Backward integration function The function IDASolveB performs the integration of the backward problem. It is essentially a wrapper for the idas main integration function IDASolve and, in the case in which checkpoints were needed, it evolves the solution of the backward problem through a sequence of forward-backward integration pairs between consecutive checkpoints. In each pair, the first run integrates the original IVP forward in time and stores interpolation data; the second run integrates the backward problem backward in time and performs the required interpolation to provide the solution of the IVP to the backward problem. The function IDASolveB does not return the solution yB itself. To obtain that, call the function IDAGetB, which is also described below. The IDASolveB function does not support rootfinding, unlike IDASoveF, which supports the finding of roots of functions of (t, y, ẏ). If rootfinding was performed by IDASolveF, then for the sake of efficiency, it should be disabled for IDASolveB by first calling IDARootInit with nrtfn = 0. The call to IDASolveB has the form 6.2 User-callable functions for adjoint sensitivity analysis 133 IDASolveB Call flag = IDASolveB(ida mem, tBout, itaskB); Description The function IDASolveB integrates the backward DAE problem. Arguments ida mem (void *) pointer to the idas memory returned by IDACreate. tBout (realtype) the next time at which a computed solution is desired. itaskB (int) a flag indicating the job of the solver for the next step. The IDA NORMAL task is to have the solver take internal steps until it has reached or just passed the user-specified value tBout. The solver then interpolates in order to return an approximate value of yB(tBout). The IDA ONE STEP option tells the solver to take just one internal step in the direction of tBout and return. Return value The return value flag (of type int) will be one of the following. For more details see §4.5.7. IDASolveB succeeded. The ida mem was NULL. The function IDAAdjInit has not been previously called. No backward problem has been added to the list of backward problems by a call to IDACreateB IDA NO FWD The function IDASolveF has not been previously called. One of the inputs to IDASolveB is illegal. IDA ILL INPUT IDA BAD ITASK The itaskB argument has an illegal value. IDA TOO MUCH WORK The solver took mxstep internal steps but could not reach tBout. IDA TOO MUCH ACC The solver could not satisfy the accuracy demanded by the user for some internal step. IDA ERR FAILURE Error test failures occurred too many times during one internal time step. IDA CONV FAILURE Convergence test failures occurred too many times during one internal time step. IDA LSETUP FAIL The linear solver’s setup function failed in an unrecoverable manner. IDA SOLVE FAIL The linear solver’s solve function failed in an unrecoverable manner. IDA BCKMEM NULL The idas memory for the backward problem was not created with a call to IDACreateB. IDA BAD TBOUT The desired output time tBout is outside the interval over which the forward problem was solved. IDA REIFWD FAIL Reinitialization of the forward problem failed at the first checkpoint (corresponding to the initial time of the forward problem). IDA FWD FAIL An error occurred during the integration of the forward problem. IDA IDA IDA IDA Notes SUCCESS MEM NULL NO ADJ NO BCK All failure return values are negative and therefore a test flag< 0 will trap all IDASolveB failures. In the case of multiple checkpoints and multiple backward problems, a given call to IDASolveB in IDA ONE STEP mode may not advance every problem one step, depending on the relative locations of the current times reached. But repeated calls will eventually advance all problems to tBout. To obtain the solution yB to the backward problem, call the function IDAGetB as follows: IDAGetB Call flag = IDAGetB(ida mem, which, &tret, yB, ypB); Description The function IDAGetB provides the solution yB of the backward DAE problem. 134 Using IDAS for Adjoint Sensitivity Analysis Arguments ida mem which tret yB ypB (void *) pointer to the idas memory returned by IDACreate. (int) the identifier of the backward problem. (realtype) the time reached by the solver (output). (N Vector) the backward solution at time tret. (N Vector) the backward solution derivative at time tret. Return value The return value flag (of type int) will be one of the following. IDA IDA IDA IDA ! Notes SUCCESS MEM NULL NO ADJ ILL INPUT IDAGetB was successful. ida mem is NULL. The function IDAAdjInit has not been previously called. The parameter which is an invalid identifier. The user must allocate space for yB and ypB. 6.2.9 Optional input functions for the backward problem 6.2.9.1 Main solver optional input functions The adjoint module in idas provides wrappers for most of the optional input functions defined in §4.5.8.1. The only difference is that the user must specify the identifier which of the backward problem within the list managed by idas. The optional input functions defined for the backward problem are: flag flag flag flag flag flag flag flag flag = = = = = = = = = IDASetNonlinearSolverB(ida_mem, which, NLSB); IDASetUserDataB(ida_mem, which, user_dataB); IDASetMaxOrdB(ida_mem, which, maxordB); IDASetMaxNumStepsB(ida_mem, which, mxstepsB); IDASetInitStepB(ida_mem, which, hinB) IDASetMaxStepB(ida_mem, which, hmaxB); IDASetSuppressAlgB(ida_mem, which, suppressalgB); IDASetIdB(ida_mem, which, idB); IDASetConstraintsB(ida_mem, which, constraintsB); Their return value flag (of type int) can have any of the return values of their counterparts, but it can also be IDA NO ADJ if IDAAdjInit has not been called, or IDA ILL INPUT if which was an invalid identifier. 6.2.9.2 Linear solver interface optional input functions When using matrix-based linear solver modules for the backward problem, i.e., a non-NULL sunmatrix object A was passed to IDASetLinearSolverB, the idals linear solver interface needs a function to compute an approximation to the Jacobian matrix. This can be attached through a call to either IDASetJacFnB or idaIDASetJacFnBS, with the second used when the backward problem depends on the forward sensitivities. IDASetJacFnB Call flag = IDASetJacFnB(ida mem, which, jacB); Description The function IDASetJacFnB specifies the Jacobian approximation function to be used for the backward problem. Arguments ida mem (void *) pointer to the idas memory block. which (int) represents the identifier of the backward problem. jacB (IDALsJacFnB) user-defined Jacobian approximation function. Return value The return value flag (of type int) is one of IDALS SUCCESS IDASetJacFnB succeeded. 6.2 User-callable functions for adjoint sensitivity analysis 135 IDALS MEM NULL The ida mem was NULL. IDALS NO ADJ The function IDAAdjInit has not been previously called. IDALS LMEM NULL The linear solver has not been initialized with a call to IDASetLinearSolverB. IDALS ILL INPUT The parameter which represented an invalid identifier. Notes The function type IDALsJacFnB is described in §6.3.5. The previous routine IDADlsSetJacFnB is now a wrapper for this routine, and may still be used for backward-compatibility. However, this will be deprecated in future releases, so we recommend that users transition to the new routine name soon. IDASetJacFnBS Call flag = IDASetJacFnBS(ida mem, which, jacBS); Description The function IDASetJacFnBS specifies the Jacobian approximation function to be used for the backward problem in the case where the backward problem depends on the forward sensitivities. Arguments ida mem (void *) pointer to the idas memory block. which (int) represents the identifier of the backward problem. jacBS (IDALJacFnBS) user-defined Jacobian approximation function. Return value The return value flag (of type int) is one of IDALS SUCCESS IDASetJacFnBS succeeded. IDALS MEM NULL The ida mem was NULL. IDALS NO ADJ The function IDAAdjInit has not been previously called. IDALS LMEM NULL The linear solver has not been initialized with a call to IDASetLinearSolverBS. IDALS ILL INPUT The parameter which represented an invalid identifier. Notes The function type IDALsJacFnBS is described in §6.3.5. The previous routine IDADlsSetJacFnBS is now a wrapper for this routine, and may still be used for backward-compatibility. However, this will be deprecated in future releases, so we recommend that users transition to the new routine name soon. When using a matrix-free linear solver module for the backward problem, the idals linear solver interface requires a function to compute an approximation to the product between the Jacobian matrix J(t, y) and a vector v. This may be performed internally using a difference-quotient approximation, or it may be supplied by the user by calling one of the following two functions: IDASetJacTimesB Call flag = IDASetJacTimesB(ida mem, which, jsetupB, jtimesB); Description The function IDASetJacTimesB specifies the Jacobian-vector setup and product functions to be used. Arguments ida mem (void *) pointer to the idas memory block. which (int) the identifier of the backward problem. jtsetupB (IDALsJacTimesSetupFnB) user-defined function to set up the Jacobian-vector product. Pass NULL if no setup is necessary. jtimesB (IDALsJacTimesVecFnB) user-defined Jacobian-vector product function. Return value The return value flag (of type int) is one of: IDALS SUCCESS IDALS MEM NULL The optional value has been successfully set. The ida mem memory block pointer was NULL. 136 Using IDAS for Adjoint Sensitivity Analysis IDALS LMEM NULL The idals linear solver has not been initialized. IDALS NO ADJ The function IDAAdjInit has not been previously called. IDALS ILL INPUT The parameter which represented an invalid identifier. Notes The function types IDALsJacTimesVecFnB and IDALsJacTimesSetupFnB are described in §6.3.6. The previous routine IDASpilsSetJacTimesB is now a wrapper for this routine, and may still be used for backward-compatibility. However, this will be deprecated in future releases, so we recommend that users transition to the new routine name soon. IDASetJacTimesBS Call flag = IDASetJacTimesBS(ida mem, which, jsetupBS, jtimesBS); Description The function IDASetJacTimesBS specifies the Jacobian-vector product setup and evaluation functions to be used, in the case where the backward problem depends on the forward sensitivities. Arguments ida mem (void *) pointer to the idas memory block. which (int) the identifier of the backward problem. jtsetupBS (IDALsJacTimesSetupFnBS) user-defined function to set up the Jacobianvector product. Pass NULL if no setup is necessary. jtimesBS (IDALsJacTimesVecFnBS) user-defined Jacobian-vector product function. Return value The return value flag (of type int) is one of: IDALS IDALS IDALS IDALS IDALS Notes SUCCESS MEM NULL LMEM NULL NO ADJ ILL INPUT The The The The The optional value has been successfully set. ida mem memory block pointer was NULL. idals linear solver has not been initialized. function IDAAdjInit has not been previously called. parameter which represented an invalid identifier. The function types IDALsJacTimesVecFnBS and IDALsJacTimesSetupFnBS are described in §6.3.6. The previous routine IDASpilsSetJacTimesBS is now a wrapper for this routine, and may still be used for backward-compatibility. However, this will be deprecated in future releases, so we recommend that users transition to the new routine name soon. Alternately, when using the default difference-quotient approximation to the Jacobian-vector product for the backward problem, the user may specify the factor to use in setting increments for the finitedifference approximation, via a call to IDASetIncrementFactorB: IDASetIncrementFactorB Call flag = IDASetIncrementFactorB(ida mem, which, dqincfacB); Description The function IDASetIncrementFactorB specifies the factor in the increments used in the difference quotient approximations to matrix-vector products for the backward problem. This routine can be used in both the cases wherethe backward problem does and does not depend on the forward sensitvities. Arguments ida mem (void *) pointer to the idas memory block. which (int) the identifier of the backward problem. dqincfacB (realtype) difference quotient approximation factor. Return value The return value flag (of type int) is one of IDALS SUCCESS IDALS MEM NULL The optional value has been successfully set. The ida mem pointer is NULL. 6.2 User-callable functions for adjoint sensitivity analysis IDALS IDALS IDALS IDALS Notes LMEM NULL NO ADJ ILL INPUT ILL INPUT The The The The 137 idals linear solver has not been initialized. function IDAAdjInit has not been previously called. value of eplifacB is negative. parameter which represented an invalid identifier. The default value is 1.0. The previous routine IDASpilsSetIncrementFactorB is now a wrapper for this routine, and may still be used for backward-compatibility. However, this will be deprecated in future releases, so we recommend that users transition to the new routine name soon. When using an iterative linear solver for the backward problem, the user may supply a preconditioning operator to aid in solution of the system, or she/he may adjust the convergence tolerance factor for the iterative linear solver. These may be accomplished through calling the following functions: IDASetPreconditionerB Call flag = IDASetPreconditionerB(ida mem, which, psetupB, psolveB); Description The function IDASetPrecSolveFnB specifies the preconditioner setup and solve functions for the backward integration. Arguments ida mem which psetupB psolveB (void *) pointer to the idas memory block. (int) the identifier of the backward problem. (IDALsPrecSetupFnB) user-defined preconditioner setup function. (IDALsPrecSolveFnB) user-defined preconditioner solve function. Return value The return value flag (of type int) is one of: IDALS IDALS IDALS IDALS IDALS Notes SUCCESS MEM NULL LMEM NULL NO ADJ ILL INPUT The The The The The optional value has been successfully set. ida mem memory block pointer was NULL. idals linear solver has not been initialized. function IDAAdjInit has not been previously called. parameter which represented an invalid identifier. The function types IDALsPrecSolveFnB and IDALsPrecSetupFnB are described in §6.3.8 and §6.3.9, respectively. The psetupB argument may be NULL if no setup operation is involved in the preconditioner. The previous routine IDASpilsSetPreconditionerB is now a wrapper for this routine, and may still be used for backward-compatibility. However, this will be deprecated in future releases, so we recommend that users transition to the new routine name soon. IDASetPreconditionerBS Call flag = IDASetPreconditionerBS(ida mem, which, psetupBS, psolveBS); Description The function IDASetPrecSolveFnBS specifies the preconditioner setup and solve functions for the backward integration, in the case where the backward problem depends on the forward sensitivities. Arguments ida mem which psetupBS psolveBS (void *) pointer to the idas memory block. (int) the identifier of the backward problem. (IDALsPrecSetupFnBS) user-defined preconditioner setup function. (IDALsPrecSolveFnBS) user-defined preconditioner solve function. Return value The return value flag (of type int) is one of: IDALS SUCCESS The optional value has been successfully set. IDALS MEM NULL The ida mem memory block pointer was NULL. IDALS LMEM NULL The idals linear solver has not been initialized. 138 Using IDAS for Adjoint Sensitivity Analysis IDALS NO ADJ The function IDAAdjInit has not been previously called. IDALS ILL INPUT The parameter which represented an invalid identifier. Notes The function types IDALsPrecSolveFnBS and IDALsPrecSetupFnBS are described in §6.3.8 and §6.3.9, respectively. The psetupBS argument may be NULL if no setup operation is involved in the preconditioner. The previous routine IDASpilsSetPreconditionerBS is now a wrapper for this routine, and may still be used for backward-compatibility. However, this will be deprecated in future releases, so we recommend that users transition to the new routine name soon. IDASetEpsLinB Call flag = IDASetEpsLinB(ida mem, which, eplifacB); Description The function IDASetEpsLinB specifies the factor by which the Krylov linear solver’s convergence test constant is reduced from the nonlinear iteration test constant. (See §2.1). This routine can be used in both the cases wherethe backward problem does and does not depend on the forward sensitvities. Arguments ida mem (void *) pointer to the idas memory block. which (int) the identifier of the backward problem. eplifacB (realtype) linear convergence safety factor (>= 0.0). Return value The return value flag (of type int) is one of IDALS IDALS IDALS IDALS IDALS IDALS Notes SUCCESS MEM NULL LMEM NULL NO ADJ ILL INPUT ILL INPUT The The The The The The optional value has been successfully set. ida mem pointer is NULL. idals linear solver has not been initialized. function IDAAdjInit has not been previously called. value of eplifacB is negative. parameter which represented an invalid identifier. The default value is 0.05. Passing a value eplifacB= 0.0 also indicates using the default value. The previous routine IDASpilsSetEpsLinB is now a wrapper for this routine, and may still be used for backward-compatibility. However, this will be deprecated in future releases, so we recommend that users transition to the new routine name soon. 6.2.10 Optional output functions for the backward problem 6.2.10.1 Main solver optional output functions The user of the adjoint module in idas has access to any of the optional output functions described in §4.5.10, both for the main solver and for the linear solver modules. The first argument of these IDAGet* and IDA*Get* functions is the pointer to the idas memory block for the backward problem. In order to call any of these functions, the user must first call the following function to obtain this pointer: IDAGetAdjIDABmem Call ida memB = IDAGetAdjIDABmem(ida mem, which); Description The function IDAGetAdjIDABmem returns a pointer to the idas memory block for the backward problem. Arguments ida mem (void *) pointer to the idas memory block created by IDACreate. which (int) the identifier of the backward problem. 6.2 User-callable functions for adjoint sensitivity analysis 139 Return value The return value, ida memB (of type void *), is a pointer to the idas memory for the backward problem. Notes The user should not modify ida memB in any way. ! Optional output calls should pass ida memB as the first argument; thus, for example, to get the number of integration steps: flag = IDAGetNumSteps(idas memB,&nsteps). To get values of the forward solution during a backward integration, use the following function. The input value of t would typically be equal to that at which the backward solution has just been obtained with IDAGetB. In any case, it must be within the last checkpoint interval used by IDASolveB. IDAGetAdjY Call flag = IDAGetAdjY(ida mem, t, y, yp); Description The function IDAGetAdjY returns the interpolated value of the forward solution y and its derivative during a backward integration. Arguments ida mem (void *) pointer to the idas memory block created by IDACreate. t (realtype) value of the independent variable at which y is desired (input). y (N Vector) forward solution y(t). yp (N Vector) forward solution derivative ẏ(t). Return value The return value flag (of type int) is one of: IDA SUCCESS IDAGetAdjY was successful. IDA MEM NULL ida mem was NULL. IDA GETY BADT The value of t was outside the current checkpoint interval. Notes The user must allocate space for y and yp. IDAGetAdjCheckPointsInfo Call flag = IDAGetAdjCheckPointsInfo(ida mem, IDAadjCheckPointRec *ckpnt); Description The function IDAGetAdjCheckPointsInfo loads an array of ncheck+1 records of type IDAadjCheckPointRec. The user must allocate space for the array ckpnt. Arguments ida mem (void *) pointer to the idas memory block created by IDACreate. ckpnt (IDAadjCheckPointRec *) array of ncheck+1 checkpoint records, each of type IDAadjCheckPointRec. Return value The return value is IDA SUCCESS if successful, or IDA MEM NULL if ida mem is NULL, or IDA NO ADJ if ASA was not initialized. Notes The members of each record ckpnt[i] are: • ckpnt[i].my addr (void *) address of current checkpoint in ida mem->ida adj mem • ckpnt[i].next addr (void *) address of next checkpoint • ckpnt[i].t0 (realtype) start of checkpoint interval • ckpnt[i].t1 (realtype) end of checkpoint interval • ckpnt[i].nstep (long int) step counter at ckeckpoint t0 • ckpnt[i].order (int) method order at checkpoint t0 • ckpnt[i].step (realtype) step size at checkpoint t0 6.2.10.2 Initial condition calculation optional output function ! 140 Using IDAS for Adjoint Sensitivity Analysis IDAGetConsistentICB Call flag = IDAGetConsistentICB(ida mem, which, yB0 mod, ypB0 mod); Description The function IDAGetConsistentICB returns the corrected initial conditions for backward problem calculated by IDACalcICB. Arguments ida mem which yB0 mod ypB0 mod (void *) pointer to the idas memory block. is the identifier of the backward problem. (N Vector) consistent initial vector. (N Vector) consistent initial derivative vector. Return value The return value flag (of type int) is one of IDA IDA IDA IDA Notes SUCCESS MEM NULL NO ADJ ILL INPUT The optional output value has been successfully set. The ida mem pointer is NULL. IDAAdjInit has not been previously called. Parameter which did not refer a valid backward problem identifier. If the consistent solution vector or consistent derivative vector is not desired, pass NULL for the corresponding argument. ! The user must allocate space for yB0 mod and ypB0 mod (if not NULL). 6.2.11 Backward integration of quadrature equations Not only the backward problem but also the backward quadrature equations may or may not depend on the forward sensitivities. Accordingly, one of the IDAQuadInitB or IDAQuadInitBS should be used to allocate internal memory and to initialize backward quadratures. For any other operation (extraction, optional input/output, reinitialization, deallocation), the same function is called regardless of whether or not the quadratures are sensitivity-dependent. 6.2.11.1 Backward quadrature initialization functions The function IDAQuadInitB initializes and allocates memory for the backward integration of quadrature equations that do not depende on forward sensititvities. It has the following form: IDAQuadInitB Call flag = IDAQuadInitB(ida mem, which, rhsQB, yQB0); Description The function IDAQuadInitB provides required problem specifications, allocates internal memory, and initializes backward quadrature integration. Arguments ida mem (void *) pointer to the idas memory block. which (int) the identifier of the backward problem. rhsQB (IDAQuadRhsFnB) is the C function which computes f QB, the residual of the backward quadrature equations. This function has the form rhsQB(t, y, yp, yB, ypB, rhsvalBQ, user dataB) (see §6.3.3). yQB0 (N Vector) is the value of the quadrature variables at tB0. Return value The return value flag (of type int) will be one of the following: IDA IDA IDA IDA IDA SUCCESS MEM NULL NO ADJ MEM FAIL ILL INPUT The call to IDAQuadInitB was successful. ida mem was NULL. The function IDAAdjInit has not been previously called. A memory allocation request has failed. The parameter which is an invalid identifier. The function IDAQuadInitBS initializes and allocates memory for the backward integration of quadrature equations that depend on the forward sensitivities. 6.2 User-callable functions for adjoint sensitivity analysis 141 IDAQuadInitBS Call flag = IDAQuadInitBS(ida mem, which, rhsQBS, yQBS0); Description The function IDAQuadInitBS provides required problem specifications, allocates internal memory, and initializes backward quadrature integration. Arguments ida mem (void *) pointer to the idas memory block. which (int) the identifier of the backward problem. rhsQBS (IDAQuadRhsFnBS) is the C function which computes f QBS, the residual of the backward quadrature equations. This function has the form rhsQBS(t, y, yp, yS, ypS, yB, ypB, rhsvalBQS, user dataB) (see §6.3.4). yQBS0 (N Vector) is the value of the sensitivity-dependent quadrature variables at tB0. Return value The return value flag (of type int) will be one of the following: IDA IDA IDA IDA IDA SUCCESS MEM NULL NO ADJ MEM FAIL ILL INPUT The call to IDAQuadInitBS was successful. ida mem was NULL. The function IDAAdjInit has not been previously called. A memory allocation request has failed. The parameter which is an invalid identifier. The integration of quadrature equations during the backward phase can be re-initialized by calling the following function. Before calling IDAQuadReInitB for a new backward problem, call any desired solution extraction functions IDAGet** associated with the previous backward problem. IDAQuadReInitB Call flag = IDAQuadReInitB(ida mem, which, yQB0); Description The function IDAQuadReInitB re-initializes the backward quadrature integration. Arguments ida mem (void *) pointer to the idas memory block. which (int) the identifier of the backward problem. yQB0 (N Vector) is the value of the quadrature variables at tB0. Return value The return value flag (of type int) will be one of the following: The call to IDAQuadReInitB was successful. ida mem was NULL. The function IDAAdjInit has not been previously called. A memory allocation request has failed. Quadrature integration was not activated through a previous call to IDAQuadInitB. IDA ILL INPUT The parameter which is an invalid identifier. IDA IDA IDA IDA IDA Notes 6.2.11.2 SUCCESS MEM NULL NO ADJ MEM FAIL NO QUAD IDAQuadReInitB can be used after a call to either IDAQuadInitB or IDAQuadInitBS. Backward quadrature extraction function To extract the values of the quadrature variables at the last return time of IDASolveB, idas provides a wrapper for the function IDAGetQuad (see §4.7.3). The call to this function has the form IDAGetQuadB Call flag = IDAGetQuadB(ida mem, which, &tret, yQB); Description The function IDAGetQuadB returns the quadrature solution vector after a successful return from IDASolveB. Arguments ida mem (void *) pointer to the idas memory. 142 Using IDAS for Adjoint Sensitivity Analysis tret yQB ! (realtype) the time reached by the solver (output). (N Vector) the computed quadrature vector. Return value Notes T he user must allocate space for yQB. The return value flag of IDAGetQuadB is one of: IDA SUCCESS IDAGetQuadB was successful. IDA MEM NULL ida mem is NULL. IDA NO ADJ The function IDAAdjInit has not been previously called. IDA NO QUAD Quadrature integration was not initialized. IDA BAD DKY yQB was NULL. IDA ILL INPUT The parameter which is an invalid identifier. 6.2.11.3 Optional input/output functions for backward quadrature integration Optional values controlling the backward integration of quadrature equations can be changed from their default values through calls to one of the following functions which are wrappers for the corresponding optional input functions defined in §4.7.4. The user must specify the identifier which of the backward problem for which the optional values are specified. flag = IDASetQuadErrConB(ida_mem, which, errconQ); flag = IDAQuadSStolerancesB(ida_mem, which, reltolQ, abstolQ); flag = IDAQuadSVtolerancesB(ida_mem, which, reltolQ, abstolQ); Their return value flag (of type int) can have any of the return values of its counterparts, but it can also be IDA NO ADJ if the function IDAAdjInit has not been previously called or IDA ILL INPUT if the parameter which was an invalid identifier. Access to optional outputs related to backward quadrature integration can be obtained by calling the corresponding IDAGetQuad* functions (see §4.7.5). A pointer ida memB to the idas memory block for the backward problem, required as the first argument of these functions, can be obtained through a call to the functions IDAGetAdjIDABmem (see §6.2.10). 6.3 User-supplied functions for adjoint sensitivity analysis In addition to the required DAE residual function and any optional functions for the forward problem, when using the adjoint sensitivity module in idas, the user must supply one function defining the backward problem DAE and, optionally, functions to supply Jacobian-related information and one or two functions that define the preconditioner (if applicable for the choice of sunlinsol object) for the backward problem. Type definitions for all these user-supplied functions are given below. 6.3.1 DAE residual for the backward problem The user must provide a resB function of type IDAResFnB defined as follows: IDAResFnB Definition typedef int (*IDAResFnB)(realtype t, N Vector y, N Vector yp, N Vector yB, N Vector ypB, N Vector resvalB, void *user dataB); Purpose This function evaluates the residual of the backward problem DAE system. This could be (2.20) or (2.25). Arguments t is the current value of the independent variable. 6.3 User-supplied functions for adjoint sensitivity analysis y yp yB ypB resvalB user dataB 143 is the current value of the forward solution vector. is the current value of the forward solution derivative vector. is the current value of the backward dependent variable vector. is the current value of the backward dependent derivative vector. is the output vector containing the residual for the backward DAE problem. is a pointer to user data, same as passed to IDASetUserDataB. Return value An IDAResFnB should return 0 if successful, a positive value if a recoverable error occurred (in which case idas will attempt to correct), or a negative value if an unrecoverabl failure occurred (in which case the integration stops and IDASolveB returns IDA RESFUNC FAIL). Notes Allocation of memory for resvalB is handled within idas. The y, yp, yB, ypB, and resvalB arguments are all of type N Vector, but yB, ypB, and resvalB typically have different internal representations from y and yp. It is the user’s responsibility to access the vector data consistently (including the use of the correct accessor macros from each nvector implementation). For the sake of computational efficiency, the vector functions in the two nvector implementations provided with idas do not perform any consistency checks with respect to their N Vector arguments (see §7.2 and §7.3). The user dataB pointer is passed to the user’s resB function every time it is called and can be the same as the user data pointer used for the forward problem. Before calling the user’s resB function, idas needs to evaluate (through interpolation) the values of the states from the forward integration. If an error occurs in the interpolation, idas triggers an unrecoverable failure in the residual function which will halt the integration and IDASolveB will return IDA RESFUNC FAIL. 6.3.2 DAE residual for the backward problem depending on the forward sensitivities The user must provide a resBS function of type IDAResFnBS defined as follows: IDAResFnBS Definition typedef int (*IDAResFnBS)(realtype N Vector N Vector N Vector Purpose This function evaluates the residual of the backward problem DAE system. This could be (2.20) or (2.25). Arguments is the current value of the independent variable. is the current value of the forward solution vector. is the current value of the forward solution derivative vector. a pointer to an array of Ns vectors containing the sensitivities of the forward solution. ypS a pointer to an array of Ns vectors containing the derivatives of the forward sensitivities. yB is the current value of the backward dependent variable vector. ypB is the current value of the backward dependent derivative vector. resvalB is the output vector containing the residual for the backward DAE problem. user dataB is a pointer to user data, same as passed to IDASetUserDataB. t y yp yS t, N Vector y, N Vector yp, *yS, N Vector *ypS, yB, N Vector ypB, resvalB, void *user dataB); ! 144 Using IDAS for Adjoint Sensitivity Analysis Return value An IDAResFnBS should return 0 if successful, a positive value if a recoverable error occurred (in which case idas will attempt to correct), or a negative value if an unrecoverable error occurred (in which case the integration stops and IDASolveB returns IDA RESFUNC FAIL). Notes Allocation of memory for resvalB is handled within idas. The y, yp, yB, ypB, and resvalB arguments are all of type N Vector, but yB, ypB, and resvalB typically have different internal representations from y and yp. Likewise for each yS[i] and ypS[i]. It is the user’s responsibility to access the vector data consistently (including the use of the correct accessor macros from each nvector implementation). For the sake of computational efficiency, the vector functions in the two nvector implementations provided with idas do not perform any consistency checks with respect to their N Vector arguments (see §7.2 and §7.3). The user dataB pointer is passed to the user’s resBS function every time it is called and can be the same as the user data pointer used for the forward problem. ! Before calling the user’s resBS function, idas needs to evaluate (through interpolation) the values of the states from the forward integration. If an error occurs in the interpolation, idas triggers an unrecoverable failure in the residual function which will halt the integration and IDASolveB will return IDA RESFUNC FAIL. 6.3.3 Quadrature right-hand side for the backward problem The user must provide an fQB function of type IDAQuadRhsFnB defined by IDAQuadRhsFnB Definition typedef int (*IDAQuadRhsFnB)(realtype t, N Vector y, N Vector yp, N Vector yB, N Vector ypB, N Vector rhsvalBQ, void *user dataB); Purpose This function computes the quadrature equation right-hand side for the backward problem. Arguments t y yp yB ypB rhsvalBQ is the current value of the independent variable. is the current value of the forward solution vector. is the current value of the forward solution derivative vector. is the current value of the backward dependent variable vector. is the current value of the backward dependent derivative vector. is the output vector containing the residual for the backward quadrature equations. user dataB is a pointer to user data, same as passed to IDASetUserDataB. Return value An IDAQuadRhsFnB should return 0 if successful, a positive value if a recoverable error occurred (in which case idas will attempt to correct), or a negative value if it failed unrecoverably (in which case the integration is halted and IDASolveB returns IDA QRHSFUNC FAIL). Notes Allocation of memory for rhsvalBQ is handled within idas. The y, yp, yB, ypB, and rhsvalBQ arguments are all of type N Vector, but they typically all have different internal representations. It is the user’s responsibility to access the vector data consistently (including the use of the correct accessor macros from each nvector implementation). For the sake of computational efficiency, the vector functions in the two nvector implementations provided with idas do not perform any consistency checks with repsect to their N Vector arguments (see §7.2 and §7.3). The user dataB pointer is passed to the user’s fQB function every time it is called and can be the same as the user data pointer used for the forward problem. 6.3 User-supplied functions for adjoint sensitivity analysis 145 Before calling the user’s fQB function, idas needs to evaluate (through interpolation) the values of the states from the forward integration. If an error occurs in the interpolation, idas triggers an unrecoverable failure in the quadrature right-hand side function which will halt the integration and IDASolveB will return IDA QRHSFUNC FAIL. 6.3.4 ! Sensitivity-dependent quadrature right-hand side for the backward problem The user must provide an fQBS function of type IDAQuadRhsFnBS defined by IDAQuadRhsFnBS Definition typedef int (*IDAQuadRhsFnBS)(realtype N Vector N Vector N Vector Purpose This function computes the quadrature equation residual for the backward problem. Arguments t is the current value of the independent variable. y is the current value of the forward solution vector. yp is the current value of the forward solution derivative vector. yS a pointer to an array of Ns vectors containing the sensitivities of the forward solution. ypS a pointer to an array of Ns vectors containing the derivatives of the forward sensitivities. yB is the current value of the backward dependent variable vector. ypB is the current value of the backward dependent derivative vector. t, N Vector y, N Vector yp, *yS, N Vector *ypS, yB, N Vector ypB, rhsvalBQS, void *user dataB); rhsvalBQS is the output vector containing the residual for the backward quadrature equations. user dataB is a pointer to user data, same as passed to IDASetUserDataB. Return value An IDAQuadRhsFnBS should return 0 if successful, a positive value if a recoverable error occurred (in which case idas will attempt to correct), or a negative value if it failed unrecoverably (in which case the integration is halted and IDASolveB returns IDA QRHSFUNC FAIL). Notes Allocation of memory for rhsvalBQS is handled within idas. The y, yp, yB, ypB, and rhsvalBQS arguments are all of type N Vector, but they typically do not all have the same internal representations. Likewise for each yS[i] and ypS[i]. It is the user’s responsibility to access the vector data consistently (including the use of the correct accessor macros from each nvector implementation). For the sake of computational efficiency, the vector functions in the two nvector implementations provided with idas do not perform any consistency checks with repsect to their N Vector arguments (see §7.2 and §7.3). The user dataB pointer is passed to the user’s fQBS function every time it is called and can be the same as the user data pointer used for the forward problem. Before calling the user’s fQBS function, idas needs to evaluate (through interpolation) the values of the states from the forward integration. If an error occurs in the interpolation, idas triggers an unrecoverable failure in the quadrature right-hand side function which will halt the integration and IDASolveB will return IDA QRHSFUNC FAIL. ! 146 6.3.5 Using IDAS for Adjoint Sensitivity Analysis Jacobian construction for the backward problem (matrix-based linear solvers) If a matrix-based linear solver module is is used for the backward problem (i.e., IDASetLinearSolverB is called with non-NULL sunmatrix argument in the step described in §6.1), the user may provide a function of type IDALsJacFnB or IDALsJacFnBS (see §6.2.9), defined as follows: IDALsJacFnB Definition typedef int (*IDALsJacFnB)(realtype tt, realtype cjB, N Vector yy, N Vector yp, N Vector yB, N Vector ypB, N Vector resvalB, SUNMatrix JacB, void *user dataB, N Vector tmp1B, N Vector tmp2B, N Vector tmp3B); Purpose This function computes the Jacobian of the backward problem (or an approximation to it). Arguments tt cjB is the current value of the independent variable. is the scalar in the system Jacobian, proportional to the inverse of the step size (α in Eq. (2.6) ). yy is the current value of the forward solution vector. yp is the current value of the forward solution derivative vector. yB is the current value of the backward dependent variable vector. ypB is the current value of the backward dependent derivative vector. resvalB is the current value of the residual for the backward problem. JacB is the output approximate Jacobian matrix. user dataB is a pointer to user data — the parameter passed to IDASetUserDataB. tmp1B tmp2B tmp3B are pointers to memory allocated for variables of type N Vector which can be used by the IDALsJacFnB function as temporary storage or work space. Return value An IDALsJacFnB should return 0 if successful, a positive value if a recoverable error occurred (in which case idas will attempt to correct, while idals sets last flag to IDALS JACFUNC RECVR), or a negative value if it failed unrecoverably (in which case the integration is halted, IDASolveB returns IDA LSETUP FAIL and idals sets last flag to IDALS JACFUNC UNRECVR). Notes ! A user-supplied Jacobian function must load the matrix JacB with an approximation to the Jacobian matrix at the point (tt,yy,yB), where yy is the solution of the original IVP at time tt, and yB is the solution of the backward problem at the same time. Information regarding the structure of the specific sunmatrix structure (e.g. number of rows, upper/lower bandwidth, sparsity type) may be obtained through using the implementation-specific sunmatrix interface functions (see Chapter 8 for details). Only nonzero elements need to be loaded into JacB as this matrix is set to zero before the call to the Jacobian function. Before calling the user’s IDALsJacFnB, idas needs to evaluate (through interpolation) the values of the states from the forward integration. If an error occurs in the interpolation, idas triggers an unrecoverable failure in the Jacobian function which will halt the integration (IDASolveB returns IDA LSETUP FAIL and idals sets last flag to IDALS JACFUNC UNRECVR). 6.3 User-supplied functions for adjoint sensitivity analysis 147 The previous function type IDADlsJacFnB is identical to IDALsJacFnB, and may still be used for backward-compatibility. However, this will be deprecated in future releases, so we recommend that users transition to the new function type name soon. IDALsJacFnBS Definition typedef int (*IDALsJacFnBS)(realtype tt, realtype cjB, N Vector yy, N Vector yp, N Vector *yS, N Vector *ypS, N Vector yB, N Vector ypB, N Vector resvalB, SUNMatrix JacB, void *user dataB, N Vector tmp1B, N Vector tmp2B, N Vector tmp3B); Purpose This function computes the Jacobian of the backward problem (or an approximation to it), in the case where the backward problem depends on the forward sensitivities. Arguments tt cjB is the current value of the independent variable. is the scalar in the system Jacobian, proportional to the inverse of the step size (α in Eq. (2.6) ). yy is the current value of the forward solution vector. yp is the current value of the forward solution derivative vector. yS a pointer to an array of Ns vectors containing the sensitivities of the forward solution. ypS a pointer to an array of Ns vectors containing the derivatives of the forward solution sensitivities. yB is the current value of the backward dependent variable vector. ypB is the current value of the backward dependent derivative vector. resvalB is the current value of the residual for the backward problem. JacB is the output approximate Jacobian matrix. user dataB is a pointer to user data — the parameter passed to IDASetUserDataB. tmp1B tmp2B tmp3B are pointers to memory allocated for variables of type N Vector which can be used by IDALsJacFnBS as temporary storage or work space. Return value An IDALsJacFnBS should return 0 if successful, a positive value if a recoverable error occurred (in which case idas will attempt to correct, while idals sets last flag to IDALS JACFUNC RECVR), or a negative value if it failed unrecoverably (in which case the integration is halted, IDASolveB returns IDA LSETUP FAIL and idals sets last flag to IDALS JACFUNC UNRECVR). Notes A user-supplied dense Jacobian function must load the matrix JacB with an approximation to the Jacobian matrix at the point (tt,yy,yS,yB), where yy is the solution of the original IVP at time tt, yS is the array of forward sensitivities at time tt, and yB is the solution of the backward problem at the same time. Information regarding the structure of the specific sunmatrix structure (e.g. number of rows, upper/lower bandwidth, sparsity type) may be obtained through using the implementation-specific sunmatrix interface functions (see Chapter 8 for details). Only nonzero elements need to be loaded into JacB as this matrix is set to zero before the call to the Jacobian function. Before calling the user’s IDALsJacFnBS, idas needs to evaluate (through interpolation) the values of the states from the forward integration. If an error occurs in the interpolation, idas triggers an unrecoverable failure in the Jacobian function which will ! 148 Using IDAS for Adjoint Sensitivity Analysis halt the integration (IDASolveB returns IDA LSETUP FAIL and idals sets last flag to IDALS JACFUNC UNRECVR). The previous function type IDADlsJacFnBS is identical to IDALsJacFnBS, and may still be used for backward-compatibility. However, this will be deprecated in future releases, so we recommend that users transition to the new function type name soon. 6.3.6 Jacobian-vector product for the backward problem (matrix-free linear solvers) If a matrix-free linear solver is selected for the backward problem (i.e., IDASetLinearSolverB is called with NULL-valued sunmatrix argument in the steps described in §6.1), the user may provide a function of type IDALsJacTimesVecFnB or IDALsJacTimesVecFnBS in the following form, to compute matrix-vector products Jv. If such a function is not supplied, the default is a difference quotient approximation to these products. IDALsJacTimesVecFnB Definition typedef int (*IDALsJacTimesVecFnB)(realtype N Vector N Vector N Vector N Vector realtype N Vector Purpose This function computes the action of the backward problem Jacobian JB on a given vector vB. Arguments t yy yp yB ypB resvalB vB JvB cjB t, yy, N Vector yp, yB, N Vector ypB, resvalB, vB, N Vector JvB, cjB, void *user dataB, tmp1B, N Vector tmp2B); is the current value of the independent variable. is the current value of the forward solution vector. is the current value of the forward solution derivative vector. is the current value of the backward dependent variable vector. is the current value of the backward dependent derivative vector. is the current value of the residual for the backward problem. is the vector by which the Jacobian must be multiplied. is the computed output vector, JB*vB. is the scalar in the system Jacobian, proportional to the inverse of the step size (α in Eq. (2.6) ). user dataB is a pointer to user data — the same as the user dataB parameter passed to IDASetUserDataB. tmp1B tmp2B are pointers to memory allocated for variables of type N Vector which can be used by IDALsJacTimesVecFnB as temporary storage or work space. Return value The return value of a function of type IDALsJtimesVecFnB should be 0 if successful or nonzero if an error was encountered, in which case the integration is halted. Notes A user-supplied Jacobian-vector product function must load the vector JvB with the product of the Jacobian of the backward problem at the point (t, y, yB) and the vector vB. Here, y is the solution of the original IVP at time t and yB is the solution of the backward problem at the same time. The rest of the arguments are equivalent to those passed to a function of type IDALsJacTimesVecFn (see §4.6.6). If the backward problem is the adjoint of ẏ = f (t, y), then this function is to compute −(∂f /∂y)T vB . 6.3 User-supplied functions for adjoint sensitivity analysis 149 The previous function type IDASpilsJacTimesVecFnB is identical to IDALsJacTimesVecFnB, and may still be used for backward-compatibility. However, this will be deprecated in future releases, so we recommend that users transition to the new function type name soon. IDALsJacTimesVecFnBS Definition typedef int (*IDALsJacTimesVecFnBS)(realtype N Vector N Vector N Vector N Vector N Vector realtype N Vector Purpose This function computes the action of the backward problem Jacobian JB on a given vector vB, in the case where the backward problem depends on the forward sensitivities. Arguments t is the current value of the independent variable. yy is the current value of the forward solution vector. yp is the current value of the forward solution derivative vector. yyS a pointer to an array of Ns vectors containing the sensitivities of the forward solution. ypS a pointer to an array of Ns vectors containing the derivatives of the forward sensitivities. yB is the current value of the backward dependent variable vector. ypB is the current value of the backward dependent derivative vector. resvalB is the current value of the residual for the backward problem. vB is the vector by which the Jacobian must be multiplied. JvB is the computed output vector, JB*vB. cjB is the scalar in the system Jacobian, proportional to the inverse of the step size (α in Eq. (2.6) ). t, yy, N Vector yp, *yyS, N Vector *ypS, yB, N Vector ypB, resvalB, vB, N Vector JvB, cjB, void *user dataB, tmp1B, N Vector tmp2B); user dataB is a pointer to user data — the same as the user dataB parameter passed to IDASetUserDataB. tmp1B tmp2B are pointers to memory allocated for variables of type N Vector which can be used by IDALsJacTimesVecFnBS as temporary storage or work space. Return value The return value of a function of type IDALsJtimesVecFnBS should be 0 if successful or nonzero if an error was encountered, in which case the integration is halted. Notes A user-supplied Jacobian-vector product function must load the vector JvB with the product of the Jacobian of the backward problem at the point (t, y, yB) and the vector vB. Here, y is the solution of the original IVP at time t and yB is the solution of the backward problem at the same time. The rest of the arguments are equivalent to those passed to a function of type IDALsJacTimesVecFn (see §4.6.6). The previous function type IDASpilsJacTimesVecFnBS is identical to IDALsJacTimesVecFnBS, and may still be used for backward-compatibility. However, this will be deprecated in future releases, so we recommend that users transition to the new function type name soon. 150 6.3.7 Using IDAS for Adjoint Sensitivity Analysis Jacobian-vector product setup for the backward problem (matrixfree linear solvers) If the user’s Jacobian-times-vector requires that any Jacobian-related data be preprocessed or evaluated, then this needs to be done in a user-supplied function of type IDALsJacTimesSetupFnB or IDALsJacTimesSetupFnBS, defined as follows: IDALsJacTimesSetupFnB Definition typedef int (*IDALsJacTimesSetupFnB)(realtype N Vector N Vector N Vector realtype Purpose This function preprocesses and/or evaluates Jacobian data needed by the Jacobiantimes-vector routine for the backward problem. Arguments tt yy yp yB ypB resvalB cjB tt, yy, N Vector yp, yB, N Vector ypB, resvalB, cjB, void *user dataB); is the current value of the independent variable. is the current value of the dependent variable vector, y(t). is the current value of ẏ(t). is the current value of the backward dependent variable vector. is the current value of the backward dependent derivative vector. is the current value of the residual for the backward problem. is the scalar in the system Jacobian, proportional to the inverse of the step size (α in Eq. (2.6) ). user dataB is a pointer to user data — the same as the user dataB parameter passed to IDASetUserDataB. Return value The value returned by the Jacobian-vector setup function should be 0 if successful, positive for a recoverable error (in which case the step will be retried), or negative for an unrecoverable error (in which case the integration is halted). Notes Each call to the Jacobian-vector setup function is preceded by a call to the backward problem residual user function with the same (t,y, yp, yB, ypB) arguments. Thus, the setup function can use any auxiliary data that is computed and saved during the evaluation of the DAE residual. If the user’s IDALsJacTimesVecFnB function uses difference quotient approximations, it may need to access quantities not in the call list. These include the current stepsize, the error weights, etc. To obtain these, the user will need to add a pointer to ida mem to user dataB and then use the IDAGet* functions described in §4.5.10.2. The unit roundoff can be accessed as UNIT ROUNDOFF defined in sundials types.h. The previous function type IDASpilsJacTimesSetupFnB is identical to IDALsJacTimesSetupFnB, and may still be used for backward-compatibility. However, this will be deprecated in future releases, so we recommend that users transition to the new function type name soon. IDALsJacTimesSetupFnBS Definition typedef int (*IDALsJacTimesSetupFnBS)(realtype N Vector N Vector N Vector N Vector realtype tt, yy, N Vector yp, *yyS, N Vector *ypS, yB, N Vector ypB, resvalB, cjB, void *user dataB); 6.3 User-supplied functions for adjoint sensitivity analysis 151 Purpose This function preprocesses and/or evaluates Jacobian data needed by the Jacobiantimes-vector routine for the backward problem, in the case that the backward problem depends on the forward sensitivities. Arguments tt yy yp yyS is the current value of the independent variable. is the current value of the dependent variable vector, y(t). is the current value of ẏ(t). a pointer to an array of Ns vectors containing the sensitivities of the forward solution. ypS a pointer to an array of Ns vectors containing the derivatives of the forward sensitivities. yB is the current value of the backward dependent variable vector. ypB is the current value of the backward dependent derivative vector. resvalB is the current value of the residual for the backward problem. cjB is the scalar in the system Jacobian, proportional to the inverse of the step size (α in Eq. (2.6) ). user dataB is a pointer to user data — the same as the user dataB parameter passed to IDASetUserDataB. Return value The value returned by the Jacobian-vector setup function should be 0 if successful, positive for a recoverable error (in which case the step will be retried), or negative for an unrecoverable error (in which case the integration is halted). Notes Each call to the Jacobian-vector setup function is preceded by a call to the backward problem residual user function with the same (t,y, yp, yyS, ypS, yB, ypB) arguments. Thus, the setup function can use any auxiliary data that is computed and saved during the evaluation of the DAE residual. If the user’s IDALsJacTimesVecFnB function uses difference quotient approximations, it may need to access quantities not in the call list. These include the current stepsize, the error weights, etc. To obtain these, the user will need to add a pointer to ida mem to user dataB and then use the IDAGet* functions described in §4.5.10.2. The unit roundoff can be accessed as UNIT ROUNDOFF defined in sundials types.h. The previous function type IDASpilsJacTimesSetupFnBS is identical to IDALsJacTimesSetupFnBS, and may still be used for backward-compatibility. However, this will be deprecated in future releases, so we recommend that users transition to the new function type name soon. 6.3.8 Preconditioner solve for the backward problem (iterative linear solvers) If preconditioning is used during integration of the backward problem, then the user must provide a function to solve the linear system P z = r, where P is a left preconditioner matrix. This function must have one of the following two forms: IDALsPrecSolveFnB Definition typedef int (*IDALsPrecSolveFnB)(realtype t, N Vector yy, N Vector yp, N Vector yB, N Vector ypB, N Vector resvalB, N Vector rvecB, N Vector zvecB, realtype cjB, realtype deltaB, void *user dataB); Purpose This function solves the preconditioning system P z = r for the backward problem. Arguments t is the current value of the independent variable. 152 Using IDAS for Adjoint Sensitivity Analysis is the current value of the forward solution vector. is the current value of the forward solution derivative vector. is the current value of the backward dependent variable vector. is the current value of the backward dependent derivative vector. is the current value of the residual for the backward problem. is the right-hand side vector r of the linear system to be solved. is the computed output vector. is the scalar in the system Jacobian, proportional to the inverse of the step size (α in Eq. (2.6) ). deltaB is an input tolerance to be used if an iterative method is employed in the solution. user dataB is a pointer to user data — the same as the user dataB parameter passed to the function IDASetUserDataB. yy yp yB ypB resvalB rvecB zvecB cjB Return value The return value of a preconditioner solve function for the backward problem should be 0 if successful, positive for a recoverable error (in which case the step will be retried), or negative for an unrecoverable error (in which case the integration is halted). Notes The previous function type IDASpilsPrecSolveFnB is identical to IDALsPrecSolveFnB, and may still be used for backward-compatibility. However, this will be deprecated in future releases, so we recommend that users transition to the new function type name soon. IDALsPrecSolveFnBS Definition typedef int (*IDALsPrecSolveFnBS)(realtype t, N Vector yy, N Vector yp, N Vector *yyS, N Vector *ypS, N Vector yB, N Vector ypB, N Vector resvalB, N Vector rvecB, N Vector zvecB, realtype cjB, realtype deltaB, void *user dataB); Purpose This function solves the preconditioning system P z = r for the backward problem, for the case in which the backward problem depends on the forward sensitivities. Arguments t yy yp yyS ypS yB ypB resvalB rvecB zvecB cjB deltaB is the current value of the independent variable. is the current value of the forward solution vector. is the current value of the forward solution derivative vector. a pointer to an array of Ns vectors containing the sensitivities of the forward solution. a pointer to an array of Ns vectors containing the derivatives of the forward sensitivities. is the current value of the backward dependent variable vector. is the current value of the backward dependent derivative vector. is the current value of the residual for the backward problem. is the right-hand side vector r of the linear system to be solved. is the computed output vector. is the scalar in the system Jacobian, proportional to the inverse of the step size (α in Eq. (2.6) ). is an input tolerance to be used if an iterative method is employed in the solution. 6.3 User-supplied functions for adjoint sensitivity analysis 153 user dataB is a pointer to user data — the same as the user dataB parameter passed to the function IDASetUserDataB. Return value The return value of a preconditioner solve function for the backward problem should be 0 if successful, positive for a recoverable error (in which case the step will be retried), or negative for an unrecoverable error (in which case the integration is halted). Notes 6.3.9 The previous function type IDASpilsPrecSolveFnBS is identical to IDALsPrecSolveFnBS, and may still be used for backward-compatibility. However, this will be deprecated in future releases, so we recommend that users transition to the new function type name soon. Preconditioner setup for the backward problem (iterative linear solvers) If the user’s preconditioner requires that any Jacobian-related data be preprocessed or evaluated, then this needs to be done in a user-supplied function of one of the following two types: IDALsPrecSetupFnB Definition typedef int (*IDALsPrecSetupFnB)(realtype N Vector N Vector N Vector realtype Purpose This function preprocesses and/or evaluates Jacobian-related data needed by the preconditioner for the backward problem. Arguments The arguments of an IDALsPrecSetupFnB are as follows: t, yy, N Vector yp, yB, N Vector ypB, resvalB, cjB, void *user dataB); is the current value of the independent variable. is the current value of the forward solution vector. is the current value of the forward solution vector. is the current value of the backward dependent variable vector. is the current value of the backward dependent derivative vector. is the current value of the residual for the backward problem. is the scalar in the system Jacobian, proportional to the inverse of the step size (α in Eq. (2.6) ). user dataB is a pointer to user data — the same as the user dataB parameter passed to the function IDASetUserDataB. t yy yp yB ypB resvalB cjB Return value The return value of a preconditioner setup function for the backward problem should be 0 if successful, positive for a recoverable error (in which case the step will be retried), or negative for an unrecoverable error (in which case the integration is halted). Notes The previous function type IDASpilsPrecSetupFnB is identical to IDALsPrecSetupFnB, and may still be used for backward-compatibility. However, this will be deprecated in future releases, so we recommend that users transition to the new function type name soon. IDALsPrecSetupFnBS Definition typedef int (*IDALsPrecSetupFnBS)(realtype N Vector N Vector N Vector N Vector realtype t, yy, N Vector yp, *yyS, N Vector *ypS, yB, N Vector ypB, resvalB, cjB, void *user dataB); 154 Using IDAS for Adjoint Sensitivity Analysis Purpose This function preprocesses and/or evaluates Jacobian-related data needed by the preconditioner for the backward problem, in the case where the backward problem depends on the forward sensitivities. Arguments The arguments of an IDALsPrecSetupFnBS are as follows: is the current value of the independent variable. is the current value of the forward solution vector. is the current value of the forward solution vector. a pointer to an array of Ns vectors containing the sensitivities of the forward solution. ypS a pointer to an array of Ns vectors containing the derivatives of the forward sensitivities. yB is the current value of the backward dependent variable vector. ypB is the current value of the backward dependent derivative vector. resvalB is the current value of the residual for the backward problem. cjB is the scalar in the system Jacobian, proportional to the inverse of the step size (α in Eq. (2.6) ). user dataB is a pointer to user data — the same as the user dataB parameter passed to the function IDASetUserDataB. t yy yp yyS Return value The return value of a preconditioner setup function for the backward problem should be 0 if successful, positive for a recoverable error (in which case the step will be retried), or negative for an unrecoverable error (in which case the integration is halted). Notes 6.4 The previous function type IDASpilsPrecSetupFnBS is identical to IDALsPrecSetupFnBS, and may still be used for backward-compatibility. However, this will be deprecated in future releases, so we recommend that users transition to the new function type name soon. Using the band-block-diagonal preconditioner for backward problems As on the forward integration phase, the efficiency of Krylov iterative methods for the solution of linear systems can be greatly enhanced through preconditioning. The band-block-diagonal preconditioner module idabbdpre, provides interface functions through which it can be used on the backward integration phase. The adjoint module in idas offers an interface to the band-block-diagonal preconditioner module idabbdpre described in section §4.8. This generates a preconditioner that is a block-diagonal matrix with each block being a band matrix and can be used with one of the Krylov linear solvers and with the MPI-parallel vector module nvector parallel. In order to use the idabbdpre module in the solution of the backward problem, the user must define one or two additional functions, described at the end of this section. 6.4.1 Usage of IDABBDPRE for the backward problem The idabbdpre module is initialized by calling the following function, after an iterative linear solver for the backward problem has been attached to idas by calling IDASetLinearSolverB (see §6.2.6). IDABBDPrecInitB Call flag = IDABBDPrecInitB(ida mem, which, NlocalB, mudqB, mldqB, mukeepB, mlkeepB, dqrelyB, GresB, GcommB); Description The function IDABBDPrecInitB initializes and allocates memory for the idabbdpre preconditioner for the backward problem. 6.4 Using the band-block-diagonal preconditioner for backward problems Arguments ida mem which NlocalB mudqB mldqB mukeepB mlkeepB dqrelyB GresB GcommB 155 (void *) pointer to the idas memory block. (int) the identifier of the backward problem. (sunindextype) local vector dimension for the backward problem. (sunindextype) upper half-bandwidth to be used in the difference-quotient Jacobian approximation. (sunindextype) lower half-bandwidth to be used in the difference-quotient Jacobian approximation. (sunindextype) upper half-bandwidth of the retained banded approximate Jacobian block. (sunindextype) lower half-bandwidth of the retained banded approximate Jacobian block. (realtype) the relative increment in components of√yB used in the difference quotient approximations. The default is dqrelyB= unit roundoff, which can be specified by passing dqrely= 0.0. (IDABBDLocalFnB) the C function which computes GB (t, y, ẏ, yB , ẏB ), the function approximating the residual of the backward problem. (IDABBDCommFnB) the optional C function which performs all interprocess communication required for the computation of GB . Return value If successful, IDABBDPrecInitB creates, allocates, and stores (internally in the idas solver block) a pointer to the newly created idabbdpre memory block. The return value flag (of type int) is one of: IDALS IDALS IDALS IDALS IDALS SUCCESS MEM FAIL MEM NULL LMEM NULL ILL INPUT The call to IDABBDPrecInitB was successful. A memory allocation request has failed. The ida mem argument was NULL. No linear solver has been attached. An invalid parameter has been passed. To reinitialize the idabbdpre preconditioner module for the backward problem, possibly with a change in mudqB, mldqB, or dqrelyB, call the following function: IDABBDPrecReInitB Call flag = IDABBDPrecReInitB(ida mem, which, mudqB, mldqB, dqrelyB); Description The function IDABBDPrecReInitB reinitializes the idabbdpre preconditioner for the backward problem. Arguments ida mem (void *) pointer to the idas memory block returned by IDACreate. which (int) the identifier of the backward problem. mudqB (sunindextype) upper half-bandwidth to be used in the difference-quotient Jacobian approximation. mldqB (sunindextype) lower half-bandwidth to be used in the difference-quotient Jacobian approximation. dqrelyB (realtype) the relative increment in components of yB used in the difference quotient approximations. Return value The return value flag (of type int) is one of: IDALS IDALS IDALS IDALS IDALS IDALS SUCCESS MEM FAIL MEM NULL PMEM NULL LMEM NULL ILL INPUT The call to IDABBDPrecReInitB was successful. A memory allocation request has failed. The ida mem argument was NULL. The IDABBDPrecInitB has not been previously called. No linear solver has been attached. An invalid parameter has been passed. For more details on idabbdpre see §4.8. 156 6.4.2 Using IDAS for Adjoint Sensitivity Analysis User-supplied functions for IDABBDPRE To use the idabbdpre module, the user must supply one or two functions which the module calls to construct the preconditioner: a required function GresB (of type IDABBDLocalFnB) which approximates the residual of the backward problem and which is computed locally, and an optional function GcommB (of type IDABBDCommFnB) which performs all interprocess communication necessary to evaluate this approximate residual (see §4.8). The prototypes for these two functions are described below. IDABBDLocalFnB Definition typedef int (*IDABBDLocalFnB)(sunindextype NlocalB, realtype t, N Vector y, N Vector yp, N Vector yB, N Vector ypB, N Vector gB, void *user dataB); Purpose This GresB function loads the vector gB, an approximation to the residual of the backward problem, as a function of t, y, yp, and yB and ypB. Arguments NlocalB t y yp yB ypB gB user dataB is the local vector length for the backward problem. is the value of the independent variable. is the current value of the forward solution vector. is the current value of the forward solution derivative vector. is the current value of the backward dependent variable vector. is the current value of the backward dependent derivative vector. is the output vector, GB (t, y, ẏ, yB , ẏB ). is a pointer to user data — the same as the user dataB parameter passed to IDASetUserDataB. Return value An IDABBDLocalFnB should return 0 if successful, a positive value if a recoverable error occurred (in which case idas will attempt to correct), or a negative value if it failed unrecoverably (in which case the integration is halted and IDASolveB returns IDA LSETUP FAIL). Notes ! This routine must assume that all interprocess communication of data needed to calculate gB has already been done, and this data is accessible within user dataB. Before calling the user’s IDABBDLocalFnB, idas needs to evaluate (through interpolation) the values of the states from the forward integration. If an error occurs in the interpolation, idas triggers an unrecoverable failure in the preconditioner setup function which will halt the integration (IDASolveB returns IDA LSETUP FAIL). IDABBDCommFnB Definition typedef int (*IDABBDCommFnB)(sunindextype NlocalB, realtype t, N Vector y, N Vector yp, N Vector yB, N Vector ypB, void *user dataB); Purpose This GcommB function performs all interprocess communications necessary for the execution of the GresB function above, using the input vectors y, yp, yB and ypB. Arguments NlocalB t y yp yB ypB is is is is is is the the the the the the local vector length. value of the independent variable. current value of the forward solution vector. current value of the forward solution derivative vector. current value of the backward dependent variable vector. current value of the backward dependent derivative vector. 6.4 Using the band-block-diagonal preconditioner for backward problems 157 user dataB is a pointer to user data — the same as the user dataB parameter passed to IDASetUserDataB. Return value An IDABBDCommFnB should return 0 if successful, a positive value if a recoverable error occurred (in which case idas will attempt to correct), or a negative value if it failed unrecoverably (in which case the integration is halted and IDASolveB returns IDA LSETUP FAIL). Notes The GcommB function is expected to save communicated data in space defined within the structure user dataB. Each call to the GcommB function is preceded by a call to the function that evaluates the residual of the backward problem with the same t, y, yp, yB and ypB arguments. If there is no additional communication needed, then pass GcommB = NULL to IDABBDPrecInitB. Chapter 7 Description of the NVECTOR module The sundials solvers are written in a data-independent manner. They all operate on generic vectors (of type N Vector) through a set of operations defined by the particular nvector implementation. Users can provide their own specific implementation of the nvector module, or use one of the implementations provided with sundials. The generic operations are described below and the implementations provided with sundials are described in the following sections. The generic N Vector type is a pointer to a structure that has an implementation-dependent content field containing the description and actual data of the vector, and an ops field pointing to a structure with generic vector operations. The type N Vector is defined as typedef struct _generic_N_Vector *N_Vector; struct _generic_N_Vector { void *content; struct _generic_N_Vector_Ops *ops; }; The generic N Vector Ops structure is essentially a list of pointers to the various actual vector operations, and is defined as struct _generic_N_Vector_Ops { N_Vector_ID (*nvgetvectorid)(N_Vector); N_Vector (*nvclone)(N_Vector); N_Vector (*nvcloneempty)(N_Vector); void (*nvdestroy)(N_Vector); void (*nvspace)(N_Vector, sunindextype *, sunindextype *); realtype* (*nvgetarraypointer)(N_Vector); void (*nvsetarraypointer)(realtype *, N_Vector); void (*nvlinearsum)(realtype, N_Vector, realtype, N_Vector, N_Vector); void (*nvconst)(realtype, N_Vector); void (*nvprod)(N_Vector, N_Vector, N_Vector); void (*nvdiv)(N_Vector, N_Vector, N_Vector); void (*nvscale)(realtype, N_Vector, N_Vector); void (*nvabs)(N_Vector, N_Vector); void (*nvinv)(N_Vector, N_Vector); void (*nvaddconst)(N_Vector, realtype, N_Vector); realtype (*nvdotprod)(N_Vector, N_Vector); realtype (*nvmaxnorm)(N_Vector); realtype (*nvwrmsnorm)(N_Vector, N_Vector); 160 realtype realtype realtype realtype void booleantype booleantype realtype int int int int int int int int int int Description of the NVECTOR module (*nvwrmsnormmask)(N_Vector, N_Vector, N_Vector); (*nvmin)(N_Vector); (*nvwl2norm)(N_Vector, N_Vector); (*nvl1norm)(N_Vector); (*nvcompare)(realtype, N_Vector, N_Vector); (*nvinvtest)(N_Vector, N_Vector); (*nvconstrmask)(N_Vector, N_Vector, N_Vector); (*nvminquotient)(N_Vector, N_Vector); (*nvlinearcombination)(int, realtype*, N_Vector*, N_Vector); (*nvscaleaddmulti)(int, realtype*, N_Vector, N_Vector*, N_Vector*); (*nvdotprodmulti)(int, N_Vector, N_Vector*, realtype*); (*nvlinearsumvectorarray)(int, realtype, N_Vector*, realtype, N_Vector*, N_Vector*); (*nvscalevectorarray)(int, realtype*, N_Vector*, N_Vector*); (*nvconstvectorarray)(int, realtype, N_Vector*); (*nvwrmsnomrvectorarray)(int, N_Vector*, N_Vector*, realtype*); (*nvwrmsnomrmaskvectorarray)(int, N_Vector*, N_Vector*, N_Vector, realtype*); (*nvscaleaddmultivectorarray)(int, int, realtype*, N_Vector*, N_Vector**, N_Vector**); (*nvlinearcombinationvectorarray)(int, int, realtype*, N_Vector**, N_Vector*); }; The generic nvector module defines and implements the vector operations acting on an N Vector. These routines are nothing but wrappers for the vector operations defined by a particular nvector implementation, which are accessed through the ops field of the N Vector structure. To illustrate this point we show below the implementation of a typical vector operation from the generic nvector module, namely N VScale, which performs the scaling of a vector x by a scalar c: void N_VScale(realtype c, N_Vector x, N_Vector z) { z->ops->nvscale(c, x, z); } Table 7.2 contains a complete list of all standard vector operations defined by the generic nvector module. Tables 7.3 and 7.4 list optional fused and vector array operations respectively. Fused and vector array operations are intended to increase data reuse, reduce parallel communication on distributed memory systems, and lower the number of kernel launches on systems with accelerators. If a particular nvector implementation defines a fused or vector array operation as NULL, the generic nvector module will automatically call standard vector operations as necessary to complete the desired operation. Currently, all fused and vector array operations are disabled by default however, sundials provided nvector implementations define additional user-callable functions to enable/disable any or all of the fused and vector array operations. See the following sections for the implementation specific functions to enable/disable operations. Finally, note that the generic nvector module defines the functions N VCloneVectorArray and N VCloneVectorArrayEmpty. Both functions create (by cloning) an array of count variables of type N Vector, each of the same type as an existing N Vector. Their prototypes are N_Vector *N_VCloneVectorArray(int count, N_Vector w); N_Vector *N_VCloneVectorArrayEmpty(int count, N_Vector w); and their definitions are based on the implementation-specific N VClone and N VCloneEmpty operations, respectively. An array of variables of type N Vector can be destroyed by calling N VDestroyVectorArray, whose prototype is 161 Table 7.1: Vector Identifications associated with vector kernels supplied with sundials. Vector ID SUNDIALS SUNDIALS SUNDIALS SUNDIALS SUNDIALS SUNDIALS SUNDIALS SUNDIALS NVEC NVEC NVEC NVEC NVEC NVEC NVEC NVEC SERIAL PARALLEL OPENMP PTHREADS PARHYP PETSC OPENMPDEV CUSTOM Vector type Serial Distributed memory parallel (MPI) OpenMP shared memory parallel PThreads shared memory parallel hypre ParHyp parallel vector petsc parallel vector OpenMP shared memory parallel with device offloading User-provided custom vector ID Value 0 1 2 3 4 5 6 7 void N_VDestroyVectorArray(N_Vector *vs, int count); and whose definition is based on the implementation-specific N VDestroy operation. A particular implementation of the nvector module must: • Specify the content field of N Vector. • Define and implement the vector operations. Note that the names of these routines should be unique to that implementation in order to permit using more than one nvector module (each with different N Vector internal data representations) in the same code. • Define and implement user-callable constructor and destructor routines to create and free an N Vector with the new content field and with ops pointing to the new vector operations. • Optionally, define and implement additional user-callable routines acting on the newly defined N Vector (e.g., a routine to print the content for debugging purposes). • Optionally, provide accessor macros as needed for that particular implementation to be used to access different parts in the content field of the newly defined N Vector. Each nvector implementation included in sundials has a unique identifier specified in enumeration and shown in Table 7.1. It is recommended that a user-supplied nvector implementation use the SUNDIALS NVEC CUSTOM identifier. 162 Description of the NVECTOR module Table 7.2: Description of the NVECTOR operations Name Usage and Description N VGetVectorID id = N VGetVectorID(w); Returns the vector type identifier for the vector w. It is used to determine the vector implementation type (e.g. serial, parallel,. . . ) from the abstract N Vector interface. Returned values are given in Table 7.1. N VClone v = N VClone(w); Creates a new N Vector of the same type as an existing vector w and sets the ops field. It does not copy the vector, but rather allocates storage for the new vector. N VCloneEmpty v = N VCloneEmpty(w); Creates a new N Vector of the same type as an existing vector w and sets the ops field. It does not allocate storage for data. N VDestroy N VDestroy(v); Destroys the N Vector v and frees memory allocated for its internal data. N VSpace N VSpace(nvSpec, &lrw, &liw); Returns storage requirements for one N Vector. lrw contains the number of realtype words and liw contains the number of integer words. This function is advisory only, for use in determining a user’s total space requirements; it could be a dummy function in a user-supplied nvector module if that information is not of interest. N VGetArrayPointer vdata = N VGetArrayPointer(v); Returns a pointer to a realtype array from the N Vector v. Note that this assumes that the internal data in N Vector is a contiguous array of realtype. This routine is only used in the solver-specific interfaces to the dense and banded (serial) linear solvers, the sparse linear solvers (serial and threaded), and in the interfaces to the banded (serial) and band-blockdiagonal (parallel) preconditioner modules provided with sundials. N VSetArrayPointer N VSetArrayPointer(vdata, v); Overwrites the data in an N Vector with a given array of realtype. Note that this assumes that the internal data in N Vector is a contiguous array of realtype. This routine is only used in the interfaces to the dense (serial) linear solver, hence need not exist in a user-supplied nvector module for a parallel environment. continued on next page 163 continued from last page Name Usage and Description N VLinearSum N VLinearSum(a, x, b, y, z); Performs the operation z = ax + by, where a and b are realtype scalars and x and y are of type N Vector: zi = axi + byi , i = 0, . . . , n − 1. N VConst N VConst(c, z); Sets all components of the N Vector z to realtype c: zi = c, i = 0, . . . , n− 1. N VProd N VProd(x, y, z); Sets the N Vector z to be the component-wise product of the N Vector inputs x and y: zi = xi yi , i = 0, . . . , n − 1. N VDiv N VDiv(x, y, z); Sets the N Vector z to be the component-wise ratio of the N Vector inputs x and y: zi = xi /yi , i = 0, . . . , n − 1. The yi may not be tested for 0 values. It should only be called with a y that is guaranteed to have all nonzero components. N VScale N VScale(c, x, z); Scales the N Vector x by the realtype scalar c and returns the result in z: zi = cxi , i = 0, . . . , n − 1. N VAbs N VAbs(x, z); Sets the components of the N Vector z to be the absolute values of the components of the N Vector x: yi = |xi |, i = 0, . . . , n − 1. N VInv N VInv(x, z); Sets the components of the N Vector z to be the inverses of the components of the N Vector x: zi = 1.0/xi , i = 0, . . . , n − 1. This routine may not check for division by 0. It should be called only with an x which is guaranteed to have all nonzero components. N VAddConst N VAddConst(x, b, z); Adds the realtype scalar b to all components of x and returns the result in the N Vector z: zi = xi + b, i = 0, . . . , n − 1. N VDotProd d = N VDotProd(x, y); Pn−1 Returns the value of the ordinary dot product of x and y: d = i=0 xi yi . N VMaxNorm m = N VMaxNorm(x); Returns the maximum norm of the N Vector x: m = maxi |xi |. continued on next page 164 Description of the NVECTOR module continued from last page Name Usage and Description N VWrmsNorm m = N VWrmsNorm(x, w) Returns the weighted root-mean-square norm of the N Vector x with r Pn−1 2 /n. (x w ) realtype weight vector w: m = i i i=0 N VWrmsNormMask m = N VWrmsNormMask(x, w, id); Returns the weighted root mean square norm of the N Vector x with realtype weight vector w built using only the elements of x corresponding to positive elements of the N Vector id: ( r Pn−1 1 α>0 2 /n, where H(α) = m= i=0 (xi wi H(idi )) 0 α≤0 m = N VMin(x); Returns the smallest element of the N Vector x: m = mini xi . N VMin N VWL2Norm m = N VWL2Norm(x, w); Returns the weighted Euclidean `2 norm of the N Vector x with realtype qP n−1 2 weight vector w: m = i=0 (xi wi ) . N VL1Norm m = N VL1Norm(x); Pn−1 Returns the `1 norm of the N Vector x: m = i=0 |xi |. N VCompare N VCompare(c, x, z); Compares the components of the N Vector x to the realtype scalar c and returns an N Vector z such that: zi = 1.0 if |xi | ≥ c and zi = 0.0 otherwise. N VInvTest t = N VInvTest(x, z); Sets the components of the N Vector z to be the inverses of the components of the N Vector x, with prior testing for zero values: zi = 1.0/xi , i = 0, . . . , n − 1. This routine returns a boolean assigned to SUNTRUE if all components of x are nonzero (successful inversion) and returns SUNFALSE otherwise. N VConstrMask t = N VConstrMask(c, x, m); Performs the following constraint tests: xi > 0 if ci = 2, xi ≥ 0 if ci = 1, xi ≤ 0 if ci = −1, xi < 0 if ci = −2. There is no constraint on xi if ci = 0. This routine returns a boolean assigned to SUNFALSE if any element failed the constraint test and assigned to SUNTRUE if all passed. It also sets a mask vector m, with elements equal to 1.0 where the constraint test failed, and 0.0 where the test passed. This routine is used only for constraint checking. continued on next page 165 continued from last page Name Usage and Description N VMinQuotient minq = N VMinQuotient(num, denom); This routine returns the minimum of the quotients obtained by term-wise dividing numi by denomi . A zero element in denom will be skipped. If no such quotients are found, then the large value BIG REAL (defined in the header file sundials types.h) is returned. Table 7.3: Description of the NVECTOR fused operations Name Usage and Description N VLinearCombination ier = N VLinearCombination(nv, c, X, z); This routine computes the linear combination of nv vectors with n elements: nX v −1 zi = cj xj,i , i = 0, . . . , n − 1, j=0 where c is an array of nv scalars (type realtype*), X is an array of nv vectors (type N Vector*), and z is the output vector (type N Vector). If the output vector z is one of the vectors in X, then it must be the first vector in the vector array. The operation returns 0 for success and a non-zero value otherwise. N VScaleAddMulti ier = N VScaleAddMulti(nv, c, x, Y, Z); This routine scales and adds one vector to nv vectors with n elements: zj,i = cj xi + yj,i , j = 0, . . . , nv − 1 i = 0, . . . , n − 1, where c is an array of nv scalars (type realtype*), x is the vector (type N Vector) to be scaled and added to each vector in the vector array of nv vectors Y (type N Vector*), and Z (type N Vector*) is a vector array of nv output vectors. The operation returns 0 for success and a non-zero value otherwise. continued on next page 166 Description of the NVECTOR module continued from last page Name Usage and Description N VDotProdMulti ier = N VDotProdMulti(nv, x, Y, d); This routine computes the dot product of a vector with nv other vectors: dj = n−1 X xi yj,i , j = 0, . . . , nv − 1, i=0 where d (type realtype*) is an array of nv scalars containing the dot products of the vector x (type N Vector) with each of the nv vectors in the vector array Y (type N Vector*). The operation returns 0 for success and a non-zero value otherwise. Table 7.4: Description of the NVECTOR vector array operations Name Usage and Description N VLinearSumVectorArray ier = N VLinearSumVectorArray(nv, a, X, b, Y, Z); This routine comuptes the linear sum of two vector arrays containing nv vectors of n elements: zj,i = axj,i + byj,i , i = 0, . . . , n − 1 j = 0, . . . , nv − 1, where a and b are realtype scalars and X, Y , and Z are arrays of nv vectors (type N Vector*). The operation returns 0 for success and a non-zero value otherwise. N VScaleVectorArray ier = N VScaleVectorArray(nv, c, X, Z); This routine scales each vector of n elements in a vector array of nv vectors by a potentially different constant: zj,i = cj xj,i , i = 0, . . . , n − 1 j = 0, . . . , nv − 1, where c is an array of nv scalars (type realtype*) and X and Z are arrays of nv vectors (type N Vector*). The operation returns 0 for success and a non-zero value otherwise. continued on next page 167 continued from last page Name Usage and Description N VConstVectorArray ier = N VConstVectorArray(nv, c, X); This routine sets each element in a vector of n elements in a vector array of nv vectors to the same value: zj,i = c, i = 0, . . . , n − 1 j = 0, . . . , nv − 1, where c is a realtype scalar and X is an array of nv vectors (type N Vector*). The operation returns 0 for success and a non-zero value otherwise. N VWrmsNormVectorArray ier = N VWrmsNormVectorArray(nv, X, W, m); This routine computes the weighted root mean square norm of nv vectors with n elements: mj = n−1 1X 2 (xj,i wj,i ) n i=0 !1/2 , j = 0, . . . , nv − 1, where m (type realtype*) contains the nv norms of the vectors in the vector array X (type N Vector*) with corresponding weight vectors W (type N Vector*). The operation returns 0 for success and a non-zero value otherwise. N VWrmsNormMaskVectorArray ier = N VWrmsNormMaskVectorArray(nv, X, W, id, m); This routine computes the masked weighted root mean square norm of nv vectors with n elements: mj = n−1 1X 2 (xj,i wj,i H(idi )) n i=0 !1/2 , j = 0, . . . , nv − 1, H(idi ) = 1 for idi > 0 and is zero otherwise, m (type realtype*) contains the nv norms of the vectors in the vector array X (type N Vector*) with corresponding weight vectors W (type N Vector*) and mask vector id (type N Vector). The operation returns 0 for success and a non-zero value otherwise. continued on next page 168 Description of the NVECTOR module continued from last page Name Usage and Description N VScaleAddMultiVectorArray ier = N VScaleAddMultiVectorArray(nv, ns, c, X, YY, ZZ); This routine scales and adds a vector in a vector array of nv vectors to the corresponding vector in ns vector arrays: zj,i = nX s −1 ck xk,j,i , i = 0, . . . , n − 1 j = 0, . . . , nv − 1, k=0 where c is an array of ns scalars (type realtype*), X is a vector array of nv vectors (type idN Vector*) to be scaled and added to the corresponding vector in each of the ns vector arrays in the array of vector arrays Y Y (type N Vector**) and stored in the output array of vector arrays ZZ (type N Vector**). The operation returns 0 for success and a non-zero value otherwise. N VLinearCombinationVectorArray ier = N VLinearCombinationVectorArray(nv, ns, c, XX, Z); This routine computes the linear combination of ns vector arrays containing nv vectors with n elements: zj,i = nX s −1 ck xk,j,i , i = 0, . . . , n − 1 j = 0, . . . , nv − 1, k=0 where c is an array of ns scalars (type realtype*), XX (type N Vector**) is an array of ns vector arrays each containing nv vectors to be summed into the output vector array of nv vectors Z (type N Vector*). If the output vector array Z is one of the vector arrays in XX, then it must be the first vector array in XX. The operation returns 0 for success and a non-zero value otherwise. 7.1 NVECTOR functions used by IDAS In Table 7.5 below, we list the vector functions used in the nvector module used by the idas package. The table also shows, for each function, which of the code modules uses the function. The idas column shows function usage within the main integrator module, while the remaining columns show function usage within the idas linear solvers interface, the idabbdpre preconditioner module, and the idaa module. At this point, we should emphasize that the idas user does not need to know anything about the usage of vector functions by the idas code modules in order to use idas. The information is presented as an implementation detail for the interested reader. Special cases (numbers match markings in table): 1. These routines are only required if an internal difference-quotient routine for constructing dense or band Jacobian matrices is used. 2. This routine is optional, and is only used in estimating space requirements for idas modules for user feedback. 3. The optional function N VDotProdMulti is only used when Classical Gram-Schmidt is enabled with spgmr or spfgmr. The remaining operations from Tables 7.3 and 7.4 not listed above are 7.2 The NVECTOR SERIAL implementation 169 unused and a user-supplied nvector module for idas could omit these operations. Of the functions listed in Table 7.2, N VWL2Norm, N VL1Norm, and N VInvTest are not used by idas. Therefore a user-supplied nvector module for idas could omit these functions. The NVECTOR SERIAL implementation 7.2 The serial implementation of the nvector module provided with sundials, nvector serial, defines the content field of N Vector to be a structure containing the length of the vector, a pointer to the beginning of a contiguous data array, and a boolean flag own data which specifies the ownership of data. struct _N_VectorContent_Serial { sunindextype length; booleantype own_data; realtype *data; }; The header file to include when using this module is nvector serial.h. The installed module library to link to is libsundials nvecserial.lib where .lib is typically .so for shared libraries and .a for static libraries. 7.2.1 NVECTOR SERIAL accessor macros The following macros are provided to access the content of an nvector serial vector. The suffix S in the names denotes the serial version. • NV CONTENT S This routine gives access to the contents of the serial vector N Vector. The assignment v cont = NV CONTENT S(v) sets v cont to be a pointer to the serial N Vector content structure. Implementation: #define NV_CONTENT_S(v) ( (N_VectorContent_Serial)(v->content) ) • NV OWN DATA S, NV DATA S, NV LENGTH S These macros give individual access to the parts of the content of a serial N Vector. The assignment v data = NV DATA S(v) sets v data to be a pointer to the first component of the data for the N Vector v. The assignment NV DATA S(v) = v data sets the component array of v to be v data by storing the pointer v data. The assignment v len = NV LENGTH S(v) sets v len to be the length of v. On the other hand, the call NV LENGTH S(v) = len v sets the length of v to be len v. Implementation: #define NV_OWN_DATA_S(v) ( NV_CONTENT_S(v)->own_data ) #define NV_DATA_S(v) ( NV_CONTENT_S(v)->data ) #define NV_LENGTH_S(v) ( NV_CONTENT_S(v)->length ) • NV Ith S This macro gives access to the individual components of the data array of an N Vector. The assignment r = NV Ith S(v,i) sets r to be the value of the i-th component of v. The assignment NV Ith S(v,i) = r sets the value of the i-th component of v to be r. Here i ranges from 0 to n − 1 for a vector of length n. Implementation: #define NV_Ith_S(v,i) ( NV_DATA_S(v)[i] ) 170 7.2.2 Description of the NVECTOR module NVECTOR SERIAL functions The nvector serial module defines serial implementations of all vector operations listed in Tables 7.2, 7.3, and 7.4. Their names are obtained from those in Tables 7.2, 7.3, and 7.4 by appending the suffix Serial (e.g. N VDestroy Serial). All the standard vector operations listed in 7.2 with the suffix Serial appended are callable via the Fortran 2003 interface by prepending an ‘F’ (e.g. FN VDestroy Serial). The module nvector serial provides the following additional user-callable routines: N VNew Serial Prototype N Vector N VNew Serial(sunindextype vec length); Description This function creates and allocates memory for a serial N Vector. Its only argument is the vector length. F2003 Name This function is callable as FN VNew Serial when using the Fortran 2003 interface module. N VNewEmpty Serial Prototype N Vector N VNewEmpty Serial(sunindextype vec length); Description This function creates a new serial N Vector with an empty (NULL) data array. F2003 Name This function is callable as FN VNewEmpty Serial when using the Fortran 2003 interface module. N VMake Serial Prototype N Vector N VMake Serial(sunindextype vec length, realtype *v data); Description This function creates and allocates memory for a serial vector with user-provided data array. (This function does not allocate memory for v data itself.) F2003 Name This function is callable as FN VMake Serial when using the Fortran 2003 interface module. N VCloneVectorArray Serial Prototype N Vector *N VCloneVectorArray Serial(int count, N Vector w); Description This function creates (by cloning) an array of count serial vectors. N VCloneVectorArrayEmpty Serial Prototype N Vector *N VCloneVectorArrayEmpty Serial(int count, N Vector w); Description This function creates (by cloning) an array of count serial vectors, each with an empty (NULL) data array. N VDestroyVectorArray Serial Prototype void N VDestroyVectorArray Serial(N Vector *vs, int count); Description This function frees memory allocated for the array of count variables of type N Vector created with N VCloneVectorArray Serial or with N VCloneVectorArrayEmpty Serial. 7.2 The NVECTOR SERIAL implementation 171 N VGetLength Serial Prototype sunindextype N VGetLength Serial(N Vector v); Description This function returns the number of vector elements. F2003 Name This function is callable as FN VGetLength Serial when using the Fortran 2003 interface module. N VPrint Serial Prototype void N VPrint Serial(N Vector v); Description This function prints the content of a serial vector to stdout. F2003 Name This function is callable as FN VPrint Serial when using the Fortran 2003 interface module. N VPrintFile Serial Prototype void N VPrintFile Serial(N Vector v, FILE *outfile); Description This function prints the content of a serial vector to outfile. By default all fused and vector array operations are disabled in the nvector serial module. The following additional user-callable routines are provided to enable or disable fused and vector array operations for a specific vector. To ensure consistency across vectors it is recommended to first create a vector with N VNew Serial, enable/disable the desired operations for that vector with the functions below, and create any additional vectors from that vector using N VClone. This guarantees the new vectors will have the same operations enabled/disabled as cloned vectors inherit the same enable/disable options as the vector they are cloned from while vectors created with N VNew Serial will have the default settings for the nvector serial module. N VEnableFusedOps Serial Prototype int N VEnableFusedOps Serial(N Vector v, booleantype tf); Description This function enables (SUNTRUE) or disables (SUNFALSE) all fused and vector array operations in the serial vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableLinearCombination Serial Prototype int N VEnableLinearCombination Serial(N Vector v, booleantype tf); Description This function enables (SUNTRUE) or disables (SUNFALSE) the linear combination fused operation in the serial vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableScaleAddMulti Serial Prototype int N VEnableScaleAddMulti Serial(N Vector v, booleantype tf); Description This function enables (SUNTRUE) or disables (SUNFALSE) the scale and add a vector to multiple vectors fused operation in the serial vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. 172 Description of the NVECTOR module N VEnableDotProdMulti Serial Prototype int N VEnableDotProdMulti Serial(N Vector v, booleantype tf); Description This function enables (SUNTRUE) or disables (SUNFALSE) the multiple dot products fused operation in the serial vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableLinearSumVectorArray Serial Prototype int N VEnableLinearSumVectorArray Serial(N Vector v, booleantype tf); Description This function enables (SUNTRUE) or disables (SUNFALSE) the linear sum operation for vector arrays in the serial vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableScaleVectorArray Serial Prototype int N VEnableScaleVectorArray Serial(N Vector v, booleantype tf); Description This function enables (SUNTRUE) or disables (SUNFALSE) the scale operation for vector arrays in the serial vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableConstVectorArray Serial Prototype int N VEnableConstVectorArray Serial(N Vector v, booleantype tf); Description This function enables (SUNTRUE) or disables (SUNFALSE) the const operation for vector arrays in the serial vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableWrmsNormVectorArray Serial Prototype int N VEnableWrmsNormVectorArray Serial(N Vector v, booleantype tf); Description This function enables (SUNTRUE) or disables (SUNFALSE) the WRMS norm operation for vector arrays in the serial vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableWrmsNormMaskVectorArray Serial Prototype int N VEnableWrmsNormMaskVectorArray Serial(N Vector v, booleantype tf); Description This function enables (SUNTRUE) or disables (SUNFALSE) the masked WRMS norm operation for vector arrays in the serial vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableScaleAddMultiVectorArray Serial Prototype int N VEnableScaleAddMultiVectorArray Serial(N Vector v, booleantype tf); Description This function enables (SUNTRUE) or disables (SUNFALSE) the scale and add a vector array to multiple vector arrays operation in the serial vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. 7.3 The NVECTOR PARALLEL implementation 173 N VEnableLinearCombinationVectorArray Serial Prototype int N VEnableLinearCombinationVectorArray Serial(N Vector v, booleantype tf); Description This function enables (SUNTRUE) or disables (SUNFALSE) the linear combination operation for vector arrays in the serial vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. Notes • When looping over the components of an N Vector v, it is more efficient to first obtain the component array via v data = NV DATA S(v) and then access v data[i] within the loop than it is to use NV Ith S(v,i) within the loop. • N VNewEmpty Serial, N VMake Serial, and N VCloneVectorArrayEmpty Serial set the field own data = SUNFALSE. N VDestroy Serial and N VDestroyVectorArray Serial will not attempt to free the pointer data for any N Vector with own data set to SUNFALSE. In such a case, it is the user’s responsibility to deallocate the data pointer. ! • To maximize efficiency, vector operations in the nvector serial implementation that have more than one N Vector argument do not check for consistent internal representation of these vectors. It is the user’s responsibility to ensure that such routines are called with N Vector arguments that were all created with the same internal representations. ! 7.2.3 NVECTOR SERIAL Fortran interfaces The nvector serial module provides a Fortran 2003 module as well as Fortran 77 style interface functions for use from Fortran applications. FORTRAN 2003 interface module The fnvector serial mod Fortran module defines interfaces to all nvector serial C functions using the intrinsic iso c binding module which provides a standardized mechanism for interoperating with C. As noted in the C function descriptions above, the interface functions are named after the corresponding C function, but with a leading ‘F’. For example, the function N VNew Serial is interfaced as FN VNew Serial. The Fortran 2003 nvector serial interface module can be accessed with the use statement, i.e. use fnvector serial mod, and linking to the library libsundials fnvectorserial mod.lib in addition to the C library. For details on where the library and module file fnvector serial mod.mod are installed see Appendix A. We note that the module is accessible from the Fortran 2003 sundials integrators without separately linking to the libsundials fnvectorserial mod library. FORTRAN 77 interface functions For solvers that include a Fortran 77 interface module, the nvector serial module also includes a Fortran-callable function FNVINITS(code, NEQ, IER), to initialize this nvector serial module. Here code is an input solver id (1 for cvode, 2 for ida, 3 for kinsol, 4 for arkode); NEQ is the problem size (declared so as to match C type long int); and IER is an error return flag equal 0 for success and -1 for failure. 7.3 The NVECTOR PARALLEL implementation The nvector parallel implementation of the nvector module provided with sundials is based on MPI. It defines the content field of N Vector to be a structure containing the global and local lengths of the vector, a pointer to the beginning of a contiguous local data array, an MPI communicator, and a boolean flag own data indicating ownership of the data array data. 174 Description of the NVECTOR module struct _N_VectorContent_Parallel { sunindextype local_length; sunindextype global_length; booleantype own_data; realtype *data; MPI_Comm comm; }; The header file to include when using this module is nvector parallel.h. The installed module library to link to is libsundials nvecparallel.lib where .lib is typically .so for shared libraries and .a for static libraries. 7.3.1 NVECTOR PARALLEL accessor macros The following macros are provided to access the content of a nvector parallel vector. The suffix P in the names denotes the distributed memory parallel version. • NV CONTENT P This macro gives access to the contents of the parallel vector N Vector. The assignment v cont = NV CONTENT P(v) sets v cont to be a pointer to the N Vector content structure of type struct N VectorContent Parallel. Implementation: #define NV_CONTENT_P(v) ( (N_VectorContent_Parallel)(v->content) ) • NV OWN DATA P, NV DATA P, NV LOCLENGTH P, NV GLOBLENGTH P These macros give individual access to the parts of the content of a parallel N Vector. The assignment v data = NV DATA P(v) sets v data to be a pointer to the first component of the local data for the N Vector v. The assignment NV DATA P(v) = v data sets the component array of v to be v data by storing the pointer v data. The assignment v llen = NV LOCLENGTH P(v) sets v llen to be the length of the local part of v. The call NV LENGTH P(v) = llen v sets the local length of v to be llen v. The assignment v glen = NV GLOBLENGTH P(v) sets v glen to be the global length of the vector v. The call NV GLOBLENGTH P(v) = glen v sets the global length of v to be glen v. Implementation: #define NV_OWN_DATA_P(v) ( NV_CONTENT_P(v)->own_data ) #define NV_DATA_P(v) ( NV_CONTENT_P(v)->data ) #define NV_LOCLENGTH_P(v) ( NV_CONTENT_P(v)->local_length ) #define NV_GLOBLENGTH_P(v) ( NV_CONTENT_P(v)->global_length ) • NV COMM P This macro provides access to the MPI communicator used by the nvector parallel vectors. Implementation: #define NV_COMM_P(v) ( NV_CONTENT_P(v)->comm ) • NV Ith P This macro gives access to the individual components of the local data array of an N Vector. The assignment r = NV Ith P(v,i) sets r to be the value of the i-th component of the local part of v. The assignment NV Ith P(v,i) = r sets the value of the i-th component of the local part of v to be r. Here i ranges from 0 to n − 1, where n is the local length. Implementation: #define NV_Ith_P(v,i) ( NV_DATA_P(v)[i] ) 7.3 The NVECTOR PARALLEL implementation 7.3.2 175 NVECTOR PARALLEL functions The nvector parallel module defines parallel implementations of all vector operations listed in Tables 7.2, 7.3, and 7.4. Their names are obtained from those in Tables 7.2, 7.3, and 7.4 by appending the suffix Parallel (e.g. N VDestroy Parallel). The module nvector parallel provides the following additional user-callable routines: N VNew Parallel Prototype N Vector N VNew Parallel(MPI Comm comm, sunindextype local length, sunindextype global length); Description This function creates and allocates memory for a parallel vector. N VNewEmpty Parallel Prototype N Vector N VNewEmpty Parallel(MPI Comm comm, sunindextype local length, sunindextype global length); Description This function creates a new parallel N Vector with an empty (NULL) data array. N VMake Parallel Prototype N Vector N VMake Parallel(MPI Comm comm, sunindextype local length, sunindextype global length, realtype *v data); Description This function creates and allocates memory for a parallel vector with user-provided data array. This function does not allocate memory for v data itself. N VCloneVectorArray Parallel Prototype N Vector *N VCloneVectorArray Parallel(int count, N Vector w); Description This function creates (by cloning) an array of count parallel vectors. N VCloneVectorArrayEmpty Parallel Prototype N Vector *N VCloneVectorArrayEmpty Parallel(int count, N Vector w); Description This function creates (by cloning) an array of count parallel vectors, each with an empty (NULL) data array. N VDestroyVectorArray Parallel Prototype void N VDestroyVectorArray Parallel(N Vector *vs, int count); Description This function frees memory allocated for the array of count variables of type N Vector created with N VCloneVectorArray Parallel or with N VCloneVectorArrayEmpty Parallel. N VGetLength Parallel Prototype sunindextype N VGetLength Parallel(N Vector v); Description This function returns the number of vector elements (global vector length). N VGetLocalLength Parallel Prototype sunindextype N VGetLocalLength Parallel(N Vector v); Description This function returns the local vector length. 176 Description of the NVECTOR module N VPrint Parallel Prototype void N VPrint Parallel(N Vector v); Description This function prints the local content of a parallel vector to stdout. N VPrintFile Parallel Prototype void N VPrintFile Parallel(N Vector v, FILE *outfile); Description This function prints the local content of a parallel vector to outfile. By default all fused and vector array operations are disabled in the nvector parallel module. The following additional user-callable routines are provided to enable or disable fused and vector array operations for a specific vector. To ensure consistency across vectors it is recommended to first create a vector with N VNew Parallel, enable/disable the desired operations for that vector with the functions below, and create any additional vectors from that vector using N VClone with that vector. This guarantees the new vectors will have the same operations enabled/disabled as cloned vectors inherit the same enable/disable options as the vector they are cloned from while vectors created with N VNew Parallel will have the default settings for the nvector parallel module. N VEnableFusedOps Parallel Prototype int N VEnableFusedOps Parallel(N Vector v, booleantype tf); Description This function enables (SUNTRUE) or disables (SUNFALSE) all fused and vector array operations in the parallel vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableLinearCombination Parallel Prototype int N VEnableLinearCombination Parallel(N Vector v, booleantype tf); Description This function enables (SUNTRUE) or disables (SUNFALSE) the linear combination fused operation in the parallel vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableScaleAddMulti Parallel Prototype int N VEnableScaleAddMulti Parallel(N Vector v, booleantype tf); Description This function enables (SUNTRUE) or disables (SUNFALSE) the scale and add a vector to multiple vectors fused operation in the parallel vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableDotProdMulti Parallel Prototype int N VEnableDotProdMulti Parallel(N Vector v, booleantype tf); Description This function enables (SUNTRUE) or disables (SUNFALSE) the multiple dot products fused operation in the parallel vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableLinearSumVectorArray Parallel Prototype int N VEnableLinearSumVectorArray Parallel(N Vector v, booleantype tf); Description This function enables (SUNTRUE) or disables (SUNFALSE) the linear sum operation for vector arrays in the parallel vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. 7.3 The NVECTOR PARALLEL implementation 177 N VEnableScaleVectorArray Parallel Prototype int N VEnableScaleVectorArray Parallel(N Vector v, booleantype tf); Description This function enables (SUNTRUE) or disables (SUNFALSE) the scale operation for vector arrays in the parallel vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableConstVectorArray Parallel Prototype int N VEnableConstVectorArray Parallel(N Vector v, booleantype tf); Description This function enables (SUNTRUE) or disables (SUNFALSE) the const operation for vector arrays in the parallel vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableWrmsNormVectorArray Parallel Prototype int N VEnableWrmsNormVectorArray Parallel(N Vector v, booleantype tf); Description This function enables (SUNTRUE) or disables (SUNFALSE) the WRMS norm operation for vector arrays in the parallel vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableWrmsNormMaskVectorArray Parallel Prototype int N VEnableWrmsNormMaskVectorArray Parallel(N Vector v, booleantype tf); Description This function enables (SUNTRUE) or disables (SUNFALSE) the masked WRMS norm operation for vector arrays in the parallel vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableScaleAddMultiVectorArray Parallel Prototype int N VEnableScaleAddMultiVectorArray Parallel(N Vector v, booleantype tf); Description This function enables (SUNTRUE) or disables (SUNFALSE) the scale and add a vector array to multiple vector arrays operation in the parallel vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableLinearCombinationVectorArray Parallel Prototype int N VEnableLinearCombinationVectorArray Parallel(N Vector v, booleantype tf); Description This function enables (SUNTRUE) or disables (SUNFALSE) the linear combination operation for vector arrays in the parallel vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. Notes • When looping over the components of an N Vector v, it is more efficient to first obtain the local component array via v data = NV DATA P(v) and then access v data[i] within the loop than it is to use NV Ith P(v,i) within the loop. • N VNewEmpty Parallel, N VMake Parallel, and N VCloneVectorArrayEmpty Parallel set the field own data = SUNFALSE. N VDestroy Parallel and N VDestroyVectorArray Parallel will not attempt to free the pointer data for any N Vector with own data set to SUNFALSE. In such a case, it is the user’s responsibility to deallocate the data pointer. ! 178 ! Description of the NVECTOR module • To maximize efficiency, vector operations in the nvector parallel implementation that have more than one N Vector argument do not check for consistent internal representation of these vectors. It is the user’s responsibility to ensure that such routines are called with N Vector arguments that were all created with the same internal representations. 7.3.3 ! NVECTOR PARALLEL Fortran interfaces For solvers that include a Fortran 77 interface module, the nvector parallel module also includes a Fortran-callable function FNVINITP(COMM, code, NLOCAL, NGLOBAL, IER), to initialize this nvector parallel module. Here COMM is the MPI communicator, code is an input solver id (1 for cvode, 2 for ida, 3 for kinsol, 4 for arkode); NLOCAL and NGLOBAL are the local and global vector sizes, respectively (declared so as to match C type long int); and IER is an error return flag equal 0 for success and -1 for failure. NOTE: If the header file sundials config.h defines SUNDIALS MPI COMM F2C to be 1 (meaning the MPI implementation used to build sundials includes the MPI Comm f2c function), then COMM can be any valid MPI communicator. Otherwise, MPI COMM WORLD will be used, so just pass an integer value as a placeholder. 7.4 The NVECTOR OPENMP implementation In situations where a user has a multi-core processing unit capable of running multiple parallel threads with shared memory, sundials provides an implementation of nvector using OpenMP, called nvector openmp, and an implementation using Pthreads, called nvector pthreads. Testing has shown that vectors should be of length at least 100, 000 before the overhead associated with creating and using the threads is made up by the parallelism in the vector calculations. The OpenMP nvector implementation provided with sundials, nvector openmp, defines the content field of N Vector to be a structure containing the length of the vector, a pointer to the beginning of a contiguous data array, a boolean flag own data which specifies the ownership of data, and the number of threads. Operations on the vector are threaded using OpenMP. struct _N_VectorContent_OpenMP { sunindextype length; booleantype own_data; realtype *data; int num_threads; }; The header file to include when using this module is nvector openmp.h. The installed module library to link to is libsundials nvecopenmp.lib where .lib is typically .so for shared libraries and .a for static libraries. The Fortran module file to use when using the Fortran 2003 interface to this module is fnvector openmp mod.mod. 7.4.1 NVECTOR OPENMP accessor macros The following macros are provided to access the content of an nvector openmp vector. The suffix OMP in the names denotes the OpenMP version. • NV CONTENT OMP This routine gives access to the contents of the OpenMP vector N Vector. The assignment v cont = NV CONTENT OMP(v) sets v cont to be a pointer to the OpenMP N Vector content structure. Implementation: #define NV_CONTENT_OMP(v) ( (N_VectorContent_OpenMP)(v->content) ) 7.4 The NVECTOR OPENMP implementation 179 • NV OWN DATA OMP, NV DATA OMP, NV LENGTH OMP, NV NUM THREADS OMP These macros give individual access to the parts of the content of a OpenMP N Vector. The assignment v data = NV DATA OMP(v) sets v data to be a pointer to the first component of the data for the N Vector v. The assignment NV DATA OMP(v) = v data sets the component array of v to be v data by storing the pointer v data. The assignment v len = NV LENGTH OMP(v) sets v len to be the length of v. On the other hand, the call NV LENGTH OMP(v) = len v sets the length of v to be len v. The assignment v num threads = NV NUM THREADS OMP(v) sets v num threads to be the number of threads from v. On the other hand, the call NV NUM THREADS OMP(v) = num threads v sets the number of threads for v to be num threads v. Implementation: #define NV_OWN_DATA_OMP(v) ( NV_CONTENT_OMP(v)->own_data ) #define NV_DATA_OMP(v) ( NV_CONTENT_OMP(v)->data ) #define NV_LENGTH_OMP(v) ( NV_CONTENT_OMP(v)->length ) #define NV_NUM_THREADS_OMP(v) ( NV_CONTENT_OMP(v)->num_threads ) • NV Ith OMP This macro gives access to the individual components of the data array of an N Vector. The assignment r = NV Ith OMP(v,i) sets r to be the value of the i-th component of v. The assignment NV Ith OMP(v,i) = r sets the value of the i-th component of v to be r. Here i ranges from 0 to n − 1 for a vector of length n. Implementation: #define NV_Ith_OMP(v,i) ( NV_DATA_OMP(v)[i] ) 7.4.2 NVECTOR OPENMP functions The nvector openmp module defines OpenMP implementations of all vector operations listed in Tables 7.2, 7.3, and 7.4. Their names are obtained from those in Tables 7.2, 7.3, and 7.4 by appending the suffix OpenMP (e.g. N VDestroy OpenMP). All the standard vector operations listed in 7.2 with the suffix OpenMP appended are callable via the Fortran 2003 interface by prepending an ‘F’ (e.g. FN VDestroy OpenMP). The module nvector openmp provides the following additional user-callable routines: N VNew OpenMP Prototype N Vector N VNew OpenMP(sunindextype vec length, int num threads) Description This function creates and allocates memory for a OpenMP N Vector. Arguments are the vector length and number of threads. F2003 Name This function is callable as FN VNew OpenMP when using the Fortran 2003 interface module. N VNewEmpty OpenMP Prototype N Vector N VNewEmpty OpenMP(sunindextype vec length, int num threads) Description This function creates a new OpenMP N Vector with an empty (NULL) data array. F2003 Name This function is callable as FN VNewEmpty OpenMP when using the Fortran 2003 interface module. 180 Description of the NVECTOR module N VMake OpenMP Prototype N Vector N VMake OpenMP(sunindextype vec length, realtype *v data, int num threads); Description This function creates and allocates memory for a OpenMP vector with user-provided data array. This function does not allocate memory for v data itself. F2003 Name This function is callable as FN VMake OpenMP when using the Fortran 2003 interface module. N VCloneVectorArray OpenMP Prototype N Vector *N VCloneVectorArray OpenMP(int count, N Vector w) Description This function creates (by cloning) an array of count OpenMP vectors. N VCloneVectorArrayEmpty OpenMP Prototype N Vector *N VCloneVectorArrayEmpty OpenMP(int count, N Vector w) Description This function creates (by cloning) an array of count OpenMP vectors, each with an empty (NULL) data array. N VDestroyVectorArray OpenMP Prototype void N VDestroyVectorArray OpenMP(N Vector *vs, int count) Description This function frees memory allocated for the array of count variables of type N Vector created with N VCloneVectorArray OpenMP or with N VCloneVectorArrayEmpty OpenMP. N VGetLength OpenMP Prototype sunindextype N VGetLength OpenMP(N Vector v) Description This function returns number of vector elements. F2003 Name This function is callable as FN VGetLength OpenMP when using the Fortran 2003 interface module. N VPrint OpenMP Prototype void N VPrint OpenMP(N Vector v) Description This function prints the content of an OpenMP vector to stdout. F2003 Name This function is callable as FN VPrint OpenMP when using the Fortran 2003 interface module. N VPrintFile OpenMP Prototype void N VPrintFile OpenMP(N Vector v, FILE *outfile) Description This function prints the content of an OpenMP vector to outfile. By default all fused and vector array operations are disabled in the nvector openmp module. The following additional user-callable routines are provided to enable or disable fused and vector array operations for a specific vector. To ensure consistency across vectors it is recommended to first create a vector with N VNew OpenMP, enable/disable the desired operations for that vector with the functions below, and create any additional vectors from that vector using N VClone. This guarantees the new vectors will have the same operations enabled/disabled as cloned vectors inherit the same enable/disable options as the vector they are cloned from while vectors created with N VNew OpenMP will have the default settings for the nvector openmp module. 7.4 The NVECTOR OPENMP implementation 181 N VEnableFusedOps OpenMP Prototype int N VEnableFusedOps OpenMP(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) all fused and vector array operations in the OpenMP vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableLinearCombination OpenMP Prototype int N VEnableLinearCombination OpenMP(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the linear combination fused operation in the OpenMP vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableScaleAddMulti OpenMP Prototype int N VEnableScaleAddMulti OpenMP(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the scale and add a vector to multiple vectors fused operation in the OpenMP vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableDotProdMulti OpenMP Prototype int N VEnableDotProdMulti OpenMP(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the multiple dot products fused operation in the OpenMP vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableLinearSumVectorArray OpenMP Prototype int N VEnableLinearSumVectorArray OpenMP(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the linear sum operation for vector arrays in the OpenMP vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableScaleVectorArray OpenMP Prototype int N VEnableScaleVectorArray OpenMP(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the scale operation for vector arrays in the OpenMP vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableConstVectorArray OpenMP Prototype int N VEnableConstVectorArray OpenMP(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the const operation for vector arrays in the OpenMP vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. 182 Description of the NVECTOR module N VEnableWrmsNormVectorArray OpenMP Prototype int N VEnableWrmsNormVectorArray OpenMP(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the WRMS norm operation for vector arrays in the OpenMP vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableWrmsNormMaskVectorArray OpenMP Prototype int N VEnableWrmsNormMaskVectorArray OpenMP(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the masked WRMS norm operation for vector arrays in the OpenMP vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableScaleAddMultiVectorArray OpenMP Prototype int N VEnableScaleAddMultiVectorArray OpenMP(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the scale and add a vector array to multiple vector arrays operation in the OpenMP vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableLinearCombinationVectorArray OpenMP Prototype int N VEnableLinearCombinationVectorArray OpenMP(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the linear combination operation for vector arrays in the OpenMP vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. Notes • When looping over the components of an N Vector v, it is more efficient to first obtain the component array via v data = NV DATA OMP(v) and then access v data[i] within the loop than it is to use NV Ith OMP(v,i) within the loop. ! • N VNewEmpty OpenMP, N VMake OpenMP, and N VCloneVectorArrayEmpty OpenMP set the field own data = SUNFALSE. N VDestroy OpenMP and N VDestroyVectorArray OpenMP will not attempt to free the pointer data for any N Vector with own data set to SUNFALSE. In such a case, it is the user’s responsibility to deallocate the data pointer. ! • To maximize efficiency, vector operations in the nvector openmp implementation that have more than one N Vector argument do not check for consistent internal representation of these vectors. It is the user’s responsibility to ensure that such routines are called with N Vector arguments that were all created with the same internal representations. 7.4.3 NVECTOR OPENMP Fortran interfaces The nvector openmp module provides a Fortran 2003 module as well as Fortran 77 style interface functions for use from Fortran applications. 7.5 The NVECTOR PTHREADS implementation 183 FORTRAN 2003 interface module The nvector openmp mod Fortran module defines interfaces to most nvector openmp C functions using the intrinsic iso c binding module which provides a standardized mechanism for interoperating with C. As noted in the C function descriptions above, the interface functions are named after the corresponding C function, but with a leading ‘F’. For example, the function N VNew OpenMP is interfaced as FN VNew OpenMP. The Fortran 2003 nvector openmp interface module can be accessed with the use statement, i.e. use fnvector openmp mod, and linking to the library libsundials fnvectoropenmp mod.lib in addition to the C library. For details on where the library and module file fnvector openmp mod.mod are installed see Appendix A. FORTRAN 77 interface functions For solvers that include a Fortran 77 interface module, the nvector openmp module also includes a Fortran-callable function FNVINITOMP(code, NEQ, NUMTHREADS, IER), to initialize this module. Here code is an input solver id (1 for cvode, 2 for ida, 3 for kinsol, 4 for arkode); NEQ is the problem size (declared so as to match C type long int); NUMTHREADS is the number of threads; and IER is an error return flag equal 0 for success and -1 for failure. 7.5 The NVECTOR PTHREADS implementation In situations where a user has a multi-core processing unit capable of running multiple parallel threads with shared memory, sundials provides an implementation of nvector using OpenMP, called nvector openmp, and an implementation using Pthreads, called nvector pthreads. Testing has shown that vectors should be of length at least 100, 000 before the overhead associated with creating and using the threads is made up by the parallelism in the vector calculations. The Pthreads nvector implementation provided with sundials, denoted nvector pthreads, defines the content field of N Vector to be a structure containing the length of the vector, a pointer to the beginning of a contiguous data array, a boolean flag own data which specifies the ownership of data, and the number of threads. Operations on the vector are threaded using POSIX threads (Pthreads). struct _N_VectorContent_Pthreads { sunindextype length; booleantype own_data; realtype *data; int num_threads; }; The header file to include when using this module is nvector pthreads.h. The installed module library to link to is libsundials nvecpthreads.lib where .lib is typically .so for shared libraries and .a for static libraries. 7.5.1 NVECTOR PTHREADS accessor macros The following macros are provided to access the content of an nvector pthreads vector. The suffix PT in the names denotes the Pthreads version. • NV CONTENT PT This routine gives access to the contents of the Pthreads vector N Vector. The assignment v cont = NV CONTENT PT(v) sets v cont to be a pointer to the Pthreads N Vector content structure. Implementation: #define NV_CONTENT_PT(v) ( (N_VectorContent_Pthreads)(v->content) ) 184 Description of the NVECTOR module • NV OWN DATA PT, NV DATA PT, NV LENGTH PT, NV NUM THREADS PT These macros give individual access to the parts of the content of a Pthreads N Vector. The assignment v data = NV DATA PT(v) sets v data to be a pointer to the first component of the data for the N Vector v. The assignment NV DATA PT(v) = v data sets the component array of v to be v data by storing the pointer v data. The assignment v len = NV LENGTH PT(v) sets v len to be the length of v. On the other hand, the call NV LENGTH PT(v) = len v sets the length of v to be len v. The assignment v num threads = NV NUM THREADS PT(v) sets v num threads to be the number of threads from v. On the other hand, the call NV NUM THREADS PT(v) = num threads v sets the number of threads for v to be num threads v. Implementation: #define NV_OWN_DATA_PT(v) ( NV_CONTENT_PT(v)->own_data ) #define NV_DATA_PT(v) ( NV_CONTENT_PT(v)->data ) #define NV_LENGTH_PT(v) ( NV_CONTENT_PT(v)->length ) #define NV_NUM_THREADS_PT(v) ( NV_CONTENT_PT(v)->num_threads ) • NV Ith PT This macro gives access to the individual components of the data array of an N Vector. The assignment r = NV Ith PT(v,i) sets r to be the value of the i-th component of v. The assignment NV Ith PT(v,i) = r sets the value of the i-th component of v to be r. Here i ranges from 0 to n − 1 for a vector of length n. Implementation: #define NV_Ith_PT(v,i) ( NV_DATA_PT(v)[i] ) 7.5.2 NVECTOR PTHREADS functions The nvector pthreads module defines Pthreads implementations of all vector operations listed in Tables 7.2, 7.3, and 7.4. Their names are obtained from those in Tables 7.2, 7.3, and 7.4 by appending the suffix Pthreads (e.g. N VDestroy Pthreads). All the standard vector operations listed in 7.2 are callable via the Fortran 2003 interface by prepending an ‘F’ (e.g. FN VDestroy Pthreads). The module nvector pthreads provides the following additional user-callable routines: N VNew Pthreads Prototype N Vector N VNew Pthreads(sunindextype vec length, int num threads) Description This function creates and allocates memory for a Pthreads N Vector. Arguments are the vector length and number of threads. F2003 Name This function is callable as FN VNew Pthreads when using the Fortran 2003 interface module. N VNewEmpty Pthreads Prototype N Vector N VNewEmpty Pthreads(sunindextype vec length, int num threads) Description This function creates a new Pthreads N Vector with an empty (NULL) data array. F2003 Name This function is callable as FN VNewEmpty Pthreads when using the Fortran 2003 interface module. 7.5 The NVECTOR PTHREADS implementation 185 N VMake Pthreads Prototype N Vector N VMake Pthreads(sunindextype vec length, realtype *v data, int num threads); Description This function creates and allocates memory for a Pthreads vector with user-provided data array. This function does not allocate memory for v data itself. F2003 Name This function is callable as FN VMake Pthreads when using the Fortran 2003 interface module. N VCloneVectorArray Pthreads Prototype N Vector *N VCloneVectorArray Pthreads(int count, N Vector w) Description This function creates (by cloning) an array of count Pthreads vectors. N VCloneVectorArrayEmpty Pthreads Prototype N Vector *N VCloneVectorArrayEmpty Pthreads(int count, N Vector w) Description This function creates (by cloning) an array of count Pthreads vectors, each with an empty (NULL) data array. N VDestroyVectorArray Pthreads Prototype void N VDestroyVectorArray Pthreads(N Vector *vs, int count) Description This function frees memory allocated for the array of count variables of type N Vector created with N VCloneVectorArray Pthreads or with N VCloneVectorArrayEmpty Pthreads. N VGetLength Pthreads Prototype sunindextype N VGetLength Pthreads(N Vector v) Description This function returns the number of vector elements. F2003 Name This function is callable as FN VGetLength Pthreads when using the Fortran 2003 interface module. N VPrint Pthreads Prototype void N VPrint Pthreads(N Vector v) Description This function prints the content of a Pthreads vector to stdout. F2003 Name This function is callable as FN VPrint Pthreads when using the Fortran 2003 interface module. N VPrintFile Pthreads Prototype void N VPrintFile Pthreads(N Vector v, FILE *outfile) Description This function prints the content of a Pthreads vector to outfile. By default all fused and vector array operations are disabled in the nvector pthreads module. The following additional user-callable routines are provided to enable or disable fused and vector array operations for a specific vector. To ensure consistency across vectors it is recommended to first create a vector with N VNew Pthreads, enable/disable the desired operations for that vector with the functions below, and create any additional vectors from that vector using N VClone. This guarantees the new vectors will have the same operations enabled/disabled as cloned vectors inherit the same enable/disable options as the vector they are cloned from while vectors created with N VNew Pthreads will have the default settings for the nvector pthreads module. 186 Description of the NVECTOR module N VEnableFusedOps Pthreads Prototype int N VEnableFusedOps Pthreads(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) all fused and vector array operations in the Pthreads vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableLinearCombination Pthreads Prototype int N VEnableLinearCombination Pthreads(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the linear combination fused operation in the Pthreads vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableScaleAddMulti Pthreads Prototype int N VEnableScaleAddMulti Pthreads(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the scale and add a vector to multiple vectors fused operation in the Pthreads vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableDotProdMulti Pthreads Prototype int N VEnableDotProdMulti Pthreads(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the multiple dot products fused operation in the Pthreads vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableLinearSumVectorArray Pthreads Prototype int N VEnableLinearSumVectorArray Pthreads(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the linear sum operation for vector arrays in the Pthreads vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableScaleVectorArray Pthreads Prototype int N VEnableScaleVectorArray Pthreads(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the scale operation for vector arrays in the Pthreads vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableConstVectorArray Pthreads Prototype int N VEnableConstVectorArray Pthreads(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the const operation for vector arrays in the Pthreads vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. 7.5 The NVECTOR PTHREADS implementation 187 N VEnableWrmsNormVectorArray Pthreads Prototype int N VEnableWrmsNormVectorArray Pthreads(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the WRMS norm operation for vector arrays in the Pthreads vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableWrmsNormMaskVectorArray Pthreads Prototype int N VEnableWrmsNormMaskVectorArray Pthreads(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the masked WRMS norm operation for vector arrays in the Pthreads vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableScaleAddMultiVectorArray Pthreads Prototype int N VEnableScaleAddMultiVectorArray Pthreads(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the scale and add a vector array to multiple vector arrays operation in the Pthreads vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableLinearCombinationVectorArray Pthreads Prototype int N VEnableLinearCombinationVectorArray Pthreads(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the linear combination operation for vector arrays in the Pthreads vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. Notes • When looping over the components of an N Vector v, it is more efficient to first obtain the component array via v data = NV DATA PT(v) and then access v data[i] within the loop than it is to use NV Ith PT(v,i) within the loop. • N VNewEmpty Pthreads, N VMake Pthreads, and N VCloneVectorArrayEmpty Pthreads set the field own data = SUNFALSE. N VDestroy Pthreads and N VDestroyVectorArray Pthreads will not attempt to free the pointer data for any N Vector with own data set to SUNFALSE. In such a case, it is the user’s responsibility to deallocate the data pointer. ! • To maximize efficiency, vector operations in the nvector pthreads implementation that have more than one N Vector argument do not check for consistent internal representation of these vectors. It is the user’s responsibility to ensure that such routines are called with N Vector arguments that were all created with the same internal representations. ! 7.5.3 NVECTOR PTHREADS Fortran interfaces The nvector pthreads module provides a Fortran 2003 module as well as Fortran 77 style interface functions for use from Fortran applications. 188 Description of the NVECTOR module FORTRAN 2003 interface module The nvector pthreads mod Fortran module defines interfaces to most nvector pthreads C functions using the intrinsic iso c binding module which provides a standardized mechanism for interoperating with C. As noted in the C function descriptions above, the interface functions are named after the corresponding C function, but with a leading ‘F’. For example, the function N VNew Pthreads is interfaced as FN VNew Pthreads. The Fortran 2003 nvector pthreads interface module can be accessed with the use statement, i.e. use fnvector pthreads mod, and linking to the library libsundials fnvectorpthreads mod.lib in addition to the C library. For details on where the library and module file fnvector pthreads mod.mod are installed see Appendix A. FORTRAN 77 interface functions For solvers that include a Fortran interface module, the nvector pthreads module also includes a Fortran-callable function FNVINITPTS(code, NEQ, NUMTHREADS, IER), to initialize this module. Here code is an input solver id (1 for cvode, 2 for ida, 3 for kinsol, 4 for arkode); NEQ is the problem size (declared so as to match C type long int); NUMTHREADS is the number of threads; and IER is an error return flag equal 0 for success and -1 for failure. 7.6 The NVECTOR PARHYP implementation The nvector parhyp implementation of the nvector module provided with sundials is a wrapper around hypre’s ParVector class. Most of the vector kernels simply call hypre vector operations. The implementation defines the content field of N Vector to be a structure containing the global and local lengths of the vector, a pointer to an object of type HYPRE ParVector, an MPI communicator, and a boolean flag own parvector indicating ownership of the hypre parallel vector object x. struct _N_VectorContent_ParHyp { sunindextype local_length; sunindextype global_length; booleantype own_parvector; MPI_Comm comm; HYPRE_ParVector x; }; The header file to include when using this module is nvector parhyp.h. The installed module library to link to is libsundials nvecparhyp.lib where .lib is typically .so for shared libraries and .a for static libraries. Unlike native sundials vector types, nvector parhyp does not provide macros to access its member variables. Note that nvector parhyp requires sundials to be built with MPI support. 7.6.1 NVECTOR PARHYP functions The nvector parhyp module defines implementations of all vector operations listed in Tables 7.2, 7.3, and 7.4, except for N VSetArrayPointer and N VGetArrayPointer, because accessing raw vector data is handled by low-level hypre functions. As such, this vector is not available for use with sundials Fortran interfaces. When access to raw vector data is needed, one should extract the hypre vector first, and then use hypre methods to access the data. Usage examples of nvector parhyp are provided in the cvAdvDiff non ph.c example program for cvode [31] and the ark diurnal kry ph.c example program for arkode [39]. The names of parhyp methods are obtained from those in Tables 7.2, 7.3, and 7.4 by appending the suffix ParHyp (e.g. N VDestroy ParHyp). The module nvector parhyp provides the following additional user-callable routines: 7.6 The NVECTOR PARHYP implementation 189 N VNewEmpty ParHyp Prototype N Vector N VNewEmpty ParHyp(MPI Comm comm, sunindextype local length, sunindextype global length) Description This function creates a new parhyp N Vector with the pointer to the hypre vector set to NULL. N VMake ParHyp Prototype N Vector N VMake ParHyp(HYPRE ParVector x) Description This function creates an N Vector wrapper around an existing hypre parallel vector. It does not allocate memory for x itself. N VGetVector ParHyp Prototype HYPRE ParVector N VGetVector ParHyp(N Vector v) Description This function returns the underlying hypre vector. N VCloneVectorArray ParHyp Prototype N Vector *N VCloneVectorArray ParHyp(int count, N Vector w) Description This function creates (by cloning) an array of count parallel vectors. N VCloneVectorArrayEmpty ParHyp Prototype N Vector *N VCloneVectorArrayEmpty ParHyp(int count, N Vector w) Description This function creates (by cloning) an array of count parallel vectors, each with an empty (NULL) data array. N VDestroyVectorArray ParHyp Prototype void N VDestroyVectorArray ParHyp(N Vector *vs, int count) Description This function frees memory allocated for the array of count variables of type N Vector created with N VCloneVectorArray ParHyp or with N VCloneVectorArrayEmpty ParHyp. N VPrint ParHyp Prototype void N VPrint ParHyp(N Vector v) Description This function prints the local content of a parhyp vector to stdout. N VPrintFile ParHyp Prototype void N VPrintFile ParHyp(N Vector v, FILE *outfile) Description This function prints the local content of a parhyp vector to outfile. By default all fused and vector array operations are disabled in the nvector parhyp module. The following additional user-callable routines are provided to enable or disable fused and vector array operations for a specific vector. To ensure consistency across vectors it is recommended to first create a vector with N VMake ParHyp, enable/disable the desired operations for that vector with the functions below, and create any additional vectors from that vector using N VClone. This guarantees the new vectors will have the same operations enabled/disabled as cloned vectors inherit the same enable/disable options as the vector they are cloned from while vectors created with N VMake ParHyp will have the default settings for the nvector parhyp module. 190 Description of the NVECTOR module N VEnableFusedOps ParHyp Prototype int N VEnableFusedOps ParHyp(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) all fused and vector array operations in the parhyp vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableLinearCombination ParHyp Prototype int N VEnableLinearCombination ParHyp(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the linear combination fused operation in the parhyp vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableScaleAddMulti ParHyp Prototype int N VEnableScaleAddMulti ParHyp(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the scale and add a vector to multiple vectors fused operation in the parhyp vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableDotProdMulti ParHyp Prototype int N VEnableDotProdMulti ParHyp(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the multiple dot products fused operation in the parhyp vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableLinearSumVectorArray ParHyp Prototype int N VEnableLinearSumVectorArray ParHyp(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the linear sum operation for vector arrays in the parhyp vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableScaleVectorArray ParHyp Prototype int N VEnableScaleVectorArray ParHyp(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the scale operation for vector arrays in the parhyp vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableConstVectorArray ParHyp Prototype int N VEnableConstVectorArray ParHyp(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the const operation for vector arrays in the parhyp vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. 7.7 The NVECTOR PETSC implementation 191 N VEnableWrmsNormVectorArray ParHyp Prototype int N VEnableWrmsNormVectorArray ParHyp(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the WRMS norm operation for vector arrays in the parhyp vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableWrmsNormMaskVectorArray ParHyp Prototype int N VEnableWrmsNormMaskVectorArray ParHyp(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the masked WRMS norm operation for vector arrays in the parhyp vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableScaleAddMultiVectorArray ParHyp Prototype int N VEnableScaleAddMultiVectorArray ParHyp(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the scale and add a vector array to multiple vector arrays operation in the parhyp vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableLinearCombinationVectorArray ParHyp Prototype int N VEnableLinearCombinationVectorArray ParHyp(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the linear combination operation for vector arrays in the parhyp vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. Notes • When there is a need to access components of an N Vector ParHyp, v, it is recommended to extract the hypre vector via x vec = N VGetVector ParHyp(v) and then access components using appropriate hypre functions. • N VNewEmpty ParHyp, N VMake ParHyp, and N VCloneVectorArrayEmpty ParHyp set the field own parvector to SUNFALSE. N VDestroy ParHyp and N VDestroyVectorArray ParHyp will not attempt to delete an underlying hypre vector for any N Vector with own parvector set to SUNFALSE. In such a case, it is the user’s responsibility to delete the underlying vector. ! • To maximize efficiency, vector operations in the nvector parhyp implementation that have more than one N Vector argument do not check for consistent internal representations of these vectors. It is the user’s responsibility to ensure that such routines are called with N Vector arguments that were all created with the same internal representations. ! 7.7 The NVECTOR PETSC implementation The nvector petsc module is an nvector wrapper around the petsc vector. It defines the content field of a N Vector to be a structure containing the global and local lengths of the vector, a pointer to the petsc vector, an MPI communicator, and a boolean flag own data indicating ownership of the wrapped petsc vector. 192 Description of the NVECTOR module struct _N_VectorContent_Petsc { sunindextype local_length; sunindextype global_length; booleantype own_data; Vec *pvec; MPI_Comm comm; }; The header file to include when using this module is nvector petsc.h. The installed module library to link to is libsundials nvecpetsc.lib where .lib is typically .so for shared libraries and .a for static libraries. Unlike native sundials vector types, nvector petsc does not provide macros to access its member variables. Note that nvector petsc requires sundials to be built with MPI support. 7.7.1 NVECTOR PETSC functions The nvector petsc module defines implementations of all vector operations listed in Tables 7.2, 7.3, and 7.4, except for N VGetArrayPointer and N VSetArrayPointer. As such, this vector cannot be used with sundials Fortran interfaces. When access to raw vector data is needed, it is recommended to extract the petsc vector first, and then use petsc methods to access the data. Usage examples of nvector petsc are provided in example programs for ida [29]. The names of vector operations are obtained from those in Tables 7.2, 7.3, and 7.4 by appending the suffix Petsc (e.g. N VDestroy Petsc). The module nvector petsc provides the following additional user-callable routines: N VNewEmpty Petsc Prototype N Vector N VNewEmpty Petsc(MPI Comm comm, sunindextype local length, sunindextype global length) Description This function creates a new nvector wrapper with the pointer to the wrapped petsc vector set to (NULL). It is used by the N VMake Petsc and N VClone Petsc implementations. N VMake Petsc Prototype N Vector N VMake Petsc(Vec *pvec) Description This function creates and allocates memory for an nvector petsc wrapper around a user-provided petsc vector. It does not allocate memory for the vector pvec itself. N VGetVector Petsc Prototype Vec *N VGetVector Petsc(N Vector v) Description This function returns a pointer to the underlying petsc vector. N VCloneVectorArray Petsc Prototype N Vector *N VCloneVectorArray Petsc(int count, N Vector w) Description This function creates (by cloning) an array of count nvector petsc vectors. N VCloneVectorArrayEmpty Petsc Prototype N Vector *N VCloneVectorArrayEmpty Petsc(int count, N Vector w) Description This function creates (by cloning) an array of count nvector petsc vectors, each with pointers to petsc vectors set to (NULL). 7.7 The NVECTOR PETSC implementation 193 N VDestroyVectorArray Petsc Prototype void N VDestroyVectorArray Petsc(N Vector *vs, int count) Description This function frees memory allocated for the array of count variables of type N Vector created with N VCloneVectorArray Petsc or with N VCloneVectorArrayEmpty Petsc. N VPrint Petsc Prototype void N VPrint Petsc(N Vector v) Description This function prints the global content of a wrapped petsc vector to stdout. N VPrintFile Petsc Prototype void N VPrintFile Petsc(N Vector v, const char fname[]) Description This function prints the global content of a wrapped petsc vector to fname. By default all fused and vector array operations are disabled in the nvector petsc module. The following additional user-callable routines are provided to enable or disable fused and vector array operations for a specific vector. To ensure consistency across vectors it is recommended to first create a vector with N VMake Petsc, enable/disable the desired operations for that vector with the functions below, and create any additional vectors from that vector using N VClone. This guarantees the new vectors will have the same operations enabled/disabled as cloned vectors inherit the same enable/disable options as the vector they are cloned from while vectors created with N VMake Petsc will have the default settings for the nvector petsc module. N VEnableFusedOps Petsc Prototype int N VEnableFusedOps Petsc(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) all fused and vector array operations in the petsc vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableLinearCombination Petsc Prototype int N VEnableLinearCombination Petsc(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the linear combination fused operation in the petsc vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableScaleAddMulti Petsc Prototype int N VEnableScaleAddMulti Petsc(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the scale and add a vector to multiple vectors fused operation in the petsc vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableDotProdMulti Petsc Prototype int N VEnableDotProdMulti Petsc(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the multiple dot products fused operation in the petsc vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. 194 Description of the NVECTOR module N VEnableLinearSumVectorArray Petsc Prototype int N VEnableLinearSumVectorArray Petsc(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the linear sum operation for vector arrays in the petsc vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableScaleVectorArray Petsc Prototype int N VEnableScaleVectorArray Petsc(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the scale operation for vector arrays in the petsc vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableConstVectorArray Petsc Prototype int N VEnableConstVectorArray Petsc(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the const operation for vector arrays in the petsc vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableWrmsNormVectorArray Petsc Prototype int N VEnableWrmsNormVectorArray Petsc(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the WRMS norm operation for vector arrays in the petsc vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableWrmsNormMaskVectorArray Petsc Prototype int N VEnableWrmsNormMaskVectorArray Petsc(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the masked WRMS norm operation for vector arrays in the petsc vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableScaleAddMultiVectorArray Petsc Prototype int N VEnableScaleAddMultiVectorArray Petsc(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the scale and add a vector array to multiple vector arrays operation in the petsc vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableLinearCombinationVectorArray Petsc Prototype int N VEnableLinearCombinationVectorArray Petsc(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the linear combination operation for vector arrays in the petsc vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. 7.8 The NVECTOR CUDA implementation 195 Notes • When there is a need to access components of an N Vector Petsc, v, it is recommeded to extract the petsc vector via x vec = N VGetVector Petsc(v) and then access components using appropriate petsc functions. • The functions N VNewEmpty Petsc, N VMake Petsc, and N VCloneVectorArrayEmpty Petsc set the field own data to SUNFALSE. N VDestroy Petsc and N VDestroyVectorArray Petsc will not attempt to free the pointer pvec for any N Vector with own data set to SUNFALSE. In such a case, it is the user’s responsibility to deallocate the pvec pointer. ! • To maximize efficiency, vector operations in the nvector petsc implementation that have more than one N Vector argument do not check for consistent internal representations of these vectors. It is the user’s responsibility to ensure that such routines are called with N Vector arguments that were all created with the same internal representations. ! 7.8 The NVECTOR CUDA implementation The nvector cuda module is an experimental nvector implementation in the cuda language. The module allows for sundials vector kernels to run on GPU devices. It is intended for users who are already familiar with cuda and GPU programming. Building this vector module requires a CUDA compiler and, by extension, a C++ compiler. The class Vector in the namespace suncudavec manages the vector data layout: template class Vector { I size_; I mem_size_; I global_size_; T* h_vec_; T* d_vec_; ThreadPartitioning * partStream_; ThreadPartitioning * partReduce_; bool ownPartitioning_; bool ownData_; bool managed_mem_; SUNMPI_Comm comm_; ... }; The class members are vector size (length), size of the vector data memory block, pointers to vector data on the host and the device, pointers to ThreadPartitioning implementations that handle thread partitioning for streaming and reduction vector kernels, a boolean flag that signals if the vector owns the thread partitioning, a boolean flag that signals if the vector owns the data, a boolean flag that signals if managed memory is used for the data arrays, and the MPI communicator. The class Vector inherits from the empty structure struct _N_VectorContent_Cuda {}; to interface the C++ class with the nvector C code. Due to the rapid progress of cuda development, we expect that the suncudavec::Vector class will change frequently in future sundials releases. The code is structured so that it can tolerate significant changes in the suncudavec::Vector class without requiring changes to the user API. When instantiated with N VNew Cuda, the class Vector will allocate memory on both the host and the device. Alternatively, a user can provide host and device data arrays by using the N VMake Cuda 196 Description of the NVECTOR module constructor. To use cuda managed memory, the constructors N VNewManaged Cuda and N VMakeManaged Cuda are provided. Details on each of these constructors are provided below. The nvector cuda module can be utilized for single-node parallelism or in a distributed context with MPI. In the single-node case the header file to include nvector cuda.h and the library to link to is libsundials nveccuda.lib . In the a distributed setting the header file to include is nvector mpicuda.h and the library to link to is libsundials nvecmpicuda.lib . The extension, .lib, is typically .so for shared libraries and .a for static libraries. Only one of these libraries may be linked to when creating an executable or library. sundials must be built with MPI support if the distributed library is desired. 7.8.1 NVECTOR CUDA functions Unlike other native sundials vector types, nvector cuda does not provide macros to access its member variables. Instead, user should use the accessor functions: N VGetLength Cuda Prototype sunindextype N VGetLength Cuda(N Vector v) Description This function returns the global length of the vector. N VGetLocalLength Cuda Prototype sunindextype N VGetLocalLength Cuda(N Vector v) Description This function returns the local length of the vector. Note: This function is for use in a distributed context and is defined in the header nvector mpicuda.h and the library to link to is libsundials nvecmpicuda.lib. N VGetHostArrayPointer Cuda Prototype realtype *N VGetHostArrayPointer Cuda(N Vector v) Description This function returns a pointer to the vector data on the host. N VGetDeviceArrayPointer Cuda Prototype realtype *N VGetDeviceArrayPointer Cuda(N Vector v) Description This function returns a pointer to the vector data on the device. N VGetMPIComm Cuda Prototype MPI Comm N VGetMPIComm Cuda(N Vector v) Description This function returns the MPI communicator for the vector. Note: This function is for use in a distributed context and is defined in the header nvector mpicuda.h and the library to link to is libsundials nvecmpicuda.lib. N VIsManagedMemory Cuda Prototype booleantype *N VIsManagedMemory Cuda(N Vector v) Description This function returns a boolean flag indicating if the vector data is allocated in managed memory or not. 7.8 The NVECTOR CUDA implementation 197 The nvector cuda module defines implementations of all vector operations listed in Tables 7.2, 7.3, and 7.4, except for N VGetArrayPointer and N VSetArrayPointer. As such, this vector cannot be used with the sundials Fortran interfaces, nor with the sundials direct solvers and preconditioners. Instead, the nvector cuda module provides separate functions to access data on the host and on the device. It also provides methods for copying from the host to the device and vice versa. Usage examples of nvector cuda are provided in some example programs for cvode [31]. The names of vector operations are obtained from those in Tables 7.2, 7.3, and 7.4 by appending the suffix Cuda (e.g. N VDestroy Cuda). The module nvector cuda provides the following functions: N VNew Cuda Single-node usage Prototype N Vector N VNew Cuda(sunindextype length) Description This function creates and allocates memory for a cuda N Vector. The vector data array is allocated on both the host and device. In the single-node setting, the only input is the vector length. This constructor is defined in the header nvector cuda.h and the library to link to is libsundials nveccuda.lib. Distributed-memory parallel usage Prototype N Vector N VNew Cuda(MPI Comm comm, sunindextype local length, sunindextype global length) Description This function creates and allocates memory for a cuda N Vector. The vector data array is allocated on both the host and device. When used in a distributed context with MPI, the arguments are the MPI communicator, the local vector length, and the global vector length. This constructor is defined in the header nvector mpicuda.h and the library to link to is libsundials nvecmpicuda.lib. N VNewManaged Cuda Single-node usage Prototype N Vector N VNewManaged Cuda(sunindextype length) Description This function creates and allocates memory for a cuda N Vector on a single node. The vector data array is allocated in managed memory. In the single-node setting, the only input is the vector length. This constructor is defined in the header nvector cuda.h and the library to link to is libsundials nveccuda.lib. Distributed-memory parallel usage Prototype N Vector N VNewManaged Cuda(MPI Comm comm, sunindextype local length, sunindextype global length) Description This function creates and allocates memory for a cuda N Vector on a single node. The vector data array is allocated in managed memory. When used in a distributed context with MPI, the arguments are the MPI communicator, the local vector lenght, and the global vector length. This constructor is defined in the header nvector mpicuda.h and the library to link to is libsundials nvecmpicuda.lib. N VNewEmpty Cuda Prototype N Vector N VNewEmpty Cuda() Description This function creates a new nvector wrapper with the pointer to the wrapped cuda vector set to NULL. It is used by the N VNew Cuda, N VMake Cuda, and N VClone Cuda implementations. 198 Description of the NVECTOR module N VMake Cuda Single-node usage Prototype N Vector N VMake Cuda(sunindextype length, realtype *h vdata, realtype *d vdata) Description This function creates an nvector cuda with user-supplied vector data arrays h vdata and d vdata. This function does not allocate memory for data itself. In the singlenode setting, the inputs are the vector length, the host data array, and the device data. This constructor is defined in the header nvector cuda.h and the library to link to is libsundials nveccuda.lib. Distributed-memory parallel usage Prototype N Vector N VMake Cuda(MPI Comm comm, sunindextype local length, sunindextype global length, realtype *h vdata, realtype *d vdata) Description This function creates an nvector cuda with user-supplied vector data arrays h vdata and d vdata. This function does not allocate memory for data itself. When used in a distributed context with MPI, the arguments are the MPI communicator, the local vector lenght, the global vector length, the host data array, and the device data array. This constructor is defined in the header nvector mpicuda.h and the library to link to is libsundials nvecmpicuda.lib. N VMakeManaged Cuda Single-node usage Prototype N Vector N VMakeManaged Cuda(sunindextype length, realtype *vdata) Description This function creates an nvector cuda with a user-supplied managed memory data array. This function does not allocate memory for data itself. In the single-node setting, the inputs are the vector length and the managed data array. This constructor is defined in the header nvector cuda.h and the library to link to is libsundials nveccuda.lib. Distributed-memory parallel usage Prototype N Vector N VMakeManaged Cuda(MPI Comm comm, sunindextype local length, sunindextype global length, realtype *vdata) Description This function creates an nvector cuda with a user-supplied managed memory data array. This function does not allocate memory for data itself. When used in a distributed context with MPI, the arguments are the MPI communicator, the local vector lenght, the global vector length, the managed data array. This constructor is defined in the header nvector mpicuda.h and the library to link to is libsundials nvecmpicuda.lib. The module nvector cuda also provides the following user-callable routines: N VSetCudaStream Cuda Prototype void N VSetCudaStream Cuda(N Vector v, cudaStream t *stream) Description This function sets the cuda stream that all vector kernels will be launched on. By default an nvector cuda uses the default cuda stream. Note: All vectors used in a single instance of a SUNDIALS solver must use the same cuda stream, and the cuda stream must be set prior to solver initialization. Additionally, if manually instantiating the stream and reduce ThreadPartitioning of a suncudavec::Vector, ensure that they use the same cuda stream. 7.8 The NVECTOR CUDA implementation 199 N VCopyToDevice Cuda Prototype realtype *N VCopyToDevice Cuda(N Vector v) Description This function copies host vector data to the device. N VCopyFromDevice Cuda Prototype realtype *N VCopyFromDevice Cuda(N Vector v) Description This function copies vector data from the device to the host. N VPrint Cuda Prototype void N VPrint Cuda(N Vector v) Description This function prints the content of a cuda vector to stdout. N VPrintFile Cuda Prototype void N VPrintFile Cuda(N Vector v, FILE *outfile) Description This function prints the content of a cuda vector to outfile. By default all fused and vector array operations are disabled in the nvector cuda module. The following additional user-callable routines are provided to enable or disable fused and vector array operations for a specific vector. To ensure consistency across vectors it is recommended to first create a vector with N VNew Cuda, enable/disable the desired operations for that vector with the functions below, and create any additional vectors from that vector using N VClone. This guarantees the new vectors will have the same operations enabled/disabled as cloned vectors inherit the same enable/disable options as the vector they are cloned from while vectors created with N VNew Cuda will have the default settings for the nvector cuda module. N VEnableFusedOps Cuda Prototype int N VEnableFusedOps Cuda(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) all fused and vector array operations in the cuda vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableLinearCombination Cuda Prototype int N VEnableLinearCombination Cuda(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the linear combination fused operation in the cuda vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableScaleAddMulti Cuda Prototype int N VEnableScaleAddMulti Cuda(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the scale and add a vector to multiple vectors fused operation in the cuda vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. 200 Description of the NVECTOR module N VEnableDotProdMulti Cuda Prototype int N VEnableDotProdMulti Cuda(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the multiple dot products fused operation in the cuda vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableLinearSumVectorArray Cuda Prototype int N VEnableLinearSumVectorArray Cuda(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the linear sum operation for vector arrays in the cuda vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableScaleVectorArray Cuda Prototype int N VEnableScaleVectorArray Cuda(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the scale operation for vector arrays in the cuda vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableConstVectorArray Cuda Prototype int N VEnableConstVectorArray Cuda(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the const operation for vector arrays in the cuda vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableWrmsNormVectorArray Cuda Prototype int N VEnableWrmsNormVectorArray Cuda(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the WRMS norm operation for vector arrays in the cuda vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableWrmsNormMaskVectorArray Cuda Prototype int N VEnableWrmsNormMaskVectorArray Cuda(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the masked WRMS norm operation for vector arrays in the cuda vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableScaleAddMultiVectorArray Cuda Prototype int N VEnableScaleAddMultiVectorArray Cuda(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the scale and add a vector array to multiple vector arrays operation in the cuda vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. 7.9 The NVECTOR RAJA implementation 201 N VEnableLinearCombinationVectorArray Cuda Prototype int N VEnableLinearCombinationVectorArray Cuda(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the linear combination operation for vector arrays in the cuda vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. Notes • When there is a need to access components of an N Vector Cuda, v, it is recommeded to use functions N VGetDeviceArrayPointer Cuda or N VGetHostArrayPointer Cuda. • To maximize efficiency, vector operations in the nvector cuda implementation that have more than one N Vector argument do not check for consistent internal representations of these vectors. It is the user’s responsibility to ensure that such routines are called with N Vector arguments that were all created with the same internal representations. 7.9 The NVECTOR RAJA implementation The nvector raja module is an experimental nvector implementation using the raja hardware abstraction layer. In this implementation, raja allows for sundials vector kernels to run on GPU devices. The module is intended for users who are already familiar with raja and GPU programming. Building this vector module requires a C++11 compliant compiler and a CUDA software development toolkit. Besides the cuda backend, raja has other backends such as serial, OpenMP, and OpenACC. These backends are not used in this sundials release. Class Vector in namespace sunrajavec manages the vector data layout: template class Vector { I size_; I mem_size_; I global_size_; T* h_vec_; T* d_vec_; SUNMPI_Comm comm_; ... }; The class members are: vector size (length), size of the vector data memory block, the global vector size (length), pointers to vector data on the host and on the device, and the MPI communicator. The class Vector inherits from an empty structure struct _N_VectorContent_Raja { }; to interface the C++ class with the nvector C code. When instantiated, the class Vector will allocate memory on both the host and the device. Due to the rapid progress of raja development, we expect that the sunrajavec::Vector class will change frequently in future sundials releases. The code is structured so that it can tolerate significant changes in the sunrajavec::Vector class without requiring changes to the user API. The nvector raja module can be utilized for single-node parallelism or in a distributed context with MPI. The header file to include when using this module for single-node parallelism is nvector raja.h. The header file to include when using this module in the distributed case is nvector mpiraja.h. The installed module libraries to link to are libsundials nvecraja.lib in the single-node case, or libsundials nvecmpicudaraja.lib in the distributed case. Only one one ! 202 Description of the NVECTOR module of these libraries may be linked to when creating an executable or library. sundials must be built with MPI support if the distributed library is desired. 7.9.1 NVECTOR RAJA functions Unlike other native sundials vector types, nvector raja does not provide macros to access its member variables. Instead, user should use the accessor functions: N VGetLength Raja Prototype sunindextype N VGetLength Raja(N Vector v) Description This function returns the global length of the vector. N VGetLocalLength Raja Prototype sunindextype N VGetLocalLength Raja(N Vector v) Description This function returns the local length of the vector. Note: This function is for use in a distributed context and is defined in the header nvector mpiraja.h and the library to link to is libsundials nvecmpicudaraja.lib. N VGetHostArrayPointer Raja Prototype realtype *N VGetHostArrayPointer Raja(N Vector v) Description This function returns a pointer to the vector data on the host. N VGetDeviceArrayPointer Raja Prototype realtype *N VGetDeviceArrayPointer Raja(N Vector v) Description This function returns a pointer to the vector data on the device. N VGetMPIComm Raja Prototype MPI Comm N VGetMPIComm Raja(N Vector v) Description This function returns the MPI communicator for the vector. Note: This function is for use in a distributed context and is defined in the header nvector mpiraja.h and the library to link to is libsundials nvecmpicudaraja.lib. The nvector raja module defines the implementations of all vector operations listed in Tables 7.2, 7.3, and 7.4, except for N VDotProdMulti, N VWrmsNormVectorArray, and N VWrmsNormMaskVectorArray as support for arrays of reduction vectors is not yet supported in raja. These function will be added to the nvector raja implementation in the future. Additionally the vector operations N VGetArrayPointer and N VSetArrayPointer are not implemented by the raja vector. As such, this vector cannot be used with the sundials Fortran interfaces, nor with the sundials direct solvers and preconditioners. The nvector raja module provides separate functions to access data on the host and on the device. It also provides methods for copying data from the host to the device and vice versa. Usage examples of nvector raja are provided in some example programs for cvode [31]. The names of vector operations are obtained from those in Tables 7.2, 7.3, and 7.4, by appending the suffix Raja (e.g. N VDestroy Raja). The module nvector raja provides the following additional user-callable routines: 7.9 The NVECTOR RAJA implementation 203 N VNew Raja Single-node usage Prototype N Vector N VNew Raja(sunindextype length) Description This function creates and allocates memory for a cuda N Vector. The vector data array is allocated on both the host and device. In the single-node setting, the only input is the vector length. This constructor is defined in the header nvector raja.h and the library to link to is libsundials nveccudaraja.lib. Distributed-memory parallel usage Prototype N Vector N VNew Raja(MPI Comm comm, sunindextype local length, sunindextype global length) Description This function creates and allocates memory for a cuda N Vector. The vector data array is allocated on both the host and device. When used in a distributed context with MPI, the arguments are the MPI communicator, the local vector lenght, and the global vector length. This constructor is defined in the header nvector mpiraja.h and the library to link to is libsundials nvecmpicudaraja.lib. N VNewEmpty Raja Prototype N Vector N VNewEmpty Raja() Description This function creates a new nvector wrapper with the pointer to the wrapped raja vector set to NULL. It is used by the N VNew Raja, N VMake Raja, and N VClone Raja implementations. N VMake Raja Prototype N Vector N VMake Raja(N VectorContent Raja c) Description This function creates and allocates memory for an nvector raja wrapper around a user-provided sunrajavec::Vector class. Its only argument is of type N VectorContent Raja, which is the pointer to the class. N VCopyToDevice Raja Prototype realtype *N VCopyToDevice Raja(N Vector v) Description This function copies host vector data to the device. N VCopyFromDevice Raja Prototype realtype *N VCopyFromDevice Raja(N Vector v) Description This function copies vector data from the device to the host. N VPrint Raja Prototype void N VPrint Raja(N Vector v) Description This function prints the content of a raja vector to stdout. 204 Description of the NVECTOR module N VPrintFile Raja Prototype void N VPrintFile Raja(N Vector v, FILE *outfile) Description This function prints the content of a raja vector to outfile. By default all fused and vector array operations are disabled in the nvector raja module. The following additional user-callable routines are provided to enable or disable fused and vector array operations for a specific vector. To ensure consistency across vectors it is recommended to first create a vector with N VNew Raja, enable/disable the desired operations for that vector with the functions below, and create any additional vectors from that vector using N VClone. This guarantees the new vectors will have the same operations enabled/disabled as cloned vectors inherit the same enable/disable options as the vector they are cloned from while vectors created with N VNew Raja will have the default settings for the nvector raja module. N VEnableFusedOps Raja Prototype int N VEnableFusedOps Raja(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) all fused and vector array operations in the raja vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableLinearCombination Raja Prototype int N VEnableLinearCombination Raja(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the linear combination fused operation in the raja vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableScaleAddMulti Raja Prototype int N VEnableScaleAddMulti Raja(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the scale and add a vector to multiple vectors fused operation in the raja vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableLinearSumVectorArray Raja Prototype int N VEnableLinearSumVectorArray Raja(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the linear sum operation for vector arrays in the raja vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableScaleVectorArray Raja Prototype int N VEnableScaleVectorArray Raja(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the scale operation for vector arrays in the raja vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableConstVectorArray Raja Prototype int N VEnableConstVectorArray Raja(N Vector v, booleantype tf) 7.10 The NVECTOR OPENMPDEV implementation Description 205 This function enables (SUNTRUE) or disables (SUNFALSE) the const operation for vector arrays in the raja vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableScaleAddMultiVectorArray Raja Prototype int N VEnableScaleAddMultiVectorArray Raja(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the scale and add a vector array to multiple vector arrays operation in the raja vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableLinearCombinationVectorArray Raja Prototype int N VEnableLinearCombinationVectorArray Raja(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the linear combination operation for vector arrays in the raja vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. Notes • When there is a need to access components of an N Vector Raja, v, it is recommeded to use functions N VGetDeviceArrayPointer Raja or N VGetHostArrayPointer Raja. • To maximize efficiency, vector operations in the nvector raja implementation that have more than one N Vector argument do not check for consistent internal representations of these vectors. It is the user’s responsibility to ensure that such routines are called with N Vector arguments that were all created with the same internal representations. 7.10 The NVECTOR OPENMPDEV implementation In situations where a user has access to a device such as a GPU for offloading computation, sundials provides an nvector implementation using OpenMP device offloading, called nvector openmpdev. The nvector openmpdev implementation defines the content field of the N Vector to be a structure containing the length of the vector, a pointer to the beginning of a contiguous data array on the host, a pointer to the beginning of a contiguous data array on the device, and a boolean flag own data which specifies the ownership of host and device data arrays. struct _N_VectorContent_OpenMPDEV { sunindextype length; booleantype own_data; realtype *host_data; realtype *dev_data; }; The header file to include when using this module is nvector openmpdev.h. The installed module library to link to is libsundials nvecopenmpdev.lib where .lib is typically .so for shared libraries and .a for static libraries. 7.10.1 NVECTOR OPENMPDEV accessor macros The following macros are provided to access the content of an nvector openmpdev vector. ! 206 Description of the NVECTOR module • NV CONTENT OMPDEV This routine gives access to the contents of the nvector openmpdev vector N Vector. The assignment v cont = NV CONTENT OMPDEV(v) sets v cont to be a pointer to the nvector openmpdev N Vector content structure. Implementation: #define NV_CONTENT_OMPDEV(v) ( (N_VectorContent_OpenMPDEV)(v->content) ) • NV OWN DATA OMPDEV, NV DATA HOST OMPDEV, NV DATA DEV OMPDEV, NV LENGTH OMPDEV These macros give individual access to the parts of the content of an nvector openmpdev N Vector. The assignment v data = NV DATA HOST OMPDEV(v) sets v data to be a pointer to the first component of the data on the host for the N Vector v. The assignment NV DATA HOST OMPDEV(v) = v data sets the host component array of v to be v data by storing the pointer v data. The assignment v dev data = NV DATA DEV OMPDEV(v) sets v dev data to be a pointer to the first component of the data on the device for the N Vector v. The assignment NV DATA DEV OMPDEV(v) = v dev data sets the device component array of v to be v dev data by storing the pointer v dev data. The assignment v len = NV LENGTH OMPDEV(v) sets v len to be the length of v. On the other hand, the call NV LENGTH OMPDEV(v) = len v sets the length of v to be len v. Implementation: #define NV_OWN_DATA_OMPDEV(v) ( NV_CONTENT_OMPDEV(v)->own_data ) #define NV_DATA_HOST_OMPDEV(v) ( NV_CONTENT_OMPDEV(v)->host_data ) #define NV_DATA_DEV_OMPDEV(v) ( NV_CONTENT_OMPDEV(v)->dev_data ) #define NV_LENGTH_OMPDEV(v) ( NV_CONTENT_OMPDEV(v)->length ) 7.10.2 NVECTOR OPENMPDEV functions The nvector openmpdev module defines OpenMP device offloading implementations of all vector operations listed in Tables 7.2, 7.3, and 7.4, except for N VGetArrayPointer and N VSetArrayPointer. As such, this vector cannot be used with the sundials Fortran interfaces, nor with the sundials direct solvers and preconditioners. It also provides methods for copying from the host to the device and vice versa. The names of vector operations are obtained from those in Tables 7.2, 7.3, and 7.4 by appending the suffix OpenMPDEV (e.g. N VDestroy OpenMPDEV). The module nvector openmpdev provides the following additional user-callable routines: N VNew OpenMPDEV Prototype N Vector N VNew OpenMPDEV(sunindextype vec length) Description This function creates and allocates memory for an nvector openmpdev N Vector. N VNewEmpty OpenMPDEV Prototype N Vector N VNewEmpty OpenMPDEV(sunindextype vec length) Description This function creates a new nvector openmpdev N Vector with an empty (NULL) host and device data arrays. 7.10 The NVECTOR OPENMPDEV implementation 207 N VMake OpenMPDEV Prototype N Vector N VMake OpenMPDEV(sunindextype vec length, realtype *h vdata, realtype *d vdata) Description This function creates an nvector openmpdev vector with user-supplied vector data arrays h vdata and d vdata. This function does not allocate memory for data itself. N VCloneVectorArray OpenMPDEV Prototype N Vector *N VCloneVectorArray OpenMPDEV(int count, N Vector w) Description This function creates (by cloning) an array of count nvector openmpdev vectors. N VCloneVectorArrayEmpty OpenMPDEV Prototype N Vector *N VCloneVectorArrayEmpty OpenMPDEV(int count, N Vector w) Description This function creates (by cloning) an array of count nvector openmpdev vectors, each with an empty (NULL) data array. N VDestroyVectorArray OpenMPDEV Prototype void N VDestroyVectorArray OpenMPDEV(N Vector *vs, int count) Description This function frees memory allocated for the array of count variables of type N Vector created with N VCloneVectorArray OpenMPDEV or with N VCloneVectorArrayEmpty OpenMPDEV. N VGetLength OpenMPDEV Prototype sunindextype N VGetLength OpenMPDEV(N Vector v) Description This function returns the number of vector elements. N VGetHostArrayPointer OpenMPDEV Prototype realtype *N VGetHostArrayPointer OpenMPDEV(N Vector v) Description This function returns a pointer to the host data array. N VGetDeviceArrayPointer OpenMPDEV Prototype realtype *N VGetDeviceArrayPointer OpenMPDEV(N Vector v) Description This function returns a pointer to the device data array. N VPrint OpenMPDEV Prototype void N VPrint OpenMPDEV(N Vector v) Description This function prints the content of an nvector openmpdev vector to stdout. N VPrintFile OpenMPDEV Prototype void N VPrintFile OpenMPDEV(N Vector v, FILE *outfile) Description This function prints the content of an nvector openmpdev vector to outfile. 208 Description of the NVECTOR module N VCopyToDevice OpenMPDEV Prototype void N VCopyToDevice OpenMPDEV(N Vector v) Description This function copies the content of an nvector openmpdev vector’s host data array to the device data array. N VCopyFromDevice OpenMPDEV Prototype void N VCopyFromDevice OpenMPDEV(N Vector v) Description This function copies the content of an nvector openmpdev vector’s device data array to the host data array. By default all fused and vector array operations are disabled in the nvector openmpdev module. The following additional user-callable routines are provided to enable or disable fused and vector array operations for a specific vector. To ensure consistency across vectors it is recommended to first create a vector with N VNew OpenMPDEV, enable/disable the desired operations for that vector with the functions below, and create any additional vectors from that vector using N VClone. This guarantees the new vectors will have the same operations enabled/disabled as cloned vectors inherit the same enable/disable options as the vector they are cloned from while vectors created with N VNew OpenMPDEV will have the default settings for the nvector openmpdev module. N VEnableFusedOps OpenMPDEV Prototype int N VEnableFusedOps OpenMPDEV(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) all fused and vector array operations in the nvector openmpdev vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableLinearCombination OpenMPDEV Prototype int N VEnableLinearCombination OpenMPDEV(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the linear combination fused operation in the nvector openmpdev vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableScaleAddMulti OpenMPDEV Prototype int N VEnableScaleAddMulti OpenMPDEV(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the scale and add a vector to multiple vectors fused operation in the nvector openmpdev vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableDotProdMulti OpenMPDEV Prototype int N VEnableDotProdMulti OpenMPDEV(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the multiple dot products fused operation in the nvector openmpdev vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableLinearSumVectorArray OpenMPDEV Prototype int N VEnableLinearSumVectorArray OpenMPDEV(N Vector v, booleantype tf) 7.10 The NVECTOR OPENMPDEV implementation Description 209 This function enables (SUNTRUE) or disables (SUNFALSE) the linear sum operation for vector arrays in the nvector openmpdev vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableScaleVectorArray OpenMPDEV Prototype int N VEnableScaleVectorArray OpenMPDEV(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the scale operation for vector arrays in the nvector openmpdev vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableConstVectorArray OpenMPDEV Prototype int N VEnableConstVectorArray OpenMPDEV(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the const operation for vector arrays in the nvector openmpdev vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableWrmsNormVectorArray OpenMPDEV Prototype int N VEnableWrmsNormVectorArray OpenMPDEV(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the WRMS norm operation for vector arrays in the nvector openmpdev vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableWrmsNormMaskVectorArray OpenMPDEV Prototype int N VEnableWrmsNormMaskVectorArray OpenMPDEV(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the masked WRMS norm operation for vector arrays in the nvector openmpdev vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableScaleAddMultiVectorArray OpenMPDEV Prototype int N VEnableScaleAddMultiVectorArray OpenMPDEV(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the scale and add a vector array to multiple vector arrays operation in the nvector openmpdev vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. N VEnableLinearCombinationVectorArray OpenMPDEV Prototype int N VEnableLinearCombinationVectorArray OpenMPDEV(N Vector v, booleantype tf) Description This function enables (SUNTRUE) or disables (SUNFALSE) the linear combination operation for vector arrays in the nvector openmpdev vector. The return value is 0 for success and -1 if the input vector or its ops structure are NULL. 210 Description of the NVECTOR module Notes • When looping over the components of an N Vector v, it is most efficient to first obtain the component array via h data = NV DATA HOST OMPDEV(v) for the host array or d data = NV DATA DEV OMPDEV(v) for the device array and then access h data[i] or d data[i] within the loop. • When accessing individual components of an N Vector v on the host remember to first copy the array back from the device with N VCopyFromDevice OpenMPDEV(v) to ensure the array is up to date. ! • N VNewEmpty OpenMPDEV, N VMake OpenMPDEV, and N VCloneVectorArrayEmpty OpenMPDEV set the field own data = SUNFALSE. N VDestroy OpenMPDEV and N VDestroyVectorArray OpenMPDEV will not attempt to free the pointer data for any N Vector with own data set to SUNFALSE. In such a case, it is the user’s responsibility to deallocate the data pointer. ! • To maximize efficiency, vector operations in the nvector openmpdev implementation that have more than one N Vector argument do not check for consistent internal representation of these vectors. It is the user’s responsibility to ensure that such routines are called with N Vector arguments that were all created with the same internal representations. 7.11 NVECTOR Examples There are NVector examples that may be installed for the implementations provided with sundials. Each implementation makes use of the functions in test nvector.c. These example functions show simple usage of the NVector family of functions. The input to the examples are the vector length, number of threads (if threaded implementation), and a print timing flag. The following is a list of the example functions in test nvector.c: • Test N VClone: Creates clone of vector and checks validity of clone. • Test N VCloneEmpty: Creates clone of empty vector and checks validity of clone. • Test N VCloneVectorArray: Creates clone of vector array and checks validity of cloned array. • Test N VCloneVectorArray: Creates clone of empty vector array and checks validity of cloned array. • Test N VGetArrayPointer: Get array pointer. • Test N VSetArrayPointer: Allocate new vector, set pointer to new vector array, and check values. • Test N VLinearSum Case 1a: Test y = x + y • Test N VLinearSum Case 1b: Test y = -x + y • Test N VLinearSum Case 1c: Test y = ax + y • Test N VLinearSum Case 2a: Test x = x + y • Test N VLinearSum Case 2b: Test x = x - y • Test N VLinearSum Case 2c: Test x = x + by • Test N VLinearSum Case 3: Test z = x + y • Test N VLinearSum Case 4a: Test z = x - y • Test N VLinearSum Case 4b: Test z = -x + y 7.11 NVECTOR Examples 211 • Test N VLinearSum Case 5a: Test z = x + by • Test N VLinearSum Case 5b: Test z = ax + y • Test N VLinearSum Case 6a: Test z = -x + by • Test N VLinearSum Case 6b: Test z = ax - y • Test N VLinearSum Case 7: Test z = a(x + y) • Test N VLinearSum Case 8: Test z = a(x - y) • Test N VLinearSum Case 9: Test z = ax + by • Test N VConst: Fill vector with constant and check result. • Test N VProd: Test vector multiply: z = x * y • Test N VDiv: Test vector division: z = x / y • Test N VScale: Case 1: scale: x = cx • Test N VScale: Case 2: copy: z = x • Test N VScale: Case 3: negate: z = -x • Test N VScale: Case 4: combination: z = cx • Test N VAbs: Create absolute value of vector. • Test N VAddConst: add constant vector: z = c + x • Test N VDotProd: Calculate dot product of two vectors. • Test N VMaxNorm: Create vector with known values, find and validate the max norm. • Test N VWrmsNorm: Create vector of known values, find and validate the weighted root mean square. • Test N VWrmsNormMask: Create vector of known values, find and validate the weighted root mean square using all elements except one. • Test N VMin: Create vector, find and validate the min. • Test N VWL2Norm: Create vector, find and validate the weighted Euclidean L2 norm. • Test N VL1Norm: Create vector, find and validate the L1 norm. • Test N VCompare: Compare vector with constant returning and validating comparison vector. • Test N VInvTest: Test z[i] = 1 / x[i] • Test N VConstrMask: Test mask of vector x with vector c. • Test N VMinQuotient: Fill two vectors with known values. Calculate and validate minimum quotient. • Test N VLinearCombination Case 1a: Test x = a x • Test N VLinearCombination Case 1b: Test z = a x • Test N VLinearCombination Case 2a: Test x = a x + b y • Test N VLinearCombination Case 2b: Test z = a x + b y 212 Description of the NVECTOR module • Test N VLinearCombination Case 3a: Test x = x + a y + b z • Test N VLinearCombination Case 3b: Test x = a x + b y + c z • Test N VLinearCombination Case 3c: Test w = a x + b y + c z • Test N VScaleAddMulti Case 1a: y = a x + y • Test N VScaleAddMulti Case 1b: z = a x + y • Test N VScaleAddMulti Case 2a: Y[i] = c[i] x + Y[i], i = 1,2,3 • Test N VScaleAddMulti Case 2b: Z[i] = c[i] x + Y[i], i = 1,2,3 • Test N VDotProdMulti Case 1: Calculate the dot product of two vectors • Test N VDotProdMulti Case 2: Calculate the dot product of one vector with three other vectors in a vector array. • Test N VLinearSumVectorArray Case 1: z = a x + b y • Test N VLinearSumVectorArray Case 2a: Z[i] = a X[i] + b Y[i] • Test N VLinearSumVectorArray Case 2b: X[i] = a X[i] + b Y[i] • Test N VLinearSumVectorArray Case 2c: Y[i] = a X[i] + b Y[i] • Test N VScaleVectorArray Case 1a: y = c y • Test N VScaleVectorArray Case 1b: z = c y • Test N VScaleVectorArray Case 2a: Y[i] = c[i] Y[i] • Test N VScaleVectorArray Case 2b: Z[i] = c[i] Y[i] • Test N VScaleVectorArray Case 1a: z = c • Test N VScaleVectorArray Case 1b: Z[i] = c • Test N VWrmsNormVectorArray Case 1a: Create a vector of know values, find and validate the weighted root mean square norm. • Test N VWrmsNormVectorArray Case 1b: Create a vector array of three vectors of know values, find and validate the weighted root mean square norm of each. • Test N VWrmsNormMaskVectorArray Case 1a: Create a vector of know values, find and validate the weighted root mean square norm using all elements except one. • Test N VWrmsNormMaskVectorArray Case 1b: Create a vector array of three vectors of know values, find and validate the weighted root mean square norm of each using all elements except one. • Test N VScaleAddMultiVectorArray Case 1a: y = a x + y • Test N VScaleAddMultiVectorArray Case 1b: z = a x + y • Test N VScaleAddMultiVectorArray Case 2a: Y[j][0] = a[j] X[0] + Y[j][0] • Test N VScaleAddMultiVectorArray Case 2b: Z[j][0] = a[j] X[0] + Y[j][0] • Test N VScaleAddMultiVectorArray Case 3a: Y[0][i] = a[0] X[i] + Y[0][i] • Test N VScaleAddMultiVectorArray Case 3b: Z[0][i] = a[0] X[i] + Y[0][i] 7.11 NVECTOR Examples 213 • Test N VScaleAddMultiVectorArray Case 4a: Y[j][i] = a[j] X[i] + Y[j][i] • Test N VScaleAddMultiVectorArray Case 4b: Z[j][i] = a[j] X[i] + Y[j][i] • Test N VLinearCombinationVectorArray Case 1a: x = a x • Test N VLinearCombinationVectorArray Case 1b: z = a x • Test N VLinearCombinationVectorArray Case 2a: x = a x + b y • Test N VLinearCombinationVectorArray Case 2b: z = a x + b y • Test N VLinearCombinationVectorArray Case 3a: x = a x + b y + c z • Test N VLinearCombinationVectorArray Case 3b: w = a x + b y + c z • Test N VLinearCombinationVectorArray Case 4a: X[0][i] = c[0] X[0][i] • Test N VLinearCombinationVectorArray Case 4b: Z[i] = c[0] X[0][i] • Test N VLinearCombinationVectorArray Case 5a: X[0][i] = c[0] X[0][i] + c[1] X[1][i] • Test N VLinearCombinationVectorArray Case 5b: Z[i] = c[0] X[0][i] + c[1] X[1][i] • Test N VLinearCombinationVectorArray Case 6a: X[0][i] = X[0][i] + c[1] X[1][i] + c[2] X[2][i] • Test N VLinearCombinationVectorArray Case 6b: X[0][i] = c[0] X[0][i] + c[1] X[1][i] + c[2] X[2][i] • Test N VLinearCombinationVectorArray Case 6c: Z[i] = c[0] X[0][i] + c[1] X[1][i] + c[2] X[2][i] Description of the NVECTOR module X X X X X X X X X X X X X 1 X 2 1 1 X X X X X X X X X X X X X X 3 X X X X X X X idaa X idabbdpre N VGetVectorID N VClone N VCloneEmpty N VDestroy N VCloneVectorArray N VDestroyVectorArray N VSpace N VGetArrayPointer N VSetArrayPointer N VLinearSum N VConst N VProd N VDiv N VScale N VAbs N VInv N VAddConst N VDotProd N VMaxNorm N VWrmsNorm N VMin N VMinQuotient N VConstrMask N VWrmsNormMask N VCompare N VLinearCombination N VScaleAddMulti N VDotProdMulti N VLinearSumVectorArray N VScaleVectorArray N VConstVectorArray N VWrmsNormVectorArray N VWrmsNormMaskVectorArray N VScaleAddMultiVectorArray N VLinearCombinationVectorArray idals Table 7.5: List of vector functions usage by idas code modules idas 214 X X X X X X X X X X X Chapter 8 Description of the SUNMatrix module For problems that involve direct methods for solving linear systems, the sundials solvers not only operate on generic vectors, but also on generic matrices (of type SUNMatrix), through a set of operations defined by the particular sunmatrix implementation. Users can provide their own specific implementation of the sunmatrix module, particularly in cases where they provide their own nvector and/or linear solver modules, and require matrices that are compatible with those implementations. Alternately, we provide three sunmatrix implementations: dense, banded, and sparse. The generic operations are described below, and descriptions of the implementations provided with sundials follow. The generic SUNMatrix type has been modeled after the object-oriented style of the generic N Vector type. Specifically, a generic SUNMatrix is a pointer to a structure that has an implementationdependent content field containing the description and actual data of the matrix, and an ops field pointing to a structure with generic matrix operations. The type SUNMatrix is defined as typedef struct _generic_SUNMatrix *SUNMatrix; struct _generic_SUNMatrix { void *content; struct _generic_SUNMatrix_Ops *ops; }; The generic SUNMatrix Ops structure is essentially a list of pointers to the various actual matrix operations, and is defined as struct _generic_SUNMatrix_Ops { SUNMatrix_ID (*getid)(SUNMatrix); SUNMatrix (*clone)(SUNMatrix); void (*destroy)(SUNMatrix); int (*zero)(SUNMatrix); int (*copy)(SUNMatrix, SUNMatrix); int (*scaleadd)(realtype, SUNMatrix, SUNMatrix); int (*scaleaddi)(realtype, SUNMatrix); int (*matvec)(SUNMatrix, N_Vector, N_Vector); int (*space)(SUNMatrix, long int*, long int*); }; The generic sunmatrix module defines and implements the matrix operations acting on SUNMatrix objects. These routines are nothing but wrappers for the matrix operations defined by a particular sunmatrix implementation, which are accessed through the ops field of the SUNMatrix structure. To 216 Description of the SUNMatrix module Table 8.1: Identifiers associated with matrix kernels supplied with sundials. Matrix ID SUNMATRIX SUNMATRIX SUNMATRIX SUNMATRIX DENSE BAND SPARSE CUSTOM Matrix type Dense M × N matrix Band M × M matrix Sparse (CSR or CSC) M × N matrix User-provided custom matrix ID Value 0 1 2 3 illustrate this point we show below the implementation of a typical matrix operation from the generic sunmatrix module, namely SUNMatZero, which sets all values of a matrix A to zero, returning a flag denoting a successful/failed operation: int SUNMatZero(SUNMatrix A) { return((int) A->ops->zero(A)); } Table 8.2 contains a complete list of all matrix operations defined by the generic sunmatrix module. A particular implementation of the sunmatrix module must: • Specify the content field of the SUNMatrix object. • Define and implement a minimal subset of the matrix operations. See the documentation for each sundials solver to determine which sunmatrix operations they require. Note that the names of these routines should be unique to that implementation in order to permit using more than one sunmatrix module (each with different SUNMatrix internal data representations) in the same code. • Define and implement user-callable constructor and destructor routines to create and free a SUNMatrix with the new content field and with ops pointing to the new matrix operations. • Optionally, define and implement additional user-callable routines acting on the newly defined SUNMatrix (e.g., a routine to print the content for debugging purposes). • Optionally, provide accessor macros or functions as needed for that particular implementation to access different parts of the content field of the newly defined SUNMatrix. Each sunmatrix implementation included in sundials has a unique identifier specified in enumeration and shown in Table 8.1. It is recommended that a user-supplied sunmatrix implementation use the SUNMATRIX CUSTOM identifier. Table 8.2: Description of the SUNMatrix operations Name Usage and Description SUNMatGetID id = SUNMatGetID(A); Returns the type identifier for the matrix A. It is used to determine the matrix implementation type (e.g. dense, banded, sparse,. . . ) from the abstract SUNMatrix interface. This is used to assess compatibility with sundialsprovided linear solver implementations. Returned values are given in the Table 8.1. continued on next page 217 Name Usage and Description SUNMatClone B = SUNMatClone(A); Creates a new SUNMatrix of the same type as an existing matrix A and sets the ops field. It does not copy the matrix, but rather allocates storage for the new matrix. SUNMatDestroy SUNMatDestroy(A); Destroys the SUNMatrix A and frees memory allocated for its internal data. SUNMatSpace ier = SUNMatSpace(A, &lrw, &liw); Returns the storage requirements for the matrix A. lrw is a long int containing the number of realtype words and liw is a long int containing the number of integer words. The return value is an integer flag denoting success/failure of the operation. This function is advisory only, for use in determining a user’s total space requirements; it could be a dummy function in a user-supplied sunmatrix module if that information is not of interest. SUNMatZero ier = SUNMatZero(A); Performs the operation Aij = 0 for all entries of the matrix A. The return value is an integer flag denoting success/failure of the operation. SUNMatCopy ier = SUNMatCopy(A,B); Performs the operation Bij = Ai,j for all entries of the matrices A and B. The return value is an integer flag denoting success/failure of the operation. SUNMatScaleAdd ier = SUNMatScaleAdd(c, A, B); Performs the operation A = cA + B. The return value is an integer flag denoting success/failure of the operation. SUNMatScaleAddI ier = SUNMatScaleAddI(c, A); Performs the operation A = cA + I. The return value is an integer flag denoting success/failure of the operation. SUNMatMatvec ier = SUNMatMatvec(A, x, y); Performs the matrix-vector product operation, y = Ax. It should only be called with vectors x and y that are compatible with the matrix A – both in storage type and dimensions. The return value is an integer flag denoting success/failure of the operation. We note that not all sunmatrix types are compatible with all nvector types provided with sundials. This is primarily due to the need for compatibility within the SUNMatMatvec routine; however, compatibility between sunmatrix and nvector implementations is more crucial when considering their interaction within sunlinsol objects, as will be described in more detail in Chapter 9. More specifically, in Table 8.3 we show the matrix interfaces available as sunmatrix modules, and the compatible vector implementations. Table 8.3: sundials matrix interfaces and vector implementations that can be used for each. Matrix Serial Parallel OpenMP pThreads hypre petsc cuda raja User Interface (MPI) Vec. Vec. Suppl. Dense X X X X continued on next page 218 Description of the SUNMatrix module Matrix Interface Band Sparse User supplied 8.1 Serial Parallel (MPI) X X X X OpenMP pThreads hypre Vec. X X X X X X X petsc Vec. cuda raja X X X User Suppl. X X X SUNMatrix functions used by IDAS In Table 8.4, we list the matrix functions in the sunmatrix module used within the idas package. The table also shows, for each function, which of the code modules uses the function. The main idas integrator does not call any sunmatrix functions directly, so the table columns are specific to the idals interface and the idabbdpre preconditioner module. We further note that the idals interface only utilizes these routines when supplied with a matrix-based linear solver, i.e., the sunmatrix object passed to IDASetLinearSolver was not NULL. At this point, we should emphasize that the idas user does not need to know anything about the usage of matrix functions by the idas code modules in order to use idas. The information is presented as an implementation detail for the interested reader. SUNMatGetID SUNMatDestroy SUNMatZero SUNMatSpace idabbdpre idals Table 8.4: List of matrix functions usage by idas code modules X X X X † The matrix functions listed in Table 8.2 with a † symbol are optionally used, in that these are only called if they are implemented in the sunmatrix module that is being used (i.e. their function pointers are non-NULL). The matrix functions listed in Table 8.2 that are not used by idas are: SUNMatCopy, SUNMatClone, SUNMatScaleAdd, SUNMatScaleAddI and SUNMatMatvec. Therefore a user-supplied sunmatrix module for idas could omit these functions. 8.2 The SUNMatrix Dense implementation The dense implementation of the sunmatrix module provided with sundials, sunmatrix dense, defines the content field of SUNMatrix to be the following structure: struct _SUNMatrixContent_Dense { sunindextype M; sunindextype N; realtype *data; sunindextype ldata; realtype **cols; }; These entries of the content field contain the following information: M - number of rows N - number of columns 8.2 The SUNMatrix Dense implementation 219 - pointer to a contiguous block of realtype variables. The elements of the dense matrix are stored columnwise, i.e. the (i,j)-th element of a dense sunmatrix A (with 0 ≤ i < M and 0 ≤ j < N) may be accessed via data[j*M+i]. data ldata - length of the data array (= M·N). - array of pointers. cols[j] points to the first element of the j-th column of the matrix in the array data. The (i,j)-th element of a dense sunmatrix A (with 0 ≤ i < M and 0 ≤ j < N) may be accessed via cols[j][i]. The header file to include when using this module is sunmatrix/sunmatrix dense.h. The sunmatrix dense module is accessible from all sundials solvers without linking to the libsundials sunmatrixdense module library. cols 8.2.1 SUNMatrix Dense accessor macros The following macros are provided to access the content of a sunmatrix dense matrix. The prefix SM in the names denotes that these macros are for SUNMatrix implementations, and the suffix D denotes that these are specific to the dense version. • SM CONTENT D This macro gives access to the contents of the dense SUNMatrix. The assignment A cont = SM CONTENT D(A) sets A cont to be a pointer to the dense SUNMatrix content structure. Implementation: #define SM_CONTENT_D(A) ( (SUNMatrixContent_Dense)(A->content) ) • SM ROWS D, SM COLUMNS D, and SM LDATA D These macros give individual access to various lengths relevant to the content of a dense SUNMatrix. These may be used either to retrieve or to set these values. For example, the assignment A rows = SM ROWS D(A) sets A rows to be the number of rows in the matrix A. Similarly, the assignment SM COLUMNS D(A) = A cols sets the number of columns in A to equal A cols. Implementation: #define SM_ROWS_D(A) ( SM_CONTENT_D(A)->M ) #define SM_COLUMNS_D(A) ( SM_CONTENT_D(A)->N ) #define SM_LDATA_D(A) ( SM_CONTENT_D(A)->ldata ) • SM DATA D and SM COLS D These macros give access to the data and cols pointers for the matrix entries. The assignment A data = SM DATA D(A) sets A data to be a pointer to the first component of the data array for the dense SUNMatrix A. The assignment SM DATA D(A) = A data sets the data array of A to be A data by storing the pointer A data. Similarly, the assignment A cols = SM COLS D(A) sets A cols to be a pointer to the array of column pointers for the dense SUNMatrix A. The assignment SM COLS D(A) = A cols sets the column pointer array of A to be A cols by storing the pointer A cols. Implementation: #define SM_DATA_D(A) ( SM_CONTENT_D(A)->data ) #define SM_COLS_D(A) ( SM_CONTENT_D(A)->cols ) 220 Description of the SUNMatrix module • SM COLUMN D and SM ELEMENT D These macros give access to the individual columns and entries of the data array of a dense SUNMatrix. The assignment col j = SM COLUMN D(A,j) sets col j to be a pointer to the first entry of the j-th column of the M × N dense matrix A (with 0 ≤ j < N). The type of the expression SM COLUMN D(A,j) is realtype *. The pointer returned by the call SM COLUMN D(A,j) can be treated as an array which is indexed from 0 to M − 1. The assignments SM ELEMENT D(A,i,j) = a ij and a ij = SM ELEMENT D(A,i,j) reference the (i,j)-th element of the M × N dense matrix A (with 0 ≤ i < M and 0 ≤ j < N). Implementation: #define SM_COLUMN_D(A,j) ( (SM_CONTENT_D(A)->cols)[j] ) #define SM_ELEMENT_D(A,i,j) ( (SM_CONTENT_D(A)->cols)[j][i] ) 8.2.2 SUNMatrix Dense functions The sunmatrix dense module defines dense implementations of all matrix operations listed in Table 8.2. Their names are obtained from those in Table 8.2 by appending the suffix Dense (e.g. SUNMatCopy Dense). All the standard matrix operations listed in 8.2 with the suffix Dense appended are callable via the Fortran 2003 interface by prepending an ‘F’ (e.g. FSUNMatCopy Dense). The module sunmatrix dense provides the following additional user-callable routines: SUNDenseMatrix Prototype SUNMatrix SUNDenseMatrix(sunindextype M, sunindextype N) Description This constructor function creates and allocates memory for a dense SUNMatrix. Its arguments are the number of rows, M, and columns, N, for the dense matrix. F2003 Name This function is callable as FSUNDenseMatrix when using the Fortran 2003 interface module. SUNDenseMatrix Print Prototype void SUNDenseMatrix Print(SUNMatrix A, FILE* outfile) Description This function prints the content of a dense SUNMatrix to the output stream specified by outfile. Note: stdout or stderr may be used as arguments for outfile to print directly to standard output or standard error, respectively. SUNDenseMatrix Rows Prototype sunindextype SUNDenseMatrix Rows(SUNMatrix A) Description This function returns the number of rows in the dense SUNMatrix. F2003 Name This function is callable as FSUNDenseMatrix Rows when using the Fortran 2003 interface module. SUNDenseMatrix Columns Prototype sunindextype SUNDenseMatrix Columns(SUNMatrix A) Description This function returns the number of columns in the dense SUNMatrix. F2003 Name This function is callable as FSUNDenseMatrix Columns when using the Fortran 2003 interface module. 8.2 The SUNMatrix Dense implementation 221 SUNDenseMatrix LData Prototype sunindextype SUNDenseMatrix LData(SUNMatrix A) Description This function returns the length of the data array for the dense SUNMatrix. F2003 Name This function is callable as FSUNDenseMatrix LData when using the Fortran 2003 interface module. SUNDenseMatrix Data Prototype realtype* SUNDenseMatrix Data(SUNMatrix A) Description This function returns a pointer to the data array for the dense SUNMatrix. F2003 Name This function is callable as FSUNDenseMatrix Data when using the Fortran 2003 interface module. SUNDenseMatrix Cols Prototype realtype** SUNDenseMatrix Cols(SUNMatrix A) Description This function returns a pointer to the cols array for the dense SUNMatrix. SUNDenseMatrix Column Prototype realtype* SUNDenseMatrix Column(SUNMatrix A, sunindextype j) Description This function returns a pointer to the first entry of the jth column of the dense SUNMatrix. The resulting pointer should be indexed over the range 0 to M − 1. F2003 Name This function is callable as FSUNDenseMatrix Column when using the Fortran 2003 interface module. Notes • When looping over the components of a dense SUNMatrix A, the most efficient approaches are to: – First obtain the component array via A data = SM DATA D(A) or A data = SUNDenseMatrix Data(A) and then access A data[i] within the loop. – First obtain the array of column pointers via A cols = SM COLS D(A) or A cols = SUNDenseMatrix Cols(A), and then access A cols[j][i] within the loop. – Within a loop over the columns, access the column pointer via A colj = SUNDenseMatrix Column(A,j) and then to access the entries within that column using A colj[i] within the loop. All three of these are more efficient than using SM ELEMENT D(A,i,j) within a double loop. • Within the SUNMatMatvec Dense routine, internal consistency checks are performed to ensure that the matrix is called with consistent nvector implementations. These are currently limited to: nvector serial, nvector openmp, and nvector pthreads. As additional compatible vector implementations are added to sundials, these will be included within this compatibility check. 8.2.3 SUNMatrix Dense Fortran interfaces The sunmatrix dense module provides a Fortran 2003 module as well as Fortran 77 style interface functions for use from Fortran applications. ! 222 Description of the SUNMatrix module FORTRAN 2003 interface module The fsunmatrix dense mod Fortran module defines interfaces to most sunmatrix dense C functions using the intrinsic iso c binding module which provides a standardized mechanism for interoperating with C. As noted in the C function descriptions above, the interface functions are named after the corresponding C function, but with a leading ‘F’. For example, the function SUNDenseMatrix is interfaced as FSUNDenseMatrix. The Fortran 2003 sunmatrix dense interface module can be accessed with the use statement, i.e. use fsunmatrix dense mod, and linking to the library libsundials fsunmatrixdense mod.lib in addition to the C library. For details on where the library and module file fsunmatrix dense mod.mod are installed see Appendix A. We note that the module is accessible from the Fortran 2003 sundials integrators without separately linking to the libsundials fsunmatrixdense mod library. FORTRAN 77 interface functions For solvers that include a Fortran interface module, the sunmatrix dense module also includes the Fortran-callable function FSUNDenseMatInit(code, M, N, ier) to initialize this sunmatrix dense module for a given sundials solver. Here code is an integer input solver id (1 for cvode, 2 for ida, 3 for kinsol, 4 for arkode); M and N are the corresponding dense matrix construction arguments (declared to match C type long int); and ier is an error return flag equal to 0 for success and -1 for failure. Both code and ier are declared to match C type int. Additionally, when using arkode with a non-identity mass matrix, the Fortran-callable function FSUNDenseMassMatInit(M, N, ier) initializes this sunmatrix dense module for storing the mass matrix. 8.3 The SUNMatrix Band implementation The banded implementation of the sunmatrix module provided with sundials, sunmatrix band, defines the content field of SUNMatrix to be the following structure: struct _SUNMatrixContent_Band { sunindextype M; sunindextype N; sunindextype mu; sunindextype ml; sunindextype s_mu; sunindextype ldim; realtype *data; sunindextype ldata; realtype **cols; }; A diagram of the underlying data representation in a banded matrix is shown in Figure 8.1. A more complete description of the parts of this content field is given below: M - number of rows N - number of columns (N = M) mu - upper half-bandwidth, 0 ≤ mu < N ml - lower half-bandwidth, 0 ≤ ml < N s mu - storage upper bandwidth, mu ≤ s mu < N. The LU decomposition routines in the associated sunlinsol band and sunlinsol lapackband modules write the LU factors into the storage for A. The upper triangular factor U, however, may have an upper bandwidth as big as min(N1,mu+ml) because of partial pivoting. The s mu field holds the upper half-bandwidth allocated for A. ldim - leading dimension (ldim ≥ s mu+ml+1) 8.3 The SUNMatrix Band implementation 223 smu−mu A mu+ml+1 size data N mu data[0] data[1] ml smu data[j] data[j+1] data[j][smu−mu] A(j−mu,j) A(j−mu−1,j) data[N−1] data[j][smu] data[j][smu+ml] A(j,j) A(j+ml,j) Figure 8.1: Diagram of the storage for the sunmatrix band module. Here A is an N × N band matrix with upper and lower half-bandwidths mu and ml, respectively. The rows and columns of A are numbered from 0 to N − 1 and the (i, j)-th element of A is denoted A(i,j). The greyed out areas of the underlying component storage are used by the associated sunlinsol band linear solver. data - pointer to a contiguous block of realtype variables. The elements of the banded matrix are stored columnwise (i.e. columns are stored one on top of the other in memory). Only elements within the specified half-bandwidths are stored. data is a pointer to ldata contiguous locations which hold the elements within the band of A. ldata - length of the data array (= ldim·N ) cols - array of pointers. cols[j] is a pointer to the uppermost element within the band in the j-th column. This pointer may be treated as an array indexed from s mu−mu (to access the uppermost element within the band in the j-th column) to s mu+ml (to access the lowest element within the band in the j-th column). Indices from 0 to s mu−mu−1 give access to extra storage elements required by the LU decomposition function. Finally, cols[j][i-j+s mu] is the (i, j)-th element with j−mu ≤ i ≤ j+ml. The header file to include when using this module is sunmatrix/sunmatrix band.h. The sunmatrix band module is accessible from all sundials solvers without linking to the libsundials sunmatrixband module library. 224 Description of the SUNMatrix module 8.3.1 SUNMatrix Band accessor macros The following macros are provided to access the content of a sunmatrix band matrix. The prefix SM in the names denotes that these macros are for SUNMatrix implementations, and the suffix B denotes that these are specific to the banded version. • SM CONTENT B This routine gives access to the contents of the banded SUNMatrix. The assignment A cont = SM CONTENT B(A) sets A cont to be a pointer to the banded SUNMatrix content structure. Implementation: #define SM_CONTENT_B(A) ( (SUNMatrixContent_Band)(A->content) ) • SM ROWS B, SM COLUMNS B, SM UBAND B, SM LBAND B, SM SUBAND B, SM LDIM B, and SM LDATA B These macros give individual access to various lengths relevant to the content of a banded SUNMatrix. These may be used either to retrieve or to set these values. For example, the assignment A rows = SM ROWS B(A) sets A rows to be the number of rows in the matrix A. Similarly, the assignment SM COLUMNS B(A) = A cols sets the number of columns in A to equal A cols. Implementation: #define SM_ROWS_B(A) ( SM_CONTENT_B(A)->M ) #define SM_COLUMNS_B(A) ( SM_CONTENT_B(A)->N ) #define SM_UBAND_B(A) ( SM_CONTENT_B(A)->mu ) #define SM_LBAND_B(A) ( SM_CONTENT_B(A)->ml ) #define SM_SUBAND_B(A) ( SM_CONTENT_B(A)->s_mu ) #define SM_LDIM_B(A) ( SM_CONTENT_B(A)->ldim ) #define SM_LDATA_B(A) ( SM_CONTENT_B(A)->ldata ) • SM DATA B and SM COLS B These macros give access to the data and cols pointers for the matrix entries. The assignment A data = SM DATA B(A) sets A data to be a pointer to the first component of the data array for the banded SUNMatrix A. The assignment SM DATA B(A) = A data sets the data array of A to be A data by storing the pointer A data. Similarly, the assignment A cols = SM COLS B(A) sets A cols to be a pointer to the array of column pointers for the banded SUNMatrix A. The assignment SM COLS B(A) = A cols sets the column pointer array of A to be A cols by storing the pointer A cols. Implementation: #define SM_DATA_B(A) ( SM_CONTENT_B(A)->data ) #define SM_COLS_B(A) ( SM_CONTENT_B(A)->cols ) • SM COLUMN B, SM COLUMN ELEMENT B, and SM ELEMENT B These macros give access to the individual columns and entries of the data array of a banded SUNMatrix. The assignments SM ELEMENT B(A,i,j) = a ij and a ij = SM ELEMENT B(A,i,j) reference the (i,j)-th element of the N × N band matrix A, where 0 ≤ i, j ≤ N − 1. The location (i,j) should further satisfy j−mu ≤ i ≤ j+ml. The assignment col j = SM COLUMN B(A,j) sets col j to be a pointer to the diagonal element of the j-th column of the N × N band matrix A, 0 ≤ j ≤ N − 1. The type of the expression 8.3 The SUNMatrix Band implementation 225 SM COLUMN B(A,j) is realtype *. The pointer returned by the call SM COLUMN B(A,j) can be treated as an array which is indexed from −mu to ml. The assignments SM COLUMN ELEMENT B(col j,i,j) = a ij and a ij = SM COLUMN ELEMENT B(col j,i,j) reference the (i,j)-th entry of the band matrix A when used in conjunction with SM COLUMN B to reference the j-th column through col j. The index (i,j) should satisfy j−mu ≤ i ≤ j+ml. Implementation: #define SM_COLUMN_B(A,j) ( ((SM_CONTENT_B(A)->cols)[j])+SM_SUBAND_B(A) ) #define SM_COLUMN_ELEMENT_B(col_j,i,j) (col_j[(i)-(j)]) #define SM_ELEMENT_B(A,i,j) ( (SM_CONTENT_B(A)->cols)[j][(i)-(j)+SM_SUBAND_B(A)] ) 8.3.2 SUNMatrix Band functions The sunmatrix band module defines banded implementations of all matrix operations listed in Table 8.2. Their names are obtained from those in Table 8.2 by appending the suffix Band (e.g. SUNMatCopy Band). All the standard matrix operations listed in 8.2 with the suffix Band appended are callable via the Fortran 2003 interface by prepending an ‘F’ (e.g. FSUNMatCopy Band). The module sunmatrix band provides the following additional user-callable routines: SUNBandMatrix Prototype SUNMatrix SUNBandMatrix(sunindextype N, sunindextype mu, sunindextype ml) Description This constructor function creates and allocates memory for a banded SUNMatrix. Its arguments are the matrix size, N, and the upper and lower half-bandwidths of the matrix, mu and ml. The stored upper bandwidth is set to mu+ml to accommodate subsequent factorization in the sunlinsol band and sunlinsol lapackband modules. F2003 Name This function is callable as FSUNBandMatrix when using the Fortran 2003 interface module. SUNBandMatrixStorage Prototype SUNMatrix SUNBandMatrixStorage(sunindextype N, sunindextype mu, sunindextype ml, sunindextype smu) Description This constructor function creates and allocates memory for a banded SUNMatrix. Its arguments are the matrix size, N, the upper and lower half-bandwidths of the matrix, mu and ml, and the stored upper bandwidth, smu. When creating a band SUNMatrix, this value should be • at least min(N-1,mu+ml) if the matrix will be used by the sunlinsol band module; • exactly equal to mu+ml if the matrix will be used by the sunlinsol lapackband module; • at least mu if used in some other manner. Note: it is strongly recommended that users call the default constructor, SUNBandMatrix, in all standard use cases. This advanced constructor is used internally within sundials solvers, and is provided to users who require banded matrices for non-default purposes. 226 Description of the SUNMatrix module SUNBandMatrix Print Prototype void SUNBandMatrix Print(SUNMatrix A, FILE* outfile) Description This function prints the content of a banded SUNMatrix to the output stream specified by outfile. Note: stdout or stderr may be used as arguments for outfile to print directly to standard output or standard error, respectively. SUNBandMatrix Rows Prototype sunindextype SUNBandMatrix Rows(SUNMatrix A) Description This function returns the number of rows in the banded SUNMatrix. F2003 Name This function is callable as FSUNBandMatrix Rows when using the Fortran 2003 interface module. SUNBandMatrix Columns Prototype sunindextype SUNBandMatrix Columns(SUNMatrix A) Description This function returns the number of columns in the banded SUNMatrix. F2003 Name This function is callable as FSUNBandMatrix Columns when using the Fortran 2003 interface module. SUNBandMatrix LowerBandwidth Prototype sunindextype SUNBandMatrix LowerBandwidth(SUNMatrix A) Description This function returns the lower half-bandwidth of the banded SUNMatrix. F2003 Name This function is callable as FSUNBandMatrix LowerBandwidth when using the Fortran 2003 interface module. SUNBandMatrix UpperBandwidth Prototype sunindextype SUNBandMatrix UpperBandwidth(SUNMatrix A) Description This function returns the upper half-bandwidth of the banded SUNMatrix. F2003 Name This function is callable as FSUNBandMatrix UpperBandwidth when using the Fortran 2003 interface module. SUNBandMatrix StoredUpperBandwidth Prototype sunindextype SUNBandMatrix StoredUpperBandwidth(SUNMatrix A) Description This function returns the stored upper half-bandwidth of the banded SUNMatrix. F2003 Name This function is callable as FSUNBandMatrix StoredUpperBandwidth when using the Fortran 2003 interface module. SUNBandMatrix LDim Prototype sunindextype SUNBandMatrix LDim(SUNMatrix A) Description This function returns the length of the leading dimension of the banded SUNMatrix. F2003 Name This function is callable as FSUNBandMatrix LDim when using the Fortran 2003 interface module. 8.3 The SUNMatrix Band implementation 227 SUNBandMatrix Data Prototype realtype* SUNBandMatrix Data(SUNMatrix A) Description This function returns a pointer to the data array for the banded SUNMatrix. F2003 Name This function is callable as FSUNBandMatrix Data when using the Fortran 2003 interface module. SUNBandMatrix Cols Prototype realtype** SUNBandMatrix Cols(SUNMatrix A) Description This function returns a pointer to the cols array for the banded SUNMatrix. SUNBandMatrix Column Prototype realtype* SUNBandMatrix Column(SUNMatrix A, sunindextype j) Description This function returns a pointer to the diagonal entry of the j-th column of the banded SUNMatrix. The resulting pointer should be indexed over the range −mu to ml. F2003 Name This function is callable as FSUNBandMatrix Column when using the Fortran 2003 interface module. Notes • When looping over the components of a banded SUNMatrix A, the most efficient approaches are to: – First obtain the component array via A data = SM DATA B(A) or A data = SUNBandMatrix Data(A) and then access A data[i] within the loop. – First obtain the array of column pointers via A cols = SM COLS B(A) or A cols = SUNBandMatrix Cols(A), and then access A cols[j][i] within the loop. – Within a loop over the columns, access the column pointer via A colj = SUNBandMatrix Column(A,j) and then to access the entries within that column using SM COLUMN ELEMENT B(A colj,i,j). All three of these are more efficient than using SM ELEMENT B(A,i,j) within a double loop. • Within the SUNMatMatvec Band routine, internal consistency checks are performed to ensure that the matrix is called with consistent nvector implementations. These are currently limited to: nvector serial, nvector openmp, and nvector pthreads. As additional compatible vector implementations are added to sundials, these will be included within this compatibility check. 8.3.3 SUNMatrix Band Fortran interfaces The sunmatrix band module provides a Fortran 2003 module as well as Fortran 77 style interface functions for use from Fortran applications. FORTRAN 2003 interface module The fsunmatrix band mod Fortran module defines interfaces to most sunmatrix band C functions using the intrinsic iso c binding module which provides a standardized mechanism for interoperating with C. As noted in the C function descriptions above, the interface functions are named after the corresponding C function, but with a leading ‘F’. For example, the function SUNBandMatrix is interfaced as FSUNBandMatrix. The Fortran 2003 sunmatrix band interface module can be accessed with the use statement, i.e. use fsunmatrix band mod, and linking to the library libsundials fsunmatrixband mod.lib in ! 228 Description of the SUNMatrix module addition to the C library. For details on where the library and module file fsunmatrix band mod.mod are installed see Appendix A. We note that the module is accessible from the Fortran 2003 sundials integrators without separately linking to the libsundials fsunmatrixband mod library. FORTRAN 77 interface functions For solvers that include a Fortran interface module, the sunmatrix band module also includes the Fortran-callable function FSUNBandMatInit(code, N, mu, ml, ier) to initialize this sunmatrix band module for a given sundials solver. Here code is an integer input solver id (1 for cvode, 2 for ida, 3 for kinsol, 4 for arkode); N, mu, and ml are the corresponding band matrix construction arguments (declared to match C type long int); and ier is an error return flag equal to 0 for success and -1 for failure. Both code and ier are declared to match C type int. Additionally, when using arkode with a non-identity mass matrix, the Fortran-callable function FSUNBandMassMatInit(N, mu, ml, ier) initializes this sunmatrix band module for storing the mass matrix. 8.4 The SUNMatrix Sparse implementation The sparse implementation of the sunmatrix module provided with sundials, sunmatrix sparse, is designed to work with either compressed-sparse-column (CSC) or compressed-sparse-row (CSR) sparse matrix formats. To this end, it defines the content field of SUNMatrix to be the following structure: struct _SUNMatrixContent_Sparse { sunindextype M; sunindextype N; sunindextype NNZ; sunindextype NP; realtype *data; int sparsetype; sunindextype *indexvals; sunindextype *indexptrs; /* CSC indices */ sunindextype **rowvals; sunindextype **colptrs; /* CSR indices */ sunindextype **colvals; sunindextype **rowptrs; }; A diagram of the underlying data representation for a CSC matrix is shown in Figure 8.2 (the CSR format is similar). A more complete description of the parts of this content field is given below: M - number of rows N - number of columns NNZ - maximum number of nonzero entries in the matrix (allocated length of data and indexvals arrays) NP - number of index pointers (e.g. number of column pointers for CSC matrix). For CSC matrices NP = N, and for CSR matrices NP = M. This value is set automatically based the input for sparsetype. data - pointer to a contiguous block of realtype variables (of length NNZ), containing the values of the nonzero entries in the matrix sparsetype - type of the sparse matrix (CSC MAT or CSR MAT) indexvals - pointer to a contiguous block of int variables (of length NNZ), containing the row indices (if CSC) or column indices (if CSR) of each nonzero matrix entry held in data 8.4 The SUNMatrix Sparse implementation 229 - pointer to a contiguous block of int variables (of length NP+1). For CSC matrices each entry provides the index of the first column entry into the data and indexvals arrays, e.g. if indexptr[3]=7, then the first nonzero entry in the fourth column of the matrix is located in data[7], and is located in row indexvals[7] of the matrix. The last entry contains the total number of nonzero values in the matrix and hence points one past the end of the active data in the data and indexvals arrays. For CSR matrices, each entry provides the index of the first row entry into the data and indexvals arrays. The following pointers are added to the SlsMat type for user convenience, to provide a more intuitive interface to the CSC and CSR sparse matrix data structures. They are set automatically when creating a sparse sunmatrix, based on the sparse matrix storage type. rowvals - pointer to indexvals when sparsetype is CSC MAT, otherwise set to NULL. indexptrs colptrs - pointer to indexptrs when sparsetype is CSC MAT, otherwise set to NULL. colvals - pointer to indexvals when sparsetype is CSR MAT, otherwise set to NULL. rowptrs - pointer to indexptrs when sparsetype is For example, the 5 × 4 CSC matrix 0 3 1 3 0 0 0 7 0 1 0 0 0 0 0 CSR MAT, otherwise set to NULL. 0 2 0 9 5 could be stored in this structure as either M = 5; N = 4; NNZ = 8; NP = N; data = {3.0, 1.0, 3.0, 7.0, 1.0, 2.0, 9.0, 5.0}; sparsetype = CSC_MAT; indexvals = {1, 3, 0, 2, 0, 1, 3, 4}; indexptrs = {0, 2, 4, 5, 8}; or M = 5; N = 4; NNZ = 10; NP = N; data = {3.0, 1.0, 3.0, 7.0, 1.0, 2.0, 9.0, 5.0, *, *}; sparsetype = CSC_MAT; indexvals = {1, 3, 0, 2, 0, 1, 3, 4, *, *}; indexptrs = {0, 2, 4, 5, 8}; where the first has no unused space, and the second has additional storage (the entries marked with * may contain any values). Note in both cases that the final value in indexptrs is 8, indicating the total number of nonzero entries in the matrix. Similarly, in CSR format, the same matrix could be stored as M = 5; N = 4; NNZ = 8; NP = N; data = {3.0, 1.0, 3.0, 2.0, 7.0, 1.0, 9.0, 5.0}; sparsetype = CSR_MAT; indexvals = {1, 2, 0, 3, 1, 0, 3, 3}; indexptrs = {0, 2, 4, 5, 7, 8}; 230 Description of the SUNMatrix module NULL NULL rowvals colptrs colvals rowptrs indexvals indexptrs data NNZ M NP = N N A sparsetype=CSC_MAT 0 A(*rowvals[0],0) j A(*rowvals[1],0) k A(*rowvals[j],1) column 0 nz A(*rowvals[k],NP−1) column NP−1 A(*rowvals[nz−1],NP−1) unused storage Figure 8.2: Diagram of the storage for a compressed-sparse-column matrix. Here A is an M × N sparse matrix with storage for up to NNZ nonzero entries (the allocated length of both data and indexvals). The entries in indexvals may assume values from 0 to M − 1, corresponding to the row index (zerobased) of each nonzero value. The entries in data contain the values of the nonzero entries, with the row i, column j entry of A (again, zero-based) denoted as A(i,j). The indexptrs array contains N + 1 entries; the first N denote the starting index of each column within the indexvals and data arrays, while the final entry points one past the final nonzero entry. Here, although NNZ values are allocated, only nz are actually filled in; the greyed-out portions of data and indexvals indicate extra allocated space. 8.4 The SUNMatrix Sparse implementation 231 The header file to include when using this module is sunmatrix/sunmatrix sparse.h. The sunmatrix sparse module is accessible from all sundials solvers without linking to the libsundials sunmatrixsparse module library. 8.4.1 SUNMatrix Sparse accessor macros The following macros are provided to access the content of a sunmatrix sparse matrix. The prefix SM in the names denotes that these macros are for SUNMatrix implementations, and the suffix S denotes that these are specific to the sparse version. • SM CONTENT S This routine gives access to the contents of the sparse SUNMatrix. The assignment A cont = SM CONTENT S(A) sets A cont to be a pointer to the sparse SUNMatrix content structure. Implementation: #define SM_CONTENT_S(A) ( (SUNMatrixContent_Sparse)(A->content) ) • SM ROWS S, SM COLUMNS S, SM NNZ S, SM NP S, and SM SPARSETYPE S These macros give individual access to various lengths relevant to the content of a sparse SUNMatrix. These may be used either to retrieve or to set these values. For example, the assignment A rows = SM ROWS S(A) sets A rows to be the number of rows in the matrix A. Similarly, the assignment SM COLUMNS S(A) = A cols sets the number of columns in A to equal A cols. Implementation: #define SM_ROWS_S(A) ( SM_CONTENT_S(A)->M ) #define SM_COLUMNS_S(A) ( SM_CONTENT_S(A)->N ) #define SM_NNZ_S(A) ( SM_CONTENT_S(A)->NNZ ) #define SM_NP_S(A) ( SM_CONTENT_S(A)->NP ) #define SM_SPARSETYPE_S(A) ( SM_CONTENT_S(A)->sparsetype ) • SM DATA S, SM INDEXVALS S, and SM INDEXPTRS S These macros give access to the data and index arrays for the matrix entries. The assignment A data = SM DATA S(A) sets A data to be a pointer to the first component of the data array for the sparse SUNMatrix A. The assignment SM DATA S(A) = A data sets the data array of A to be A data by storing the pointer A data. Similarly, the assignment A indexvals = SM INDEXVALS S(A) sets A indexvals to be a pointer to the array of index values (i.e. row indices for a CSC matrix, or column indices for a CSR matrix) for the sparse SUNMatrix A. The assignment A indexptrs = SM INDEXPTRS S(A) sets A indexptrs to be a pointer to the array of index pointers (i.e. the starting indices in the data/indexvals arrays for each row or column in CSR or CSC formats, respectively). Implementation: #define SM_DATA_S(A) ( SM_CONTENT_S(A)->data ) #define SM_INDEXVALS_S(A) ( SM_CONTENT_S(A)->indexvals ) #define SM_INDEXPTRS_S(A) ( SM_CONTENT_S(A)->indexptrs ) 232 8.4.2 Description of the SUNMatrix module SUNMatrix Sparse functions The sunmatrix sparse module defines sparse implementations of all matrix operations listed in Table 8.2. Their names are obtained from those in Table 8.2 by appending the suffix Sparse (e.g. SUNMatCopy Sparse). All the standard matrix operations listed in 8.2 with the suffix Sparse appended are callable via the Fortran 2003 interface by prepending an ‘F’ (e.g. FSUNMatCopy Sparse). The module sunmatrix sparse provides the following additional user-callable routines: SUNSparseMatrix Prototype SUNMatrix SUNSparseMatrix(sunindextype M, sunindextype N, sunindextype NNZ, int sparsetype) Description This function creates and allocates memory for a sparse SUNMatrix. Its arguments are the number of rows and columns of the matrix, M and N, the maximum number of nonzeros to be stored in the matrix, NNZ, and a flag sparsetype indicating whether to use CSR or CSC format (valid arguments are CSR MAT or CSC MAT). F2003 Name This function is callable as FSUNSparseMatrix when using the Fortran 2003 interface module. SUNSparseFromDenseMatrix Prototype SUNMatrix SUNSparseFromDenseMatrix(SUNMatrix A, realtype droptol, int sparsetype); Description This function creates a new sparse matrix from an existing dense matrix by copying all values with magnitude larger than droptol into the sparse matrix structure. Requirements: • A must have type SUNMATRIX DENSE; • droptol must be non-negative; • sparsetype must be either CSC MAT or CSR MAT. The function returns NULL if any requirements are violated, or if the matrix storage request cannot be satisfied. F2003 Name This function is callable as FSUNSparseFromDenseMatrix when using the Fortran 2003 interface module. SUNSparseFromBandMatrix Prototype SUNMatrix SUNSparseFromBandMatrix(SUNMatrix A, realtype droptol, int sparsetype); Description This function creates a new sparse matrix from an existing band matrix by copying all values with magnitude larger than droptol into the sparse matrix structure. Requirements: • A must have type SUNMATRIX BAND; • droptol must be non-negative; • sparsetype must be either CSC MAT or CSR MAT. The function returns NULL if any requirements are violated, or if the matrix storage request cannot be satisfied. F2003 Name This function is callable as FSUNSparseFromBandMatrix when using the Fortran 2003 interface module. 8.4 The SUNMatrix Sparse implementation 233 SUNSparseMatrix Realloc Prototype int SUNSparseMatrix Realloc(SUNMatrix A) Description This function reallocates internal storage arrays in a sparse matrix so that the resulting sparse matrix has no wasted space (i.e. the space allocated for nonzero entries equals the actual number of nonzeros, indexptrs[NP]). Returns 0 on success and 1 on failure (e.g. if the input matrix is not sparse). F2003 Name This function is callable as FSUNSparseMatrix Realloc when using the Fortran 2003 interface module. SUNSparseMatrix Reallocate Prototype int SUNSparseMatrix Reallocate(SUNMatrix A, sunindextype NNZ) Description This function reallocates internal storage arrays in a sparse matrix so that the resulting sparse matrix has storage for a specified number of nonzeros. Returns 0 on success and 1 on failure (e.g. if the input matrix is not sparse or if NNZ is negative). F2003 Name This function is callable as FSUNSparseMatrix Reallocate when using the Fortran 2003 interface module. SUNSparseMatrix Print Prototype void SUNSparseMatrix Print(SUNMatrix A, FILE* outfile) Description This function prints the content of a sparse SUNMatrix to the output stream specified by outfile. Note: stdout or stderr may be used as arguments for outfile to print directly to standard output or standard error, respectively. SUNSparseMatrix Rows Prototype sunindextype SUNSparseMatrix Rows(SUNMatrix A) Description This function returns the number of rows in the sparse SUNMatrix. F2003 Name This function is callable as FSUNSparseMatrix Rows when using the Fortran 2003 interface module. SUNSparseMatrix Columns Prototype sunindextype SUNSparseMatrix Columns(SUNMatrix A) Description This function returns the number of columns in the sparse SUNMatrix. F2003 Name This function is callable as FSUNSparseMatrix Columns when using the Fortran 2003 interface module. SUNSparseMatrix NNZ Prototype sunindextype SUNSparseMatrix NNZ(SUNMatrix A) Description This function returns the number of entries allocated for nonzero storage for the sparse matrix SUNMatrix. F2003 Name This function is callable as FSUNSparseMatrix NNZ when using the Fortran 2003 interface module. 234 Description of the SUNMatrix module SUNSparseMatrix NP Prototype sunindextype SUNSparseMatrix NP(SUNMatrix A) Description This function returns the number of columns/rows for the sparse SUNMatrix, depending on whether the matrix uses CSC/CSR format, respectively. The indexptrs array has NP+1 entries. F2003 Name This function is callable as FSUNSparseMatrix NP when using the Fortran 2003 interface module. SUNSparseMatrix SparseType Prototype int SUNSparseMatrix SparseType(SUNMatrix A) Description This function returns the storage type (CSR MAT or CSC MAT) for the sparse SUNMatrix. F2003 Name This function is callable as FSUNSparseMatrix SparseType when using the Fortran 2003 interface module. SUNSparseMatrix Data Prototype realtype* SUNSparseMatrix Data(SUNMatrix A) Description This function returns a pointer to the data array for the sparse SUNMatrix. F2003 Name This function is callable as FSUNSparseMatrix Data when using the Fortran 2003 interface module. SUNSparseMatrix IndexValues Prototype sunindextype* SUNSparseMatrix IndexValues(SUNMatrix A) Description This function returns a pointer to index value array for the sparse SUNMatrix: for CSR format this is the column index for each nonzero entry, for CSC format this is the row index for each nonzero entry. F2003 Name This function is callable as FSUNSparseMatrix IndexValues when using the Fortran 2003 interface module. SUNSparseMatrix IndexPointers Prototype sunindextype* SUNSparseMatrix IndexPointers(SUNMatrix A) Description This function returns a pointer to the index pointer array for the sparse SUNMatrix: for CSR format this is the location of the first entry of each row in the data and indexvalues arrays, for CSC format this is the location of the first entry of each column. F2003 Name This function is callable as FSUNSparseMatrix IndexPointers when using the Fortran 2003 interface module. ! Within the SUNMatMatvec Sparse routine, internal consistency checks are performed to ensure that the matrix is called with consistent nvector implementations. These are currently limited to: nvector serial, nvector openmp, and nvector pthreads. As additional compatible vector implementations are added to sundials, these will be included within this compatibility check. 8.4.3 SUNMatrix Sparse Fortran interfaces The sunmatrix sparse module provides a Fortran 2003 module as well as Fortran 77 style interface functions for use from Fortran applications. 8.4 The SUNMatrix Sparse implementation 235 FORTRAN 2003 interface module The fsunmatrix sparse mod Fortran module defines interfaces to most sunmatrix sparse C functions using the intrinsic iso c binding module which provides a standardized mechanism for interoperating with C. As noted in the C function descriptions above, the interface functions are named after the corresponding C function, but with a leading ‘F’. For example, the function SUNSparseMatrix is interfaced as FSUNSparseMatrix. The Fortran 2003 sunmatrix sparse interface module can be accessed with the use statement, i.e. use fsunmatrix sparse mod, and linking to the library libsundials fsunmatrixsparse mod.lib in addition to the C library. For details on where the library and module file fsunmatrix sparse mod.mod are installed see Appendix A. We note that the module is accessible from the Fortran 2003 sundials integrators without separately linking to the libsundials fsunmatrixsparse mod library. FORTRAN 77 interface functions For solvers that include a Fortran interface module, the sunmatrix sparse module also includes the Fortran-callable function FSUNSparseMatInit(code, M, N, NNZ, sparsetype, ier) to initialize this sunmatrix sparse module for a given sundials solver. Here code is an integer input for the solver id (1 for cvode, 2 for ida, 3 for kinsol, 4 for arkode); M, N and NNZ are the corresponding sparse matrix construction arguments (declared to match C type long int); sparsetype is an integer flag indicating the sparse storage type (0 for CSC, 1 for CSR); and ier is an error return flag equal to 0 for success and -1 for failure. Each of code, sparsetype and ier are declared so as to match C type int. Additionally, when using arkode with a non-identity mass matrix, the Fortran-callable function FSUNSparseMassMatInit(M, N, NNZ, sparsetype, ier) initializes this sunmatrix sparse module for storing the mass matrix. Chapter 9 Description of the SUNLinearSolver module For problems that involve the solution of linear systems of equations, the sundials packages operate using generic linear solver modules defined through the sunlinsol API. This allows sundials packages to utilize any valid sunlinsol implementation that provides a set of required functions. These functions can be divided into three categories. The first are the core linear solver functions. The second group consists of “set” routines to supply the linear solver object with functions provided by the sundials package, or for modification of solver parameters. The last group consists of “get” routines for retrieving artifacts (statistics, residual vectors, etc.) from the linear solver. All of these functions are defined in the header file sundials/sundials linearsolver.h. The implementations provided with sundials work in coordination with the sundials generic nvector and sunmatrix modules to provide a set of compatible data structures and solvers for the solution of linear systems using direct or iterative (matrix-based or matrix-free) methods. Moreover, advanced users can provide a customized SUNLinearSolver implementation to any sundials package, particularly in cases where they provide their own nvector and/or sunmatrix modules. Historically, the sundials packages have been designed to specifically leverage the use of either direct linear solvers or matrix-free, scaled, preconditioned, iterative linear solvers. However, matrixbased iterative linear solvers are also supported. The iterative linear solvers packaged with sundials leverage scaling and preconditioning, as applicable, to balance error between solution components and to accelerate convergence of the linear solver. To this end, instead of solving the linear system Ax = b directly, these apply the underlying iterative algorithm to the transformed system Ãx̃ = b̃ (9.1) where à = S1 P1−1 AP2−1 S2−1 , b̃ = S1 P1−1 b, x̃ = S2 P2 x, and where • P1 is the left preconditioner, • P2 is the right preconditioner, • S1 is a diagonal matrix of scale factors for P1−1 b, • S2 is a diagonal matrix of scale factors for P2 x. (9.2) 238 Description of the SUNLinearSolver module sundials packages request that iterative linear solvers stop based on the 2-norm of the scaled preconditioned residual meeting a prescribed tolerance b̃ − Ãx̃ < tol. 2 When provided an iterative sunlinsol implementation that does not support the scaling matrices S1 and S2 , sundials’ packages will adjust the value of tol accordingly (see §9.4.2 for more details). In this case, they instead request that iterative linear solvers stop based on the criteria P1−1 b − P1−1 Ax 2 < tol. We note that the corresponding adjustments to tol in this case are non-optimal, in that they cannot balance error between specific entries of the solution x, only the aggregate error in the overall solution vector. We further note that not all of the sundials-provided iterative linear solvers support the full range of the above options (e.g., separate left/right preconditioning), and that some of the sundials packages only utilize a subset of these options. Further details on these exceptions are described in the documentation for each sunlinsol implementation, or for each sundials package. For users interested in providing their own sunlinsol module, the following section presents the sunlinsol API and its implementation beginning with the definition of sunlinsol functions in sections 9.1.1 – 9.1.3. This is followed by the definition of functions supplied to a linear solver implementation in section 9.1.4. A table of linear solver return codes is given in section 9.1.5. The SUNLinearSolver type and the generic sunlinsol module are defined in section 9.1.6. The section 9.2 discusses compatibility between the sundials-provided sunlinsol modules and sunmatrix modules. Section 9.3 lists the requirements for supplying a custom sunlinsol module and discusses some intended use cases. Users wishing to supply their own sunlinsol module are encouraged to use the sunlinsol implementations provided with sundials as a template for supplying custom linear solver modules. The sunlinsol functions required by this sundials package as well as other package specific details are given in section 9.4. The remaining sections of this chapter present the sunlinsol modules provided with sundials. 9.1 The SUNLinearSolver API The sunlinsol API defines several linear solver operations that enable sundials packages to utilize any sunlinsol implementation that provides the required functions. These functions can be divided into three categories. The first are the core linear solver functions. The second group of functions consists of set routines to supply the linear solver with functions provided by the sundials time integrators and to modify solver parameters. The final group consists of get routines for retrieving linear solver statistics. All of these functions are defined in the header file sundials/sundials linearsolver.h. 9.1.1 SUNLinearSolver core functions The core linear solver functions consist of four required routines to get the linear solver type (SUNLinSolGetType), initialize the linear solver object once all solver-specific options have been set (SUNLinSolInitialize), set up the linear solver object to utilize an updated matrix A (SUNLinSolSetup), and solve the linear system Ax = b (SUNLinSolSolve). The remaining routine for destruction of the linear solver object (SUNLinSolFree) is optional. SUNLinSolGetType Call type = SUNLinSolGetType(LS); Description The required function SUNLinSolGetType returns the type identifier for the linear solver LS. It is used to determine the solver type (direct, iterative, or matrix-iterative) from the abstract SUNLinearSolver interface. 9.1 The SUNLinearSolver API Arguments 239 LS (SUNLinearSolver) a sunlinsol object. Return value The return value type (of type int) will be one of the following: • SUNLINEARSOLVER DIRECT – 0, the sunlinsol module requires a matrix, and computes an ‘exact’ solution to the linear system defined by that matrix. • SUNLINEARSOLVER ITERATIVE – 1, the sunlinsol module does not require a matrix (though one may be provided), and computes an inexact solution to the linear system using a matrix-free iterative algorithm. That is it solves the linear system defined by the package-supplied ATimes routine (see SUNLinSolSetATimes below), even if that linear system differs from the one encoded in the matrix object (if one is provided). As the solver computes the solution only inexactly (or may diverge), the linear solver should check for solution convergence/accuracy as appropriate. • SUNLINEARSOLVER MATRIX ITERATIVE – 2, the sunlinsol module requires a matrix, and computes an inexact solution to the linear system defined by that matrix using an iterative algorithm. That is it solves the linear system defined by the matrix object even if that linear system differs from that encoded by the packagesupplied ATimes routine. As the solver computes the solution only inexactly (or may diverge), the linear solver should check for solution convergence/accuracy as appropriate. Notes See section 9.3.1 for more information on intended use cases corresponding to the linear solver type. SUNLinSolInitialize Call retval = SUNLinSolInitialize(LS); Description The required function SUNLinSolInitialize performs linear solver initialization (assuming that all solver-specific options have been set). Arguments LS (SUNLinearSolver) a sunlinsol object. Return value This should return zero for a successful call, and a negative value for a failure, ideally returning one of the generic error codes listed in Table 9.1. SUNLinSolSetup Call retval = SUNLinSolSetup(LS, A); Description The required function SUNLinSolSetup performs any linear solver setup needed, based on an updated system sunmatrix A. This may be called frequently (e.g., with a full Newton method) or infrequently (for a modified Newton method), based on the type of integrator and/or nonlinear solver requesting the solves. Arguments LS (SUNLinearSolver) a sunlinsol object. A (SUNMatrix) a sunmatrix object. Return value This should return zero for a successful call, a positive value for a recoverable failure and a negative value for an unrecoverable failure, ideally returning one of the generic error codes listed in Table 9.1. SUNLinSolSolve Call retval = SUNLinSolSolve(LS, A, x, b, tol); Description The required function SUNLinSolSolve solves a linear system Ax = b. Arguments LS A (SUNLinearSolver) a sunlinsol object. (SUNMatrix) a sunmatrix object. 240 Description of the SUNLinearSolver module x (N Vector) a nvector object containing the initial guess for the solution of the linear system, and the solution to the linear system upon return. b (N Vector) a nvector object containing the linear system right-hand side. tol (realtype) the desired linear solver tolerance. Return value This should return zero for a successful call, a positive value for a recoverable failure and a negative value for an unrecoverable failure, ideally returning one of the generic error codes listed in Table 9.1. Notes Direct solvers: can ignore the tol argument. Matrix-free solvers: (those that identify as SUNLINEARSOLVER ITERATIVE) can ignore the sunmatrix input A, and should instead rely on the matrix-vector product function supplied through the routine SUNLinSolSetATimes. Iterative solvers: (those that identify as SUNLINEARSOLVER ITERATIVE or SUNLINEARSOLVER MATRIX ITERATIVE) should attempt to solve to the specified tolerance tol in a weighted 2-norm. If the solver does not support scaling then it should just use a 2-norm. SUNLinSolFree Call retval = SUNLinSolFree(LS); Description The optional function SUNLinSolFree frees memory allocated by the linear solver. Arguments LS (SUNLinearSolver) a sunlinsol object. Return value This should return zero for a successful call and a negative value for a failure. 9.1.2 SUNLinearSolver set functions The following set functions are used to supply linear solver modules with functions defined by the sundials packages and to modify solver parameters. Only the routine for setting the matrix-vector product routine is required, and that is only for matrix-free linear solver modules. Otherwise, all other set functions are optional. sunlinsol implementations that do not provide the functionality for any optional routine should leave the corresponding function pointer NULL instead of supplying a dummy routine. SUNLinSolSetATimes Call retval = SUNLinSolSetATimes(LS, A data, ATimes); Description The function SUNLinSolSetATimes is required for matrix-free linear solvers; otherwise it is optional. This routine provides an ATimesFn function pointer, as well as a void* pointer to a data structure used by this routine, to a linear solver object. sundials packages will call this function to set the matrix-vector product function to either a solver-provided difference-quotient via vector operations or a user-supplied solver-specific routine. Arguments LS (SUNLinearSolver) a sunlinsol object. A data (void*) data structure passed to ATimes. ATimes (ATimesFn) function pointer implementing the matrix-vector product routine. Return value This routine should return zero for a successful call, and a negative value for a failure, ideally returning one of the generic error codes listed in Table 9.1. 9.1 The SUNLinearSolver API 241 SUNLinSolSetPreconditioner Call retval = SUNLinSolSetPreconditioner(LS, Pdata, Pset, Psol); Description The optional function SUNLinSolSetPreconditioner provides PSetupFn and PSolveFn function pointers that implement the preconditioner solves P1−1 and P2−1 from equations (9.1)-(9.2). This routine will be called by a sundials package, which will provide translation between the generic Pset and Psol calls and the package- or user-supplied routines. Arguments LS Pdata Pset Psol (SUNLinearSolver) a sunlinsol object. (void*) data structure passed to both Pset and Psol. (PSetupFn) function pointer implementing the preconditioner setup. (PSolveFn) function pointer implementing the preconditioner solve. Return value This routine should return zero for a successful call, and a negative value for a failure, ideally returning one of the generic error codes listed in Table 9.1. SUNLinSolSetScalingVectors Call retval = SUNLinSolSetScalingVectors(LS, s1, s2); Description The optional function SUNLinSolSetScalingVectors provides left/right scaling vectors for the linear system solve. Here, s1 and s2 are nvector of positive scale factors containing the diagonal of the matrices S1 and S2 from equations (9.1)-(9.2), respectively. Neither of these vectors need to be tested for positivity, and a NULL argument for either indicates that the corresponding scaling matrix is the identity. Arguments LS (SUNLinearSolver) a sunlinsol object. s1 (N Vector) diagonal of the matrix S1 s2 (N Vector) diagonal of the matrix S2 Return value This routine should return zero for a successful call, and a negative value for a failure, ideally returning one of the generic error codes listed in Table 9.1. 9.1.3 SUNLinearSolver get functions The following get functions allow sundials packages to retrieve results from a linear solve. All routines are optional. SUNLinSolNumIters Call its = SUNLinSolNumIters(LS); Description The optional function SUNLinSolNumIters should return the number of linear iterations performed in the last ‘solve’ call. Arguments LS (SUNLinearSolver) a sunlinsol object. Return value int containing the number of iterations SUNLinSolResNorm Call rnorm = SUNLinSolResNorm(LS); Description The optional function SUNLinSolResNorm should return the final residual norm from the last ‘solve’ call. Arguments LS (SUNLinearSolver) a sunlinsol object. Return value realtype containing the final residual norm 242 Description of the SUNLinearSolver module SUNLinSolResid Call rvec = SUNLinSolResid(LS); Description If an iterative method computes the preconditioned initial residual and returns with a successful solve without performing any iterations (i.e., either the initial guess or the preconditioner is sufficiently accurate), then this optional routine may be called by the sundials package. This routine should return the nvector containing the preconditioned initial residual vector. Arguments LS (SUNLinearSolver) a sunlinsol object. Return value N Vector containing the final residual vector Notes Since N Vector is actually a pointer, and the results are not modified, this routine should not require additional memory allocation. If the sunlinsol object does not retain a vector for this purpose, then this function pointer should be set to NULL in the implementation. SUNLinSolLastFlag Call lflag = SUNLinSolLastFlag(LS); Description The optional function SUNLinSolLastFlag should return the last error flag encountered within the linear solver. This is not called by the sundials packages directly; it allows the user to investigate linear solver issues after a failed solve. Arguments LS (SUNLinearSolver) a sunlinsol object. Return value long int containing the most recent error flag SUNLinSolSpace Call retval = SUNLinSolSpace(LS, &lrw, &liw); Description The optional function SUNLinSolSpace should return the storage requirements for the linear solver LS. Arguments LS (SUNLinearSolver) a sunlinsol object. lrw (long int*) the number of realtype words stored by the linear solver. liw (long int*) the number of integer words stored by the linear solver. Return value This should return zero for a successful call, and a negative value for a failure, ideally returning one of the generic error codes listed in Table 9.1. Notes 9.1.4 This function is advisory only, for use in determining a user’s total space requirements. Functions provided by sundials packages To interface with the sunlinsol modules, the sundials packages supply a variety of routines for evaluating the matrix-vector product, and setting up and applying the preconditioner. These packageprovided routines translate between the user-supplied ODE, DAE, or nonlinear systems and the generic interfaces to the linear systems of equations that result in their solution. The types for functions provided to a sunlinsol module are defined in the header file sundials/sundials iterative.h, and are described below. ATimesFn Definition typedef int (*ATimesFn)(void *A data, N Vector v, N Vector z); Purpose These functions compute the action of a matrix on a vector, performing the operation z = Av. Memory for z should already be allocted prior to calling this function. The vector v should be left unchanged. 9.1 The SUNLinearSolver API Arguments 243 A data is a pointer to client data, the same as that supplied to SUNLinSolSetATimes. v is the input vector to multiply. z is the output vector computed. Return value This routine should return 0 if successful and a non-zero value if unsuccessful. PSetupFn Definition typedef int (*PSetupFn)(void *P data) Purpose These functions set up any requisite problem data in preparation for calls to the corresponding PSolveFn. Arguments P data is a pointer to client data, the same pointer as that supplied to the routine SUNLinSolSetPreconditioner. Return value This routine should return 0 if successful and a non-zero value if unsuccessful. PSolveFn Definition typedef int (*PSolveFn)(void *P data, N Vector r, N Vector z, realtype tol, int lr) Purpose These functions solve the preconditioner equation P z = r for the vector z. Memory for z should already be allocted prior to calling this function. The parameter P data is a pointer to any information about P which the function needs in order to do its job (set up by the corresponding PSetupFn). The parameter lr is input, and indicates whether P is to be taken as the left preconditioner or the right preconditioner: lr = 1 for left and lr = 2 for right. If preconditioning is on one side only, lr can be ignored. If the preconditioner is iterative, then it should strive to solve the preconditioner equation so that kP z − rkwrms < tol where the weight vector for the WRMS norm may be accessed from the main package memory structure. The vector r should not be modified by the PSolveFn. Arguments P data is a pointer to client data, the same pointer as that supplied to the routine SUNLinSolSetPreconditioner. r is the right-hand side vector for the preconditioner system. z is the solution vector for the preconditioner system. tol is the desired tolerance for an iterative preconditioner. lr is flag indicating whether the routine should perform left (1) or right (2) preconditioning. Return value This routine should return 0 if successful and a non-zero value if unsuccessful. On a failure, a negative return value indicates an unrecoverable condition, while a positive value indicates a recoverable one, in which the calling routine may reattempt the solution after updating preconditioner data. 9.1.5 SUNLinearSolver return codes The functions provided to sunlinsol modules by each sundials package, and functions within the sundials-provided sunlinsol implementations utilize a common set of return codes, shown in Table 9.1. These adhere to a common pattern: 0 indicates success, a postitive value corresponds to a recoverable failure, and a negative value indicates a non-recoverable failure. Aside from this pattern, the actual values of each error code are primarily to provide additional information to the user in case of a linear solver failure. 244 Description of the SUNLinearSolver module Table 9.1: Description of the SUNLinearSolver error codes Name SUNLS SUNLS SUNLS SUNLS SUNLS SUNLS SUNLS SUNLS SUCCESS MEM NULL ILL INPUT MEM FAIL ATIMES FAIL UNREC PSET FAIL UNREC PSOLVE FAIL UNREC PACKAGE FAIL UNREC Value Description 0 -1 -2 -3 -4 -5 -6 -7 successful call or converged solve the memory argument to the function is NULL an illegal input has been provided to the function failed memory access or allocation an unrecoverable failure occurred in the ATimes routine an unrecoverable failure occurred in the Pset routine an unrecoverable failure occurred in the Psolve routine an unrecoverable failure occurred in an external linear solver package a failure occurred during Gram-Schmidt orthogonalization (sunlinsol spgmr/sunlinsol spfgmr) a singular R matrix was encountered in a QR factorization (sunlinsol spgmr/sunlinsol spfgmr) an iterative solver reduced the residual, but did not converge to the desired tolerance an iterative solver did not converge (and the residual was not reduced) a recoverable failure occurred in the ATimes routine a recoverable failure occurred in the Pset routine a recoverable failure occurred in the Psolve routine a recoverable failure occurred in an external linear solver package a singular matrix was encountered during a QR factorization (sunlinsol spgmr/sunlinsol spfgmr) a singular matrix was encountered during a LU factorization (sunlinsol dense/sunlinsol band) SUNLS GS FAIL -8 SUNLS QRSOL FAIL -9 SUNLS RES REDUCED 1 SUNLS CONV FAIL 2 SUNLS SUNLS SUNLS SUNLS 3 4 5 6 ATIMES FAIL REC PSET FAIL REC PSOLVE FAIL REC PACKAGE FAIL REC SUNLS QRFACT FAIL 7 SUNLS LUFACT FAIL 8 9.1.6 The generic SUNLinearSolver module sundials packages interact with specific sunlinsol implementations through the generic sunlinsol module on which all other sunlinsol iplementations are built. The SUNLinearSolver type is a pointer to a structure containing an implementation-dependent content field, and an ops field. The type SUNLinearSolver is defined as typedef struct _generic_SUNLinearSolver *SUNLinearSolver; struct _generic_SUNLinearSolver { void *content; struct _generic_SUNLinearSolver_Ops *ops; }; where the generic SUNLinearSolver Ops structure is a list of pointers to the various actual linear solver operations provided by a specific implementation. The generic SUNLinearSolver Ops structure is defined as struct _generic_SUNLinearSolver_Ops { SUNLinearSolver_Type (*gettype)(SUNLinearSolver); 9.2 Compatibility of SUNLinearSolver modules int int int int int int int realtype long int int N_Vector int }; 245 (*setatimes)(SUNLinearSolver, void*, ATimesFn); (*setpreconditioner)(SUNLinearSolver, void*, PSetupFn, PSolveFn); (*setscalingvectors)(SUNLinearSolver, N_Vector, N_Vector); (*initialize)(SUNLinearSolver); (*setup)(SUNLinearSolver, SUNMatrix); (*solve)(SUNLinearSolver, SUNMatrix, N_Vector, N_Vector, realtype); (*numiters)(SUNLinearSolver); (*resnorm)(SUNLinearSolver); (*lastflag)(SUNLinearSolver); (*space)(SUNLinearSolver, long int*, long int*); (*resid)(SUNLinearSolver); (*free)(SUNLinearSolver); The generic sunlinsol module defines and implements the linear solver operations defined in Sections 9.1.1-9.1.3. These routines are in fact only wrappers to the linear solver operations defined by a particular sunlinsol implementation, which are accessed through the ops field of the SUNLinearSolver structure. To illustrate this point we show below the implementation of a typical linear solver operation from the generic sunlinsol module, namely SUNLinSolInitialize, which initializes a sunlinsol object for use after it has been created and configured, and returns a flag denoting a successful/failed operation: int SUNLinSolInitialize(SUNLinearSolver S) { return ((int) S->ops->initialize(S)); } 9.2 Compatibility of SUNLinearSolver modules We note that not all sunlinsol types are compatible with all sunmatrix and nvector types provided with sundials. In Table 9.2 we show the matrix-based linear solvers available as sunlinsol modules, and the compatible matrix implementations. Recall that Table 4.1 shows the compatibility between all sunlinsol modules and vector implementations. Table 9.2: sundials matrix-based linear solvers and matrix implementations that can be used for each. Linear Solver Dense Banded Sparse User Interface Matrix Matrix Matrix Supplied Dense X X Band X X LapackDense X X LapackBand X X klu X X superlumt X X User supplied X X X X 9.3 Implementing a custom SUNLinearSolver module A particular implementation of the sunlinsol module must: 246 Description of the SUNLinearSolver module • Specify the content field of the SUNLinearSolver object. • Define and implement a minimal subset of the linear solver operations. See the section 9.4 to determine which sunlinsol operations are required for this sundials package. Note that the names of these routines should be unique to that implementation in order to permit using more than one sunlinsol module (each with different SUNLinearSolver internal data representations) in the same code. • Define and implement user-callable constructor and destructor routines to create and free a SUNLinearSolver with the new content field and with ops pointing to the new linear solver operations. We note that the function pointers for all unsupported optional routines should be set to NULL in the ops structure. This allows the sundials package that is using the sunlinsol object to know that the associated functionality is not supported. Additionally, a sunlinsol implementation may do the following: • Define and implement additional user-callable “set” routines acting on the SUNLinearSolver, e.g., for setting various configuration options to tune the linear solver to a particular problem. • Provide additional user-callable “get” routines acting on the SUNLinearSolver object, e.g., for returning various solve statistics. 9.3.1 Intended use cases The sunlinsol (and sunmatrix) APIs are designed to require a minimal set of routines to ease interfacing with custom or third-party linear solver libraries. External solvers provide similar routines with the necessary functionality and thus will require minimal effort to wrap within custom sunmatrix and sunlinsol implementations. Sections 8.1 and 9.4 include a list of the required set of routines that compatible sunmatrix and sunlinsol implementations must provide. As sundials packages utilize generic sunlinsol modules allowing for user-supplied SUNLinearSolver implementations, there exists a wide range of possible linear solver combinations. Some intended use cases for both the sundialsprovided and user-supplied sunlinsol modules are discussd in the following sections. Direct linear solvers Direct linear solver modules require a matrix and compute an ‘exact’ solution to the linear system defined by the matrix. Multiple matrix formats and associated direct linear solvers are supplied with sundials through different sunmatrix and sunlinsol implementations. sundials packages strive to amortize the high cost of matrix construction by reusing matrix information for multiple nonlinear iterations. As a result, each package’s linear solver interface recomputes Jacobian information as infrequently as possible. Alternative matrix storage formats and compatible linear solvers that are not currently provided by, or interfaced with, sundials can leverage this infrastructure with minimal effort. To do so, a user must implement custom sunmatrix and sunlinsol wrappers for the desired matrix format and/or linear solver following the APIs described in Chapters 8 and 9. This user-supplied sunlinsol module must then self-identify as having SUNLINEARSOLVER DIRECT type. Matrix-free iterative linear solvers Matrix-free iterative linear solver modules do not require a matrix and compute an inexact solution to the linear system defined by the package-supplied ATimes routine. sundials supplies multiple scaled, preconditioned iterative linear solver (spils) sunlinsol modules that support scaling to allow users to handle non-dimensionalization (as best as possible) within each sundials package and retain variables and define equations as desired in their applications. For linear solvers that do not support left/right scaling, the tolerance supplied to the linear solver is adjusted to compensate (see section 9.4.2 for 9.4 IDAS SUNLinearSolver interface 247 more details); however, this use case may be non-optimal and cannot handle situations where the magnitudes of different solution components or equations vary dramatically within a single problem. To utilize alternative linear solvers that are not currently provided by, or interfaced with, sundials a user must implement a custom sunlinsol wrapper for the linear solver following the API described in Chapter 9. This user-supplied sunlinsol module must then self-identify as having SUNLINEARSOLVER ITERATIVE type. Matrix-based iterative linear solvers (reusing A) Matrix-based iterative linear solver modules require a matrix and compute an inexact solution to the linear system defined by the matrix. This matrix will be updated infrequently and resued across multiple solves to amortize cost of matrix construction. As in the direct linear solver case, only wrappers for the matrix and linear solver in sunmatrix and sunlinsol implementations need to be created to utilize a new linear solver. This user-supplied sunlinsol module must then self-identify as having SUNLINEARSOLVER MATRIX ITERATIVE type. At present, sundials has one example problem that uses this approach for wrapping a structuredgrid matrix, linear solver, and preconditioner from the hypre library that may be used as a template for other customized implementations (see examples/arkode/CXX parhyp/ark heat2D hypre.cpp). Matrix-based iterative linear solvers (current A) For users who wish to utilize a matrix-based iterative linear solver module where the matrix is purely for preconditioning and the linear system is defined by the package-supplied ATimes routine, we envision two current possibilities. The preferred approach is for users to employ one of the sundials spils sunlinsol implementations (sunlinsol spgmr, sunlinsol spfgmr, sunlinsol spbcgs, sunlinsol sptfqmr, or sunlinsol pcg) as the outer solver. The creation and storage of the preconditioner matrix, and interfacing with the corresponding linear solver, can be handled through a package’s preconditioner ‘setup’ and ‘solve’ functionality (see §4.5.8.2) without creating sunmatrix and sunlinsol implementations. This usage mode is recommended primarily because the sundials-provided spils modules support the scaling as described above. A second approach supported by the linear solver APIs is as follows. If the sunlinsol implementation is matrix-based, self-identifies as having SUNLINEARSOLVER ITERATIVE type, and also provides a non-NULL SUNLinSolSetATimes routine, then each sundials package will call that routine to attach its package-specific matrix-vector product routine to the sunlinsol object. The sundials package will then call the sunlinsol-provided SUNLinSolSetup routine (infrequently) to update matrix information, but will provide current matrix-vector products to the sunlinsol implementation through the package-supplied ATimesFn routine. 9.4 IDAS SUNLinearSolver interface Table 9.3 below lists the sunlinsol module linear solver functions used within the idals interface. As with the sunmatrix module, we emphasize that the ida user does not need to know detailed usage of linear solver functions by the ida code modules in order to use ida. The information is presented as an implementation detail for the interested reader. The linear solver functions listed below are marked with Xto indicate that they are required, or with † to indicate that they are only called if they are non-NULL in the sunlinsol implementation that is being used. Note: 1. Although idals does not call SUNLinSolLastFlag directly, this routine is available for users to query linear solver issues directly. 2. Although idals does not call SUNLinSolFree directly, this routine should be available for users to call when cleaning up from a simulation. 248 Description of the SUNLinearSolver module MATRIX ITERATIVE X † † † X X X X X † † X X X X X X † † † X X X X X † † † DIRECT ITERATIVE Table 9.3: List of linear solver function usage in the idals interface SUNLinSolGetType SUNLinSolSetATimes SUNLinSolSetPreconditioner SUNLinSolSetScalingVectors SUNLinSolInitialize SUNLinSolSetup SUNLinSolSolve SUNLinSolNumIters SUNLinSolResid 1 SUNLinSolLastFlag 2 SUNLinSolFree SUNLinSolSpace Since there are a wide range of potential sunlinsol use cases, the following subsections describe some details of the idals interface, in the case that interested users wish to develop custom sunlinsol modules. 9.4.1 Lagged matrix information If the sunlinsol object self-identifies as having type SUNLINEARSOLVER DIRECT or SUNLINEARSOLVER MATRIX ITERATIVE, then the sunlinsol object solves a linear system defined by a sunmatrix object. cvls will update the matrix information infrequently according to the strategies outlined in §2.1. When solving a linear system J x̄ = b, it is likely that the value ᾱ used to construct J differs from the current value of α in the BDF method, since J is updated infrequently. Therefore, after calling the sunlinsol-provided SUNLinSolSolve routine, we test whether α/ᾱ 6= 1, and if this is the case we scale the solution x̄ to obtain the desired linear system solution x via x= 2 x̄. 1 + α/ᾱ (9.3) For values of α/ᾱ that are “close” to 1, this rescaling approximately solves the original linear system. 9.4.2 Iterative linear solver tolerance If the sunlinsol object self-identifies as having type SUNLINEARSOLVER ITERATIVE or SUNLINEARSOLVER MATRIX ITERATIVE then idals will set the input tolerance delta as described in §2.1. However, if the iterative linear solver does not support scaling matrices (i.e., the SUNLinSolSetScalingVectors routine is NULL), then idals will attempt to adjust the linear solver tolerance to account for this lack of functionality. To this end, the following assumptions are made: 1. All solution components have similar magnitude; hence the error weight vector W used in the WRMS norm (see §2.1) should satisfy the assumption Wi ≈ Wmean , for i = 0, . . . , n − 1. 9.5 The SUNLinearSolver Dense implementation 249 2. The sunlinsol object uses a standard 2-norm to measure convergence. Since ida uses identical left and right scaling matrices, S1 = S2 = S = diag(W ), then the linear solver convergence requirement is converted as follows (using the notation from equations (9.1)-(9.2)): b̃ − Ãx̃ < tol 2 ⇔ ⇔ SP1−1 b − SP1−1 Ax n−1 X 2 < tol 2 Wi P1−1 (b − Ax) i < tol2 i=0 2 ⇔ Wmean n−1 X P1−1 (b − Ax) 2 i < tol2 i=0 ⇔ n−1 X P1−1 (b − Ax) 2 i < i=0 ⇔ P1−1 (b − Ax) 2 < tol Wmean 2 tol Wmean Therefore the tolerance scaling factor √ Wmean = kW k2 / n is computed and the scaled tolerance delta= tol/Wmean is supplied to the sunlinsol object. 9.5 The SUNLinearSolver Dense implementation This section describes the sunlinsol implementation for solving dense linear systems. The sunlinsol dense module is designed to be used with the corresponding sunmatrix dense matrix type, and one of the serial or shared-memory nvector implementations (nvector serial, nvector openmp, or nvector pthreads). To access the sunlinsol dense module, include the header file sunlinsol/sunlinsol dense.h. We note that the sunlinsol dense module is accessible from sundials packages without separately linking to the libsundials sunlinsoldense module library. 9.5.1 SUNLinearSolver Dense description This solver is constructed to perform the following operations: • The “setup” call performs a LU factorization with partial (row) pivoting (O(N 3 ) cost), P A = LU , where P is a permutation matrix, L is a lower triangular matrix with 1’s on the diagonal, and U is an upper triangular matrix. This factorization is stored in-place on the input sunmatrix dense object A, with pivoting information encoding P stored in the pivots array. • The “solve” call performs pivoting and forward and backward substitution using the stored pivots array and the LU factors held in the sunmatrix dense object (O(N 2 ) cost). 9.5.2 SUNLinearSolver Dense functions The sunlinsol dense module provides the following user-callable constructor for creating a SUNLinearSolver object. 250 Description of the SUNLinearSolver module SUNLinSol Dense Call LS = SUNLinSol Dense(y, A); Description The function SUNLinSol Dense creates and allocates memory for a dense SUNLinearSolver object. Arguments y (N Vector) a template for cloning vectors needed within the solver A (SUNMatrix) a sunmatrix dense matrix template for cloning matrices needed within the solver Return value This returns a SUNLinearSolver object. If either A or y are incompatible then this routine will return NULL. Notes This routine will perform consistency checks to ensure that it is called with consistent nvector and sunmatrix implementations. These are currently limited to the sunmatrix dense matrix type and the nvector serial, nvector openmp, and nvector pthreads vector types. As additional compatible matrix and vector implementations are added to sundials, these will be included within this compatibility check. Deprecated Name For backward compatibility, the wrapper function SUNDenseLinearSolver with idential input and output arguments is also provided. F2003 Name This function is callable as FSUNLinSol Dense when using the Fortran 2003 interface module. The sunlinsol dense module defines implementations of all “direct” linear solver operations listed in Sections 9.1.1 – 9.1.3: • SUNLinSolGetType Dense • SUNLinSolInitialize Dense – this does nothing, since all consistency checks are performed at solver creation. • SUNLinSolSetup Dense – this performs the LU factorization. • SUNLinSolSolve Dense – this uses the LU factors and pivots array to perform the solve. • SUNLinSolLastFlag Dense • SUNLinSolSpace Dense – this only returns information for the storage within the solver object, i.e. storage for N, last flag, and pivots. • SUNLinSolFree Dense All of the listed operations are callable via the Fortran 2003 interface module by prepending an ‘F’ to the function name. 9.5.3 SUNLinearSolver Dense Fortran interfaces The sunlinsol dense module provides a Fortran 2003 module as well as Fortran 77 style interface functions for use from Fortran applications. FORTRAN 2003 interface module The fsunlinsol dense mod Fortran module defines interfaces to all sunlinsol dense C functions using the intrinsic iso c binding module which provides a standardized mechanism for interoperating with C. As noted in the C function descriptions above, the interface functions are named after the corresponding C function, but with a leading ‘F’. For example, the function SUNLinSol Dense is interfaced as FSUNLinSol Dense. The Fortran 2003 sunlinsol dense interface module can be accessed with the use statement, i.e. use fsunlinsol dense mod, and linking to the library libsundials fsunlinsoldense mod.lib in 9.5 The SUNLinearSolver Dense implementation 251 addition to the C library. For details on where the library and module file fsunlinsol dense mod.mod are installed see Appendix A. We note that the module is accessible from the Fortran 2003 sundials integrators without separately linking to the libsundials fsunlinsoldense mod library. FORTRAN 77 interface functions For solvers that include a Fortran 77 interface module, the sunlinsol dense module also includes a Fortran-callable function for creating a SUNLinearSolver object. FSUNDENSELINSOLINIT Call FSUNDENSELINSOLINIT(code, ier) Description The function FSUNDENSELINSOLINIT can be called for Fortran programs to create a dense SUNLinearSolver object. Arguments code (int*) is an integer input specifying the solver id (1 for cvode, 2 for ida, 3 for kinsol, and 4 for arkode). Return value ier is a return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. Notes This routine must be called after both the nvector and sunmatrix objects have been initialized. Additionally, when using arkode with a non-identity mass matrix, the sunlinsol dense module includes a Fortran-callable function for creating a SUNLinearSolver mass matrix solver object. FSUNMASSDENSELINSOLINIT Call FSUNMASSDENSELINSOLINIT(ier) Description The function FSUNMASSDENSELINSOLINIT can be called for Fortran programs to create a dense SUNLinearSolver object for mass matrix linear systems. Arguments None Return value ier is a int return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. Notes 9.5.4 This routine must be called after both the nvector and sunmatrix mass-matrix objects have been initialized. SUNLinearSolver Dense content The sunlinsol dense module defines the content field of a SUNLinearSolver as the following structure: struct _SUNLinearSolverContent_Dense { sunindextype N; sunindextype *pivots; long int last_flag; }; These entries of the content field contain the following information: N - size of the linear system, pivots - index array for partial pivoting in LU factorization, last flag - last error return flag from internal function evaluations. 252 9.6 Description of the SUNLinearSolver module The SUNLinearSolver Band implementation This section describes the sunlinsol implementation for solving banded linear systems. The sunlinsol band module is designed to be used with the corresponding sunmatrix band matrix type, and one of the serial or shared-memory nvector implementations (nvector serial, nvector openmp, or nvector pthreads). To access the sunlinsol band module, include the header file sunlinsol/sunlinsol band.h. We note that the sunlinsol band module is accessible from sundials packages without separately linking to the libsundials sunlinsolband module library. 9.6.1 SUNLinearSolver Band description This solver is constructed to perform the following operations: • The “setup” call performs a LU factorization with partial (row) pivoting, P A = LU , where P is a permutation matrix, L is a lower triangular matrix with 1’s on the diagonal, and U is an upper triangular matrix. This factorization is stored in-place on the input sunmatrix band object A, with pivoting information encoding P stored in the pivots array. • The “solve” call performs pivoting and forward and backward substitution using the stored pivots array and the LU factors held in the sunmatrix band object. ! • A must be allocated to accommodate the increase in upper bandwidth that occurs during factorization. More precisely, if A is a band matrix with upper bandwidth mu and lower bandwidth ml, then the upper triangular factor U can have upper bandwidth as big as smu = MIN(N-1,mu+ml). The lower triangular factor L has lower bandwidth ml. 9.6.2 SUNLinearSolver Band functions The sunlinsol band module provides the following user-callable constructor for creating a SUNLinearSolver object. SUNLinSol Band Call LS = SUNLinSol Band(y, A); Description The function SUNLinSol Band creates and allocates memory for a band SUNLinearSolver object. Arguments y (N Vector) a template for cloning vectors needed within the solver A (SUNMatrix) a sunmatrix band matrix template for cloning matrices needed within the solver Return value This returns a SUNLinearSolver object. If either A or y are incompatible then this routine will return NULL. Notes This routine will perform consistency checks to ensure that it is called with consistent nvector and sunmatrix implementations. These are currently limited to the sunmatrix band matrix type and the nvector serial, nvector openmp, and nvector pthreads vector types. As additional compatible matrix and vector implementations are added to sundials, these will be included within this compatibility check. Additionally, this routine will verify that the input matrix A is allocated with appropriate upper bandwidth storage for the LU factorization. Deprecated Name For backward compatibility, the wrapper function SUNBandLinearSolver with idential input and output arguments is also provided. F2003 Name This function is callable as FSUNLinSol Band when using the Fortran 2003 interface module. 9.6 The SUNLinearSolver Band implementation 253 The sunlinsol band module defines band implementations of all “direct” linear solver operations listed in Sections 9.1.1 – 9.1.3: • SUNLinSolGetType Band • SUNLinSolInitialize Band – this does nothing, since all consistency checks are performed at solver creation. • SUNLinSolSetup Band – this performs the LU factorization. • SUNLinSolSolve Band – this uses the LU factors and pivots array to perform the solve. • SUNLinSolLastFlag Band • SUNLinSolSpace Band – this only returns information for the storage within the solver object, i.e. storage for N, last flag, and pivots. • SUNLinSolFree Band All of the listed operations are callable via the Fortran 2003 interface module by prepending an ‘F’ to the function name. 9.6.3 SUNLinearSolver Band Fortran interfaces The sunlinsol band module provides a Fortran 2003 module as well as Fortran 77 style interface functions for use from Fortran applications. FORTRAN 2003 interface module The fsunlinsol band mod Fortran module defines interfaces to all sunlinsol band C functions using the intrinsic iso c binding module which provides a standardized mechanism for interoperating with C. As noted in the C function descriptions above, the interface functions are named after the corresponding C function, but with a leading ‘F’. For example, the function SUNLinSol Band is interfaced as FSUNLinSol Band. The Fortran 2003 sunlinsol band interface module can be accessed with the use statement, i.e. use fsunlinsol band mod, and linking to the library libsundials fsunlinsolband mod.lib in addition to the C library. For details on where the library and module file fsunlinsol band mod.mod are installed see Appendix A. We note that the module is accessible from the Fortran 2003 sundials integrators without separately linking to the libsundials fsunlinsolband mod library. FORTRAN 77 interface functions For solvers that include a Fortran 77 interface module, the sunlinsol band module also includes a Fortran-callable function for creating a SUNLinearSolver object. FSUNBANDLINSOLINIT Call FSUNBANDLINSOLINIT(code, ier) Description The function FSUNBANDLINSOLINIT can be called for Fortran programs to create a band SUNLinearSolver object. Arguments code (int*) is an integer input specifying the solver id (1 for cvode, 2 for ida, 3 for kinsol, and 4 for arkode). Return value ier is a return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. Notes This routine must be called after both the nvector and sunmatrix objects have been initialized. Additionally, when using arkode with a non-identity mass matrix, the sunlinsol band module includes a Fortran-callable function for creating a SUNLinearSolver mass matrix solver object. 254 Description of the SUNLinearSolver module FSUNMASSBANDLINSOLINIT Call FSUNMASSBANDLINSOLINIT(ier) Description The function FSUNMASSBANDLINSOLINIT can be called for Fortran programs to create a band SUNLinearSolver object for mass matrix linear systems. Arguments None Return value ier is a int return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. Notes 9.6.4 This routine must be called after both the nvector and sunmatrix mass-matrix objects have been initialized. SUNLinearSolver Band content The sunlinsol band module defines the content field of a SUNLinearSolver as the following structure: struct _SUNLinearSolverContent_Band { sunindextype N; sunindextype *pivots; long int last_flag; }; These entries of the content field contain the following information: N - size of the linear system, pivots - index array for partial pivoting in LU factorization, last flag - last error return flag from internal function evaluations. 9.7 ! The SUNLinearSolver LapackDense implementation This section describes the sunlinsol implementation for solving dense linear systems with LAPACK. The sunlinsol lapackdense module is designed to be used with the corresponding sunmatrix dense matrix type, and one of the serial or shared-memory nvector implementations (nvector serial, nvector openmp, or nvector pthreads). To access the sunlinsol lapackdense module, include the header file sunlinsol/sunlinsol lapackdense.h. The installed module library to link to is libsundials sunlinsollapackdense.lib where .lib is typically .so for shared libraries and .a for static libraries. The sunlinsol lapackdense module is a sunlinsol wrapper for the LAPACK dense matrix factorization and solve routines, *GETRF and *GETRS, where * is either D or S, depending on whether sundials was configured to have realtype set to double or single, respectively (see Section 4.2). In order to use the sunlinsol lapackdense module it is assumed that LAPACK has been installed on the system prior to installation of sundials, and that sundials has been configured appropriately to link with LAPACK (see Appendix A for details). We note that since there do not exist 128-bit floating-point factorization and solve routines in LAPACK, this interface cannot be compiled when using extended precision for realtype. Similarly, since there do not exist 64-bit integer LAPACK routines, the sunlinsol lapackdense module also cannot be compiled when using 64-bit integers for the sunindextype. 9.7.1 SUNLinearSolver LapackDense description This solver is constructed to perform the following operations: 9.7 The SUNLinearSolver LapackDense implementation 255 • The “setup” call performs a LU factorization with partial (row) pivoting (O(N 3 ) cost), P A = LU , where P is a permutation matrix, L is a lower triangular matrix with 1’s on the diagonal, and U is an upper triangular matrix. This factorization is stored in-place on the input sunmatrix dense object A, with pivoting information encoding P stored in the pivots array. • The “solve” call performs pivoting and forward and backward substitution using the stored pivots array and the LU factors held in the sunmatrix dense object (O(N 2 ) cost). 9.7.2 SUNLinearSolver LapackDense functions The sunlinsol lapackdense module provides the following user-callable constructor for creating a SUNLinearSolver object. SUNLinSol LapackDense Call LS = SUNLinSol LapackDense(y, A); Description The function SUNLinSol LapackDense creates and allocates memory for a LAPACKbased, dense SUNLinearSolver object. Arguments y (N Vector) a template for cloning vectors needed within the solver A (SUNMatrix) a sunmatrix dense matrix template for cloning matrices needed within the solver Return value This returns a SUNLinearSolver object. If either A or y are incompatible then this routine will return NULL. Notes This routine will perform consistency checks to ensure that it is called with consistent nvector and sunmatrix implementations. These are currently limited to the sunmatrix dense matrix type and the nvector serial, nvector openmp, and nvector pthreads vector types. As additional compatible matrix and vector implementations are added to sundials, these will be included within this compatibility check. Deprecated Name For backward compatibility, the wrapper function SUNLapackDense with idential input and output arguments is also provided. The sunlinsol lapackdense module defines dense implementations of all “direct” linear solver operations listed in Sections 9.1.1 – 9.1.3: • SUNLinSolGetType LapackDense • SUNLinSolInitialize LapackDense – this does nothing, since all consistency checks are performed at solver creation. • SUNLinSolSetup LapackDense – this calls either DGETRF or SGETRF to perform the LU factorization. • SUNLinSolSolve LapackDense – this calls either DGETRS or SGETRS to use the LU factors and pivots array to perform the solve. • SUNLinSolLastFlag LapackDense • SUNLinSolSpace LapackDense – this only returns information for the storage within the solver object, i.e. storage for N, last flag, and pivots. • SUNLinSolFree LapackDense 9.7.3 SUNLinearSolver LapackDense Fortran interfaces For solvers that include a Fortran 77 interface module, the sunlinsol lapackdense module also includes a Fortran-callable function for creating a SUNLinearSolver object. 256 Description of the SUNLinearSolver module FSUNLAPACKDENSEINIT Call FSUNLAPACKDENSEINIT(code, ier) Description The function FSUNLAPACKDENSEINIT can be called for Fortran programs to create a LAPACK-based dense SUNLinearSolver object. Arguments code (int*) is an integer input specifying the solver id (1 for cvode, 2 for ida, 3 for kinsol, and 4 for arkode). Return value ier is a return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. Notes This routine must be called after both the nvector and sunmatrix objects have been initialized. Additionally, when using arkode with a non-identity mass matrix, the sunlinsol lapackdense module includes a Fortran-callable function for creating a SUNLinearSolver mass matrix solver object. FSUNMASSLAPACKDENSEINIT Call FSUNMASSLAPACKDENSEINIT(ier) Description The function FSUNMASSLAPACKDENSEINIT can be called for Fortran programs to create a LAPACK-based, dense SUNLinearSolver object for mass matrix linear systems. Arguments None Return value ier is a int return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. Notes 9.7.4 This routine must be called after both the nvector and sunmatrix mass-matrix objects have been initialized. SUNLinearSolver LapackDense content The sunlinsol lapackdense module defines the content field of a SUNLinearSolver as the following structure: struct _SUNLinearSolverContent_Dense { sunindextype N; sunindextype *pivots; long int last_flag; }; These entries of the content field contain the following information: N - size of the linear system, pivots - index array for partial pivoting in LU factorization, last flag - last error return flag from internal function evaluations. 9.8 The SUNLinearSolver LapackBand implementation This section describes the sunlinsol implementation for solving banded linear systems with LAPACK. The sunlinsol lapackband module is designed to be used with the corresponding sunmatrix band matrix type, and one of the serial or shared-memory nvector implementations (nvector serial, nvector openmp, or nvector pthreads). To access the sunlinsol lapackband module, include the header file sunlinsol/sunlinsol lapackband.h. The installed module library to link to is libsundials sunlinsollapackband.lib where .lib is typically .so for shared libraries and .a for static libraries. 9.8 The SUNLinearSolver LapackBand implementation 257 The sunlinsol lapackband module is a sunlinsol wrapper for the LAPACK band matrix factorization and solve routines, *GBTRF and *GBTRS, where * is either D or S, depending on whether sundials was configured to have realtype set to double or single, respectively (see Section 4.2). In order to use the sunlinsol lapackband module it is assumed that LAPACK has been installed on the system prior to installation of sundials, and that sundials has been configured appropriately to link with LAPACK (see Appendix A for details). We note that since there do not exist 128-bit floating-point factorization and solve routines in LAPACK, this interface cannot be compiled when using extended precision for realtype. Similarly, since there do not exist 64-bit integer LAPACK routines, the sunlinsol lapackband module also cannot be compiled when using 64-bit integers for the sunindextype. 9.8.1 ! SUNLinearSolver LapackBand description This solver is constructed to perform the following operations: • The “setup” call performs a LU factorization with partial (row) pivoting, P A = LU , where P is a permutation matrix, L is a lower triangular matrix with 1’s on the diagonal, and U is an upper triangular matrix. This factorization is stored in-place on the input sunmatrix band object A, with pivoting information encoding P stored in the pivots array. • The “solve” call performs pivoting and forward and backward substitution using the stored pivots array and the LU factors held in the sunmatrix band object. • A must be allocated to accommodate the increase in upper bandwidth that occurs during factorization. More precisely, if A is a band matrix with upper bandwidth mu and lower bandwidth ml, then the upper triangular factor U can have upper bandwidth as big as smu = MIN(N-1,mu+ml). The lower triangular factor L has lower bandwidth ml. 9.8.2 SUNLinearSolver LapackBand functions The sunlinsol lapackband module provides the following user-callable constructor for creating a SUNLinearSolver object. SUNLinSol LapackBand Call LS = SUNLinSol LapackBand(y, A); Description The function SUNLinSol LapackBand creates and allocates memory for a LAPACKbased, band SUNLinearSolver object. Arguments y (N Vector) a template for cloning vectors needed within the solver A (SUNMatrix) a sunmatrix band matrix template for cloning matrices needed within the solver Return value This returns a SUNLinearSolver object. If either A or y are incompatible then this routine will return NULL. Notes This routine will perform consistency checks to ensure that it is called with consistent nvector and sunmatrix implementations. These are currently limited to the sunmatrix band matrix type and the nvector serial, nvector openmp, and nvector pthreads vector types. As additional compatible matrix and vector implementations are added to sundials, these will be included within this compatibility check. Additionally, this routine will verify that the input matrix A is allocated with appropriate upper bandwidth storage for the LU factorization. Deprecated Name For backward compatibility, the wrapper function SUNLapackBand with idential input and output arguments is also provided. ! 258 Description of the SUNLinearSolver module The sunlinsol lapackband module defines band implementations of all “direct” linear solver operations listed in Sections 9.1.1 – 9.1.3: • SUNLinSolGetType LapackBand • SUNLinSolInitialize LapackBand – this does nothing, since all consistency checks are performed at solver creation. • SUNLinSolSetup LapackBand – this calls either DGBTRF or SGBTRF to perform the LU factorization. • SUNLinSolSolve LapackBand – this calls either DGBTRS or SGBTRS to use the LU factors and pivots array to perform the solve. • SUNLinSolLastFlag LapackBand • SUNLinSolSpace LapackBand – this only returns information for the storage within the solver object, i.e. storage for N, last flag, and pivots. • SUNLinSolFree LapackBand 9.8.3 SUNLinearSolver LapackBand Fortran interfaces For solvers that include a Fortran 77 interface module, the sunlinsol lapackband module also includes a Fortran-callable function for creating a SUNLinearSolver object. FSUNLAPACKDENSEINIT Call FSUNLAPACKBANDINIT(code, ier) Description The function FSUNLAPACKBANDINIT can be called for Fortran programs to create a LAPACK-based band SUNLinearSolver object. Arguments code (int*) is an integer input specifying the solver id (1 for cvode, 2 for ida, 3 for kinsol, and 4 for arkode). Return value ier is a return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. Notes This routine must be called after both the nvector and sunmatrix objects have been initialized. Additionally, when using arkode with a non-identity mass matrix, the sunlinsol lapackband module includes a Fortran-callable function for creating a SUNLinearSolver mass matrix solver object. FSUNMASSLAPACKBANDINIT Call FSUNMASSLAPACKBANDINIT(ier) Description The function FSUNMASSLAPACKBANDINIT can be called for Fortran programs to create a LAPACK-based, band SUNLinearSolver object for mass matrix linear systems. Arguments None Return value ier is a int return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. Notes This routine must be called after both the nvector and sunmatrix mass-matrix objects have been initialized. 9.9 The SUNLinearSolver KLU implementation 9.8.4 259 SUNLinearSolver LapackBand content The sunlinsol lapackband module defines the content field of a SUNLinearSolver as the following structure: struct _SUNLinearSolverContent_Band { sunindextype N; sunindextype *pivots; long int last_flag; }; These entries of the content field contain the following information: N - size of the linear system, pivots - index array for partial pivoting in LU factorization, last flag - last error return flag from internal function evaluations. 9.9 The SUNLinearSolver KLU implementation This section describes the sunlinsol implementation for solving sparse linear systems with KLU. The sunlinsol klu module is designed to be used with the corresponding sunmatrix sparse matrix type, and one of the serial or shared-memory nvector implementations (nvector serial, nvector openmp, or nvector pthreads). The header file to include when using this module is sunlinsol/sunlinsol klu.h. The installed module library to link to is libsundials sunlinsolklu.lib where .lib is typically .so for shared libraries and .a for static libraries. The sunlinsol klu module is a sunlinsol wrapper for the klu sparse matrix factorization and solver library written by Tim Davis [1, 17]. In order to use the sunlinsol klu interface to klu, it is assumed that klu has been installed on the system prior to installation of sundials, and that sundials has been configured appropriately to link with klu (see Appendix A for details). Additionally, this wrapper only supports double-precision calculations, and therefore cannot be compiled if sundials is configured to have realtype set to either extended or single (see Section 4.2). Since the klu library supports both 32-bit and 64-bit integers, this interface will be compiled for either of the available sunindextype options. 9.9.1 SUNLinearSolver KLU description The klu library has a symbolic factorization routine that computes the permutation of the linear system matrix to block triangular form and the permutations that will pre-order the diagonal blocks (the only ones that need to be factored) to reduce fill-in (using AMD, COLAMD, CHOLAMD, natural, or an ordering given by the user). Of these ordering choices, the default value in the sunlinsol klu module is the COLAMD ordering. klu breaks the factorization into two separate parts. The first is a symbolic factorization and the second is a numeric factorization that returns the factored matrix along with final pivot information. klu also has a refactor routine that can be called instead of the numeric factorization. This routine will reuse the pivot information. This routine also returns diagnostic information that a user can examine to determine if numerical stability is being lost and a full numerical factorization should be done instead of the refactor. Since the linear systems that arise within the context of sundials calculations will typically have identical sparsity patterns, the sunlinsol klu module is constructed to perform the following operations: • The first time that the “setup” routine is called, it performs the symbolic factorization, followed by an initial numerical factorization. ! 260 Description of the SUNLinearSolver module • On subsequent calls to the “setup” routine, it calls the appropriate klu “refactor” routine, followed by estimates of the numerical conditioning using the relevant “rcond”, and if necessary “condest”, routine(s). If these estimates of the condition number are larger than ε−2/3 (where ε is the double-precision unit roundoff), then a new factorization is performed. • The module includes the routine SUNKLUReInit, that can be called by the user to force a full or partial refactorization at the next “setup” call. • The “solve” call performs pivoting and forward and backward substitution using the stored klu data structures. We note that in this solve klu operates on the native data arrays for the right-hand side and solution vectors, without requiring costly data copies. 9.9.2 SUNLinearSolver KLU functions The sunlinsol klu module provides the following user-callable constructor for creating a SUNLinearSolver object. SUNLinSol KLU Call LS = SUNLinSol KLU(y, A); Description The function SUNLinSol KLU creates and allocates memory for a KLU-based SUNLinearSolver object. Arguments y (N Vector) a template for cloning vectors needed within the solver A (SUNMatrix) a sunmatrix sparse matrix template for cloning matrices needed within the solver Return value This returns a SUNLinearSolver object. If either A or y are incompatible then this routine will return NULL. Notes This routine will perform consistency checks to ensure that it is called with consistent nvector and sunmatrix implementations. These are currently limited to the sunmatrix sparse matrix type (using either CSR or CSC storage formats) and the nvector serial, nvector openmp, and nvector pthreads vector types. As additional compatible matrix and vector implementations are added to sundials, these will be included within this compatibility check. Deprecated Name For backward compatibility, the wrapper function SUNKLU with idential input and output arguments is also provided. F2003 Name This function is callable as FSUNLinSol KLU when using the Fortran 2003 interface module. The sunlinsol klu module defines implementations of all “direct” linear solver operations listed in Sections 9.1.1 – 9.1.3: • SUNLinSolGetType KLU • SUNLinSolInitialize KLU – this sets the first factorize flag to 1, forcing both symbolic and numerical factorizations on the subsequent “setup” call. • SUNLinSolSetup KLU – this performs either a LU factorization or refactorization of the input matrix. • SUNLinSolSolve KLU – this calls the appropriate klu solve routine to utilize the LU factors to solve the linear system. • SUNLinSolLastFlag KLU • SUNLinSolSpace KLU – this only returns information for the storage within the solver interface, i.e. storage for the integers last flag and first factorize. For additional space requirements, see the klu documentation. 9.9 The SUNLinearSolver KLU implementation 261 • SUNLinSolFree KLU All of the listed operations are callable via the Fortran 2003 interface module by prepending an ‘F’ to the function name. The sunlinsol klu module also defines the following additional user-callable functions. SUNLinSol KLUReInit Call retval = SUNLinSol KLUReInit(LS, A, nnz, reinit type); Description The function SUNLinSol KLUReInit reinitializes memory and flags for a new factorization (symbolic and numeric) to be conducted at the next solver setup call. This routine is useful in the cases where the number of nonzeroes has changed or if the structure of the linear system has changed which would require a new symbolic (and numeric factorization). Arguments (SUNLinearSolver) a template for cloning vectors needed within the solver A (SUNMatrix) a sunmatrix sparse matrix template for cloning matrices needed within the solver nnz (sunindextype) the new number of nonzeros in the matrix reinit type (int) flag governing the level of reinitialization. The allowed values are: LS • SUNKLU REINIT FULL – The Jacobian matrix will be destroyed and a new one will be allocated based on the nnz value passed to this call. New symbolic and numeric factorizations will be completed at the next solver setup. • SUNKLU REINIT PARTIAL – Only symbolic and numeric factorizations will be completed. It is assumed that the Jacobian size has not exceeded the size of nnz given in the sparse matrix provided to the original constructor routine (or the previous SUNLinSol KLUReInit call). Return value The return values from this function are SUNLS MEM NULL (either S or A are NULL), SUNLS ILL INPUT (A does not have type SUNMATRIX SPARSE or reinit type is invalid), SUNLS MEM FAIL (reallocation of the sparse matrix failed) or SUNLS SUCCESS. Notes This routine will perform consistency checks to ensure that it is called with consistent nvector and sunmatrix implementations. These are currently limited to the sunmatrix sparse matrix type (using either CSR or CSC storage formats) and the nvector serial, nvector openmp, and nvector pthreads vector types. As additional compatible matrix and vector implementations are added to sundials, these will be included within this compatibility check. This routine assumes no other changes to solver use are necessary. Deprecated Name For backward compatibility, the wrapper function SUNKLUReInit with idential input and output arguments is also provided. F2003 Name This function is callable as FSUNLinSol KLUReInit when using the Fortran 2003 interface module. SUNLinSol KLUSetOrdering Call retval = SUNLinSol KLUSetOrdering(LS, ordering); Description This function sets the ordering used by klu for reducing fill in the linear solve. Arguments LS (SUNLinearSolver) the sunlinsol klu object ordering (int) flag indicating the reordering algorithm to use, the options are: 262 Description of the SUNLinearSolver module 0 AMD, 1 COLAMD, and 2 the natural ordering. The default is 1 for COLAMD. Return value The return values from this function are SUNLS MEM NULL (S is NULL), SUNLS ILL INPUT (invalid ordering choice), or SUNLS SUCCESS. Deprecated Name For backward compatibility, the wrapper function SUNKLUSetOrdering with idential input and output arguments is also provided. F2003 Name 9.9.3 This function is callable as FSUNLinSol KLUSetOrdering when using the Fortran 2003 interface module. SUNLinearSolver KLU Fortran interfaces The sunlinsol klu module provides a Fortran 2003 module as well as Fortran 77 style interface functions for use from Fortran applications. FORTRAN 2003 interface module The fsunlinsol klu mod Fortran module defines interfaces to all sunlinsol klu C functions using the intrinsic iso c binding module which provides a standardized mechanism for interoperating with C. As noted in the C function descriptions above, the interface functions are named after the corresponding C function, but with a leading ‘F’. For example, the function SUNLinSol klu is interfaced as FSUNLinSol klu. The Fortran 2003 sunlinsol klu interface module can be accessed with the use statement, i.e. use fsunlinsol klu mod, and linking to the library libsundials fsunlinsolklu mod.lib in addition to the C library. For details on where the library and module file fsunlinsol klu mod.mod are installed see Appendix A. FORTRAN 77 interface functions For solvers that include a Fortran 77 interface module, the sunlinsol klu module also includes a Fortran-callable function for creating a SUNLinearSolver object. FSUNKLUINIT Call FSUNKLUINIT(code, ier) Description The function FSUNKLUINIT can be called for Fortran programs to create a sunlinsol klu object. Arguments code (int*) is an integer input specifying the solver id (1 for cvode, 2 for ida, 3 for kinsol, and 4 for arkode). Return value ier is a return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. Notes This routine must be called after both the nvector and sunmatrix objects have been initialized. Additionally, when using arkode with a non-identity mass matrix, the sunlinsol klu module includes a Fortran-callable function for creating a SUNLinearSolver mass matrix solver object. FSUNMASSKLUINIT Call FSUNMASSKLUINIT(ier) Description The function FSUNMASSKLUINIT can be called for Fortran programs to create a KLUbased SUNLinearSolver object for mass matrix linear systems. 9.9 The SUNLinearSolver KLU implementation Arguments 263 None Return value ier is a int return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. Notes This routine must be called after both the nvector and sunmatrix mass-matrix objects have been initialized. The SUNLinSol KLUReInit and SUNLinSol KLUSetOrdering routines also support Fortran interfaces for the system and mass matrix solvers: FSUNKLUREINIT Call FSUNKLUREINIT(code, nnz, reinit type, ier) Description The function FSUNKLUREINIT can be called for Fortran programs to re-initialize a sunlinsol klu object. Arguments (int*) is an integer input specifying the solver id (1 for cvode, 2 for ida, 3 for kinsol, and 4 for arkode). nnz (sunindextype*) the new number of nonzeros in the matrix reinit type (int*) flag governing the level of reinitialization. The allowed values are: code 1 – The Jacobian matrix will be destroyed and a new one will be allocated based on the nnz value passed to this call. New symbolic and numeric factorizations will be completed at the next solver setup. 2 – Only symbolic and numeric factorizations will be completed. It is assumed that the Jacobian size has not exceeded the size of nnz given in the sparse matrix provided to the original constructor routine (or the previous SUNLinSol KLUReInit call). Return value ier is a int return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. Notes See SUNLinSol KLUReInit for complete further documentation of this routine. FSUNMASSKLUREINIT Call FSUNMASSKLUREINIT(nnz, reinit type, ier) Description The function FSUNMASSKLUREINIT can be called for Fortran programs to re-initialize a sunlinsol klu object for mass matrix linear systems. Arguments The arguments are identical to FSUNKLUREINIT above, except that code is not needed since mass matrix linear systems only arise in arkode. Return value ier is a int return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. Notes See SUNLinSol KLUReInit for complete further documentation of this routine. FSUNKLUSETORDERING Call FSUNKLUSETORDERING(code, ordering, ier) Description The function FSUNKLUSETORDERING can be called for Fortran programs to change the reordering algorithm used by klu. Arguments code (int*) is an integer input specifying the solver id (1 for cvode, 2 for ida, 3 for kinsol, and 4 for arkode). ordering (int*) flag indication the reordering algorithm to use. Options include: 0 AMD, 1 COLAMD, and 264 Description of the SUNLinearSolver module 2 the natural ordering. The default is 1 for COLAMD. Return value ier is a int return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. Notes See SUNLinSol KLUSetOrdering for complete further documentation of this routine. FSUNMASSKLUSETORDERING Call FSUNMASSKLUSETORDERING(ier) Description The function FSUNMASSKLUSETORDERING can be called for Fortran programs to change the reordering algorithm used by klu for mass matrix linear systems. Arguments The arguments are identical to FSUNKLUSETORDERING above, except that code is not needed since mass matrix linear systems only arise in arkode. Return value ier is a int return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. See SUNLinSol KLUSetOrdering for complete further documentation of this routine. Notes 9.9.4 SUNLinearSolver KLU content The sunlinsol klu module defines the content field of a SUNLinearSolver as the following structure: struct _SUNLinearSolverContent_KLU { long int last_flag; int first_factorize; sun_klu_symbolic *symbolic; sun_klu_numeric *numeric; sun_klu_common common; sunindextype (*klu_solver)(sun_klu_symbolic*, sun_klu_numeric*, sunindextype, sunindextype, double*, sun_klu_common*); }; These entries of the content field contain the following information: - last error return flag from internal function evaluations, last flag first factorize - flag indicating whether the factorization has ever been performed, symbolic - klu storage structure for symbolic factorization components, numeric - klu storage structure for numeric factorization components, common - storage structure for common klu solver components, klu solver – pointer to the appropriate klu solver function (depending on whether it is using a CSR or CSC sparse matrix). 9.10 The SUNLinearSolver SuperLUMT implementation This section describes the sunlinsol implementation for solving sparse linear systems with SuperLU MT. The superlumt module is designed to be used with the corresponding sunmatrix sparse matrix type, and one of the serial or shared-memory nvector implementations (nvector serial, nvector openmp, or nvector pthreads). While these are compatible, it is not recommended to use a threaded vector module with sunlinsol superlumt unless it is the nvector openmp module and the superlumt library has also been compiled with OpenMP. 9.10 The SUNLinearSolver SuperLUMT implementation 265 The header file to include when using this module is sunlinsol/sunlinsol superlumt.h. The installed module library to link to is libsundials sunlinsolsuperlumt.lib where .lib is typically .so for shared libraries and .a for static libraries. The sunlinsol superlumt module is a sunlinsol wrapper for the superlumt sparse matrix factorization and solver library written by X. Sherry Li [2, 35, 19]. The package performs matrix factorization using threads to enhance efficiency in shared memory parallel environments. It should be noted that threads are only used in the factorization step. In order to use the sunlinsol superlumt interface to superlumt, it is assumed that superlumt has been installed on the system prior to installation of sundials, and that sundials has been configured appropriately to link with superlumt (see Appendix A for details). Additionally, this wrapper only supports single- and double-precision calculations, and therefore cannot be compiled if sundials is configured to have realtype set to extended (see Section 4.2). Moreover, since the superlumt library may be installed to support either 32-bit or 64-bit integers, it is assumed that the superlumt library is installed using the same integer precision as the sundials sunindextype option. 9.10.1 SUNLinearSolver SuperLUMT description The superlumt library has a symbolic factorization routine that computes the permutation of the linear system matrix to reduce fill-in on subsequent LU factorizations (using COLAMD, minimal degree ordering on AT ∗ A, minimal degree ordering on AT + A, or natural ordering). Of these ordering choices, the default value in the sunlinsol superlumt module is the COLAMD ordering. Since the linear systems that arise within the context of sundials calculations will typically have identical sparsity patterns, the sunlinsol superlumt module is constructed to perform the following operations: • The first time that the “setup” routine is called, it performs the symbolic factorization, followed by an initial numerical factorization. • On subsequent calls to the “setup” routine, it skips the symbolic factorization, and only refactors the input matrix. • The “solve” call performs pivoting and forward and backward substitution using the stored superlumt data structures. We note that in this solve superlumt operates on the native data arrays for the right-hand side and solution vectors, without requiring costly data copies. 9.10.2 SUNLinearSolver SuperLUMT functions The module sunlinsol superlumt provides the following user-callable constructor for creating a SUNLinearSolver object. SUNLinSol SuperLUMT Call LS = SUNLinSol SuperLUMT(y, A, num threads); Description The function SUNLinSol SuperLUMT creates and allocates memory for a SuperLU MT-based SUNLinearSolver object. Arguments (N Vector) a template for cloning vectors needed within the solver (SUNMatrix) a sunmatrix sparse matrix template for cloning matrices needed within the solver num threads (int) desired number of threads (OpenMP or Pthreads, depending on how superlumt was installed) to use during the factorization steps Return value This returns a SUNLinearSolver object. If either A or y are incompatible then this routine will return NULL. y A ! 266 Notes Description of the SUNLinearSolver module This routine analyzes the input matrix and vector to determine the linear system size and to assess compatibility with the superlumt library. This routine will perform consistency checks to ensure that it is called with consistent nvector and sunmatrix implementations. These are currently limited to the sunmatrix sparse matrix type (using either CSR or CSC storage formats) and the nvector serial, nvector openmp, and nvector pthreads vector types. As additional compatible matrix and vector implementations are added to sundials, these will be included within this compatibility check. The num threads argument is not checked and is passed directly to superlumt routines. Deprecated Name For backward compatibility, the wrapper function SUNSuperLUMT with idential input and output arguments is also provided. The sunlinsol superlumt module defines implementations of all “direct” linear solver operations listed in Sections 9.1.1 – 9.1.3: • SUNLinSolGetType SuperLUMT • SUNLinSolInitialize SuperLUMT – this sets the first factorize flag to 1 and resets the internal superlumt statistics variables. • SUNLinSolSetup SuperLUMT – this performs either a LU factorization or refactorization of the input matrix. • SUNLinSolSolve SuperLUMT – this calls the appropriate superlumt solve routine to utilize the LU factors to solve the linear system. • SUNLinSolLastFlag SuperLUMT • SUNLinSolSpace SuperLUMT – this only returns information for the storage within the solver interface, i.e. storage for the integers last flag and first factorize. For additional space requirements, see the superlumt documentation. • SUNLinSolFree SuperLUMT The sunlinsol superlumt module also defines the following additional user-callable function. SUNLinSol SuperLUMTSetOrdering Call retval = SUNLinSol SuperLUMTSetOrdering(LS, ordering); Description This function sets the ordering used by superlumt for reducing fill in the linear solve. Arguments LS (SUNLinearSolver) the sunlinsol superlumt object ordering (int) a flag indicating the ordering algorithm to use, the options are: 0 1 2 3 natural ordering minimal degree ordering on AT A minimal degree ordering on AT + A COLAMD ordering for unsymmetric matrices The default is 3 for COLAMD. Return value The return values from this function are SUNLS MEM NULL (S is NULL), SUNLS ILL INPUT (invalid ordering choice), or SUNLS SUCCESS. Deprecated Name For backward compatibility, the wrapper function SUNSuperLUMTSetOrdering with idential input and output arguments is also provided. 9.10 The SUNLinearSolver SuperLUMT implementation 9.10.3 267 SUNLinearSolver SuperLUMT Fortran interfaces For solvers that include a Fortran interface module, the sunlinsol superlumt module also includes a Fortran-callable function for creating a SUNLinearSolver object. FSUNSUPERLUMTINIT Call FSUNSUPERLUMTINIT(code, num threads, ier) Description The function FSUNSUPERLUMTINIT can be called for Fortran programs to create a sunlinsol klu object. Arguments (int*) is an integer input specifying the solver id (1 for cvode, 2 for ida, 3 for kinsol, and 4 for arkode). num threads (int*) desired number of threads (OpenMP or Pthreads, depending on how superlumt was installed) to use during the factorization steps code Return value ier is a return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. Notes This routine must be called after both the nvector and sunmatrix objects have been initialized. Additionally, when using arkode with a non-identity mass matrix, the sunlinsol superlumt module includes a Fortran-callable function for creating a SUNLinearSolver mass matrix solver object. FSUNMASSSUPERLUMTINIT Call FSUNMASSSUPERLUMTINIT(num threads, ier) Description The function FSUNMASSSUPERLUMTINIT can be called for Fortran programs to create a SuperLU MT-based SUNLinearSolver object for mass matrix linear systems. Arguments num threads (int*) desired number of threads (OpenMP or Pthreads, depending on how superlumt was installed) to use during the factorization steps. Return value ier is a int return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. Notes This routine must be called after both the nvector and sunmatrix mass-matrix objects have been initialized. The SUNLinSol SuperLUMTSetOrdering routine also supports Fortran interfaces for the system and mass matrix solvers: FSUNSUPERLUMTSETORDERING Call FSUNSUPERLUMTSETORDERING(code, ordering, ier) Description The function FSUNSUPERLUMTSETORDERING can be called for Fortran programs to update the ordering algorithm in a sunlinsol superlumt object. Arguments code (int*) is an integer input specifying the solver id (1 for cvode, 2 for ida, 3 for kinsol, and 4 for arkode). ordering (int*) a flag indicating the ordering algorithm, options are: 0 1 2 3 natural ordering minimal degree ordering on AT A minimal degree ordering on AT + A COLAMD ordering for unsymmetric matrices The default is 3 for COLAMD. Return value ier is a int return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. 268 Description of the SUNLinearSolver module Notes See SUNLinSol SuperLUMTSetOrdering for complete further documentation of this routine. FSUNMASSUPERLUMTSETORDERING Call FSUNMASSUPERLUMTSETORDERING(ordering, ier) Description The function FSUNMASSUPERLUMTSETORDERING can be called for Fortran programs to update the ordering algorithm in a sunlinsol superlumt object for mass matrix linear systems. Arguments ordering (int*) a flag indicating the ordering algorithm, options are: 0 1 2 3 natural ordering minimal degree ordering on AT A minimal degree ordering on AT + A COLAMD ordering for unsymmetric matrices The default is 3 for COLAMD. Return value ier is a int return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. See SUNLinSol SuperLUMTSetOrdering for complete further documentation of this routine. Notes 9.10.4 SUNLinearSolver SuperLUMT content The sunlinsol superlumt module defines the content field of a SUNLinearSolver as the following structure: struct _SUNLinearSolverContent_SuperLUMT { long int last_flag; int first_factorize; SuperMatrix *A, *AC, *L, *U, *B; Gstat_t *Gstat; sunindextype *perm_r, *perm_c; sunindextype N; int num_threads; realtype diag_pivot_thresh; int ordering; superlumt_options_t *options; }; These entries of the content field contain the following information: last flag - last error return flag from internal function evaluations, first factorize - flag indicating whether the factorization has ever been performed, A, AC, L, U, B - SuperMatrix pointers used in solve, Gstat - GStat t object used in solve, perm r, perm c - permutation arrays used in solve, N - size of the linear system, num threads - number of OpenMP/Pthreads threads to use, diag pivot thresh - threshold on diagonal pivoting, ordering - flag for which reordering algorithm to use, options - pointer to superlumt options structure. 9.11 The SUNLinearSolver SPGMR implementation 9.11 269 The SUNLinearSolver SPGMR implementation This section describes the sunlinsol implementation of the spgmr (Scaled, Preconditioned, Generalized Minimum Residual [41]) iterative linear solver. The sunlinsol spgmr module is designed to be compatible with any nvector implementation that supports a minimal subset of operations (N VClone, N VDotProd, N VScale, N VLinearSum, N VProd, N VConst, N VDiv, and N VDestroy). When using Classical Gram-Schmidt, the optional function N VDotProdMulti may be supplied for increased efficiency. To access the sunlinsol spgmr module, include the header file sunlinsol/sunlinsol spgmr.h. We note that the sunlinsol spgmr module is accessible from sundials packages without separately linking to the libsundials sunlinsolspgmr module library. 9.11.1 SUNLinearSolver SPGMR description This solver is constructed to perform the following operations: • During construction, the xcor and vtemp arrays are cloned from a template nvector that is input, and default solver parameters are set. • User-facing “set” routines may be called to modify default solver parameters. • Additional “set” routines are called by the sundials solver that interfaces with sunlinsol spgmr to supply the ATimes, PSetup, and Psolve function pointers and s1 and s2 scaling vectors. • In the “initialize” call, the remaining solver data is allocated (V, Hes, givens, and yg ) • In the “setup” call, any non-NULL PSetup function is called. Typically, this is provided by the sundials solver itself, that translates between the generic PSetup function and the solver-specific routine (solver-supplied or user-supplied). • In the “solve” call, the GMRES iteration is performed. This will include scaling, preconditioning, and restarts if those options have been supplied. 9.11.2 SUNLinearSolver SPGMR functions The sunlinsol spgmr module provides the following user-callable constructor for creating a SUNLinearSolver object. SUNLinSol SPGMR Call LS = SUNLinSol SPGMR(y, pretype, maxl); Description The function SUNLinSol SPGMR creates and allocates memory for a spgmr SUNLinearSolver object. Arguments y (N Vector) a template for cloning vectors needed within the solver pretype (int) flag indicating the desired type of preconditioning, allowed values are: • • • • maxl Return value PREC PREC PREC PREC NONE (0) LEFT (1) RIGHT (2) BOTH (3) Any other integer input will result in the default (no preconditioning). (int) the number of Krylov basis vectors to use. Values ≤ 0 will result in the default value (5). This returns a SUNLinearSolver object. If either y is incompatible then this routine will return NULL. 270 Notes Description of the SUNLinearSolver module This routine will perform consistency checks to ensure that it is called with a consistent nvector implementation (i.e. that it supplies the requisite vector operations). If y is incompatible, then this routine will return NULL. We note that some sundials solvers are designed to only work with left preconditioning (ida and idas) and others with only right preconditioning (kinsol). While it is possible to configure a sunlinsol spgmr object to use any of the preconditioning options with these solvers, this use mode is not supported and may result in inferior performance. Deprecated Name For backward compatibility, the wrapper function SUNSPGMR with idential input and output arguments is also provided. F2003 Name This function is callable as FSUNLinSol SPGMR when using the Fortran 2003 interface module. The sunlinsol spgmr module defines implementations of all “iterative” linear solver operations listed in Sections 9.1.1 – 9.1.3: • SUNLinSolGetType SPGMR • SUNLinSolInitialize SPGMR • SUNLinSolSetATimes SPGMR • SUNLinSolSetPreconditioner SPGMR • SUNLinSolSetScalingVectors SPGMR • SUNLinSolSetup SPGMR • SUNLinSolSolve SPGMR • SUNLinSolNumIters SPGMR • SUNLinSolResNorm SPGMR • SUNLinSolResid SPGMR • SUNLinSolLastFlag SPGMR • SUNLinSolSpace SPGMR • SUNLinSolFree SPGMR All of the listed operations are callable via the Fortran 2003 interface module by prepending an ‘F’ to the function name. The sunlinsol spgmr module also defines the following additional user-callable functions. SUNLinSol SPGMRSetPrecType Call retval = SUNLinSol SPGMRSetPrecType(LS, pretype); Description The function SUNLinSol SPGMRSetPrecType updates the type of preconditioning to use in the sunlinsol spgmr object. Arguments LS (SUNLinearSolver) the sunlinsol spgmr object to update pretype (int) flag indicating the desired type of preconditioning, allowed values match those discussed in SUNLinSol SPGMR. Return value This routine will return with one of the error codes SUNLS ILL INPUT (illegal pretype), SUNLS MEM NULL (S is NULL) or SUNLS SUCCESS. Deprecated Name For backward compatibility, the wrapper function SUNSPGMRSetPrecType with idential input and output arguments is also provided. F2003 Name This function is callable as FSUNLinSol SPGMRSetPrecType when using the Fortran 2003 interface module. 9.11 The SUNLinearSolver SPGMR implementation 271 SUNLinSol SPGMRSetGSType Call retval = SUNLinSol SPGMRSetGSType(LS, gstype); Description The function SUNLinSol SPGMRSetPrecType sets the type of Gram-Schmidt orthogonalization to use in the sunlinsol spgmr object. Arguments LS (SUNLinearSolver) the sunlinsol spgmr object to update gstype (int) flag indicating the desired orthogonalization algorithm; allowed values are: • MODIFIED GS (1) • CLASSICAL GS (2) Any other integer input will result in a failure, returning error code SUNLS ILL INPUT. Return value This routine will return with one of the error codes SUNLS ILL INPUT (illegal pretype), SUNLS MEM NULL (S is NULL) or SUNLS SUCCESS. Deprecated Name For backward compatibility, the wrapper function SUNSPGMRSetGSType with idential input and output arguments is also provided. F2003 Name This function is callable as FSUNLinSol SPGMRSetGSType when using the Fortran 2003 interface module. SUNLinSol SPGMRSetMaxRestarts Call retval = SUNLinSol SPGMRSetMaxRestarts(LS, maxrs); Description The function SUNLinSol SPGMRSetMaxRestarts sets the number of GMRES restarts to allow in the sunlinsol spgmr object. Arguments LS (SUNLinearSolver) the sunlinsol spgmr object to update maxrs (int) integer indicating number of restarts to allow. A negative input will result in the default of 0. Return value This routine will return with one of the error codes SUNLS MEM NULL (S is NULL) or SUNLS SUCCESS. Deprecated Name For backward compatibility, the wrapper function SUNSPGMRSetMaxRestarts with idential input and output arguments is also provided. F2003 Name 9.11.3 This function is callable as FSUNLinSol SPGMRSetMaxRestarts when using the Fortran 2003 interface module. SUNLinearSolver SPGMR Fortran interfaces The sunlinsol spgmr module provides a Fortran 2003 module as well as Fortran 77 style interface functions for use from Fortran applications. FORTRAN 2003 interface module The fsunlinsol spgmr mod Fortran module defines interfaces to all sunlinsol spgmr C functions using the intrinsic iso c binding module which provides a standardized mechanism for interoperating with C. As noted in the C function descriptions above, the interface functions are named after the corresponding C function, but with a leading ‘F’. For example, the function SUNLinSol SPGMR is interfaced as FSUNLinSol SPGMR. The Fortran 2003 sunlinsol spgmr interface module can be accessed with the use statement, i.e. use fsunlinsol spgmr mod, and linking to the library libsundials fsunlinsolspgmr mod.lib in addition to the C library. For details on where the library and module file fsunlinsol spgmr mod.mod are installed see Appendix A. We note that the module is accessible from the Fortran 2003 sundials integrators without separately linking to the libsundials fsunlinsolspgmr mod library. 272 Description of the SUNLinearSolver module FORTRAN 77 interface functions For solvers that include a Fortran 77 interface module, the sunlinsol spgmr module also includes a Fortran-callable function for creating a SUNLinearSolver object. FSUNSPGMRINIT Call FSUNSPGMRINIT(code, pretype, maxl, ier) Description The function FSUNSPGMRINIT can be called for Fortran programs to create a sunlinsol spgmr object. Arguments code (int*) is an integer input specifying the solver id (1 for cvode, 2 for ida, 3 for kinsol, and 4 for arkode). pretype (int*) flag indicating desired preconditioning type maxl (int*) flag indicating Krylov subspace size Return value ier is a return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. Notes This routine must be called after the nvector object has been initialized. Allowable values for pretype and maxl are the same as for the C function SUNLinSol SPGMR. Additionally, when using arkode with a non-identity mass matrix, the sunlinsol spgmr module includes a Fortran-callable function for creating a SUNLinearSolver mass matrix solver object. FSUNMASSSPGMRINIT Call FSUNMASSSPGMRINIT(pretype, maxl, ier) Description The function FSUNMASSSPGMRINIT can be called for Fortran programs to create a sunlinsol spgmr object for mass matrix linear systems. Arguments pretype (int*) flag indicating desired preconditioning type maxl (int*) flag indicating Krylov subspace size Return value ier is a int return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. Notes This routine must be called after the nvector object has been initialized. Allowable values for pretype and maxl are the same as for the C function SUNLinSol SPGMR. The SUNLinSol SPGMRSetPrecType, SUNLinSol SPGMRSetGSType and SUNLinSol SPGMRSetMaxRestarts routines also support Fortran interfaces for the system and mass matrix solvers. FSUNSPGMRSETGSTYPE Call FSUNSPGMRSETGSTYPE(code, gstype, ier) Description The function FSUNSPGMRSETGSTYPE can be called for Fortran programs to change the Gram-Schmidt orthogonaliation algorithm. Arguments code (int*) is an integer input specifying the solver id (1 for cvode, 2 for ida, 3 for kinsol, and 4 for arkode). gstype (int*) flag indicating the desired orthogonalization algorithm. Return value ier is a int return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. Notes See SUNLinSol SPGMRSetGSType for complete further documentation of this routine. 9.11 The SUNLinearSolver SPGMR implementation 273 FSUNMASSSPGMRSETGSTYPE Call FSUNMASSSPGMRSETGSTYPE(gstype, ier) Description The function FSUNMASSSPGMRSETGSTYPE can be called for Fortran programs to change the Gram-Schmidt orthogonaliation algorithm for mass matrix linear systems. Arguments The arguments are identical to FSUNSPGMRSETGSTYPE above, except that code is not needed since mass matrix linear systems only arise in arkode. Return value ier is a int return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. Notes See SUNLinSol SPGMRSetGSType for complete further documentation of this routine. FSUNSPGMRSETPRECTYPE Call FSUNSPGMRSETPRECTYPE(code, pretype, ier) Description The function FSUNSPGMRSETPRECTYPE can be called for Fortran programs to change the type of preconditioning to use. Arguments code (int*) is an integer input specifying the solver id (1 for cvode, 2 for ida, 3 for kinsol, and 4 for arkode). pretype (int*) flag indicating the type of preconditioning to use. Return value ier is a int return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. Notes See SUNLinSol SPGMRSetPrecType for complete further documentation of this routine. FSUNMASSSPGMRSETPRECTYPE Call FSUNMASSSPGMRSETPRECTYPE(pretype, ier) Description The function FSUNMASSSPGMRSETPRECTYPE can be called for Fortran programs to change the type of preconditioning for mass matrix linear systems. Arguments The arguments are identical to FSUNSPGMRSETPRECTYPE above, except that code is not needed since mass matrix linear systems only arise in arkode. Return value ier is a int return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. Notes See SUNLinSol SPGMRSetPrecType for complete further documentation of this routine. FSUNSPGMRSETMAXRS Call FSUNSPGMRSETMAXRS(code, maxrs, ier) Description The function FSUNSPGMRSETMAXRS can be called for Fortran programs to change the maximum number of restarts allowed for spgmr. Arguments code (int*) is an integer input specifying the solver id (1 for cvode, 2 for ida, 3 for kinsol, and 4 for arkode). maxrs (int*) maximum allowed number of restarts. Return value ier is a int return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. Notes See SUNLinSol SPGMRSetMaxRestarts for complete further documentation of this routine. 274 Description of the SUNLinearSolver module FSUNMASSSPGMRSETMAXRS Call FSUNMASSSPGMRSETMAXRS(maxrs, ier) Description The function FSUNMASSSPGMRSETMAXRS can be called for Fortran programs to change the maximum number of restarts allowed for spgmr for mass matrix linear systems. Arguments The arguments are identical to FSUNSPGMRSETMAXRS above, except that code is not needed since mass matrix linear systems only arise in arkode. Return value ier is a int return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. Notes 9.11.4 See SUNLinSol SPGMRSetMaxRestarts for complete further documentation of this routine. SUNLinearSolver SPGMR content The sunlinsol spgmr module defines the content field of a SUNLinearSolver as the following structure: struct _SUNLinearSolverContent_SPGMR { int maxl; int pretype; int gstype; int max_restarts; int numiters; realtype resnorm; long int last_flag; ATimesFn ATimes; void* ATData; PSetupFn Psetup; PSolveFn Psolve; void* PData; N_Vector s1; N_Vector s2; N_Vector *V; realtype **Hes; realtype *givens; N_Vector xcor; realtype *yg; N_Vector vtemp; }; These entries of the content field contain the following information: maxl - number of GMRES basis vectors to use (default is 5), pretype - flag for type of preconditioning to employ (default is none), gstype - flag for type of Gram-Schmidt orthogonalization (default is modified Gram-Schmidt), max restarts - number of GMRES restarts to allow (default is 0), numiters - number of iterations from the most-recent solve, resnorm - final linear residual norm from the most-recent solve, last flag - last error return flag from an internal function, ATimes - function pointer to perform Av product, ATData - pointer to structure for ATimes, Psetup - function pointer to preconditioner setup routine, 9.12 The SUNLinearSolver SPFGMR implementation 275 Psolve - function pointer to preconditioner solve routine, PData - pointer to structure for Psetup and Psolve, s1, s2 - vector pointers for supplied scaling matrices (default is NULL), V - the array of Krylov basis vectors v1 , . . . , vmaxl+1 , stored in V[0], . . . , V[maxl]. Each vi is a vector of type nvector., Hes - the (maxl + 1) × maxl Hessenberg matrix. It is stored row-wise so that the (i,j)th element is given by Hes[i][j]., givens - a length 2*maxl array which represents the Givens rotation matrices that arise in the GMRES algorithm. These matrices are F0 , F1 , . . . , Fj , where 1 Fi = .. . 1 ci si −si ci 1 .. . , 1 are represented in the givens vector as givens[0] = c0 , givens[1] = s0 , givens[2] = c1 , givens[3] = s1 , . . . givens[2j] = cj , givens[2j+1] = sj ., xcor - a vector which holds the scaled, preconditioned correction to the initial guess, yg - a length (maxl+1) array of realtype values used to hold “short” vectors (e.g. y and g), vtemp - temporary vector storage. 9.12 The SUNLinearSolver SPFGMR implementation This section describes the sunlinsol implementation of the spfgmr (Scaled, Preconditioned, Flexible, Generalized Minimum Residual [40]) iterative linear solver. The sunlinsol spfgmr module is designed to be compatible with any nvector implementation that supports a minimal subset of operations (N VClone, N VDotProd, N VScale, N VLinearSum, N VProd, N VConst, N VDiv, and N VDestroy). When using Classical Gram-Schmidt, the optional function N VDotProdMulti may be supplied for increased efficiency. Unlike the other Krylov iterative linear solvers supplied with sundials, spfgmr is specifically designed to work with a changing preconditioner (e.g. from an iterative method). To access the sunlinsol spfgmr module, include the header file sunlinsol/sunlinsol spfgmr.h. We note that the sunlinsol spfgmr module is accessible from sundials packages without separately linking to the libsundials sunlinsolspfgmr module library. 9.12.1 SUNLinearSolver SPFGMR description This solver is constructed to perform the following operations: • During construction, the xcor and vtemp arrays are cloned from a template nvector that is input, and default solver parameters are set. • User-facing “set” routines may be called to modify default solver parameters. • Additional “set” routines are called by the sundials solver that interfaces with sunlinsol spfgmr to supply the ATimes, PSetup, and Psolve function pointers and s1 and s2 scaling vectors. 276 Description of the SUNLinearSolver module • In the “initialize” call, the remaining solver data is allocated (V, Hes, givens, and yg ) • In the “setup” call, any non-NULL PSetup function is called. Typically, this is provided by the sundials solver itself, that translates between the generic PSetup function and the solver-specific routine (solver-supplied or user-supplied). • In the “solve” call, the FGMRES iteration is performed. This will include scaling, preconditioning, and restarts if those options have been supplied. 9.12.2 SUNLinearSolver SPFGMR functions The sunlinsol spfgmr module provides the following user-callable constructor for creating a SUNLinearSolver object. SUNLinSol SPFGMR Call LS = SUNLinSol SPFGMR(y, pretype, maxl); Description The function SUNLinSol SPFGMR creates and allocates memory for a spfgmr SUNLinearSolver object. Arguments y (N Vector) a template for cloning vectors needed within the solver pretype (int) flag indicating the desired type of preconditioning, allowed values are: • • • • maxl PREC PREC PREC PREC NONE (0) LEFT (1) RIGHT (2) BOTH (3) Any other integer input will result in the default (no preconditioning). (int) the number of Krylov basis vectors to use. Values ≤ 0 will result in the default value (5). Return value This returns a SUNLinearSolver object. If either y is incompatible then this routine will return NULL. Notes This routine will perform consistency checks to ensure that it is called with a consistent nvector implementation (i.e. that it supplies the requisite vector operations). If y is incompatible, then this routine will return NULL. We note that some sundials solvers are designed to only work with left preconditioning (ida and idas) and others with only right preconditioning (kinsol). While it is possible to configure a sunlinsol spfgmr object to use any of the preconditioning options with these solvers, this use mode is not supported and may result in inferior performance. F2003 Name This function is callable as FSUNLinSol SPFGMR when using the Fortran 2003 interface module. SUNSPFGMR The sunlinsol spfgmr module defines implementations of all “iterative” linear solver operations listed in Sections 9.1.1 – 9.1.3: • SUNLinSolGetType SPFGMR • SUNLinSolInitialize SPFGMR • SUNLinSolSetATimes SPFGMR • SUNLinSolSetPreconditioner SPFGMR • SUNLinSolSetScalingVectors SPFGMR • SUNLinSolSetup SPFGMR 9.12 The SUNLinearSolver SPFGMR implementation 277 • SUNLinSolSolve SPFGMR • SUNLinSolNumIters SPFGMR • SUNLinSolResNorm SPFGMR • SUNLinSolResid SPFGMR • SUNLinSolLastFlag SPFGMR • SUNLinSolSpace SPFGMR • SUNLinSolFree SPFGMR All of the listed operations are callable via the Fortran 2003 interface module by prepending an ‘F’ to the function name. The sunlinsol spfgmr module also defines the following additional user-callable functions. SUNLinSol SPFGMRSetPrecType Call retval = SUNLinSol SPFGMRSetPrecType(LS, pretype); Description The function SUNLinSol SPFGMRSetPrecType updates the type of preconditioning to use in the sunlinsol spfgmr object. Arguments LS (SUNLinearSolver) the sunlinsol spfgmr object to update pretype (int) flag indicating the desired type of preconditioning, allowed values match those discussed in SUNLinSol SPFGMR. Return value This routine will return with one of the error codes SUNLS ILL INPUT (illegal pretype), SUNLS MEM NULL (S is NULL) or SUNLS SUCCESS. Deprecated Name For backward compatibility, the wrapper function SUNSPFGMRSetPrecType with idential input and output arguments is also provided. F2003 Name This function is callable as FSUNLinSol SPFGMRSetPrecType when using the Fortran 2003 interface module. SUNLinSol SPFGMRSetGSType Call retval = SUNLinSol SPFGMRSetGSType(LS, gstype); Description The function SUNLinSol SPFGMRSetPrecType sets the type of Gram-Schmidt orthogonalization to use in the sunlinsol spfgmr object. Arguments LS (SUNLinearSolver) the sunlinsol spfgmr object to update gstype (int) flag indicating the desired orthogonalization algorithm; allowed values are: • MODIFIED GS (1) • CLASSICAL GS (2) Any other integer input will result in a failure, returning error code SUNLS ILL INPUT. Return value This routine will return with one of the error codes SUNLS ILL INPUT (illegal pretype), SUNLS MEM NULL (S is NULL) or SUNLS SUCCESS. Deprecated Name For backward compatibility, the wrapper function SUNSPFGMRSetGSType with idential input and output arguments is also provided. F2003 Name This function is callable as FSUNLinSol SPFGMRSetGSType when using the Fortran 2003 interface module. 278 Description of the SUNLinearSolver module SUNLinSol SPFGMRSetMaxRestarts Call retval = SUNLinSol SPFGMRSetMaxRestarts(LS, maxrs); Description The function SUNLinSol SPFGMRSetMaxRestarts sets the number of GMRES restarts to allow in the sunlinsol spfgmr object. Arguments LS (SUNLinearSolver) the sunlinsol spfgmr object to update maxrs (int) integer indicating number of restarts to allow. A negative input will result in the default of 0. Return value This routine will return with one of the error codes SUNLS MEM NULL (S is NULL) or SUNLS SUCCESS. Deprecated Name For backward compatibility, the wrapper function SUNSPFGMRSetMaxRestarts with idential input and output arguments is also provided. F2003 Name 9.12.3 This function is callable as FSUNLinSol SPFGMRSetMaxRestarts when using the Fortran 2003 interface module. SUNLinearSolver SPFGMR Fortran interfaces The sunlinsol spfgmr module provides a Fortran 2003 module as well as Fortran 77 style interface functions for use from Fortran applications. FORTRAN 2003 interface module The fsunlinsol spfgmr mod Fortran module defines interfaces to all sunlinsol spfgmr C functions using the intrinsic iso c binding module which provides a standardized mechanism for interoperating with C. As noted in the C function descriptions above, the interface functions are named after the corresponding C function, but with a leading ‘F’. For example, the function SUNLinSol SPFGMR is interfaced as FSUNLinSol SPFGMR. The Fortran 2003 sunlinsol spfgmr interface module can be accessed with the use statement, i.e. use fsunlinsol spfgmr mod, and linking to the library libsundials fsunlinsolspfgmr mod.lib in addition to the C library. For details on where the library and module file fsunlinsol spfgmr mod.mod are installed see Appendix A. We note that the module is accessible from the Fortran 2003 sundials integrators without separately linking to the libsundials fsunlinsolspfgmr mod library. FORTRAN 77 interface functions For solvers that include a Fortran 77 interface module, the sunlinsol spfgmr module also includes a Fortran-callable function for creating a SUNLinearSolver object. FSUNSPFGMRINIT Call FSUNSPFGMRINIT(code, pretype, maxl, ier) Description The function FSUNSPFGMRINIT can be called for Fortran programs to create a sunlinsol spfgmr object. Arguments code (int*) is an integer input specifying the solver id (1 for cvode, 2 for ida, 3 for kinsol, and 4 for arkode). pretype (int*) flag indicating desired preconditioning type maxl (int*) flag indicating Krylov subspace size Return value ier is a return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. Notes This routine must be called after the nvector object has been initialized. Allowable values for pretype and maxl are the same as for the C function SUNLinSol SPFGMR. 9.12 The SUNLinearSolver SPFGMR implementation 279 Additionally, when using arkode with a non-identity mass matrix, the sunlinsol spfgmr module includes a Fortran-callable function for creating a SUNLinearSolver mass matrix solver object. FSUNMASSSPFGMRINIT Call FSUNMASSSPFGMRINIT(pretype, maxl, ier) Description The function FSUNMASSSPFGMRINIT can be called for Fortran programs to create a sunlinsol spfgmr object for mass matrix linear systems. Arguments pretype (int*) flag indicating desired preconditioning type maxl (int*) flag indicating Krylov subspace size Return value ier is a int return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. Notes This routine must be called after the nvector object has been initialized. Allowable values for pretype and maxl are the same as for the C function SUNLinSol SPFGMR. The SUNLinSol SPFGMRSetPrecType, SUNLinSol SPFGMRSetGSType and SUNLinSol SPFGMRSetMaxRestarts routines also support Fortran interfaces for the system and mass matrix solvers. FSUNSPFGMRSETGSTYPE Call FSUNSPFGMRSETGSTYPE(code, gstype, ier) Description The function FSUNSPFGMRSETGSTYPE can be called for Fortran programs to change the Gram-Schmidt orthogonaliation algorithm. Arguments (int*) is an integer input specifying the solver id (1 for cvode, 2 for ida, 3 for kinsol, and 4 for arkode). gstype (int*) flag indicating the desired orthogonalization algorithm. code Return value ier is a int return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. Notes See SUNLinSol SPFGMRSetGSType for complete further documentation of this routine. FSUNMASSSPFGMRSETGSTYPE Call FSUNMASSSPFGMRSETGSTYPE(gstype, ier) Description The function FSUNMASSSPFGMRSETGSTYPE can be called for Fortran programs to change the Gram-Schmidt orthogonaliation algorithm for mass matrix linear systems. Arguments The arguments are identical to FSUNSPFGMRSETGSTYPE above, except that code is not needed since mass matrix linear systems only arise in arkode. Return value ier is a int return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. Notes See SUNLinSol SPFGMRSetGSType for complete further documentation of this routine. FSUNSPFGMRSETPRECTYPE Call FSUNSPFGMRSETPRECTYPE(code, pretype, ier) Description The function FSUNSPFGMRSETPRECTYPE can be called for Fortran programs to change the type of preconditioning to use. Arguments code (int*) is an integer input specifying the solver id (1 for cvode, 2 for ida, 3 for kinsol, and 4 for arkode). pretype (int*) flag indicating the type of preconditioning to use. 280 Description of the SUNLinearSolver module Return value ier is a int return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. See SUNLinSol SPFGMRSetPrecType for complete further documentation of this routine. Notes FSUNMASSSPFGMRSETPRECTYPE Call FSUNMASSSPFGMRSETPRECTYPE(pretype, ier) Description The function FSUNMASSSPFGMRSETPRECTYPE can be called for Fortran programs to change the type of preconditioning for mass matrix linear systems. Arguments The arguments are identical to FSUNSPFGMRSETPRECTYPE above, except that code is not needed since mass matrix linear systems only arise in arkode. Return value ier is a int return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. See SUNLinSol SPFGMRSetPrecType for complete further documentation of this routine. Notes FSUNSPFGMRSETMAXRS Call FSUNSPFGMRSETMAXRS(code, maxrs, ier) Description The function FSUNSPFGMRSETMAXRS can be called for Fortran programs to change the maximum number of restarts allowed for spfgmr. Arguments (int*) is an integer input specifying the solver id (1 for cvode, 2 for ida, 3 for kinsol, and 4 for arkode). maxrs (int*) maximum allowed number of restarts. code Return value ier is a int return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. See SUNLinSol SPFGMRSetMaxRestarts for complete further documentation of this routine. Notes FSUNMASSSPFGMRSETMAXRS Call FSUNMASSSPFGMRSETMAXRS(maxrs, ier) Description The function FSUNMASSSPFGMRSETMAXRS can be called for Fortran programs to change the maximum number of restarts allowed for spfgmr for mass matrix linear systems. Arguments The arguments are identical to FSUNSPFGMRSETMAXRS above, except that code is not needed since mass matrix linear systems only arise in arkode. Return value ier is a int return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. Notes 9.12.4 See SUNLinSol SPFGMRSetMaxRestarts for complete further documentation of this routine. SUNLinearSolver SPFGMR content The sunlinsol spfgmr module defines the content field of a SUNLinearSolver as the following structure: struct _SUNLinearSolverContent_SPFGMR { int maxl; int pretype; int gstype; int max_restarts; int numiters; 9.12 The SUNLinearSolver SPFGMR implementation 281 realtype resnorm; long int last_flag; ATimesFn ATimes; void* ATData; PSetupFn Psetup; PSolveFn Psolve; void* PData; N_Vector s1; N_Vector s2; N_Vector *V; N_Vector *Z; realtype **Hes; realtype *givens; N_Vector xcor; realtype *yg; N_Vector vtemp; }; These entries of the content field contain the following information: maxl - number of FGMRES basis vectors to use (default is 5), pretype - flag for type of preconditioning to employ (default is none), gstype - flag for type of Gram-Schmidt orthogonalization (default is modified Gram-Schmidt), max restarts - number of FGMRES restarts to allow (default is 0), numiters - number of iterations from the most-recent solve, resnorm - final linear residual norm from the most-recent solve, last flag - last error return flag from an internal function, ATimes - function pointer to perform Av product, ATData - pointer to structure for ATimes, Psetup - function pointer to preconditioner setup routine, Psolve - function pointer to preconditioner solve routine, PData - pointer to structure for Psetup and Psolve, s1, s2 - vector pointers for supplied scaling matrices (default is NULL), V - the array of Krylov basis vectors v1 , . . . , vmaxl+1 , stored in V[0], . . . , V[maxl]. Each vi is a vector of type nvector., Z - the array of preconditioned Krylov basis vectors z1 , . . . , zmaxl+1 , stored in Z[0], . . . , Z[maxl]. Each zi is a vector of type nvector., Hes - the (maxl + 1) × maxl Hessenberg matrix. It is stored row-wise so that the (i,j)th element is given by Hes[i][j]., givens - a length 2*maxl array which represents the Givens rotation matrices that arise in the FGMRES algorithm. These matrices are F0 , F1 , . . . , Fj , where 1 .. . 1 ci −si , Fi = si ci 1 .. . 1 282 Description of the SUNLinearSolver module are represented in the givens vector as givens[0] = c0 , givens[1] = s0 , givens[2] = c1 , givens[3] = s1 , . . . givens[2j] = cj , givens[2j+1] = sj ., xcor - a vector which holds the scaled, preconditioned correction to the initial guess, yg - a length (maxl+1) array of realtype values used to hold “short” vectors (e.g. y and g), vtemp - temporary vector storage. 9.13 The SUNLinearSolver SPBCGS implementation This section describes the sunlinsol implementation of the spbcgs (Scaled, Preconditioned, BiConjugate Gradient, Stabilized [44]) iterative linear solver. The sunlinsol spbcgs module is designed to be compatible with any nvector implementation that supports a minimal subset of operations (N VClone, N VDotProd, N VScale, N VLinearSum, N VProd, N VDiv, and N VDestroy). Unlike the spgmr and spfgmr algorithms, spbcgs requires a fixed amount of memory that does not increase with the number of allowed iterations. To access the sunlinsol spbcgs module, include the header file sunlinsol/sunlinsol spbcgs.h. We note that the sunlinsol spbcgs module is accessible from sundials packages without separately linking to the libsundials sunlinsolspbcgs module library. 9.13.1 SUNLinearSolver SPBCGS description This solver is constructed to perform the following operations: • During construction all nvector solver data is allocated, with vectors cloned from a template nvector that is input, and default solver parameters are set. • User-facing “set” routines may be called to modify default solver parameters. • Additional “set” routines are called by the sundials solver that interfaces with sunlinsol spbcgs to supply the ATimes, PSetup, and Psolve function pointers and s1 and s2 scaling vectors. • In the “initialize” call, the solver parameters are checked for validity. • In the “setup” call, any non-NULL PSetup function is called. Typically, this is provided by the sundials solver itself, that translates between the generic PSetup function and the solver-specific routine (solver-supplied or user-supplied). • In the “solve” call the spbcgs iteration is performed. This will include scaling and preconditioning if those options have been supplied. 9.13.2 SUNLinearSolver SPBCGS functions The sunlinsol spbcgs module provides the following user-callable constructor for creating a SUNLinearSolver object. SUNLinSol SPBCGS Call LS = SUNLinSol SPBCGS(y, pretype, maxl); Description The function SUNLinSol SPBCGS creates and allocates memory for a spbcgs SUNLinearSolver object. Arguments y (N Vector) a template for cloning vectors needed within the solver pretype (int) flag indicating the desired type of preconditioning, allowed values are: • PREC NONE (0) 9.13 The SUNLinearSolver SPBCGS implementation 283 • PREC LEFT (1) • PREC RIGHT (2) • PREC BOTH (3) maxl Any other integer input will result in the default (no preconditioning). (int) the number of linear iterations to allow. Values ≤ 0 will result in the default value (5). Return value This returns a SUNLinearSolver object. If either y is incompatible then this routine will return NULL. Notes This routine will perform consistency checks to ensure that it is called with a consistent nvector implementation (i.e. that it supplies the requisite vector operations). If y is incompatible, then this routine will return NULL. We note that some sundials solvers are designed to only work with left preconditioning (ida and idas) and others with only right preconditioning (kinsol). While it is possible to configure a sunlinsol spbcgs object to use any of the preconditioning options with these solvers, this use mode is not supported and may result in inferior performance. Deprecated Name For backward compatibility, the wrapper function SUNSPBCGS with idential input and output arguments is also provided. F2003 Name This function is callable as FSUNLinSol SPBCGS when using the Fortran 2003 interface module. The sunlinsol spbcgs module defines implementations of all “iterative” linear solver operations listed in Sections 9.1.1 – 9.1.3: • SUNLinSolGetType SPBCGS • SUNLinSolInitialize SPBCGS • SUNLinSolSetATimes SPBCGS • SUNLinSolSetPreconditioner SPBCGS • SUNLinSolSetScalingVectors SPBCGS • SUNLinSolSetup SPBCGS • SUNLinSolSolve SPBCGS • SUNLinSolNumIters SPBCGS • SUNLinSolResNorm SPBCGS • SUNLinSolResid SPBCGS • SUNLinSolLastFlag SPBCGS • SUNLinSolSpace SPBCGS • SUNLinSolFree SPBCGS All of the listed operations are callable via the Fortran 2003 interface module by prepending an ‘F’ to the function name. The sunlinsol spbcgs module also defines the following additional user-callable functions. 284 Description of the SUNLinearSolver module SUNLinSol SPBCGSSetPrecType Call retval = SUNLinSol SPBCGSSetPrecType(LS, pretype); Description The function SUNLinSol SPBCGSSetPrecType updates the type of preconditioning to use in the sunlinsol spbcgs object. Arguments LS (SUNLinearSolver) the sunlinsol spbcgs object to update pretype (int) flag indicating the desired type of preconditioning, allowed values match those discussed in SUNLinSol SPBCGS. Return value This routine will return with one of the error codes SUNLS ILL INPUT (illegal pretype), SUNLS MEM NULL (S is NULL) or SUNLS SUCCESS. Deprecated Name For backward compatibility, the wrapper function SUNSPBCGSSetPrecType with idential input and output arguments is also provided. F2003 Name This function is callable as FSUNLinSol SPBCGSSetPrecType when using the Fortran 2003 interface module. SUNLinSol SPBCGSSetMaxl Call retval = SUNLinSol SPBCGSSetMaxl(LS, maxl); Description The function SUNLinSol SPBCGSSetMaxl updates the number of linear solver iterations to allow. Arguments LS (SUNLinearSolver) the sunlinsol spbcgs object to update maxl (int) flag indicating the number of iterations to allow. Values ≤ 0 will result in the default value (5). Return value This routine will return with one of the error codes SUNLS MEM NULL (S is NULL) or SUNLS SUCCESS. Deprecated Name For backward compatibility, the wrapper function SUNSPBCGSSetMaxl with idential input and output arguments is also provided. F2003 Name 9.13.3 This function is callable as FSUNLinSol SPBCGSSetMaxl when using the Fortran 2003 interface module. SUNLinearSolver SPBCGS Fortran interfaces The sunlinsol spbcgs module provides a Fortran 2003 module as well as Fortran 77 style interface functions for use from Fortran applications. FORTRAN 2003 interface module The fsunlinsol spbcgs mod Fortran module defines interfaces to all sunlinsol spbcgs C functions using the intrinsic iso c binding module which provides a standardized mechanism for interoperating with C. As noted in the C function descriptions above, the interface functions are named after the corresponding C function, but with a leading ‘F’. For example, the function SUNLinSol SPBCGS is interfaced as FSUNLinSol SPBCGS. The Fortran 2003 sunlinsol spbcgs interface module can be accessed with the use statement, i.e. use fsunlinsol spbcgs mod, and linking to the library libsundials fsunlinsolspbcgs mod.lib in addition to the C library. For details on where the library and module file fsunlinsol spbcgs mod.mod are installed see Appendix A. We note that the module is accessible from the Fortran 2003 sundials integrators without separately linking to the libsundials fsunlinsolspbcgs mod library. FORTRAN 77 interface functions For solvers that include a Fortran 77 interface module, the sunlinsol spbcgs module also includes a Fortran-callable function for creating a SUNLinearSolver object. 9.13 The SUNLinearSolver SPBCGS implementation 285 FSUNSPBCGSINIT Call FSUNSPBCGSINIT(code, pretype, maxl, ier) Description The function FSUNSPBCGSINIT can be called for Fortran programs to create a sunlinsol spbcgs object. Arguments code (int*) is an integer input specifying the solver id (1 for cvode, 2 for ida, 3 for kinsol, and 4 for arkode). pretype (int*) flag indicating desired preconditioning type maxl (int*) flag indicating number of iterations to allow Return value ier is a return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. Notes This routine must be called after the nvector object has been initialized. Allowable values for pretype and maxl are the same as for the C function SUNLinSol SPBCGS. Additionally, when using arkode with a non-identity mass matrix, the sunlinsol spbcgs module includes a Fortran-callable function for creating a SUNLinearSolver mass matrix solver object. FSUNMASSSPBCGSINIT Call FSUNMASSSPBCGSINIT(pretype, maxl, ier) Description The function FSUNMASSSPBCGSINIT can be called for Fortran programs to create a sunlinsol spbcgs object for mass matrix linear systems. Arguments pretype (int*) flag indicating desired preconditioning type maxl (int*) flag indicating number of iterations to allow Return value ier is a int return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. Notes This routine must be called after the nvector object has been initialized. Allowable values for pretype and maxl are the same as for the C function SUNLinSol SPBCGS. The SUNLinSol SPBCGSSetPrecType and SUNLinSol SPBCGSSetMaxl routines also support Fortran interfaces for the system and mass matrix solvers. FSUNSPBCGSSETPRECTYPE Call FSUNSPBCGSSETPRECTYPE(code, pretype, ier) Description The function FSUNSPBCGSSETPRECTYPE can be called for Fortran programs to change the type of preconditioning to use. Arguments code (int*) is an integer input specifying the solver id (1 for cvode, 2 for ida, 3 for kinsol, and 4 for arkode). pretype (int*) flag indicating the type of preconditioning to use. Return value ier is a int return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. Notes See SUNLinSol SPBCGSSetPrecType for complete further documentation of this routine. FSUNMASSSPBCGSSETPRECTYPE Call FSUNMASSSPBCGSSETPRECTYPE(pretype, ier) Description The function FSUNMASSSPBCGSSETPRECTYPE can be called for Fortran programs to change the type of preconditioning for mass matrix linear systems. 286 Description of the SUNLinearSolver module Arguments The arguments are identical to FSUNSPBCGSSETPRECTYPE above, except that code is not needed since mass matrix linear systems only arise in arkode. Return value ier is a int return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. Notes See SUNLinSol SPBCGSSetPrecType for complete further documentation of this routine. FSUNSPBCGSSETMAXL Call FSUNSPBCGSSETMAXL(code, maxl, ier) Description The function FSUNSPBCGSSETMAXL can be called for Fortran programs to change the maximum number of iterations to allow. Arguments code (int*) is an integer input specifying the solver id (1 for cvode, 2 for ida, 3 for kinsol, and 4 for arkode). maxl (int*) the number of iterations to allow. Return value ier is a int return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. See SUNLinSol SPBCGSSetMaxl for complete further documentation of this routine. Notes FSUNMASSSPBCGSSETMAXL Call FSUNMASSSPBCGSSETMAXL(maxl, ier) Description The function FSUNMASSSPBCGSSETMAXL can be called for Fortran programs to change the type of preconditioning for mass matrix linear systems. Arguments The arguments are identical to FSUNSPBCGSSETMAXL above, except that code is not needed since mass matrix linear systems only arise in arkode. Return value ier is a int return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. Notes 9.13.4 See SUNLinSol SPBCGSSetMaxl for complete further documentation of this routine. SUNLinearSolver SPBCGS content The sunlinsol spbcgs module defines the content field of a SUNLinearSolver as the following structure: struct _SUNLinearSolverContent_SPBCGS { int maxl; int pretype; int numiters; realtype resnorm; long int last_flag; ATimesFn ATimes; void* ATData; PSetupFn Psetup; PSolveFn Psolve; void* PData; N_Vector s1; N_Vector s2; N_Vector r; N_Vector r_star; N_Vector p; N_Vector q; N_Vector u; 9.14 The SUNLinearSolver SPTFQMR implementation 287 N_Vector Ap; N_Vector vtemp; }; These entries of the content field contain the following information: maxl - number of spbcgs iterations to allow (default is 5), pretype - flag for type of preconditioning to employ (default is none), numiters - number of iterations from the most-recent solve, resnorm - final linear residual norm from the most-recent solve, last flag - last error return flag from an internal function, ATimes - function pointer to perform Av product, ATData - pointer to structure for ATimes, Psetup - function pointer to preconditioner setup routine, Psolve - function pointer to preconditioner solve routine, PData - pointer to structure for Psetup and Psolve, s1, s2 - vector pointers for supplied scaling matrices (default is NULL), r - a nvector which holds the current scaled, preconditioned linear system residual, r star - a nvector which holds the initial scaled, preconditioned linear system residual, p, q, u, Ap, vtemp - nvectors used for workspace by the spbcgs algorithm. 9.14 The SUNLinearSolver SPTFQMR implementation This section describes the sunlinsol implementation of the sptfqmr (Scaled, Preconditioned, Transpose-Free Quasi-Minimum Residual [23]) iterative linear solver. The sunlinsol sptfqmr module is designed to be compatible with any nvector implementation that supports a minimal subset of operations (N VClone, N VDotProd, N VScale, N VLinearSum, N VProd, N VConst, N VDiv, and N VDestroy). Unlike the spgmr and spfgmr algorithms, sptfqmr requires a fixed amount of memory that does not increase with the number of allowed iterations. To access the sunlinsol sptfqmr module, include the header file sunlinsol/sunlinsol sptfqmr.h. We note that the sunlinsol sptfqmr module is accessible from sundials packages without separately linking to the libsundials sunlinsolsptfqmr module library. 9.14.1 SUNLinearSolver SPTFQMR description This solver is constructed to perform the following operations: • During construction all nvector solver data is allocated, with vectors cloned from a template nvector that is input, and default solver parameters are set. • User-facing “set” routines may be called to modify default solver parameters. • Additional “set” routines are called by the sundials solver that interfaces with sunlinsol sptfqmr to supply the ATimes, PSetup, and Psolve function pointers and s1 and s2 scaling vectors. • In the “initialize” call, the solver parameters are checked for validity. • In the “setup” call, any non-NULL PSetup function is called. Typically, this is provided by the sundials solver itself, that translates between the generic PSetup function and the solver-specific routine (solver-supplied or user-supplied). • In the “solve” call the TFQMR iteration is performed. This will include scaling and preconditioning if those options have been supplied. 288 9.14.2 Description of the SUNLinearSolver module SUNLinearSolver SPTFQMR functions The sunlinsol sptfqmr module provides the following user-callable constructor for creating a SUNLinearSolver object. SUNLinSol SPTFQMR Call LS = SUNLinSol SPTFQMR(y, pretype, maxl); Description The function SUNLinSol SPTFQMR creates and allocates memory for a sptfqmr SUNLinearSolver object. Arguments y (N Vector) a template for cloning vectors needed within the solver pretype (int) flag indicating the desired type of preconditioning, allowed values are: • • • • maxl PREC PREC PREC PREC NONE (0) LEFT (1) RIGHT (2) BOTH (3) Any other integer input will result in the default (no preconditioning). (int) the number of linear iterations to allow. Values ≤ 0 will result in the default value (5). Return value This returns a SUNLinearSolver object. If either y is incompatible then this routine will return NULL. Notes This routine will perform consistency checks to ensure that it is called with a consistent nvector implementation (i.e. that it supplies the requisite vector operations). If y is incompatible, then this routine will return NULL. We note that some sundials solvers are designed to only work with left preconditioning (ida and idas) and others with only right preconditioning (kinsol). While it is possible to configure a sunlinsol sptfqmr object to use any of the preconditioning options with these solvers, this use mode is not supported and may result in inferior performance. Deprecated Name For backward compatibility, the wrapper function SUNSPTFQMR with idential input and output arguments is also provided. F2003 Name This function is callable as FSUNLinSol SPTFQMR when using the Fortran 2003 interface module. The sunlinsol sptfqmr module defines implementations of all “iterative” linear solver operations listed in Sections 9.1.1 – 9.1.3: • SUNLinSolGetType SPTFQMR • SUNLinSolInitialize SPTFQMR • SUNLinSolSetATimes SPTFQMR • SUNLinSolSetPreconditioner SPTFQMR • SUNLinSolSetScalingVectors SPTFQMR • SUNLinSolSetup SPTFQMR • SUNLinSolSolve SPTFQMR • SUNLinSolNumIters SPTFQMR • SUNLinSolResNorm SPTFQMR 9.14 The SUNLinearSolver SPTFQMR implementation 289 • SUNLinSolResid SPTFQMR • SUNLinSolLastFlag SPTFQMR • SUNLinSolSpace SPTFQMR • SUNLinSolFree SPTFQMR All of the listed operations are callable via the Fortran 2003 interface module by prepending an ‘F’ to the function name. The sunlinsol sptfqmr module also defines the following additional user-callable functions. SUNLinSol SPTFQMRSetPrecType Call retval = SUNLinSol SPTFQMRSetPrecType(LS, pretype); Description The function SUNLinSol SPTFQMRSetPrecType updates the type of preconditioning to use in the sunlinsol sptfqmr object. Arguments LS (SUNLinearSolver) the sunlinsol sptfqmr object to update pretype (int) flag indicating the desired type of preconditioning, allowed values match those discussed in SUNLinSol SPTFQMR. Return value This routine will return with one of the error codes SUNLS ILL INPUT (illegal pretype), SUNLS MEM NULL (S is NULL) or SUNLS SUCCESS. Deprecated Name For backward compatibility, the wrapper function SUNSPTFQMRSetPrecType with idential input and output arguments is also provided. F2003 Name This function is callable as FSUNLinSol SPTFQMRSetPrecType when using the Fortran 2003 interface module. SUNLinSol SPTFQMRSetMaxl Call retval = SUNLinSol SPTFQMRSetMaxl(LS, maxl); Description The function SUNLinSol SPTFQMRSetMaxl updates the number of linear solver iterations to allow. Arguments LS (SUNLinearSolver) the sunlinsol sptfqmr object to update maxl (int) flag indicating the number of iterations to allow; values ≤ 0 will result in the default value (5) Return value This routine will return with one of the error codes SUNLS MEM NULL (S is NULL) or SUNLS SUCCESS. F2003 Name This function is callable as FSUNLinSol SPTFQMRSetMaxl when using the Fortran 2003 interface module. SUNSPTFQMRSetMaxl 9.14.3 SUNLinearSolver SPTFQMR Fortran interfaces The sunlinsol spfgmr module provides a Fortran 2003 module as well as Fortran 77 style interface functions for use from Fortran applications. FORTRAN 2003 interface module The fsunlinsol sptfqmr mod Fortran module defines interfaces to all sunlinsol spfgmr C functions using the intrinsic iso c binding module which provides a standardized mechanism for interoperating with C. As noted in the C function descriptions above, the interface functions are named after the corresponding C function, but with a leading ‘F’. For example, the function SUNLinSol SPTFQMR is interfaced as FSUNLinSol SPTFQMR. 290 Description of the SUNLinearSolver module The Fortran 2003 sunlinsol spfgmr interface module can be accessed with the use statement, i.e. use fsunlinsol sptfqmr mod, and linking to the library libsundials fsunlinsolsptfqmr mod.lib in addition to the C library. For details on where the library and module file fsunlinsol sptfqmr mod.mod are installed see Appendix A. We note that the module is accessible from the Fortran 2003 sundials integrators without separately linking to the libsundials fsunlinsolsptfqmr mod library. FORTRAN 77 interface functions For solvers that include a Fortran 77 interface module, the sunlinsol sptfqmr module also includes a Fortran-callable function for creating a SUNLinearSolver object. FSUNSPTFQMRINIT Call FSUNSPTFQMRINIT(code, pretype, maxl, ier) Description The function FSUNSPTFQMRINIT can be called for Fortran programs to create a sunlinsol sptfqmr object. Arguments (int*) is an integer input specifying the solver id (1 for cvode, 2 for ida, 3 for kinsol, and 4 for arkode). pretype (int*) flag indicating desired preconditioning type maxl (int*) flag indicating number of iterations to allow code Return value ier is a return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. Notes This routine must be called after the nvector object has been initialized. Allowable values for pretype and maxl are the same as for the C function SUNLinSol SPTFQMR. Additionally, when using arkode with a non-identity mass matrix, the sunlinsol sptfqmr module includes a Fortran-callable function for creating a SUNLinearSolver mass matrix solver object. FSUNMASSSPTFQMRINIT Call FSUNMASSSPTFQMRINIT(pretype, maxl, ier) Description The function FSUNMASSSPTFQMRINIT can be called for Fortran programs to create a sunlinsol sptfqmr object for mass matrix linear systems. Arguments pretype (int*) flag indicating desired preconditioning type maxl (int*) flag indicating number of iterations to allow Return value ier is a int return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. Notes This routine must be called after the nvector object has been initialized. Allowable values for pretype and maxl are the same as for the C function SUNLinSol SPTFQMR. The SUNLinSol SPTFQMRSetPrecType and SUNLinSol SPTFQMRSetMaxl routines also support Fortran interfaces for the system and mass matrix solvers. FSUNSPTFQMRSETPRECTYPE Call FSUNSPTFQMRSETPRECTYPE(code, pretype, ier) Description The function FSUNSPTFQMRSETPRECTYPE can be called for Fortran programs to change the type of preconditioning to use. Arguments code (int*) is an integer input specifying the solver id (1 for cvode, 2 for ida, 3 for kinsol, and 4 for arkode). 9.14 The SUNLinearSolver SPTFQMR implementation 291 pretype (int*) flag indicating the type of preconditioning to use. Return value ier is a int return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. Notes See SUNLinSol SPTFQMRSetPrecType for complete further documentation of this routine. FSUNMASSSPTFQMRSETPRECTYPE Call FSUNMASSSPTFQMRSETPRECTYPE(pretype, ier) Description The function FSUNMASSSPTFQMRSETPRECTYPE can be called for Fortran programs to change the type of preconditioning for mass matrix linear systems. Arguments The arguments are identical to FSUNSPTFQMRSETPRECTYPE above, except that code is not needed since mass matrix linear systems only arise in arkode. Return value ier is a int return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. See SUNLinSol SPTFQMRSetPrecType for complete further documentation of this routine. Notes FSUNSPTFQMRSETMAXL Call FSUNSPTFQMRSETMAXL(code, maxl, ier) Description The function FSUNSPTFQMRSETMAXL can be called for Fortran programs to change the maximum number of iterations to allow. Arguments code (int*) is an integer input specifying the solver id (1 for cvode, 2 for ida, 3 for kinsol, and 4 for arkode). maxl (int*) the number of iterations to allow. Return value ier is a int return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. Notes See SUNLinSol SPTFQMRSetMaxl for complete further documentation of this routine. FSUNMASSSPTFQMRSETMAXL Call FSUNMASSSPTFQMRSETMAXL(maxl, ier) Description The function FSUNMASSSPTFQMRSETMAXL can be called for Fortran programs to change the type of preconditioning for mass matrix linear systems. Arguments The arguments are identical to FSUNSPTFQMRSETMAXL above, except that code is not needed since mass matrix linear systems only arise in arkode. Return value ier is a int return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. Notes 9.14.4 See SUNLinSol SPTFQMRSetMaxl for complete further documentation of this routine. SUNLinearSolver SPTFQMR content The sunlinsol sptfqmr module defines the content field of a SUNLinearSolver as the following structure: struct _SUNLinearSolverContent_SPTFQMR { int maxl; int pretype; int numiters; realtype resnorm; 292 Description of the SUNLinearSolver module long int last_flag; ATimesFn ATimes; void* ATData; PSetupFn Psetup; PSolveFn Psolve; void* PData; N_Vector s1; N_Vector s2; N_Vector r_star; N_Vector q; N_Vector d; N_Vector v; N_Vector p; N_Vector *r; N_Vector u; N_Vector vtemp1; N_Vector vtemp2; N_Vector vtemp3; }; These entries of the content field contain the following information: maxl - number of TFQMR iterations to allow (default is 5), pretype - flag for type of preconditioning to employ (default is none), numiters - number of iterations from the most-recent solve, resnorm - final linear residual norm from the most-recent solve, last flag - last error return flag from an internal function, ATimes - function pointer to perform Av product, ATData - pointer to structure for ATimes, Psetup - function pointer to preconditioner setup routine, Psolve - function pointer to preconditioner solve routine, PData - pointer to structure for Psetup and Psolve, s1, s2 - vector pointers for supplied scaling matrices (default is NULL), r star - a nvector which holds the initial scaled, preconditioned linear system residual, q, d, v, p, u - nvectors used for workspace by the SPTFQMR algorithm, r - array of two nvectors used for workspace within the SPTFQMR algorithm, vtemp1, vtemp2, vtemp3 - temporary vector storage. 9.15 The SUNLinearSolver PCG implementation This section describes the sunlinsol implementaiton of the pcg (Preconditioned Conjugate Gradient [24]) iterative linear solver. The sunlinsol pcg module is designed to be compatible with any nvector implementation that supports a minimal subset of operations (N VClone, N VDotProd, N VScale, N VLinearSum, N VProd, and N VDestroy). Unlike the spgmr and spfgmr algorithms, pcg requires a fixed amount of memory that does not increase with the number of allowed iterations. To access the sunlinsol pcg module, include the header file sunlinsol/sunlinsol pcg.h. We note that the sunlinsol pcg module is accessible from sundials packages without separately linking to the libsundials sunlinsolpcg module library. 9.15 The SUNLinearSolver PCG implementation 9.15.1 293 SUNLinearSolver PCG description Unlike all of the other iterative linear solvers supplied with sundials, pcg should only be used on symmetric linear systems (e.g. mass matrix linear systems encountered in arkode). As a result, the explanation of the role of scaling and preconditioning matrices given in general must be modified in this scenario. The pcg algorithm solves a linear system Ax = b where A is a symmetric (AT = A), real-valued matrix. Preconditioning is allowed, and is applied in a symmetric fashion on both the right and left. Scaling is also allowed and is applied symmetrically. We denote the preconditioner and scaling matrices as follows: • P is the preconditioner (assumed symmetric), • S is a diagonal matrix of scale factors. The matrices A and P are not required explicitly; only routines that provide A and P −1 as operators are required. The diagonal of the matrix S is held in a single nvector, supplied by the user. In this notation, pcg applies the underlying CG algorithm to the equivalent transformed system Ãx̃ = b̃ (9.4) where à = SP −1 AP −1 S, b̃ = SP −1 b, x̃ = S −1 (9.5) P x. The scaling matrix must be chosen so that the vectors SP −1 b and S −1 P x have dimensionless components. The stopping test for the PCG iterations is on the L2 norm of the scaled preconditioned residual: kb̃ − Ãx̃k2 < δ ⇔ kSP −1 b − SP −1 Axk2 < δ ⇔ kP −1 b − P −1 AxkS < δ √ where kvkS = v T S T Sv, with an input tolerance δ. This solver is constructed to perform the following operations: • During construction all nvector solver data is allocated, with vectors cloned from a template nvector that is input, and default solver parameters are set. • User-facing “set” routines may be called to modify default solver parameters. • Additional “set” routines are called by the sundials solver that interfaces with sunlinsol pcg to supply the ATimes, PSetup, and Psolve function pointers and s scaling vector. • In the “initialize” call, the solver parameters are checked for validity. • In the “setup” call, any non-NULL PSetup function is called. Typically, this is provided by the sundials solver itself, that translates between the generic PSetup function and the solver-specific routine (solver-supplied or user-supplied). • In the “solve” call the pcg iteration is performed. This will include scaling and preconditioning if those options have been supplied. 294 9.15.2 Description of the SUNLinearSolver module SUNLinearSolver PCG functions The sunlinsol pcg module provides the following user-callable constructor for creating a SUNLinearSolver object. SUNLinSol PCG Call LS = SUNLinSol PCG(y, pretype, maxl); Description The function SUNLinSol PCG creates and allocates memory for a pcg SUNLinearSolver object. Arguments y (N Vector) a template for cloning vectors needed within the solver pretype (int) flag indicating whether to use preconditioning. Since the pcg algorithm is designed to only support symmetric preconditioning, then any of the pretype inputs PREC LEFT (1), PREC RIGHT (2), or PREC BOTH (3) will result in use of the symmetric preconditioner; any other integer input will result in the default (no preconditioning). maxl (int) the number of linear iterations to allow; values ≤ 0 will result in the default value (5). Return value This returns a SUNLinearSolver object. If either y is incompatible then this routine will return NULL. Notes This routine will perform consistency checks to ensure that it is called with a consistent nvector implementation (i.e. that it supplies the requisite vector operations). If y is incompatible, then this routine will return NULL. Although some sundials solvers are designed to only work with left preconditioning (ida and idas) and others with only right preconditioning (kinsol), pcg should only be used with these packages when the linear systems are known to be symmetric. Since the scaling of matrix rows and columns must be identical in a symmetric matrix, symmetric preconditioning should work appropriately even for packages designed with one-sided preconditioning in mind. Deprecated Name For backward compatibility, the wrapper function SUNPCG with idential input and output arguments is also provided. F2003 Name This function is callable as FSUNLinSol PCG when using the Fortran 2003 interface module. The sunlinsol pcg module defines implementations of all “iterative” linear solver operations listed in Sections 9.1.1 – 9.1.3: • SUNLinSolGetType PCG • SUNLinSolInitialize PCG • SUNLinSolSetATimes PCG • SUNLinSolSetPreconditioner PCG • SUNLinSolSetScalingVectors PCG – since pcg only supports symmetric scaling, the second nvector argument to this function is ignored • SUNLinSolSetup PCG • SUNLinSolSolve PCG • SUNLinSolNumIters PCG • SUNLinSolResNorm PCG • SUNLinSolResid PCG 9.15 The SUNLinearSolver PCG implementation 295 • SUNLinSolLastFlag PCG • SUNLinSolSpace PCG • SUNLinSolFree PCG All of the listed operations are callable via the Fortran 2003 interface module by prepending an ‘F’ to the function name. The sunlinsol pcg module also defines the following additional user-callable functions. SUNLinSol PCGSetPrecType Call retval = SUNLinSol PCGSetPrecType(LS, pretype); Description The function SUNLinSol PCGSetPrecType updates the flag indicating use of preconditioning in the sunlinsol pcg object. Arguments LS (SUNLinearSolver) the sunlinsol pcg object to update pretype (int) flag indicating use of preconditioning, allowed values match those discussed in SUNLinSol PCG. Return value This routine will return with one of the error codes SUNLS ILL INPUT (illegal pretype), SUNLS MEM NULL (S is NULL) or SUNLS SUCCESS. Deprecated Name For backward compatibility, the wrapper function SUNPCGSetPrecType with idential input and output arguments is also provided. F2003 Name This function is callable as FSUNLinSol PCGSetPrecType when using the Fortran 2003 interface module. SUNLinSol PCGSetMaxl Call retval = SUNLinSol PCGSetMaxl(LS, maxl); Description The function SUNLinSol PCGSetMaxl updates the number of linear solver iterations to allow. Arguments LS (SUNLinearSolver) the sunlinsol pcg object to update maxl (int) flag indicating the number of iterations to allow; values ≤ 0 will result in the default value (5) Return value This routine will return with one of the error codes SUNLS MEM NULL (S is NULL) or SUNLS SUCCESS. Deprecated Name For backward compatibility, the wrapper function SUNPCGSetMaxl with idential input and output arguments is also provided. F2003 Name 9.15.3 This function is callable as FSUNLinSol PCGSetMaxl when using the Fortran 2003 interface module. SUNLinearSolver PCG Fortran interfaces The sunlinsol pcg module provides a Fortran 2003 module as well as Fortran 77 style interface functions for use from Fortran applications. FORTRAN 2003 interface module The fsunlinsol pcg mod Fortran module defines interfaces to all sunlinsol pcg C functions using the intrinsic iso c binding module which provides a standardized mechanism for interoperating with C. As noted in the C function descriptions above, the interface functions are named after the corresponding C function, but with a leading ‘F’. For example, the function SUNLinSol PCG is interfaced as FSUNLinSol PCG. 296 Description of the SUNLinearSolver module The Fortran 2003 sunlinsol pcg interface module can be accessed with the use statement, i.e. use fsunlinsol pcg mod, and linking to the library libsundials fsunlinsolpcg mod.lib in addition to the C library. For details on where the library and module file fsunlinsol pcg mod.mod are installed see Appendix A. We note that the module is accessible from the Fortran 2003 sundials integrators without separately linking to the libsundials fsunlinsolpcg mod library. FORTRAN 77 interface functions For solvers that include a Fortran 77 interface module, the sunlinsol pcg module also includes a Fortran-callable function for creating a SUNLinearSolver object. FSUNPCGINIT Call FSUNPCGINIT(code, pretype, maxl, ier) Description The function FSUNPCGINIT can be called for Fortran programs to create a sunlinsol pcg object. Arguments code (int*) is an integer input specifying the solver id (1 for cvode, 2 for ida, 3 for kinsol, and 4 for arkode). pretype (int*) flag indicating desired preconditioning type maxl (int*) flag indicating number of iterations to allow Return value ier is a return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. Notes This routine must be called after the nvector object has been initialized. Allowable values for pretype and maxl are the same as for the C function SUNLinSol PCG. Additionally, when using arkode with a non-identity mass matrix, the sunlinsol pcg module includes a Fortran-callable function for creating a SUNLinearSolver mass matrix solver object. FSUNMASSPCGINIT Call FSUNMASSPCGINIT(pretype, maxl, ier) Description The function FSUNMASSPCGINIT can be called for Fortran programs to create a sunlinsol pcg object for mass matrix linear systems. Arguments pretype (int*) flag indicating desired preconditioning type maxl (int*) flag indicating number of iterations to allow Return value ier is a int return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. Notes This routine must be called after the nvector object has been initialized. Allowable values for pretype and maxl are the same as for the C function SUNLinSol PCG. The SUNLinSol PCGSetPrecType and SUNLinSol PCGSetMaxl routines also support Fortran interfaces for the system and mass matrix solvers. FSUNPCGSETPRECTYPE Call FSUNPCGSETPRECTYPE(code, pretype, ier) Description The function FSUNPCGSETPRECTYPE can be called for Fortran programs to change the type of preconditioning to use. Arguments code (int*) is an integer input specifying the solver id (1 for cvode, 2 for ida, 3 for kinsol, and 4 for arkode). pretype (int*) flag indicating the type of preconditioning to use. Return value ier is a int return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. 9.15 The SUNLinearSolver PCG implementation Notes 297 See SUNLinSol PCGSetPrecType for complete further documentation of this routine. FSUNMASSPCGSETPRECTYPE Call FSUNMASSPCGSETPRECTYPE(pretype, ier) Description The function FSUNMASSPCGSETPRECTYPE can be called for Fortran programs to change the type of preconditioning for mass matrix linear systems. Arguments The arguments are identical to FSUNPCGSETPRECTYPE above, except that code is not needed since mass matrix linear systems only arise in arkode. Return value ier is a int return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. Notes See SUNLinSol PCGSetPrecType for complete further documentation of this routine. FSUNPCGSETMAXL Call FSUNPCGSETMAXL(code, maxl, ier) Description The function FSUNPCGSETMAXL can be called for Fortran programs to change the maximum number of iterations to allow. Arguments code (int*) is an integer input specifying the solver id (1 for cvode, 2 for ida, 3 for kinsol, and 4 for arkode). maxl (int*) the number of iterations to allow. Return value ier is a int return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. Notes See SUNLinSol PCGSetMaxl for complete further documentation of this routine. FSUNMASSPCGSETMAXL Call FSUNMASSPCGSETMAXL(maxl, ier) Description The function FSUNMASSPCGSETMAXL can be called for Fortran programs to change the type of preconditioning for mass matrix linear systems. Arguments The arguments are identical to FSUNPCGSETMAXL above, except that code is not needed since mass matrix linear systems only arise in arkode. Return value ier is a int return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. Notes 9.15.4 See SUNLinSol PCGSetMaxl for complete further documentation of this routine. SUNLinearSolver PCG content The sunlinsol pcg module defines the content field of a SUNLinearSolver as the following structure: struct _SUNLinearSolverContent_PCG { int maxl; int pretype; int numiters; realtype resnorm; long int last_flag; ATimesFn ATimes; void* ATData; PSetupFn Psetup; PSolveFn Psolve; void* PData; 298 Description of the SUNLinearSolver module N_Vector N_Vector N_Vector N_Vector N_Vector }; s; r; p; z; Ap; These entries of the content field contain the following information: maxl - number of pcg iterations to allow (default is 5), pretype - flag for use of preconditioning (default is none), numiters - number of iterations from the most-recent solve, resnorm - final linear residual norm from the most-recent solve, last flag - last error return flag from an internal function, ATimes - function pointer to perform Av product, ATData - pointer to structure for ATimes, Psetup - function pointer to preconditioner setup routine, Psolve - function pointer to preconditioner solve routine, PData - pointer to structure for Psetup and Psolve, s - vector pointer for supplied scaling matrix (default is NULL), r - a nvector which holds the preconditioned linear system residual, p, z, Ap - nvectors used for workspace by the pcg algorithm. 9.16 SUNLinearSolver Examples There are SUNLinearSolver examples that may be installed for each implementation; these make use of the functions in test sunlinsol.c. These example functions show simple usage of the SUNLinearSolver family of functions. The inputs to the examples depend on the linear solver type, and are output to stdout if the example is run without the appropriate number of command-line arguments. The following is a list of the example functions in test sunlinsol.c: • Test SUNLinSolGetType: Verifies the returned solver type against the value that should be returned. • Test SUNLinSolInitialize: Verifies that SUNLinSolInitialize can be called and returns successfully. • Test SUNLinSolSetup: Verifies that SUNLinSolSetup can be called and returns successfully. • Test SUNLinSolSolve: Given a sunmatrix object A, nvector objects x and b (where Ax = b) and a desired solution tolerance tol, this routine clones x into a new vector y, calls SUNLinSolSolve to fill y as the solution to Ay = b (to the input tolerance), verifies that each entry in x and y match to within 10*tol, and overwrites x with y prior to returning (in case the calling routine would like to investigate further). • Test SUNLinSolSetATimes (iterative solvers only): Verifies that SUNLinSolSetATimes can be called and returns successfully. • Test SUNLinSolSetPreconditioner (iterative solvers only): Verifies that SUNLinSolSetPreconditioner can be called and returns successfully. • Test SUNLinSolSetScalingVectors (iterative solvers only): Verifies that SUNLinSolSetScalingVectors can be called and returns successfully. 9.16 SUNLinearSolver Examples 299 • Test SUNLinSolLastFlag: Verifies that SUNLinSolLastFlag can be called, and outputs the result to stdout. • Test SUNLinSolNumIters (iterative solvers only): Verifies that SUNLinSolNumIters can be called, and outputs the result to stdout. • Test SUNLinSolResNorm (iterative solvers only): Verifies that SUNLinSolResNorm can be called, and that the result is non-negative. • Test SUNLinSolResid (iterative solvers only): Verifies that SUNLinSolResid can be called. • Test SUNLinSolSpace verifies that SUNLinSolSpace can be called, and outputs the results to stdout. We’ll note that these tests should be performed in a particular order. For either direct or iterative linear solvers, Test SUNLinSolInitialize must be called before Test SUNLinSolSetup, which must be called before Test SUNLinSolSolve. Additionally, for iterative linear solvers Test SUNLinSolSetATimes, Test SUNLinSolSetPreconditioner and Test SUNLinSolSetScalingVectors should be called before Test SUNLinSolInitialize; similarly Test SUNLinSolNumIters, Test SUNLinSolResNorm and Test SUNLinSolResid should be called after Test SUNLinSolSolve. These are called in the appropriate order in all of the example problems. Chapter 10 Description of the SUNNonlinearSolver module sundials time integration packages are written in terms of generic nonlinear solver operations defined by the sunnonlinsol API and implemented by a particular sunnonlinsol module of type SUNNonlinearSolver. Users can supply their own sunnonlinsol module, or use one of the modules provided with sundials. The time integrators in sundials specify a default nonlinear solver module and as such this chapter is intended for users that wish to use a non-default nonlinear solver module or would like to provide their own nonlinear solver implementation. Users interested in using a non-default solver module may skip the description of the sunnonlinsol API in section 10.1 and proceeded to the subsequent sections in this chapter that describe the sunnonlinsol modules provided with sundials. For users interested in providing their own sunnonlinsol module, the following section presents the sunnonlinsol API and its implementation beginning with the definition of sunnonlinsol functions in sections 10.1.1 – 10.1.3. This is followed by the definition of functions supplied to a nonlinear solver implementation in section 10.1.4. A table of nonlinear solver return codes is given in section 10.1.5. The SUNNonlinearSolver type and the generic sunnonlinsol module are defined in section 10.1.6. Section 10.1.7 describes how sunnonlinsol models interface with sundials integrators providing sensitivity analysis capabilities (cvodes and idas). Finally, section 10.1.8 lists the requirements for supplying a custom sunnonlinsol module. Users wishing to supply their own sunnonlinsol module are encouraged to use the sunnonlinsol implementations provided with sundials as a template for supplying custom nonlinear solver modules. 10.1 The SUNNonlinearSolver API The sunnonlinsol API defines several nonlinear solver operations that enable sundials integrators to utilize any sunnonlinsol implementation that provides the required functions. These functions can be divided into three categories. The first are the core nonlinear solver functions. The second group of functions consists of set routines to supply the nonlinear solver with functions provided by the sundials time integrators and to modify solver parameters. The final group consists of get routines for retrieving nonlinear solver statistics. All of these functions are defined in the header file sundials/sundials nonlinearsolver.h. 10.1.1 SUNNonlinearSolver core functions The core nonlinear solver functions consist of two required functions to get the nonlinear solver type (SUNNonlinsSolGetType) and solve the nonlinear system (SUNNonlinSolSolve). The remaining three functions for nonlinear solver initialization (SUNNonlinSolInitialization), setup (SUNNonlinSolSetup), and destruction (SUNNonlinSolFree) are optional. 302 Description of the SUNNonlinearSolver module SUNNonlinSolGetType Call type = SUNNonlinSolGetType(NLS); Description The required function SUNNonlinSolGetType returns nonlinear solver type. Arguments NLS (SUNNonlinearSolver) a sunnonlinsol object. Return value The return value type (of type int) will be one of the following: SUNNONLINEARSOLVER ROOTFIND 0, the sunnonlinsol module solves F (y) = 0. SUNNONLINEARSOLVER FIXEDPOINT 1, the sunnonlinsol module solves G(y) = y. SUNNonlinSolInitialize Call retval = SUNNonlinSolInitialize(NLS); Description The optional function SUNNonlinSolInitialize performs nonlinear solver initialization and may perform any necessary memory allocations. Arguments NLS (SUNNonlinearSolver) a sunnonlinsol object. Return value The return value retval (of type int) is zero for a successful call and a negative value for a failure. Notes It is assumed all solver-specific options have been set prior to calling SUNNonlinSolInitialize. sunnonlinsol implementations that do not require initialization may set this operation to NULL. SUNNonlinSolSetup Call retval = SUNNonlinSolSetup(NLS, y, mem); Description The optional function SUNNonlinSolSetup performs any solver setup needed for a nonlinear solve. Arguments NLS (SUNNonlinearSolver) a sunnonlinsol object. y (N Vector) the initial iteration passed to the nonlinear solver. mem (void *) the sundials integrator memory structure. Return value The return value retval (of type int) is zero for a successful call and a negative value for a failure. Notes sundials integrators call SUNonlinSolSetup before each step attempt. sunnonlinsol implementations that do not require setup may set this operation to NULL. SUNNonlinSolSolve Call retval = SUNNonlinSolSolve(NLS, y0, y, w, tol, callLSetup, mem); Description The required function SUNNonlinSolSolve solves the nonlinear system F (y) = 0 or G(y) = y. Arguments (SUNNonlinearSolver) a sunnonlinsol object. (N Vector) the initial iterate for the nonlinear solve. This must remain unchanged throughout the solution process. y (N Vector) the solution to the nonlinear system. w (N Vector) the solution error weight vector used for computing weighted error norms. tol (realtype) the requested solution tolerance in the weighted root-meansquared norm. callLSetup (booleantype) a flag indicating that the integrator recommends for the linear solver setup function to be called. NLS y0 10.1 The SUNNonlinearSolver API mem 303 (void *) the sundials integrator memory structure. Return value The return value retval (of type int) is zero for a successul solve, a positive value for a recoverable error, and a negative value for an unrecoverable error. SUNNonlinSolFree Call retval = SUNNonlinSolFree(NLS); Description The optional function SUNNonlinSolFree frees any memory allocated by the nonlinear solver. Arguments NLS (SUNNonlinearSolver) a sunnonlinsol object. Return value The return value retval (of type int) should be zero for a successful call, and a negative value for a failure. sunnonlinsol implementations that do not allocate data may set this operation to NULL. 10.1.2 SUNNonlinearSolver set functions The following set functions are used to supply nonlinear solver modules with functions defined by the sundials integrators and to modify solver parameters. Only the routine for setting the nonlinear system defining function (SUNNonlinSolSetSysFn is required. All other set functions are optional. SUNNonlinSolSetSysFn Call retval = SUNNonlinSolSetSysFn(NLS, SysFn); Description The required function SUNNonlinSolSetSysFn is used to provide the nonlinear solver with the function defining the nonlinear system. This is the function F (y) in F (y) = 0 for SUNNONLINEARSOLVER ROOTFIND modules or G(y) in G(y) = y for SUNNONLINEARSOLVER FIXEDPOINT modules. Arguments NLS (SUNNonlinearSolver) a sunnonlinsol object. SysFn (SUNNonlinSolSysFn) the function defining the nonlinear system. See section 10.1.4 for the definition of SUNNonlinSolSysFn. Return value The return value retval (of type int) should be zero for a successful call, and a negative value for a failure. SUNNonlinSolSetLSetupFn Call retval = SUNNonlinSolSetLSetupFn(NLS, LSetupFn); Description The optional function SUNNonlinSolLSetupFn is called by sundials integrators to provide the nonlinear solver with access to its linear solver setup function. Arguments NLS (SUNNonlinearSolver) a sunnonlinsol object. LSetupFn (SUNNonlinSolLSetupFn) a wrapper function to the sundials integrator’s linear solver setup function. See section 10.1.4 for the definition of SUNNonlinLSetupFn. Return value The return value retval (of type int) should be zero for a successful call, and a negative value for a failure. Notes The SUNNonlinLSetupFn function sets up the linear system Ax = b where A = ∂F ∂y is the linearization of the nonlinear residual function F (y) = 0 (when using sunlinsol direct linear solvers) or calls the user-defined preconditioner setup function (when using sunlinsol iterative linear solvers). sunnonlinsol implementations that do not require solving this system, do not utilize sunlinsol linear solvers, or use sunlinsol linear solvers that do not require setup may set this operation to NULL. 304 Description of the SUNNonlinearSolver module SUNNonlinSolSetLSolveFn Call retval = SUNNonlinSolSetLSolveFn(NLS, LSolveFn); Description The optional function SUNNonlinSolSetLSolveFn is called by sundials integrators to provide the nonlinear solver with access to its linear solver solve function. Arguments NLS (SUNNonlinearSolver) a sunnonlinsol object LSolveFn (SUNNonlinSolLSolveFn) a wrapper function to the sundials integrator’s linear solver solve function. See section 10.1.4 for the definition of SUNNonlinSolLSolveFn. Return value The return value retval (of type int) should be zero for a successful call, and a negative value for a failure. The SUNNonlinLSolveFn function solves the linear system Ax = b where A = ∂F ∂y is the linearization of the nonlinear residual function F (y) = 0. sunnonlinsol implementations that do not require solving this system or do not use sunlinsol linear solvers may set this operation to NULL. Notes SUNNonlinSolSetConvTestFn Call retval = SUNNonlinSolSetConvTestFn(NLS, CTestFn); Description The optional function SUNNonlinSolSetConvTestFn is used to provide the nonlinear solver with a function for determining if the nonlinear solver iteration has converged. This is typically called by sundials integrators to define their nonlinear convergence criteria, but may be replaced by the user. Arguments NLS (SUNNonlinearSolver) a sunnonlinsol object. CTestFn (SUNNonlineSolConvTestFn) a sundials integrator’s nonlinear solver convergence test function. See section 10.1.4 for the definition of SUNNonlinSolConvTestFn. Return value The return value retval (of type int) should be zero for a successful call, and a negative value for a failure. Notes sunnonlinsol implementations utilizing their own convergence test criteria may set this function to NULL. SUNNonlinSolSetMaxIters Call retval = SUNNonlinSolSetMaxIters(NLS, maxiters); Description The optional function SUNNonlinSolSetMaxIters sets the maximum number of nonlinear solver iterations. This is typically called by sundials integrators to define their default iteration limit, but may be adjusted by the user. Arguments NLS (SUNNonlinearSolver) a sunnonlinsol object. maxiters (int) the maximum number of nonlinear iterations. Return value The return value retval (of type int) should be zero for a successful call, and a negative value for a failure (e.g., maxiters < 1). 10.1.3 SUNNonlinearSolver get functions The following get functions allow sundials integrators to retrieve nonlinear solver statistics. The routines to get the current total number of iterations (SUNNonlinSolGetNumIters) and number of convergence failures (SUNNonlinSolGetNumConvFails) are optional. The routine to get the current nonlinear solver iteration (SUNNonlinSolGetCurIter) is required when using the convergence test provided by the sundials integrator or by the arkode and cvode linear solver interfaces. Otherwise, SUNNonlinSolGetCurIter is optional. 10.1 The SUNNonlinearSolver API 305 SUNNonlinSolGetNumIters Call retval = SUNNonlinSolGetNumIters(NLS, numiters); Description The optional function SUNNonlinSolGetNumIters returns the total number of nonlinear solver iterations. This is typically called by the sundials integrator to store the nonlinear solver statistics, but may also be called by the user. Arguments NLS (SUNNonlinearSolver) a sunnonlinsol object numiters (long int*) the total number of nonlinear solver iterations. Return value The return value retval (of type int) should be zero for a successful call, and a negative value for a failure. SUNNonlinSolGetCurIter Call retval = SUNNonlinSolGetCurIter(NLS, iter); Description The function SUNNonlinSolGetCurIter returns the iteration index of the current nonlinear solve. This function is required when using sundials integrator-provided convergence tests or when using a sunlinsol spils linear solver; otherwise it is optional. Arguments NLS iter (SUNNonlinearSolver) a sunnonlinsol object (int*) the nonlinear solver iteration in the current solve starting from zero. Return value The return value retval (of type int) should be zero for a successful call, and a negative value for a failure. SUNNonlinSolGetNumConvFails Call retval = SUNNonlinSolGetNumConvFails(NLS, nconvfails); Description The optional function SUNNonlinSolGetNumConvFails returns the total number of nonlinear solver convergence failures. This may be called by the sundials integrator to store the nonlinear solver statistics, but may also be called by the user. Arguments NLS (SUNNonlinearSolver) a sunnonlinsol object nconvfails (long int*) the total number of nonlinear solver convergence failures. Return value The return value retval (of type int) should be zero for a successful call, and a negative value for a failure. 10.1.4 Functions provided by SUNDIALS integrators To interface with sunnonlinsol modules, the sundials integrators supply a variety of routines for evaluating the nonlinear system, calling the sunlinsol setup and solve functions, and testing the nonlinear iteration for convergence. These integrator-provided routines translate between the user-supplied ODE or DAE systems and the generic interfaces to the nonlinear or linear systems of equations that result in their solution. The types for functions provided to a sunnonlinsol module are defined in the header file sundials/sundials nonlinearsolver.h, and are described below. SUNNonlinSolSysFn Definition typedef int (*SUNNonlinSolSysFn)(N Vector y, N Vector F, void* mem); Purpose These functions evaluate the nonlinear system F (y) for SUNNONLINEARSOLVER ROOTFIND type modules or G(y) for SUNNONLINEARSOLVER FIXEDPOINT type modules. Memory for F must by be allocated prior to calling this function. The vector y must be left unchanged. Arguments y F is the state vector at which the nonlinear system should be evaluated. is the output vector containing F (y) or G(y), depending on the solver type. 306 Description of the SUNNonlinearSolver module mem is the sundials integrator memory structure. Return value The return value retval (of type int) is zero for a successul solve, a positive value for a recoverable error, and a negative value for an unrecoverable error. SUNNonlinSolLSetupFn Definition typedef int (*SUNNonlinSolLSetupFn)(N Vector y, N Vector F, booleantype jbad, booleantype* jcur, void* mem); Purpose These functions are wrappers to the sundials integrator’s function for setting up linear solves with sunlinsol modules. Arguments y is the state vector at which the linear system should be setup. F is the value of the nonlinear system function at y. jbad is an input indicating whether the nonlinear solver believes that A has gone stale (SUNTRUE) or not (SUNFALSE). jcur is an output indicating whether the routine has updated the Jacobian A (SUNTRUE) or not (SUNFALSE). mem is the sundials integrator memory structure. Return value The return value retval (of type int) is zero for a successul solve, a positive value for a recoverable error, and a negative value for an unrecoverable error. Notes The SUNNonlinLSetupFn function sets up the linear system Ax = b where A = ∂F ∂y is the linearization of the nonlinear residual function F (y) = 0 (when using sunlinsol direct linear solvers) or calls the user-defined preconditioner setup function (when using sunlinsol iterative linear solvers). sunnonlinsol implementations that do not require solving this system, do not utilize sunlinsol linear solvers, or use sunlinsol linear solvers that do not require setup may ignore these functions. SUNNonlinSolLSolveFn Definition typedef int (*SUNNonlinSolLSolveFn)(N Vector y, N Vector b, void* mem); Purpose These functions are wrappers to the sundials integrator’s function for solving linear systems with sunlinsol modules. Arguments y b is the input vector containing the current nonlinear iteration. contains the right-hand side vector for the linear solve on input and the solution to the linear system on output. mem is the sundials integrator memory structure. Return value The return value retval (of type int) is zero for a successul solve, a positive value for a recoverable error, and a negative value for an unrecoverable error. Notes The SUNNonlinLSolveFn function solves the linear system Ax = b where A = ∂F ∂y is the linearization of the nonlinear residual function F (y) = 0. sunnonlinsol implementations that do not require solving this system or do not use sunlinsol linear solvers may ignore these functions. SUNNonlinSolConvTestFn Definition typedef int (*SUNNonlinSolConvTestFn)(SUNNonlinearSolver NLS, N Vector y, N Vector del, realtype tol, N Vector ewt, void* mem); Purpose These functions are sundials integrator-specific convergence tests for nonlinear solvers and are typically supplied by each sundials integrator, but users may supply custom problem-specific versions as desired. 10.1 The SUNNonlinearSolver API Arguments NLS y del tol ewt mem is is is is is is the the the the the the 307 sunnonlinsol object. current nonlinear iterate. difference between the current and prior nonlinear iterates. nonlinear solver tolerance. weight vector used in computing weighted norms. sundials integrator memory structure. Return value The return value of this routine will be a negative value if an unrecoverable error occurred or one of the following: SUN NLS SUCCESS the iteration is converged. SUN NLS CONTINUE the iteration has not converged, keep iterating. SUN NLS CONV RECVR the iteration appears to be diverging, try to recover. Notes The tolerance passed to this routine by sundials integrators is the tolerance in a weighted root-mean-squared norm with error weight vector ewt. sunnonlinsol modules utilizing their own convergence criteria may ignore these functions. 10.1.5 SUNNonlinearSolver return codes The functions provided to sunnonlinsol modules by each sundials integrator, and functions within the sundials-provided sunnonlinsol implementations utilize a common set of return codes, shown below in Table 10.1. Here, negative values correspond to non-recoverable failures, positive values to recoverable failures, and zero to a successful call. Table 10.1: Description of the SUNNonlinearSolver return codes Name SUN SUN SUN SUN SUN SUN NLS NLS NLS NLS NLS NLS 10.1.6 SUCCESS CONTINUE CONV RECVR MEM NULL MEM FAIL ILL INPUT Value Description 0 1 2 -1 -2 -3 successful call or converged solve the nonlinear solver is not converged, keep iterating the nonlinear solver appears to be diverging, try to recover a memory argument is NULL a memory access or allocation failed an illegal input option was provided The generic SUNNonlinearSolver module sundials integrators interact with specific sunnonlinsol implementations through the generic sunnonlinsol module on which all other sunnonlinsol implementations are built. The SUNNonlinearSolver type is a pointer to a structure containing an implementation-dependent content field and an ops field. The type SUNNonlinearSolver is defined as follows: typedef struct _generic_SUNNonlinearSolver *SUNNonlinearSolver; struct _generic_SUNNonlinearSolver { void *content; struct _generic_SUNNonlinearSolver_Ops *ops; }; where the generic SUNNonlinearSolver Ops structure is a list of pointers to the various actual nonlinear solver operations provided by a specific implementation. The generic SUNNonlinearSolver Ops structure is defined as 308 Description of the SUNNonlinearSolver module struct _generic_SUNNonlinearSolver_Ops { SUNNonlinearSolver_Type (*gettype)(SUNNonlinearSolver); int (*initialize)(SUNNonlinearSolver); int (*setup)(SUNNonlinearSolver, N_Vector, void*); int (*solve)(SUNNonlinearSolver, N_Vector, N_Vector, N_Vector, realtype, booleantype, void*); int (*free)(SUNNonlinearSolver); int (*setsysfn)(SUNNonlinearSolver, SUNNonlinSolSysFn); int (*setlsetupfn)(SUNNonlinearSolver, SUNNonlinSolLSetupFn); int (*setlsolvefn)(SUNNonlinearSolver, SUNNonlinSolLSolveFn); int (*setctestfn)(SUNNonlinearSolver, SUNNonlinSolConvTestFn); int (*setmaxiters)(SUNNonlinearSolver, int); int (*getnumiters)(SUNNonlinearSolver, long int*); int (*getcuriter)(SUNNonlinearSolver, int*); int (*getnumconvfails)(SUNNonlinearSolver, long int*); }; The generic sunnonlinsol module defines and implements the nonlinear solver operations defined in Sections 10.1.1 – 10.1.3. These routines are in fact only wrappers to the nonlinear solver operations provided by a particular sunnonlinsol implementation, which are accessed through the ops field of the SUNNonlinearSolver structure. To illustrate this point we show below the implementation of a typical nonlinear solver operation from the generic sunnonlinsol module, namely SUNNonlinSolSolve, which solves the nonlinear system and returns a flag denoting a successful or failed solve: int SUNNonlinSolSolve(SUNNonlinearSolver NLS, N_Vector y0, N_Vector y, N_Vector w, realtype tol, booleantype callLSetup, void* mem) { return((int) NLS->ops->solve(NLS, y0, y, w, tol, callLSetup, mem)); } 10.1.7 Usage with sensitivity enabled integrators When used with sundials packages that support sensitivity analysis capabilities (e.g., cvodes and idas) a special nvector module is used to interface with sunnonlinsol modules for solves involving sensitivity vectors stored in an nvector array. As described below, the nvector senswrapper module is an nvector implementation where the vector content is an nvector array. This wrapper vector allows sunnonlinsol modules to operate on data stored as a collection of vectors. For all sundials-provided sunnonlinsol modules a special constructor wrapper is provided so users do not need to interact directly with the nvector senswrapper module. These constructors follow the naming convention SUNNonlinSol ***Sens(count,...) where *** is the name of the sunnonlinsol module, count is the size of the vector wrapper, and ... are the module-specific constructor arguments. The NVECTOR SENSWRAPPER module This section describes the nvector senswrapper implementation of an nvector. To access the nvector senswrapper module, include the header file sundials/sundials nvector senswrapper.h. The nvector senswrapper module defines an N Vector implementing all of the standard vectors operations defined in Table 7.2 but with some changes to how operations are computed in order to accommodate operating on a collection of vectors. 10.1 The SUNNonlinearSolver API 309 1. Element-wise vector operations are computed on a vector-by-vector basis. For example, the linear sum of two wrappers containing nv vectors of length n, N VLinearSum(a,x,b,y,z), is computed as zj,i = axj,i + byj,i , i = 0, . . . , n − 1, j = 0, . . . , nv − 1. 2. The dot product of two wrappers containing nv vectors of length n is computed as if it were the dot product of two vectors of length nnv . Thus d = N VDotProd(x,y) is d= nX v −1 n−1 X xj,i yj,i . j=0 i=0 3. All norms are computed as the maximum of the individual norms of the nv vectors in the wrapper. For example, the weighted root mean square norm m = N VWrmsNorm(x, w) is v ! u u 1 n−1 X 2 t m = max (xj,i wj,i ) j n i=0 To enable usage alongside other nvector modules the nvector senswrapper functions implementing vector operations have SensWrapper appended to the generic vector operation name. The nvector senswrapper module provides the following constructors for creating an nvector senswrapper: N VNewEmpty SensWrapper Call w = N VNewEmpty SensWrapper(count); Description The function N VNewEmpty SensWrapper creates an empty nvector senswrapper wrapper with space for count vectors. Arguments count (int) the number of vectors the wrapper will contain. Return value The return value w (of type N Vector) will be a nvector object if the constructor exits successfully, otherwise w will be NULL. N VNew SensWrapper Call w = N VNew SensWrapper(count, y); Description The function N VNew SensWrapper creates an nvector senswrapper wrapper containing count vectors cloned from y. Arguments count (int) the number of vectors the wrapper will contain. y (N Vector) the template vectors to use in creating the vector wrapper. Return value The return value w (of type N Vector) will be a nvector object if the constructor exits successfully, otherwise w will be NULL. The nvector senswrapper implementation of the nvector module defines the content field of the N Vector to be a structure containing an N Vector array, the number of vectors in the vector array, and a boolean flag indicating ownership of the vectors in the vector array. struct _N_VectorContent_SensWrapper { N_Vector* vecs; int nvecs; booleantype own_vecs; }; The following macros are provided to access the content of an nvector senswrapper vector. 310 Description of the SUNNonlinearSolver module • NV CONTENT SW(v) - provides access to the content structure • NV VECS SW(v) - provides access to the vector array • NV NVECS SW(v) - provides access to the number of vectors • NV OWN VECS SW(v) - provides access to the ownership flag • NV VEC SW(v,i) - provides access to the i-th vector in the vector array 10.1.8 Implementing a Custom SUNNonlinearSolver Module A sunnonlinsol implementation must do the following: 1. Specify the content of the sunnonlinsol module. 2. Define and implement the required nonlinear solver operations defined in Sections 10.1.1 – 10.1.3. Note that the names of the module routines should be unique to that implementation in order to permit using more than one sunnonlinsol module (each with different SUNNonlinearSolver internal data representations) in the same code. 3. Define and implement a user-callable constructor to create a SUNNonlinearSolver object. Additionally, a SUNNonlinearSolver implementation may do the following: 1. Define and implement additional user-callable “set” routines acting on the SUNNonlinearSolver object, e.g., for setting various configuration options to tune the performance of the nonlinear solve algorithm. 2. Provide additional user-callable “get” routines acting on the SUNNonlinearSolver object, e.g., for returning various solve statistics. 10.2 The SUNNonlinearSolver Newton implementation This section describes the sunnonlinsol implementation of Newton’s method. To access the sunnonlinsol newton module, include the header file sunnonlinsol/sunnonlinsol newton.h. We note that the sunnonlinsol newton module is accessible from sundials integrators without separately linking to the libsundials sunnonlinsolnewton module library. 10.2.1 SUNNonlinearSolver Newton description To find the solution to F (y) = 0 given an initial guess y (0) (10.1) , Newton’s method computes a series of approximate solutions y (m+1) = y (m) + δ (m+1) (10.2) where m is the Newton iteration index, and the Newton update δ (m+1) is the solution of the linear system A(y (m) )δ (m+1) = −F (y (m) ) , (10.3) in which A is the Jacobian matrix A ≡ ∂F/∂y . (10.4) Depending on the linear solver used, the sunnonlinsol newton module will employ either a Modified Newton method, or an Inexact Newton method [5, 9, 18, 20, 33]. When used with a direct linear solver, the Jacobian matrix A is held constant during the Newton iteration, resulting in a Modified Newton method. With a matrix-free iterative linear solver, the iteration is an Inexact Newton method. 10.2 The SUNNonlinearSolver Newton implementation 311 In both cases, calls to the integrator-supplied SUNNonlinSolLSetupFn function are made infrequently to amortize the increased cost of matrix operations (updating A and its factorization within direct linear solvers, or updating the preconditioner within iterative linear solvers). Specifically, sunnonlinsol newton will call the SUNNonlinSolLSetupFn function in two instances: (a) when requested by the integrator (the input callLSetSetup is SUNTRUE) before attempting the Newton iteration, or (b) when reattempting the nonlinear solve after a recoverable failure occurs in the Newton iteration with stale Jacobian information (jcur is SUNFALSE). In this case, sunnonlinsol newton will set jbad to SUNTRUE before calling the SUNNonlinSolLSetupFn function. Whether the Jacobian matrix A is fully or partially updated depends on logic unique to each integratorsupplied SUNNonlinSolSetupFn routine. We refer to the discussion of nonlinear solver strategies provided in Chapter 2 for details on this decision. The default maximum number of iterations and the stopping criteria for the Newton iteration are supplied by the sundials integrator when sunnonlinsol newton is attached to it. Both the maximum number of iterations and the convergence test function may be modified by the user by calling the SUNNonlinSolSetMaxIters and/or SUNNonlinSolSetConvTestFn functions after attaching the sunnonlinsol newton object to the integrator. 10.2.2 SUNNonlinearSolver Newton functions The sunnonlinsol newton module provides the following constructors for creating a SUNNonlinearSolver object. SUNNonlinSol Newton Call NLS = SUNNonlinSol Newton(y); Description The function SUNNonlinSol Newton creates a SUNNonlinearSolver object for use with sundials integrators to solve nonlinear systems of the form F (y) = 0 using Newton’s method. Arguments y (N Vector) a template for cloning vectors needed within the solver. Return value The return value NLS (of type SUNNonlinearSolver) will be a sunnonlinsol object if the constructor exits successfully, otherwise NLS will be NULL. F2003 Name This function is callable as FSUNNonlinSol Newton when using the Fortran 2003 interface module. SUNNonlinSol NewtonSens Call NLS = SUNNonlinSol NewtonSens(count, y); Description The function SUNNonlinSol NewtonSens creates a SUNNonlinearSolver object for use with sundials sensitivity enabled integrators (cvodes and idas) to solve nonlinear systems of the form F (y) = 0 using Newton’s method. Arguments count (int) the number of vectors in the nonlinear solve. When integrating a system containing Ns sensitivities the value of count is: • Ns+1 if using a simultaneous corrector approach. • Ns if using a staggered corrector approach. y (N Vector) a template for cloning vectors needed within the solver. Return value The return value NLS (of type SUNNonlinearSolver) will be a sunnonlinsol object if the constructor exits successfully, otherwise NLS will be NULL. F2003 Name This function is callable as FSUNNonlinSol NewtonSens when using the Fortran 2003 interface module. 312 Description of the SUNNonlinearSolver module The sunnonlinsol newton module implements all of the functions defined in sections 10.1.1 – 10.1.3 except for the SUNNonlinSolSetup function. The sunnonlinsol newton functions have the same names as those defined by the generic sunnonlinsol API with Newton appended to the function name. Unless using the sunnonlinsol newton module as a standalone nonlinear solver the generic functions defined in sections 10.1.1 – 10.1.3 should be called in favor of the sunnonlinsol newtonspecific implementations. The sunnonlinsol newton module also defines the following additional user-callable function. SUNNonlinSolGetSysFn Newton Call retval = SUNNonlinSolGetSysFn Newton(NLS, SysFn); Description The function SUNNonlinSolGetSysFn Newton returns the residual function that defines the nonlinear system. Arguments NLS (SUNNonlinearSolver) a sunnonlinsol object SysFn (SUNNonlinSolSysFn*) the function defining the nonlinear system. Return value The return value retval (of type int) should be zero for a successful call, and a negative value for a failure. Notes This function is intended for users that wish to evaluate the nonlinear residual in a custom convergence test function for the sunnonlinsol newton module. We note that sunnonlinsol newton will not leverage the results from any user calls to SysFn. F2003 Name This function is callable as FSUNNonlinSolGetSysFn Newton when using the Fortran 2003 interface module. 10.2.3 SUNNonlinearSolver Newton Fortran interfaces The sunnonlinsol newton module provides a Fortran 2003 module as well as Fortran 77 style interface functions for use from Fortran applications. FORTRAN 2003 interface module The fsunnonlinsol newton mod Fortran module defines interfaces to all sunnonlinsol newton C functions using the intrinsic iso c binding module which provides a standardized mechanism for interoperating with C. As noted in the C function descriptions above, the interface functions are named after the corresponding C function, but with a leading ‘F’. For example, the function SUNNonlinSol Newton is interfaced as FSUNNonlinSol Newton. The Fortran 2003 sunnonlinsol newton interface module can be accessed with the use statement, i.e. use fsunnonlinsol newton mod, and linking to the library libsundials fsunnonlinsolnewton mod.lib in addition to the C library. For details on where the library and module file fsunnonlinsol newton mod.mod are installed see Appendix A. We note that the module is accessible from the Fortran 2003 sundials integrators without separately linking to the libsundials fsunnonlinsolnewton mod library. FORTRAN 77 interface functions For sundials integrators that include a Fortran 77 interface, the sunnonlinsol newton module also includes a Fortran-callable function for creating a SUNNonlinearSolver object. FSUNNEWTONINIT Call FSUNNEWTONINIT(code, ier); Description The function FSUNNEWTONINIT can be called for Fortran programs to create a SUNNonlinearSolver object for use with sundials integrators to solve nonlinear systems of the form F (y) = 0 with Newton’s method. 10.3 The SUNNonlinearSolver FixedPoint implementation Arguments 313 code (int*) is an integer input specifying the solver id (1 for cvode, 2 for ida, and 4 for arkode). Return value ier is a return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. 10.2.4 SUNNonlinearSolver Newton content The sunnonlinsol newton module defines the content field of a SUNNonlinearSolver as the following structure: struct _SUNNonlinearSolverContent_Newton { SUNNonlinSolSysFn SUNNonlinSolLSetupFn SUNNonlinSolLSolveFn SUNNonlinSolConvTestFn N_Vector booleantype int int long int long int }; Sys; LSetup; LSolve; CTest; delta; jcur; curiter; maxiters; niters; nconvfails; These entries of the content field contain the following information: Sys - the function for evaluating the nonlinear system, LSetup - the package-supplied function for setting up the linear solver, LSolve - the package-supplied function for performing a linear solve, CTest - the function for checking convergence of the Newton iteration, delta - the Newton iteration update vector, jcur - the Jacobian status (SUNTRUE = current, SUNFALSE = stale), curiter - the current number of iterations in the solve attempt, maxiters - the maximum number of Newton iterations allowed in a solve, and niters - the total number of nonlinear iterations across all solves. nconvfails - the total number of nonlinear convergence failures across all solves. 10.3 The SUNNonlinearSolver FixedPoint implementation This section describes the sunnonlinsol implementation of a fixed point (functional) iteration with optional Anderson acceleration. To access the sunnonlinsol fixedpoint module, include the header file sunnonlinsol/sunnonlinsol fixedpoint.h. We note that the sunnonlinsol fixedpoint module is accessible from sundials integrators without separately linking to the libsundials sunnonlinsolfixedpoint module library. 10.3.1 SUNNonlinearSolver FixedPoint description To find the solution to G(y) = y given an initial guess y (0) (10.5) , the fixed point iteration computes a series of approximate solutions y (n+1) = G(y (n) ) (10.6) 314 Description of the SUNNonlinearSolver module where n is the iteration index. The convergence of this iteration may be accelerated using Anderson’s method [3, 45, 21, 36]. With Anderson acceleration using subspace size m, the series of approximate solutions can be formulated as the linear combination y (n+1) = mn X (n) αi G(y (n−mn +i) ) (10.7) i=0 where mn = min{m, n} and the factors (n) (n) α(n) = (α0 , . . . , αm ) n solve the minimization problem minα kFn αT k2 under the constraint that (10.8) Pmn i=0 αi = 1 where Fn = (fn−mn , . . . , fn ) (10.9) with fi = G(y (i) ) − y (i) . Due to this constraint, in the limit of m = 0 the accelerated fixed point iteration formula (10.7) simplifies to the standard fixed point iteration (10.6). Following the recommendations made in [45], the sunnonlinsol fixedpoint implementation computes the series of approximate solutions as y (n+1) = G(y (n) ) − mX n −1 (n) (10.10) (n) (10.11) γi ∆gn−mn +i i=0 with ∆gi = G(y (i+1) ) − G(y (i) ) and where the factors (n) γ (n) = (γ0 , . . . , γmn −1 ) solve the unconstrained minimization problem minγ kfn − ∆Fn γ T k2 where ∆Fn = (∆fn−mn , . . . , ∆fn−1 ), (10.12) with ∆fi = fi+1 − fi . The least-squares problem is solved by applying a QR factorization to ∆Fn = Qn Rn and solving Rn γ = QTn fn . The acceleration subspace size m is required when constructing the sunnonlinsol fixedpoint object. The default maximum number of iterations and the stopping criteria for the fixed point iteration are supplied by the sundials integrator when sunnonlinsol fixedpoint is attached to it. Both the maximum number of iterations and the convergence test function may be modified by the user by calling SUNNonlinSolSetMaxIters and SUNNonlinSolSetConvTestFn functions after attaching the sunnonlinsol fixedpoint object to the integrator. 10.3.2 SUNNonlinearSolver FixedPoint functions The sunnonlinsol fixedpoint module provides the following constructors for creating a SUNNonlinearSolver object. SUNNonlinSol FixedPoint Call NLS = SUNNonlinSol FixedPoint(y, m); Description The function SUNNonlinSol FixedPoint creates a SUNNonlinearSolver object for use with sundials integrators to solve nonlinear systems of the form G(y) = y. Arguments y (N Vector) a template for cloning vectors needed within the solver m (int) the number of acceleration vectors to use Return value The return value NLS (of type SUNNonlinearSolver) will be a sunnonlinsol object if the constructor exits successfully, otherwise NLS will be NULL. F2003 Name This function is callable as FSUNNonlinSol FixedPoint when using the Fortran 2003 interface module. 10.3 The SUNNonlinearSolver FixedPoint implementation 315 SUNNonlinSol FixedPointSens Call NLS = SUNNonlinSol FixedPointSens(count, y, m); Description The function SUNNonlinSol FixedPointSens creates a SUNNonlinearSolver object for use with sundials sensitivity enabled integrators (cvodes and idas) to solve nonlinear systems of the form G(y) = y. Arguments count (int) the number of vectors in the nonlinear solve. When integrating a system containing Ns sensitivities the value of count is: • Ns+1 if using a simultaneous corrector approach. • Ns if using a staggered corrector approach. y (N Vector) a template for cloning vectors needed within the solver. m (int) the number of acceleration vectors to use. Return value The return value NLS (of type SUNNonlinearSolver) will be a sunnonlinsol object if the constructor exits successfully, otherwise NLS will be NULL. F2003 Name This function is callable as FSUNNonlinSol FixedPointSens when using the Fortran 2003 interface module. Since the accelerated fixed point iteration (10.6) does not require the setup or solution of any linear systems, the sunnonlinsol fixedpoint module implements all of the functions defined in sections 10.1.1 – 10.1.3 except for the SUNNonlinSolSetup, SUNNonlinSolSetLSetupFn, and SUNNonlinSolSetLSolveFn functions, that are set to NULL. The sunnonlinsol fixedpoint functions have the same names as those defined by the generic sunnonlinsol API with FixedPoint appended to the function name. Unless using the sunnonlinsol fixedpoint module as a standalone nonlinear solver the generic functions defined in sections 10.1.1 – 10.1.3 should be called in favor of the sunnonlinsol fixedpoint-specific implementations. The sunnonlinsol fixedpoint module also defines the following additional user-callable function. SUNNonlinSolGetSysFn FixedPoint Call retval = SUNNonlinSolGetSysFn FixedPoint(NLS, SysFn); Description The function SUNNonlinSolGetSysFn FixedPoint returns the fixed-point function that defines the nonlinear system. Arguments NLS (SUNNonlinearSolver) a sunnonlinsol object SysFn (SUNNonlinSolSysFn*) the function defining the nonlinear system. Return value The return value retval (of type int) should be zero for a successful call, and a negative value for a failure. Notes This function is intended for users that wish to evaluate the fixed-point function in a custom convergence test function for the sunnonlinsol fixedpoint module. We note that sunnonlinsol fixedpoint will not leverage the results from any user calls to SysFn. F2003 Name This function is callable as FSUNNonlinSolGetSysFn FixedPoint when using the Fortran 2003 interface module. 10.3.3 SUNNonlinearSolver FixedPoint Fortran interfaces The sunnonlinsol fixedpoint module provides a Fortran 2003 module as well as Fortran 77 style interface functions for use from Fortran applications. 316 Description of the SUNNonlinearSolver module FORTRAN 2003 interface module The fsunnonlinsol fixedpoint mod Fortran module defines interfaces to all sunnonlinsol fixedpoint C functions using the intrinsic iso c binding module which provides a standardized mechanism for interoperating with C. As noted in the C function descriptions above, the interface functions are named after the corresponding C function, but with a leading ‘F’. For example, the function SUNNonlinSol FixedPoint is interfaced as FSUNNonlinSol FixedPoint. The Fortran 2003 sunnonlinsol fixedpoint interface module can be accessed with the use statement, i.e. use fsunnonlinsol fixedpoint mod, and linking to the library libsundials fsunnonlinsolfixedpoint mod.lib in addition to the C library. For details on where the library and module file fsunnonlinsol fixedpoint mod.mod are installed see Appendix A. We note that the module is accessible from the Fortran 2003 sundials integrators without separately linking to the libsundials fsunnonlinsolfixedpoint mod library. FORTRAN 77 interface functions For sundials integrators that include a Fortran 77 interface, the sunnonlinsol fixedpoint module also includes a Fortran-callable function for creating a SUNNonlinearSolver object. FSUNFIXEDPOINTINIT Call FSUNFIXEDPOINTINIT(code, m, ier); Description The function FSUNFIXEDPOINTINIT can be called for Fortran programs to create a SUNNonlinearSolver object for use with sundials integrators to solve nonlinear systems of the form G(y) = y. Arguments code (int*) is an integer input specifying the solver id (1 for cvode, 2 for ida, and 4 for arkode). m (int*) is an integer input specifying the number of acceleration vectors. Return value ier is a return completion flag equal to 0 for a success return and -1 otherwise. See printed message for details in case of failure. 10.3.4 SUNNonlinearSolver FixedPoint content The sunnonlinsol fixedpoint module defines the content field of a SUNNonlinearSolver as the following structure: struct _SUNNonlinearSolverContent_FixedPoint { SUNNonlinSolSysFn Sys; SUNNonlinSolConvTestFn CTest; int int realtype realtype realtype N_Vector N_Vector N_Vector N_Vector N_Vector N_Vector N_Vector N_Vector N_Vector m; *imap; *R; *gamma; *cvals; *df; *dg; *q; *Xvecs; yprev; gy; fold; gold; delta; 10.3 The SUNNonlinearSolver FixedPoint implementation int int long int long int }; 317 curiter; maxiters; niters; nconvfails; The following entries of the content field are always allocated: Sys - function for evaluating the nonlinear system, CTest - function for checking convergence of the fixed point iteration, yprev - N Vector used to store previous fixed-point iterate, gy - N Vector used to store G(y) in fixed-point algorithm, delta - N Vector used to store difference between successive fixed-point iterates, curiter - the current number of iterations in the solve attempt, maxiters - the maximum number of fixed-point iterations allowed in a solve, and niters - the total number of nonlinear iterations across all solves. nconvfails - the total number of nonlinear convergence failures across all solves. m - number of acceleration vectors, If Anderson acceleration is requested (i.e., m > 0 in the call to SUNNonlinSol FixedPoint), then the following items are also allocated within the content field: imap - index array used in acceleration algorithm (length m) R - small matrix used in acceleration algorithm (length m*m) gamma - small vector used in acceleration algorithm (length m) cvals - small vector used in acceleration algorithm (length m+1) df - array of N Vectors used in acceleration algorithm (length m) dg - array of N Vectors used in acceleration algorithm (length m) q - array of N Vectors used in acceleration algorithm (length m) Xvecs - N Vector pointer array used in acceleration algorithm (length m+1) fold - N Vector used in acceleration algorithm gold - N Vector used in acceleration algorithm Appendix A SUNDIALS Package Installation Procedure The installation of any sundials package is accomplished by installing the sundials suite as a whole, according to the instructions that follow. The same procedure applies whether or not the downloaded file contains one or all solvers in sundials. The sundials suite (or individual solvers) are distributed as compressed archives (.tar.gz). The name of the distribution archive is of the form solver-x.y.z.tar.gz, where solver is one of: sundials, cvode, cvodes, arkode, ida, idas, or kinsol, and x.y.z represents the version number (of the sundials suite or of the individual solver). To begin the installation, first uncompress and expand the sources, by issuing % tar xzf solver-x.y.z.tar.gz This will extract source files under a directory solver-x.y.z. Starting with version 2.6.0 of sundials, CMake is the only supported method of installation. The explanations of the installation procedure begins with a few common observations: • The remainder of this chapter will follow these conventions: solverdir is the directory solver-x.y.z created above; i.e., the directory containing the sundials sources. builddir is the (temporary) directory under which sundials is built. instdir is the directory under which the sundials exported header files and libraries will be installed. Typically, header files are exported under a directory instdir/include while libraries are installed under instdir/CMAKE INSTALL LIBDIR, with instdir and CMAKE INSTALL LIBDIR specified at configuration time. • For sundials CMake-based installation, in-source builds are prohibited; in other words, the build directory builddir can not be the same as solverdir and such an attempt will lead to an error. This prevents “polluting” the source tree and allows efficient builds for different configurations and/or options. • The installation directory instdir can not be the same as the source directory solverdir. • By default, only the libraries and header files are exported to the installation directory instdir. If enabled by the user (with the appropriate toggle for CMake), the examples distributed with sundials will be built together with the solver libraries but the installation step will result in exporting (by default in a subdirectory of the installation directory) the example sources and sample outputs together with automatically generated configuration files that reference the installed sundials headers and libraries. As such, these configuration files for the sundials examples can be used as “templates” for your own problems. CMake installs CMakeLists.txt files ! 320 SUNDIALS Package Installation Procedure and also (as an option available only under Unix/Linux) Makefile files. Note this installation approach also allows the option of building the sundials examples without having to install them. (This can be used as a sanity check for the freshly built libraries.) • Even if generation of shared libraries is enabled, only static libraries are created for the FCMIX modules. (Because of the use of fixed names for the Fortran user-provided subroutines, FCMIX shared libraries would result in “undefined symbol” errors at link time.) A.1 CMake-based installation CMake-based installation provides a platform-independent build system. CMake can generate Unix and Linux Makefiles, as well as KDevelop, Visual Studio, and (Apple) XCode project files from the same configuration file. In addition, CMake also provides a GUI front end and which allows an interactive build and installation process. The sundials build process requires CMake version 3.1.3 or higher and a working C compiler. On Unix-like operating systems, it also requires Make (and curses, including its development libraries, for the GUI front end to CMake, ccmake), while on Windows it requires Visual Studio. CMake is continually adding new features, and the latest version can be downloaded from http://www.cmake.org. Build instructions for CMake (only necessary for Unix-like systems) can be found on the CMake website. Once CMake is installed, Linux/Unix users will be able to use ccmake, while Windows users will be able to use CMakeSetup. As previously noted, when using CMake to configure, build and install sundials, it is always required to use a separate build directory. While in-source builds are possible, they are explicitly prohibited by the sundials CMake scripts (one of the reasons being that, unlike autotools, CMake does not provide a make distclean procedure and it is therefore difficult to clean-up the source tree after an in-source build). By ensuring a separate build directory, it is an easy task for the user to clean-up all traces of the build by simply removing the build directory. CMake does generate a make clean which will remove files generated by the compiler and linker. A.1.1 Configuring, building, and installing on Unix-like systems The default CMake configuration will build all included solvers and associated examples and will build static and shared libraries. The instdir defaults to /usr/local and can be changed by setting the CMAKE INSTALL PREFIX variable. Support for FORTRAN and all other options are disabled. CMake can be used from the command line with the cmake command, or from a curses-based GUI by using the ccmake command. Examples for using both methods will be presented. For the examples shown it is assumed that there is a top level sundials directory with appropriate source, build and install directories: % mkdir (...)sundials/instdir % mkdir (...)sundials/builddir % cd (...)sundials/builddir Building with the GUI Using CMake with the GUI follows this general process: • Select and modify values, run configure (c key) • New values are denoted with an asterisk • To set a variable, move the cursor to the variable and press enter – If it is a boolean (ON/OFF) it will toggle the value – If it is string or file, it will allow editing of the string A.1 CMake-based installation 321 – For file and directories, the key can be used to complete • Repeat until all values are set as desired and the generate option is available (g key) • Some variables (advanced variables) are not visible right away • To see advanced variables, toggle to advanced mode (t key) • To search for a variable press / key, and to repeat the search, press the n key To build the default configuration using the GUI, from the builddir enter the ccmake command and point to the solverdir: % ccmake ../solverdir The default configuration screen is shown in Figure A.1. Figure A.1: Default configuration screen. Note: Initial screen is empty. To get this default configuration, press ’c’ repeatedly (accepting default values denoted with asterisk) until the ’g’ option is available. The default instdir for both sundials and corresponding examples can be changed by setting the CMAKE INSTALL PREFIX and the EXAMPLES INSTALL PATH as shown in figure A.2. Pressing the (g key) will generate makefiles including all dependencies and all rules to build sundials on this system. Back at the command prompt, you can now run: % make To install sundials in the installation directory specified in the configuration, simply run: % make install 322 SUNDIALS Package Installation Procedure Figure A.2: Changing the instdir for sundials and corresponding examples Building from the command line Using CMake from the command line is simply a matter of specifying CMake variable settings with the cmake command. The following will build the default configuration: % > > % % cmake -DCMAKE_INSTALL_PREFIX=/home/myname/sundials/instdir \ -DEXAMPLES_INSTALL_PATH=/home/myname/sundials/instdir/examples \ ../solverdir make make install A.1.2 Configuration options (Unix/Linux) A complete list of all available options for a CMake-based sundials configuration is provide below. Note that the default values shown are for a typical configuration on a Linux system and are provided as illustration only. BLAS ENABLE - Enable BLAS support Default: OFF Note: Setting this option to ON will trigger additional CMake options. See additional information on building with BLAS enabled in A.1.4. BLAS LIBRARIES - BLAS library Default: /usr/lib/libblas.so A.1 CMake-based installation 323 Note: CMake will search for libraries in your LD LIBRARY PATH prior to searching default system paths. BUILD ARKODE - Build the ARKODE library Default: ON BUILD CVODE - Build the CVODE library Default: ON BUILD CVODES - Build the CVODES library Default: ON BUILD IDA - Build the IDA library Default: ON BUILD IDAS - Build the IDAS library Default: ON BUILD KINSOL - Build the KINSOL library Default: ON BUILD SHARED LIBS - Build shared libraries Default: ON BUILD STATIC LIBS - Build static libraries Default: ON CMAKE BUILD TYPE - Choose the type of build, options are: None (CMAKE C FLAGS used), Debug, Release, RelWithDebInfo, and MinSizeRel Default: Note: Specifying a build type will trigger the corresponding build type specific compiler flag options below which will be appended to the flags set by CMAKE FLAGS. CMAKE C COMPILER - C compiler Default: /usr/bin/cc CMAKE C FLAGS - Flags for C compiler Default: CMAKE C FLAGS DEBUG - Flags used by the C compiler during debug builds Default: -g CMAKE C FLAGS MINSIZEREL - Flags used by the C compiler during release minsize builds Default: -Os -DNDEBUG CMAKE C FLAGS RELEASE - Flags used by the C compiler during release builds Default: -O3 -DNDEBUG CMAKE CXX COMPILER - C++ compiler Default: /usr/bin/c++ Note: A C++ compiler (and all related options) are only triggered if C++ examples are enabled (EXAMPLES ENABLE CXX is ON). All sundials solvers can be used from C++ applications by default without setting any additional configuration options. CMAKE CXX FLAGS - Flags for C++ compiler Default: CMAKE CXX FLAGS DEBUG - Flags used by the C++ compiler during debug builds Default: -g 324 SUNDIALS Package Installation Procedure CMAKE CXX FLAGS MINSIZEREL - Flags used by the C++ compiler during release minsize builds Default: -Os -DNDEBUG CMAKE CXX FLAGS RELEASE - Flags used by the C++ compiler during release builds Default: -O3 -DNDEBUG CMAKE Fortran COMPILER - Fortran compiler Default: /usr/bin/gfortran Note: Fortran support (and all related options) are triggered only if either Fortran-C support is enabled (FCMIX ENABLE is ON) or BLAS/LAPACK support is enabled (BLAS ENABLE or LAPACK ENABLE is ON). CMAKE Fortran FLAGS - Flags for Fortran compiler Default: CMAKE Fortran FLAGS DEBUG - Flags used by the Fortran compiler during debug builds Default: -g CMAKE Fortran FLAGS MINSIZEREL - Flags used by the Fortran compiler during release minsize builds Default: -Os CMAKE Fortran FLAGS RELEASE - Flags used by the Fortran compiler during release builds Default: -O3 CMAKE INSTALL PREFIX - Install path prefix, prepended onto install directories Default: /usr/local Note: The user must have write access to the location specified through this option. Exported sundials header files and libraries will be installed under subdirectories include and CMAKE INSTALL LIBDIR of CMAKE INSTALL PREFIX, respectively. CMAKE INSTALL LIBDIR - Library installation directory Default: Note: This is the directory within CMAKE INSTALL PREFIX that the sundials libraries will be installed under. The default is automatically set based on the operating system using the GNUInstallDirs CMake module. Fortran INSTALL MODDIR - Fortran module installation directory Default: fortran CUDA ENABLE - Build the sundials cuda vector module. Default: OFF EXAMPLES ENABLE C - Build the sundials C examples Default: ON EXAMPLES ENABLE CUDA - Build the sundials cuda examples Default: OFF Note: You need to enable cuda support to build these examples. EXAMPLES ENABLE CXX - Build the sundials C++ examples Default: OFF EXAMPLES ENABLE RAJA - Build the sundials raja examples Default: OFF Note: You need to enable cuda and raja support to build these examples. EXAMPLES ENABLE F77 - Build the sundials Fortran77 examples Default: ON (if F77 INTERFACE ENABLE is ON) A.1 CMake-based installation 325 EXAMPLES ENABLE F90 - Build the sundials Fortran90/Fortran2003 examples Default: ON (if F77 INTERFACE ENABLE or F2003 INTERFACE ENABLE is ON) EXAMPLES INSTALL - Install example files Default: ON Note: This option is triggered when any of the sundials example programs are enabled (EXAMPLES ENABLE is ON). If the user requires installation of example programs then the sources and sample output files for all sundials modules that are currently enabled will be exported to the directory specified by EXAMPLES INSTALL PATH. A CMake configuration script will also be automatically generated and exported to the same directory. Additionally, if the configuration is done under a Unix-like system, makefiles for the compilation of the example programs (using the installed sundials libraries) will be automatically generated and exported to the directory specified by EXAMPLES INSTALL PATH. EXAMPLES INSTALL PATH - Output directory for installing example files Default: /usr/local/examples Note: The actual default value for this option will be an examples subdirectory created under CMAKE INSTALL PREFIX. F77 INTERFACE ENABLE - Enable Fortran-C support via the Fortran 77 interfaces Default: OFF F2003 INTERFACE ENABLE - Enable Fortran-C support via the Fortran 2003 interfaces Default: OFF HYPRE ENABLE - Enable hypre support Default: OFF Note: See additional information on building with hypre enabled in A.1.4. HYPRE INCLUDE DIR - Path to hypre header files HYPRE LIBRARY DIR - Path to hypre installed library files KLU ENABLE - Enable KLU support Default: OFF Note: See additional information on building with KLU enabled in A.1.4. KLU INCLUDE DIR - Path to SuiteSparse header files KLU LIBRARY DIR - Path to SuiteSparse installed library files LAPACK ENABLE - Enable LAPACK support Default: OFF Note: Setting this option to ON will trigger additional CMake options. See additional information on building with LAPACK enabled in A.1.4. LAPACK LIBRARIES - LAPACK (and BLAS) libraries Default: /usr/lib/liblapack.so;/usr/lib/libblas.so Note: CMake will search for libraries in your LD LIBRARY PATH prior to searching default system paths. MPI ENABLE - Enable MPI support (build the parallel nvector). Default: OFF Note: Setting this option to ON will trigger several additional options related to MPI. MPI C COMPILER - mpicc program Default: 326 SUNDIALS Package Installation Procedure MPI CXX COMPILER - mpicxx program Default: Note: This option is triggered only if MPI is enabled (MPI ENABLE is ON) and C++ examples are enabled (EXAMPLES ENABLE CXX is ON). All sundials solvers can be used from C++ MPI applications by default without setting any additional configuration options other than MPI ENABLE. MPI Fortran COMPILER - mpif77 or mpif90 program Default: Note: This option is triggered only if MPI is enabled (MPI ENABLE is ON) and Fortran-C support is enabled (F77 INTERFACE ENABLE or F2003 INTERFACE ENABLE is ON). MPIEXEC EXECUTABLE - Specify the executable for running MPI programs Default: mpirun Note: This option is triggered only if MPI is enabled (MPI ENABLE is ON). OPENMP ENABLE - Enable OpenMP support (build the OpenMP nvector). Default: OFF OPENMP DEVICE ENABLE - Enable OpenMP device offloading (build the OpenMPDEV nvector) if supported by the provided compiler. Default: OFF SKIP OPENMP DEVICE CHECK - advanced option - Skip the check done to see if the OpenMP provided by the compiler supports OpenMP device offloading. Default: OFF PETSC ENABLE - Enable PETSc support Default: OFF Note: See additional information on building with PETSc enabled in A.1.4. PETSC INCLUDE DIR - Path to PETSc header files PETSC LIBRARY DIR - Path to PETSc installed library files PTHREAD ENABLE - Enable Pthreads support (build the Pthreads nvector). Default: OFF RAJA ENABLE - Enable raja support (build the raja nvector). Default: OFF Note: You need to enable cuda in order to build the raja vector module. SUNDIALS F77 FUNC CASE - advanced option - Specify the case to use in the Fortran name-mangling scheme, options are: lower or upper Default: Note: The build system will attempt to infer the Fortran name-mangling scheme using the Fortran compiler. This option should only be used if a Fortran compiler is not available or to override the inferred or default (lower) scheme if one can not be determined. If used, SUNDIALS F77 FUNC UNDERSCORES must also be set. SUNDIALS F77 FUNC UNDERSCORES - advanced option - Specify the number of underscores to append in the Fortran name-mangling scheme, options are: none, one, or two Default: Note: The build system will attempt to infer the Fortran name-mangling scheme using the Fortran compiler. This option should only be used if a Fortran compiler is not available or to override the inferred or default (one) scheme if one can not be determined. If used, SUNDIALS F77 FUNC CASE must also be set. A.1 CMake-based installation 327 SUNDIALS INDEX TYPE - advanced option - Integer type used for sundials indices. The size must match the size provided for the SUNDIALS INDEX SIZE option. Default: Note: In past SUNDIALS versions, a user could set this option to INT64 T to use 64-bit integers, or INT32 T to use 32-bit integers. Starting in SUNDIALS 3.2.0, these special values are deprecated. For SUNDIALS 3.2.0 and up, a user will only need to use the SUNDIALS INDEX SIZE option in most cases. SUNDIALS INDEX SIZE - Integer size (in bits) used for indices in sundials, options are: 32 or 64 Default: 64 Note: The build system tries to find an integer type of appropriate size. Candidate 64-bit integer types are (in order of preference): int64 t, int64, long long, and long. Candidate 32-bit integers are (in order of preference): int32 t, int, and long. The advanced option, SUNDIALS INDEX TYPE can be used to provide a type not listed here. SUNDIALS PRECISION - Precision used in sundials, options are: double, single, or extended Default: double SUPERLUMT ENABLE - Enable SuperLU MT support Default: OFF Note: See additional information on building with SuperLU MT enabled in A.1.4. SUPERLUMT INCLUDE DIR - Path to SuperLU MT header files (typically SRC directory) SUPERLUMT LIBRARY DIR - Path to SuperLU MT installed library files SUPERLUMT THREAD TYPE - Must be set to Pthread or OpenMP Default: Pthread USE GENERIC MATH - Use generic (stdc) math libraries Default: ON xSDK Configuration Options sundials supports CMake configuration options defined by the Extreme-scale Scientific Software Development Kit (xSDK) community policies (see https://xsdk.info for more information). xSDK CMake options are unused by default but may be activated by setting USE XSDK DEFAULTS to ON. When xSDK options are active, they will overwrite the corresponding sundials option and may have different default values (see details below). As such the equivalent sundials options should not be used when configuring with xSDK options. In the GUI front end to CMake (ccmake), setting USE XSDK DEFAULTS to ON will hide the corresponding sundials options as advanced CMake variables. During configuration, messages are output detailing which xSDK flags are active and the equivalent sundials options that are replaced. Below is a complete list xSDK options and the corresponding sundials options if applicable. TPL BLAS LIBRARIES - BLAS library Default: /usr/lib/libblas.so sundials equivalent: BLAS LIBRARIES Note: CMake will search for libraries in your LD LIBRARY PATH prior to searching default system paths. TPL ENABLE BLAS - Enable BLAS support Default: OFF sundials equivalent: BLAS ENABLE TPL ENABLE HYPRE - Enable hypre support Default: OFF sundials equivalent: HYPRE ENABLE ! 328 SUNDIALS Package Installation Procedure TPL ENABLE KLU - Enable KLU support Default: OFF sundials equivalent: KLU ENABLE TPL ENABLE PETSC - Enable PETSc support Default: OFF sundials equivalent: PETSC ENABLE TPL ENABLE LAPACK - Enable LAPACK support Default: OFF sundials equivalent: LAPACK ENABLE TPL ENABLE SUPERLUMT - Enable SuperLU MT support Default: OFF sundials equivalent: SUPERLUMT ENABLE TPL HYPRE INCLUDE DIRS - Path to hypre header files sundials equivalent: HYPRE INCLUDE DIR TPL HYPRE LIBRARIES - hypre library sundials equivalent: N/A TPL KLU INCLUDE DIRS - Path to KLU header files sundials equivalent: KLU INCLUDE DIR TPL KLU LIBRARIES - KLU library sundials equivalent: N/A TPL LAPACK LIBRARIES - LAPACK (and BLAS) libraries Default: /usr/lib/liblapack.so;/usr/lib/libblas.so sundials equivalent: LAPACK LIBRARIES Note: CMake will search for libraries in your LD LIBRARY PATH prior to searching default system paths. TPL PETSC INCLUDE DIRS - Path to PETSc header files sundials equivalent: PETSC INCLUDE DIR TPL PETSC LIBRARIES - PETSc library sundials equivalent: N/A TPL SUPERLUMT INCLUDE DIRS - Path to SuperLU MT header files sundials equivalent: SUPERLUMT INCLUDE DIR TPL SUPERLUMT LIBRARIES - SuperLU MT library sundials equivalent: N/A TPL SUPERLUMT THREAD TYPE - SuperLU MT library thread type sundials equivalent: SUPERLUMT THREAD TYPE USE XSDK DEFAULTS - Enable xSDK default configuration settings Default: OFF sundials equivalent: N/A Note: Enabling xSDK defaults also sets CMAKE BUILD TYPE to Debug XSDK ENABLE FORTRAN - Enable sundials Fortran interfaces Default: OFF sundials equivalent: F77 INTERFACE ENABLE/F2003 INTERFACE ENABLE A.1 CMake-based installation 329 XSDK INDEX SIZE - Integer size (bits) used for indices in sundials, options are: 32 or 64 Default: 32 sundials equivalent: SUNDIALS INDEX SIZE XSDK PRECISION - Precision used in sundials, options are: double, single, or quad Default: double sundials equivalent: SUNDIALS PRECISION A.1.3 Configuration examples The following examples will help demonstrate usage of the CMake configure options. To configure sundials using the default C and Fortran compilers, and default mpicc and mpif77 parallel compilers, enable compilation of examples, and install libraries, headers, and example sources under subdirectories of /home/myname/sundials/, use: % > > > > > % % % cmake \ -DCMAKE_INSTALL_PREFIX=/home/myname/sundials/instdir \ -DEXAMPLES_INSTALL_PATH=/home/myname/sundials/instdir/examples \ -DMPI_ENABLE=ON \ -DFCMIX_ENABLE=ON \ /home/myname/sundials/solverdir make install To disable installation of the examples, use: % > > > > > > % % % cmake \ -DCMAKE_INSTALL_PREFIX=/home/myname/sundials/instdir \ -DEXAMPLES_INSTALL_PATH=/home/myname/sundials/instdir/examples \ -DMPI_ENABLE=ON \ -DFCMIX_ENABLE=ON \ -DEXAMPLES_INSTALL=OFF \ /home/myname/sundials/solverdir make install A.1.4 Working with external Libraries The sundials suite contains many options to enable implementation flexibility when developing solutions. The following are some notes addressing specific configurations when using the supported third party libraries. When building sundials as a shared library external libraries any used with sundials must also be build as a shared library or as a static library compiled with the -fPIC flag. Building with BLAS sundials does not utilize BLAS directly but it may be needed by other external libraries that sundials can be built with (e.g. LAPACK, PETSc, SuperLU MT, etc.). To enable BLAS, set the BLAS ENABLE option to ON. If the directory containing the BLAS library is in the LD LIBRARY PATH environment variable, CMake will set the BLAS LIBRARIES variable accordingly, otherwise CMake will attempt to find the BLAS library in standard system locations. To explicitly tell CMake what libraries to use, the BLAS LIBRARIES variable can be set to the desired library. Example: % cmake \ > -DCMAKE_INSTALL_PREFIX=/home/myname/sundials/instdir \ ! 330 > > > > > > > % % % ! SUNDIALS Package Installation Procedure -DEXAMPLES_INSTALL_PATH=/home/myname/sundials/instdir/examples \ -DBLAS_ENABLE=ON \ -DBLAS_LIBRARIES=/myblaspath/lib/libblas.so \ -DSUPERLUMT_ENABLE=ON \ -DSUPERLUMT_INCLUDE_DIR=/mysuperlumtpath/SRC -DSUPERLUMT_LIBRARY_DIR=/mysuperlumtpath/lib /home/myname/sundials/solverdir make install When allowing CMake to automatically locate the LAPACK library, CMake may also locate the corresponding BLAS library. If a working Fortran compiler is not available to infer the Fortran name-mangling scheme, the options SUNDIALS F77 FUNC CASE and SUNDIALS F77 FUNC UNDERSCORES must be set in order to bypass the check for a Fortran compiler and define the name-mangling scheme. The defaults for these options in earlier versions of sundials were lower and one respectively. Building with LAPACK ! To enable LAPACK, set the LAPACK ENABLE option to ON. If the directory containing the LAPACK library is in the LD LIBRARY PATH environment variable, CMake will set the LAPACK LIBRARIES variable accordingly, otherwise CMake will attempt to find the LAPACK library in standard system locations. To explicitly tell CMake what library to use, the LAPACK LIBRARIES variable can be set to the desired libraries. When setting the LAPACK location explicitly the location of the corresponding BLAS library will also need to be set. Example: % > > > > > > > % % % ! cmake \ -DCMAKE_INSTALL_PREFIX=/home/myname/sundials/instdir \ -DEXAMPLES_INSTALL_PATH=/home/myname/sundials/instdir/examples \ -DBLAS_ENABLE=ON \ -DBLAS_LIBRARIES=/mylapackpath/lib/libblas.so \ -DLAPACK_ENABLE=ON \ -DLAPACK_LIBRARIES=/mylapackpath/lib/liblapack.so \ /home/myname/sundials/solverdir make install When allowing CMake to automatically locate the LAPACK library, CMake may also locate the corresponding BLAS library. If a working Fortran compiler is not available to infer the Fortran name-mangling scheme, the options SUNDIALS F77 FUNC CASE and SUNDIALS F77 FUNC UNDERSCORES must be set in order to bypass the check for a Fortran compiler and define the name-mangling scheme. The defaults for these options in earlier versions of sundials were lower and one respectively. Building with KLU The KLU libraries are part of SuiteSparse, a suite of sparse matrix software, available from the Texas A&M University website: http://faculty.cse.tamu.edu/davis/suitesparse.html. sundials has been tested with SuiteSparse version 4.5.3. To enable KLU, set KLU ENABLE to ON, set KLU INCLUDE DIR to the include path of the KLU installation and set KLU LIBRARY DIR to the lib path of the KLU installation. The CMake configure will result in populating the following variables: AMD LIBRARY, AMD LIBRARY DIR, BTF LIBRARY, BTF LIBRARY DIR, COLAMD LIBRARY, COLAMD LIBRARY DIR, and KLU LIBRARY. A.1 CMake-based installation 331 Building with SuperLU MT The SuperLU MT libraries are available for download from the Lawrence Berkeley National Laboratory website: http://crd-legacy.lbl.gov/∼xiaoye/SuperLU/#superlu mt. sundials has been tested with SuperLU MT version 3.1. To enable SuperLU MT, set SUPERLUMT ENABLE to ON, set SUPERLUMT INCLUDE DIR to the SRC path of the SuperLU MT installation, and set the variable SUPERLUMT LIBRARY DIR to the lib path of the SuperLU MT installation. At the same time, the variable SUPERLUMT THREAD TYPE must be set to either Pthread or OpenMP. Do not mix thread types when building sundials solvers. If threading is enabled for sundials by having either OPENMP ENABLE or PTHREAD ENABLE set to ON then SuperLU MT should be set to use the same threading type. Building with PETSc The PETSc libraries are available for download from the Argonne National Laboratory website: http://www.mcs.anl.gov/petsc. sundials has been tested with PETSc version 3.7.2. To enable PETSc, set PETSC ENABLE to ON, set PETSC INCLUDE DIR to the include path of the PETSc installation, and set the variable PETSC LIBRARY DIR to the lib path of the PETSc installation. Building with hypre The hypre libraries are available for download from the Lawrence Livermore National Laboratory website: http://computation.llnl.gov/projects/hypre. sundials has been tested with hypre version 2.11.1. To enable hypre, set HYPRE ENABLE to ON, set HYPRE INCLUDE DIR to the include path of the hypre installation, and set the variable HYPRE LIBRARY DIR to the lib path of the hypre installation. Building with CUDA sundials cuda modules and examples have been tested with version 8.0 of the cuda toolkit. To build them, you need to install the Toolkit and compatible NVIDIA drivers. Both are available for download from the NVIDIA website: https://developer.nvidia.com/cuda-downloads. To enable cuda, set CUDA ENABLE to ON. If cuda is installed in a nonstandard location, you may be prompted to set the variable CUDA TOOLKIT ROOT DIR with your cuda Toolkit installation path. To enable cuda examples, set EXAMPLES ENABLE CUDA to ON. Building with RAJA raja is a performance portability layer developed by Lawrence Livermore National Laboratory and can be obtained from https://github.com/LLNL/RAJA. sundials raja modules and examples have been tested with raja version 0.3. Building sundials raja modules requires a cuda-enabled raja installation. To enable raja, set CUDA ENABLE and RAJA ENABLE to ON. If raja is installed in a nonstandard location you will be prompted to set the variable RAJA DIR with the path to the raja CMake configuration file. To enable building the raja examples set EXAMPLES ENABLE RAJA to ON. A.1.5 Testing the build and installation If sundials was configured with EXAMPLES ENABLE options to ON, then a set of regression tests can be run after building with the make command by running: % make test Additionally, if EXAMPLES INSTALL was also set to ON, then a set of smoke tests can be run after installing with the make install command by running: % make test_install ! 332 A.2 ! SUNDIALS Package Installation Procedure Building and Running Examples Each of the sundials solvers is distributed with a set of examples demonstrating basic usage. To build and install the examples, set at least of the EXAMPLES ENABLE options to ON, and set EXAMPLES INSTALL to ON. Specify the installation path for the examples with the variable EXAMPLES INSTALL PATH. CMake will generate CMakeLists.txt configuration files (and Makefile files if on Linux/Unix) that reference the installed sundials headers and libraries. Either the CMakeLists.txt file or the traditional Makefile may be used to build the examples as well as serve as a template for creating user developed solutions. To use the supplied Makefile simply run make to compile and generate the executables. To use CMake from within the installed example directory, run cmake (or ccmake to use the GUI) followed by make to compile the example code. Note that if CMake is used, it will overwrite the traditional Makefile with a new CMake-generated Makefile. The resulting output from running the examples can be compared with example output bundled in the sundials distribution. NOTE: There will potentially be differences in the output due to machine architecture, compiler versions, use of third party libraries etc. A.3 Configuring, building, and installing on Windows CMake can also be used to build sundials on Windows. To build sundials for use with Visual Studio the following steps should be performed: 1. Unzip the downloaded tar file(s) into a directory. This will be the solverdir 2. Create a separate builddir 3. Open a Visual Studio Command Prompt and cd to builddir 4. Run cmake-gui ../solverdir (a) Hit Configure (b) Check/Uncheck solvers to be built (c) Change CMAKE INSTALL PREFIX to instdir (d) Set other options as desired (e) Hit Generate 5. Back in the VS Command Window: (a) Run msbuild ALL BUILD.vcxproj (b) Run msbuild INSTALL.vcxproj The resulting libraries will be in the instdir. The sundials project can also now be opened in Visual Studio. Double click on the ALL BUILD.vcxproj file to open the project. Build the whole solution to create the sundials libraries. To use the sundials libraries in your own projects, you must set the include directories for your project, add the sundials libraries to your project solution, and set the sundials libraries as dependencies for your project. A.4 Installed libraries and exported header files Using the CMake sundials build system, the command % make install A.4 Installed libraries and exported header files 333 will install the libraries under libdir and the public header files under includedir. The values for these directories are instdir/CMAKE INSTALL LIBDIR and instdir/include, respectively. The location can be changed by setting the CMake variable CMAKE INSTALL PREFIX. Although all installed libraries reside under libdir/CMAKE INSTALL LIBDIR, the public header files are further organized into subdirectories under includedir/include. The installed libraries and exported header files are listed for reference in Table A.1. The file extension .lib is typically .so for shared libraries and .a for static libraries. Note that, in the Tables, names are relative to libdir for libraries and to includedir for header files. A typical user program need not explicitly include any of the shared sundials header files from under the includedir/include/sundials directory since they are explicitly included by the appropriate solver header files (e.g., cvode dense.h includes sundials dense.h). However, it is both legal and safe to do so, and would be useful, for example, if the functions declared in sundials dense.h are to be used in building a preconditioner. 334 shared nvector serial nvector parallel nvector openmp nvector openmpdev nvector pthreads nvector parhyp SUNDIALS Package Installation Procedure Table A.1: sundials libraries and header files Libraries n/a Header files sundials/sundials config.h sundials/sundials fconfig.h sundials/sundials types.h sundials/sundials math.h sundials/sundials nvector.h sundials/sundials fnvector.h sundials/sundials matrix.h sundials/sundials linearsolver.h sundials/sundials iterative.h sundials/sundials direct.h sundials/sundials dense.h sundials/sundials band.h sundials/sundials nonlinearsolver.h sundials/sundials version.h sundials/sundials mpi types.h Libraries libsundials nvecserial.lib libsundials fnvecserial mod.lib libsundials fnvecserial.a Header files nvector/nvector serial.h Module fnvector serial mod.mod files Libraries libsundials nvecparallel.lib libsundials fnvecparallel.a Header files nvector/nvector parallel.h Libraries libsundials nvecopenmp.lib libsundials fnvecopenmp mod.lib libsundials fnvecopenmp.a Header files nvector/nvector openmp.h Module fnvector openmp mod.mod files Libraries libsundials nvecopenmpdev.lib Header files nvector/nvector openmpdev.h Libraries libsundials nvecpthreads.lib libsundials fnvecpthreads mod.lib libsundials fnvecpthreads.a Header files nvector/nvector pthreads.h Module fnvector pthreads mod.mod files Libraries libsundials nvecparhyp.lib Header files nvector/nvector parhyp.h continued on next page A.4 Installed libraries and exported header files 335 continued from last page nvector petsc nvector cuda Libraries Header files Libraries Libraries Header files nvector raja Libraries Libraries Header files sunmatrix band Libraries sunmatrix dense Header files Module files Libraries sunmatrix sparse Header files Module files Libraries sunlinsol band Header files Module files Libraries sunlinsol dense Header files Module files Libraries Header files libsundials nvecpetsc.lib nvector/nvector petsc.h libsundials nveccuda.lib libsundials nvecmpicuda.lib nvector/nvector cuda.h nvector/nvector mpicuda.h nvector/cuda/ThreadPartitioning.hpp nvector/cuda/Vector.hpp nvector/cuda/VectorKernels.cuh libsundials nveccudaraja.lib libsundials nveccudampiraja.lib nvector/nvector raja.h nvector/nvector mpiraja.h nvector/raja/Vector.hpp libsundials sunmatrixband.lib libsundials fsunmatrixband mod.lib libsundials fsunmatrixband.a sunmatrix/sunmatrix band.h fsunmatrix band mod.mod libsundials sunmatrixdense.lib libsundials fsunmatrixdense mod.lib libsundials fsunmatrixdense.a sunmatrix/sunmatrix dense.h fsunmatrix dense mod.mod libsundials sunmatrixsparse.lib libsundials fsunmatrixsparse mod.lib libsundials fsunmatrixsparse.a sunmatrix/sunmatrix sparse.h fsunmatrix sparse mod.mod libsundials sunlinsolband.lib libsundials fsunlinsolband mod.lib libsundials fsunlinsolband.a sunlinsol/sunlinsol band.h fsunlinsol band mod.mod libsundials sunlinsoldense.lib libsundials fsunlinsoldense mod.lib libsundials fsunlinsoldense.a sunlinsol/sunlinsol dense.h continued on next page 336 SUNDIALS Package Installation Procedure continued from last page sunlinsol klu Module files Libraries sunlinsol lapackband Header files Module files Libraries sunlinsol lapackdense Header files Libraries sunlinsol pcg Header files Libraries sunlinsol spbcgs Header files Module files Libraries sunlinsol spfgmr Header files Module files Libraries sunlinsol spgmr Header files Module files Libraries sunlinsol sptfqmr Header files Module files Libraries Header files fsunlinsol dense mod.mod libsundials sunlinsolklu.lib libsundials fsunlinsolklu mod.lib libsundials fsunlinsolklu.a sunlinsol/sunlinsol klu.h fsunlinsol klu mod.mod libsundials sunlinsollapackband.lib libsundials fsunlinsollapackband.a sunlinsol/sunlinsol lapackband.h libsundials sunlinsollapackdense.lib libsundials fsunlinsollapackdense.a sunlinsol/sunlinsol lapackdense.h libsundials sunlinsolpcg.lib libsundials fsunlinsolpcg mod.lib libsundials fsunlinsolpcg.a sunlinsol/sunlinsol pcg.h fsunlinsol pcg mod.mod libsundials sunlinsolspbcgs.lib libsundials fsunlinsolspbcgs mod.lib libsundials fsunlinsolspbcgs.a sunlinsol/sunlinsol spbcgs.h fsunlinsol spbcgs mod.mod libsundials sunlinsolspfgmr.lib libsundials fsunlinsolspfgmr mod.lib libsundials fsunlinsolspfgmr.a sunlinsol/sunlinsol spfgmr.h fsunlinsol spfgmr mod.mod libsundials sunlinsolspgmr.lib libsundials fsunlinsolspgmr mod.lib libsundials fsunlinsolspgmr.a sunlinsol/sunlinsol spgmr.h fsunlinsol spgmr mod.mod libsundials sunlinsolsptfqmr.lib libsundials fsunlinsolsptfqmr mod.lib libsundials fsunlinsolsptfqmr.a sunlinsol/sunlinsol sptfqmr.h continued on next page A.4 Installed libraries and exported header files 337 continued from last page sunlinsol superlumt Module files Libraries sunnonlinsol newton Header files Libraries sunnonlinsol fixedpoint Header files Module files Libraries cvode cvodes Header files Module files Libraries Header files Module files Libraries Header files arkode Libraries Header files ida Libraries Header files idas Libraries Header files kinsol Libraries fsunlinsol sptfqmr mod.mod libsundials sunlinsolsuperlumt.lib libsundials fsunlinsolsuperlumt.a sunlinsol/sunlinsol superlumt.h libsundials sunnonlinsolnewton.lib libsundials fsunnonlinsolnewton mod.lib libsundials fsunnonlinsolnewton.a sunnonlinsol/sunnonlinsol newton.h fsunnonlinsol newton mod.mod libsundials sunnonlinsolfixedpoint.lib libsundials fsunnonlinsolfixedpoint.a libsundials fsunnonlinsolfixedpoint mod.lib sunnonlinsol/sunnonlinsol fixedpoint.h fsunnonlinsol fixedpoint mod.mod libsundials cvode.lib cvode/cvode.h cvode/cvode direct.h cvode/cvode spils.h cvode/cvode bbdpre.h fcvode mod.mod libsundials cvodes.lib cvodes/cvodes.h cvodes/cvodes direct.h cvodes/cvodes spils.h cvodes/cvodes bbdpre.h libsundials arkode.lib arkode/arkode.h arkode/arkode ls.h arkode/arkode bbdpre.h libsundials ida.lib ida/ida.h ida/ida direct.h ida/ida spils.h libsundials idas.lib idas/idas.h idas/idas direct.h idas/idas spils.h libsundials kinsol.lib libsundials fcvode.a cvode/cvode impl.h cvode/cvode ls.h cvode/cvode bandpre.h cvodes/cvodes impl.h cvodes/cvodes ls.h cvodes/cvodes bandpre.h libsundials farkode.a arkode/arkode impl.h arkode/arkode bandpre.h libsundials fida.a ida/ida impl.h ida/ida ls.h ida/ida bbdpre.h idas/idas impl.h idas/idas ls.h idas/idas bbdpre.h libsundials fkinsol.a continued on next page 338 SUNDIALS Package Installation Procedure continued from last page Header files kinsol/kinsol.h kinsol/kinsol direct.h kinsol/kinsol spils.h kinsol/kinsol impl.h kinsol/kinsol ls.h kinsol/kinsol bbdpre.h Appendix B IDAS Constants Below we list all input and output constants used by the main solver and linear solver modules, together with their numerical values and a short description of their meaning. B.1 IDAS input constants idas main solver module IDA IDA IDA IDA IDA NORMAL ONE STEP SIMULTANEOUS STAGGERED CENTERED 1 2 1 2 1 IDA FORWARD 2 IDA YA YDP INIT IDA Y INIT 1 2 Solver returns at specified output time. Solver returns after each successful step. Simultaneous corrector forward sensitivity method. Staggered corrector forward sensitivity method. Central difference quotient approximation (2nd order) of the sensitivity RHS. Forward difference quotient approximation (1st order) of the sensitivity RHS. Compute ya and ẏd , given yd . Compute y, given ẏ. idas adjoint solver module IDA HERMITE IDA POLYNOMIAL 1 2 Use Hermite interpolation. Use variable-degree polynomial interpolation. Iterative linear solver module PREC NONE PREC LEFT MODIFIED GS CLASSICAL GS B.2 0 1 1 2 No preconditioning Preconditioning on the left. Use modified Gram-Schmidt procedure. Use classical Gram-Schmidt procedure. IDAS output constants idas main solver module 340 IDA IDA IDA IDA IDA IDAS Constants SUCCESS TSTOP RETURN ROOT RETURN WARNING TOO MUCH WORK 0 1 2 99 -1 IDA TOO MUCH ACC -2 IDA ERR FAIL -3 IDA CONV FAIL -4 IDA LINIT FAIL IDA LSETUP FAIL -5 -6 IDA LSOLVE FAIL -7 IDA RES FAIL -8 IDA REP RES FAIL -9 IDA RTFUNC FAIL IDA CONSTR FAIL -10 -11 IDA FIRST RES FAIL -12 IDA LINESEARCH FAIL IDA NO RECOVERY -13 -14 IDA IDA IDA IDA IDA IDA IDA IDA IDA IDA -15 -16 -20 -21 -22 -23 -24 -25 -26 -27 NLS INIT FAIL NLS SETUP FAIL MEM NULL MEM FAIL ILL INPUT NO MALLOC BAD EWT BAD K BAD T BAD DKY IDA NO QUAD IDA QRHS FAIL -30 -31 IDA FIRST QRHS ERR -32 Successful function return. IDASolve succeeded by reaching the specified stopping point. IDASolve succeeded and found one or more roots. IDASolve succeeded but an unusual situation occurred. The solver took mxstep internal steps but could not reach tout. The solver could not satisfy the accuracy demanded by the user for some internal step. Error test failures occurred too many times during one internal time step or minimum step size was reached. Convergence test failures occurred too many times during one internal time step or minimum step size was reached. The linear solver’s initialization function failed. The linear solver’s setup function failed in an unrecoverable manner. The linear solver’s solve function failed in an unrecoverable manner. The user-provided residual function failed in an unrecoverable manner. The user-provided residual function repeatedly returned a recoverable error flag, but the solver was unable to recover. The rootfinding function failed in an unrecoverable manner. The inequality constraints were violated and the solver was unable to recover. The user-provided residual function failed recoverably on the first call. The line search failed. The residual function, linear solver setup function, or linear solver solve function had a recoverable failure, but IDACalcIC could not recover. The nonlinear solver’s init routine failed. The nonlinear solver’s setup routine failed. The ida mem argument was NULL. A memory allocation failed. One of the function inputs is illegal. The idas memory was not allocated by a call to IDAInit. Zero value of some error weight component. The k-th derivative is not available. The time t is outside the last step taken. The vector argument where derivative should be stored is NULL. Quadratures were not initialized. The user-provided right-hand side function for quadratures failed in an unrecoverable manner. The user-provided right-hand side function for quadratures failed in an unrecoverable manner on the first call. B.2 IDAS output constants 341 IDA REP QRHS ERR -33 IDA NO SENS IDA SRES FAIL -40 -41 IDA REP SRES ERR -42 IDA BAD IS IDA NO QUADSENS IDA QSRHS FAIL -43 -50 -51 IDA FIRST QSRHS ERR -52 IDA REP QSRHS ERR -53 The user-provided right-hand side repeatedly returned a recoverable error flag, but the solver was unable to recover. Sensitivities were not initialized. The user-provided sensitivity residual function failed in an unrecoverable manner. The user-provided sensitivity residual function repeatedly returned a recoverable error flag, but the solver was unable to recover. The sensitivity identifier is not valid. Sensitivity-dependent quadratures were not initialized. The user-provided sensitivity-dependent quadrature righthand side function failed in an unrecoverable manner. The user-provided sensitivity-dependent quadrature righthand side function failed in an unrecoverable manner on the first call. The user-provided sensitivity-dependent quadrature righthand side repeatedly returned a recoverable error flag, but the solver was unable to recover. idas adjoint solver module IDA NO ADJ -101 IDA NO FWD IDA NO BCK IDA BAD TB0 -102 -103 -104 IDA REIFWD FAIL IDA FWD FAIL -105 -106 IDA GETY BADT -107 The combined forward-backward problem has not been initialized. IDASolveF has not been previously called. No backward problem was specified. The desired output for backward problem is outside the interval over which the forward problem was solved. No checkpoint is available for this hot start. IDASolveB failed because IDASolve was unable to store data between two consecutive checkpoints. Wrong time in interpolation function. idals linear solver interface IDALS IDALS IDALS IDALS SUCCESS MEM NULL LMEM NULL ILL INPUT 0 -1 -2 -3 IDALS IDALS IDALS IDALS IDALS IDALS MEM FAIL PMEM NULL JACFUNC UNRECVR JACFUNC RECVR SUNMAT FAIL SUNLS FAIL -4 -5 -6 -7 -8 -9 Successful function return. The ida mem argument was NULL. The idals linear solver has not been initialized. The idals solver is not compatible with the current nvector module, or an input value was illegal. A memory allocation request failed. The preconditioner module has not been initialized. The Jacobian function failed in an unrecoverable manner. The Jacobian function had a recoverable error. An error occurred with the current sunmatrix module. An error occurred with the current sunlinsol module. 342 IDAS Constants IDALS NO ADJ -101 IDALS LMEMB NULL -102 The combined forward-backward problem has not been initialized. The linear solver was not initialized for the backward phase. Bibliography [1] KLU Sparse Matrix Factorization Library. http://faculty.cse.tamu.edu/davis/suitesparse.html. [2] SuperLU MT Threaded Sparse Matrix Factorization Library. http://crd-legacy.lbl.gov/ xiaoye/SuperLU/. [3] D. G. Anderson. Iterative procedures for nonlinear integral equations. J. Assoc. Comput. Machinery, 12:547–560, 1965. [4] K. E. Brenan, S. L. Campbell, and L. R. Petzold. Numerical Solution of Initial-Value Problems in Differential-Algebraic Equations. SIAM, Philadelphia, Pa, 1996. [5] P. N. Brown. A local convergence theory for combined inexact-Newton/finite difference projection methods. SIAM J. Numer. Anal., 24(2):407–434, 1987. [6] P. N. Brown and A. C. Hindmarsh. Reduced Storage Matrix Methods in Stiff ODE Systems. J. Appl. Math. & Comp., 31:49–91, 1989. [7] P. N. Brown, A. C. Hindmarsh, and L. R. Petzold. Using Krylov Methods in the Solution of Large-Scale Differential-Algebraic Systems. SIAM J. Sci. Comput., 15:1467–1488, 1994. [8] P. N. Brown, A. C. Hindmarsh, and L. R. Petzold. Consistent Initial Condition Calculation for Differential-Algebraic Systems. SIAM J. Sci. Comput., 19:1495–1512, 1998. [9] P. N. Brown and Y. Saad. Hybrid Krylov Methods for Nonlinear Systems of Equations. SIAM J. Sci. Stat. Comput., 11:450–481, 1990. [10] G. D. Byrne. Pragmatic Experiments with Krylov Methods in the Stiff ODE Setting. In J.R. Cash and I. Gladwell, editors, Computational Ordinary Differential Equations, pages 323–356, Oxford, 1992. Oxford University Press. [11] G. D. Byrne and A. C. Hindmarsh. User Documentation for PVODE, An ODE Solver for Parallel Computers. Technical Report UCRL-ID-130884, LLNL, May 1998. [12] G. D. Byrne and A. C. Hindmarsh. PVODE, An ODE Solver for Parallel Computers. Intl. J. High Perf. Comput. Apps., 13(4):254–365, 1999. [13] Y. Cao, S. Li, L. R. Petzold, and R. Serban. Adjoint Sensitivity Analysis for Differential-Algebraic Equations: The Adjoint DAE System and its Numerical Solution. SIAM J. Sci. Comput., 24(3):1076–1089, 2003. [14] M. Caracotsios and W. E. Stewart. Sensitivity Analysis of Initial Value Problems with Mixed ODEs and Algebraic Equations. Computers and Chemical Engineering, 9:359–365, 1985. [15] S. D. Cohen and A. C. Hindmarsh. CVODE, a Stiff/Nonstiff ODE Solver in C. Computers in Physics, 10(2):138–143, 1996. [16] A. M. Collier, A. C. Hindmarsh, R. Serban, and C.S. Woodward. User Documentation for KINSOL v4.0.0. Technical Report UCRL-SM-208116, LLNL, 2018. 344 BIBLIOGRAPHY [17] T. A. Davis and P. N. Ekanathan. Algorithm 907: KLU, a direct sparse solver for circuit simulation problems. ACM Trans. Math. Softw., 37(3), 2010. [18] R. S. Dembo, S. C. Eisenstat, and T. Steihaug. Inexact Newton Methods. SIAM J. Numer. Anal., 19:400–408, 1982. [19] J. W. Demmel, J. R. Gilbert, and X. S. Li. An asynchronous parallel supernodal algorithm for sparse gaussian elimination. SIAM J. Matrix Analysis and Applications, 20(4):915–952, 1999. [20] J. E. Dennis and R. B. Schnabel. Numerical Methods for Unconstrained Optimization and Nonlinear Equations. SIAM, Philadelphia, 1996. [21] H. Fang and Y. Saad. Two classes of secant methods for nonlinear acceleration. Numer. Linear Algebra Appl., 16:197–221, 2009. [22] W. F. Feehery, J. E. Tolsma, and P. I. Barton. Efficient Sensitivity Analysis of Large-Scale Differential-Algebraic Systems. Applied Numer. Math., 25(1):41–54, 1997. [23] R. W. Freund. A Transpose-Free Quasi-Minimal Residual Algorithm for Non-Hermitian Linear Systems. SIAM J. Sci. Comp., 14:470–482, 1993. [24] M. R. Hestenes and E. Stiefel. Methods of Conjugate Gradients for Solving Linear Systems. J. Research of the National Bureau of Standards, 49(6):409–436, 1952. [25] K. L. Hiebert and L. F. Shampine. Implicitly Defined Output Points for Solutions of ODEs. Technical Report SAND80-0180, Sandia National Laboratories, February 1980. [26] A. C. Hindmarsh, P. N. Brown, K. E. Grant, S. L. Lee, R. Serban, D. E. Shumaker, and C. S. Woodward. SUNDIALS, suite of nonlinear and differential/algebraic equation solvers. ACM Trans. Math. Softw., (31):363–396, 2005. [27] A. C. Hindmarsh and R. Serban. User Documentation for CVODE v4.0.0. Technical Report UCRL-SM-208108, LLNL, 2018. [28] A. C. Hindmarsh and R. Serban. User Documentation for CVODES v4.0.0. Technical report, LLNL, 2018. UCRL-SM-208111. [29] A. C. Hindmarsh, R. Serban, and A. Collier. Example Programs for IDA v4.0.0. Technical Report UCRL-SM-208113, LLNL, 2018. [30] A. C. Hindmarsh, R. Serban, and A. Collier. User Documentation for IDA v4.0.0. Technical Report UCRL-SM-208112, LLNL, 2018. [31] A. C. Hindmarsh, R. Serban, and D. R. Reynolds. Example Programs for CVODE v4.0.0. Technical report, LLNL, 2018. UCRL-SM-208110. [32] A. C. Hindmarsh and A. G. Taylor. PVODE and KINSOL: Parallel Software for Differential and Nonlinear Systems. Technical Report UCRL-ID-129739, LLNL, February 1998. [33] C. T. Kelley. Iterative Methods for Solving Linear and Nonlinear Equations. SIAM, Philadelphia, 1995. [34] S. Li, L. R. Petzold, and W. Zhu. Sensitivity Analysis of Differential-Algebraic Equations: A Comparison of Methods on a Special Problem. Applied Num. Math., 32:161–174, 2000. [35] X. S. Li. An overview of SuperLU: Algorithms, implementation, and user interface. ACM Trans. Math. Softw., 31(3):302–325, September 2005. [36] P. A. Lott, H. F. Walker, C. S. Woodward, and U. M. Yang. An accelerated Picard method for nonlinear systems related to variably saturated flow. Adv. Wat. Resour., 38:92–101, 2012. BIBLIOGRAPHY 345 [37] T. Maly and L. R. Petzold. Numerical Methods and Software for Sensitivity Analysis of Differential-Algebraic Systems. Applied Numerical Mathematics, 20:57–79, 1997. [38] D.B. Ozyurt and P.I. Barton. Cheap second order directional derivatives of stiff ODE embedded functionals. SIAM J. of Sci. Comp., 26(5):1725–1743, 2005. [39] Daniel R. Reynolds. Example Programs for ARKODE v3.0.0. Methodist University, 2018. Technical report, Southern [40] Y. Saad. A flexible inner-outer preconditioned GMRES algorithm. SIAM J. Sci. Comput., 14(2):461–469, 1993. [41] Y. Saad and M. H. Schultz. GMRES: A Generalized Minimal Residual Algorithm for Solving Nonsymmetric Linear Systems. SIAM J. Sci. Stat. Comp., 7:856–869, 1986. [42] R. Serban and A. C. Hindmarsh. CVODES, the sensitivity-enabled ODE solver in SUNDIALS. In Proceedings of the 5th International Conference on Multibody Systems, Nonlinear Dynamics and Control, Long Beach, CA, 2005. ASME. [43] R. Serban and A. C. Hindmarsh. Example Programs for IDAS v3.0.0. Technical Report LLNLTR-437091, LLNL, 2018. [44] H. A. Van Der Vorst. Bi-CGSTAB: A Fast and Smoothly Converging Variant of Bi-CG for the Solution of Nonsymmetric Linear Systems. SIAM J. Sci. Stat. Comp., 13:631–644, 1992. [45] H. F. Walker and P. Ni. Anderson acceleration for fixed-point iterations. SIAM Jour. Num. Anal., 49(4):1715–1735, 2011. Index adjoint sensitivity analysis checkpointing, 22 implementation in idas, 23, 26 mathematical background, 20–23 quadrature evaluation, 144 residual evaluation, 142, 143 sensitivity-dependent quadrature evaluation, 145 BIG REAL, 32, 165 booleantype, 32 eh data, 73 error control sensitivity variables, 19 error messages, 45 redirecting, 45 user-defined handler, 47, 73 fnvector serial mod, 174 forward sensitivity analysis absolute tolerance selection, 19 correction strategies, 18, 26, 96, 97 mathematical background, 17–20 residual evaluation, 108 right hand side evaluation, 20 right-hand side evaluation, 19 FSUNBANDLINSOLINIT, 253 FSUNDENSELINSOLINIT, 251 FSUNFIXEDPOINTINIT, 316 FSUNKLUINIT, 262 FSUNKLUREINIT, 263 FSUNKLUSETORDERING, 264 FSUNLAPACKBANDINIT, 258 FSUNLAPACKDENSEINIT, 256 fsunlinsol band mod, 253 fsunlinsol dense mod, 251 fsunlinsol klu mod, 262 fsunlinsol pcg mod, 296 fsunlinsol spbcgs mod, 284 fsunlinsol spfgmr mod, 278 fsunlinsol spgmr mod, 272 fsunlinsol sptfqmr mod, 290 FSUNMASSBANDLINSOLINIT, 254 FSUNMASSDENSELINSOLINIT, 251 FSUNMASSKLUINIT, 263 FSUNMASSKLUREINIT, 263 FSUNMASSKLUSETORDERING, 264 FSUNMASSLAPACKBANDINIT, 259 FSUNMASSLAPACKDENSEINIT, 256 FSUNMASSPCGINIT, 296 FSUNMASSPCGSETMAXL, 297 FSUNMASSPCGSETPRECTYPE, 297 FSUNMASSSPBCGSINIT, 285 FSUNMASSSPBCGSSETMAXL, 286 FSUNMASSSPBCGSSETPRECTYPE, 286 FSUNMASSSPFGMRINIT, 279 FSUNMASSSPFGMRSETGSTYPE, 279 FSUNMASSSPFGMRSETMAXRS, 280 FSUNMASSSPFGMRSETPRECTYPE, 280 FSUNMASSSPGMRINIT, 272 FSUNMASSSPGMRSETGSTYPE, 273 FSUNMASSSPGMRSETMAXRS, 274 FSUNMASSSPGMRSETPRECTYPE, 273 FSUNMASSSPTFQMRINIT, 290 FSUNMASSSPTFQMRSETMAXL, 291 FSUNMASSSPTFQMRSETPRECTYPE, 291 FSUNMASSSUPERLUMTINIT, 267 FSUNMASSUPERLUMTSETORDERING, 268 fsunmatrix band mod, 227 fsunmatrix dense mod, 222 fsunmatrix sparse mod, 235 FSUNNEWTONINIT, 312 fsunnonlinsol fixedpoint mod, 316 fsunnonlinsol newton mod, 312 FSUNPCGINIT, 296 FSUNPCGSETMAXL, 297 FSUNPCGSETPRECTYPE, 297 FSUNSPBCGSINIT, 285 FSUNSPBCGSSETMAXL, 286 FSUNSPBCGSSETPRECTYPE, 285 FSUNSPFGMRINIT, 278 FSUNSPFGMRSETGSTYPE, 279 FSUNSPFGMRSETMAXRS, 280 FSUNSPFGMRSETPRECTYPE, 280 FSUNSPGMRINIT, 272 FSUNSPGMRSETGSTYPE, 273 348 FSUNSPGMRSETMAXRS, 274 FSUNSPGMRSETPRECTYPE, 273 FSUNSPTFQMRINIT, 290 FSUNSPTFQMRSETMAXL, 291 FSUNSPTFQMRSETPRECTYPE, 291 FSUNSUPERLUMTINIT, 267 FSUNSUPERLUMTSETORDERING, 267 INDEX IDA NO RECOVERY, 43 IDA NO SENS, 98, 99, 101–103, 105–108, 111, 112, 114 IDA NORMAL, 44, 122, 126, 133 IDA ONE STEP, 44, 122, 126, 133 IDA POLYNOMIAL, 125 IDA QRHS FAIL, 82, 86, 118 IDA QRHSFUNC FAIL, 144, 145 half-bandwidths, 89 IDA QSRHS FAIL, 112 header files, 33, 88 IDA REIFWD FAIL, 133 IDA REP QRHS ERR, 82 ida/ida ls.h, 33 IDA REP QSRHS ERR, 112 IDA BAD DKY, 58, 83, 101–103, 113, 114 IDA REP RES ERR, 45 IDA BAD EWT, 43 IDA REP SRES ERR, 101 IDA BAD IS, 102, 103, 114 IDA RES FAIL, 43, 45 IDA BAD ITASK, 133 IDA RESFUNC FAIL, 143, 144 IDA BAD K, 58, 83, 102, 103, 113, 114 IDA ROOT RETURN, 44, 127 IDA BAD T, 58, 83, 102, 103, 113, 114 IDA RTFUNC FAIL, 45, 74 IDA BAD TB0, 128, 129 IDA SIMULTANEOUS, 26, 97 IDA BAD TBOUT, 133 IDA SOLVE FAIL, 133 IDA BCKMEM NULL, 133 IDA SRES FAIL, 101, 109 IDA CENTERED, 104 IDA STAGGERED, 26, 97 IDA CONSTR FAIL, 43, 45 IDA SUCCESS, 38, 39, 42–44, 47–51, 55–58, 67, IDA CONV FAIL, 43, 45 70, 72, 81–85, 97–108, 111–117, 125–130, 133, 134, 139–142 IDA CONV FAILURE, 127, 133 IDA TOO MUCH ACC, 45, 127, 133 IDA ERR FAIL, 45 IDA TOO MUCH WORK, 45, 127, 133 IDA ERR FAILURE, 127, 133 IDA TSTOP RETURN, 44, 127 IDA FIRST QRHS ERR, 82, 86 IDA WARNING, 73 IDA FIRST QSRHS ERR, 112, 118 IDA Y INIT, 43 IDA FIRST RES FAIL, 43, 109 IDA YA YDP INIT, 42 IDA FORWARD, 104 IDAAdjFree, 125 IDA FWD FAIL, 133 IDAAdjInit, 122, 125 IDA GETY BADT, 139 IDAAdjReInit, 125 IDA HERMITE, 125 IDA ILL INPUT, 38, 39, 42–44, 48–51, 55–57, 66, IDAAdjSetNoSensi, 126 72, 84, 97–100, 103, 104, 108, 111, 112, idabbdpre preconditioner description, 86–87 115, 125, 127–130, 133, 134, 140–142 optional output, 91 IDA LINESEARCH FAIL, 43 usage, 88–89 IDA LINIT FAIL, 43, 45 usage with adjoint module, 154–157 IDA LMEM NULL, 70 user-callable functions, 89–91, 154–155 IDA LSETUP FAIL, 43, 45, 127, 133, 146, 147, 156, user-supplied functions, 87–88, 156–157 157 IDABBDPrecGetNumGfnEvals, 91 IDA LSOLVE FAIL, 43, 45, 127 IDA MEM FAIL, 38, 49, 65, 66, 81, 97, 98, 105, 107, IDABBDPrecGetWorkSpace, 91 IDABBDPrecInit, 89 108, 111, 125, 127, 128, 140, 141 IDA MEM NULL, 38, 39, 42–44, 47–51, 55–58, 60– IDABBDPrecInitB, 154 67, 70, 72, 81–85, 97–108, 111–117, 126, IDABBDPrecReInit, 90 128–130, 133, 134, 139–142 IDABBDPrecReInitB, 155 IDA NO ADJ, 125–134, 140–142 IDACalcIC, 42 IDA NO BCK, 133 IDACalcICB, 131 IDA NO FWD, 133 IDACalcICBS, 131, 132 IDA NO MALLOC, 39, 43, 72, 127–130 IDACreate, 38 IDA NO QUAD, 82–85, 115, 141 IDACreateB, 122, 127 IDA NO QUADSENS, 112–117 IDADlsGetLastFlag, 71 INDEX IDADlsGetNumJacEvals, 68 IDADlsGetNumRhsEvals, 68 IDADlsGetReturnFlagName, 71 IDADlsGetWorkspace, 68 IDADlsJacFn, 76 IDADlsJacFnB, 147 IDADlsJacFnBS, 148 IDADlsSetJacFn, 52 IDADlsSetJacFnB, 135 IDADlsSetJacFnBS, 135 IDADlsSetLinearSolver, 41 IDADlsSetLinearSolverB, 131 IDAErrHandlerFn, 73 IDAEwtFn, 73 IDAFree, 37, 38 IDAGetActualInitStep, 63 IDAGetAdjCheckPointsInfo, 139 IDAGetAdjIDABmem, 138 IDAGetAdjY, 139 IDAGetB, 133 IDAGetConsistentIC, 66 IDAGetConsistentICB, 140 IDAGetCurrentOrder, 62 IDAGetCurrentStep, 63 IDAGetCurrentTime, 63 IDAGetDky, 58 IDAGetErrWeights, 64 IDAGetEstLocalErrors, 64 IDAGetIntegratorStats, 64 IDAGetLastLinFlag, 70 IDAGetLastOrder, 62 IDAGetLastStep, 63 IDAGetLinReturnFlagName, 71 IDAGetLinWorkSpace, 67 IDAGetNonlinSolvStats, 65 IDAGetNumBacktrackOps, 66 IDAGetNumErrTestFails, 62 IDAGetNumGEvals, 67 IDAGetNumJacEvals, 68 IDAGetNumJtimesEvals, 70 IDAGetNumJTSetupEvals, 70 IDAGetNumLinConvFails, 69 IDAGetNumLinIters, 68 IDAGetNumLinResEvals, 68 IDAGetNumLinSolvSetups, 62 IDAGetNumNonlinSolvConvFails, 65 IDAGetNumNonlinSolvIters, 65 IDAGetNumPrecEvals, 69 IDAGetNumPrecSolves, 69 IDAGetNumResEvals, 61 IDAGetNumResEvalsSEns, 105 IDAGetNumSteps, 61 IDAGetQuad, 82, 141 IDAGetQuadB, 124, 141 349 IDAGetQuadDky, 83 IDAGetQuadErrWeights, 85 IDAGetQuadNumErrTestFails, 85 IDAGetQuadNumRhsEvals, 84 IDAGetQuadSens, 113 IDAGetQuadSens1, 113 IDAGetQuadSensDky, 113 IDAGetQuadSensDky1, 114 IDAGetQuadSensErrWeights, 116 IDAGetQuadSensNumErrTestFails, 116 IDAGetQuadSensNumRhsEvals, 116 IDAGetQuadSensStats, 117 IDAGetQuadStats, 85 IDAGetReturnFlagName, 66 IDAGetRootInfo, 67 IDAGetSens, 96, 101 IDAGetSens1, 96, 102 IDAGetSensConsistentIC, 108 IDAGetSensDky, 96, 101, 102 IDAGetSensDky1, 96, 102 IDAGetSensErrWeights, 107 IDAGetSensNonlinSolvStats, 107 IDAGetSensNumErrTestFails, 106 IDAGetSensNumLinSolvSetups, 106 IDAGetSensNumNonlinSolvConvFails, 107 IDAGetSensNumNonlinSolvIters, 107 IDAGetSensNumResEvals, 105 IDAGetSensStats, 106 IDAGetTolScaleFactor, 64 IDAGetWorkSpace, 60 IDAInit, 38, 71 IDAInitB, 123, 128 IDAInitBS, 123, 128 idals linear solver interface convergence test, 54 Jacobian approximation used by, 51, 52 memory requirements, 67 optional input, 51–55, 134–138 optional output, 67–71 preconditioner setup function, 54, 78 preconditioner setup function (backward), 153 preconditioner solve function, 54, 78 preconditioner solve function (backward), 151 IDALS ILL INPUT, 41, 53, 54, 90, 131, 135–138, 155 IDALS JACFUNC RECVR, 146, 147 IDALS JACFUNC UNRECVR, 146–148 IDALS LMEM NULL, 52–54, 67–70, 90, 91, 135–138, 155 IDALS MEM FAIL, 41, 90, 131, 155 IDALS MEM NULL, 41, 52–54, 67–70, 90, 91, 131, 135–138, 155 IDALS NO ADJ, 131, 135–138 IDALS PMEM NULL, 91, 155 350 INDEX IDALS SUCCESS, 41, 52–54, 67–70, 90, 91, 131, IDASensFree, 98 134–138, 155 IDASensInit, 95–97 IDALS SUNLS FAIL, 41, 53, 54 IDASensReInit, 97 IDALsJacFn, 74 IDASensResFn, 97, 108 IDALsJacFnB, 146 IDASensSStolerances, 99 IDALsJacFnBS, 146 IDASensSVtolerances, 99 IDALsJacTimesSetupFn, 77 IDASensToggleOff, 98 IDALsJacTimesSetupFnB, 150 IDASetConstraints, 51 IDALsJacTimesSetupFnBS, 150 IDASetEpsLin, 54 IDALsJacTimesVecFn, 76 IDASetEpsLinB, 138 IDALsJacTimesVecFnB, 148 IDASetErrFile, 47 IDALsJacTimesVecFnBS, 148 IDASetErrHandlerFn, 47 IDALsPrecSetupFn, 78 IDASetId, 51 IDALsPrecSolveFn, 78 IDASetIncrementFactor, 53 IDAQuadFree, 82 IDASetIncrementFactorB, 136 IDAQuadInit, 81 IDASetInitStep, 48 IDAQuadInitB, 140 IDASetJacFn, 52 IDAQuadInitBS, 141 IDASetJacFnB, 134 IDAQuadReInit, 81 IDASetJacFnBS, 135 IDAQuadReInitB, 141 IDASetJacTimes, 52 IDAQuadRhsFn, 81, 85 IDASetJacTimesB, 135 IDAQuadRhsFnB, 140, 144 IDASetJacTimesBS, 136 IDAQuadRhsFnBS, 141, 145 IDASetLinearSolver, 36, 41, 74, 218 IDAQuadSensEEtolerances, 115, 116 IDASetLinearSolverB, 123, 130, 146, 154 IDAQuadSensFree, 112 IDASetLineSearchOffIC, 56 IDAQuadSensInit, 111 IDASetMaxBacksIC, 56 IDAQuadSensReInit, 112 IDASetMaxConvFails, 50 IDAQuadSensRhsFn, 111, 117 IDASetMaxErrTestFails, 49 IDAQuadSensSStolerances, 115 IDASetMaxNonlinIters, 49 IDAQuadSensSVtolerances, 115 IDASetMaxNumItersIC, 56 IDAQuadSStolerances, 84 IDASetMaxNumJacsIC, 55 IDAQuadSVtolerances, 84 IDASetMaxNumSteps, 48 IDAReInit, 71, 72 IDASetMaxNumStepsIC, 55 IDAReInitB, 129 IDASetMaxOrd, 48 IDAResFn, 38, 72 IDASetMaxStep, 48 IDAResFnB, 128, 142 IDASetNoInactiveRootWarn, 57 IDAResFnBS, 128, 143 IDASetNonlinConvCoef, 50 IDARootFn, 74 IDASetNonlinConvCoefIC, 55 IDARootInit, 43 IDASetNonLinearSolver, 42 idas IDASetNonlinearSolver, 36, 42, 100, 101 motivation for writing in C, 2 IDASetNonlinearSolverB, 124 package structure, 25 IDASetNonLinearSolverSensSim, 100 relationship to ida, 1–2 IDASetNonlinearSolverSensSim, 100 idas linear solver interface IDASetNonLinearSolverSensStg, 100 idals, 41, 130 IDASetNonlinearSolverSensStg, 100 idas linear solvers IDASetPreconditioner, 53, 54 header files, 33 IDASetPrecSolveFnB, 137 implementation details, 29 IDASetPrecSolveFnBS, 137 nvector compatibility, 31 IDASetQuadErrCon, 83 selecting one, 41 IDASetQuadSensErrCon, 114 usage with adjoint module, 130 IDASetRootDirection, 57 ida linear solver interfaces, 26 IDASetSensDQMethod, 104 idas/idas.h, 33 IDASetSensErrCon, 104 IDASensEEtolerances, 99 IDASetSensMaxNonlinIters, 104 INDEX IDASetSensParams, 103 IDASetStepToleranceIC, 57 IDASetStopTime, 49 IDASetSuppressAlg, 50 IDASetUserData, 47 IDASolve, 36, 44, 116 IDASolveB, 124, 132, 133 IDASolveF, 122, 126 IDASpilsGetLastFlag, 71 IDASpilsGetNumConvFails, 69 IDASpilsGetNumJtimesEvals, 70 IDASpilsGetNumJTSetupEvals, 70 IDASpilsGetNumLinIters, 69 IDASpilsGetNumPrecEvals, 69 IDASpilsGetNumPrecSolves, 70 IDASpilsGetNumRhsEvals, 68 IDASpilsGetReturnFlagName, 71 IDASpilsGetWorkspace, 68 IDASpilsJacTimesSetupFn, 78 IDASpilsJacTimesSetupFnB, 150 IDASpilsJacTimesSetupFnBS, 151 IDASpilsJacTimesVecFn, 77 IDASpilsJacTimesVecFnB, 149 IDASpilsJacTimesVecFnBS, 149 IDASpilsPrecSetupFn, 79 IDASpilsPrecSetupFnB, 153 IDASpilsPrecSetupFnBS, 154 IDASpilsPrecSolveFn, 78 IDASpilsPrecSolveFnB, 152 IDASpilsPrecSolveFnBS, 153 IDASpilsSetEpsLin, 55 IDASpilsSetEpsLinB, 138 IDASpilsSetIncrementFactor, 53 IDASpilsSetIncrementFactorB, 137 IDASpilsSetJacTimes, 53 IDASpilsSetJacTimesB, 136 IDASpilsSetJacTimesBS, 136 IDASpilsSetLinearSolver, 41 IDASpilsSetLinearSolverB, 131 IDASpilsSetPreconditioner, 54 IDASpilsSetPreconditionerB, 137 IDASpilsSetPreconditionerBS, 138 IDASStolerances, 39 IDASStolerancesB, 130 IDASVtolerances, 39 IDASVtolerancesB, 130 IDAWFtolerances, 39 itask, 44, 126 Jacobian approximation function difference quotient, 51 Jacobian times vector difference quotient, 52 user-supplied, 52, 76–77 351 Jacobian-vector product user-supplied (backward), 135, 148 Jacobian-vector setup user-supplied, 77–78 user-supplied (backward), 150 user-supplied, 52, 74–76 user-supplied (backward), 134, 146 maxord, 71 memory requirements idabbdpre preconditioner, 91 idals linear solver interface, 67 idas solver, 81, 97, 111 idas solver, 61 N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N VCloneVectorArray, 160 VCloneVectorArray OpenMP, 180 VCloneVectorArray OpenMPDEV, 207 VCloneVectorArray Parallel, 176 VCloneVectorArray ParHyp, 190 VCloneVectorArray Petsc, 193 VCloneVectorArray Pthreads, 185 VCloneVectorArray Serial, 171 VCloneVectorArrayEmpty, 160 VCloneVectorArrayEmpty OpenMP, 181 VCloneVectorArrayEmpty OpenMPDEV, 207 VCloneVectorArrayEmpty Parallel, 176 VCloneVectorArrayEmpty ParHyp, 190 VCloneVectorArrayEmpty Petsc, 193 VCloneVectorArrayEmpty Pthreads, 185 VCloneVectorArrayEmpty Serial, 171 VCopyFromDevice Cuda, 199 VCopyFromDevice OpenMPDEV, 208 VCopyFromDevice Raja, 204 VCopyToDevice Cuda, 199 VCopyToDevice OpenMPDEV, 208 VCopyToDevice Raja, 204 VDestroyVectorArray, 160 VDestroyVectorArray OpenMP, 181 VDestroyVectorArray OpenMPDEV, 207 VDestroyVectorArray Parallel, 176 VDestroyVectorArray ParHyp, 190 VDestroyVectorArray Petsc, 193 VDestroyVectorArray Pthreads, 186 VDestroyVectorArray Serial, 171 Vector, 33, 159 VEnableConstVectorArray Cuda, 201 VEnableConstVectorArray OpenMP, 182 VEnableConstVectorArray OpenMPDEV, 209 VEnableConstVectorArray Parallel, 177 VEnableConstVectorArray ParHyp, 191 VEnableConstVectorArray Petsc, 195 VEnableConstVectorArray Pthreads, 187 VEnableConstVectorArray Raja, 205 352 N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N INDEX VEnableConstVectorArray Serial, 172 N VEnableScaleAddMulti Cuda, 200 N VEnableScaleAddMulti OpenMP, 182 VEnableDotProdMulti Cuda, 200 N VEnableScaleAddMulti OpenMPDEV, 209 VEnableDotProdMulti OpenMP, 182 N VEnableScaleAddMulti Parallel, 177 VEnableDotProdMulti OpenMPDEV, 209 N VEnableScaleAddMulti ParHyp, 191 VEnableDotProdMulti Parallel, 177 N VEnableScaleAddMulti Petsc, 194 VEnableDotProdMulti ParHyp, 191 N VEnableScaleAddMulti Pthreads, 187 VEnableDotProdMulti Petsc, 194 N VEnableScaleAddMulti Raja, 205 VEnableDotProdMulti Pthreads, 187 N VEnableScaleAddMulti Serial, 172 VEnableDotProdMulti Serial, 172 N VEnableScaleAddMultiVectorArray Cuda, 201 VEnableFusedOps Cuda, 200 N VEnableScaleAddMultiVectorArray OpenMP, 183 VEnableFusedOps OpenMP, 181 N VEnableScaleAddMultiVectorArray OpenMPDEV, VEnableFusedOps OpenMPDEV, 208 210 VEnableFusedOps Parallel, 177 N VEnableScaleAddMultiVectorArray Parallel, VEnableFusedOps ParHyp, 190 178 VEnableFusedOps Petsc, 194 N VEnableScaleAddMultiVectorArray ParHyp, 192 VEnableFusedOps Pthreads, 186 N VEnableScaleAddMultiVectorArray Petsc, 195 VEnableFusedOps Raja, 204 N VEnableScaleAddMultiVectorArray Pthreads, VEnableFusedOps Serial, 172 188 VEnableLinearCombination Cuda, 200 N VEnableScaleAddMultiVectorArray Raja, 205 VEnableLinearCombination OpenMP, 181 N VEnableScaleAddMultiVectorArray Serial, 173 VEnableLinearCombination OpenMPDEV, 208 N VEnableScaleVectorArray Cuda, 200 VEnableLinearCombination Parallel, 177 N VEnableScaleVectorArray OpenMP, 182 VEnableLinearCombination ParHyp, 190 N VEnableScaleVectorArray OpenMPDEV, 209 VEnableLinearCombination Petsc, 194 N VEnableScaleVectorArray Parallel, 177 VEnableLinearCombination Pthreads, 186 N VEnableScaleVectorArray ParHyp, 191 VEnableLinearCombination Raja, 204 N VEnableScaleVectorArray Petsc, 194 VEnableLinearCombination Serial, 172 VEnableLinearCombinationVectorArray Cuda, N VEnableScaleVectorArray Pthreads, 187 N VEnableScaleVectorArray Raja, 205 201 N VEnableScaleVectorArray Serial, 172 VEnableLinearCombinationVectorArray OpenMP, N VEnableWrmsNormMaskVectorArray Cuda, 201 183 N VEnableWrmsNormMaskVectorArray OpenMP, 182 VEnableLinearCombinationVectorArray OpenMPDEV, N VEnableWrmsNormMaskVectorArray OpenMPDEV, 210 209 VEnableLinearCombinationVectorArray Parallel, N VEnableWrmsNormMaskVectorArray Parallel, 178 178 N VEnableWrmsNormMaskVectorArray ParHyp, 191 VEnableLinearCombinationVectorArray ParHyp, N VEnableWrmsNormMaskVectorArray Petsc, 195 192 VEnableLinearCombinationVectorArray Petsc,N VEnableWrmsNormMaskVectorArray Pthreads, 187 N VEnableWrmsNormMaskVectorArray Serial, 173 195 N VEnableWrmsNormVectorArray Cuda, 201 VEnableLinearCombinationVectorArray Pthreads, N VEnableWrmsNormVectorArray OpenMP, 182 188 VEnableLinearCombinationVectorArray Raja, N VEnableWrmsNormVectorArray OpenMPDEV, 209 205 N VEnableWrmsNormVectorArray Parallel, 178 VEnableLinearCombinationVectorArray Serial, N VEnableWrmsNormVectorArray ParHyp, 191 173 N VEnableWrmsNormVectorArray Petsc, 195 VEnableLinearSumVectorArray Cuda, 200 N VEnableWrmsNormVectorArray Pthreads, 187 VEnableLinearSumVectorArray OpenMP, 182 N VEnableWrmsNormVectorArray Serial, 173 VEnableLinearSumVectorArray OpenMPDEV, 209 N VGetDeviceArrayPointer Cuda, 197 VEnableLinearSumVectorArray Parallel, 177 N VGetDeviceArrayPointer OpenMPDEV, 208 VEnableLinearSumVectorArray ParHyp, 191 N VGetDeviceArrayPointer Raja, 203 VEnableLinearSumVectorArray Petsc, 194 N VGetHostArrayPointer Cuda, 197 VEnableLinearSumVectorArray Pthreads, 187 N VGetHostArrayPointer OpenMPDEV, 208 VEnableLinearSumVectorArray Raja, 205 N VGetHostArrayPointer Raja, 202 VEnableLinearSumVectorArray Serial, 172 N VGetLength Cuda, 197 INDEX N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N N VGetLength OpenMP, 181 VGetLength OpenMPDEV, 207 VGetLength Parallel, 176 VGetLength Pthreads, 186 VGetLength Raja, 202 VGetLength Serial, 171 VGetLocalLength Cuda, 197 VGetLocalLength Parallel, 176 VGetLocalLength Raja, 202 VGetMPIComm Cuda, 197 VGetMPIComm Raja, 203 VGetVector ParHyp, 189 VGetVector Petsc, 193 VIsManagedMemory Cuda, 197 VMake Cuda, 198 VMake OpenMP, 180 VMake OpenMPDEV, 207 VMake Parallel, 176 VMake ParHyp, 189 VMake Petsc, 193 VMake Pthreads, 185 VMake Raja, 204 VMake Serial, 171 VMakeManaged Cuda, 199 VNew Cuda, 197 VNew OpenMP, 180 VNew OpenMPDEV, 207 VNew Parallel, 175 VNew Pthreads, 185 VNew Raja, 203 VNew SensWrapper, 309 VNew Serial, 170 VNewEmpty Cuda, 198 VNewEmpty OpenMP, 180 VNewEmpty OpenMPDEV, 207 VNewEmpty Parallel, 175 VNewEmpty ParHyp, 189 VNewEmpty Petsc, 193 VNewEmpty Pthreads, 185 VNewEmpty Raja, 203 VNewEmpty SensWrapper, 309 VNewEmpty Serial, 170 VNewManaged Cuda, 198 VPrint Cuda, 199 VPrint OpenMP, 181 VPrint OpenMPDEV, 208 VPrint Parallel, 176 VPrint ParHyp, 190 VPrint Petsc, 193 VPrint Pthreads, 186 VPrint Raja, 204 VPrint Serial, 171 VPrintFile Cuda, 200 VPrintFile OpenMP, 181 353 N VPrintFile OpenMPDEV, 208 N VPrintFile Parallel, 176 N VPrintFile ParHyp, 190 N VPrintFile Petsc, 194 N VPrintFile Pthreads, 186 N VPrintFile Raja, 204 N VPrintFile Serial, 171 N VSetCudaStream Cuda, 199 NV COMM P, 175 NV CONTENT OMP, 179 NV CONTENT OMPDEV, 206 NV CONTENT P, 174 NV CONTENT PT, 184 NV CONTENT S, 169 NV DATA DEV OMPDEV, 206 NV DATA HOST OMPDEV, 206 NV DATA OMP, 179 NV DATA P, 174 NV DATA PT, 184 NV DATA S, 170 NV GLOBLENGTH P, 174 NV Ith OMP, 180 NV Ith P, 175 NV Ith PT, 185 NV Ith S, 170 NV LENGTH OMP, 179 NV LENGTH OMPDEV, 206 NV LENGTH PT, 184 NV LENGTH S, 170 NV LOCLENGTH P, 174 NV NUM THREADS OMP, 179 NV NUM THREADS PT, 184 NV OWN DATA OMP, 179 NV OWN DATA OMPDEV, 206 NV OWN DATA P, 174 NV OWN DATA PT, 184 NV OWN DATA S, 170 NVECTOR module, 159 nvector openmp mod, 183 nvector pthreads mod, 188 optional input backward solver, 134 forward sensitivity, 103–105 generic linear solver interface, 51–55, 134–138 initial condition calculation, 55–57 iterative linear solver, 55, 137–138 iterative-free linear solver, 53 matrix-based linear solver, 51–52, 134–135 matrix-free linear solver, 52–53, 135–137 quadrature integration, 83–84, 142 rootfinding, 57 sensitivity-dependent quadrature integration, 114–116 354 solver, 45–51 optional output backward initial condition calculation, 139– 140 backward solver, 138–139 band-block-diagonal preconditioner, 91 forward sensitivity, 105–108 generic linear solver interface, 67–71 initial condition calculation, 66, 108 interpolated quadratures, 83 interpolated sensitivities, 101 interpolated sensitivity-dep. quadratures, 113 interpolated solution, 58 quadrature integration, 84–85, 142 sensitivity-dependent quadrature integration, 116–117 solver, 60–66 version, 58–60 output mode, 126, 133 INDEX SM COLUMNS D, 219 SM COLUMNS S, 231 SM CONTENT B, 224 SM CONTENT D, 219 SM CONTENT S, 231 SM DATA B, 224 SM DATA D, 219 SM DATA S, 231 SM ELEMENT B, 75, 224 SM ELEMENT D, 75, 220 SM INDEXPTRS S, 231 SM INDEXVALS S, 231 SM LBAND B, 224 SM LDATA B, 224 SM LDATA D, 219 SM LDIM B, 224 SM NNZ S, 76, 231 SM NP S, 231 SM ROWS B, 224 SM ROWS D, 219 partial error control SM ROWS S, 231 explanation of idas behavior, 118 SM SPARSETYPE S, 231 portability, 32 SM SUBAND B, 224 preconditioning SM UBAND B, 224 advice on, 15–16, 26 SMALL REAL, 32 band-block diagonal, 86 step size bounds, 48–49 setup and solve phases, 26 SUNBandMatrix, 35, 225 user-supplied, 53–54, 78, 137–138, 151, 153 SUNBandMatrix Cols, 227 SUNBandMatrix Column, 227 quadrature integration, 17 SUNBandMatrix Columns, 226 forward sensitivity analysis, 20 SUNBandMatrix Data, 227 SUNBandMatrix LDim, 226 RCONST, 32 SUNBandMatrix LowerBandwidth, 226 realtype, 32 SUNBandMatrix Print, 226 reinitialization, 71, 129 Rows, 226 SUNBandMatrix residual function, 72 SUNBandMatrix StoredUpperBandwidth, 226 backward problem, 142, 143 SUNBandMatrix UpperBandwidth, 226 forward sensitivity, 108 SUNBandMatrixStorage, 225 quadrature backward problem, 144 SUNDenseMatrix, 35, 220 sensitivity-dep. quadrature backward probSUNDenseMatrix Cols, 221 lem, 145 SUNDenseMatrix Column, 221 right-hand side function SUNDenseMatrix Columns, 220 quadrature equations, 85 sensitivity-dependent quadrature equations, SUNDenseMatrix Data, 221 SUNDenseMatrix LData, 221 117 SUNDenseMatrix Print, 220 Rootfinding, 16, 36, 43 SUNDenseMatrix Rows, 220 sundials/sundials linearsolver.h, 237 second-order sensitivity analysis, 23 sundials nonlinearsolver.h, 33 support in idas, 24 sundials nvector.h, 33 SM COLS B, 224 sundials types.h, 32, 33 SM COLS D, 219 SUNDIALSGetVersion, 60 SM COLUMN B, 75, 224 SUNDIALSGetVersionNumber, 60 SM COLUMN D, 75, 220 sunindextype, 32 SM COLUMN ELEMENT B, 75, 224 SM COLUMNS B, 224 SUNLinearSolver, 237, 244 INDEX SUNLinearSolver module, 237 SUNLINEARSOLVER DIRECT, 239, 246 SUNLINEARSOLVER ITERATIVE, 239, 247 SUNLINEARSOLVER MATRIX ITERATIVE, 239, 247 sunlinsol/sunlinsol band.h, 33 sunlinsol/sunlinsol dense.h, 33 sunlinsol/sunlinsol klu.h, 33 sunlinsol/sunlinsol lapackband.h, 33 sunlinsol/sunlinsol lapackdense.h, 33 sunlinsol/sunlinsol pcg.h, 34 sunlinsol/sunlinsol spbcgs.h, 33 sunlinsol/sunlinsol spfgmr.h, 33 sunlinsol/sunlinsol spgmr.h, 33 sunlinsol/sunlinsol sptfqmr.h, 34 sunlinsol/sunlinsol superlumt.h, 33 SUNLinSol Band, 41, 252 SUNLinSol Dense, 41, 250 SUNLinSol KLU, 41, 260 SUNLinSol KLUReInit, 261 SUNLinSol KLUSetOrdering, 263 SUNLinSol LapackBand, 41, 257 SUNLinSol LapackDense, 41, 255 SUNLinSol PCG, 41, 294, 296 SUNLinSol PCGSetMaxl, 295 SUNLinSol PCGSetPrecType, 295 SUNLinSol SPBCGS, 41, 283, 285 SUNLinSol SPBCGSSetMaxl, 284 SUNLinSol SPBCGSSetPrecType, 284 SUNLinSol SPFGMR, 41, 276, 279 SUNLinSol SPFGMRSetMaxRestarts, 278 SUNLinSol SPFGMRSetPrecType, 277 SUNLinSol SPGMR, 41, 269, 272 SUNLinSol SPGMRSetMaxRestarts, 271 SUNLinSol SPGMRSetPrecType, 271 SUNLinSol SPTFQMR, 41, 288, 290 SUNLinSol SPTFQMRSetMaxl, 289 SUNLinSol SPTFQMRSetPrecType, 289 SUNLinSol SuperLUMT, 41, 265 SUNLinSol SuperLUMTSetOrdering, 267 SUNLinSolFree, 37, 238, 240 SUNLinSolGetType, 238 SUNLinSolInitialize, 238, 239 SUNLinSolLastFlag, 242 SUNLinSolNumIters, 241 SUNLinSolResNorm, 241 SUNLinSolSetATimes, 239, 240, 247 SUNLinSolSetPreconditioner, 241 SUNLinSolSetScalingVectors, 241 SUNLinSolSetup, 238, 239, 247 SUNLinSolSolve, 238, 239 SUNLinSolSpace, 242 SUNMatDestroy, 37 SUNMatrix, 215 SUNMatrix module, 215 355 SUNNonlinearSolver, 33, 301 SUNNonlinearSolver module, 301 SUNNONLINEARSOLVER FIXEDPOINT, 302 SUNNONLINEARSOLVER ROOTFIND, 302 SUNNonlinSol FixedPoint, 314, 317 SUNNonlinSol FixedPointSens, 315 SUNNonlinSol Newton, 311 SUNNonlinSol NewtonSens, 311 SUNNonlinSolFree, 37, 303 SUNNonlinSolGetCurIter, 305 SUNNonlinSolGetNumConvFails, 305 SUNNonlinSolGetNumIters, 305 SUNNonlinSolGetSysFn FixedPoint, 315 SUNNonlinSolGetSysFn Newton, 312 SUNNonlinSolGetType, 302 SUNNonlinSolInitialize, 302 SUNNonlinSolLSetupFn, 303 SUNNonlinSolSetConvTestFn, 304 SUNNonlinSolSetLSolveFn, 304 SUNNonlinSolSetMaxIters, 304 SUNNonlinSolSetSysFn, 303 SUNNonlinSolSetup, 302 SUNNonlinSolSolve, 302 SUNSparseFromBandMatrix, 232 SUNSparseFromDenseMatrix, 232 SUNSparseMatrix, 35, 232 SUNSparseMatrix Columns, 233 SUNSparseMatrix Data, 234 SUNSparseMatrix IndexPointers, 234 SUNSparseMatrix IndexValues, 234 SUNSparseMatrix NNZ, 76, 233 SUNSparseMatrix NP, 234 SUNSparseMatrix Print, 233 SUNSparseMatrix Realloc, 233 SUNSparseMatrix Reallocate, 233 SUNSparseMatrix Rows, 233 SUNSparseMatrix SparseType, 234 tolerances, 13, 39, 40, 73, 84, 115 UNIT ROUNDOFF, 32 User main program Adjoint sensitivity analysis, 121 forward sensitivity analysis, 93 idabbdpre usage, 88 idas usage, 34 integration of quadratures, 79 integration of sensitivitiy-dependent quadratures, 109 user data, 47, 73, 74, 86, 88, 117 user dataB, 156, 157 weighted root-mean-square norm, 12–13
Source Exif Data:
File Type : PDF File Type Extension : pdf MIME Type : application/pdf PDF Version : 1.5 Linearized : No Page Count : 368 Page Mode : UseOutlines Author : Title : Subject : Creator : LaTeX with hyperref Producer : pdfTeX-1.40.19 Create Date : 2018:12:07 12:42:35-08:00 Modify Date : 2018:12:07 12:42:35-08:00 Trapped : False PTEX Fullbanner : This is pdfTeX, Version 3.14159265-2.6-1.40.19 (TeX Live 2018) kpathsea version 6.3.0EXIF Metadata provided by EXIF.tools