Prototype Manual
User Manual:
Open the PDF directly: View PDF .
Page Count: 199
Download | |
Open PDF In Browser | View PDF |
The Matrix Template Library User Manual Jeremy G. Siek, Andrew Lumsdaine and Lie-Quan Lee February 2, 2004 2 Contents I Introduction to Generic Programming 1 Traits Classes 1.1 Typedefs Nested in Classes 1.2 Template Specialization . . 1.3 Definition of a Traits Class 1.4 Partial Specialization . . . . 1.5 External Polymorphism and 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 3 5 6 6 7 2 Concepts and Models 2.1 Requirements . . . . . . . . . . . . . 2.2 Example: InputIterator . . . . . . . 2.3 Concepts vs. Abstract Base Classes . 2.4 Multi-type Concepts and Modules . 2.5 Concept Checking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9 9 10 10 10 11 II III . . . . . . . . . . . . Tags . . . . . Introduction to Numerical Linear Algebra Tutorials 15 17 3 Gaussian Elimination 19 4 Pointwise LU Factorization 21 5 Blocked LU Factorization 25 6 Preconditioned GMRES(m) 29 IV 33 Reference Manual 7 Concepts 35 7.1 Algebraic Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . 36 7.1.1 AbelianGroup . . . . . . . . . . . . . . . . . . . . . . . . . 37 3 4 CONTENTS 7.2 7.3 7.4 7.5 7.1.2 Ring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1.3 Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1.4 R-Module . . . . . . . . . . . . . . . . . . . . . . . . . . 7.1.5 VectorSpace . . . . . . . . . . . . . . . . . . . . . . . . . 7.1.6 FiniteVectorSpace, FiniteBanachSpace, FiniteHilbertSpace 7.1.7 BanachSpace . . . . . . . . . . . . . . . . . . . . . . . . 7.1.8 HilbertSpace . . . . . . . . . . . . . . . . . . . . . . . . . 7.1.9 LinearOperator . . . . . . . . . . . . . . . . . . . . . . . 7.1.10 FiniteLinearOperator . . . . . . . . . . . . . . . . . . . . 7.1.11 TransposableLinearOperator . . . . . . . . . . . . . . . . 7.1.12 LinearAlgebra . . . . . . . . . . . . . . . . . . . . . . . . Collection Concepts . . . . . . . . . . . . . . . . . . . . . . . . 7.2.1 Collection . . . . . . . . . . . . . . . . . . . . . . . . . . 7.2.2 ForwardCollection . . . . . . . . . . . . . . . . . . . . . . 7.2.3 ReversibleCollection . . . . . . . . . . . . . . . . . . . . . 7.2.4 SequentialCollection . . . . . . . . . . . . . . . . . . . . . 7.2.5 RandomAccessCollection . . . . . . . . . . . . . . . . . . Iterator Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . 7.3.1 IndexedIterator . . . . . . . . . . . . . . . . . . . . . . . 7.3.2 MatrixIterator . . . . . . . . . . . . . . . . . . . . . . . . 7.3.3 IndexValuePairIterator . . . . . . . . . . . . . . . . . . . . Vector Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.1 BasicVector . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.2 Vector . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.3 SparseVector . . . . . . . . . . . . . . . . . . . . . . . . . 7.4.4 SubdividableVector . . . . . . . . . . . . . . . . . . . . . 7.4.5 ResizableVector . . . . . . . . . . . . . . . . . . . . . . . Matrix Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . 7.5.1 BasicMatrix . . . . . . . . . . . . . . . . . . . . . . . . . 7.5.2 VectorMatrix . . . . . . . . . . . . . . . . . . . . . . . . 7.5.3 Matrix1D . . . . . . . . . . . . . . . . . . . . . . . . . . 7.5.4 Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7.5.5 BandedMatrix . . . . . . . . . . . . . . . . . . . . . . . . 7.5.6 TriangularMatrix . . . . . . . . . . . . . . . . . . . . . . . 7.5.7 StrideableMatrix . . . . . . . . . . . . . . . . . . . . . . . 7.5.8 SubdividableMatrix . . . . . . . . . . . . . . . . . . . . . 7.5.9 ResizeableMatrix . . . . . . . . . . . . . . . . . . . . . . 7.5.10 FastDiagMatrix . . . . . . . . . . . . . . . . . . . . . . . 8 Arithmetic Types and Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39 41 42 43 45 45 47 49 50 51 53 57 57 60 60 61 61 62 62 62 63 64 64 66 67 69 70 71 71 73 74 74 78 79 80 81 83 84 85 9 Object Model and Memory Management 87 9.1 Object Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87 9.2 Memory Management . . . . . . . . . . . . . . . . . . . . . . . . 88 CONTENTS 10 Vector Classes: vector::type 10.1 Dense Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.1.1 vector , Orien>::type . . . . . 10.1.2 vector , Orien>::type . . . . . 10.1.3 vector , Orien>::type . . . . . . . 10.2 Sparse Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2.1 vector , Orien>::type . . . . . . . . . . . . . . . . . . . . 10.2.2 vector , Orien>::type . . . . . . . . . . . . . . . . . . . . 10.2.3 vector , Orien>::type . . 10.2.4 vector , Orien>::type 10.2.5 vector , Orien>::type . . . . . . . . . 5 89 92 92 93 95 96 97 99 100 101 102 11 Matrix Classes: matrix ::type 105 11.0.6 Shape Selectors . . . . . . . . . . . . . . . . . . . . . . . . 106 11.0.7 Storage Selectors . . . . . . . . . . . . . . . . . . . . . . . 107 11.0.8 Shape and Storage Combinations . . . . . . . . . . . . . . 111 11.1 Dense Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113 11.1.1 matrix , dense , Orien>::type . . . . . . . . . . . . . . . . . . . . 113 11.1.2 matrix , dense , Orien>::type . . . . . . . . . . . . . . . . . . . . 114 11.1.3 matrix , static size , Orien>::type . . . . . . . 115 11.1.4 matrix , array< dense >, Orien>::type . . . . 116 11.2 Sparse Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 11.2.1 matrix , Orien>::type . . . . . . . . . . . . . . . . . . . . 118 11.2.2 matrix , Orien>::type . . . . . . . . . . . . . . . . . . . . 118 11.2.3 matrix , Orien>::type . . . . . . . . . . . . . . . . . . . . 119 11.2.4 matrix , Orien>::type . . . . . . . . . . . . . . . . . . . . 119 11.3 Banded Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . 120 11.3.1 matrix ,banded ,Orien>::type 120 11.3.2 matrix ,banded ,Orien>::type . 120 11.3.3 matrix ,dense ,Orien>::type . 120 11.3.4 matrix ,dense ,Orien>::type . 120 11.4 Triangular Matrices . . . . . . . . . . . . . . . . . . . . . . . . . 121 11.4.1 matrix , Storage, Orien>::type . . . . . . . . . . . . . 121 11.4.2 matrix , dense ,Orien>::type . . . . . . . . 121 6 CONTENTS 11.4.3 matrix , dense , Orien>::type . . . . . . . 11.4.4 matrix , banded ,Orien>::type . . . . . . 11.4.5 matrix , banded , Orien>::type . . . . . . 11.4.6 matrix , packed , Orien>::type . . . . . . 11.4.7 matrix , packed , Orien>::type . . . . . . 11.5 Symmetric Matrices . . . . . . . . . . . . . . . . . . . . . . . . 11.5.1 matrix , Storage, Orien>::type . . . . . . . . . . . . 11.5.2 matrix , dense ,Orien>::type . . . . . . . 11.5.3 matrix , dense , Orien>::type . . . . . . . 11.5.4 matrix , banded ,Orien>::type . . . . . . 11.5.5 matrix , banded , Orien>::type . . . . . . 11.5.6 matrix , packed , Orien>::type . . . . . . 11.5.7 matrix , packed , Orien>::type . . . . . . 11.6 Hermitian Matrices . . . . . . . . . . . . . . . . . . . . . . . . . 11.6.1 matrix , Storage, Orien>::type . . . . . . . . . . . . 11.6.2 matrix , dense ,Orien>::type . . . . . . . 11.6.3 matrix , dense , Orien>::type . . . . . . . 11.6.4 matrix , banded ,Orien>::type . . . . . . 11.6.5 matrix , banded , Orien>::type . . . . . . 11.6.6 matrix , packed , Orien>::type . . . . . . 11.6.7 matrix , packed , Orien>::type . . . . . . 11.7 Diagonal Matrices . . . . . . . . . . . . . . . . . . . . . . . . . 11.7.1 matrix , dense , diagonal major>::type 11.7.2 matrix , dense , diagonal major>::type . . 121 . 121 . 121 . 121 . 121 . 122 . 122 . 122 . 122 . 122 . 122 . 122 . 122 . 123 . 123 . 123 . 123 . 123 . 123 . 123 . 123 . 124 . 124 . 124 CONTENTS 7 11.7.3 matrix , array , diagonal major>::type . . . . . 124 12 Adaptors and Helper Functions 125 13 Overloaded Operators 127 14 Vector Operations 14.1 Vector Data Movement Operations 14.1.1 set . . . . . . . . . . . . . 14.1.2 copy . . . . . . . . . . . . . 14.1.3 swap . . . . . . . . . . . . . 14.1.4 gather . . . . . . . . . . . 14.1.5 scatter . . . . . . . . . . . 14.2 Vector Reduction Operations . . . 14.2.1 dot (inner product) . . . . 14.2.2 one norm . . . . . . . . . . 14.2.3 two norm (euclidean norm) 14.2.4 infinity norm . . . . . . . 14.2.5 sum . . . . . . . . . . . . . 14.2.6 sum squares . . . . . . . . 14.2.7 max . . . . . . . . . . . . . 14.2.8 min . . . . . . . . . . . . . 14.2.9 max index . . . . . . . . . . 14.2.10 max with index . . . . . . 14.2.11 min index . . . . . . . . . . 14.3 Vector Arithmetic Operations . . . 14.3.1 scale . . . . . . . . . . . . 14.3.2 add . . . . . . . . . . . . . 14.3.3 iadd . . . . . . . . . . . . . 14.3.4 ele mult . . . . . . . . . . 14.3.5 ele div . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129 130 130 130 131 132 133 135 135 136 137 137 138 139 140 140 141 141 142 144 144 144 146 147 148 15 Matrix Operations 15.1 Matrix Data Movement Operations . 15.1.1 set . . . . . . . . . . . . . . 15.1.2 copy . . . . . . . . . . . . . . 15.1.3 swap . . . . . . . . . . . . . . 15.1.4 transpose . . . . . . . . . . 15.2 Matrix Norms . . . . . . . . . . . . . 15.2.1 one norm . . . . . . . . . . . 15.2.2 frobenius norm . . . . . . . 15.2.3 infinity norm . . . . . . . . 15.3 Element-wise Arithmetic Operations 15.3.1 scale . . . . . . . . . . . . . 15.3.2 add . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151 152 152 152 154 155 157 157 157 158 159 159 160 8 CONTENTS 15.4 15.5 15.6 15.7 15.8 15.3.3 iadd . . . . . . . . . . . . . . . . . . . . . . . . 15.3.4 ele mult . . . . . . . . . . . . . . . . . . . . . 15.3.5 ele div . . . . . . . . . . . . . . . . . . . . . . Rank Updates (Outer Products) . . . . . . . . . . . . 15.4.1 rank one update . . . . . . . . . . . . . . . . . 15.4.2 rank one conj . . . . . . . . . . . . . . . . . . 15.4.3 rank two update . . . . . . . . . . . . . . . . . 15.4.4 rank two conj . . . . . . . . . . . . . . . . . . Trianglular Solves (Forward & Backward Substitution) 15.5.1 tri solve . . . . . . . . . . . . . . . . . . . . . Matrix-Vector Multiplication . . . . . . . . . . . . . . 15.6.1 mult . . . . . . . . . . . . . . . . . . . . . . . . 15.6.2 mult add . . . . . . . . . . . . . . . . . . . . . Matrix-Matrix Multiplication . . . . . . . . . . . . . . 15.7.1 mult . . . . . . . . . . . . . . . . . . . . . . . . 15.7.2 mult add . . . . . . . . . . . . . . . . . . . . . 15.7.3 lrdiag mult . . . . . . . . . . . . . . . . . . . 15.7.4 lrdiag mult add . . . . . . . . . . . . . . . . . 15.7.5 babt mult . . . . . . . . . . . . . . . . . . . . . Miscellaneous Matrix Operations . . . . . . . . . . . . 15.8.1 trace . . . . . . . . . . . . . . . . . . . . . . . 16 Miscellaneous Operations 16.1 Basic Transformations . . . 16.1.1 givens . . . . . . . 16.1.2 givens apply . . . . 16.1.3 house . . . . . . . . 16.1.4 house apply left . 16.1.5 house apply right . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161 162 162 164 164 164 164 164 166 166 168 168 168 170 170 170 170 170 170 171 171 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173 174 174 175 175 177 177 A Concept Checks for STL A.1 STL Basic Concept Checks . A.1.1 Assignable . . . . . . . A.1.2 DefaultConstructible . A.1.3 CopyConstructible . . . A.1.4 EqualityComparable . . A.1.5 LessThanComparable . A.1.6 Generator . . . . . . . A.1.7 UnaryFunction . . . . . A.1.8 BinaryFunction . . . . A.1.9 UnaryPredicate . . . . A.1.10 BinaryPredicate . . . . A.2 STL Iterator Concept Checks A.2.1 TrivialIterator . . . . . A.2.2 Mutable-TrivialIterator A.2.3 InputIterator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 179 179 179 179 179 180 180 180 181 181 181 181 182 182 182 182 CONTENTS A.2.4 A.2.5 A.2.6 A.2.7 A.2.8 A.2.9 A.2.10 9 OutputIterator . . . . . . . . . ForwardIterator . . . . . . . . Mutable-ForwardIterator . . . BidirectionalIterator . . . . . . Mutable-BidirectionalIterator . RandomAccessIterator . . . . . Mutable-RandomAccessIterator BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182 183 183 183 183 184 184 185 10 CONTENTS List of Tables 11.1 Valid Shape and Storage Combinations. . . . . . . . . . . . . . . 111 11 12 LIST OF TABLES List of Figures 4.1 4.2 4.3 LU factorization pseudo-code. . . . . . . . . . . . . . . . . . . . . Diagram for LU factorization. . . . . . . . . . . . . . . . . . . . . Complete MTL version of pointwise LU factorization. . . . . . . 21 22 24 5.1 5.2 5.3 Pointwise step in block LU factorization. . . . . . . . . . . . . . . Update steps in block LU factorization. . . . . . . . . . . . . . . MTL version of block LU factorization. . . . . . . . . . . . . . . 26 27 28 6.1 The Iterative Template Library (ITL) implementation of the preconditioned GMRES(m) algorithm. This algorithm computes an approximate solution to Ax = b preconditioned with M . The restart value is specified by the parameter m. . . . . . . . . . . . 30 Refinement of the algebraic concepts. . . . . . . . . . . . . . . . . Refinement of the matrix concepts. . . . . . . . . . . . . . . . . . 37 71 10.1 The Compressed Sparse Vector Format. . . . . . . . . . . . . . . 10.2 The Sparse Pair Vector Format. . . . . . . . . . . . . . . . . . . . 96 96 7.1 7.2 11.1 11.2 11.3 11.4 11.5 11.6 11.7 Example of a banded matrix with bandwidth (1,2). . . . . . . . . Example of a symmetric matrix with bandwidth (2,2). . . . . . . Example of the dense matrix storage format. . . . . . . . . . . . Example of the banded matrix storage format. . . . . . . . . . . Example of the packed matrix storage format. . . . . . . . . . . . Example of the compressed column matrix storage format. . . . . Example of the array matrix storage format with dense and with sparse pair OneD storage types. . . . . . . . . . . . . . . . . . . . 11.8 Example of the envelope matrix storage format. . . . . . . . . . . 13 107 108 108 109 109 110 111 112 14 LIST OF FIGURES Part I Introduction to Generic Programming 1 Chapter 1 Traits Classes One of the most important techniques used in generic programming is the traits class, which was introduced by Nathan Meyers in XX. The traits class technique may seem strange and somewhat daunting when first encountered (granted the syntax looks a bit strange) but the essense of the idea is simple, and it is essential to learn how to use traits classes, for they appear over and over again in generic libraries such as the STL, and are also used heavily here in the MTL. Here we give a short motivation and tutorial for traits classes via an extended example. A traits class is basically a way of finding out information about a type that you otherwise would not know anything about. For instance, suppose I want to write a generic sum() function: template X sum(const Vector& v, int n) { X total; for (int i = 0; i < n; ++i) total += v[i]; return total; } From the point of view of this template function, not much is know about the template type Vector. For instance, I don’t know what kind of elements are inside the vector. But I need to know this in order to declare the local variable total, which should be the same type as the elements of Vector (the X there right now is just a bogus placeholder that needs to be replaced by something else!). 1.1 Typedefs Nested in Classes One way to access information out of a type is to use the scope operator ”::” to get at typedefs that are nested inside the class. Imagine I’ve created a vector class that looks like this: 3 4 CHAPTER 1. TRAITS CLASSES class my_vector { public: typedef double value_type; // the type for elements in the vector double& operator[](int i) { ... }; ... }; Now I can access the type of the elements by writing my vector::value type. Here’s the generic sum() function again, with the X’s replaced: template typename Vector::value_type sum(const Vector& v, int n) { typename Vector::value_type total = 0; for (int i = 0; i < n; ++i) total += v[i]; return total; } The use of the keyword typename deserves some explaining. Due to some quirks in C++, nested typedefs and nested members can cause ambiguity from the compiler’s point of view: when parsing a template function the compiler may not know whether the thing on the right hand side of the scope operator is a type or an object. The typename keyword is used to clear up this ambiguity. The rule of thumb is, whenever you use the scope operator to access a nested typedef, and when the type on the left hand side of the scope operator somehow depends on a template argument, then use the typename keyword. If the type on the left hand side does not depend on a template argument, then do not use typename. In the above sum() function we use typename since the left hand side, Vector, is a template argument. In contrast, typename is not used below since the type std::vector does not depend on any template arguments (it is not even in a template function!). int main(int, char*[]) { std::vector ::value_type x; return 0; } Getting back to the sum() function, the technique of using a nested typedef works as long as Vector is a class type that has such a nested typedef. But what if I want to use the generic sum() function with a builtin type such as double* that couldn’t possibly have any typedefs? Or what if we want to use sum() with a vector class from a third party who didn’t provide the necessary typedef? The operator[] works with double* and our imaginary third party vector, so it would be a shame to miss out on a chance for re-use just because of the issue of accessing the value type. Below shows the situation we want to make possible, where sum() can be reused with types such as double*. double* x = ...; int n = ...; sum(x, n); 1.2. TEMPLATE SPECIALIZATION 1.2 5 Template Specialization The solution to this problem is the traits class technique. To understand how traits classes work, we first need to review some facts about template classes and something called specialization. We start with a simple (though perhaps gratuitous) example of a template class: template class hello_world { public: void say_hi() { cout << "Hello world!" << endl; } }; One interesting thing about C++ templates is that they can be specialized. That is, a special version of the class can be explicitly created for a particular case of the template arguments (in this case T). So we can write this: template <> class hello_world { public: void say_hi() { cout << "I’m special!" << endl; } }; template <> class hello_world { public: void say_hi() { cout << "I’m special too!" << endl; } }; Now can you guess what happens when I do the following? hello_world h1; hello_world h2; hello_world h3; hello_world h4; h1.say_hi(); h2.say_hi(); h3.say_hi(); h4.say_hi(); Here’s what the output will look like: Hello world! Hello world! I’m special! I’m special too! In this example, template specialization allowed us to pick different versions of the say hi() member function based on some type (char, int, long, double*). The specializations of hello world created a mapping between a type and a version of say hi(). The original templated version of hello world acted like a default if none of the specializations covered the type. 6 CHAPTER 1. TRAITS CLASSES 1.3 Definition of a Traits Class In general, we can use specialization to create mappings from types to other nested types, member functions, or constants. It might be helpful to think of the template class as a function that executes at compile-time, whose input is the template parameters and whose output is the things nested in the class. A traits class is a class who’s sole purpose is to define such a mapping. Getting back to the generic sum() function, here’s an example traits class, templated on the Vector type, that allows us to get the element, or value type of the vector. For the default case, we’ll assume the vector is a class with a nested typedef like my vector: template struct vector_traits { typedef typename Vec::value_type value_type; }; But now we can also handle the case when the Vector template argument is something else like a double*: template <> struct vector_traits { typedef double value_type; }; or even some third party class, say johns int vector: template <> struct vector_traits { typedef int value_type; }; 1.4 Partial Specialization Now one might get bored of writing a traits class for every pointer type, or perhaps the third party class is templated. The solution to this is the longwinded term partial specialization. Here’s what you can write: template struct vector_traits { typedef T value_type; }; template struct vector_traits< johns_vector > { typedef T value_type; }; Your C++ compiler will attempt a pattern match between the template argument provided at the “call” to the traits class, and all the specializations defined, picking the specialization that is the closest match. The above partial 1.5. EXTERNAL POLYMORPHISM AND TAGS 7 specialization for T* will match whenever the type is a pointer, though the previous complete specializations for double* would match first for that particular pointer type. The most well known use of a traits class is the iterator traits class used in the Standard Template Library, which provides access to things like the value type and iterator category that is associated with an Iterator. MTL also uses traits classes, such as matrix traits. Typically a traits class is used with with a particular concept or family of concepts. The iterator traits class is used with the family of Iterator concepts. The matrix traits class is used with the familty of MTL Matrix concepts. The traits class is what provides access to the associated types of a concept. We will explain concepts, associated types, etc. in detail in Chapter 2. 1.5 External Polymorphism and Tags A technique that often goes hand in hand with traits classes is external polymorphism, which is a way of using function overloading to dispatch based on properties of a type. A good example of this is the implementation of the std::advance() function in the STL, which increments an iterator n times. Depending on the kind of iterator, there are different optimizations that can be applied in the implementation. If the iterator is random access (can jump forward and backward arbitrary distances), then the advance() function can simply be implemented with i += n, and is very efficient: constant time. If the iterator is bidirectional, then it makes sense for n to be negative, we can decrement the iterator n times. The relation between external polymorphism and traits classes is that the property to be exploited (in this case the iterator category) is accessed through a traits class. The main advance() function uses the iterator traits class to get the iterator category. It then makes a call the the overloaded advance() function. The appropriate advance() is selected by the compiler based on whatever type the iterator category resolves to, either input iterator tag, bidirectional iterator tag, or random access iterator tag. A tag is simply a class whose only purpose is to convey some property for use in external polymorphism. struct input_iterator_tag { }; struct bidirectional_iterator_tag { }; struct random_access_iterator_tag { }; template void __advance(InputIterator& i, Distance n, input_iterator_tag) { while (n--) ++i; } template void __advance(BidirectionalIterator& i, Distance n, bidirectional_iterator_tag) { 8 CHAPTER 1. TRAITS CLASSES if (n >= 0) while (n--) ++i; else while (n++) --i; } template void __advance(RandomAccessIterator& i, Distance n, random_access_iterator_tag) { i += n; } template void advance(InputIterator& i, Distance n) { typedef typename iterator_traits ::iterator_category Cat; __advance(i, n, Cat()); } Chapter 2 Concepts and Models Here we define the basic terminology of generic programming, much of which was introduced in [1]. In the context of generic programming, the term concept is used to describe the collection of requirements that a template argument must meet for the template function or templated class to compile and operate properly. The requirements are described as a set of valid expressions, associated types, invariants, and complexity guarantees. A type that meets the set of requirements is said to model the concept. 2.1 Requirements Valid Expressions are C++ expressions which must compile successfully for the objects involved in the expression to be considered models of the concept. Associated Types are types that are related to the modelling type in one of two ways. They are either provided by typedefs nested within class definition for the type, or they are accessed through a traits class, such as iterator traits. Invariants are typically run-time characteristics of the objects that must always be true, that is, the functions involving the objects must preserve these characteristics. Complexity Guarantees are maximum limits on how long the execution of one of the valid expressions will take, or how much of various resources its computation will use. 9 10 CHAPTER 2. CONCEPTS AND MODELS 2.2 Example: InputIterator Examples of concept definitions can be found in the C++ Standard, many of which deal with the requirements for iterators. The InputIterator1 concept is one of these. The following expressions are valid if the object i is an instance of some type that models InputIterator. ++i i++ *i The std::iterator traits class provides access to the associated types of an iterator type. In the following example we find out what type of object is pointed to by some iterator type (call it X) via the value type of the traits class. typename iterator_traits ::value_type t; t = *i; As for complexity guarantees, all of the InputIterator operations are required to be constant time. Some examples of types that satisfy the requirements for InputIterator are double*, std::list ::iterator, and std::istream iterator . 2.3 Concepts vs. Abstract Base Classes In many respects a concept is similar to an abstract base class (or interface): it defines a set of types that are interchangeable from the point of view of some algorithm (or collection of algorithms). Also, much in the way abstract base classes can inherit from (extend) other base classes, a concept can refine or add requirements to another concept. However a concept is much looser than an abstract base class: there is no inheritance requirement and the valid expressions offer more freedom than the requirement for member functions with exactly matching signatures. Also dispatch based on type is not required to be via a virtual function (though it still can be). Also, for expressions that involve multiple types, the function overload resolution can depend on both types, which avoids the binary method [3] problem associated with inheritance-based polymorphism. 2.4 Multi-type Concepts and Modules Most concepts describe expressions that involve interaction between two or more types. Often one of the types is the “main” type and the other types can be derived from the “main” type via typdefs or traits class (they are associated types). We talk of the “main” type as the one that models the concept. This is the case with the iterators and containers of the STL. In our example above, the dereference expression *i required by InputIterator returns a second type, 1 We always use the bold sans serif font for concept names. 2.5. CONCEPT CHECKING 11 namely the value type associated with the iterator type (the “main” type in this case). However, for some concepts there is not a “main” type. There are multiple types involved, none of which can be derived from the others. We call a set of types that together model a concept a module. For example, later in this chapter we will be defining the concept of a VectorSpace which consists of some vector type, a scalar type, and a multiplication operator between the two. Since one can multiply a complex number by a float, the module {complex ,float} is a model of VectorSpace. It is also true that the module {complex ,double} is a model of VectorSpace, so we see that there is not necessarily a one-to-one relationship between the vector type and the scalar type, as there was between the iterator type and value type discussed above. In the descriptions of multitype concepts we will list the participating types instead of associated types. No traits classes or nested typedefs are specified, as there is not a one-to-one mapping between the participating types. The participating types are typically readily available by other means. 2.5 Concept Checking The C++ language does not provide direct support for ensuring that template arguments meet the requirements demanded of them by the generic algorithm or template class. This means that if the user makes an error, the resulting compiler error will point to somewhere deep inside the implementation of the generic algorithm, giving an error that may not be easy to match with the cause. Together with the SGI STL team we have developed a method for forcing the compiler to give better error messages. The idea is to exercise all the requirements placed on the template arguments at the very beginning of the generic algorithm. We have created some macros and a methodology for how to do this. Suppose we wish to add concept checks to the STL copy() algorithm, which has the following prototype. template OutIter copy(InIter first, InIter last, OutIter result); We will need to make sure the InIter is a model of InputIterator and that OutIter is a model of OutputIterator. The first step is to create the code that will excercise the expressions associated with each of these concepts. The following is the concept checking class for OutputIterator. template struct OutputIterator { CLASS_REQUIRES(T, Assignable); void constraints() { (void)*i; // require ++i; // require i++; // require *i++ = *j; // require dereference operator preincrement operator postincrement operator postincrement and assignment 12 CHAPTER 2. CONCEPTS AND MODELS } T i, j; }; Once the concept checking classes are complete, one simple needs to invoke them at the beginning of the generic algorithm using our REQUIRE macro. Here’s what the STL copy() algorithm looks like with the concept checks inserted. template OutIter copy(InIter first, InIter last, OutIter result) { REQUIRE(OutIter, OutputIterator); REQUIRE(InIter, InputIterator); return copy_aux(first, last, result, VALUE_TYPE(first)); } Looking back at the OutputIterator concept checking class you might wonder why we used the CLASS REQUIRES macro instead of just REQUIRE. The reason for this is that different tricks are needed to force compilation of the checking code when invoking the macro from inside a class definition instead of a function. Sometimes there is more than one type involved in a concept. This means that the corresponding concept checking class will have more than one template argument. The VectorSpace concept checker is an example of one of this, it involves an AbelianGroup and a Field. template struct VectorSpace { CLASS_REQUIRES2(G, F, R_Module); CLASS_REQUIRES(F, Field); void constraints() { y = x / a; y /= a; } G x, y, z; F a; }; When invoking a concept checker with more than one type it is necessary to append the number of type arguments to the macro name. Here is an example of using a multi-type concept checker. template void foo(Vector& x, Real a) { REQUIRE2(Vector, Real, VectorSpace); ... } For the most part the user of MTL does not need to know how to create concept checks and insert them in generic algorithms, however it is good to 2.5. CONCEPT CHECKING 13 know what they are and how to use them. This will make the error messages easier to understand. Also if you are unsure about whether you can use a certain MTL class with a particular algorithm, or whether some custom-made class of your own is compatible, a good first check is to invoke the appropriate concept checker. Here’s a quick example program that one could write to see if the types std::complex and float together satisfy the requirements for VectorSpace. #include #include int main(int,char*[]) { using mtl::VectorSpace; REQUIRE2(std::complex , float, VectorSpace); return 0; } In addition, if you create generic algorithms of your own then we highly encourage you to create, publish and use concept checks. 14 CHAPTER 2. CONCEPTS AND MODELS Part II Introduction to Numerical Linear Algebra 15 Part III Tutorials 17 Chapter 3 Gaussian Elimination template void gaussian_elimination(Matrix& A) { typename matrix_traits ::size_type m = A.nrows(), n = A.ncols(), i, k; typename matrix_traits ::value_type s; for (k = 0; k < std::min(m-1,n-1); ++k) { if (A(k,k) != zero(s)) for (i = k + 1; i < m; ++i) { s = A(i,k) / A(k,k); A(i,all) -= s * A(k,all); } } } template void gauss_elim_with_partial_pivoting(Matrix& A, Vector& pivots) { typename matrix_traits ::size_type m = A.nrows(), n = A.ncols(), pivot, i, k; typename matrix_traits ::value_type s; for (k = 0; k < std::min(m-1,n-1); ++k) { pivot = k + max_abs_index( A(range(k,m),k) ); if (pivot != k) swap(A(pivot,all), A(k,all)); pivots[k] = pivot; if (A(k,k) != zero(s)) for (i = k + 1; i < m; ++i) { s = A(i,k) / A(k,k); A(i,all) -= s * A(k,all); } } 19 20 } CHAPTER 3. GAUSSIAN ELIMINATION Chapter 4 Pointwise LU Factorization This section shows how one could implement the usual pointwise LU factorization. The next section will describe an implementation of the blocked LU factorization. First, a quick review of LU factorization. It is basically gaussian elimination, where a general matrix is transformed into a lower triangular matrix and an upper triangular matrix. The purpose of this transformation is to solve a system of equations, and it is easy to solve a system once it is in triangular form (using forward or backward substitution). So if we start with the equation Ax = b, using LU factorization this becomes LU x = b. We can then solve the system in two simple steps: first we solve Ly = b (where y has replaced U x), and then we solve U x = y. For more background on LU factorization see [6] or [10]. The algorithm for LU factorization is given in Figure 4.1 and the graphical representation of the algorithm is given in Figure 4.2, as it would look part way through the computation. The black square represents the current pivot element. The horizontal shaded rectangle is a row from the upper triangle of the matrix. The vertical shaded rectangle is a column from the lower triangle of the matrix. The L and U labeled regions are the portions of the matrix that have already been updated by the algorithm. The algorithm has not yet reached the region labeled A0 . for i = 1 . . . min(M − 1, N − 1) find maximum element in the column section A(i + 1 : M, i) swap the row of maximum element with row A(i, :) scale column section A(i + 1 : M, i) by 1/A(i, i) let A0 = A(i + 1 : M, i + 1 : N ) A0 ← A0 + L(:, i)U (i, :) (rank one update) Figure 4.1: LU factorization pseudo-code. We will implement the LU factorization as a template function that takes a matrix input-output argument and a vector output-argument to record the 21 22 CHAPTER 4. POINTWISE LU FACTORIZATION A’ L U Figure 4.2: Diagram for LU factorization. pivots. The return type in an integer that will be zero if the algorithm is successful (the matrix is non-singular), otherwise the matrix is singular and U (i, i) is zero with i given as the return value. The return type is the size type which is associated with the matrix using the matrix traits class. For most cases one could get away with using int instead, but when writing a generic algorithm it is best to use this more portable method1 . Inside the algorithm we will use many of the matrix operations specified in the SubdividableMatrix concept, so we insert a REQUIRE clause to ensure that the user does not do something like try to call the function with a sparse matrix (which is not a model of SubdividableMatrix). In addition, we require the PVector type to really be a Vector2 . template typename matrix_traits ::size_type lu_factor(Matrix& A, PVector& pivots) { REQUIRE(Matrix, SubdividableMatrix); REQUIRE(PVector, Vector); ... } To make the indexing as simple as possible, we can create matrix objects to make explicit the upper and lower triangular views of the original matrix A. First the types of the views must be obtained. This is done using the triangle view traits class. We also create a typedef for the sub-matrix type, which will be used later. The U and L matrix objects are then created and the matrix A is passed to their constructors. The L and U objects are just handles, so their creation is inexpensive (constant time complexity). typedef typename triangle_view ::type Unit_Upper; typedef typename triangle_view ::type Unit_Lower; typedef typename submatrix ::type SubMatrix; Unit_Upper U(A); 1 Often times the biggest headache in porting large codes is changing integer types. This problem can be mitigated by the correct use of traits classes and the associated size type 2 The require clauses are not really necessary and can be left out. Their purpose is to make error messages more understandable when the user incorrectly applies a generic algorithm. 23 Unit_Lower L(A); Next we need to declare some index variables, again using the matrix traits class to get the correct type. typedef typename matrix_traits ::size_type SizeT; typedef typename matrix_traits ::value_type T; SizeT i, ip, M = A.nrows(), N = A.ncols(), info = 0; The first operation in the LU factorization is to find the maximum element in the column, which will tell us how to pivot. The expression A(i,range(i,M)) returns a subsection of the ith column, from the ith row to the bottom. The range describes a half-open interval which does not include the element A(i,M) (which would be out-of-bounds). The sub-column object is a full-fledge MTL Vector, and can be used with any of the MTL vector operations. The same is true for sub-rows and sub-matrices. We then apply the abs() function to create an expression object, which will apply abs() to each element of the sub-column during the evaluation of the max index() function. ip = max_index(abs(A(i,range(i,M)))); The next operation in the LU factorization is to swap the current row with the row that has the maximum element. if (ip != i) swap((A(i,all), A(ip,all)); The third operation in the LU factorization is to scale the column under the pivot by 1/A(i, i). The use of the identity() function needs some explaining. Since we are writing a generic algorithm, we do not know the element type for the matrix, and therefore can not be sure that an integer constant 1 is convertible to the element type. The solution is to use the generic identity() function provided by MTL which returns a 1 of the appropriate type. L(all,i) *= identity(T()) / A(i,i); The final operation in the LU factorization is to update the trailing submatrix according to A0 ← A0 + L(:, i)U (i, :). SubMatrix Aprime = A(range(i+1, M), range(i+1, N)); Aprime -= L(all,i) * U(i,all); The complete LU factorization implementation is given in Figure 4.3. 24 CHAPTER 4. POINTWISE LU FACTORIZATION template typename matrix_traits ::size_type lu_factor(Matrix& A, PivotVector& pivots) { typedef typename triangle_view ::type Unit_Upper; typedef typename triangle_view ::type Unit_Lower; typedef typename submatrix ::type SubMatrix; typedef typename matrix_traits ::size_type SizeT; typedef typename matrix_traits ::value_type T; REQUIRE(Matrix, SubdividableMatrix); REQUIRE(Unit_Lower, SubdividableMatrix); REQUIRE2(Unit_Lower, T, VectorSpace); REQUIRE2(SubMatrix, Ring); REQUIRE(PivotVector, Vector); Unit_Upper U(A); Unit_Lower L(A); int info = 0; SizeT i, ip, M = A.nrows(), N = A.ncols(); for (i = 0; i < min(M - 1, N - 1); ++i) { ip = max_index(abs(A(i,range(i,M)))); /* find pivot */ pivots[i] = ip + 1; if ( A(ip, i) != zero(T()) ) { /* make sure pivot isn’t zero */ if (ip != i) swap((A(i,all), A(ip,all)); /* swap the rows i and ip */ L(all,i) *= identity(T()) / A(i,i); /* update column under the pivot */ } else { info = i + 1; break; } SubMatrix Aprime = A(range(i+1, M), range(i+1, N)); Aprime -= L(all,i) * U(i,all); /* update the submatrix */ } pivots[i] = i + 1; return info; } Figure 4.3: Complete MTL version of pointwise LU factorization. Chapter 5 Blocked LU Factorization The execution time of many linear algebra operations on modern computer architectures can be decreased dramatically through blocking to increase cache utilization [4, 5]. In algorithms where repeated matrix-vector operations are done (such as the rank-one-update of the LU factorization), it is beneficial to convert the algorithm to use matrix-matrix operations to introduce more opportunities for blocking. Here we give an example of how to MTL to reformulate the LU factorization algorithm to use more matrix-matrix operations [?]. First the matrix A is split into four submatrices, using a blocking factor of r. A11 is r × r and A12 is r × n − r where dim(A) = n × n. A= A11 A21 A12 A22 Then A = LU is formulated in terms of the blocks. A11 A21 A12 A22 = L11 L21 0 L22 U11 0 U12 U22 From this the following equations are derived for the submatrices of A. The matrix on the right shows the values that should occupy A after one step of the blocked LU factorization. A11 A12 A21 A22 = = = = L11 U11 L11 U12 L21 U11 L21 U12 + L22 U22 L11 \ U11 L21 U12 Ã22 L11 , U11 , and L21 are updated by applying the pointwise LU factorization to the combined region of A11 and A21 . U12 is then updated with a triangular 25 26 CHAPTER 5. BLOCKED LU FACTORIZATION solve applied to A12 . Finally Ã22 is calculated with a matrix product of L21 and U12 . The algorithm is then applied recursively to Ã22 . In the implementation of block LU factorization, MTL can be used to create a partitioning of the matrix into submatrices. The use of the submatrix objects throughout the algorithm removes the redundant indexing that a programmer would typically have to do to specify the regions for each submatrix for each operation. The code below shows how the partitioning is performed that corresponds to Figure 5.1 with the matrix objects A 0, A 1, and A 2. The partitioning for Figure 5.2 is created with matrix objects A 11, A 12, A 21, and A 22. SubMatrix A_0 = A(range(j,M), range(0,j)), A_1 = A(range(j,M), range(j,j+jb)), A_2 = A(range(j,M), range(j+jb,N)), A_11 = A(range(j,j+jb), range(j,j+jb)), A_12 = A(range(j,j+jb), range(j+jb,N)), A_21 = A(range(j+jb,M), range(j,j+jb)), A_22 = A(range(j+jb,M), range(j+jb,N)); triangle_view ::type L_11(A_11); Figure 5.1 depicts the block factorization part way through the computation. The matrix is divided up for the pointwise factorization step. The region including A11 and A21 is labeled A1 . Since there is pivoting involved, the rows in the regions labeled A0 and A2 must be swapped according to the pivots used in A1 . U L A0 A1 A2 Figure 5.1: Pointwise step in block LU factorization. The implementation of this step in MTL is very concise. The A 1 submatrix object is passed to the lu factorize() algorithm. The multi swap() function is then used on A 0 and A 2 to pivot their rows to match A 1. PivotVector sub_pivots(jb); S ret = lu_factor(A_1, sub_pivots); if (ret != 0) return ret + j; for (S i = j; i < min(M, j + jb); ++i) pivots[i] = sub_pivots[i-j] + j; 27 if (j > 0) permute(A_0, sub_pivots, left_size()); if (j + jb < M) { permute(A_2, sub_pivots, left_side()); Once A1 has been factored, the A12 and A22 regions must be updated. The submatrices involved are depicted in Figure 5.2. The A12 region needs to be updated with a triangular solve. U U11 L L 11 L 21 A 12 A 22 Figure 5.2: Update steps in block LU factorization. To apply the tri solve() algorithm to L11 and A12 , we merely call the MTL routine and pass in the L 11 and A 12 matrix objects. tri_solve(L_11, A_12, left_side()); The last step is to calculate Ã22 with a matrix-matrix multiply according to Ã22 ← A22 − L21 U12 . The A 12 and A 21 matrix objects are used to implement then operation. They have been overwritten in the previous steps with U12 and L21 . A_22 -= A_21 * A_12; The complete version of the MTL block LU factorization algorithm is given in Figure 5.3. 28 CHAPTER 5. BLOCKED LU FACTORIZATION template typename Matrix::size_type block_lu(Matrix& A, PivotVector& pivots) { typedef typename Matrix::value_type T; typedef typename Matrix::size_type S; typedef typename submatrix_view ::type SubMatrix; const S BF = LU_BF; // blocking factor const S M = A.nrows(); const S N = A.ncols(); if (min(M, N) <= BF || BF == 1) return lu_factor(A, pivots); for (S j = 0; j < min(M, N); j += BF) { S jb = min(min(M, N) - j, BF); SubMatrix A_0 = A(range(j,M), range(0,j)), A_1 = A(range(j,M), range(j,j+jb)), A_2 = A(range(j,M), range(j+jb,N)), A_11 = A(range(j,j+jb), range(j,j+jb)), A_12 = A(range(j,j+jb), range(j+jb,N)), A_21 = A(range(j+jb,M), range(j,j+jb)), A_22 = A(range(j+jb,M), range(j+jb,N)); triangle_view ::type L_11(A_11); PivotVector sub_pivots(jb); S ret = lu_factor(A_1, sub_pivots); if (ret != 0) return ret + j; for (S i = j; i < min(M, j + jb); ++i) pivots[i] = sub_pivots[i-j] + j; if (j > 0) permute(A_0, sub_pivots, left_size()); if (j + jb < M) { permute(A_2, sub_pivots, left_side()); tri_solve(L_11, A_12, left_side()); A_22 -= A_21 * A_12; } } } Figure 5.3: MTL version of block LU factorization. Chapter 6 Preconditioned GMRES(m) One important use for MTL is for the rapid construction of numerical libraries. Although not necessary, a generic approach can be used when developing these libraries as well, resulting in reusable scientific software at a higher level. To demonstrate how one might use MTL for a non-trivial high-level library, we show the complete implementation of the preconditioned GMRES(m) algorithm [9] in Figure 6.1 (taken from our Iterative Template Library). The basic algorithmic steps (corresponding to the GMRES algorithm as given in [9]) are given in the comments and the calls to MTL in the body of the algorithm should be fairly clear. Some of the other code may seem somewhat impenetrable at first glance, so we’ll take a quick walk-through the more difficult statements. The algorithm parameterizes GMRES in some important ways, as shown in the template statement on lines 1 and 2. The matrix and vector types are parameterized, so that any matrix type can be used. In particular, matrices having any element type can be used — e.g., real or complex. In fact, matrices without explicit elements at all (matrix-free operators [2]) can be used. All that is required is for the mult() algorithm be suitably defined. For MTL matrices, the generic MTL mult() will generally suffice. For non-MTL matrices, or matrix-free operators, a suitably overloaded mult() must be provided. There are also two other type parameterizations of interest, the Preconditioner and the Iteration. Similar to the parameterization of the matrix, the preconditioner type is parameterized so that arbitrary preconditioners can be used. It is only required that the preconditioner be callable with the solve() algorithm. The Iteration type parameter allows the user to control the stopping criterion for the algorithm. (Pre-defined stopping criteria are included as part of ITL.) The using namespace mtl statement on line 6 allows us to access MTL functions (which are all declared within the mtl namespace) without explicitly using the mtl:: scope. The use of a namespace helps to prevent name clashes with other libraries and user code. On line 7, we use the internally defined typedef for the value type to determine the type of the individual elements of the matrix. All MTL matrix and 29 30 CHAPTER 6. PRECONDITIONED GMRES(M) template int gmres(const Matrix &A, Vector &x, const VectorB &b, const Preconditioner &Minv, int m, Iteration& outer, Norm norm, InnerProduct dot) { typedef typename Matrix::value_type T; typedef mtl::matrix , dense<>, column_major>::type InternalMatrix; typedef itl_traits ::internal_vector InternalVec; REQUIRE4(InternalVec, T, Norm, InnerProduct, HilbertSpace); REQUIRE4(Matrix, Vector, VectorB, T, LinearOperator); REQUIRE4(Preconditioner, InternalVec, InternalVec, T, LinearOperator); InternalMatrix H(m+1, m), V(size(x), m+1); InternalVec s(m+1), w, r, u; std::vector< givens_rotation > rotations(m+1); w = b - A * x; r = Minv * w; typename Iteration::real beta = std::abs(norm(r)); while (! outer.finished(beta)) { // Outer iteration V[0] = r /beta; s = zero(s[0]); s[0] = beta; int j = 0; Iteration inner(outer.normb(), m, outer.tol()); do { u = A * V[j]; w = Minv * u; for (int i = 0; i <= j; i++) { H(i,j) = dot(w, V[i]); w -= V[i] * H(i,j); } H(j+1,j) = norm(w); V[j+1] = w / H(j+1,j); // Inner iteration // QR triangularization of H for (int i = 0; i < j; i++) rotations[i].apply(H(i,j), H(i+1,j)); rotations[j] = givens_rotation (H(j,j), H(j+1,j)); rotations[j].apply(H(j,j), H(j+1,j)); rotations[j].apply(s[i], s[i+1]); ++inner, ++outer, ++j; } while (! inner.finished(std::abs(s[j]))); // Form the approximate solution tri_solve(tri_view ()(H(range(0, j), range(0, j))), s); x += V(range(0,x.size()), range(0,j)) * s; // Restart w = b - A * x; r = Minv * w; beta = std::abs(norm(r)); } return outer.error_code(); } Figure 6.1: The Iterative Template Library (ITL) implementation of the preconditioned GMRES(m) algorithm. This algorithm computes an approximate solution to Ax = b preconditioned with M . The restart value is specified by the parameter m. 31 vector classes have an accessible type member called value type that specifies the type of the element data. By using this internal type, rather than a fixed type, we can make the algorithm generic with respect to element type. Finally, although the entire algorithm fits on a single page, it is not a toy implementation — it is both high-quality and high-performance. This is where the power of reusable software components can be appreciated. Now that there exists a generic GMRES(m) that has been implemented, tested, and debugged, programmers wanting to use GMRES are forever spared the work of implementation, testing, and debugging GMRES themselves. Both time and reliability are gained. 32 CHAPTER 6. PRECONDITIONED GMRES(M) Part IV Reference Manual 33 Chapter 7 Concepts 35 36 7.1 CHAPTER 7. CONCEPTS Algebraic Concepts One of the beautiful aspects of linear algebra is that in many respects matrices and vectors act like numbers, for instance you can add and multiply them. Of course, matrices and vectors don’t act exactly like numbers, there are some important differences and restrictions on the operations. And since many algorithms operate on matrices at this abstraction level (without looking “inside” the matrices) it is important to formulate a precise description of this abstraction level. Fortunately, mathematicians [7, 8, 11] have already developed a very precise definition for a linear algebra which we will use, merely adapting it to C++ syntax. The section following this one will describe the interface for looking “inside” the matrix and vector data structures. The definition of a linear algebra is somewhat complex and relies on a family of algebraic concepts which we also define here. If you are not particularly interested in this mathematical structure, you can skip forward to the description of the LinearAlgebra concept in Section 7.1.12, which summarizes all of the operations on matrices and vectors. Later, when you encounter algorithms that use the other concepts defined here, you can return to this section for reference. With the following concept definitions we attempt to map the purely mathematical algebraic concepts into the realm of C++ components. Due to many practical considerations the mapping is not perfect, and the mathematical concepts will be stretched and bent here and there. The overall purpose here is to define useful interfaces and terminology for use in the precise documentation of algorithms in C++. This is not a theoretical exercise for determining how closely we can model the mathematical concepts in C++. In our formulation of these C++ concepts we have left out the mathematical concepts that do not aid in defining interfaces for real C++ components. However, much of the fine granularity present in the mathematical concepts has been retained. This granularity is useful when documenting algorithms, as it makes it easier to choose a concept that closely matches the minimal requirements for each parameter to an algorithm. For instance, most of the algorithms in ITL only use the subset of matrix operations contained in the LinearOperator concept. Figure 7.1 gives an overview of the algebraic concepts, with the arrows representing the refinement relationship. Equality Stating whether two objects are “equal” is somewhat of a sticky subject. To begin with, we will want to talk about whether objects are equal even if there is not an operator== defined for that type (it is not EqualityComparable). Also, when dealing with floating point numbers we want to talk about an equality that is much looser than the bit-level equality that is given by operator==. To this end we use the symbol = to mean a = b iff |a − b| < where is some appropriate small number for the situation (like machine epsilon). If the number is not LessThanComparable or it does not make sense to take the absolute value, then = means that during computation, if the value on the left-hand-side was 7.1. ALGEBRAIC CONCEPTS 37 VectorSpace BanachSpace HilbertSpace R-Module Field LinearAlgebra Ring AbelianGroup LinearOperator TransposableLinearOperator Figure 7.1: Refinement of the algebraic concepts. substituted with the value on the right-hand-side, the difference in the resulting behaviour of the program would not be large enough to care about. 7.1.1 AbelianGroup A group is a set of elements and an operator over the set that obeys the associative law and has an identity element. If the operator is commutative it is called an Abelian group, and if the notation used for the operator is + then it is an additive group. The concept AbelianGroup we define here is an additive Abelian group . Refinement of Assignable Notation X a,b,c is a type that is a model of AbelianGroup. are objects of type X. Valid Expressions • Addition a + b Return Type: X or a type convertible to X that is also a model of Semantics: AbelianGroup. See below for the invariants. 38 CHAPTER 7. CONCEPTS • Addition Assignment a += b Return Type: Semantics: X& Equivalent to a = a + b. • Additive Inverse -a Return Type: X or a type convertible to X that is also a model of Semantics: AbelianGroup. See below. • Subtraction a - b Return Type: X or a type convertible to X that is also a model of Semantics: AbelianGroup. Equivalent to a + -b. • Subtraction Assignment a -= b Return Type: Semantics: X& Equivalent to a = a + -b. • Zero Element (Additive Identity) zero(a) Return Type: Semantics: X This function returns a zero element of the same type as the argument a. a is not changed, the purpose of the argument is merely to carry type information and also size information in the case of vectors and matrices. See below for the algebraic properties of the zero element. Invariants • Associativity (a + b) + c = a + (b + c) • Definition of the Identity Element a + zero(a) = a • Definition of Additive Inverse a + -a = zero(a) • Commutativity a + b = b + a Models • int • float 7.1. ALGEBRAIC CONCEPTS 39 • complex • vector ::type • matrix ::type Constraints Checking Class template struct AbelianGroup { void constraints() { c = a + b; b += a; b = -a; c = a - b; b -= a; b = zero(a); } X a, b, c; }; 7.1.2 Ring A Ring adds the notion of a second operation, namely multiplication, to the concept of a AbelianGroup. The multiplication obeys the law of associativity and distributes with addition. Adding a unity element (the multiplicative identity) to a ring give a ring-with-unity. For simplicity we will include the requirement for a unity element in the Ring concept. Another variant of the Ring concept adds the requirement that multiplication be commutative. We will refer to this concept as a CommutativeRing Refinement of AbelianGroup Notation X a,b is a type that is a model of Ring. are objects of type X. Requirements • Multiplication a * b Return Type: convertible to type X. X or a type convertible to X. 40 CHAPTER 7. CONCEPTS • Multiplicative Identity Element identity(a) Return Type: Semantics: X Returns the appropriate identity element for the type of a. The argument a is not changed, its purpose is to carry type information and also size information in the case when a is a matrix. The algebraic properties of the identity element are listed below. Invariants • Associativity of Multiplication a * (b * c) = (a * b) * c • Distributivity a*(b + c) = a*b + a*c (b + c)*a = b*a + c*a • Definition of Multiplicative Identity a * identity(a) = a • Commutativity (for a CommutativeRing) a * b = b * a Models • int • float • complex • matrix ::type Constraints Checking Class template struct Ring { CLASS_REQUIRES(X, AbelianGroup); void constraints() { c = a * b; b = identity(a); } X a, b, c; }; 7.1. ALGEBRAIC CONCEPTS 7.1.3 41 Field A Field adds the notion that there is always a solution for the equations ax = b ya = b ∀a 6= 0. This means that the set is closed under division, so we can add the division operator to the requirements. Refinement of Ring, EqualityComparable, and LessThanComparable Notation is a type that is a model of Field. are objects of type X. X a,b Valid Expressions • Division a / b Return Type: X or a type convertible to X. • Division Assignment a /= b Return Type: X& Invariants • Definition of Multplicative Inverse if a * x = b then x = b/a. Models • float • double • complex Constraints Checking Class template struct Field { CLASS_REQUIRES(X, Ring); CLASS_REQUIRES(X, EqualityComparable); CLASS_REQUIRES(X, LessThanComparable); 42 CHAPTER 7. CONCEPTS void constraints() { c = a / b; b /= a; } X a, b, c; }; 7.1.4 R-Module The R-Module concept defines multiplication (or “scaling”) between an AbelianGroup and a Ring, where the result of the multiplication is an object of the group type. Also the multiplication must be associative and it must distribute with the addition operator of the AbelianGroup. Refinement of AbelianGroup Notation G R a, b x is a type that is a model of AbelianGroup. is a type that is a model of Ring. are objects of type R is an object of type X Participating Types • Vector Type The type that plays the role of the vector (type G) and that is a model of AbelianGroup. • Scalar Type The type that plays the role of the scalar (type R) and that is a model of Ring. Valid Expressions • Right Scalar Multiplication x * a Return Type: G or a type convertible to G. • Left Scalar Multiplication a * x Return Type: G or a type convertible to G which also with the scalar type satisfies R-Module. • Scalar Multiplication Assignment x *= a Return Type: G& 7.1. ALGEBRAIC CONCEPTS 43 Invariants • Distributive (a + b)*x = a*x + b*x a*(x + y) = a*x + a*y • Associative a * (b * x) = (a * b) * x • Identity identity(a) * x = x Models • { vector ::type, int } • { matrix ::type, int } Constraints Checking Class template struct R_Module { CLASS_REQUIRES(G, AbelianGroup); CLASS_REQUIRES(R, Ring); void constraints() { y *= a; y = x * a; y = a * x; } G x, y; R a; }; 7.1.5 VectorSpace The VectorSpace concept is a refinement of the R-Module concept, adding the addition requirement that the scalar type be a model of Field instead of Ring. With this we are able to define vector division by a scalar. A VectorSpace is also called a F-module. Refinement of R-Module 44 CHAPTER 7. CONCEPTS Notation is is is is G F x a a type that is a model of AbelianGroup. a type that is a model of Field. an object of type G an object of type F Participating Types • Vector Type The type that plays the role of the vector (type G) and that is a model of AbelianGroup. • Scalar Type The type that plays the role of the scalar (type F) and that is a model of Field. Valid Expressions • Scalar Division x / a Return Type: G or a type convertible to G. • Scalar Division Assignment x /= a Return Type: G& Models • { vector ::type, float } • { matrix ::type, double } • { std::valarray , float } • { std::complex , float } Constraints Checking Class template struct VectorSpace { CLASS_REQUIRES2(G, F, R_Module); CLASS_REQUIRES(F, Field); void constraints() { y = x / a; y /= a; } G x, y; F a; }; 7.1. ALGEBRAIC CONCEPTS 7.1.6 45 FiniteVectorSpace, FiniteBanachSpace, FiniteHilbertSpace A finite-dimensional vector space has a basis consisting of finite number of vectors x1 , . . . , xn where n is the dimension of the vector space. Any vector in such a space can be expressed in terms of n coordinates with respect to the basis. With this in mind, the concept FiniteVectorSpace requires a method of access for the coordinates of a vector and access to the dimension, namely the operator[] and a size() function. The behaviour of these operations and the associated traits information is described in BasicVector. The next two vector space concepts, BanachSpace and HilbertSpace, also have finite-dimensional variants which we will call FiniteBanachSpace and FiniteHilbertSpace. Refinement of VectorSpace and BasicVector Constraints Checking Class template struct FiniteVectorSpace { CLASS_REQUIRES2(G, F, VectorSpace); CLASS_REQUIRES(G, BasicVector); }; template struct FiniteBanachSpace { CLASS_REQUIRES3(G, F, Norm, BanachSpace); CLASS_REQUIRES(G, BasicVector); }; template struct FiniteHilbertSpace { CLASS_REQUIRES4(G, F, Norm, InnerProduct, HilbertSpace); CLASS_REQUIRES(G, BasicVector); }; 7.1.7 BanachSpace A BanachSpace is basically a VectorSpace composed with a definition for a norm function. More technically, a BanachSpace is a complete normed vector space [8]. Since one may want to use the same generic algorithm on different spaces with different norms, we specify access to the norm function through a function object which is passed to algorithms or classes that operate on a BanachSpaces1 . 1 The MTL and ITL algorithms provide the 2-norm as a default. 46 CHAPTER 7. CONCEPTS Refinement of VectorSpace Notation is is is is is is {G,F} Norm x,y a,b r norm a module that models VectorSpace. a functor type as defined below. an object of type X. an object of type F. an object of type magnitude ::type. an object of type Norm. Participating Types • Norm Functor Type The Norm type is a function object that can be applied to vector type X and whose return type is magnitude ::type. In addition, the norm function satisfies the invariants listed below. Associated Types • Magnitude Type magnitude ::type The return type of abs() applied to the scalar type F. Typically this is some real number type. Valid Expressions • Norm Functor Application norm(x) Return Type: Semantics: magnitude ::type See below. • Absolute Value abs(a) Return Type: Semantics: magnitude ::type The distance between a and zero. Invariants The norm functor must obey the following invariants. • norm(x) >= zero(r) • norm(x) == zero(r) iff x == zero(x) • Homogeneity norm(a*x) = abs(a) * norm(x) 7.1. ALGEBRAIC CONCEPTS 47 • Triangle Inequality norm(x + y) <= norm(x) + norm(y) The abs() function must obey a similar set of invariants. • abs(a) >= zero(r) • abs(a) == zero(r) iff a == zero(a) • Homogeneity abs(a*b) = abs(a) * abs(b) • Triangle Inequality abs(a + b) <= abs(a) + abs(b) Constraints Checking Class template struct BanachSpace { CLASS_REQUIRES2(G, F, VectorSpace); void constraints() { r = norm(x); r = abs(a); } G x; F a; Norm norm; typename magnitude ::type r; }; 7.1.8 HilbertSpace A HilbertSpace is basically a VectorSpace composed with an inner product function. The inner product is also known as dot product or scalar product. Similarly to the norm of the BanachSpace, the inner product must be provided as a function object2 . Refinement of BanachSpace Notation {G,F} InnerProduct x,y,z a,b dot 2 MTL is a module that models BanachSpace. is a functor as defined below. are objects of type X. are objects of type F. is an object of type InnerProduct. and ITL algorithms use mtl::dot functor as a default for the function object. 48 CHAPTER 7. CONCEPTS Participating Types • Inner Product Type The InnerProduct type is a function object that can be applied to two vectors of type X and whose return type is the scalar type F. In addition, the inner product satisfies the invariants listed below. Valid Expression • Inner Product Functor Application dot(x,y) Return Type: Semantics: Preconditions: F See below. size(x) == size(y) if they are finite • Scalar Conjugate conj(a) Return Type: Semantics: F • Vector Conjugate conj(x) Return Type: Semantics: convertible to X Invariants The dot functor obeys the following invariants. • dot(x,x) > zero(a) for all x != zero(x) • dot(x,x) == zero(a) if x == zero(x) • dot(x,y) == dot(y,conj(x)) • dot(a*x + b*y, z) = a*dot(x,z) + b*dot(y,z) • dot(x, a*y + b*z) = conj(a)*dot(x,y) + conj(b)*dot(x,z) • norm(x) = sqrt(dot(x,x)) The scalar conj() function must obey the following laws. • conj(a * b) = conj(a) * conj(b) • conj(a + b) = conj(a) + conj(b) The vector conj() function must obey similar laws. • conj(x * y) = conj(x) * conj(y) • conj(x + y) = conj(x) + conj(y) 7.1. ALGEBRAIC CONCEPTS 49 Constraints Checking Class template struct HilbertSpace { CLASS_REQUIRES3(G, F, Norm, BanachSpace); void constraints() { a = dot(x, y); a = conj(a); y = conj(x); } G x, y; F a; InnerProduct dot; }; 7.1.9 LinearOperator A linear operator is a function from one vector space to another. Also known as a linear transformation. The concept LinearOperator consists of an operator and two vector types, a domain and range vector type, and a scalar type. Notation Op F {VX,F} {VY,F} A x,w a is the operator type. is the scalar type which must model Field. is a module that models VectorSpace. is a module that models VectorSpace. is an object of type Op are objects of type VX is an object of type F Participating Types • Operator Type The type of the operator function Op which can be applied to the domain vector space. • Domain Vector Space A type that models VectorSpace which is the argument to the LinearOperator. An access method (via a traits class) is not provided since a given linear operator type may be applicable to many different vector types. • Range Vector Space A type that models VectorSpace (or at least is convertible to such a type). 50 CHAPTER 7. CONCEPTS Associated Types • Size Type matrix traits ::size type The Integral type that is the return type for the nrows() and ncols() functions. Valid Expressions • Operator Application A * x Return Type: a type that models the range vector space (or is convertible to it). Invariates • Linearity A * (x + w) = A * x + A * y. A * (x * a) = (A * x) * a Constraints Checking Class template struct LinearOperator { CLASS_REQUIRES2(VX, F, VectorSpace); CLASS_REQUIRES2(VY, F, VectorSpace); void constraints() { y = A * x; } Op A; VX x; VY y; }; 7.1.10 FiniteLinearOperator A FiniteLinearOperator is a LinearOperator the operates over finite vector spaces. Notation Op F {VX,F} {VY,F} A x is is is is is is the operator type. the scalar type which must model Field. a module that models VectorSpace. a module that models VectorSpace. an object of type Op an object of type VX 7.1. ALGEBRAIC CONCEPTS 51 Associated Types • Size Type matrix traits ::size type The Integral type that is the return type for the nrows() and ncols() functions. Valid Expressions • Operator Application A * x Return Type: Preconditions: Semantics: a type that models the range vector space (or is convertible to it). ncols(A) == size(x) The returned vector will have size equal to nrows(A). • Number of Rows nrows(A) Return Type: Semantics: matrix traits ::size type The dimension of the resulting vector space. • Number of Columns ncols(A) Return Type: Semantics: matrix traits ::size type The dimension of the input vector space. Constraints Checking Class template struct FiniteLinearOperator { CLASS_REQUIRES2(Op, VX, VY, F, LinearOperator); void constraints() { m = nrows(A); n = ncols(A); } Op A; typename matrix_traits ::size_type m, n; }; 7.1.11 TransposableLinearOperator A TransposableLinearOperator is simply a LinearOperator for which the transpose of the linear operator can also be applied. Refinement of LinearOperator 52 CHAPTER 7. CONCEPTS Notation X V A y is is is is a type models TransposableLinearOperator. a type that models VectorSpace. an object of type X an object of type V Associated Types In addition to the associated types inherited from LinearOperator we have: • Transpose Linear Operator Type transpose view ::type The type returned by trans(A), which is a transposed view into the matrix. Valid Expressions • Transposed Operator Application (Transposed Matrix-Vector Multiplication) trans(A) * y Return Type: some type that models VectorSpace or is convertible to one. Preconditions: Semantics: nrows(A) == size(y) The resulting vector (if it is finite) will have size equal to ncols(A). • Transposed Operator Application (Left Matrix-Vector Multiplication) y * A Return Type: The return type will be a model of VectorSpace (or at least convertible to one) Preconditions: Semantics: nrows(A) == size(y) The resulting vector (if it is finite) will have size equal to ncols(A). Constraints Checking Class template struct TransposableLinearOperator { CLASS_REQUIRES4(X, VX, VY, F, LinearOperator); void constraints() { typedef typename transpose_view ::type Transpose; Transpose AT = trans(A); y = AT * x; y = x * A; } X A; VX x; VY y; }; 7.1. ALGEBRAIC CONCEPTS 7.1.12 53 LinearAlgebra The LinearAlgebra adds several operations to the LinearOperator concept. With the scalar multiplication and addition operators the LinearOperator forms a VectorSpace, and with multiplication the LinearOperator forms a Ring. The multiplication operator associated with this Ring is the composition of linear operators, which in the context of MTL is matrix multiplication. One of the requirements for a Ring is that it be closed under multiplication. Matrix multiplication is only closed for square matrices. In the general case of rectangular matrices none of the matrices involved in the multiplication are over the same vector space (for C = AB, A maps Rm → Rk , B maps Rk → Rn , and C maps Rm → Rn ). We gloss over this distinction and include rectangular matrices in the LinearAlgebra concept. The LinearAlgebra concept does not introduce any new requirements, it is just a composition of the algebraic concepts discussed in this chapter. However, to help clarify the definition of this rather large concept, here we list the complete set of requirements. The usual definition for the computation of these operations is listed with each operator merely as a reminder to the reader and does not require that the operation be implemented in that manner. We use t to denote the temporary vector or matrix resulting from some of the operations 3 . Refinement LinearOperator, VectorSpace, and Ring Notation Matrix Vector A,B r, c x,y a is a type that models TransposableLinearOperator. is a type that models VectorSpace. is an object of type Matrix. are objects of type Matrix, which in this case is a VectorMatrix (a row or column vector). are objects of type Vector. is an object of type Scalar. Participating Types • The Matrix Type • The Vector Type • The Scalar Type 3 Due to the expression template optimization that MTL performs, in many cases the temporary is not actually formed 54 CHAPTER 7. CONCEPTS Valid Expressions • Vector Addition (ti = xi + yi ) x + y Preconditions: size(x) == size(x) if x is finite-dimensional. • Vector Addition Assignment (xi = xi + yi ) x += y Preconditions: size(x) == size(y) • Vector Additive Inverse (ti = −xi ) -x • Vector Subtraction (ti = xi − yi ) x - y Preconditions: size(x) == size(y) • Vector Subtraction Assignment (xi = xi − yi ) x -= y Preconditions: size(x) == size(x) • Zero Vector (ti = 0) zero(x) Semantics: This function returns a zero vector of the same type as the argument x. x is not changed, the purpose of the argument is merely to carry type and size information. • Vector-Scalar Multiplication (ti = xi α) x * a Return Type: Vector or a type convertible to Vector that is a model of VectorSpace. • Scalar-Vector Multiplication (ti = αxi ) a * x Return Type: Vector or a type convertible to Vector. that is a model of VectorSpace. • Vector-Scalar Multiplication Assignment (xi = xi α) x *= a Return Type: Vector& • Matrix Addition (tij = aij + bij ) A + B Preconditions: nrows(A) == nrows(B) && ncols(A) == ncols(B) • Matrix Addition Assignment (bij = bij + aij ) B += A Preconditions: nrows(A) == nrows(B) && ncols(A) == ncols(B) • Matrix Additive Inverse (tij = −aij ) -A 7.1. ALGEBRAIC CONCEPTS 55 • Matrix Subtraction (tij = aij − bij ) A - B Preconditions: nrows(A) == nrows(B) && ncols(A) == ncols(B) • Matrix Subtraction Assignment (bij = bij − aij ) B -= A Return Type: Preconditions: Matrix& nrows(A) == nrows(B) && ncols(A) == ncols(B) • Zero Matrix (tij = 0) zero(A) Return Type: Semantics: Matrix This function returns a zero matrix of the same type as the argument A. A is not changed, the purpose of the argument is merely to carry type and size information. • Matrix-Scalar Multiplication (tij = aij α) A * a • Scalar-Matrix Multiplication (tij = αaij ) a * A • Matrix Scalar Multiplication Assignment (aij = aij α) A *= a Return Type: Matrix& • Matrix-Vector Multiplication (ti = P i aij xj ) A * x Preconditions: Semantics: ncols(A) == size(x) The return type convertible to a model of Vector, and the size is equal to nrows(A). P • Transposed Matrix-Vector Multiplication (tj = j aij yi ) trans(A) * y Preconditions: Semantics: nrows(A) == size(y) The return type convertible to a model of Vector, and the size is equal to ncols(A). P • Left Matrix-Vector Multiplication (tj = j yi aij ) y * A Preconditions: Semantics: • Inner Product ( nrows(A) == size(y) The return type convertible to a model of Vector, and the size is equal to ncols(A). P i ri ci ) r * c Return Type: Preconditions: Scalar size(r) == size(c) 56 CHAPTER 7. CONCEPTS • Outer Product (tij = ci rj ) c * r Semantics: The return type is a model of Matrix with dimensions The matrix defaults to be rowmajor, and is always dense. P • Matrix Multiplication (tij = k aik bkj ) (size(c),size(r)). A * B Preconditions: Semantics: ncols(A) == nrows(B) The return type is a model of Matrix with dimensions (nrows(A),ncols(B)). • Identity Matrix (tii = 1) identity(A) Return Type: Semantics: Matrix Returns the identity matrix for the same type as A. The argument A is not changed, its purpose is to carry type and size information for the creation of the identity matrix. • Matrix Transpose (tij = aji ) trans(A) Return Type: Semantics: transpose view ::type Returns a transposed view of the matrix, where A(i,j) appears to be A(j,i). • Row and Column Vector Transpose trans(r) and trans(c) Return Type: Semantics: transpose view ::type If Matrix is a row it becomes a column and vice-versa. Constraints Checking Class template struct LinearAlgebra { CLASS_REQUIRES4(Op, VX, VY, F, LinearOperator); CLASS_REQUIRES2(Op, F, VectorSpace); CLASS_REQUIRES(Op, Ring); }; 7.2. COLLECTION CONCEPTS 7.2 57 Collection Concepts 7.2.1 Collection A Collection is a concept similar to the STL Container concept. A Collection provides iterators for accessing a range of elements and provides information about the number of elements in the Collection. However, a Collection has fewer requirements than a Container. The motivation for the Collection concept is that there are many useful Container-like types that do not meet the full requirements of Container, and many algorithms that can be written with this reduced set of requirements. To summarize the reduction in requirements: • It is not required to “own” its elements: the lifetime of an element in a Collection does not have to match the lifetime of the Collection object, though the lifetime of the element should cover the lifetime of the Collection object. • The semantics of copying a Collection object is not defined (it could be a deep or shallow copy or not even support copying). • The associated reference type of a Collection does not have to be a real C++ reference. Because of the reduced requirements, some care must be taken when writing code that is meant to be generic for all Collection types. In particular, a Collection object should be passed by-reference since assumptions can not be made about the behaviour of the copy constructor. Associated types • Value type X::value type The type of the object stored in a Collection. If the Collection is mutable then the value type must be Assignable. Otherwise the value type must be CopyConstructible. • Iterator type X::iterator The type of iterator used to iterate through a Collection’s elements. The iterator’s value type is expected to be the Collection’s value type. A conversion from the iterator type to the const iterator type must exist. The iterator type must be an InputIterator. • Const iterator type X::const iterator A type of iterator that may be used to examine, but not to modify, a Collection’s elements. 58 CHAPTER 7. CONCEPTS • Reference type X::reference A type that behaves like a reference to the Collection’s value type. 4 • Const reference type X::const reference A type that behaves like a const reference to the Collection’s value type. • Pointer type X::pointer A type that behaves as a pointer to the Collection’s value type. • Distance type X::difference type A signed integral type used to represent the distance between two of the Collection’s iterators. This type must be the same as the iterator’s distance type. • Size type X::size type An unsigned integral type that can represent any nonnegative value of the Collection’s distance type. Notation X a, b T A type that is a model of Collection. Object of type X. The value type of X. Valid expressions • Beginning of range a.begin() Return Type: Semantics: Postcondition: iterator if a is mutable, const iterator otherwise Returns an iterator pointing to the first element in the Collection. a.begin() is either dereferenceable or past-the-end. It is past-the-end if and only if a.size() == 0. 4 The reference type does not have to be a real C++ reference. The requirements of the reference type depend on the context within which the Collection is being used. Specifically it depends on the requirements the context places on the value type of the Collection. The reference type of the Collection must meet the same requirements as the value type. In addition, the reference objects must be equivalent to the value type objects in the Collection (which is trivially true if they are the same object). Also, in a mutable Collection, an assignment to the reference object must result in an assignment to the object in the Collection (again, which is trivially true if they are the same object, but non-trivial if the reference type is a proxy class). 7.2. COLLECTION CONCEPTS 59 • End of range a.end() Return Type: Semantics: Postcondition: iterator if a is mutable, const iterator otherwise Returns an iterator pointing one past the last element in the Collection. a.end() is past-the-end. • Size a.size() Return Type: Semantics Poscondition: size type Returns the size of the Collection, i.e., the number of elements. a.size() >= 0 • Maximum size a.max size() Return Type: Semantics: Postcondition: size type Returns the largest size that this Collection can ever have. a.max size() >= 0 && a.max size() >= a.size() • Empty Collection a.empty() Return Type: Semantics: Convertible to bool Equivalent to a.size() == 0. (But possibly faster.) • Swap a.swap(b) Return Type: Semantics: void Equivalent to swap(a,b) Complexity guarantees begin() and end() are amortized constant time. size() is at most linear in the Collection’s size. empty() is amortized constant time. swap() is at most linear in the size of the two collections. Invariants • Valid range For any Collection a, [a.begin(), a.end()) is a valid range. • Range size a.size() is equal to the distance from a.begin() to a.end(). • Completeness An algorithm that iterates through the range [a.begin(), a.end()) will pass through every element of a. 60 CHAPTER 7. CONCEPTS Models • boost::array • boost::array ptr • std::vector • mtl::vector ::type • mtl::matrix traits ::OneD where Matrix is some type that models the MTL Matrix concept. 7.2.2 ForwardCollection The elements are arranged in some order that does not change spontaneously from one iteration to the next. As a result, a ForwardCollection is EqualityComparable and LessThanComparable. In addition, the iterator type of a ForwardCollection is a MultiPassInputIterator which is just an InputIterator with the added requirements that the iterator can be used to make multiple passes through a range, and that if it1 == it2 and it1 is dereferenceable then ++it1 == ++it2. The ForwardCollection also has a front() method. Refinement of Collection Valid Expressions • Front a.front() Return Type: Semantics: 7.2.3 reference if a is mutable, const reference otherwise. Equivalent to *(a.first()). ReversibleCollection The container provides access to iterators that traverse in both directions (forward and reverse). The iterator type must meet all of the requirements of BidirectionalIterator except that the reference type does not have to be a real C++ reference. The ReversibleCollection adds the following requirements to those of ForwardCollection. Refinement of ForwardCollection 7.2. COLLECTION CONCEPTS 61 Valid Expressions • Beginning of range a.rbegin() Return Type: Semantics: reverse iterator if a is mutable, const reverseiterator otherwise. Equivalent to reverse iterator(a.end()). • End of range a.rend() Return Type: Semantics: reverse iterator if a is mutable, const reverseiterator otherwise. Equivalent to X::reverse iterator(a.begin()). • Back a.back() 7.2.4 Return Type: reference if a is mutable, ¡br¿ const reference other- Semantics: wise. Equivalent to *(--a.end()). SequentialCollection Refinement of ReversibleCollection. The elements are arranged in a strict linear order. No extra methods are required. 7.2.5 RandomAccessCollection The iterators of a RandomAccessCollection satisfy all of the requirements of RandomAccessIterator except that the reference type does not have to be a real C++ reference. In addition, a RandomAccessCollection provides an element access operator. Refinement of SequentialCollection Valid Expressions • Element Access a[n] Return Type: Semantics: Precondition: reference if a is mutable, const reference otherwise. Returns the nth element of the collection. n must be convertible to size type and 0 <= n && n < a.size(). 62 CHAPTER 7. CONCEPTS 7.3 Iterator Concepts 7.3.1 IndexedIterator Refinement of ForwardIterator Notation is a type that is a model of IndexedIterator. is an object of type X. X i Associated Types • Size Type X::size type The return type of the index() member function. Valid Expressions • Element Index i.index() Return Type: Semantics: 7.3.2 X::size type Returns the index associated with the element currently pointed to by the iterator i. MatrixIterator Refinement of IndexedIterator Notation X i is a type that is a model of MatrixIterator. is an object of type X. Valid Expressions • Row Index i.row() Return Type: Semantics: difference type Returns the row index associated with the element currently pointed to by the iterator i. • Column Index i.column() Return Type: Semantics: difference type Returns the column index associated with the element currently pointed to by the iterator i. 7.3. ITERATOR CONCEPTS 7.3.3 63 IndexValuePairIterator Valid Expressions • Index index(*i) Return Type: Semantics: difference type Returns the index associated with the element currently pointed to by the iterator i. • Value value(*i) Return Type: Semantics: value type Returns the value associated with the element currently pointed to by the iterator i. 64 CHAPTER 7. CONCEPTS 7.4 Vector Concepts 7.4.1 BasicVector A BasicVector provides a mapping from a set of indices to the associated elements. The indices do not have to form a contiguous range though the indices must fall between zero and size(x). An access to x[i] where i >= size(x) is considered out-of-bounds. We add a few useful traits such as whether the vector is sparse or dense and if the vector has static size, what that size is. Associated Types • Linear Algebra Category linalg category ::type For vectors this is vector tag unless the vector is also a matrix (such as a row or column) in which case this is matrix tag. • Value Type vector traits ::value type • Reference Type vector traits ::reference • Const Reference Type vector traits ::const reference • Pointer Type vector traits ::pointer • Const Pointer Type vector traits ::const pointer • Size Type vector traits ::size type • Sparsity vector traits ::sparsity A Vector can be either sparse (sparse tag or dense (dense tag). • Static Size vector traits ::static size If the vector has static size (size determined at compile-time) then static size gives the length of the vector. Otherwise static size has the value dynamic sized. Notation X x i is a type that is a model of BasicVector. is an object of type X. is an object of type vector traits ::size type. 7.4. VECTOR CONCEPTS 65 Valid expressions • Element Access x[i] Return Type: Semantics: vector traits ::reference if x is mutable, vector traits ::const reference otherwise returns the element with index i. • Size size(x) Return Type: Semantics: vector traits ::size type Returns the extent of the index set for the vector. For sparse vectors size(x) will typically be much larger the number of stored elements. Complexity Guarantees Unlike RandomAccessContainer, the BasicVector concept does not guarantee amortized constant time for element access (operator[]) since that would rule out sparse vectors. Element access is only guaranteed to be linear time in the number of non-zeroes in the vector. The most efficient method and also the preferred method for accessing elements of sparse vectors is to use an iterator which is introduced in the Vector concept. Models • std::valarray • std::vector • mtl::vector ::type Constraints Checking Class template struct BasicVector { typedef typename linalg_category ::type category; typedef typename vector_traits ::sparsity sparsity; enum { CATEGORY = linalg_category ::RET, SPARSITY = vector_traits ::SPARSITY, SIZE = vector_traits ::SIZE }; typedef typename vector_traits ::size_type size_type; typedef typename vector_traits ::value_type value_type; typedef typename vector_traits ::reference reference; typedef typename vector_traits ::const_reference const_reference; typedef typename vector_traits ::pointer pointer; typedef typename vector_traits ::const_pointer const_pointer; void constraints() { 66 CHAPTER 7. CONCEPTS reference r = x[n]; const_constraints(x); } void const_constraints(const X& x) { const_reference r = x[n]; n = size(x); } X x; size_type n; }; 7.4.2 Vector This describes the MTL Vector concept, which is not to be confused with the std::vector class or the family of mtl::vector ::type classes. All of the vector classes in MTL model this Vector concept. That is, they fulfill the requirements (member functions and associated types) described here. The MTL Vector concept is a refinement of ForwardCollection (not Container), and BasicVector which adds the property that each element of a Vector has a unique corresponding index. The index can be used to access the corresponding element through the vector’s bracket operator (i.e., x[i]). The elements do not have to be sorted by their index, and the indices do not necessarily have to start at zero (though they often are sorted and start at zero). Refinement of ForwardCollection and BasicVector Invariants The invariant x[i] == *(x.begin() + i) that applies to RandomAccessContainer does not apply to Vector, since the x[i] is defined for Vector to return the element with the ith index with is not required to be the element at position i of the container. Models • mtl::vector ::type Constraints Checking Class template struct Vector { CLASS_REQUIRES(X, ForwardCollection); CLASS_REQUIRES(X, BasicVector); }; 7.4. VECTOR CONCEPTS 7.4.3 67 SparseVector The SparseVector concept adds several operations that are typically needed for sparse vectors. The number of non-zeroes (number of stored elements) in the vector can be accessed with the nnz(x) expression. The index of an element in the sparse vector can be obtained from the iterator, using the index() member function. For example, to print out the elements of the vector as index-value pairs one can write this: template void print_sparse_vector(const SparseVector& x) { typename SparseVector::const_iterator i; for (i = x.begin(); i != x.end(); ++i) cout << "(" << i.index() << "," << *i << ") "; } If one wishes to examine only the indices of the sparse vector, the nz struct(x) function can be used to obtain a view of the element indices. template void print_vector_indices(const SparseVector& x) { typedef typename nonzero_structure ::type NzStruct; typename NzStruct::const_iterator i; NzStruct x_nz = nz_struct(x); for (i = x_nz.begin(); i != x_nz.end(); ++i) cout << *i << " "; } Constraints Checking Class template struct SparseVector { CLASS_REQUIRES(X, Vector); typedef typename X::iterator iterator; typedef typename X::const_iterator const_iterator; CLASS_REQUIRES(iterator, IndexedIterator); CLASS_REQUIRES(const_iterator, IndexedIterator); typedef typename nonzero_structure ::type NonZeroStruct; CLASS_REQUIRES(NonZeroStruct, Collection); typedef typename X::size_type size_type; void constraints() { n = nnz(x); z = nonzero_structure(x); } X x; size_type n; NonZeroStruct z; }; 68 CHAPTER 7. CONCEPTS Refinement of Vector Associated Types • Iterator X::iterator The iterator must be a model of IndexedIterator. • Non-Zero Structure Type nonzero structure ::type Valid expressions • Number of Non-Zeroes nnz(x) Return Type: Semantics: X::size type returns the number of stored elements in the vector. Note that if an element stored in the vector happens to be zero it is still counted towards the nnz. For dense vectors, nnz(x) == size(x). For sparse vectors nnz(x) is typically much smaller than size(x). • Non-Zero Structure Access nz struct(x) Return Type: Semantics: nz struct ::type Returns a Collection that consists of the indices of the elements in the vector x. The size() of the collection is nnz(x). Invariants x[i] == *iter if and only if iter.index() == i. Complexity Guarantees The nnz() an nz struct() functions are amortized constant time. Models • mtl::vector >::type • mtl::vector >::type • mtl::vector >::type 7.4. VECTOR CONCEPTS 7.4.4 69 SubdividableVector Constraints Checking Class template struct SubdividableVector { CLASS_REQUIRES(X, Vector); void constraints() { sub_x = x[range(s,f)]; } typedef typename X::size_type size_type; size_type s, f; X x; typename subvector ::type sub_x; }; Refinement of Vector Associated Types • Sub-Vector Type subvector ::type The type used to represent sub-vector “views” of the vector. Notation is a type that is a model of Vector. is an object of type X. is an object of type std::pair . X x r Valid expressions • Sub-Vector Access x[r] Return Type: Semantics: subvector ::type returns a sub-vector “view” of a. The region is defined by the half-open interval [r.first,r.second). That is, the region includes x[r.first] but not x[r.second]. The range() funciton can be used for conveniently creating the r argument from a pair of indices. Complexity Guarantees • The sub-range access is only guaranteed to be linear time (but is typically constant time for dense vectors). 70 CHAPTER 7. CONCEPTS 7.4.5 ResizableVector Constraints Checking Class template struct ResizableVector { CLASS_REQUIRES(X, Vector); typedef typename X::size_type size_type; void constraints() { x.resize(n); } X x; size_type n; }; Refinement of Vector Valid expressions • Resize Vector x.resize(n) Return Type: Semantics: void Changes the size of the vector to n. New elements will be initialized to zero. Complexity Guarantees • The resize function is guaranteed to be linear in the size of the vector. 7.5. MATRIX CONCEPTS 7.5 71 Matrix Concepts SubdivideableMatrix TriangularMatrix StrideableMatrix BandedMatrix Matrix BasicMatrix VectorMatrix Matrix1D FastDiagMatrix ResizeableMatrix Figure 7.2: Refinement of the matrix concepts. 7.5.1 BasicMatrix The BasicMatrix concept defines the associated types and operations that are common to all MTL matrix types. A matrix consists of elements, each of which has a row and column index. The expression A(i,j) returns the element with row index i and column index j. For most matrix algorithms, one needs to traverse through all the of elements of a matrix. One could use A(i,j) to do this, but for some matrix types this is inefficient (e.g., sparse matrices) or can be confusing (e.g., banded matrices). The iterators provided by the Matrix and Matrix1D concepts provide a better method for efficiently traversing a matrix, and the SubdividableMatrix concept defines a nice interface for accessing slices and sub-matrices. There are several traits classes used with the BasicMatrix concept: linalg traits, and matrix traits. The linalg traits class is shared with the MTL vector and scalar concepts. For each of the tags defined in these traits classes there is also a numerical constant defined, for example matrix traits ::shape and matrix traits ::SHAPE. Both the tag and the constant are provided because the type tag is needed to implement external polymorphism (dispatching based to tag categories), while the constant is more convenient to use in compile-time (template meta-programming) logic. Associated Types The matrix traits class provides access to the associated types of a BasicMatrix, though the linalg traits class can also be used for some of the associated types. • Category linalg category ::type and linalg category ::RET For matrices this is always matrix tag and MATRIX. 72 CHAPTER 7. CONCEPTS • Element Type matrix traits ::element type The type of the elements contained in the matrix. The choice of naming this element type over value type is to help differentiate the element type from the 1-D type which for most MTL matrices is given by X::value type (as they are a 2-D Collection). • Reference matrix traits ::reference The mutable reference type associated with the element type. • Const Reference matrix traits ::const reference The constant (immutable) reference type associated with the element type. • Size Type matrix traits ::size type The type used for expressing matrix indices and dimensions. • Sparsity matrix traits ::sparsity Access whether the matrix is dense or sparse (dense tag orsparse tag ). • Dimension matrix traits ::dimension Access the dimension of the matrix (oned tag or twod tag). • Static Number of Rows matrix traits ::static nrows If the matrix is static (size determined at compile-time) then static nrows gives the number of rows in the matrix. Otherwise static nrows is dynamic sized. • Static Number of Columns matrix traits ::static ncols If the matrix is static (size determined at compile-time) then static ncols gives the number of rows in the matrix. Otherwise static ncols is dynamic sized. • Shape matrix traits ::shape The general layout of where the non-zeroes appear in the matrix. The possible types are rectangle tag, banded tag, triangle tag, symmetric tag, hermitian tag, or diagonal tag. • Orientation matrix traits ::orientation The natural order of traversal of the matrix, either row tag, column tag, or diagonal tag, or no orientation tag. 7.5. MATRIX CONCEPTS 73 Notation X A i,j is a type that is a model of BasicMatrix. is an object of type X. are objects of type matrix traits ::size type. Valid Expressions • Element Access A(i,j) Return Type: Semantics: reference if A is mutable, const reference otherwise Returns the element at row i and column j. • Number of Rows nrows(A) Return Type: size type • Number of Columns ncols(A) Return Type: size type • Number of Non-Zeroes nnz(A) Return Type: Semantics: size type Returns the number of stored elements in the matrix. Note that if an element that is stored in the matrix happens to be zero it is still counted towards the nnz. For dense matrices nnz(A) == nrows(A) * ncols(A). For sparse and banded matrices nnz(A) is typically much small than nrows(A) * ncols(A). Complexity Guarantees • Element access is only guaranteed to be O(mn). The time complexity for this operation varies widely from matrix type to matrix type. For dense matrices it is constant, while for coordinate scheme sparse matrices it is on averange mn/2. See the documentation for each matrix type for the specific time complexity. • The nrows(), ncols(), and nnz() members are all constant time. 7.5.2 VectorMatrix A VectorMatrix is basically a vector that is also a matrix. Examples of this include a row or column section of a matrix or a free-standing vector which is being used to represent a row matrix or column matrix (a matrix consisting of a single row or column). MTL vectors are by default considered to be a column matrix and have column tag for their orientation. For the most part, the properties of the VectorMatrix derive from being a Vector, though the VectorMatrix 74 CHAPTER 7. CONCEPTS also fulfills the requirements for BasicMatrix. The iterator and const iterator types of the VectorMatrix must model MatrixIterator. Refinement of Vector and BasicMatrix Associated Types VectorMatrix inherits its associated types from Vector and BasicMatrix. Valid Expressions • Transposed View of the Vector trans(x) Return Type: Semantics: Complexity: transpose view ::type Creates a transposed view of the vector. If the vector was a row it is now a column and vice-versa. Constant time. Models • matrix ::type::value type (a row of a dense matrix) • vector