Prototype Manual

User Manual:

Open the PDF directly: View PDF PDF.
Page Count: 199

DownloadPrototype Manual
Open PDF In BrowserView PDF
The Matrix Template Library
User Manual
Jeremy G. Siek, Andrew Lumsdaine and Lie-Quan Lee
February 2, 2004

2

Contents
I

Introduction to Generic Programming

1 Traits Classes
1.1 Typedefs Nested in Classes
1.2 Template Specialization . .
1.3 Definition of a Traits Class
1.4 Partial Specialization . . . .
1.5 External Polymorphism and

1

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

3
3
5
6
6
7

2 Concepts and Models
2.1 Requirements . . . . . . . . . . . . .
2.2 Example: InputIterator . . . . . . .
2.3 Concepts vs. Abstract Base Classes .
2.4 Multi-type Concepts and Modules .
2.5 Concept Checking . . . . . . . . . .

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

.
.
.
.
.

9
9
10
10
10
11

II
III

. . .
. . .
. . .
. . .
Tags

.
.
.
.
.

Introduction to Numerical Linear Algebra
Tutorials

15
17

3 Gaussian Elimination

19

4 Pointwise LU Factorization

21

5 Blocked LU Factorization

25

6 Preconditioned GMRES(m)

29

IV

33

Reference Manual

7 Concepts
35
7.1 Algebraic Concepts . . . . . . . . . . . . . . . . . . . . . . . . . . 36
7.1.1 AbelianGroup . . . . . . . . . . . . . . . . . . . . . . . . . 37
3

4

CONTENTS

7.2

7.3

7.4

7.5

7.1.2 Ring . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.1.3 Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.1.4 R-Module . . . . . . . . . . . . . . . . . . . . . . . . . .
7.1.5 VectorSpace . . . . . . . . . . . . . . . . . . . . . . . . .
7.1.6 FiniteVectorSpace, FiniteBanachSpace, FiniteHilbertSpace
7.1.7 BanachSpace . . . . . . . . . . . . . . . . . . . . . . . .
7.1.8 HilbertSpace . . . . . . . . . . . . . . . . . . . . . . . . .
7.1.9 LinearOperator . . . . . . . . . . . . . . . . . . . . . . .
7.1.10 FiniteLinearOperator . . . . . . . . . . . . . . . . . . . .
7.1.11 TransposableLinearOperator . . . . . . . . . . . . . . . .
7.1.12 LinearAlgebra . . . . . . . . . . . . . . . . . . . . . . . .
Collection Concepts . . . . . . . . . . . . . . . . . . . . . . . .
7.2.1 Collection . . . . . . . . . . . . . . . . . . . . . . . . . .
7.2.2 ForwardCollection . . . . . . . . . . . . . . . . . . . . . .
7.2.3 ReversibleCollection . . . . . . . . . . . . . . . . . . . . .
7.2.4 SequentialCollection . . . . . . . . . . . . . . . . . . . . .
7.2.5 RandomAccessCollection . . . . . . . . . . . . . . . . . .
Iterator Concepts . . . . . . . . . . . . . . . . . . . . . . . . . .
7.3.1 IndexedIterator . . . . . . . . . . . . . . . . . . . . . . .
7.3.2 MatrixIterator . . . . . . . . . . . . . . . . . . . . . . . .
7.3.3 IndexValuePairIterator . . . . . . . . . . . . . . . . . . . .
Vector Concepts . . . . . . . . . . . . . . . . . . . . . . . . . .
7.4.1 BasicVector . . . . . . . . . . . . . . . . . . . . . . . . .
7.4.2 Vector . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.4.3 SparseVector . . . . . . . . . . . . . . . . . . . . . . . . .
7.4.4 SubdividableVector . . . . . . . . . . . . . . . . . . . . .
7.4.5 ResizableVector . . . . . . . . . . . . . . . . . . . . . . .
Matrix Concepts . . . . . . . . . . . . . . . . . . . . . . . . . .
7.5.1 BasicMatrix . . . . . . . . . . . . . . . . . . . . . . . . .
7.5.2 VectorMatrix . . . . . . . . . . . . . . . . . . . . . . . .
7.5.3 Matrix1D . . . . . . . . . . . . . . . . . . . . . . . . . .
7.5.4 Matrix . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7.5.5 BandedMatrix . . . . . . . . . . . . . . . . . . . . . . . .
7.5.6 TriangularMatrix . . . . . . . . . . . . . . . . . . . . . . .
7.5.7 StrideableMatrix . . . . . . . . . . . . . . . . . . . . . . .
7.5.8 SubdividableMatrix . . . . . . . . . . . . . . . . . . . . .
7.5.9 ResizeableMatrix . . . . . . . . . . . . . . . . . . . . . .
7.5.10 FastDiagMatrix . . . . . . . . . . . . . . . . . . . . . . .

8 Arithmetic Types and Classes

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

39
41
42
43
45
45
47
49
50
51
53
57
57
60
60
61
61
62
62
62
63
64
64
66
67
69
70
71
71
73
74
74
78
79
80
81
83
84
85

9 Object Model and Memory Management
87
9.1 Object Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
9.2 Memory Management . . . . . . . . . . . . . . . . . . . . . . . . 88

CONTENTS
10 Vector Classes: vector::type
10.1 Dense Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10.1.1 vector, Orien>::type . . . . .
10.1.2 vector, Orien>::type . . . . .
10.1.3 vector, Orien>::type . . . . . . .
10.2 Sparse Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . .
10.2.1 vector,
Orien>::type . . . . . . . . . . . . . . . . . . . .
10.2.2 vector,
Orien>::type . . . . . . . . . . . . . . . . . . . .
10.2.3 vector, Orien>::type . .
10.2.4 vector, Orien>::type
10.2.5 vector, Orien>::type . . . . . . . . .

5
89
92
92
93
95
96
97
99
100
101
102

11 Matrix Classes: matrix::type
105
11.0.6 Shape Selectors . . . . . . . . . . . . . . . . . . . . . . . . 106
11.0.7 Storage Selectors . . . . . . . . . . . . . . . . . . . . . . . 107
11.0.8 Shape and Storage Combinations . . . . . . . . . . . . . . 111
11.1 Dense Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
11.1.1 matrix, dense,
Orien>::type . . . . . . . . . . . . . . . . . . . . 113
11.1.2 matrix, dense,
Orien>::type . . . . . . . . . . . . . . . . . . . . 114
11.1.3 matrix,
static size, Orien>::type . . . . . . . 115
11.1.4 matrix,
array< dense >, Orien>::type . . . . 116
11.2 Sparse Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
11.2.1 matrix,
Orien>::type . . . . . . . . . . . . . . . . . . . . 118
11.2.2 matrix,
Orien>::type . . . . . . . . . . . . . . . . . . . . 118
11.2.3 matrix,
Orien>::type . . . . . . . . . . . . . . . . . . . . 119
11.2.4 matrix,
Orien>::type . . . . . . . . . . . . . . . . . . . . 119
11.3 Banded Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
11.3.1 matrix,banded,Orien>::type 120
11.3.2 matrix,banded,Orien>::type . 120
11.3.3 matrix,dense,Orien>::type . 120
11.3.4 matrix,dense,Orien>::type . 120
11.4 Triangular Matrices . . . . . . . . . . . . . . . . . . . . . . . . . 121
11.4.1 matrix,
Storage, Orien>::type . . . . . . . . . . . . . 121
11.4.2 matrix,
dense,Orien>::type . . . . . . . . 121

6

CONTENTS
11.4.3 matrix,
dense, Orien>::type . . . . . . .
11.4.4 matrix,
banded,Orien>::type . . . . . .
11.4.5 matrix,
banded, Orien>::type . . . . . .
11.4.6 matrix,
packed, Orien>::type . . . . . .
11.4.7 matrix,
packed, Orien>::type . . . . . .
11.5 Symmetric Matrices . . . . . . . . . . . . . . . . . . . . . . . .
11.5.1 matrix,
Storage, Orien>::type . . . . . . . . . . . .
11.5.2 matrix,
dense,Orien>::type . . . . . . .
11.5.3 matrix,
dense, Orien>::type . . . . . . .
11.5.4 matrix,
banded,Orien>::type . . . . . .
11.5.5 matrix,
banded, Orien>::type . . . . . .
11.5.6 matrix,
packed, Orien>::type . . . . . .
11.5.7 matrix,
packed, Orien>::type . . . . . .
11.6 Hermitian Matrices . . . . . . . . . . . . . . . . . . . . . . . . .
11.6.1 matrix,
Storage, Orien>::type . . . . . . . . . . . .
11.6.2 matrix,
dense,Orien>::type . . . . . . .
11.6.3 matrix,
dense, Orien>::type . . . . . . .
11.6.4 matrix,
banded,Orien>::type . . . . . .
11.6.5 matrix,
banded, Orien>::type . . . . . .
11.6.6 matrix,
packed, Orien>::type . . . . . .
11.6.7 matrix,
packed, Orien>::type . . . . . .
11.7 Diagonal Matrices . . . . . . . . . . . . . . . . . . . . . . . . .
11.7.1 matrix,
dense, diagonal major>::type
11.7.2 matrix,
dense, diagonal major>::type .

. 121
. 121
. 121
. 121
. 121
. 122
. 122
. 122
. 122
. 122
. 122
. 122
. 122
. 123
. 123
. 123
. 123
. 123
. 123
. 123
. 123
. 124
. 124
. 124

CONTENTS

7

11.7.3 matrix,
array, diagonal major>::type . . . . . 124
12 Adaptors and Helper Functions

125

13 Overloaded Operators

127

14 Vector Operations
14.1 Vector Data Movement Operations
14.1.1 set . . . . . . . . . . . . .
14.1.2 copy . . . . . . . . . . . . .
14.1.3 swap . . . . . . . . . . . . .
14.1.4 gather . . . . . . . . . . .
14.1.5 scatter . . . . . . . . . . .
14.2 Vector Reduction Operations . . .
14.2.1 dot (inner product) . . . .
14.2.2 one norm . . . . . . . . . .
14.2.3 two norm (euclidean norm)
14.2.4 infinity norm . . . . . . .
14.2.5 sum . . . . . . . . . . . . .
14.2.6 sum squares . . . . . . . .
14.2.7 max . . . . . . . . . . . . .
14.2.8 min . . . . . . . . . . . . .
14.2.9 max index . . . . . . . . . .
14.2.10 max with index . . . . . .
14.2.11 min index . . . . . . . . . .
14.3 Vector Arithmetic Operations . . .
14.3.1 scale . . . . . . . . . . . .
14.3.2 add . . . . . . . . . . . . .
14.3.3 iadd . . . . . . . . . . . . .
14.3.4 ele mult . . . . . . . . . .
14.3.5 ele div . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

129
130
130
130
131
132
133
135
135
136
137
137
138
139
140
140
141
141
142
144
144
144
146
147
148

15 Matrix Operations
15.1 Matrix Data Movement Operations .
15.1.1 set . . . . . . . . . . . . . .
15.1.2 copy . . . . . . . . . . . . . .
15.1.3 swap . . . . . . . . . . . . . .
15.1.4 transpose . . . . . . . . . .
15.2 Matrix Norms . . . . . . . . . . . . .
15.2.1 one norm . . . . . . . . . . .
15.2.2 frobenius norm . . . . . . .
15.2.3 infinity norm . . . . . . . .
15.3 Element-wise Arithmetic Operations
15.3.1 scale . . . . . . . . . . . . .
15.3.2 add . . . . . . . . . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.

151
152
152
152
154
155
157
157
157
158
159
159
160

8

CONTENTS

15.4

15.5
15.6

15.7

15.8

15.3.3 iadd . . . . . . . . . . . . . . . . . . . . . . . .
15.3.4 ele mult . . . . . . . . . . . . . . . . . . . . .
15.3.5 ele div . . . . . . . . . . . . . . . . . . . . . .
Rank Updates (Outer Products) . . . . . . . . . . . .
15.4.1 rank one update . . . . . . . . . . . . . . . . .
15.4.2 rank one conj . . . . . . . . . . . . . . . . . .
15.4.3 rank two update . . . . . . . . . . . . . . . . .
15.4.4 rank two conj . . . . . . . . . . . . . . . . . .
Trianglular Solves (Forward & Backward Substitution)
15.5.1 tri solve . . . . . . . . . . . . . . . . . . . . .
Matrix-Vector Multiplication . . . . . . . . . . . . . .
15.6.1 mult . . . . . . . . . . . . . . . . . . . . . . . .
15.6.2 mult add . . . . . . . . . . . . . . . . . . . . .
Matrix-Matrix Multiplication . . . . . . . . . . . . . .
15.7.1 mult . . . . . . . . . . . . . . . . . . . . . . . .
15.7.2 mult add . . . . . . . . . . . . . . . . . . . . .
15.7.3 lrdiag mult . . . . . . . . . . . . . . . . . . .
15.7.4 lrdiag mult add . . . . . . . . . . . . . . . . .
15.7.5 babt mult . . . . . . . . . . . . . . . . . . . . .
Miscellaneous Matrix Operations . . . . . . . . . . . .
15.8.1 trace . . . . . . . . . . . . . . . . . . . . . . .

16 Miscellaneous Operations
16.1 Basic Transformations . . .
16.1.1 givens . . . . . . .
16.1.2 givens apply . . . .
16.1.3 house . . . . . . . .
16.1.4 house apply left .
16.1.5 house apply right

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

161
162
162
164
164
164
164
164
166
166
168
168
168
170
170
170
170
170
170
171
171

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

.
.
.
.
.
.

173
174
174
175
175
177
177

A Concept Checks for STL
A.1 STL Basic Concept Checks .
A.1.1 Assignable . . . . . . .
A.1.2 DefaultConstructible .
A.1.3 CopyConstructible . . .
A.1.4 EqualityComparable . .
A.1.5 LessThanComparable .
A.1.6 Generator . . . . . . .
A.1.7 UnaryFunction . . . . .
A.1.8 BinaryFunction . . . .
A.1.9 UnaryPredicate . . . .
A.1.10 BinaryPredicate . . . .
A.2 STL Iterator Concept Checks
A.2.1 TrivialIterator . . . . .
A.2.2 Mutable-TrivialIterator
A.2.3 InputIterator . . . . . .

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

.
.
.
.
.
.
.
.
.
.
.
.
.
.
.

179
179
179
179
179
180
180
180
181
181
181
181
182
182
182
182

CONTENTS
A.2.4
A.2.5
A.2.6
A.2.7
A.2.8
A.2.9
A.2.10

9
OutputIterator . . . . . . . . .
ForwardIterator . . . . . . . .
Mutable-ForwardIterator . . .
BidirectionalIterator . . . . . .
Mutable-BidirectionalIterator .
RandomAccessIterator . . . . .
Mutable-RandomAccessIterator

BIBLIOGRAPHY

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

.
.
.
.
.
.
.

182
183
183
183
183
184
184
185

10

CONTENTS

List of Tables
11.1 Valid Shape and Storage Combinations. . . . . . . . . . . . . . . 111

11

12

LIST OF TABLES

List of Figures
4.1
4.2
4.3

LU factorization pseudo-code. . . . . . . . . . . . . . . . . . . . .
Diagram for LU factorization. . . . . . . . . . . . . . . . . . . . .
Complete MTL version of pointwise LU factorization. . . . . . .

21
22
24

5.1
5.2
5.3

Pointwise step in block LU factorization. . . . . . . . . . . . . . .
Update steps in block LU factorization. . . . . . . . . . . . . . .
MTL version of block LU factorization. . . . . . . . . . . . . . .

26
27
28

6.1

The Iterative Template Library (ITL) implementation of the preconditioned GMRES(m) algorithm. This algorithm computes an
approximate solution to Ax = b preconditioned with M . The
restart value is specified by the parameter m. . . . . . . . . . . .

30

Refinement of the algebraic concepts. . . . . . . . . . . . . . . . .
Refinement of the matrix concepts. . . . . . . . . . . . . . . . . .

37
71

10.1 The Compressed Sparse Vector Format. . . . . . . . . . . . . . .
10.2 The Sparse Pair Vector Format. . . . . . . . . . . . . . . . . . . .

96
96

7.1
7.2

11.1
11.2
11.3
11.4
11.5
11.6
11.7

Example of a banded matrix with bandwidth (1,2). . . . . . . . .
Example of a symmetric matrix with bandwidth (2,2). . . . . . .
Example of the dense matrix storage format. . . . . . . . . . . .
Example of the banded matrix storage format. . . . . . . . . . .
Example of the packed matrix storage format. . . . . . . . . . . .
Example of the compressed column matrix storage format. . . . .
Example of the array matrix storage format with dense and with
sparse pair OneD storage types. . . . . . . . . . . . . . . . . . . .
11.8 Example of the envelope matrix storage format. . . . . . . . . . .

13

107
108
108
109
109
110
111
112

14

LIST OF FIGURES

Part I

Introduction to Generic
Programming

1

Chapter 1

Traits Classes
One of the most important techniques used in generic programming is the traits
class, which was introduced by Nathan Meyers in XX. The traits class technique
may seem strange and somewhat daunting when first encountered (granted the
syntax looks a bit strange) but the essense of the idea is simple, and it is essential
to learn how to use traits classes, for they appear over and over again in generic
libraries such as the STL, and are also used heavily here in the MTL. Here we
give a short motivation and tutorial for traits classes via an extended example.
A traits class is basically a way of finding out information about a type that
you otherwise would not know anything about. For instance, suppose I want to
write a generic sum() function:
template 
X sum(const Vector& v, int n)
{
X total;
for (int i = 0; i < n; ++i)
total += v[i];
return total;
}

From the point of view of this template function, not much is know about
the template type Vector. For instance, I don’t know what kind of elements are
inside the vector. But I need to know this in order to declare the local variable
total, which should be the same type as the elements of Vector (the X there
right now is just a bogus placeholder that needs to be replaced by something
else!).

1.1

Typedefs Nested in Classes

One way to access information out of a type is to use the scope operator ”::” to
get at typedefs that are nested inside the class. Imagine I’ve created a vector
class that looks like this:
3

4

CHAPTER 1. TRAITS CLASSES
class my_vector {
public:
typedef double value_type; // the type for elements in the vector
double& operator[](int i) { ... };
...
};

Now I can access the type of the elements by writing my vector::value type.
Here’s the generic sum() function again, with the X’s replaced:
template 
typename Vector::value_type sum(const Vector& v, int n)
{
typename Vector::value_type total = 0;
for (int i = 0; i < n; ++i)
total += v[i];
return total;
}

The use of the keyword typename deserves some explaining. Due to some
quirks in C++, nested typedefs and nested members can cause ambiguity from
the compiler’s point of view: when parsing a template function the compiler
may not know whether the thing on the right hand side of the scope operator is
a type or an object. The typename keyword is used to clear up this ambiguity.
The rule of thumb is, whenever you use the scope operator to access a nested
typedef, and when the type on the left hand side of the scope operator somehow
depends on a template argument, then use the typename keyword. If the type
on the left hand side does not depend on a template argument, then do not use
typename. In the above sum() function we use typename since the left hand side,
Vector, is a template argument. In contrast, typename is not used below since
the type std::vector does not depend on any template arguments (it is
not even in a template function!).
int main(int, char*[]) {
std::vector::value_type x;
return 0;
}

Getting back to the sum() function, the technique of using a nested typedef
works as long as Vector is a class type that has such a nested typedef. But what
if I want to use the generic sum() function with a builtin type such as double*
that couldn’t possibly have any typedefs? Or what if we want to use sum() with
a vector class from a third party who didn’t provide the necessary typedef?
The operator[] works with double* and our imaginary third party vector, so it
would be a shame to miss out on a chance for re-use just because of the issue of
accessing the value type. Below shows the situation we want to make possible,
where sum() can be reused with types such as double*.
double* x = ...;
int n = ...;
sum(x, n);

1.2. TEMPLATE SPECIALIZATION

1.2

5

Template Specialization

The solution to this problem is the traits class technique. To understand how
traits classes work, we first need to review some facts about template classes
and something called specialization. We start with a simple (though perhaps
gratuitous) example of a template class:
template 
class hello_world {
public:
void say_hi() { cout << "Hello world!" << endl; }
};

One interesting thing about C++ templates is that they can be specialized.
That is, a special version of the class can be explicitly created for a particular
case of the template arguments (in this case T). So we can write this:
template <>
class hello_world {
public:
void say_hi() { cout << "I’m special!" << endl; }
};
template <>
class hello_world {
public:
void say_hi() { cout << "I’m special too!" << endl; }
};

Now can you guess what happens when I do the following?
hello_world h1;
hello_world h2;
hello_world h3;
hello_world h4;
h1.say_hi();
h2.say_hi();
h3.say_hi();
h4.say_hi();

Here’s what the output will look like:
Hello world!
Hello world!
I’m special!
I’m special too!

In this example, template specialization allowed us to pick different versions of the say hi() member function based on some type (char, int, long,
double*). The specializations of hello world created a mapping between a type
and a version of say hi(). The original templated version of hello world acted
like a default if none of the specializations covered the type.

6

CHAPTER 1. TRAITS CLASSES

1.3

Definition of a Traits Class

In general, we can use specialization to create mappings from types to other
nested types, member functions, or constants. It might be helpful to think of
the template class as a function that executes at compile-time, whose input is
the template parameters and whose output is the things nested in the class. A
traits class is a class who’s sole purpose is to define such a mapping.
Getting back to the generic sum() function, here’s an example traits class,
templated on the Vector type, that allows us to get the element, or value type
of the vector. For the default case, we’ll assume the vector is a class with a
nested typedef like my vector:
template 
struct vector_traits {
typedef typename Vec::value_type value_type;
};

But now we can also handle the case when the Vector template argument is
something else like a double*:
template <>
struct vector_traits {
typedef double value_type;
};

or even some third party class, say johns int vector:
template <>
struct vector_traits {
typedef int value_type;
};

1.4

Partial Specialization

Now one might get bored of writing a traits class for every pointer type, or
perhaps the third party class is templated. The solution to this is the longwinded term partial specialization. Here’s what you can write:
template 
struct vector_traits {
typedef T value_type;
};
template 
struct vector_traits< johns_vector > {
typedef T value_type;
};

Your C++ compiler will attempt a pattern match between the template
argument provided at the “call” to the traits class, and all the specializations
defined, picking the specialization that is the closest match. The above partial

1.5. EXTERNAL POLYMORPHISM AND TAGS

7

specialization for T* will match whenever the type is a pointer, though the previous complete specializations for double* would match first for that particular
pointer type.
The most well known use of a traits class is the iterator traits class used
in the Standard Template Library, which provides access to things like the
value type and iterator category that is associated with an Iterator. MTL also
uses traits classes, such as matrix traits. Typically a traits class is used with
with a particular concept or family of concepts. The iterator traits class is
used with the family of Iterator concepts. The matrix traits class is used with
the familty of MTL Matrix concepts. The traits class is what provides access to
the associated types of a concept. We will explain concepts, associated types,
etc. in detail in Chapter 2.

1.5

External Polymorphism and Tags

A technique that often goes hand in hand with traits classes is external polymorphism, which is a way of using function overloading to dispatch based on
properties of a type. A good example of this is the implementation of the
std::advance() function in the STL, which increments an iterator n times. Depending on the kind of iterator, there are different optimizations that can be
applied in the implementation. If the iterator is random access (can jump forward and backward arbitrary distances), then the advance() function can simply
be implemented with i += n, and is very efficient: constant time. If the iterator
is bidirectional, then it makes sense for n to be negative, we can decrement the iterator n times. The relation between external polymorphism and traits classes is
that the property to be exploited (in this case the iterator category) is accessed
through a traits class. The main advance() function uses the iterator traits
class to get the iterator category. It then makes a call the the overloaded
advance() function. The appropriate advance() is selected by the compiler
based on whatever type the iterator category resolves to, either input iterator tag,
bidirectional iterator tag, or random access iterator tag. A tag is simply a
class whose only purpose is to convey some property for use in external polymorphism.
struct input_iterator_tag { };
struct bidirectional_iterator_tag { };
struct random_access_iterator_tag { };
template 
void __advance(InputIterator& i, Distance n, input_iterator_tag) {
while (n--) ++i;
}
template 
void __advance(BidirectionalIterator& i, Distance n,
bidirectional_iterator_tag) {

8

CHAPTER 1. TRAITS CLASSES
if (n >= 0)
while (n--) ++i;
else
while (n++) --i;
}
template 
void __advance(RandomAccessIterator& i, Distance n,
random_access_iterator_tag) {
i += n;
}
template 
void advance(InputIterator& i, Distance n) {
typedef typename iterator_traits::iterator_category Cat;
__advance(i, n, Cat());
}

Chapter 2

Concepts and Models
Here we define the basic terminology of generic programming, much of which
was introduced in [1]. In the context of generic programming, the term concept
is used to describe the collection of requirements that a template argument
must meet for the template function or templated class to compile and operate
properly. The requirements are described as a set of valid expressions, associated
types, invariants, and complexity guarantees. A type that meets the set of
requirements is said to model the concept.

2.1

Requirements

Valid Expressions are C++ expressions which must compile successfully for
the objects involved in the expression to be considered models of the concept.
Associated Types are types that are related to the modelling type in one
of two ways. They are either provided by typedefs nested within class
definition for the type, or they are accessed through a traits class, such as
iterator traits.
Invariants are typically run-time characteristics of the objects that must always be true, that is, the functions involving the objects must preserve
these characteristics.
Complexity Guarantees are maximum limits on how long the execution of
one of the valid expressions will take, or how much of various resources its
computation will use.
9

10

CHAPTER 2. CONCEPTS AND MODELS

2.2

Example: InputIterator

Examples of concept definitions can be found in the C++ Standard, many of
which deal with the requirements for iterators. The InputIterator1 concept is
one of these. The following expressions are valid if the object i is an instance
of some type that models InputIterator.
++i
i++
*i

The std::iterator traits class provides access to the associated types of
an iterator type. In the following example we find out what type of object is
pointed to by some iterator type (call it X) via the value type of the traits class.
typename iterator_traits::value_type t;
t = *i;

As for complexity guarantees, all of the InputIterator operations are required
to be constant time. Some examples of types that satisfy the requirements for InputIterator are double*, std::list::iterator, and std::istream iterator.

2.3

Concepts vs. Abstract Base Classes

In many respects a concept is similar to an abstract base class (or interface): it
defines a set of types that are interchangeable from the point of view of some
algorithm (or collection of algorithms). Also, much in the way abstract base
classes can inherit from (extend) other base classes, a concept can refine or add
requirements to another concept.
However a concept is much looser than an abstract base class: there is
no inheritance requirement and the valid expressions offer more freedom than
the requirement for member functions with exactly matching signatures. Also
dispatch based on type is not required to be via a virtual function (though it still
can be). Also, for expressions that involve multiple types, the function overload
resolution can depend on both types, which avoids the binary method [3] problem
associated with inheritance-based polymorphism.

2.4

Multi-type Concepts and Modules

Most concepts describe expressions that involve interaction between two or more
types. Often one of the types is the “main” type and the other types can be
derived from the “main” type via typdefs or traits class (they are associated
types). We talk of the “main” type as the one that models the concept. This
is the case with the iterators and containers of the STL. In our example above,
the dereference expression *i required by InputIterator returns a second type,
1 We

always use the bold sans serif font for concept names.

2.5. CONCEPT CHECKING

11

namely the value type associated with the iterator type (the “main” type in
this case).
However, for some concepts there is not a “main” type. There are multiple
types involved, none of which can be derived from the others. We call a set of
types that together model a concept a module. For example, later in this chapter
we will be defining the concept of a VectorSpace which consists of some vector
type, a scalar type, and a multiplication operator between the two. Since one can
multiply a complex number by a float, the module {complex,float} is a
model of VectorSpace. It is also true that the module {complex,double}
is a model of VectorSpace, so we see that there is not necessarily a one-to-one
relationship between the vector type and the scalar type, as there was between
the iterator type and value type discussed above. In the descriptions of multitype concepts we will list the participating types instead of associated types.
No traits classes or nested typedefs are specified, as there is not a one-to-one
mapping between the participating types. The participating types are typically
readily available by other means.

2.5

Concept Checking

The C++ language does not provide direct support for ensuring that template
arguments meet the requirements demanded of them by the generic algorithm
or template class. This means that if the user makes an error, the resulting
compiler error will point to somewhere deep inside the implementation of the
generic algorithm, giving an error that may not be easy to match with the cause.
Together with the SGI STL team we have developed a method for forcing the
compiler to give better error messages. The idea is to exercise all the requirements placed on the template arguments at the very beginning of the generic
algorithm. We have created some macros and a methodology for how to do this.
Suppose we wish to add concept checks to the STL copy() algorithm, which
has the following prototype.
template 
OutIter copy(InIter first, InIter last, OutIter result);

We will need to make sure the InIter is a model of InputIterator and that
OutIter is a model of OutputIterator. The first step is to create the code that will
excercise the expressions associated with each of these concepts. The following
is the concept checking class for OutputIterator.
template 
struct OutputIterator {
CLASS_REQUIRES(T, Assignable);
void constraints() {
(void)*i;
// require
++i;
// require
i++;
// require
*i++ = *j;
// require

dereference operator
preincrement operator
postincrement operator
postincrement and assignment

12

CHAPTER 2. CONCEPTS AND MODELS
}
T i, j;

};

Once the concept checking classes are complete, one simple needs to invoke
them at the beginning of the generic algorithm using our REQUIRE macro. Here’s
what the STL copy() algorithm looks like with the concept checks inserted.
template 
OutIter copy(InIter first, InIter last, OutIter result)
{
REQUIRE(OutIter, OutputIterator);
REQUIRE(InIter, InputIterator);
return copy_aux(first, last, result, VALUE_TYPE(first));
}

Looking back at the OutputIterator concept checking class you might wonder
why we used the CLASS REQUIRES macro instead of just REQUIRE. The reason for
this is that different tricks are needed to force compilation of the checking code
when invoking the macro from inside a class definition instead of a function.
Sometimes there is more than one type involved in a concept. This means
that the corresponding concept checking class will have more than one template
argument. The VectorSpace concept checker is an example of one of this, it
involves an AbelianGroup and a Field.
template 
struct VectorSpace {
CLASS_REQUIRES2(G, F, R_Module);
CLASS_REQUIRES(F, Field);
void constraints() {
y = x / a;
y /= a;
}
G x, y, z;
F a;
};

When invoking a concept checker with more than one type it is necessary to
append the number of type arguments to the macro name. Here is an example
of using a multi-type concept checker.
template 
void foo(Vector& x, Real a) {
REQUIRE2(Vector, Real, VectorSpace);
...
}

For the most part the user of MTL does not need to know how to create
concept checks and insert them in generic algorithms, however it is good to

2.5. CONCEPT CHECKING

13

know what they are and how to use them. This will make the error messages
easier to understand. Also if you are unsure about whether you can use a
certain MTL class with a particular algorithm, or whether some custom-made
class of your own is compatible, a good first check is to invoke the appropriate
concept checker. Here’s a quick example program that one could write to see if
the types std::complex and float together satisfy the requirements for
VectorSpace.
#include 
#include 
int main(int,char*[])
{
using mtl::VectorSpace;
REQUIRE2(std::complex, float, VectorSpace);
return 0;
}

In addition, if you create generic algorithms of your own then we highly
encourage you to create, publish and use concept checks.

14

CHAPTER 2. CONCEPTS AND MODELS

Part II

Introduction to Numerical
Linear Algebra

15

Part III

Tutorials

17

Chapter 3

Gaussian Elimination
template 
void gaussian_elimination(Matrix& A)
{
typename matrix_traits::size_type
m = A.nrows(), n = A.ncols(), i, k;
typename matrix_traits::value_type s;
for (k = 0; k < std::min(m-1,n-1); ++k) {
if (A(k,k) != zero(s))
for (i = k + 1; i < m; ++i) {
s = A(i,k) / A(k,k);
A(i,all) -= s * A(k,all);
}
}
}
template 
void gauss_elim_with_partial_pivoting(Matrix& A, Vector& pivots)
{
typename matrix_traits::size_type
m = A.nrows(), n = A.ncols(), pivot, i, k;
typename matrix_traits::value_type s;
for (k = 0; k < std::min(m-1,n-1); ++k) {
pivot = k + max_abs_index( A(range(k,m),k) );
if (pivot != k)
swap(A(pivot,all), A(k,all));
pivots[k] = pivot;
if (A(k,k) != zero(s))
for (i = k + 1; i < m; ++i) {
s = A(i,k) / A(k,k);
A(i,all) -= s * A(k,all);
}
}

19

20
}

CHAPTER 3. GAUSSIAN ELIMINATION

Chapter 4

Pointwise LU Factorization
This section shows how one could implement the usual pointwise LU factorization. The next section will describe an implementation of the blocked LU
factorization. First, a quick review of LU factorization. It is basically gaussian
elimination, where a general matrix is transformed into a lower triangular matrix and an upper triangular matrix. The purpose of this transformation is to
solve a system of equations, and it is easy to solve a system once it is in triangular form (using forward or backward substitution). So if we start with the
equation Ax = b, using LU factorization this becomes LU x = b. We can then
solve the system in two simple steps: first we solve Ly = b (where y has replaced
U x), and then we solve U x = y. For more background on LU factorization see
[6] or [10].
The algorithm for LU factorization is given in Figure 4.1 and the graphical
representation of the algorithm is given in Figure 4.2, as it would look part
way through the computation. The black square represents the current pivot
element. The horizontal shaded rectangle is a row from the upper triangle of the
matrix. The vertical shaded rectangle is a column from the lower triangle of the
matrix. The L and U labeled regions are the portions of the matrix that have
already been updated by the algorithm. The algorithm has not yet reached the
region labeled A0 .
for i = 1 . . . min(M − 1, N − 1)
find maximum element in the column section A(i + 1 : M, i)
swap the row of maximum element with row A(i, :)
scale column section A(i + 1 : M, i) by 1/A(i, i)
let A0 = A(i + 1 : M, i + 1 : N )
A0 ← A0 + L(:, i)U (i, :) (rank one update)

Figure 4.1: LU factorization pseudo-code.
We will implement the LU factorization as a template function that takes
a matrix input-output argument and a vector output-argument to record the
21

22

CHAPTER 4. POINTWISE LU FACTORIZATION






































































































A’





























































L









U

Figure 4.2: Diagram for LU factorization.
pivots. The return type in an integer that will be zero if the algorithm is
successful (the matrix is non-singular), otherwise the matrix is singular and
U (i, i) is zero with i given as the return value. The return type is the size type
which is associated with the matrix using the matrix traits class. For most
cases one could get away with using int instead, but when writing a generic
algorithm it is best to use this more portable method1 .
Inside the algorithm we will use many of the matrix operations specified in
the SubdividableMatrix concept, so we insert a REQUIRE clause to ensure that the
user does not do something like try to call the function with a sparse matrix
(which is not a model of SubdividableMatrix). In addition, we require the PVector
type to really be a Vector2 .
template 
typename matrix_traits::size_type
lu_factor(Matrix& A, PVector& pivots)
{
REQUIRE(Matrix, SubdividableMatrix);
REQUIRE(PVector, Vector);
...
}

To make the indexing as simple as possible, we can create matrix objects
to make explicit the upper and lower triangular views of the original matrix
A. First the types of the views must be obtained. This is done using the
triangle view traits class. We also create a typedef for the sub-matrix type,
which will be used later. The U and L matrix objects are then created and the
matrix A is passed to their constructors. The L and U objects are just handles,
so their creation is inexpensive (constant time complexity).
typedef typename triangle_view::type Unit_Upper;
typedef typename triangle_view::type Unit_Lower;
typedef typename submatrix::type SubMatrix;
Unit_Upper U(A);
1 Often times the biggest headache in porting large codes is changing integer types. This
problem can be mitigated by the correct use of traits classes and the associated size type
2 The require clauses are not really necessary and can be left out. Their purpose is to make
error messages more understandable when the user incorrectly applies a generic algorithm.

23
Unit_Lower L(A);

Next we need to declare some index variables, again using the matrix traits
class to get the correct type.
typedef typename matrix_traits::size_type SizeT;
typedef typename matrix_traits::value_type T;
SizeT i, ip, M = A.nrows(), N = A.ncols(), info = 0;

The first operation in the LU factorization is to find the maximum element
in the column, which will tell us how to pivot. The expression A(i,range(i,M))
returns a subsection of the ith column, from the ith row to the bottom. The
range describes a half-open interval which does not include the element A(i,M)
(which would be out-of-bounds). The sub-column object is a full-fledge MTL
Vector, and can be used with any of the MTL vector operations. The same is
true for sub-rows and sub-matrices. We then apply the abs() function to create
an expression object, which will apply abs() to each element of the sub-column
during the evaluation of the max index() function.
ip = max_index(abs(A(i,range(i,M))));

The next operation in the LU factorization is to swap the current row with
the row that has the maximum element.
if (ip != i)
swap((A(i,all), A(ip,all));

The third operation in the LU factorization is to scale the column under the
pivot by 1/A(i, i). The use of the identity() function needs some explaining.
Since we are writing a generic algorithm, we do not know the element type
for the matrix, and therefore can not be sure that an integer constant 1 is
convertible to the element type. The solution is to use the generic identity()
function provided by MTL which returns a 1 of the appropriate type.
L(all,i) *= identity(T()) / A(i,i);

The final operation in the LU factorization is to update the trailing submatrix according to A0 ← A0 + L(:, i)U (i, :).
SubMatrix Aprime = A(range(i+1, M), range(i+1, N));
Aprime -= L(all,i) * U(i,all);

The complete LU factorization implementation is given in Figure 4.3.

24

CHAPTER 4. POINTWISE LU FACTORIZATION

template 
typename matrix_traits::size_type
lu_factor(Matrix& A, PivotVector& pivots)
{
typedef typename triangle_view::type Unit_Upper;
typedef typename triangle_view::type Unit_Lower;
typedef typename submatrix::type SubMatrix;
typedef typename matrix_traits::size_type SizeT;
typedef typename matrix_traits::value_type T;
REQUIRE(Matrix, SubdividableMatrix);
REQUIRE(Unit_Lower, SubdividableMatrix);
REQUIRE2(Unit_Lower, T, VectorSpace);
REQUIRE2(SubMatrix, Ring);
REQUIRE(PivotVector, Vector);
Unit_Upper U(A);
Unit_Lower L(A);
int info = 0;
SizeT i, ip, M = A.nrows(), N = A.ncols();
for (i = 0; i < min(M - 1, N - 1); ++i) {
ip = max_index(abs(A(i,range(i,M)))); /* find pivot */
pivots[i] = ip + 1;
if ( A(ip, i) != zero(T()) ) {
/* make sure pivot isn’t zero */
if (ip != i)
swap((A(i,all), A(ip,all));
/* swap the rows i and ip */
L(all,i) *= identity(T()) / A(i,i); /* update column under the pivot */
} else {
info = i + 1;
break;
}
SubMatrix Aprime = A(range(i+1, M), range(i+1, N));
Aprime -= L(all,i) * U(i,all);
/* update the submatrix */
}
pivots[i] = i + 1;
return info;
}

Figure 4.3: Complete MTL version of pointwise LU factorization.

Chapter 5

Blocked LU Factorization
The execution time of many linear algebra operations on modern computer
architectures can be decreased dramatically through blocking to increase cache
utilization [4, 5]. In algorithms where repeated matrix-vector operations are
done (such as the rank-one-update of the LU factorization), it is beneficial
to convert the algorithm to use matrix-matrix operations to introduce more
opportunities for blocking.
Here we give an example of how to MTL to reformulate the LU factorization
algorithm to use more matrix-matrix operations [?]. First the matrix A is split
into four submatrices, using a blocking factor of r. A11 is r × r and A12 is
r × n − r where dim(A) = n × n.

A=

A11
A21



A12
A22

Then A = LU is formulated in terms of the blocks.


A11
A21

A12
A22




=

L11
L21

0
L22



U11
0

U12
U22



From this the following equations are derived for the submatrices of A. The
matrix on the right shows the values that should occupy A after one step of the
blocked LU factorization.
A11
A12
A21
A22

=
=
=
=

L11 U11
L11 U12
L21 U11
L21 U12 + L22 U22



L11 \ U11
L21

U12
Ã22



L11 , U11 , and L21 are updated by applying the pointwise LU factorization
to the combined region of A11 and A21 . U12 is then updated with a triangular
25

26

CHAPTER 5. BLOCKED LU FACTORIZATION

solve applied to A12 . Finally Ã22 is calculated with a matrix product of L21
and U12 . The algorithm is then applied recursively to Ã22 .
In the implementation of block LU factorization, MTL can be used to create
a partitioning of the matrix into submatrices. The use of the submatrix objects
throughout the algorithm removes the redundant indexing that a programmer
would typically have to do to specify the regions for each submatrix for each
operation. The code below shows how the partitioning is performed that corresponds to Figure 5.1 with the matrix objects A 0, A 1, and A 2. The partitioning
for Figure 5.2 is created with matrix objects A 11, A 12, A 21, and A 22.
SubMatrix
A_0 = A(range(j,M), range(0,j)),
A_1 = A(range(j,M), range(j,j+jb)),
A_2 = A(range(j,M), range(j+jb,N)),
A_11 = A(range(j,j+jb), range(j,j+jb)),
A_12 = A(range(j,j+jb), range(j+jb,N)),
A_21 = A(range(j+jb,M), range(j,j+jb)),
A_22 = A(range(j+jb,M), range(j+jb,N));
triangle_view::type L_11(A_11);

Figure 5.1 depicts the block factorization part way through the computation. The matrix is divided up for the pointwise factorization step. The region
including A11 and A21 is labeled A1 . Since there is pivoting involved, the rows
in the regions labeled A0 and A2 must be swapped according to the pivots used
in A1 .

U

L
A0

A1

A2

Figure 5.1: Pointwise step in block LU factorization.
The implementation of this step in MTL is very concise. The A 1 submatrix
object is passed to the lu factorize() algorithm. The multi swap() function is
then used on A 0 and A 2 to pivot their rows to match A 1.
PivotVector sub_pivots(jb);
S ret = lu_factor(A_1, sub_pivots);
if (ret != 0)
return ret + j;
for (S i = j; i < min(M, j + jb); ++i)
pivots[i] = sub_pivots[i-j] + j;

27
if (j > 0)
permute(A_0, sub_pivots, left_size());
if (j + jb < M) {
permute(A_2, sub_pivots, left_side());

Once A1 has been factored, the A12 and A22 regions must be updated. The
submatrices involved are depicted in Figure 5.2. The A12 region needs to be
updated with a triangular solve.

U
U11

L L 11

L 21

A 12
A 22

Figure 5.2: Update steps in block LU factorization.
To apply the tri solve() algorithm to L11 and A12 , we merely call the MTL
routine and pass in the L 11 and A 12 matrix objects.
tri_solve(L_11, A_12, left_side());

The last step is to calculate Ã22 with a matrix-matrix multiply according to
Ã22 ← A22 − L21 U12 . The A 12 and A 21 matrix objects are used to implement
then operation. They have been overwritten in the previous steps with U12 and
L21 .
A_22 -= A_21 * A_12;

The complete version of the MTL block LU factorization algorithm is given
in Figure 5.3.

28

CHAPTER 5. BLOCKED LU FACTORIZATION

template 
typename Matrix::size_type
block_lu(Matrix& A, PivotVector& pivots)
{
typedef typename Matrix::value_type T;
typedef typename Matrix::size_type S;
typedef typename submatrix_view::type SubMatrix;
const S BF = LU_BF; // blocking factor
const S M = A.nrows();
const S N = A.ncols();
if (min(M, N) <= BF || BF == 1)
return lu_factor(A, pivots);
for (S j = 0; j < min(M, N); j += BF) {
S jb = min(min(M, N) - j, BF);
SubMatrix
A_0 = A(range(j,M), range(0,j)),
A_1 = A(range(j,M), range(j,j+jb)),
A_2 = A(range(j,M), range(j+jb,N)),
A_11 = A(range(j,j+jb), range(j,j+jb)),
A_12 = A(range(j,j+jb), range(j+jb,N)),
A_21 = A(range(j+jb,M), range(j,j+jb)),
A_22 = A(range(j+jb,M), range(j+jb,N));
triangle_view::type L_11(A_11);
PivotVector sub_pivots(jb);
S ret = lu_factor(A_1, sub_pivots);
if (ret != 0)
return ret + j;
for (S i = j; i < min(M, j + jb); ++i)
pivots[i] = sub_pivots[i-j] + j;
if (j > 0)
permute(A_0, sub_pivots, left_size());
if (j + jb < M) {
permute(A_2, sub_pivots, left_side());
tri_solve(L_11, A_12, left_side());
A_22 -= A_21 * A_12;
}
}
}

Figure 5.3: MTL version of block LU factorization.

Chapter 6

Preconditioned GMRES(m)
One important use for MTL is for the rapid construction of numerical libraries.
Although not necessary, a generic approach can be used when developing these
libraries as well, resulting in reusable scientific software at a higher level. To
demonstrate how one might use MTL for a non-trivial high-level library, we show
the complete implementation of the preconditioned GMRES(m) algorithm [9]
in Figure 6.1 (taken from our Iterative Template Library).
The basic algorithmic steps (corresponding to the GMRES algorithm as
given in [9]) are given in the comments and the calls to MTL in the body of the
algorithm should be fairly clear. Some of the other code may seem somewhat
impenetrable at first glance, so we’ll take a quick walk-through the more difficult
statements.
The algorithm parameterizes GMRES in some important ways, as shown
in the template statement on lines 1 and 2. The matrix and vector types are
parameterized, so that any matrix type can be used. In particular, matrices
having any element type can be used — e.g., real or complex. In fact, matrices
without explicit elements at all (matrix-free operators [2]) can be used. All
that is required is for the mult() algorithm be suitably defined. For MTL
matrices, the generic MTL mult() will generally suffice. For non-MTL matrices,
or matrix-free operators, a suitably overloaded mult() must be provided.
There are also two other type parameterizations of interest, the Preconditioner
and the Iteration. Similar to the parameterization of the matrix, the preconditioner type is parameterized so that arbitrary preconditioners can be used. It
is only required that the preconditioner be callable with the solve() algorithm.
The Iteration type parameter allows the user to control the stopping criterion
for the algorithm. (Pre-defined stopping criteria are included as part of ITL.)
The using namespace mtl statement on line 6 allows us to access MTL
functions (which are all declared within the mtl namespace) without explicitly
using the mtl:: scope. The use of a namespace helps to prevent name clashes
with other libraries and user code.
On line 7, we use the internally defined typedef for the value type to determine the type of the individual elements of the matrix. All MTL matrix and
29

30

CHAPTER 6. PRECONDITIONED GMRES(M)

template 
int gmres(const Matrix &A, Vector &x, const VectorB &b,
const Preconditioner &Minv, int m, Iteration& outer,
Norm norm, InnerProduct dot)
{
typedef typename Matrix::value_type T;
typedef mtl::matrix,
dense<>, column_major>::type InternalMatrix;
typedef itl_traits::internal_vector InternalVec;
REQUIRE4(InternalVec, T, Norm, InnerProduct, HilbertSpace);
REQUIRE4(Matrix, Vector, VectorB, T, LinearOperator);
REQUIRE4(Preconditioner, InternalVec, InternalVec, T, LinearOperator);
InternalMatrix H(m+1, m), V(size(x), m+1);
InternalVec s(m+1), w, r, u;
std::vector< givens_rotation > rotations(m+1);
w = b - A * x;
r = Minv * w;
typename Iteration::real beta = std::abs(norm(r));
while (! outer.finished(beta)) {
// Outer iteration
V[0] = r /beta;
s = zero(s[0]);
s[0] = beta;
int j = 0;
Iteration inner(outer.normb(), m, outer.tol());
do {
u = A * V[j];
w = Minv * u;
for (int i = 0; i <= j; i++) {
H(i,j) = dot(w, V[i]);
w -= V[i] * H(i,j);
}
H(j+1,j) = norm(w);
V[j+1] = w / H(j+1,j);

// Inner iteration

// QR triangularization of H
for (int i = 0; i < j; i++)
rotations[i].apply(H(i,j), H(i+1,j));
rotations[j] = givens_rotation(H(j,j), H(j+1,j));
rotations[j].apply(H(j,j), H(j+1,j));
rotations[j].apply(s[i], s[i+1]);
++inner, ++outer, ++j;
} while (! inner.finished(std::abs(s[j])));
// Form the approximate solution
tri_solve(tri_view()(H(range(0, j), range(0, j))), s);
x += V(range(0,x.size()), range(0,j)) * s;
// Restart
w = b - A * x;
r = Minv * w;
beta = std::abs(norm(r));
}
return outer.error_code();
}

Figure 6.1: The Iterative Template Library (ITL) implementation of the preconditioned GMRES(m) algorithm. This algorithm computes an approximate
solution to Ax = b preconditioned with M . The restart value is specified by the
parameter m.

31
vector classes have an accessible type member called value type that specifies
the type of the element data. By using this internal type, rather than a fixed
type, we can make the algorithm generic with respect to element type.
Finally, although the entire algorithm fits on a single page, it is not a toy
implementation — it is both high-quality and high-performance. This is where
the power of reusable software components can be appreciated. Now that there
exists a generic GMRES(m) that has been implemented, tested, and debugged,
programmers wanting to use GMRES are forever spared the work of implementation, testing, and debugging GMRES themselves. Both time and reliability
are gained.

32

CHAPTER 6. PRECONDITIONED GMRES(M)

Part IV

Reference Manual

33

Chapter 7

Concepts

35

36

7.1

CHAPTER 7. CONCEPTS

Algebraic Concepts

One of the beautiful aspects of linear algebra is that in many respects matrices and vectors act like numbers, for instance you can add and multiply them.
Of course, matrices and vectors don’t act exactly like numbers, there are some
important differences and restrictions on the operations. And since many algorithms operate on matrices at this abstraction level (without looking “inside”
the matrices) it is important to formulate a precise description of this abstraction level. Fortunately, mathematicians [7, 8, 11] have already developed a
very precise definition for a linear algebra which we will use, merely adapting
it to C++ syntax. The section following this one will describe the interface for
looking “inside” the matrix and vector data structures.
The definition of a linear algebra is somewhat complex and relies on a family of algebraic concepts which we also define here. If you are not particularly
interested in this mathematical structure, you can skip forward to the description of the LinearAlgebra concept in Section 7.1.12, which summarizes all of the
operations on matrices and vectors. Later, when you encounter algorithms that
use the other concepts defined here, you can return to this section for reference.
With the following concept definitions we attempt to map the purely mathematical algebraic concepts into the realm of C++ components. Due to many
practical considerations the mapping is not perfect, and the mathematical concepts will be stretched and bent here and there. The overall purpose here is
to define useful interfaces and terminology for use in the precise documentation
of algorithms in C++. This is not a theoretical exercise for determining how
closely we can model the mathematical concepts in C++.
In our formulation of these C++ concepts we have left out the mathematical concepts that do not aid in defining interfaces for real C++ components.
However, much of the fine granularity present in the mathematical concepts has
been retained. This granularity is useful when documenting algorithms, as it
makes it easier to choose a concept that closely matches the minimal requirements for each parameter to an algorithm. For instance, most of the algorithms
in ITL only use the subset of matrix operations contained in the LinearOperator
concept. Figure 7.1 gives an overview of the algebraic concepts, with the arrows
representing the refinement relationship.
Equality
Stating whether two objects are “equal” is somewhat of a sticky subject. To
begin with, we will want to talk about whether objects are equal even if there
is not an operator== defined for that type (it is not EqualityComparable). Also,
when dealing with floating point numbers we want to talk about an equality
that is much looser than the bit-level equality that is given by operator==. To
this end we use the symbol = to mean a = b iff |a − b| <  where  is some
appropriate small number for the situation (like machine epsilon). If the number
is not LessThanComparable or it does not make sense to take the absolute value,
then = means that during computation, if the value on the left-hand-side was

7.1. ALGEBRAIC CONCEPTS

37

VectorSpace
BanachSpace
HilbertSpace

R-Module
Field

LinearAlgebra

Ring

AbelianGroup

LinearOperator

TransposableLinearOperator
Figure 7.1: Refinement of the algebraic concepts.
substituted with the value on the right-hand-side, the difference in the resulting
behaviour of the program would not be large enough to care about.

7.1.1

AbelianGroup

A group is a set of elements and an operator over the set that obeys the associative law and has an identity element. If the operator is commutative it is
called an Abelian group, and if the notation used for the operator is + then it
is an additive group. The concept AbelianGroup we define here is an additive
Abelian group .
Refinement of
Assignable
Notation
X
a,b,c

is a type that is a model of AbelianGroup.
are objects of type X.

Valid Expressions
• Addition
a + b

Return Type:

X or a type convertible to X that is also a model of

Semantics:

AbelianGroup.
See below for the invariants.

38

CHAPTER 7. CONCEPTS
• Addition Assignment
a += b

Return Type:
Semantics:

X&

Equivalent to a = a + b.

• Additive Inverse
-a

Return Type:

X or a type convertible to X that is also a model of

Semantics:

AbelianGroup.
See below.

• Subtraction
a - b

Return Type:

X or a type convertible to X that is also a model of

Semantics:

AbelianGroup.
Equivalent to a + -b.

• Subtraction Assignment
a -= b

Return Type:
Semantics:

X&

Equivalent to a = a + -b.

• Zero Element (Additive Identity)
zero(a)

Return Type:
Semantics:

X

This function returns a zero element of the same type
as the argument a. a is not changed, the purpose of the
argument is merely to carry type information and also
size information in the case of vectors and matrices. See
below for the algebraic properties of the zero element.

Invariants
• Associativity
(a + b) + c = a + (b + c)

• Definition of the Identity Element
a + zero(a) = a
• Definition of Additive Inverse
a + -a = zero(a)
• Commutativity
a + b = b + a
Models
• int
• float

7.1. ALGEBRAIC CONCEPTS

39

• complex
• vector::type
• matrix::type
Constraints Checking Class
template 
struct AbelianGroup {
void constraints() {
c = a + b;
b += a;
b = -a;
c = a - b;
b -= a;
b = zero(a);
}
X a, b, c;
};

7.1.2

Ring

A Ring adds the notion of a second operation, namely multiplication, to the
concept of a AbelianGroup. The multiplication obeys the law of associativity and
distributes with addition. Adding a unity element (the multiplicative identity)
to a ring give a ring-with-unity. For simplicity we will include the requirement
for a unity element in the Ring concept. Another variant of the Ring concept
adds the requirement that multiplication be commutative. We will refer to this
concept as a CommutativeRing
Refinement of
AbelianGroup
Notation
X
a,b

is a type that is a model of Ring.
are objects of type X.

Requirements
• Multiplication
a * b

Return Type:
convertible to
type X.

X or a type convertible to X.

40

CHAPTER 7. CONCEPTS
• Multiplicative Identity Element
identity(a)

Return Type:
Semantics:

X

Returns the appropriate identity element for the type
of a. The argument a is not changed, its purpose is to
carry type information and also size information in the
case when a is a matrix. The algebraic properties of
the identity element are listed below.

Invariants
• Associativity of Multiplication
a * (b * c) = (a * b) * c
• Distributivity
a*(b + c) = a*b + a*c
(b + c)*a = b*a + c*a
• Definition of Multiplicative Identity
a * identity(a) = a
• Commutativity (for a CommutativeRing)
a * b = b * a
Models
• int
• float
• complex
• matrix::type
Constraints Checking Class
template 
struct Ring {
CLASS_REQUIRES(X, AbelianGroup);
void constraints() {
c = a * b;
b = identity(a);
}
X a, b, c;
};

7.1. ALGEBRAIC CONCEPTS

7.1.3

41

Field

A Field adds the notion that there is always a solution for the equations
ax = b
ya = b

∀a 6= 0.

This means that the set is closed under division, so we can add the division
operator to the requirements.
Refinement of
Ring, EqualityComparable, and LessThanComparable
Notation
is a type that is a model of Field.
are objects of type X.

X
a,b

Valid Expressions
• Division
a / b

Return Type:

X or a type convertible to X.

• Division Assignment
a /= b

Return Type:

X&

Invariants
• Definition of Multplicative Inverse
if a * x = b then x = b/a.
Models
• float
• double
• complex
Constraints Checking Class
template 
struct Field {
CLASS_REQUIRES(X, Ring);
CLASS_REQUIRES(X, EqualityComparable);
CLASS_REQUIRES(X, LessThanComparable);

42

CHAPTER 7. CONCEPTS
void constraints() {
c = a / b;
b /= a;
}
X a, b, c;

};

7.1.4

R-Module

The R-Module concept defines multiplication (or “scaling”) between an AbelianGroup and a Ring, where the result of the multiplication is an object of the group
type. Also the multiplication must be associative and it must distribute with
the addition operator of the AbelianGroup.
Refinement of
AbelianGroup
Notation
G
R
a, b
x

is a type that is a model of AbelianGroup.
is a type that is a model of Ring.
are objects of type R
is an object of type X

Participating Types
• Vector Type
The type that plays the role of the vector (type G) and that is a model of
AbelianGroup.
• Scalar Type
The type that plays the role of the scalar (type R) and that is a model of
Ring.
Valid Expressions
• Right Scalar Multiplication
x * a

Return Type:

G or a type convertible to G.

• Left Scalar Multiplication
a * x

Return Type:

G or a type convertible to G which also with the scalar

type satisfies R-Module.
• Scalar Multiplication Assignment
x *= a

Return Type:

G&

7.1. ALGEBRAIC CONCEPTS

43

Invariants
• Distributive
(a + b)*x = a*x + b*x
a*(x + y) = a*x + a*y

• Associative
a * (b * x) = (a * b) * x

• Identity
identity(a) * x = x

Models
• { vector::type, int }
• { matrix::type, int }

Constraints Checking Class
template 
struct R_Module {
CLASS_REQUIRES(G, AbelianGroup);
CLASS_REQUIRES(R, Ring);
void constraints() {
y *= a;
y = x * a;
y = a * x;
}
G x, y;
R a;
};

7.1.5

VectorSpace

The VectorSpace concept is a refinement of the R-Module concept, adding the
addition requirement that the scalar type be a model of Field instead of Ring.
With this we are able to define vector division by a scalar. A VectorSpace is
also called a F-module.

Refinement of
R-Module

44

CHAPTER 7. CONCEPTS

Notation
is
is
is
is

G
F
x
a

a type that is a model of AbelianGroup.
a type that is a model of Field.
an object of type G
an object of type F

Participating Types
• Vector Type
The type that plays the role of the vector (type G) and that is a model of
AbelianGroup.
• Scalar Type
The type that plays the role of the scalar (type F) and that is a model of
Field.
Valid Expressions
• Scalar Division
x / a

Return Type:

G or a type convertible to G.

• Scalar Division Assignment
x /= a

Return Type:

G&

Models
• { vector::type, float }
• { matrix::type, double }
• { std::valarray, float }
• { std::complex, float }
Constraints Checking Class
template 
struct VectorSpace {
CLASS_REQUIRES2(G, F, R_Module);
CLASS_REQUIRES(F, Field);
void constraints() {
y = x / a;
y /= a;
}
G x, y;
F a;
};

7.1. ALGEBRAIC CONCEPTS

7.1.6

45

FiniteVectorSpace, FiniteBanachSpace, FiniteHilbertSpace

A finite-dimensional vector space has a basis consisting of finite number of vectors x1 , . . . , xn where n is the dimension of the vector space. Any vector in such
a space can be expressed in terms of n coordinates with respect to the basis.
With this in mind, the concept FiniteVectorSpace requires a method of access for
the coordinates of a vector and access to the dimension, namely the operator[]
and a size() function. The behaviour of these operations and the associated
traits information is described in BasicVector.
The next two vector space concepts, BanachSpace and HilbertSpace, also
have finite-dimensional variants which we will call FiniteBanachSpace and FiniteHilbertSpace.
Refinement of
VectorSpace and BasicVector
Constraints Checking Class
template 
struct FiniteVectorSpace {
CLASS_REQUIRES2(G, F, VectorSpace);
CLASS_REQUIRES(G, BasicVector);
};
template 
struct FiniteBanachSpace {
CLASS_REQUIRES3(G, F, Norm, BanachSpace);
CLASS_REQUIRES(G, BasicVector);
};
template 
struct FiniteHilbertSpace {
CLASS_REQUIRES4(G, F, Norm, InnerProduct, HilbertSpace);
CLASS_REQUIRES(G, BasicVector);
};

7.1.7

BanachSpace

A BanachSpace is basically a VectorSpace composed with a definition for a norm
function. More technically, a BanachSpace is a complete normed vector space [8].
Since one may want to use the same generic algorithm on different spaces with
different norms, we specify access to the norm function through a function object
which is passed to algorithms or classes that operate on a BanachSpaces1 .
1 The

MTL and ITL algorithms provide the 2-norm as a default.

46

CHAPTER 7. CONCEPTS

Refinement of
VectorSpace
Notation
is
is
is
is
is
is

{G,F}
Norm
x,y
a,b
r
norm

a module that models VectorSpace.
a functor type as defined below.
an object of type X.
an object of type F.
an object of type magnitude::type.
an object of type Norm.

Participating Types
• Norm Functor Type
The Norm type is a function object that can be applied to vector type X and
whose return type is magnitude::type. In addition, the norm function
satisfies the invariants listed below.
Associated Types
• Magnitude Type
magnitude::type

The return type of abs() applied to the scalar type F. Typically this is
some real number type.
Valid Expressions
• Norm Functor Application
norm(x)

Return Type:
Semantics:

magnitude::type

See below.

• Absolute Value
abs(a)

Return Type:
Semantics:

magnitude::type

The distance between a and zero.

Invariants
The norm functor must obey the following invariants.
• norm(x) >= zero(r)
• norm(x) == zero(r) iff x == zero(x)
• Homogeneity
norm(a*x) = abs(a) * norm(x)

7.1. ALGEBRAIC CONCEPTS

47

• Triangle Inequality
norm(x + y) <= norm(x) + norm(y)

The abs() function must obey a similar set of invariants.
• abs(a) >= zero(r)
• abs(a) == zero(r) iff a == zero(a)
• Homogeneity
abs(a*b) = abs(a) * abs(b)
• Triangle Inequality
abs(a + b) <= abs(a) + abs(b)

Constraints Checking Class
template 
struct BanachSpace
{
CLASS_REQUIRES2(G, F, VectorSpace);
void constraints() {
r = norm(x);
r = abs(a);
}
G x;
F a;
Norm norm;
typename magnitude::type r;
};

7.1.8

HilbertSpace

A HilbertSpace is basically a VectorSpace composed with an inner product function. The inner product is also known as dot product or scalar product. Similarly to the norm of the BanachSpace, the inner product must be provided as a
function object2 .
Refinement of
BanachSpace
Notation
{G,F}
InnerProduct
x,y,z
a,b
dot
2 MTL

is a module that models BanachSpace.
is a functor as defined below.
are objects of type X.
are objects of type F.
is an object of type InnerProduct.

and ITL algorithms use mtl::dot functor as a default for the function object.

48

CHAPTER 7. CONCEPTS

Participating Types
• Inner Product Type
The InnerProduct type is a function object that can be applied to two
vectors of type X and whose return type is the scalar type F. In addition,
the inner product satisfies the invariants listed below.
Valid Expression
• Inner Product Functor Application
dot(x,y)

Return Type:
Semantics:
Preconditions:

F

See below.
size(x) == size(y) if they are finite

• Scalar Conjugate
conj(a)

Return Type:
Semantics:

F

• Vector Conjugate
conj(x)

Return Type:
Semantics:

convertible to X

Invariants
The dot functor obeys the following invariants.
• dot(x,x) > zero(a) for all x != zero(x)
• dot(x,x) == zero(a) if x == zero(x)
• dot(x,y) == dot(y,conj(x))
• dot(a*x + b*y, z) = a*dot(x,z) + b*dot(y,z)
• dot(x, a*y + b*z) = conj(a)*dot(x,y) + conj(b)*dot(x,z)
• norm(x) = sqrt(dot(x,x))
The scalar conj() function must obey the following laws.
• conj(a * b) = conj(a) * conj(b)
• conj(a + b) = conj(a) + conj(b)
The vector conj() function must obey similar laws.
• conj(x * y) = conj(x) * conj(y)
• conj(x + y) = conj(x) + conj(y)

7.1. ALGEBRAIC CONCEPTS

49

Constraints Checking Class
template 
struct HilbertSpace
{
CLASS_REQUIRES3(G, F, Norm, BanachSpace);
void constraints() {
a = dot(x, y);
a = conj(a);
y = conj(x);
}
G x, y;
F a;
InnerProduct dot;
};

7.1.9

LinearOperator

A linear operator is a function from one vector space to another. Also known
as a linear transformation. The concept LinearOperator consists of an operator
and two vector types, a domain and range vector type, and a scalar type.

Notation
Op
F
{VX,F}
{VY,F}
A
x,w
a

is the operator type.
is the scalar type which must model Field.
is a module that models VectorSpace.
is a module that models VectorSpace.
is an object of type Op
are objects of type VX
is an object of type F

Participating Types
• Operator Type
The type of the operator function Op which can be applied to the domain
vector space.
• Domain Vector Space
A type that models VectorSpace which is the argument to the LinearOperator. An access method (via a traits class) is not provided since a given
linear operator type may be applicable to many different vector types.
• Range Vector Space
A type that models VectorSpace (or at least is convertible to such a type).

50

CHAPTER 7. CONCEPTS

Associated Types
• Size Type
matrix traits::size type

The Integral type that is the return type for the nrows() and ncols()
functions.
Valid Expressions
• Operator Application
A * x

Return Type:

a type that models the range vector space (or is convertible to it).

Invariates
• Linearity
A * (x + w) = A * x + A * y.
A * (x * a) = (A * x) * a

Constraints Checking Class
template 
struct LinearOperator {
CLASS_REQUIRES2(VX, F, VectorSpace);
CLASS_REQUIRES2(VY, F, VectorSpace);
void constraints() {
y = A * x;
}
Op A;
VX x;
VY y;
};

7.1.10

FiniteLinearOperator

A FiniteLinearOperator is a LinearOperator the operates over finite vector spaces.
Notation
Op
F
{VX,F}
{VY,F}
A
x

is
is
is
is
is
is

the operator type.
the scalar type which must model Field.
a module that models VectorSpace.
a module that models VectorSpace.
an object of type Op
an object of type VX

7.1. ALGEBRAIC CONCEPTS

51

Associated Types
• Size Type
matrix traits::size type

The Integral type that is the return type for the nrows() and ncols()
functions.
Valid Expressions
• Operator Application
A * x

Return Type:
Preconditions:
Semantics:

a type that models the range vector space (or is convertible to it).
ncols(A) == size(x)

The returned vector will have size equal to nrows(A).

• Number of Rows
nrows(A)

Return Type:
Semantics:

matrix traits::size type

The dimension of the resulting vector space.

• Number of Columns
ncols(A)

Return Type:
Semantics:

matrix traits::size type

The dimension of the input vector space.

Constraints Checking Class
template 
struct FiniteLinearOperator {
CLASS_REQUIRES2(Op, VX, VY, F, LinearOperator);
void constraints() {
m = nrows(A);
n = ncols(A);
}
Op A;
typename matrix_traits::size_type m, n;
};

7.1.11

TransposableLinearOperator

A TransposableLinearOperator is simply a LinearOperator for which the transpose
of the linear operator can also be applied.
Refinement of
LinearOperator

52

CHAPTER 7. CONCEPTS

Notation
X
V
A
y

is
is
is
is

a type models TransposableLinearOperator.
a type that models VectorSpace.
an object of type X
an object of type V

Associated Types
In addition to the associated types inherited from LinearOperator we have:
• Transpose Linear Operator Type
transpose view::type
The type returned by trans(A), which is a transposed view into the matrix.

Valid Expressions
• Transposed Operator Application (Transposed Matrix-Vector Multiplication)
trans(A) * y

Return Type:

some type that models VectorSpace or is convertible to
one.

Preconditions:
Semantics:

nrows(A) == size(y)

The resulting vector (if it is finite) will have size equal
to ncols(A).

• Transposed Operator Application (Left Matrix-Vector Multiplication)
y * A

Return Type:

The return type will be a model of VectorSpace (or at
least convertible to one)

Preconditions:
Semantics:

nrows(A) == size(y)

The resulting vector (if it is finite) will have size equal
to ncols(A).

Constraints Checking Class
template 
struct TransposableLinearOperator {
CLASS_REQUIRES4(X, VX, VY, F, LinearOperator);
void constraints() {
typedef typename transpose_view::type Transpose;
Transpose AT = trans(A);
y = AT * x;
y = x * A;
}
X A;
VX x;
VY y;
};

7.1. ALGEBRAIC CONCEPTS

7.1.12

53

LinearAlgebra

The LinearAlgebra adds several operations to the LinearOperator concept. With
the scalar multiplication and addition operators the LinearOperator forms a VectorSpace, and with multiplication the LinearOperator forms a Ring. The multiplication operator associated with this Ring is the composition of linear operators,
which in the context of MTL is matrix multiplication. One of the requirements
for a Ring is that it be closed under multiplication. Matrix multiplication is only
closed for square matrices. In the general case of rectangular matrices none of
the matrices involved in the multiplication are over the same vector space (for
C = AB, A maps Rm → Rk , B maps Rk → Rn , and C maps Rm → Rn ). We
gloss over this distinction and include rectangular matrices in the LinearAlgebra
concept.
The LinearAlgebra concept does not introduce any new requirements, it is just
a composition of the algebraic concepts discussed in this chapter. However, to
help clarify the definition of this rather large concept, here we list the complete
set of requirements. The usual definition for the computation of these operations
is listed with each operator merely as a reminder to the reader and does not
require that the operation be implemented in that manner. We use t to denote
the temporary vector or matrix resulting from some of the operations 3 .
Refinement
LinearOperator, VectorSpace, and Ring
Notation
Matrix
Vector
A,B
r, c
x,y
a

is a type that models TransposableLinearOperator.
is a type that models VectorSpace.
is an object of type Matrix.
are objects of type Matrix, which in this case is a VectorMatrix (a
row or column vector).
are objects of type Vector.
is an object of type Scalar.

Participating Types
• The Matrix Type
• The Vector Type
• The Scalar Type
3 Due to the expression template optimization that MTL performs, in many cases the
temporary is not actually formed

54

CHAPTER 7. CONCEPTS

Valid Expressions
• Vector Addition (ti = xi + yi )
x + y

Preconditions:

size(x) == size(x) if x is finite-dimensional.

• Vector Addition Assignment (xi = xi + yi )
x += y

Preconditions:

size(x) == size(y)

• Vector Additive Inverse (ti = −xi )
-x

• Vector Subtraction (ti = xi − yi )
x - y

Preconditions:

size(x) == size(y)

• Vector Subtraction Assignment (xi = xi − yi )
x -= y

Preconditions:

size(x) == size(x)

• Zero Vector (ti = 0)
zero(x)

Semantics:

This function returns a zero vector of the same type as
the argument x. x is not changed, the purpose of the
argument is merely to carry type and size information.

• Vector-Scalar Multiplication (ti = xi α)
x * a

Return Type:

Vector or a type convertible to Vector that is a model
of VectorSpace.

• Scalar-Vector Multiplication (ti = αxi )
a * x

Return Type:

Vector or a type convertible to Vector. that is a model
of VectorSpace.

• Vector-Scalar Multiplication Assignment (xi = xi α)
x *= a

Return Type:

Vector&

• Matrix Addition (tij = aij + bij )
A + B

Preconditions:

nrows(A) == nrows(B) && ncols(A) == ncols(B)

• Matrix Addition Assignment (bij = bij + aij )
B += A

Preconditions:

nrows(A) == nrows(B) && ncols(A) == ncols(B)

• Matrix Additive Inverse (tij = −aij )
-A

7.1. ALGEBRAIC CONCEPTS

55

• Matrix Subtraction (tij = aij − bij )
A - B

Preconditions:

nrows(A) == nrows(B) && ncols(A) == ncols(B)

• Matrix Subtraction Assignment (bij = bij − aij )
B -= A

Return Type:
Preconditions:

Matrix&
nrows(A) == nrows(B) && ncols(A) == ncols(B)

• Zero Matrix (tij = 0)
zero(A)

Return Type:
Semantics:

Matrix

This function returns a zero matrix of the same type
as the argument A. A is not changed, the purpose of the
argument is merely to carry type and size information.

• Matrix-Scalar Multiplication (tij = aij α)
A * a

• Scalar-Matrix Multiplication (tij = αaij )
a * A

• Matrix Scalar Multiplication Assignment (aij = aij α)
A *= a

Return Type:

Matrix&

• Matrix-Vector Multiplication (ti =

P

i

aij xj )

A * x

Preconditions:
Semantics:

ncols(A) == size(x)

The return type convertible to a model of Vector, and
the size is equal to nrows(A).
P
• Transposed Matrix-Vector Multiplication (tj = j aij yi )
trans(A) * y

Preconditions:
Semantics:

nrows(A) == size(y)

The return type convertible to a model of Vector, and
the size is equal to ncols(A).
P
• Left Matrix-Vector Multiplication (tj = j yi aij )
y * A

Preconditions:
Semantics:
• Inner Product (

nrows(A) == size(y)

The return type convertible to a model of Vector, and
the size is equal to ncols(A).
P

i ri ci )

r * c

Return Type:
Preconditions:

Scalar
size(r) == size(c)

56

CHAPTER 7. CONCEPTS
• Outer Product (tij = ci rj )
c * r

Semantics:

The return type is a model of Matrix with dimensions
The matrix defaults to be rowmajor, and is always dense.
P
• Matrix Multiplication (tij = k aik bkj )
(size(c),size(r)).

A * B

Preconditions:
Semantics:

ncols(A) == nrows(B)

The return type is a model of Matrix with dimensions
(nrows(A),ncols(B)).

• Identity Matrix (tii = 1)
identity(A)

Return Type:
Semantics:

Matrix

Returns the identity matrix for the same type as A. The
argument A is not changed, its purpose is to carry type
and size information for the creation of the identity
matrix.

• Matrix Transpose (tij = aji )
trans(A)

Return Type:
Semantics:

transpose view::type

Returns a transposed view of the matrix, where A(i,j)
appears to be A(j,i).

• Row and Column Vector Transpose
trans(r) and
trans(c)

Return Type:
Semantics:

transpose view::type
If Matrix is a row it becomes a column and vice-versa.

Constraints Checking Class
template 
struct LinearAlgebra
{
CLASS_REQUIRES4(Op, VX, VY, F, LinearOperator);
CLASS_REQUIRES2(Op, F, VectorSpace);
CLASS_REQUIRES(Op, Ring);
};

7.2. COLLECTION CONCEPTS

7.2

57

Collection Concepts

7.2.1

Collection

A Collection is a concept similar to the STL Container concept. A Collection
provides iterators for accessing a range of elements and provides information
about the number of elements in the Collection. However, a Collection has fewer
requirements than a Container. The motivation for the Collection concept is that
there are many useful Container-like types that do not meet the full requirements
of Container, and many algorithms that can be written with this reduced set of
requirements. To summarize the reduction in requirements:
• It is not required to “own” its elements: the lifetime of an element in
a Collection does not have to match the lifetime of the Collection object,
though the lifetime of the element should cover the lifetime of the Collection
object.
• The semantics of copying a Collection object is not defined (it could be a
deep or shallow copy or not even support copying).
• The associated reference type of a Collection does not have to be a real
C++ reference.
Because of the reduced requirements, some care must be taken when writing
code that is meant to be generic for all Collection types. In particular, a Collection object should be passed by-reference since assumptions can not be made
about the behaviour of the copy constructor.
Associated types
• Value type
X::value type

The type of the object stored in a Collection. If the Collection is mutable
then the value type must be Assignable. Otherwise the value type must
be CopyConstructible.
• Iterator type
X::iterator

The type of iterator used to iterate through a Collection’s elements. The
iterator’s value type is expected to be the Collection’s value type. A conversion from the iterator type to the const iterator type must exist. The
iterator type must be an InputIterator.
• Const iterator type
X::const iterator

A type of iterator that may be used to examine, but not to modify, a
Collection’s elements.

58

CHAPTER 7. CONCEPTS
• Reference type
X::reference

A type that behaves like a reference to the Collection’s value type.

4

• Const reference type
X::const reference

A type that behaves like a const reference to the Collection’s value type.
• Pointer type
X::pointer

A type that behaves as a pointer to the Collection’s value type.
• Distance type
X::difference type

A signed integral type used to represent the distance between two of the
Collection’s iterators. This type must be the same as the iterator’s distance
type.
• Size type
X::size type

An unsigned integral type that can represent any nonnegative value of the
Collection’s distance type.
Notation
X
a, b
T

A type that is a model of Collection.
Object of type X.
The value type of X.

Valid expressions
• Beginning of range
a.begin()

Return Type:
Semantics:
Postcondition:

iterator if a is mutable, const iterator otherwise

Returns an iterator pointing to the first element in the
Collection.
a.begin() is either dereferenceable or past-the-end. It
is past-the-end if and only if a.size() == 0.

4 The reference type does not have to be a real C++ reference. The requirements of the
reference type depend on the context within which the Collection is being used. Specifically
it depends on the requirements the context places on the value type of the Collection. The
reference type of the Collection must meet the same requirements as the value type. In addition,
the reference objects must be equivalent to the value type objects in the Collection (which is
trivially true if they are the same object). Also, in a mutable Collection, an assignment to the
reference object must result in an assignment to the object in the Collection (again, which is
trivially true if they are the same object, but non-trivial if the reference type is a proxy class).

7.2. COLLECTION CONCEPTS

59

• End of range
a.end()

Return Type:
Semantics:
Postcondition:

iterator if a is mutable, const iterator otherwise
Returns an iterator pointing one past the last element
in the Collection.
a.end() is past-the-end.

• Size
a.size()

Return Type:
Semantics
Poscondition:

size type

Returns the size of the Collection, i.e., the number of
elements.
a.size() >= 0

• Maximum size
a.max size()

Return Type:
Semantics:
Postcondition:

size type

Returns the largest size that this Collection can ever
have.
a.max size() >= 0 && a.max size() >= a.size()

• Empty Collection
a.empty()

Return Type:
Semantics:

Convertible to bool
Equivalent to a.size() == 0. (But possibly faster.)

• Swap
a.swap(b)

Return Type:
Semantics:

void

Equivalent to swap(a,b)

Complexity guarantees
begin() and end() are amortized constant time. size() is at most linear in the
Collection’s size. empty() is amortized constant time. swap() is at most linear in

the size of the two collections.
Invariants
• Valid range
For any Collection a, [a.begin(), a.end()) is a valid range.
• Range size
a.size() is equal to the distance from a.begin() to a.end().
• Completeness
An algorithm that iterates through the range [a.begin(), a.end()) will
pass through every element of a.

60

CHAPTER 7. CONCEPTS

Models
• boost::array
• boost::array ptr
• std::vector
• mtl::vector::type
• mtl::matrix traits::OneD where Matrix is some type that models
the MTL Matrix concept.

7.2.2

ForwardCollection

The elements are arranged in some order that does not change spontaneously
from one iteration to the next. As a result, a ForwardCollection is EqualityComparable and LessThanComparable. In addition, the iterator type of a ForwardCollection is a MultiPassInputIterator which is just an InputIterator with the added
requirements that the iterator can be used to make multiple passes through a
range, and that if it1 == it2 and it1 is dereferenceable then ++it1 == ++it2.
The ForwardCollection also has a front() method.
Refinement of
Collection
Valid Expressions
• Front
a.front()

Return Type:
Semantics:

7.2.3

reference if a is mutable, const reference otherwise.
Equivalent to *(a.first()).

ReversibleCollection

The container provides access to iterators that traverse in both directions (forward and reverse). The iterator type must meet all of the requirements of BidirectionalIterator except that the reference type does not have to be a real C++
reference. The ReversibleCollection adds the following requirements to those of
ForwardCollection.
Refinement of
ForwardCollection

7.2. COLLECTION CONCEPTS

61

Valid Expressions
• Beginning of range
a.rbegin()

Return Type:
Semantics:

reverse iterator if a is mutable, const reverseiterator otherwise.
Equivalent to reverse iterator(a.end()).

• End of range
a.rend()

Return Type:
Semantics:

reverse iterator if a is mutable, const reverseiterator otherwise.
Equivalent to X::reverse iterator(a.begin()).

• Back
a.back()

7.2.4

Return Type:

reference if a is mutable, ¡br¿ const reference other-

Semantics:

wise.
Equivalent to *(--a.end()).

SequentialCollection

Refinement of ReversibleCollection. The elements are arranged in a strict linear
order. No extra methods are required.

7.2.5

RandomAccessCollection

The iterators of a RandomAccessCollection satisfy all of the requirements of
RandomAccessIterator except that the reference type does not have to be a real
C++ reference. In addition, a RandomAccessCollection provides an element
access operator.
Refinement of
SequentialCollection
Valid Expressions
• Element Access
a[n]

Return Type:
Semantics:
Precondition:

reference if a is mutable, const reference otherwise.
Returns the nth element of the collection.
n must be convertible to size type and 0 <= n && n <
a.size().

62

CHAPTER 7. CONCEPTS

7.3

Iterator Concepts

7.3.1

IndexedIterator

Refinement of
ForwardIterator
Notation
is a type that is a model of IndexedIterator.
is an object of type X.

X
i

Associated Types
• Size Type
X::size type

The return type of the index() member function.
Valid Expressions
• Element Index
i.index()

Return Type:
Semantics:

7.3.2

X::size type

Returns the index associated with the element currently
pointed to by the iterator i.

MatrixIterator

Refinement of
IndexedIterator
Notation
X
i

is a type that is a model of MatrixIterator.
is an object of type X.

Valid Expressions
• Row Index
i.row()

Return Type:
Semantics:

difference type

Returns the row index associated with the element currently pointed to by the iterator i.

• Column Index
i.column()

Return Type:
Semantics:

difference type

Returns the column index associated with the element
currently pointed to by the iterator i.

7.3. ITERATOR CONCEPTS

7.3.3

63

IndexValuePairIterator

Valid Expressions
• Index
index(*i)

Return Type:
Semantics:

difference type

Returns the index associated with the element currently
pointed to by the iterator i.

• Value
value(*i)

Return Type:
Semantics:

value type

Returns the value associated with the element currently
pointed to by the iterator i.

64

CHAPTER 7. CONCEPTS

7.4

Vector Concepts

7.4.1

BasicVector

A BasicVector provides a mapping from a set of indices to the associated elements. The indices do not have to form a contiguous range though the indices
must fall between zero and size(x). An access to x[i] where i >= size(x) is
considered out-of-bounds. We add a few useful traits such as whether the vector
is sparse or dense and if the vector has static size, what that size is.
Associated Types
• Linear Algebra Category
linalg category::type
For vectors this is vector tag unless the vector is also a matrix (such as a
row or column) in which case this is matrix tag.

• Value Type
vector traits::value type

• Reference Type
vector traits::reference

• Const Reference Type
vector traits::const reference

• Pointer Type
vector traits::pointer

• Const Pointer Type
vector traits::const pointer

• Size Type
vector traits::size type

• Sparsity
vector traits::sparsity

A Vector can be either sparse (sparse tag or dense (dense tag).
• Static Size
vector traits::static size

If the vector has static size (size determined at compile-time) then static size
gives the length of the vector. Otherwise static size has the value dynamic sized.
Notation
X
x
i

is a type that is a model of BasicVector.
is an object of type X.
is an object of type vector traits::size type.

7.4. VECTOR CONCEPTS

65

Valid expressions
• Element Access
x[i]

Return Type:
Semantics:

vector traits::reference if x is mutable, vector traits::const reference otherwise
returns the element with index i.

• Size
size(x)

Return Type:
Semantics:

vector traits::size type

Returns the extent of the index set for the vector. For
sparse vectors size(x) will typically be much larger the
number of stored elements.

Complexity Guarantees
Unlike RandomAccessContainer, the BasicVector concept does not guarantee amortized constant time for element access (operator[]) since that would rule out
sparse vectors. Element access is only guaranteed to be linear time in the number of non-zeroes in the vector. The most efficient method and also the preferred
method for accessing elements of sparse vectors is to use an iterator which is
introduced in the Vector concept.
Models
• std::valarray
• std::vector
• mtl::vector::type
Constraints Checking Class
template 
struct BasicVector
{
typedef typename linalg_category::type category;
typedef typename vector_traits::sparsity sparsity;
enum { CATEGORY = linalg_category::RET,
SPARSITY = vector_traits::SPARSITY,
SIZE = vector_traits::SIZE };
typedef typename vector_traits::size_type size_type;
typedef typename vector_traits::value_type value_type;
typedef typename vector_traits::reference reference;
typedef typename vector_traits::const_reference const_reference;
typedef typename vector_traits::pointer pointer;
typedef typename vector_traits::const_pointer const_pointer;
void constraints() {

66

CHAPTER 7. CONCEPTS
reference r = x[n];
const_constraints(x);
}
void const_constraints(const X& x) {
const_reference r = x[n];
n = size(x);
}
X x;
size_type n;

};

7.4.2

Vector

This describes the MTL Vector concept, which is not to be confused with the
std::vector class or the family of mtl::vector::type

classes. All of the vector classes in MTL model this Vector concept. That is, they
fulfill the requirements (member functions and associated types) described here.
The MTL Vector concept is a refinement of ForwardCollection (not Container),
and BasicVector which adds the property that each element of a Vector has a
unique corresponding index. The index can be used to access the corresponding
element through the vector’s bracket operator (i.e., x[i]). The elements do not
have to be sorted by their index, and the indices do not necessarily have to start
at zero (though they often are sorted and start at zero).
Refinement of
ForwardCollection and BasicVector
Invariants
The invariant x[i] == *(x.begin() + i) that applies to RandomAccessContainer
does not apply to Vector, since the x[i] is defined for Vector to return the element
with the ith index with is not required to be the element at position i of the
container.
Models
• mtl::vector::type
Constraints Checking Class
template 
struct Vector
{
CLASS_REQUIRES(X, ForwardCollection);
CLASS_REQUIRES(X, BasicVector);
};

7.4. VECTOR CONCEPTS

7.4.3

67

SparseVector

The SparseVector concept adds several operations that are typically needed for
sparse vectors. The number of non-zeroes (number of stored elements) in the
vector can be accessed with the nnz(x) expression. The index of an element in
the sparse vector can be obtained from the iterator, using the index() member
function. For example, to print out the elements of the vector as index-value
pairs one can write this:
template 
void print_sparse_vector(const SparseVector& x) {
typename SparseVector::const_iterator i;
for (i = x.begin(); i != x.end(); ++i)
cout << "(" << i.index() << "," << *i << ") ";
}

If one wishes to examine only the indices of the sparse vector, the nz struct(x)
function can be used to obtain a view of the element indices.
template 
void print_vector_indices(const SparseVector& x) {
typedef typename nonzero_structure::type NzStruct;
typename NzStruct::const_iterator i;
NzStruct x_nz = nz_struct(x);
for (i = x_nz.begin(); i != x_nz.end(); ++i)
cout << *i << " ";
}

Constraints Checking Class
template 
struct SparseVector
{
CLASS_REQUIRES(X, Vector);
typedef typename X::iterator iterator;
typedef typename X::const_iterator const_iterator;
CLASS_REQUIRES(iterator, IndexedIterator);
CLASS_REQUIRES(const_iterator, IndexedIterator);
typedef typename nonzero_structure::type NonZeroStruct;
CLASS_REQUIRES(NonZeroStruct, Collection);
typedef typename X::size_type size_type;
void constraints() {
n = nnz(x);
z = nonzero_structure(x);
}
X x;
size_type n;
NonZeroStruct z;
};

68

CHAPTER 7. CONCEPTS

Refinement of
Vector
Associated Types
• Iterator
X::iterator

The iterator must be a model of IndexedIterator.
• Non-Zero Structure Type
nonzero structure::type

Valid expressions
• Number of Non-Zeroes
nnz(x)

Return Type:
Semantics:

X::size type

returns the number of stored elements in the vector.
Note that if an element stored in the vector happens to
be zero it is still counted towards the nnz. For dense
vectors, nnz(x) == size(x). For sparse vectors nnz(x)
is typically much smaller than size(x).

• Non-Zero Structure Access
nz struct(x)

Return Type:
Semantics:

nz struct::type

Returns a Collection that consists of the indices of the
elements in the vector x. The size() of the collection
is nnz(x).

Invariants
x[i] == *iter if and only if iter.index() == i.

Complexity Guarantees
The nnz() an nz struct() functions are amortized constant time.
Models
• mtl::vector >::type
• mtl::vector >::type
• mtl::vector >::type

7.4. VECTOR CONCEPTS

7.4.4

69

SubdividableVector

Constraints Checking Class
template 
struct SubdividableVector
{
CLASS_REQUIRES(X, Vector);
void constraints() {
sub_x = x[range(s,f)];
}
typedef typename X::size_type size_type;
size_type s, f;
X x;
typename subvector::type sub_x;
};

Refinement of
Vector
Associated Types
• Sub-Vector Type
subvector::type

The type used to represent sub-vector “views” of the vector.
Notation
is a type that is a model of Vector.
is an object of type X.
is an object of type std::pair.

X
x
r

Valid expressions
• Sub-Vector Access
x[r]

Return Type:
Semantics:

subvector::type

returns a sub-vector “view” of a. The region is defined
by the half-open interval [r.first,r.second). That is,
the region includes x[r.first] but not x[r.second].
The range() funciton can be used for conveniently creating the r argument from a pair of indices.

Complexity Guarantees
• The sub-range access is only guaranteed to be linear time (but is typically
constant time for dense vectors).

70

CHAPTER 7. CONCEPTS

7.4.5

ResizableVector

Constraints Checking Class
template 
struct ResizableVector
{
CLASS_REQUIRES(X, Vector);
typedef typename X::size_type size_type;
void constraints() {
x.resize(n);
}
X x;
size_type n;
};

Refinement of
Vector
Valid expressions
• Resize Vector
x.resize(n)

Return Type:
Semantics:

void

Changes the size of the vector to n. New elements will
be initialized to zero.

Complexity Guarantees
• The resize function is guaranteed to be linear in the size of the vector.

7.5. MATRIX CONCEPTS

7.5

71

Matrix Concepts

SubdivideableMatrix
TriangularMatrix

StrideableMatrix
BandedMatrix

Matrix

BasicMatrix

VectorMatrix
Matrix1D
FastDiagMatrix
ResizeableMatrix

Figure 7.2: Refinement of the matrix concepts.

7.5.1

BasicMatrix

The BasicMatrix concept defines the associated types and operations that are
common to all MTL matrix types. A matrix consists of elements, each of which
has a row and column index. The expression A(i,j) returns the element with
row index i and column index j. For most matrix algorithms, one needs to
traverse through all the of elements of a matrix. One could use A(i,j) to do
this, but for some matrix types this is inefficient (e.g., sparse matrices) or can
be confusing (e.g., banded matrices). The iterators provided by the Matrix and
Matrix1D concepts provide a better method for efficiently traversing a matrix,
and the SubdividableMatrix concept defines a nice interface for accessing slices
and sub-matrices.
There are several traits classes used with the BasicMatrix concept: linalg traits,
and matrix traits. The linalg traits class is shared with the MTL vector and
scalar concepts. For each of the tags defined in these traits classes there is
also a numerical constant defined, for example matrix traits::shape and
matrix traits::SHAPE. Both the tag and the constant are provided because
the type tag is needed to implement external polymorphism (dispatching based
to tag categories), while the constant is more convenient to use in compile-time
(template meta-programming) logic.
Associated Types
The matrix traits class provides access to the associated types of a BasicMatrix,
though the linalg traits class can also be used for some of the associated types.
• Category
linalg category::type and linalg category::RET
For matrices this is always matrix tag and MATRIX.

72

CHAPTER 7. CONCEPTS
• Element Type
matrix traits::element type

The type of the elements contained in the matrix. The choice of naming
this element type over value type is to help differentiate the element type
from the 1-D type which for most MTL matrices is given by X::value type
(as they are a 2-D Collection).
• Reference
matrix traits::reference The mutable reference type associated with

the element type.
• Const Reference
matrix traits::const reference The constant (immutable) reference

type associated with the element type.
• Size Type
matrix traits::size type

The type used for expressing matrix indices and dimensions.
• Sparsity
matrix traits::sparsity

Access whether the matrix is dense or sparse (dense tag orsparse tag ).
• Dimension
matrix traits::dimension

Access the dimension of the matrix (oned tag or twod tag).
• Static Number of Rows
matrix traits::static nrows

If the matrix is static (size determined at compile-time) then static nrows
gives the number of rows in the matrix. Otherwise static nrows is dynamic sized.
• Static Number of Columns
matrix traits::static ncols

If the matrix is static (size determined at compile-time) then static ncols
gives the number of rows in the matrix. Otherwise static ncols is dynamic sized.
• Shape
matrix traits::shape

The general layout of where the non-zeroes appear in the matrix. The
possible types are rectangle tag, banded tag, triangle tag, symmetric tag,
hermitian tag, or diagonal tag.
• Orientation
matrix traits::orientation

The natural order of traversal of the matrix, either row tag, column tag,
or diagonal tag, or no orientation tag.

7.5. MATRIX CONCEPTS

73

Notation
X
A
i,j

is a type that is a model of BasicMatrix.
is an object of type X.
are objects of type matrix traits::size type.

Valid Expressions
• Element Access
A(i,j)

Return Type:
Semantics:

reference if A is mutable, const reference otherwise
Returns the element at row i and column j.

• Number of Rows
nrows(A)

Return Type:

size type

• Number of Columns
ncols(A)

Return Type:

size type

• Number of Non-Zeroes
nnz(A)

Return Type:
Semantics:

size type

Returns the number of stored elements in the matrix. Note that if an element that is stored in the
matrix happens to be zero it is still counted towards
the nnz. For dense matrices nnz(A) == nrows(A) *
ncols(A). For sparse and banded matrices nnz(A) is typically much small than nrows(A) * ncols(A).

Complexity Guarantees
• Element access is only guaranteed to be O(mn). The time complexity for
this operation varies widely from matrix type to matrix type. For dense
matrices it is constant, while for coordinate scheme sparse matrices it is
on averange mn/2. See the documentation for each matrix type for the
specific time complexity.
• The nrows(), ncols(), and nnz() members are all constant time.

7.5.2

VectorMatrix

A VectorMatrix is basically a vector that is also a matrix. Examples of this
include a row or column section of a matrix or a free-standing vector which is
being used to represent a row matrix or column matrix (a matrix consisting of
a single row or column). MTL vectors are by default considered to be a column
matrix and have column tag for their orientation. For the most part, the properties of the VectorMatrix derive from being a Vector, though the VectorMatrix

74

CHAPTER 7. CONCEPTS

also fulfills the requirements for BasicMatrix. The iterator and const iterator
types of the VectorMatrix must model MatrixIterator.
Refinement of
Vector and BasicMatrix
Associated Types
VectorMatrix inherits its associated types from Vector and BasicMatrix.
Valid Expressions
• Transposed View of the Vector
trans(x)

Return Type:
Semantics:
Complexity:

transpose view::type

Creates a transposed view of the vector. If the vector
was a row it is now a column and vice-versa.
Constant time.

Models
• matrix::type::value type (a row of a dense matrix)
• vector::type

7.5.3

Matrix1D

A Matrix1D is a matrix that provides an iterator type that traverses all of the
elements of the matrix in one pass. Dereferencing the iterator gives an element
of the matrix, so std::iterator traits::value type is the same
type as matrix traits::value type. The iterator type must be a model of
MatrixIterator.
Refinement of
BasicMatrix and Collection

7.5.4

Matrix

The main interface to the Matrix concept is basically that of a two-dimensional
Collection (which is very similar to the STL Container concept). It has a begin()
and end() method which provides access to 2-D iterators. The 2-D iterators
dereference to give 1-D sections of the matrix, which also have begin() and
end() methods for access to the 1-D iterators. The 1-D iterators dereference to
give matrix elements. The 1-D iterators (which model MatrixIterator) provide
access to the row and column index of each element, through the row() and
column() methods. These iterators provide an efficient traversal method for any

7.5. MATRIX CONCEPTS

75

MTL matrix type, and usually correspond to the “natural” traversal order given
by how the matrix is stored in memory. For matrices with special structure,
such as banded or sparse matrices, only the stored elements are traversed.
If the 1-D iterators traverse down each row, then we call the matrix roworiented. If the traversal goes down each column, we call the matrix columnoriented. Some MTL matrices are even diagonally-oriented. The 1-D sections
of a matrix can also be accessed via the bracket operator A[i] (which gives the
ith 1-D section). The type of the 1-D section is given by X::value type (where
X is some matrix type) which is a model of VectorMatrix. The element type
of the matrix is accessed through matrix traits::value type as described in
BasicMatrix.
Example
Print out the elements of a matrix.
template 
void print_matrix(const Matrix& A)
{
typedef typename matrix_traits::const_iterator Iter2D;
typedef typename A::OneD::const_iterator Iter1D;
for (Iter2D i = A.begin(); i != A.end(); ++i)
for (Iter1D j = (*i).begin(); j != (*i).end(); ++j)
cout << "(" << j.row() << "," << j.column() << ") =" << *j << endl;
}

Refinment of
BasicMatrix and RandomAccessCollection
Notation
X
A
i,j

is a type that is a model of Matrix.
is an object of type X.
are objects of type matrix traits::size type.

Associated Types
Matrix inherits the associated types from BasicMatrix and RandomAccessCollection.
Valid Expressions
Most of the valid expressions for Matrix are inherited from the concepts it refines, but we restate them here for convenience of reference. The three new
expressions, trans(A), abs(A), and conj(A) are described first.

76

CHAPTER 7. CONCEPTS
• Transposed View of a Matrix
trans(A)

Return Type:
Semantics:
Complexity:

transpose view::type

Creates a transposed view of the matrix. Access to an
element at A(i,j) will retrieve the element at A(j,i).
Constant time.

• Absolute Value
abs(A)

Return Type:
Semantics:

a Matrix with element type magnitude::type (where
T is matrix traits::value type).
applies abs() to each element of the matrix.

• Conjugate
conj(A)

Return Type:
Semantics:

convertible to X
applies conj() to each element of the matrix.

• Element Access
A(i,j)

Return Type:
Semantics:

matrix traits::reference
if A is mutable,
matrix traits::const reference otherwise
Returns the element at row i and column j.

• Number of Rows
nrows(A)

Return Type:

X::size type

• Number of Columns
ncols(A)

Return Type:

X::size type

• Number of Non-Zeroes
nnz(A)

Return Type:
Semantics:

X::size type

Returns the number of stored elements in the matrix. Note that if an element that is stored in the
matrix happens to be zero it is still counted towards
the nnz. For dense matrices nnz(A) == nrows(A) *
ncols(A). For sparse and banded matrices nnz(A) is typically much small than nrows(A) * ncols(A).

• 2-D iterator access to beginning of range
A.begin()

Return Type:
Semantics:

X::iterator if A is mutable, X::const iterator otherwise.
returns an iterator pointing to the first 1D section of
the matrix.

7.5. MATRIX CONCEPTS

77

• 2-D iterator access to end of range
A.end()

Return Type:

X::iterator if A is mutable, X::const iterator other-

Semantics:

wise.
returns an iterator pointing after the last 1D section of
the matrix.

• OneD Access
A[i]

Return Type:
Semantics:

X::reference if A is mutable, X::const reference oth-

erwise
returns the ith one-dimensional section of the matrix
(e.g., for a row-oriented matrix this returns the ith
row).

• Size
A.size(),
size(A)

Return Type:
Semantics
Invariants:
Postcondition:

X::size type

Returns the number of 1D sections (typically rows or
columns) in the matrix.
A.size() == A.end() - A.begin()
A.size() >= 0

• Maximum size
A.max size()

Return Type:
Semantics:
Postcondition:

X::size type

Returns the largest size that this Matrix can ever have.
a.max size() >= 0 && a.max size() >= a.size()

• Empty Matrix
A.empty()

Return Type:
Semantics:

Convertible to bool
Equivalent to a.size() == 0.

• Swap
A.swap(B)

Return Type:
Semantics:

void

Equivalent to swap(A,B)

Complexity Guarantees
• Element access time via A(i,j) is guaranteed to be O(m + n).
• All iterator access methods and iterator operations are constant time.
• The 1-D access though operator bracket is constant time.

78

CHAPTER 7. CONCEPTS

Examples
A transposed view of a small matrix. The transposed view is column-oriented,
which means that trans(A)[0] is a column (whereas A[0] is a row).
matrix::type A(2,2);
A(0,0) = 1; A(0,1) = 2;
A(1,0) = 3; A(1,1) = 4;
cout << "A =" << endl << A << endl;
cout << "A[0] = " << A[0] << endl;
cout << "trans(A) =" << endl << trans(A) << endl;
cout << "trans(A)[0] = " << trans(A)[0] << endl;

The output is:
A =
[1 2;
3 4]
A[0] = [1 2]
trans(A) =
[1 3;
2 4]
trans(A)[0] = [1 2]

7.5.5

BandedMatrix

A BandedMatrix provides an interface that restricts access to a region or band
of a matrix, which is all elements aij where i − j < l and j − i < u, l being the
number of sub diagonals (below the main diagonal) and u the number of super
diagonals (above the main diagonal). So the shape of the band is described
by the pair (l, u) and the bandwidth is l + u + 1. A BandedMatrix is typically
used when a matrix constains mostly zero elements, and all of the non-zeros fall
within the band.
Elements outside of the band are considered out-of-bounds, and attempts
to access those elements (via A(i,j) or A[i][j]) will raise an exception. The
begin() and end() methods of the matrix’s 1-D sections are modified so that
the iterators traverse only the band of the matrix. If the BandedMatrix is also a
SubdividableMatrix, then it is only valid to ask for sub-matrix, sub-row and subcolumn regions that are entirely within the band. However, if all is specified,
then this is taken to mean all of the region within the band.

Refinement of
Matrix

7.5. MATRIX CONCEPTS

79

Requirements
• Number of Sub-Diagonals
nsub(A)

Return Type:
Semantics:

size type

returns the number of sub-diagonals in the bandwidth.

• Number of Super-Diagonals
nsuper(A)

Return Type:
Semantics:

7.5.6

size type

returns the number of super-diagonals in the bandwidth.

TriangularMatrix

A TriangularMatrix provides an interface that restricts access to either the upper
or lower triangle of the matrix. A TriangularMatrix is a kind of BandedMatrix,
where the bandwith is (m − 1, 0) for a lower triangular matrix, and (0, n − 1)
for an upper triangular matrix.
One common case is for a triangular matrix to have all ones on the main
diagonal, in which case the it is called a unit diagonal matrix. A TriangularMatrix
that is declared unit diagonal restricts access from the main diagonal (no need
to access those elements since they are always one). Some MTL algorithms are
more efficient when handling triangular matrices that are declared unit diagonal.
Refinement of
BandedMatrix
Requirements
• Is Upper Triangular?
is upper(A)

Return Type:
Semantics:

bool

Returns true if the upper triangular region of the matrix contains the non-zeroes. In the case of a symmetric
matrix, this returns true if the elements of the upper
triangle are the ones that are actually stored and accessed.

• Is Lower Triangular?
is lower(A)

Return Type:
Semantics:

bool

Returns true if the lower triangular region of the matrix contains the non-zeroes. In the case of a symmetric
matrix, this returns true if the elements of the lower triangle are the ones that are actually stored and accessed.

80

CHAPTER 7. CONCEPTS
• Is Unit Diagonal?
is unit diag(A)

Return Type:
Semantics:

7.5.7

bool

Returns true if the diagonal of the matrix is not stored,
and can be assumed to be all ones. This allows some
algorithms to be more efficient.

StrideableMatrix

A dense matrix is typically stored in a row-major or column-major fashion in
memory. By default MTL provides a row-oriented and column-oriented interface
respectively for accessing these matrices. Sometimes, however, it is necessary to
access the columns of row-major matrix, or the rows of a column-major matrix.
The StrideableMatrix concept defines the interface for creating a new matrix
object that provides this kind of strided-view of the original matrix.
Requirements
• Row Oriented View Type
row view::type

The type of the object returned by rows(A). This type must is a model of
Matrix.
• Column Oriented View Type
column view::type

The type of the object returned by columns(A). This type must is a model
of Matrix.
• Row Oriented View Access Function
rows(A)

Return Type:
Semantics:
Complexity:

row view::type

Create a row-oriented “view” of a matrix.
Constant time.

• Column Oriented View Access Function
columns(A)

Return Type:
Semantics:
Complexity:

column view::type

Create a column-oriented “view” of a matrix.
Constant time.

Example
Inspecting a row and column oriented view of the same matrix.
typedef matrix::type Matrix;
Matrix A(2,2);
A(0,0) = 1; A(0,1) = 2;
A(1,0) = 3; A(1,1) = 4;
row_view::type Ar = rows(A);

7.5. MATRIX CONCEPTS

81

column_view::type Ac = columns(A);
cout << "A =" << endl << A << endl;
cout << "rows(A)[0] = " << Ar[0] << endl;
cout << "columns(A)[0] = " << Ac[0] << endl;

The output is:
A =
[1 2;
3 4]
rows(A)[0] = [1 2]
columns(A)[0] = [1 3]

7.5.8

SubdividableMatrix

A SubdividableMatrix provides methods for accessing sub-matrices, sub-rows,
and sub-columns of the matrix. The syntax is similar to that of MATLAB,
though the “:” which is used in MATLAB to specify all of a row or all of
a column has been replaced by all (since “:” is not a legal C++ identifier).
Unlike MATLAB (and most array libraries) the ranges are specified with halfopen intervals instead of closed intervals (e.g, A(range(0,2),(0,3)) is a 2 × 3
matrix, not 3 × 4) 5 .
Note that the origin for the row and column indices for sub-sections of a
matrix is always reset to (0, 0).
Refinement of
StrideableMatrix
Requirements
X
A
i,j
r1,r2
all

is a type that is a model of Matrix.
is an object of type X.
are objects of type size type.
are objects of type std::pair.
is the only value of the enumerated type all index e.

• Sub-Matrix Type
submatrix::type

The type for sub-matrix views into the matrix.
• Sub-Row Type
subrow::type

The type for sub-row views into the matrix.
5 Specifying ranges with half-open intervals is consistent with C++ Standard iterators, and
with C-style loop indexing conventions.

82

CHAPTER 7. CONCEPTS
• Sub-Column Type
subcolumn::type

The type for sub-column views into the matrix.
• Column Access
A(all,j)

Return Type:
Semantics:

subcolumn::type
Return the jth column.

• Sub-Column Access
A(r1,j)

Return Type:
Semantics:

subcolumn::type

Return a sub-section of the jth column, with row indices in the range [r1.first,r1.second).

• Row Access
A(i,all)

Return Type:
Semantics:

subrow::type
Return the ith row.

• Sub-Row Access
A(i,r2)

Return Type:
Semantics:

subrow::type

Return a sub-section of the ith row, with row indices
in the range [r2.first, r2.second).

• Sub-Matrix Access
A(r1,all)

Return Type:
Semantics:

submatrix::type

Return the sub-matrix whose top-left element
is (r1.first,0) and bottom-right element is
(r1.second-1,N-1).

• Sub-Matrix Access
A(all,r2)

Return Type:
Semantics:

submatrix::type

Return the sub-matrix whose top-left element
is (0,r2.first) and bottom-right element is
(M-1,r2.second-1).

• Sub-Matrix Access
A(r1,r2)

Return Type:
Semantics:

submatrix::type

Return the sub-matrix whose top-left element is
element is

(r1.first,r2.first) and bottom-right
(r1.second-1,r2.second-1).

• Range Creation
range(b,e)

Return Type:
Semantics:

std::pair

A helper function for specifying ranges.

7.5. MATRIX CONCEPTS

83

Example
A textbook implementation of gaussian elimination with partial pivoting.
template 
void gaussian_elimination(Matrix& A, Vector& pivots)
{
CLASS_REQUIRES(Matrix, SubdividableMatrix);
typename matrix_traits::size_type
m = A.nrows(), n = A.ncols(), pivot, i, k;
typename matrix_traits::value_type s;
for (k = 0; k < std::min(m-1,n-1); ++k) {
pivot = k + max_abs_index( A(range(k,m),k) );
if (pivot != k)
swap(A(pivot,all), A(k,all));
pivots[k] = pivot;
if (A(k,k) != zero(s))
for (i = k + 1; i < m; ++i) {
s = A(i,k) / A(k,k);
A(i,all) -= s * A(k,all);
}
}
}

7.5.9

ResizeableMatrix

A ResizeableMatrix can grow or shrink using the resize() method. When growing, the new elements are initialized to zero (or to the default value for the
element type) and the old elements of the matrix remain unchanged. The time
complexity for this operation can vary quite a bit from matrix type to matrix type. The array<> based matrices can grow and shrink quickly in the 2-D
dimension.
Refinement of
BasicMatrix
Requirements
X
A
m,n

is a type that is a model of Matrix.
is an object of type X.
are objects of type matrix traits::size type.

• Resize Matrix Dimensions
A.resize(m, n)

Return Type:
Semantics:

void

Changes the dimensions of the matrix to m × n.

84

CHAPTER 7. CONCEPTS

7.5.10

FastDiagMatrix

A FastDiagMatrix provides an interface for fast traversal of the main diagonal
of the matrix. The diag(A) function returns a Collection object which provides
a view to the diagonal elements (aii ∀i = 0... min(m, n)) of the matrix.
Refinement of
BasicMatrix
Requirements
is a type that is a model of FastDiagMatrix.
is an object of type X.

X
A

• The Main Diagonal View Type
diagonal view::type

This calculates the type of the object returned by diag(A), which is required to be a model of Collection.
• Main Diagonal Access Function
diag(A)

Return Type:
Complexity:

diagonal view::type

Constant time.

Example
Calculate the trace of a matrix, which is the sum of the elements on the main
diagonal.
namespace mtl {
template 
typename matrix_traits::value_type
trace(const FastDiagMatrix& A)
{
return sum(diag(A));
}
}

Chapter 8

Arithmetic Types and
Classes

85

86

CHAPTER 8. ARITHMETIC TYPES AND CLASSES

Chapter 9

Object Model and Memory
Management
9.1

Object Model

The MTL object-model defines what happens when you construct, copy, and
assign one vector or matrix object to another. Most MTL vector and matrix
classes behave like “handles” to the underlying data, and use shallow-copy semantics. For example:
mtl::vector::type x(5, 1.0), y, z(5);
y = x;
mtl::copy(x, z);
y[2] = 3.0;
cout << x << endl;
cout << z << endl;
The output is:
[1,1,3,1,1]
[1,1,1,1,1]
The shallow copy semantics means that for most MTL objects, copy and
assignment is a fast constant time operation.
The shallow-copy semantics does not apply to the stack allocated staticsized vector and matrix objects. The reason for this is that it is impossible to
implement stack-allocated objects with shallow-copy semantics in C++. When
passing MTL objects to functions, if you plan to use both stack-allocated and
heap-allocated vectors it is best to pass-by-reference.
87

88

CHAPTER 9. OBJECT MODEL AND MEMORY MANAGEMENT

9.2

Memory Management

In MTL there are three memory management categories for vector and matrix
objects:
• stack-allocated
• external memory (the MTL object is just a view to pre-existing memory,
created via a pointer to that memory)
• heap-allocated (the normal case)
For the first two categories, MTL does no memory management, as none is
needed. For heap-allocated objects, MTL automatically keeps a reference-count
of the underlying data objects, and frees the memory when the reference count
reaches zero. A know limitation of reference counting is that if there are cycles
in the graph of reference-counted pointers then the cycles will not be deallocated
properly. This is not a concern for vector and matrix objects, which typically
do not contain other vectors and matrices, and when they do, it is in a tree
structure (for sub-matrices) and does not form cycles.

Chapter 10

Vector Classes:
vector::type

89

90

CHAPTER 10. VECTOR CLASSES: VECTOR::TYPE

In MTL there are a fair number of vector types, so to make the selection
process easier this vector type generator is provided. The mtl::vector 1 class
is for selecting the type of vector that you want to use. This section describes
the choices that are possible for the template parameters of the mtl::vector
type generator. For most common uses the vector type given by the default
arguments is the correct one, so you can just declare a vector with the type
mtl::vector::type. For example,
// create a vector of double precision numbers with size 10
mtl::vector::type x(10);
// assign a value into the vector
x[4] = 2.14159;
The sections following this one describe the actual vector classes in more detail.
Template Parameters
T
Storage

Orien

The value type, the type of object stored in the vector.
Selects the way in which the elements are stored in memory. Possible choices are dense, static size, sparse pair,
compressed, and tree. See below for a description of the
storage types.
Default: dense<>
The orientation of the vector, either row major or
column major. This argument is important only when vectors are used inside operator expressions.
Default: column major

Storage Type Selection
The first decision to make when choosing a storage type is whether you want a
sparse or dense vector. Second you must decide whether you want new memory
allocated for the vector elements (internal memory), or if you are just creating
a vector “view” to pre-existing memory (external memory). If memory will be
allocated you can go with the default allocator, or provide a custom allocator.
In addition, for dense vectors you can choose stack-allocated memory, in which
case the vector will have static (constant) size. For external memory vectors,
use external for the Allocator template argument and then supply a pointer to
the memory in vector object’s constructor.
For static-sized and external memory vectors, MTL performs no memory
management, as none is needed. For heap-allocated vectors MTL automatically
keeps a reference count of the element data. See Section 9.1 for a discussion of
how MTL objects use shallow-copy semantics, and keep reference counts. The
1 It is a shame that the name vector was choosen for the dynamic array class in the C++
standard. If you are using both the std::vector class and MTL at the same time, it is not
advisable to do using mtl::vector; or using namespace mtl;.

91
following is a list of the valid choices for the Storage template parameter. The
storage types are described in more detail in the following sections.
• dense A general purpose vector, typically with heap allocated
memory.
• static size
A vector who’s size is constant and memory is stack-allocated.
• sparse pair
This is a sparse vector in which index-value pairs are stored in an array.
• compressed
This is a sparse vector in which the indices and element values are stored
in parallel arrays.
Model of
Vector
Members
The MTL vector classes meet the requirements for the Vector concept, and
therefore has the member functions and associated type defined in Vector and
in the concepts that Vector refines, which include Linalg and ReversibleCollection.
The constructors for each vector type are described in the following sections.

92

CHAPTER 10. VECTOR CLASSES: VECTOR::TYPE

10.1

Dense Vectors

10.1.1

vector, Orien>::type

This is the main vector class. The vector elements are stored contiguously on
the heap, and the iterators are random-access. Element access with operator[]
is constant time.
Example
// This is a dumb example, replace it! -JGS
typedef std::complex C;
typedef mtl::vector< C, dense<> >::type Vec;
Vec x(100);
int cnt = 0;
for (Vec::iterator i = x.begin(); i != x.end(); ++i)
*i = C(cnt, ++cnt);
cout << x[40] << endl;

The output is:
(40,41)

Template Parameters
is the vector’s value type, the type of object stored in the
vector.
Allocator is the allocator used to obtain memory for storing the elements of the vector. In addition to C++ standard conforming allocators (models of the Allocator concept), you
can choose external which creates an MTL vector “view”
to pre-existing memory.
Default: std::allocator
Orien
is the orientation of the vector, either row major or
column major. This argument is important only when vectors are used inside operator expressions.
Default: column major
T

Model of
ResizableVector, SubdividableVector, RandomAccessCollection, and VectorSpace.
Members
In addition to the member function required by ResizableVector, SubdividableVector and RandomAccessCollection, the following members are defined. We use self
as a placeholder for the actual class name.
self(const Allocator& a = Allocator())

10.1. DENSE VECTORS

93

This is the default constructor, which creates a vector of size zero.
self(size_type n, const T& init = T(),
const Allocator& a = Allocator())

Creates a vector of size n and initialize all the elements to the value init2 .

self(const self& x)

The copy constructor. This performs a shallow copy, which means that
the underlying data will be shared between this vector and vector x. The
reference count of the data object is incremented.
~self()

The destructor. The reference count of the underlying vector data is
decremented.
self& operator=(const self& x)

The assignment operator. Like the copy constructor, this makes a shallow
copy and increments the reference count of the data object.

10.1.2

vector, Orien>::type

Use this class to create a vector “view” to pre-existing memory. This is useful
when interfacing with legacy code or to other programming languages. This
MTL vector claims no responsibility for the lifetime of the underlying memory,
and some care must be taken to ensure that the memory lasts longer than any
use of this vector object.
Example
extern "C" float sum(float* xp, int n)
{
typedef mtl::vector >::type Vec;
Vec x(xp, n);
return mtl::sum(x);
}

(JGS comment: Should there be a template argument to specify if it is a
const pointer and therefore must be a const vector?)

94

CHAPTER 10. VECTOR CLASSES: VECTOR::TYPE

Template Parameters
T
N

Orien

is the value type, the type of object stored in the vector.
specifies the static size of the vector.
A value of
dynamic size denotes a vector with size determined at runtime.
Default: dynamic size
is the orientation of the vector, either row major or
column major. This argument is important only when vectors are used inside operator expressions.
Default: column major

Model of
SubdividableVector, RandomAccessCollection, and VectorSpace.
Members
In addition to the member function required by SubdividableVector and RandomAccessCollection, the following members are defined. We use self as a placeholder for the actual class name.
self()

This is the default constructor, which creates an empty vector handle.
You will need to assign another vector to this handle before it will be
useful.
self(T* data, size_type n)

Construct a view to the existing memory given by the data pointer.
self(const self& x)

The copy constructor. This performs a shallow copy, which means that
the underlying data will be shared between this vector and vector x.
self& operator=(const self& x)

The assignment operator. Like the copy constructor, this make a shallow
copy.
~self()

The destructor. This function does nothing.

10.1. DENSE VECTORS

10.1.3

95

vector, Orien>::type

This is a dense vector that is stack-allocated. It is especially advisable to use
this vector type when the vector size is small (less then 20), since it avoids the
overhead of heap-allocation. The vector size must be constant, and specified in
the template argument. Unlike most MTL vectors, this vector is not initialized
to some value (zero by default) but left uninitialized. Also this vector type
has deep copy semantics instead of shallow copy semantics as is usual for MTL
objects (see Section 9.1 for details).
Example
const int N = 3;
typedef mtl::vector >::type Vec;
Vec x; // there is only a default constructor
// but you can also use intializer list syntax
Vec y = { 4.0, 1.2, 5.6 }; // not true anymore! -JGS
x = y; // copy and assignment is deep for stack-allocated vectors
x[0] = 3.0;
assert(x[0] != y[0]);

Template Parameters
T
N
Orien

is the value type, the type of object stored in the vector.
specifies the static size of the vector.
is the orientation of the vector, either row major or
column major. This argument is important only when vectors are used inside operator expressions.
Default: column major

Model of
SubdividableVector, RandomAccessCollection and VectorSpace.
Members
This vector implements the member functions and associated types defined in
the SubdividableVector, RandomAccessCollection, and VectorSpace concepts (and
the concepts they refine).
self()

Default constructor.

96

CHAPTER 10. VECTOR CLASSES: VECTOR::TYPE

10.2

Sparse Vectors

A sparse vector provides an efficient method for storing vectors when most of
the elements are zero. The sparse vectors only store the non-zero elements and
the corresponding index (location) of each element. The MTL sparse vectors
have the same Vector interface as the dense vectors. They provide the usual
iterators for traversal and the operator[] for element access. The MTL sparse
vectors also provide the additional functionality specified in the SparseVector
concept described in Section 7.4.3.
MTL provides three sparse vector storage formats: compressed, sparse pair,
and tree. The compressed format uses a Fortran-style parallel array, one array
for the indices and one array for the values, as shown in Figure 10.1. The sparsepair format uses a single array, where each element of the array is an index-value
pair, as shown in Figure 10.2. The tree format stores the elements as indexvalue pairs in a std::set, which is a red-black tree that provides the advantage
of fast random insertion. All three formats keep the elements in sorted order
according to their index.
The time complexity for element access using operator[] is not constant
for sparse vectors, but is logarithmic in the number of non-zeroes. In addition, assignment to elements through operator[] may cause allocation and data
movement, since if the accessed element was not yet explicitly stored (previously
zero) then space must be made in the appropriate place in the sparse vector.
Due to this extra overhead, it is not advisable to use a sparse vector in situations
where the elements are dense, or as the output argument for operations that
result in dense vectors, such as matrix-vector multiplication.

Indices

1

3

Values

1.3 5

8

11 22 24

3.1 4

0

2.1

Figure 10.1: The Compressed Sparse Vector Format.

(1,1.3) (3,5)

(8,3.1) (11,4) (22,0) (24,2.1)

Figure 10.2: The Sparse Pair Vector Format.
The stored indices are by default zero-based, but this can be changed to
one-based with the IdxStart template parameter. Note that this only changes
the way in which the indices are stored, the vector will still appear to be zerobased. The purpose of allowing one-based internal storage is to accomodate
data sharing with Fortran codes.

10.2. SPARSE VECTORS

10.2.1

97

vector,
Orien>::type

This class implements a sparse vector using two parallel arrays, one for the
elements values and one for the indices. The interface provided by this class
is similar to all the other MTL vectors, and is described in the ResizableVector
and SequentialCollection concepts. The memory for the elements is obtained via
Alloc which must be an Allocator.
Example
Calculate the infinity norm of a sparse vector.
typedef mtl::vector >::type Vec;
const int n = 10;
Vec x(n), y(n);
x[3] = 2.5;
x[8] = 0.1;
x[5] = 3.1;
float inf_norm = mtl::infinity_norm(x);
cout << inf_norm << endl;

The output is:
3.1

Template Parameters
T
Idx

Alloc
IdxStart

Orien

is the sparse vector’s value type, which is the element type
of the value array.
is the index type (and size type), which is the element
type of the index array.
Default: int
is the allocator type, which must model Allocator.
Default: std::allocator
The counting convention for the indices, either C style
index from zero or Fortran style index from one.
Default: index from zero
is the orientation of the vector, either row major or
column major. This argument is important only when vectors are used inside operator expressions.
Default: column major

Model of
SparseVector, ResizableVector, SequentialCollection, and VectorSpace.

98

CHAPTER 10. VECTOR CLASSES: VECTOR::TYPE

Complexity
Element assignment through operator[] is linear in the number of non-zeroes,
since data movement may occur. For better performance insert the elements all
at once using the constructor for IndexValuePairIterator, which is O(nnz log nnz)
for the insertion of all the elements. Element access via operator[] is O(log nnz),
and iterator traversal (operator++) is constant time as usual.
Members
In addition to the member functions required by ResizableVector and SequentialCollection, the following members are defined. We use self as a placeholder
for the actual class name.
self(const Alloc& a = Alloc())

This is the default constructor, which creates a vector of size zero.
self(size_type n, const Alloc& a = Alloc())

Creates a vector of size n but with nnz() == 0. The vector will allow
assignment to element indices in the range [0, n).
template 
self(IndexValuePairIterator first, IndexValuePairIterator last,
size_type n)

Construct a sparse vector from any iterators that dereferences to give
index-value pairs, where the functions value() and index() provide access to the value and index. value(*first) must be convertible to T and
index(*first) must be convertible to Idx. The iterators must model InputIterator, though the construction is more efficient when the iterators
model RandomAccessIterator. The indices should fall in the range [0, n).

self(const self& x)

The copy constructor. This performs a shallow copy, which means that
the underlying data will be shared between this vector and vector x. The
reference count of the data object is incremented.
~self()

The destructor. The reference count of the underlying vector data is
decremented.
self& operator=(const self& x)

The assignment operator. Like the copy constructor, this makes a shallow
copy and increments the reference count of the data object.

10.2. SPARSE VECTORS

10.2.2

99

vector,
Orien>::type

This class can be used to create an MTL “view” to a pre-existing sparse vector in
memory. This is useful when interfacing with legacy code or other programming
languages. This MTL vector claims no responsibility for the lifetime of the
underlying memory, and some care must be taken to ensure that the memory
lasts longer than any use of this vector object. Make sure to choose the T and
Idx template argument to match the pointer type for the values and indices
array. Note that for this class, element assignment through operator[] is only
valid for non-zero elements, and will result in a “no-op” (nothing done) for zero
elements. This is because this vector type does not control the memory.
Example
extern "C"
float sparse_one_norm(float* values, int* indices, int n, int nnz)
{
typedef mtl::vector >::type Vec;
Vec x(values, indices, n, nnz);
return mtl::one_norm(x);
}

Complexity
Element access via operator[] is O(log nnz), and iterator traversal (operator++)
is constant time as usual.
Template Parameters
See the description for vector,Orien>::type.
Members
self()

This is the default constructor, which creates an empty vector handle.
You will need to assign another vector to this handle before it will be
useful.
self(T* values, Idx* indices, size_type n, size_type nnz)

Construct a view to the existing memory given by the values and indices
pointers, with nnz non-zeroes (which should match the length of the two
arrays) and with indices in the range [0, n).
self(const self& x)

The copy constructor. This performs a shallow copy, which means that
the underlying data will be shared between this vector and vector x.

100 CHAPTER 10. VECTOR CLASSES: VECTOR::TYPE
self& operator=(const self& x)}

The assignment operator. Like the copy constructor, this make a shallow
copy.
~self()

The destructor. This function does nothing.

10.2.3

vector, Orien>::type

This class implements a sparse vector as a sorted array of index-value pairs.
The interface provided by this class is similar to all the other MTL vectors,
and is described in the ResizableVector and SequentialCollection concepts. The
memory for the elements is obtained via Alloc which must be an Allocator.
Model of
SequentialCollection, ResizableVector and MatrixExpression.
Complexity
Element assignment through operator[] is linear in the number of non-zeroes,
since data movement may occur. For better performance insert the elements all
at once using the constructor for IndexValuePairIterator, which is O(nnz log nnz)
for the insertion of all the elements. Element access via operator[] is O(log nnz),
and iterator traversal (operator++) is constant time as usual.
Template Parameters
T
Idx
Alloc
Orien

is the sparse vector’s value type.
is the index type (and size type).
Default: int
is the allocator type, which must model Allocator.
Default: std::allocator
is the orientation of the vector, either row major or
column major. This argument is important only when vectors are used inside operator expressions.
Default: column major

Members
In addition to the member functions required by ResizableVector and SequentialCollection, the following members are defined. We use self as a placeholder
for the actual class name.
self(const Alloc& a = Alloc())

This is the default constructor, which creates a vector of size zero.

10.2. SPARSE VECTORS

101

self(size_type n, const Alloc& a = Alloc())

Creates a vector of size n but with nnz() == 0. The vector will allow
assignment to element indices in the range [0, n).
template 
self(IndexValuePairIterator first, IndexValuePairIterator last,
size_type n)

Construct a sparse vector from any iterators that dereference to give
index-value pairs, where the function value() and index() provide access to the index and value. value(*first) must be convertible to T and
index(*first) must be convertible to size type. The iterators must model
InputIterator, though the construction is more efficient when the iterators
model RandomAccessIterator. The indices should fall in the range [0, n).

self(const self& x)

The copy constructor. This performs a shallow copy, which means that
the underlying data will be shared between this vector and vector x. The
reference count of the data object is incremented.
~self()

The destructor. The reference count of the underlying vector data is
decremented.
self& operator=(const self& x)

The assignment operator. Like the copy constructor, this makes a shallow
copy and increments the reference count of the data object.

10.2.4

vector, Orien>::type

This class can be used to create an MTL “view” to a pre-existing sparse vector
in memory. This MTL vector claims no responsibility for the lifetime of the
underlying memory, and some care must be taken to ensure that the memory
lasts longer than any use of this vector object. The pointer to the underlying
array must be of type std::pair*. Note that for this class, element
assignment through operator[] is only valid for non-zero elements, and will
result in a “no-op” (nothing done) for zero elements. This is because this vector
type does not control the memory.
Complexity
Element access via operator[] is O(log nnz), and iterator traversal (operator++)
is constant time as usual.

102 CHAPTER 10. VECTOR CLASSES: VECTOR::TYPE
Template Parameters
T
Idx
Orien

is the sparse vector’s value type.
is the index type (and size type).
Default: int
is the orientation of the vector, either row major or
column major. This argument is important only when vectors are used inside operator expressions.
Default: column major

Members
self()

This is the default constructor, which creates an empty vector handle.
You will need to assign another vector to this handle before it will be
useful.
self(std::pair* data, size_type n, size_type nnz)

Construct a view to the existing memory given by the data pointer, with
nnz non-zeroes (which should match the length of the data array) and
with indices in the range [0, n).
self(const self& x)

The copy constructor. This performs a shallow copy, which means that
the underlying data will be shared between this vector and vector x.
self& operator=(const self& x)

The assignment operator. Like the copy constructor, this make a shallow
copy.
~self()

The destructor. This function does nothing.

10.2.5

vector, Orien>::type

This class implements a sparse vector using std::set with index-value pairs
as the value type for the std::set. The interface provided by this class is
similar to all the other MTL vectors, and is described in the ResizableVector
and SequentialCollection concepts. The memory for the elements is obtained via
Alloc which must be an Allocator.
Model of
SequentialCollection, ResizableVector and MatrixExpression.

10.2. SPARSE VECTORS

103

Complexity
Element assignment through operator[] is logarithmic in the number of nonzeroes (O(log nnz)). If you are inserting the elements all at once, it is typically
more efficient to use one of the array-based sparse vectors and their constructor
for IndexValuePairIterator (though this class also provides such a constructor).
Element access via operator[] is O(log nnz), and iterator traversal (operator++)
is constant time as usual.
Template Parameters
T
Alloc
Orien

is the sparse vector’s value type.
is the allocator type, which must model Allocator.
Default: std::allocator
The orientation of the vector, either row major or
column major. This argument is important only when vectors are used inside operator expressions.
Default: column major

Members
In addition to the member functions required by ResizableVector and SequentialCollection, the following members are defined. We use self as a placeholder
for the actual class name.
self(const Alloc& a = Alloc())

This is the default constructor, which creates a vector of size zero.
self(size_type n, const Alloc& a = Alloc())

Creates a vector of size n but with nnz() == 0. The vector will allow
assignment to element indices in the range [0, n).
template 
self(IndexValuePairIterator first, IndexValuePairIterator last,
size_type n)

Construct a sparse vector from any iterators that dereference to give
index-value pairs, where the functions value() and index() must be defined for the pair type. value(*first) must be convertible to T and
index(*first) must be convertible to Idx. The iterators must model InputIterator, though the construction is more efficient when the iterators
model RandomAccessIterator. The indices should fall in the range [0, n).
self(const self& x)}

The copy constructor. This performs a shallow copy, which means that
the underlying data will be shared between this vector and vector x. The
reference count of the data object is incremented.

104 CHAPTER 10. VECTOR CLASSES: VECTOR::TYPE
~self()

The destructor. The reference count of the underlying vector data is
decremented.
self& operator=(const self& x)

The assignment operator. Like the copy constructor, this makes a shallow
copy and increments the reference count of the data object.

Chapter 11

Matrix Classes:
matrix::type

105

106CHAPTER 11. MATRIX CLASSES: MATRIX::TYPE
The MTL contains a large number of matrix types, ranging from your basic
dense matrix, to banded, symmetric, and various sparse matrices. MTL also
provides several memory allocation alternatives for the matrix data, including static-sized stack allocation (good for small matrix computations), heap
allocation based on generalized STL-style Allocators (good for arbitrarily sized
matrices), and simple pointer-wrappers for matrix data that originated from
other sources (which we refer to as external data).
Due to the heavy parameterization of the MTlL matrix classes, it is somewhat complicated to deal with them directly. In an effort to simplify the interface we provide this mtl::matrix class which is a type generator. The user provides several template parameter choices (called selectors) and the mtl::matrix
provides an inner typedef for the correct matrix type. Many of the template
arguments have default values, so creating an MTL matrix can be as simple as
in the first line of the following example.
Example
typedef mtl::matrix::type Matrix; // select the matrix type
Matrix A(num_rows, num_colums); // create a matrix object
// Fill in the matrix with random values, using
// STL-style iterators to access the matrix elements
for (Matrix::iterator i = A.begin(); i != A.end(); ++i)
for (Matrix::OneD::iterator j = (*i).begin(); j != (*i).end(); ++j)
*j = rand();

Template Parameters
T
Shape
Storage
Orien

11.0.6

The value type, the type of the elements in the matrix.
The shape of the matrix, described below.
The storage format for the placement of elements in memory.
The orientation of the matrix, either row major,
column major, or diagonal major.

Shape Selectors

The Shape template argument of the matrix generator specifies the layout of
the non-zero elements of the matrix and whether the matrix is symmetric or
Hermitian. Here we give an overview of the choices for matrix shape, and details
are given in the following sections.
• rectangle<>
A rectangular matrix is a general purpose matrix in which elements can
appear in any position in the matrix, i.e., there can be any element aij
where 0 ≤ i ≤ M and 0 ≤ j ≤ N . Both dense and sparse matrices can fit
into this category.

107
• banded<>
A band shaped matrix is one in which non-zero matrix elements only appear within a certain distance to the main diagonal of the matrix. The
bandwidth of a matrix is described by the number of diagonals in the band
that are below the main diagonal, referred to as the sub diagonals, and
the number of diagonals in the band above the main diagonal, referred
to as the super diagonals. Figure 11.1 is an example of a matrix with a
bandwidth of (1,2). There are several storage types that can be used to
efficiently represent banded matrices, including the banded format (equivalent to the LAPACK banded matrix) and dense matrices with diagonal
orientation. Accesses made to elements outside of the band are considered
to be out-of-bounds.

1

2

3

0

0

4

5

6

7

0

0

8

9

10 11

0

0

12 13 14

0

0

0

15 16

Figure 11.1: Example of a banded matrix with bandwidth (1,2).
• triangle
The triangle shape is a special case of the banded shape. A triangular
matrix of shape (M, N ) has a bandwidth (M − 1, 0) for a lower triangular
matrix or (0, N − 1) for an upper triangular matrix. There is also the
special case of when the main diagonal of the matrix is all ones, in which
case the matrix is called unit diagonal.
• symmetric
The symmetric shape is also a special case of the banded shape, with a
twist. Since the matrix is symmetric (aij = aji ) it is only necessary to
store half of the matrix, either the elements above or below the diagonal.
The sub and super of a symmetric matrix must be equal, as the number
of rows and columns must also be equal.
• hermitian
This is similar to a symmetric matrix, except that the elements must be
complex numbers, and the elements above the diagonal are the complex
conjugates of the elements below the diagonal (aij = aji ).

11.0.7

Storage Selectors

The matrix storage types are introduced here, and are described in more detail
in the subsequent sections.

108CHAPTER 11. MATRIX CLASSES: MATRIX::TYPE

1

2

3

0

0

2

6

7

8

0

3

7

10 11 12

0

8

11 13 14

0

0

12 14 15

Figure 11.2: Example of a symmetric matrix with bandwidth (2,2).
• dense
This is the most common way of storing matrices, and consists of one
contiguous piece of memory that is divided up into rows or columns of
equal length. The example in Figure 11.3 shows how a matrix can be
mapped to linear memory in either a row-major or column-major fashion.

1

2

3

4

5

6

7

8

9

row major storage
1

2

3

4

5

6

7

8

9

3

6

9

column major storage
1

4

7

2

5

8

Figure 11.3: Example of the dense matrix storage format.
• static size
This is similar to the dense<> matrix storage, except that the matrix
is stack-allocated and the matrix dimension must be constants (fixed at
compile-time). For operations on small matrices, it is often more efficient
to use this matrix over the dense<> matrix.
• banded
This storage format is equivalent to the banded storage used in the BLAS
and LAPACK. The banded storage format maps the bands of the matrix
to an array of dimension (M, sub + super + 1) for row major and (sub +
super + 1, N ) for column major matrices. For a row major matrix, the
diagonals of the matrix fall into the columns of the array. For a column
major matrix the diagonals fall into the rows of the array. The twodimensional array is mapped to the linear memory space of a single chunk
of memory. Figure 11.4 is an example banded matrix with the mapping to

109
the row-major and column-major 2D arrays. The X’s represent memory
locations that are not used.

1

2

3

0

0

X 1

2

3

4

5

6

7

0

4

5

6

7

0

8

9

10 11

8

9

10 11

0

0

12 13 14

12 13 14 X

0

0

0

15 16 X X

15 16

original matrix

banded storage (row major)

X X 3

7

11

X 2

6

10 14

1

5

9

13 16

4

8

12 15 X

banded storage (column major)
Figure 11.4: Example of the banded matrix storage format.

• packed
This storage format is equivalent to the BLAS/LAPACK packed storage
format. This format provides an efficient way to represent triangular,
symmetric and Hermitian matrices. Either the elements in the upper or
lower triangle of the matrix are stored. Each row (or column) is packed
into a contiguous block of memory, one row starting immediately after the
next. Figure 11.5 depicts and example of a matrix stored in this way.

1

2

3

4

5

1

2

3

4

5

0

6

7

8

9

0

0

10 11 12

0

0

0

13 14

0

0

0

0

6

7

8

15
9

10 11 12 13 14 15

Figure 11.5: Example of the packed matrix storage format.

110CHAPTER 11. MATRIX CLASSES: MATRIX::TYPE
• compressed
This storage format is the traditional compressed row or compressed column format. The storage consists of three arrays, one array for all of the
elements, one array consisting of the row or column index (row for columnmajor and column for row-major matrices), and one array consisting of
pointers to the start of each row/column. Figure 11.6 is an example sparse
matrix in compressed for format, with the stored indices starting from one
(Fortran style). They can also be indexed from zero (C style).

1

0

2

3

0

0

4

0

0

5

6

0

7

8

0

9

0

0

10 0

0

0

11 0

12

row pointer array

1

2

3

1

4

6

9

11 13

4

5

6

7

8

9

10 11 12

1

3

4

1

4

element value array
1

3

4

2

5

3

5

element column index array
Figure 11.6: Example of the compressed column matrix storage format.
• array
This storage format gives an “array of pointers” style implementation of
a matrix. Each row or column of the matrix is allocated separately. The
type of vector used for the rows or columns is flexible, and one can choose
from any of the 1-D storage types, which include dense, compressed,
sparse pair, tree, linked list, and set. Figure 11.7 gives two examples of array storage types.
• envelope
This storage scheme is for sparse symmetric matrices, where most of the
non-zero elements fall near the main diagonal. The storage format is
useful for certain factorizations since the fill-ins fall into areas already
allocated. This scheme is different than most sparse matrices since the
row containers are actually dense, similar to a banded matrix. Figure 11.8
gives an example of a matrix in the envelope storage scheme.

111

1

0

2

3

0

0

4

0

0

5

6

0

7

8

0

9

0

0

10 0

0

0

11 0

12

(1,1) (2,3) (3,4)
(4,2) (5,5)
(6,1) (7,3) (8,4)
(9,1) (10,4)
(11,3) (12,5)
Figure 11.7: Example of the array matrix storage format with dense and with
sparse pair OneD storage types.

11.0.8

Shape and Storage Combinations

Table 11.1 shows the valid combinations of Shape and Storage template parameters for the mtl::matrix. The X’s mark the valid combinations.
(JGS comment: how should we handle the interface to blocked matrices?
Implementation-wise there are two varieties of blocked matrices. A blocking
imposed on a normal (dense) matrix, and a matrix which literally contains submatrices.)
Shape
Storage

rectangle

banded

triangle

symmetric

hermitian

dense
static size
banded
packed
array
sparse array
compressed
coordinate
envelope

X
X

X

X

X

X

X

X
X
X
X
X

X
X
X
X
X
X
X

X
X
X
X
X
X
X

X
X
X
X

X
X

Table 11.1: Valid Shape and Storage Combinations.

112CHAPTER 11. MATRIX CLASSES: MATRIX::TYPE

1

2

4

0

0

2

3

0

4

0

5

6

0

0

0

6

7

9

0

8

0

9

10

0 8

diagonals pointer array

1

2

3

0

2

5

7

11

4

0

5

6

7

8

0

9

10

element values array
Figure 11.8: Example of the envelope matrix storage format.

11.1. DENSE MATRICES

11.1

113

Dense Matrices

The MTL dense matrices come in three flavors, the general purpose heapallocated single block of data (in either row major or column major format),
the “array of pointers” style of matrix, and the stack-allocated, constant size
matrix. Also there is the “external” variant of the heap-allocated matrix, which
has a constructor that takes a pointer to some pre-existing data, and which does
no memory management.

11.1.1

matrix, dense,
Orien>::type

Example
typedef matrix,
dense<>, row_major>::type Matrix;
Matrix A(m, n); // construct a matrix object
// fill matrix ...
A.submatrix( ) = x * trans(y);

Template Parameters
T
Alloc

Orien

is the matrix’s value type, the type of object stored in the
matrix.
is the allocator used obtain memory, which must be a C++
standard compliant allocator.
Default: std::allocator
The orientation of the matrix, either row major or
column major.

Model of
StrideableMatrix, ResizableMatrix, SubdividableMatrix, FastDiagMatrix and MatrixExpression.
Members
In addition to the member function required by StrideableMatrix, ResizableMatrix, SubdividableMatrix, FastDiagMatrix and LinearAlgebra the following members are defined. We use self as a placeholder for the actual class name.
self(const Alloc& a = Alloc())

This is the default constructor, which creates a matrix of size (0, 0).

114CHAPTER 11. MATRIX CLASSES: MATRIX::TYPE
self(size_type m, size_type n,
const T& init = T(),
size_type m_origin = 0, size_type n_origin = 0,
const Alloc& a = Alloc())

Creates a matrix of size (m,n) and initalizes all of the elements to the
value of init1 . The element indices will be ([m origin,m origin + m),
[n origin,n origin + n)).
self(const self& x)

The copy constructor. This performs a shallow copy, which means that
the underlying data will be shared between this matrix and matrix x. The
reference count of the data object is incremented.
~self()

The destructor. The reference count of the underlying data object is
decremented.
self& operator=(const self& x)

The assignment operator. Like the copy constructor, this makes a shallow
copy and increments the reference count of the data object.

11.1.2

matrix, dense,
Orien>::type

This matrix class is used to create a matrix “view” to pre-existing memory. This
is useful when interfacing with legacy code or to other programming languages.
This matrix type claims no responsibility for the lifetime of the underlying
memory, and some care must be taken to ensure that the memory lasts longer
than any use of the matrix object.
Template Parameters
T
M

N

Orien

is the value type, the type of object stored in the matrix.
specifies the number of rows for a static sized matrix. A
value of dynamic size denotes a matrix with size determined at run-time.
Default: dynamic size
specifies the number of columns for a static sized matrix.
A value of dynamic size denotes a matrix with size determined at run-time.
Default: dynamic size
The orientation of the matrix, either row major
column major.

11.1. DENSE MATRICES

115

Model of
StrideableMatrix, ResizableMatrix, SubdividableMatrix, FastDiagMatrix and LinearAlgebra.
Members
In addition to the member function required by StrideableMatrix, ResizableMatrix, SubdividableMatrix, FastDiagMatrix and LinearAlgebra the following members are defined. We use self as a placeholder for the actual class name.
self()

This is the default constructor, which creates an empty matrix. This
matrix object is unusable until assigned to some matrix with data.
self(T* data, size_type m, size_type n, size_type ld,
size_type m_origin = 0, size_type n_origin = 0)

Creates a matrix view to the existing memory given by the data pointer.
The matrix will be of size (m,n), with a leading dimension of ld. The
element indices will be ([m origin,m origin + m), [n origin,n origin +
n)).
self(const self& x)

The copy constructor. This performs a shallow copy, which means that
the underlying data will be shared between this matrix and matrix x.
~self()

The destructor, which does nothing.
self& operator=(const self& x)

The assignment operator. Like the copy constructor, this makes a shallow
copy so that this matrix and matrix x share the same data.

11.1.3

matrix,
static size, Orien>::type

This is a stack-allocated matrix. For operations on small matrices, it is more
efficient to use this matrix type than the general dense matrix since it avoids the
overhead of heap-allocation. The size of the matrix is constant, and specified
as a template argument. Unlike most MTL matrices, the elements are not
initialized to some value (zero by default) but left unitialized (unless explicitly
initialized via initializer list syntax as in the following example). This matrix
type has deep copy semantics instead of the shallow copy semantics that is usual
for MTL objects (see Section 9.1 for details).

116CHAPTER 11. MATRIX CLASSES: MATRIX::TYPE
Example
const int m = 2, n = 2;
typedef mtl::matrix,
static_size >::type Matrix;
Matrix A = { 1, 1,
1, 1 };
Matrix B = { 3, 3,
3, 3 };
Matrix C;
mtl::add(A, B, C);
for (int i = 0; i < m; ++i)
for (int j = 0; j < n; ++j)
assert(C(i,j) == 4);

Template Parameters
T
M
N
Orien

is the value type, the type of object stored in the matrix.
specifies the number of rows.
specifies the number of columns.
is the orientation of the matrix, either row major or
column major.

Model of
StrideableMatrix, SubdividableMatrix, FastDiagMatrix and LinearAlgebra.
Members
This matrix implements the member function required by StrideableMatrix, SubdividableMatrix, FastDiagMatrix and MatrixExression.

11.1.4

matrix,
array< dense >, Orien>::type

This matrix type provides an “array of pointers” style of matrix implementation. Each row (or column) of the matrix is allocated separately. This gives the
advantage the rows can be swapped in constant time. In addition, growing the
matrix along the major dimension is more efficient than for the normal dense
matrix. Unlike the other dense matrix types, this matrix is not a SubdividableMatrix, StrideableMatrix, or a FastDiagMatrix.
Example
Swapping the rows of a matrix in constant time.

11.1. DENSE MATRICES

117

typedef matrix< double,
rectangle<>,
array< dense<> >,
row_major>::type Matrix;
Matrix B(m, n);
Matrix::Row tmp = B[2];
B[2] = B[3];
B[3] = tmp;

Template Parameters
T
Alloc

Orien

is the matrix’s value type, the type of object stored in the
matrix.
is the allocator used obtain memory, which must be a C++
standard compliant allocator.
Default: std::allocator
The orientation of the matrix, either row major or
column major.

Members
This matrix implements the member functions required by Matrix and LinearAlgebra. In addition this class defines the following constructions and destructor.
self(const Alloc& a = Alloc())

This is the default constructor, which creates a matrix of size (0, 0).
self(size_type m, size_type n,
const T& init = T(),
size_type m_origin = 0, size_type n_origin = 0,
const Alloc& a = Alloc())

Creates a matrix of size (m,n) and initalizes all of the elements to the
value of init2 . The element indices will be ([m origin,m origin + m),
[n origin,n origin + n)).
self(const self& x)

The copy constructor. This performs a shallow copy, which means that
the underlying data will be shared between this matrix and matrix x. The
reference count of the data object is incremented.
~self()

The destructor. The reference count of the underlying data object is
decremented.
self& operator=(const self& x)

The assignment operator. Like the copy constructor, this makes a shallow
copy and increments the reference count of the data object.

118CHAPTER 11. MATRIX CLASSES: MATRIX::TYPE

11.2

Sparse Matrices

11.2.1

matrix,
Orien>::type

Template Parameters
T
Shape

Idx

Alloc

IdxStart

Orien

is the value type, the type of object stored in the matrix.
is the shape of the matrix, either rectangle<>,
symmetric, hermitian, or triangle.
is the index type (also the size type) of the matrix, which
is the element type of the index array.
Default: int
is the allocator type, which must be a model of Allocator
or external.
Default: std::allocator
specifies the counting conversion for the indices, either C
style index from zero or Fortran style index from one.
Default: index from zero
is the orientation of the matrix, either row major or
column major.

Members
self(Idx m, Idx n, Idx nnz = max(m,n) * 10)

Create a sparse matrix with m rows and n columns. The nnz argument is
a hint for how many nonzeroes will be in the matrix.
template 
self(ElementIterator first, ElementIterator last,
Idx m, Idx n, Idx nnz)

11.2.2

matrix,
Orien>::type

Members
self(Idx m, Idx n, Idx nnz, T* val, Idx* indx, Idx* ptrs)

11.2. SPARSE MATRICES

119

This creates a sparse matrix “view” to pre-existing memory, which must
be in the traditional compressed row/column matrix format given by the
three arrays val, ptrs, and inds, the number of rows and columns, m and
n and the number of nonzeros nnz. The val and inds arrays should be
of length nnz, and the ptrs array should be either length m + 1 for a row
major matrix or n + 1 for a column oriented matrix. The m origin and
n origin parameters specify the coordinate system for the indices of the
matrix.

11.2.3

matrix,
Orien>::type

Template Parameters
is the value type, the type of object stored in the matrix.
is the shape of the matrix, either rectangle<>,
symmetric,
hermitian,
or
triangle.
SparseOneD is the type used for the one-dimensional segments
(row or columns) of the matrix.
The choices are
compressed, sparse pair, and
tree.
Orien
is the orientation of the matrix, either row major or
column major.

T
Shape

11.2.4

matrix,
Orien>::type

120CHAPTER 11. MATRIX CLASSES: MATRIX::TYPE

11.3

Banded Matrices

11.3.1

matrix,banded,Orien>::type

11.3.2

matrix,banded,Orien>::type

11.3.3

matrix,dense,Orien>::type

11.3.4

matrix,dense,Orien>::type

11.4. TRIANGULAR MATRICES

11.4

Triangular Matrices

11.4.1

matrix,
Storage, Orien>::type

Template Parameters
Uplo

Diag

specifies whether the non-zeroes fall in the upper or lower
triangle of the matrix. The valid arguments for this parameter are upper, lower and dynamic uplo, which specifies
that the choice between upper and lower is postponed until
run-time.
specifes whether the matrix is guaranteed to be unit diagonal. Some algorithms are specialized to be more efficient
for unit diagonal matrices. The valid choices for this parameter are unit diag, non unit diag and dynamic diag.

11.4.2

matrix,
dense,Orien>::type

11.4.3

matrix,
dense, Orien>::type

11.4.4

matrix,
banded,Orien>::type

11.4.5

matrix,
banded, Orien>::type

11.4.6

matrix,
packed, Orien>::type

11.4.7

matrix,
packed, Orien>::type

121

122CHAPTER 11. MATRIX CLASSES: MATRIX::TYPE

11.5

Symmetric Matrices

11.5.1

matrix,
Storage, Orien>::type

Template Parameters
Uplo

specifies whether the matrix elements of the upper or lower
triangle are the ones stored. The valid arguments for this
parameter are upper, lower and dynamic uplo, which specifies that the choice between upper and lower is postponed
until run-time.

11.5.2

matrix,
dense,Orien>::type

11.5.3

matrix,
dense, Orien>::type

11.5.4

matrix,
banded,Orien>::type

11.5.5

matrix,
banded, Orien>::type

11.5.6

matrix,
packed, Orien>::type

11.5.7

matrix,
packed, Orien>::type

11.6. HERMITIAN MATRICES

11.6

Hermitian Matrices

11.6.1

matrix,
Storage, Orien>::type

Template Parameters
Uplo

specifies whether the matrix elements of the upper or lower
triangle are the ones stored. The valid arguments for this
parameter are upper, lower and dynamic uplo, which specifies that the choice between upper and lower is postponed
until run-time.

11.6.2

matrix,
dense,Orien>::type

11.6.3

matrix,
dense, Orien>::type

11.6.4

matrix,
banded,Orien>::type

11.6.5

matrix,
banded, Orien>::type

11.6.6

matrix,
packed, Orien>::type

11.6.7

matrix,
packed, Orien>::type

123

124CHAPTER 11. MATRIX CLASSES: MATRIX::TYPE

11.7

Diagonal Matrices

11.7.1

matrix,
dense, diagonal major>::type

11.7.2

matrix,
dense, diagonal major>::type

11.7.3

matrix,
array, diagonal major>::type

Chapter 12

Adaptors and Helper
Functions

125

126

CHAPTER 12. ADAPTORS AND HELPER FUNCTIONS

Chapter 13

Overloaded Operators

127

128

CHAPTER 13. OVERLOADED OPERATORS

Chapter 14

Vector Operations
The time complexity of the MTL vector operations is linear in the size of the
vector, or in the number of non-zeroes for sparse vectors.

129

130

CHAPTER 14. VECTOR OPERATIONS

14.1

Vector Data Movement Operations

14.1.1

set

xi ← α
template 
void set(Vector& x, const T& alpha)
This operation assigns the value of alpha to each element of vector x. For a
sparse vector alpha is assigned to only the non-zero, explicitly stored elements
of the vector.
Equivalent MXTL Expression
x = alpha
Requirements on Types
• Vector must be a model of the Vector concept.
• T must be convertible to the value type of Vector.
Complexity
Linear. n stores are performed into vector x.
Example
mtl::vector::type x(10);
mtl::set(x, 2);
for (int i = 0; i < 10; ++i)
assert(x[i] == 2);

14.1.2

copy

y←x
template 
void copy(const VectorX& x, VectorY& y)
This operation copies all of the elements of the vector x into the vector y.
The two vectors must be the same size. After the copy is performed the vectors
will be element-wise equal, that is x[i] == y[i] for i=0...n-1. When y is a
sparse vector, the old sparse structure of y is thrown away along with the old
values, and replaced by the new values and indices of vector x. When x is a

14.1. VECTOR DATA MOVEMENT OPERATIONS

131

sparse vector and y is a dense vector (copying from sparse to dense), the nonzero elements of y are copied into the appropriate positions in x and the other
elements of x are set to zero. If you do not want the other elements of x set
to zero use the scatter() function instead. It is not recommended to copy a
dense vector into a sparse vector, since the sparse vector formats are inefficient
for storing and manipulating a full vector.
Requirements on Types
• VectorX and VectorY must be models of the Vector concept.
• The value type of VectorX must be convertible to the value type of VectorY.
Preconditions
• x.size() == y.size()
Postconditions
• x[i] == y[i] for i=0...n-1.
Complexity
Linear. If either vector is dense, then there are N assignments. If both vectors
are sparse then there are nnz assignments. If the target vector is sparse then
some memory allocation and or deallocation may occur.
Example
Copy a compressed sparse vector into another sparse vector.
typedef mtl::vector >::type CompVec;
CompVec x(15), y(15);
for (int i = 0; i < x.size(); i += 3)
x[i] = i;
mtl::copy(x, y);
for (int i = 0; i < x.size(); ++i)
assert(x[i] == y[i]);

14.1.3

swap

x↔y
template 
void swap(VectorX& x, VectorY& y)
This function swaps the elements of vector x with the elements of vector y.

132

CHAPTER 14. VECTOR OPERATIONS

Requirements on Types
• VectorX and VectorY must be models of the Vector concept.
• The value type of VectorX must be convertible to the value type of VectorY,
and vice versa.
Preconditions
• x.size() == y.size()
Complexity
Linear. 2n loads and stores are performed.
Example
typedef mtl::vector::type Vec;
Vec x(10), y(5);
mtl::set(x, 1);
mtl::set(y, 2);
swap(strided(x, 2), y);
cout << x << endl;
cout << y << endl;
The output is:
[2,1,2,1,2,1,2,1,2,1]
[1,1,1,1,1]

14.1.4

gather

y ← x|y
template 
void gather(const DenseVector& x, SparseVector& y)
This function copies the elements of dense vector x into sparse vector y
according to the non-zero structure of vector y.
Equivalent MXTL Expression
y << x
Preconditions
• x.size() == y.size().

14.1. VECTOR DATA MOVEMENT OPERATIONS

133

Requirements on Types
• DenseVector and SparseVector must be models of the Vector concept.
• The value type of DenseVector must be convertible to the value type of
SparseVector.
Complexity
Linear. nnz loads and stores are performed.
Example
typedef mtl::vector >::type SparseVec;
typedef mtl::vector::type DenseVec;
const int n = 9, nnz = 3;
int y_values[] = { 0, 0, 0 };
int y_indices[] = { 2, 5, 7 };
SparseVec y(y_values, y_indices, n, nnz);
DenseVec x(n);
for (int i = 0; i < n; ++i)
x[i] = 2*i;
mtl::gather(x, y);
cout << y << endl;
The output is:
[(4,2),(10,5),(14,7)]

14.1.5

scatter

y|x ← x
template 
void scatter(const SparseVector& x, DenseVector& y)
This function copies the non-zero elements of the sparse vector x into the
dense vector y. The elements of y that correspond to zeroes in x are left unchanged.
Equivalent MXTL Expression
x >> y

134

CHAPTER 14. VECTOR OPERATIONS

Preconditions
• x.size() == y.size().
Requirements on Types
• DenseVector and SparseVector must be models of the Vector concept.
• The value type of SparseVector must be convertible to the value type of
DenseVector.
Complexity
Linear. nnz loads and stores are performed.
Example
typedef mtl::vector >::type SparseVec;
typedef mtl::vector::type DenseVec;
const int n = 9, nnz = 3;
int x_values[] = { 4, 4, 4 };
int x_indices[] = { 2, 5, 7 };
SparseVec x(x_values, x_indices, n, nnz);
DenseVec y(n);
y = 1;
mtl::scatter(x, y);
cout << y << endl;
The output is:
[1,1,4,1,1,4,1,4,1]

14.2. VECTOR REDUCTION OPERATIONS

14.2

Vector Reduction Operations

14.2.1

dot (inner product)

135

r ←r+x·y
template 
T dot(const VectorX& x, VectorY& y, T r)
r ←x·y
template 
T dot(const VectorX& x, VectorY& y)
In the second version T is defined by
XT = typename VectorX::value_type
YT = typedef typename VectorY::value_type
T = typename binary_op_trait::result_type
This operation computes the inner product of two vectors, that is it returns
the value of r which is calculated by r = r + x[i] * y[i] for i=0...n-1. In
version 2 of the algorithm the initial value for r is obtained via binary op trait::identity().
Requirements on Types
• VectorX and VectorY must be models of the Vector concept.
• The value type of VectorX and the value type of VectorY must be Multipliable.
• The result type and type T of the multiply must be Addable.
Preconditions
• x.size() == y.size()
Complexity
Linear. The algorithm performs n multiplications and additions and 2n loads
if both vectors are dense. The algorithm performs nnz multiplications and
additions and 2nnz loads if at least one vector is sparse.
Example
Calculate the dot product of two orthogonal vectors.

136

CHAPTER 14. VECTOR OPERATIONS

typedef mtl::vector >::type Vec;
Vec x = { 1, 2, 3 };
Vec y = { 3, 0, -1 };
int r = mtl::dot(x, y);
assert(r == 0);

14.2.2

one norm

r ←k x k1 or equivalently r ←

P

i

|xi |

template 
T one_norm(const Vector& x)
VT = typename Vector::value_type
T = typename magnitude::type
This algorithm calculates the one norm of a vector, which is the sum of the
absolute values of the elements.
Requirements on Types
• Vector must be a model of the Vector concept.
• The value type of Vector must be Addable.
• The abs() function must be defined for the value type of Vector.
Complexity
Linear. n loads, additions, and abs() are performed (nnz for sparse vectors)..
Example
Calculate the one norm of a complex vector.
typedef std::complex Z;
typedef mtl::vector::type Vec;
Vec x(10); // create a vector size 10
x = Z(2.0, 1.0)); // fill vector will complex number (2,1)
double r = mtl::one_norm(x);
cout << r << endl;
The output is:
22.3607

14.2. VECTOR REDUCTION OPERATIONS

14.2.3

137

two norm (euclidean norm)

r ←k x k2 or equivalently r ←

pP

i

x2i

template 
T two_norm(const Vector& x)
T = typename Vector::value_type
This operation calculates the two norm (or euclidean norm) of a vector,
which is calculated by taking the square root of the sum of the squares of all
the elements.
Requirements on Types
• Vector must be a model of the Vector concept.
• The value type of Vector must be Addable and Multipliable.
• The sqrt() function must be defined for the value type of Vector.
Complexity
Linear. n loads, multiplies and adds are performed. The sqrt() function is
invoked once (nnz for sparse vectors).
Example
Calculate the two norm of a vector
typedef mtl::vector::type Vec;
Vec x(10);
x = 2.0;
double r = mtl::two_norm(x);
cout << r << endl;
The output is:
6.32456

14.2.4

infinity norm

r ←k x k∞ or equivalently r ← maxi |xi |
template 
T infinity_norm(const Vector& x)
VT = typename Vector::value_type
T = typename magnitude::type
This algorithm computes the infinity norm of a vector, which is equal to the
maximum absolute value of any element in the vector.

138

CHAPTER 14. VECTOR OPERATIONS

Requirements on Types
• Vector must be a model of the Vector concept.
• The abs() function must be defined for the value type of Vector.
• The result type of the abs() operation must be LessThanComparable.
Complexity
Linear. n loads, abs(), and comparisons are performed (nnz for sparse vectors).
Example
Find the infinity norm of a complex vector.
typedef std::complex Z;
typedef mtl::vector::type Vec;
Vec x(10); // create a vector size 10
for (int i = 0; i < x.size(); ++i)
x = Z(2*(i+1), i+1));
double r = mtl::infinity_norm(x);
cout << r << endl;
The output is:
UNDER CONSTRUCTION

14.2.5
r←

P

i

sum
xi

template 
T sum(const Vector& x)
T = typename Vector::value_type
This operation calculates the sum of the elements in the vector. The initial
value of r is obtained via binary op trait::identity().
Requirements on Types
• Vector must be a model of the Vector concept.
• The value type of Vector must be a model of Addable.

14.2. VECTOR REDUCTION OPERATIONS

139

Complexity
Linear. The algorithm performs N loads and additions.
Example
The sum of an arithmetic series,

PN

i=1

i=

N (N +1)
.
2

typedef mtl::vector::type Vec;
const int N = 10;
Vec x(N);
for (int i = 0; i < N; ++i)
x[i] = i + 1;
int r = mtl::sum(x);
assert(r == N * (N + 1) / 2);

14.2.6
r←

P

i

sum squares
x2i

template 
T sum_squares(const Vector& x)
T = typename Vector::value_type
This operation calculate the sum of the squares of the elements in the vector.
The initial value of r is obtained via binary op trait::identity().
Requirements on Types
• Vector must be a model of the Vector concept.
• The value type of Vector must be a model of Addable and Multipliable.
Complexity
Linear. The algorithm performs N loads, additions, and multiplications.
Example
UNDER CONSTRUCTION

140

14.2.7

CHAPTER 14. VECTOR OPERATIONS

max

r ← maxi (xi )
template 
T max(const Vector& x)
T = typename Vector::value_type
This function returns the maximum value of any of the elements of vector x.
Requirements on Types
• Vector must be a model of the Vector concept.
• The value type of Vector must be LessThanComparable. For example,
complex numbers are not LessThanComparable.
Complexity
Linear.
Example
typedef mtl::vector >::type Vec;
Vec x = { 3, 1, 7, 4 };
int r = mtl::max(x);
assert(r == 7);

14.2.8

min

r ← mini (xi )
template 
T min(const Vector& x)
T = typename Vector::value_type
This function returns the maximum value of any of the elements of vector x.
Requirements on Types
• Vector must be a model of the Vector concept.
• The value type of Vector must be LessThanComparable. For example,
complex numbers are not LessThanComparable.
Complexity
Linear.

14.2. VECTOR REDUCTION OPERATIONS

141

Example
typedef mtl::vector >::type Vec;
Vec x = { 3, 1, 7, 4 };
int r = mtl::min(x);
assert(r == 1);

14.2.9

max index

k ← arg maxi (xi )
template 
S max_index(const Vector& x)
S = typename Vector::size_type
This function finds the index (location) of the maximum element in vector
x.

Requirements on Types
• Vector must be a model of the Vector concept.
• The value type of Vector must be LessThanComparable.
Complexity
Linear.
Example
using boost tie;
typedef mtl::vector >::type Vec;
Vec x = { 5.0, -7.0, -4.0, 6.0 , 0.0 };
int k;
k = mtl::max_index(x);
assert(k == 3);
k = mtl::max_index(abs(x));
assert(k == 1);

14.2.10

max with index

(k, xk ) ← arg maxi (xi )

142

CHAPTER 14. VECTOR OPERATIONS

template 
std::pair max_with_index(const Vector& x)
S = typename Vector::size_type
T = typename Vector::value_type
This function finds the location (index) and value of the maximum element
in vector x.
Requirements on Types
• Vector must be a model of the Vector concept.
• The value type of Vector must be LessThanComparable.
Complexity
Linear.
Example
using boost tie;
typedef mtl::vector >::type Vec;
Vec x = { 5.0, -7.0, -4.0, 6.0 , 0.0 };
int k;
double xk;
tie(k,xk) = mtl::max_with_index(x);
assert(k == 3 && xk == 6.0);
tie(k,xk) = mtl::max_with_index(abs(x));
assert(k == 1 && xk == 7.0);

14.2.11

min index

(k, xk ) ← arg mini (xi )
template 
std::pair min_index(const Vector& x)
S = typename Vector::size_type
T = typename Vector::value_type
This function finds the location (index) and value of the minimum element
in vector x.

14.2. VECTOR REDUCTION OPERATIONS
Requirements on Types
• Vector must be a model of the Vector concept.
• The value type of Vector must be LessThanComparable.
Complexity
Linear.
Example
using boost::tie;
typedef mtl::vector >::type Vec;
Vec x = { 5.0, -7.0, -4.0, 6.0 , 0.0 };
int k;
double xk;
tie(k,xk) = mtl::min_index(x);
assert(k == 1 && xk == -7.0);
tie(k,xk) = mtl::min_index(abs(x));
assert(k == 4 && xk == 0.0);

143

144

CHAPTER 14. VECTOR OPERATIONS

14.3

Vector Arithmetic Operations

14.3.1

scale

x ← αx
template 
void scale(Vector& x, const T& alpha)
This operation multiplies each element of vector x by the scalar alpha.
Equivalent MXTL Expression
x *= alpha
Requirements on Types
• The type Vector must be a model of the Vector concept.
• The value type of Vector and type T must be Multipliable.
Complexity
Linear. The algorithm performs n loads, stores and multiplications if the vector
is dense, and nnz if the vector is sparse.
Example
Perform one of the steps in a Gaussian Elimination by scaling a row of the
matrix.
typedef mtl::matrix,
static_size<3,3>, row_major>::type Mat;
Mat A = { 5.0, 5.5, 6.0,
2.5, 3.0, 3.5,
1.0, 1.5, 2.0 };
double scal = A(0,0) / A(1,0);
mtl::scale(A[1], scal);
assert(A(0,0) == A(1,0));

14.3.2

add

y ←x+y
template 
void add(const VectorX& x, VectorY& y)

14.3. VECTOR ARITHMETIC OPERATIONS

145

w ←x+y
template 
void add(const VectorX& x, const VectorY& y, VectorW& w)
w ←x+y+z
template 
void add(const VectorX& x, const VectorY& y, const VectorZ& z, VectorW& w)
These algorithms perform element-wise addition of vectors, and place the
results either in vector y or in the output vector w.
Equivalent MXTL Expressions
add(x, y)
add(x, y, w)
add(x, y, z, w)

y += x
w = x + y
w = x + y + z

Preconditions
• For all versions: x.size() == y.size().
• For version 2 and 3, x.size() == w.size().
• For version 3, x.size() == z.size().
Requirements on Types
• VectorX, VectorY, VectorZ and VectorW must be models of the Vector concept.
• The value types of VectorX, VectorY, and VectorZ must be Addable.
• The result type of the addition must be convertible to the value type of
the output vector, either VectorY or VectorW.
Complexity
Linear. The algorithm performs 2n loads and n additions and stores. If both
vectors are sparse, only O(nnz) operations are performed. In addition, if the
output vector is sparse some allocation and or deallocation may occur.
Examples
Add two vectors into a third.

146

CHAPTER 14. VECTOR OPERATIONS

typedef mtl::vector >::type Vec;
Vec x = { 1, 1, 1 };
Vec y = { 3, 3, 3 };
Vec w;
mtl::add(x, y, w);
cout << w << endl;
The output is:
[4,4,4]
Perform one of the steps in a Gaussian Elimination by subtracting one row of
the matrix from another.
typedef mtl::matrix,
static_size<3,3>, row_major>::type Mat;
Mat A = { 5.0, 5.5, 6.0,
5.0, 6.0, 7.0,
1.0, 1.5, 2.0 };
double scal = A(0,0) / A(1,0);
mtl::add(mtl::scaled(A[0],-1), A[1]);
assert(A(1,0) == 0.0);

14.3.3

iadd

y ← y + x|y
template 
void iadd(const VectorX& x, VectorY& y)
This algorithm performs incomplete addition for sparse vectors. This means
that only the elements of x that correspond to non-zero elements in y are added.
The non-zero structure of vector y is left unchanged.
Preconditions
• x.size() == y.size().
Requirements on Types
• VectorX and VectorY must be models of the Vector concept.
• The value types of VectorX and VectorY must be Addable.
Complexity
Linear. The algorithm performs 2nnz loads and nnz additions and stores.

14.3. VECTOR ARITHMETIC OPERATIONS

147

Example
Add two sparse vectors, with respect to the sparsity structure of the destination
vector.
typedef std::pair P;
const int n = 6, x_nnz = 3, y_nnz = 4;
P xp[] = { P(1,11.0), P(3,1.0), P(5,4.0) };
P yp[] = { P(0,4.0), P(3,2.0), P(4,6.0), P(5,3.0) };
typedef mtl::vector >::type Vec;
Vec x(n, x_nnz, xp), y(n, y_nnz, yp);
mtl::iadd(x, y);
cout << y << endl;
The output is:
[(0,4),(3,3),(4,6),(5,7)]

14.3.4

ele mult

w ←x⊗y
template 
void ele_mult(const VectorX& x, const VectorY& y, VectorW& w)
Element-wise multiply vector x and y, placing the result in vector z.
Preconditions
• x.size() == y.size() && x.size() == w.size().
Requirements on Types
• VectorX, VectorY, and VectorW must be models of the Vector concept.
• The value types of VectorX and VectorY must be Multipliable.
• The result type of the multiply must be convertible to the value type of
VectorW.
Complexity
Linear. The algorithm performs 2n loads and n multiplies and stores if both
vectors are dense. If either input vector is sparse then the algorithm performs
2nnz loads and nnz multiplies, and either n or nnz stores depending on whether
the output vector is dense or sparse. Also, if the output vector is sparse some
allocation and or deallocation may occur.

148

CHAPTER 14. VECTOR OPERATIONS

Example
typedef mtl::vector >::type Vec;
Vec x = { 2, 2, 2 };
Vec y = { 3, 3, 3 };
Vec w;
mtl::ele_mult(x, y, w);
cout << w << endl;
The output is:
[6,6,6]

14.3.5
w←x

ele div
y

template 
void ele_div(const VectorX& x, const VectorY& y, VectorW& w)
Element-wise divide vector x and y, placing the result in vector z. Note that
if vector y is sparse a divide by zero error will surely occur. If vector x is sparse,
then INF will typically be assigned to many of the elements of w.
Preconditions
• x.size() == y.size() && x.size() == w.size().
Requirements on Types
• VectorX, VectorY, and VectorW must be models of the Vector concept.
• The value types of VectorX and VectorY must be Dividable.
• The result type of the divide must be convertible to the value type of
VectorW.
Complexity
Linear. The algorithm performs 2n loads and n divides and stores if both input
vectors are dense. If vector x is sparse then the algorithm performs 2nnz loads,
nnz multiplies, and n stores.

14.3. VECTOR ARITHMETIC OPERATIONS
Example
typedef mtl::vector >::type Vec;
Vec x = { 6, 6, 6 };
Vec y = { 2, 2, 2 };
Vec w;
mtl::ele_div(x, y, w);
cout << w << endl;
The output is:
[3,3,3]

149

150

CHAPTER 14. VECTOR OPERATIONS

Chapter 15

Matrix Operations

151

152

CHAPTER 15. MATRIX OPERATIONS

15.1

Matrix Data Movement Operations

15.1.1

set

aij ← α
template 
void set(Matrix& A, const T& alpha)
This operation assigns the value of alpha to each stored element of the matrix
A. For some matrices (sparse, banded, unit diagonal, etc.), the post-condition
A(i,j) == alpha for all i=0...m-1,j=0...n-1 will not hold, as the non-stored
elements will still have their assumed value (typically zero or one).
Equivalent MXTL Expression
A = alpha
Requirements on Types
• Matrix must be a model of the Matrix concept.
• The type T must be convertible to the value type of Matrix.
Complexity
The algorithm will perform mn stores for a dense matrix, and nnz stores for a
sparse matrix.
Example
mtl::matrix::type A(2, 2);
mtl::set(A, 7.3);
cout << A << endl;
The output is:
[[7.3,7.3],
[7.3,7.3]]

15.1.2

copy

B←A
template 
void copy(const MatrixA& A, MatrixB& B)

15.1. MATRIX DATA MOVEMENT OPERATIONS

153

This operation copies the elements of matrix A into matrix B. After the copy,
the following post-condition holds: A(i,j) == B(i,j) for all i=0...m-1,j=0...n-1.
The copy function is useful for converting from one matrix format to another.
For instance, during the initialization phase of some algorithm it may be more
efficient to use a sparse matrix that supports fast insertion, and then in the
computation phase change to another sparse matrix format that supports fast
traversal. When using this routine on banded or diagonal matrices some care
must be taken to avoid the situation where elements of matrix A map to positions
in matrix B that are out-of-bounds.
Preconditions
• A.nrows() == B.nrows() && A.ncols() == B.ncols()
Postconditions
• A(i,j) == B(i,j) for all i=0...m-1,j=0...n-1 (that is if the value types of
the matrices are EqualityComparable).
Requirements on Types
• MatrixA and MatrixB must be models of the Matrix concept.
• typeid(MatrixA::orientation) == typeid(MatrixB::orientation)
• The value type of MatrixA must be convertible to the value type of MatrixB.
Complexity
The algorithm will perform mn stores for a dense matrix, and nnz stores for a
sparse matrix.
Example
typedef mtl::matrix,
array, row_major>::type TreeMat;
typedef mtl::matrix,
compressed<>, row_major>::type CompMat;
const int m = 5, n = 5;
TreeMat A(m,n);
CompMat B(m,n);
// insert some values at random locations
for (int i = 0; i < m*2; ++i)
A(rand() % 5, rand() % 5) = i;
mtl::copy(A, B);

154

CHAPTER 15. MATRIX OPERATIONS

assert(A == B);

15.1.3

swap

A↔B
template 
void swap(MatrixA& A, MatrixB& B)
This functions exchanges the elements of matrix A with the elements of
matrix B. The routine assumes the two matrices are the same orientation (e.g.,
both row major) and that the non-zero structure of the two matrices is identical.
Anotherwords, for sparse matrices only the values are swapped, and not the
indices.
Preconditions
• A.nrows() == B.nrows() && A.ncols() == B.ncols()
• The non-zero structure of A and B must match.
Requirements on Types
• MatrixA and MatrixB must be models of the Matrix concept.
• typeid(MatrixA::orientation) == typeid(MatrixB::orientation)
• typeid(MatrixA::sparsity) == typeid(MatrixB::sparsity)
• The value type of MatrixA must be convertible to the value type of MatrixB
and vice versa.
Complexity
The algorithm will perform 2mn loads and stores for dense matrices, and 2nnz
loads and stores for sparse matrices.
Example
typedef mtl::matrix,
banded<>, row_major>::type BandedMatrix;
typedef mtl::matrix,
packed<>, row_major>::type PackedMatrix;
const int m = 4, n = 3, kl = 2, ku = 1;
BandedMatrix A(m, n, kl, ku);
PackedMatrix B(m, n, kl, ku);

15.1. MATRIX DATA MOVEMENT OPERATIONS

155

A = 2.5;
B = 1.4;
mtl::swap(A, B);
// The elements in the non-zero band of A should be == 1.4
BandedMatrix::iterator Ai = A.begin()
for (; Ai != A.end(); ++Ai) {
BandedMatrix::Row::iterator Aij = (*Ai).begin();
for (; Aij != (*Ai).end(); ++Aij)
assert(*Aij == 1.4);
}

15.1.4

transpose

A ← AT or equivalently aij ↔ aji
template 
void transpose(Matrix& A)
B ← AT or equivalently bji ← aij
template 
void transpose(const MatrixA& A, MatrixB& B)
This function moves each element of the matrix to the mirror-image of its
location, across the main diagonal of the matrix. The element at (i,j) will
be moved to (j,i). Version 1 of the algorithm transposes the matrix in place,
while version 2 places the result in matrix B.
Preconditions
• For version 2, A.nrows() == B.ncols() && A.ncols() == B.nrows()
• Allow what kind of sparse/dense/banded/orientation combinations? JGS
Requirements on Types
• MatrixA and MatrixB must be models of the Matrix concept.
• The value type of MatrixA must be convertible to the value type of MatrixB
and vice versa.
• For version 1, if the matrix is static sized, it must also be square for the
transpose operation to work correctly.
Complexity
The algorithm performs O(mn) loads and stores.

156

CHAPTER 15. MATRIX OPERATIONS

Examples
Transpose a matrix in place.
typedef mtl::matrix,
static_size<3,3>, row_major>::type Matrix;
Matrix A = { 1, 2, 3,
4, 5, 6,
7, 8, 9 };
mtl::transpose(A);
cout << A << endl;
The output is:
[[1,4,7],
[2,5,8],
[3,6,9]]
Assign the transpose to a second matrix.
mtl::matrix::type A(3,2), B(2,3);
for (int i = 0; i < 3; ++i)
for (int j = 0; j < 2; ++j)
A(i,j) = i * 2 + j + 1;
mtl::transpose(A, B);
cout << A << endl << endl;
cout << B << endl;
The output is:
[[1,2,3],
[4,5,6]]
[[1,4],
[2,5],
[3,6]]

15.2. MATRIX NORMS

157

15.2

Matrix Norms

15.2.1

one norm
P

r ←k A k1 or equivalently r ← maxj

i

|aij |

template 
T one_norm(const Matrix& A)
VT = typename Matrix::value_type
T = typename unary_op_trait::result_type
This function returns the one norm of a matrix, which is the maximum of
the column sums.
Requirements on Types
• Matrix must be a model of the Matrix concept.
• The abs() function must be defined for the value type of Matrix.
• The result type of the abs() function must be Addable and LessThanComparable.
Complexity
O(mn) for dense matrices and O(nnz) for sparse.
Example
UNDER CONSTRUCTION

15.2.2

frobenius norm

r ←k A kF or equivalently r ←

qP

ij

|aij |2

template 
T frobenius_norm(const Matrix& A)
VT = typename Matrix::value_type
T = typename unary_op_trait::result_type
The fobenius norm is calculated by taking the square root of the sum of the
squares of all the elements in the matrix.

158

CHAPTER 15. MATRIX OPERATIONS

Requirements on Types
• Matrix must be a model of the Matrix concept.
• The value type of Matrix must be Addable.
• The abs() function must be defined for the value type of Matrix.
• The sqrt() function must be defined for the value type of Matrix.
Complexity
O(mn) for dense matrices and O(nnz) for sparse.
Example
UNDER CONSTRUCTION

15.2.3

infinity norm

r ←k A k∞ or equivalently r ← maxi

P

j

|aij |

template 
T infinity_norm(const Matrix& A)
VT = typename Matrix::value_type
T = typename unary_op_trait::result_type
The infinity norm of a matrix is the maximum row sum.
Requirements on Types
• Matrix must be a model of the Matrix concept.
• The abs() function must be defined for the value type of Matrix.
• The result type of the abs() function must be Addable and LessThanComparable.
Complexity
O(mn) for dense matrices and O(nnz) for sparse.
Example
UNDER CONSTRUCTION

15.3. ELEMENT-WISE ARITHMETIC OPERATIONS

15.3

Element-wise Arithmetic Operations

15.3.1

scale

159

A ← αA
template 
void scale(Matrix& A, const T& alpha)
This function multiplies each element in matrix A by alpha. For matrices
with special structure (sparse, banded, etc.) only the stored elements are scaled.
Requirements on Types
• Matrix must be a model of the Matrix concept.
• The value type of Matrix and type T must be Multipliable.
Complexity
The algorithm performs O(mn) loads, stores, and multiplies for dense matrices
and O(nnz) for sparse.
Example
Multiply the elements of a sparse matrix by 2.
typedef mtl::matrix,
compressed<>, row_major>::type Matrix;
Matrix A(3,3);
A(0,1)
A(1,0)
A(1,2)
A(2,0)

=
=
=
=

2.5;
0.2;
1.4;
4.1;

mtl::scale(A, 2.0);
cout << A << endl;

The output is:
[[(1,5.0)],
[(0,0.4),(2,2.8)],
[(0,8.2)]]

160

15.3.2

CHAPTER 15. MATRIX OPERATIONS

add

B ←A+B
template 
void add(const MatrixA& A, MatrixB& B)
C ←A+B
template 
void add(const MatrixA& A, const MatrixB& B, MatrixC& C)
This function adds the elements of the two matrices, and assigns the result
to matrix B in version 1 or to matrix C in version 2. If the destination matrix
is banded, the other matrices must have the same shape (bandwidth) so as to
avoid assignment to out-of-bounds elements. If the destination matrix is sparse,
then some allocation and or deallocation may occur.
Equivalent MXTL Expressions
add(A, B)
add(A, B, C)

B += A
C = A + B

Preconditions
• For version 1, A.nrows() == B.nrows() && A.ncols() == B.ncols().
• For version 2, A.nrows() == B.nrows() && A.ncols() == B.ncols() && A.nrows()
== C.nrows() && A.ncols() == C.ncols().
Requirements on Types
• MatrixA, MatrixB, and MatrixC must be models of the Matrix concept.
• typeid(MatrixA::orientation) == typeid(MatrixB::orientation).
• For version 2, typeid(MatrixA::orientation) == typeid(MatrixC::orientation).
• The value type of MatrixA and MatrixB must be Addable.
• The result type of the addition must be convertible to the value type of
the output matrix.
Complexity
The algorithm performs O(mn) loads, stores, and additions for dense matrices,
and O(nnz) for sparse.

15.3. ELEMENT-WISE ARITHMETIC OPERATIONS

161

Example
const int m = 2, n = 2;
typedef mtl::matrix,
static_size >::type Matrix;
Matrix A = { 1, 1,
1, 1 };
Matrix B = { 3, 3,
3, 3 };
Matrix C;
mtl::add(A, B, C);
for (int i = 0; i < m; ++i)
for (int j = 0; j < n; ++j)
assert(C(i,j) == 4);

15.3.3

iadd

B ← B + A|B
template 
void iadd(const SparseMatrixA& A, SparseMatrixB& B)
This function performs an incomplete addition of the sparse matrix A to
sparse matrix B according to the non-zero structure of B. Elements of A are
added to B only if they match the indices of existing non-zeroes in B. The nonzero structure of B is not changed.
Preconditions
• A.nrows() == B.nrows() && A.ncols() == B.ncols().
Requirements on Types
• SparseMatrixA and SparseMatrixB must be models of the Matrix concept.
• The value type of SparseMatrixA and SparseMatrixB must be Addable.
• typeid(SparseMatrixA::orientation) == typeid(SparseMatrixB::orientation).
Complexity
The algorithm performs O(nnz) loads, stores, and additions.
Example
UNDER CONSTRUCTION

162

CHAPTER 15. MATRIX OPERATIONS

15.3.4

ele mult

B ← B ⊗ A or equivalently bij ← bij aij
template 
void ele_mult(const MatrixA& A, MatrixB& B)
This function performs element-wise multiplication of two matrices.
Preconditions
• A.nrows() == B.nrows() && A.ncols() == B.ncols().
Requirements on Types
• MatrixA and MatrixB must be models of the Matrix concept.
• The value type of MatrixA and MatrixB must be Multipliable.
• typeid(MatrixA::orientation) == typeid(MatrixB::orientation).
Complexity
The algorithm performs O(mn) loads, stores, and additions if bother matrices
are dense, and O(nnz) if at least one matrix is sparse.
Example
UNDER CONSTRUCTION

15.3.5
B←B

ele div
A or equivalently bij ← bij /aij

template 
void ele_div(const MatrixA& A, MatrixB& B)
This function performs element-wise division of two matrices. This function
can not be used with matrices of special structure (sparse, banded, etc.).
Preconditions
• A.nrows() == B.nrows() && A.ncols() == B.ncols().
Requirements on Types
• MatrixA and MatrixB must be models of the Matrix concept.
• The value type of MatrixA and MatrixB must be Multipliable.
• typeid(MatrixA::orientation) == typeid(MatrixB::orientation).

15.3. ELEMENT-WISE ARITHMETIC OPERATIONS

163

Complexity
The algorithm performs O(mn) loads, stores, and additions if bother matrices
are dense, and O(nnz) if at least one matrix is sparse.
Example
UNDER CONSTRUCTION

164

15.4

CHAPTER 15. MATRIX OPERATIONS

Rank Updates (Outer Products)

The result matrix must be a full matrix (not sparse or banded). If the matrix
is symmetric or hermitian then the vector x and y must be equal to maintain
the symmetry of the resultant matrix.

15.4.1

rank one update

A ← A + xy T or equivalently aij ← aij + xi yj
template 
void rank_one_update(Matrix& A, const VectorX& x, const VectorY& y)
The rank one update function takes the outer product of two vectors and
assigns the result to matrix A.
Equivalent MXTL Expression
A += x * trans(y)

15.4.2

rank one conj

A ← A + xy H or equivalently aij ← aij + xi yj
template 
void rank_one_conj(Matrix& A, const VectorX& x, const VectorY& y)
The rank one update function takes the outer product of vector x and the
conjugate of vector y and assigns the result to matrix A.
Equivalent MXTL Expression
A += x * conj(trans(y))

15.4.3

rank two update

A ← A + xy T + yxT or equivalently aij ← aij + xi yj + yi xj
template 
void rank_two_update(Matrix& A, const VectorX& x, const VectorY& y)
Equivalent MXTL Expression
A += x * trans(y) + y * trans(x)

15.4.4

rank two conj

A ← A + xy H + yxH or equivalently aij ← aij + xi yj + yi xj
template 
void rank_two_conj(Matrix& A, const VectorX& x, const VectorY& y)

15.4. RANK UPDATES (OUTER PRODUCTS)
Equivalent MXTL Expression
A += x * conj(trans(y)) + y * conj(trans(x))

165

166

CHAPTER 15. MATRIX OPERATIONS

15.5

Trianglular Solves (Forward & Backward
Substitution)

15.5.1

tri solve

x←T

−1

x

template 
void tri_solve(const Matrix& T, Vector& x)
B ← T −1 B
template 
void tri_solve(const MatrixT& T, MatrixB& B, right_side)
B ← BT −1
template 
void tri_solve(const MatrixT& T, MatrixB& B, left_side)
The triangular solve routines perform forward (for lower triangular matrices)
or backward (for upper triangular matrices) substitution. The routines assume
that the elements on the main diagonal are non-zero and do not check. This is
because tri solve() is typically used after the matrix has been factored, and
the factoring routine will ensure that the main diagonal is non-zero. In addition,
checking for zeroes would cause a small performance degradation.
Examples
Perform forward substitution on a lower-triangular matrix.
typedef mtl::matrix,
packed<>, row_major>::type Matrix;
typedef mtl::vector > Vector;
const int N = 3;
Matrix
A(0,0)
A(1,0)
A(2,0)

A(N);
= 1;
= 2; A(1,1) = 4;
= 3; A(2,1) = 5; A(2,2) = 7;

Vector b = { 7, 46, 124};
mtl::tri_solve(A, b);
// The solution should be
Vector x = { 7, 46, 124 };
assert(b == x);

15.5. TRIANGLULAR SOLVES (FORWARD & BACKWARD SUBSTITUTION)167
Solve the triangular system for multiple right-hand-sides.
typedef mtl::matrix::type Matrix;
typedef mtl::matrix,
packed<> >::type TriMatrix;
const int m = 3, n = 2;
TriMatrix U(m,m);
Matrix B(m,n);
U(0,0) = 1;

U(0,1) = 2;
U(1,1) = 3;

B(0,0) = 7;
B(1,0) = 8;
B(2,0) = 6;

B(0,1) = 34;
B(1,1) = 42;
B(2,1) = 36;

U(0,2) = 4;
U(1,2) = 5;
U(2,2) = 6;

cout << U << endl << endl;
cout << B << endl << endl;
mtl::tri_solve(U, B, left_side());
cout << B << endl;
The output is:
[[1,2,4],
[0,3,5],
[0,0,6]]
[[7,34],
[8,42],
[6,36]]
[[1,2],
[1,4],
[1,6]]

168

CHAPTER 15. MATRIX OPERATIONS

15.6

Matrix-Vector Multiplication

15.6.1

mult

y ← Ax
template 
void mult(const Matrix& A, const VectorX& x, Vector& y)
Equivalent MXTL Expression
y = A * x
Example
UNDER CONSTRUCTION

15.6.2

mult add

y ← Ax + y
template 
void mult_add(const Matrix& A, const VectorX& x, Vector& y)
z ← Ax + y
template 
void mult_add(const Matrix& A, const VectorX& x,
const VectorY& y, VectorZ& z)
Equivalent MXTL Expression
mult_add(A,x,y)
mult_add(A,x,y,z)

y += A * x
z = A * x + y

Example
typedef mtl::matrix,
banded<>, row_major>::type Matrix;
typedef mtl::vector::type Vector;
const int m = 5, n = 5, kl = 0, ku = 2;
Matrix A(m, n, kl, ku);
Vector y(m);
Vector x(n);
int val = 1;
Matrix::iterator ri = A.begin();
while (ri != A.end()) {

15.6. MATRIX-VECTOR MULTIPLICATION
Matrix::Row::iterator i = (*ri).begin();
while (i != (*ri).end())
*i++ = val++;
++ri;
}
for (int i = 0; i < n; ++i)
x[i] = i + 1;
mtl::set(y, 1);
cout << A << endl << endl;
cout << x << endl << endl;
cout << y << endl << endl;
mtl::mult_add(A, scaled(x, 2), scaled(y, 3), y);
cout << y << endl;
The output is:
[[1,1,1,],
[2,2,2,],
[3,3,3,],
[4,4,],
[5,]]
[1,2,3,4,5,]
[1,1,1,1,1,]
[15,39,75,75,53,]

169

170

CHAPTER 15. MATRIX OPERATIONS

15.7

Matrix-Matrix Multiplication

15.7.1

mult

C ← AB
template 
void mult(const MatrixA& A, const MatrixB& B, MatrixC& C)

15.7.2

mult add

C ← C + AB
template 
void mult_add(const MatrixA& A, const MatrixB& B, MatrixC& C)

15.7.3

lrdiag mult

C ← DL ADR
template 
void lrdiag_mult(const MatrixA& A, const MatrixDr& Dr,
const MatrixDl& Dl, MatrixC& C)

15.7.4

lrdiag mult add

C ← C + DL ADR
template 
void lrdiag_mult(const MatrixA& A, const MatrixDr& Dr,
const MatrixDl& Dl, MatrixC& C)

15.7.5

babt mult

C ← BAB T
template 
void babt_mult(const MatrixA& A, const MatrixB& B, MatrixC& C)

15.8. MISCELLANEOUS MATRIX OPERATIONS

15.8

Miscellaneous Matrix Operations

15.8.1

trace

r←

P

i

171

aii

template 
T trace(const FastDiagMatrix& A)
T = typename FastDiagMatrix::value_type
This function returns the sum of the elements in the main diagonal of the
matrix.
Requirements on Types
• The matrix must be a model of the FastDiagMatrix concept, which means
there must be a diag(A) function available to provide access to the main
diagonal.
Complexity
Linear. It will perform max(m, n) additions and loads.
Example
The implementation of this function is trivial:
namespace mtl {
template 
typename FastDiagMatrix::value_type
trace(const FastDiagMatrix& A)
{
return sum(diag(A));
}
}

172

CHAPTER 15. MATRIX OPERATIONS

Chapter 16

Miscellaneous Operations

173

174

CHAPTER 16. MISCELLANEOUS OPERATIONS

16.1

Basic Transformations

16.1.1

givens

template 
void givens(T a, T b, Real& c, T& s, T& r);
This function generates a Givens rotation which can be used to selectively
zero elements in a vector. More specifically, this function computes c, s, and r
such that


c
−s

s
c

T 

a
b




=

r
0


.

Requirements on Types
T
Real

is the element type, which can be real or complex.
is a real number type.

Complexity
Constant time.
Example
const int N = 5;
double dx[] = { 1, 2, 3, 4, 5 };
double dy[] = { 2, 4, 8, 16, 32};
vector >::type x(dx, N), y(dy, N);
cout << x << endl << y << endl;
double a = x[N-1];
double b = y[N-1];
double c, s, r;
givens(a, b, c, s, r);
givens_apply(c, s, x, y);
cout << x << endl << y << endl;

The output is:
[1 2 3 4 5 ]
[2 4 8 16 32 ]
[2.1304 4.2608 8.36723 16.4257 32.3883 ]
[-0.679258 -1.35852 -1.72902 -1.48202 0 ]

16.1. BASIC TRANSFORMATIONS

16.1.2

175

givens apply

template 
void givens_apply(Real c, T s, VectorX& x, VectorY& y);
This function applies the givens rotation generated by givens() and stored
in c and s to the vectors x and y.
Requirements on Types
T
VectorX
VectorY
Real

is the element type, which can be real or complex.
is a type that models Vector. VectorX::value type should
be the same type as T.
is a type that models Vector. VectorY::value type should
be the same type as T.
is a real number type.

Preconditions
• size(x) == size(y)
Complexity
Linear.

16.1.3

house

template 
void house(T alpha, Vector& x, T& tau, Real& beta);
The Householder transformation (reflection) is used to zero out (annihilate)
components of a vector (typically in rows or columns of a matrix) . This is
useful in algorithms like QR factorization. The Householder transformation is
described by the Householder matrix H such that Hx yields a vector with all
zeroes except x[0].

 

α
β
H
=
, HH T = I
x[1 . . . n − 1]
0
The Householder matrix is stored in the Householder vector v and then generated implicitly during application since
H = I − τ vv T
HA = (I − τ vv T )A = A − vwT
AH = A(I − τ vv T ) = A − wv T

, τ=

2
vT v

, w = τ AT v
, w = τ Av

176

CHAPTER 16. MISCELLANEOUS OPERATIONS

This routines takes as input the vector x in two peices: alpha which is from
x[0] and x which is x[1 . . . n − 1]. The output is tau, beta, and x which has
been overwritten by the vector v[1 . . . n − 1]. Before applying the transform
(with either house apply left() or house apply right()) you need to set x[0]
to 1. The algorithm used to implement this function is the same as that of
LAPACK’s xLARFG routine.
Requirements on Types
T
Vector
Real

is the element type, which can be real or complex.
is a type that models Vector.
is a real number type.

Complexity
Linear.
Examples
Example of generating a Householder vector and applying it to the original
vector, annihilating all the elements but the first.
double x_[] = { 3, 1, 5, 1 };
vector >::type x(x_,4);
vector::type v(4);
v = x;
house(x[0], v[range(1,4)], tau, beta); // generate Householder transform
v[0] = 1.0;
house_apply_left(v, tau, x);

// x = H * x

cout << x << endl;

The output is:
[-6 0 0 0]

The Householder bidiagonalization, Algorithm 5.4.2 from [6]. The somewhat
cumbersome interface for house is necessitated by the desire to make this algorithm work in-place. The matrix A is overwritten by the bidiagonal result and
is also overwritten by the Householder vectors used to get there.
template 
void bidiagonalize(Matrix& A)
{
typedef typename matrix_traits::element_type T;
typename Matrix::size_type M = A.nrows(), N = A.ncols(), j;
T tau, alpha;
typename magnitude::type beta;
for (j = 0; j < N; ++j) {

16.1. BASIC TRANSFORMATIONS

177

alpha = A(j,j);
house( alpha, A(range(j+1,M),j), tau, beta );
A(j,j) = one(alpha);
house_apply_left( A(range(j,M),j), tau,
A(range(j,M),range(j,N)) );
A(j,j) = beta;
if (j <= N - 2) {
alpha = A(j,j+1);
house( alpha, A(j,range(j+2,N)), tau, beta );
A(j,j+1) = one(alpha);
house_apply_right( trans(A(j,range(j+1,N))), tau,
A(range(j,M),range(j+1,N)) );
A(j,j+1) = beta;
}
}
}

16.1.4

house apply left

A ← HA
template 
void house_apply_left(const Vector& v, T tau, Matrix& A);
This function applies the Householder transformation stored in v (which can
be created with the house() function) to the Matrix A. The implementation of
the application consists of a matrix vector multplication and a rank-1 update.
HA = (I − τ vv T )A = A − vwT

, w = τ AT v

See the documentation for house() for more details and and examples.
Preconditions
• size(v) == nrows(A)
Complexity
O(mn)

16.1.5

house apply right

A ← AH
template 
void house_apply_right(const Vector& v, T tau, Matrix& A);

178

CHAPTER 16. MISCELLANEOUS OPERATIONS

This function applies the Householder transformation stored in v (which can
be created with the house() function) to the Matrix A. The implementation of
the application consists of a matrix vector multplication and a rank-1 update.
AH = A(I − τ vv T ) = A − wv T

, w = τ Av

See the documentation for house() for more details and and examples.
Complexity
O(mn)
Preconditions
• size(v) == ncols(A)

Appendix A

Concept Checks for STL
A.1
A.1.1

STL Basic Concept Checks
Assignable

template 
struct Assignable {
void constraints() {
a = a;
// require assignment operator
T c(a);
// require copy constructor
const_constraints(a);
ignore_unused_variable_warning(c);
}
void const_constraints(const T& b) {
a = b;
// const required for argument to assignment
T c(b);
// const required for argument to copy constructor
ignore_unused_variable_warning(c);
}
T a;
};

A.1.2

DefaultConstructible

template 
struct DefaultConstructible {
void constraints() {
T a;
// require default constructor
}
};

A.1.3

CopyConstructible

template 
struct CopyConstructible {

179

180

APPENDIX A. CONCEPT CHECKS FOR STL
void constraints() {
T a(b);
// require
T* ptr = &a;
// require
const_constraints(a);
}
void const_constraints(const T&
T c(a);
// require
const T* ptr = &a; // require
}
T b;

copy constructor
address of operator

a) {
const copy constructor
const address of operator

};

A.1.4

EqualityComparable

template 
struct EqualityComparable {
void constraints() {
r = a == b;
// require equality operator
r = a != b;
// require inequality operator
}
T a, b;
bool r;
};

A.1.5

LessThanComparable

template 
struct LessThanComparable {
void constraints() {
// require comparison operators
r = a < b || a > b || a <= b || a >= b;
}
T a, b;
bool r;
};

A.1.6

Generator

template 
struct Generator {
void constraints() {
r = f();
// require operator() member function
}
Func f;
Ret r;
};

A.1. STL BASIC CONCEPT CHECKS

A.1.7

UnaryFunction

template 
struct UnaryFunction {
void constraints() {
r = f(arg);
// require operator()
}
Func f;
Arg arg;
Ret r;
};

A.1.8

BinaryFunction

template 
struct BinaryFunction {
void constraints() {
r = f(first, second); // require operator()
}
Func f;
First first;
Second second;
Ret r;
};

A.1.9

UnaryPredicate

template 
struct UnaryPredicate {
void constraints() {
r = f(arg);
// require operator() returning bool
}
Func f;
Arg arg;
bool r;
};

A.1.10

BinaryPredicate

template 
struct BinaryPredicate {
void constraints() {
r = f(a, b);
// require operator() returning bool
}
Func f;
First a;
Second b;
bool r;
};

181

182

APPENDIX A. CONCEPT CHECKS FOR STL

A.2
A.2.1

STL Iterator Concept Checks
TrivialIterator

template 
struct TrivialIterator {
CLASS_REQUIRES(T, Assignable);
CLASS_REQUIRES(T, DefaultConstructible);
CLASS_REQUIRES(T, EqualityComparable);
void constraints() {
typedef typename std::iterator_traits::value_type V;
(void)*i;
// require dereference operator
}
T i;
};

A.2.2

Mutable-TrivialIterator

template 
struct Mutable_TrivialIterator {
CLASS_REQUIRES(T, TrivialIterator);
void constraints() {
*i = *j;
// require dereference and assignment
}
T i, j;
};

A.2.3

InputIterator

template 
struct InputIterator {
CLASS_REQUIRES(T, TrivialIterator);
void constraints() {
// require iterator_traits typedef’s
typedef typename std::iterator_traits::difference_type D;
typedef typename std::iterator_traits::reference R;
typedef typename std::iterator_traits::pointer P;
typedef typename std::iterator_traits::iterator_category C;
REQUIRE2(typename std::iterator_traits::iterator_category,
std::input_iterator_tag, Convertible);
++i;
// require preincrement operator
i++;
// require postincrement operator
}
T i;
};

A.2.4

OutputIterator

template 

A.2. STL ITERATOR CONCEPT CHECKS
struct OutputIterator {
CLASS_REQUIRES(T, Assignable);
void constraints() {
(void)*i;
// require
++i;
// require
i++;
// require
*i++ = *j;
// require
}
T i, j;
};

A.2.5

dereference operator
preincrement operator
postincrement operator
postincrement and assignment

ForwardIterator

template 
struct ForwardIterator {
CLASS_REQUIRES(T, InputIterator);
void constraints() {
REQUIRE2(typename std::iterator_traits::iterator_category,
std::forward_iterator_tag, Convertible);
}
};

A.2.6

Mutable-ForwardIterator

template 
struct Mutable_ForwardIterator {
CLASS_REQUIRES(T, ForwardIterator);
CLASS_REQUIRES(T, OutputIterator);
void constraints() { }
};

A.2.7

BidirectionalIterator

template 
struct BidirectionalIterator {
CLASS_REQUIRES(T, ForwardIterator);
void constraints() {
REQUIRE2(typename std::iterator_traits::iterator_category,
std::bidirectional_iterator_tag, Convertible);
--i;
// require predecrement operator
i--;
// require postdecrement operator
}
T i;
};

A.2.8

Mutable-BidirectionalIterator

template 
struct Mutable_BidirectionalIterator {

183

184

APPENDIX A. CONCEPT CHECKS FOR STL
CLASS_REQUIRES(T, BidirectionalIterator);
CLASS_REQUIRES(T, Mutable_ForwardIterator);
void constraints() {
*i-- = *i;
// require postdecrement and assignment
}
T i;

};

A.2.9

RandomAccessIterator

template 
struct RandomAccessIterator {
CLASS_REQUIRES(T, BidirectionalIterator);
CLASS_REQUIRES(T, LessThanComparable);
void constraints() {
REQUIRE2(typename std::iterator_traits::iterator_category,
std::random_access_iterator_tag, Convertible);
typedef typename std::iterator_traits::reference R;
i += n;
// require assignment addition operator
i = i + n; i = n + i; // require addition with difference type
i -= n;
// require assignment subtraction operator
i = i - n;
// require subtraction with difference type
n = i - j;
// require difference operator
(void)i[n];
// require element access operator
}
T a, b;
T i, j;
typename std::iterator_traits::difference_type n;
};

A.2.10

Mutable-RandomAccessIterator

template 
struct Mutable_RandomAccessIterator {
CLASS_REQUIRES(T, RandomAccessIterator);
CLASS_REQUIRES(T, Mutable_BidirectionalIterator);
void constraints() {
i[n] = *i;
// require element access and assignment
}
T i;
typename std::iterator_traits::difference_type n;
};

Bibliography
[1] M. H. Austern. Generic Programming and the STL. Professional computing
series. Addison-Wesley, 1999.
[2] P. N. Brown and A. C. Hindmarsh. Matrix-free methods for stiff systems
of ODE’s. SIAM J. Numer. Anal., 23(3):610–638, June 1986.
[3] K. B. Bruce, L. Cardelli, G. Castagna, J. Eifrig, S. F. Smith, V. Trifonov,
G. T. Leavens, and B. C. Pierce. On binary methods. Theory and Practice
of Object Systems, 1(3):221–242, 1995.
[4] S. Carr and K. Kennedy. Blocking linear algebra codes for memory hierarchies. In D. C. Sorensen, J. Dongarra, P. Messina, and R. G. Voigt,
editors, Proceedings of the 4th Conference on Parallel Processing for Scientific Computing, pages 400–405, Philadelphia, PA, USA, Dec. 1989. SIAM
Publishers.
[5] L. Carter, J. Ferrante, and S. F. Hummel. Hierarchical tiling for improved
superscalar performance. In Proceedings of the 9th International Symposium on Parallel Processing (IPPS’95), pages 239–245, Los Alamitos, CA,
USA, Apr. 1995. IEEE Computer Society Press.
[6] G. H. Golub and C. F. Van Loan. Matrix Computations. The John Hopkins
University Press, Baltimore, Maryland, 1983.
[7] P. R. Halmos. Finite-Dimensional Vector Spaces. The University Series in
Undergraduate Mathematics. van Nostrand Company, 2 edition, 1958.
[8] A. N. Michel and C. J. Herget. Applied Algebra and Functional Analysis.
Dover, 1981.
[9] Y. Saad and M. Schultz. GMRES: A generalized minimum residual algorithm for solving nonsymmetric linear systems. SIAM J. Sci. Statist.
Comput., 7(3):856–869, July 1986.
[10] G. Strang. Linear Algebra and Its Applications. Harcourt, Brace, Jovanovich, San Diego, third edition, 1988.
[11] B. L. van der Waerden. Algebra. Frederick Ungar Publishing, 1970.

185



Source Exif Data:
File Type                       : PDF
File Type Extension             : pdf
MIME Type                       : application/pdf
PDF Version                     : 1.3
Linearized                      : No
Page Count                      : 199
Page Mode                       : UseOutlines
Warning                         : Duplicate 'Producer' entry in dictionary (ignored)
Producer                        : pdfTeX-0.14h
Author                          : 
Title                           : 
Subject                         : 
Creator                         : LaTeX with hyperref package
Create Date                     : 2004:02:02 14:22:00
EXIF Metadata provided by
EXIF.tools

Navigation menu