Manual

manual

User Manual:

Open the PDF directly: View PDF .
Page Count: 310 [warning: Documents this large are best viewed by clicking the View PDF Link!]

Introduction
ec.Evolve and Utility Classes
ec.EvolutionState and the ECJ Evolutionary Process
Basic Evolutionary Processes
Representations
Parallel Processes
- Distributed Evaluation (The ec.eval Package)
- Island Models (The ec.exchange Package)
Additional Evolutionary Algorithms

The ECJ Owner’s Manual

A User Manual for the ECJ Evolutionary Computation Library

Sean Luke

Department of Computer Science

George Mason University

Manual Version 26

July 5, 2018

Where to Obtain ECJ

http://cs.gmu.edu/∼eclab/projects/ecj/

Thanks to Carlotta Domeniconi.

Get the latest version of this document or suggest improvements here:

http://cs.gmu.edu/∼eclab/projects/ecj/

This document is licensed

under the

Creative Commons Attribution-No Derivative Works 3.0 United

States License,

except for those portions of the work licensed differently as described in the next section. To view a copy

of this license, visit http://creativecommons.org/licenses/by-nd/3.0/us/ or send a letter to Creative Commons, 171

Second Street, Suite 300, San Francisco, California, 94105, USA. A quick license summary:

• You are free to redistribute this document.

•You may not modify, transform, translate, or build upon the document except for personal use.

• You must maintain the author’s attribution with the document at all times.

• You may not use the attribution to imply that the author endorses you or your document use.

This summary is just informational: if there is any conﬂict in interpretation between the summary and the actual license,

the actual license always takes precedence.

This document is was produced

in part through funding from grants 0916870 and 1317813 from the

National Science Foundation.

Contents

1 Introduction 7

1.1 AboutECJ ................................................ 7

1.2 Overview................................................. 9

1.3 Unpacking ECJ and Using the Tutorials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

1.3.1 The ec Directory, the CLASSPATH, and jar ﬁles . . . . . . . . . . . . . . . . . . . . . . . 15

1.3.1.1 The ec/display Directory:ECJ’sGUI........................ 15

1.3.1.2 The ec/app Directory: Demo Applications . . . . . . . . . . . . . . . . . . . . 15

1.3.2 The docs Directory ....................................... 15

1.3.2.1 Tutorials........................................ 16

2ec.Evolve and Utility Classes 17

2.1 TheParameterDatabase ........................................ 18

2.1.1 Inheritance............................................ 19

2.1.2 KindsofParameters ...................................... 20

2.1.3 Namespace Hierarchies and Parameter Bases . . . . . . . . . . . . . . . . . . . . . . . . 22

2.1.4 Parameter Files in Jar Files . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.1.5 AccessingParameters ..................................... 25

2.1.6 ParameterMacros ....................................... 27

2.1.6.1 TheAliasMacro ................................... 27

2.1.7 Debugging Your Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

2.1.8 Building a Parameter Database from Scratch . . . . . . . . . . . . . . . . . . . . . . . . 30

2.2 Output .................................................. 32

2.2.1 Creating and Writing to Logs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

2.2.2 QuietingtheProgram ..................................... 34

2.2.3 The ec.util.Code Class...................................... 34

2.2.3.1 Decoding the Hard Way . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

2.2.3.2 Decoding the Easy Way . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

2.3 Checkpointing.............................................. 37

2.3.1 Implementing Checkpointable Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

2.4 Threads and Random Number Generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

2.4.1 RandomNumbers ....................................... 40

2.4.2 Selecting Randomly from Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . 43

2.4.3 Thread-LocalStorage...................................... 45

2.4.4 MultithreadingSupport .................................... 45

2.5 Jobs.................................................... 46

2.6 The ec.Evolve Top-level......................................... 47

2.7 Integrating ECJ with other Applications or Libraries . . . . . . . . . . . . . . . . . . . . . . . . 49

2.7.1 ControlbyECJ ......................................... 49

2.7.2 Control by another Application or Library . . . . . . . . . . . . . . . . . . . . . . . . . 53

3ec.EvolutionState and the ECJ Evolutionary Process 55

3.1 CommonPatterns............................................ 57

3.1.1 Setup............................................... 57

3.1.2 SingletonsandCliques..................................... 57

3.1.3 Prototypes............................................ 57

3.1.4 TheFlyweightPattern ..................................... 58

3.1.5 Groups.............................................. 58

3.2 Populations, Subpopulations, Species, Individuals, and Fitnesses . . . . . . . . . . . . . . . . 59

3.2.1 Making Large Numbers of Subpopulations . . . . . . . . . . . . . . . . . . . . . . . . . 61

3.2.2 How Species Make Individuals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

3.2.3 Reading and Writing Populations and Subpopulations . . . . . . . . . . . . . . . . . . 62

3.2.4 AboutIndividuals ....................................... 64

3.2.4.1 Implementing an Individual . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

3.2.5 AboutFitnesses......................................... 66

3.3 InitializersandFinishers........................................ 68

3.3.1 Population Files and Subpopulation Files . . . . . . . . . . . . . . . . . . . . . . . . . . 70

3.4 EvaluatorsandProblems........................................ 71

3.4.1 Problems............................................. 72

3.4.2 ImplementingaProblem ................................... 73

3.5 Breeders ................................................. 75

3.5.1 Breeding Pipelines and BreedingSources . . . . . . . . . . . . . . . . . . . . . . . . . . 78

3.5.1.1 AuxiliaryData.................................... 79

3.5.2 SelectionMethods........................................ 79

3.5.2.1 Implementing a Simple SelectionMethod . . . . . . . . . . . . . . . . . . . . . 80

3.5.2.2 StandardClasses................................... 81

3.5.3 BreedingPipelines ....................................... 84

3.5.3.1 Implementing a Simple BreedingPipeline . . . . . . . . . . . . . . . . . . . . 86

3.5.3.2 Standard Utility Pipelines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87

3.5.4 SettingupaPipeline...................................... 91

3.5.4.1 A Genetic Algorithm Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . . . 91

3.5.4.2 A Genetic Programming Pipeline . . . . . . . . . . . . . . . . . . . . . . . . . 92

3.6 Exchangers................................................ 93

3.7 Statistics ................................................. 93

3.7.1 Creating a Statistics Chain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96

3.7.2 TabularStatistics ........................................ 96

3.7.3 QuietingtheStatistics ..................................... 99

3.7.4 Implementing a Statistics Object . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99

3.8 Debugging an Evolutionary Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

4 Basic Evolutionary Processes 107

4.1 GenerationalEvolution......................................... 107

4.1.1 The Genetic Algorithm (The ec.simple Package)....................... 109

4.1.2 Evolution Strategies (The ec.es Package)........................... 111

4.2 Steady-State Evolution (The ec.steadystate Package) ........................ 115

4.2.1 SteadyStateStatistics ..................................... 118

4.2.2 Producing More than One Individual at a Time . . . . . . . . . . . . . . . . . . . . . . 118

4.3 Single-State Methods (The ec.singlestate Package).......................... 120

4.3.1 Simple Hill-Climbing and (1+1) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120

4.3.2 Steepest Ascent Hill-Climbing and (1+λ) .......................... 121

4.3.3 Steepest Ascent Hill-Climbing With Replacement and (1, λ) ............... 122

4.3.4 SimulatedAnnealing...................................... 123

5 Representations 125

5.1 Vector and List Representations (The ec.vector Package)...................... 125

5.1.1 Vectors.............................................. 126

5.1.1.1 Initialization ..................................... 127

5.1.1.2 Crossover....................................... 128

5.1.1.3 Multi-Vector Crossover . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131

5.1.1.4 Mutation ....................................... 131

5.1.1.5 Heterogeneous Vector Individuals . . . . . . . . . . . . . . . . . . . . . . . . . 137

5.1.2 Lists ............................................... 139

5.1.2.1 UtilityMethods ................................... 139

5.1.2.2 Initialization ..................................... 140

5.1.2.3 Crossover....................................... 140

5.1.2.4 Mutation ....................................... 141

5.1.3 Arbitrary Genes: ec.vector.Gene ................................ 142

5.2 Genetic Programming (The ec.gp Package).............................. 144

5.2.1 GPNodes, GPTrees, and GPIndividuals . . . . . . . . . . . . . . . . . . . . . . . . . . . 146

5.2.1.1 GPNodes ....................................... 147

5.2.1.2 GPTrees........................................ 147

5.2.1.3 GPIndividual..................................... 148

5.2.1.4 GPNodeConstraints................................. 148

5.2.1.5 GPTreeConstraints.................................. 148

5.2.1.6 GPFunctionSet.................................... 148

5.2.2 BasicSetup ........................................... 149

5.2.2.1 DeﬁningGPNodes.................................. 150

5.2.3 Deﬁning the Representation, Problem, and Statistics . . . . . . . . . . . . . . . . . . . . 151

5.2.3.1 GPData ........................................ 152

5.2.3.2 KozaFitness...................................... 153

5.2.3.3 GPProblem...................................... 154

5.2.3.4 GPNodeSubclasses ................................. 155

5.2.3.5 Statistics........................................ 157

5.2.4 Initialization........................................... 158

5.2.5 Breeding............................................. 162

5.2.6 ACompleteExample...................................... 169

5.2.7 GPNodesinDepth....................................... 172

5.2.8 GPTrees and GPIndividuals in Depth . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176

5.2.8.1 Pretty-Printing Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177

5.2.8.2 GPIndividuals .................................... 180

5.2.9 Ephemeral Random Constants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 180

5.2.10 Automatically Deﬁned Functions and Macros . . . . . . . . . . . . . . . . . . . . . . . 183

5.2.10.1 AboutADFStacks.................................. 186

5.2.11 Strongly Typed Genetic Programming . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189

5.2.11.1 InsideGPTypes ................................... 194

5.2.12 Parsimony Pressure (The ec.parsimony Package) ...................... 195

5.3 Grammatical Evolution (The ec.gp.ge Package) ........................... 197

5.3.1 GEIndividuals, GESpecies, and Grammars . . . . . . . . . . . . . . . . . . . . . . . . . 198

5.3.1.1 StrongTyping .................................... 199

5.3.1.2 ADFsandERCs ................................... 200

5.3.2 Translation and Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 200

5.3.3 Printing ............................................. 202

5.3.4 Initialization and Breeding . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 203

5.3.5 DealingwithGP ........................................ 204

5.3.6 ACompleteExample...................................... 204

5.3.6.1 GrammarFiles.................................... 206

5.3.7 HowParsingisDone...................................... 206

5.4 Push (The ec.gp.push Package)..................................... 207

5.4.1 PushandGP .......................................... 209

5.4.2 Deﬁning the Push Instruction Set . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 210

5.4.3 CreatingaPushProblem ................................... 211

5.4.4 Building a Custom Instruction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 212

5.5 NEAT (The ec.neat Package)...................................... 213

5.5.1 Building a NEAT Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214

5.5.1.1 Breeding ....................................... 214

5.5.1.2 Evaluation ...................................... 217

5.6 Rulesets and Collections (The ec.rule Package) ........................... 220

5.6.1 RuleIndividuals and RuleSpecies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 220

5.6.2 RuleSets and RuleSetConstraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221

5.6.3 Rules and RuleConstraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224

5.6.4 Initialization........................................... 226

5.6.5 Mutation............................................. 226

5.6.6 Crossover ............................................ 227

6 Parallel Processes 229

6.1 Distributed Evaluation (The ec.eval Package) ............................ 229

6.1.1 TheMaster............................................ 230

6.1.2 Slaves .............................................. 231

6.1.3 OpportunisticEvolution.................................... 233

6.1.4 AsynchronousEvolution ................................... 235

6.1.5 TheMasterProblem....................................... 236

6.1.6 Noisy Distributed Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 238

6.2 Island Models (The ec.exchange Package) .............................. 239

6.2.1 Islands.............................................. 239

6.2.2 TheServer............................................ 241

6.2.2.1 Synchronicity..................................... 242

6.2.3 InternalIslandModels..................................... 243

6.2.4 TheExchanger ......................................... 244

7 Additional Evolutionary Algorithms 247

7.1 Coevolution (The ec.coevolve Package)................................ 247

7.1.1 CoevolutionaryFitness .................................... 247

7.1.2 GroupedProblems....................................... 248

7.1.3 One-Population Competitive Coevolution . . . . . . . . . . . . . . . . . . . . . . . . . 250

7.1.4 Multi-Population Coevolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 252

7.1.4.1 Parallel and Sequential Coevolution . . . . . . . . . . . . . . . . . . . . . . . 254

7.1.4.2 Maintaining Context . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 255

7.1.5 Performing Distributed Evaluation with Coevolution . . . . . . . . . . . . . . . . . . . 256

7.2 Spatially Embedded Evolutionary Algorithms (The ec.spatial Package) . . . . . . . . . . . . . 257

7.2.1 ImplementingaSpace ..................................... 258

7.2.2 SpatialBreeding ........................................ 259

7.2.3 Coevolutionary Spatial Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 260

7.3 Particle Swarm Optimization (The ec.pso Package)......................... 261

7.4 Differential Evolution (The ec.de Package).............................. 265

7.4.1 Evaluation............................................ 265

7.4.2 Breeding............................................. 265

7.4.2.1 The DE/rand/1/bin Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . 267

7.4.2.2 The DE/best/1/bin Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . 267

7.4.2.3 The DE/rand/1/either-or Operator . . . . . . . . . . . . . . . . . . . . . . . . 268

7.5 Multiobjective Optimization (The ec.multiobjective Package) ................... 269

7.5.0.1 The MultiObjectiveFitness class . . . . . . . . . . . . . . . . . . . . . . . . . . 269

7.5.0.2 The MultiObjectiveStatistics class . . . . . . . . . . . . . . . . . . . . . . . . . 271

7.5.0.3 The HypervolumeStatistics class . . . . . . . . . . . . . . . . . . . . . . . . . . 272

7.5.1 Selecting with Multiple Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272

7.5.1.1 ParetoRanking.................................... 273

7.5.1.2 Archives ....................................... 274

7.5.2 NSGA-II/III (ec.multiobjective.nsga2 and ec.multiobjective.nsga3 Packages) . . . . . . . 274

7.5.3 SPEA2 (The ec.multiobjective.spea2 Package) ........................ 275

7.6 Estimation of Distribution Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276

7.6.1 PBIL ............................................... 276

7.6.2 CMA-ES............................................. 278

7.6.2.1 Parameters ...................................... 279

7.6.3 iAMaLGaM IDEA ....................................... 281

7.6.4 DOvS............................................... 283

7.7 Meta-Evolutionary Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 285

7.7.1 TheTwoParameterFiles.................................... 286

7.7.2 DeﬁningtheParameters.................................... 288

7.7.3 StatisticsandMessages .................................... 290

7.7.4 Populations Versus Generations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 291

7.7.5 Using Meta-Evolution with Distributed Evaluation . . . . . . . . . . . . . . . . . . . . 292

7.7.6 Customization ......................................... 293

7.8 Resets (The ec.evolve Package)..................................... 295

Chapter 1

Introduction

The purpose of this manual is to describe practically every feature of ECJ, an evolutionary computation

toolkit. It’s not a good choice of reading material if your goal is to learn the system from scratch. It’s very

terse, boring, and long, and not organized as a tutorial but rather as an encyclopedia. Instead, I refer you to

ECJ’s four tutorials and various other documentation that comes with the system. But when you need to

know about some particular gizmo that ECJ has available, this manual is where to look.

1.1 About ECJ

ECJ is an evolutionary computation framework written in Java. The system was designed for large, heavy-

weight experimental needs and provides tools which provide many popular EC algorithms and conventions

of EC algorithms, but with a particular emphasis towards genetic programming. ECJ is free open-source

with a BSD-style academic license (AFL 3.0).

ECJ is now well over ﬁfteen years old and is a mature, stable framework which has (fortunately) exhibited

relatively few serious bugs over the years. Its design has readily accommodated many later additions, includ-

ing multiobjective optimization algorithms, island models, master/slave evaluation facilities, coevolution,

steady-state and evolution strategies methods, parsimony pressure techniques, and various new individual

representations (for example, rule-sets). The system is widely used in the genetic programming community

and is reasonably popular in the EC community at large. I myself have used it in over thirty or forty

publications.

A toolkit such as this is not for everyone. ECJ was designed for big projects and to provide many facilities,

and this comes with a relatively steep learning curve. We provide tutorials and many example applications,

but this only partly mitigates ECJ’s imposing nature. Further, while ECJ is extremely “hackable”, the initial

development overhead for starting a new project is relatively large. As a result, while I feel ECJ is an excellent

tool for many projects, other tools might be more apropos for quick-and-dirty experimental work.

Why ECJ was Made

ECJ’s primary inspiration comes from lil-gp [

], to which it owes much. Homage to

lil-gp may be found in ECJ’s command-line facility, how it prints out messages, and how it stores statistics.

Work on ECJ commenced in Fall 1998 after experiences with lil-gp in evolving simulated soccer robot teams

[6]. This project involved heavily modifying lil-gp to perform parallel evaluations, a simple coevolutionary

procedure, multiple threading, and strong typing. Such modiﬁcations made it clear that lil-gp could not be

further extended without considerable effort, and that it would be worthwhile developing an “industrial-

grade” evolutionary computation framework in which GP was one of a number of orthogonal features. I

intended ECJ to provide at least ten years of useful life, and I believe it has performed well so far.

Evaluator

Pre-Evaluation Statistics

Post-Evaluation Statistics

Pre-Breeding

Exchange

Pre-Pre-Breeding Exchange Statistics

Post-Pre-Breeding Exchange Statistics

Breeding

Pre-Breeding Statistics

Post-Breeding Statistics

Post-Breeding

Exchange

Pre-Post-Breeding Exchange Statistics

Post-Post-Breeding Exchange Statistics

Initializer

Pre-Initialization Statistics

Post-Initialization Statistics

Initialize Exchanger, Evaluator

Finisher

Pre-Finishing Statistics

Shut Down Exchanger, Evaluator

Out of time or

want to quit?

Recover

from

Checkpoint

Reinitialize Exchanger, Evaluator

Optionally

Checkpoint

Optional Post-Checkpoint Statistics

Optional Pre-Checkpoint Statistics

Increment Generation

Want to quit?

YES

Figure 1.1 Top-Level Loop of ECJ’s SimpleEvolutionState class, used for basic generational EC algorithms. Various sub-operations are

shown occurring before or after the primary operations. The full population is revised each iteration.

1.2 Overview

ECJ is a general-purpose evolutionary computation framework which attempts to permit as many valid

combinations as possible of individual representation and breeding method, ﬁtness and selection procedure,

evolutionary algorithm, and parallelism.

Top-level Loop

ECJ hangs the entire state of the evolutionary run off of a single instance of a subclass of

EvolutionState. This enables ECJ to serialize out the entire state of the system to a checkpoint ﬁle and to

recover it from the same. The EvolutionState subclass chosen deﬁnes the kind of top-level evolutionary loop

used in the ECJ process. We provide two such loops: a simple generational loop with optional elitism, and a

steady-state loop.

Figure 1.1 shows the top-level loop of the simple generational EvolutionState. The loop iterates between

breeding and evaluation, with an optional “exchange” period after each. Statistics hooks are called before

and after each period of breeding, evaluation, and exchanging, as well as before and after initialization of

the population and “ﬁnishing” (cleaning up prior to quitting the program).

Breeding and evaluation are handled by singleton objects known as the Breeder and Evaluator respectively.

Likewise, population initialization is handled by an Initializer singleton, and ﬁnishing is done by a Finisher.

Exchanges after breeding and after evaluation are handled by an Exchanger. The particular versions of these

singleton objects are determined by the experimenter, though we provide versions which perform common

tasks. For example, we provide a traditional-EA SimpleEvaluator, a steady-state EA SteadyStateEvaluator, a

“single-population coevolution” CompetitiveEvaluator, and a multi-population coevolution MultiPopCoevolu-

tionaryEvaluator, among others. There are likewise custom breeders and initializers for different functions.

The Exchanger provides an opportunity for other hooks, notably internal and external island models. For ex-

ample, post-breeding exchange might allow external immigrants to enter the population, while emmigrants

might leave the population during post-evaluation exchange. These singleton operators comprise most of

the high-level “verbs” in the ECJ system, as shown in Figure 1.2.

Parameterized Construction

ECJ is unusually heavily parameterized: practically every feature of the

system is determined at runtime from a parameter. Parameters deﬁne the classes of objects, the speciﬁc

subobjects they hold, and all of their initial runtime values. ECJ does this through a bootstrap class called

Evolve, which loads a ParameterDatabase from runtime parameter ﬁles at startup. Using this database, Evolve

constructs the top-level EvolutionState and tells it to “setup” itself. EvolutionState in turn calls subsidiary

classes (such as Evaluator) and tells them to “setup” themselves from the database. This procedure continues

down the chain until the entire system is constructed.

State Objects

In addition to “verbs”, EvolutionState also holds “nouns” — the state objects representing

the things being evolved. Speciﬁcally, EvolutionState holds exactly one Population, which contains some

(typically 1) Subpopulations. Multiple Subpopulations permit experiments in coevolution, internal island

models, etc. Each Subpopulation holds some number of Individuals and the Species to which the Individuals

belong. Species is a ﬂyweight object for Individual: it provides a central repository for things common to many

Individuals so they don’t have to each contain them in their own instances.

While running, numerous state objects must be created, destroyed, and recreated. As ECJ only learns the

speciﬁc classes of these objects from the user-deﬁned parameter ﬁle at runtime, it cannot simply construct

them using Java’s new operator. Instead such objects are created by constructing a prototype object at startup

time, and then using this object to stamp out copies of itself as often as necessary. For example, Species

contains a prototypical Individual. When new Individuals must be created for a given Subpopulation, they are

copied from the Subpopulation’s Species and then customized. This allows different Subpopulations to use

different Individual representations.

In keeping with its philosophy of orthogonality, ECJ deﬁnes Fitnesses separate from Individuals (represen-

tations), and provides both single-objective and multi-objective Fitness subclasses. In addition to holding a

prototypical Individual, Species also hold the prototypical Fitness to be used with that kid of Individual.

EvolutionState

Initializer

Breeder

Evaluator

Finisher

Statistics

Exchanger

Problem

Mersenne Twister

RNG

Output

Parameter

Database

makes Population

Breeding Pipelineapplies

Evolve

makes

updates

Fitness

updates

Individual

evaluates

0..n

prototype

Log

0..n

Figure 1.2 Top-Level operators and utility facilities in EvolutionState, and their relationship to certain state objects.

Breeding

ASpecies holds a prototypical breeding pipeline which is cloned by the Breeder and used per-thread

to breed individuals and form the next-generation population. Breeding pipelines are tree structures where

a node in the tree ﬁlters incoming Individuals from its child nodes and hands them to its parents. The leaf

nodes in the tree are SelectionMethods which simply choose Individuals from the old subpopulation and

hand them off. There exist SelectionMethods which perform tournament selection, ﬁtness proportional

selection, truncation selection, etc. Nonleaf nodes in the tree are BreedingPipelines, many of which copy and

modify their received Individuals before handing them to their parent nodes. Some BreedingPipelines are

representation-independent: for example, MultiBreedingPipeline asks for Individuals from one of its children at

random according to some probability distribution. But most BreedingPipelines act to mutate or cross over

Individuals in a representation-dependent way. For example, the GP CrossoverPipeline asks for one Individual

of each of its two children, which must be genetic programming Individuals, performs subtree crossover on

those Individuals, then hands them to its parent.

A tree-structured breeding pipeline allows for a rich assortment of experimenter-deﬁned selection and

breeding proceses. Further, ECJ’s pipeline is copy-forward:BreedingPipelines must ensure that they copy

Individuals before modifying them or handing them forward, if they have not been already copied. This

guarantees that new Individuals are copies of old ones in the population, and furthermore that multiple

pipelines may operate on the same Subpopulation in different threads without the need for locking. ECJ may

apply multiple threads to parallelize the breeding process without the use of Java synchronization at all.

EvolutionState

Population

Subpopulation

Individual

1..n

Species

Fitness

prototype

1 1

Breeding Pipeline

prototype

ﬂyweight

1..n 1

Selection Method

child of

0..n

child of

0..n

uses

Figure 1.3 Top-Level data objects used in evolution.

Evaluation

The Evaluator performs evaluation of a population by passing one or (for coevolutionary

evaluation) several Individuals to a Problem subclass which the Evaluator has cloned off of its prototype.

Evaluation may too be done in multithreaded fashion with no locking, using one Problem per thread.

Individuals may also undergo repeated evaluation in coevolutionary Evaluators of different sorts.

In most projects using ECJ, the primary task is to construct an appropriate Problem subclass. The task

of the Problem is to assess the ﬁtness of the Individual(s) and set its Fitness accordingly. Problem classes also

report if the ideal Individual has been discovered.

Utilities

In addition to its ParameterDatabase, ECJ also uses a checkpointable Output convenience facility

which maintains various streams, repairing them after checkpoint. Output also provides for message logging,

retaining in memory all messages during the run, so that on checkpoint recovery the messages are printed

out again as before. Other utilities include population distribution selectors, searching and sorting tools, etc.

The quality of a random number generator is important for a stochastic optimization system. As such,

ECJ’s random number generator was the very ﬁrst class written in the system: it is a Java implementation

of the highly respected Mersenne Twister algorithm [

] and is the fastest such implementation available.

Since ECJ’s release, the ECJ MersenneTwister and MersenneTwisterFast classes have found their way in a

number of unrelated public-domain systems, including the popular NetLogo multiagent simulator [

MersenneTwisterFast is also shared in ECJ’s sister software, the MASON multiagent simulation toolkit [8].

Representations and Genetic Programming

ECJ allows you to specify any genome representation you

like. Standard representation packages in ECJ provide functionality for vectors of all Java data types;

arbitrary-length lists; trees; and collections of objects (such as rulesets).

ECJ is perhaps best known for its support of “Koza”-style tree-structured genetic programming repre-

sentations. ECJ represents these individuals as forests of parse-trees, each tree equivalent to a single Lisp

2.3

and

on-

wall tick>

20 3

bool

int

int, float

float

int, float

int

int, float

float

int, float

float

tree

int, float

float

Figure 1.4 A typed genetic programming parse tree.

s-expression. Figure 1.4 shows a parse-tree for a simple robot program, equivalent to the Lisp s-expression

(if (and on-wall (tick>20) (∗(ir 3) 6) 2.3).

In C this might look like

(onWall && tick >20) ? ir(3) * 6 : 2.3.

This notionally says “If I’m on the wall and my tick-count is greater than 20, then return the value of my

third infrared sensor times six, else return 2.3”. Such parse-trees are typically evaluated by executing their

programs in a test environment, and modiﬁed via subtree crossover (swapping subtrees among individuals)

or various kinds of mutation (replacing a subtree with a randomly-generated one, perhaps).

ECJ allows multiple subtrees for various experimental needs: Automatically Deﬁned Functions (ADFs —

a mechanism for evolving subroutine calls [

]), or parallel program execution, or evolving teams of programs.

Along with ADFs, ECJ provides built-in support for Automatically Deﬁned Macros (ADMs) [

] and

Ephemeral Random Constants (ERCs [3], such as the numbers 20, 3, 6, and 2.3 in Figure 1.4).

Genetic programming trees are constructed out of a “primorial soup” of function templates (such as

on-wall or 2.3. Early forms of genetic programming were typeless: though such templates had a predeﬁned

arity (number of arguments), any node could be connected to any other. Many genetic programming needs

require more constraints than this. For example, the node if might expect a boolean value in its ﬁrst argument,

and integers or ﬂoats in the second and third arguments, and return a ﬂoat when evaluated. Similarly and

might take two booleans as arguments and return a boolean, while

∗

would take ints or ﬂoats as arguments

and return a ﬂoat.

Such types are often associated with the kinds of data passed from node to node, but they do not have

to be. Typing might be used to constrain certain nodes to be evaluated in groups or in a certain order: for

example, a function type-block might insist that its ﬁrst argument be of type foo and its second argument be

of type ar to make certain that a foo node be executed before a ar node.

ECJ permits a simple static typing mechanism called set-based typing, which is suitable for many such

tasks. In set-based typing, the return type and argument types of each node are each deﬁned to be sets of

type symbols (for example,

{ool}

{

foo, bar, baz

}

, or

{

int, ﬂoat

}

. The desired return type for the tree’s root is

similarly deﬁned. A child node is permitted to ﬁt into the argument slot of a parent node if the child node’s

return type and type of the that argument slot in the parent are compatible. We deﬁne types to be compatible

if their set intersection is nonempty (that is, they share at least one type symbol).

Set-based typing is sufﬁcient for the typing requirements found in many programming languages,

including ones with type hierarchies. It allows, among other things, for nodes such as

∗

to accept either

integers or ﬂoats. However there are considerable restrictions on the power of set-based typing. It’s often

useful for the return type of a node to change based on the particular nodes which have plugged into it as

arguments. For example,

∗

might be deﬁned as returning a ﬂoat if at least one of its arguments returns ﬂoats,

but returning an integer if both of its arguments return integers. if might be similarly deﬁned not to return a

particular type, but to simply require that its return type and the second and third argument types must all

match. Such “polymorphic” typing is particularly useful in situations such as matrix multiplication, where

the operator must place constraints on the width and height of its arguments and the ﬁnal returned matrix.

In this example, it’s also useful to have an inﬁnite number of types (perhaps to represent matrices of varying

widths or heights).

ECJ does not support polymorphic typing out of the box simply because it is difﬁcult to implement

many if not most common tree modiﬁcation and generation algorithms using polymorphic typing: instead,

set-based typing is offered to handle as many common needs as can be easily done.

Out of the Box Capabilities ECJ provides support out-of-the-box for a bunch of algorithm options:

•

Generational algorithms:

(µ

λ)

and

(µ+λ)

Evolution Strategies, the Genetic Algorithm, Genetic

Programming variants, Grammatical Evolution, PushGP, and Differential Evolution

• Steady-State evolution

• Parsimony pressure algorithms

• Spatially-embeded evolutionary algorithms

• Random restarts

• Multiobjective optimization, including the NSGA-II and SPEA2 algorithms.

• Cooperative, 1-Population Competitive, and 2-Population Competitive coevolution.

• Multithreaded evaluation and breeding.

• Parallel synchronous and asynchronous Island Models spread over a grid of computers.

• Internal synchronous Island Models internally in a single ECJ process.

• Massive parallel generational ﬁtness evaluation of individuals on remote slave machines.

•

Asynchronous Evolution, a version of steady-state evolution with massive parallel ﬁtness evaluation

on remote slave machines.

•

Opportunistic Evolution, where remote slave machines run their own mini-evolutionary processes for

a while before sending individuals back to the master process.

• Internal synchronous Island Models internally in a single ECJ process.

• Meta-Evolution

• A large number of selection and breeding operators

ECJ also has a GUI, though in truth I nearly universally use the command-line.

Idiosyncracies

ECJ was developed near the introduction of Java and so has a lot of historical idiosyncra-

cies.

Some of them exist to this day because of conservatism: refactoring is disruptive. If you code with ECJ,

1It used to have a lot more — I’ve been weeding out ones that I think are unnecessary nowadays!

you’ll deﬁnitely have to get used to one or more of the following:

•

No generics at all, few iterators or enumerators, no Java features beyond 1.4 (including annotations),

and little use of the Java Collections library. This is part historical, and part my own dislike of Java’s

byzantine generics implementation, but it’s mostly efﬁciency. Generics are very slow when used with

basic data types, as they require boxing and unboxing. The Java Collections library is unusually badly

written in many places internally: and anyway, for speed we tend to work directly with arrays.

•

Hand-rolled socket code. With one exception (optional compression), ECJ’s parallel facility doesn’t

rely on other libraries.

•

ECJ loads nearly every object from its parameter database. This means that you’ll rarely see the new

keyword in ECJ, nor any constructors. Instead ECJ’s usual “constructor” method is a method called

setup(...), which sets up an object from the database.

•

A proprietary logging facility. ECJ was developed before the existence of java.util.logging. Partly out of

conservatism, I am hesitant to rip up all the pervasive logging just to use Sun’s implementation (which

isn’t very good anyway).

•

A parameter database derived from Java’s old java.util.Properties list rather than XML. This is historical

of course. But seriously, do I need a justiﬁcation to avoid XML?

•

Mersenne Twister random number generator. java.lang.Random is grotesquely bad, and systems which

use it should be shunned.

• A Makeﬁle. ECJ was developed before Ant and I’ve personally never needed it.

1.3 Unpacking ECJ and Using the Tutorials

ECJ is designed to be built either with

maven

or with

make

. If you build ECJ with maven, then it will

package all of ECJ into a single

jar ﬁle

. If you build with

make

, you have the option of packaging into a

single

jar ﬁle

or into a

directory of class ﬁles and resources

. Building to a directory is how ECJ classically

was constructed, but nowadays a jar ﬁle is more useful. However even if you build to a jar ﬁle, it’s still useful

to have the resources and java ﬁles on-hand so you know what various ECJ applications can do and how to

run them.

After unpacking ECJ, you’re left with one directory called ecj where you will ﬁnd several items:

• A top-level README.md ﬁle, which should be self-explanatory in its importance.

• ECJ’s LICENSE ﬁle, which describes the primary license (AFL 3.0, a BSD-style academic license).

• A CHANGES log, which lists all past changes to all versions (including the latest).

• A Makeﬁle for building via make.

• A pom.xml for building via Maven.

• The docs directory. This contains most of the ECJ documentation.

•

The start directory. This contains various scripts for starting up ECJ: though in truth we rarely use

them.

•

The lib directory. This contains any additional libraries that maven (not make) will need to build ECJ

(to build with make, you’ll use a different mechanism).

•

The classes directory. This contains the class ﬁles, and possibly other resources (see next bullet), for ECJ

after it has been built.

•

The src directory, which contains the main Java code and the internal library test code. Of particular

interest to you will be the directories

src/main/java/ec/

, which is the top-level package for ECJ, and

src/main/java/resources/ec/. The src/main/java/ec/ directory holds the top-level package for ECJ,

ec. The src/main/java/resources/ec directory is an identical-structured directory containing various

resources (notably parameter ﬁles). If you build with

make

, then these resources will get merged into

the classes directory.

1.3.1 The ec Directory, the CLASSPATH, and jar ﬁles

The ec (ecj/src/main/java/ec/) directory is ECJ’s top-level package. Every subdirectory is a subpackage,

and most of them are headed by helpful README ﬁles which describe the contents of the directory. Most

packages contain not only Java ﬁles and class ﬁles but also parameter ﬁles and occasional data ﬁles: ECJ

was designed originally for the class ﬁles to be compiled and stored right alongside the Java ﬁles in these

directories, though it can be used with the separate-build-area approach taken by IDEs like Eclipse.

Because ec is the top-level package, you can compile ECJ, more or less, by just sticking its parent directory

(the ecj directory), in your CLASSPATH. You will also need to add certain

jar ﬁles

in order to compile ECJ’s

distributed evaluation and island model facilities, and its GUI. You can get these jar ﬁles from the ECJ

main website (http://cs.gmu.edu/

∼

eclab/projects/ecj/). Note that none of these libraries is required. For

example, if the libraries for the distributed evaluator and island model are missing, ECJ will compile but will

complain if you try to run those packages with compression turned on (a feature of the packages). The GUI

library is optional to ECJ, so if you don’t install its libraries, you can still compile ECJ by just deleting the

ec/display directory.

1.3.1.1 The ec/display Directory: ECJ’s GUI

This directory contains ECJ’s GUI. It’s in a state of disrepair and I suggest you do not use it. ECJ is really best

as a command line program. In fact, as mentioned above, you can simply delete the directory and ECJ will

compile just ﬁne.

1.3.1.2 The ec/app Directory: Demo Applications

This directory contains all the demo applications. We have quite a number of demo applications, many

sharing the same subdirectories. Read the provided README ﬁle for some guidance.

1.3.2 The docs Directory

This directory contains all top-level documentation of ECJ except for the various README ﬁles scattered

throughout the package. The index.html ﬁle provides the top-level entry point to the documentation.

The documentation includes:

• Introduction to parameters in ECJ

• Class documentation

•

ECJ’s four tutorials and post-tutorial discussion. The actual tutorial code is located in the ec/app

directory.

• An (old) overview of ECJ

• An (old) discussion of ECJ’s warts

• Some (old) graph diagrams of ECJ’s structure

• This manual

1.3.2.1 Tutorials

ECJ has four tutorials which introduce you to the basics of coding on the system.

I strongly suggest you go

through them before continuing through the rest of this manual. They are roughly:

1. A simple GA to solve the MaxOnes problem with a boolean representation.

2. A GA to solve an integer problem, with a custom mutation pipeline.

An evolution strategy to solve a ﬂoating-point problem, with a custom statistics object and reading

and writing populations.

4. A genetic programming problem, plus some elitism.

As should be obvious from the rest of this manual, this

barely scratches the surface

of ECJ. No mention

is given of parallelism, differential evolution, coevolution, multiobjective optimization, list and ruleset

representations, grammatical encoding, spatial embedding, etc. But it’ll get you up to speed.

Chapter 2

ec.Evolve and Utility Classes

ECJ is big. Let us begin.

ECJ’s entry point is the class ec.Evolve. This class is little more than bootstrapping code to set up the ECJ

system, construct basic datatypes, and get things going.

To run an ECJ process, you ﬁre up ec.Evolve with certain runtime arguments.

java ec.Evolve -file myParameterFile.params -p param=value -p param=value (etc.)

ECJ sets itself up entirely using a

parameter ﬁle

. To this you can add additional

command-line parame-

ters

which override those found in the parameter ﬁle. More on the parameter ﬁle will be discussed starting

in Section 2.1.

For example, if you were presently in the ecj directory, you could do this:

java ec.Evolve -file ec/app/ecsuite/ecsuite.params

This all assumes that the parameter ﬁle is a free-standing ﬁle in your ﬁlesystem. But it might not be:

you might want to start up from a parameter ﬁle stored within a Jar ﬁle (for example if your ECJ library is

bundled up into a Jar ﬁle like ecj.jar). To do this you can specify the parameter ﬁle as a ﬁle resource relative

to the .class ﬁle of a class (a-la Java’s Class.getResource(...) method):

java ec.Evolve -from myParameterFile.params -at relative.to.Classname -p param=value

(etc.)

... for example:

java ec.Evolve -from ecsuite.params -at ec.app.ecsuite.ECSuite

You can also say:

java ec.Evolve -from myParameterFile.params -p param=value (etc.)

In which case ECJ will assume that the class is ec.Evolve. In this situation, you’d probably need to specify

the parameter ﬁle as a path away from ec.Evolve (which is in the ec directory), for example:

java ec.Evolve -from app/ecsuite/ecsuite.params

(Note the missing ec/...). See Section 2.1 for more discussion about all this.

ECJ can also restart from a checkpoint ﬁle it created in a previous run:

java ec.Evolve -checkpoint myCheckpointFile.gz

Checkpointing will be discussed in Section 2.3.

Last but not least, if you forget this stuff, you can always type this to get some reminders:

java ec.Evolve -help

The purpose of ec.Evolve is to construct an ec.EvolutionState instance, or load one from a checkpoint ﬁle;

then get it running; and ﬁnally clean up. The ec.EvolutionState class actually performs the evolutionary

process. Most of the stuff ec.EvolutionState holds is associated with evolutionary algorithms or other

stochastic optimization procedures. However there are certain important utility objects or data which are

created by ec.Evolve prior to creating the ec.EvolutionState, and are then stored into ec.EvolutionState after it

has been constructed. These objects are:

•

The

Parameter Database

, which holds all the parameters ec.EvolutionState uses to build and run the

process.

• The Output, which handles logging and writing to ﬁles.

• The Checkpointing Facility to create checkpoint ﬁles as the process continues.

• The Number of Threads to use, and the Random Number Generators, one per thread.

• A simple declaration of the Number of Jobs to run in the process.

The remainder Section 2 discusses each of these items. It’s not the most exciting of topics: but it’s

important in order to understand the rest of the ECJ process.

2.1 The Parameter Database

To build and run an experiment in ECJ, you typically write three things:

• (In Java) A problem which evaluates individuals and assigns ﬁtness values to them.

•

(In Java) Depending on the kind of experiment, various

components

from which individuals can be

constructed — for example, for a genetic programming experiment, you’ll need to deﬁne the kinds of

nodes which can be used to make up the individual’s tree.

•

(In one or more Parameter Files) Various

parameters

which deﬁne the kind of algorithm you are using,

the nature of the experiment, and the makeup of your populations and processes.

Let’s begin with the third item. Parameters are the lifeblood of ECJ: practically everything in the system

is deﬁned by them. This makes ECJ highly ﬂexible; but it also adds complexity to the system.

ECJ loads parameter ﬁles and stores them into the ec.util.ParameterDatabase object, which is available

to nearly everything. Parameter ﬁles are an extension of the ﬁles used by Java’s old

java.util.PropertyList

object. Parameter ﬁles usually end in

".params"

, and contain parameters one to a line. Parameter ﬁles may

also contain blank (all whitespace) lines, which are ignored, and also lines which start with

"#"

, which are

considered comments and also ignored. An example comment:

# This is a comment

The parameter lines in a parameter ﬁle typically look like this:

parameter.name =parameter value

parameter name

is a string of non-whitespace characters except for

"="

. After this comes some optional

whitespace, then an

"="

, then some more optional whitespace.

parameter value

is a string of characters,

including whitespace, except that all whitespace is trimmed from the front and end of the string. Notice the

use of a period the parameter name. It’s quite a common convention to use periods in various parameter

names in ECJ. We’ll get to why in a second.

Here are some legal parameter lines:

generations = 400

pop.subpop.0.size =1000

pop.subpop= ec.Subpopulation

Here are some illegal parameter lines:

generations

= 1000

pop subpop = ec.Subpopulation

2.1.1 Inheritance

Parameter ﬁles may be set up to

derive from

one or more other parameter ﬁles. Let’s say you have two

parameter ﬁles, a.params and b.params. Both are located in the same directory. You can set up a.params to

derive from b.params by adding the following line as the very ﬁrst line in the a.params ﬁle:

parent.0 = b.params

This says, in effect: “include in me all the parameters found in the b.params ﬁle, but any parameters I

myself declare will override any parameters of the same name in the b.params ﬁle.” Note that b.params may

itself derive from some other ﬁle (say, c.params). In this case, a.params receives parameters from both (and

parameters in b.params will likewise override ones of the same name in c.params).

Let’s say that b.params is located inside a subdirectory called foo. Then the line will look like this:

parent.0 = foo/b.params

Notice the forward slash: ECJ was designed on UNIX systems. Likewise, imagine if b.params was stored

in a sibling directory called bar: then we might say:

parent.0 = ../bar/b.params

You can also deﬁne absolute paths, UNIX-style:

parent.0 = /tmp/myproject/foo.params

Long story short: parameter ﬁles are declared using traditional UNIX path syntax.

A parameter ﬁle can also derive from multiple parent parameter ﬁles, by including each at the beginning

of the ﬁle, with consecutive numbers, like this:

parent.0 = b.params

parent.1 = yo/d.params

parent.2 = ../z.params

This says in effect: “ﬁrst look in a.params for the parameter. If you can’t ﬁnd it there, look in b.params and,

ultimately, all the ﬁles b.params derives from. If you can’t ﬁnd it in any of them, look in d.params and all the

1Actually, you can omit the "=", but it’s considered bad style.

ﬁles it derives from. If you can’t ﬁnd it in any of them, look in z.params and all the ﬁles it derives from. If

you’ve still not found the parameter, give up.”

This is essentially a depth-ﬁrst search through a tree or DAG, with parents overriding their children

(the ﬁles they derive from) and earlier siblings overriding later siblings. Note that this multiple inheritance

scheme is not the same as C++ or Lisp/CLOS, which use a distance measure!

Parent parameter ﬁles can be explicit ﬁles on your ﬁle system (as shown above) or they can be ﬁles located

in JAR ﬁles etc. But how do you refer to a ﬁle inside a JAR ﬁle? It’s easy: refer to it using a

class relative

path

(see the next Section, 2.1.2), which deﬁnes the path relative to the class ﬁle of some class. For example,

suppose you’re creating a parameter ﬁle whose parent is ec/app/ant/ant.params. But you’re not using ECJ in

its unpacked form, but rather bundled up into a JAR ﬁle. Thus ec/app/ant/ant.params is archived in that JAR

ﬁle. Since this ﬁle is right next to ec/app/ant/Ant.class — the class ﬁle for the ec.app.ant.Ant class– you can

refer to it as:

parent.0 = @ec.app.ant.Ant ant.params

If your parameter ﬁle is already in a JAR ﬁle, and it uses ordinary relative path names to refer to its

parents (like ../z.params), these will be interpreted as other ﬁles in the archived ﬁle system inside that JAR

ﬁle. To escape the JAR ﬁle you have to use an absolute path name, such as

parent.0 = /tmp/foo.params

It’s pretty rare to need that though, and hardly good style. The whole point of JAR ﬁles is to encapsulate

functionality into one package.

Overriding the Parameter File

When you ﬁre up ECJ, you point it at a single parameter ﬁle, and you can

provide additional parameters at the command-line, like this:

java ec.Evolve -file parameterFile.params -p command-line-parameter=value \

-p command-line-parameter=value ...

Furthermore, your program itself can submit parameters to the parameter database, though it’s very

unusual to do so. When a parameter is requested from the parameter database, here’s how it’s looked up:

1. If the parameter was declared by the program itself, this value is returned.

2. Else if the parameter was provided on the command line, this value is returned.

Else the parameter is looked up in the provided parameter ﬁle and all derived ﬁles using the inheritance

ordering described earlier.

4. Else the database signals failure.

2.1.2 Kinds of Parameters

ECJ supports the following kinds of parameters:

•Numbers. Either long integers or double ﬂoating-point values. Examples:

generations = 500

tournament.size = 3.25

minimum-fitness = -23.45e15

•Arbitrary Strings trimmed of whitespace. Example:

crossover-type = two-point

•Booleans

. Any value except for

"false"

(case-insensitive) is considered to be true. It’s best style to use

lower-case "true" and "false". The ﬁrst two of these examples are false and the second two are true:

print-params = false

die-a-painful-death = fAlSe

pop.subpop.0.perform-injections = true

quit-on-run-complete = whatever

•Class Names

. Class names are deﬁned as the full class name of the class, including the package.

Example:

pop.subpop.0.species = ec.gp.GPSpecies

•File or Resource Path Names. Paths can be of four types.

– Absolute paths

, which (in UNIX) begin with a

"/"

, stipulate a precise location in the ﬁle system.

– Relative paths

, which do not begin with a

"/"

, are deﬁned relative to the parameter ﬁle in which

the parameter was located. If the parameter ﬁle was an actual ﬁle in the ﬁlesystem, the relative

path will also be considered to point to a ﬁle. If the parameter ﬁle was in a jar ﬁle, then the relative

path will be considered to point to a resource inside the same jar ﬁle relative to the parameter ﬁle

location. You’ve seen relative paths already used for derived parameter ﬁles.

– Execution relative paths

are deﬁned relative to the directory in which the ECJ process was

launched. Execution relative paths look exactly like relative paths except that they begin with the

special character "$".

– Class relative paths

deﬁne a path relative to the class ﬁle of a class. They have two parts: the

class in question, and then the path to the resource relative to it. If the class is stored in a Jar ﬁle,

then the path to the resource will also be within that Jar ﬁle. Otherwise the path will point to an

actual ﬁle. Class relative paths begin with

"@"

, followed by the full class name, then spaces or

tabs, then the relative path.

Examples of all four kinds of paths:

stat.file = $out.stat

eval.prob.map-file = ../dungeon.map

temporary-output-file = /tmp/output.txt

image = @ec.app.myapp.MyClass images/picture.png

If the parameter is for a ﬁle meant to be opened

read-only

, any of the four approaches above will

work ﬁne. But if the parameter is for a

writable ﬁle

, then you have an issue. As discussed in Section

2.1.4, the parameter in question could refer to a ﬁle in your operating system, or it could refer to a ﬁle

bundled inside a Java jar ﬁle. In this second case, the ﬁle cannot be written to.

The only kinds of paths which can refer to things inside jar ﬁles are relative paths and class-relative

paths. Thus if you stick with absolute paths or execution-relative paths, you know you’ll be referring

to a writable ﬁle. For this reason we recommend:

– Read-only Files

should use class-relative paths or relative paths, and should only be accessed

using getResource(...) (see Section 2.1.5).

– Read-Write Files

should use absolute paths or (in most cases) execution-relative paths, and

should only be accessed using getFile(...) (see Section 2.1.5).

•Arrays. ECJ supports loading arrays of doubles,2but does not have direct support for loading arrays

of other types. However it has a convention you should be made aware of. It’s common for arrays to

2See the various “getDoubles” methods in Section 2.1.5.

be loaded by ﬁrst stipulating the number of elements in the array, then stipulating each array element

in turn, starting with 0. The parameter used for the number of elements differs from case to case. Note

the use of periods prior to each number in the following example:

gp.fs.0.size = 6

gp.fs.0.func.0 = ec.app.ant.func.Left

gp.fs.0.func.1 = ec.app.ant.func.Right

gp.fs.0.func.2 = ec.app.ant.func.Move

gp.fs.0.func.3 = ec.app.ant.func.IfFoodAhead

gp.fs.0.func.4 = ec.app.ant.func.Progn2

gp.fs.0.func.5 = ec.app.ant.func.Progn3

The particulars vary. Here’s another, slightly different, example:

exch.num-islands = 8

exch.island.0.id = SurvivorIsland

exch.island.1.id = GilligansIsland

exch.island.2.id = FantasyIsland

exch.island.3.id = TemptationIsland

exch.island.4.id = RhodeIsland

exch.island.5.id = EllisIsland

exch.island.6.id = ConeyIsland

exch.island.7.id = TreasureIsland

Anyway, you get the idea.

2.1.3 Namespace Hierarchies and Parameter Bases

ECJ has lots of parameters, and by convention organizes them in a namespace hierarchy to maintain some

sense of order. The delimiter for paths in this hierarchy is — you guessed it — the period.

The vast majority of parameters are used by one Java object or another to set itself up immediately after it

has been instantiated for the ﬁrst time. ECJ has an important convention which uses the namespace hierarchy

to do just this: the

parameter base

. A parameter base is essentially a path (or namespace, what have you) in

which an object expects to ﬁnd all of its parameters. The preﬁx for this path is typically the parameter name

by which the object itself was loaded.

For example, let us consider the process of deﬁning the class to be used for the global population. This

class is found in the following parameter:

pop = ec.Population

ECJ looks for this parameter, expects a class (in this case, ec.Population), loads the class, and creates one

instance. It then calls a special method (setup(...), we’ll discuss it later) on this class so it can set itself up

from various parameters. In this case, ec.Population needs to know how many subpopulations it will have.

This is deﬁned by the following parameter:

pop.subpops = 2

ec.Population didn’t know that it was supposed to look in

pop.subpops

for this value. Instead, it only

knew that it needed to look in a parameter called

subpops

. The rest (in this case,

pop

) was provided

to ec.Population as its parameter base: the text to be prepended — plus a period — to all parameters that

ec.Population needed to set itself up. It’s not a coincidence that the parameter base also happened to be the

very parameter which deﬁned ec.Population in the ﬁrst place. This is by convention.

Armed with the fact that it needs to create an array of two subpopulations, ec.Population is ready to load

the classes for those two subpopulations. Let’s say that for our experiment we want them to be of different

classes. Here they are:

pop.subpop.0 = ec.Subpopulation

pop.subpop.1 = ec.app.myapp.MySpecialSubpopulation

The two classes are loaded and one instance is created of each of them. Then setup(...) is called on each of

them. Each subpopulation looks for a parameter called

size

to tell it how may individuals will be in that

subpopulation. Since each of them is provided with a different parameter base, they can have different sizes:

pop.subpop.0.size = 100

pop.subpop.1.size = 512

Likewise, each of these subpopulations needs a “species”. Presuming that the species are different classes,

we might have:

pop.subpop.0.species = ec.vector.VectorSpecies

pop.subpop.1.species = ec.gp.GPSpecies

These species objects themselves need to be set up, and when they do, their parameter bases will be

pop.subpop.0.species and pop.subpop.1.species respectively. And so on.

Now imagine that we have ten subpopulations, all of the same class (ec.Subpopulation), and all but the ﬁrst

one has the exact same size. We’d wind up having to write silly stuff like this:

pop.subpop.0.size = 1000

pop.subpop.1.size = 500

pop.subpop.2.size = 500

pop.subpop.3.size = 500

pop.subpop.4.size = 500

pop.subpop.5.size = 500

pop.subpop.6.size = 500

pop.subpop.7.size = 500

pop.subpop.8.size = 500

pop.subpop.9.size = 500

That’s a lot of typing. Though I am saddened to report that ECJ’s parameter ﬁles do require a lot of typing,

at least the parameter database facility offers an option to save our ﬁngers somewhat in this case. Speciﬁcally,

when the ec.Subpopulation class sets itself up each time, it actually looks in not one but two path locations for

the

size

parameter: ﬁrst it tacks on its current base (as above), and if there’s no parameter at that location,

then it tries tacking on a

default base

deﬁned for its class. In this case, the default base for ec.Subpopulation

is the preﬁx ec.subpop. Armed with this we could simply write:

ec.subpop.size = 500

pop.subpop.0.size = 1000

When ECJ looks for subpopulation 0’s size, it’ll ﬁnd it as normal (1000). But when it looks for subpopula-

tion 1 (etc.), it won’t ﬁnd a size parameter in the normal location, so it’ll look in the default location, and use

what it ﬁnds there (500). Only if there’s no parameter to be found in either location will ECJ signal an error.

It’s important to note that if a class is loaded from a default parameter, this doesn’t mean that the default

parameter will become its parameter base: rather, the original expected location will continue to be the base.

For example, imagine if both of our Species objects were the same class, and we had deﬁned them using the

default base. That is, instead of

pop.subpop.0.species = ec.vector.VectorSpecies

pop.subpop.1.species = ec.vector.VectorSpecies

...we simply said

ec.subpop.species = ec.vector.VectorSpecies

When the species for subpopulation 0 is loaded, its parameter base is not going to be

ec.subpop.species

Instead, it will still be

pop.subpop.0.species

. Likewise, the parameter base for the species of subpopulation

1 will still be pop.subpop.1.species.

Keep in mind that all of this is just a convention. You can use periods for whatever you like ultimately.

And there exist a few global parameters without any base at all. For example, the number of generations is

deﬁned as

generations = 200

...and the seed for the random number generator the fourth thread is

seed.3 = 12303421

...even though there is no object set up with the

seed

parameter, and hence no object has

seed

as its parameter

base. Random number generators are one of the few rare objects in ECJ which are not speciﬁed from the

parameter ﬁle.

2.1.4 Parameter Files in Jar Files

Parameter ﬁles don’t have to be just in your ﬁle system: they can be bundled up in jar ﬁles. If a parameter

ﬁle is being read from a jar ﬁle, its parents will be generally assumed to be from the same jar ﬁle as well if

they’re relative paths (they don’t start with "’/’" in UNIX).

So how do you point to a parameter ﬁle in a jar ﬁle to get things rolling? You can run ECJ like this:

java ec.Evolve -from parameterFile.params -at relative.class.Name ...

This instructs ECJ to look for the .class ﬁle of the class relative.class.Name, be it in the ﬁle system or in a Jar

ﬁle. Once ECJ has found it, it looks for the path parameterFile.params relative to this ﬁle. You can omit the

classname, which causes ECJ to assume that the class in question is ec.Evolve. For example, to run the Ant

demo from ECJ (in a Jar ﬁle or unpacked into the ﬁle system), you could say:

java ec.Evolve -from app/ant/ant.params

Notice it does not say ec/app/ant/ant.params, which is probably what you’d expect if you used

"-file"

rather than

"-from"

. This is because ECJ goes to the ec/Evolve.class ﬁle, then from there it searches for the

parameter ﬁle. The path of the parameter ﬁle relative to the ec/Evolve.class ﬁle is app/ant/ant.params.

There are similar rules regarding ﬁle references (such as parent references) within a parameter ﬁle. Let’s

say that your parameter ﬁle is inside a jar ﬁle. If you say something like:

parent.0 = ../path/to/the/parent.params

... then ECJ will look around inside the same Jar ﬁle for this ﬁle, rather than externally in the operating

system’s ﬁle system or in some other Jar ﬁle.

You can escape this however. For example, once your parameter ﬁle is inside a Jar ﬁle, you can still deﬁne

a parent in another Jar ﬁle, or in the ﬁle system, if you know a another class ﬁle it’s located relative to. You

just need to specify another class for ECJ to start at, and a path relative to it, like this:

parent.0 = @ec.foo.AnotherClass relative/path/to/the/parent.params

See the next section for more explanation of that text format.

Last but not least, once your parameter ﬁle is in a Jar ﬁle, you can refer to a parent in the ﬁle system if

you use an absolute path (that is, one which (in UNIX anyway) starts with "’/’"). For example:

parent.0 = /Users/sean/myexperiment/other.params

Absolute path names aren’t very portable and aren’t recommended.

2.1.5 Accessing Parameters

Parameters are looked up in the ec.util.ParameterDatabase class, and parameter names are speciﬁed using

the ec.Parameter class. The latter is little more than a cover for Java strings. To create the parameter

pop.subpop.0.size, we say:

Parameter param = new Parameter("pop.subpop.0.size");

Of course, usually we don’t want to just make a direct parameter, but rather want to construct one from a

parameter base and the remainder. Let’s say our base (

pop.subpop.0

) is stored in the variable base, and we

want to look for size. We do this as:

Parameter param = base.push("size");

Here are some common ec.util.ParameterDatabase methods. Note that all of them look in two places to

ﬁnd a parameter value. This is what we use to handle “standard” and “default” bases. Typically you’d pass

in the parameter in its standard location, and also (in the “default parameter”) parameter with its default

base conﬁguration. You can pass in null for either, and it’ll get ignored.

ec.util.ParameterDatabase Methods

public boolean exists(Parameter parameter, Parameter default)

If either parameter exists in the database, return true. Either parameter may be null.

public String getString(Parameter parameter, Parameter default)

Look ﬁrst in parameter, then failing that, in default parameter, and return the result as a String, else null if not found.

Either parameter may be null.

public File getFile(Parameter parameter, Parameter default)

Look ﬁrst in parameter, then failing that, in default parameter, and return the result as a File, else null if not found.

Either parameter may be null.

Important Note.

You should generally only use this method if you are writing to a

ﬁle. Otherwise it’s best if you used getResource(...).

public InputStream getResource(Parameter parameter, Parameter default)

Look ﬁrst in parameter, then failing that, in default parameter, and open an InputStream to the result, else null if not

found. Either parameter may be null.

Important Note.

This is distinguished from getFile(...) in that the object

doesn’t have to be a ﬁle in the ﬁle system: it can for example be a location in a jar ﬁle. If the parameter speciﬁes an

absolute path or an execution relative path, then a ﬁle in the ﬁle system will be opened. If the parameter speciﬁes

a relative path, and the parameter database was itself loaded as a ﬁle rather than a resource (in a jar ﬁle say), then

a ﬁle will be opened, else a resource will be opened in the same jar ﬁle as the parameter ﬁle. You can also specify a

resource path directly.

public Object getInstanceForParameterEq(Parameter parameter, Parameter default, Class superclass)

Look ﬁrst in parameter, then failing that, in default parameter, to ﬁnd a class. The class must have superclass as a

superclass, or can be the superclass itself. Instantiate one instance of the class using the default (no-argument)

constructor, and return the instance. Throws an ec.util.ParamClassLoadException if no class is found.

public Object getInstanceForParameter(Parameter parameter, Parameter default, Class superclass)

Look ﬁrst in parameter, then failing that, in default parameter, to ﬁnd a class. The class must have superclass as a

superclass, but may not be superclass itself. Instantiate one instance of the class using the default (no-argument)

constructor, and return the instance. Throws an ec.util.ParamClassLoadException if no class is found.

public int getBoolean(Parameter parameter, Parameter default, double defaultValue)

Look ﬁrst in parameter, then failing that, in default parameter, and return the result as a boolean, else defaultValue if

not found or not a boolean. Either parameter may be null.

public int getIntWithDefault(Parameter parameter, Parameter default, int defaultValue)

Look ﬁrst in parameter, then failing that, in default parameter, and return the result as an int, else defaultValue if not

found or not an int. Either parameter may be null.

public int getInt(Parameter parameter, Parameter default, int minValue)

Look ﬁrst in parameter, then failing that, in default parameter, and return the result as an int, else minValue

−

1 if not

found, not an int, or <minValue. Either parameter may be null.

public int getIntWithMax(Parameter parameter, Parameter default, int minValue, int maxValue)

Look ﬁrst in parameter, then failing that, in default parameter, and return the result as an int, else minValue

−

1 if not

found, not an int, <minValue, or >maxValue. Either parameter may be null.

public long getLongWithDefault(Parameter parameter, Parameter default, long defaultValue)

Look ﬁrst in parameter, then failing that, in default parameter, and return the result as a long, else defaultValue if not

found or not a long. Either parameter may be null.

public long getLong(Parameter parameter, Parameter default, long minValue)

Look ﬁrst in parameter, then failing that, in default parameter, and return the result as a long, else minValue

−

1 if not

found, not a long, or <minValue. Either parameter may be null.

public long getLongWithMax(Parameter parameter, Parameter default, long minValue, long maxValue)

Look ﬁrst in parameter, then failing that, in default parameter, and return the result as a long, else minValue

−

1 if not

found, not a long, <minValue, or >maxValue. Either parameter may be null.

public ﬂoat getFloatWithDefault(Parameter parameter, Parameter default, ﬂoat defaultValue)

Look ﬁrst in parameter, then failing that, in default parameter, and return the result as a ﬂoat, else defaultValue if not

found or not a ﬂoat. Either parameter may be null.

public ﬂoat getFloat(Parameter parameter, Parameter default, ﬂoat minValue)

Look ﬁrst in parameter, then failing that, in default parameter, and return the result as a ﬂoat, else minValue

−

1 if not

found, not a ﬂoat, or <minValue. Either parameter may be null.

public ﬂoat getFloatWithMax(Parameter parameter, Parameter default, ﬂoat minValue, ﬂoat maxValue)

Look ﬁrst in parameter, then failing that, in default parameter, and return the result as a ﬂoat, else minValue

−

1 if not

found, not a ﬂoat, <minValue, or >maxValue. Either parameter may be null.

public double getDoubleWithDefault(Parameter parameter, Parameter default, double defaultValue)

Look ﬁrst in parameter, then failing that, in default parameter, and return the result as a double, else defaultValue if

not found or not a double. Either parameter may be null.

public double getDouble(Parameter parameter, Parameter default, double minValue)

Look ﬁrst in parameter, then failing that, in default parameter, and return the result as a double, else minValue

−

1 if

not found, not a double, or <minValue. Either parameter may be null.

public double getDoubleWithMax(Parameter parameter, Parameter default, double minValue, double maxValue)

Look ﬁrst in parameter, then failing that, in default parameter, and return the result as a double, else minValue

−

1 if

not found, not a double, <minValue, or >maxValue. Either parameter may be null.

public double getDoubles(Parameter parameter, Parameter default, double minValue)

Look ﬁrst in parameter, then failing that, in default parameter, for a space- or tab-delimited list of double values, and

return the result as an array of doubles, else null if they are improperly formatted or if any of them is

minValue,

or if the list is zero in length, or if it has garbage at the end of it. Either parameter may be null.

public double getDoubles(Parameter parameter, Parameter default, double minValue, int expectedLength)

Look ﬁrst in parameter, then failing that, in default parameter, for a space- or tab-delimited list of double values, and

return the result as an array of doubles, else null if they are improperly formatted or if any of them is

minValue,

or if the list is not expectedLength (>0)long, or if it has garbage at the end of it. Either parameter may be null.

public double getDoublesWithMax(Parameter parameter, Parameter default, double minValue, double maxValue)

Look ﬁrst in parameter, then failing that, in default parameter, for a space- or tab-delimited list of double values, and

return the result as an array of doubles, else null if they are improperly formatted or if any of them is

minValue,

or >maxValue, or if the list is zero in length, or if it has garbage at the end of it. Either parameter may be null.

public double getDoublesWithMax(Parameter parameter, Parameter default, double minValue, double maxValue,

int expectedLength)

Look ﬁrst in parameter, then failing that, in default parameter, for a space- or tab-delimited list of double values, and

return the result as an array of doubles, else null if they are improperly formatted or if any of them is

minValue,

maxValue, or if the list is not expectedLength

)

long, or if it has garbage at the end of it. Either parameter

may be null.

public double getDoublesUnconstrained(Parameter parameter, Parameter default)

Look ﬁrst in parameter, then failing that, in default parameter, for a space- or tab-delimited list of double values, and

return the result as an array of doubles, else null if they are improperly formatted, or if the list is zero in length, or

if it has garbage at the end of it. Either parameter may be null.

public double getDoublesUnconstrained(Parameter parameter, Parameter default, int expectedLength)

Look ﬁrst in parameter, then failing that, in default parameter, for a space- or tab-delimited list of double values,

and return the result as an array of doubles, else null if they are improperly formatted, or if the list is not

expectedLength (>0)long, or if it has garbage at the end of it. Either parameter may be null.

2.1.6 Parameter Macros

The get() method can also handle special macro parameters. Macro parameter names will end in

alias

, which

means you cannot have any parameter names which end with this word. The idea behind a macro parameter

is that it can substitute one substring for another among your parameter names, making your parameters

potentially much simpler. Macros work along period boundaries.

2.1.6.1 The Alias Macro

The alias parameter macro works as follows. Let’s say that you have:

hello.there.alias = foo

This means to replace

hello.there

with

foo

and continue looking up the parameter. For example, when

When query for the parameter

hello.there.mom.how.are.you

it is converted to the parameter

foo.mom.how.are.you

which is then looked up in the database. Furthermore

hello.there.partner

is converted to the parameter

foo.partner

and

hello.there

is converted to just the parameter

foo

Which are then looked up in the database. Conﬂicting rules are handled in order of speciﬁcity. For example,

if you had

hello.there.alias = foo

hello.there.mom.alias = bar

hello.there.mom.how.are.you = whoa

hello.there.brother = hey

Now let’s say we queried

hello.there.dad

This would get converted to

foo.dad

which would then get looked up. But if we queried

hello.there.mom.42

Then our second rule would take precedence, because it’s more speciﬁc, and instead of converting to

foo.mom.42 we would instead convert to

bar.42

which would then get looked up. Finally, if we queried

hello.there.mom.how.are.you

Then this would take precedence and the system would immediately return

whoa

. Furthermore, if we

queried

hello.there.brother

This would immediately return hey, because exact parameters take precedence over macros.

Additionally note that the substitution is delimited by the periods. So querying

hello.therewhoa

will not get translated. And furthermore the substitution must start at the start of the parameter name, so

querying

my.hello.there.mom

will also not get translated.

Beware of cycles! For example, if you have

a.b.alias = foo

foo.alias = a.b

... and you query

a.b.yo

... you should get an error reported and null returned.

2.1.7 Debugging Your Parameters

Your ECJ experiment is loading and running, but how do you know you didn’t make a mistake in your

parameters? How do you know ECJ is using the parameters you stated rather than some default values? If

you include the following parameter in your collection:

print-params = true

...then ECJ will print out all the parameters which were used or tested for existence. For example, you might

get things like this printed out:

!P: pop.subpop.0.file

P: pop.subpop.0.species = ec.gp.GPSpecies

<P: ec.subpop.species

P: pop.subpop.0.species.pipe = ec.breed.MultiBreedingPipeline

<P: gp.species.pipe

!E: pop.subpop.0.species.pipe.prob

P: pop.subpop.0.species.pipe.num-sources = 2

<P: breed.multibreed.num-sources

P: pop.subpop.0.species.pipe.source.0 = ec.gp.koza.CrossoverPipeline

<P: breed.multibreed.source.0

E: pop.subpop.0.species.pipe.source.0.prob = 0.9

<E: gp.koza.xover.prob

means that a parameter value was accessed (or attempted to). An

means that a parameter was

tested for existence. An

means that the parameter did not exist. A

means that the parameter existed in

the default base as well as the primary base, but the value of the primary base was the one used. In this last

case, the primary base is printed out on the line immediately prior.

There are a few other debugging parameters of less value. At the end of a run, ECJ can dump all the

parameters in the database; all the parameters accessed (retrieved or tested for existence); all the parameters

used (retrieved); all the parameters not accessed; and all the parameters not used. Pick your poison. Here are

the relevant parameters:

print-all-params = true

print-accessed-params = true

print-used-params = true

print-unaccessed-params = true

print-unused-params = true

Typically you’d only want to set one of these to true. The most useful one is

print-unaccessed-params

since by examining the results you can see if a parameter you set was used or not: if not, probably because it

wasn’t typed right. It also tells you about old, disused parameters. In fact, as I was writing this manual and

needed

print-unaccessed-params

examples, I ran the Lawnmower problem (in ec/app/lawnmower) and

got the following:

Unaccessed Parameters

===================== (Ignore parent.x references)

gp.fs.2.info = ec.gp.GPFuncInfo

gp.koza.grow.min-depth = 5

gp.tc.0.init.max = 6

gp.koza.mutate.build.0 = ec.gp.koza.GrowBuilder

gp.tc.1.init.max = 6

parent.0 = ../../gp/koza/koza.params

gp.koza.grow.max-depth = 5

gp.tc.2.init.max = 6

gp.koza.mutate.ns.0 = ec.gp.koza.KozaNodeSelector

gp.fs.0.info = ec.gp.GPFuncInfo

gp.koza.half.growp = 0.5

gp.tc.0.init.min = 2

gp.koza.mutate.source.0 = ec.select.TournamentSelection

gp.koza.mutate.tries = 1

gp.tc.1.init.min = 2

gp.fs.1.info = ec.gp.GPFuncInfo

gp.tc.2.init.min = 2

gp.koza.mutate.maxdepth = 17

Most of these unaccessed parameters are perfectly ﬁne; standard boilerplate stuff for genetic pro-

gramming that didn’t happen to be used by this application. But then there’s the ﬁrst parameter:

gp.fs.2.info = ec.gp.GPFuncInfo

, and two others like it later. I had deleted the GPFuncInfo class from

the ECJ distribution well over a year ago. But apparently I forgot to remove a vestigial parameter which

referred to it. Oops!

By the way, note the request to ignore “parent.x references” — this means to ignore the stuff like

parent.0 = ../../gp/koza/koza.params that gets printed out with everything else.

For more on debugging ECJ, see Section 3.8.

2.1.8 Building a Parameter Database from Scratch

This is starting to get inside-baseball, so you may wish to skip it for now. Normally ParameterDatabase is

constructed from the Evolve class on your behalf. But in some unusual situations you may need to build

one yourself. Most notably, if you’re attaching ECJ as a sub-module under some application or toolkit (see

Section 2.7), you may need to make a custom ParameterDatabase with which to run ECJ.

A typical ParameterDatabase is constructed from a ﬁle or a resource relative to a class. When this is done,

what you receive is an empty ParameterDatabase which points to another ParameterDatabase, The empty

ParameterDatabase is free for you to modify. For example, if you call...

File file = ...

String[] commandLineArguments = ...

ParameterDatabase db = new ParameterDatabase(file, commandLineArguments);

... you will get back an empty ParameterDatabase whose parent is a ParameterDatabase holding the

commands-line arguments, whose parent is a ParameterDatabase constructed from the ﬁle. That ﬁnal

ParameterDatabase may have further parents as speciﬁed in the ﬁle itself.

Other ParameterDatabase constructors, particularly ones which do not concern themselves with speciﬁc

ﬁles, will not have this feature.

The particular chain, if there is one at all, varies depending on the constructor you call. The one in the

example above is the most common constructor.

ec.util.ParameterDatabase Constructor Methods

public ParameterDatabase()

Creates a simple empty parameter database with no parents. The database hierarchy is simply: empty

public ParameterDatabase(Dictionary map)

Creates a parameter database with the given Dictionary. Both the keys and values will be run through toString()

before adding to the database. Keys are parameters. Values are the values of the parameters. If parents are deﬁned

in the map’s parameters, they will be attempted to be loaded: only parents which are absolute path names are

permitted. Beware that a ParameterDatabase is itself a Dictionary; but if you pass one in here you will only get the

lowest-level elements. The database hierarchy is :

map → · · ·

Note that unlike other methods, this method

does not create an empty base parameter database. If you want a hierarchy like

empty →map → · · ·

you can

achieve this with new ParameterDatabase().addParent(new ParameterDatabase(myMap))

public ParameterDatabase(InputStream stream)

Creates a parameter database by reading parameters from the provided stream. If parents are deﬁned among

the parameters, they will be attempted to be loaded: only parents which are absolute path names are permitted.

Beware that a ParameterDatabase is itself a Dictionary; but if you pass one in here you will only get the lowest-level

elements. The database hierarchy is :

stream → · · ·

Note that unlike other methods, this method does not create

an empty base parameter database. If you want a hierarchy like

empty →stream → · · ·

you can achieve this

with new ParameterDatabase().addParent(new ParameterDatabase(myStream))

public ParameterDatabase(File ﬁle)

Creates a new parameter database from the given ﬁle (and possibly parent ﬁles). The database hierarchy is:

empty →ﬁle → · · ·

public ParameterDatabase(File ﬁle, String[] args)

Creates a new parameter database from the given ﬁle (and possibly parent ﬁles). The database hierarchy is:

empty →args →ﬁle → · · ·

public ParameterDatabase(String pathNameRelativeToClassFile, Class cls)

Creates a new parameter database from the given ﬁle (and possibly parent ﬁles). The ﬁle is located a path

name relative to object (.class) ﬁle of the provided class. For example, if the class were Foo, and its object ﬁle

was located at /a/b/Foo.class, and the path name relative to the class ﬁle was ../c/bar.params, then the ﬁle

would be expected to be located at /a/c/bar.params. This also works inside jar ﬁles. The database hierarchy is:

empty →pathNameRelativeToClassFile → · · ·

public ParameterDatabase(String pathNameRelativeToClassFile, Class cls, String[] args)

Creates a new parameter database from the given ﬁle (and possibly parent ﬁles). The ﬁle is located a path

name relative to object (.class) ﬁle of the provided class. For example, if the class were Foo, and its object ﬁle

was located at /a/b/Foo.class, and the path name relative to the class ﬁle was ../c/bar.params, then the ﬁle

would be expected to be located at /a/c/bar.params. This also works inside jar ﬁles. The database hierarchy is:

empty →args →pathNameRelativeToClassFile → · · ·

Once you have created a ParameterDatabase, you can attach parent ParameterDatabases to it. You can

also set values in the ParameterDatabase, and remove values (though you cannot remove values from its

parents without accessing the parents themselves), among other operations.

ec.util.ParameterDatabase Methods

public void addParent(ParameterDatabase database)

Adds a parent database to the parameter database.

public void set(Parameter parameter, String value)

Sets a parameter in the immediate parameter database. This overrides settings in parents. The value is ﬁrst

trimmed of whitespace.

public void remove(Parameter parameter)

Removes a parameter from an immediate parameter database (but not from its parents in the hierarchy).

public ParameterDatabase getLocation(Parameter parameter)

Returns the parameter database in the parent hierarchy which deﬁned the currently-used value for the given

parameter.

public String getLabel()

Returns a string describing the location of the ParameterfDatabase (such as the ﬁle name from which it was

loaded), or the empty string if there is nothing appropriate.

2.2 Output

ECJ has its own idiosyncratic logging and output facility called ec.util.Output. This is largely historical: ECJ

predates any standard logging facilities available in Java. The facility is in part inspired by a similar facility

that existed in the

lil-gp

C-based genetic programming system. The system has generally worked out well

so we’ve not seen ﬁt to replace it.

The primary reason for the central logging and output facility is to survive checkpointing and restarting

from checkpoints (see Section 2.3). Except for the occasional debugging statement which we’ve forgotten to

remove, all output in ECJ goes through ec.util.Output.

The output facility has four basic features:

•Logs

, attached to Files or to Writers, which output text of all kinds. Logs can be restarted, meaning that

they can be reopened when ECJ is restarted from a checkpoint.

• Two dedicated Logs, the Message Logs, which write text out to stdout and stderr respectively.

• The ability to print arbitrary text to any Log.

•

Short

Announcements

of different kinds. Announcements are different from arbitrary text in that they

are not only written out to Logs (usually the stderr message Log) but are also stored in memory. This

allows them to be checkpointed and automatically reposted after ECJ has started up again from a

checkpoint.

The least important announcements are simple

messages

. One special kind of message is the

system

message

generated by ECJ itself. Next in importance are

warnings

. One special kind of warning, the

once-only-warning

, will be written only once to a Log even if it’s posted multiple times. Next are

basic

errors

. Things can be conﬁgured such that after a bunch of errors, ECJ will quit. Finally,

fatal

errors will cause ECJ to quit immediately rather than wait for more errors to accumulate.

2.2.1 Creating and Writing to Logs

There are many methods in ec.util.Output for creating or accessing Logs. Here are some common ones:

ec.util.Output Methods

public int addLog(File ﬁle, boolean appendOnRestart)

Add a log on a given ﬁle. If ECJ is restarted from a checkpoint, and appendOnRestart is true, then the log will be

appended to the current ﬁle contents. Else they will be replaced. The Log is registered with ec.util.Output and its

log number is returned.

public int addLog(File ﬁle, boolean appendOnRestart, boolean gzip)

Add a log on a given ﬁle. If ECJ is restarted from a checkpoint, and appendOnRestart is true, then the log will be

appended to the current ﬁle contents. Else they will be replaced. If gzip is true, then the log will be gzipped. You

cannot have both appendOnRestart and gzip true at the same time. The Log is registered with ec.util.Output and its

log number is returned.

public Log getLog(int index)

Returns the log indexed at the given location.

public int numLogs()

Returns the total number of logs.

Two logs are always made for you automatically: a log to

stdout

(log index 0); and another log to

stderr

(log index 1). The stderr log prints all announcements, but the stdout log does not.

Logs have various instance variables, but few are important, except for this one:

public boolean silent = false;

If you set this ﬂag to true, the log will not print anything at all. See section 2.2.2 for more information on

how to do this.

To write arbitrary text to a log, here are the most common methods:

ec.util.Output Methods

public void print(String text, int log number)

Prints a string to a log.

public void println(String text, int log number)

Prints a string to a log, plus a newline.

Besides stdout (0) and stderr (1), there are two other special log numbers you should be aware of:

public int ALL MESSAGE LOGS;

public int NO LOGS;

NO LOGS is a special log value meaning “don’t bother printing this”. It’s sometimes used to turn off

printing to certain logs temporarily. ALL MESSAGE LOGS will cause printing to be sent to all logs for which

message logging is turned on. By default that’s just stderr (1). This is not commonly used. To post a message

or generate a warning or error (all of which ordinarily go to the stderr log, and are also stored in memory):

ec.util.Output Methods

public void message(String text)

Posts a message.

public void warning(String text)

Posts a warning.

public void warning(String text, Parameter parameter, Parameter default)

Posts a warning, and indicates the parameters which caused the warning. Typically used for cautioning the user

about the parameters he chose.

public void warnOnce(String text)

Posts a warning which will not appear a second time.

public void warnOnce(String text, Parameter parameter, Parameter default)

Posts a warning which will not appear a second time, and indicates the parameters which caused the warning.

Typically used for cautioning the user about the parameters he chose.

public void error(String text)

Posts an error message. The contract implied in using this method is that at some point in the near future you will

call exitIfErrors().

public void error(String text, Parameter parameter, Parameter default)

Posts an error message, and indicates the parameters which caused the warning. Typically used for cautioning the

user about the parameters he chose. The contract implied in using this method is that at some point in the near

future you will call exitIfErrors().

public void exitIfErrors()

Exits immediately if an error has been posted.

public void fatal(String text)

Posts an error message and exits immediately.

public void fatal(String text, Parameter parameter, Parameter default)

Posts an error message, indicates the parameters which caused the warning, and exits immediately. Typically

used for cautioning the user about the parameters he chose.

2.2.2 Quieting the Program

ECJ prints a lot of stuff to the screen (both stdout and stderr) when doing its work. Perhaps you’d like to

shut ECJ up. It’s easy. If you set the following parameter:

silent = true

... then ECJ will eliminate both of its stdout and stderr logs, so nothing will be printed to the screen.

This parameter doesn’t prevent ECJ statistics objects from writing to various ﬁle logs. However many

statistics objects have similar options to quiet them. See Sections 3.7.3 and 7.5.0.2.

2.2.3 The ec.util.Code Class

ECJ Individuals, Fitnesses, and various other components sometimes need to write themselves to a ﬁle in a

way which can both be read by humans and be read back into Java resulting in perfect copies of the original.

This means that neither printing text nor writing raw data binary is adequate.

ECJ provides a utility facility to make doing this task a little simpler. The ec.util.Code class encodes and

decodes basic Java data types (booleans, bytes, shorts, ints, longs, ﬂoats, chars, Strings) into Strings which

can be emitted as text. They all have the same pattern:

ec.util.Code Methods

public static String encode(boolean val)

Encodes val into a String and returns it.

public static String encode(byte val )

Encodes val into a String and returns it.

public static String encode(short val )

Encodes val into a String and returns it.

public static String encode(int val)

Encodes val into a String and returns it.

public static String encode(long val)

Encodes val into a String and returns it.

public static String encode(ﬂoat val)

Encodes val into a String and returns it.

public static String encode(double val)

Encodes val into a String and returns it.

public static String encode(char val )

Encodes val into a String and returns it.

public static String encode(String val)

Encodes val into a String and returns it. Obviously encoding a String into a String sounds goofy, but go with us

here.

These methods encode their data in an idiosyncratic way. Here’s a table describing it:

Data Type Encoding Example

boolean Tor F T

byte bvalueAsDecimalNumber|b59|

short svalueAsDecimalNumber|s-321|

int ivalueAsDecimalNumber|i42391|

long lvalueAsDecimalNumber|l-342341232|

ﬂoat fvalueEncodedAsInteger|valuePrintedForHumans|f-665866527|-9.1340002E14|

double dvalueEncodedAsLong|valuePrintedForHumans|d4614256656552045848|3.141592653589793|

char ’characterWithEscapes’ ’w’ or ’ ’ or ’\n’ or ’\’’ or ’\u2FD3’

String "stringWithEscapes" "Dragon in Chinese is:\n\u2FD3"

These are of course idiosyncratic,

but lacking a Java standard for doing the same task, they do an

adequate job. You’re more than welcome to go your own way.

2.2.3.1 Decoding the Hard Way

To decode a sequence of values from a String, you begin by creating an ec.util.DecodeReturn object wrapped

around the String:

DecodeReturn decodeReturn = new DecodeReturn(string);

To decode the next item out of the string, you call:

Code.decode(decodeReturn);

The type of the decoded data is stored here:

int type = decodeReturn.type;

... and is one of the following ec.util.DecodeReturn constants:

The eccentricities in this class stem from it being developed well before Java had any standard way to do such things itself — indeed

Java still doesn’t have a standard way to do most of this. I might improve it in the future, at the very least, by not requiring type

symbols (like b

) in front of integer types. And including methods named things like DecodeReturn.getFloat() which throws exceptions

rather than requiring one to look up type information.

public static final byte DecodeReturn. T ERROR = -1;

public static final byte DecodeReturn. T BOOLEAN = 0;

public static final byte DecodeReturn. T BYTE = 1;

public static final byte DecodeReturn. T CHAR = 2;

public static final byte DecodeReturn. T SHORT = 3;

public static final byte DecodeReturn. T INT = 4;

public static final byte DecodeReturn. T LONG = 5;

public static final byte DecodeReturn. T FLOAT = 6;

public static final byte DecodeReturn. T DOUBLE = 7;

public static final byte DecodeReturn. T STRING = 8;

If the type is a boolean (false =0, true =1), byte, char, short, int, or long, the result is stored here:

long result = decodeReturn.l;

If the type is a double or ﬂoat, the result is stored here:

double result = decodeReturn.d;

If the type is a String, the result is stored here:

String result = decodeReturn.s;

To decode the next element out of the String, just call Code.decode(decodeReturn) again. Continue doing

this until you’re satisﬁed or reach a type of T ERROR.

2.2.3.2 Decoding the Easy Way

One of the most common decoding tasks is reading a decoded number or boolean from a single line, often

preceded with a preamble, such as:

Evaluated: T

... or ...

Size of Genome: i13|

The Code class has some convenience methods for decoding these lines without having to muck about

with a DecodeReturn:

ec.util.Code Methods

public static ﬂoat readFloatWithPreamble(String preamble, EvolutionState state, LineNumberReader reader)

Decodes and returns an encoded single ﬂoating-point value from the reader, ﬁrst skipping past an expected

preamble. If the preamble does not exist, or the value does not exist, an error is issued.

public static ﬂoat readDoubleWithPreamble(String preamble, EvolutionState state, LineNumberReader reader)

Decodes and returns an encoded double ﬂoating-point value from the reader, ﬁrst skipping past an expected

preamble. If the preamble does not exist, or the value does not exist, an error is issued.

public static ﬂoat readBooleanWithPreamble(String preamble, EvolutionState state, LineNumberReader reader )

Decodes and returns an encoded boolean value from the reader, ﬁrst skipping past an expected preamble. If the

preamble does not exist, or the value does not exist, an error is issued.

public static byte readByteWithPreamble(String preamble, EvolutionState state, LineNumberReader reader )

Decodes and returns an encoded byte from the reader, ﬁrst skipping past an expected preamble. If the preamble

does not exist, or the value does not exist, an error is issued.

public static short readShortWithPreamble(String preamble, EvolutionState state, LineNumberReader reader)

Decodes and returns an encoded short from the reader, ﬁrst skipping past an expected preamble. If the preamble

does not exist, or the value does not exist, an error is issued.

public static ﬂoat readIntegerWithPreamble(String preamble, EvolutionState state, LineNumberReader reader)

Decodes and returns an encoded integer from the reader, ﬁrst skipping past an expected preamble. If the preamble

does not exist, or the value does not exist, an error is issued.

public static long readLongWithPreamble(String preamble, EvolutionState state, LineNumberReader reader)

Decodes and returns an encoded long from the reader, ﬁrst skipping past an expected preamble. If the preamble

does not exist, or the value does not exist, an error is issued.

public static char readCharacterWithPreamble(String preamble, EvolutionState state, LineNumberReader reader)

Decodes and returns an encoded character from the reader, ﬁrst skipping past an expected preamble. If the

preamble does not exist, or the value does not exist, an error is issued.

public static char readStringWithPreamble(String preamble, EvolutionState state, LineNumberReader reader )

Decodes and returns an encoded string from the reader, ﬁrst skipping past an expected preamble. If the preamble

does not exist, or the value does not exist, an error is issued.

2.3 Checkpointing

ECJ supports

checkpointing

, meaning the ability to save the state of the stochastic optimization process to a

ﬁle at any point in time, and later start a new ECJ process resuming at that exact state. Checkpointing is

particularly useful when doing long processes on shared servers or other environments where the process

may be killed at any time. ECJ’s checkpointing procedure largely consists of applying Java’s serialization

mechanism to the ec.EvolutionState object, which in turn serializes the entire object graph of the current

system.

Turn on checkpointing like this:

checkpoint = true

ECJ typically writes out checkpoint ﬁles every

generations (or, in the steady-state evolution situation,

every ngenerations’ worth of evaluations of individuals). To set n=4, you’d say:

checkpoint-modulo = 4

ECJ writes to checkpoint ﬁles named ec.generation.gz, where generation is the current generation number.

If you don’t like the ec preﬁx for some reason, change it to, say, curmudgeon like this:

checkpoint-prefix = curmudgeon

By default ECJ writes checkpoints to the directory in which you had run Java. But you can set a parameter

to specify a directory to which checkpoints should be written, such as /tmp/:

checkpoint-directory = /tmp/

This directory can be an absolute, relative, execution relative, or class relative path (see Section 2.1.2 for a

refresher). But it must be a directory, not a ﬁle.

Whenever a checkpoint is written, this fact is also added as an announcement. Here’s the output of a

typical run with checkpointing every two generations.

| ECJ

| An evolutionary computation system (version 19)

| By Sean Luke

| Contributors: L. Panait, G. Balan, S. Paus, Z. Skolicki, R. Kicinger, E. Popovici,

| K. Sullivan, J. Harrison, J. Bassett, R. Hubley, A. Desai, A. Chircop,

| J. Compton, W. Haddon, S. Donnelly, B. Jamil, and J. O’Beirne

| URL: http://cs.gmu.edu/~eclab/projects/ecj/

| Mail: ecj-help@cs.gmu.edu

| (better: join ECJ-INTEREST at URL above)

| Date: July 10, 2009

| Current Java: 1.5.0_20 / Java HotSpot(TM) Client VM-1.5.0_20-141

| Required Minimum Java: 1.4

Threads: breed/1 eval/1

Seed: -530434079

Job: 0

Setting up

Initializing Generation 0

Subpop 0 best fitness of generation: Fitness: -1542.1932

Generation 1

Subpop 0 best fitness of generation: Fitness: -1499.354

Checkpointing

Wrote out checkpoint file ec.2.gz

Generation 2

Subpop 0 best fitness of generation: Fitness: -1497.0482

Generation 3

Subpop 0 best fitness of generation: Fitness: -1481.9377

Checkpointing

Wrote out checkpoint file ec.4.gz

Generation 4

Subpop 0 best fitness of generation: Fitness: -1426.816

...

Imagine that at this point the power failed and we lost the process. We’d like to start again from the

checkpoint ﬁle ec.4.gz. We can do that by typing:

java ec.Evolve -checkpoint ec.4.gz

Notice that we don’t provide a parameter ﬁle or optional command-line parameters. That’s because

the parameter database has already been built and stored inside the checkpoint ﬁle. When ECJ starts up

from a checkpoint ﬁle, it starts right where it left off, but ﬁrst spits out all the announcements that had been

produced up to that point, with one exception. See if you can catch it:

Restoring from Checkpoint ec.4.gz

| ECJ

| An evolutionary computation system (version 19)

| By Sean Luke

| Contributors: L. Panait, G. Balan, S. Paus, Z. Skolicki, R. Kicinger, E. Popovici,

| K. Sullivan, J. Harrison, J. Bassett, R. Hubley, A. Desai, A. Chircop,

| J. Compton, W. Haddon, S. Donnelly, B. Jamil, and J. O’Beirne

| URL: http://cs.gmu.edu/~eclab/projects/ecj/

| Mail: ecj-help@cs.gmu.edu

| (better: join ECJ-INTEREST at URL above)

| Date: July 10, 2009

| Current Java: 1.5.0_20 / Java HotSpot(TM) Client VM-1.5.0_20-141

| Required Minimum Java: 1.4

Threads: breed/1 eval/1

Seed: -530434079

Job: 0

Setting up

Initializing Generation 0

Subpop 0 best fitness of generation: Fitness: -1542.1932

Generation 1

Subpop 0 best fitness of generation: Fitness: -1499.354

Checkpointing

Wrote out checkpoint file ec.2.gz

Generation 2

Subpop 0 best fitness of generation: Fitness: -1497.0482

Generation 3

Subpop 0 best fitness of generation: Fitness: -1481.9377

Checkpointing

Generation 4

Subpop 0 best fitness of generation: Fitness: -1426.816

Generation 5

Subpop 0 best fitness of generation: Fitness: -1336.0835

Checkpointing

Wrote out checkpoint file ec.6.gz

Generation 6

Subpop 0 best fitness of generation: Fitness: -1302.0063

...

2.3.1 Implementing Checkpointable Code

ECJ’s checkpoint facility relies on Java’s serialization package. When ECJ checkpoints, it serializes the

EvolutionState. Since everything in an ECJ run is hanging off of the EvolutionState somewhere, the entire ECJ

run is serialized out to disk.

Checkpointing is fragile. When you write your code, here are some good practices you should follow:

• Add to each of your classes the following instance variable:

private static final long serialVersionUID = 1;

•

Try to avoid non-static inner classes. But if you must have one, it should also have the aforementioned

instance variable in its variables as well.

•

All static variables should be ﬁnal and should be simple types, such as Strings, ints, ﬂoats, etc. If

you need to store global information, it should be stored as an instance variable in your subclass of

EvolutionState.

• Don’t allocate your own threads or locks.

•

If you make a special object, it must be java.io.Serializable. Most ECJ classes are already serializable, so

you inherit this by just subclassing from them.

Checkpointing is normally done after breeding has occurred, and the generation number has been

incremented. Three things typically happen:

1. The preCheckpointStatistics(...) method is called on the statistics object.

The setCheckpoint(...) method is called on the checkpoint object. This causes the checkpoint object to

serialize out the current EvolutionState to a gzipped checkpoint ﬁle.

3. The postCheckpointStatistics(...) method is called on the statistics object.

When the system is restored from a checkpoint, the following happens:

The restoreFromCheckpoint(...) method is called on the Checkpoint class. This method does the

following:

(a) It ﬁrst unserializes the EvolutionState from the checkpoint ﬁle.

(b)

It then calls resetFromCheckpoint(...) on the EvolutionState. The resetFromCheckpoint method

normally does two things:

i. restart(...) is called on the Output. This allows it to set up output logs again.

ii.

reinitializeContacts(...) is called on the Exchanger, then on the Evaluator, to allow them to

reestablish network connections for distributed evaluation and for island models.

2. Evolution is resumed by calling the run(...) method on the unserialized EvolutionState.

startFromCheckpoint(...) is then called on the EvolutionState. This is a simple hook method you can use

to set up things before evolution starts again.

At this point, evolution continues.

In general you have two hooks available to you to set up after resuming from a checkpoint. First, you

can override the method EvolutionState.resetFromCheckpoint(). This method is called before Evolution-

State.run(...) is called to resume running. You would override this method to reopen ﬁles or sockets (it

optionally throws an IOException).

Second, you could override the method EvolutionState.startFromCheckpoint(). This method is called

during EvolutionState.run(...), typically immediately before the run resumes. You would typically override

this method to do internal setup that doesn’t involve external communication.

In either case, be sure to call the supermethod ﬁrst.

2.4 Threads and Random Number Generation

In many cases ECJ supports multiple threads at two stages of the evolutionary process: during breeding and

during evaluation. You can specify the number of threads for each of these processes like this:

breedthreads = 4

evalthreads = 4

Typically, but not always, you’d want to set these numbers to match the number of cores or processors on

your computer. And usually these two numbers should be the same. If you don’t know the number of cores,

you can let ECJ try to ﬁgure it out for you by saying:

breedthreads = auto

evalthreads = auto

ECJ is still capable of producing replicable results even when threading is turned on: you’ll get the

same results if you use the same number of evaluation and breeding threads and the same random number

generator seeds. Which brings us to...

2.4.1 Random Numbers

As beﬁtting its name, stochastic optimization is stochastic, meaning involving randomness. This means

that a random number generator is central to the algorithms in ECJ, and it’s crucial to have a fairly good

generator. Unfortunately, Java’s default random number generator, java.util.Random, is notoriously bad. It

creates highly nonrandom sequences, so much so that websites have been developed to show off how awful

it is.4Never, ever, use java.util.Random in your ECJ code.

ECJ comes with a high quality random number generator ready for you to use: ec.util.MersenneTwisterFast.

This is a fast implementation of a famous random number generator, the Mersenne Twister.

The Mersenne

Twister has a very high period and good statistical randomness qualities.

If you’re comfortable with java.util.Random, you’ll be ﬁne. ec.util.MersenneTwisterFast has all the methods

that java.util.Random has, plus one or two more.

In ECJ, Mersenne Twister is seeded with a single 32-bit integer other than zero (actually, it’s a long, but

only the ﬁrst 32 bits are used).6You specify this seed with the following parameter:

seed.0 = -492341

Setting the seed this way gives you control over ECJ’s results: if you set the seed to the same value, ECJ

will produce the exact same results again. But if you like you can also let ECJ set the seed to the current wall

clock time in milliseconds, which is almost always different for different runs:

seed.0 = time

One reason ECJ’s Mersenne Twister implementation is fairly fast is that it’s not threadsafe. Thus ECJ

maintains one random number generator for each thread used by the program. This means that if you have

more than one thread, you’ll have more than one random number generator, and each one of them will need

a seed. Let’s say you’ve settled on two threads. You can set both random number generator seeds like this:

evalthreads = 2

breedthreads = 2

seed.0 = -492341

seed.1 = 93123

You can also use wall clock time. Speciﬁcally, if you instead do the following:

evalthreads = 2

breedthreads = 2

seed.0 = time

seed.1 = time

...ECJ will guarantee that the two seeds differ. Last, if you set your threads automatically:

evalthreads = auto

breedthreads = auto

...then ECJ will automatically set all the seeds using wall clock time, except the ones you specify by hand.

After all, you don’t know how many seeds you’ll get!

The Mersenne Twister random number generators are stored in an array, located in a variable called

random in the ec.EvolutionState object. The size of the array is the maximum of the number of breed and

evaluation threads being used. How do you know which random number generator you should use? Many

methods in ECJ are passed a

thread number

. This number is the index into the random number generator

array for the thread in which this method is being called. For example, to get a random double, you typically

see things along these lines:

double d = state.random[threadnum].nextDouble();

If you’re in a single-threaded portion of the program, you can just use generator number 0.

4See for example http://alife.co.uk/nonrandom/

5http://www.math.sci.hiroshima-u.ac.jp/∼m-mat/MT/emt.html

Actually, Mersenne Twister can be seeded with its full internal state: an array of over 600 integers. But it’s pretty rare to need this,

and you’d have to do it programmatically in the random number generator rather than as an ECJ parameter.

Any gotchas?

Yes. The standard MT199937 seeding algorithm uses one of Donald Knuth’s plain-jane linear

congruential generators to ﬁll the Mersenne Twister’s arrays. This means that for a short while the algorithm

will initially be outputting a (very slightly) lower quality random number stream until it warms up. After

about 625 calls to the generator, it’ll be warmed up sufﬁciently. You probably will never notice or care, but if

you wanted to be extra extra paranoid, you could call nextInt() 1300 times or so when your model is initially

started. Perhaps in the future we’ll do that for you.

MersenneTwisterFast (which ECJ uses) and its sibling MersenneTwisterFast have identical methods to

java.util.Random, plus one or two more for good measure. They should look familiar to you:

ec.util.MersenneTwisterFast Constructor Methods

public MersenneTwisterFast(long seed)

Seeds the random number generator. Note that only the ﬁrst 32 bits of the seed are used.

public MersenneTwisterFast()

Seeds the random number generator using the current time in milliseconds.

public MersenneTwisterFast(int[] vals)

Seeds the random number generator using the given array. Only the ﬁrst 624 integers in the array are used. If the

array is shorter than 624, then the integers are repeatedly used in a wrap-around fashion (not recommended). The

integers can be anything, but you should avoid too many zeros. MASON does not call this method.

ec.util.MersenneTwisterFast Methods

public void setSeed(long seed)

Seeds the random number generator. Note that only the ﬁrst 32 bits of the seed are used.

public void setSeed(int[] vals)

Seeds the random number generator using the given array. Only the ﬁrst 624 integers in the array are used. If the

array is shorter than 624, then the integers are repeatedly used in a wrap-around fashion (not recommended). The

integers can be anything, but you should avoid too many zeros.

public double nextDouble()

Returns a random double drawn in the half-open interval from [0.0, 1.0). That is, 0.0 may be drawn but 1.0 will

never be drawn.

public double nextDouble(boolean includeZero, boolean includeOne)

Returns a random double drawn in interval from 0.0 to 1.0, possibly including 0.0 or 1.0 or both, as speciﬁed in the

arguments.

public ﬂoat nextFloat()

Returns a random ﬂoat drawn in the half-open interval from [0.0f, 1.0f). That is, 0.0f may be drawn but 1.0f will

never be drawn.

public ﬂoat nextFloat(boolean includeZero, boolean includeOne)

Returns a random ﬂoat drawn in interval from 0.0f to 1.0f, possibly including 0.0f or 1.0f or both, as speciﬁed in

the arguments.

public double nextGaussian()

Returns a random double drawn from the standard normal Gaussian distribution (that is, a Gaussian distribution

with a mean of 0 and a standard deviation of 1).

public long nextLong()

Returns a random long.

public long nextLong(long n)

Returns a random long drawn from between 0 to n−1 inclusive.

public int nextInt()

Returns a random integer.

public int nextInt(int n)

Returns a random integer drawn from between 0 to n−1 inclusive.

public short nextShort()

Returns a random short.

public char nextChar()

Returns a random character.

public byte nextByte()

Returns a random byte.

public void nextBytes(byte[] bytes)

Fills the given array with random bytes.

public boolean nextBoolean()

Returns a random boolean.

public boolean nextBoolean(ﬂoat probability)

Returns a random boolean which is true with the given probability, else false. Note that you must carefully pass in

aﬂoat here, else it’ll use the double version below (which is twice as slow).

public boolean nextBoolean(double probability)

Returns a random boolean which is true with the given probability, else false.

public Object clone()

Clones the generator.

public boolean stateEquals(Object o)

Returns true if the given Object is a MersenneTwisterFast and if its internal state is identical to this one.

public void writeState(DataOutputStream stream)

Writes the state to a stream.

public void readState(DataInputStream stream)

Reads the state from a stream as written by writeState(...).

public static void main(String[] args)

Performs a test of the code.

2.4.2 Selecting Randomly from Distributions

Selecting from distributions is a common task in stochastic optimization.

ECJ has a utility class,

ec.util.RandomChoice, which makes it easy to set up and select from histogram-style (arbitrary) distributions,

such as selecting randomly from a Population by Fitness.

The distributions in question come in the form of arrays of ﬂoats, doubles, or special objects which can

provide their own ﬂoat or double values. The values in these arrays are expected to form a probability

density function (PDF). The objective is to select indexes in this array proportional to their value. To begin,

you call one of the following methods on your array to have RandomChoice convert it into a Cumulative

Density Function (CDF) to make selection easier:

ec.util.RandomChoice Methods

These are just histogram distributions. If what you need is to pick random numbers under some mathematical distribution (Poisson,

say), ECJ doesn’t have support for that. However ECJ’s sister package, MASON, has support for it in its utilities. See MASON’s

sim.util.distribution package. You can remove this package from MASON and just use it with ECJ with no problems: it comes with

MASON but is independent of it. See http://cs.gmu.edu/∼eclab/projects/mason/

public static void organizeDistribution(ﬂoat[ ] probabilities, boolean allowAllZeros)

If the array is all zeros, then if allowAllZeroes is false, then an ArithmeticException is thrown, else the array is

converted to all ones. Then the array is converted to a CDF. If the array has negative numbers or is of zero length,

an Arithmetic Exception is thrown.

public static void organizeDistribution(ﬂoat[ ] probabilities)

If the array is all zeros, then if allowAllZeroes is false, then an ArithmeticException is thrown. If not, then the array

is converted to a CDF. If the array has negative numbers or is of zero length, an Arithmetic Exception is thrown.

public static void organizeDistribution(double[ ] probabilities, boolean allowAllZeros)

If the array is all zeros, then if allowAllZeroes is false, then an ArithmeticException is thrown, else the array is

converted to all ones. Then the array is converted to a CDF. If the array has negative numbers or is of zero length,

an Arithmetic Exception is thrown.

public static void organizeDistribution(double[ ] probabilities)

If the array is all zeros, then if allowAllZeroes is false, then an ArithmeticException is thrown. If not, then the array

is converted to a CDF. If the array has negative numbers or is of zero length, an Arithmetic Exception is thrown.

public static void organizeDistribution(Object[ ] objs, RandomChoiceChooser chooser, boolean allowAllZeros)

The objects in objs ae passed to chooser to provide their ﬂoating-point values (and to set them if needed). If the array

is all zeros, then if allowAllZeroes is false, then an ArithmeticException is thrown, else the array is converted to all

ones. Then the array is converted to a CDF. If the array has negative numbers or is of zero length, an Arithmetic

Exception is thrown.

public static void organizeDistribution(Object[ ] objs, RandomChoiceChooser chooser )

The objects in objs ae passed to chooser to provide their ﬂoating-point values (and to set them if needed). If the

array is all zeros, then if allowAllZeroes is false, then an ArithmeticException is thrown. Then the array is converted

to a CDF. If the array has negative numbers or is of zero length, an Arithmetic Exception is thrown.

public static void organizeDistribution(Object[ ] objs, RandomChoiceChooserD chooser, boolean allowAllZeros)

The objects in objs ae passed to chooser to provide their double-ﬂoating-point values (and to set them if needed).

If the array is all zeros, then if allowAllZeroes is false, then an ArithmeticException is thrown, else the array is

converted to all ones. Then the array is converted to a CDF. If the array has negative numbers or is of zero length,

an Arithmetic Exception is thrown.

public static void organizeDistribution(Object[ ] objs, RandomChoiceChooserD chooser )

The objects in objs ae passed to chooser to provide their double-ﬂoating-point values (and to set them if needed).

If the array is all zeros, then if allowAllZeroes is false, then an ArithmeticException is thrown. Then the array is

converted to a CDF. If the array has negative numbers or is of zero length, an Arithmetic Exception is thrown.

These methods rely on two special interfaces,

RandomChoiceChooser

(ec.util.RandomChoiceChooser) and

RandomChoiceChooserD

(ec.util.RandomChoiceChooserD). RandomChoiceChooser requires two method

which map between Objects and ﬂoating-point values.

public float getProbability(Object obj);

public void setProbability(Object obj, float probability);

RandomChoiceChooserD is the same, except that it’s used for double values:

public double getProbability(Object obj);

public void setProbability(Object obj, double probability);

Once the array has been modiﬁed, you can then select random indexes from it. This is done by ﬁrst

generating a random ﬂoating-point number from 0...1, then passing that number into one of the following

methods.

ec.util.RandomChoice Methods

public static int pickFromDistribution(ﬂoat[ ] probabilities, ﬂoat probability)

Selects and returns an index in the given array which contains the given probability.

public static int pickFromDistribution(double[ ] probabilities, double probability)

Selects and returns an index in the given array which contains the given probability.

public static int pickFromDistribution(Object[ ] objs, RandomChoiceChooser chooser, ﬂoat probability)

Selects and returns an index in the given array which contains the given probability. The chooser will provide the

ﬂoating-point values of each element in the array.

public static int pickFromDistribution(Object[ ] objs, RandomChoiceChooserD chooser, double probability)

Selects and returns an index in the given array which contains the given probability. The chooser will provide the

ﬂoating-point values of each element in the array.

2.4.3 Thread-Local Storage

In certain rare cases you might need your threads to be able to stash away temporary information on a

per-thread basis. To do this, ECJ has an array of hash tables, one per thread number (indexed just like the

random number generators are). You can use these as you like, though we suggest you use them to store

information you need hashed under a unique string name special to your task. The hash tables are located in

EvolutionState and are called:

public HashMap[] data; // one hash table per thread number

These serve roughly the same purpose as Java’s ThreadLocal variables, but we can use an array here

instead of a ThreadLocal because we know beforehand how many threads we have.

2.4.4 Multithreading Support

ECJ not only can evaluate and breed individuals in a multithreaded fashion, but it also uses multiple threads

in other areas, such as handling island models or distributed parallel evaluation on remote machines. Many

of these ECJ modules take advantage of a thread pool, essentially a cache of hot threads, so they don’t have to

create new ones on-the-ﬂy all the time. This could have been done with Java’s concurrency package; but ECJ

has its own lightweight thread pool called ec.util.ThreadPool.

You can hand the ThreadPool a java.lang.Runnable, and it will ﬁre off this Runnable in a separate thread,

either drawn from the ThreadPool or (if none are available) created new. ThreadPool returns to you a

ec.util.ThreadPool.Worker which represents that thread. When your Runnable has completed its task, the

Worker will return to the ThreadPool to be used later. You can also wait for a speciﬁc Worker to complete its

Runnable (a task known as joining); or wait for all currently running Workers to complete their Runnables.

You can also kill Workers not presently running a Runnable, or join all Workers and then kill everyone.

Overall it’s a very simple, lightweight API.

ec.util.ThreadPool Methods

public Worker start(Runnable run)

Starts a Worker’s thread on a given Runnable and returns that Worker. This Worker and its thread are drawn from

the pool, or if there is no available Worker, it is created new (with a new thread).

public Worker start(Runnable run, String name)

Starts a Worker’s thread on a given Runnable and returns that Worker. The thread will be named with the given

name so it is easily recognized in a debugger. This Worker and its thread are drawn from the pool, or if there is no

available Worker, it is created new (with a new thread).

public int getTotalWorkers()

Returns the total number of workers either waiting in the pool or presently working on some Runnable.

public int getPooledWorkers()

Returns the total number of workers waiting in the pool (not working on some Runnable).

public boolean join(Worker thread, Runnable run)

Blocks until the given Worker has completed running the given Runnable, then returns true. If the Worker is not

running this Runnable, returns false immediately.

public boolean join(Worker thread)

Blocks until the given Worker has completed running any current Runnable, then returns true. If the Worker is not

any Runnable, returns false immediately.

public void joinAll()

Blocks until all Workers have completed running any current Runnables.

public void killPooled()

Destroys all currently pooled Workers (that is, ones which are not running a Runnable).

public void killAll()

Blocks until all Workers have completed running any current Runnables. Then destroys all Workers.

A Worker encapsulates a running thread, and is intentionally very opaque. However you can send an

interrupt to the underlying thread if you need to (a rare need):

ec.util.ThreadPool.Worker Methods

public void interrupt()

Calls interrupt() on the Worker’s underlying thread.

2.5 Jobs

Perhaps you need to run ECJ 50 times and collect statistics from all 50 runs. ECJ’s ec.Evolve class provides a

rudimentary but extensible jobs facility to do this. You specify the number of jobs as follows:

jobs = 50

Each job will automatically use a different set of random number generator seeds. Additionally, if there

is more than one job, ECJ will prepend each statistics ﬁle with job.jobnumber.For example, if we ran with

just a single job (the default) we’d probably create an output statistics ﬁle called out.stat. But if we ran with

multiple jobs, during the fourth job we’d create the output statistics ﬁle as job.3.out.stat (jobs start with 0).

Jobs are restarted properly from checkpoints: when you resume from a checkpoint, you’ll start up right

in that job and continue from there. This is accomplished by storing the job parameter, and the runtime

arguments, in the ec.EvolutionState object. See extended comments in the ec.Evolve source code for more

information.

But what if you need more job complexity? For example, what if you want to run ECJ with 10 different

parameter settings and 50 runs per setting? You’ll need to do some coding.

For example, let’s say we want to run 50 jobs, and each job changes the generation length. The ﬁrst job has

20 generations, the second job has 21 generations, etc. Here’s the trick. After ECJ creates the ec.EvolutionState

object, it calls the startFresh() method on this object. The default implementation calls setup(...) on this

object, then starts running the evolutionary loop. There’s your chance. Let’s say you’re using the common

EvolutionState ec.simple.SimpleEvolutionState subclass as your EvolutionState. Override the startFresh(...)

method in a custom subclass of SimpleEvolutionState along these lines:

public class ec.app.myexample.MySimpleEvolutionState extends ec.simple.SimpleEvolutionState

{

public void startFresh()

{

// setup() hasn’t been called yet, so very few instance variables are valid at this point.

// Here’s what you can access: parameters, random, output, evalthreads, breedthreads,

// randomSeedOffset, job, runtimeArguments, checkpointPrefix,

// checkpointDirectory

// Let’s modify the ’generations’ parameter based on the job number

int jobNum = ((Integer)(job[0])).intValue();

parameters.set(new ec.util.Parameter("generations"), "" + (jobNum + 20));

// call super.startFresh() here at the end. It’ll call setup() from the parameters

super.startFresh();

}

Now we need to stipulate that this is our EvolutionState, by changing the state parameter:

state = ec.app.myexample.MySimpleEvolutionState

jobs = 50

In truth though, in all my experiments, I have personally always handled different parameter settings in a

completely different way: on the command line using a UNIX script. It’s much simpler than mucking with

Java code. For example, to run ten runs each of ﬁve different population sizes, perhaps you could do this (in

the tcsh shell language):

@ seed = 92341

foreach size (2 4 8 16 32)

foreach job (1 2 3 4 5 6 7 8 9 10)

@ seed = ${seed} + 17

java ec.Evolve -file ant.params \

-p seed.0=${seed} \

-p pop.subpop.0.size=${size} \

-p stat.file=out.${size}.${job}.stat

end

2.6 The ec.Evolve Top-level

ECJ’s Evolve.java class looks complex but that’s just because it has various gizmos to do jobs, checkpoint

handling, etc. But in in fact, the top-level loop for ECJ can be quite small. Here’s all you need to do:

1. Try to load an ec.EvolutionState from a checkpoint ﬁle.

2. If it loads call run(...) on it.

3. Otherwise...

(a) Load a parameter database.

(b) Create a new ec.EvolutionState from the parameter database.

4. Call cleanup(...) and exit.

Here’s the basic code. As you can see, it’s not very complex.

public static void main(String[] args)

{

EvolutionState state = Evolve.possiblyRestoreFromCheckpoint(args);

if (state!=null) // loaded from checkpoint

state.run(EvolutionState.C_STARTED_FROM_CHECKPOINT);

else

{

ParameterDatabase parameters = Evolve.loadParameterDatabase(args);

state = Evolve.initialize(parameters, 0);

state.run(EvolutionState.C_STARTED_FRESH);

}

Evolve.cleanup(state);

System.exit(0);

}

The source code for ec.Evolve is very heavily commented with examples and ideas for customization.

Check it out!

The ec.Evolve class is the most common way to start up ECJ but it’s just a bootstrapping mechanism and

can be completely replaced with code of your own. However there are a number of useful utility methods in

the class which you might want to take advantage of even if you decide to roll your own bootstrapper.

ec.Evolve Methods

public static void main(String[ ] args)

The top-level. Starts up ECJ either from a checkpoint ﬁle (by calling possiblyRestoreFromCheckpoint(...)) or from

scratch (by calling initialize(...), runs the EC process, then ﬁnally calls cleanup(...).

public static EvolutionState possiblyRestoreFromCheckpoint(String[ ] args)

If the command-line arguments indicate that ECJ should load an EvolutionState from a checkpoint ﬁle, this method

does so and returns it. Else it returns null.

public static ParameterDatabase loadParameterDatabase(String[ ] args)

Loads a ParameterDatabase from the ﬁle or resource indicated on the command line.

public static EvolutionState initialize(ParameterDatabase parameters, int randomSeedOffset)

Builds a new EvolutionState and initializes it, using the provided random seed offset. This method simply calls

buildOutput(), then calls initialize(parameters, randomSeedOﬀset, output).

public static Output buildOutput()

Builds a new Output and returns it.

public static EvolutionState initialize(ParameterDatabase parameters, int randomSeedOffset, Output output)

Builds a new EvolutionState and initializes it, using the provided Output and random seed offset. The random

seed is determined by ﬁrst drawing it from the command line, then adding the random seed offset multiplied by

the number of random number generators (e.g., threads). Thus the randomSeedOffset can be used to indicate a

job number. For example, if there are two random number generators and the base seed was 1234, and there are

three jobs, then the ﬁrst (zeroth) job will have generators 1234 and 1235, the second job will have generators 1236

and 1237, and the third job will have 1238 and 1239. If you’re not doing jobs, just pass in 0 for the offset.

public static int determineThreads(Output output, ParameterDatabase parameters, Parameter threadParameter)

Looks up the given thread parameter and, using it, determines the number of threads to use.

public static int determineSeed(Output output, ParameterDatabase parameters, Parameter seedParameter,

long currentTime, int offset, boolean auto)

Computes a random number generator seed. The seed is computed by ﬁrst looking up seedParameter and using

this as the base seed. If the seed parameter is “time”, or auto is true, then the provided current time is used as a

base seed (you might wish to change the time each time you call this method). Otherwise, the base seed is the

number value stored in the seed parameter. Then the base seed plus the offset is returned as the ﬁnal seed.

public static MersenneTwisterFast primeGenerator(MersenneTwisterFast generator )

Mersenne Twister’s ﬁrst 624 or so random numbers are not as good as its later numbers because they were

constructed using a Knuth LCS generator to initialize Mersenne Twister from a seed. They’re acceptable to use but

just to be careful, this method “primes the pump” by calling nextInt() 1048 times, then returns the same generator.

public static void cleanup(EvolutionState state)

Flushes and closes output buffers and writes out used, accessed, unused, and unaccessed parameters as requested.

2.7 Integrating ECJ with other Applications or Libraries

When integrating ECJ with other applications or libraries, you have to decide who’s going to be in the

driver’s seat — that is, in control of main(...). There are two common situations:

•ECJ is in control.

The most common scenario here is ECJ using an external library, such as a simulation

toolkit, to assess the ﬁtness of an Individual.

•The other application or library is in control.

This might arise if you have some external application

which wishes to use ECJ as a sub-procedure to do some optimization.

You can of course do both: have Application A control ECJ, which in turn controls Simulation Library B.

2.7.1 Control by ECJ

When ECJ is in control, usually the subordinate library is being used to assess the quality of an Individual,

perhaps as a simulation library. To do this, you’ll probably need a place to set up your library, prepare a

simulation to test a string of Individuals, reset the simulation for each Individual, and eventually destroy the

library. Note that if you use checkpointing, your library must be serializable.

Let’s do the simplest case: it may well be the case that all you need to do is create a simulation, run it,

and destroy it each time an individual is evaluated. in this case, you just do this inside the evaluate() method

in your Problem, along these lines:

public void evaluate(EvolutionState state, Individual ind, int subpopulation, int threadnum)

{

if (!ind.evaluated) // don’t bother reevaluating

{

// Build the simulation

Simulation mySimulation = new Simulation(state.random[threadnum], threadnum, ...);

// use the simulation, then get rid of it

// then set the fitness of the individual

}

Notice that in this example, to construct our simulation, we’re passing in the random number generator

for the thread, which would be the best practice. We’re also passing in the thread number, which is a unique

integer from 0 ... state.evalthreads), which might be useful for the simulation to distinguish itself from others

(for example, so it can write out some kind of statistics ﬁle as ”foo.threadnumber.out” or whatever so none of

the simulations overwrite each other).

You need to be careful about random number generators. The best thing to do would be to use ECJ’s

generator. But here’s the order of options, from best to least good.

The best would be for simulator #thread to use state.random[thread] as its only random number genera-

tor.

If this is not possible, next best would be for each simulator to use a unique random number generator

object. This may not result in replicable results.

If this is not possible, next best would be for each simulator to use a shared random number generator

object which is synchronized for multithreading. This will deﬁnitely not result in replicable results, and

may also not be properly serializable.

If your simulator is using java.util.Random or (heaven forbid) Math.random(...), then it’s time for a

rewrite of your simulator.

Note also that if you use distributed evaluation, this code will be run on remote machines, so you’d want

to do this stuff on the slave, not on the master.

Finally, remember that if you run multithreaded, you’ll have multiple simulations running in parallel,

and you’ll want to make sure that these simulations don’t share any global (static) data in common that

would create race conditions.

This approach is plausible but it’s a bit inﬂexible and inefﬁcient. You are given no opportunity to globally

set up the simulation system, and furthermore you’re constructing and destroying an entire simulation each

time — perhaps it would be better to:

1. Load and set up the simulation library once

2. Use this to construct some Nsimulations, one per thread, once, or perhaps once per generation

Reuse these simulations multiple times by resetting them (rather than destroying and recreating them)

when new individuals need to be assessed

4. Clean up before quitting

Here are some plausible locations for these tasks:

Set up the library

I would create a custom EvolutionState, and override setup() to create a blank array of

simulations, one per evaluation thread. This would also be the spot to set up the library as appropriate. Note

that setup() is called even on the master when you’re doing distributed evaluation, but in that case you’d not

want to set up since you’re not doing any simulations on the master. So the code below checks for that.

public class MyEvolutionState extends ec.simple.SimpleEvolutionState // or whatever...

{

public Simulation[] simulations;

public void setup(EvolutionState state, Parameter base)

{

super.setup(state, base); // state is obviously the MyEvolutionState itself

// there are two cases where we’d want to set this up:

// 1. I am running ECJ in a single process, and am NOT running with

// distributed (master/slave) evaluation

// 2. I am running with distributed evaluation, and I am a slave (not the master).

// So we check for those cases here:

// the code below verifies that I’m running on a slave, not the master

ec.util.Parameter param = new ec.util.Parameter("eval.i-am-slave");

boolean amASlave = state.parameters.getBoolean(param, null, false);

// the code below verifies that I’m doing distributed evaluation

boolean doingDistributedEvaluation = (Evaluator.masterProblem == null);

// okay here we go

if (!doingDistributedEvaluation || amASlave)

{

simulations = new Simulation[state.evalthreads];

// do your library setup here

// now create the simulation array

}

You’d then set this parameter:

state = MyEvolutionState

You could of course override the setup(...) method of certain other classes, such as ec.Initializer (see

Section 3.3).

It’s possible that you might need to differentiate between setting up from a fresh run and setting up

due to recovering from checkpoint. In this case, you might instead override the methods startFresh() and

startFromCheckpoint(). See Section 3.

Prepare the simulation and evaluate an individual

Whenever evaluate(...) is called, you can now grab

your simulation, reset it as appropriate, and test your individual. Alternatively you could just delete your

existing simulation and create a new fresh one:

public void evaluate(EvolutionState state, Individual ind, int subpopulation, int thread)

{

if (simulations[thread] == null) // simulator doesn’t exist

simulations[thread] = new Simulation(state.random[thread], thread, ...);

else simulations[thread].reset(); // reset your simulator somehow

// run the simulation with your individual and assess the individual

// now set its fitness here

}

Alternatively of course you could always just make a new simulation if that’s not terribly inefﬁcient.

Now you’d set up the simulation with your ECJ individual, pulse it multiple times until you’re ﬁnished,

and then determine the Individual’s Fitness and return.

Alternatively if your Problem implements the coevolutionary ec.coevolve.GroupedProblemForm (see

Section 7.1.2), you might write this:

public void evaluate(EvolutionState state, Individual[] ind, boolean[] updateFitness,

boolean countVictoriesOnly, int[] subpops, int thread)

{

if (simulations[thread] == null) // simulator doesn’t exist

simulations[thread] = new Simulation(state.random[thread], thread, ...);

else simulations[thread].reset(); // reset your simulator somehow

// run the simulation with all individuals in ind[] and assess them

// now set their fitnesses here

}

At the end of a run, ECJ may call the describe() method on your Problem (see Section 3.4.1). I’d do things

similarly:

public void describe(EvolutionState state, Individual ind, int subpopulation, int thread, int log)

{

if (simulations[thread] == null) // simulator doesn’t exist

simulations[thread] = new Simulation(state.random[thread], thread, ...);

else simulations[thread].reset(); // reset your simulator somehow

// run the simulation with your individual and write out to the log interesting facts

}

At present there is no describe(...) method for GroupedProblemForm.

Optionally Delete the Simulation each Generation

It’s possible that you might want to clean delete your

simulation or clean it up somehow after a generation has transpired. This is an unusual need: I wouldn’t do

this. But if you must, you could do this in your Problem class:

public void finishEvaluating(EvolutionState state, int thread)

{

super.finishEvaluating(state, thread);

// do your cleanup here of simulations[thread], then...

simulations[thread] = null;

}

See again Section 3.4.1.

Clean up the Library

Finally at the end of an ECJ run you can clean up the library if you need to: perhaps

to ﬂush out logs or close sockets. A reasonable place to do this is to create a custom ec.Finisher subclass (see

Section 3.3):

public class MyFinisher extends SimpleFinisher // or whatever

{

public void finishPopulation(EvolutionState state, int result)

{

super.finishPopulation(state, result);

// clean up the simulations

for(int simID = 0; simID < state.evalthreads; simID++)

{

if (simulations[simID] != null)

{

// clean up simulations[simID] as you wish, then...

simulations[simID] == null;

}

// finally clean up the whole library here

}

2.7.2 Control by another Application or Library

To set up and run ECJ from an external application or library, you need to get a parameter database, initialize

ECJ on it, and start it running. Most likely you’ll be running ECJ multiple times, so it makes sense to

construct a single ParameterDatabase, then clone it repeatedly for each time ECJ does an optimization run.

You could create your initial ParameterDatabase by pointing it at a ﬁle:

File parameterFile = ...

ParameterDatabase dbase = new ParameterDatabase(parameterFile,

new String[] { "-file", parameterFile.getCanonicalPath() });

There are other options besides loading from ﬁles of course: see Section 2.1.8. Once you have created

your ParameterDatabase, you’ll likely want to make copies of it over and over again so you can customize

some of its parameters differently each time you run ECJ from the application or library. You could do this

in two ways. You could just make a ParameterDatabase which uses the original database as its parent:

ParameterDatabase child = new ParameterDatabase();

child.addParent(dbase);

I instead prefer to ﬁrst copy the original so as to keep a completely separate version in case there

are multithreading issues. Here we use ec.util.DataPipe to copy the ParameterDatabase, because it’s not

Cloneable:8

ParameterDatabase copy = (ParameterDatabase)(DataPipe.copy(dbase));

ParameterDatabase child = new ParameterDatabase();

child.addParent(dbase);

Once you have your ParameterDatabase ready, it’s time to add some custom parameters. Perhaps each

time you’re setting up ECJ from your application you want it to run in a slightly different way. Notably you

may want to customize the random number seed. You can do it like this:

child.set(...);

// ... etc...

(and so on). Now you set up the Output class. At this point you may wish to quiet stdout and stderr if you

don’t want ECJ polluting them:

Output out = Evolve.buildOutput();

// this stuff is optional

out.getLog(0).silent = true; // stdout

out.getLog(1).silent = true; // stderr

You can shut up the Statistics log with this parameter:

stat.file = /dev/null

Now you initialize ECJ and start it running:

EvolutionState evaluatedState = Evolve.initialize(child, 0, out);

evaluatedState.run(EvolutionState.C_STARTED_FRESH);

This isn’t ECJ’s fault. It’s because ParameterDatabase is a subclass of java.util.Properties, which is Cloneable only in the most recent

versions of Java.

This runs the whole thing from start to end, then returns. Alternatively to pulse ECJ every generation

(maybe so you can test stuff per-generation) you could say:

EvolutionState evaluatedState = Evolve.initialize(child, 0, out);

evaluatedState.startFresh();

int result = EvolutionState.R_NOTDONE;

while( result == EvolutionState.R_NOTDONE )

result = evaluatedState.evolve();

At this point you might wish to check the Statistics to see what the results were. Let’s say it’s a

SimpleStatistics. You could then say:

// inds is an array, one per subpopulation

Individual[] inds = ((SimpleStatistics)(evaluatedState.statistics)).getBestSoFar();

// ... grab the Fitness from these Individuals, etc. ...

Finally, you clean up.

Evolve.cleanup(evaluatedState);

And you can forget about evaluatedState at this point.

Chapter 3

ec.EvolutionState and the ECJ

Evolutionary Process

As discussed in Section 2, the purpose of the ec.Evolve class is simply to set up an ec.EvolutionState and get it

going. ec.EvolutionState is the central object in all of ECJ.

An ECJ process has only one ec.EvolutionState instance. Practically everything in ECJ, except for ec.Evolve

itself, is pointed to somehow from ec.EvolutionState, so if you checkpoint ec.EvolutionState, the entire ECJ

process is written to disk. Various subclasses of ec.EvolutionState deﬁne the stochastic optimization process.

And a great many methods are handed the ec.EvolutionState instance, and so have essentially global access

to the system.

If you peek inside ec.EvolutionState, you will ﬁnd a number of objects, as shown in Figure 3.1:

•

Some familiar objects, placed there by ec.Evolve after it created the ec.EvolutionState: the

Parameter

Database, Output,

and array of

Random Number Generators

. Additionally, the number of

breeding

threads and evaluation threads (and various checkpoint and job stuff, not shown below):

public ec.util.ParameterDatabase parameters;

public ec.util.MersenneTwisterFast[] random;

public ec.util.Output output;

public int breedthreads;

public int evalthreads;

•

Population

, which holds the individuals in the evolutionary process; plus the current

generation

(iteration) of the evolutionary process and the total

number of generations

to run, or alternatively the

total

number of evaluations

to run. How or whether these last three variables are used depends on

the evolutionary process in question.

public ec.Population population;

public int generation;

public int numGenerations = UNDEFINED;

public int numEvaluations = UNDEFINED;

Some notes. The default setting for numEvaluations and numGenerations is EvolutionState.UNDEFINED

(0). One of these two variables will be set at parameter-loading time. The other will stay, initially, at

UNDEFINED. If numEvaluations was set, and generational evolution is being used, then numGenerations

will eventually be set to a real value after the initial population has been created but before it has been

evaluated. We’ll get to how these are set later, in the Generational and Steady-State sections (Sections

4.1 and 4.2).

EvolutionState

Initializer

Breeder

Evaluator

Finisher

Statistics

Exchanger

Problem

Mersenne Twister

RNG

Output

Parameter

Database

makes Population

Breeding Pipelineapplies

Evolve

makes

updates

Fitness

updates

Individual

evaluates

0..n

prototype

Log

0..n

Figure 3.1 Top-Level operators and utility facilities in EvolutionState, and their relationship to certain state objects. A repeat of Figure

1.2. Compare to Figure 4.2.

•

Initializer

, whose job is to create the initial ec.Population at the beginning of the run, and a

Finalizer

whose job is to clean up at the very end of the run.

public ec.Initializer initializer;

public ec.Finalizer finalizer;

•

Evaluator

, whose job is assign quality assessments (ﬁtnesses) to each member of the Population,

and a

Breeder

, whose job is to produce a new Population from the previous-generation’s Population

through some collection of selection and modiﬁcation operators.

public ec.Evaluator evaluator;

public ec.Breeder breeder;

•

Exchanger

, which optionally exports Population members to other ECJ processes, or imports ones

to add to the Population. And ﬁnally, a

Statistics

object, whose methods are called at many points in

the run to output statistics on the current run performance.

public ec.Exchanger exchanger;

public ec.Statistics statistics;

3.1 Common Patterns

Most of ECJ’s classes follow certain patterns which you’ll see many times, so it’s useful to review them here.

3.1.1 Setup

Nearly all classes adhere to the ec.Setup interface. This interface is java.io.Serializable (which is why ECJ can

serialize all its objects) and deﬁnes a single method:

ec.Setup Methods

public void setup(EvolutionState state, Parameter base)

Constructs the Setup object from the Parameter Database using base as the primary parameter base. Nearly all ECJ

classes implement this method.

ECJ objects are born from ECJ’s Parameter Database, which constructs them with the default (no-

argument) constructor. Then they have setup(...) called on them, and are expected to construct themselves

by loading parameters as necessary from the Parameter Database (state.parameters), using the provided

Parameter base.

Thus the setup(...) method is, for all intents and purposes, the constructor for nearly all

ECJ objects.

When implementing setup(...) always call super.setup(...) if a superclass exists.

3.1.2 Singletons and Cliques

Singletons (ec.Singleton) are Setups which create a single instance per evolutionary run, and that’s it. For

example, ec.EvolutionState is a Singleton, as are ec.Initializer,ec.Finalizer,ec.Evaluator,ec.Breeder,ec.Exchanger,

and ec.Statistics. Singletons are generally meant to be globally accessible.

Though Singleton are single objects, Cliques (ec.Clique) are objects for which only a small number (but

usually more than 1) are created. Cliques are also generally meant to be globally accessible. Most Cliques

have a globally accessible registry of some sort in which all Clique members can be found.

Because they are global, Prototypes and Singletons usually are set up from a single parameter base (the

one provided by setup(...)).

3.1.3 Prototypes

Prototypes (ec.Prototype) are by far the most common objects in ECJ. Prototypes are Setups which follow the

following design pattern: only one instance is loaded from the Parameter Database and set up; this object

is the prototype. Then many objects are created by deep cloning the prototype. One example of a Prototype

is an Individual (ec.Individual): a single prototypical Individual is created when ECJ starts up; and further

Individuals are deep cloned from this prototype to ﬁll Populations.

Because they can be deep cloned, Prototypes implement the java.lang.Cloneable interface, so you must

implement the method:

ec.Prototype Methods

public Object clone()

Deep-clones the object. Implemented by all Prototypes. Must call super.clone(), possibly catching a thrown

CloneNotSupportedException.

Unlike Singletons and Cliques, Prototypes also usually have two parameter bases: the primary base

provided by setup(...), and a default base. As a result, Prototypes must implement a method which can be

called to provide this default base:

ec.Prototype Methods

public ec.util.Parameter defaultBase()

Returns the default base for the Prototype.

The standard way to implement this method is to consult a special

defaults class

in the Parameter’s

Java package. For example, in the ec.simple package the defaults class is ec.simple.SimpleDefaults. Here’s the

entirety of this class:

public final class SimpleDefaults implements ec.DefaultsForm

{

public static final String P_SIMPLE = "simple";

public static final Parameter base() { return new Parameter(P_SIMPLE); }

}

The Parameter returned by base() here provides

package default base

for the ec.simple package. Now

consider ec.simple.SimpleFitness, a Prototype in this package. This class implements the defaultBase() method

like this:

public static final String P_FITNESS = "fitness";

public Parameter defaultBase()

{

return SimpleDefaults.base().push(P_FITNESS);

}

Thus, as a result the default parameter base for ec.simple.SimpleFitness is simple.fitness.

3.1.4 The Flyweight Pattern

Many Prototypes follow what is commonly known as the

ﬂyweight pattern

. Prototypes are often great in

number and Java is a memory hog: so it’s helpful for groups of Prototypes to place shared information

common to them in a single central location rather than keep copies of their own. For various reasons

(particularly because it’s hard to do serialization) ECJ doesn’t use static variables to store this common

information. Instead groups of Prototypes often all point to a special object which contains information

common to all of them. For example, instances of ec.Individual, in groups, typically share a common ec.Species

which contains information common to them. At any particular time there may be several such groups of

Individuals, each with a different Species.

3.1.5 Groups

Groups are similar to Prototypes in that a single object is loaded from the Parameter Database and further

objects are created by a cloning procedure. Groups are likewise java.lang.Cloneable. However, Groups are

different in that there is no prototype per se: the object loaded from the Parameter Database isn’t held in

reserve but is actively used. It must not just clone another object, but actually create a new, fresh, clean object

ready to be used. This is done by implementing the method:

ec.Group Methods

public ec.util.Parameter emptyClone()

Returns a pristine, new clone of the Group which has been emptied of members.

This method is normally implemented by cloning the object, cleaning out the clone, and returning it

in the same pristine state that it would be if it had been created directly from the Parameter Database. At

EvolutionState

Population

Subpopulation

Individual

1..n

Species

Fitness

prototype

1 1

Breeding Pipeline

prototype

ﬂyweight

1..n 1

Selection Method

child of

0..n

child of

0..n

uses

Figure 3.2 Top-Level data objects used in evolution. A repeat of Figure 1.3.

present there are only a few ECJ objects which implement Group: namely, ec.Population,ec.Subpopulation, and

certain specialized subclasses of ec.Subpopulation.

3.2 Populations, Subpopulations, Species, Individuals, and Fitnesses

Populations, Subpopulations, and Individuals are the “nouns” of an evolutionary system, and Fitnesses

are the “adjectives”. They’re pretty central to the operation of any evolutionary or sample-based stochastic

search algorithm.

In ECJ, an

individual

is a candidate solution to a problem. Some

Individuals are grouped together

into a sample of solutions known as a

subpopulation

. Some

subpopulations are grouped together into

the system’s

population

. There’s only one population per evolutionary process. The most common scenario

is for ECJ to have

individuals grouped into a single subpopulation, which then is the sole member of

ECJ’s population. However, coevolutionary algorithms (Section 7.1) typically have

1 subpopulations:

as does a special and little-used internal island model scheme (see Section 6.2).1

Usually ECJ’s population is an instance of the class ec.Population and its subpopulations are instances of

the class ec.Subpopulation. Both of these are Groups. Let’s say that there’s a single subpopulation, which

must contain 100 individuals. We can express this as follows:

pop = ec.Population

pop.subpops = 1

pop.subpop.0 = ec.Subpopulation

pop.subpop.0.size = 100

1Because these two techniques use the subpopulations in different ways, they cannot be used together (a rare situation in ECJ).

Obviously further subpopulations would be

pop.subpop.1

pop.subpop.2

, etc. The population is found

in an instance variable in the EvolutionState:

public Population population;

The Population is little more than an array of Subpopulations. To get Subpopulation 0, with the Evolu-

tionState being state, you’d say:

Subpopulation theSubpop = state.population.subpops.get(0);

Subpopulations themselves contain arrays of individuals. To get Individual 15 of Subpopulation 0, you’d

say:

Individual theIndividual = state.population.subpops.get(0).individuals.get(15);

In addition to an array of individuals, each subpopulation contains a

species

which deﬁnes the indi-

viduals used to ﬁll the subpopulation, as well as their ﬁtness and the means by which they are modiﬁed.

Subpopulations also contain some basic parameters for creating initial individuals, though the procedure is

largely handled by Species.2We’ll get to creation and modiﬁcation later.

Species have an odd relationship to Individuals and to Subpopulations. First recall the Flyweight pattern

in Section 3.1.4. Individuals are related to a common Species using the Flyweight pattern: they use Species

to store a lot of common information (how to modify themselves, for example). Ordinarily you’d think that

the Subpopulation would be a good place for this storage. However different Subpopulations can share the

same Species. This allows you to, for example, have one Species guide an entire evolutionary run that might

have twenty Subpoulations in it.3The species of Subpopulation 0 may be found here:

Species theSpecies = state.population.subpops.get(0).species;

A Species contains three major elements: ﬁrst, the

prototypical Individual

for Subpopulations which use

that Species. Recall that Individuals are Prototypes and new ones are formed by cloning from a prototypical

individual held in reserve. This “queen bee” individual, so to speak, is found here:

Individual theProto = state.population.subpops.get(0).species.i_prototype;

A Species also contains a

prototypical Fitness

object. In ECJ ﬁtnesses are separate from individuals.

Individuals deﬁne the candidate solution, and Fitnesses deﬁne how well it has performed. Like Individuals,

Fitnesses are also Prototypes. The prototypical Fitness for Subpopulation 0 may be found here:

Fitness theProtoFitness = state.population.subpops.get(0).species.f_prototype;

The Species class you pick is usually determined by the kind of Individual you pick, that is, by the kind of

representation of your solution. You deﬁne the class of the Species for Subpopulation 0, and its prototypical

Fitness and prototypical Individual, as follows. For example, let’s make Individuals which are arrays of

integers, and a simple Fitness common to many evolutionary algorithms:

pop.subpop.0.species = ec.vector.IntegerVectorSpecies

pop.subpop.0.species.ind = ec.vector.IntegerVectorIndividual

pop.subpop.0.species.fitness = ec.simple.SimpleFitness

By way of explanation, IntegerVectorIndividual, along with various other “integer” vector individuals like

LongVectorIndividual,ShortVectorIndividual, and ByteVectorIndividual, requires an IntegerVectorSpecies. And

ec.simple.SimpleFitness is widely used for problems such as Genetic Algorithms or Evolution Strategies. The

You might be asking: if Species are responsible for making individuals, why are Subpopulations involved at all? A very good

question indeed.

3Granted, this isn’t very common.

prototypical Individual is never assigned a Fitness (it’s null). But once assembled in a Subpopulation, each

Individual has its very own Fitness. To get the Fitness of individual 15 in Subpopulation 0, you’d say:

Fitness theFitness = state.population.subpops.get(0).individuals.get(15).fitness;

Last, a Species contains a

prototypical Breeding Pipeline

to modify individuals. We’ll get to that in

Section 3.5.

Since they’re Prototypes, Individuals, Fitnesses, and Species all have default bases. We’ll talk about the

different kinds of Individuals, Fitnesses, and Species later, plus various default bases for them.

3.2.1 Making Large Numbers of Subpopulations

Let’s say you’re doing an evolutionary experiment (perhaps coevolution, see Section 7.1) which involves 100

Subpopulations. It’s going to get very tiresome to repeat...

pop = ec.Population

pop.subpops = 100

pop.subpop.0 = ec.Subpopulation

pop.subpop.0.size = 100

pop.subpop.0.species = ec.vector.IntegerVectorSpecies

pop.subpop.0.species.ind = ec.vector.IntegerVectorIndividual

pop.subpop.0.species.fitness = ec.simple.SimpleFitness

....

pop.subpop.1 = ec.Subpopulation

pop.subpop.1.size = 100

pop.subpop.1.species = ec.vector.IntegerVectorSpecies

pop.subpop.1.species.ind = ec.vector.IntegerVectorIndividual

pop.subpop.1.species.fitness = ec.simple.SimpleFitness

....

pop.subpop.2 = ec.Subpopulation

pop.subpop.2.size = 100

pop.subpop.2.species = ec.vector.IntegerVectorSpecies

pop.subpop.2.species.ind = ec.vector.IntegerVectorIndividual

pop.subpop.2.species.fitness = ec.simple.SimpleFitness

....

... and so on some 100 times. Even with the help of ECJ’s default parameters, you’ll still be typing an awful

lot. Population has a simple mechanism to make this easier on you: the parameter...

pop.default-subpop = 0

This says that if you do not specify a Subpopulation in parameters, ECJ will assume its parameters are

identical for those of Subpopulation 0. Thus you could simply say:

pop = ec.Population

pop.subpops = 100

pop.default-subpop = 0

pop.subpop.0 = ec.Subpopulation

pop.subpop.0.size = 100

pop.subpop.0.species = ec.vector.IntegerVectorSpecies

pop.subpop.0.species.ind = ec.vector.IntegerVectorIndividual

pop.subpop.0.species.fitness = ec.simple.SimpleFitness

...

... and be done with it. Note that you can always specify a Subpopulation specially. For example, suppose all

of your Subpopulations were exactly like Subpopulation 0 except for Subpopulation 19. You can say:

pop = ec.Population

pop.subpops = 100

pop.default-subpop = 0

pop.subpop.0 = ec.Subpopulation

pop.subpop.0.size = 100

pop.subpop.0.species = ec.vector.IntegerVectorSpecies

pop.subpop.0.species.ind = ec.vector.IntegerVectorIndividual

pop.subpop.0.species.fitness = ec.simple.SimpleFitness

...

pop.subpop.19 = ec.Subpopulation

pop.subpop.19.size = 25

pop.subpop.19.species = ec.vector.FloatVectorSpecies

pop.subpop.19.species.ind = ec.vector.DoubleVectorIndividual

pop.subpop.19.species.fitness = ec.simple.SimpleFitness

...

Note that even though Subpopulation 19 shared the same ﬁtness type as the others, we still had to specify

it. It’s an all-or-nothing proposition: either you say nothing about that particular Subpopulation, or you say

everything.

3.2.2 How Species Make Individuals

Species have two ways to create new individuals: from scratch, or reading from a stream. To generate an

individual from scratch, you can call (in ec.Species):

ec.Species Methods

public Individual newIndividual(EvolutionState state, int thread)

Returns a brand new, randomized Individual.

The default implementation of this method simply clones an Individual from the prototype and returns it.

Subclasses of Species override this to randomize the Individual in a fashion appropriate to its representation.

Another way to create an individual is to read it from a binary or text stream. ec.Species provides two

methods for this:

ec.Species Methods

public Individual newIndividual(EvolutionState state, LineNumberReader reader) throws IOException

Produces a new individual read from the stream.

public Individual newIndividual(EvolutionState state, DataInput input) throws IOException

Produces a new individual read from the given DataInput.

These methods create Individuals by cloning the prototype, then calling the equivalent readIndividual(...)

method in ec.Individual. See Section 3.2.4 for more information on those methods.

3.2.3 Reading and Writing Populations and Subpopulations

Populations and Subpopulations have certain predeﬁned methods for reading and writing, which you

should know how to use. If you subclass Population or Subpopulation (relatively rare) you may need to

reimplement these methods. Population’s methods are:

public void printPopulationForHumans(EvolutionState state, int log);

public void printPopulation(EvolutionState state, int log);

public void printPopulation(EvolutionState state, PrintWriter writer);

public void readPopulation(EvolutionState state, LineNumberReader reader)

throws IOException;

public void writePopulation(EvolutionState state, DataOutput output)

throws IOException;

public void readPopulation(EvolutionState state, DataInput input)

throws IOException;

Subpopulation’s methods are nearly identical:

In Subpopulation:

public void printSubopulationForHumans(EvolutionState state, int log);

public void printSubopulation(EvolutionState state, int log);

public void printSubopulation(EvolutionState state, PrintWriter writer);

public void readSubopulation(EvolutionState state, LineNumberReader reader)

throws IOException;

public void writeSubopulation(EvolutionState state, DataOutput output)

throws IOException;

public void readSubopulation(EvolutionState state, DataInput input)

throws IOException;

These methods employ similar methods in ec.Individual to print out, or read, Individuals. Those methods

are discussed next in Section 3.2.4.

The ﬁrst Population method, printPopulationForHumans(...), prints an entire population to a log in a

form pleasing to the human eye. It begins by printing out the number of subpopulations, then prints each

Subpopulation index and calls printSubpopulationForHumans(...) on each Subpopulation in turn. printPopula-

tionForHumans(...) then prints out the number of individuals, then for each Individual it prints the Individual

index, then calls printIndividualForHumans to print the Individual. Overall, it looks along these lines:

Number of Subpopulations: 1

Subpopulation Number: 0

Number of Individuals: 1000

Individual Number: 0

Evaluated: T

Fitness: 0.234

-4.97551104730313 -1.7220830524609632 1.7908415218297096

2.3277606156190496 3.5616099573877404 -3.8002895023118617

Individual Number: 1

Evaluated: T

Fitness: 4.91235

3.1033182498148575 -3.613847679151146 -0.562978505270439

-2.860926011046968 1.9007479097991151 -3.051348823625001

...

The next two Population methods, both named printPopulation(...), print an entire population to a log

in a form that can be (barely) read by humans but can also be read back in perfectly by ECJ, resulting in

identical Populations These operate similarly to printPopulationForHumans(...), except that various data types

are emitted using ec.util.Code (Section 2.2.3).

Number of Subpopulations: i1|

Subpopulation Number: i0|

Number of Individuals: i1000|

Individual Number: i0|

Evaluated: F

Fitness: f0|0.0|

i6|d4600627607395240880|0.3861348728170766|d4616510324226321041|4.284844300646584|

d4614576621171274054|3.2836854885228233|d4616394543356495435|4.182010230653371|

Individual Number: i1|

Evaluated: F

Fitness: f0|0.0|

i6|d4603775819114015296|0.6217914592919627|d4612464338011645914|2.345643329183969|

d-4606767824441912859|-4.368233761797886|d4616007477858046134|3.919113503960115|

...

The Population method readPopulation(..., LineNumberReader) can read in this mess to produce a Popula-

tion. It in turn does its magic by calling the equivalent method in Subpopulation.

The last two methods, writePopulation(...) and readPopulation(..., DataInput), read and write Populations

(or Subpopulations) to binary ﬁles.

3.2.4 About Individuals

Individuals have four basic parts:

• The Individual’s ﬁtness.

public Fitness fitness;

• The Individual’s species.

public Species species;

• Whether the individual has been evaluated and had its Fitness set to a legal value yet.4

public boolean evaluated;

•

The representation of the Individual. This could be anything from an array to a tree structure —

representations of course vary and are deﬁned by subclasses. We’ll talk about them later.

3.2.4.1 Implementing an Individual

For many purposes you can just use one of the standard “off-the-rack” individuals — vector individuals,

genetic programming tree individuals, ruleset individuals — but if you need to implement one yourself,

here are some methods you need to be aware of. First off, Individuals are Prototypes and must override

the clone() method to deep-clone themselves, including deep-cloning their representation and their Fitness,

but not their Species (which is just pointer-copied). Individuals must also implement the setup(...), and

defaultBase() methods. Additionally, Individuals have a number of methods which either should or must be

overridden. Let’s start with the “must override” ones:

public abstract int hashCode();

public abstract boolean equals(Object individual);

4Why isn’t this in the Fitness object? Another excellent question.

These two standard Java methods enable hashing by value, which allows Subpopulations to remove

duplicate Individuals. hashCode() must return a hashcode for an individual based on value of its representa-

tion. equals(...) must return true if the Individual is identical to the other object (which in ECJ will always be

another Individual).

The next two methods are optional and may not be appropriate depending on your representation:

public long size();

public double distanceTo(Individual other);

size() returns an estimate of the size of the individual. The only hard-and-fast rule is that 0 is the smallest

possible size (and the default returned by the method). Size information is largely used by the ec.parsimony

package (Section 5.2.12) to apply one of several parsimony pressure techniques.

distanceTo(...) returns an estimate of the distance, in some metric space, of the Individual to some other

Individual of the same type. In the future this method may be used for various crowding or niching methods.

At present no package uses it, though all vector individuals implement it. The default implementation

returns 0 if the other Individual is identical, else Double.POSITIVE INFINITY.

Last come a host of functions whose purpose is to read and write individuals. You’ve seen this pattern

before in Section 3.2.3. Some of these are important to implement; others can wait if you’re in a hurry to get

your custom Individual up and running.

public void printIndividualForHumans(EvolutionState state, int log);

public void printIndividual(EvolutionState state, int log);

public void printIndividual(EvolutionState state, PrintWriter writer);

public void readIndividual(EvolutionState state, LineNumberReader reader)

throws IOException;

public void writeIndividual(EvolutionState state, DataOutput output)

throws IOException;

public void readIndividual(EvolutionState state, DataInput input)

throws IOException;

These six methods only need to be overridden in certain situations, and in each case there’s another

method which is typically overridden instead. Here’s what they do:

•

printIndividualForHumans(...) prints an individual, whether it’s been evaluated, and its ﬁtness, out a

probably should instead override this method:

public String genotypeToStringForHumans();

... which should return the representation of the individual in a human-pleasing fashion. Or, since

genotypeToStringForHumans() by default just calls toString(), you can just override:

public String toString();

Overriding one or both of these methods is pretty important: otherwise Statistics objects will largely be

printing your individuals as gibberish. Here’s a typical output of these methods:

Evaluated: T

Fitness: 0.234

-4.97551104730313 -1.7220830524609632 1.7908415218297096

2.3277606156190496 3.5616099573877404 -3.8002895023118617

•

Both printIndividual(...) methods print an individual, and its ﬁtness, out in a way that can be perfectly

read back in again with readIndividual(...), but which can also be parsed by humans with some effort.

Rather than override this method, you probably should instead override this method:

public String genotypeToString();

This method is important to implement only if you intend to write individuals out to ﬁles in such a way

that you can load them back in later. If you don’t implement it, toString() will be used, which probably

won’t be as helpful. This returns a String which can be parsed in again in the next method. Note

that you need to write an individual out so that it can perfectly be read back in again as an identical

individual. How do you do this? ECJ’s classes by default all use the aging and idiosyncratic package

ec.util.Code package developed long ago for this purpose, but which still works well. See Section 2.2.3.

Here’s a typical output of these methods (note the use of ec.util.Code):

Evaluated: F

Fitness: f0|0.0|

i6|d4600627607395240880|0.3861348728170766|d4616510324226321041|4.284844300646584|

d4614576621171274054|3.2836854885228233|d4616394543356495435|4.182010230653371|

•

readIndividual(..., LineNumberReader) reads an individual, and its ﬁtness, in from a LineNumberReader.

The stream of text being read is assumed to have been generated by printIndividual(.... Rather than

override this method, you probably should instead override this method:

protected void parseGenotype(EvolutionState state, LineNumberReader reader)

throws IOException;

This modiﬁes the existing Individual’s genotype to match the genotype read in from the reader. The

genotype will have been written out using printIndividual(...). You only need to override this method if

you plan on reading individuals in from ﬁles (by default the method just throws an error).

•

The last two methods (writeIndividual(...) and readIndividual(..., DataInput)) read and write an individual,

including its representation, ﬁtness and evaluated ﬂag, in a purely binary fashion to a stream. Don’t

write the Species. It’s probably best instead to override the following methods to just read and write

the genotype:

public void writeGenotype(EvolutionState state, DataOutput output)

throws IOException;

public void readGenotype(EvolutionState state, DataInput input)

throws IOException;

These methods are probably only important to implement if you plan on using ECJ’s distributed

facilities (distributed evaluator, island models). The default implementations of these methods simply

throw exceptions.

3.2.5 About Fitnesses

Fitnesses are separate from Individuals, and various Fitnesses can be used depending on the demands of

the evolutionary algorithm. The most common Fitness is ec.simple.SimpleFitness, which represents ﬁtness

as a single number from negative inﬁnity to positive inﬁnity, where larger values are “ﬁtter”. Certain

selection methods (notably ﬁtness proportionate selection) require that the ﬁtness be non-negative; and

ideally between 0 and 1 inclusive.

There are other Fitness objects. For example, there are various multiobjective ﬁtnesses (see Section 7.5), in

which the ﬁtness value is not one but some

numbers, and either higher or lower may be better depending

on the algorithm. Other Fitnesses, like the one used in genetic programming (Section 5.2), maintain a primary

Fitness statistic and certain auxiliary ones.

You probably won’t need to implement a Fitness object. But you may need to use some of the meth-

ods below. Fitnesses are Prototypes and so must implement the clone() (as a deep-clone), setup(...), and

defaultBase() methods. Fitness has four additional required methods:

public abstract double fitness();

public abstract boolean isIdealFitness();

public abstract boolean equivalentTo(Fitness other);

public abstract boolean betterThan(Fitness other);

The ﬁrst method, ﬁtness(), should return the ﬁtness cast into a value from negative inﬁnity to positive

inﬁnity, where higher values are better. This is used largely for ﬁtness-proportionate and similar selection

methods. If there is no appropriate mechanism for this, you’ll need to fake it. For example, multiobjective

ﬁtnesses might return the maximum or sum over their various objectives.

The second method, isIdealFitness(), returns true if the ﬁtness in question is the best possible. This is

largely used to determine if it’s okay to quit. It’s ﬁne for this method to always return false if you so desire.

The third and fourth methods compare against another ﬁtness object, of the same type. The ﬁrst returns

true if the two Fitnesses are in the same equivalence class: that is, neither is ﬁtter than the other. For simple

ﬁtnesses, this is just equality. For multiobjective ﬁtnesses this is Pareto-nondomination of one another. The

second method returns true if the Fitness is superior to the one provided in the method. For simple ﬁtnesses,

this just means ﬁtter. For multiobjective ﬁtnesses this implies Pareto domination.

Fitnesses also have similar printing facilities to Individuals:5

public void printFitnessForHumans(EvolutionState state, int log);

public void printFitness(EvolutionState state, int log);

public void printFitness(EvolutionState state, PrintWriter writer);

public void readFitness(EvolutionState state, LineNumberReader reader)

throws IOException;

public void writeFitness(EvolutionState state, DataOutput output)

throws IOException;

public void readFitness(EvolutionState state, DataInput input)

throws IOException;

As usual: the ﬁrst method, printFitnessForHumans(...), prints a Fitness in a way pleasing for humans to

read. It simply prints out the result of the following method (which you should override instead if you ever

need to):

public String fitnessToStringForHumans();

The default implementation of ﬁtnessToString() simply calls:

public String toString();

The next two methods, both named printFitness(...), prints a Fitness in a way that can be (barely) read by

humans, and can be read by ECJ to produce an identical Fitness to the original. These methods just print out

the result of the following method (which you should override instead if you ever need to):

public String fitnessToString();

5Starting to get redundant? Sorry about that.

The default implementation of this method calls toString(), which is almost certainly wrong. But all the

standard Fitness subclasses implement it appropriately using the ec.util.Code tools (Section 2.2.3).

The method readFitness(..., LineNumberReader) reads into ECJ a Fitness written by these last two printers.

Finally, the last two methods, writeFitness(...) and readFitness(..., DataInput), read and write the Fitness in a

binary fashion. The default implementation of these methods throws an error, but all standard subclasses of

Fitness implement them properly.

Fitnesses have two auxiliary variables:

public ArrayList trials = null;

public Individual[] context = null;

These variables are used by coevolutionary processes (see Section 7.1) to keep track of the number of

trials (in the form of java.lang.Double used to compute the Fitness value, and to maintain the context (other

collaborating Individuals) which produced the best result represented by he Fitness. Outside of coevolution

they’re presently unused: leave them null and they won’t be printed.

Fitnesses have three hooks which can be used to merge multiple Fitness values into one, if appropriate

(for example, this doesn’t make much sense for multiobjective ﬁtnesses). Though this could be used to

assemble a Fitness over multiple trials, Coevolution uses the different mechanism above to achieve this

which preserves contextual information (see Section 7.1). One method setToMeanOf(...), is unimplemented

in the Fitness class proper, though it’s been implemented in common subclasses in ECJ. If you make your

own Fitness object you might ultimately want to implement this method if appropriate, but it’s not necessary

in most cases. The other two methods call setToMeanOf(...) internally.

ec.util.Fitness Methods

public void setToMeanOf(EvolutionState state, Fitness[] ﬁtnesses)

Sets the ﬁtness to the mean of the provided ﬁtness values. By default this method is unimplemented and generates

an error. Common subclasses (like SimpleFitness and KozaFitness) override this method and implement it. Other

classes, such as MultiobjectiveFitness and its subclasses,, do not, since there is no notion of a “mean” in that context.

You do not have to implement this utility method in most situations.

public void setToMedianOf(EvolutionState state, Fitness[] ﬁtnesses)

Sets the ﬁtness to the median of the provided ﬁtness values. This method calls setToMeanOf(...) in its implementa-

tion.

public void setToBestOf(EvolutionState state, Fitness[] ﬁtnesses)

Sets the ﬁtness to the best of the provided ﬁtness values. This method calls setToMeanOf(...) in its implementation.

3.3 Initializers and Finishers

The

Initializer

is called at the beginning of an evolutionary run to create the initial population. The

Finisher

is called at the end of a run to clean up. In fact, it’s very rare to use any Finisher other than ec.simple.Finisher,

which does nothing at all. So nearly always you’ll have this:

finish = ec.simple.SimpleFinisher

Initializers vary largely based on representation, but not for the reason you think. Initializers generally

don’t need to know anything about the representation of an individual in order to construct it. Instead,

certain representations require a lot of pieces which need to be in a central repository (they’re Cliques). For

example, the genetic programming facility (Section 5.2) has various types,function sets,tree constraints,node

constraints, etc. It’s not in ECJ’s style to store these things as static variables because of the difﬁculty it presents

for serialization. Instead ECJ needed a global object to hold them, and Initializers were chosen for that task.

It’s probably not been the smartest of decisions: Finishers (which have historically had little purpose) could

have been recruited to the job, or some generic type repository perhaps. As it stands, Initializers aren’t an

optimal location, but there it is.6

Unless you’re doing genetic programming (ec.gp) or using the ec.rule package, you’ll probably use a

ec.simple.SimpleInitializer:

init = ec.simple.SimpleInitializer

ECJ’s generational7initialization procedure goes like this:

1. The EvolutionState asks the Initializer to build a Population by calling:

population = state.initializer.initialPopulation(state, 0);

The 0 is thread index 0: this portion of the code is single-threaded.

The Initializer then creates and sets up a Population by calling the following on itself. It then tells the

Population to populate itself with individuals:

Population pop = setupPopulation(state, 0);

pop.populate(state, 0);

Why break this out? Because there are a few EvolutionState subclasses which don’t want to populate

the population immediately or at all — they just want to set it up. For example, steady state evolution

sets up a Population but may only gradually ﬁll it with initial population members. In this case, the

steady state system will just call setupPopulation(...) directly, bypassing initialPopulation(...).

The Population’s default populate(...) method is usually straightforward: it calls populate(...) in turn on

each Subpoulation in the Population’s subpopulation array.

Alternatively, the Population can read an entire population from a ﬁle. This is determined by (as usual)

a parameter! If the Population should be read in from the ﬁle /tmp/population.in, the parameter setting

would be:

pop.file = /tmp/population.in

The Population will read Subpopulations, and ultimately Individuals, from this ﬁle by calling its

readPopulation(..., LineNumberReader) method.

If the Population is not reading from a ﬁle, it will call populate(...) on each of its Subpopulations. A

Subpopulation’s populate(...) method usually works like this. First, it determines if it should create

new individuals from scratch or if it should ﬁll its array by reading Individuals from a ﬁle. If the

individuals are to be generated from scratch (the most common case by far), Subpopulation generates

new individuals using the standard newIndividual(...) method in ec.Species (see Section 3.2.2). ECJ

can also check to make sure that the Subpopulation does not produce duplicate individuals while

generating from scratch, if you set the following parameter (in this case, in Subpopulation 0):

pop.subpop.0.duplicate-retries = 100

The default value is no retries. This says that if the Subpopulation creates a duplicate individual, it will

try up to 100 times to replace it with a new, original individual. After that it will give up and use the

duplicate individual.

This makes it problematic to have both a “rule” representation and a genetic programming representation in the same run without

a little hacking, since both require their own Initializer. Perhaps this might be remedied in the future.

7ECJ’s Steady State evolution mechanism has a different initialization procedure. See Section 4.2 for more information.

You can also read Subpopulations directly from ﬁles, in a procedure similar to how it’s done for

Population. If Subpopulation 0 should be read in from the ﬁle /tmp/subpopulation.in, the parameter

setting would be:

pop.subpop.0.file = /tmp/subpopulation.in

Subpopulations will try to read individuals from ﬁles using readSubpopulation(..., LineNumberReader).

If the number of individuals in the ﬁle is greater than the size of the Subpopulation, then the Subpopu-

lation will be resized to match the ﬁle. If the number of individuals is in the ﬁle is less than the size of

the Subpopulation, then the Subpopulation will try to do on of three things:

•

Truncate the Subpopulation to the size of the ﬁle. This the default when reading from a ﬁle, but if

you want to be explicit, it’s speciﬁed like so:

pop.subpop.0.extra-behavior = truncate

•

Wrap copies of the ﬁle’s individuals repeatedly into the Subpopulation. For example, if the ﬁle

had individuals A, B, and C, and the Subpopulation was of size 8, then it’d be ﬁlled with A, B, C,

A B, C, A, B. This is particularly useful if you want to ﬁll a ﬁle with copies of a single individual.

This is speciﬁed like so:

pop.subpop.0.extra-behavior = wrap

•

Fill the remainder of the Subpopulation with random individuals (see below). This is speciﬁed

like so:

pop.subpop.0.extra-behavior = fill

These options aren’t available if you’re reading the whole Population from a ﬁle: it always truncates

its Subpopulations appropriately. Note that if you’re reading the Population from a ﬁle, you can’t

simultaneously read one of its Subpopulations from a ﬁle — that wouldn’t make any sense.

3.3.1 Population Files and Subpopulation Files

If you write out a population using printPopulation(...), the resulting ﬁle or print-out typically starts with a

declaration of the number of subpopulations, followed by a declaration of a subpopulation number, then

the number of individuals in that subpopulation, then the individuals one by one. After this come the

declaration of the next subpopulation number, and the number of individuals in that subpopulation, then

those individuals. And so on. It looks like this:

Number of Subpopulations: i3|

Subpopulation Number: i0|

Number of Individuals: i1024|

... [the individuals] ...

Subpopulation Number: i1|

Number of Individuals: i512|

... [the individuals] ...

Subpopulation Number: i2|

Number of Individuals: i2048|

... [the individuals] ...

But ECJ doesn’t read in entire populations on initialization. Instead if you want to initialize your

population from a ﬁle, you do so on a per-subpopulation basis, as in the parameters:

pop.subpop.0.file = myfile.in

A subpopulation ﬁle like this usually just has the the number of individuals for the subpopulation,

followed by the individuals:

Number of Individuals: i512|

... [the individuals] ...

You can typically edit a subpopulation ﬁle out of a population ﬁle with some judicious typing: the

relevant text is between the relevant “subpopulation Number:” lines.

In the example above, there are three subpopulations, because of the line

Number of Subpopulations: i3|

This “i3

” oddity is due to use of ECJ’s Code package (Section 2.2.3). The “i” means “integer”, the “3” is

the value, and the “|” is a separator. Likewise subpopulation 0 starts with “i0|”.

3.4 Evaluators and Problems

ECJ evaluates (assesses the ﬁtness of) Individuals in a Population by passing it to an ec.Evaluator. Various

evolutionary algorithms and other stochastic search algorithms have their own special kinds of Evaluators.

Evaluators perform this ﬁtness assessment by cloning one or more

Problems

, discussed in the next Section,

and asking these Problems to evaluate the individuals on their behalf. Evaluators hold the prototypical

Problem here:

public Problem p problem;

This problem is loaded from parameters. For example, to specify that we will use the Artiﬁcial Ant

Problem to test our genetic programming Individuals, we’d say:

eval.problem = ec.app.ant.Ant

The basic Evaluator is ec.simple.SimpleEvaluator. This class evaluates a Population ﬁrst by determining

how many threads to use. To use four threads (for example), we say:

evalthreads = 4

The default value is a single thread.

Recall from Section 2.4 that his will require at least four random number generator seeds, for example:

seed.0 = 1234

seed.1 = -503812

seed.2 = 992341

seed.3 = -16723

When evaluating a Population, ec.simple.SimpleEvaluator will construct

Problems cloned from the

Problem prototype, and assign one to each thread. Then, for each Subpopulation, the Evaluator will use

these threads to evaluate the individuals in the Subpopulation. By default SimpleEvaluator simply breaks

each Subpopulation into

even chunks and assigns each chunk to a different thread and its Problem. This

enables the Population to be evaluated in parallel.

The problem with this approach to parallelism is that it’s not ﬁne-grained: and so if some individuals

take much longer to evaluate, then some threads will sit around waiting for a thread to ﬁnish its chunk. You

can ﬁx this by specifying the chunk size, all the way own to chunks of a single individual each. When an

individual has ﬁnished its chunk, it will request another chunk to work on, and if it has exhausted on all

the chunks in a Subpopulation, it’ll grab chunks from the next Subpopulation. For example, the extreme of

ﬁne-grained parallelism would be:

eval.chunk-size=1

The disadvantage of a small chunk size is that it involves a lot of locking to get each chunk. This is a small

but signiﬁcant overhead: so we suggest using the default (large automatic chunks) unless your evaluations

are costly and of high variance in evaluation time.

Another disadvantage of a nonstandard chunk size is that threads run at different speeds and are no

longer asynchronous: as a result, different runs with the same seeds could produce different results if

evaluation is stochastic.

Of course, you probably most often don’t do parallelism at all: you’ll just have a single thread (that is,

1). In this case you have one further option: to avoid cloning the Problem each time, by setting the

following parameter to false:

eval.clone-problem = false

If false, then the same Problem instance (the Prototype, in fact) will be used again and again. Ob-

viously, this only is allowed if there’s a single evaluation thread. And steady-state evolution (via

ec.simple.SteadyStateEvaluator) does not support it.

The idea of not cloning the population and pipeline is due to Brian Olsen, a GMU PhD Student.

Certain Evaluator methods are required. The primary method an Evaluator must implement is

public abstract void evaluatePopulation(EvolutionState state);

This method must take the Population (that is, state.population) and evaluate all the individuals in it in

the fashion expected by the stochastic search algorithm being employed. Additionally, an Evaluator must

implement the method

public abstract String runComplete(EvolutionState state);

... which returns a non-null String if the Evaluator believes the process has reached a terminating state.

Typically this is done by scanning through the Population and determining if any of the Individuals have

ideal ﬁtnesses. The String provides a message for the user as to why the run terminated early (typically

because the ideal individual was discovered). If you don’t want to be bothered, it’s ﬁne to have this method

always return null.

If you would like to terminate the run early for some other reason — perhaps you have run out of

resources or got stuck somehow — you can call the following method to request that ECJ terminate its run

at its earliest convenience. You provide a String which tells the user why you’ve decided to quit.

public void setRunComplete(String message);

3.4.1 Problems

Evaluators assess the ﬁtness of individuals typically by creating one or more Problems and handing them

chunks of Subpopulations to evaluate. There are two ways that an Evaluator can ask a Problem to perform

evaluation:

•

For each Individual, the Evaluator can call the Problem’s evaluation method. This method varies

depending on the kind of Problem. Problems which adhere to ec.simple.SimpleProblemForm — by far

the most common situation — use the following method:

public void evaluate(EvolutionState state, Individual ind,

int subpopulation, int threadnum);

When this approach is taken, the Problem must assign a ﬁtness immediately during the evaluate(...)

method. In practice, ECJ doesn’t do this all that much.

•

The more common approach allows a Problem to perform ﬁtness evaluation in bulk. In this approach,

the Evaluator will ﬁrst call the following method once:

public void prepareToEvaluate(EvolutionState state, int thread);

This signals to the Problem that it must prepare itself to begin evaluating a series of Individuals, and

then afterwards assign ﬁtness to all of them. Next the Evaluator calls the Problem’s evaluation method

for each Individual, typically using the method evaluate(...) as before. Finally, the Evaluator calls this

method:

public void finishEvaluating(EvolutionState state, int thread);

Using this approach, the Problem is permitted to delay assigning ﬁtness to Individuals until ﬁnishEvalu-

ating(...) is called.

When ECJ is preparing to exit various Statistics objects sometimes construct a Problem in order to

re-evaluate the ﬁttest Individual of the run, solely to have such evaluation print out useful information to

tell the user how the Individual operates. This special version of evaluation is done with the following

ec.simple.SimpleProblemForm method:

public void describe(EvolutionState state, Individual ind, int subpopulation,

int threadnum, int log);

Note that ECJ will not call prepareToEvaluate(...) before describe(...), nor call ﬁnishEvaluating(...)

after it.

When this method is called, the expectation is that the individual will be evaluated for the purpose

of writing out interesting descriptive information to the log. For example, a ﬁt Artiﬁcial Ant agent might

show the map of the trail it produces as it wanders about eating pellets of food. If you prefer you don’t

have to implement this method: and in fact many Problems don’t. The default version (in ec.Problem) does

nothing at all.

Problem is a Prototype, and so it must implement the clone() (as a deep-clone), setup(...), and defaultBase()

methods: although in truth the default base is rarely used. Problem’s “default” default base is

problem

which is very rarely used.

3.4.2 Implementing a Problem

Commonly the only method a Problem needs to implement is the evaluate(...) method. For example, let’s

imagine that our Individuals are of the class ec.vector.IntegerVectorIndidual, discussed in Section 5.1. The

genotype for IntegerVectorIndividual is little more than an array of integers. Let us presume that the ﬁtness

of these individuals is deﬁned as the product of their integers.

The example below does ﬁve basic things:

If the individual has already been evaluated, we don’t bother evaluating it again. It’s possible you’d

might want to evaluate it anyway (perhaps if you had a dynamically changing ﬁtness function, for

example).

2. We do a sanity check: if the individual is of the wrong type, we issue an error.

3. We compute product of the values in the genome.

We set the ﬁtness to that product, and test to see if the ﬁtness is optimal (in this case, if it’s equal to

Double.POSITIVE INFINITY.

5. We set the individual’s evaluated ﬂag.

The implementation is pretty straightforward:

package ec.app.myapp;

import ec.*;

import ec.simple.*;

import ec.vector.*;

public class MyProblem extends Problem implements SimpleProblemForm

{

public void evaluate(EvolutionState state, Individual ind,

int subpopulation, int thread)

{

if (ind.evaluated) return;

if (!(ind instanceof IntegerVectorIndividual))

state.output.fatal("Whoa! It’s not an IntegerVectorIndividual!!!");

int[] genome = ((IntegerVectorIndividual)ind).genome;

double product = 1.0;

for(int x=0; x<genome.length; x++)

product = product * genome[x];

((SimpleFitness)ind.fitness).setFitness(state, product,

product == Double.POSITIVE_INFINITY);

ind.evaluated = true;

}

If you’re doing ec.simple.SimpleProblemForm may wish to also implement the describe(...) method. This

method will be called a single time at the end of the run, and will be passed the ﬁttest individual discovered.

It gives you a chance to write out interesting statistics about the individual, typically to the statistics log. For

example, the ec/app/ant/Ant.java ﬁle implements the describe(...) method to print out the winning path that

the ant took to achieve its high ﬁtness. Here is a silly example of implementing describe(...) to give you the

general idea.

public void describe(EvolutionState state, Individual ind,

int subpopulation, int thread, int log)

{

if (!(ind instanceof IntegerVectorIndividual))

state.output.fatal("Whoa! It’s not an IntegerVectorIndividual!!!");

int[] genome = ((IntegerVectorIndividual)ind).genome;

double product = 1.0;

double min = Double.POSITIVE_INFINITY;

for(int x=0; x<genome.length; x++)

{

product = product * genome[x];

min = Math.min(genome[x], min);

}

state.output.println("Best Individual’s total: " + product, log);

state.output.println("Best Individual’s minimum genome: " + min, log);

if (product == Double.POSITIVE_INFINITY)

state.output.println("Best Individual is OPTIMAL!", log);

else

state.output.println("Best Individual is not optimal.", log);

}

Notice that describe(...) basically looks the same as evaluate(...), except that you always do evaluation

(why wouldn’t you?), and don’t set the ﬁtness (why would you?).

3.5 Breeders

Individuals are selected bred to create new Individuals using a subclass of ec.Breeder. Because this is so

central to the differences among various evolutionary algorithms, many such algorithms implement their

own Breeder subclasses. A Breeder consists of a single method:

public abstract Population breedPopulation(EvolutionState state);

This method is required to take the current Population, found here...

state.population

... and return a Population to be used for the next generation, consisting of individuals selected and bred

from the previous Population in a manner appropriate for the algorithm being used. The Population returned

can be the original Population, or it can be an entirely new Population cloned from the original (Population

is a Group, recall — see Section 3.1.5).

The most common Breeder is ec.simple.SimpleBreeder, which implements a basic form of generational

breeding common to the Genetic Algorithm and to Genetic Programming, among others. SimpleBreeder has

facilities for multithreaded breeding and a simple form of elitism, and works as follows:

1. For each Subpopulation in the Population,

(a) Determine the Nﬁttest Individuals in the Subpopulation.

(b) Create a new Subpopulation.

(c)

Load these

individuals (the “elites”) into ﬁrst slots of the new Subpopulation’s individuals array.

(d)

Break the remaining unﬁlled region of this individuals array into

chunks, one chunk per thread.

(e) For each of the Mthreads (in parallel),

i. Construct a new Breeding Pipeline.

ii. Use this Breeding Pipeline to populate the thread’s chunk with newly-bred Individuals.

2. Assemble all the new Subpopulations into a new Population and return it.

The number of elites (

) in each Subpopulation is a parameter. To set 10 elites for Subpopulation 0 (for

example), you’d say:

breed.elite.0 = 10

Alternatively you can deﬁne the number of elites as a proportion of the Subpouplation:.

breed.elite-fraction.0 = 0.25

You can’t do both, but you can do neither: rhe default value is no elites.

You don’t have to specify the elitism on a per-Subpopulation basis. If you speciﬁed a default subpopula-

tion using the parameter

pop.default-subpop

(see Section 3.2.1), SimpleBreeder will try this instead when

it can’t ﬁnd an elitism parameter, and will issue a warning letting you know it’s doing so. For example, if

you declared:

pop.default-subpop = 0

breed.elite.0 = 10

... then when SimpleBreeder must determine elitism for (say) Subpopulation 4, and that parameter doesn’t

exist, it’ll use 10 instead (the value for Subpopulation 0).

Ordinarily elites never have their ﬁtness reevaluated. But if you have a dynamic ﬁtness function, you

may wish to reevaluate their ﬁtness each generation to see if it’s still the same. To do this for Subpopulation

0, you say:

breed.reevaluate-elites.0 = true

The default value is false.

Again, you don’t have to specify the elitism on a per-Subpopulation basis, if you speciﬁed a default

subpopulation using the parameter

pop.default-subpop

. SimpleBreeder will use the default value and

issue a warning.

The number of threads (M) is also a parameter. To set it to 4, you’d say:

breedthreads = 4

The default value is a single thread.

As was the case for the

evalthreads

parameter (for Evaluator), recall from Section 2.4 that his will

require at least four random number generator seeds, one per thread. For example:

seed.0 = 1234

seed.1 = -503812

seed.2 = 992341

seed.3 = -16723

Certain Breeders allow you to change the subpopulation breeding order to a sequential one. Let’s say you

have 3 subpopulations. If you turn on the following parameter:

breed.sequential = true

... then SimpleBreeder will only breed subpopulation 0 the ﬁrst generation, then only subpopulation 1 the

second generation, then only subpopulation 2 the second generation, then only subpopulation 0 the third

Vector Crossover

Pipeline

Vector Mutation

Pipeline

Reproduction

Pipeline

Multi-Breeding

Pipeline

Tournament

Selection

(GP) Crossover Pipeline

Tournament

Selection

Fitness Proportionate

Selection

Sigma Scaling

Selection

New Subpopulation

Old SubpopulationOld Subpopulation

New Subpopulation

Copy Copy

Copy

Figure 3.3 Two example Breeding Pipelines.

generation, and so on. Obviously if you only have one subpopulation, this parameter has no effect. Why

would you want this? ECJ only uses it for a common kind of coevolution, discussed later in Section 7.1.4.1.

Otherwise you should keep this set to

false

. The only reason it’s mentioned here is so that if you make

a Breeder subclass, you can check for this parameter and signal an error if it is

true

, indicating that your

Breeder subclass doesn’t support it.

All that remains is the breeding procedure itself, for which SimpleBreeder (and many Breeders) constructs

aBreeding Pipeline.

Last but not least, if you breed with only one thread (

1), you have an additional option: to avoid

cloning the Population and Pipeline each time. Instead, the same Pipeline will be used each generation; and

the breeder will swap back and forth between two Populations each time rather than create new ones. This

primarily exists to be a little more efﬁcient in some situations, primarily with small populations. This is done

by setting this parameter to false:

eval.clone-pipeline-and-population = false

Obviously, this only is allowed if there’s a single evaluation thread. And this option is not supported by

a number of subclasses, namely:

•ec.spatial.SpatialBreeder

•ec.multiobjective.nsga2.NSGA2Breeder

•ec.multiobjective.spea2.SPEA2Breeder

•ec.steadystate.SteadyStateBreeder

The idea of not cloning the population and pipeline is due to Brian Olsen, a GMU PhD Student.

3.5.1 Breeding Pipelines and BreedingSources

A Breeding Pipeline is a chain of selection and breeding operators whose function is to draw from Individuals

in an old Subpopulation to produce individuals in a new Subpopulation.

Breeding Pipelines consist of two kinds of objects. First there are

Selection Methods

, which select

Individuals from the old Subpopulation and return copies of them. Then there are

Breeding Pipelines

(what

would have better been called

Breeding Operators

), which take Individuals from Selection Methods or from

other Breeding Pipelines, modify them in some way, and return them.

The Breeding Pipeline structure isn’t actually a pipeline: it’s really a tree (or in some situations, a directed

acyclic graph). The leaf nodes in the graph tree are the Selection Methods (subclasses of ec.SelectionMethod,

and the nonleaf nodes are the Breeding Pipeline objects (subclasses of ec.BreedingPipeline).

Each BreedingPipeline object can have some

Nsources

(children) from which it draws Individuals. Both

ec.SelectionMethod and ec.BreedingPipeline are subclasses of the abstract superclass ec.BreedingSource, and

so can function as sources for BreedingPipelines. SelectionMethods do not have sources: rather, they draw

Individuals directly from the old Subpopulation.

Breeding Pipelines implement a

copy-forward

protocol: the leaf nodes (SelectionMethods) build clones

of individuals selected from the original population, and the non-leaf BreedingPipeline objects modify those

clones. This way every Breeding Pipeline has its own copies of individuals from which it generates children,

and so no locks are required on the original population when doing multithreaded breeding.

BreedingSources (and BreedingPipeline, and SelectionMethods) are Prototypes, and so must implement

the clone(),defaultBase(), and setup(...) methods. BreedingSources also implement three special methods

which perform the actual selection and breeding, which we describe here. When a Breeder wishes to produce

a series of new Individuals from an old Subpopulation, it begins by calling the method

public abstract void prepareToProduce(EvolutionState state, int subpopulation,

int thread);

This instructs the BreedingSource to prepare for a number of requests for Individuals drawn from

subpopulation number subpopulation. During this method the BreedingSource will, at a minimum, call

prepareToProduce(...) on each of its sources.

Next, the Breeder calls the following zero or more times to actually produce the Individuals:

public abstract int produce(int min, int max, int subpopulation,

java.util.ArrayList<Individual> inds, EvolutionState state,

int thread, java.util.HashMap<String, Object> misc);

This instructs the BreedingSource to produce between min and max Individuals drawn from subpop-

ulation number subpopulation. The Individuals are to added to the end of the ArrayList inds. Auxiliary

information may be found (or placed in) the HashMap misc: see Section 3.5.1.1. Finally the method returns

the actual number of Individuals produced.

Last, the Breeder calls the following method to give the BreedingSource an opportunity to clean up. The

BreedingSource in turn, at a minimum, will call the same method on each of its sources.

public abstract void finishProducing(EvolutionState state, int subpopulation,

int thread);

Additionally, BreedingSources implement three other methods. The ﬁrst:

public abstract int typicalIndsProduced();

This method returns the number of individuals a BreedingSource would produce by default if not

constrained by min and max. The method can return any number >0.

The next method:

public abstract boolean produces(EvolutionState state, Population newpop,

int subpopulation, int thread);

... returns true if the BreedingSource believes it can validly produce Individuals of the type described by the

given Species, that is, by newpop.subpops.get(subpopulation).species. This is basically a sanity check. At the

minimum, the BreedingSource should call this method on each of its sources and return false if any of them

return false.

Last, we have the hook...

public void preparePipeline(Object hook);

You don’t have to implement this at all. ECJ does not call this method nor implement it in any of its

BreedingSources beyond the default implementation (which in BreedingPipeline calls the method in turn on

each of its sources). This method simply exists in the case that you need a way to communicate with all the

methods of a BreedingPipeline at some unusual time.

3.5.1.1 Auxiliary Data

In addition to Individuals, Breeding Sources can also return various

auxiliary data

associated with them.

This is an elaborate hook which lets you do much more sophisticated things with Breeding Sources are your

needs may require. This data is stored in the misc variable passed into the produce(...) method of a Breeding

Source.

A Breeder may set misc to null, in which case Breeding Sources should simply ignore it. Otherwise misc

holds a HashMap which maps

keyword strings

custom objects

, and is created immediately before any

call to produce(...) by calling the following method in ec.Species:

public HashMap<String, Object> buildMisc(EvolutionState state, int subpopIndex,

int thread);

By default this method just builds an empty HashMap.

You can put whatever you want in this HashMap. But ECJ currently reserves exactly one keyword string,

deﬁned in SelectionMethod.java:

public static final String KEY PARENTS = "parents";

Associated with this string in the HashMap is an array of ec.util.IntBag objects. An IntBag is very much

like an ArrayList but it holds ordinary integers, not objects.

. For a given individual number

returned in

the ArrayList, the IntBag number

in the parents array holds indexes to all the parents of that Individual in

the old Subpopulation.

3.5.2 SelectionMethods

Selection Methods by default implement the typicalIndsProduced() method to return Selection-

Method.INDS PRODUCED (that is, 1).

Furthermore, the default implementation of the produces method,

public abstract boolean produces(EvolutionState state, Population newpop,

int subpopulation, int thread);

...just returns true. But you may wish to use this method to check to make sure that your Selection-

Method knows how to work with the kind of Fitnesses found in the given subpopulation, that is,

state.population.subpops.get(subpopulation).f prototype.

8IntBag comes from ECJ’s sister project, MASON https://cs.gmu.edu/∼eclab/projects/mason

The default implementations of prepareToProduce(...) and ﬁnishProducing(...) do nothing at all; though

some kinds of SelectionMethods, such as Fitness Proportionate Selection (ec.select.FitProportionateSelection),

use prepareToProduce(...) to prepare probability distributions based on the Subpopulation in order to select

properly.

SelectionMethods are sometimes called upon not to produce a cloned Individual but to provide an index

into a subpopulation where the Individual is located — perhaps to kill that Individual and place another

Individual in its stead. To this end, SelectionMethods have an alternative form of the produce(...) method:

public abstract int produce(int subpopulation, EvolutionState state, int thread);

This method must return the index of the selected individual in the Subpopulation given by:

state.population.subpops.get(subpopulation).individuals;

Furthermore, SelectionMethods have a special version of produce(...) called produceWithoutCloning(...),

which selects individuals (using the alternative form of produce(...) as described earlier) and sticks them

directly into the ArrayList without cloning them.

public int produceWithoutCloning(int min, int max, int subpopulation,

java.util.ArrayList<Individual> inds, EvolutionState state,

int thread, java.util.HashMap<String, Object> misc);

At present, produceWithoutCloning() does exactly one thing with the misc variable: it creates a

parents

bag for each produced child, and places in it the index of the original individual in the old subpopulation.

The default implementation of the full produce(...) method just calls producewithoutCloning(...), then

replaces with clones each of the individuals placed in the ArrayList. All this means that

you don’t have

to implement the full produce(...) method

. You typically just implement the alternative form (which you

have to implement anyway).

3.5.2.1 Implementing a Simple SelectionMethod

Implementing a SelectionMethod can be as simple as overriding the “alternative” form of the produce(...)

method. You don’t have to implement the “standard” form of produce(...) because its default implementation

calls the alternative form and handles the rest of the work for you.

To select an individual at random from a Subpopulation, you could simply implement the “alternative”

form to return a random number between 0 and the size of the subpopulation in question:

public int produce(int subpopulation, EvolutionState state, int thread)

{

return state.random[thread.next(

state.population.subpops.get(subpopulation).individuals.size())];

}

You’ll want to always implement the alternative form of produce(...). But in some cases you may wish to

also reimplement the “standard” form of produce(...) for some reason (you’ll still have to implement the

alternative form) —, perhaps you want to play with the misc variable, for example. In this case it’s best to

implement the produceWithoutCloning(...) version. We start by determining how many individuals we’ll

produce, defaulting with 1:

public int produceWithoutCloning(int min, int max, int subpopulation, java.util.ArrayList<Individual> inds,

EvolutionState state, int thread, java.util.HashMap<String, Object> misc)

{

int start = inds.size();

int n = 1;

if (n>max) n = max;

if (n<min) n = min;

Next we select

individuals from the old subpopulation and place them into slots inds[start] ...

inds[start+n-1]. Here we’re do it randomly:

for(int q = 0; q < n; q++)

{

int index = produce(subpopulation,state,thread);

inds.add(state.population.subpops.get(subpopulation).individuals.get(index));

// here we will create a parents IntBag for each selected individual, and place

// the index of that selected individual directly in this IntBag. An array of

// IntBags are stored associated with KEY_PARENTS.

// Note that misc could be null, that’s perfectly legitimate.

if (misc!=null && misc.get(KEY_PARENTS)!=null)

{

ec.util.IntBag bag = new ec.util.IntBag(1);

bag.add(index);

((IntBag[])misc.get(KEY_PARENTS))[start+q] = bag;

}

// Here is where you could play around some more with the misc variable,

// for example...

}

return n;

}

3.5.2.2 Standard Classes

There are a number of standard SelectionMethods available in ECJ, all found in the ec.select package.

•

ec.select.FirstSelection always returns the ﬁrst individual in the Subpopulation. This is largely used for

testing purposes.

•

ec.select.RandomSelection returns an Individual chosen uniformly at random. Basically it does the same

thing as the example given above.

•

ec.select.TopSelection always returns the very best individual in the Subpopulation, breaking ties

randomly. You can optionally cache this individual so TopSelection doesn’t hunt for him every time it’s

called. To do this, you set:

base.cache = true

The default value is false. Note that if multiple individuals are tied for best, only one will be selected

for the cache, and the others will never be returned. The cache is reset when the method prepareToPro-

duce(...) is called. Warning: in steady-state evolution, this method is only called once, at the beginning

of the run. In generational evolution, the method is called at the start of every generation.

•ec.select.AnnealedSelection is used for doing Simulated Annealing. It works as follows. If there is only

one individual in the population, it is selected. Otherwise we choose a random individual that is not

the ﬁrst individual. If that random individual is ﬁtter than the ﬁrst individual, it is selected. Otherwise

if that random individual is as ﬁt as (has a ﬁtness equivalent to) the ﬁrst individual, one of the two is

selected at random. Otherwise if the random individual is not as ﬁt as the ﬁrst individual, it is selected

with a probability

P=eFitness(the random individual)−Fitness(the ﬁrst individual)

, where

is a temperature. Otherwise the

ﬁrst individual is selected.

The temperature starts at a high value



0, and is slowly cut down by multiplying it by a cutdown

value every generation. When the temperature reaches 0, then the ﬁrst individual is always selected.

The initial temperature and cutdown are set like this:

base.temperature = 10000

base.cutdown = 0.95

(You don’t have to provide a cutdown, in which case its default is 0.95.)

Like TopSelection, AnnealedSelection also has a caching option. The selected individual can be cached

so the same individual is returned repeatedly without being recomputed. This cache is cleared after a

call to prepareToProduce(...). Again, note that this option is not appropriate for Steady State Evolution,

which only calls prepareToProduce(...) once. To do this, you set:

base.cache = true

The default value is false.

•

ec.select.FitProportionateSelection

uses

Fitness-Proportionate Selection

, sometimes called

Roulette

Selection, to pick individuals. Thus ec.select.BestSelection requires that all ﬁtnesses be non-negative.

•

ec.select.SUSSelection selects individuals using

Stochastic Universal Sampling

, a low-variance version

of Fitness-Proportionate selection in which highly ﬁt individuals are unlikely to never be chosen. Every

new generation, and

selection events thereafter, it shufﬂes the Subpopulation, then computes the

individuals to be selected in the future. ECJ assumes that

is the size of the Subpopulation.

Fitnesses must be non-negative. You have the option of whether or not to shufﬂe the Subpopulation

ﬁrst:

base.shuffle = true

The default value is false.

•

ec.select.SigmaScalingSelection (written by Jack Compton, a former undergraduate at GMU) is another

low-variance version of Fitness-Proportionate Selection, in which modiﬁed versions of the Individuals’

ﬁtnesses are used to reduce the variance among them. This is done by ﬁrst computing the mean

and standard deviation

among the ﬁtnesses. If

σ=

0 no change is made. Otherwise each modiﬁed

ﬁtness

is then treated as

g←

+f−µ

2σ

. This can result in negative modiﬁed ﬁtnesses, so we introduce

a ﬁtness ﬂoor: modiﬁed ﬁtnesses are bounded to be no less than the ﬂoor. Original ﬁtnesses must be

non-negative To set this ﬂoor to 0.1 (a common value), you’d say:

base.scaled-fitness-floor = 0.1

0.1 is the default value already, so this is redundant.

SigmaScalingSelection default base is select.sigma-scaling.

•

ec.select.BoltzmanSelection (also written by Jack Compton) works like Fitness-Proportionate Selection,

but uses modiﬁed ﬁtness values according to a Boltzman (Simulated-Annealing-style) cooling schedule.

Initially BoltzmanSelection has a high temperature

, and for each successive generation it decreases

by a cooling rate

T←T∗R

. Each modiﬁed ﬁtness

is computed as

g←ef/T

, where

is the

original ﬁtness. Fitnesses must be non-negative. When the temperature reaches 1.0, BoltzmanSelection

It’s called FitProportionateSelection rather than FitnessProportionateSelection for a historical reason: MacOS 9 didn’t allow

ﬁlenames longer than 32 characters, and FitnessProportionateSelection.class is 35 characters long.

reverts to FitnessProportionateSelection. To set the initial temperature to 1000 and the cooling rate to

0.99, you’d say:

base.starting-temperature = 1000

base.cooling-rate = 0.99

The default temperature is 1.0; and the default cooling rate is 0.0, which causes BoltzmanSelection to

behave exactly like FitProportionateSelection.

BoltzmanSelection’s default base is select.boltzman.

•

ec.select.GreedyOverselection is a variation of Fitness-Proportionate Selection which was common in

the early genetic programming community (see Section 5.2), but no longer. The Individuals are sorted

and divided into the “ﬁtter” and “less ﬁt” groups. With a certain probability the “ﬁtter” individuals

will be selected (using Fitness-Proportionate Selection), else the “less ﬁt” individuals will be selected

(also using Fitness-Proportionate Selection). Fitnesses must be non-negative. To specify that the “ﬁtter”

group is 25% of the Subpopulation, and that individuals are chosen from it 40% of the time, you’d say:

base.top = 0.25

base.gets = 0.40

GreedyOverselection’s default base is select.greedy.

•

ec.select.TournamentSelection ﬁrst chooses

individuals entirely at random with replacement (thus the

same Individual may be chosen more than once). These individuals are known as the tournament, and

is the tournament size. Then from among those

it returns the ﬁttest (or least ﬁt, a parameter setting)

Individual, breaking ties randomly.

is often an integer but in fact it doesn’t have to be: it can be any

real-valued number

0. If

isn’t an integer, it’s interpreted as follows: with probability

T− bTc

we choose

dTe

individuals at random, else we choose

bTc

individuals at random. Fitnesses must be

non-negative. The most common setting for

is 2. To use 2, and return the ﬁttest individual rather

than the least-ﬁt one, say:

base.size = 2

base.pick-worst = false

By default, pick-worst is false, so the second parameter is redundant here.

TournamentSelection’s default base is select.tournament.

•

ec.select.BestSelection gathers the best or worst

individuals in the population. It then uses a tourna-

ment selection of size

to select, restricted to just those

. The tournament selection procedure works

just like ec.select.TournamentSelection. If the

worst individuals were gathered, then the tournament

will pick the worst in the tournament.

This could be used in various ways. Continuing the example above, to use a value of

2, selecting

among the best 15 individuals in the population (say), we could say:

base.n = 15

base.size = 2

base.pick-worst = false

We could also use this to always pick the single worst individual in the population:

base.n = 1

base.size = 1

base.pick-worst = true

Or we could also use this to pick randomly among the best 100 individuals in the population, in a kind

of poor-man’s (µ,λ)Evolution Strategy (see Section 4.1.2):

base.n = 100

base.size = 1

base.pick-worst = false

Speaking of Evolution Strategies, you could also do a kind of poor-man’s

(µ+λ)

as well by including

those top 100 individuals as elites:

base.n = 100

base.size = 1

base.pick-worst = false

breed.elite.0 = 100

If you don’t like specifying

as a ﬁxed value, you also have the option of specifying it as a fraction of

the population:

base.n-fraction = 0.1

You can still use this to do a poor-man’s (µ+λ)because elitism can likewise be deﬁned this way:

base.n-fraction = 0.1

base.size = 1

base.pick-worst = false

breed.elite.0 = 0.1

•

Finally, ec.select.MultiSelection is a special version of a SelectionMethod with

other SelectionMethods

as sources. Each time it must produce an individual, it picks one of these SelectionMethods at random

(using certain probabilities) and has it produce the Individual instead. To set up MultiSelection with

two sources, TournamentSelection (chosen 60% of the time) and FitnessProportionateSelection (chosen

40% of the time), you’d say:

base.num-selects = 2

base.select.0 = ec.select.TournamentSelection

base.select.0.prob = 0.60

base.select.1 = ec.select.FitnessProportionateSelection

base.select.1.prob = 0.40

MultiSelection’s default base is select.multiselect.

3.5.3 BreedingPipelines

BreedingPipelines (ec.BreedingPipeline) take Individuals from sources, typically modify them in some way,

and hand them off. Some BreedingPipelines are mutation or crossover operators; others are more mundane

utility pipelines. BreedingPipelines specify the required number of sources they use with the following

method:

public abstract int numSources();

This method must return a value

0, or it can return the value BreedingPipeline.DYNAMIC SOURCES,

which indicates that the BreedingPipeline can vary its number of sources, and that the user must specify the

number of sources with the parameter like this:

base.num-sources = 3

Note: if you use BreedingPipeline.DYNAMIC SOURCES, in the BreedingPipeline’s setup(...) method, you

probably will want to check that the number of sources speciﬁed by the user is acceptable. You can do this

by checking the size of sources.length. For example, the user can specify 0 as a number of sources, which for

most pipelines will make little sense.

At any rate, the user speciﬁes each source with a parameter. For example, to stipulate sources 0, 1, and 2,

you might say:

base.source.0 = ec.select.TournamentSelection

base.source.1 = ec.select.TournamentSelection

base.source.2 = ec.select.GreedyOverselection

A diversion. One trick available to you is to state that a source is the same source as a previous one

using a special value called

same

. For example, in the example above two TournamentSelection operators are

created. But if you said the following instead:

base.source.0 = ec.select.TournamentSelection

base.source.1 = same

base.source.2 = ec.select.GreedyOverselection

...then sources 0 and 1 will be the exact same object. This allows you to create a (very simple) Directed

Acyclic Graph, or DAG, rather than a tree. There is an even more sophisticated facility for creating DAGs

called StubPipeline in Section 3.5.3.2. We will not talk about it further here except to say that there is an

additional option available to say:

base.source.0 = ec.select.TournamentSelection

base.source.1 = stub

base.source.2 = ec.select.GreedyOverselection

... and StubPipeline will ﬁll the stub in with a pointer to a different pipeline. Note that if you see

stub

, it

must be the case that StubPipeline is somewhere in the ancestors of that parameter’s BreedingPipeline.

Getting back on track now. At any rate, the sources are then stored in the following instance variable:

public BreedingSource[] sources;

Some BreedingPipelines, like crossover pipelines, have a very speciﬁc number of children they produce

by default (the value returned by typicalIndsProduced()). However many others (mutation operators, etc.)

simply return whatever Individuals they receive from their sources. For these, BreedingPipeline has a

default implementation of typicalIndsProduced() which should work ﬁne: it simply calls typicalIndsProduced()

on all of its sources, and returns the minimum. This computation is done via a simple utility function,

minChildProduction(), one of two such methods which might be useful to you:

public int minChildProduction();

public int maxChildProduction();

BreedingPipeline has default implementations of the produces(...),prepareToProduce(...),ﬁnishProduc-

ing(...), and preparePipeline(...) methods, all of which call the same methods on the BreedingPipeline’s

children.

One ﬁnal option common to most BreedingPipelines which make modiﬁcations (mutation, crossover):

you can specify the probability that the pipeline will operate at all, or if Individuals will simply be passed

through. For example, let’s say you’re using a crossover pipeline of some sort, which creates two children

from its sources, then crosses them over and returns them. If you state:

base.likelihood = 0.8

...then with an 0.8 probability crossover will occur as normal. But with a 0.2 probability two Individuals

from the sources will be simply copied and returned, with no crossover occurring.

3.5.3.1 Implementing a Simple BreedingPipeline

To implement a BreedingPipeline, at a minimum, you’ll need to override two methods: numSources() and

produce(...).numSources() is easy. Just return the number of sources your BreedingPipeline requires, or

BreedingPipeline.DYNAMIC SOURCES if the number can be any size speciﬁed by the user (0 and greater). For

example, to make a mutation pipeline, we probably want a single source, which we’ll extract Individuals

from and mutate:

public int numSources() { return 1;}

If you have chosen to return BreedingPipeline.DYNAMIC SOURCES, you probably want to double-check

that the number of sources the user has speciﬁed in his parameter ﬁle is valid for your pipeline. You can do

this in setup(...). For example, the example below veriﬁes that the value is not zero:

public void setup(final EvolutionState state, final Parameter base)

{

super.setup(state,base);

Parameter def = defaultBase();

if (sources.length == 0) // uh oh

state.output.fatal("num-sources must be > 0 for MyPipeline",

base.push(P_NUMSOURCES), def.push(P_NUMSOURCES));

...

}

Similarly if you have certain unusual constraints on the nature of your sources (that they are certain

classes, for example), you can double-check that in setup(...) too.

Now we need to implement the produce(...) method. Most mutation procedures ask their sources to

produce some number of Individuals for them (however many the source prefers), and then mutate those

Indviduals and return them. We can ask our source to produce sources like this:

public int produce(int min, int max, int subpopulation, ArrayList<Individual> inds,

EvolutionState state, int thread, HashMap<String, Object> misc)

{

int start = inds.size(); // this tells us where we started adding individuals

int n = sources[0].produce(min,max,subpopulation,inds,state,thread,misc);

The source has taken the liberty of adding some

individuals to the inds ArrayList. Next we need to

decide whether to bother mutating at all, based on the

likelihood

parameter. We do a coin-ﬂip to determine

this:

// should we bother mutating at all?

if (!state.random[thread].nextBoolean(likelihood))

return n;

At this point we’re committed to mutating the

Individuals. How mutation occurs depends on the

representation of the Individual of course, so we’ll fake it with a comment:

for(int q=start;q<n+start;q++)

{

// modify Individual inds.get(q) somehow

// we probably now want to turn off its evaluated flag now since he’s not like his parent

inds.get(q).evaluated = false;

}

return n;

}

3.5.3.2 Standard Utility Pipelines

Most BreedingPipelines are custom for your representation: vectors and trees etc. all have their own special

ways of being crossed over or mutated. However there are some utility BreedingPipelines you should be

aware of, all stored in the ec.breed package:

•

ec.breed.ReproductionPipeline is by far the most common utility BreedingPipeline. In response to a

request for

individuals, ReproductionPipeline requests the same number from its single source,

then simply returns them (copying if necessary). ReproductionPipeline has one rarely-used parameter,

which indicates if it must copy the individuals even if it’s not required to maintain the copy-forward

protocol:

base.must-clone = true

By default, must-clone is false.

•

Also common is ec.breed.MultiBreedingPipeline, which takes some

sources — determined by the

user — and when asked to produce Individuals, chooses randomly among its sources to produce the

Individuals for it. It then returns those Individuals. This is a common BreedingPipeline used in genetic

programming (Section 5.2). Recall that to stipulate the number of sources, you say:

base.num-sources = 2

Each source can be accompanied with a probability that this source will be chosen. For example, to

state that the ﬁrst Source is a ReproductionPipeline, chosen 10% of the time, and that the second is a

VectorCrossoverPipeline, chosen 90% of the time, we’d say something like:

base.source.0 = ec.vector.breed.VectorCrossoverPipeline

base.source.0.prob = 0.90

base.source.1 = ec.breed.ReproductionPipeline

base.source.1.prob = 0.10

You can also state that the number of Individuals returned by any source must be exactly the same —

speciﬁcally, the maximum that any one of them would return in response to a given request. For

example, if you had a Crossover pipeline (which normally returns 2 Individuals) and a Reproduction

pipeline (which normally returns 1 Individual), you could force both of them to return 2 Individuals if

called on. This is done by saying:

base.generate-max = true

By default,

generate-max

is true, so this is redundant. Note that in Genetic Programming it is common

to have a single Crossover pipeline returning 2 Individuals and a Reproduction pipeline returning a

single individual, so in the Genetic Programming parameter ﬁles, generate-max is set to false.

•

ec.breed.InitializationPipeline takes no sources at all: instead it simply generates new random individuals

and returns them. It always generates the maximum number requested by its parent. This pipeline is

useful for doing random search, for example.

•

ec.breed.BuﬀeredBreedingPipeline buffers up Individual requests and then hands them out one by one.

When you ﬁrst call produce() on a BufferedBreedingPipeline, regardless of the number of Individuals

requested, it will in turn demand some

children from its single source. It then stores them in a buffer

and hands them to this and later produce() requests until they are depleted, at which time it requests

more, and so on. This value of Nis set like this:

base.num-inds = 10

Why would you want to do this? Primarily tricks like the following. Let’s say you want to create a

crossover operator which produces two children, which are then fed into another different crossover

operator and thus are crossed over again. Ordinarily you’d think you could do it along these lines:

pop.subpop.0.pipe.0 = ec.app.myCrossover

pop.subpop.0.pipe.0.source.0 = ec.app.myOtherCrossover

pop.subpop.0.pipe.0.source.1 = same

Looks good, right? Not so fast. The myCrossover class will request exactly one individual from each of

its sources. First it’ll request from source 0, which will cross over two children, return one, and throw

away the other. Then it’ll request from source 1, which will again produce two children, return one, and

throw away the other. As a result you’re not crossing over two individuals twice. You’re crossing over

different individuals which are the result of separate earlier crossovers.

But if you did it instead like this:

pop.subpop.0.pipe.0 = ec.app.myCrossover

pop.subpop.0.pipe.0.source.0 = ec.breed.BufferedBreedingPipeline

pop.subpop.0.pipe.0.source.1 = same

pop.subpop.0.pipe.0.source.0.num-inds = 2

pop.subpop.0.pipe.0.source.0.source.0 = ec.app.myOtherCrossover

pop.subpop.0.pipe.0.source.0.source.1 = same

Now myCrossover requests one child from BufferedBreedingPipeline, which in turn demands two

children from myOtherCrossover, which crosses over two Individuals and returns them. Buffered-

BreedingPipeline returns one of the Individuals. Then myOtherCrossover requests the second child,

and BufferedBreedingPipeline returns the other Individual out of its buffer. Problem solved.

•

ec.breed.RepeatPipeline takes a single source. After the method prepareToProduce(...) has been called, it

will request a single individual from its source. It will then provide only copies of that individual in

response to produce(...) requests. This is useful for selecting a single individual and then (for example)

mutating him many times to form children, as is often done in hill-climbing techniques. Warning:

in steady-state evolution, prepareToProduce(...) is only called once, at the beginning of the run. In

generational evolution, the method is called at the start of every generation.

•

ec.breed.ForceBreedingPipeline takes a single source and a parameter num-inds. In response for a request

for at least

individuals (minimum), ForceBreedingPipeline will return

individuals, or num-inds

individuals, whichever is larger. If num-inds is equal to or larger than

, ForceBreedingPipeline will

demand exactly num-inds children from its breeding source, then copy (if necessary) and return them.

is larger, ForceBreedingPipeline will repeatedly request num-inds children, or (for the ﬁnal request)

fewer than num-inds, in order to satisfy the request for

. For example, if

15 and num-inds

then ForceBreedingPipeline will request 4, then 4, then 4, then 3 Individuals, then return them all. The

parameter of interest is set to some value like:

base.num-inds = 5

This pipeline gives you a way of either forcing the child pipeline to provide an exact number of

individuals at a time. This trick is sort of the counterpart to BufferedBreedingPipeline: it gives you a

way of demanding a certain number of individuals from a pipeline (like a CrossoverPipeline) which

doesn’t normally return that number.

•

ec.breed.CheckingPipeline takes two sources, and some

times to produce “valid” individuals from

source 0. To verify validity, it calls the method allValid(...), which you override. If it fails after

attempts, it instead produces individuals from source 1 and returns them. The value of

is determined

with the parameter:

base.num-times = 20

The allValid(...) method is deﬁned as:

ec.breed.CheckingPipeline Methods

public boolean allValid(Individual[] inds, int numInds, int subpopulation, EvolutionState state, int thread)

Returns whether all of the individuals in inds[0] ... inds[numInds - 1] are valid. Override this to customize

how CheckingBreedingPipeline checks individuals. By default, this method simply returns true.

It may be the case that you wish to produce some

individuals which are checked for validity inde-

pendently of one another. Probably the easiest way to do this is to set the CheckingBreedingPipeline as

a child of ForceBreedingPipeline whose

num-inds

parameter has been set to 1. For example, you might

do something like:

pop.subpop.0.pipe.0 = ec.app.myMutation

pop.subpop.0.pipe.0.source.0 = ec.breed.ForceBreedingPipeline

pop.subpop.0.pipe.0.source.0.num-inds = 1

pop.subpop.0.pipe.0.source.0.source.0 = ec.app.myCheckingBreedingPipeline

pop.subpop.0.pipe.0.source.0.source.0.num-times = 20

pop.subpop.0.pipe.0.source.0.source.0.source.0 = ...

•

ec.breed.UniquePipeline tries hard to guarantee that all individuals it produces are different from

individual in the past generation’s subpopulation. It does this each generation by building a set of

individuals in the previous generation’s subpopulation, then checks newly generated individuals

against this set. If they match, it discards them and tries again.

UniquePipeline will try

times to ﬁll a certain requested number of individuals. If it fails after

attempts, it will ﬁll the remainder with non-unique generated individuals. The value of

determined with the parameter:

base.duplicate-retries = 20

How may individuals must be generated for UniquePipeline to be satisﬁed? If the following parameter

true

, then UniquePipeline will always ﬁll to the maximum number permitted; else it will ﬁll to at

least the minimum number, or whatever its subsidiary pipeline provides it. The default is false:

base.generate-max = true

•

ec.breed.GenerationSwitchPipeline takes two sources. In response to a request for individuals, for

generations 1 through

N−

1 GenerationSwitchPipeline will request Indivduals from source 0. For

generations

and on, GenerationSwitchPipeline will request Individuals from source 1. You specify

the switch-generation Nas:

base.switch-at = 15

Like ReproductionPipeline, GenerationSwitchPipeline can guarantee that both of its sources always

return the same number of individuals (the maximum of the two) with:

base.generate-max = true

By default, generate-max is true, so this example is redundant.

StubPipeline

ReproductionPipeline

TournamentSelection

VectorCrossoverPipeline

VectorMutationPipeline

source.0

stub-pipeline

stub

source.0

source.1

StubPipeline

ReproductionPipeline

TournamentSelection

VectorCrossoverPipeline

VectorMutationPipeline

source.0

stub-pipeline

source.1

source.0

Figure 3.4 Stub Pipeline before (as speciﬁed in the parameter ﬁle) and after (as constructed). See text.

•

ec.breed.FirstCopyPipeline is a version of ReproductionPipeline but has two sources. Whenever pre-

pareToProduce(...) is called, FirstCopyPipeline is reset. the immediate next time produce(...) is called,

FirstCopyPipeline will use source #0 to fulﬁll the very ﬁrst child produced. Thereafter it use source

#1 for all future children (and all future produce(...) calls. prepareToProduce(...) resets it again. This is

useful for performing a simple elitism in a pipeline rather than doing it at the Breeder level.

•

Last, ec.breed.StubPipeline is a version of ReproductionPipeline but includes a special subsidiary

pipeline called a stub pipeline. If some BreedingPipeline which feeds into StubPipeline’s source #0

itself contains a source simply called

stub

, then that source location is ﬁlled with that StubPipeline’s

stub pipeline. Each breeding thread gets its own copy of the stub pipeline, but within a thread stub

pipelines are not copied. Using StubPipeline you can construct elaborate Directed Acyclic Graphs in

your Pipeline.

For example, if we had:

pop.subpop.0.pipe.0 = ec.breed.StubPipeline

# Here’s the stub:

pop.subpop.0.pipe.0.stub-source = ec.breed.ReproductionPipeline

pop.subpop.0.pipe.0.stub-source.source.0 = ec.select.TournamentSelection

pop.subpop.0.pipe.0.source.0 = ec.vector.VectorCrossoverPipeline

pop.subpop.0.pipe.0.source.0.source.0 = stub

pop.subpop.0.pipe.0.source.0.source.1 = ec.vector.VectorMutationPipeline

pop.subpop.0.pipe.0.source.0.source.1.source.0 = stub

... then the ECJ would ﬁll the two stubs such that the stub lines would be more or less like this:

# First stub pop.subpop.0.pipe.0.source.0.source.0 = ec.breed.ReproductionPipeline

pop.subpop.0.pipe.0.source.0.source.0.source.0 = ec.select.TournamentSelection

# Second stub -- note that these are the EXACT SAME OBJECTS as in the first stub!

pop.subpop.0.pipe.0.source.0.source.1.source.0 = ec.breed.ReproductionPipeline

pop.subpop.0.pipe.0.source.0.source.1.source.0.source.0 = ec.select.TournamentSelection

Figure 3.4 shows the process of building this pipeline from the stub declarations. The critical feature is

that the two stubs are replaced with the same object, not copies, and thus it’s a Directed Acyclic Graph.

The procedure is as follows. During setup(...) sources declared as a “stub” are not ﬁlled: they’re set

to null. Then after a pipeline has been cloned and handed to a thread to begin breeding, but before

produce(...) commences, the method BreedingSource.ﬁllStubs(...) is called. This recursively traverses

Vector Crossover

Pipeline

Vector Mutation

Pipeline

Reproduction

Pipeline

Multi-Breeding

Pipeline

Tournament

Selection

(GP) Crossover Pipeline

Tournament

Selection

Fitness Proportionate

Selection

Sigma Scaling

Selection

New Subpopulation

Old SubpopulationOld Subpopulation

New Subpopulation

Copy Copy

Copy

Figure 3.5 Two example Breeding Pipelines (Repeat of Figure 3.3).

through the tree: whenever a StubPipeline receives this method, it ﬁlls all the null references in its

children with its stub pipeline.

StubPipelines replace all the stubs down to the next StubPipeline (which gets to replace stubs of its

own subtree). Additionally it’s possible that a stub-pipeline itself contains stubs. In this case, if a

StubPipeline A contains a child or descendent StubPipeline B whose stub-pipeline contains stubs, they

are ﬁlled with StubPipeline A’s stub-pipeline before control is handed off to StubPipeline B to start

ﬁlling.

3.5.4 Setting up a Pipeline

Setting up a pipeline using parameters sometimes isn’t entirely obvious. Let’s do the two examples shown

in Figure 3.5, in both cases setting up a pipeline for Subpopulation 0.

3.5.4.1 A Genetic Algorithm Pipeline

The left ﬁgure is a Breeding Pipeline common to the Genetic Algorithm: two Individuals are selected from

the old Subpopulation, then copied and crossed over, and the two children are then mutated and added to

the new Subpopulation. To build the pipeline, we work our way backwards: ﬁrst deﬁning the mutator as the

top element, then the crossover as its source, then the selector as the crossover’s two sources.

First the mutator. We’ll use ec.vector.VectorMutationPipeline, a common mutator for vector individuals.

Subpopulation 0’s species holds the prototypical pipeline:

pop.subpop.0.species.pipe = ec.vector.VectorMutationPipeline

Next we deﬁne its sole source: the crossover operator (ec.vector.VectorCrossoverPipeline)

pop.subpop.0.species.pipe.source.0 = ec.vector.VectorCrossoverPipeline

Back to building the Pipeline. Crossover has two sources. We’d like them to both be the Tournament

Selector (they could be different Tournament Selectors; it doesn’t really matter):

pop.subpop.0.species.pipe.source.0.source.0 = ec.select.TournamentSelection

pop.subpop.0.species.pipe.source.0.source.1 = same

Tournament Selection has a tournament size operator. Since we’re using the same selector for both

sources, we only need to set it once:

pop.subpop.0.species.pipe.source.0.source.0.size = 2

We could also just set the default parameter for all tournament selectors:

select.tournament.size = 2

Perhaps we’d like the second source to use a tournament size of 4. To do this we’d need to use a separate

selector, so we could do this:

pop.subpop.0.species.pipe.source.0.source.1 = ec.select.TournamentSelection

pop.subpop.0.species.pipe.source.0.source.1.size = 4

This would also override the default we just set, so it’d work whether or not we used the default-setting

approach for source 0.

VectorCrossoverPipeline and VectorMutationPipeline, discussed later, have various parameters; to

simplify the pipeline-building procedure, the ec.vector package (Section 5.1) puts these parameters in the

Species, not the pipeline. For completeness sakes, let’s include some of them here:

pop.subpop.0.species = ec.vector.FloatVectorSpecies

pop.subpop.0.species.crossover-type = one

pop.subpop.0.species.mutation-prob = 1.0

pop.subpop.0.species.mutation-type = gauss

pop.subpop.0.species.mutation-stdev = 0.01

3.5.4.2 A Genetic Programming Pipeline

The right ﬁgure is a typical Genetic Programming pipeline (see Section 5.2). We begin with the root:

pop.subpop.0.species.pipe = ec.breed.MultiBreedingPipeline

As discussed earlier, MultiBreedingPipeline can take any number of sources, so we have to specify it (to 2

here). We also need to state the sources and the probabilities for each source. We’ll do 10% Reproduction for

the ﬁrst source and 90% Genetic Programming Crossover for the second. We won’t require the two sources

to produce the same number of individuals:

pop.subpop.0.species.pipe.num-sources = 2

pop.subpop.0.species.pipe.generate-max = false

pop.subpop.0.species.pipe.source.0 = ec.breed.ReproductionPipeline

pop.subpop.0.species.pipe.source.0.prob = 0.10

pop.subpop.0.species.pipe.source.1 = ec.gp.koza.CrossoverPipeline

pop.subpop.0.species.pipe.source.1.prob = 0.90

In Genetic Programming nearly always we’d use Tournament Selection for all selectors. But we’ll do

various selectors as shown in the Figure. First the Fitness Proportionate Selection source for the Reproduc-

tionPipeline:

pop.subpop.0.species.pipe.source.0.source.0 = ec.select.FitProportionateSelection

Next TournamentSelection (tournament size 7) as Crossover’s ﬁrst source:

pop.subpop.0.species.pipe.source.1.source.0 = ec.select.TournamentSelection

pop.subpop.0.species.pipe.source.1.source.0.size = 7

Last, Sigma Scaling Selection as Crossover’s second source:

pop.subpop.0.species.pipe.source.1.source.1 = ec.select.SigmaScalingSelection

pop.subpop.0.species.pipe.source.1.source.1.scaled-fitness-floor = 0.1

(The default setting for

pop.subpop.0.species.pipe.source.1.source.1.scaled-fitness-floor

already 0.1, so it doesn’t really need to be set.)

3.6 Exchangers

An Exchanger is a subclass of ec.Exchanger, and is called both before and after breeding. Exchangers form

the basis of Island Models in ECJ and will be discussed in-depth in Section 6.2.

Besides setup(...), Exchangers have three basic functions, called at different times in the evolutionary

cycle:

public abstract Population preBreedingExchangePopulation(EvolutionState state);

public abstract Population postBreedingExchangePopulation(EvolutionState state);

public abstract String runComplete(EvolutionState state);

The ﬁrst method is called prior to breeding a population. It’s largely available for Island Models to ship

off members of the Population to remote ECJ processes. The second method is called immediately after

breeding a Population, and enables Island Models to import members from remote ECJ processes, possibly

displacing newly-bred individuals. The third method is called after preBreedingExchangePopulation(...) to

determine whether or not the Exchanger thinks ECJ should shut down its process because some other process

has found the optimal individual. To cause ECJ to shutdown, return a String with a shutdown message of

some sort (which will get printed out); otherwise return null.

Unless you’re doing Island Models, almost certainly you’ll use a default Exchanger called

ec.simple.SimpleExchanger which does nothing to the Population at all. To wit:

exch = ec.simple.SimpleExchanger

3.7 Statistics

ECJ provides a large number of

statistics hooks

, places where ECJ will call arbitrary methods on a Statistics

object throughout the process. These hooks give you a chance to examine the current process at all sorts of

stages and output useful logging information as you see ﬁt. Statistics objects are subclasses of ec.Statistics

and often follow one or more

statistics forms

which provide different alternative Statistics hooks. For

example, ec.steadystate.SteadyStateStatisticsForm, discussed later in Section 4.2, stipulates hooks for Statistics

in steady-state evolution. The Statistics class ec.simple.SimpleStatistics and has hooks for both regular

generational evolution (deﬁned in ec.Statistics) and SteadyStateStatisticsForm. Another basic Statistics class,

ec.simple.SimpleShortStatistics, has hooks only for generational evolution. Other Statistics classes exist, such

as those found in Genetic Programming (Section 5.2).

You can have as many Statistics objects as you want, but one Statistics object (usually arbitrarily chosen)

must be the statistics root. To deﬁne an ec.simple.SimpleStatistics as the root, you say:

stat = ec.simple.SimpleStatistics

Statistics objects usually, but not always, have a

ﬁle

which they log their statistics results out to. It’s

common to stipulate that ﬁle as:

stat.file = $out.stat

This tells SimpleStatistics to write to a ﬁle called out.stat, located right where the user launched ECJ (for a

reminder on the meaning of the “

”, see Section 2.1.2). If you are running with multiple jobs (Section 2.5),

ECJ will automatically append the preﬁx ”jobs.n.” to this ﬁlename, where

is the job number. Thus the

statistics ﬁle for job number 5 will be ”jobs.5.out.stat”. If no ﬁle is provided, SimpleStatistics will simply

print out to the screen.

If ECJ is restarted from a checkpoint, SimpleStatistics will append to existing ﬁles rather than overwriting

them. SimpleStatistics also has an option to compress the ﬁle using GZIP (and thus add a ”.gz” sufﬁx at the

very end, as in ”jobs.5.out.stat.gz”. Note that if this option is used, SimpleStatistics will simply overwrite the

ﬁle if restarted from a checkpoint. The parameter is:

stat.gzip = true

For each generation, the SimpleStatistics object prints out the best Individual of the generation using the

printIndividualForHumans(...) method. For example, generation 0 might have:

Generation: 0

Best Individual:

Subpopulation 0:

Evaluated: T

Fitness: -1503.8322

2.4502202187677815 0.9879236448665667 0.7631586426217085 0.6854305126240172

If you would like to prevent this per-generational statistics from being written out to the log, you can say:

stat.do-generation = false

At the end of the run, the SimpleStatistics object prints out the best Individual of the run:

Best Individual of Run:

Subpopulation 0:

Evaluated: T

Fitness: -185.78166

-1.0393115193403102 -2.006026366200021 -0.03642166362331428 -1.1196984643947918

If you would like to prevent this information from being written to the log, say:

stat.do-final = false

At the end of the run, the SimpleStatistics object ﬁnally calls the describe(...) method on the Evaluator to

get it to write out a phenotypical description of the performance of the best individual of the run (see Section

3.4.1). Only some problems implement the describe(...) method: for example the Artiﬁcial Ant problem will

use it to print out a map of the food trail by the best discovered ant algorithm, such as:

Best Individual’s Map

=====================

acdgf.......f............m......

...h........g............l......

...i........h...........+k##++..

...j........i.yzabqdefghij...#..

...k........j.x...pon...#....#..

...lmnopqrstkvw....tmrqp+....+..

............l......ul..o.....#..

............mbazyxwvk..n.....+..

............n.......j..m.....+..

............o.......i..l.....#..

............p.......h..k.....+..

............q.......g..j.....+..

............r.......f..i.....#..

............s.......e..h.....+..

............t.......d..g++###+..

............u...+zabc..f........

............v...vy.....e........

............w...ux.....d........

............x...t......c#+++....

............y...s......b...#....

............z...r......a...+....

............a...q......z...+....

............b...p......y++#+....

...........tc...o......x........

.wvuts+####ud...n......w........

.x...r.....ve...m......v........

.y...qpo...wf...l......u........

.z.....n###xghijk......t........

.a.....m...y...........s........

.b.....l...z...........r........

.cdehijk..bcd..........q........

...fg.......e..........pon......

If you don’t want to see stuff like this at the end of your statistics ﬁle, say:

stat.do-description = false

SimpleStatistics can also call describe(...) to print out stuff like this every generation, for the best individual

of that generation. For backwards-compatibility reasons by default it’s turned off. To turn it on, say:

stat.do-per-generation-description = true

Finally, SimpleStatistics will also write out a message (see Section 2.2) to stdout, which prints to the

screen some short statistical information about each generation. For example:

Initializing Generation 0

Subpop 0 best fitness of generation Fitness: Standardized=63.0 Adjusted=0.015625 Hits=26

Generation 1

Subpop 0 best fitness of generation Fitness: Standardized=57.0 Adjusted=0.01724138 Hits=32

Generation 2

Subpop 0 best fitness of generation Fitness: Standardized=48.0 Adjusted=0.020408163 Hits=41

Generation 3

Subpop 0 best fitness of generation Fitness: Standardized=48.0 Adjusted=0.020408163 Hits=41

Generation 4

Subpop 0 best fitness of generation Fitness: Standardized=40.0 Adjusted=0.024390243 Hits=49

Generation 5

Subpop 0 best fitness of generation Fitness: Standardized=38.0 Adjusted=0.025641026 Hits=51

...

If you would prefer not to see this information, just say:

stat.do-message = false

... and what you’ll get on the screen will be:

Initializing Generation 0

Generation 1

Generation 2

Generation 3

Generation 4

Generation 5

...

3.7.1 Creating a Statistics Chain

So how do you add additional Statistics objects? As children of the root, or of one another. Any Statistics

object can have some

children. Children are called the same hooks as their parents are. To add another

Statistics object (say, ec.simple.SimpleShortStatistics, we might add a child to the root:

stat.num-children = 1

stat.child.0 = ec.simple.SimpleShortStatistics

stat.child.0.file = $out2.stat

Notice that the ﬁle has changed: you don’t want both Statistics objects writing to the same ﬁle! If we

wanted to add a third Statistics object (say, another ec.simple.SimpleShortStatistics), we could do it this way:

stat.num-children = 2

stat.child.0 = ec.simple.SimpleShortStatistics

stat.child.0.file = $out2.stat

stat.child.1 = ec.simple.SimpleShortStatistics

stat.child.1.file = $out3.stat

...or we could do it this way:

stat.num-children = 1

stat.child.0 = ec.simple.SimpleShortStatistics

stat.child.0.file = $out2.stat

stat.child.0.num-children = 1

stat.child.0.child.0 = ec.simple.SimpleShortStatistics

stat.child.0.child.0.file = $out3.stat

The point is, you can hang a Statistics object as a child of any other Statistics object. Pick your poison.

3.7.2 Tabular Statistics

SimpleShortStatistics writes out a different kind of statistics from SimpleStatistics In its basic form, for each

generation it writes out a line of the following values, each separated by a space.

1. The generation number

2. The mean ﬁtness of the entire population for this generation

3. The best ﬁtness of the entire population for this generation

4. The best ﬁtness of the entire population so far in the run

For example, we might have values like this...

0 -1851.9916400146485 -1559.68 -1559.68

1 -1801.2400487060547 -1557.7627 -1557.7627

2 -1758.2322434082032 -1513.4955 -1513.4955

3 -1715.5276463623047 -1420.0074 -1420.0074

4 -1675.379030883789 -1459.842 -1420.0074

5 -1637.332774291992 -1426.798 -1420.0074

...

You can add sizing information as well. If you set:

stat.child.0.do-size = true

... you’ll get

1. The generation number

2. (If do-size is true) The average size of an individual this generation

3. (If do-size is true) The average size of an individual so far in the run

4. (If do-size is true) The size of the best individual this generation

5. (If do-size is true) The size of the best individual so far in the run

6. The mean ﬁtness of the entire population for this generation

7. The best ﬁtness of the entire population for this generation

8. The best ﬁtness of the entire population so far in the run

You can also turn on timing information. If you set:

stat.child.0.do-size = true

stat.child.0.do-time = true

You’ll additionally get:

1. The generation number

(If

do-time

is true) How long initialization took in milliseconds, or how long the previous generation

took to breed to form this generation

3. (If do-time is true) How long evaluation took in milliseconds this generation

4. (If do-size is true) The average size of an individual this generation

5. (If do-size is true) The average size of an individual so far in the run

6. (If do-size is true) The size of the best individual this generation

7. (If do-size is true) The size of the best individual so far in the run

8. The mean ﬁtness of the entire population for this generation

9. The best ﬁtness of the entire population for this generation

10. The best ﬁtness of the entire population so far in the run

Finally, you can add in per-subpopulation information. If you set:

stat.child.0.do-size = true

stat.child.0.do-time = true

stat.child.0.do-subpops = true

you get the whole shebang:

1. The generation number

(If

do-time

is true) How long initialization took in milliseconds, or how long the previous generation

took to breed to form this generation

3. (If do-time is true) How long evaluation took in milliseconds this generation

4. Once for each subpopulation...

(a) (If do-size is true) The average size of an individual this generation for this subpopulation

(b) (If do-size is true) The average size of an individual so far in the run for this subpopulation

(d) (If do-size is true) The size of the best individual so far in the run for this subpopulation

(e) The mean ﬁtness of the subpopulation for this generation

(f) The best ﬁtness of the subpopulation for this generation

(g) The best ﬁtness of the subpopulation so far in the run

5. (If do-size is true) The average size of an individual this generation

6. (If do-size is true) The average size of an individual so far in the run

7. (If do-size is true) The size of the best individual this generation

8. (If do-size is true) The size of the best individual so far in the run

9. The mean ﬁtness of the entire population for this generation

10. The best ﬁtness of the entire population for this generation

11. The best ﬁtness of the entire population so far in the run

Restricting Rows with a Modulus

If you’ve got a lot of subpopulations, or very long runs, there’s another

way you can reduce the size of your ﬁle: only output results ever

generations. For example, if we had set

stat.child.0.modulus = 2

.. then the aforementioned statistics would look like this:

0 -1851.9916400146485 -1559.68 -1559.68

2 -1758.2322434082032 -1513.4955 -1513.4955

4 -1675.379030883789 -1459.842 -1420.0074

...

An important thing to keep in mind. Let’s say you want to run for 1000 generations and gather the

last. Thus your generations are 0 through 999. If you did a modulus of 100 (say), your reported results

would be for generations 0, 100, 200, 300, 400, 500, 600, 700, 800, and 900. That is, you’d have lost the last 99

generations, including the ﬁnal generation. How might you ﬁx this? The easiest way is to simply run for

1001 generations. Then your last reported generation will be 1000.

3.7.3 Quieting the Statistics

Many statistics objects have options to prevent them from either writing to the screen, or creating statistics

logs, or both. Because statistics classes vary in the kinds of ﬁles they write, the options they have for quieting

them vary from class to class.

Basic subclasses of SimpleStatistics (such as ec.simple.SimpleStatistics,ec.simple.SimpleShortStatistics, and

ec.gp.koza.KozaShortStatistics) allow you to do any of the following:

• Not print to the screen

• Not create statistics logs or print to them

• Both of the above

To do any of these with the ﬁrst Statistics object in the Statistics chain, you’d say:

# Pick one of these:

# Turn off printing to the screen

stat.silent.print = true

# Do not create statistics logs (don’t even open them)

stat.silent.file = true

# Do both of the above

stat.silent = true

Obviously for other Statistics objects in your chain, it’s slightly different. For example, to do the same to

the ﬁrst child object in the chain, you might say:

# Pick one of these:

# Turn off printing to the screen

stat.child.0.silent.print = true

# Do not create statistics logs (don’t even open them)

stat.child.0.silent.file = true

# Do both of the above

stat.child.0.silent = true

You should be made aware of the important difference between these parameters and the “do-...” param-

eters discussed earlier in Sections 5.2.3.5 and 3.7.2. The “do-...” parameters control what kinds of messages

will be written out to logs or to the screen. Whereas the “silent” parameters are much cruder: they control

whether anything will be written at all. Note that the “do-...” parameters don’t prevent statistics ﬁles from

being opened or created, even if nothing is written to them. However the “silent” parameters will stop the

ﬁles from even being opened or created in the ﬁrst place.

Also note the relationship between these “silent” parameters and the

silent

parameter from Section

2.2.2. That second parameter cuts off the stderr and stdout streams entirely, preventing anything from being

written to the screen; it takes precedence over the settings of the “silent” parameters in this section.

3.7.4 Implementing a Statistics Object

A basic Statistics object implements one or more of the following hooks:

public void preInitializationStatistics(EvolutionState state);

public void postInitializationStatistics(EvolutionState state); // Generational

public void preCheckpointStatistics(EvolutionState state);

public void postCheckpointStatistics(EvolutionState state);

public void preEvaluationStatistics(EvolutionState state); // Generational

public void postEvaluationStatistics(EvolutionState state); // Generational

public void prePreBreedingExchangeStatistics(EvolutionState state);

public void postPreBreedingExchangeStatistics(EvolutionState state);

public void preBreedingStatistics(EvolutionState state); // Generational

public void postBreedingStatistics(EvolutionState state); // Generational

public void prePostBreedingExchangeStatistics(EvolutionState state);

public void postPostBreedingExchangeStatistics(EvolutionState state);

public void finalStatistics(EvolutionState state, int result);

When these statistics hooks are called should be self-explanatory from the method name. Note that

the methods marked Generational are only called by generational EvolutionState objects — notably the

ec.simple.SimpleEvolutionState object. There are also some additional hooks called by the ec.steadystate

package for steady-state evolution (see Section 4.2.1).

The ﬁnalStatistics(...) method, called at the end of an evolutionary run, contains one additional argument,

result. This argument will be either ec.EvolutionState.R SUCCESS or ec.EvolutionState.R FAILURE. Success

simply means that the optimal individual was discovered, and nothing more.

Whenever you override one of these methods, make certain to call super(...) ﬁrst. Let’s say that we’d like

to know what the size is of the very ﬁrst individual created after initialization. We might create a Statistics

subclass which overrides this to print this size out to the screen:

public void postInitializationStatistics(EvolutionState state)

{

super.postInitializationStatistics(state); // always call this

state.output.println(state.population.subpops.get(0).individuals.get(0).size(), 0); // stdout

}

We could also write to a ﬁle, but to do so we’d need to determine the ﬁle name. We could do it in a

manner similar to SimpleStatistics (ignoring the compression):

public static final String P_STATISTICS_FILE = "file";

public int log = 0; // 0 by default means stdout

public void setup(final EvolutionState state, final Parameter base)

{

super.setup(state, base);

File statisticsFile = state.parameters.getFile(base.push(P_STATISTICS_FILE),null);

if (statisticsFile!=null) try

{

log = state.output.addLog(statisticsFile, true, false, null, false);

}

catch (IOException i)

{

state.output.fatal("An IOException occurred trying to create the log "

+ statisticsFile + ":\n" + i);

}

// else we’ll just keep the log at 0, which is stdout

}

Now we can write out to the log:

100

public void postInitializationStatistics(EvolutionState state)

{

super.postInitializationStatistics(state); // always call this

state.output.println(state.population.subpops.get(0).individuals.get(0).size(), log);

}

3.8 Debugging an Evolutionary Process

A hint. One helpful way to debug ECJ is via BeanShell (http://www.beanshell.org). This tool is essentially a

command-line for Java. Using it, we can create an ECJ process and get it going quite easily. Let’s say that the

parameter ﬁle is called ”foo.params”. We can type the following into BeanShell:

show();

import ec.*;

args = new String[] { "-file", "foo.params" };

database = Evolve.loadParameterDatabase(args);

state = Evolve.initialize(database, 0);

state.run(EvolutionState.C_STARTED_FRESH);

Evolve.cleanup(state);

This is basically a variation of the main code found in Section 2.5. When state.run(...) is executed, ECJ will

go through the entire evolutionary run. It turns out that state.run(...) is really just a cover for three methods

(at least, if we’re starting from scratch rather than from a checkpoint). They are: ﬁrst calling state.startFresh(),

then repeatedly calling state.evolve(), and when it returns something other than state.R NOTDONE, ﬁnally

calling state.ﬁnish(...), passing in what was returned. In other words:

show();

import ec.*;

args = new String[] { "-file", "foo.params" };

database = Evolve.loadParameterDatabase(args);

state = Evolve.initialize(database, 0);

state.startFresh();

result = EvolutionState.R_NOTDONE;

while( result == EvolutionState.R_NOTDONE )

{

result = state.evolve();

}

state.finish(result);

Evolve.cleanup(state);

So how can we use this for debugging? Well, BeanShell gives us full access to Java. Let’s try it on the

ecsuite.params ﬁle. We begin with the preliminaries:

show();

import ec.*;

args = new String[] { "-file", "ecsuite.params" };

BeanShell responds with:

<[Ljava.lang.String;@39617189>

This is the String array. We continue with:

101

database = Evolve.loadParameterDatabase(args);

BeanShell responds with the ParameterDatabase:

<{} : ({} : ({eval=ec.simple.SimpleEvaluator, pop.subpop.0=ec.Subpopulation,

quit-on-run-complete=true, generations=1000,

pop.subpop.0.species.pipe.source.0=ec.vector.breed.VectorCrossoverPipeline,

pop.subpop.0.species.min-gene=-5.12, eval.problem=ec.app.ecsuite.ECSuite,

state=ec.simple.SimpleEvolutionState, pop.subpop.0.species.mutation-type=gauss,

pop=ec.Population, pop.subpop.0.duplicate-retries=2, select.tournament.size=2,

pop.subpops=1, pop.subpop.0.species.mutation-stdev=0.01,

pop.subpop.0.species.pipe=ec.vector.breed.VectorMutationPipeline,

pop.subpop.0.species.max-gene=5.12, pop.subpop.0.species.pipe.source.0.source.1=same,

pop.subpop.0.species.pipe.source.0.source.0=ec.select.TournamentSelection,

pop.subpop.0.species=ec.vector.FloatVectorSpecies, breed=ec.simple.SimpleBreeder,

pop.subpop.0.species.mutation-prob=1.0, pop.subpop.0.species.genome-size=100,

pop.subpop.0.species.crossover-type=one, finish=ec.simple.SimpleFinisher,

parent.0=../../ec.params, init=ec.simple.SimpleInitializer,

pop.subpop.0.species.ind=ec.vector.DoubleVectorIndividual,

pop.subpop.0.species.fitness=ec.simple.SimpleFitness, pop.subpop.0.size=1000,

eval.problem.type=rastrigin, stat=ec.simple.SimpleStatistics,

exch=ec.simple.SimpleExchanger, stat.file=$out.stat} : ({checkpoint-modulo=1,

evalthreads=1, checkpoint=false, breedthreads=1, checkpoint-prefix=ec, seed.0=time})))>

Now we initialize the EvolutionState from the database:

state = Evolve.initialize(database, 0);

This causes ECJ to start printing to the screen something along these lines:

| ECJ

| An evolutionary computation system (version 19)

| By Sean Luke

| Contributors: L. Panait, G. Balan, S. Paus, Z. Skolicki, R. Kicinger, E. Popovici,

| K. Sullivan, J. Harrison, J. Bassett, R. Hubley, A. Desai, A. Chircop,

| J. Compton, W. Haddon, S. Donnelly, B. Jamil, and J. O’Beirne

| URL: http://cs.gmu.edu/~eclab/projects/ecj/

| Mail: ecj-help@cs.gmu.edu

| (better: join ECJ-INTEREST at URL above)

| Date: July 10, 2009

| Current Java: 1.6.0_20 / Java HotSpot(TM) 64-Bit Server VM-16.3-b01-279

| Required Minimum Java: 1.4

Threads: breed/1 eval/1

Seed: 1853290822

<ec.simple.SimpleEvolutionState@1f5b0afd>

Notice the last line — BeanShell is returning the EvolutionState. Now we ﬁre up the EvolutionState to

initialize the ﬁrst population:

state.startFresh();

... and get back:

102

Setting up

Initializing Generation 0

The ﬁrst population has been created. Let’s look at the ﬁrst Individual:

state.population.subpops.get(0).individuals.get(0);

This produces:

<ec.vector.DoubleVectorIndividual@87740549{699618236}>

Not very helpful. Instead, let’s have it printed:

state.population.subpops.get(0).individuals.get(0).printIndividualForHumans(state, 0);

This produces something more useful:

Evaluated: F

Fitness: 0.0

-4.3846934361930945 4.051323475292111 2.750742781209575 -2.1599970035296088

3.5139838195638236 -4.326431483145531 -1.5799722524229094 -4.64489169381555

-4.809825694271426 -0.6969239813124668 -4.322411553562226 4.8723307904232565

2.8978088843319947 -4.311437772193992 -1.556903048013028 2.876699531303326

-1.5461627480422133 -3.406470106152458 0.3510231690045371 -1.26870148662141

-2.9943682283832675 -1.1321325429409796 -4.780798908878881 -2.789054768098288

2.7957975471728034 -2.4529277934521363 0.06864524959557006 -2.807030901927618

-3.817734647565329 3.0018199187738803 3.893346256074625 -4.1700250768556355

-3.3035366716916714 -1.5300889532287534 -1.2924365390313826 2.6878356877535623

4.344108056131552 1.0732802812225044 1.804809997034555 0.6627493849916508

1.6556742582736854 -3.8324177646471913 0.2901815515514814 -0.5045301890375606

-2.755111883054377 -4.057309896490254 -2.097059222061862 -2.062611078568839

3.676980437590175 3.4010063830636517 5.001876654997903 2.3637174851440808

-2.3242430228722846 -0.2027490501614988 4.948796285958214 3.645393286308912

-0.9981883696957627 -2.4911201811073296 2.281601570422807 -3.0028177298996583

-0.6949487749058276 2.4115725052273005 2.2705630820859133 3.8198793397976756

3.927188087275849 3.5439728479577974 4.195897069928313 4.064291914283307

-1.6071055662352376 -0.45138576561254506 0.5382601925283925 2.2824947546503687

-0.0837300863613164 -2.4997930740673895 0.06696037058102089 1.782243737261787

-3.390249634178219 4.669336185081783 2.371290190775591 1.8743739255868377

0.13349732700681827 2.808175830805574 -4.2297879656940705 -0.5781599273148448

3.4174595199606577 2.5509508748123793 0.9574470878471297 1.181916131827328

3.3128918249657184 3.5085201808925843 -0.8921840350705308 -4.016933626993176

2.5591127486976983 1.580181276449899 -0.6102226049991097 1.0644092417475743

0.5897983455130262 2.5504671849586904 -2.230897886457403 1.8133759722806326

Now we’re getting somewhere. You could print out all the individuals like this:

for(int i = 0; i < state.population.subpops.get(0).individuals.size(); i++ )

state.population.subpops.get(0).individuals.get(i).printIndividualForHumans(state, 0);

... but I won’t torture you with what comes out. Instead, notice that the Individual has not been evaluated

yet — just initialized with the Population. We go through one evolutionary loop — evaluating the individuals

and breeding a new Population — by calling

state.evolve();

103

This produces:

Subpop 0 best fitness of generation: Fitness: -1493.534

<2>

The ﬁrst line was printed to the screen. The second line indicates what state.evolve() returned. 2 is the

value of EvolutionState.R NOTDONE, indicating that the process needs to be evolved more.

Let’s look at that ﬁrst individual again:

state.population.subpops.get(0).individuals.get(0).printIndividualForHumans(state, 0);

...yielding...

Evaluated: F

Fitness: -1806.1432

-1.5595820609909528 -2.9941135630034292 -3.188550961391961 0.8673223056511647

2.4308132097811472 -3.6298006589453533 -4.62193495641744 1.7381186900517611

-3.2707539202577953 -2.8517369832386144 -4.701099579700639 1.1683479248633841

0.10118833856168477 2.7982137159130787 -1.3673458253800685 -4.548719487000453

-1.7852252742508177 -1.662999422245311 -4.891889992368657 2.0689413066938824

4.64815452362056 4.03620579726471 -2.6065781548997413 2.8384398494616585

-1.6231723965539844 -0.19641152832494305 -0.8025430015631594 -4.337733534634894

-4.259188069209607 0.0974585410674078 4.878006291864429 4.187577755641656

3.9507153065207605 -3.3456633008586922 3.7163666200189596 -0.7581028665673978

4.28299933455259 1.8522464455693997 4.4324032846812935 -1.3209545115697914

4.239911043319335 -2.7741200087506352 -3.181419981396656 -0.4574562816089688

3.9209870275697982 0.31049605413333237 -0.46868091240064014 -4.570530964131764

-0.9126484738704782 3.6348709305820153 -1.800821491837854 -0.8548399118205554

-1.6874962921883667 2.628667604603462 0.060377157894385663 3.194354857448187

1.2106237734207714 -3.477534436566739 1.919326547065771 -3.74880517912247

4.076653684533312 -2.9153006121227034 -2.4460232838375973 0.6128610868842217

0.7785108819209824 -1.213371979065718 3.2441049504290587 -1.352037820951835

1.151316091162472 0.3915293759690397 -0.15229424767569708 1.8192706794904545

-3.057866603248519 -3.2217378304635926 3.7963181147558447 1.9609441782591566

2.1365399986514815 -0.7608502832241196 -1.2202190662246202 -3.2592371482282956

-2.612971504172355 3.1496849987738167 -5.083084415090031 -4.243405086300351

3.8516939433487387 -4.87008846122508 1.012854792831603 3.77728764346906

2.843506550933032 4.705462097924235 1.4291349248648448 3.8398215224809875

1.1776568359195472 -4.784524531392207 2.765230136436807 -2.6521295800350555

-2.271480494878218 -2.018481022639772 -2.2536397207045686 -1.5048357519436404

It’s a different Individual: the next generation one to be exact. Notice that although it has a “ﬁtness”, in

fact it’s not been evaluated: the “ﬁtness” is just nonsense cloned from a previous Individual.

So how do you see a population with ﬁtness-evaluated Individuals? Generational ECJ EvolutionState

processes assess the ﬁtness of a Population, then create a new Population and throw away the old one. You

can hold onto the old Population pretty easily. Just do this:

p = state.population;

state.evolve();

Now pholds the old Population, ﬁlled with now-evaluated Individuals, and state.population holds the

next-generation Population, which hasn’t been evaluated yet. For example, if you say:

p.subpops[0].individuals.get(0).printIndividualForHumans(state, 0);

... you will get back something like this:

104

Evaluated: T

Fitness: -1779.4391