SLi M Manual

User Manual:
Open the PDF directly: View PDF .
Page Count: 517
Download
Open PDF In Browser	View PDF
SLiM: An Evolutionary Simulation Framework
Benjamin C. Haller and Philipp W. Messer
Dept. of Biological Statistics and Computational Biology
Cornell University, Ithaca, NY 14853
Correspondence: bhaller@benhaller.com

Last revised 29 January 2019, for SLiM version 3.2.1.

Author Contributions:
SLiM 2 and later were conceived and designed by BCH and PWM, based upon the previous design of SLiM
by PWM. BCH designed and implemented the Eidos scripting language, wrote almost all of the code for
SLiM 2 and later (but see the acknowledgements below), and wrote this manual. PWM provided feedback
and edited this manual.

Acknowledgements:
The authors want to thank Andrew Sackman, Aaron Sams, Nathan Oakes, Madeline Kwicklis, Jamal
Elkhader, and all members of the Messer lab for helpful feedback and bug reports. Thanks to Kevin Thornton
and Ryan Hernandez for many discussions and for their general help in promoting forward population
genetic simulations. Thanks to Jared Galloway, Jerome Kelleher, and Peter Ralph for their considerable work
in implementing tree-sequence recording in SLiM 3. Thanks to Simon Aeschbacher, Jorge Amaya, Bill Amos,
Chenling Antelope, Jaime Ashander, Hannes Becher, Emma Berdan, Jeremy Berg, Tom Booker, Gideon
Bradburd, Yoann Buoro, Deborah Charlesworth, Jeremy Van Cleve, Jean Cury, Michael DeGiorgio, A.P. Jason
de Koning, Emily Dennis, Jordan Rohmeyer Dherby, Jared Galloway, Jesse Garcia, Kimberley Gilbert,
Alexandre Harris, Kelley Harris, Rebecca Harris, Matthew Hartfield, Ding He, Kathryn Hodgins, Christian
Huber, Melissa Jane Hubisz, Emilia Huerta-Sanchez, Jacob Malte Jensen, Peter Keightley, Jerome Kelleher,
Andy Kern, Bhavin Khatri, Bernard Kim, Athanasios Kousathanas, Chris Kyriazis, Benjamin Laenen, Áki
Láruson, Stefan Laurent, Eugenio Lopez, Kathleen Lotterhos, Andrew Marderstein, Sebastian Matuszewski,
Mikhail Matz, Rupert Mazzucco, Maéva Mollion, Miguel Navascués, Dominic Nelson, Bruno Nevado,
Etsuko Nonaka, Greg Owens, Harvinder Pawar, Martin Petr, Denis Pierron, Fernando Racimo, Peter Ralph,
David Rinker, Murillo Fernando Rodrigues, Andrew Sackman, Kieran Samuk, Derek Setter, Onuralp
Söylemez, Stefan Strütt, Rob Unckless, Christos Vlachos, Silu Wang, Aaron Wolf, Yan Wong, and Justin Yeh
for comments and feedback that has led to improvements in SLiM. Thanks also to everyone on
stackoverflow, an invaluable resource and a great community. Finally, we want to thank Dmitri Petrov,
whose support was instrumental in the initial conception of SLiM.

Citation:
To cite SLiM 3 in a publication, please cite:
Haller, B.C., and Messer, P.W. (2017). SLiM 2: Flexible, interactive forward genetic simulations.
Molecular Biology and Evolution (early access). DOI: https://doi.org/10.1093/molbev/msy228
Papers using tree-sequence recording should perhaps also cite that paper:
Haller, B.C., Galloway, J., Kelleher, J., Messer, P.W., & Ralph, P.L. (2018). Tree-sequence recording in
SLiM opens new horizons for forward-time simulation of whole genomes. Molecular Ecology
Resources (early access). DOI: https://doi.org/10.1111/1755-0998.12968
Papers which only use SLiM 2 can still cite that paper:
Haller, B.C., and Messer, P.W. (2017). SLiM 2: Flexible, interactive forward genetic simulations.
Molecular Biology and Evolution 34(1), 230–240. DOI: http://dx.doi.org/10.1093/molbev/msw211
Papers which wish to cite this manual (perhaps because they make reference to a recipe) should cite:
Haller, B.C., and Messer, P.W. (2016). SLiM: An Evolutionary Simulation Framework.
URL: http://benhaller.com/slim/SLiM_Manual.pdf
And if you wish to cite a publication about Eidos, please cite the Eidos manual:
Haller, B.C. (2016). Eidos: A Simple Scripting Language.
URL: http://benhaller.com/slim/Eidos_Manual.pdf

2

URL:
http://messerlab.org/slim

License:
Copyright © 2016–2019 Philipp Messer. All rights reserved.
SLiM is a free software: you can redistribute it and/or modify it under the terms of the GNU General Public
License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any
later version.

Disclaimer:
The program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even
the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
General Public License (http://www.gnu.org/licenses/) for more details.

3

Contents
PART I: THE SLIM COOKBOOK
1. SLiM overview .................................................................................................................................
12
1.1 Introduction .......................................................................................................................... 12
1.2 Why SLiM? ............................................................................................................................
13
1.3 A quick summary of SLiM .....................................................................................................
14
1.4 The typical SLiM usage pattern .............................................................................................. 17
1.5 Conceptual overview ............................................................................................................ 18
1.5.1 Individuals and genomes ........................................................................................... 18
1.5.2 Mutations and substitutions .......................................................................................
20
1.5.3 Mutation stacking ......................................................................................................
22
1.5.4 Genomic elements, genomic element types, mutation types, and the chromosome ........26
1.5.5 Subpopulations and migration ...................................................................................
27
1.5.6 Other concepts ..........................................................................................................
28
1.6 Wright-Fisher (WF) versus non-Wright-Fisher (nonWF) models .............................................. .....29
1.7 Tree-sequence recording ....................................................................................................... 31
1.8 Online resources for SLiM users ............................................................................................ .....34
2. Installation ....................................................................................................................................... 36
2.1 Installation on Mac OS X ....................................................................................................... 36
2.1.1 Installing the prebuilt SLiM package on Mac OS X ..........................................................36
2.1.2 Building SLiM from sources on Mac OS X .................................................................
36
2.2 Installation on Linux and other Un*x platforms .....................................................................
40
2.3 Installation on non-Un*x platforms........................................................................................
42
2.4 Testing the SLiM installation .................................................................................................. 42
3. Running simulations in SLiMgui ......................................................................................................
43
3.1 The SLiMgui simulation window ........................................................................................... 43
3.2 The script help window ......................................................................................................... 44
3.3 The Eidos console ................................................................................................................. 45
3.4 The Eidos variable browser .................................................................................................... 46
3.5 Automatic code completion and command syntax lookup ....................................................
46
3.6 Automated script generation .................................................................................................. 48
3.7 Script prettyprinting...............................................................................................................
51
3.8 Further SLiMgui features ........................................................................................................ 52
4. Getting started: Neutral evolution in a panmictic population ........................................................... 53
4.1 A basic neutral simulation .....................................................................................................
53
4.1.1 initialize() callbacks ................................................................................................... 54
4.1.2 Mutation rate ............................................................................................................. 55
4.1.3 Mutation types ........................................................................................................... 55
4.1.4 Genomic element types .............................................................................................
57
4.1.5 Genomic elements .................................................................................................... 58
4.1.6 Recombination rate ................................................................................................... 59
4.1.7 Eidos events...............................................................................................................
60
4.1.8 Subpopulations..........................................................................................................
60
4.1.9 Executing the simulation............................................................................................
61

4

4.2

5.

6.

7.

8.

Basic output .......................................................................................................................... 62
4.2.1 Entire population ....................................................................................................... 62
4.2.2 Random population sample .......................................................................................
64
4.2.3 Sampling individuals rather than genomes .................................................................
66
4.2.4 Substitutions ..............................................................................................................
67
4.2.5 Custom output with Eidos .......................................................................................... 68
4.2.6 The simulation endpoint ............................................................................................
70
Demography and population structure.............................................................................................
72
5.1 Subpopulation size ................................................................................................................ 72
5.1.1 Instantaneous changes ....................................................................................................72
5.1.2 Exponential growth .................................................................................................... 72
5.1.3 The population visualization graph ............................................................................ 76
5.1.4 Cyclical changes........................................................................................................ 77
5.1.5 Context-dependent changes: Muller’s Ratchet ............................................................
77
5.2 Population structure ..............................................................................................................
78
5.2.1 Adding subpopulations .............................................................................................. 78
5.2.2 Removing subpopulations.......................................................................................... 80
5.2.3 Splitting subpopulations ............................................................................................
81
5.3 Migration and admixture ....................................................................................................... 81
5.3.1 A linear island model ................................................................................................
81
5.3.2 A non-spatial metapopulation....................................................................................
83
5.3.3 A two-dimensional subpopulation matrix .................................................................. 84
5.3.4 A random, sparse spatial metapopulation .................................................................. 85
5.3.5 Reading a migration matrix from a file ....................................................................... 88
5.4 The Gravel et al. (2011) model of human evolution ............................................................... 90
5.5 Rescaling population sizes to improve simulation performance ..................................................92
Sexual reproduction ........................................................................................................................
96
6.1 Recombination ...................................................................................................................... 96
6.1.1 Crossing over: Making a random recombination map .....................................................96
6.1.2 Crossing over: Reading a recombination map from a file ........................................... .....97
6.1.3 Gene conversion ....................................................................................................... 99
6.2 Separate sexes .......................................................................................................................
99
6.2.1 Enabling separate sexes .............................................................................................
99
6.2.2 Sex ratios ................................................................................................................... 100
6.2.3 Modeling sex-chromosome evolution ........................................................................ 101
6.3 Selfing and cloning ................................................................................................................ 103
6.3.1 Selfing in hermaphroditic populations ....................................................................... 103
6.3.2 Cloning ..................................................................................................................... 103
Mutation types, genomic elements, and chromosome structure ....................................................... 105
7.1 Mutation types and fitness effects .......................................................................................... 105
7.2 Genomic element types......................................................................................................... 107
7.3 Chromosome organization .................................................................................................... 108
7.4 Custom display colors in SLiMgui .......................................................................................... 110
SLiMgui visualizations for polymorphism patterns ........................................................................... 114
8.1 Mutation frequency spectra ................................................................................................... 114
8.2 Mutation frequency trajectories ............................................................................................. 115
8.3 Times to fixation and loss ...................................................................................................... ...116

5

9.

10.

11.

12.

13.

8.4 Population fitness over time ................................................................................................... 117
Context-dependent selection using fitness() callbacks ...................................................................... 118
9.1 Temporally varying selection ................................................................................................. 118
9.2 Spatially varying selection ..................................................................................................... 119
9.3 Fitness as a function of genomic background ........................................................................ 121
9.3.1 Epistasis ..................................................................................................................... 121
9.3.2 Polygenic selection .................................................................................................... 123
9.4 Fitness as a function of population composition .................................................................... 125
9.4.1 Frequency-dependent selection ................................................................................. 125
9.4.2 Kin selection and inclusive fitness.............................................................................. 127
9.4.3 Cultural effects on fitness ........................................................................................... 128
9.4.4 The green-beard effect ............................................................................................... 130
9.5 Changing selection coefficients with setSelectionCoeff() ........................................................ ...134
Selective sweeps .............................................................................................................................. 136
10.1 Introducing adaptive mutations ............................................................................................. 136
10.2 Making sweeps conditional on fixation ................................................................................. 138
10.3 Making sweeps conditional on establishment ........................................................................ 140
10.4 Partial sweeps........................................................................................................................ 142
10.5 Soft sweeps from de novo mutations ..................................................................................... 143
10.5.1 A soft sweep from recurrent de novo mutations in a large population ........................ ...143
10.5.2 A soft sweep with a fixed mutation schedule ............................................................. ...145
10.5.3 A soft sweep with a random mutation schedule ......................................................... 147
10.6 Sweeps from standing genetic variation ................................................................................. 149
10.6.1 A sweep from standing variation at a random locus ................................................... ...149
10.6.2 A sweep from standing variation at a predetermined locus ...........................................150
10.7 Adaptive introgression ........................................................................................................... 152
10.8 Fixation probabilities under Hill-Robertson interference ........................................................ 153
Complex mating schemes using mateChoice() callbacks .................................................................. 156
11.1 Assortative mating ................................................................................................................. 156
11.2 Sequential mate search.......................................................................................................... 162
11.3 Gametophytic self-incompatibility ......................................................................................... 165
Direct child modifications using modifyChild() callbacks................................................................. 170
12.1 Social learning of cultural traits ............................................................................................. 170
12.2 Lethal epistasis ...................................................................................................................... 172
12.3 Simulating gene drive ............................................................................................................ 173
12.4 Suppressing hermaphroditic selfing ....................................................................................... 178
Advanced models ............................................................................................................................ 180
13.1 Quantitative genetics and phenotypically-based fitness ............................................................180
13.2 Relatedness, inbreeding, and heterozygosity ......................................................................... 186
13.3 Mortality-based fitness ..............................................................................................................188
13.4 Reading initial simulation state from an MS output file .............................................................192
13.5 Modeling chromosomal inversions with a recombination() callback ...................................... ...195
13.6 Modeling both X and Y Chromosomes with a Pseudo-Autosomal Region (PAR) ..................... 200
13.7 Forcing a specific pedigree through arranged matings ........................................................... 204

6

13.8 Estimating model parameters with ABC ................................................................................. 207
13.9 Tracking true local ancestry along the chromosome .............................................................. 210
13.10 A quantitative genetics model with heritability ...................................................................... 214
13.11 Live plotting with R using system() ......................................................................................... 219
13.12 Modeling nucleotides at a locus ............................................................................................ 222
13.13 Modeling haploid organisms ................................................................................................. 227
13.14 Using mutation rate variation to model varying functional density......................................... ...228
13.15 Modeling microsatellites ....................................................................................................... 230
13.16 Modeling transposable elements ........................................................................................... 235
13.17 A QTL-based model with two quantitative phenotypic traits and pleiotropy .............................241
13.18 Modeling opposite ends of a chromosome ............................................................................ 248
13.19 Biased gene conversion ......................................................................................................... 251
14. Continuous-space models and interactions ...................................................................................... 258
14.1 A simple 2D continuous-space model ................................................................................... 258
14.2 Spatial competition ............................................................................................................... 260
14.3 Boundaries and boundary conditions .................................................................................... 262
14.4 Mate choice with a spatial kernel .......................................................................................... 264
14.5 Mate choice with a nearest-neighbor search.......................................................................... 266
14.6 Divergence due to phenotypic competition with an interaction() callback ............................. ...267
14.7 Modeling phenotype as a spatial dimension .......................................................................... 272
14.8 Sympatric speciation facilitated by assortative mating............................................................ ...275
14.9 Speciation due to spatial variation in selection ...................................................................... 278
14.10 A simple biogeographic landscape model ............................................................................. 282
14.11 Local adaptation on a heterogeneous landscape map ............................................................ 288
14.12 Periodic spatial boundaries ................................................................................................... 293
15. Going beyond Wright-Fisher models: nonWF model recipes ..............................................................298
15.1 A minimal nonWF model ...................................................................................................... 298
15.2 Age structure (a life table model) ........................................................................................... 301
15.3 Monogamous mating and variation in litter size .................................................................... 303
15.4 Beneficial mutations and absolute fitness .............................................................................. 306
15.5 A metapopulation extinction-colonization model .................................................................. 308
15.6 Habitat choice....................................................................................................................... 312
15.7 Evolutionary rescue after environmental change .................................................................... 315
15.8 Pollen flow ............................................................................................................................ 320
15.9 Litter size and parental investment ........................................................................................ 322
15.10 Spatial competition and spatial mate choice in a nonWF model ...............................................325
15.11 A spatial model with carrying-capacity density...................................................................... 330
15.12 Forcing a specific pedigree in a nonWF model ...................................................................... 333
15.13 Modeling clonal haploids in a nonWF model with addRecombinant() ......................................338
15.14 Modeling clonal haploid bacteria with horizontal gene transfer ...............................................340
15.15 Implementing a Wright–Fisher model with a nonWF model .....................................................343
15.16 Alternation of generations ..................................................................................................... 345

7

16. Tree-sequence recording: tracking population history and true local ancestry .................................. ...349
16.1 A minimal tree-seq model ..................................................................................................... 349
16.2 Overlaying neutral mutations ................................................................................................ 350
16.3 Simulation conditional upon fixation of a sweep, preserving ancestry ......................................351
16.4 Detecting the “dip in diversity”: analyzing tree heights in Python .......................................... ...353
16.5 Mapping admixture: analyzing ancestry in Python ................................................................ 356
16.6 Measuring the coalescence time of a model .......................................................................... 359
16.7 Analyzing selection coefficients in Python with pyslim .............................................................361
16.8 Starting a hermaphroditic WF model with a coalescent history .............................................. ...362
16.9 Starting a sexual nonWF model with a coalescent history.........................................................364
16.10 Adding a neutral burn-in after simulation with recapitation ................................................... ...366
17. Runtime control ............................................................................................................................... 371
17.1 The random number generator .............................................................................................. 371
17.2 Defining constants on the command line .............................................................................. 372
17.3 Other command-line options ................................................................................................ 374
17.4 File input and output ............................................................................................................. 376
17.5 Lambda execution ................................................................................................................. 377
17.6 Debugging ............................................................................................................................ 379
18. Implementation and performance .................................................................................................... 380
18.1 Writing fast SLiM simulations ................................................................................................ 380
18.2 Performance evaluation ......................................................................................................... 382
18.3 Memory usage considerations ............................................................................................... 384
18.4 Mutation runs and runtime optimization ............................................................................... 385
18.5 Profiling simulations in SLiMgui ............................................................................................ 388
18.6 Profiling memory usage in SLiMgui, or with outputUsage().................................................... ...394

PART II: THE SLIM REFERENCE
19. SLiM architecture (WF models) ........................................................................................................ 399
19.1 Step 1: Execution of early() Eidos events ................................................................................ 399
19.2 Step 2: Generation of offspring .............................................................................................. 399
19.2.1 The order of offspring generation ............................................................................... 399
19.2.2 Mate choice .............................................................................................................. 400
19.2.3 Mutation and recombination ..................................................................................... 401
19.2.4 Child modification .................................................................................................... 401
19.2.5 Child generation........................................................................................................ 402
19.3 Step 3: Removal of fixed mutations ....................................................................................... 402
19.4 Step 4: Offspring become parents .......................................................................................... 403
19.5 Step 5: Execution of late() Eidos events .................................................................................. 403
19.6 Step 6: Fitness value recalculation ......................................................................................... 403
19.7 Step 7: Generation count increment ...................................................................................... 404
20. SLiM architecture (nonWF models) .................................................................................................. 405
20.1 Step 1: Generation of offspring .............................................................................................. 405
20.1.1 The order of offspring generation ............................................................................... 405
20.1.2 Individual-based reproduction with reproduction() callbacks .......................................406

8

20.1.3 Mutation and recombination .....................................................................................
20.1.4 Child modification ....................................................................................................
20.1.5 Child generation........................................................................................................
20.2 Step 2: Execution of early() Eidos events ................................................................................
20.3 Step 3: Fitness value recalculation .........................................................................................
20.4 Step 4: Viability/survival selection..........................................................................................
20.5 Step 5: Removal of fixed mutations .......................................................................................
20.6 Step 6: Execution of late() Eidos events ..................................................................................
20.7 Step 7: Generation count increment ......................................................................................
21. SLiM classes ....................................................................................................................................
21.1 Simulation initialization: initialize() callbacks ........................................................................
21.2 Class Chromosome ................................................................................................................
21.2.1 Chromosome properties ............................................................................................
21.2.2 Chromosome methods ..............................................................................................
21.3 Class Genome .......................................................................................................................
21.3.1 Genome properties....................................................................................................
21.3.2 Genome methods......................................................................................................
21.4 Class GenomicElement..........................................................................................................
21.4.1 GenomicElement properties ......................................................................................
21.4.2 GenomicElement methods ........................................................................................
21.5 Class GenomicElementType ..................................................................................................
21.5.1 GenomicElementType properties ...............................................................................
21.5.2 GenomicElementType methods .................................................................................
21.6 Class Individual .....................................................................................................................
21.6.1 Individual properties .................................................................................................
21.6.2 Individual methods....................................................................................................
21.7 Class InteractionType ............................................................................................................
21.7.1 InteractionType properties .........................................................................................
21.7.2 InteractionType methods ...........................................................................................
21.8 Class Mutation ......................................................................................................................
21.8.1 Mutation properties ...................................................................................................
21.8.2 Mutation methods .....................................................................................................
21.9 Class MutationType ...............................................................................................................
21.9.1 MutationType properties............................................................................................
21.9.2 MutationType methods ..............................................................................................
21.10 Class SLiMEidosBlock ............................................................................................................
21.10.1 SLiMEidosBlock properties ......................................................................................
21.10.2 SLiMEidosBlock methods ........................................................................................
21.11 Class SLiMgui........................................................................................................................
21.11.1 SLiMgui properties ..................................................................................................
21.11.2 SLiMgui methods.....................................................................................................
21.12 Class SLiMSim .......................................................................................................................
21.12.1 SLiMSim properties .................................................................................................
21.12.2 SLiMSim methods....................................................................................................

406
406
406
407
407
408
408
409
409
410
410
417
417
419
420
420
421
425
425
426
426
426
426
427
427
430
432
434
435
439
439
440
441
442
444
444
444
445
445
445
445
446
446
447

9

22.

23.

24.

25.
26.
27.
28.

21.13 Class Subpopulation .............................................................................................................. 455
21.13.1 Subpopulation properties ........................................................................................ 456
21.13.2 Subpopulation methods........................................................................................... 457
21.14 Class Substitution .................................................................................................................. 467
21.14.1 Substitution properties............................................................................................. 467
21.14.2 Substitution methods ............................................................................................... 468
Writing Eidos events and callbacks .................................................................................................. 469
22.1 Defining Eidos events ...............................................................................................................469
22.2 Defining mutation fitness with a fitness() callback ................................................................. 470
22.3 Defining mate choice with a mateChoice() callback .............................................................. 473
22.4 Defining child generation with a modifyChild() callback ....................................................... 475
22.5 Defining recombination behavior with a recombination() callback ...........................................477
22.6 Defining interaction behavior with a interaction() callback .................................................... ...478
22.7 Defining reproduction behavior with a reproduction() callback ................................................480
22.8 Further details on Eidos events and callbacks ........................................................................ 481
SLiM output formats ........................................................................................................................ 483
23.1 SLiMSim output methods ....................................................................................................... 483
23.1.1 outputFull() ................................................................................................................ 484
23.1.2 outputFixedMutations() .............................................................................................. 486
23.1.3 outputMutations() ...................................................................................................... 486
23.2 Subpopulation output methods .............................................................................................. 487
23.2.1 outputSample() .......................................................................................................... 487
23.2.2 outputMSSample() ..................................................................................................... 488
23.2.3 outputVCFSample() ................................................................................................... 488
23.3 Genome output methods ....................................................................................................... 490
23.3.1 output() ..................................................................................................................... 491
23.3.2 outputMS() ................................................................................................................ 491
23.3.3 outputVCF()............................................................................................................... 491
23.4 SLiM additions to the .trees file format................................................................................... 492
SLiM extensions to the Eidos language............................................................................................. 498
24.1 Extensions to the Eidos grammar ........................................................................................... 498
24.2 SLiM scoping rules ................................................................................................................ 499
SLiM reference sheet ....................................................................................................................... 501
Revision history ............................................................................................................................... 506
Credits and licenses for incorporated software ................................................................................. 513
References ....................................................................................................................................... 516

10

PART I: THE SLIM COOKBOOK

1. SLiM overview
1.1 Introduction
SLiM is an evolutionary simulation package that provides facilities for very easily and quickly
constructing genetically explicit individual-based evolutionary models. By default, SLiM is based
upon a Wright-Fisher or “WF” model of evolution; in particular, (1) generations are nonoverlapping and discrete, (2) the probability of an individual being chosen as a parent for a child
in the next generation is proportional to the individual’s fitness, (3) individuals are diploid, and
(4) offspring are generated by recombination of parental chromosomes with the addition of new
mutations. Some of these assumptions can be relaxed in WF models using techniques described in
this manual, and an alternative non-Wright-Fisher or “nonWF” type of model can even be used
instead; nevertheless, the default WF model is the conceptual foundation of SLiM, and it should be
understood thoroughly before venturing into more advanced models.
The original version of SLiM (through version 1.8; Messer 2013) was written by Philipp Messer,
now of Cornell University; its name stands for Selection on Linked Mutations. SLiM 2 and later –
the subject of this manual, hereafter simply referred to as SLiM – is a ground-up redesign of SLiM
(by Benjamin C. Haller, now of the Messer Lab at Cornell) that provides much greater power,
flexibility, and speed on top of the same foundational architecture as the original.
SLiM is based upon two main components: a simple scripting language called Eidos that was
invented for use with SLiM, and a set of Eidos classes that implement entities such as
subpopulations, mutations, and chromosomes. A minimal SLiM simulation, such as you will see in
section 4.1, comprises just a few lines of Eidos code; virtually all of the simulation details are
handled by SLiM, so the Eidos script needs only to set up basic parameters such as the population
size and mutation rate. Because of the extensibility provided by Eidos, however, it is
straightforward to extend such a simulation to model almost any scenario.
Regardless of the specific problem studied, evolutionary simulations will often entail common
design elements: multiple subpopulations connected by migration, for example, or selective
sweeps, or spatial and temporal variation in selection. Such design elements will come up over
and over for users of SLiM, and it might not be obvious – particularly to biologists with little
programming experience – how to model them using the toolbox provided by Eidos and SLiM.
The first part of this manual has thus been structured as a “cookbook”, an assemblage of recipes
showing how to build different sorts of models in SLiM. The design of each recipe will be
explained, so that users of SLiM feel comfortable modifying the recipes to build their own models.
Under the hood, SLiM is a complex piece of software, with dozens of source files devoted to
implementing both the Eidos language and the Eidos classes provided by SLiM. However, users of
SLiM should not need to confront that complexity; SLiM users should literally never need to delve
into the C++ and Objective-C code that drives SLiM. All that you need to understand, as an end
user, is the Eidos scripting interface that SLiM presents for your use. Understanding the Eidos
language itself is an important part of using SLiM effectively; this manual will briefly introduce
Eidos concepts as they arise, but for a more thorough and complete introduction to Eidos it is
recommended that you refer to its separate manual (Haller 2016). Eidos is similar to the popular R
language (R Core Team 2015); if you have used R, Eidos should feel natural. (The Eidos manual
discusses why we invented a new language for SLiM, rather than using an existing language.)
This manual does not need to be read from cover to cover; each recipe is designed to stand
alone, so if you are interested in a specific problem – how to model epistasis in SLiM, say – it may
be possible to turn directly to that recipe. However, concepts do build on each other, and a
familiarity with Eidos also builds through the course of this manual. We recommend that all SLiM
users read at least this introductory chapter, which lays out the conceptual foundations of SLiM.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

12

1.2 Why SLiM?
Many evolutionary simulation packages already exist, from custom-built one-off models that
address only a single problem, to simulation toolkits designed to fit a wide variety of tasks. It
would be reasonable to ask why we have made yet another toolkit, and what sets SLiM apart. This
section will explain what SLiM is designed to do and why it adds something important and unique
to existing modeling tools. Note that here we are discussing SLiM 2 and later, which is quite
different in its design philosophy and approach than earlier versions of SLiM.
The primary reason for SLiM is flexibility. Most evolutionary modeling toolkits, including SLiM
1.8, are quite limited in their abilities. They are not easily extensible or modifiable; typically, if
you wish to make such a toolkit do something new, you need to modify the actual source code
(often C or C++) in which the toolkit is written – a non-trivial task. Some toolkits embrace this
shortcoming as a feature, and are thus designed as a set of C++ templates or other reusable
objects; but again, although this approach is flexible, a great deal of programming experience is
needed. Because existing toolkits are so difficult to modify and extend, it is often simpler for
researchers to write their own purpose-built model from scratch; however, this also entails
substantial drawbacks. First, it involves reinventing the wheel over and over. Second, it limits the
level of complexity attainable, since each research project begins again more or less from scratch.
Third, it is a very bug-prone approach, since the code underlying such simulations is often
complex, and all of the debugging and testing that went into previous models is lost with each
new model. SLiM is designed to radically simplify the process of making an evolutionary model,
because of the way that the inner mechanisms of the SLiM simulation are exposed in the Eidos
scripting language. Modifying a simulation to add a new and complex behavior such as epistasis
or sequential mate choice can often be expressed in just a couple of lines of simple Eidos code – a
much simpler proposition than trying to do the same thing in the underlying C++ in which the
toolkit is written. The underlying SLiM engine is quite complex – it contains a full interpreter for
the Eidos language – but it can be treated as a black box, and never needs to be modified or
understood by the end user at all. The script that drives a particular simulation, on the other hand,
can usually be trivially short, easily understood, and quickly modified. The end result is immense
power and flexibility coupled with immense simplicity and reusability.
A second reason for SLiM’s existence is performance. When writing a one-off model, it is often
prohibitive – in terms of both time and effort – to optimize the model for fast execution, and once
code is optimized it becomes much harder to maintain and modify, hindering reusability. Because
SLiM will be used for many different models, however, we deemed it worthwhile to spend the
effort on optimization. Months of hard work have gone into making SLiM run fast, including a
number of complex and non-obvious algorithmic optimizations. By using SLiM, you get all of the
speed benefits of those optimizations for free. Individual-based simulations are often speedlimited; one is often forced to explore only a limited range of parameter space, or use smaller
population sizes than desired, or make other such compromises because of limited computing
resources. SLiM helps to lift that constraint.
A third reason for SLiM’s existence is to provide interactive execution and graphical debugging.
With SLiMgui on OS X, you can visualize your simulation as it runs, with graphical depictions of
mutations, genomic elements, subpopulations, migration patterns, and simulation metrics. You
can single-step through your simulations, examine the values of all of the underlying objects, and
even execute arbitrary Eidos code to modify your simulation as it runs. This allows much more
rapid and bug-free simulation development. We highly recommend that you use SLiMgui on OS X
for development even if your production runs will be on Linux or elsewhere; the value of graphical
model development and debugging is immense (Grimm 2002).
We believe that flexibility, performance, and graphical interactivity make SLiM a worthy
addition to the ecosystem of evolutionary modeling toolkits. We hope that you will agree.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

13

1.3 A quick summary of SLiM
This section is a quick summary of SLiM’s design, as a brief introduction to provide context for
the recipes in this cookbook. See part II of this manual, the SLiM Reference, for further details.
Glossary. We begin with a glossary of terms that have a special meaning in SLiM:
simulation: The simulation in SLiM is the top-level Eidos object, of class SLiMSim,
corresponding to the running simulation model. All other objects defined by SLiM
are contained within the simulation object.
population: The population in SLiM comprises all of the individuals being simulated by SLiM.
There is no Eidos object corresponding to the population; the simulation object
handles everything related to the population.
subpopulation: The population is divided into subpopulations, discrete groups of individuals
which may or may not be connected to each other by migration. The Eidos class
Subpopulation is used to represent the subpopulations in a simulation.
individual: An individual, in SLiM, is a diploid organism composed of two haploid genomes
(see below). Individuals are represented by the Eidos class Individual. The
individual is the conceptual level at which fitness is computed, mate choice is
conducted, and so forth.
genome: A genome is the haploid set of all mutations occurring in the genetic material
represented by the genome object. Each individual contains two genomes,
representing the two homologous chromosomes of the individual. The Eidos class
Genome is used to represent genomes.
mutation: A mutation is a change in the genetic information of an individual, represented by
the Eidos class Mutation. Mutations have a defined position in the genome, a
selection coefficient, and information about when and where they arose. Each
mutation references a mutation type (see below) that governs some additional
properties of the mutation, such as its dominance coefficient.
mutation type: Mutations are drawn from a particular mutation type, representing
simulation-dependent categories of mutations (neutral, beneficial, lethal,
synonymous, etc.). In general, the mutation type determines the distribution of
fitness effects (DFE) from which mutations of that mutation type are drawn. The
mutation type also determines the dominance coefficient of all mutations of that
type. The Eidos class MutationType is used to represent mutation types in SLiM.
chromosome: In SLiM’s terminology, the chromosome is the positional map of regions, such
as genes, being modeled by SLiM; the term does not refer to a single chromosome
carried by a particular individual (the term genome is used for that purpose). The
chromosome, represented by the Eidos class Chromosome, defines regions according
to both their recombination rate and their mutational profile.
genomic element: The chromosome is spanned by non-overlapping genomic elements, of
Eidos class GenomicElement, each referencing a genomic element type (see below).
genomic element type: A genomic element type defines the particular mutation types that
can occur in genomic elements of the given type. It is represented by Eidos class
GenomicElementType. Biological examples of genomic element types could be
introns, exons, or non-coding regions. All of the genomic elements referencing a
type use that type’s mutational profile.
substitution: When mutations reach fixation in the entire population, they are generally
replaced by substitution objects, of Eidos class Substitution, for efficiency reasons.
The substitution provides a permanent record of the fixed mutation’s characteristics.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

14

Population structure. SLiM allows arbitrary population structure; any number of
subpopulations may exist, connected by any pattern of migration, and subpopulations may come
into existence, change size, and be removed at any time. Mate choice occurs within
subpopulations; adult organisms do not migrate or mate between subpopulations. Migration
occurs at the juvenile stage. Individuals in SLiM are always diploid, and gametes are always
haploid; at present, SLIM does not support other ploidy levels (but haploids can be modeled with
scripting; see the recipe in section 13.13). These topics are discussed further in chapter 5.
Sexual reproduction. SLiM can model either hermaphroditic individuals (no distinction
between sexes) or sexual individuals (distinct males and females). In either case, individuals
normally undergo biparental mating to produce offspring
The sequence of events within one through sexual recombination (including both crossing over
and, optionally, gene conversion). Clonal reproduction is
generation in WF models.
also supported instead of or in addition to biparental mating,
and in the hermaphroditic case, SLiM also supports self1. Execution of early() events
fertilization. When modeling sexual individuals, the sex ratio
is controllable, and sex chromosomes may be modeled.
2. Generation of offspring;
These topics are discussed further in chapter 6.
for each offspring generated:
Genetics. SLiM is genetically explicit in the sense that it
2.1. Choose source subpop
models mutations at specific base positions in genomes with
for parental individuals,
an explicit chromosome structure; SLiM does not, however,
based on migration rates
model nucleotide sequences (but see section 13.12). The
chromosome modeled by SLiM is composed of genomic
2.2. Choose parent 1, based
elements (e.g., sections of a gene), each of a particular
on cached fitness values
genomic element type (e.g., intron versus exon). The
genomic element type defines the mutational profile of
2.3. Choose parent 2, based
elements of that type, using a set of mutation types and
on fitness and any defined
associated probabilities. These topics are discussed further in
mateChoice() callbacks
chapter 7.
Fitness. By default, SLiM calculates fitness multiplicatively,
2.4. Generate the candidate
based upon all of the mutations possessed by each individual.
offspring, with mutation
The selection coefficient s of a given mutation defines the
and recombination (incl.
recombination() callbacks)
mutation’s fitness effect when homozygous (1+s); when
heterozygous, the fitness effect is modified by a dominance
2.5. Suppress/modify the
coefficient h (1+hs). The fitness effects of mutations may be
candidate, using defined
altered by fitness() callbacks that provide full control over
modifyChild() callbacks
how the fitness of an individual is calculated given the
particular set of mutations present in its genome, and possibly
other properties of the population. These topics are discussed
3. Removal of fixed mutations
further in chapter 9.
unless convertToSubstitution==F
Life cycle. SLiM is based, by default, on an extended
Wright-Fisher
or “WF” model with non-overlapping, discrete
4. Offspring become parents
generations (a non-Wright-Fisher or “nonWF” model can also
be used, as discussed in section 1.6 and chapters 15 and 20,
5. Execution of late() events
but that is an advanced topic that we will pass over here).
Within each generation, events occur in a fixed order (see
6. Fitness value recalculation
left). Each generation begins with the execution of userusing fitness() callbacks
defined Eidos scripts called early() events. Examples of
early() events might be demographic events, such as
7. Generation count increment
changes in population size, population splits, changes in
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

15

migration rates, etc. Offspring are then generated by drawing gametes from the parent population
according to fitness; this default mating scheme may be modified to implement non-standard
mating scenarios via user-defined mateChoice() callbacks (see chapter 11). Gametes are
generated from the genomes of the candidate parents, modified by mutation and recombination;
the standard user-defined recombination map can be modified arbitrarily for each gamete
generated using a recombination() callback (see sections 13.5 and 21.5). After offspring have
been created, their genomes can be modified according to user-defined rules using modifyChild()
callbacks (see chapter 12). After (optional) removal of fixed mutations from the model, the
offspring become the parents. Next is another opportunity for Eidos events – in this case, late()
events – to run. This is where output events, such as drawing a random sample of individuals from
the populations, would typically be specified. Fitness values are then calculated, modified by
fitness() callbacks (see chapter 9). Finally, the simulation advances to the next generation.
Tags. User-defined “tag” values can be attached to almost all of the objects defined by SLiM in
order to associate your own information with SLiM’s objects, whether short-term flags or long-term
state. Tags are used in the recipes in sections 9.4.3, 9.4.4, 10.5.2, 12.1, 13.1, and 13.3, which
provide a variety of examples of their utility. In SLiM 2.2, a dictionary-like getValue() /
setValue() mechanism was added to the Individual, SLiMSim, and Subpopulation classes (and in
SLiM 2.4, now MutationType, GenomicElementType and InteractionType too; and in SLiM 2.5,
Mutation also; and in SLiM 3.0, Substitution also). This facility provides an even broader and
more flexible way to attach model state to those objects; see the class references in chapter 21 for
details on these functions, and see the recipe in section 11.1 for an example of their use.
Continuous space. Beginning in SLiM 2.3, SLiM adds support for continuous space. If this
optional feature is enabled, individuals in SLiM maintain a spatial position – either (x), (x, y), or
(x, y, z) – within their subpopulation. These spatial positions can be changed at any time
(simulating phenomena such as foraging and migration, for example), and are used to create a
spatial visualization of the subpopulation in SLiMgui. Positions can used in script in any way,
allowing models to incorporate a concept of continuous space in any way desired. In particular,
spatial positions may be used as the basis for spatial interactions between individuals (see below),
and may be used in conjunction with spatial maps that define variation in environmental variables
across continuous space. This advanced feature is first introduced in recipes in chapter 14.
Interactions. Beginning in SLiM 2.3, SLiM adds a new class, InteractionType, which can
govern interactions between individuals. Interactions can still be handled with pure Eidos code,
but the use of InteractionType automates and accelerates many common tasks, such as finding
the total interaction strength felt by an individual (as a result of competition, for example).
InteractionType can also manage spatial interactions, providing features such as interaction
strengths that vary according to distance, and handling spatial queries such as nearest-neighbor
searches. This advanced feature is first introduced in recipes in chapter 14.
Section 1.4 sketches out some practical details of how SLiM is typically used. Section 1.5
provides a more detailed overview of some of the concepts above. Section 1.6 then introduces
non-Wright-Fisher or “nonWF” models, and section 1.7 introduces tree-sequence recording; these
are both advanced topics, but it is good to be aware of the existence of these features and the
reasons why you might wish to use them. Chapter 2 provides instructions on building and
installing SLiM on various platforms. Chapter 3 gives an introduction to SLiMgui, the graphical
modeling environment provided for use on Mac OS X. The remainder of Part I of this manual, the
SLiM Cookbook, then provides “recipes” demonstrating the core concepts of SLiM. Part II of this
manual, the SLiM Reference, provides technical reference documentation for SLiM, including such
aspects as the generation cycle, the Eidos classes provided by SLiM, the various types of events and
callbacks, and the output formats supported by SLiM, beginning in chapter 19.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

16

1.4 The typical SLiM usage pattern
Before delving more deeply into the concepts introduced in the previous section, it might be
helpful to clarify how a typical user would use SLiM, in practical terms.
First of all, many or most users will use the SLiMgui modeling environment on Mac OS X for
their model development and testing, and perhaps for exploratory, non-production runs as well.
SLiMgui provides many tools for model development, such as code completion, syntax coloring,
online documentation, interactive model execution, and graphical debugging. It also makes
model development logistically simpler; there is no need to write a dispatch script, no need to
execute Unix commands in a terminal, etc. When learning how to use SLiM, it is therefore
strongly recommended that you find a Mac and use SLiMgui. Note that all of the recipes in this
manual are directly accessible in SLiMgui through the Open Recipe submenu of the File menu.
Second, many or most SLiM users will do their “production” model runs on a computing
cluster. One reason is that individual-based models often take a long time to execute; SLiM is
highly optimized, but simulating the genomic details of a large number of individuals over many
generations is inevitably slow. Another reason is that many replicate runs are typically needed;
one run of an individual-based model provides just one data point, one solitary example of what
could happen, so one usually needs to perform many runs and then use statistical methods to draw
inferences (just as one often would in field-based or lab-based research). Finally, most studies
involving individual-based modeling explore a “parameter space”, examining how the model’s
outcome depends upon the parameters of the model; each set of parameter values explored
implies another full set of replicated runs. Together, these facts mean that a single study using
SLiM might entail many years of processor time; a computing cluster is thus often needed.
For this reason, SLiM is designed to fully utilize a single processor; it is not designed to take
advantage of multiple processors using multithreading or MPI. A single run of the slim command
runs a given model a single time; to conduct the many runs that are typically needed, slim will be
run many times. This single-threaded design makes it straightforward for the user to run a separate
instance of a model on each processor on a multicore machine or a computing cluster. This can
be done manually in some cases, but is more typically done using a batch-queueing system such
as Open Grid Scheduler. Either way, some sort of a dispatch script is generally needed to schedule
each of the individual runs of slim. Because there are so many different possible ways that the
user might want to run SLiM, and so many different computing environments in which it might be
run, such a dispatch script is not provided as a part of the SLiM package; you will need to write
your own. However, this is usually extremely straightforward. It can usually be done in whatever
scripting language you prefer, from R or Python to a Bash shell script, and often just consists of a
loop over all of the parameter values and replicates desired, with a call to launch or schedule a run
of slim inside that loop. In Python, sublaunching a Unix process can be done with the
subprocess package; in R, with system() or system2(); in a Bash shell script, by just invoking
slim directly. If you are working on an institutional computing cluster, the cluster administrator
may be able to provide you with examples of dispatch scripts appropriate for that environment.
Finally, SLiM users will typically want to collect results from model runs and perform statistics
and other analyses on them. This can sometimes be done directly in the dispatch script; that script
might collect the model output and tabulate simple results as runs complete. In other cases, each
invocation of slim will be set up to produce its own output files, and then a separate analysis script
– typically written in a language like R or Python that has support for statistics and plotting – will
read in those output files, parse the relevant information out of them, and conduct the desired
analyses. In this undertaking, you are on your own. However, it is worth noting that SLiM can
generate output in some standard file formats, such as VCF and MS, and that many tools already
exist to read in and analyze such standard-format files, so in some cases you might be able to use
pre-existing software for at least some of your analysis. If your model uses tree-sequence recording
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

17

(see section 1.7), you can also output a .trees file that can be read and processed in Python using
the msprime package, making many types of post-run analysis much easier (see examples in
chapter 16); indeed, this could in itself be a compelling reason to use tree-sequence recording,
since parsing output files is otherwise such an annoyance. Finally, in some cases you might want
to do some of the needed analysis inside the SLiM model, in Eidos code, to simplify the postprocessing needed. For example, if your analysis needs the number of mutations fixed at the end
of each model run, it might be simpler to count the fixed mutations inside the SLiM model, and
just output that count, than to parsing full genomic output files from each model run just to extract
the count of fixed mutations. It will be beneficial to think, up front, about how to design your
model and your analysis code so that they communicate as cleanly and simply as possible.
1.5 Conceptual overview
This section will delve into further detail on some of the concepts set out in section 1.3 (which
should be read before this section), to present a more complete picture of how SLiM works at a
conceptual level. We will not show any Eidos code here; that will be left for the recipes in the
“cookbook” that begins in chapter 4. We will, however, make reference to the Eidos classes used
by SLiM to represent various concepts, and to some of the properties and methods of those classes.
We will gloss over some minor details here in order to present the big picture as clearly as
possible; for more comprehensive information, see the SLiM Reference that begins in chapter 19.
1.5.1 Individuals and genomes
SLiM is a framework for running individual-based models; this means that every individual
organism in the model is simulated explicitly. Each individual is represented in SLiM as an
instance of the Individual class in the Eidos scripting language (see section 21.6). At the most
minimal level individuals are born and die, and in between they find mates and produce offspring
(or they reproduce by selfing or cloning); these actions are built into SLiM. If optional extensions
to SLiM are enabled (using the initializeSLiMOptions() function; see section 21.1), SLiM can
also keep some pedigree information regarding individuals (up to the grandparental level), and can
keep track of the spatial positions of individuals on a landscape. In more complex models
individuals may also do things like gather resources, learn things, interact with other individuals,
be subject to events that alter their state, and exhibit behavior; these actions are not built into
SLiM, but may easily be implemented in Eidos script.
Perhaps most importantly, since SLiM models genetically explicit simulations, individuals
contain genetic information. Individuals in SLiM are diploid; each individual thus possesses two
homologous chromosomes (or one X and one Y chromosome, if sex chromosomes are being
simulated), referred to as the genomes of that individual. (It is possible to model more than one
chromosome in SLiM, conceptually, but this is done by using a recombination map that specifies
free recombination at particular positions, effectively subdividing the chromosome into unlinked
sub-chromosomes; see, e.g., section 13.1.) Each of the two genomes of an individual is
represented using an instance of the Genome class in Eidos (see section 21.3).
A genome is essentially a container that holds a set of mutations. If both of an individual’s
genomes contain exactly the same mutation (a surprisingly subtle concept, which will be defined
rigorously in the next subsection), the individual is homozygous for that mutation; if a given
mutation is contained in only one of the two genomes, the individual is heterozygous for that
mutation. Note that SLiM does not model nucleotides explicitly (although it is possible to layer a
concept of nucleotides on top of SLiM, in script; see section 13.12), but it does model explicit,
discrete base positions along the genome.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

18

The overall picture, then, looks like this:

Each yellow square represents one individual. Each individual contains two genomes; and each
genome contains the state of all of the base positions along the chromosome, from beginning
(position 0) to end (position L−1, so that the chromosome is of length L). Each base position is
represented here as an empty box, and all of the boxes from 0 to L−1 together represent a genome.
A key concept in SLiM is that genomes begin, by default, as empty: they contain no mutations
and no genetic information. This can be thought of as the “wild type”, in a sense, and we will
refer it to in that way sometimes in what follows, but it does not have to represent the wild-type
state of the organism you are modeling. Instead, it simply represents the base, un-mutated state of
individuals, whatever you want to consider that to be. All mutations in SLiM can be thought of as
modifications that are layered on top of this empty base state. The base state can also be thought
of as “neutral”, in terms of fitness, but it does not have to actually be neutral (i.e., 1.0) in absolute
fitness. Instead, the base state can have any absolute fitness value you like – but in general that is
unimportant, since SLiM’s core engine is only concerned with relative fitness (in WF models, the
default mode of operation, which we will limit ourselves to in this discussion). When the
simulation begins, and all genomes are empty, it does not matter to SLiM what the absolute fitness
of those empty genomes is; since they all have the same absolute fitness, they all have a relative
fitness of 1.0, and that is what matters to SLiM. The fitness effects of mutations then modify those
relative fitnesses, multiplicatively.
You can, of course, set up your simulation to begin with whatever mutational state you want;
even more commonly, a simulation will begin with a “burn-in” period that establishes an
equilibrium level of genetic diversity through mutation–selection–migration balance before the
more interesting part of the simulation begins. It is important to understand that such genetic
diversity is always built on top of empty chromosomes in SLiM, however. In general, if two
mutations are segregating at a given base position, there are effectively three alleles in the
population at that base position: the first mutation, the second mutation, and what you could think
of as the “wild-type allele” represented by the absence of either of those mutations. In script, you
could force a mutation to exist at every base position in every genome, so that there are no empty
positions in any of the genomes in your simulation; if you do so, however, you will find that your
simulation then runs quite slowly, since SLiM is having to track and manage all of those mutations,
so such a strategy is usually undesirable.
Learning to let go of the idea of chromosomes filled with genetic information at every position
(as would be the case in a nucleotide-based model) and think instead in terms of mutations layered
on top of the empty “wild-type” state is an essential conceptual leap to make in using SLiM. To
move from one conceptual model to the other, imagine the “wild-type” nucleotide sequence for
your study organism: some specified sequence of A, T, G, and C. Whatever that sequence might
be, it is represented in SLiM by the absence of any genetic information at all: empty chromosomes.
SLiM tracks only mutations on top of that sequence: SNPs, in the nucleotide-based paradigm. But
any individual that does not possess a SNP at a given location instead possesses the “wild-type”
nucleotide, conceptually – represented in SLiM by an empty base position.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

19

1.5.2 Mutations and substitutions
With the foundation of individuals and genomes laid by the previous section, let’s now explore
the idea of mutations in SLiM in more detail. In SLiM, a mutation is an instance of class Mutation
in Eidos (see section 21.8). A mutation has various properties – its base position and its selection
coefficient, most importantly. In SLiM, the multiplicative fitness effect of a mutation with selection
coefficient s is 1+s for a homozygote; in a heterozygote it is 1+hs, where h is the dominance
coefficient (kept by the mutation type; see section 1.5.4). Let’s update our conceptual schematic to
include some mutations:

This simulation has three mutations, represented here with blue, red, and green. At position 3,
the first individual is homozygous for the blue mutation, the second individual is heterozygous for
red and heterozygous for blue, and the third individual is heterozygous for blue (the other allele in
that individual being the empty “wild-type allele” as discussed above).
An important concept to absorb is that the genomes that contain the blue mutation do not just
contain their own particular copies of blue-mutation-type information; they actually contain
references to the very same shared blue-mutation object. A more accurate conceptual diagram
might therefore look like this:

Since that is quite difficult to interpret visually, we will stick with showing mutations as residing
inside genomes; but you should always keep in mind that mutations are really shared objects. A
new mutation object is created either (1) when a random mutation event occurs in SLiM (as
governed by the overall mutation rate set for the simulation), or (2) when requested by the
simulation script with the addNewMutation() or addNewDrawnMutation() methods of Genome (see
section 21.3.2). In both cases, these events always create a new mutation object, even if a
mutation with exactly the same properties – position, selection coefficient, etc. – already exists in
SLiM. In our conceptual diagram above, the blue and red mutations might be identical in every
detail; they are nevertheless considered to be different mutations by SLiM, and will be tracked
separately and never merged into a single identity. You can think of this as representing a
mutational lineage, a sort of identity by descent; the red and blue mutations might represent the
very same SNP, but they arose due to separate mutational events, and thus they seeded separate
mutational lineages.
This distinction becomes particularly important when you ask whether two genomes contain
“the same mutation” – if you ask, for example, whether a given individual is heterozygous or
homozygous for a given mutation. The second individual in the diagram is heterozygous for blue
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

20

and heterozygous for red; even if the blue and red mutations are identical, the individual is not
considered by SLiM to be homozygous. If the fitness effects of these mutations entail a dominance
effect, this could be important – the fitness effect of being red/blue would then be different from
the fitness effect of being red/red or blue/blue. Often this can be ignored; if mutations are neutral
then dominance doesn’t matter, and if mutational effects are drawn from a distribution of fitness
effects then in general no two new mutations will be the same anyway. However, if some
mutations in a simulation use a fixed, non-neutral fitness effect with dominance then this might
become important; this could be important in some soft-sweep models, for example. In such
cases, you may wish to ensure that all references to identical mutations are transmuted into
references to the same object, maintaining a single mutation for all lineages (see, e.g., section
13.12). When new mutations are being introduced in script, rather than by SLiM, you can ensure
that an existing mutation object is used by using the addMutations() method of Genome (see
section 21.3.2), which adds already existing mutations to a genome instead of creating new
mutation objects (and thereby new mutational lineages).
Another key concept involving mutations is that by default, mutations are removed from the
simulation when they become fixed. Suppose that, after mate choice and biparental mating, the
next generation of our conceptual diagram looked like this:

This state will never be visible to the simulation script, because at the end of offspring
generation it will be replaced by this state instead:

The fixed mutation has been removed from the simulation and stored as a “substitution” object
(represented here by the blue square to the right). This substitution object will be kept by SLiM
forever, to remember the fixed mutation, and this substitution object is available to the script; but
the mutation is no longer contained by the genomes of each individual, and it no longer influences
SLiM’s fitness computations. This is usually a good idea, because it allows SLiM to run much faster
than it otherwise would; without removal of fixed mutations, simulations would slowly bog down
under the weight of more and more accumulated fixed mutations. It is also usually safe, since a
mutation that is possessed by every individual will usually have an effect on absolute fitness but
not on relative fitness – since its multiplicative fitness effect is the same in every individual, it can
be neglected. However, simulations that involve epistasis, or that otherwise depend upon
mutations in ways that go beyond their direct effect on relative fitness, may wish to disable this
automatic conversion for the mutations involved in such effects; this can be done easily in SLiM
using the convertToSubstitution property (see section 21.9.1).
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

21

The above ideas, about the distinct identity of each mutational lineage and the way that fixation
is defined in SLiM, combine in a way that is worth mentioning since it might be unexpected.
Suppose that there are two mutations, blue and red, segregating at base position 3, as above, and
suppose further that these mutations are identical in every detail but arose separately, as described
above. Finally, suppose that the population reaches this state:

It might be natural to suppose that the mutations at position 3 would be considered to have
fixed, and would be removed, as above, given that they are all identical and merely represent
independent mutational lineages of what you might think to be the same mutation. As above,
however, this is not how SLiM thinks! Since the blue and red mutations are different mutation
objects, neither is considered to have reached fixation; fixation in SLiM occurs when a specific
mutation reaches a frequency of 1.0, without any consideration for other identical mutations
segregating in the population. In a scenario like this, such as a soft-sweep model, if you want
fixation of independent mutational lineages to be detected you will need to either merge the
independent lineages yourself in script, as described above, or simply detect fixation yourself
directly. You could detect fixation by checking that every genome in the model contains an
appropriate mutation (either red or blue, in this case), or by summing the counts of all of the
appropriate mutations in the population to confirm that the total count is equal to 2N (where N is
the population size and the constant factor of 2 accounts for diploidy). (You might be inclined to
sum the mutation frequencies instead and compare to 1.0, but this strategy is vulnerable to
floating-point roundoff error, so it is not advisable.) This is all simpler than it sounds; the softsweep recipes in sections 10.5 provide some examples of this.
1.5.3 Mutation stacking
There is one more key concept about mutations in SLiM to consider, and it is this: by default, a
given base position in a given genome may actually contain more than one mutation – indeed, it
may contain an arbitrarily large number of mutations. This is referred to as “mutation stacking”;
the multiple mutations at a single base position in a given genome are referred to as being
“stacked”. For example, imagine that this individual exists in a SLiM simulation:

And then imagine that a new mutation, which we will show as red, occurs at the same position,
in the same genome, where the blue mutation already exists. By default, here is what will happen:

The red mutation has stacked on top of the blue mutation; both mutations now exist at that
position in that genome. This does not fit terribly well into the concept of “genotype” – SLiM does
not really think in terms of genotypes. You could perhaps say that the genotype of this individual is
something like “wild/red-blue”, if you wished, where “red-blue” is the allele created by the
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

22

stacking of a red and a blue mutation together at the same locus. SLiM, however, just thinks in
terms of the mutations actually possessed by each genome, so in SLiM’s terms, rather than talking
about genotype, we would simply say that the individual is wild-type (i.e., empty) at the given
position in one genome and has a red mutation and a blue mutation at that position in the other
genome. We could also say that the individual is heterozygous for red and heterozygous for blue;
that is true, but is a bit ambiguous since the same would be said of a red/blue heterozygote that
had a red allele in one genome and a blue allele in the other; unless the red and blue mutations
are epistatic, however, that distinction is probably unimportant. Note that further mutations at the
same position could occur as well, and would stack on top of those already present, making all
this even more complex. Probably the simplest thing is to learn to think in the same terms in
which SLiM thinks: which mutations are present in which genomes.
This behavior of SLiM is perhaps quirky, but usually harmless. It is not common for mutations
to end up stacked in practice, since normal mutation–selection balance in finite populations
usually clears out genetic diversity quickly enough that it is unusual for a new mutation to occur
right on top of another mutation that is still segregating in the population. Furthermore, when
mutations do occasionally stack, it is usually not important to the dynamics of the model; in most
cases the resulting behavior is identical, for practical purposes, to if the new mutation had
occurred at an immediately adjacent base position instead, which would result in the two
mutations being extremely tightly physically linked rather than actually being stacked:

In principle these mutations could be separated by recombination, whereas the stacked
mutations cannot be, but in practice that is unlikely enough that it will probably not make a
difference to the model’s dynamics. It is therefore important to understand that SLiM works in this
way, but in most cases models do not need to concern themselves with stacking.
However, there are cases where it does present problems. If you want to simulate actual
nucleotides (see section 13.12), for example, then each base position must be unambiguously
either A, T, G, or C; it makes no biological sense for an A and a G to be “stacked” at a single
position. Similarly, if you want to make a quantitative-genetics model with particular discrete
quantitative effects for the alleles at each QTL (see section 13.1), such as a −1/0/+1 allelic system,
you would not want mutation stacking to occur since stacking of more than one mutation at a
given QTL would violate your design. The default mutation stacking behavior may therefore be
modified, and indeed, the two example recipes cited above do so.
The stacking behavior of mutations is governed by their mutation type, a concept we haven’t yet
discussed. All mutations in SLiM belong to a mutation type, represented by the Eidos class
MutationType (section 21.9). Mutation types are important primarily because they dictate the
distribution of fitness effects from which mutations are drawn; all of the mutations of a given
mutation type might be neutral, for example, or they might be deleterious and drawn from a
gamma distribution, for example. When a new mutation is generated by SLiM, the selection
coefficient for the mutation is drawn from the distribution specified by the relevant mutation type,
as will be discussed in detail in section 1.5.4. Simulations may define as many mutation types as
desired, but most simulations contain just one or a few mutation types.
Besides defining the distribution of fitness effects from which mutations are drawn, mutation
types define a few other behaviors for mutations too. For example, the convertToSubstitution
property mentioned in section 1.5.2, which determines whether a mutation will be removed when
it fixes, is actually a property of MutationType, not of each individual mutation, since it makes

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

23

sense for this behavior to be uniform across each class of mutations. The dominance coefficient of
mutations is also a property of the mutation type, not individual mutations, for the same reason.
For our discussion of mutation stacking here, however, mutation types are important because
stacking behavior is controlled by the mutationStackGroup and mutationStackPolicy properties of
MutationType (see section 21.9.1); all of the mutations of a given mutation type follow the same
stacking policy. In fact, more than one mutation type can be joined together into a single
“mutation stacking group” using mutationStackGroup, and then all of the mutations in that
stacking group follow the same policy. This is a power-user feature that we will mostly gloss over
for the rest of the discussion here, however; we will just assume that each mutation type
constitutes a separate stacking group (which is the default behavior).
Given this, in the example above the red mutation needs to be of the same mutation type (or in
the same stacking group) as the blue mutation before stacking can be prevented. If they are not of
the same mutation type (or more precisely, not in the same stacking group), then they will stack,
regardless of what stacking policy has been set for each mutation type. The idea is that you
generally don’t want different kinds of mutations to interfere with each other in a model. If a QTL
mutation exists at a given position, for example, then you might want a new QTL mutation at the
same locus to replace the old one, rather than stacking – but if a neutral mutation occurred at the
same locus, you would certainly not want that neutral mutation to replace the existing QTL
mutation! Whatever you might think of that argument, you can always modify it by placing all of
your mutation types into a single stacking group.
So let’s now assume that the red and blue mutations are in the same stacking group – of the
same mutation type, most trivially. Now the stacking policy can be modified. The default policy is
referred to as type "s" (for “stack”), and yields the stacked result we saw above. Instead, you can
set the policy to type "l" (for “last”), which produces this result after the red mutation occurs:

The red mutation has replaced the mutation in this genome, because this stacking policy
dictates that the last mutation at a given position should be kept. This is the common alternative to
the "s" policy, but you can also set a policy of "f" (for “first”) which produces this result:

Here the red mutation has simply been thrown away, because the blue mutation was there first.
The post-mutation state is therefore identical to the pre-mutation state (note that this means the
effective mutation rate will be lower than the requested mutation rate, since some mutations will
be suppressed). In general, this policy retains the first mutation at a given position in a given
genome; new mutations are only allowed in if the position is presently empty. This is less
commonly used, since its biological motivation is unclear, but it is provided as an option in SLiM
just in case it is called for in some unusual situation.
Note that all three stacking policies – "s", "l", and "f" – depend only upon the existing
mutation(s) at the specific base position in the specific genome where a new mutation is occurring.
The state at other positions, or in other genomes, is completely irrelevant. Changing the mutation
stacking policy is not a way to prevent more than one allele from existing in a population at a
given locus; it is only a way to prevent more than one mutation from existing in one genome at a
given locus. Suppose, in our ongoing example, that the new red mutation had occurred in the
other genome of the individual instead; regardless of the stacking policy, the result would then be:
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

24

That base position in that genome was empty; the stacking policy is therefore irrelevant, since
there is no pre-existing mutation to stack on top of, so the red mutation is always added. More
generally, suppose we have a population of three individuals, with a few mutations:

Different new mutations could occur at each of the empty sites at base position 3, regardless of
the stacking policy. So whether the stacking policy is "s", "l", or "f", new mutations could easily
lead to this state:

If the stacking policy were "f", further new mutations at position 3 beyond this could not be
introduced; now that every genome has a mutation at position 3, those pre-existing mutations
would prevent the new mutations from being added. Under stacking policy "l", new mutations
would still be possible at position 3 even in this state; the new mutations would just replace the
old mutations. Under policy "s", the new mutations would stack, the default behavior.
Sometimes it is desirable to allow for the possibility of back-mutation in a model. This can be
accomplished in several ways in SLiM. One way is to use a mutational distribution of fitness
effects that provides only a limited set of possible values (perhaps four values, to represent the four
possible nucleotide at a base position), and use type "l" stacking so that new mutations replace
existing mutations; new mutations are then automatically sometimes back-mutations. See section
13.12 for an example of this sort of strategy. Another possibility, if you only need back-mutation to
the “wild type” empty-genome state, is to remove existing mutations yourself, in script; if done
with the appropriate probability, this could simulate back-mutation to the wild type.
Finally, note that the stacking policy is applied to new mutations introduced in script, as well as
to new mutations added by SLiM as a result of the overall mutation rate. New mutations that you
add will be allowed to stack unless you change the stacking policy; and conversely, if you change
the stacking policy then new mutations that you add might result in the removal of pre-existing
mutations, or might not be added at all, in accordance with the policy you have chosen. However,
when the stacking policy is changed in mid-run it is not retroactively enforced upon existing
mutations, and when a saved population is loaded the current stacking policy is not enforced upon
the individuals loaded.
The issue of mutation stacking is a bit complicated and confusing; if the discussion above has
left you with more questions than answers, rest assured that this is an advanced topic that generally
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

25

does not impact simple models at all. Play around with SLiM for a while, and then when you
revisit this section it will probably make much more sense.
1.5.4 Genomic elements, genomic element types, mutation types, and the chromosome
SLiM allows you to model complex chromosomes with genomic regions that have different
mutational effects. For example, exons would be expected to sustain beneficial, deleterious, and
neutral mutations, whereas introns might sustain only deleterious and neutral mutations, and noncoding regions might only allow neutral mutations. You set up this structure in SLiM using a
hierarchical configuration: the chromosome (class Chromosome, section 21.2) contains genomic
elements (class GenomicElement, section 21.4), which are each of a given genomic element type
(class GenomicElementType, section 21.5), and each genomic element type draws new mutations
from a weighted set of mutation types (class MutationType, section 21.9), each of which specifies a
distribution of fitness effects for that type of mutation. In an exon/intron/non-coding model, for
example, each type of genomic region – “exon”, “intron”, and “non-coding” – would be a
genomic element type, and the chromosome would be a mosaic of genomic elements using these
three genomic element types. Each of these genomic element types would draw its mutations from
a different set of mutation types – perhaps “beneficial”, “deleterious”, and “neutral”, in this case.
In practice, that might look something like this:

Here the chromosome contains two genes, each of which is composed of an alternation of
exons and introns, and non-coding regions are interspersed around the genes. Those regions along
the chromosome are defined by genomic elements, which reference the three genomic element
types defined here (non-coding, exon, and intron). Those three genomic element types each
utilize some subset of the mutation types that have been defined here (neutral, beneficial, and
deleterious). Although this diagram simply shows arrows from genomic element types to mutation
types, there are in fact weights associated with the mutation types of a given genomic element
type; in this case, you might specify that 90% of mutations within introns are neutral and 10% are
deleterious, by giving those weights to SLiM when the intron genomic element type is configured.
In this way, new mutations that are automatically generated by SLiM will have appropriate fitness
effects depending upon the genomic region in which they occur.
Note that this example is entirely arbitrary; you can define whatever mutation types, genomic
element types, and genomic elements you wish. You could make your chromosome entirely
neutral, or you could model the specific mutational effects of each region in an empirical genome
map, right down to the full level of detail known for your study organism. There is no practical
limit to the number of mutation types and genomic element types you may define. In the opposite
direction, you can also make a much simpler model than the one above. Not all of the
chromosome needs to be assigned to a genomic element; much of the chromosome can simply be
empty. For example, you could make a model of epistatic interactions between two loci with a
chromosome like this:
Here most of the chromosome is empty, containing no genomic elements. No mutations will
be generated in these regions, since mutation generation in SLiM is governed by genomic
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

26

elements. Only the purple loci will be active; we have one genomic element type and one
mutation type in this model, and quite possibly only two mutations (one at each locus). A simple
genetic structure like this will often run much faster in SLiM than simulating a whole chromosome;
if you are not interested in what is going on in some regions of the chromosome, just don’t
simulate it. The only limitation in setting up your chromosome structure in SLiM is that genomic
elements must be non-overlapping.
Note also that in general, this chromosomal structure only influences the way that SLiM
generates new mutations. When offspring are generated, the overall mutation rate is used to draw
the number of mutations that have occurred in a given gamete. The position of each mutation is
then drawn, and SLiM determines which genomic element the mutation falls within. Given that,
SLiM can find the genomic element type, and it then chooses a mutation type from the weighted
set of mutation types used by that genomic element type. Finally, knowing the mutation type for
the mutation, it draws a selection coefficient from the distribution of fitness effects for that
mutation type. That yields a new mutation object, which is placed in the gamete. That is the only
time that SLiM uses the genomic elements and genomic element types you have defined; none of
the other machinery inside SLiM’s core cares about those constructs at all. You may consult the
chromosomal structure in your script and use it in whatever way you wish, but doing so would be
quite unusual. By and large, the behavior of SLiM simulations depends upon the mutations
contained by the genomes of individuals, without reference to the chromosomal structure after the
point when new mutations are created.
Mutation types, however, do continue to be used, in a minimal fashion, as we saw in section
1.5.3; each mutation knows its mutation type, and some properties of mutations, such as their
dominance coefficient, their stacking behavior, and their fixation behavior, are specified by their
mutation type.
1.5.5 Subpopulations and migration
The previous section discussed hierarchy levels from the individual to the genome to the
mutation, and the dependence of mutations upon mutation types, genomic element types,
genomic elements, and ultimately the chromosome itself. However, there are also higher
hierarchy levels: individuals live within subpopulations, and all of the subpopulations together
exist within the whole modeled population.
Subpopulations are represented by the class Subpopulation in Eidos. A subpopulation is a set
of individuals, and is characterized primarily by the fact that random mating occurs (weighted by
individual fitness) within each subpopulation. In other words, subpopulations primarily influence
reproductive isolation; each subpopulation is internally panmictic (again, weighted by individual
fitness), but externally isolated. Migration rates can be configured between subpopulations in
order to allow gene flow, but by default the migration rate among subpopulations is zero.
For example, one might have a population structure like this:
p1
p10

p2

p9

p3

p8

p4

p7

p5
p6

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

27

Here we have ten subpopulations linked into a “stepping-stone” model that might represent a
river system; subpopulation p1 is upstream, and relatively high levels of migration produce gene
flow in the downstream direction, while much lower levels of migration exist in the upstream
direction (as shown by the relative widths of the arrows). This is just an example; you can have as
many subpopulations as you wish, linked by any pattern of migration. The effect of the population
structure on SLiM will manifest in the pattern of reproductive isolation among individuals.
The details of this conceptual model can be important. Offspring are always generated by
parents within a single subpopulation; parentals do not migrate between subpopulations for the
purposes of mating (in WF models, to which we are still limiting this discussion; see section 1.6).
Instead, migration occurs at the juvenile stage in SLiM; generated offspring are placed into
particular subpopulations in accordance with the specified migration rates. SLiM allows you to
define spatial simulations involving one-, two-, or three-dimensional landscapes for each
subpopulation; again, even in such spatial models, panmixis (weighted by individual fitness) is the
default, although effects like spatial competition and spatial mate choice preference may be added
in script. The subpopulation is the fundamental unit of reproductive isolation in SLiM.
An important conceptual point is that when setting the size of a subpopulation in SLiM, this
should be thought of as a request for a future change, not as a present-time change. If a
subpopulation contains 800 individuals and the script requests that it be 900 or 700 instead, that
change does not happen immediately (which individuals would be removed? how exactly would
the new individuals be created – with what genetic state?); instead, the subpopulation size change
is a request that will take effect in the next generation. That is, when offspring are generated, 900
or 700 offspring will be generated from the current subpopulation of 800, and the requested
subpopulation size will thus take effect in the child generation. This is always the case in SLiM;
there is actually no straightforward way to create new individuals or kill existing individuals within
a single generation (although the fitness of an individual can be set to zero, which has much the
same effect as killing it, all else being equal).
Another key concept is that zero-size subpopulations do not exist in SLiM. If a subpopulation is
set to size zero, it will (in the child generation) cease to exist entirely. For this reason, a new
subpopulation cannot be created with a size of zero, since that has no meaning in SLiM. Instead,
you should create the new subpopulation with a non-zero size at the moment that its size grows
above zero. This can be inconvenient in metapopulation models that involve dynamic local
extinction and re-colonization; such models are better written as nonWF models, which follow
very different rules (see section 1.6, and section 15.5 for an example of a nonWF extinction/
recolonization model).
1.5.6 Other concepts
That completes our overview of the foundational concepts underlying SLiM. There is a lot more
to SLiM that will be covered in the following chapters, but if you understand these fundamental
ideas the rest should be fairly straightforward.
Other sections of this manual also provide important conceptual information. Section 1.3
contains a brief introduction to some other key SLiM concepts that did not merit an in-depth
discussion here, such as fitness, the life cycle, continuous space, interactions, and tags; it would be
good to read now, if you haven’t already. Part II of this manual, the SLiM Reference, also has some
conceptual sections; chapter 19 contains detailed information on the stages of the generational life
cycle in SLiM for WF models (and chapter 20, for nonWF models), including details on how mate
choice, migration, offspring generation, fitness calculation, and other stages are implemented, and
chapter 22 describes how to modify those default life cycle stages through the use of several
different types of scripted callbacks, which provides much of the power and flexibility of SLiM.
Reading those chapters might be deferred until you have become more familiar with SLiM,
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

28

however, as they are more technical; if you are a beginning SLiM user, it is recommended that you
finish reading this chapter and then proceed with installing SLiM and begin experimenting with the
recipes presented in the chapters that follow.
1.6 Wright-Fisher (WF) versus non-Wright-Fisher (nonWF) models
SLiM 1.x and 2.x supported only one type of model, Wright-Fisher models (henceforth referred
to as WF models). In SLiM 3.0, support has been added for a second type of model, non-WrightFisher models (henceforth, nonWF models). The choice between these types of models is made
with an optional initialization call, initializeSLiMModelType(); if no call to that function is made,
the WF model type is used by default, providing complete backward compatibility with SLiM 2.x.
Use of the nonWF model type is an advanced topic, recommended only for users experienced
with SLiM. Most of this manual will be about the default WF model type. Only three areas of this
manual will discuss nonWF models: section 1.6 (e.g., this section, which summarizes the
differences between WF and nonWF models), chapter 15 (which introduces nonWF models in
more detail, and then presents nonWF model recipes), and the SLiM Reference (Part II of this
manual).
The differences between WF and nonWF models are pervasive, but may be regarded as falling
into a few major categories:
• Age structure. In WF models, generations are discrete and non-overlapping. Each “tick”
of SLiM’s generation counter is associated with the creation of a new offspring generation
and the demise of the previous parental generation. There is thus no concept of the age of
an individual, since all individuals live for a single “tick”. In nonWF models, generations
may instead be overlapping. Each “tick” of SLiM’s generation counter is associated with
the opportunity for creation of new offspring and the opportunity for mortality among
existing individuals. SLiM keeps track of the age of individuals, which may live for many
“ticks”. This makes it simple to construct models of overlapping generations with any type
of age structure and any age-related behaviors desired.
• Offspring generation. In WF models, offspring are generated automatically by SLiM each
generation. This process may be modified by various callbacks, but the process itself –
how many offspring to generate, from which parental individuals, into which
subpopulations – is managed by SLiM in such a way as to “fill out” each subpopulation
with a fresh batch of offspring while satisfying the constraints imposed by parameters such
as subpopulation size, sex ratio, cloning rate, and selfing rate. In nonWF models,
offspring are instead generated in response to a request from the model script, made in a
reproduction() callback, and the script itself is in charge of managing the process. In
nonWF models, SLiM no longer attempts to enforce any particular subpopulation size, sex
ratio, cloning rate, or selfing rate; typically, these are instead emergent properties of the
individual-based dynamics of the model. This approach is somewhat more complex, but
allows the genetics and state of each individual to influence the way that individual
reproduces – its expected litter size, its reproductive behavior (cloning, selfing, biparental
mating), the sex of its offspring, and so forth.
• Population regulation. In WF models, population regulation (keeping population size
within bounds) is managed automatically by SLiM, which keeps each subpopulation at the
initial size it is given, or at whatever new size is set by a setSubpopulationSize() call.
Subpopulation size is therefore a parameter of the model, in effect. In nonWF models,
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

29

population regulation is instead an emergent property, a side effect of how many offspring
are created versus how many individuals die due to selection in each generation. This
makes it much more natural to construct models with realistic population dynamics such
as density-dependence, fitness-dependence, resource limitation (i.e., carrying capacity),
and so forth. In a nonWF model, the parameters of the model that result in population
regulation would more likely be things like carrying capacity, the strength of densitydependent selection, or the size of a limited resource pool.
• Fitness. In WF models, all offspring survive to maturity, and fitness specifies the
probability that an individual will be chosen as a parent in the next generation. Higherfitness individuals thus have a larger expected litter size, but fitness in WF models is
relative fitness because the population size is held to whatever size is set by the model as
described above. In nonWF models, fitness influences survival instead (mating success
and fecundity can be modified also, but in script, not through SLiM’s automatic fitness
evaluation mechanism). Fitness differences are thus expressed through the likelihood that
a given individual will survive to maturity. This allows a more realistic modeling of the
variance in reproductive output, as well as simplifying model dynamics that are
influenced by the survival of individuals, such as competition. In nonWF models, fitness
is absolute fitness, which is more realistic but can be more challenging to model since it
forces you to think explicitly about population regulation in concrete, individual-based,
mechanistic terms.
• Migration. In WF models, migration is governed by model parameters set on each
subpopulation, specifying what rate of migration SLiM should enforce with each other
subpopulation. Migration is simulated in these models as the movement of an offspring
individual, immediately after it is generated in the parental subpopulation; thus, only
juvenile migration can be modeled. In nonWF models, migration is instead implemented
in script, by explicitly moving individuals from subpopulation to subpopulation. This can
be done at any time; migration of adults as well as juveniles can be modeled, or multiple
migrations over the course of an individual’s life. This design also makes it much simpler
to implement migration that depends upon the circumstances of each individual: habitat
choice, condition-dependent migration, genetic variation in dispersal, sex differences in
migration behavior, and so forth.
• Subpopulation splits. In WF models, the splitting of a subpopulation is modeled as a new
subpopulation being founded by a wave of such migrant offspring. In nonWF models,
subpopulation splits are modeled as the migration of a set of individuals (of any age) to
form a new subpopulation. Particularly in models with small population sizes, this can
produce more realistic splits, particularly when migration of parental individuals, rather
than juveniles, is desired to found new populations.
The general trend across all of the above points is that nonWF models are more individualbased, more script-controlled, and potentially more biologically realistic – but also more complex
in some respects, because the SLiM core is managing fewer details of the model’s dynamics
automatically. In particular, all nonWF models must implement at least one reproduction()
callback in order to generate new offspring. Each type of model has its appropriate uses; nonWF
models are not “better”, although they are more flexible in some respects. WF models may be
simpler to design, and may run considerably faster; and of course staying within the WF
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

30

conceptual framework may make it easier to compare simulation results with theoretical
expectations from analytical models that are based in the Wright-Fisher paradigm.
As mentioned above, most of the remainder of this manual will discuss WF models, since they
are SLiM’s original mode of operation and remain the default model type. Whenever a given
section does not explicitly state otherwise, it should be assumed that the focus in on WF models.
All of the recipes in Part I of the manual are WF recipes up until chapter 15, which specifically
presents nonWF models.
The reference section of this manual, Part II, will provide reference information covering both
WF and nonWF models. A color-coding convention will thus be used in Part II: items which apply
only to WF models will be highlighted in green, and items which apply only to nonWF models (or
primarily so) will be highlighted in blue. For example:
– (void)a_WF_only_method(...)

and:
– (void)a_nonWF_only_method(...)

Again: nonWF models are an advanced topic. As a beginning SLiM user, it’s good to know that
nonWF models exist, but don’t worry if all of the above information is not clear. You can forget
about nonWF models for now, and focus on familiarizing yourself with the default, WF models.
1.7 Tree-sequence recording
SLiM 3 introduces a major feature called tree-sequence recording. This is essentially a method
of tracking the true local ancestry of every chromosome position in every individual as a SLiM
model runs. Such ancestry information can be saved out to files called .trees files, and can be
loaded in to SLiM from .trees files as long as they are in the correct SLiM-compliant format.
These .trees files can also be loaded into Python, where their ancestry information can be
browsed, analyzed, and even modified using the msprime coalescent simulation package. When
moving data between SLiM and msprime, the pyslim Python package is also essential, since it
knows how to translate some types of SLiM-specific information in .trees files from SLiM into a
form that msprime can work with, and vice versa. Models that use tree-sequence recording will
sometimes be referred to as tree-seq models for brevity.
Use of tree-sequence recording is an advanced topic, recommended only for users experienced
with SLiM. Most of this manual will be about models that do not use tree-sequence recording.
Only three areas of this manual will discuss tree-seq models: section 1.7 (e.g., this section, which
summarizes the concept of tree-sequence recording), chapter 16 (which presents tree-seq model
recipes), and the SLiM Reference (Part II of this manual).
Tree-sequence recording does not record the full pedigree of a model. Instead, it records only
the specific ancestral information needed to reconstruct the mutational and recombinational
history of each extant individual’s genomes. Over time, some information that originally needed to
be recorded in the tree sequence may become unnecessary to keep; perhaps a whole branch of
the evolutionary tree went extinct and so the information recorded about it is no longer relevant,
for example. Through the process of simplification, which is performed periodically upon the
recorded tree sequence, all information that is not relevant to any extant genomes gets pruned
away, which keeps the memory requirements of tree-sequence recording manageable.
This recorded information is referred to as a “tree sequence” because it is literally a sequence of
trees. Conceptually, each position along the chromosome has its own ancestry tree, which is the
result of recombination at that position; as one walks along the chromosome, one encounters a
sequence of such ancestry trees, from which the pattern of inheritance can be traced from every
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

31

extant genome back to the most recent common ancestor for the population at that chromosome
position. The ancestry tree at a given position may have multiple roots, if there is no common
ancestor for all of the individuals in the population at that position; forward simulations begin with
every individual unrelated to every other, and common ancestry is only built over time through the
process of coalescence. The ancestry trees at adjacent chromosome positions are generally highly
correlated, and indeed are often identical (since those adjacent positions may never have been
split by recombination, or traces of such recombination might not have survived in any living
individual). Tree-sequence recording accounts for this correlation, tracking each distinct ancestry
tree along successive chromosome intervals rather than at every position. This is one reason why
the tree sequence is sometimes called a succinct tree sequence; it is much more compact than the
full set of trees for every position, because it omits the redundant information shared between
positions with identical ancestry. This multiplicity of trees is not easy to depict, but here is a sketch
of the concept:
12

11
10
9

9
8

5
1

0

0

7
4

2

5

3

0

3

8
7

4

1

2

5

3

0

5

7
4

1

2

6

3

3

1

5
4

8

0

2

10

Genome coordinates

The intervals between the ticks on the x axis are intervals on the chromosome that have distinct
ancestry trees (this example chromosome is only ten base positions long, and has four intervals
with distinct trees). Each interval’s ancestry tree reflects its particular pattern of inheritance along
the chromosome; however, the trees at adjacent sites tend to be highly correlated, with redundant
information that is represented concisely by the tree sequence data structure. Within each tree, the
leaf nodes at the bottom (labeled 0 through 4) might be extant genomes with no descendants,
whereas the internal nodes might be ancestral genomes that are no longer extant; however, with
overlapping generations things might be less clear-cut, since ancestors might still be extant. As
mentioned above, the tree for a given interval might have multiple roots, if coalescence is not yet
complete; that situation is found in the ancestry tree for the third chromosome interval pictured
above.
This is just a brief summary of tree-sequence recording, and may or may not be
comprehensible; for a full overview of tree-sequence recording, including details of how
simplification works, please refer to the definitive paper on the topic, Kelleher et al. (2018).
Tree-sequence recording enables some very powerful techniques, such as:
• Overlaying neutral mutations. You can run a model without any neutral mutations (which
is generally much faster), and then overlay neutral mutations afterwards. This gains
tremendous efficiency from the fact that neutral mutations only need to be overlaid on
branches of the ancestry tree that lead to extant individuals; neutral mutations on all
branches that went extinct before the end of the simulation do not need to be considered
at all. For models that contain many neutral mutations, this can result in a speedup of an
order of magnitude or more.
• Analyzing ancestry directly. You can sometimes avoid simulating neutral mutations
altogether, when it is actually the pattern of ancestry you are interested in. Often the
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

32

pattern of neutral mutations from a forward simulation is used to infer things about a
population’s history – selection, bottlenecks, immigration, and so forth. Using neutral
mutations for this purpose is a blunt instrument, however, since they are sparse and occur
stochastically; inference from them is difficult, time-consuming, and inexact. Instead, it is
often possible to draw such inferences from the recorded tree sequence itself, which in a
sense embodies every possible mutational history that the population could have had
given its actual history of inheritance and recombination. The inferential power can
therefore be much higher. This means that inferences from the tree sequence can often be
exact, and can also often be computed much more quickly and easily than from neutral
mutations.
• Detecting coalescence during forward simulation. Often you want to “burn in” a model
until it has reached an equilibrium state, which can often be defined as occurring when
full coalescence across the whole chromosome has occurred. The time at which this
happens, however, is often hard to know. A heuristic of “10N” (running for a number of
generations equal to ten times the population size) is often used, but even for simple
models it can be an underestimate. For models with a variable population size, multiple
subpopulations, or non-neutral dynamics of any kind, the proper burn-in duration is often
just a guess, and it is often necessary to make a large overestimation “just to be safe”.
With tree-sequence recording enabled, SLiM can tell you whether your model has
coalesced fully or not; it has all the information needed. You can then use that
information to decide when to end the burn-in period of the model.
• Moving between coalescent and forward simulation methods. Tree-sequence recording
and the .trees file format build a sort of bridge between coalescent and forward
simulation methods, allowing both methods to be used in a single simulation. For
example, a neutral “burn-in” period for a model could be simulated using the coalescent,
and the results saved to a .trees file in the proper SLiM-compatible format using pyslim.
That .trees file could then be loaded into SLiM as the starting state for forward
simulation; perhaps the neutral mutations from the coalescent become non-neutral at that
point in time, for example, due to a change in the environment, and so now one wishes to
simulate the resulting non-neutral dynamics. Since the coalescent is so fast, this can result
in much quicker burn-in compared to simulating the burn-in with SLiM. Overlaying
neutral mutations after a run, mentioned above, is another strategy that moves between
coalescent and forward simulation methods. Moving between methods in this manner
allows the strengths of each method to be leveraged, while avoiding their weaknesses. As
mentioned above, the pyslim package is essential to this sort of interoperability, as we will
see in chapter 16. The pyslim package is not yet able to annotate mutations that were
overlaid with msprime in a way that renders them SLiM-compatible, unfortunately; this
planned feature is still being implemented. Several other ways of moving .trees data
between SLiM and msprime using pyslim have been implemented, and will be shown in
chapter 16.
• “Recapitating” to construct the ancestral history of a simulation. With this technique,
you could run a model forward with no neutral burn-in period at all, from a set of empty
genomes, and then reconstruct the ancestry of the initial genomes using the coalescent
after the forward simulation has completed. This technique, called recapitation, is much
more efficient even than generating a burned-in state using the coalescent directly,
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

33

because only the ancestry trees present at the end of forward simulation need to be
coalesced back; any part of any genome that was present in the initial state of the model
but was later lost does not need to be coalesced, so most of the work of burn-in can be
avoided. As before, neutral mutations can be overlaid at the end of the model run, after
recapitation has constructed the ancestral history, rather than having to be simulated. This
technique can allow very large models to be simulated relatively efficiently, since most of
the work of burn-in can be eliminated. Support for this technique is presently being
added to the msprime Python package; this feature will be rolled out in a later release of
SLiM.
Tree-sequence recording has a large impact on the performance of models, in terms of both
runtime and memory usage. It is therefore disabled by default, and needs to be enabled with a
call to an initialization function, initializeTreeSeq(). When it is enabled, you can control the
frequency of simplification (more frequent simplification means a lower memory footprint but a
longer runtime), and you can turn on checking for coalescence if your model needs to know when
full coalescence has been attained (for a small additional performance penalty on top of each
simplification).
Again, tree-seq models are an advanced topic; it is good to know that tree-sequence recording
is an option, but beginning SLiM users should postpone trying to use it until they have become
fairly familiar with the basics of SLiM. However, if you’re finding that performance is an issue, and
you’re bogged down simulating huge numbers of neutral mutations or running endless burn-in
periods, tree-sequence recording may be able to help. Similarly, if you want to detect
coalescence, or obtain a record of the ancestry at every chromosome position, tree-sequence
recording can provide a solution. Tree-sequence recording also provides an output format for
saving simulation state that can be much more compact than either VCF or SLiM’s native output
file format (even though it includes information about ancestry that those file formats do not), and
that can be analyzed and manipulated in Python with pyslim; these advantages can also be a
reason to use tree-sequence recording.
1.8 Online resources for SLiM users
Users of SLiM should be aware of various online resources that are available to support their
work. This section summarizes them and provides links.
• First of all, there is the SLiM home page at the Messer lab website. This is the primary
place from which to download SLiM, and provides a history of SLiM releases as well as a
list of publications that have used SLiM. https://messerlab.org/slim/
• The slim-announce mailing list is only for announcements from our group, such as new
versions of SLiM, new SLiM publications related to SLiM, and plans for conferences where
you could connect with us. https://groups.google.com/forum/#!forum/slim-announce
• The slim-discuss mailing list is for questions from SLiM users that might be of general
interest. Please feel free to post your own questions – and even to answer other people’s
questions, if you can. https://groups.google.com/forum/#!forum/slim-discuss
• The SLiM-Extras GitHub repository is a place for the SLiM community to share useful
resources related to SLiM. This could include reuseable Eidos functions (see, e.g., section
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

34

11.1), full SLiM models that might be of general interest, code for reading or analyzing
SLiM output in languages like R or Python, code for sublaunching multiple SLiM runs on
computing clusters (see section 1.4), and so forth. Please feel free to email us your own
submissions for additions to SLiM-Extras. https://github.com/MesserLab/SLiM-Extras
• The SLiM GitHub repository is where SLiM’s source code lives. SLiM is open-source, but
unless you have some specific reason to want to access the source, you probably don’t
need to. As described in chapter 2, users on Mac OS X can install SLiM with the doubleclick installer we provide, and users on Linux will usually want to build from the official
release source archive provided on the SLiM home page. The GitHub repository, then, is
mostly useful for people who want to be on the cutting edge, running the latest version of
SLiM before it has been publicly released. Be aware doing this will mean you are more
exposed to bugs, and that sometimes the sources on GitHub may not even build (although
we try to avoid that). https://github.com/MesserLab/SLiM
That’s what we’ve got for now. As always, please contact us if you have suggestions for
additions that would be useful, or reports of problems with any of the above.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

35

2. Installation
The command-line tool for SLiM is designed to be portable; it is written in pure C++, with no
external dependencies. Code from various third-party libraries such the GNU Scientific Library
(Galassi et al. 2009), Boost (Boost 2015), and msprime (Kelleher et al. 2016) is used in SLiM (see
chapter 27), but that code has been integrated directly into SLiM, so those libraries are not
required to build or run SLiM. SLiM should be buildable on Mac OS X, Linux, other Un*x
platforms, and – possibly with minor modifications – on other platforms as well. The SLiMgui
application, on the other hand, is written for Mac OS X only, in Objective-C++ using Apple’s
Cocoa library, and is available only on that platform. The steps to install SLiM depend upon the
platform you are on; please refer to the appropriate subsection below.
2.1 Installation on Mac OS X
On Mac OS X you have a choice between (1) installing a prebuilt package containing the slim
command-line tool and the SLiMgui application (as well as EidosScribe, the eidos command-line
tool, and manual PDFs), or (2) cloning the SLiM GitHub repository and building the projects
yourself using Apple’s Xcode development environment. Unless you are an experienced
developer, the first option is recommended. Building SLiM on OS X at the command line is not
recommended, although it is possible; use of the Xcode project will provide the proper compiler
settings, etc., for a standard OS X build.
2.1.1 Installing the prebuilt SLiM package on Mac OS X
This is quite straightforward. First, download the installer package from SLiM’s home page at
http://messerlab.org/slim/:

If multiple versions are available, be sure to download the appropriate version for your system.
Once it is downloaded, double-click the package in the Finder to run the installer. Click through
all steps in the installer, and SLiM will now be installed on your system. The applications (SLiMgui
and EidosScribe) will be installed in your /Applications folder; the command-line tools (slim and
eidos) will be installed in /usr/local/bin/. Pre-existing Terminal windows may not find slim and
eidos; open a new Terminal window to get the updated paths.
2.1.2 Building SLiM from sources on Mac OS X
This option is relatively complex. If you are not an experienced developer it is recommended
that you install SLiM using the prebuilt package instead (see the previous section). There is no
advantage to building SLiM from sources unless you wish to run it under the debugger, modify its
code, or other similarly advanced tasks, or wish to run the current development version of SLiM.
First of all, you need to have Xcode and Apple’s other developer tools installed on your
machine. SLiM’s project is generally kept synchronized with the current version of Xcode; older
versions of Xcode may be unable to open the SLiM project. To run the current version of Xcode,
you generally need to be on the current version of OS X as well, so you may need to upgrade your
OS X installation before installing Xcode. Once you have done that, you will need to obtain and
install Xcode itself. This will probably involve registering as a developer with Apple (which is free,
for the basic level) at their developer website, developer.apple.com, and then downloading and
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

36

installing the developer tools package including the latest version of Xcode (which may be possible
through the App Store, otherwise through the developer website’s Member Center). It is quite a
large download, so do it through a fast connection. Once you have completed this step, you
should see Xcode.app in your /Applications folder, and you should be able to launch it by
double-clicking it.
Second, you need to obtain the SLiM source code. If you are following these instructions, you
probably want to obtain the current development version of SLiM from its GitHub repository at
https://github.com/MesserLab/SLiM. Go to that URL, click the big green “Clone or download”
button, and then click “Download ZIP”:

Locate the downloaded file and double-click it to decompress the zip file if that has not already
been done for you automatically by your browser. This distribution will include the sources to
allow you build both the slim command-line tool and the SLiMgui interactive graphical modeling
environment. Be aware that the current development version on GitHub may not be thoroughly
tested – indeed, it may not even compile.
Third, open the SLiM project in Xcode. To do so, locate the Xcode project file within the
archive you have just downloaded; it is at the root level, with the name SLiM.xcodeproj. Doubleclick it to open the project in Xcode. You should see a big project window.
Fourth, choose a build target. A target is basically a product that an Xcode project knows how
to build. The SLiM project has four targets that may be selected: SLiM (the slim command-line
tool), SLiMgui, EidosScribe (an interactive Eidos script development environment), and eidos (the
Eidos interpreter command-line tool, eidos). For now, select SLiMgui. You do this in the pop-up
near the upper left of the project window:

Fifth, build and run the selected target by pressing command-R (which is the Run command in
the Product menu). It should build fairly cleanly (perhaps apart from some nib warnings that are
difficult to eliminate). Once it finishes building (which may take several minutes, depending on
your machine), SLiMgui should launch automatically. Assuming that worked, quit SLiMgui for
now, as we are not quite done setting things up in Xcode.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

37

Sixth, there is an additional twist: the build configuration and other build scheme options.
These options are accessed by choosing the “Edit Scheme...” menu item that can be seen in the
pop-up menu in the above screenshot. Choose this, and you will see a sheet:

In this screenshot, the “Run” action has been selected on the left, so we are configuring what
Running this target (the SLiMgui target) will do. At the top, the Info tab is selected, which provides
the most basic configuration options. Finally, the pop-up menu next to the label “Build
Configuration” has been clicked in order to choose the Release build configuration. In general,
you will want to build and run SLiM and SLiMgui using the Release build configuration unless you
have a specific reason to wish to do otherwise; Release builds are much faster than Debug builds,
partly because of optimization, and partly because additional runtime checks are turned on in
SLiM’s code when building in the Debug configuration. After choosing Release, you can click the
Close button, and command-R (run) will now build and run a Release build of SLiMgui.
If your goal is to run SLiMgui, you can simply run it from within Xcode; there is no particular
disadvantage to doing so. If your goal is to run the slim command-line tool, doing so from within
Xcode is a little bit inconvenient (since it is a little bit complicated to supply command-line
arguments to it, for example), so the simplest course is to build slim in Xcode and put the built
executable wherever you want it to be so that you can run it in Terminal as usual.
To do this, first select the SLiM target from the target pop-up in the project window, as described
above (since the SLiMgui target is presently selected, if you have been following along). Second,
choose Edit Scheme... again and make sure that the SLiM target is using the Release build
configuration for its Run action, as described above (since we set that for the SLiMgui target above,
not for the SLiM target – each target has its own settings), and close the Edit Scheme... sheet.
Third, press command-B to build the target (this is the Build command in the Product menu).

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

38

Once the build completes, you will want to locate the built product in the Finder. To do this,
first select the Project Navigator in the project window by clicking the folder icon at the left edge
of the strip of icons at the top left of the project window:

Then click disclosure triangles to open the SLiM top-level item and the Products subfolder in the
outline view shown there:

The slim product may be shown in red even if it has been successfully built; this appears to be
a bug in Xcode. Right-click (i.e., click with the right-hand mouse button) or control-click (i.e.,
hold down the control key and click) on the slim product, and choose “Show in Finder” from the
context menu displayed, as shown in the screenshot above. Your computer should switch to the
Finder and open a new window in the Finder, showing the contents of a folder that is probably
named “Release” (because the Release build configuration has been chosen). A file named slim
should be selected in this window. This is the built executable for the slim command-line tool.
(There are other, more standard ways to get to this point, but they are a bit more complicated.) You
should copy this file to whatever location you wish, and then run slim using that copy. The
standard install location for slim is at /usr/local/bin/, but since this is a custom build it might be
wise not to put it at that location to avoid confusion. Instead, a location like ~/bin/ inside your
home directory might be appropriate (you might need to create this folder first, in the Finder). In
any case, once you have installed slim at the desired location, you can open a Terminal window
and cd to that location (your Terminal shell prompts may look different from mine, of course):
darwin:~ bhaller $ cd ~/bin
darwin:~/bin bhaller $

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

39

Then you can run the local copy of slim that is at that location:
./slim 

The syntax ./slim tells Terminal to run the copy of SLiM in the current directory, rather than
running the standard version that might be installed at /usr/local/bin/ on your machine. You
can verify that the correct version of SLiM is being run using the -v (version) option:
darwin:~/bin bhaller $ ./slim -v
SLiM version 2.3, built Apr 18 2017 17:20:34

The build date and time should correspond with the build you just did in Xcode; if not, you
have done something wrong and should re-check the steps above.
You can in fact follow the same steps to locate the built SLiMgui application in the Finder and
install it somewhere on your machine for later use; the ~/Applications/ folder might be a good
choice of location. However, it is probably simpler to run it from within Xcode.
Obviously Xcode is a very complicated application, and we can’t possibly provide a thorough
introduction to using it, but the steps above should allow you to build and run both slim and
SLiMgui using the current development head sources from GitHub. If you run into any problems
with these instructions, please let us know.
2.2 Installation on Linux and other Un*x platforms
Details of building SLiM may vary depending on the platform, but the basic gist should be the
same. First, you need to obtain the SLiM source code. It is recommended that you obtain the
code for the latest supported release, from SLiM’s home page at http://messerlab.org/slim/:

However, you may also obtain the current development version from SLiM’s GitHub repository
at https://github.com/MesserLab/SLiM. Be aware that the version on GitHub may not be
thoroughly tested – indeed, it may not even compile.
The following steps have changed considerably with the release of SLiM 3, since we have
switched from using the make build tool to using a tool called cmake that is considerably more
modern (but also a bit more complicated to use). The cmake tool is often preinstalled on Linux
systems, but if it is not installed on your machine, you will need to install it before proceeding.
You can tell whether cmake is installed on your system by executing this in a terminal window:
which cmake

If a path is printed in response, cmake is installed; if not, it needs to be installed. On Mac OS X,
cmake can be installed using MacPorts (https://www.macports.org) or Homebrew (https://brew.sh);
which installation system you use seems to be largely a matter of taste, although you can read
strong opinions in both directions online. On Linux and other Un*x systems, the cmake home page
at https://cmake.org/download/ provides downloadable source packages with installation
instructions. Apologies for this additional complication; with the integration of more third-party
code into SLiM, using make directly just became too unwieldy. The cmake tool builds a makefile for
us, which can then be used to build with make, as we will see next.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

40

With cmake installed, go to a new Un*x shell window and type:
cd SLiM
cd ..
mkdir build
cd build
cmake ../SLiM
make slim

The first cd command changes the current directory to your SLiM source directory (you will
need to supply the appropriate path there instead of just SLiM), and the next cd command changes
to the enclosing directory (of course you can just cd there directly). We then use mkdir to create a
build directory adjacent to the SLiM source directory in the filesystem; this keeps all of the cruft
associated with the build process separate from SLiM’s sources. We cd into that build directory,
and then tell cmake to update its information regarding SLiM. The last command actually builds
the slim command-line tool. Note that SLiM uses C++11 extensions; the g++ compiler installed
on your machine must be recent enough to support C++11 or your build will fail. The slim tool
will appear as an executable file in the build directory once the build is finished (not, as in SLiM
2.x, in a bin directory inside the source directory). You can copy the built executable to a different
location, such as /usr/bin/, if you wish; the standard location for user-built executables depends
upon your Un*x flavor, so consult your documentation if you wish to install it in a standard
location.
The eidos command-line tool can also be built on Un*x platforms following the same
procedure, with the final command make eidos; a simple make command will build both. The
SLiMgui and EidosScribe targets are buildable only on Mac OS X using Xcode (see section 2.1.2).
Once cmake’s information has been set up, rebuilding after minor source changes can generally
be done just by re-executing the make slim command in the build directory. If you make major
changes to the SLiM sources – and in particular, if you add or remove source files, or do a
git pull of new changes from GitHub – you might need to tell cmake to update its information
before you run make to rebuild the project. This is necessary because SLiM’s CMakeLists.txt
configuration file uses a cmake feature called GLOB to collect a list of source files to be used for
building, and that means that cmake cannot automatically re-update in all cases. If you make such
changes to the sources, you should do the following:
cd SLiM
touch ./CMakeLists.txt

Then cd to the build directory and execute make slim to rebuild; cmake’s cached information
will be rebuilt, since the configuration file was touched, so you do not need to re-run cmake.
The build instructions above use a build directory named build, and use a default “build type”
with cmake (a Release build), and this will suffice for almost all SLiM users; you need not read
further unless you really want to complicate your life. In fact, however, you can name your build
directory whatever you wish, and you can have more than one build directory if you wish; and you
can specify a build type of either Release or Debug with cmake. For example, you could do
something like this to make a build directory named Release, using the Release build type:
cd SLiM
cd ..
mkdir Release
cd Release
cmake -D CMAKE_BUILD_TYPE=Release ../SLiM
make slim

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

41

A Release build has optimization turned on and debugging symbols turned off, giving you a
lean, fast executable suitable for most purposes. This is cmake’s default when building SLiM; the
original build instructions above produced a Release build since no build type was specified.
Similarly, you could make a Debug build in a directory named Debug:
cd SLiM
cd ..
mkdir Debug
cd Debug
cmake -D CMAKE_BUILD_TYPE=Debug ../SLiM
make slim

A Debug build has optimization turned off and debugging symbols turned on, giving a large
symbolicated build suitable for runtime debugging. Debug builds also have some additional
runtime error checking turned on in SLiM, designed to catch a variety of internal error states that
might cause a Release build to crash or produce incorrect output. Most users, however, will be
fine with just the standard build directory named build and the default cmake build type.
2.3 Installation on non-Un*x platforms
Building and installing SLiM on non-Un*x platforms should be reasonably straightforward since
the code is standard C and C++. However, there may be minor differences that lead to compile
and/or link problems, particularly on non-POSIX-compliant operating systems such as Windows.
We will probably not be able to give you any useful help in solving these problems; we know
nothing about programming on Windows, for example. If you work through such problems and
have useful patches for us to allow the project to build on a new platform (using #ifdef or similar
for conditional compilation), feel free to send us a pull request on GitHub.
Overall, the build steps should be similar to the sections above: you will download the sources
or clone the GitHub repository (see above), and then you will execute compile commands with (1)
a flag indicating the C++11 standard, which is required for SLiM, (2) a nice high optimization level
such as -O3, (3) all of the source files necessary, and (4) the appropriate header search paths. You
can look at the project’s CMakeLists.txt file for further guidance.
2.4 Testing the SLiM installation
Regardless of your platform or method of installation, it is a good idea to run some self-tests
after installing SLiM. In your Terminal window, Un*x shell window, or platform equivalent, run the
following two commands (while in the build directory where the slim executable resides):
./slim -testEidos
./slim -testSLiM

Each command should print a result line beginning with SUCCESS and then a count of successful
self-tests. If any other lines print, indicating failure of a test, you should probably contact us to ask
how to proceed.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

42

3 Running simulations in SLiMgui
This chapter introduces the SLiMgui graphical development environment for SLiM. This app is
available only on Mac OS X; on other platforms you will need to run SLiM directly from the
command line, in which case you can skip ahead to chapter 4.
On OS X, SLiMgui is typically located in /Applications/ if you installed from the prebuilt
package; if you built SLiM from sources, you may run SLiMgui directly from Xcode.
3.1 The SLiMgui simulation window
When you first launch SLiMgui, you should see a window similar to this:

The main sections of the window are indicated with red numbered circles; they are:
1. The scripting pane. This is where input commands for SLiM – in the form of an Eidos script –
are entered. A simple script is given by SLiMgui by default, as a starting point. At the top of this
pane you can see the pane’s title, “Input Commands”, and to the left of that title are buttons for
four commands: Check Script, Script Help, Show Eidos Console, and Show Eidos Variable Browser
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

43

(all of the buttons in the window have tooltips, so you can just hover the mouse over them to be
reminded of what they do). The use of these buttons will become clear in later sections.
2. The output pane. This is where run output from SLiM – diagnostic output as well as output
explicitly generated by your simulation script – is shown. At the top of this pane you can see the
pane’s title, “Run Output”, surrounded by buttons for four commands: Clear Output, Dump
Population State, Add Output Command (a pop-up menu button), and Show Graph (also a pop-up
menu button). Again, these buttons will be covered in later sections.
3. The population view. This is a table view that displays all of the subpopulations in the current
simulation population. Since the simulation has not yet been started, there are presently no
subpopulations, and thus the table is empty. This will be covered in more detail once we have a
running simulation. Directly below the table view are eight buttons: Change Subpopulation Size,
Remove Subpopulation, Add Subpopulation, Split Subpopulation, Change Migration Rates,
Change Selfing Rate, Change Cloning Rate, and Change Sex Ratio. Again, these buttons will be
covered in later sections.
4. The individual view. This is where individual “organisms” in the simulation are displayed
from the subpopulation(s) selected in the population view. At present this is empty since there are
no individuals to display.
5. The generation controls. The three large buttons are, from left to right, for the commands
Step (to run forward one generation), Play (to run forward continuously), and Recycle (to reset back
to the initialization stage of the simulation). The slider below the Play button controls the speed at
which the simulation runs; usually you will want this to be the maximum speed, but occasionally
it is useful to slow the simulation down to better see what is happening on short timescales.
Finally, the Generation textfield shows the generation that is about to execute; if you press the Step
button, the generation shown will execute. At present the simulation is paused prior to the
execution of initialize() callbacks, a special initialization state that happens first before the
simulation begins to run.
6. The color controls. The two color stripes show the colors that will be used by SLiMgui to
indicate different fitness values (for individuals) and different selection coefficients (for mutations).
In both cases, yellow is always neutral, but the scale of the color ramps around yellow can be
adjusted using the vertical sliders to the right, in order to make the color scale better reveal fine or
coarse differences in fitness and selection coefficients in your simulation.
7. The chromosome views. The two wide stripes show views onto the simulated chromosome
(empty at present since the simulation has not yet started); these will be discussed later. To the
right of those views are buttons for six commands. In the top row are buttons for Add Genetic
Command (a pop-up menu button) and Show Details (a toggle button that shows and hides a
drawer). Below those is a cluster of four buttons labeled R, G, M, and F; these toggle visibility on
and off in the chromosome views of, respectively, rate maps (for recombination and mutation),
genomic elements, mutations, and fixed mutations (i.e., substitutions). By default only mutations
are shown.
This is a lot of information, and to avoid drowning you in details we will not cover all these
elements in detail now; you will see them again later as they come up in context.
3.2 The script help window
Before moving on to writing your first Eidos script for SLiM, there are a few other components of
SLiMgui that it might be useful to briefly mention. One such is the script help window. If you
click the Script Help button ? , the script help window will open:

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

44

This window can be used to browse information about both Eidos and SLiM. A list of top-level
topic headings appears on the left; you can click on these headings to expand them to show the
sub-headings they contain, which may in turn be expanded to show sub-sub-headings and so
forth. When you click on a “leaf” – an actual topic – the information about that topic will be
shown in the pane on the right. You can also use the search field at the top of the window to
search for information about a given topic; note that there is a pop-up menu under the magnifying
glass that allows you to choose whether the search is done on topic titles only (for fewer and
probably more relevant results), or is done on the full help text of each topic (for more results).
A useful shortcut: option-click on anything in the Eidos script of the simulation window’s input
area to pop up the script help window with search results for the item you option-clicked. This is
very useful for quickly looking up functions, language keywords, and other such topics.
3.3 The Eidos console
Clicking the Show Eidos Console button

will show the console (or hide it, if already open):

We won’t go into detail about this now, since you are not yet familiar with Eidos at all, but in
essence, this console is a window where you can execute arbitrary Eidos commands at any point
in your simulation. Commands are entered, and their output is shown, in the console pane on the
right-hand side of the window. The left-hand side is a scratch area where you can work on longer
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

45

blocks of script. If you want to experiment with Eidos and try out ideas interactively, this is the
place to do it; so as new Eidos concepts are introduced in this manual, you may wish to return to
the Eidos console to play around with them.
3.4 The Eidos variable browser
Finally, clicking the Show Eidos Variable Browser button
variable browser that looks like this:

will show (or hide, as appropriate) a

Only a few variables are presently defined; these are all constants – unmodifiable values
defined for your use by Eidos (or by SLiM, but the constants shown are all Eidos constants). These
constants include the values T and F representing true and false, the values PI and E representing
those mathematical constants (3.1415... and 2.7182...), and some others; these will be
discussed as they come up.
Constants are shown in gray in the variable browser since they are unmodifiable. The variable
browser will also show variables that you define yourself (in black rather than gray). For example,
you might try now entering x=10 at the console prompt in the Eidos console window. After you
press return to execute that command, you will see the variable x appear in the variable browser,
defined with a type of integer, a size of 1 (because 10 is a single value, as opposed to a vector of
values), and a value of 10.
We will return to the variable browser in later sections to examine some of the objects defined
by SLiM. In general, however, just be aware that the variable browser exists, and that you can use
it to examine all of the Eidos variables that you will be working with in later sections.
3.5 Automatic code completion and command syntax lookup
In the next chapter, we will start exploring a simple neutral model in SLiM. As you start writing
scripts like that, you will sometimes find it difficult to remember the name of something you need
to use in your script – a function, a property, or a method. We are getting a bit ahead of ourselves,
but there is a final feature of SLiMgui to discuss here. Suppose you were writing a script, but you
couldn’t remember the name of the initializeGeneConversion() function. You could look it up
in the documentation, but there is an easier way. First click in your script at the appropriate spot
inside an initialize() callback, and then start typing with init since you know you’re looking
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

46

for an initialize...() function. Now press the  key. This requests code completion from
SLiMgui. In response, you should see something like this:

This is a pop-up list of all of the things Eidos knows about that start with init. Conveniently,
is even selected for you, since it happens to be alphabetically first in
the list. You can simply press return or click outside the box to accept the default choice; you
could also double-click a different choice, or use the arrow keys to move to a different choice.
Having accepted initializeGeneConversion() as your choice, you might now also be unable
to remember what its parameters are. Simply click between the parentheses of the function call,
and then look at the status bar at the bottom of the window:
initializeGeneConversion()

The function signature shown there reminds you of the parameters needed, including their types
and their names. Section 4.1.3 has further discussion about this feature, getting into details of the
structure and symbolism of the function signature, which would have little meaning before we
have started scripting.
As described in section 3.2, there is a script help facility provided in SLiMgui, which can be
called up with an option-click on any part of your script. If the function signature shown above is
not sufficient to jog your memory, and you want to see the full documentation for the
initializeGeneConversion() function, just option-click on the function name,
initializeGeneConversion, and the script help window will come up showing the results of a
search on that term:

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

47

The full documentation for initializeGeneConversion() is shown, which should clarify any
outstanding questions.
The code completion (with ) and context help (with option-click) features work for
properties and methods as well. This may not mean much to you yet, since functions, properties,
and methods have not yet been formally introduced, but keep all this in the back of your mind as
you start exploring scripting in chapter 4.
3.6 Automated script generation
In section 3.1, a few controls were skimmed over with a promise that they would be covered in
later sections. Here we will cover some of those controls: various buttons and pop-up menus that
add automatically generated script blocks to the simulation.
SLiMgui provides quite a few different facilities for automated script generation. Through the
buttons under the subpopulation table, you can change the size of a subpopulation, remove or add
subpopulations, split subpopulations, change migration rates, change selfing and cloning rates, or
change a subpopulation’s sex ratio. Through the pop-up menu to the right of the chromosome
view, you can define a new mutation type or genomic element type, add a genomic element or a
recombination interval to the chromosome, or make your simulation sexual rather than
hermaphroditic (and even simulate a sex chromosome rather than an autosome). Finally, through
the pop-up menu at the top of the Run Output area, you can add output of the full population
state, output of a sample from a subpopulation, or output of a list of fixed mutations. You can see
these three access points for automated script generation in the screenshot below:

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

48

All of these automated script generation commands will not be covered individually here, as
that would be too repetitive and tedious. The buttons have tooltips (visible if you hover the cursor
over them), and the menu items are named in a way that indicates their function, so you can easily
find the command you need. All of these commands work in essentially the same way, so we will
take just one of them as a representative example. The leftmost button under the subpopulation
table is the Change Subpopulation Size button. If you recycle and click it, you will likely see
something like this:

This illustrates the first point about SLiMgui’s automated script generation: it depends upon the
current state of the simulation. Immediately after doing a Recycle, no subpopulations have been
defined in the current simulation, and so SLiMgui has no information about which subpopulations
exist (and thus might be resized). The first step in using these facilities is therefore to get yourself
into a state in which SLiMgui has enough information to make modifications to the objects you are
interested in. In this example (using SLiMgui’s default model), stepping forward twice (once to
execute the initialize() callback, and another to execute the generation 1 event that defines
subpopulation p1) provides that information. Now if you click the Change Subpopulation Size
button, you should see a more useful panel:

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

49

Other automated-script-generation panels will differ in their specifics, but the idea is always the
same: you supply the information needed for the script generation, and then you either do an
Insert Only operation (generating the script block but leaving the current state of the simulation
unchanged) or you do an Insert & Execute operation (generating the script block and putting it into
effect in the current simulation). In this case, suppose we enter a value of 5 for “Generation for
resize”, and then click “Insert & Execute”. The first thing you will note is that a new script block
has been inserted into your script by SLiMgui:
5 {
p1.setSubpopulationSize(1000);
}

This simply implements the requested change using an Eidos event. The other thing you will
notice is that this event is active in the current simulation, with no need to recycle. If you step
forward through generation 5, subpopulation p1 will resize to 1000 individuals as requested (see
section 5.1.1 for a proper introduction to the setSubpopulationSize() method). This allows you
to develop the dynamics of a simulation as you go, adding events one by one without needing to
recycle and step forward to see the effects of each event added.
If you choose a generation for the event that is prior to the next generation to be executed,
SLiMgui will display a warning that the event cannot be executed in the current simulation
(because its target generation has already come and gone, and it is not possible to modify the past
simulation state retroactively). In this case, the Insert & Execute button will change to Insert &
Recycle, so that you easily restart the simulation with the new event in effect.
Finally, it is worth noting that the automated script generation always provides you with a new
event or callback, even if the command could be merged into an existing event or callback. For
example, if we use the Add Mutation Type command to make a new m2 mutation type (see section
4.1.3), we will get a new initialize() callback like this:
initialize() {
initializeMutationType(2, 0.5, "f", 0.5);
}

Almost certainly, you will want to copy the call to initializeMutationType() into your main
callback and delete the extra callback provided by SLiMgui; it is quite unusual to
actually want to have more than one initialize() callback in a script, although it is legal.
SLiMgui can’t guess exactly where the call ought to be inserted, however, so it leaves that to you.
initialize()

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

50

3.7 Script prettyprinting
As you start writing your own scripts in SLiMgui, you may find that they look a bit raggedy
because their line indentation doesn’t follow standard coding conventions, like this:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 {
subpopCount = 5;
for (i in 1:subpopCount)
sim.addSubpop(i, 500);
for (i in 1:subpopCount)
for (j in 1:subpopCount)
if (i != j)
sim.subpopulations[i-1].setMigrationRates(j, 0.05);
}
10000 late() {
sim.outputFull();
}

This is a contrived example (based on the recipe in section 5.3.2), but you really will find that
your script indentation starts to suffer as you change the structure of your code, moving code from
block to block, wrapping some code in loops and unwrapping other code, and so forth.
Indentation issues can make your code hard to read and maintain, but manually fixing the
indentation of each line is a hassle. Instead, you can just use SLiMgui’s automatic script
prettyprinting facility. Select “Prettyprint Script” from the Script menu, or click the
button just
above the scripting pane, and your code will be reformatted:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 {
subpopCount = 5;
for (i in 1:subpopCount)
sim.addSubpop(i, 500);
for (i in 1:subpopCount)
for (j in 1:subpopCount)
if (i != j)
sim.subpopulations[i-1].setMigrationRates(j, 0.05);
}
10000 late() {
sim.outputFull();
}

Only the line indentation is changed; all other aspects of your coding style are preserved,
including blank lines, brace style, spacing around operators, and so forth.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

51

3.8 Further SLiMgui features
SLiMgui has many more features, but they won’t make much sense until we have gotten into
writing scripts. For future reference, here is a cross-reference to some sections that present more
features of SLiMgui:
Running a simulation: section 4.1.9.
Viewing script block registration/deregistration: section 5.1.2.
Executing a simulation up to a specified generation: section 5.1.2.
Visualizing population structure and migration: sections 5.1.3 and 5.3.3.
Using the Play speed slider: section 5.2.2.
Viewing recombination regions: section 6.1.1.
Viewing the sex ratio, cloning rate, and selfing rate: sections 6.2.2, 6.3.1, and 6.3.2.
Viewing the registered mutation types and their distributions of fitness effects: section 7.1.
Viewing the registered genomic element types: section 7.2.
Viewing genomic elements and selecting a subrange of the chromosome: section 7.3.
Customizing display colors for mutations, genomic elements, and individuals: section 7.4.
Graphing of mutation and population dynamics in SLiMgui: chapter 8.
Alternative population display modes: section 12.3.
Haplotype display in the chromosome view, and haplotype plots: section 13.5.
Finally, note that SLiMgui knows the recipes presented in this cookbook, and can open them for
you directly. Just choose the recipe you want from the “Open Recipe” menu under SLiMgui’s File
menu, and the recipe will open in a new SLiMgui window.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

52

4. Getting started: Neutral evolution in a panmictic population
This chapter will introduce the basic concepts involved in making, configuring, running, and
obtaining output from a simple neutral simulation.
4.1 A basic neutral simulation
There is a tradition in computer programming of introducing a new language by writing a
“hello, world” program. In Eidos, this minimal “hello, world” program is quite simple:
print("hello, world");

This single-line program calls a built-in function named print(), which prints whatever you tell it
to. Here, print is called with a single argument, the string "hello, world", enclosed in quotes
that indicate it is a string. The semicolon at the end indicates to Eidos that the statement – the line
of code – ends at that point. If you are using SLiMgui on OS X, you can open the Eidos console
window now and enter this command at the prompt; after you press return, "hello, world" will
be shown as output. In SLiM, we use Eidos as a tool for building and controlling SLiM simulations;
for a complete introduction to Eidos, see the manual Eidos: A Simple Scripting Language.
That was a minimal Eidos program; now let’s look at a minimal SLiM simulation script, which is
a bit more involved. We’ll start with a basic neutral simulation that models a genomic region of
length 100 kb in a population of 500 diploid individuals, evolving over 10000 generations.
Neutral mutations occur uniformly in this region at a rate of 10-7 per bp per generation.
Recombination also occurs uniformly at a rate of 10-8 per bp per generation (corresponding to 1
cM/Mbp). The Eidos commands for specifying this simulation are as follows:
// set up a simple neutral simulation
initialize()
{
// set the overall mutation rate
initializeMutationRate(1e-7);
// m1 mutation type: neutral
initializeMutationType("m1", 0.5, "f", 0.0);
// g1 genomic element type: uses m1 for all mutations
initializeGenomicElementType("g1", m1, 1.0);
// uniform chromosome of length 100 kb
initializeGenomicElement(g1, 0, 99999);
// uniform recombination along the chromosome
initializeRecombinationRate(1e-8);
}
// create a population of 500 individuals
1
{
sim.addSubpop("p1", 500);
}
// run to generation 10000
10000
{
sim.simulationFinished();
}
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

53

Running this script at the command line on a Un*x system (including Mac OS X) is very simple.
If it is saved in the current directory as a file named test.txt, and if the slim command-line tool is
in the shell’s executable path, then the command:
slim test.txt

should suffice to run it. Otherwise, paths will be needed, which depend upon where the slim
command-line tool and the test.txt file are located. On OS X, the SLiM installer will install slim
in /usr/local/bin; if test.txt is in your home directory, then the command would be:
/usr/local/bin/slim ~/test.txt

Running this script in the SLiMgui application on OS X is also straightforward. Just launch
SLiMgui (installed in /Applications by the SLiM installer), copy/paste the script into the script area
of SLiMgui’s window (or, even easier: select the section 4.1 recipe from the Open Recipe submenu
of SLiMgui’s File menu), press the big Recycle button at the upper right of the window to reset the
simulation with the new script, and then press the Play button just to the left of the Recycle button.
Chapter 3 provided a brief introduction to SLiMgui and the parts of its window, and the sections
below will walk through running this particular simulation in SLiMgui in more detail.
There are a couple of general things to say first about the whole script. First of all, comments in
Eidos begin with two slashes, //, and continue to the end of their line; the script above has a
comment for most of the lines of executable code.
Second, this code snippet utilizes syntax coloring to make the meaning of the script clearer, just
as shown in SLiM’s input area. Numeric constants are shown in blue, string constants in red, SLiM
object constants such as m1, g1, and sim in sage, and comments in green. Syntax coloring will
generally be used in this manual for Eidos scripts.
Third, braces {} are used in Eidos to enclose whole blocks of code. The code above has three
such blocks: an initialize() callback that sets up the simulation, an Eidos event that is set to
execute in generation 1 – the beginning of simulation execution – to add a subpopulation, and an
Eidos event that runs in generation 10000 and stops the simulation.
Fourth, whitespace such as spaces, tabs, and newlines is not generally significant in Eidos;
comments, also, are considered whitespace and do not matter to the execution of your code. The
above script could thus be written more compactly as:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 { sim.addSubpop("p1", 500); }
10000 { sim.simulationFinished(); }

With that prelude, the following subsections will explore each of the commands in this script in
detail. If you wish to following along in SLiMgui, you should copy the example script out of this
document and paste it into your SLiMgui simulation window, then press the Recycle button (which
abandons the previous simulation run, based on the old script, and begins a new simulation based
on the new script you have just pasted in).
4.1.1 initialize() callbacks
Before a simulation can really begin running, some initialization tasks need to be done. SLiM
needs to know basic simulation parameters like the mutation rate, the length of the chromosome,
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

54

and so forth. Setting up this foundational state is done before the execution of the first generation,
at what is called “initialization time”. At initialization time, SLiM calls any blocks in your script
that are designated as initialize() callbacks (simply by having initialize() before their starting
brace). Our sample script defines one initialize() callback; you are allowed to have more than
one, in which case they are called sequentially in the order in which they are defined in the script.
This would be useful mostly for conceptual division of your code into discrete sections.
An initialize() callback may contain arbitrary Eidos code; you can define variables, execute
loops, and call functions (all topics we will explore in future sections). However, the simulation is
not yet set up, so you do not have access to SLiM’s sim constant, and this cuts you off from most of
SLiM’s functionality. Mostly what you will do in your initialize() callbacks, in practice, is call
initialization functions that are built into SLiM to help set up your simulation. These functions
have names that begin with the word initialize; the initialize() callback here calls five such
functions, discussed in the following sections. (A full reference of all of the initialize...()
functions provided by SLiM is given in Part II of this manual, the reference section; in general all of
the topics discussed in Part I are summarized in a more reference-oriented format there).
4.1.2 Mutation rate
The first line of our initialize() callback is:
initializeMutationRate(1e-7);

This calls the SLiM function initializeMutationRate(), passing a single parameter, the
numeric constant 1e-7. This is written in a sort of scientific notation commonly used in
programming; 1e-7 means 1.0×10−7, and could also be written in Eidos as 0.0000001. Numeric
values in Eidos may be of type integer, like 6 or -17, or of type float, like 1e-7 or 0.0000001.
The effect of this statement is to tell SLiM that the simulation will use a uniform mutation rate of
1e-7 (per base position per generation) across the whole chromosome. SLiM uses this rate to
determine how many mutations arise in each offspring that it generates in each generation in the
genomic elements being simulated (see section 21.1 for a more precise definition of the mutation
rate in SLiM). It is also possible to set a mutation rate map that varies the mutation rate along the
chromosome, but we will defer that until later. Precisely what mutations can arise in a given
element is governed by other aspects of the simulation configuration, discussed next.
4.1.3 Mutation types
The next line in the initialize() callback is:
initializeMutationType("m1", 0.5, "f", 0.0);

This calls the SLiM function initializeMutationType() to set up a new mutation type. You
may define as many mutation types in SLiM as you wish. Each is given a unique symbolic name;
the mutation type defined here is given the name m1, as requested by the first parameter to the
function call. A mutation type encapsulates a few key pieces of information about a particular
type of mutations: the dominance coefficient for mutations of this type (0.5 here), the distribution
of fitness effects (DFE) to be used (a fixed fitness effect here, as represented by "f"), and any
parameters that configure the distribution of fitness effects (0.0 here, giving the fixed selection
coefficient that will be used by all mutations of this type). This call creates a new Eidos variable
named m1, and so henceforth we can use the symbol m1 to refer to this mutation type.
This mutation type, m1, thus represents neutral mutations – always with a selection coefficient of
0.0. Mutation types might represent things like neutral mutations, beneficial mutations,
deleterious mutations, nearly neutral mutations, etc., with different distributions of fitness effects.
Each time that a mutation is created, its selection coefficient is drawn from the distribution of
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

55

fitness effects specified by the mutation type to which the new mutation belongs. It can be useful
to use different mutation types to represent mutations that are conceptually different in your
simulation even if they share the same DFE. For example, you might use mutation type to
represent neutral mutations created by SLiM randomly as a result of normal mutational processes,
and a different mutation type with exactly the same DFE to represent a particular neutral mutation
that you deliberately create in your script and want to track separately during the simulation.
You might notice that it seems hard to remember what all four parameters are and which order
they are supposed to go in. If you are using SLiMgui, a helpful feature is provided to address this
problem (introduced in section 3.5, with a screenshot). If you click anywhere inside the
parentheses of the initializeMutationType() call, a summary of the syntax of the call is shown in
the status bar at the bottom of the window:
(object$)initializeMutationType(is$ id,
numeric$ dominanceCoeff, string$ distributionType, ...)

The Eidos manual has a complete description of meaning of these summaries, called function
signatures. Briefly, this function signature first gives the type of value returned by the function –
here, an object value of class MutationType, representing the new mutation type created by the
function call. This returned value is a singleton, meaning that there will be only a single value
present, rather than a vector containing multiple values; this singleton property is represented by
the trailing $. You could therefore assign the result of initializeMutationType() into a variable:
x = initializeMutationType("m1", 0.5, "f", 0.0);

This would define x to refer to the new mutation type. Since the variable m1 is defined
automatically with the same value, this is usually not necessary; you would use m1 to refer to the
new mutation type. In some cases, however, it can be useful, particularly if you are defining more
than one mutation type using a loop.
Now we get to the parameters listed in the function signature. The first parameter may be either
an integer or a string (thus the designation is, from the leading character of each permitted type
name). This parameter should be a singleton, and is named id; it gives the identifier to be used for
the new mutation type, either as an integer like 1 or as a string like "m1" (both of which would lead
to a variable named m1 being defined). The second parameter must be numeric (meaning either an
integer or a float), is a singleton, and is named dominanceCoeff; this is self-explanatory. The
third parameter must be a singleton string, and is named distributionType. Then there is ..., a
signifier that zero or more additional parameters might be supplied of unspecified type and name.
In the case of initializeMutationType(), these additional parameters configure the distribution of
fitness effects (DFE), and their number and type depend upon the distribution specified by
distributionType. Exponential, gamma, and normal DFEs are also supported by SLiM, for
example, and require different parameters for their specification; consult the reference manual for
details about all of the DFE types currently supported in SLiM.
For now, the overall point is that the function signature is always available in the status bar
whenever you click inside a function, and can be used as a quick reference to remind you of the
meaning and type of each parameter. Remember that in SLiMgui you can also option-click in
Eidos code to bring up the script help window showing a search on the clicked term (see section
3.5); an option-click on initializeMutationType would bring up not only the function signature
for the function, but the full text from the reference section regarding the function. You might try
this now, as an experiment.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

56

4.1.4 Genomic element types
Let’s now turn to the third line of the initialize() callback:
initializeGenomicElementType("g1", m1, 1.0);

This creates a new genomic element type named g1. A genomic element type represents a
particular type of chromosomal region – introns, exons, UTRs, etc. As with mutation types, you
might wish to use a special genomic element type for a particular chromosomal region that you
want to track separately in your simulation, even if it has the same characteristics as other similar
regions – an exon of particular interest, for example.
Each genomic element type has a particular mutational profile. Mutations occur in all genomic
elements at the same uniform rate, as set by the overall mutation rate; but the types of mutations
that can occur in a particular genomic element type are determined by the mutational profile of
that genomic element type. Here, genomic element type g1 is defined as using mutation type m1
for all of its mutations (as specified by the proportion 1.0 supplied as the third parameter).
Suppose we wanted to define g1 as using a mix of three mutation types, m1, m2, and m3? Let’s
look at the function signature for initializeGenomicElementType() to see how we might do this:
(object$)initializeGenomicElementType(is$ id,
io mutationTypes, numeric proportions)

In some ways this is quite similar to the function signature for initializeMutationType() that
we examined above; it returns an object (this time of class GenomicElementType), and it takes an
integer or string named id to specify the identifier for the new object. The second parameter is
named mutationTypes, and can be either of type integer or object (with class MutationType); so
we could specify the mutation type for the genomic element type either using an object like m1, as
we did in the example script, or using an integer like 1 (identifying mutation type m1), which might
be more convenient. The third parameter is of type numeric (integer or float, remember) and
specifies the proportion of all mutations that will be drawn from the given mutation type.
The second and third parameters are not designated as singletons (they do not have a $ in their
type specifier). This means that they can be vectors of values, which allows us to specify multiple
mutation types for a genomic element type:
initializeGenomicElementType("g1", c(m1,m2,m3), c(1,2,10));

The c() built-in function returns all of its parameters pasted together into a single vector; we use
it here to make a vector containing the three mutation types, and another vector containing
proportions. Notice the proportions don’t have to sum to 1; they are just relative proportions.
In Eidos, all values are in fact vectors; singletons are just vectors containing exactly one value.
Even when you write a numeric constant like 10, that is actually an Eidos vector that happens to be
a singleton. Many Eidos operators and functions are built to work with whole vectors; this
simplifies your code by removing the need for many of the loops that would be necessary in other
languages in order to loop over the elements in an array. It also makes Eidos much faster, since a
whole vector can be processed in a single statement. For example, take this Eidos statement:
sum(1:10);

This adds the numbers from 1 to 10 using the built-in sum() function. A vector containing the
numbers from 1 to 10 is generated using the sequence operator of Eidos, :, which counts upwards
(or downward) from its first operand to its second operand. Once you get used to the way vectors
work in Eidos, you will find that they often make complicated tasks very easy.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

57

4.1.5 Genomic elements
Having set up genomic element type g1 in the third line, the fourth line of the initialize()
callback now uses g1 to set up a genomic element:
initializeGenomicElement(g1, 0, 99999);

A genomic element is simply a region of the chromosome that uses a particular genomic
element type. For example, you might have one genomic element type that represents introns, and
you might then have dozens (or thousands) of genomic elements in your chromosome that use that
genomic element type to represent a specific intron at a particular position. The call here sets up a
single genomic element that stretches from base position 0 to base position 99999, and is thus
100000 bases long, using genomic element type g1.
A chromosome can consist of many genomic elements, but two genomic elements cannot
overlap. Genomic elements also do not have to cover the entire chromosome. For example, you
can run a simulation with only two genomic elements of length 1 kb each, separated by 50 kb of
“empty space”. Mutations are only simulated in those regions of the chromosome where a
genomic element has been specified. Recombination events, however, are still simulated across
the whole chromosome. The end position of the last genomic element determines the length of the
chromosome being simulated. You can see the genomic elements you have defined by turning on
the display of genomic elements in SLiMgui’s chromosome view (see section 3.1).
The following example shows how we can set up a chromosome consisting of ten genomic
elements of type g1, with gaps between them at regular intervals:
for (index in 1:10)
initializeGenomicElement(g1, index*1000, index*1000 + 499);

This introduces a few new Eidos concepts. First of all, 1:10 makes a vector containing the
sequence from 1 to 10, as we saw above; it is equivalent to c(1,2,3,4,5,6,7,8,9,10).
Second, the for...in construct loops over that vector; for every value in 1:10, the value will be
assigned to the variable index and then the body of the loop – the statement on the next line – will
be executed. In this way, initializeGenomicElement() will be called ten times, first with index
equal to 1, then with index equal to 2, and on up to index equal to 10.
Third, you can see that Eidos, like most programming languages, allows you to write
mathematical expressions that are evaluated when the script executes. The * operator indicates
multiplication and the + operator indicates addition, so the expressions here calculate start and
end positions based upon the current value of index. All in all, this for loop is essentially
equivalent to:
initializeGenomicElement(g1, 1000, 1499);
initializeGenomicElement(g1, 2000, 2499);
...
initializeGenomicElement(g1, 10000, 10499);

It should now be fairly obvious how one might extend the for loop above to create a much
more complex chromosome involving multiple genes, each with UTRs and introns and exons,
interspersed with non-coding regions, and so forth. You would likely want to use additional
features of Eidos that are described in the Eidos language manual, such as nested for loops and the
modulo operator %. You could also potentially read in a chromosome map from a file on disk,
parse that map in whatever format it is written in, and execute the corresponding commands to
build the map in Eidos; Eidos has all of the tools you would need to do this, including file input/
output and string processing.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

58

4.1.6 Recombination rate
The final line of the initialize() callback in our script is:
initializeRecombinationRate(1e-8);

This specifies the recombination rate for the whole chromosome to be 1e-8, which means that a
crossing-over event will occur between any two adjacent bases with a probability of 1e-8 per
genome per generation (see section 21.1 for a more precise definition of the recombination rate in
SLiM). In this case, a single recombination rate is used along the whole chromosome. The
function signature for initializeRecombinationRate(), however, is:
(void)initializeRecombinationRate(numeric rates, [Ni ends = NULL],
[string$ sex = "*"])

Parameters named ends and sex are listed; however, they are in brackets []. This indicates that
these parameters are optional and may be omitted (in which case they are assigned the default
values shown in the signature; see the Eidos manual for further discussion of optional arguments).
That is what we did in the example script, so ends was assigned its default value of NULL (and sex
was assigned its default value of "*", which we won’t discuss here). If we included ends, which
must be either NULL or an integer value according to its Ni type-specifier in the signature, then it
should be the last base position of the chromosome, like this:
initializeRecombinationRate(1e-8, 99999);

Notice that the rates and ends parameters are not singletons (no $). In fact, we may supply a
vector of end positions and a matching vector of rates, defining a series of recombination regions,
each starting at the base position after the previous range ends (or at the beginning of the
chromosome, for the first range). For a simple example, we could make the first half of the
chromosome experience a much higher recombination rate than the second half:
initializeRecombinationRate(c(1e-7,1e-8), c(49999,99999));

You may supply as many recombination regions as you wish, specifying recombination
hotspots, etc. These recombination regions are not related to genomic elements; the boundaries of
recombination regions and genomic elements do not need to match up at all. You can see the
recombination regions you have defined by turning on the display of recombination regions in
SLiMgui’s chromosome view (see section 3.1). Note also that recombination can be tailored on an
individual-level basis using a recombination() callback to model things such as chromosomal
inversions; see sections 13.5 and 21.5.
As mentioned in section 4.1.2, it is also possible to define a mutation rate map that varies the
mutation rate along the chromosome. In fact, that is done with exactly the same syntax that we
have just seen here to configure a recombination rate map; a vector of rates and end positions is
supplied to initializeMutationRate() instead a singleton rate.
With this, we’re done discussing the initialize() callback section of the script. After the
initialize() callback finishes, the generation counter will be set to 1 and the first generation will
be ready to execute, as discussed next. If you’re following along in SLiMgui (which you can do by
pressing the Step button in the simulation window once), the generation counter will change from
initialize() to 1, indicating that the initialization phase is done and generation 1 is next up to be
executed (but has not executed yet). If you have the variables browser open (you can open it now
if you wish), you will see that the variables m1 and g1, defined by the initialize() callback, are
now listed, along with another variable named sim that will be discussed below.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

59

4.1.7 Eidos events
The next section of our script is:
1
{
sim.addSubpop("p1", 500);
}

This defines an Eidos event that is scheduled to run in generation 1 – at the very beginning of
the simulation, but after the initialize() callbacks have completed. This is the usual time at
which new subpopulations are constructed, and that is what this event does, as will be discussed
in the next section. For now, however, we’re interested in this idea of defining Eidos events.
Eidos events are run at the beginning of each generation, by default, as shown in the generation
schedule in section 1.3. Each event is scheduled to run in a specific generation or range of
generations. A single generation is specified with a single number, as here. A range of generations
is specified with the Eidos sequence operator; we could make an event run in generations 10
through 19, for example, by writing:
10:19 { ... }

The ... here is the script for the event, of course; it can be whatever Eidos code you wish. We
will see a great many Eidos events in later examples.
4.1.8 Subpopulations
The Eidos event scheduled to run in generation 1 has a single line in its body:
sim.addSubpop("p1", 500);

This looks a bit like a function named sim.addSubpop() is being called – and that is almost
right, but not quite. After initialize() callbacks complete, SLiM defines a new Eidos constant
named sim that represents the simulation itself. The sim object is used to access all sorts of
simulation properties, as we will see later. It is also a sort of gateway for a specialized sort of
function calls referred to as methods. A method is a call that can be made to a particular object,
to request that that object perform some operation. All values of type object in Eidos support a
handful of basic methods (as you can read about in the Eidos manual), and the object classes
defined by SLiM often support several more specialized methods as well.
So here we call the addSubpop() method of the sim object. This adds a new subpopulation to
the simulation; it will be represented by the new variable p1, as specified by the first parameter,
and it will have an initial size of 500 diploid individuals (the second parameter). The individuals in
the new subpopulation are blank slates; they contain no mutations as of yet. The method call is
done using the member operator of Eidos, a period (“.”); this operator selects one member, such as
a method, from an object. Apart from this syntactic difference, methods are in many ways quite
similar to functions, but they encapsulate an “object-oriented” perspective; instead of a globally
defined function performing an operation, a specific object is asked to perform the operation.
Subpopulations “belong” to the simulation, and therefore the simulation – the sim object – is asked
to add new subpopulations.
If you have been following in SLiMgui, you can now press Step again, and you will see that a
new variable named p1 appears in the variable browser to represent the new subpopulation. You
might now open the Eidos console and enter sim.methodSignature(). This is a method that
returns the signatures for all of the methods supported by an object. Among other methods in this
list (all documented in the reference section of this manual), you will see the signature for
addSubpop():
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

60

– (object$)addSubpop(is$ subpopID, integer$ size,
[float$ sexRatio = 0.5])

Notice there is a third parameter, which is optional, named sexRatio. This will be discussed
when sexual simulations are covered in chapter 6. The dash at the very beginning indicates that
the signature is for a method, as opposed to a function signature, which has no leading dash.
Methods are sometimes referred to by name with this leading dash; the addSubpop() method is
thus sometimes called -addSubpop(). The meanings are identical.
4.1.9 Executing the simulation
We have already executed the initialization phase and the first generation of the simulation (if
you are following along in SLiMgui). Now try pressing the Play button. The simulation will
suddenly run at full speed, which is probably pretty fast in this case. You will see mutations flicker
in and out of existence, and rise and fall in frequency, along the length of the chromosome, as
displayed in the chromosome views. If you let the simulation run, it will go until it reaches
generation 10000, at which point it will stop because of the final snippet of code in our example
script:
10000
{
sim.simulationFinished();
}

This defines an Eidos event that executes in generation 10000. The event calls a method named
on the sim object, and that method declares that the simulation is finished
(although it continues to execute until the end of the current generation). This is actually not the
typical way that a simulation ends, as we will see in the next section; but it will serve for now.
When the simulation stops, the generation counter will read 10001 because generation 10000
finished executing and the next generation to execute (were the simulation not finished) would be
10001.
Having reached the end of the simulation, let’s look at a few parts of the simulation window.
The population table view now looks something like this:
simulationFinished()

This shows the subpopulation that the generation 1 Eidos event created, named p1 as sown in
the ID column, along with its size (500) and its selfing and cloning rates (all 0.00). The last column
shows the sex ratio (M:M+F); simulations in SLiM are hermaphroditic by default, so the sex ratio is
undefined.
The individual view now looks something like this:

Each yellow square represents one individual; there are 500 squares since the subpopulation
size is 500. Each individual is colored according to the calculated fitness of that individual;
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

61

however, in this simulation all mutations are neutral, so all individuals have the same relative
fitness, 1.0, and so they are all yellow (as expressed by the color stripe for fitness values, discussed
in section 3.1). In a simulation with beneficial and deleterious mutations you would see a range of
colors, giving you an immediate visual sense of the fitness distribution across the subpopulation.
Finally, the chromosome view area looks something like this:

The top band shows the whole chromosome; the numbers and ticks below indicate base
positions. You can click and drag here to select a subrange of the chromosome for display in the
bottom view; a simple click in the top band will return you to viewing the full chromosome. The
bottom band displays the selected range of the chromosome (here, the full range). Each yellow bar
is a mutation that exists at that position in the chromosome. The height of the bar indicates the
frequency of the mutation; if the bar reaches the full height of the band, it has reached fixation and
is removed from the display (but you can turn on display of fixed mutations with the F button, in
which case they will display as full-height blue bars, by default). Here, all the bars are yellow
because all the mutations in this simulation are neutral, and neutral mutations are colored yellow
as shown by the color stripe at upper right (see section 3.1).
You might have noticed that whole sets of yellow bars tend to rise and fall in synchrony. These
are haplotypes that have become associated with each other through genetic drift. The dynamics
of mutations, haplotypes, and genetic drift is immediately visually apparent here, underlining the
value of having an interactive graphical interface in which you can develop, debug, test, and
experiment with your simulations. This visual, interactive workflow also makes SLiMgui a
potentially valuable tool for classroom instruction in population genetics and evolutionary biology.
4.2 Basic output
So far, our basic neutral simulation simply stops in generation 10000. Typically, it is desirable
for a simulation to produce some output. We will look at a few output options in this section.
4.2.1 Entire population
One option is to output the full state of the population (i.e., all individuals in all
subpopulations). To do this, we might change our script just a bit:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 { sim.addSubpop("p1", 500); }
10000 late() { sim.outputFull(); }

The only change here is that the final event, including the sim.simulationFinished() call, has
been replaced by a different event to generate output using the sim.outputFull() method:
10000 late() { sim.outputFull(); }
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

62

Note the keyword late() here. SLiM can run user-defined Eidos events at two different points
in the generational life cycle, as shown in section 1.3. These two points are referred to as early()
and late() events. Events are early() by default, so the original line:
10000 { sim.simulationFinished(); }

defined an early() event, and could just as well have been written as:
10000 early() { sim.simulationFinished(); }

Most of the time, running events early in the generation, prior to the generation of offspring, is
desirable (which is why that is the default). In some cases, however, it is better for an event to run
toward the end of the generation, after the generation of offspring – and events that produce output
are usually such a case. If the late() specifier were omitted, the output would be generated at the
beginning of generation 10000, instead of at the end, and would thus reflect the results of running
the model for only 9999 generations. After the output was generated, the model would then run
for the final generation, with no effect on the model’s output.
That would be quite a subtle bug, and easy to miss. In fact, generating output at the beginning
of a generation, in an early() event, is so likely to be a bug that SLiM will output a warning if you
try to do it using one of SLiM’s standard output generation methods. For example, if we delete the
late() designation on the output event above, SLiM will generate a warning when the model
executes:
#WARNING (SLiMSim::ExecuteInstanceMethod): outputFull() should probably
not be called from an early() event; the output will reflect state at the
beginning of the generation, not the end.

There are a few other common situations in which you want to use a late() event instead of
the default early() events, most notably when you introduce new mutations into the simulation in
script; these situations will be discussed as they arise.
Once you have copied and pasted this new script into your simulation window, press Recycle
and then Play, and let the simulation run to the end. Notice that the full state of the population has
been printed, as requested in an output section that looks something like this (with large chunks of
output skipped over with ellipses):
#OUT: 10000 A
Populations:
p1 500 H
Mutations:
29 68306 m1 34640 0 0.5 p1 6848 450
2 68503 m1 7251 0 0.5 p1 6867 561
...
Individuals:
p1:i0 H p1:0 p1:1
p1:i1 H p1:2 p1:3
...
Genomes:
p1:0 A 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
p1:1 A 16 17 18 7 8 19 9 20 21 22 23 24 25 26 27 28
...

The format here is fairly simple and easily parsable by something like an R script; it can also be
read by SLiM in order to read in a population from a saved state, as we will see later. It begins
with an output prefix, #OUT:, followed by the generation (10000) and the type of output (A for all).
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

63

The next section describes the subpopulations; p1 is of size 500 and is hermaphroditic (H).
Next comes a list of all mutations present; each line shows one mutation, including (1) a
temporary unique identifier used only in this output section, (2) a permanent unique identifier kept
by SLiM throughout a run, (3) the mutation type, (4) the base position, (5) the selection coefficient
(here always 0 since this is a neutral model), (6) the dominance coefficient (here always 0.5), (7)
the identifier of the subpopulation in which the mutation first arose, (8) the generation in which it
arose, and (9) the prevalence of the mutation (the number of genomes that contain the mutation,
where – in the way that SLiM uses the term “genome” – there are two genomes per individual).
Next comes a list of individuals. The first line of this section as shown above, for example,
shows that individual 0 in subpopulation 1 (p1:i0) is a hermaphrodite (H) and is comprised of two
genomes, p1:0 and p1:1, that will be listed in the following section. This section makes it easier to
figure out which genomes correspond to which individuals, but is largely redundant.
The final section lists all of the genomes in the simulation. The first line in this section as shown
above, for example, shows that genome p1:0 (which the Individuals section told us belonged to
individual p1:i0) is an autosome (A) and contains the mutations with the identifiers 0, 1, 2, ... 19.
From all of this information, the complete state of the population can be reconstructed – every
individual, and every mutation contained by every individual – thus the method name
outputFull().
4.2.2 Random population sample
Often the outputFull() method is overkill; you might just want a sample of genomes that is
randomly drawn from a particular subpopulation. For example, to output a 10-genome sample at
the halfway point of the simulation, here is a modified script:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 { sim.addSubpop("p1", 500); }
5000 late() { p1.outputSample(10); }
10000 late() { sim.outputFull(); }

The only change relative to the previous script is the added line:
5000 late() { p1.outputSample(10); }

This calls the method outputSample() on the subpopulation from which the sample should be
taken, p1, with a requested sample size of 10 genomes (not 10 individuals, note; outputSample()
outputs randomly selected haploid genomes). If you copy and paste this script into SLiMgui,
Recycle and Play, you will see that at generation 5000 some output appears:
#OUT: 5000 SS p1 10
Mutations:
21 203188 m1 92838 0 0.5 p1 4489 4
4 203247 m1 73991 0 0.5 p1 4495 6
...
Genomes:
p1:0 A 0 1 2 3 4 5 6
p1:1 A 0 7 1 3 4 5 8 6 9
...

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

64

The format here is very similar to that of outputFull(), but the header #OUT: 5000 SS p1 10
indicates that in generation 5000 a sample (S) was taken from p1 of size 10 genomes and was
output in SLiM (S) format. The list of mutations and genomes is identical in format to
outputFull(), except that only those mutations are shown that are actually present in the sample,
and that prevalence now refers to the sample rather than the entire population. The list of
populations and the list of individuals are also omitted in this output style.
SLiM also supports output of samples in MS format. The previous recipe can be modified to use
the outputMSSample() method instead of the outputSample() method; running the recipe in
SLiMgui would then produce MS-style output at generation 5000:
segsites: 73
positions: 0.0262003 0.0351204 0.0445104 0.0597206 0.0647306 0.0702307
0.0871209 0.1039710 0.1046410 0.1162212 0.1222912 0.1407114 0.1608916
0.1728717 0.1775018 0.2036920 0.2366524 0.2462625 0.2468625 0.2475225
0.2531725 0.2858329 0.2880929 0.2913729 0.2998330 0.3045530 0.3048630
0.3132831 0.3279533 0.3400434 0.3484835 0.3544035 0.3547335 0.3586336
0.3594236 0.3765438 0.3845438 0.4385444 0.4441644 0.4460845 0.4483745
0.4692747 0.4801248 0.5282953 0.5387154 0.5493255 0.5563656 0.5660457
0.5810758 0.5871859 0.5880459 0.6183762 0.6271763 0.6368964 0.6389064
0.6678067 0.6880269 0.6961170 0.6961970 0.7082871 0.7340373 0.7362074
0.8214782 0.8450485 0.8694387 0.8696087 0.9022290 0.9114191 0.9187892
0.9264793 0.9364694 0.9835098 0.9969800
1011010000000100001100110001001010100010000010010000000011001011001010001
0011010000010100001111110101001001100000000010010000000000101011011000011
0001010000000100000000001000010000000101100101100100111000010000000001000
1011010000100100001100110001001010100010000010010000000011001111001010101
1011010000000100001100110011001010101010000010010000000011001011001010001
0011010000010100001110110101001001100000000010010000000000101011011000011
0000101111001011111100010001101100110000011000011011000100001010101100000
0101010000000100000000001000010000000101100101100100111000010000000001000
1011010000000100001100110011001010101010000010010000000011001011001010001
1011010000100100001100110001001010100010000010010000000011001011001010001

Finally, SLiM supports output of samples in VCF format. If the outputVCFSample() method is
used instead of outputSample() in the above recipe, VCF-style output would be produced:
##fileformat=VCFv4.2
##fileDate=20160609
##source=SLiM
##INFO=
##INFO=
##INFO=
##INFO=
##INFO=
##INFO=
##INFO=
##INFO=
##FORMAT=
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT p1:i166 p1:i100
p1:i462 p1:i8 p1:i0 p1:i314 p1:i318 p1:i488 p1:i81 p1:i287
1 547 . A T 1000 PASS S=0;DOM=0.5;PO=1;GO=4822;MT=1;AC=4;DP=1000
GT 1|0 0|1 1|0 0|0 0|0 0|0 0|0 0|0 0|0 0|1
1 826 . A T 1000 PASS S=0;DOM=0.5;PO=1;GO=4342;MT=1;AC=5;DP=1000
GT 0|0 0|0 0|0 1|1 0|1 0|0 1|0 0|0 0|0 1|0
...

See sections 20.12.2 and 22.2 for details on these different output methods.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

65

4.2.3 Sampling individuals rather than genomes
The previous section showed how to use methods of the Subpopulation class to output a
sample of genomes from a subpopulation, in either SLiM’s own format, or in MS or VCF format.
However, in many situations you want to output information about a sample of individuals, rather
than a sample of genomes – you want to ensure that pairs of genomes, each belonging to an
individual, are the level of granularity at which the sampling for output occurs. Also, the
Subpopulation output methods only support sampling from a single subpopulation at a time, but
sometimes you want to output a sample drawn from multiple subpopulations, or from the full
population. Finally, sometimes you want to output a custom sample that your script chooses itself,
rather than a random, equally-weighted sample of the sort that the Subpopulation methods allow.
These tasks are all quite straightforward using lower-level methods of the Genome class. As an
illustration of this approach, we will look at a recipe that outputs the genomes of a weighted
sample of individuals drawn from the whole population:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeMutationType("m2", 0.5, "f", 0.01);
initializeGenomicElementType("g1", c(m1,m2), c(1.0,0.01));
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 {
sim.addSubpop("p1", 500);
sim.addSubpop("p2", 500);
p1.setMigrationRates(p2, 0.01);
p2.setMigrationRates(p1, 0.01);
}
10000 late() {
allIndividuals = sim.subpopulations.individuals;
w = asFloat(allIndividuals.countOfMutationsOfType(m2) + 1);
sampledIndividuals = sample(allIndividuals, 10, weights=w);
sampledIndividuals.genomes.output();
}

This model involves a few things that we haven’t looked at yet, such as multiple subpopulations
connected by migration (see section 5.2.1) and multiple mutation types (see section 7.1).
However, those details are not our focus here. For our current purposes, it suffices to recognize
that there are two subpopulations, p1 and p2, connected by migration, and there are two mutation
types, m1 and m2, with m1 representing common neutral mutations and m2 representing less
common beneficial mutations. Our focus is on the late() event in generation 10000, which
produces the output for the model. We will follow through its logic one line at a time.
The first line defines the set of individuals from which our sample will be drawn, here called
allIndividuals. The sim.subpopulations property gives us a vector of all of the subpopulations
in the model (i.e., p1 and p2). The individuals property of that vector is then all of the individuals
(that is, objects of class Individual; see section 21.6) gathered from across those subpopulations.
The second line determines the weights that we will use for sampling, here called w. Note that
the use of weights here is optional; if you want an unweighted sample, you can skip this line and
omit the weights vector in the call to sample() below. In this recipe, however, we want the
likelihood that an individual is chosen for sampling to be related to the number of m2 mutations the
individual possesses, so we use countOfMutationsOfType(m2). We add 1 to that count, so that
individuals with no m2 mutations have a weight of 1, individuals with one m2 mutation have a
weight of 2, and so forth. This is somewhat arbitrary, but it does have the nice property that we are
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

66

guaranteed that the weights vector is not all zero (which would cause a runtime error, since it
would not be possible to draw a sample). If you draw your own samples, you probably want to
ensure that the sample can always be drawn, to avoid runtime errors.
The third line draws a sample of 10 individuals from allIndividuals, using the sample()
function and passing it the weights vector w (using the named arguments syntax of Eidos, for
readability). The sample is put into the temporary variable sampledIndividuals.
Finally, the fourth line outputs the genomes of the sampled individuals using the output()
method of Genome. This outputs the sample in SLiM’s own format, similarly to what we saw in the
previous sections; there are also outputMS() and outputVCF() methods on Genome that can be used
similarly to produce MS and VCF output. Since there are two genomes per individual, twenty
genomes will be output. Each pair of genomes in the output will come from one sampled
individual; in the SLiM and MS output formats this is not explicit, but in the VCF format, since it is
a diploid format (as used by SLiM), this pairing into individuals is explicit in the output. See
section 21.3.2 for more information on these output methods, and section 23.3 for details on the
format of output they generate.
While this recipe implemented one particular sampling scheme, the Genome methods used here
can be used to output any vector of genomes, whether a random sample or a specifically selected
set. The higher-level output methods of SLiMSim and Subpopulation are a little easier to use, but
these low-level methods provide greater power and generality.
4.2.4 Substitutions
The outputFull() method of SLiMSim does output the full state of the population, but there is
some historical state that it does not output. Most notably, it does not output substitutions –
mutations which have been fixed and have thus been removed from the population by SLiM for
efficiency. If fixed mutations are of interest, SLiM does keep a record of them and can output that
record on request. For example, we could output substitutions, in addition to other state, with this
script:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 { sim.addSubpop("p1", 500); }
5000 late() { p1.outputSample(10); }
10000 late() { sim.outputFull(); }
10000 late() { sim.outputFixedMutations(); }

The only change relative to the previous script is the added line:
10000 late() { sim.outputFixedMutations(); }

If you copy and paste this script into SLiMgui, Recycle and Play, you will see that at simulation
end some additional output appears:
#OUT: 10000 F
Mutations:
0 220169 m1 98564 0 0.5 p1 107 1053
1 221802 m1 1217 0 0.5 p1 268 1152
...

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

67

The header #OUT: 10000 F indicates simply that in generation 10000 fixed mutations (F) were
output. The rest is a list of mutations in almost the same format as before. However, the final
number in each line is no longer a prevalence; since the mutation fixed, we know that it was
present in every genome in the simulation. Instead, the final number shows the generation
number in which the mutation reached fixation. Note that the default behavior of SLiM – that
substitutions are removed from all individual genomes – can be deactivated if necessary (see
section 21.9.1).
Display of fixed mutations in SLiMgui can also be enabled by clicking the F button to the right
of the chromosome view; when this display is enabled, fixed mutations will be displayed as fullheight bars (blue, by default) in the chromosome view.
4.2.5 Custom output with Eidos
The outputFull(), outputSample(), outputMSSample(), outputVCFSample(), and
outputFixedMutations() methods described in the preceding sections generate output in very
fixed formats that may or may not be useful to you. There is one more built-in output method that
is described in the reference section – outputMutations() – but it, too, may not be what you
need. What to do?
This is an area where the power of Eidos really shines, since you can write whatever Eidos script
you like to generate whatever kind of output you want. Some of the more advanced recipes in this
cookbook will generate custom output of interesting kinds; section 11.1’s recipe will include code
to calculate and print the FST between two subpopulations to assess their genetic divergence, for
example, and section 13.2’s recipe calculates and prints the nucleotide heterozygosity, π, of a
population to assess the effects of inbreeding. In this section, we will look at a much simpler
example, as an introduction to the topic of custom output using Eidos.
Suppose you are interested only in the base positions of all of the mutations in the population;
you can achieve this with the following output event:
100 late() { cat(paste(sim.mutations.position, "\n")); }

With our basic neutral simulation script, this generates output like:
31113
57761
...

This is precisely what we want. How does it work? Let’s dissect it, step by step, since it is more
complex than the Eidos code we’ve seen so far.
First of all, we have seen the sim object before, representing the simulation. This object, which
is an Eidos object of class SLiMSim, has a property named mutations that yields a vector containing
all of the mutations in the simulation (a vector of type object and class Mutation, to be precise).
This is the first use we have seen of a property; a property is somewhat like a method, in that it is a
member of an object, but whereas a method performs an operation (perhaps involving a lot of
computation, and perhaps altering the state of the simulation), a property simply corresponds to a
value that exists inside the object. The sim object knows all of the mutations it contains; it doesn’t
have to calculate them or create them, it just has to return the value it already has. It is therefore a
property – the mutations property. The member operator is used to access properties, just as it is
used to access methods, so the value of sim.mutations is the vector of mutations contained by the
simulation.
That value is itself an object. As mentioned before, all values in Eidos are actually vectors; the
vector of mutations might contain many elements (each of which is called an object-element), but
it is nevertheless a single Eidos object. The Mutation class in SLiM defines a property, position,
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

68

that is the base position of the mutation; we can access this property on our vector of mutations to
get a vector of positions. This involves a certain amount of work behind the scenes; each mutation
has its own position, and Eidos must loop over all of the mutations in the vector of mutations,
getting the position of each one and pasting all the positions together to make a new vector. Eidos
does this for you, though, since Eidos is a vector-based language; so sim.mutations.position is all
the Eidos code that is needed to get a vector of the positions of all mutations in the simulation.
Peeling the onion back a layer, this expression is contained in a function call:
paste(sim.mutations.position, "\n")

This function takes a vector argument (sim.mutations.position) and pastes all of the elements
together to form a single value of type string. We haven’t really talked about string yet; a string
is simply a sequence of characters, like "hello, world". Eidos can paste strings together, split
them apart, and print them out; it can also convert most other types into string for the purpose of
output. The paste() function always produces a string as its result; it converts the elements of the
vector it is given into string elements in order to do so. The second parameter of paste(), "\n",
is a string containing a single character, the newline character, represented with the escape
sequence \n (you can read more about strings and escape sequences in the Eidos manual; it is not
worth getting into here). The final result is that each of the integer positions in
sim.mutations.position is converted to a string representation and then pasted together with the
others, with newlines in between, to produce the desired output as seen above.
Note that this output event is designated as a late() event, for the reasons outlined in section
4.2.1; in this case, we wish to see the positions of mutations at the end of generation 100, not the
beginning. With custom output code of this sort, SLiM is not able to infer that it is likely to be an
error if it runs in an early() event instead, however; so if the late() designation is removed here,
SLiM will not produce a warning. When writing custom output script events, you need to think
carefully about whether they should be early() or late() events (but a late() event is usually
what you want).
The simplicity of this example – just a single line of Eidos code! – should make it clear that
generating output in whatever format you desire is likely to be straightforward once you get the
hang of how Eidos works.
Since the power of this may not be entirely apparent yet, let’s consider a problem that might
arise in using SLiM. You might wish to produce MS-style output for a sample of genomes spanning
the whole population, but the built-in outputMSSample() method of Subpopulation (sections 4.2.2,
20.12.2, and 22.2.3) only supports generation of MS-style output from a sample of a single
subpopulation. As of SLiM 2.1, you can solve this problem with the outputMS() method of Genome
(sections 20.3.2 and 22.3.2), but let’s pretend that method doesn’t exist; the same general problem
will arise whenever you want to output data in a format that SLiM does not, in fact, intrinsically
support. Generating MS-style output using Eidos is trivial:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 {
sim.addSubpop("p1", 500);
sim.addSubpop("p2", 500);
}

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

69

// custom MS-style output from a multi-subpop sample
2000 late() {
// obtain a random sample of genomes from the whole population
g = sample(sim.subpopulations.genomes, 10, T);
// get the unique mutations in the sample, sorted by position
m = sortBy(unique(g.mutations), "position");
// print the number of segregating sites
cat("\n\nsegsites: " + size(m) + "\n");
// print the positions
positions = format("%.6f", m.position / sim.chromosome.lastPosition);
cat("positions: " + paste(positions, " ") + "\n");
// print the sampled genomes
for (genome in g)
{
hasMuts = (match(m, genome.mutations) >= 0);
cat(paste(asInteger(hasMuts), "") + "\n");
}
}

This model is quite straightforward; the only complication is that there are two subpopulations,
and we wish to produce MS-style output for a sample across both of them. The generation 2000
event does precisely that. First it obtains the genomes upon which the output will be based; here
this is done using sample() on the vector of all genomes in the simulation, but any other sampling
scheme could be used instead. To sample specifically from subpopulations p1 and p2, for
example, without sampling from any other subpopulations that might exist, you could do:
g = sample(c(p1.genomes, p2.genomes), 10, T);

The sample size is 10, and sampling is done with replacement (the T parameter to sample()),
but that can obviously be customized. Next unique() and sortBy() are used to get a vector of the
mutations present in the sample, sorted by position. The size() of that vector is the number of
segregating sites, so that is trivial to output. Printing the positions is also simple; format() is used
to guarantee that every position is formatted with six digits to the right of the decimal, in decimal
rather than scientific notation even for very small values. Finally, the sampled genomes are printed
in MS format, using asInteger() to convert logical values from match() into 0s and 1s.
4.2.6 The simulation endpoint
The astute reader will have noticed that in the previous sections we not only added output
events, we also removed one line from the original script:
10000 { sim.simulationFinished(); }

We can do this because SLiM will generally stop after the last generation in which an event is
scheduled. In our basic neutral simulation, however, we had no output commands; if we had
omitted the line above with the simulationFinished() call the simulation would have ended at
the end of generation 1. Alternatively, we could have just written:
10000 { }

This “empty event” would have caused SLiM to run out to the end of generation 10000, because
there was still a future event scheduled. At the end of generation 10000 SLiM would then stop,
since there was no future event after generation 10000 scheduled. In fact, simulationFinished()
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

70

is really useful only when you wish to end a simulation before it would naturally end on its own.
You might check for an equilibrium condition, for example, and end the simulation if it has been at
equilibrium for the past 1000 generations, even though it would otherwise have run further.
There is one further caveat to mention regarding how simulations end. It is possible to define
an Eidos event that runs in every generation. For example, we could write the following script:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 { sim.addSubpop("p1", 500); }
late() { p1.outputSample(10); }

Notice that there is no generation number before the Eidos event defined in the last line; this
indicates that the event should be run in every generation, from here to eternity. If you copy and
paste this into SLiMgui and Recycle and Play, it will stop at the end of generation 1, however; there
is always a scheduled future event (the output event), but SLiM does not use it in determining
when the simulation should end, precisely because it has no expiration date. SLiM assumes that it
is not your wish to run a simulation that runs forever – commonly known as an “infinite loop” in
programming – so the simulation end is determined without reference to that event.
To define a final generation for the simulation in this situation, you would use precisely the trick
described above: just add an “empty event”, like:
100 { }

The simulation will now run to the end of generation 100. Note that there is no need to
designate the event as a late() event; SLiM always runs the last generation to completion, even if
simulationFinished() is called. The only way to halt a simulation immediately, within a
generation, is to call the Eidos error function stop().

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

71

5. Demography and population structure
The previous chapter discussed in detail the structure of a basic neutral simulation in SLiM,
including initialize() callbacks, Eidos events, the SLiMSim class, and many of the conceptual
underpinnings of SLiM such as mutation types, genomic element types, and chromosome
organization. It also covered some basic features of the Eidos language, such as vectors and vector
operations, function calls, objects, and method calls. From here, we assume a working knowledge
of these topics; you can consult the Eidos manual regarding features of the Eidos language, and the
reference section of this manual for details regarding SLiM.
In this chapter, we will focus on building on this foundation to make some simple “recipes” for
simulations that do interesting things with demography and population structure.
5.1 Subpopulation size
5.1.1 Instantaneous changes
As we saw in section 4.1.8, we can create a new subpopulation with sim.addSubpop(), a
method call which takes the initial size of the subpopulation as a parameter. What if we want the
population size to change later? This is very straightforward; for example, here is a simple script
that models a population that goes through a bottleneck:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 { sim.addSubpop("p1", 1000); }
1000 { p1.setSubpopulationSize(100); }
2000 { p1.setSubpopulationSize(1000); }
10000 late() { sim.outputFull(); }

The initialize() callback is unchanged from previous models. In generation 1 a
subpopulation named p1 is set up with an initial size of 1000. In generation 1000, a method named
setSubpopulationSize() is called on p1 to change its size to 100, the beginning of the bottleneck.
In generation 2000 the size is set back to 1000, ending the bottleneck. The simulation then
executes until it ends in generation 10000 with full output.
If you run this in SLiMgui (don’t forget to Recycle), the population size changes as expected.
5.1.2 Exponential growth
Sometimes you may want the size of a subpopulation to vary in a continuous fashion, rather
than experiencing discrete changes as in the previous recipe. This is straightforward in SLiM
because of the power of Eidos to express mathematical relationships. We will look at several
different versions of this recipe in order to explore different aspects of the problem.
As a first example, to make a population experience a period of exponential growth:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

72

1 { sim.addSubpop("p1", 100); }
1000:1099 {
newSize = asInteger(p1.individualCount * 1.03);
p1.setSubpopulationSize(newSize);
}
10000 late() { sim.outputFull(); }

Most of the script is unchanged; the work is done in the second Eidos event, which is scheduled
to run from generation 1000 to 1099, producing exponential growth for 100 generations. This event
calculates a new size, assigning it into newSize, and then sets the subpopulation size using that
variable. The use of the variable is for readability; it would be essentially equivalent to write:
p1.setSubpopulationSize(asInteger(p1.individualCount * 1.03));

Let’s look more closely at that statement. First, the current size of the subpopulation is accessed
through the individualCount property of p1. That size is then multiplied by 1.03 to produce a
new size based on an exponential growth rate of 1.03. Finally, the size is converted to an integer
by the asInteger() function (since setSubpopulationSize() does not allow you to set a noninteger size), and the integer size is passed to setSubpopulationSize(). There is a suite of as...
() functions in Eidos that allow you to convert values to various new types, but converting float
values to integer with asInteger() is probably the most common conversion.
An important point here is that the setSubpopulationSize() call does not change the current
size of the subpopulation; after all, what would the genetics of the new individuals be? Instead, it
sets a new target size that will be used the next time that an offspring generation is created; a few
additional offspring will be made to reach the new target size. If you access the individualCount
property immediately after calling setSubpopulationSize(), you therefore observe that the
individual count has not changed. This is the reason why setSubpopulationSize() is not called
setIndividualCount(); the difference in name is intended to emphasize how SLiM works.
Since the population size gets rounded down to an integer in each generation, this code does
not actually achieve the precise exponential growth rate of 1.03 that we wanted. Let’s fix that:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 { sim.addSubpop("p1", 100); }
1000:1099 {
newSize = asInteger(round(1.03^(sim.generation - 999) * 100));
p1.setSubpopulationSize(newSize);
}
10000 late() { sim.outputFull(); }

Running this in SLiMgui shows that the population reaches a size of 1922, whereas the previous
version reached a size of 1630 – a significant difference! Beware accumulated roundoff error.
How does the new version work? The expression sim.generation-999 computes the number of
generations of exponential growth that have occurred; in generation 1000 this is 1, in generation
1099 it is 100. Next, 1.03^(sim.generation-999) raises 1.03 to that power; ^ is the exponentiation
operator in Eidos, as in many programming languages. That calculates the result of the exponential
growth over the requisite number of generations. Finally, that result is rounded to the nearest
whole number by round(), which produces a float result, and then that whole number is
converted to an integer by asInteger().
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

73

Sometimes we may want the population size to increase exponentially until a specific target
size is reached, without knowing how many generations that might take. This is a bit trickier. One
possible implementation does this by brute force (leaving out the initialize() code):
1 { sim.addSubpop("p1", 100); }
1000:2000 {
if (p1.individualCount < 2000)
{
newSize = asInteger(round(1.03^(sim.generation - 999) * 100));
p1.setSubpopulationSize(newSize);
}
}
10000 late() { sim.outputFull(); }

This uses an if statement to increase the size of the subpopulation only if it is still less than
(this is the first time we’ve seen the if statement in Eidos, but it should be pretty obvious what
it does here; see the Eidos manual for clarification). The generation range for the event has been
expanded to 1000:2000, because we don’t know what generation the target size will be reached in;
it will be well before generation 2000, so this is good enough, but is a bit sloppy. Alternatively, we
could calculate the exact generation in which the target size is reached (1101, as it happens), and
use that instead of 2000; if we did that, we wouldn’t even need the if statement, since the
exponential growth would reach its target in the event’s last invocation.
To ensure that the last generation of growth produces exactly 2000 individuals, we can use the
following solution:
2000

initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 { sim.addSubpop("p1", 100); }
1000: {
newSize = round(1.03^(sim.generation - 999) * 100);
if (newSize > 2000)
newSize = 2000;
p1.setSubpopulationSize(asInteger(newSize));
}
10000 late() { sim.outputFull(); }

First of all, notice that the generation range for the exponential growth event is now written as
omitting the end generation. Just as supplying no generation range at all for an event means
“run in every generation”, omitting the end generation means “run in every generation after the
specified start generation”. One may similarly omit the start generation with the syntax :end,
meaning “run in every generation from the beginning until the specified end generation”.
The logic inside the event has also changed. Now the code always calculates a new size. If
that size is greater than 2000 it gets clamped to 2000, which enforces the maximum population size
we wanted. The clamped size is then set on the subpopulation.
The drawback to this solution is that the event runs in every generation from 1000 onward,
calling setSubpopulationSize() to set the size to 2000 over and over in generations after the target
size has been reached. That is inefficient; more importantly, it might interfere with other changes
we might want to make to the population size later in the simulation. We’d really like our event to
run for exactly the needed duration to reach the target size, and then not run in subsequent
1000:,

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

74

generations, without having to hard-code a final generation for it. There is a simple solution to this
problem – and this, you will be relieved to read, is the final, polished version of our exponential
growth recipe (so the initialize() callback is provided to make the recipe complete):
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 { sim.addSubpop("p1", 100); }
1000: {
newSize = asInteger(round(1.03^(sim.generation - 999) * 100));
if (newSize >= 2000)
{
newSize = 2000;
sim.deregisterScriptBlock(self);
}
p1.setSubpopulationSize(newSize);
}
10000 late() { sim.outputFull(); }

Braces have been added around the if statement’s consequent to make a group of statements;
all of the statements in the group will be executed if the condition of the if statement is true. This
use of braces makes what is called a compound statement; compound statements are legal
anywhere in Eidos that a single statement is legal.
The interesting change here, though, is the addition of the line:
sim.deregisterScriptBlock(self);

This line prevents the event from continuing to run after the target size has been reached. It is a
common pattern in SLiM scripting, so it is worth a bit of discussion. SLiM keeps track of script
blocks that are registered for execution in the simulation. All of the events and callbacks that you
write in your input script are automatically registered; it is also possible to construct and register
new script blocks dynamically, as we will see in a later chapter. Script blocks that are registered
can be deregistered, using the deregisterScriptBlock() method of SLiMSim, as we are doing
here. Once a block is deregistered, SLiM forgets about it and no longer executes it.
The other thing to be explained here is self. This is a variable that is defined whenever SLiM is
executing an event or callback; it refers to the currently executing script block. We use it here so
that our exponential growth event can deregister itself. Once deregistered, the event is no longer
executed, and will remain deregistered until the Recycle button is pressed.
You can see the dynamics of script registration and deregistration graphically in SLiMgui, by the
way. Start by pressing Recycle, with the recipe above already pasted into the window. Now if you
open the drawer for the simulation window by pressing the drawer button
to the right of the
chromosome view area, you will see a list of registered scripts:

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

75

The exponential growth event is listed as the third entry there; if you hover the mouse over its
entry for a second or two, you will even see the code for it, shown in a tooltip. Now click in the
generation field, enter 1101, and press return:

This causes SLiMgui to execute the simulation up to the beginning of the requested generation,
which is just before the target population size is attained. If you now click Step, you will see the
exponential growth event disappear from the list of registered events.
If your scripting gets complicated, with script blocks registering and deregistering frequently,
this facility for viewing all of the registered script blocks can prove quite useful.
5.1.3 The population visualization graph
In the previous section, we saw several different ways to make a subpopulation grow
exponentially for a period of time. In this section, we show how to visualize the dynamics of
demography and population structure graphically in SLiMgui on Mac OS X.
Start by copying and pasting the last recipe into a new SLiMgui simulation window, and then
Recycle to parse the script and get ready to execute it. Now click on the Show Graph button, ,
and from its pop-up menu select “Graph Population Visualization”. This will bring up a small
window, presently empty.
If you click Step twice, in order to execute the initialization phase and then generation 1, you
should see a small yellow circle labeled “p1” appear. This circle represents subpopulation p1; in
particular, its radius represents the size of the subpopulation, and its color represents the mean
fitness value of the subpopulation.
Now press Play, and keep your eyes on the subpopulation circle. When the simulation reaches
generation 1000 the circle for p1 will begin to grow, and it will continue growing until the
subpopulation size reaches its target, 2000:

In this instance the visualization is fairly trivial; still, it is useful for verifying that the population
expansion happens in the way that it was planned. In future sections we will see that the
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

76

population visualization graph can also help us to visualize migration dynamics, fitness dynamics,
and other such things, making it a very useful tool for simulation debugging.
5.1.4 Cyclical changes
Another common demographic dynamic is a cyclically varying subpopulation size, perhaps
representing seasonality, or a decadal oscillation in resource availability. Implementing this in
SLiM is quite trivial; for example:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 { sim.addSubpop("p1", 1500); }
{
newSize = cos((sim.generation - 1) / 100) * 500 + 1000;
p1.setSubpopulationSize(asInteger(newSize));
}
10000 late() { sim.outputFull(); }

The initialize() callback is unchanged. The interesting work is done by the second event –
the one with no generation specifier, which you might recall means that it will be executed in
every generation. This event first calculates a new size, and then sets the new size in the
subpopulation (converting to integer as usual). The new size is calculated based upon the current
generation, sim.generation, passed through the cos() function, which calculates a cosine. The
generation is scaled and translated in order to arrange that the first offspring generation has a size
of 1500, matching the size of the parental generation set up by addSubpop(). Of course that is not
required; this particular arrangement is just for demonstration purposes.
The cos() function calculates a cosine based on a given angle in radians, as is standard. The
periodicity of the cycling is therefore based upon the fact that there are 2π radians in a circle;
given the scaling factor of 100, the population will complete a full cycle in 100*2π generations,
which is about 628. To make a subpopulation cycle with whatever periodicity you wish, just adjust
the scaling factor accordingly. Similarly, since cosine varies between -1 and 1, the scaling factor
of 500 on that, plus the translation of 1000, means that the population size cycles between 500 and
1500. You might wish to view these dynamics in SLiMgui’s population visualization graph, as
described in the previous section.
This recipe uses cos() to generate cyclical dynamics, but you can plug in whatever formula you
wish. The population size does not have to depend only upon the generation, either; the flexibility
of Eidos allows you to implement whatever demographic model you wish, including demography
that depends upon the model dynamics themselves, as we will see in the next section.
5.1.5 Context-dependent changes: Muller’s Ratchet
Sometimes you might want subpopulation size to depend upon something other than time. An
example would be a situation in which population size depends upon mean fitness (e.g., a model
of so-called “hard” selection); if the mean fitness is low then the subpopulation size would be
small (perhaps because the subpopulation is getting outcompeted by individuals of some other
species, or perhaps because the subpopulation is just so unfit that individuals are having trouble
surviving and feeding themselves even without competition). Here we will examine such
dynamics in a model of Muller’s Ratchet:

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

77

initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeMutationType("m2", 0.5, "e", -0.01);
m2.convertToSubstitution = F;
initializeGenomicElementType("g1", c(m1,m2), c(1,1));
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 { sim.addSubpop("p1", 100); }
{
meanFitness = mean(p1.cachedFitness(NULL));
newSize = asInteger(100 * meanFitness);
p1.setSubpopulationSize(newSize);
}
10000 late() { sim.outputFull(); }

The initialize() callback here has been modified so that there are two mutation types, m1 and
m2, produced with equal probability in genomic element type g1. The m1 type represents neutral
mutations as usual (a fixed DFE with selection coefficient 0.0); the m2 type represents deleterious
mutations (an exponential DFE with mean selection coefficient -0.1). In our model we want
deleterious mutations to continue to affect fitness even after fixation, progressively reducing the
population size. For this reason, the convertToSubstitution property of m2 is set to F to prevent
fixed mutations of that type from being replaced by Substitution objects (see sections 18.3 and
20.9.1 for further information on this property).
The cachedFitness() method of class Subpopulation, called on p1, provides a vector of the
fitness values of all individuals in the subpopulation (a vector of indices could be passed, to
request fitness values only for specified individuals; the NULL value passed simply requests fitness
values for all individuals). The mean() function calculated the arithmetic mean of the vector it is
passed, resulting in the mean subpopulation fitness. A new size for the subpopulation is then
calculated using a base size of 100, representing the size if the population were of mean fitness
1.0. In our scenario, the base size is simply multiplied by the mean fitness, but one can of course
use any other function for this.
This recipe is a very primitive toy model of fitness-based population dynamics; nevertheless, it is
interesting to look at it in the population visualization graph, where the population size changes
visibly and is clearly correlated with mean fitness, shown as the color of the subpopulation. As
deleterious mutations accumulate in the subpopulation, its circle both reddens and shrinks,
showing visually that demography is being driven by mean fitness. When the subpopulation
reaches extinction due to the accumulation of deleterious mutations by Muller’s Ratchet, this
model terminates with an error (“undefined identifier p1”), because after subpopulation p1 is set to
a size of 0 the symbol p1 ceases to exist; setting a size of 0 tells SLiM to remove the subpopulation
entirely. One could end the simulation more gracefully by testing for (newSize == 0) and calling
sim.simulationTerminated(), perhaps after producing some sort of output regarding the
generation and the number of deleterious mutations fixed.
5.2 Population structure
5.2.1 Adding subpopulations
We have already seen in previous recipes how to add a single subpopulation by calling the
addSubpop() method of sim (which is of class SLiMSim). Creating population structure by adding
multiple subpopulations – of different sizes, linked by migration – is a very simple extension of
that:
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

78

initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 {
sim.addSubpop("p1", 500);
sim.addSubpop("p2", 100);
sim.addSubpop("p3", 1000);
p1.setMigrationRates(c(p2,p3), c(0.2,0.1));
p2.setMigrationRates(c(p1,p3), c(0.8,0.01));
}
10000 late() { sim.outputFull(); }

For this recipe we have returned to the simple initialize() callback we started with, with only
one mutation type. The new action happens in the generation 1 event, where we now add three
subpopulations of different sizes. Those populations are then connected by migration using calls
to the setMigrationRates() method of Subpopulation. Let’s examine the first such call:
p1.setMigrationRates(c(p2,p3), c(0.2,0.1));

Since this method call is made on p1, it is setting the immigration into p1. Both p2 and p3 are
given as sources of immigration, with rates of 0.2 and 0.1 respectively. In SLiM’s design, this does
not mean that individuals from p2 and p3 will move from those populations into p1. Rather, it
means that when p1 generates an offspring generation, and parents are chosen for a new offspring
individual, a proportion of 0.2 of those parents will be chosen from p2, a proportion of 0.1 from
p3, and the remaining proportion of 0.7 will be chosen from p1 itself, since the default behavior is
that parents are chosen from the subpopulation into which the new offspring will be placed. The
next line sets up migration into p2, in the same way. Since migration into p3 is not configured, p3
acts as a source only. Note that, as discussed further in section 19.2.1, migration rates specify the
probability that any given offspring individual will come from a particular source subpopulation;
the actual number of migrants in a given generation is thus stochastic, not deterministic.
It can be hard to visualize complex population structure just from code like this; happily,
SLiMgui’s population visualization graph provides a nice way to see what’s going on. If you copy
and paste the above recipe into a simulation window, Recycle, press Step twice so that generation
1 has been executed, and then open the population visualization graph, you will see a nice
graphical representation of the population structure:
p1

p2

p3

As before, the size of the circle for a subpopulation represents the subpopulation’s size, and the
color of the circle indicates the mean fitness of that subpopulation. The arrows show migration
links, with the thicknesses of the arrows indicating the strength of each link. Here it is immediately
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

79

obvious that p3 is a source, and that p2 is somewhat close to being a sink but does contribute some
immigrants to p1.
Of course there is no need for the population structure to all be set up in generation 1; you can
add subpopulations, remove subpopulations, and change migration rates in each generation if you
wish, as we will explore in the next section. The population visualization graph will update to
show the current population structure as your simulation runs.
5.2.2 Removing subpopulations
The population structure set up in the previous recipe was quite static, established in generation
1 and unchanging subsequently. Let’s explore more dynamic population structure by both adding
and removing subpopulations over time and dynamically changing migration. The recipe here is
longer because it does more things, but it is only a small extension of previous concepts:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 { sim.addSubpop("p1", 500); }
100 { sim.addSubpop("p2", 100); }
100:150 {
migrationProgress = (sim.generation - 100) / 50;
p1.setMigrationRates(p2, 0.2 * migrationProgress);
p2.setMigrationRates(p1, 0.8 * migrationProgress);
}
1000 { sim.addSubpop("p3", 10); }
1000:1100 {
p3Progress = (sim.generation - 1000) / 100;
p3.setSubpopulationSize(asInteger(990 * p3Progress + 10));
p1.setMigrationRates(p3, 0.1 * p3Progress);
p2.setMigrationRates(p3, 0.01 * p3Progress);
}
2000 { p2.setSubpopulationSize(0); }
10000 late() { sim.outputFull(); }

The initialize() callback is as before. Subpopulations p1, p2, and p3 are now set up in
generations 1, 100, and 1000 respectively. Subpopulation p2 is removed in generation 2000 by
setting its size to 0; that is all that is needed to remove a subpopulation.
The rest of the code – the 100:150 event and the 1000:1100 event – introduces some continuous
change into the population structure. In the 100:150 event the migration rates between p1 and the
newly established p2 grow over time until they reach a target rate. In the 1000:1100 event the new
subpopulation p3 grows in size over time, from a very small founder population, and its
migrational contribution to p1 and p2 grows over time as p3 grows in size.
Watching these dynamics in SLiMgui’s population visualization graph is useful for confirming
that they are functioning correctly. However, the simulation may go too fast for the dynamics to be
seen clearly. You can use the speed slider, directly below the Play button, to slow it down:

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

80

If you set a slower speed, as above, and then Recycle and Play, it should be easier to follow the
action.
5.2.3 Splitting subpopulations
By default, when new subpopulations are added in SLiM they are composed of “brand-new”
individuals with no mutations. To produce more realistic dynamics, we might like the new
subpopulations to be split off from an existing subpopulation. Modifying the recipe above, this is
quite a simple change:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 { sim.addSubpop("p1", 500); }
100 { sim.addSubpopSplit("p2", 100, p1); }
100:150 {
migrationProgress = (sim.generation - 100) / 50;
p1.setMigrationRates(p2, 0.2 * migrationProgress);
p2.setMigrationRates(p1, 0.8 * migrationProgress);
}
1000 { sim.addSubpopSplit("p3", 10, p2); }
1000:1100 {
p3Progress = (sim.generation - 1000) / 100;
p3.setSubpopulationSize(asInteger(990 * p3Progress + 10));
p1.setMigrationRates(p3, 0.1 * p3Progress);
p2.setMigrationRates(p3, 0.01 * p3Progress);
}
2000 { p2.setSubpopulationSize(0); }
10000 late() { sim.outputFull(); }

The only change is that the addSubpop() calls that established p2 and p3 have been changed to
This call is very similar to addSubpop(), but takes an extra parameter: the
existing subpopulation from which the new subpopulation should be split. The split is
accomplished by copying the individuals for the new subpopulation from the source
subpopulation. In other words, each new individual is an exact genetic clone of an existing
individual in the source population, mimicking a founding event in which a subset of the
individuals in the source subpopulation find themselves split off into founders of a new
subpopulation. (From this perspective it is a bit odd that the founders also remain in the source
subpopulation, admittedly, but this is unlikely to be important to simulation dynamics in realistic
scenarios since source subpopulations are typically large and founding subpopulations are
typically small.)
addSubpopSplit().

5.3 Migration and admixture
The previous subsections already showed how to set up patterns of migration among multiple
subpopulations. Here, we will focus on recipes for a few standard types of population structure to
show how the flexibility of Eidos makes setting up complex population structure easy.
5.3.1 A linear island model
This model describes a linear chain of subpopulations, each of which is connected by migration
only to its nearest neighbors. This is quite easy to set up:
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

81

initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 {
subpopCount = 10;
for (i in 1:subpopCount)
sim.addSubpop(i, 500);
for (i in 2:subpopCount)
sim.subpopulations[i-1].setMigrationRates(i-1, 0.2);
for (i in 1:(subpopCount-1))
sim.subpopulations[i-1].setMigrationRates(i+1, 0.05);
}
10000 late() { sim.outputFull(); }

If you copy and paste this into SLiMgui, Recycle, do Step twice to step through generation 1,
and open the population visualization graph, you will see the resulting structure:
p1
p10

p2

p9

p3

p8

p4

p7

p5
p6

As you can see, migration is stronger in one direction than in the other; this might represent
gene flow in a river system, for example, where gene flow is more likely to go downstream than
upstream. Indeed, it would be trivial to interrupt the upstream gene flow completely in certain
spots to represent the effect of waterfalls that prevent upstream migration. Using the techniques
shown in previous sections, it would also be trivial to make the migration pattern change over
time; a major flooding event in one generation could allow gene flow to pass upstream past a
small waterfall, for example.
This population structure is achieved using three for loops. We saw for loops briefly in section
4.1.5; let’s revisit the concept now. The first loop in this recipe is:
for (i in 1:subpopCount)
sim.addSubpop(i, 500);

This causes the statement sim.addSubpop(i, 500); to be executed repeatedly as the loop index
variable i varies from 1 up to subpopCount, following the sequence defined by 1:subpopCount.
The result is that new subpopulations are created in order: p1, p2, p3, and on up to subpopCount.
The second loop is almost as straightforward:
for (i in 2:subpopCount)
sim.subpopulations[i-1].setMigrationRates(i-1, 0.2);

p1,

This sets up migration into each subpopulation from the previous subpopulation: into p2 from
into p3 from p2, and so forth. Since p1 receives no such migration (there being no p0

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

82

subpopulation), the loop starts at 2, not 1. For each value of i, the corresponding subpopulation is
looked up in the simulations subpopulation array, which is a property of SLiMSim, using
sim.subpopulations[i-1]; the use of i-1 here instead of i is because p1 will be at index 0 in the
subpopulation array (since vectors in Eidos are numbered starting at 0, not 1). Given the proper
subpopulation, the setMigrationRates() call then sets up migration from the previous
subpopulation (which is the purpose of i-1 in that call).
Given that explanation, the operation of the third loop should be fairly obvious. The power of
this algorithmic approach to setting up population structure should be clear; if we want 1000
subpopulations arranged in this manner, all we have to do is change subpopCount = 10; to
subpopCount = 1000;. The population visualization graph in SLiMgui will find this a bit
challenging to display, but it should be possible to test and debug such a model with just 10
subpopulations, and then scale up to 1000 subpopulations for your production runs.
5.3.2 A non-spatial metapopulation
Another possible scenario is a non-spatial metapopulation in which migration between each
pair of subpopulations occurs at some constant rate. This can be set up as follows:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 {
subpopCount = 5;
for (i in 1:subpopCount)
sim.addSubpop(i, 500);
for (i in 1:subpopCount)
for (j in 1:subpopCount)
if (i != j)
sim.subpopulations[i-1].setMigrationRates(j, 0.05);
}
10000 late() { sim.outputFull(); }

Here a nested pair of for loops using index variables i and j sets up the migration from j into i
for each pair of subpopulations. The test if (i!=j) prevents the code from attempting to set the
migration rate from a subpopulation into itself, which is illegal.
SLiMgui’s visualization of this population structure looks like what we wanted:
p1

p5

p2

p4

p3

This recipe uses only five subpopulations (subpopCount = 5), but again the code is general and
may be scaled up to however many subpopulations you want in your metapopulation (although
SLiMgui may not be very good at visualizing these scenarios once they become too complex).
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

83

5.3.3 A two-dimensional subpopulation matrix
A third common population structure is a two-dimensional grid or matrix of subpopulations,
representing a spatial metapopulation in which each subpopulation exchanges migrants only with
its direct neighbors. SLiM has no intrinsic concept of geographic space in its simulations, but with
a population structure like this a pseudo-geographic regime can be imposed upon a SLiM
simulation such that new beneficial mutations, for example, will spread from subpopulation to
subpopulation in a similar manner to how they might spread across a real landscape.
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 {
metapopSide = 3;
// number of subpops along one side of the grid
metapopSize = metapopSide * metapopSide;
for (i in 1:metapopSize)
sim.addSubpop(i, 500);
subpops = sim.subpopulations;
for (x in 1:metapopSide)
for (y in 1:metapopSide)
{
destID = (x - 1) + (y - 1) * metapopSide + 1;
destSubpop = subpops[destID - 1];
if (x > 1)
// left to right
destSubpop.setMigrationRates(destID - 1, 0.05);
if (x < metapopSide)
// right to left
destSubpop.setMigrationRates(destID + 1, 0.05);
if (y > 1)
// top to bottom
destSubpop.setMigrationRates(destID - metapopSide, 0.05);
if (y < metapopSide)
// bottom to top
destSubpop.setMigrationRates(destID + metapopSide, 0.05);
}
}
10000 late() { sim.outputFull(); }

This code is a bit more complex than previous recipes. In the generation 1 event, we first
decide how large of a metapopulation we want; metapopSide=3 means that we will make a 3x3
metapopulation (trivially small, but easier to check for correctness). This can be scaled up
arbitrarily; it works well in SLiMgui for values as large as a 1000x1000 metapopulation with 10
individuals per subpopulation. We calculate the number of subpopulations we will need, and
make them all with the first for loop.
Then comes the trickier part. We loop over the grid of subpopulations using x and y, which
each range from 1 to metapopSide. For each subpopulation in the grid, as determined by x and y,
we calculate the ID of the subpopulation in destID, and then fetch the subpopulation itself into
destSubpop, getting it from the simulation’s subpopulation array as before. We then set the
migration into destSubpop from each of its four sides; if it is at the edge of the matrix, it receives no
migrants from that side (although it would be trivial to modify this code to make a wrap-around
pattern of migration simulating a toroidal world). Simple arithmetic is used to determine the
identifier of neighboring subpopulations based upon destID.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

84

SLiMgui’s visualization of this setup is complicated; it is not immediately obvious that it
represents a two-dimensional metapopulation, but it does:
p1
p9

p2

p8

p3

p7

p4
p6

p5

However, SLiMgui has an alternate method of display for these population visualizations. If you
control-click or right-click on the graph, you will get a pop-up menu. Select “Optimized
Positions” from that menu, and you should see something more like this:
p2

p3

p4

p5

p6

p7

p8

p1

p9

Here it is much more obvious that the migration pattern is as desired. This positioning
optimization algorithm is based on a concept called force-directed layout. It is a somewhat
experimental feature in SLiMgui, and may not always give layouts that look good; it is also quite
slow for layouts involving more than a dozen or so subpopulations. However, it is worth a try if
you want to get a publication-worthy picture of your population structure.
Speaking of “publication-worthy”, note that when you control-click or right-click on the
visualization graph, the pop-up menu also has an item entitled “Copy Graph”. That copies the
graph to the clipboard as a PDF – very handy for pasting into documents or slides.
5.3.4 A random, sparse spatial metapopulation
The recipe in section 5.3.3 showed how to make a small spatial metapopulation in which
subpopulations are arranged spatially in a grid that is connected by orthogonal migration. Here
we will extend that recipe to a larger size and a more random metapopulation configuration, and
we’ll look at a very simple sweep of a beneficial mutation across the metapopulation.
The initialize() callback for this model is largely unchanged from section 5.3.3:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeMutationType("m2", 0.5, "f", 0.3);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

85

The only modification here is the addition of a new mutation type, m2, which has been defined
to represent beneficial mutations (with a selection coefficient of 0.3). New mutations are still
always neutral (mutation type m1); we will introduce the sweep mutation ourselves in script.
The final output event is also unchanged:
10000 late() { sim.outputFull(); }

The interesting changes are to the 1 late() event that configures the initial state of the
metapopulation:
1 late() {
mSide = 10;
// number of subpops along one side of the grid
for (i in 1:(mSide * mSide))
sim.addSubpop(i, 500);
subpops = sim.subpopulations;
for (x in 1:mSide)
for (y in 1:mSide)
{
destID = (x - 1) + (y - 1) * mSide + 1;
ds = subpops[destID - 1];
if (x > 1)
// left to right
ds.setMigrationRates(destID - 1, runif(1, 0.0, 0.05));
if (x < mSide)
// right to left
ds.setMigrationRates(destID + 1, runif(1, 0.0, 0.05));
if (y > 1)
// top to bottom
ds.setMigrationRates(destID - mSide, runif(1, 0.0, 0.05));
if (y < mSide)
// bottom to top
ds.setMigrationRates(destID + mSide, runif(1, 0.0, 0.05));
// set up SLiMgui's population
xd = ((x - 1) / (mSide - 1)) *
yd = ((y - 1) / (mSide - 1)) *
ds.configureDisplay(c(xd, yd),

visualization nicely
0.9 + 0.05;
0.9 + 0.05;
0.4);

}
// remove 25% of the subpopulations
subpops[sample(0:99, 25)].setSubpopulationSize(0);
// introduce a beneficial mutation
target_subpop = sample(sim.subpopulations, 1);
sample(target_subpop.genomes, 10).addNewDrawnMutation(m2, 20000);
}

This creates a 10×10 spatial metapopulation in much the same way as section 5.3.3 created a
3×3 metapopulation: first by creating the subpopulations themselves in a simple loop, and then
setting up the migration among them with a pair of nested loops over x and y (see section 5.3.3 for
further comments on that basic design).
For each subpopulation, however, it also does something new, under the // set up SLiMgui
comment: it calculates a visual position for the subpopulation, as xd and yd, and then calls
configureDisplay() on the subpopulation to tell SLiMgui to use that position in its display. The
coordinate system used by SLiMgui for subpopulation display spans [0,1] in x and y, so the xd and
yd values calculated here are within that range. We pass a value of 0.4 for the second parameter
to configureDisplay(); this is a scaling factor for the circle used to represent the subpopulation,
so here we are telling SLiMgui to use unusually small circles (so that all 100 subpopulations fit into
the display without overlapping).
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

86

If we ran the model without this display configuration (just comment out the call to
we would get the left-hand plot; the subpopulations are all overlapping and
nothing can be discerned at all. (The “Optimized Positions” technique shown in section 5.3.3 is
not up to the task either; optimizing such a large network is a difficult problem. Perhaps that
experimental feature will eventually be improved to the point that it produces acceptable results
for this model, but at present it does not.) If we run it with the display configuration, we get the
right-hand plot, which is obviously much better:
configureDisplay(),

p91

p1p2p3p6p8
p100
p99
p98
p97
p96
p9
p94
p11
p91
p12
p90
p13
p89
p15
p88
p16
p87
p17
p84
p18
p83
p19
p81
p22
p79
p23
p78
p24
p77
p25
p72
p27
p69
p28
p68
p29
p65
p30
p64
p31
p63
p32
p61
p33
p60
p34
p59
p35
p58
p36
p57
p37
p56
p38
p55
p39
p54
p40
p53
p41
p52
p42
p51
p43
p49
p44
p48
p45
p47
p46

p93

p94

p95

p81

p83

p84

p71

p73

p63

p61

p92

p62

p51

p41

p42

p43

p32

p21

p22

p23

p97

p98

p85

p87

p88

p89

p74

p75

p77

p78

p79

p64

p65

p54

p55

p44

p45

p34

p35

p24

p25

p12

p1

p2

p96

p66

p57

p36

p16

p4

p5

p6

p80

p68

p70

p58

p60

p48

p49

p50

p37

p38

p39

p40

p27

p28

p29

p30

p17

p18

p19

p20

p8

p9

Looking at the right-hand visualization, it is now immediately apparent that this is not a
complete metapopulation, but rather a sparse metapopulation in which some subpopulations are
missing. That brings us to the next lines in the event:
// remove 25% of the subpopulations
subpops[sample(0:99, 25)].setSubpopulationSize(0);

This takes a sample, of size 25, from the sequence 0:99, selects those subpopulations (using the
subset operator on subpops), and sets their size to zero. As we saw in section 5.2.2, this removes
those subpopulations from the simulation; any connections they have to other subpopulations by
migration are broken. That one line, then, very easily produces a random sparse metapopulation.
The only caveat is that, depending upon which subpopulations get removed, the remaining
metapopulation might not be connected; there might be two or more isolated networks with no
connection between them via migration at all. If that is undesirable, further steps would have to
be taken to avoid the possibility.
You might also notice that the arrows in the population visualization now vary in their
thickness; this is because we now draw random migration rates using runif(), rather than using a
fixed migration rate as we did in section 5.3.3. This code draws migration rates from a uniform
distribution, and makes no attempt to ensure that migration rates between a given pair of
subpopulations are symmetric; since this is just scripting, one could implement any pattern of
random migration one wished.
That brings us to the final lines of the event:
// introduce a beneficial mutation
target_subpop = sample(sim.subpopulations, 1);
sample(target_subpop.genomes, 10).addNewDrawnMutation(m2, 20000);

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

87

This manual will cover introduced mutations and selective sweeps in great detail in chapter 10,
so this is just a tiny bit of foreshadowing of that complex topic. The idea here is simple: we choose
one subpopulation from the subpopulations that remain in the model using sample(), and then we
choose ten genomes at random from the chosen subpopulation again using sample() (starting with
ten rather than just one to make it more likely that the mutation will not be lost due to drift early
on, for pedagogical purposes), and finally we call addNewDrawnMutation() to add a new m2
mutation to those ten genomes at position 20000. This code, then, introduces a beneficial mutation
that will, we hope, sweep through our sparse metapopulation.
And if it doesn’t get lost, and if the metapopulation is connected, sweep it does. Here is a
screenshot from midway through the sweep:
p92

p71

p93

p94

p83

p84

p72

p95

p75

p76

p66

p63

p64

p65

p51

p53

p54

p55

p42

p31

p32

p43

p23

p24

p11

p12

p13

p14

p1

p2

p3

p4

p97

p86

p61

p41

p96

p100

p88

p89

p90

p79

p80

p68

p69

p70

p58

p59

p60

p49

p50

p77

p57

p45

p46

p47

p35

p36

p37

p38

p39

p26

p27

p28

p29

p30

p19

p20

p9

p10

p15

p17

p6

p7

p8

As per the default behavior in SLiM, subpopulations are colored according to their mean fitness;
populations where the sweep mutation is approaching fixation are thus a dark green, while
populations that are still neutral are still yellow. Note that the configureDisplay() method has a
third (optional) argument that would allow us to customize the color of each subpopulation as
well, coloring them according to whatever model variable we wish; we have not used that here,
since the default fitness-based coloring is in fact exactly what we want.
5.3.5 Reading a migration matrix from a file
Sometimes, when modeling an empirical population, migration rates are specified in a file, and
you would like to read that file in and create a corresponding SLiM model. This is very easy to
accomplish using Eidos in SLiM. Let’s start by looking at a very simple migration matrix file:
// For the recipe of section 5.3.4.
// Format: , , .
1,1,0.78
1,2,0.10
1,3,0.12
2,1,0.01
2,2,0.96
2,3,0.03
3,1,0.33
3,2,0.17
3,3,0.50

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

88

This file expresses the model’s migration rates as a series of lines, each of which is a series of
comma-separated values; this format is often called a CSV file. Let us suppose that it exists on disk
at the path ~/Desktop/migration.csv. The first two lines are comments, describing the file. The
rest of the lines each have three values: the identifier of the source subpopulation, the identifier of
the destination subpopulation, and the migration rate (sometimes the destination is listed before
the source in these sorts of files, so be careful). Note that there are lines expressing the fraction of
each subpopulation that does not migrate – the remainder after all migrants have left. This is not
optimal for SLiM’s purposes, but we want to work with the file as it is.
We can read this file in and create a model based on it with a simple script:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 {
for (i in 1:3)
sim.addSubpop(i, 1000);
subpops = sim.subpopulations;
lines = readFile("~/Desktop/migration.csv");
lines = lines[substr(lines, 0, 1) != "//"];
for (line in lines)
{
fields = strsplit(line, ",");
i = asInteger(fields[0]);
j = asInteger(fields[1]);
m = asFloat(fields[2]);
if (i != j)
{
p_i = subpops[subpops.id == i];
p_j = subpops[subpops.id == j];
p_j.setMigrationRates(p_i, m);
}
}
}
10000 late() { sim.outputFull(); }

The generation 1 event is where the action is. It first creates the three subpopulations of the
model with a simple for loop. It would be simple to extend this model to determine the number
of subpopulations from the contents of the migration.csv file, and to read subpopulation sizes in
from that file, and so forth. Here, however, we hard-code those values for simplicity. Once the
subpopulations have been created, we cache the vector of subpopulations in subpops for brevity in
the script that follows.
Next, the script reads in the contents of the migration.csv file using readFile(). This creates a
string vector, with each line of the file being a separate element in the vector. The following line
uses the substr() function to find lines that begin with "//", and removes those lines by
subsetting the lines vector, stripping out the comment lines from the file.
Now we can loop through the lines with a for loop and handle each line in turn. The
strsplit() function is used to split line into its components, separated by commas; this is often a
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

89

very convenient way to parse CSV files. The three values of the line are then extracted from the
fields vector, which is of type string, and are converted to their appropriate types.
The last block actually sets the migration rate. First it checks that i and j are not the same; this
filters away the lines that express the non-migrating fractions, which SLiM does not need to know
about. Then it looks up the subpopulations referenced by i and j, using the id property, which
corresponds to the numeric part of the subpopulation’s symbol (i.e., subpopulation p3 has an id of
3). Finally, it sets the migration rate from p_j to p_i to be the rate m that was read from the file.
When executed, SLiMgui’s visualization of this population structure looks like this:
p1

p3

p2

That looks like what we would expect from looking at the file; p2 is a sink, p3 is a source, and
is close to balanced. This is a very simple population model, but this script could just as easily
read in a migration matrix file for hundreds or even thousands of subpopulations; its code is in
quite general. As mentioned above, it could easily be extended to read subpopulation sizes from
the CSV file as well; indeed, other subpopulation properties, such as selfing rates and sex ratios,
could also easily be added to the file format and set up by this script, and the script could easily be
adapted to work with whatever empirical data files already exist.
p1

5.4 The Gravel et al. (2011) model of human evolution
In this section we will look at a recipe that brings together all of the elements of demography
and population structure that have been discussed in this chapter. This is a SLiM implementation
of a model of human evolution presented by Gravel et al. (2011); in particular, we here model the
“Low-coverage + exons” model described in their Table 2. The recipe:
initialize() {
initializeMutationRate(2.36e-8);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 10000);
initializeRecombinationRate(1e-8);
}
// Create the ancestral African population
1 { sim.addSubpop("p1", 7310); }
// Expand the African population to 14474
// This occurs 148000 years (5920) generations ago
52080 { p1.setSubpopulationSize(14474); }
// Split non-Africans from Africans and set up migration between them
// This occurs 51000 years (2040 generations) ago
55960 {
sim.addSubpopSplit("p2", 1861, p1);
p1.setMigrationRates(c(p2), c(15e-5));
p2.setMigrationRates(c(p1), c(15e-5));
}
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

90

// Split p2 into European and East Asian subpopulations
// This occurs 23000 years (920 generations) ago
57080 {
sim.addSubpopSplit("p3", 554, p2);
p2.setSubpopulationSize(1032); // reduce European size
// Set migration rates for
p1.setMigrationRates(c(p2,
p2.setMigrationRates(c(p1,
p3.setMigrationRates(c(p1,

the rest of the simulation
p3), c(2.5e-5, 0.78e-5));
p3), c(2.5e-5, 3.11e-5));
p2), c(0.78e-5, 3.11e-5));

}
// Set up exponential growth in Europe and East Asia
// Where N(0) is the base subpopulation size and t = gen - 57080:
//
N(Europe) should be int(round(N(0) * e^(0.0038*t)))
//
N(East Asia) should be int(round(N(0) * e^(0.0048*t)))
57080:58000 {
t = sim.generation - 57080;
p2_size = round(1032 * exp(0.0038 * t));
p3_size = round(554 * exp(0.0048 * t));
p2.setSubpopulationSize(asInteger(p2_size));
p3.setSubpopulationSize(asInteger(p3_size));
}
// Generation 58000 is the present. Output and
58000 late() {
p1.outputSample(216); // YRI phase 3 sample
p2.outputSample(198); // CEU phase 3 sample
p3.outputSample(206); // CHB phase 3 sample
}

terminate.
of size 108
of size 99
of size 103

Three subpopulations are modeled in this recipe, as shown in the diagram below (solid arrows
showing subpopulation splitting events, dotted arrows showing migration rates between
subpopulations, and red italic numbers showing subpopulation effective sizes).

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

91

The first subpopulation, present from the beginning of the simulation, is p1; it represents
Africans (YRI). The second, p2, initially represents the Ancestral Eurasian Bottleneck, and then
becomes the Europeans (CEU). The third, p3, represents East Asians (CHB). As you can see in the
diagram and in the recipe above, p2 splits from p1 at generation 55960, and then p3 splits off from
p2 at generation 57080. Beginning with the split of p3 from p2, both p2 and p3 undergo
exponential growth until the end of the model.
The model begins 58000 generations ago (1.45 million years if we assume 25 years per
generation). It spends quite a long time doing neutral burn-in before the action really starts in
generation 52080 with the expansion of the African subpopulation. Running the full model only
takes a couple of minutes, but if you’re impatient, you can decrease the generation ranges for all
events by 52000, providing an 80-generation burn-in that should suffice for illustrative purposes
(but will not produce the correct pattern of neutral diversity at the end of the run).
This model is a neutral model; the only mutation type modeled is m1, which represents neutral
mutations. At the end of the model, random samples are output from each of the three
subpopulations to provide a view on the neutral diversity present in each subpopulation. The
empirical samples that this output is intended to match were taken from (diploid) humans; the
outputSample() method of Subpopulation, on the other hand, takes as its argument the number of
(haploid) genomes to sample and output. The sample sizes in SLiM are therefore double the
number of humans sampled empirically.
It is worth noting that the population sizes used in this model are effective population sizes.
The actual population sizes in human history were likely much larger, but geography and other
factors greatly reduced the effective population size. This is also the reason that the sizes of p2 and
p3 post-split are not modeled as adding up to the same size as the pre-split p2 subpopulation.
The model for this recipe was written by Aaron Sams of the Messer Lab at Cornell.
5.5 Rescaling population sizes to improve simulation performance
The limiting factor in most forward population genetic simulations tends to be the actual
number of individuals simulated. In SLiM, every individual in the population is modeled
explicitly. Thus, in larger populations more time is consumed per generation creating the children
that make up the population, and more memory is needed for storing their genetic information.
Perhaps we can approximate the evolution of a large population using simulation of a smaller
population? In some ways, we can: analytical theory predicts that under certain assumptions
many important population genetic parameters, such as the expected levels of diversity,
polymorphism frequency spectra, levels of linkage disequilibrium, etc., should primarily depend
on products of the form Nµ, Nr, and Ns, where N is the effective population size, µ is the mutation
rate, r the recombination rate, and s the selection coefficient of a given mutation.
Thus, for many analyses we do not have to simulate a population of the true size, but can
obtain similar results by simulating a much smaller population while rescaling µ, r, and s such that
the products Nµ, Nr, and Ns remain the same. Importantly, when rescaling population sizes, time
has to be rescaled as well, because drift will be faster in a smaller population. The amount of drift
in a Wright-Fisher population of 1000 individuals, observed over 100 generations, will be similar
to that in a population of 100 individuals observed over just 10 generations. In this way, rescaling
a simulation to a smaller population size not only helps us by having to model fewer individuals
per generation, but also by reducing the number of generations we need to simulate.
This rescaling “trick” can be tremendously helpful in practice, allowing the simulation of
scenarios featuring large populations that would not be feasible to simulate when using their actual
population sizes. However, there are clear limitations to rescaling as well. Most obviously, there
will be limits to how far downscaling of population sizes can be pushed in the light of an
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

92

increasing impact of discretization effects: as simulated population sizes become smaller, there
will be fewer possible population frequencies at which mutations can segregate, with the lowest
possible frequency being 1/(2N). Thus, the rescaling approach will break down if the goal is to
study the characteristics of very low-frequency polymorphisms.
The discreteness of generations can increasingly become a problem as time is rescaled
downwards. For example, if a selective sweep that takes 100 generations in a population of 10000
individuals takes the same amount of “time” in a downscaled population of 500 individuals, it
would complete in only 5 generations (assuming selection coefficients were rescaled upwards
accordingly). In that case, the frequency changes of the selected allele will no longer be small
over the timescale of a single generation, violating a key assumption in many analytical models.
Furthermore, since the time for a beneficial allele to sweep scales with log(N), we actually expect
the sweep in the smaller population to complete even faster than we would expect after
accounting for rescaling.
This discretization can also have unexpected and undesirable side effects on processes such as
adaptation. Rescaling by a factor Q preserves the influx of mutations per generation (since N
µ = (N / Q) µ Q). One rescaled generation is Q original generations, so this implies a lower influx
of mutations per unit of time, but since mean TMRCA scales with N, genetic diversity at neutral
sites is preserved. Since the probability of fixation of a beneficial mutation scales with s, the influx
of successfully established beneficial mutations per unit time is preserved (since N µ s dt = (N / Q) µ
Q s Q (dt / Q)), implying a smaller number of fixed selected mutations since the mean TMRCA in
the population. Since rescaled values of s are larger, this has the net effect of substituting many
mutations of small effect with a single one of large effect (with Q = 100, replacing 100 mutations
with s = 0.001 by a single one of s = 0.1), a very different model of adaptation.
While it may be tempting to always rescale any given scenario to a very small population size,
then, one must be careful that finite-size effects do not distort results too much. It is therefore
generally advisable to test any rescaled scenario at larger and smaller sizes in order to make sure
that the results are consistent. When tree-sequence recording can be used to speed up a
simulation, by allowing simulation of neutral mutations to be deferred (see section16.2), or even
by allowing the coalescent to be used for neutral burn-in with recapitation (see section 16.10), that
will usually be strongly preferable, since tree-sequence recording does not introduce such artifacts.
Nevertheless, since tree-sequence recording may not be applicable to a given model, or may
provide insufficient performance enhancement, rescaling is still sometimes needed. Rescaling may
be used in conjunction with tree-sequence recording, with the appropriate adjustment of
parameters such as µ and r for mutation overlay and recapitation.
As an example of the use of rescaling, consider the following recipe, in which we model neutral
and deleterious mutations occurring over a 10 kb locus in a population of initial size 5000. In
generation 50000, the population experiences a bottleneck that lasts for 5000 generations. A
random population sample is then taken in generation 60000:
initialize() {
initializeMutationRate(1e-8);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeMutationType("m2", 0.5, "f", -0.01);
initializeGenomicElementType("g1", c(m1,m2), c(0.8,0.2));
initializeGenomicElement(g1, 0, 9999);
initializeRecombinationRate(1e-8);
}
1 { sim.addSubpop("p1", 5000); }
50000 { p1.setSubpopulationSize(1000); }
55000 { p1.setSubpopulationSize(5000); }
60000 late() { p1.outputSample(10); }
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

93

The patterns of diversity observed in the population sample retrieved at the end of the
simulation should be very similar to those obtained from the following recipe, in which we
downscaled the population size by a factor of ten:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeMutationType("m2", 0.5, "f", -0.1);
initializeGenomicElementType("g1", c(m1,m2), c(0.8,0.2));
initializeGenomicElement(g1, 0, 9999);
initializeRecombinationRate(1e-7);
}
1 { sim.addSubpop("p1", 500); }
5000 { p1.setSubpopulationSize(100); }
5500 { p1.setSubpopulationSize(500); }
6000 late() { p1.outputSample(10); }

Note how the population sizes, times, mutation rates, recombination rates, and selection
coefficients were all rescaled in this scenario. On a 2.26 GHz Intel Xeon Mac Pro running the
models in SLiMgui, the first recipe runs in 218 seconds, whereas the second runs in 12 seconds –
quite a significant difference.
One caveat is that if the original model uses very high recombination rates, those rates should
not be scaled in the simple multiplicative fashion described above. In particular, consider that a
recombination rate of 0.5 represents completely independent assortment from one base to the
next, due to a probability of crossover between bases of 0.5 (see section 21.1’s documentation of
initializeRecombinationRate()). If a model uses a rate of 0.5 between sites, a rescaled version
of that model would still use a rate of 0.5 between those sites, because completely independent
assortment can’t get more independent; indeed, if the rescaling factor were, e.g., 10, then
multiplying the original recombination rate by the rescaling factor would result in a nonsensical
scaled rate of 5.0 that would not even be interpretable as a probability. In point of fact, the simple
multiplication of rates when rescaling a model is always just an approximation, but for rates less
than 0.001 and rescaling factors of 10.0 or less it will be such a close approximation that the
difference shouldn’t matter. The correct formula for rescaling of recombination rates in SLiM, that
gives the rescaled, per-locus recombination rate rscaled corresponding to an original per-locus
recombination rate of r with a rescaling factor of n, is (thanks to Peter Ralph):

rscaled =

1
(1 − (1 − 2r)n )
2

For small values of r this produces an essentially linear scaling by n, but as r approaches 0.5 the
scaling saturates in the desired manner. For an original rate of 0.001 and a rescaling factor n of 10,
this suggests a rescaled recombination rate of 0.00991, which is only very slightly lower than the
rate of 0.01 produced by naive multiplication. If the original rate were 0.01, however, the rescaled
rate would be 0.0915, almost 10% off from the rate of 0.1 provided by naive multiplication.
This formula is based upon the probability that a binomial draw will be odd; as a region of
length n squeezes down to a single site, the important question for rescaling the recombination
rate is the probability that the original region of length n would have an even number of
recombination events (cancelling out to produce no effect, once rescaled) or an odd number of
recombination events (producing the same effect as a single crossover, once rescaled). In practice,
however, if you are rescaling a model in a parameter regime where the effects of this formula
matter to your results (besides the fact that a rate of 0.5 should stay 0.5 to produce independence),
you may be pushing the limits of safe rescaling, and should proceed with extreme caution. See
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

94

section 13.18 for a somewhat unusual application of this formula to scaling one particular region
of the chromosome in a model.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

95

6. Sexual reproduction
The standard model of reproduction in SLiM assumes (1) hermaphroditic diploid individuals, (2)
reproduction by biparental sexual mating, (3) a uniform rate of crossing over without gene
conversion during recombination, and (4) modeling of an autosome rather than a sex
chromosome. In this chapter, we will explore how each of these assumptions can be modified to
study various other scenarios of reproduction. The only limitations are that SLiM remains restricted
to diploid organisms (but haploids can be simulated with some creative scripting; see the recipe in
section 13.13), hermaphroditism or two-sex systems, and X–Y sex chromosome systems.
6.1 Recombination
6.1.1 Crossing over: Making a random recombination map
Section 4.1.6 introduced the initializeRecombinationRate() method in our basic neutral
simulation. That section also discussed the possibility of supplying different recombination rates
for different stretches of the chromosome, since initializeRecombinationRate() takes a vector of
rates and a vector of end positions.
Here, then, let’s examine a recipe for setting up random variation in the recombination rate
along a chromosome, to simulate the presence of “hot spots” and “cold spots”:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
// 1000 random recombination regions
ends = c(sort(sample(0:99998, 999)), 99999);
rates = runif(1000, 1e-9, 1e-7);
initializeRecombinationRate(rates, ends);
}
1 { sim.addSubpop("p1", 500); }
10000 { sim.simulationFinished(); }

First of all, please note that this recipe is not empirically based; the choice of 1000
recombination regions, and the choice of uniform distribution from which the recombination rates
are drawn, are both arbitrary. However, it would be straightforward to tailor this recipe to match
whatever empirical information about recombination rates and regions one might have.
Let’s look at how this code works. Everything new is in the initialize() callback, under the
comment “// 1000 random recombination regions”. The initializeRecombinationRates() call
sets all of the rates in one fell swoop, using a vector named rates that contains the recombination
rates (in recombination events per base pair per generation), and a vector named ends that
contains the ends of the recombination regions (in base pairs along the chromosome). Each of
these vectors has 1000 elements.
The previous script line creates the rates vector by calling the runif() function of Eidos. This
function generates random draws from a specified uniform distribution. The first parameter
requests 1000 samples; the next two parameters give the minimum and maximum values for the
uniform distribution to be used. Note that Eidos has several other functions for drawing from other
distributions, such as rnorm() for a normal distribution, rpois() for a Poisson distribution,
rbinom() for a binomial distribution, and rexp() for an exponential distribution. Using these
facilities, it is quite easy to make simulations based upon some sort of random configuration or
behavior, as in this recipe.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

96

Continuing to work backwards, the preceding script line sets up the vector of recombination
region endpoints:
ends = c(sort(sample(0:99998, 999)), 99999);

Let’s analyze this from the inside out. The innermost call is sample(0:99998, 999). The Eidos
function sample() returns a random sample from a given vector. The given vector here is the
sequence 0:99998, containing every base pair position along the chromosome except for the very
last position (for reasons we will see momentarily). The second parameter requests 999 samples.
The sample() function draws its samples without replacement by default, which is what we want;
each recombination region should end at a different base pair. Next, the result from sample() is
passed to the sort() function, which sorts the vector (because initializeRecombinationRates()
requires the vector of end positions to be in sorted order). Finally, the c() function is used to
concatenate the value 99999 onto the end of the vector, providing the final entry for the vector; this
is the reason that only 999 samples were drawn, from positions up to only 99998. The last end
position is required by initializeRecombinationRates() to be the last position in the
chromosome.
If we paste this recipe into SLiMgui and do a Recycle and a Step, the random recombination
map is loaded into the simulation. To see that it has worked, click the R button to turn on display
of rate maps (for recombination and mutation – in this case, just recombination, since that is the
rate map that has been set). The chromosome view should then show you something like this:

The regions shown in the darkest blues are cold spots, with low rates closer to 10−9, whereas
the regions shown in shades close to white are hot spots, with high rates closer to 10−7.
Remember that you can drag out a display range in the upper chromosome view, which changes
what you see in the lower chromosome view – including the recombination map.
Note that SLiM also allows the recombination map to be tailored at an individual level using a
recombination() callback; see sections 13.5 and 21.5. Also, note that a mutation rate map may
be configured in exactly the same way as this recipe’s recombination rate map, using
initializeMutationRate().
6.1.2 Crossing over: Reading a recombination map from a file
Rather than generating a random recombination map, you might want to read in and use an
empirically determined recombination map such as the map from Drosophila melanogaster
presented by Comeron et al. (2012) (dataset available in Fiston-Lavier & Petrov 2013). Let’s work
with chromosome arm 2L, which looks like this:
1
0
100001 0
200001 0
300001 0.234550139
400001 0.234550139
500001 1.993676178
...
22800001 0.234550139
22900001 0
23000001 0

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

97

The length of the 2L arm is given as 23011544 bases; positions in SLiM will thus range from 0 to
(always beware of off-by-one errors!). The file gives start positions in bases, beginning
with 1, with rates in the second column in cM/Mbp (centimorgans per megabasepair). We have a
little format conversion to do, but reading this file in and using it in SLiM is quite straightforward:
23011543

initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 23011543);
// read Drosophila 2L map from Comeron et al. 2012
lines = readFile("/Users/bhaller/Desktop/Comeron_100kb_chr2L.txt");
rates = NULL;
ends = NULL;
for (line in lines)
{
components = strsplit(line, "\t");
ends = c(ends, asInteger(components[0]));
rates = c(rates, asFloat(components[1]));
}
ends = c(ends[1:(size(ends)-1)] - 2, 23011543);
rates = rates * 1e-8;
initializeRecombinationRate(rates, ends);
}
1 { sim.addSubpop("p1", 500); }
10000 { sim.simulationFinished(); }

The action is in the initialize() callback after the comment, as before. The first line calls the
function of Eidos to read in a text file at the given filesystem path; you will need to
change this path to the correct path to the file on your computer. The result of readFile() is a
string vector of lines; each line in the input file becomes a separate string-element.
Next, we set rates and ends both to NULL. These will be the rates and endpoints vectors that we
will give to SLiM; we will add entries to them one at a time as we process each line in the input
file. NULL is a special type indicating “no value”; it often provides a good initial state.
Now comes a loop over the lines read from the file; each line is placed into the loop index
variable line, and that line is then processed by the loop body. The strsplit() call splits the line
into substrings separated by tab ("\t") characters; since each line has two values separated by a
tab, components end up as a string vector of length 2. The next two lines handle those two
components by converting them to the correct type and then concatenating them on to the tail of
ends and rates using the c() function. This method of building a vector by successive
concatenation is a common quick-and-dirty approach, although it is not terribly fast. When the
loop finishes, we have rates and positions as specified by the input file.
Finally, we need to convert that data into the format expected by SLiM. The input file specifies
start positions, but we want end positions. The expression ends[1:(size(ends)-1)] thus strips off
the first value, the start position 1, which is not needed. The -2 correction accomplishes two
things: (1) it shifts from starts to ends (each region ends at the base position one less than the base
position at which the next region starts), accounting for -1, and (2) it shifts from the 1-based system
of the input file, in which the first base is at position 1, to the 0-based system of SLiM, in which the
first base is at position 0, accounting for another -1. Finally, the c() call adds the last base
position, 23011543, to the tail of the vector.
readFile()

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

98

The second conversion line just converts the recombination rates from cM/Mbp to a rate per
base pair per generation, as used by SLiM, by multiplying by 10−8, a conversion ratio that comes
from the units involved. 1 cM means that there is a probability of 0.01 of crossover during
meiosis. Therefore, 1 cM/Mbp = 10−6 cM/bp = 10−8 probability of crossover per base pair.
If you paste this recipe into SLiMgui, Recycle, Step, and turn on display of rate maps in the
chromosome view with the R button, you should see something like this:

This is the Drosophila 2L recombination map according to Comeron et al. (2012). Note that a
mutation rate map could be read in and set up in SLiM in much the same manner.
6.1.3 Gene conversion
In addition to crossing over, recombination can also lead to gene conversion, the copying of a
stretch of the genetic sequence from one chromosome to its homologous chromosome (Chen at al.
2007). By default gene conversion is not enabled in SLiM, but it can be turned on easily:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
initializeGeneConversion(0.2, 25);
}
1 { sim.addSubpop("p1", 500); }
10000 { sim.simulationFinished(); }

The initializeGeneConversion() call takes two parameters. The first is the fraction of
recombination events that will result in gene conversion; here we decide that 0.2 or 20% will.
The second is the mean length of the gene conversion tract, in base pairs; here we specify 25.
When a gene conversion event occurs, the length of the converted tract is drawn by SLiM from a
geometric distribution with the specified mean. Note that SLiM does not presently support
variation in the gene conversion rate along the chromosome; however, gene conversion can be
tailored on a per-individual basis using a recombination() callback, which would allow much the
same thing (see sections 13.5 and 21.5).
6.2 Separate sexes
6.2.1 Enabling separate sexes
Simulations in SLiM involve hermaphroditic individuals by default. If you wish to simulate
separate sexes, however, doing so is essentially just the flip of a switch:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
initializeSex("A");
}
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

99

1 { sim.addSubpop("p1", 500); }
10000 { sim.simulationFinished(); }

This recipe is identical to the basic neutral recipe from section 4.1, apart from the addition of a
call to the function initializeSex():
initializeSex("A");

The parameter here specifies the type of chromosome that is to be modeled; "A" specifies
modeling of an autosome, but it is also possible to model the X or Y chromosome in SLiM (see
section 6.2.3).
Once sex is turned on in a simulation, there is no way to turn it off; it is not possible to make a
SLiM simulation in which individuals switch between being sexual and being hermaphroditic. A
similar effect could probably be produced by modifying the mate choice algorithm, however (see
chapter 11).
Having turned sex on, all subpopulations will keep track of male and female individuals
separately, and biparental matings will always involve a male parent and a female parent. A few
things work a bit differently when sex is enabled; each subpopulation has a sex ratio that can be
specified and modified (see section 6.2.2), for example, and selfing (see section 6.3.1) is not
allowed when sex is enabled. Usually you will know whether your code is running with sex
enabled or disabled, but if you wish to write general-purpose code that works in either
environment, the sexEnabled property of SLiMSim will tell you whether sex is presently enabled.
Section 4.2 discussed output of basic simulation state using the outputFull() and
outputSample() methods of SLiMSim. When sex is enabled, the output generated by these
methods changes slightly. In particular, the H symbol that was used to designate individuals as
hermaphrodites will change to indicate the sex of each individual with M or F. In addition, if sex
chromosomes are modeled (see section 6.2.3), the A that designated genomes in the output as
autosomes will change to an X or Y as appropriate.
6.2.2 Sex ratios
When individuals are one of two sexes, rather than being hermaphroditic, the question of the
sex ratio immediately arises. A sex ratio of 0.5 is maintained by default in SLiM; if that is what you
want, no further action is required. To set a sex ratio (i.e., male fraction; see below) of 0.6, on the
other hand, you simply supply an extra parameter to addSubpop() (or to addSubpopSplit(), which
works in the same way):
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
initializeSex("A");
}
1 { sim.addSubpop("p1", 500, 0.6); }
10000 { sim.simulationFinished(); }

If you paste this recipe into SLiMgui, Recycle, and Step twice to get through generation 1, you
can see in the population table view that this sex ratio has been used by SLiM:

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

100

The column labeled
shows the current sex ratio of each subpopulation; the subpopulations
in the simulation do not all need to have the same sex ratio. The sex ratio of a subpopulation can
be accessed in script through the sexRatio property of Subpopulation; in the recipe above, for
example, p1.sexRatio would be equal to 0.6.
The “sex ratio” in SLiM refers to the male fraction, the fraction of the total subpopulation that is
male. Symbolically, it is therefore M/(M+F), where M and F are the number of males and females
respectively. A sex ratio of 0 would imply all females; a sex ratio of 1 would imply all males; the
default sex ratio of 0.5 is half males and half females.
You are free to set almost any sex ratio you wish; SLiM will raise an error, however, if the
simulation is forced into a situation in which a subpopulation would become unisexual (whether
due to the sex ratio, cloning rate, or other factors). All subpopulations must be viable, which
means that at least one parent of each sex must be available to produce offspring.
It is possible for the sex ratio of a subpopulation to change over time. For example, here is a
recipe for a simulation in which the sex ratio fluctuates randomly around 0.5:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
initializeSex("A");
}
1 { sim.addSubpop("p1", 500); }
1: { p1.setSexRatio(runif(1, 0.3, 0.7)); }
10000 { sim.simulationFinished(); }

The setSexRatio() method of Subpopulation simply takes a float representing the new sex
ratio. As with setSubpopulationSize(), the change is not effected immediately; instead, the call
sets the target sex ratio that will be used when offspring are generated. The recipe above draws a
new random sex ratio between 0.3 and 0.7 in each generation using runif().
6.2.3 Modeling sex-chromosome evolution
So far we have always seen simulations of autosomal evolution, but SLiM also supports
simulation of sex chromosome evolution. The choice of chromosome type is made in the
initializeSex() call, so to simulate X chromosome evolution one would simply do:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
initializeSex("X");
}
1 { sim.addSubpop("p1", 500); }
10000 { sim.simulationFinished(); }

The "X" passed to initializeSex() sets SLiM to model the X chromosome. SLiM will handle
all of the necessary details; females in the simulation will be XX, whereas males will be XY. Since
the Y chromosome is not being modeled in this scenario, it will be a “null chromosome”, a
placeholder kept by SLiM simply to make the diploid bookkeeping balance. Null chromosomes
have no structure, receive no mutations, and raise an error if you attempt to do much of anything
with them. It is also possible to supply "Y" to initializeSex(), to model the Y chromosome; in
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

101

this case it is the X chromosome that will be a “null chromosome”. If you need to work directly
with Genome objects in your script (we will see some examples of this later), you can find out
whether a given genome is null or not with the isNullGenome property of Genome.
When modeling sex chromosomes, recombination works as you would expect it to. To be
specific, if you are modeling the X chromosome, recombination will occur between the two X
chromosomes of female parents when generating their gametes, whereas no recombination will
occur between the X and Y chromosomes of a male. If you are modeling the Y chromosome,
recombination does not occur, since the X chromosome in this case is a null chromosome, and
individuals never possess two Y chromosomes.
When modeling the X chromosome, there are two different ways in which an individual can be
heterozygous for a given mutation: the individual might be XX (i.e., female) and possess the
mutation on just one X chromosome, or the individual might be XY (i.e., male) and possess the
mutation on the one X chromosome present. These cases are handled differently by SLiM when it
calculates the fitness effect of a mutation. The first case is handled in the same way as when
modeling an autosome: the dominance coefficient for the mutation type of the mutation, as
supplied to initializeMutationType(), specifies a multiplicative modifier for the selection
coefficient of the mutation (see section 4.1.3). The second case is unique to the case of modeling
the X chromosome, and is handled using a special X-dominance coefficient. This coefficient is 1.0
by default, meaning that an XY individual possessing an X-linked mutation will experience the
same fitness effect from that mutation as an XX individual that is homozygous for that mutation (a
reasonable assumption because of X inactivation). The X-dominance coefficient may be changed
by supplying an optional second parameter to the initializeSex() function:
initializeSex("X", 0.8);

The dominanceCoeffX property of SLiMSim can also be used to get and set the X-dominance
coefficient; for example, one could vary its value over time. However, this mechanism is presently
much less flexible than the standard dominance coefficient, since a single X-dominance coefficient
value is used for all mutations in all subpopulations at present. Of course it would be possible to
introduce a more complex fitness calculation scheme using a fitness() callback, as discussed in
chapter 9.
When modeling the Y chromosome, individuals can only possess zero or one copy of a given
mutation, since males possess one Y and females possess none. For this reason, SLiM does not use
a dominance coefficient at all in this case. The selection coefficient of the mutation determines its
fitness effect, and the dominance coefficient supplied in the mutation type is simply ignored.
It is worth noting that the way in which the Y chromosome is modeled in SLiM is in some ways
similar to how one might model mitochondrial genomes. If SLiM is used in this way, the XY
“males” are really the females, with a single genome possessed by all of the mitochondria in a
given female. The XX “females” are really the males; they possess mitochondria too, of course, but
they do not pass them down to their offspring, so their mitochondria are an evolutionary dead end
that does not need to be modeled. This equivalence should work unless you need to model the
fitness effects of mitochondrial mutations in males; that would probably also be possible in SLiM,
using an autosomal model with a modifyChild() callback, but it is beyond the scope of this
manual.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

102

6.3 Selfing and cloning
6.3.1 Selfing in hermaphroditic populations
Selfing, or self-fertilization, is the mating of an individual with itself: male and female gametes
from one individual combine to form a zygote. Selfing is different from cloning in that gametes are
produced and fertilize, so offspring are not clones of their parent; notably, recombination occurs in
selfing. It is an essentially hermaphroditic phenomenon, since hermaphroditic individuals can
produce both eggs and sperm, can produce both X and Y gametes, and are fertile. There are
probably counterexamples somewhere in biology, where it sometimes seems that anything that is
possible exists; but in SLiM, at least, selfing is limited to the hermaphroditic case.
Returning to a hermaphroditic model, then, selfing can be turned on with a single call:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 {
sim.addSubpop("p1", 500);
p1.setSelfingRate(0.8);
}
10000 { sim.simulationFinished(); }

The setSelfingRate() method call here on p1 tells SLiM that selfing should be used to generate
80% of the offspring in subpopulation p1. The selfing rate may be anything from 0.0 (the default)
to 1.0. The setSelfingRate() method may be called at any time, so the selfing rate can be varied
over time or in response to other simulation conditions. The current selfing rate for a
subpopulation is shown in the population table view under the
column.
Usually it is not important, but it should be noted that in non-sexual simulations SLiM does not
prevent a parent from being chosen twice, as both parents in a biparental mating event. Even
when the selfing rate is set to 0, therefore, a low background rate of selfing may occur. This can
easily be prevented if necessary; see the recipe in section 12.4.
6.3.2 Cloning
SLiM also supports clonal reproduction, in which offspring are an exact genetic copy of a single
parent (except for new mutations introduced during the copying of the DNA). Indeed,
hermaphroditic simulations may have a combination of biparental mating, self-fertilization, and
cloning. In a hermaphroditic model, setting up clonal reproduction is a single call:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 {
sim.addSubpop("p1", 500);
p1.setCloningRate(0.1);
}
10000 { sim.simulationFinished(); }

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

103

The setCloningRate() call here tells SLiM to generate 10% of the offspring in p1 clonally (the
remaining 90% will be generated biparentally as usual). The cloning rate may be anything from
0.0 (the default) to 1.0. The setCloningRate() method may be called at any time, so the cloning
rate can be varied over time or in response to other simulation conditions. The current cloning
rate for each subpopulation is shown in the population table view under the
and
columns; in the hermaphroditic case the same number will be shown in both columns (the icon
depicts a little Athena budding parthenogenically from the head of Zeus, if that is not immediately
obvious).
Clonal reproduction is also supported in the case of separate sexes. It works in essentially the
same way:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
initializeSex("A");
}
1 {
sim.addSubpop("p1", 500);
p1.setCloningRate(c(0.5,0.0));
}
10000 { sim.simulationFinished(); }

The only difference is that when separate sexes are enabled by initializeSex(), the
setCloningRate() method can take a vector of two rates, as above. Here the cloning rate is set to
0.5 for females, but 0.0 for males, reflecting a somewhat common situation in some taxa, in which
females can reproduce parthenogenically but males can reproduce only sexually. You may still
pass a singleton value to setCloningRate(); when separate sexes are enabled, that value is then
taken as the cloning rate for both males and females. The current cloning rates for the females and
males in each subpopulation is shown in the population table view under the
and
columns.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

104

7. Mutation types, genomic elements, and chromosome structure
A number of concepts such as mutation types, genomic element types, genomic elements, and
chromosome structure were already introduced in chapter 4 in our basic neutral simulation. Here
we return to those foundational concepts and explore them in more detail, showing how
simulations might use multiple mutation types, multiple genomic element types, and many
genomic elements to simulate more realistic chromosome structures with mutations of varying
selective effects.
The structure of this chapter will be a bit different from that of previous chapters. In this chapter
each subsection will build upon the recipe introduced by the previous subsection, working
towards making a relatively large final recipe for a full-chromosome simulation.
7.1 Mutation types and fitness effects
In section 4.1.3 the concept of a mutation type was introduced: a category of mutations that
represents some particular subset of the mutations in a simulation. Examples might include
“neutral mutations”, “beneficial mutations introduced by the simulation script in generation 10”,
or “mutations that will be forced to sweep to fixation”. Mutation types are represented in SLiM
with the MutationType class, and each defined mutation type has a unique symbolic identifier of
the form mX, like m1 or m27. Whenever you want to be able to generate or refer to a particular type
of mutations separately from other types, you will want to define a new mutation type.
Section 4.1.3 also introduced the function used to create new mutation types at initialization
time, initializeMutationType(). This function can only be called inside an initialize()
callback; in fact, it is not even defined at other times. Mutation types can therefore be set up only
at initialization time; you must set up all of the types that your simulation will need for the entire
run. After initialization, mutation types are typically static entities; however, there is a method,
setDistribution(), that may be used to change a mutation type’s distribution of fitness effects,
affecting new mutations generated from that point onward.
A mutation type is basically defined by two things: its dominance coefficient, and its
distribution of fitness effects (DFE). Both of these properties are important only because they affect
new mutations that are generated from a given mutation type: (1) each mutation of a given
mutation type receives a selection coefficient drawn from the mutation type’s DFE, and (2)
individuals that are heterozygous for a mutation will use the dominance coefficient, obtained from
the mutation type of the mutation, to modify the selection coefficient of the mutation.
For the purposes of this recipe, let’s define four mutation types: two for neutral mutations (noncoding and synonymous), one for mildly deleterious mutations, and one for strongly beneficial
mutations:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeMutationType("m2", 0.5, "f", 0.0);
initializeMutationType("m3", 0.1, "g", -0.03, 0.2);
initializeMutationType("m4", 0.8, "e", 0.1);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 { sim.addSubpop("p1", 5000); }
10000 { sim.simulationFinished(); }

//
//
//
//

non-coding
synonymous
deleterious
beneficial

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

105

The neutral mutation types, m1 and m2, are defined with a dominance coefficient of 0.5 (which
does not matter for neutral mutations), and mutations drawn from them receive a fixed selection
coefficient (DFE type "f") of 0.0 as specified by the DFE parameter. Note that they are exactly
identical in their parameters; however, they will be used to represent different conceptual types of
neutral mutations, with m1 representing neutral mutations in non-coding regions and m2
representing synonymous mutations in coding regions. Drawing this distinction will allow us to
distinguish these two classes of mutations later on, when we are observing the simulation running.
The deleterious mutation type, m3, has a dominance coefficient of only 0.1, meaning that
heterozygous individuals feel very little effect from these mutations; they are almost recessive.
These mutations are drawn from a gamma distribution (see section 21.9) with a mean of −0.03,
with a shape parameter (alpha) of 0.2.
Finally, the beneficial mutation type, m4, has a dominance coefficient of 0.8, representing
incomplete dominance. Mutations of this type are drawn from an exponential DFE with a mean of
0.1.
All defined mutation types can be viewed in the drawer on the simulation window. If you paste
in the recipe so far, Recycle, and Step, then open the drawer by pressing the
button, you can
see the mutation types listed, which can be quite useful if mutation types are manufactured en
masse with a for loop or similar automated construction method:

We are not using the new mutation types yet, just defining them; we’ll make more progress with
this recipe in the next section. Before we move on to that, however, there is one hidden feature
worth mentioning. The mutation type table that is depicted in the screenshot above has a nice
hidden addition: tooltips that show the mutation type’s DFE graphically. If you place the mouse
cursor over the m1 line without moving it for a second or a bit more (often called a “hover” of the
cursor), a “tooltip” – a little informational tab – will appear:

(Ignore the somewhat different appearance of the two screenshots above; they were taken on
different versions of OS X, and Apple changed the standard system font and other aspects of table
appearance from one OS X release to the next.) The tooltip shown above contains a plot of the
mutation type’s distribution of fitness effects; the x-axis is the selection coefficient, and the y-axis is
the distribution’s relative density. In this case, since the mutation type has a fixed selection
coefficient of 0.0, the distribution consists of a single peak at 0.0. Hover over m3, and a more
interesting tooltip appears:
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

106

This plot shows the specific gamma distribution specified by the parameters for m3; note that the
x-axis now spans the range −1 to 0 only, since this DFE does not include any positive selection
coefficients. The distribution shown has most of its density very close to zero, but a tail is visible
extending downward perhaps as far as −0.25.
Hovering over m4 shows its DFE:

This is a non-negative DFE, so the x-axis spans 0 to 1, and this distribution is much broader.
This sort of visualization can prove quite helpful in configuring and debugging DFEs.
7.2 Genomic element types
Now that we have a couple of mutation types defined, we can make some genomic element
types that use these mutation types. Genomic element types were introduced in section 4.1.4;
they represent a type of region in the genome with a particular mutational profile. For this simple
toy model, let’s focus on exons, introns, and non-coding regions:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
// non-coding
initializeMutationType("m2", 0.5, "f", 0.0);
// synonymous
initializeMutationType("m3", 0.1, "g", -0.03, 0.2); // deleterious
initializeMutationType("m4", 0.8, "e", 0.1);
// beneficial
initializeGenomicElementType("g1", c(m2,m3,m4), c(2,8,0.1)); // exon
initializeGenomicElementType("g2", c(m1,m3), c(9,1));
// intron
initializeGenomicElementType("g3", c(m1), 1);
// non-coding
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 { sim.addSubpop("p1", 5000); }
10000 { sim.simulationFinished(); }
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

107

Genomic element type g1 defines exons, which often suffer deleterious mutations, sometimes
get neutral (synonymous) mutations, and very rarely get beneficial mutations. Type g2 defines
introns, which often get neutral (non-coding) mutations, and occasionally get deleterious
mutations. Type g3 defines non-coding regions, which get neutral (non-coding) mutations only.
Each genomic element type is defined with a call to initializeGenomicElementType(), as
previously discussed in section 4.1.5; its parameters supply the identifier for the new type, a vector
of mutation types, and a vector of relative proportions for those mutation types among the new
mutations that occur in the genomic elements of this type.
Notice that there is not a one-to-one correspondence between mutation types and genomic
element types; this is typical, since the two classes represent quite different objects. A genomic
element often draws from a variety of different mutation types, with various probabilities. If this
model were expanded to be more biologically realistic, it might have quite a few more genomic
element types (3′ and 5′ UTRs, promoters and enhancers and silencers, various different types of
non-coding regions...) but might have only a couple more mutation types (a strongly deleterious
mutation type, to represent things like premature stop codons and broken promoters, for example).
Again, this recipe is not yet complete; now that we have defined genomic element types we
need to make genomic elements that use those types. However, at this point we can Recycle,
Step, and observe the defined genomic element types in the simulation window’s drawer:

Note that each genomic element type is automatically assigned a color by SLiMgui. We will
soon see how these colors are used.
7.3 Chromosome organization
As previously discussed in section 4.1.5, genomic elements are regions of a chromosome that
use a particular genomic element type. The genomic element types define what possibilities exist
in the chromosome; the genomic elements determine what actually does exist. Having defined
genomic element types for exons, introns, and non-coding regions, we now need to create
genomic elements to express how the genomic element types are distributed in the chromosome.
As has been the case all along with this recipe, the goal here is not biological realism;
nevertheless, it is interesting to try to come up with a recipe that broadly approximates the
structure of a real chromosome, just to show how the problem might be approached. In this
recipe, then, the formulas used for the lengths of exons and introns are very loosely based on
empirical length distributions.
This is by far the longest recipe we’ve seen so far, so a bit of preamble to prepare the way for it
is helpful. We have previously used for loops to iterate over a specified vector. This example uses
two new looping constructs, a do loop and a do–while loop. These both iterate as long as a given
condition remains true; when the conditions tests false, the loop terminates. The only difference
between them is that do–while tests its condition at the end of the loop body, and thus the loop
body always executes at least once. The other new thing in this recipe is use of the rlnorm()
function, which draws samples from a lognormal distribution. In addition to the number of
samples to draw, it takes the mean and standard deviation of the distribution as parameters, on the
log scale. With that preface, here is the recipe:
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

108

initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1",
initializeMutationType("m2",
initializeMutationType("m3",
initializeMutationType("m4",

0.5,
0.5,
0.1,
0.8,

"f",
"f",
"g",
"e",

0.0);
0.0);
-0.03, 0.2);
0.1);

//
//
//
//

non-coding
synonymous
deleterious
beneficial

initializeGenomicElementType("g1", c(m2,m3,m4), c(2,8,0.1)); // exon
initializeGenomicElementType("g2", c(m1,m3), c(9,1));
// intron
initializeGenomicElementType("g3", c(m1), 1);
// non-coding
// Generate random genes along an approximately 100000-base chromosome
base = 0;
while (base < 100000) {
// make a non-coding region
nc_length = asInteger(runif(1, 100, 5000));
initializeGenomicElement(g3, base, base + nc_length - 1);
base = base + nc_length;
// make first exon
ex_length = asInteger(rlnorm(1, log(50), log(2))) + 1;
initializeGenomicElement(g1, base, base + ex_length - 1);
base = base + ex_length;
// make additional intron-exon pairs
do
{
in_length = asInteger(rlnorm(1, log(100), log(1.5))) + 10;
initializeGenomicElement(g2, base, base + in_length - 1);
base = base + in_length;
ex_length = asInteger(rlnorm(1, log(50), log(2))) + 1;
initializeGenomicElement(g1, base, base + ex_length - 1);
base = base + ex_length;
}
while (runif(1) < 0.8);

// 20% probability of stopping

}
// final non-coding region
nc_length = asInteger(runif(1, 100, 5000));
initializeGenomicElement(g3, base, base + nc_length - 1);
// single recombination rate
initializeRecombinationRate(1e-8);
}
1 { sim.addSubpop("p1", 5000); }
10000 { sim.simulationFinished(); }

This is too much code to parse through line by line, but it should be fairly clear what is going
on. The outer loop adds one non-coding region and then one gene each time that it iterates; the
inner loop adds one intron-exon pair with each iteration. The overall algorithm is not targeted to
end at a fixed chromosome length (doing that without biasing the metrics of the final gene
generated would be a bit tricky); instead, genes are added until the chromosome length exceeds
100000 and then the process is ended with a final non-coding region. Eidos does not have any
built-in facility for graphing, so you can’t directly see the distributions used to generate the lengths
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

109

of the exons and introns; however, since the syntax of Eidos is so similar to that of R, it is quite
easy to use R to plot these distributions. The only hitch is that conversion to integer is asInteger()
in Eidos but as.integer() in R, so that has to be tweaked. If you are so inclined, try running this
code in R:
x = as.integer(rlnorm(100000, log(50), log(2))) + 1
hist(x[x < 250], breaks=100)

That will produce a plot of the distribution of exon lengths, which may be compared to the
distribution shown in Deutsch & Long (1999), which was loosely used as a reference for this
model. The distribution of intron lengths may be compared similarly.
If you paste in the recipe above, Recycle, and Step, you will see the random chromosome
structure that it generated, which might look something like this (after turning on display of
genomic elements with the G button, and selecting a subrange in the upper chromosome view to
show some detail):

The structure of non-coding regions (purple) interspersed with genes made up on exons (blue)
alternating with introns (green) appears to be as intended. It should now be clear, of course, how
SLiMgui uses the colors that it assigns to genomic element types, as seen in the previous section.
Rather than generating a random chromosome organization, you might wish to read in an
actual chromosome map and generate genomic regions based upon that information, to simulate
the evolution of a particular organism. That is left as an exercise for the reader, but with the recipe
in section 6.1.2 as guidance it should not be too difficult.
7.4 Custom display colors in SLiMgui
The model that we built in the previous three sections uses various default colors schemes
supplied by SLiMgui: mutations are colored according to their selection coefficients, genomic
element types are automatically assigned colors from a standard palette, and individuals are
colored according to their fitness. For simple models these default colors generally suffice, but
when constructing complex models it may be helpful to customize the color scheme used in
SLiMgui to improve the clarity of the models’ visual representation. In this section we will briefly
explore SLiM’s facilities for doing so.
First of all, the colors used for genomic regions, as shown in the figure above in section 7.3, is
perhaps less than ideal. It would be nice if non-coding regions were shown in a distinctive color
suggestive of the unimportance of those regions to the organism, such as black. Similarly, perhaps
exons could be in the brightest color, connoting their importance, whereas introns could be shown
in a more muted fashion. This can be accomplished by simply adding these lines after the calls to
initializeGenomicElementType() have set up the genomic element types:
g1.color = "cornflowerblue";
g2.color = "#00009F";
g3.color = "black";

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

110

This produces a chromosome view like this:

It works by setting values on the color property of the genomic element types. By default, these
properties are the empty string, "", which tells SLiM to use a color from its default color
palette. Here we have instead told SLiM to use a light blue named "cornflowerblue" for the
exons represented by g1, and "black" for non-coding regions represented by g3. These names are
two of the 657 named colors supported by SLiM. The complete list of named colors is identical to
the named color list provided by the R language, for simplicity, and so there are various web
resources available that show all of the names and their corresponding colors (one such resource:
http://research.stowers-institute.org/efg/R/Color/Chart/ColorChart.pdf). You can browse such a list,
and then simply use the name for the color you want in SLiM.
The color used for g2 is not a named color, however; instead, it is the rather cryptic string value
"#00009F". This specifies the color in hexadecimal, or base 16, as two-digit values for the red,
green, and blue components of the color. With values of zero for red and green, and a value of 9F
for blue, this string represents a dark blue that works well for introns here.
The Eidos manual has further discussion of how colors are specified in SLiM (this is actually part
of the Eidos language). Note that setting colors on SLiM objects in this way only has an effect in
SLiMgui, but it is entirely legal when running SLiM models at the command line; the properties
still exist, but are unused by SLiM outside of SLiMgui.
Now that the genomic elements are colored nicely, let’s set up new colors for the mutations. By
default, SLiMgui displays neutral mutations in yellow, beneficial mutations in shades of green and
blue, and deleterious mutations in shades of orange and red. Let’s suppose we want to deemphasize neutral mutations in this model; we want them to display, but in a less visible color
than the default bright yellow color. We also want all beneficial mutations to be a single shade of
green (the blue doesn’t show up well against the blues we’ve chosen for our genomic elements),
and we want all deleterious mutations to be bright red so we can see them very clearly. This can
be achieved with a few lines placed after the mutation types have been defined:
color

m1.color
m2.color
m3.color
m4.color

=
=
=
=

"gray40";
"gray40";
"red";
"green";

That produces a display like this (with display of the genomic elements turned off now, for
greater clarity):

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

111

That looks like what we had in mind. If we turn on display of fixed mutations instead of
segregating mutations, however, we see that those are still being displayed in SLiM’s default color
scheme:

This is obviously not what we want. Instead, let’s use darker shades of the same colors used for
the mutations when they are still segregating. This can be done by setting the colorSubstitution
property of the mutation types:
m1.colorSubstitution
m2.colorSubstitution
m3.colorSubstitution
m4.colorSubstitution

=
=
=
=

"gray20";
"gray20";
"#550000";
"#005500";

That produces the desired result:

There’s one thing left for us to tweak. If we turn on display of both segregating and fixed
mutations, SLiMgui displays the fixed mutations using a shade of blue, rather than the colors we
just set up above:

This is deliberate, to prevent too much clutter and chaos in the chromosome view. However,
we deliberately chose dark colors above for our fixed mutations with the intention of having them
coexist aesthetically, so we’d like to tell SLiMgui to use our color scheme in all cases. The default
dark blue color SLiMgui uses for fixed mutations when both are being displayed is a property on
the Chromosome object named colorSubstitution. By default, it is set to "#3333FF", the dark blue
color being used. We can set it to the empty string, "", instead; this eliminates the use of this
default color. Since the Chromosome object is not available until the simulation has been fully
initialized, we can do this at the beginning of generation 1, by adding a line to the generation 1
event:
sim.chromosome.colorSubstitution = "";

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

112

Now our model displays as intended:

The fixed mutations are now in dark, dimmed colors, whereas the currently segregating
mutations are in bright, clear colors that are easily visible against that background.
We won’t give an example of it here, but the Individual class in SLiM also has a color
property that can be set to override the default fitness-based color scheme for display of
individuals. Setting this property should be done in a late() event, since SLiMgui updates its
display at the very beginning of each generation; early() events are executed after SLiMgui has
already displayed, and so setting display colors for individuals at that time will have no visible
effect. Generally SLiMgui’s fitness-based coloring works well, but in some special cases it might
be useful to give a specific color to individuals that possess some particular property – especially
in models in with the tag values of individuals are being used to keep track of extra state
information.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

113

8. SLiMgui visualizations for polymorphism patterns
The previous chapter developed a model of evolutionary dynamics in a scenario involving
neutral, deleterious, and beneficial mutations. Given the full recipe from section 7.3 (not 7.4) of
the previous chapter, we will look briefly in this chapter at the interplay of selection, drift, and
hitchhiking when it executes, since it’s the most complex model we’ve made thus far.
First of all, if you Recycle and Play, you should see a fairly complicated dance of mutations,
some fixing quickly, others struggling, but most flickering in and out of existence quickly:

Notice that display of rate maps has been turned on with the R button; otherwise, the genomic
elements display spans the full view height and makes it hard to see mutations that are at high
frequency. Notice also that the mutation lines are distinct colors; neutral mutations are yellow,
beneficial are green, and deleterious are orange or sometimes even red. You might need to play
with the selection coefficients color scale slider (see section 3.1) to optimize these colors.
A cloud of neutral and deleterious mutations occupies the bottom row of pixels; most mutations
in the model are neutral or deleterious. A few beneficial mutations are visible in green, near 19000
and 48000. A neutral mutation near 83000 and a deleterious mutation near 29000 have been
dragged up to high frequency by the two beneficial mutations they are linked to, both of which are
about to fix. A few dozen generations later, this is the situation:

The beneficial mutation near 48000 has fixed, and is thus no longer displayed. The one near
has not yet fixed, but a new beneficial mutation has arisen near 17000 on top of both the
beneficial mutation near 48000 and the deleterious mutation near 29000. With that added selective
pressure, the deleterious mutation has risen near fixation. After another couple of dozen
generations, however, recombination produced a new variant containing the beneficial mutation
without the deleterious mutation. That variant rose very quickly, fixing the beneficial mutation
while leaving the deleterious mutation at a frequency of only ~0.5. It then dropped and was lost
fairly quickly, no longer having the benefit of its linked companions. In this model it is fairly rare
for a deleterious mutation to make it all the way to fixation, although it does occasionally happen.
As the simulation is running, you can watch the colors of the individuals, shown in the top
center view, to see the mix of different fitnesses present. As a beneficial mutation sweeps, more
and more individuals will turn green (if the fitness color scale slider is set appropriately); once the
mutation fixes, they will suddenly revert toward yellow, since the fitness benefit of that mutation is
no longer part of SLiM’s fitness calculations.
19000

8.1 Mutation frequency spectra
SLiMgui has a number of graphs it can display to help analyze and understand the simulation.
While running this recipe, for example, if you click the
button and select “Graph Mutation
Frequency Spectrum” you should see something like this:
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

114

Remember that m1 and m2 are neutral, m3 is deleterious, and m4 is beneficial. What this plot tells
us is that out of all mutations that are of type m1, for example, the vast majority are at very low
frequency; the same is true of m2 and m3. Out of all mutations that are of type m4, however – the
beneficial mutations – a fairly large proportion are at middle or high frequency, because they are
on their way to fixation.
8.2 Mutation frequency trajectories
The next graph in SLiMgui’s graph menu is “Graph Mutation Frequency Trajectories”. Whereas
the previous graph showed us only an analysis of the frequencies of different mutation types at the
present moment in time (try opening it while the simulation is actually playing), this graph shows
us similar information across the whole run of the model so far:

Using the popups at the lower left of the window, you can select a particular subpopulation and
mutation type. The data collected for this graph can be quite large, and quite slow to collect. For
this reason, the data are not normally collected when you run a model; doing so would slow SLiM
down too much. Instead, the data are collected only when the graph window is open, and only
for the subpopulation and mutation type chosen. If you want a plot for an entire simulation run, as
shown above, you should therefore (1) Recycle, (2) Step once to advance to generation 1, (3) open
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

115

the graph window, (4) select the mutation type you want from the popup, and (5) press Play to run
your simulation with data collection.
The result at the end of a simulation run might look something like the graph shown above. The
complete trajectory of every mutation of the selected mutation type in the selected subpopulation
is shown with an individual curve in this plot; here we’re looking at beneficial mutations (type m4),
and there is only one subpopulation. The color of each curve indicates whether the mutation was
lost (red), fixed (blue), or is still segregating (black).
SLiMgui provides various options to configure the visual appearance of plots. If you controlclick or right-click on the graph window here, for example, you will get a context menu with
menu items that allow you to show/hide grid lines, show/hide the legend, and so forth. If you’re
following along in SLiMgui, try experimenting with these options; you can’t do any harm. SLiMgui
graph windows also make their data available to you, if you want to regenerate them or perform
further analysis; from the context menu, just select “Copy Data” to copy the underlying data for the
plot to the clipboard, or “Export Data...” to export the data to a text file readable by other programs
such as R. The format of the data should be pretty self-explanatory.
8.3 Times to fixation and loss
Next let’s look at the plots for “Graph Mutation Loss Time Histogram” and “Graph Mutation
Fixation Time Histogram”, side by side:

These plots show an analysis of metrics gathered over the course of the entire run. The loss time
plot, on the left, shows the distribution of loss times, in generations, for each of the four mutation
types. The loss profile for all four types is quite similar. Remember that this is not saying that
beneficial mutations are just as likely to be lost as deleterious mutations, but only that when a
beneficial mutation is lost, the amount of time it takes to be lost is similar to the amount of time it
takes a deleterious mutation to be lost. In other words, this plot is conditional upon mutations
being lost; it implies nothing about the likelihood of loss.
The fixation time plot, on the right, similarly shows the distribution of fixation times, in
generations, for each of the four mutation types (conditional, similarly, upon the mutations having
fixed).

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

116

8.4 Population fitness over time
The next graph we will examine is “Graph Fitness ~ Time”. At the end of a run of this recipe, it
might look something like this:

This plot shows the fitness of the population over time. More specifically it shows mean
absolute fitness, and (following the SLiM engine) it rescales whenever a mutation fixes, dropping
the fixed mutation from the calculation. Without that rescaling, the plot would be nearly a
monotonically rising curve; with the rescaling, a drop in the curve occurs each time that a
beneficial mutation fixes (shown in blue). In fact neutral and deleterious mutations that fix are also
shown with a blue line in this plot, but since, in this model, they almost always fix at the same
time as a beneficial mutation, there are only a few fixation events in the plot that do not
correspond to a drop in mean fitness due to a rescaling.
A few things can be observed in this plot. First of all, a bunch of mutations fixed over 10000
generations; the probability of beneficial mutations is probably much higher than the typical
empirical rate. For the same reason, the strong-selection-weak-mutation assumption emphatically
does not hold here; beneficial mutations are stacking up and competing with each other, as can be
seen from both the complex wiggling of the curve in between fixation events, and from the fact
that fixation events usually do not drop the mean fitness back down to 1.0 (indicating that at least
one other beneficial mutation exists at intermediate frequency in the population at the same time).
We have seen the “Graph Population Visualization” plot before, and since there is no
population structure in this recipe it is not interesting here; so this concludes our tour of SLiMgui’s
graphing facilities. If you want to graph other things, you can of course generate an output file
with the data you need, read the data into R, and generate your plots there. Also, don’t forget that
you can copy SLiMgui’s graphs to the clipboard, or save them out as PDF files, using the “Copy
Graph” and “Export Graph...” context menu commands.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

117

9. Context-dependent selection using fitness() callbacks
In this and following chapters, we will show how Eidos callbacks can be used to modify SLiM’s
standard behavior. You have seen one Eidos callback already, initialize() callbacks that are
called by SLiM at initialization time to modify the default initialization behavior (which sets up
only the SLiMSim object). At least one initialize() callback in required, since SLiM’s default
initialization is insufficient to produce a working simulation. Other Eidos callbacks are optional,
because SLiM’s default behavior is sufficient. In this chapter we will look at one particular Eidos
callback, the fitness() callback, used to modify how SLiM calculates the fitness of an individual
as a function of the mutations present in this individual and potentially other simulation state.
There are just a few conceptual points to discuss before the first recipe. First of all, fitness()
callbacks return a relative fitness value for a focal mutation in a focal individual. Neutrality is
indicated by a relative fitness of 1.0; fitness uses a different scale than selection coefficients, for
which 0.0 indicates neutrality. A fitness() callback is called once for each mutation possessed
by each individual; the callback can therefore assign a different fitness value to the same mutation
depending upon the focal individual possessing the mutation. If a given individual is homozygous
for a mutation, the fitness() callback is still called only once; a flag provided to the callback
indicates whether the focal mutation is homozygous or heterozygous in the focal individual. This
is because fitness() callbacks return a relative fitness rather than a selection coefficient: they take
all of the information regarding the focal mutation in the focal individual – selection coefficient,
dominance coefficient, homozygosity versus heterozygosity, genetic background, subpopulation,
sex, etc. – and condense all of it into a determination of the fitness effect of the focal mutation.
Each mutation in the focal individual is evaluated separately – by one or more fitness() callbacks
if any apply, otherwise by SLiM’s standard fitness equation – to produce a set of relative fitness
values, one for each mutation. SLiM then multiplies these relative fitness values to determine the
fitness of the individual. This process is repeated for each individual in the simulation. This is just
a quick summary; see sections 19.6, 20.3, and 22.2 for a fuller explanation.
9.1 Temporally varying selection
One way to model temporally varying selection in SLiM is to use the setSelectionCoeff()
method of Mutation; you can find the mutation(s) whose selection coefficient you want to change,
and then use that method to make the change at a specific point in time. However, there are
several disadvantages to this approach in general. First of all, the change is permanent; it would be
difficult to later restore the original selection coefficient (if the modified selection regime ends, for
example). Second, each mutation that you wish to modify must be changed individually. (To be
fair, there are also advantages to this method; the change is done once and then is finished with,
and subsequently you pay no speed penalty whatsoever for the change.)
Here we will examine another possible solution to this problem, using a fitness() callback:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0); // neutral
initializeMutationType("m2", 0.5, "f", 0.1); // beneficial
initializeGenomicElementType("g1", c(m1,m2), c(0.995,0.005));
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 { sim.addSubpop("p1", 500); }
2000:3999 fitness(m2) { return 1.0; }
10000 { sim.simulationFinished(); }

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

118

The initialize() callback sets up two mutation types: a very common neutral mutation type
(m1), and an uncommon beneficial mutation type (m2). This produces distinctive simulation
dynamics in which occasional beneficial mutations arise and sweep quickly.
However, from generation 2000 to 3999, the simulation switches to pure neutral dynamics,
because mutation type m2 reverts to neutrality for that period of time due to the fitness() callback
that is defined. This change in dynamics can be seen quite clearly in SLiMgui if you Recycle and
then Play. Let’s look at the callback in more detail:
2000:3999 fitness(m2) { return 1.0; }

The generation range used, 2000:3999, is expressed in the usual syntax. Following that comes
the declaration that this block is a fitness() callback, rather than an ordinary Eidos event:
fitness(m2). A fitness() callback must be declared as applying to one specific mutation type –
in this case, m2; the callback modifies the fitness effect only of mutations belonging to that
mutation type. (This is both for conceptual clarity and for efficiency.) The rest of the callback
definition is a compound statement that returns a float value, used by SLiM as the fitness effect of
the mutation. Here the value 1.0 is returned, which represents neutrality. (Remember that a
neutral mutation has a selection coefficient of 0.0 but a multiplicative fitness effect of 1.0). This is
the first time we have seen the Eidos keyword return; it simply causes the executing script block
to return immediately, passing its (optional) value out to the caller of the block, which in this case
is the SLiM engine itself. It can be used in Eidos events too, although any value returned will not
be used for anything.
This recipe is trivially simple, but of course the code in a fitness() callback can do anything,
so the potential power of this mechanism should be apparent. In the context of temporally varying
selection, you could make the fitness effect vary sinusoidally through time using an expression
based on sin(sim.generation), or make it be random in each generation with rnorm(), or any
other effect you wish.
9.2 Spatially varying selection
The previous example showed a simple recipe implementing temporally varying selection. In
this section we will see how to make selection vary spatially between subpopulations. More
specifically, we will examine a model in which mutations have beneficial effects in one
subpopulation, but deleterious effects in another subpopulation. Whether such mutations fix or
not depends on the strength of the fitness effects, the migration rates between the subpopulations,
and various other factors. The recipe:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
// neutral
initializeMutationType("m2", 0.5, "e", 0.1);
// deleterious in p2
initializeGenomicElementType("g1", c(m1,m2), c(0.99,0.01));
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 {
sim.addSubpop("p1", 500);
sim.addSubpop("p2", 500);
p1.setMigrationRates(p2, 0.1);
// weak migration p2 -> p1
p2.setMigrationRates(p1, 0.5);
// strong migration p1 -> p2
}
fitness(m2, p2) { return 1/relFitness; }
10000 { sim.simulationFinished(); }
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

119

Here we set up two subpopulations of equal size, with weak migration from p2 to p1, but with
strong migration from p1 to p2 (remember that you can use the population visualization graph in
SLiMgui to see a graphical depiction of the population structure, after you Recycle and then Step
through the setup). Mutations are mostly neutral, but occasionally beneficial mutations are drawn
from an exponential distribution with mean 0.1. This is all review; the interesting part is the
fitness() callback:
fitness(m2, p2) { return 1/relFitness; }

Here no generation range is specified; this fitness() callback is therefore active in every
generation. Subpopulation p2 is given as a parameter to the callback definition; this restricts the
operation of the callback to only the specified subpopulation, producing spatial variation in
selection. The body of the callback returns 1/relFitness as the modified fitness effect of each m2
mutation.
We haven’t seen relFitness before; it is a variable defined by SLiM when a fitness() callback
is called, and it contains the fitness value that SLiM would normally use for the mutation. By
returning 1/relFitness, we are telling SLiM to invert the normal fitness effect of all m2 mutations
in p2; if they are of high fitness in p1 (2.0, say) they become low fitness in p2 (1/2.0 == 0.5), and
vice versa.
There is a point worth clarifying here. Mutation type m2 draws selection coefficients from an
exponential DFE; each mutation of type m2 is thus unique. Because of this, the fitness() callback
here is not called just once per generation, to calculate a modified fitness effect for all mutations of
type m2; it is called once per individual per m2 mutation, and it calculates a modified fitness effect
for that mutation in that individual. These calculations can therefore depend upon both the
specific mutation and the specific individual. In addition to the relFitness variable that contains
the default relative fitness effect for the mutation, SLiM also defines mut, the mutation being
assessed; subpop, the subpopulation containing the individual possessing the mutation;
homozygous, a flag which is true if the individual is homozygous for the mutation, false otherwise;
and a few others as well that we will see in the next sections. The code in a fitness() callback
can use all of these variables to calculate its modified fitness effect.
Because fitness() callbacks might be called thousands or even millions of times every
generation, SLiM and Eidos are highly optimized to make those calls as fast as possible. You
should do your part by making your Eidos code as tight as possible; writing an inefficient
fitness() callback is an excellent way to make your simulation slow to a crawl. See section 18.1
for some tips on how to write fast Eidos code.
If you paste this recipe into SLiMgui, Recycle, and Play, you will see the spatial dynamics very
clearly; mutations will spread that are beneficial in p1, turning that subpopulation’s individuals
green, but deleterious in p2, turning that subpopulation’s individuals red. With the recipe as
written, these mutations will often fix; the strong migration from p1 versus the weak migration from
p2 gives p1 an advantage, allowing it to force p2 to fix mutations that are deleterious in that
context. If you reverse the migration bias, that will no longer happen; instead p2 will be able to
force p1 to lose mutations that are beneficial in that context. One could easily use this model to
explore this scenario in detail, with questions such as the effect of subpopulation size, migration
rate, selection strength, and so forth. Since neutral mutations are also in the model, one could also
ask questions about how these sorts of dynamics affect neutral diversity and divergence. Not bad
for sixteen lines of code!

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

120

9.3 Fitness as a function of genomic background
9.3.1 Epistasis
The fitness() callback mechanism can also make the fitness effect of a mutation depend upon
the genetic background of the individual that possesses the mutation. One example of this is
epistasis, in which two loci interact to produce a non-additive fitness effect (Philipps 2008). The
recipe here is a bit contrived – a more realistic model would probably use introduced epistatic
mutations rather than random mutations – but it serves to illustrate the concept:
initialize() {
initializeMutationRate(1e-8);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeMutationType("m2", 0.5, "f", 0.1);
initializeMutationType("m3", 0.5, "f", 0.1);
initializeGenomicElementType("g1", m1, 1);
initializeGenomicElementType("g2", m2, 1);
initializeGenomicElementType("g3", m3, 1);
initializeGenomicElement(g1, 0, 10000);
initializeGenomicElement(g2, 10001, 13000);
initializeGenomicElement(g1, 13001, 70000);
initializeGenomicElement(g3, 70001, 73000);
initializeGenomicElement(g1, 73001, 99999);
initializeRecombinationRate(1e-8);
}
1 { sim.addSubpop("p1", 500); }
10000 { sim.simulationFinished(); }
fitness(m3) {
if (genome1.countOfMutationsOfType(m2))
return 0.5;
else if (genome2.countOfMutationsOfType(m2))
return 0.5;
else
return relFitness;
}

// neutral
// epistatic mut 1
// epistatic mut 2
// epistatic locus 1
// epistatic locus 2

The initialize() callback sets the stage by making three mutation types, three genomic
element types, and five genomic elements. Over the majority of the chromosome, genomic
element type g1 is used; it generates only neutral mutations. In a locus spanning 10001:13000,
genomic element type g2 is used, which generates mutations of type m2, which are beneficial with
a selection coefficient of 0.1. In a second locus spanning 70001:73000, genomic element type g3
is used, which generates mutations of type m3, which are also beneficial. If you turn on display of
genomic elements in SLiMgui, the resulting setup looks like this:

The twist, however, comes in the fitness() callback, which makes mutations of types m2 and
interact epistatically. Any mutation of type m3 has a fitness effect of 0.5 – strongly deleterious –
if it is found in the same genome as any mutation of type m2. This is achieved in several steps.
First of all, genome1 and genome2 are defined by SLiM for fitness() callbacks; they are the two
m3

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

121

genomes (the two homologous chromosomes) of the individual carrying the mutation that is being
evaluated. The call genome1.countOfMutationsOfType() method counts the mutations of type m2
in the first genome; if that count is non-zero, the if statement evaluates it as true (T), and 0.5 is
returned, making the focal mutation deleterious. The next two lines perform the same check for
genome2 (these two if conditions could be combined into a single test using the “logical or”
operator, |, but the line of code would then be too long to fit without wrapping here). If neither
Genome object contains a mutation of type m2, the final else clause will execute and relFitness
will be returned, keeping the normal fitness effect of the focal mutation without modification.
There are some new concepts here, and this manual is not going to explain them all in detail; for
full discussions of the logical type, the behavior of true and false values in Eidos, and so forth,
please refer to the Eidos manual, which explains these ideas thoroughly.
Note that mutations of type m2 suffer no fitness penalty from sharing an individual with an m3
mutation; the fitness() callback affects only m3 mutations. Since the fitness of the carrying
individual will be severely compromised, however, the net effect is that m2 and m3 mutations are
rarely found together; collocation effectively harms m2 mutations just as much as m3 mutations.
This produces some interesting dynamics. Type m2 and m3 mutations arise fairly often, and
quickly sweep to fixation; however, they never sweep together, so if a mutation at one of the two
epistatic loci is sweeping, the other locus will have no active mutations. The two loci thus “take
turns” sweeping new mutations to fixation. Since new beneficial loci sweep so often, neutral loci
generally fix only if they are linked to a beneficial mutation. Because of recombination, that is
most likely to happen close to one of the two epistatic loci, so if we run the full simulation and
then turn on display of fixed mutations, we see a pattern of clustering near the epistatic loci:

All of the fixed mutations within the loci are beneficial epistatic mutations, since the two loci
generate only those mutation types. All of the bands for fixed mutations outside of those loci,
however, are for neutral mutations that hitchhiked.
The astute reader will have noticed a problem with all this. Biologically, this dynamic of
“taking turns” sweeping at one or the other locus makes no sense. If a mutation of type m2 sweeps
to fixation at the g2 locus, that mutation still exists in the genome, and the epistatic interaction
between m2 and m3 mutations should prevent an m3 mutation from sweeping, forever after. That
objection is correct, and points out a very important issue.
The reason that this recipe behaves in this manner has to do with the way that SLiM handles
fixed mutations (see section 19.3). When a mutation fixes, SLiM normally removes it from the
simulation, replacing it with a Substitution object that provides a permanent record of the fixed
mutation. The assumption is that since every individual possesses the fixed mutation, it has the
same fitness effect in every individual, and therefore can be ignored. However, epistasis violates
that assumption, since the fixed mutation causes a differential fitness effect among individuals
based upon the genetic background in which it is found. SLiM’s replacement of fixed m2 and m3
mutations in this model thus produces incorrect behavior that does not accurately model epistasis.
What to do? Happily, SLiM provides a way to handle this situation. The MutationType class has
a logical property named convertToSubstitution; by default it is T, indicating to SLiM that fixed
mutations of that type should be replaced by Substitution objects. We can set it to F instead,
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

122

telling SLiM to keep all mutations of that type active in the simulation even once fixed. (See
sections 18.3 and 20.9.1 for further discussion of this property). The new recipe:
initialize() {
initializeMutationRate(1e-8);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeMutationType("m2", 0.5, "f", 0.1);
m2.convertToSubstitution = F;
initializeMutationType("m3", 0.5, "f", 0.1);
m3.convertToSubstitution = F;
initializeGenomicElementType("g1", m1, 1);
initializeGenomicElementType("g2", m2, 1);
initializeGenomicElementType("g3", m3, 1);
initializeGenomicElement(g1, 0, 10000);
initializeGenomicElement(g2, 10001, 13000);
initializeGenomicElement(g1, 13001, 70000);
initializeGenomicElement(g3, 70001, 73000);
initializeGenomicElement(g1, 73001, 99999);
initializeRecombinationRate(1e-8);
}
1 { sim.addSubpop("p1", 500); }
10000 { sim.simulationFinished(); }
fitness(m3) {
if (genome1.countOfMutationsOfType(m2))
return 0.5;
else if (genome2.countOfMutationsOfType(m2))
return 0.5;
else
return relFitness;
}

// neutral
// epistatic mut 1
// epistatic mut 2

// epistatic locus 1
// epistatic locus 2

This will now produce the correct epistatic dynamics. If you run this model in SLiMgui, you
will see that one or the other epistatic locus wins an initial contest by fixing a mutation. Once that
happens, the other locus will never manage to establish.
The convertToSubstitution property should be used to prevent fixed mutation substitution
whenever the assumption of equal effects in all individuals would be violated, whether by a
fitness() callback introducing a mechanism like epistasis, or by differential effects of the
mutation type on mate choice or other dynamics. SLiM is not able to guess when substitution
should be turned off, so you must keep this caveat in mind. However, also keep in mind that
turning off substitution will make your models run much more slowly.
9.3.2 Polygenic selection
Another way in which the fitness effect of one mutation can depend upon the genetic
background is polygenic selection. As in epistasis, multiple mutations collocated in a single
individual produce a non-additive effect; but whereas epistasis produces an effect on mutations at
locus A based upon the simple presence or absence of mutations at locus B, polygenic selection
produces an overall fitness effect that depends upon how many mutations of a given type exist.
That description contrasts the concept of polygenic selection with epistasis, but technically, it
really is a type of epistasis; the genetic background against which a given mutation is found
influences the fitness effect of that mutation. Given this, we will need to use the
convertToSubstitution property of MutationType to suppress the substitution of the epistatic
mutations, as in the previous example, so that even after they fix they continue to be taken into
account by the fitness calculations of the model.
Here is a recipe for a simple polygenic selection scenario:
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

123

initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
// neutral
initializeMutationType("m2", 0.5, "f", -0.04); // polygenic
m2.convertToSubstitution = F;
initializeGenomicElementType("g1", m1, 1);
initializeGenomicElementType("g2", m2, 1);
initializeGenomicElement(g1, 0, 20000);
initializeGenomicElement(g2, 20001, 30000);
initializeGenomicElement(g1, 30001, 99999);
initializeRecombinationRate(1e-8);
}
1 { sim.addSubpop("p1", 500); }
1:10000 {
if (sim.mutationsOfType(m2).size() > 100)
sim.simulationFinished();
}
fitness(m2) {
count = sum(c(genome1,genome2).countOfMutationsOfType(m2));
if ((count > 2) & homozygous)
return 1.0 + count * 0.1;
else
return relFitness;
}

The initialize() callback here sets up a chromosome with a genomic element of type g2 in a
central locus, generating mutations of mutation type m2 that are under polygenic selection; the rest
of the chromosome uses g1 and m1, generating neutral mutations.
The fitness() callback, defined on mutation type m2, sets up the conditions for polygenic
selection. First it counts the total number of mutations of type m2 on both chromosomes of the
focal individual, using the countOfMutationsOfType() method we saw in the previous section, and
assigns that value to count. It does it in a slightly odd way; it first creates an object vector
containing both genomes, with c(genome1,genome2), then calls countOfMutationsOfType() on that
vector to produce a result vector containing two integer values (the counts for genome1 and
genome2, respectively), and then uses sum() to add the two values together to get an overall count.
(You can read more about the mechanics of how this works in the Eidos manual, as usual.) Then, if
count is greater than 2 and the focal mutation is homozygous in the focal individual (using the flag
homozygous that is set by SLiM for fitness() callbacks), a beneficial fitness effect is returned by
1.0 + count * 0.1 (where the 1.0 provides a baseline of neutrality). Otherwise, relFitness is
returned so that the normal (deleterious) fitness effect of the mutation is unchanged.
Running this recipe in SLiMgui shows the expected dynamics: within the locus where polygenic
selection is active, pairs (or more) of mutations drive to fixation simultaneously, because single
mutations on their own are deleterious. Many deleterious mutations flicker in and out of existence
in that locus; when, by chance, one manages to become both homozygous and collocated with
another mutation within the locus, the combination is favorable and they fix together:

Once one pair of these mutations has fixed, all individuals possess a homozygous mutation
(two, in fact) in their genetic background, and so the requirements for further mutations to be
beneficial are greatly relaxed. Individual mutations are then immediately beneficial, through their
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

124

epistatic effect on the fixed mutations, even though their own fitness effect will remain deleterious
until they become homozygous.
Incidentally, the use of homozygous in the fitness() callback here may seem strange; a
mutation only gets the polygenic selection bump-up if it is homozygous in the focal individual.
This is a quick hack to prevent a particular type of dynamics from taking over, in which mutations
of type m2 are prevented from going to fixation because recombination is suppressed.
Recombination gets suppressed because it would break apart linked collections of mutations that
are presently benefiting from the polygenic selection. Without the homozygous requirement, the
highest possible fitness for an individual comes from having completely different sets of mutations
in its two genomes; fixation is therefore blocked. If you delete the “& homozygous” check from the
recipe, you can see the effect of this in SLiMgui:

As in the previous section, models involving polygenic selection in SLiM are likely to involve
introduced mutations, and might thus avoid this issue.
See sections 13.1 and 13.10 for more sophisticated recipes that construct a quantitative trait
based upon the additive effects of multiple loci.
9.4 Fitness as a function of population composition
9.4.1 Frequency-dependent selection
In previous sections we have seen how we can use a fitness() callback to modify the fitness
effects of mutations in order to model scenarios such as epistasis, polygenic selection, and
spatiotemporal variation in selection. Now we will shift to making the fitness of mutations be a
function of population composition. First we’ll look at frequency-dependent selection, in which
the fitness effect of a mutation depends upon the frequency of the mutation in the population
(Ayala & Campbell 1974). Let’s start with a simple recipe for negative frequency-dependence:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
// neutral
initializeMutationType("m2", 0.5, "f", 0.1);
// balanced
initializeGenomicElementType("g1", c(m1,m2), c(999,1));
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 { sim.addSubpop("p1", 500); }
10000 { sim.simulationFinished(); }
fitness(m2) {
return 1.5 - sim.mutationFrequencies(p1, mut);
}

The idea here should be pretty clear; if the mutationFrequencies() method tells us that mut is
rare in p1, then the mutation will be highly beneficial, but if it tells us that mut is common, the
mutation will be deleterious. These dynamics prevent the mutation from either fixing or being lost;
instead we have what is called “balancing selection”, in which selection favors keeping mutations
of that type at an intermediate frequency. (See section 9.5 for an adaptation of this recipe using
setSelectionCoeff() instead; and see section 11.3 for a very different model of balancing
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

125

selection, using a modifyChild() callback instead of a fitness() callback.) Running this for a
while in SLiMgui gives us a picture something like this (where the yellow lines are neutral
mutations and the green lines are the mutations under balancing selection):

There are five m2 mutations under balancing selection, all at frequency ~0.5 since that is the
point at which the fitness effect changes from beneficial to deleterious according to the callback.
Next let’s look at a model of positive frequency-dependence:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
// neutral
initializeMutationType("m2", 0.5, "f", 0.1);
// positive freq. dep.
initializeGenomicElementType("g1", c(m1,m2), c(999,1));
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 { sim.addSubpop("p1", 500); }
10000 { sim.simulationFinished(); }
fitness(m2) {
return 1.0 + sim.mutationFrequencies(p1, mut);
}

The way this works is probably pretty obvious at this point; at low frequencies (as calculated by
mutations of type m2 are close to neutral, but at higher frequencies they
become strongly beneficial. Note that these mutations still do not sweep to fixation as quickly as
one might expect. This is because fitness() callbacks get called only once per mutation per
individual. If an individual carries one copy of a mutation (i.e., is a heterozygote), the fitness()
callback is called once. If an individual carries two copies (i.e., is a homozygote), the fitness()
callback is still called only once. Thus, if a fitness() callback does not explicitly consult the
homozygous flag, as described in sections 9.3.2 and 21.2, it will produce an effect of complete
dominance, regardless of the original dominance coefficient set in the mutation type, thereby
hindering fixation. If you wish to model codominance in the above scenario, just include the
homozygous flag in your calculations. In this case, we could alter our recipe in the following way:
mutationFrequencies())

initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
// neutral
initializeMutationType("m2", 0.5, "f", 0.1);
// positive freq. dep.
initializeGenomicElementType("g1", c(m1,m2), c(999,1));
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 { sim.addSubpop("p1", 500); }
10000 { sim.simulationFinished(); }
fitness(m2) {
dominance = asInteger(homozygous) * 0.5 + 0.5;
return 1.0 + sim.mutationFrequencies(p1, mut) * dominance;
}

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

126

9.4.2 Kin selection and inclusive fitness
According to the theory of kin selection and inclusive fitness (Hamilton 1964ab; Dawkins
1976), a mutation that promotes altruism towards kin can spread even if its direct fitness effect on
individuals is negative (because they are devoting energy to behaving altruistically towards related
individuals), if the mutation has indirect benefits (in the form of related individuals behaving
altruistically towards carriers of the mutation). In other words, one cannot look at things solely
from the perspective of the individual; one must look from the perspective of the gene, and see
whether the indirect benefits that accrue to carriers of that gene outweigh the direct costs of
possessing the gene.
This is a little tricky to model directly in SLiM, since all fitness calculations are done from the
perspective of the individual, and since there is no phase of the life cycle in which individuals
interact socially to affect survival probabilities, etc. This kin selection model will therefore look a
bit tautological, because the fitness calculation for an individual just tots up to a net benefit: the
indirect benefit outweighs the direct cost. (But see section 9.4.4 for a much more satisfying
approach.) Nevertheless, we can cook up a recipe that shows the concept by using mutations at
two different loci to serve as the two halves of the inclusive fitness equation:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
// neutral
initializeMutationType("m2", 0.5, "f", 0.1);
// kin recognition
m2.convertToSubstitution = F;
initializeMutationType("m3", 0.5, "f", 0.1);
// kin benefit
m3.convertToSubstitution = F;
initializeGenomicElementType("g1", m1, 1);
initializeGenomicElementType("g2", m2, 1);
initializeGenomicElementType("g3", m3, 1);
initializeGenomicElement(g1, 0, 10000);
initializeGenomicElement(g2, 10001, 13000);
initializeGenomicElement(g1, 13001, 70000);
initializeGenomicElement(g3, 70001, 73000);
initializeGenomicElement(g1, 73001, 99999);
initializeRecombinationRate(1e-8);
}
1 { sim.addSubpop("p1", 500); }
3000 { sim.simulationFinished(); }
fitness(m2) {
// count our kin and suffer a fitness cost for altruism
dominance = asInteger(homozygous) * 0.5 + 0.5;
return 1.0 - sim.mutationFrequencies(p1, mut) * dominance * 0.1;
}
fitness(m3) {
// count our kin and gain a fitness benefit from them
m2Muts = individual.uniqueMutationsOfType(m2);
kinCount = sum(sim.mutationFrequencies(p1, m2Muts));
return 1.0 + sim.mutationFrequencies(p1, mut) * kinCount;
}

Here we have a locus spanning 10001:13000 that contains mutations governing kin recognition
(using mutation type m2 in genomic element type g2), and a locus spanning 70001:73000 with
mutations which reap a benefit from having kin (using mutation type m3 in genomic element type
g3). Mutations at the m2 locus are always deleterious; their fitness effect decreases from 1.0,
growing smaller as they become more common (negative frequency-dependence, but without a

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

127

positive fitness effect even at low frequency). This locus represents a facility for kin recognition
and costly altruistic behavior towards kin. By itself, m2 mutations would almost invariably die out.
However, mutations at the m3 locus represent the target of altruism from kin. The fitness()
callback here computes a metric based upon how many other individuals share the focal
individual’s mutations at the kin-recognition locus. Carriers of mutations at the m3 locus (a locus
that attracts altruistic behavior from kin) benefit from these interactions based upon the magnitude
of shared kinship.
If you run this model in SLiMgui, you can see that mutations at the m2 locus sometimes fix,
despite the fact that the direct fitness benefit of those mutations is negative. This does not happen
in isolation; instead, a mutation at the kin recognition locus will rise to fixation at the same time
that an mutation at the benefit locus rises, and the mutation at the benefit locus needs to initiate
the process. Together, however, the mutations at the two loci can rise to fixation.
This model is somewhat unsatisfying from an individual-based perspective – one would like to
see individual interactions in which the altruistic individual suffers without benefit, while the kin
individual gains without cost (see section 9.4.4 for such a recipe) – but the fitness() callback
equations in this recipe essentially encapsulate such interactions on an aggregated level. If you
consider the effects of kin recognition and altruism to be composed of many small acts that can be
averaged over the lifetime of an individual – as is likely to be the case in most empirical systems
involving kin selection – then this approach is actually quite reasonable.
9.4.3 Cultural effects on fitness
Sometimes fitness might be influenced by environmental factors as well as genetic factors. In
some cases, these environmental factors would be associated with the subpopulation in which an
individual resides; in that case, they can be modeled as spatial variation in selection (see section
9.2). In other cases, however, the environmental factors might be a matter of individual variation
that is not correlated with the subpopulation; that possibility is what we will explore in this recipe.
In particular, we will make a simple model of cultural differences between individuals. Each
individual will be assigned to one of two cultural groups at birth. If an individual belongs to one
cultural group, a particular type of mutation will be beneficial; if the individual belongs to the
other cultural group, those mutations will be neutral. An analogue to this in human history, which
we will loosely follow here, would be the allele conferring the ability to digest lactose as an adult;
if an individual belongs to a cultural group that drinks milk in adulthood, mutations promoting the
retention of the lactase enzyme into adulthood are beneficial, whereas if the individual belongs to
a cultural group that does not drink milk in adulthood, such mutations are nearly neutral (or
perhaps slightly deleterious, due to the energetic investment of producing an unneeded enzyme).
For the initial version of this model, the cultural group into which an individual is assigned will
be entirely random. The cultural group of an individual will be tracked using the tag property of
Individual, an integer property that is not used at all by SLiM, and is thus free for you to use as
you wish. We will use a tag value of 1 to indicate a milk-drinker, and a value of 0 to indicate a
non-milk-drinker. The tag value will be set up in a late() event that runs after offspring are
generated but before fitness values are calculated. The recipe:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
// neutral
initializeMutationType("m2", 0.5, "f", 0.1);
// lactase-promoting
m2.convertToSubstitution = F;
initializeGenomicElementType("g1", c(m1,m2), c(0.99,0.01));
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

128

1 { sim.addSubpop("p1", 1000); }
10000 { sim.simulationFinished(); }
late() {
// Assign a cultural group: milk-drinker == 1, non-milk-drinker == 0
p1.individuals.tag = rbinom(1000, 1, 0.5);
}
fitness(m2) {
if (individual.tag == 0)
return 1.0;
// neutral for non-milk-drinkers
else
return relFitness; // beneficial for milk-drinkers
}

When run, the occasional mutations of type m2 sweep to fixation, as you might expect, but even
when they are close to fixation only about half of the population – the milk-drinkers – are
experiencing a fitness benefit from the mutation. This can be seen easily in the pattern of
individual fitness values displayed in SLiMgui when an m2 mutation was at a frequency of 0.95:

The m2 mutations are prevented from being removed at fixation, by setting the
convertToSubstitution property of m2 to F. As the model runs, m2 mutations accumulate,

and the
fitness benefits of being a milk-drinker become larger and larger. Since that status is assigned
randomly, however, this does not affect the frequency of milk-drinking; a randomly chosen half of
the population is less likely to pass on its genes to the next generation, but milk-drinking remains a
random choice.
The late() event assigns cultural tag values to all of the individuals in the new offspring
generation. We first generate 1000 random values, either 0 or 1, using rbinom() to draw from a
simple binomial distribution. The resulting vector is assigned directly into p1.individuals.tag
using the multiplexed assignment semantics of Eidos to put each value in the vector into the tag
property of the corresponding individual in subpopulation p1 (see the Eidos manual for further
discussion of multiplexed assignment).
Given those tag values, the fitness() callback is then trivial: for the lactase-promoting
mutations of type m2, the fitness is neutral if the tag value of a given individual carrying the
mutation is 0, indicating a non-milk-drinker, whereas for milk-drinkers, with a tag value of 1,
relFitness is returned to accept SLiM’s default calculated fitness for the mutation (which uses
both the selection coefficient of 0.1 and the dominance coefficient of 0.5 defined for m2). This
could be modified to make the mutations slightly deleterious in non-milk-drinkers, if one wished,
of course.
In this model, the cultural group of an individual is assigned randomly. In many cases one
would wish this trait to have some degree of heritability, even though it is not genetically based; in
humans and some other species, individuals inherit their culture from their parents through social
learning. To implement that, a modifyChild() callback is needed; we will therefore take this
model up again in section 12.1.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

129

9.4.4 The green-beard effect
The recipe in section 9.4.2 modeled kin-selection effects by averaging the beneficial and
deleterious effects of altruistic acts across the whole population. That was unsatisfying, because it
looked somewhat tautological; each individual received an average benefit from the existence of
kin in the population (directing altruistic acts toward the individual), and each individual received
an average harm from the existence of kin (by committing costly altruistic acts), and if the average
benefit outweighed the average harm, kin-directed altruism spread because the net effect on every
individual was beneficial. In a large population, with many small altruistic acts occurring between
all kin, that model is actually quite reasonable; but it leaves open the question of whether inclusive
fitness theory is really correct that kin-directed altruism can evolve even when the individuals
committing the altruistic acts (at a cost to themselves) do not necessarily receive a benefit from
altruistic acts by others. In a more individual-oriented model that doesn’t average the benefits and
costs across the whole population, does kin-directed altruism still evolve? Using the tag property
of the Individual class, as shown in the previous section, we can develop a model to answer that
question.
Here we will explore this question in a closely related evolutionary problem, the modeling of
green-beard alleles (Hamilton 1964ab; Dawkins 1976). A green-beard allele causes three
pleiotropic effects: (1) a phenotypic trait of some kind (the “green beard”), (2) the ability to
recognize other individuals possessing this phenotypic trait, and (3) a tendency to direct altruistic
acts toward other individuals possessing this phenotypic trait (at some cost to the altruistic
individual, and some benefit to the receiving individual).
The idea for this recipe is to, in effect, add a new stage to the generation life cycle. In this new
life cycle stage, a finite number of one-on-one interactions between randomly chosen individuals
in the population are modeled. If the individuals both have the green-beard allele, an altruistic act
occurs and one receives a benefit and the other receives a cost; if either individual does not have
the green-beard allele, no altruistic act occurs and the interaction is neutral. The benefits and costs
incurred by each individual are tallied up separately, and are taken into account in a fitness()
callback that models the fitness effect of the green-beard allele for each individual based upon its
particular interaction tally. Some individuals with the green-beard allele will, by chance, incur
nothing but costs from their interactions, harming themselves to the point where they are unlikely
to reproduce. Other individuals with the green-beard allele will, by chance, incur a mixture of
costs and benefits, or if they’re lucky, nothing but benefits; their outcome will thus be neutral or
positive.
Let’s get to the recipe. We’ll build it one step at a time; the first step is just to set up a single
subpopulation with two mutation types:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeMutationType("m2", 0.5, "f", -0.01);
m2.convertToSubstitution = F;
initializeGenomicElementType("g1", m1, 1);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 { sim.addSubpop("p1", 500); }
fitness(m2) { return 1.0; } // neutral
10000 { sim.simulationFinished(); }

// neutral
// green-beard

Mutation type m2 will be used for green-beard alleles, but is not used yet. It is set to not convert
into a Substitution object upon fixation, using the convertToSubstitution property of
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

130

MutationType,

so that we can easily see whether or not it has fixed at the end of a model run. Its
selection coefficient of -0.01 is just to give it a distinctive color in SLiMgui; the green-beard allele
is neutral here, because the fitness() callback gives a relative fitness of 1.0 for the allele,
overriding selection coefficient of the mutation type. And in any case m2 is not used yet; as it
stands, then, this is just a model of neutral drift using only m1.
Now let’s add a system for tagging individuals with costs and benefits, replacing the fitness()
callback in the code above:
1: late() {
p1.individuals.tag = 0;
}
fitness(m2) { return 1.0 + individual.tag / 10; }

Every individual in SLiM has a tag property that can be set to any value. As explained in the
previous section, the tag field is not used by SLiM; it’s just scratch space for models to use as they
please. In this recipe, we use it to hold the cost/benefit tallies for individuals. The fitness()
callback then adds the tag value for the individual it is evaluating to a neutral relative fitness of
1.0. The tag value is an integer, so we divide it by 10 in the callback; each increment or
decrement of an individual’s tag will represent a change to its relative fitness of 0.1 or -0.1.
Let’s introduce a green-beard allele into the population. It won’t be very interesting if it just gets
lost due to drift in the first generation or two, at green-beard alleles at low frequency generally drift
freely since interactions between the rare green-beards are unlikely – so we’ll introduce several
copies of it to give it an initial boost. The concept of introducing a mutation is covered in depth in
section 10.1, but it should be pretty clear what’s going on here:
1 late() {
target = sample(p1.genomes, 100);
target.addNewDrawnMutation(m2, 10000);
}

This code draws 100 target genomes from the population using the sample() function (which
samples without replacement, by default). It then adds a new mutation of type m2 to those target
genomes with a single call to addNewDrawnMutation(). This call adds the same new mutation to all
of the target genomes, rather than a different new mutation to each genome, because the
addNewDrawnMutation() method is a class method of Genome, not an instance method, and it is
therefore called just once, rather than being multicast out to each instance. This is conceptually
similar to writing a static member function in C++ (which is like a class method in Eidos) to
perform an operation across a vector of objects. That static member function would take a vector
of objects as one of its parameters; in Eidos, you instead call the class method on the target vector,
as seen here, just like calling an instance method. If this is not clear, don’t worry; it is not an
essential point, since the syntax for calling class methods and instance methods is identical in
Eidos. You can consult the Eidos manual for more details on class versus instance methods.
Some individuals might receive two copies of the green-beard mutation (one in each of their
genomes), and end up being homozygous, but there is no harm in that, and most of the individuals
targeted by this code will end up heterozygous for the new green-beard allele. (You could
guarantee heterozygosity by sampling individuals instead of genomes, and then getting just one
genome from each sampled individual; conversely, you could guarantee homozygosity by
sampling individuals and then adding to both of the genomes of each sampled individual.) Again,
it should be emphasized that adding 100 copies is in no way “cheating”; this is just a model of
whether a green-beard allele can rise from low frequency to fixation, rather than a model of

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

131

whether a single copy of a green-beard allele can rise all the way to fixation. It would be easy to
construct the latter model in SLiM using the tools introduced in section 10.2.
Now we need to make our green-beards interact! We’ll do this with an extension to our
previous 1: event:
1: late() {
p1.individuals.tag = 0;
for (rep in 1:50) {
individuals = sample(p1.individuals, 2);
i0 = individuals[0];
i1 = individuals[1];
i0greenbeards = i0.countOfMutationsOfType(m2);
i1greenbeards = i1.countOfMutationsOfType(m2);
if (i0greenbeards & i1greenbeards) {
alleleSum = i0greenbeards + i1greenbeards;
i0.tag = i0.tag - alleleSum;
// cost to i0
i1.tag = i1.tag + alleleSum * 2;
// benefit to i1
}
}
}

This bears some explaining. First, we zero out all of the individual tag values, as before. Then
we run a loop 50 times, to produce 50 interactions between pairs of individuals (which might or
might not possess green beards). Each time through the loop, we draw two individuals from the
population using sample(), giving us a vector individuals containing two objects of class
Individual which we extract into i0 and i1.
The next two lines count the number of m2 mutations contained in the two genomes belonging
to each individual. The value of i0greenbeards and i1greenbeards, then, is 0 if the corresponding
individual does not have the green-beard mutation at all, 1 if it is heterozygous, or 2 if it is
homozygous. This is equivalent to, but more concise than, writing:
i0greenbeards = i0.genomes[0].countOfMutationsOfType(m2)
+ i0.genomes[1].countOfMutationsOfType(m2);

(and the same for i1greenbeards).
Next comes an if statement that will execute if both i0greenbeards and i1greenbeards are
non-zero (since in Eidos, as in many languages, 0 is considered false and any non-0 value is
considered true). So this executes if and only if both of the interacting individuals are at least
heterozygous for the green-beard allele. In that case, alleleSum is computed as the total number
of green-beard alleles between the two individuals; this makes the green-beard effect stronger
when homozygotes are involved, weaker when heterozygotes are involved. Finally, the tag values
of the two interacting individuals are modified to reflect the interaction; individual i0 acts
altruistically and incurs a cost, whereas individual i1 receives the altruistic act and gets a benefit.
The costs and benefits are added to the tag value of each individual. Note that the benefit is larger
than the cost; this makes the average fitness effect of the green-beard allele positive, and thus
means that the green-beard allele is selected for, on average, in each generation. But as promised,
the benefit and the cost are incurred by different individuals in this model. Indeed, the altruistic
individual feels a nearly lethal fitness effect, if both interacting individuals are homozygous. From
the perspective of the individual, it is a truly selfless altruistic sacrifice. From the perspective of the
green-beard allele, it is an entirely selfish act, however – which is why the “selfish gene”
perspective has so much explanatory power.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

132

For the record, here is the full recipe for our green-beard model:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeMutationType("m2", 0.5, "f", -0.01);
m2.convertToSubstitution = F;
initializeGenomicElementType("g1", m1, 1);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 { sim.addSubpop("p1", 500); }
1 late() {
target = sample(p1.genomes, 100);
target.addNewDrawnMutation(m2, 10000);
}
1: late() {
p1.individuals.tag = 0;

// neutral
// green-beard

for (rep in 1:50) {
individuals = sample(p1.individuals, 2);
i0 = individuals[0];
i1 = individuals[1];
i0greenbeards = i0.countOfMutationsOfType(m2);
i1greenbeards = i1.countOfMutationsOfType(m2);
if (i0greenbeards & i1greenbeards) {
alleleSum = i0greenbeards + i1greenbeards;
i0.tag = i0.tag - alleleSum;
// cost to i0
i1.tag = i1.tag + alleleSum * 2;
// benefit to i1
}
}
}
fitness(m2) { return 1.0 + individual.tag / 10; }
10000 { sim.simulationFinished(); }

If you Recycle and run this recipe in SLiMgui, it may take several tries before the green-beard
mutation “catches” and rises to fixation. Again, this is because its fitness effect is close to neutral
when it is at low frequency; early on, it is quite liable to be lost simply due to drift. Once the
mutation “catches”, though, the population view will show something like this:

This display, of the model when the green-beard allele is at a frequency of around 0.3, shows
the result of the individual interactions between green-beards. Some benefit from those
interactions, and are colored in a shade of green or blue (depending on the magnitude of the
benefit); some are harmed, and are colored in a shade of orange or red, shading down to black
(depending on the magnitude of the harm).
Even once the green-beard mutation has fixed, the majority of individuals will be yellow
(indicating neutrality – a relative fitness of exactly 1.0), because only 50 pairs of individuals are
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

133

chosen to interact in each generation. The larger the number of interactions, relative to the
population size, the stronger the selection will be in favor of the green-beard allele. If you
increase the number of interactions from 50 to 5000, for example, there will be so many
interactions relative to the population size that the green-beard allele will rise to fixation almost
deterministically. In this case, this recipe has approached almost to the realm of the section 9.4.2’s
recipe, where computing averaged effects across the whole population was used to model the
behavior of the system. With only 50 interactions, however, the model is quite stochastic in its
behavior, and the green-beard allele will sometimes be lost even when it has risen as high as a
frequency of 0.5.
This recipe is, from a rather sidelong angle, basically a model of a selective sweep, and as such,
it showed how to introduce a new mutation into a population and watch it sweep to fixation. In
the next chapter, that subject will be treated comprehensively.
9.5 Changing selection coefficients with setSelectionCoeff()
The preceding recipes have demonstrated the use of fitness() callbacks to implement a wide
variety of different types of selection in SLiM. It is worth noting, however, that it is also possible to
simply change the selection coefficient of a mutation using the setSelectionCoeff() method of
Mutation (see section 21.8.2). Indeed, this can have large performance advantages, since
executing Eidos callbacks is relatively slow. However, this strategy can only be applied in very
limited cases.
First of all, if you want the fitness effect of a mutation to vary from individual to individual –
whether because of the genetic background of the individual, or the subpopulation the individual
is in, or any other individual-specific model state – then a fitness() callback is necessary. This is
because changing the selection coefficient of a mutation using setSelectionCoeff() changes that
mutation’s selection coefficient for all individuals, in all subpopulations. The recipes in sections
9.2 (spatially varying selection) and 9.3 (genomic background effects) could therefore not be
implemented using setSelectionCoeff(), nor could those of sections 9.4.3 (cultural effects on
fitness) or 9.4.4 (green-beard alleles).
Second, if you want a fitness effect to be temporary then it is often best to use a fitness()
callback, because when the callback is no longer active mutations will automatically revert to their
original effect. To take the recipe in section 9.1 (temporally varying selection) as an example,
when the callback expires the m2 mutations revert back to their original selection coefficient of 0.1
without needing to be re-set to 0.1; indeed, their selection coefficients are 0.1 the entire time, the
fitness() callback just overrides that with its own effect. If this recipe used setSelectionCoeff()
instead, the original selection coefficients would need to be restored when the altered fitness
regime expired. In this case that would be straightforward – just set them all to 0.1 with another
call to setSelectionCoeff() – but if the m2 mutations were drawn from a distribution of fitness
effects, and thus all had different selection coefficients, this would be more complicated.
The recipes in section 9.4.1, on the other hand, can in fact be improved by rewriting them to
use setSelectionCoeff(). Here is a modified version of the first recipe from that section, which
used a fitness() callback to model a dominant frequency-dependent mutation type:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
// neutral
initializeMutationType("m2", 1.0, "f", 0.1);
// balanced
initializeGenomicElementType("g1", c(m1,m2), c(999,1));
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

134

1 { sim.addSubpop("p1", 500); }
late() {
m2muts = sim.mutationsOfType(m2);
freqs = sim.mutationFrequencies(NULL, m2muts);
for (index in seqAlong(m2muts))
m2muts[index].setSelectionCoeff(0.5 - freqs[index]);
}
10000 { sim.simulationFinished(); }

Note that this recipe uses a dominance coefficient of 1.0, to match the behavior caused by the
fitness() callback in the section 9.4.1’s first recipe; any dominance coefficient could be used
here, however. The late() event here calculates the frequency of each m2 mutation, and then uses
a loop to set the selection coefficient of each mutation accordingly. The recipe in section 9.4.1
used a fitness formula of 1.5-frequency in its callback, but here we use 0.5-frequency; this is
because fitness() callbacks return relative fitness values, where 1.0 is neutral, whereas with
selection coefficients – the currency in which the present recipe trades – 0.0 is neutral instead. As
always, this distinction should be treated with care since it is a common cause of errors.
This recipe runs about two orders of magnitude faster than the original recipe – not a small
difference. It might not produce exactly identical results, because of the aggregated effects of tiny
differences in floating-point roundoff error due to the different way in which fitness values are
juggled in the two models, but it is effectively identical for all practical purposes.
However, this recipe’s strategy using setSelectionCoeff() only works because this is a singlesubpopulation model; with multiple subpopulations, one would presumably want the frequencydependent effect to be different in each subpopulation, and to depend upon the frequency within
that subpopulation. Such a model could no longer be achieved using setSelectionCoeff().
Section 9.4.1 thus presented a more generally applicable recipe, albeit a slower one.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

135

10. Selective sweeps
This chapter shows how SLiM can be used to model selective sweeps and various associated
phenomena such as hitchhiking, background selection, and genetic draft. SLiM’s ability to
accurately model such scenarios is the origin of its name, Selection on Linked Mutations. This
chapter will use very simple one-population models; of course these recipes can be combined
with the recipes of previous chapters to model sweeps with complex demography and population
structure, complex genetic architecture, spatiotemporal variation in selection, and so forth.
10.1 Introducing adaptive mutations
The simplest selective sweep scenario is of an explicitly introduced adaptive mutation sweeping
against a background of neutral mutations. We can simulate such a sweep in just a few lines:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeMutationType("m2", 1.0, "f", 0.5);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 { sim.addSubpop("p1", 500); }
1000 late() {
target = sample(p1.genomes, 1);
target.addNewDrawnMutation(m2, 10000);
}
10000 { sim.simulationFinished(); }

// introduced mutation

The call to initializeMutationType() sets up m2, a new mutation type used to represent the
introduced beneficial mutation. Note that a dominance coefficient of 1.0 (fully dominant) is used
with a selection coefficient of 0.5; these values provide strong selection for the introduced
mutation, making it less likely to be lost due to genetic drift while still at low frequency. This is
useful for illustrative purposes here, but of course in real simulations less favorable values would
likely be needed, making it more probable that the mutation would be lost rather than sweeping.
There are two ways to handle this. One is to simply accept that some runs of the simulation will
fail to sweep – do many runs of the model, and collect statistics only across those runs where the
sweep succeeds. The second way to handle the problem is to write additional Eidos code to make
the simulation run conditional on fixation (see the next section).
The other interesting code here is the Eidos event at generation 1000. The first line of this event
selects a target genome into which the added mutation will be placed. It does this using the
sample() function, by drawing a single sample from subpopulation p1’s vector of genomes. This is
important, because SLiM does not – for reasons of speed – guarantee a random order to the
individuals in a subpopulation. In this simple model all individuals are produced identically, so
this is not an issue, but if features of SLiM are used such as migration, separate sexes, selfing,
cloning, or mateChoice()/modifyChild()/recombination() callbacks, the individuals in the
subpopulation will be produced in a specific and non-random order, and thus whenever a random
individual is desired sample() must be used (rather than just using the genome at index 0 in the
subpopulation, for example).
The second line then adds a new mutation to the target genome, drawn from mutation type m2;
the remaining parameter specifies the position of the mutation in the chromosome as 10000 (this
could be drawn randomly using sample() as well, if desired).
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

136

Note that this event is a late() event, specified in its declaration. These were introduced back
in section 4.2.1 as a way to make scripted events happen late in the generation, after offspring
generation has occurred (see the generation life cycle overview in section 1.3). It is important that
introducing new mutations occurs in a late() event, because it makes script-introduced mutations
act in exactly the same way as mutations added by SLiM due to the normal process of mutation
during offspring generation. If this event were an early() event instead (as is the default if no
explicit designation is given), the mutation would be introduced immediately before offspring
generation, and it would have no effect on fitness values since fitnesses would have been
calculated before it was added (at the end of the preceding generation). The mutation would
therefore be subject to one generation of drift before its correct fitness effect would be accounted
for. This would likely not be the desired behavior, so SLiM will issue a warning if you add a
mutation during an early() event – you can delete the late() designation here to see that.
The introduced mutation in this model typically sweeps quite quickly (or is lost almost
immediately due to drift), so perhaps the simplest way to see what is going on in SLiMgui is to
recycle and then enter 1000 in the generation textfield and press return (as illustrated in section
5.1.2). The simulation will advance to the beginning of generation 1000, and you can then press
Step to single-step forward and watch the progress of the introduced mutation. Alternatively, you
could slow down the simulation speed using the speed slider (as illustrated in section 5.2.2) to see
the simulation play more slowly. In either case, you should observe that at generation 1000, just
before the beneficial mutation is introduced, there is lots of neutral diversity across the
chromosome:

But just before the mutation finishes sweeping (assuming it is not lost – you may need to recycle
and repeat this procedure several times to get a run in which the sweep establishes), most of that
diversity has been lost, and just a few neutral sites have hitchhiked along with the introduced
mutation, producing the characteristic signature of a selective sweep (Smith & Haigh 1974):

Note the green line at position 10000, representing the beneficial mutation itself.
We can automatically halt the model immediately after the introduced mutation has fixed (or
has been lost). This requires just a simple modification of the final line of the model, with the call
to simulationFinished():
1000:10000 late() {
if (sim.countOfMutationsOfType(m2) == 0)
sim.simulationFinished();
}

Now simulationFinished() is called immediately when the number of mutations of mutation
type m2 falls to zero, which happens either when the introduced mutation fixes (because SLiM then
converts the mutation into a substitution object), or when it is lost. Note the generation range of
1000:10000 on this event; this means that this termination condition will be checked each
generation from 1000 (when the introduced mutation originates) until 10000 (when the model ends,
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

137

regardless). Also note that a late() designation has been added, so that the simulation ends in the
same generation that the m2 mutation is lost or fixed.
It might be nice to have some indication, in the output stream of the simulation, as to whether
the introduced mutation fixed or was lost. One simple way to do this is to check for a
Substitution object of the right mutation type, in a small extension to the event above:
1000:100000 late() {
if (sim.countOfMutationsOfType(m2) == 0)
{
fixed = (sum(sim.substitutions.mutationType == m2) == 1);
cat(ifelse(fixed, "FIXED\n", "LOST\n"));
sim.simulationFinished();
}
}

The variable fixed is assigned a logical value that comes from checking the list of substitutions
for elements with mutation type m2. Each match will produce a value of T in the logical vector
that results from the == operator; and since T is equal to 1 in Eidos, whereas F is equal to 0, the
sum() function then adds up the total number of matches. If the sum is 1, the mutation fixed;
otherwise (when it is 0) the mutation was lost.
The next line calls cat() to concatenate a string to the output stream. The string comes from
the ifelse() function, which looks at the logical value of its first parameter (fixed) and returns its
second parameter ("FIXED\n") if that value is T; otherwise, it returns its third parameter ("LOST\n").
This function is based on the ifelse() function of R; it is similar to the “trinary conditional”
operator, ?:, of C and other languages, but ifelse() is a vectorized function and can thus perform
this operation across whole vectors at once – a useful tool. Even when using it with singletons, as
here, it is a compact way to express things that would otherwise require an if–else construct and
other complications, as in the alternative way of coding this output directive:
if (fixed)
cat("FIXED\n");
else
cat("LOST\n");

10.2 Making sweeps conditional on fixation
The recipe in the previous section did not guarantee that the introduced mutation would sweep
to fixation. That is not necessarily a flaw; sometimes one is interested in the outcome of a model
both when a sweep completes and when it does not. Sometimes, however, one wishes to make a
model that guarantees that a sweep completes:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeMutationType("m2", 1.0, "f", 0.5); // introduced mutation
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 {
// save this run's identifier, used to save and restore
defineConstant("simID", getSeed());
sim.addSubpop("p1", 500);
}
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

138

1000 late() {
// save the state of the simulation
sim.outputFull("/tmp/slim_" + simID + ".txt");
// introduce the sweep mutation
target = sample(p1.genomes, 1);
target.addNewDrawnMutation(m2, 10000);
}
1000:100000 late() {
if (sim.countOfMutationsOfType(m2) == 0)
{
fixed = (sum(sim.substitutions.mutationType == m2) == 1);
if (fixed)
{
cat(simID + ": FIXED\n");
sim.simulationFinished();
}
else
{
cat(simID + ": LOST – RESTARTING\n");
// go back to generation 1000
sim.readFromPopulationFile("/tmp/slim_" + simID + ".txt");
// start a newly seeded run
setSeed(rdunif(1, 0, asInteger(2^32) - 1));
// re-introduce the sweep mutation
target = sample(p1.genomes, 1);
target.addNewDrawnMutation(m2, 10000);
}
}
}

To make a model that is conditional on fixation, there are some simple “solutions” that don’t
work. You can’t force the introduced mutation to increase in frequency each generation; that
would produce abnormal evolutionary trajectories that would distort the model dynamics.
Similarly, you can’t just reintroduce a new mutation if the previous one is lost; in the process of
getting lost, the previous mutation may have affected the genetic background. What you really
want is that if the introduced mutation has been lost, the simulation gets reset to precisely the
same state as it was in prior to the previous introduction, but with a different random number seed
to ensure that events take a different course. If you do this repeatedly until the mutation fixes, you
have a proper simulation of a selective sweep conditional upon fixation. SLiM provides a
convenient solution for this by allowing the user to save the relevant state of a simulation to disk.
Doing this is actually pretty simple; there are just a few key steps that need to be taken to save the
state and then restore it when we want to try a different trajectory.
First of all, the generation 1 event now stores the initial random number seed, obtained from
getSeed(), as a constant named simID, using the Eidos function defineConstant(). Such
constants persist until the simulation terminates or is recycled; unlike variables, defined constants
do not disappear at the end of the currently executing event in SLiM, but instead go into the global
scope. This value is thus a unique identifier for the model run; if many model runs were being
done, each would presumably have a different seed, and thus the initial seed identifies the run.
The event in generation 1000 saves the current population state to a file in the /tmp directory, a
standard place used in all Un*x systems (including Mac OS X) to store temporary files. The
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

139

filename used inside the /tmp directory is generated using string concatenation with the + operator
(described in detail in the Eidos manual), based on the value stored earlier in simID; the temporary
file therefore gets named according to the initial random number seed of the run, with a name like
"slim_92631.txt". This event also adds the new mutation that we want to sweep. Note that this
event is designated as a late() event; this is logical, both because it produces output (by saving
the simulation state to disk), and because it adds a mutation to the simulation.
The generation 1000:100000 event checks, each generation, whether the introduced mutation is
still present. If it has fixed, it prints a message and terminates the simulation. If it has been lost, it
prints a message and reads the saved population state back in (which sets the generation counter
back to 1000). It then changes the random number seed, to start a new evolutionary trajectory
(reproducible by starting again at the new seed – see section 17.1 for discussion), and then reintroduces the sweep mutation as before. This event is also a late() event; this way if the
simulation has to be restored to the saved state, it picks up exactly where it was saved out (it
should also be a late() event because it adds a new mutation). Finally, note that when we set the
generation back to 1000, the generation 1000 event that did our initial setup does not execute
again; the events that will run for a given generation are determined at the beginning of that
generation and do not change even if sim.generation is changed. If we set sim.generation to 999
instead, then the generation 1000 event would execute again, in the next generation.
If you recycle and run this model, sometimes the introduced mutation will fix on the first
attempt, but sometimes it will take repeated attempts and you will see output like this:
1452090492983:
1452090492983:
1452090492983:
1452090492983:

LOST – RESTARTING
LOST – RESTARTING
LOST – RESTARTING
FIXED

This demonstrates that the model is working as intended; three introduced mutations were lost
before the fourth attempt resulted in fixation. You could change the dominance coefficient and
selection coefficient of m2 to be less favorable, which would result in more restarts but should work
fine. You could even make m2 neutral or deleterious, while still making runs conditional on
fixation; we’ll see an example of that in section 10.6.2. Since the probability of a new neutral
mutation drifting to fixation is small, many restarts will be needed, but no harm is done by that.
One caveat is that outputFull() writes out only a specific set of information: the
subpopulations that exist, the individuals they contain, and the mutations contained by the
genomes of those individuals. When readFromPopulationFile() is called, any other state
associated with the subpopulation, individuals, genomes, and mutations is wiped away. If the
model had set up other properties – setting a cloning rate on a subpopulation, say, or setting up
values with tag or setValue() – that state will be lost, and will need to be set up again.
Given the relative complexity of this recipe, most of the other selective sweep recipes in this
chapter will assume that conditionality on fixation is not desired, so that their different topics are
not obscured by the conditionality machinery. However, the key elements of this recipe can be
combined with any selective sweep strategy – or indeed, can be used to make a simulation that is
conditional upon any outcome you wish. In the next section, we will see how to make a
simulation conditional upon establishment of a mutation, rather than upon fixation.
10.3 Making sweeps conditional on establishment
In the previous section, we saw how to make a model conditional on fixation of the introduced
mutation. Sometimes one may want to relax this condition and require only that the mutation
reaches a certain threshold frequency at which selection outweighs drift, rendering subsequent loss
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

140

unlikely. This threshold frequency is commonly referred to as the establishment frequency, and is a
function of the selection coefficient of the mutation. Making our model conditional on
establishment, rather than fixation, requires just a simple modification to the previous recipe:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeMutationType("m2", 1.0, "f", 0.5); // introduced mutation
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 {
// save this run's identifier, used to save and restore
defineConstant("simID", getSeed());
sim.addSubpop("p1", 500);
}
1000 late() {
// save the state of the simulation
sim.outputFull("/tmp/slim_" + simID + ".txt");
// introduce the sweep mutation
target = sample(p1.genomes, 1);
target.addNewDrawnMutation(m2, 10000);
}
1000: late() {
mut = sim.mutationsOfType(m2);
if (size(mut) == 1)
{
if (sim.mutationFrequencies(NULL, mut) > 0.1)
{
cat(simID + ": ESTABLISHED\n");
sim.deregisterScriptBlock(self);
}
}
else
{
cat(simID + ": LOST – RESTARTING\n");
// go back to generation 1000
sim.readFromPopulationFile("/tmp/slim_" + simID + ".txt");
// start a newly seeded run
setSeed(rdunif(1, 0, asInteger(2^32) - 1));
// re-introduce the sweep mutation
target = sample(p1.genomes, 1);
target.addNewDrawnMutation(m2, 10000);
}
}
10000 {
sim.simulationFinished();
}

Much of the machinery here is carried over from the previous recipe; see section 10.2 for
discussion of the way the simulation state is saved and restored. Here, however, the 1000: event
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

141

checks for the existence of the introduced mutation, using size(). If it exists, its frequency is
checked against the threshold frequency 0.1 (an arbitrary choice that can be adjusted to whatever
threshold frequency you wish). If the mutation’s frequency is above that threshold, a message is
printed and the script block deregisters itself so that further checks are not done – once the
mutation has established, the simulation runs freely to its end, whether the mutation is
subsequently fixed or lost. (See section 5.1.2 for discussion of deregisterScriptBlock(), as well
as sections 10.5.3 and 20.11.2). On the other hand, if the introduced mutation no longer exists,
the simulation is reset back to the point of introduction, just as in section 10.2.
When this model is run, you will typically see output like:
1459603349880:
1459603349880:
1459603349880:
1459603349880:

LOST – RESTARTING
LOST – RESTARTING
LOST – RESTARTING
ESTABLISHED

10.4 Partial sweeps
Sometimes it is desirable to make a selective sweep stop or change when the sweep mutation
reaches a specific frequency. This recipe will initiate a selective sweep with a beneficial mutation,
which will convert into a neutral mutation when it reaches a frequency of 0.5.
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeMutationType("m2", 1.0, "f", 0.5); // introduced mutation
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 { sim.addSubpop("p1", 500); }
1000 late() {
target = sample(p1.genomes, 1);
target.addNewDrawnMutation(m2, 10000);
}
1000:10000 late() {
mut = sim.mutationsOfType(m2);
if (size(mut) == 0)
sim.simulationFinished();
else if (mut.selectionCoeff != 0.0)
if (sim.mutationFrequencies(NULL, mut) >= 0.5)
mut.setSelectionCoeff(0.0);
}

Most of this is similar to previous recipes in this chapter; the difference lies in the 1000:10000
event. This event now uses mutationsOfType() to recover the introduced mutation object (which,
as usual, can’t be stored anywhere persistent). If the mutation is no longer present – if size(mut) is
zero – the simulation terminates; this is equivalent to the countOfMutationsOfType(m2) check
done in previous recipes. Otherwise, it uses mutationFrequences() to check the frequency of the
introduced mutation; if it has cleared the 0.5 threshold, it is converted to neutral with
setSelectionCoeff(). That last bit of code is protected by a test that the selection coefficient has
not already been changed; that check is purely for speed, as mutationFrequencies() can be slow.
This recipe could incorporate parts of the previous recipes, in order to determine whether the
introduced mutation was lost or had fixed, or in order to make the simulation conditional on the
mutation reaching the threshold frequency; the basic idea is easily adapted.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

142

10.5 Soft sweeps from de novo mutations
In the previous recipes in this chapter, we’ve seen “hard” selective sweeps originating from a
single new mutation. If, on the other hand, a sweep proceeds from multiple copies of the same
mutation, present in different individuals, it is termed a “soft” sweep, because it preserves a more
diverse genetic background (Messer & Petrov 2013). Soft sweeps can arise from standing genetic
variation (seen in section 10.6), or can occur when multiple copies of the same de novo mutation
arise during a sweep. This section will model the latter scenario, in three different ways.
10.5.1 A soft sweep from recurrent de novo mutations in a large population
In this recipe, we will model a soft sweep by de novo mutations generated by SLiM. In order to
get a soft sweep to occur from recurrent de novo mutations at the same locus, the population
needs to be very large or the mutation rate needs to be very high, so that new copies of the
mutation tend to arise before the previous copy has fixed. In this recipe, unlike most recipes in
this book, we will not model a background of neutral mutations; instead, this is a model of
competing beneficial mutations at a single site. We are, in effect, modeling the sweep of multiple,
separate instances of a particular single-nucleotide change, such as the transition of a particular
nucleotide from G to A (conceptually – SLiM does not model actual nucleotides). The model
terminates when the sweep completes: when every individual genome possesses an instance of the
mutation, regardless of where and when this instance arose. With no further ado, the recipe:
initialize() {
initializeMutationRate(1e-5);
initializeMutationType("m1", 0.45, "f", 0.5); // sweep mutation
m1.convertToSubstitution = F;
m1.mutationStackPolicy = "f";
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 0);
initializeRecombinationRate(0);
}
1 {
sim.addSubpop("p1", 100000);
}
1:10000 {
counts = p1.genomes.countOfMutationsOfType(m1);
freq = mean(asInteger(counts > 0));
if (freq == 1.0)
{
cat("\nTotal mutations: " + size(sim.mutations) + "\n\n");
for (mut in sortBy(sim.mutations, "originGeneration"))
{
mutFreq = mean(asInteger(p1.genomes.containsMutations(mut)));
cat("Origin " + mut.originGeneration+ ": " + mutFreq + "\n");
}
sim.simulationFinished();
}
}

The chromosome is defined as having a single genomic element that spans from position 0 to
position 0: a single base position. This recipe uses a relatively high mutation rate (1e-5) with a
large population size (100000) so that multiple mutations arise at that single site before the sweep

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

143

completes. An even larger population size could be used with a lower mutation rate (N=1e6 and
u=1e-6, say, or even N=1e7 and u=1e-7), but the model would take somewhat longer to complete.
There is one unusual complication in this model: SLiM, by default, allows multiple mutations of
a given mutation type to occur at the same site in the same individual (“stacked” mutations, as we
will call them). In other words, an individual that has already undergone the G-to-A transition
might nevertheless be given another mutation at the same site, representing the same G-to-A
transition. While this behavior is desirable in many instances, it is inconvenient in this particular
model, and so this recipe tells SLiM to use a different behavior: “stacked” mutations will not be
allowed, and new mutations that collide with an existing mutation at the same site will be
suppressed. This is achieved by setting the mutationStackPolicy property of m1 to "f", which tells
SLiM to prevent stacking by keeping only the first (thus "f") mutation of type m1 at a given site (see
section 21.9.1 for details). This simplifies the model’s code considerably; without this change in
policy, the model would have to carefully take account of the possibility of stacking and
compensate for it. Note that if this model contained other mutation types as well, those would still
be allowed to coexist, both with each other and with an m1 mutation at the same site; only stacking
of m1 mutations with other m1 mutations is prevented (by default; see section 21.9.1).
Second, the dominance coefficient of 0.45 for m1 is actually a special and important value.
SLiM tracks each independent origin of the sweep mutation as a separate Mutation object; those
mutations are all of type m1, but they are distinct mutations as far as SLiM is concerned. If an
individual has different versions of the sweep mutation in its two genomes, SLiM will evaluate the
fitness of that individual according to the fact that it contains two different mutations, each of
which is heterozygous, and each of which therefore uses the mutation type’s dominance
coefficient. The value 0.45 makes this work out, because the relative fitness for one heterozygous
mutation is 1.0+0.45*0.5, which is 1.225, and then when the relative fitness values for the two
mutations are multiplied together, 1.225*1.225=1.5 (almost exactly), and 1.5 is the relative fitness
of the sweep mutation when homozygous. In other words, h=(sqrt(1+s)−1)/s, and therefore
(1+hs)2=1+s. This model could be written to use a dominance coefficient that did not satisfy this
relationship, but a fitness() callback would then be needed to produce the desired fitness values.
The 1:10000 event tallies up the overall frequency of the sweep mutation, regardless of which
particular Mutation objects are possessed by each individual. To do this, it first uses the
countOfMutationsOfType() method to produce a vector of the number of mutations in each
genome in the population. Then it tests counts > 0, producing a logical vector that is T if a
genome contains at least one mutation, F otherwise. The asInteger() function converts that
vector to integer (where T is 1 and F is 0, as defined by Eidos), and mean() then computes the
frequency of the sweep as the average of those values. If the frequency is equal to 1.0, the sweep
has completed and the simulation is stopped.
This model executes and stops, but it is hard to tell what’s going on; there is only a single base
position on the chromosome, so all of the mutations are displayed in a single column in SLiMgui’s
chromosome view. We can sort of see the sweep progressing, but we can’t tell how many
mutations are involved, or what the distribution of their frequencies might be. The model therefore
has some custom output code that processes the final state of the simulation and prints a summary.
Because stacking is prevented by mutationStackPolicy, as described above, this output code is
quite simple; it just loops through all of the mutations in the simulation (sorted by origin
generation, using sortBy()), and calculates the frequency of each mutation as the average of the
values returned by p1.genomes.containsMutations() for that mutation – a very similar strategy to
the calculation of the overall frequency of the sweep, as described above. Typical output might
indicate that 25 mutations existed in the population at the end of the run (and thus participated in
the soft sweep), and would then list the final frequencies of those mutations, sorted in ascending
order by origin generation:
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

144

Total mutations: 25
Origin
Origin
Origin
Origin
Origin
Origin
...

4: 0.22628
6: 0.04301
7: 0.18139
7: 0.215995
10: 0.10995
12: 0.0709

10.5.2 A soft sweep with a fixed mutation schedule
The previous recipe relied upon SLiM’s standard mutation-generation machinery to generate
new instances of a mutation that executed a soft sweep. Sometimes one might wish to have more
control over the process than that; our next recipe therefore adds the multiple copies of the sweep
mutation at predetermined times during the run, according to a scripted schedule:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeMutationType("m2", 1.0, "f", 0.5);// sweep mutation
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 {
sim.addSubpop("p1", 500);
p1.tag = 0; // indicate that a mutation has not yet been seen
}
1000:1100 late() {
if (sim.generation % 10 == 0) {
target = sample(p1.genomes, 1);
if (target.countOfMutationsOfType(m2) == 0)
target.addNewDrawnMutation(m2, 10000);
}
}
1:10000 late() {
if (p1.tag != sim.countOfMutationsOfType(m2)) {
if (any(sim.substitutions.mutationType == m2)) {
cat("Hard sweep ended in generation " + sim.generation + "\n");
sim.simulationFinished();
} else {
p1.tag = sim.countOfMutationsOfType(m2);
cat("Gen. " + sim.generation + ": " + p1.tag + " lineage(s)\n");
if ((p1.tag == 0) & (sim.generation > 1100)) {
cat("Sweep failed to establish.\n");
sim.simulationFinished();
}
}
}
if (all(p1.genomes.countOfMutationsOfType(m2) > 0)) {
cat("Soft sweep ended in generation " + sim.generation + "\n");
cat("Frequencies:\n");
print(sim.mutationFrequencies(p1, sim.mutationsOfType(m2)));
sim.simulationFinished();
}
}

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

145

There’s a lot going on here; let’s unpack it. First of all, there are a bunch of cat() calls to print
out diagnostic information about the sweep. A run of this model might produce output like:
Gen. 1000: 1 lineage(s)
Gen. 1010: 2 lineage(s)
Gen. 1020: 3 lineage(s)
Gen. 1030: 4 lineage(s)
Gen. 1032: 3 lineage(s)
Gen. 1040: 4 lineage(s)
Gen. 1041: 3 lineage(s)
Sweep ended in generation 1043
Frequencies:
0.55 0.006 0.444

Each time a new copy of the sweep mutation is introduced, it produces a new lineage that is
conceptually distinct from other lineages containing the same mutation; although each copy of the
mutation is identical (same position, same selection coefficient, etc.), SLiM tracks each introduced
copy separately, since each is generated with a separate call to addNewDrawnMutation(). (If you
wanted SLiM to simulate just a single mutation object, without tracking lineages in this way, you
could look up the existing mutation object and add it to new individuals with the addMutation()
method instead.) As the output shows, the number of lineages increases as new copies get added
during the soft sweep; sometimes the lineage count goes down, though, as particular lineages go
extinct due to genetic drift. The model uses a value stored in p1.tag to track the number of
lineages; each time that the current lineage count differs from that tag value, a new output line is
generated. That’s the function of the first half of the 1:10000 event: to track the number of lineages,
produce output when it changes, and halt the simulation if a hard sweep completes from a single
introduced mutation (indicated by the existence of a substitution of type m2) or if the sweep fails to
establish (indicated by a lack of active sweep mutations, if we’re past the end of the mutation
introduction period).
The second half of the 1:10000 event detects when the soft sweep has completed, and prints a
message giving the generation at which it completed, along with the frequencies of each of the
mutational lineages that ended up being part of the completed sweep. These frequencies sum to 1,
indicating that every individual in the population possesses mutations from one or another of those
lineages; given that all of the mutational lineages are genetically identical, this means that the
mutation has really fixed, even though it is divided into distinct lineages by the design of the
simulation. The way this event detects completion might need a bit of explanation. First,
p1.genomes gives a vector containing the genomes for all individuals in the subpopulation. The
method countOfMutationsOfType(m2) performs a count on each genome in that vector, and
returns a new vector containing the corresponding counts. The > 0 test then generates a logical
vector containing T for each genome that had a copy of the sweep mutation; if any genome is still
missing the sweep mutation, this vector will have an F in that position. Finally, all() returns T if
all of the elements of that logical vector are T; if any genome failed the test, all() returns F. You
can compare this to the strategy for detecting soft sweep fixation used in the previous section,
which actually computes the frequency of the sweep.
The 1000:1100 event is the engine that generates the new mutational lineages of the soft sweep.
For simplicity, it starts a new lineage every 10 generations, using the modulo operator, %, to test for
generations which are evenly divisible by 10; it would of course be trivial to modify this to
generate the new lineages at whatever fixed generations you wished, or to generate a new lineage
randomly with a given probability. In any case, when it is decided that a new lineage should be
created this event adds a new mutation, identical to the previous mutations, to a randomly chosen
genome, much as we have seen before.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

146

The only tricky bit here is that we want to prevent more than one sweep mutation from existing
in the same individual, so before we add the new mutation, we check whether one is already
there. Normally SLiM allows two mutations to occupy exactly the same position in the
chromosome; the check here avoids that possibility, so as to guarantee that the final lineage
frequencies will sum to exactly 1. In the previous section, we used the mutationStackPolicy
property of MutationType to prevent such “stacked” mutations; that solution would work equally
well here, since the mutation type’s stacking policy applies to introduced mutations as well as to
mutations generated by SLiM. A different strategy is employed here simply to show a different
approach to the same issue.
10.5.3 A soft sweep with a random mutation schedule
As mentioned above, the previous recipe could be modified to follow a random schedule quite
easily; for example, the (sim.generation % 10 == 0) test could be changed to (runif(1) < 0.1)
to provide a new mutational lineage at random times averaging every ten generations. But perhaps
your model would like to know the mutation schedule ahead of time, for some reason; you would
like to have, at the outset, a list of the generations in which a new lineage will be created. This
would allow you to prune out schedules that didn’t satisfy some criterion, or run statistics on the
schedule in advance, or other such tasks. Here is an alternative recipe, then, that generates a
schedule ahead of time and then uses it to generate lineages:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeMutationType("m2", 1.0, "f", 0.5);// sweep mutation
m2.mutationStackPolicy = "f";
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 {
sim.addSubpop("p1", 500);
gens = cumSum(rpois(10, 10));
gens = gens + (1000 - min(gens));

// make a vector of start gens
// align to start at 1000

for (gen in gens)
sim.registerLateEvent(NULL, s1.source, gen, gen);
sim.deregisterScriptBlock(s1);
}
s1 1000 late() {
sample(p1.genomes, 1).addNewDrawnMutation(m2, 10000);
}
1:10000 late() {
if (all(p1.genomes.countOfMutationsOfType(m2) > 0))
{
cat("Frequencies at completion:\n");
print(sim.mutationFrequencies(p1, sim.mutationsOfType(m2)));
sim.simulationFinished();
}
if ((sim.countOfMutationsOfType(m2) == 0) & (sim.generation > 1100))
{
cat("Soft sweep failed to establish.\n");
sim.simulationFinished();
}
}
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

147

The 1:10000 event checks for loss or completion of the sweep and reacts accordingly, much as
we have seen previously. This recipe has been simplified somewhat from the previous one, for
brevity, however; some of the output has been removed, the lineage-counting mechanism using
p1.tag has been removed, and if the final result is a hard sweep this recipe considers it as a
“failure to establish”, unlike the previous recipe – after all, the goal is to produce a soft sweep. Of
course those pieces could be added back in to this recipe if desired. “Stacked” mutations are
prevented here using mutationStackPolicy, as in section 10.5.1 but unlike section 10.5.2, to show
the alternative approach in a model using introduced mutations (see those sections for further
discussion of the issue of stacked mutations).
More interesting is how the lineage addition works now. There is an event declared using a
syntax that has not been seen up until now:
s1 1000 late() {

This event is named; an Eidos constant, s1, will be defined as referring to the script block for the
event. To some extent this is just a shorthand for convenience; all defined script blocks in a
simulation are available through sim.scriptBlocks, as objects of type SLiMEidosBlock, and this
block could be looked up from that vector using the properties of that class (by looking for a block
running only in generation 1000, for example). Being able to just use the name s1 is obviously
more convenient, and mirrors the syntax of being able to refer to a subpopulation as p1, a genomic
element type as g1, or a mutation type as m1.
Putting that aside, this event schedules the addition of a new lineage in much the same way as
in the previous recipe. Notably, however, the generation 1000 event now schedules a new lineage
just once. The trick behind this recipe is that this block gets re-scheduled by the model to run in
an assortment of other generations. The code to do this is in the generation 1 event. First, a vector
of all the generations in which the event will run is constructed by drawing from a Poisson
distribution to get waiting times, using rpois(), and then calculating cumulative sums from that
vector of waiting times, using cumSum(); the Eidos documentation gives details on these functions,
of course. This vector is then realigned so that its smallest element is equal to 1000, giving us the
final schedule of lineage additions that will be followed. The event then loops through that vector,
and for each generation in it, it calls sim.registerLateEvent() to schedule a lineage addition in
that generation. It does this by scheduling the source code from the s1 event over again, obtained
simply with s1.source. Finally, the s1 event itself is deregistered; the loop has already scheduled a
lineage addition in generation 1000, so s1 is not needed. In this way, s1 itself never runs at all, but
its source code is used as a template for ten other events that do run.
The fact that Eidos is an interpreted language gives you the freedom to play all sorts of games
like this with the code of your simulation, even as it runs; you can register new events and
callbacks, deregister existing ones, and even generate code on the fly and then execute it with the
executeLambda() command. These sorts of techniques should be used in moderation, as they can
make the code of a model difficult to understand and maintain; but in some situations, as here,
they can prove exceedingly useful for implementing highly dynamic model behavior.
Incidentally, SLiMgui provides graphical tools for inspecting the event list, which can be useful
for understanding the behavior of models like this. This feature was previously described in
section 5.1.2, in a very different context; you might try opening the window drawer as described in
that section, and then recycling this model and stepping through generation 1. You will see that
initially, the event list contains s1; after generation 1 has executed, s1 has disappeared from the
list, because it has been deregistered, and ten new events have been added programmatically.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

148

10.6 Sweeps from standing genetic variation
The previous section showed recipes for simulating soft selective sweeps with de novo
mutations – either introduced explicitly, or resulting from the normal mutational processes of SLiM.
Soft sweeps can also originate from standing genetic variation, however. In this case, a mutation
that was previously neutral becomes beneficial (presumably due to a change in the environment),
suddenly exposing the mutation to selection. Because the mutation was initially neutral, it will
generally occur in the population against a variety of genetic backgrounds; a soft selective sweep
will therefore generally occur, although a hard sweep is possible if just one of the mutationcontaining lineages happens to eliminate all of the others.
There are several ways that a sweep from standing genetic variation might be modeled. We will
examine two recipes in this section: a sweep from standing variation at a randomly chosen locus,
and a sweep from standing variation at a predetermined locus that is triggered when a mutation at
that locus reaches a threshold frequency.
10.6.1 A sweep from standing variation at a random locus
In this first recipe, a randomly chosen mutation will be picked out of the standing neutral
genetic diversity that has accumulated in the model; the only criterion for the selection of the
mutation will be that it must be above a threshold frequency. The chosen mutation will be
changed to be beneficial (reflecting an environmental change), and the model will terminate when
the mutation has either been lost or has completed a sweep:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 1.0, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 {
sim.addSubpop("p1", 500);
}
1000 late() {
muts = sim.mutations;
muts = muts[sim.mutationFrequencies(p1, muts) > 0.1];
if (size(muts))
{
mut = sample(muts, 1);
mut.setSelectionCoeff(0.5);
}
else
{
cat("No contender of sufficient frequency found.\n");
}
}
1000:10000 late() {
if (sum(sim.mutations.selectionCoeff) == 0.0)
{
if (sum(sim.substitutions.selectionCoeff) == 0.0)
cat("Sweep mutation lost in gen. " + sim.generation + "\n");
else
cat("Sweep mutation reached fixation.\n");
sim.simulationFinished();
}
}
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

149

The generation 1000 event chooses the target mutation and transmogrifies it. The muts variable
is set up initially as the full set of mutations in the simulation, and is then whittled down to only
those mutations above a minimum threshold frequency of 0.1. Note that it would be
straightforward to modify this further and also require that the mutation is below a given maximum
threshold frequency; in that case, we would be studying a scenario of a selective sweep from a
standing genetic variant with starting frequency within the interval (min, max). The sample()
function is then used to choose one mutation from that set, at random, and the chosen mutation
has its selection coefficient altered to be beneficial. Note that this event is a late() event; this is
for essentially the same reasons as explicated in section 4.2.1. We are changing the selection
coefficient of an existing mutation; for the fitness effect of that change to be realized immediately,
it must happen in a late() event so that the change occurs after offspring generation but before
fitness recalculation.
The 1000:10000 event just checks for termination conditions: either the chosen mutation has
been lost, or it has fixed. Since a separate mutation type is not used for the sweep mutation in this
model, unlike the other sweep models we have seen, these termination conditions are detected
using the property that the chosen mutation has a non-zero selection coefficient.
This could be tailored in all sorts of ways; for example, the transition from neutral to beneficial
could be made gradual, or the new selection coefficient could be drawn from a distribution rather
than being a fixed value.
10.6.2 A sweep from standing variation at a predetermined locus
In this recipe we want to model a soft sweep from standing genetic variation. Specifically, we
want to assume that a previously neutral mutation suddenly becomes beneficial as a consequence
of an environmental change, and then subsequently sweeps in the population. In this scenario, we
can choose the “starting frequency” of the mutation at the time the environment changes. Our
recipe for this scenario is based on the conditional-on-establishment machinery of section 10.3,
which restarts the simulation if the mutation is lost prior to reaching the chosen starting frequency
(here defined as a frequency of 0.1). If the mutation does manage to drift to this frequency, it is
then converted to be beneficial. After that point, the model runs without conditionality, and
detects whether the mutation fixes or is lost prior to generation 10000.
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeMutationType("m2", 1.0, "f", 0.0); // introduced mutation
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 {
// save this run's identifier, used to save and restore
defineConstant("simID", getSeed());
sim.addSubpop("p1", 500);
}
1000 late() {
// save the state of the simulation
sim.outputFull("/tmp/slim_" + simID + ".txt");
// introduce the sweep mutation
target = sample(p1.genomes, 1);
target.addNewDrawnMutation(m2, 10000);
}
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

150

1000: late() {
mut = sim.mutationsOfType(m2);
if (size(mut) == 1)
{
if (sim.mutationFrequencies(NULL, mut) > 0.1)
{
cat(simID + ": ESTABLISHED – CONVERTING TO BENEFICIAL\n");
mut.setSelectionCoeff(0.5);
sim.deregisterScriptBlock(self);
}
}
else
{
cat(simID + ": LOST BEFORE ESTABLISHMENT – RESTARTING\n");
// go back to generation 1000
sim.readFromPopulationFile("/tmp/slim_" + simID + ".txt");
// start a newly seeded run
setSeed(rdunif(1, 0, asInteger(2^32) - 1));
// re-introduce the sweep mutation
target = sample(p1.genomes, 1);
target.addNewDrawnMutation(m2, 10000);
}
}
1000:10000 late() {
if (sim.countOfMutationsOfType(m2) == 0)
{
fixed = (sum(sim.substitutions.mutationType == m2) == 1);
cat(simID + ifelse(fixed, ": FIXED\n", ": LOST\n"));
sim.simulationFinished();
}
}

Most of the code here is identical to the recipe in section 10.3, so please refer to that section for
further discussion. Compared to that recipe, mutation type m2 now has an initial selection
coefficient of 0.0, making it neutral. If the mutation reaches the threshold frequency of 0.1, an
extra line of code calls setSelectionCoeff() to convert the mutation to beneficial. Finally, a
1000:10000 event has been added that checks for fixation or loss. Note that this event runs after
the 1000: event, because it is declared later in the source code; the 1000: event catches cases of
loss prior to establishment and restarts the model before the 1000:10000 event notices the loss.
If you run this model, it will probably emit a long string of restart messages before it fixes, due
to loss of the introduced mutation while it is still neutral. Once it reaches the threshold frequency
and is converted to beneficial, it will usually fix. Typical output is therefore something like:
1460585630465:
1460585630465:
...
1460585630465:
1460585630465:
1460585630465:
1460585630465:

LOST BEFORE ESTABLISHMENT – RESTARTING
LOST BEFORE ESTABLISHMENT – RESTARTING
LOST BEFORE ESTABLISHMENT – RESTARTING
LOST BEFORE ESTABLISHMENT – RESTARTING
ESTABLISHED – CONVERTING TO BENEFICIAL
FIXED

It would be simple to convert the introduced mutation to be initially deleterious; the model
would still work. In that case, however, it would usually take a very long time to complete a run,
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

151

because the introduced mutation would frequently be lost unless we chose a very low threshold
frequency.
10.7 Adaptive introgression
All of the selective sweep recipes thus far have involved just a single subpopulation. Here we
want to extend this to model adaptive introgression, the progress of an adaptive allele from its
subpopulation of origin into another subpopulation via migration and gene flow. Section 5.3
introduced techniques for setting up structured populations with migration; all we need to do is
add a selective sweep to such a model. Here’s a simple model combining the population structure
of section 5.3.1’s recipe with the hard sweep dynamics of section 10.1 (both slightly modified):
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeMutationType("m2", 1.0, "f", 0.5);// introduced mutation
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 {
subpopCount = 10;
for (i in 1:subpopCount)
sim.addSubpop(i, 500);
for (i in 2:subpopCount)
sim.subpopulations[i-1].setMigrationRates(i-1, 0.01);
for (i in 1:(subpopCount-1))
sim.subpopulations[i-1].setMigrationRates(i+1, 0.2);
}
100 late() {
target = sample(p1.genomes, 1);
target.addNewDrawnMutation(m2, 10000);
}
100:100000 late() {
if (sim.countOfMutationsOfType(m2) == 0)
{
fixed = (sum(sim.substitutions.mutationType == m2) == 1);
cat(ifelse(fixed, "FIXED\n", "LOST\n"));
sim.simulationFinished();
}
}

If you run this model and stop the run in the middle, while the adaptive introgression is in
progress (assuming it is not lost by chance early on – you may have to recycle and restart several
times before it catches), the population shown in SLiMgui may look something like this:

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

152

Individuals possessing the introgressing allele are colored green; note that there are no
intermediate-colored individuals because m2 is fully dominant. The mutation was introduced into
subpopulation p1, and so it is closest to fixation there, whereas it has only just begun to sweep in
p10. Of course it is not possible to see the connectivity between the subpopulations, and the
migration rates between them, in this view. As described in section 5.1.3, you can open the
population visualization window to see what the population structure looks like:

Here we see that we have a stepping-stone model, with gene flow mostly from p10 toward p1,
opposing the spread of the introduced mutation. Nevertheless, the mutation introgresses quite
rapidly given the parameters for this model; you could play with migration rates, selection
coefficients, etc. to test how they affect the time until completion of the sweep. It would be easy
to introduce spatial variation in selection into this model, too, using the techniques of section 9.2,
in order to allow the introgression to proceed only so far along the chain of subpopulations before
further introgression is blocked by local selection against it (see section 12.3).
10.8 Fixation probabilities under Hill-Robertson interference
In this final section, we’ll see how a simple sweep model can test the predictions of population
genetic theory. The recipe here will model Hill-Robertson interference, the interference of
beneficial mutations that have arisen in different lineages and thus compete against each other
(Hill & Robertson 1966). The recipe’s code then compares the predicted mean fixation time for a
beneficial allele without such interference to the actual mean fixation time observed by the
simulation.
The design of this recipe is that the first 5000 generations of the run are a burn-in period during
which a dynamic equilibrium is reached, and then the final 1000 generations are measured. The
formulas for the probability of fixation of a beneficial mutation without interference and the
expected number of fixed mutations are calculated according to standard population genetic
theory (Kimura 1962). The actual number of fixed mutations is obtained from the information
stored in SLiM’s substitution objects, which record the generation of fixation of all mutations; the
number of fixed mutations with a generation of fixation after the end of the burn-in can then be
counted. As we shall see below, the actual count is much lower than the expected count; fixation
is being highly suppressed in this model compared to the expectations without Hill-Robertson
interference.
Without further ado, the recipe:
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

153

initialize() {
initializeMutationRate(1e-6);
initializeMutationType("m1", 0.5, "f", 0.05);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 {
sim.addSubpop("p1", 1000);
}
6000 late() {
// Calculate the fixation probability for a beneficial mutation
s = 0.05;
N = 1000;
p_fix = (1 - exp(-2 * s)) / (1 - exp(-4 * N * s));
// Calculate the expected number of fixed mutations
n_gens = 1000; // first 5000 generations were burn-in
mu = 1e-6;
locus_size = 100000;
expected = mu * locus_size * n_gens * 2 * N * p_fix;
// Figure out the actual number of fixations after burn-in
subs = sim.substitutions;
actual = sum(subs.fixationGeneration >= 5000);
// Print a summary of our findings
cat("P(fix) = " + p_fix + "\n");
cat("Expected fixations: " + expected + "\n");
cat("Actual fixations: " + actual + "\n");
cat("Ratio, actual/expected: " + (actual/expected) + "\n");
}

Running this produces output like this (a typical result):
P(fix) = 0.0951626
Expected fixations: 19032.5
Actual fixations: 302
Ratio, actual/expected: 0.0158676

This is obviously a very simple model; the point is just to demonstrate that it is straightforward
to make models that test theoretical predictions like this – and that Hill-Robertson interference can
make a large difference to dynamics! If you run the model in SLiMgui, the way that the
competition of different lineages slows progress towards fixation is quite apparent. Of course if the
mutational input is constant, and the time to fixation increases, that has to mean (assuming
equilibrium) that fewer mutations are fixing. That can also be observed with this model in
SLiMgui; you can see many substantial bars, representing beneficial mutations at reasonably high
frequencies, getting pushed down to zero by the rise of a competing haplotype. Far fewer
beneficial mutations would be lost if they were not competing with each other. Recombination
occasionally resolves these conflicts by bringing competing mutations together onto the same
chromosome.
A very simple way to check this model is to decrease the mutation rate. The lower the mutation
rate, the less competition there should be between different mutations existing simultaneously in
the population (to the limiting case of a single extant mutation at any given time, which would
experience no Hill-Robertson interference at all). This is very easy to test; just change the mutation
rate to 1e-7, both where it is set with initializeMutationRate() and where it is assigned to the
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

154

variable mu for the calculations at the end. With both of those set to 1e-7, typical output is
something like:
P(fix) = 0.0951626
Expected fixations: 1903.25
Actual fixations: 88
Ratio, actual/expected: 0.0462367

And if we change the rate to 1e-8, again in both places, we get:
P(fix) = 0.0951626
Expected fixations: 190.325
Actual fixations: 16
Ratio, actual/expected: 0.0840667

With 1e-9, we get something like this (with a lot of stochasticity in the result, at this point –
generations post-burn-in is not nearly long enough to get an accurate estimate of the model’s
behavior when the mutation rate is so low):
1000

P(fix) = 0.0951626
Expected fixations: 19.0325
Actual fixations: 6
Ratio, actual/expected: 0.31525

So as predicted, the lower the mutation rate, the less Hill-Robertson interference influences the
dynamics, and the more closely the model approximates the theoretical ideal of independent
mutations without interference. And indeed, if you run the model with the mutation rate of 1e-9 in
SLiMgui, you will see that sometimes multiple mutations still interfere, but fairly often, too, single
mutations arise and fix individually. Of course a rigorous analysis would want to use a longer
burn-in, a longer post-burn-in runtime, multiple runs averaged together for each mutation rate,
calculation of a standard error of the mean across each set of multiple runs, statistical tests for
significance of differences, and so forth; but even this very simple analysis is sufficient to make the
trend quite clear.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

155

11. Complex mating schemes
Since SLiM is based on the Wright-Fisher model, biparental mating in SLiM is normally random;
for each offspring individual to be generated in a subpopulation, two parents are chosen at random
from the appropriate subpopulation (which may not be the same subpopulation as the offspring’s
subpopulation, if migration is involved). It is possible to modify this default behavior, however, by
writing a special type of Eidos script block called a mateChoice() callback. These callbacks are
documented comprehensively in the SLiM reference, in section 22.3. Here we will explore
recipes that illustrate the use of mateChoice() callbacks to implement two different types of nonrandom mating: assortative mating, and sequential mate search.
Some mating schemes are actually better modeled using a different type of callback, a
modifyChild() callback. Mostly, modifyChild() callbacks are used for purposes other than mating
schemes; recipes using them are mostly in chapter 12 (and they are documented in the SLiM
reference in section 22.4). However, the last recipe in this chapter will implement an interesting
type of non-random mating, gametophytic self-incompatibility based on an S-locus, using a
modifyChild() callback. See section 13.7 for another type of non-random mating, forcing SLiM to
execute a specified pedigree, also done with a modifyChild() callback.
11.1 Assortative mating
Assortative mating is the preference of an individual for mates that resemble the individual itself
in some way. A species could exhibit assortative mating by size, for example, which would mean
that smaller individuals tend to prefer other smaller individuals as mates, whereas larger
individuals tend to prefer other larger individuals. Assortative mating is an important topic in
evolutionary biology because it is thought to be important to the process of speciation: a
population can diverge into genetically distinct lineages if assortative mating becomes strong
enough to reproductively isolate phenotypically distinct subsets of the population from each other.
Indeed, this mechanism can be so effective that if a single gene provides both adaptive phenotypic
divergence and assortative mating based upon the diverging trait, the trait governed by the gene is
called a “magic trait” because of its power to facilitate speciation (Servedio et al. 2011).
In this section we’ll look at a simple magic-trait model in SLiM (see sections 13.1 and 14.8 for
further investigations of this topic). Since this is a relatively complex model, let’s put it together
one piece at a time. The first piece is to set up the genetic and population structure:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeMutationType("m2", 1.0, "f", 0.5);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 {
sim.addSubpop("p1", 500);
sim.addSubpop("p2", 500);
p1.setMigrationRates(p2, 0.1);
p2.setMigrationRates(p1, 0.1);
}
1000 late() {
target = sample(p1.genomes, 1);
target.addNewDrawnMutation(m2, 10000);
}
3499 { sim.simulationFinished(); }

// introduced mutation

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

156

This is very similar to the “introduced adaptive mutation” recipe of section 10.1: we have two
mutation types, one neutral (m1) and the other adaptive (m2), and after running a burn-in period of
1000 generations using only type m1, we introduce a single mutation of type m2 into a randomly
chosen genome. The main difference between section 10.1’s recipe and this model is that here we
have two subpopulations connected by migration; the gene flow introduced by the migration rate
of 0.1 in both directions between the subpopulations is substantial.
When run, this model provides sweep dynamics similar to those seen in some recipes in the
previous chapter. In the snapshot below, most of the neutral diversity has been swept away by the
beneficial mutation introduced at position 10000, which is about to fix and is carrying a few
neutral mutations along with it:

We can also examine the behavior of this model in SLiMgui using the Mutation Frequency
Trajectories graph. First recycle the simulation, then click the Show Graph popup button
and
open the Mutation Frequency Trajectories graph window from that menu. At this point, the
simulation is not yet initialized, and so the graph is empty. Step forward one generation, over the
initialize() callback; now the mutation types have been defined, and so you can choose
mutation type m2 from the popup in the graph window. Play the simulation forward, and you
should end up with a plot something like this, if the introduced mutation is not lost (if it is lost, you
can just recycle and try again):

Frequency

1.0

0.5

0.0
0

1000

2000

Generation

This plot illustrates that from the point at which the mutation was introduced, in generation
1000, it swept very rapidly to fixation.
For our magic-trait model we need divergent ecological selection between the subpopulations,
which we introduce by adding a fitness() callback that flips the fitness effect of the introduced
mutation in subpopulation p2:
fitness(m2, p2) { return -0.2; }

That’s all it takes. For simplicity, we are modeling a dominant mutation here, but we could
easily use a scheme such as that in section 9.4.1 if the mutation were not dominant.
A little while after the introduced mutation arose, this looks like:

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

157

The introduced mutation is not fixing in this version of the model; indeed, it is stuck fluctuating
around a frequency of 0.5, unable to increase further because it is deleterious in p2. Note that
there is lots of neutral variation in this run, in contrast to the previous version of the model.
If we look at the Mutation Frequency Trajectories graph now, we see a very different pattern,
illustrating how the spatial variation in selection is preventing the introduced mutation from fixing:

Frequency

1.0

0.5

0.0
0

1000

2000

Generation

Finally, for a magic-trait model we need assortative mating based upon the same locus that is
under divergent selection. In this model, that assortative mating will be provided by a
mateChoice() callback, which we can add now:
2000: mateChoice() {
parent1HasMut = (individual.countOfMutationsOfType(m2) > 0);
parent2HasMut = (sourceSubpop.individuals.countOfMutationsOfType(m2)
> 0);
if (parent1HasMut)
return weights * ifelse(parent2HasMut, 2.0, 1.0);
else
return weights * ifelse(parent2HasMut, 0.5, 1.0);
}

This callback is a bit complicated, so let’s walk through it. The first line determines whether the
parent that is choosing a mate possesses the magic-trait mutation. The choosing parent is provided
to the callback as an object named individual, of class Individual; this class is essentially a bag
containing the two Genome objects belonging to the individual. The countOfMutationsOfType()
method of Individual counts all occurrences of mutations of the given type in both of the
individual’s genomes; in other words, if the individual is homozygous for a given m2 mutation, that
mutation is represented twice in the count. Comparing the total count to 0 yields a single logical
truth value indicating whether any mutation of that type is present in the individual. Note that the
use of > 0 means that we are ignoring dominance; for purposes of mate choice, we are treating the
mutation as dominant. If we wanted heterozygotes to prefer heterozygotes and homozygotes
prefer homozygotes, or some such mating scheme, we could use the actual count instead; with a
little change to the following logic that would work fine too.
Similarly, we can assess whether the other individuals in the subpopulation – the potential
mates – possess the mutation using countOfMutationsOfType(m2) for all the individuals in the
subpopulation. As previously, we use a > 0 comparison to get a vector of logical values,
parent2HasMut, indicating whether each individual in the subpopulation possess the magic-trait
mutation in either of its genomes.
Finally, we have a little logic at the end to determine the mating preferences we want to return.
The standard fitness-based mating weights are supplied to the mateChoice() callback in the
weights variable, and we don’t want to ignore that; if we didn’t use that vector at all, we would be
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

158

overriding SLiM’s built-in fitness calculations, based on the selection coefficients of mutations,
entirely (which you may do if you wish, but which is usually not what is wanted). Here, we start
with weights, and combine it multiplicatively with an assortative-mating term computed with
ifelse(). The ifelse() function produces a result by looking at each element of its first
parameter (here, parent2HasMut) and choosing a corresponding element for the result; if the
element from the first parameter is T, an element from the second parameter is chosen, whereas if
the element from the first parameter is F, an element from the third parameter is chosen. The
second and third parameters may be non-singleton vectors, too; ifelse() is a very powerful
vectorized comparison function, which you can read more about in the Eidos documentation.
Here, its use is really quite simple. If the parent choosing a mate possesses the magic-trait
mutation, then candidate mates that also possess it get a multiplier of 2.0, expressing the
preference of carriers for other carriers. If the parent choosing a mate does not possess the magictrait mutation, then candidate mates that possess it get a multiplier of 0.5, expressing the dislike of
non-carriers for carriers.
If we recycle and run, and then look at the Mutation Frequency Trajectories graph, we can see
the effects of this callback kick in at generation 2000, when the callback becomes active:

Frequency

1.0

0.5

0.0
0

1000

2000

3000

Generation

At generation 2000, when the mateChoice() callback becomes active, the frequency of the
magic-trait allele jumps upward. If the two subpopulations were able to reach complete
reproductive isolation through this mechanism, we would expect the frequency plotted here to
equilibrate at 1.0 (because this graph shows the frequencies in subpopulation p1 only). In
practice, full divergence is not possible, because SLiM is a model of juvenile dispersal, and fitness
acts during mating (as opposed to causing mortality earlier in the generation life cycle). Given the
migration rates declared in the model, approximately 10% of the individuals in each
subpopulation will come from matings in the other population, every generation. Those migrants
are also mating assortatively – they might be maladapted in their new home, but they will
nevertheless preferentially mate with each other to produce offspring that are also maladapted.
Given the very high migration rate, adaptive divergence is helped only a little by the addition of
assortative mating (but see below).
The rest of the chromosome, outside the magic-trait locus, is subject only to neutral mutations
in this simple model. These neutral mutations can provide a means of monitoring the degree of
divergence between the subpopulations in the model; if the subpopulations become fully
reproductively isolated from each other, divergence at neutral sites should be observed, whereas
without reproductive isolation divergence at neutral sites should be small or absent. This is often
measured with a metric called FST: higher FST indicates greater genetic divergence among
subpopulations. We can add code to start calculating the mean FST between p1 and p2 at
generation 3000:

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

159

// Calculate the FST between two subpopulations
function (f$)calcFST(o$ subpop1, o$ subpop2)
{
p1_p = sim.mutationFrequencies(subpop1);
p2_p = sim.mutationFrequencies(subpop2);
mean_p = (p1_p + p2_p) / 2.0;
H_t = 2.0 * mean_p * (1.0 - mean_p);
H_s = p1_p * (1.0 - p1_p) + p2_p * (1.0 - p2_p);
fst = 1.0 - H_s/H_t;
fst = fst[!isNAN(fst)]; // exclude muts where mean_p is 0.0 or 1.0
return mean(fst);
}
3000: late() {
sim.setValue("FST", sim.getValue("FST") + calcFST(p1, p2));
}
3499 late() {
cat("Mean FST at equilibrium: " + (sim.getValue("FST") / 500));
sim.simulationFinished();
}

This recipe introduces a new feature in Eidos that was introduced in SLiM 2.5: user-defined
functions (see the Eidos manual for details on the syntax, etc.). Most of the code above defines a
new function, named calcFST(), that calculates the FST between two subpopulations. The rest of
the code just calls that function, records the results, and prints out a summary at the end of the run
(discussed below). Writing a new function like this is a good way to encapsulate general-purpose
code that could be reused in other models. In fact, user-defined functions like this have so much
potential to be useful that we have started a Github repository just for sharing them with other
users of SLiM: https://github.com/MesserLab/SLiM-Extras. Some of the functions shared there have
been written by us; we’re hoping that SLiM users will contribute more. You can send us a
contribution via a git pull request, or you can just email us your code and we’ll put it into the
repository for you if you prefer. If you’re a regular user of SLiM, it might be a good idea for you to
check that repository from time to time, to see what new goodies might have been added!
Besides the quirk of the user-defined function, this code is just an Eidos implementation of
Wright’s definition of FST:

FST = 1 −

HS

HT

where HS is the average heterozygosity in the two subpopulations, and HT is the total
heterozygosity when both subpopulations are combined. Note that all of the calculations here are
vectorized; the FST for each mutation in the simulation is calculated simultaneously in the code
above, leveraging the vectorized syntax of Eidos, and only at the end (with mean(F_ST)) are those
values combined into a mean FST value across the chromosome.
The mean FST for the generation is added to a value kept by the simulation using its dictionarylike getValue() / setValue() mechanism (see section 21.12.2), to keep a running total; that total is
then used in generation 3499 to calculate the average FST over the generations 3000:3499. The
getValue() / setValue() facility simply provides a way to attach named state to the simulation; it
is similar to the tag property we have used in various other recipes, but is more flexible (allowing
us to keep track of state of type float here, for example, whereas tag is limited to integer values).
We also need to add a line to the generation 1 event, to zero out the running FST total:
sim.setValue("FST", 0.0);
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

160

If we run the model with the mateChoice() callback commented out, we get this output at the
end of the run (for a run in which the focal mutation is not lost):
Mean FST at equilibrium: 0.0208799

Running it with the mateChoice() callback active, on the other hand, produces this output at
the end of the run:
Mean FST at equilibrium: 0.0322724

The magic trait therefore appears to have increased the level of divergence between the
subpopulations at neutral loci, relative to the divergence for the same trait under natural selection
but without the “magic” effect on mate choice, due to a decrease in gene flow. Single runs are of
course not sufficient to show this convincingly; in practice, you would want to do many
replications of both models, and do some statistics to show that the difference between the
outcomes of the two models is significant. You might also wish to run for more than 1000
generations before starting to gather the FST statistics, to assure that equilibrium had been reached.
You might also wish to verify that the magic trait was not lost from the simulation; that happens
occasionally, which of course defeats the purpose of the model. Finally, you might look separately
at the FST at different positions along the chromosome, since it might be much higher near the
magic trait locus than at more distant locations, due to linkage.
For the record, the complete model with all of the above components is:
// Calculate the FST between two subpopulations
function (f$)calcFST(o$ subpop1, o$ subpop2)
{
p1_p = sim.mutationFrequencies(subpop1);
p2_p = sim.mutationFrequencies(subpop2);
mean_p = (p1_p + p2_p) / 2.0;
H_t = 2.0 * mean_p * (1.0 - mean_p);
H_s = p1_p * (1.0 - p1_p) + p2_p * (1.0 - p2_p);
fst = 1.0 - H_s/H_t;
fst = fst[!isNAN(fst)]; // exclude muts where mean_p is 0.0 or 1.0
return mean(fst);
}
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeMutationType("m2", 1.0, "f", 0.5);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 {
sim.setValue("FST", 0.0);
sim.addSubpop("p1", 500);
sim.addSubpop("p2", 500);
p1.setMigrationRates(p2, 0.1);
p2.setMigrationRates(p1, 0.1);
}
1000 late() {
target = sample(p1.genomes, 1);
target.addNewDrawnMutation(m2, 10000);
}
fitness(m2, p2) { return -0.2; }

// introduced mutation

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

161

2000: mateChoice() {
parent1HasMut = (individual.countOfMutationsOfType(m2) > 0);
parent2HasMut = (sourceSubpop.individuals.countOfMutationsOfType(m2)
> 0);
if (parent1HasMut)
return weights * ifelse(parent2HasMut, 2.0, 1.0);
else
return weights * ifelse(parent2HasMut, 0.5, 1.0);
}
3000: late() {
sim.setValue("FST", sim.getValue("FST") + calcFST(p1, p2));
}
3499 late() {
cat("Mean FST at equilibrium: " + (sim.getValue("FST") / 500));
sim.simulationFinished();
}

This model runs very slowly after generation 2000. The mateChoice() callback computes the
same values over and over; using the Eidos functions defineConstant() and rm() to cache those
values yields about a 10× speedup. This is left as an advanced exercise for the reader.
11.2 Sequential mate search
In the previous section we saw how to use a mateChoice() callback to implement assortative
mating with a mateChoice() that modified the standard mating weights vector, weights, by
multiplying it with an additional term derived from the genetic match between the focal individual
and the other individuals in the subpopulation. In other words, the ultimate choice of mate was
left to SLiM’s machinery; the mateChoice() callback just modified the mating weights.
In this section we will explore a completely different kind of mateChoice() callback, one which
makes the mate choice determination itself and tells SLiM’s machinery which mate was chosen (if
any). The standard weights vector will be used internally, within the mateChoice() callback; but
the callback will return a weights vector with 1 for the chosen mate and 0 for all other candidates.
In certain circumstances, it will instead return a zero-length vector as a signal to SLiM that no
suitable mate could be found, initiating a new mate search with a new choosing parent.
The biological scenario here has to do with sequential mate search, the search by the choosy
parent for a suitable mate by examining candidates sequentially until a suitable candidate is found
or the breeding season runs out. Conceptually, we will model a species like peacocks, in which
the choosy parent is looking for a mate with a large ornament that is costly in fitness but that
makes the mate attractive; however, to keep the model relatively simple we will do a
hermaphroditic model in which all individuals can play both the chooser and the chosen.
Let’s build the model one step at a time, first constructing the genetic and population structure:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
// neutral
initializeMutationType("m2", 0.5, "f", -0.01); // ornamental
initializeGenomicElementType("g1", c(m1, m2), c(1.0, 0.01));
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 { sim.addSubpop("p1", 500); }
2001 early() {
sim.simulationFinished();
}
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

162

This is straightforward: one population of 500 individuals, and a genetic structure that allows
both neutral mutations (m1) and mutations that influence the ornament size of the individuals (m2).
The ornamental mutations occur only about 1% as often as the neutral mutations, but may occur
anywhere in the genome; this could be thought of as a “many genes of small effect”, quantitativegenetics sort of scenario. These ornamental mutations are all slightly deleterious, as genes that
increase the size of a peacock’s tail presumably would be (apart from their positive effect on
mating). If run, this model behaves as you might expect: neutral mutations accumulate, but no
ornamental mutations fix, or even reach appreciable frequency, because of their deleterious effect.
Now we want to introduce the effect of the ornamental mutations on mating; to do so, we add a
mateChoice() callback. This callback should implement sequential choosy mate choice by
searching for a mate, preferring potential mates with more ornamental mutations:
mateChoice() {
fixedMuts = sum(sim.substitutions.mutationType == m2);
for (attempt in 1:5)
{
mate = sample(0:499, 1, T, weights);
osize = 1.0 + (fixedMuts * 0.01) - p1.cachedFitness(mate);
if (runif(1) < osize * 10 + 0.1)
return p1.individuals[mate];
}
return float(0);
}

The first line of this callback just totals up the number of ornamental mutations that have fixed.
Once a mutation fixes, SLiM removes it from the active simulation, and from fitness calculations;
since all individuals possess the fixed mutation, it has no differential effect on fitness or dynamics,
in general. In this model, however, we want such fixed mutations to continue to influence mate
choice, as described below. This could also be done (perhaps better) by setting the
convertToSubstitution property of m2 to F, as we have seen in some previous recipes; we’re using
a different strategy here just to show a different angle on this common issue.
Next, the callback uses a for loop to make up to five attempts at finding a mate. Each attempt
is a little less picky than the previous attempt, reflecting declining standards as breeding season
proceeds. If all five attempts fail, float(0) is returned to indicate that the individual failed to find
a mate. Within each attempt, a candidate mate is chosen using the sample() function, with the
standard fitness-based weights vector as the weights for sampling; the deleterious effect of the
ornamental mutations is thus still taken into account, reducing the likelihood that highly
ornamented individuals will be chosen as mates for survival- or growth-based reasons. The
ornament size of the candidate mate is calculated, using both fixed and unfixed ornamental
mutations. Finally, runif() generates a random uniform draw (between 0 and 1, by default), and
that drawn value is compared to a threshold value determined by the candidate mate’s ornament
size; if the candidate gets sufficiently lucky, it ends up as the chosen mate. (The addition of 0.1
here ensures that the mate choice algorithm is guaranteed to terminate eventually; without it, a
hang would be possible if no individual can ever find a suitable mate because no individuals with
any ornamental mutations exist.) When a mate is chosen, it is simply returned; it just needs to be
looked up from p1.individuals since mate is just the index of the chosen mate. It would be
possible instead to construct a new weights vector, with 1 for the chosen mate and 0 for all other
entries, and return that to force SLiM to choose that individual; simply returning the individual is a
shorthand that can be handled by SLiM far more efficiently than can a returned weights vector.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

163

If you run this model, you can now see the ornamental mutations increasing in frequency and
fixing despite their deleterious fitness effect, because they are favored by sexual selection.
We might be interested in the final ornament size attained, so we can flesh out the final event:
2001 early() {
fixedMuts = sum(sim.substitutions.mutationType == m2);
osize = 1.0 + (fixedMuts * 0.01) - mean(p1.cachedFitness(NULL));
cat("Mean ornament size: " + osize);
sim.simulationFinished();
}

By writing this as a 2001 early() event, we are effectively running at the very end of generation
2000, after fitness values have been evaluated (see the generation cycle diagram in section 1.3, or
the more in-depth discussion in chapter 19). This distinction is important, since this event calls
cachedFitness() to get the fitness values of individuals; in a late() event those values would not
yet be calculated, and in fact calling cachedFitness() at that time would result in an error.
If you run this model until completion, you will likely see an output line like:
Mean ornament size: 0.0905

Individuals have evolved to possess, on average, about nine ornamental mutations. If you
modify the model to run until generation 10001, the final state is not much different:
Mean ornament size: 0.09

The fact that the mate choice algorithm takes fixed mutations into account, comparing the true
ornament sizes of candidate mates, means that having an ornament above a certain threshold size
provides no marginal benefit; indeed, because of the deleterious fitness effect of the ornamental
mutations, additional ornamental mutations above that threshold are selected against. The model
therefore reaches an equilibrium ornament size, just as happens in reality with ornamented species
subject to both natural selection against the ornament and sexual selection for the ornament, such
as peacocks. In this toy model, the equilibrium value is easily predicted from the structure of the
model itself (the key predictor is the mathematics of the runif() comparison in the callback).
For the record, here is the full model with all of the above components:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
// neutral
initializeMutationType("m2", 0.5, "f", -0.01); // ornamental
initializeGenomicElementType("g1", c(m1, m2), c(1.0, 0.01));
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 { sim.addSubpop("p1", 500); }
mateChoice() {
fixedMuts = sum(sim.substitutions.mutationType == m2);
for (attempt in 1:5)
{
mate = sample(0:499, 1, T, weights);
osize = 1.0 + (fixedMuts * 0.01) - p1.cachedFitness(mate);
if (runif(1) < osize * 10 + 0.1)
return p1.individuals[mate];
}
return float(0);
}
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

164

2001 early() {
fixedMuts = sum(sim.substitutions.mutationType == m2);
osize = 1.0 + (fixedMuts * 0.01) - mean(p1.cachedFitness(NULL));
cat("Mean ornament size: " + osize);
sim.simulationFinished();
}

Any kind of preference, bias, search, benefit, or cost influencing mate choice or mating
eligibility can be modeled with mateChoice() callbacks; this recipe only scratches the surface.
11.3 Gametophytic self-incompatibility
In the previous sections we have seen the use of mateChoice() callbacks to implement
assortative mating and sequential mate choice, two types of non-random mating. In this section,
we will see the use of a different type of callback, a modifyChild() callback, to model an
interesting type of non-random mating, gametophytic self-incompatibility based on S-locus alleles.
Gametophytic self-incompatibility is quite a common mating scheme in plants. In a nutshell,
the pollen grain (the male gamete, which is haploid) expresses the particular allele that it possesses
at a special locus called the S-locus. When a pollen grain lands on the stigma of a female flower
and begins to grow a pollen tube down the style towards the ovaries of the flower, the pollen tube
expresses the S-locus allele of the pollen grain as it grows. Female flowers, in their styles, do some
sort of check of the S-locus allele being expressed by the pollen tube, and if it exactly matches an
S-locus allele possessed by the female plant (on either of its two copies of that locus, since the
female flower is part of a diploid plant), it halts the growth of the pollen tube, preventing
fertilization. The reasons why this mechanism might be beneficial are thought to be related to
promotion of outcrossing and prevention of selfing.
Why use a modifyChild() callback instead of a mateChoice() callback to model a mate choice
scheme such as gametophytic self-incompatibility? There are several considerations.
Conceptually, a mateChoice() callback allows control over how likely every possible mating pair
in a population is to form, whereas a modifyChild() callback is about controlling the outcome of a
specific mating – governing whether that mating is fertile and what the genetic outcome of the
mating is. Gametophytic self-incompatibility is not really about the choice of mates across the
population (pollen lands indiscriminately on all female flowers, at least without complications
such as heterostyly or enantiostyly); instead, it is about whether the combination of a specific
pollen grain and a specific flower is fertile or infertile, making modifyChild() a natural choice.
Another consideration is that mateChoice() callbacks are about pre-mating reproductive
isolation, whereas modifyChild() callbacks are about post-mating reproductive isolation (among
other uses). Gametophytic self-incompatibility depends upon the S-locus allele expressed by the
haploid pollen grain; that information is simply not available in a mateChoice() callback, since
gametes have not yet been produced at that stage. From this perspective, then, modifyChild() is
actually a forced choice for this recipe.
A third consideration is that if a mateChoice() callback rejects a first parent completely, SLiM’s
mating algorithm goes back to try a different first parent within the same source subpopulation,
whereas if a modifyChild() callback suppresses a child completely, SLiM’s mating algorithm rechooses the source subpopulation as well, based upon the migration rates set for the target
subpopulation (see section 19.2). This difference reflects the pre-mating versus post-mating
semantics of the two callbacks, and makes good sense here; if a particular source subpopulation’s
pollen grains have a high probability of being incompatible with the female flowers of the target
subpopulation, that source subpopulation should be underrepresented in gene flow into the target
subpopulation, compared to the expected gene flow based upon pollen flow alone.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

165

A final consideration, if none of the previous issues decides the question, might be speed.
Unfortunately, mateChoice() callbacks tend to be quite slow. In computer science parlance, they
are generally O(N), meaning that the computation time taken by a mateChoice() callback is
proportional to the number of individuals in the subpopulation; this is because each individual
must be evaluated to produce a mating weight. On the other hand, modifyChild() callbacks are
generally O(1), meaning they take a constant amount of time regardless of population size; this is
because only the product of one mating event is considered by the callback. In practice, this
algorithmic speed difference can result in a very large difference to the running speed of a model.
Usually, however, the previous considerations will dictate the choice of implementation.
Let’s build the model in two steps, beginning without the modifyChild() callback:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeMutationType("m2", 0.5, "f", 0.0); // S-locus mutations
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElementType("g2", m2, 1.0);
initializeGenomicElement(g1, 0, 20000);
initializeGenomicElement(g2, 20001, 21000);
initializeGenomicElement(g1, 21001, 99999);
initializeRecombinationRate(1e-8);
}
1 { sim.addSubpop("p1", 500); }
10000 late() {
cat("m1 mutation count: " + sim.countOfMutationsOfType(m1) + "\n");
cat("m2 mutation count: " + sim.countOfMutationsOfType(m2) + "\n");
}

We have two mutation types, both neutral, and a chromosome that uses mutation type m1 over
most of its length (with g1) but mutation type m2 within a small (1000-bp) locus (with g2). That
small locus is of course the S-locus, but without the mateChoice() callback it acts identically to the
rest of the chromosome. Since there is not yet any code to enforce a difference between the Slocus and the rest of the chromosome, this recipe is, at this point, simply a model of neutral drift.
In the generation 10000 event, it outputs two metrics prior to termination: the number of
mutations of type m1 and of type m2. These serve as a quick-and-dirty measure of genetic diversity,
both across the bulk of the chromosome (the m1 count) and within the S-locus (the m2 count). A
typical test run of this recipe produces something like:
m1 mutation count: 124
m2 mutation count: 1

You can run the model a bunch of times and confirm that there is not a whole lot of variation
around these numbers; the m1 count is typically 100–200, the m2 count typically 0–5.
It is also interesting to look at the graphical representation of the chromosome in SLiMgui.
Selecting a subrange of the chromosome, so as to zoom in on just one section, a typical run of the
recipe so far looks like:

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

166

This view has display of genomic elements turned on, with the G button, so that the location of
the S-locus is clear. The two things to note here are that the distribution of neutral mutations is
fairly sparse, and that the distribution of neutral mutations inside versus outside the S-locus is
comparable.
Now let’s add in the modifyChild() callback that implements the gametophytic selfincompatibility system:
modifyChild(p1) {
pollenSMuts = childGenome2.mutationsOfType(m2);
styleSMuts1 = parent1Genome1.mutationsOfType(m2);
styleSMuts2 = parent1Genome2.mutationsOfType(m2);
if (identical(pollenSMuts, styleSMuts1))
if (runif(1) < 0.99)
return F;
if (identical(pollenSMuts, styleSMuts2))
if (runif(1) < 0.99)
return F;
return T;
}

First, the callback gets a vector of all of the S-locus mutations present in the pollen grain. The
variable is defined by SLiM within modifyChild() callbacks, and represents the
gamete produced by the second parent in the mating; see section 22.4 for details. The result,
pollenSMuts, represents the S-locus allele expressed by the pollen grain. All mutations at the Slocus are considered, in this model, to change the expressed S-allele; it would be trivial to add in
the possibility of neutral mutations at the S-locus as well, by adding in a probability of type m1
mutations in genomic element type g2.
Next, the callback gets the two S-locus alleles of the female flower by fetching the vector of
mutations of type m2 from parent1Genome1 and parent1Genome2, the two homologous genomes of
the first parent (again, defined by SLiM in these callbacks, and documented in section 22.4).
Next comes the crucial step at which the growth of the pollen tube is stopped if there is an
incompatibility between the S-allele of the pollen and either of the S-alleles of the female flower.
This is checked using the Eidos function identical(), which checks whether its two arguments are
exactly identical. We don’t need to worry about non-identicality due to mutations being listed in a
different order in the vectors, because Genome keeps its list of mutations in sorted order, and
mutationsOfType() returns a sorted subset of that list.
If the pollen S-allele is identical to either of the female flower’s S-alleles, there is a 99%
probability (in this model) of the pollen tube being blocked, as implemented by the test
(runif(1) < 0.99). Returning F in these cases tells SLiM to suppress generation of the proposed
child altogether; this is, conceptually, the stoppage of the pollen tube. It is important to guarantee
that a modifyChild() callback will never suppress 100% of all proposed children, for any state that
your model might reach; if that ever happens, SLiM will hang, stuck in an infinite loop generating
an infinite succession of proposed children that all get suppressed. In this case, the model would
not quite hang without the runif() test, since eventually SLiM would manage to generate pollen
grains that all happened to have a mutation within the S-locus that made them compatible with the
available female flowers; but it would take an awfully long time. Indeed, even at 99% the first
generation can take quite a while to finish, and many of the individuals in the second generation
will have an S-locus mutation because of the effective imposition of gametophytic selfincompatibility in a single generation. A more realistic model might perhaps “phase in” the
gametophytic self-incompatibility slowly over some thousands of generations, reflecting the
gradual evolution of an increasingly strong mechanism, by making the threshold against which
childGenome2

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

167

is compared depend upon sim.generation. This is left as an exercise for the reader; it
should not change the final state of the model at equilibrium (although equilibrium might take a
lot more than the 10000 generations we use here).
The final line simply returns T, indicating that since the pollen grain was compatible with the
female flower, the proposed child can be generated.
The full recipe, for the record:
runif()

initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeMutationType("m2", 0.5, "f", 0.0); // S-locus mutations
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElementType("g2", m2, 1.0);
initializeGenomicElement(g1, 0, 20000);
initializeGenomicElement(g2, 20001, 21000);
initializeGenomicElement(g1, 21001, 99999);
initializeRecombinationRate(1e-8);
}
1 { sim.addSubpop("p1", 500); }
10000 late() {
cat("m1 mutation count: " + sim.countOfMutationsOfType(m1) + "\n");
cat("m2 mutation count: " + sim.countOfMutationsOfType(m2) + "\n");
}
modifyChild(p1) {
pollenSMuts = childGenome2.mutationsOfType(m2);
styleSMuts1 = parent1Genome1.mutationsOfType(m2);
styleSMuts2 = parent1Genome2.mutationsOfType(m2);
if (identical(pollenSMuts, styleSMuts1))
if (runif(1) < 0.99)
return F;
if (identical(pollenSMuts, styleSMuts2))
if (runif(1) < 0.99)
return F;
return T;
}

If you run the full recipe, you should get output something like:
m1 mutation count: 582
m2 mutation count: 55

There is vastly more genetic diversity now, both within the S-locus (mutation type m2) and across
the whole chromosome (type m1). Gametophytic self-incompatibility basically imposes a regime of
balancing selection (i.e., negative frequency-dependent selection) at the S-locus, and that regime
tends to preserve allelic diversity at the locus (see section 9.4.1 for a different model of negative
frequency-dependent selection). It also increases the outcrossing rate in the population,
particularly when there are relatively few S-alleles (when there are many S-alleles, it would mostly
tend to diminish selfing, since most non-selfing mating pairs would possess different S-alleles
anyway; it would be trivial, of course, to add selfing to this model to experiment with that
additional nuance, as seen in section 6.3.1).
Examining the same subsection of the chromosome as before, in SLiMgui, it now looks like this:

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

168

Compare this to the previous snapshot, and the increase in genetic diversity across the
chromosome is immediately apparent. It is also striking how much more diversity is contained
within the S-locus itself (due to the balancing selection there) compared to the rest of the
chromosome. Note that in the implementation of this model, the number of S-alleles is not just
the number of different mutations within the S-locus that are circulating in the population.
Instead, every unique haplotype within the S-locus as a whole represents a different S-allele, and
new S-alleles can be generated in this model by recombination as well as by mutation. Other
schemes for evaluating what an S-allele “really” is, based upon the mutations present at the Slocus, could of course be implemented using a different test than identical() in the
modifyChild() callback.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

169

12. Direct child modifications
Thus far we have seen two ways of modifying SLiM’s basic Wright-Fisher model: fitness()
callbacks that modify the default fitness of mutations (chapter 9), and mateChoice() callbacks that
alter the default behavior of random mating (chapter 11). In this chapter we will look at a third
type of modification, modifyChild() callbacks. These allow you to modify children generated by
SLiM (by adding or removing mutations, for example), or to suppress particular children entirely
(see section 1.5 for a full technical discussion). Here, we will look at how to use a modifyChild()
callback to solve three specific problems: social learning of cultural traits that modify fitness, lethal
epistasis, and simulating a “gene drive” based upon CRISPR/Cas9. Note that section 11.3 also
used a modifyChild() callback, to implement a gametophytic self-incompatibility system.
12.1 Social learning of cultural traits
In section 9.4.3 we explored a simple model of a cultural trait; the tag value of Individual was
used to track whether each individual was a milk-drinker or not, and a fitness() callback was
used to make mutations promoting the production of lactase into adulthood beneficial for milkdrinkers but neutral for non-milk-drinkers. In that model, whether a given individual was a milkdrinker or not was determined at random, in a late() callback that tagged all offspring with their
assigned cultural group. We will now take up that model again and extend it to be a model of
social learning: individuals will tend to inherit milk-drinking from their parents, although a
substantial random factor in the cultural assignment will be retained. To achieve this, instead of
assigning the cultural group in a late() event, we will do it in a modifyChild() callback, so that
we have the information we need regarding the culture of the parents. The recipe:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
// neutral
initializeMutationType("m2", 0.5, "f", 0.1);
// lactase-promoting
m2.convertToSubstitution = F;
initializeGenomicElementType("g1", c(m1,m2), c(0.99,0.01));
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 {
sim.addSubpop("p1", 1000);
p1.individuals.tag = rbinom(1000, 1, 0.5);
}
modifyChild() {
parentCulture = (parent1.tag + parent2.tag) / 2;
childCulture = rbinom(1, 1, 0.1 + 0.8 * parentCulture);
child.tag = childCulture;
return T;
}
fitness(m2) {
if (individual.tag == 0)
return 1.0;
// neutral for non-milk-drinkers
else
return relFitness; // beneficial for milk-drinkers
}
10000 { sim.simulationFinished(); }

The setup of the model is similar to its predecessor in section 9.4.3: mutation type m2 is used to
represent alleles that promote retention of lactase production (and removal of fixed m2 mutations is
prevented with convertToSubstitution), tag values of 1 are used to indicate milk-drinkers, and
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

170

the same fitness() callback as in section 9.4.3 produces a differential fitness effect of m2
mutations depending upon the culture of the individual. What has changed is the way that the tag
values are maintained. The late() event has been removed. We now assign initial tag values in
generation 1, immediately after creating subpopulation p1. Thenceforth, the tag values of new
individuals are assigned in the modifyChild() callback:
modifyChild() {
parentCulture = (parent1.tag + parent2.tag) / 2;
childCulture = rbinom(1, 1, 0.1 + 0.8 * parentCulture);
child.tag = childCulture;
return T;
}

This callback is called once for every new offspring individual created, throughout the run of
the model. Its first line gets the tag values of the two parents (using parent1 and parent2 variables
that are set up for all modifyChild() callbacks; see section 22.4 for a complete list of these
variables), and averages them to get a parentCulture value that will be 0, 0.5, or 1. The next line
determines what the child’s culture will be (0 or 1) using rbinom() to draw from a binomial
distribution; the trial success probability used for the draw depends on parentCulture in such a
way as to make children tend to follow the culture of their parents, but with a 10% chance of
deviating even if the parents share the same culture. To make this cultural determination take
effect, childCulture is assigned into the tag property of the child (using the child variable defined
for the callback by SLiM). Finally, a value of T is returned to indicate that the child should be
generated; it is possible for a modifyChild() callback to suppress the generation of some children
by returning F.
In the original model of section 9.4.3, the fraction of milk-drinkers remained about 0.5,
because assignment into a cultural group was random. In this model, in contrast, the fraction of
milk-drinkers can increase over time as individuals learn milk-drinking from their parents; the
fitness benefit of milk-drinking increases as more m2 mutations sweep. It is easy to observe this in
SLiMgui by adding a custom output event:
{
if (sim.generation % 100 == 0)
cat(sim.generation + ": " + mean(p1.individuals.tag) + "\n");
}

This prints the fraction of milk-drinkers in the population, every 100 generations. Running this
model produces output that shows the progressive increase in milk-drinking:
100:
200:
300:
400:
500:
...

0.465
0.567
0.552
0.681
0.735

However, there will always be a minimum of about 10% non-milk-drinkers in the population,
because of the element of chance provided by the binomial draw in the modifyChild() callback.
It would be possible to modify that to allow a specific culture to fix in the population. On the
other hand, one could also model the fact that milk-drinkers who do not retain lactase production
into adulthood will typically suffer from symptoms of lactose intolerance, making milk-drinking
disadvantageous in some circumstances. One could also model a slight deleterious effect of
lactase retention genes among non-milk-drinkers, since they are devoting energy to producing an
enzyme that they do not use. There is always more complexity to be added; but as it stands, this
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

171

model shows the coevolution of both genetic and cultural factors related to milk-drinking, using
tag values in combination with a modifyChild() callback to model social learning in SLiM.
12.2 Lethal epistasis
In section 9.3.1 we saw a model of epistasis that influenced the fitness of the epistatic alleles
using a fitness() callback. Sometimes the situation is simpler than the scenario presented in that
model: sometimes the fitness of individuals carrying both epistatic alleles is zero, making the
epistatic interaction lethal. In this case, a modifyChild() callback is well-suited to the task, since
it can suppress the generation of particular offspring depending upon their genetics (or other
factors). In this case, we will model two introduced mutations, A and B, which are normally
beneficial, but are lethal when they occur in the same offspring:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeMutationType("m2", 0.5, "f", 0.5); // mutation A
m2.convertToSubstitution = F;
initializeMutationType("m3", 0.5, "f", 0.5); // mutation B
m3.convertToSubstitution = F;
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 {
sim.addSubpop("p1", 500);
}
1 late() {
sample(p1.genomes, 20).addNewDrawnMutation(m2, 10000); // add A
sample(p1.genomes, 20).addNewDrawnMutation(m3, 20000); // add B
}
modifyChild() {
childGenomes = c(childGenome1, childGenome2);
hasMutA = any(childGenomes.countOfMutationsOfType(m2) > 0);
hasMutB = any(childGenomes.countOfMutationsOfType(m3) > 0);
if (hasMutA & hasMutB)
return F;
return T;
}
10000 { sim.simulationFinished(); }

The mechanics here for the introduction of mutations A and B follow the pattern we have seen
in other recipes: a target vector of genomes is chosen with sample(), and then a mutation is added
to the target genomes with addNewDrawnMutation(). Unlike most previous recipes, however, here
we sample 20 genomes, not just one, in order to add the new mutation to multiple individuals.
Because addNewDrawnMutation() is a class method of Genome, not an instance method, a single
new mutation is added to all of the target genomes, rather than a different new mutation being
added to each target genome as a result of multicasting; see section 9.4.4 for discussion of the
mechanics of this. No effort is made here to avoid adding A and B to the same individuals, but that
would be trivial to add by refining the choice of targets for B. The mutation types for A and B, m2
and m3, are set not to convert fixed mutations to substitutions, so that their epistatic interaction
persists even after fixation; see section 9.3.1 for extensive discussion of this.
The new and interesting behavior is in the modifyChild() callback. First it determines whether
the proposed child possesses mutation A and mutation B, setting up logical flags hasMutA and
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

172

by checking whether the count of mutations of the relevant type is greater than zero in
either of the child’s genomes. Then, if the child has both A and B, it returns F, indicating that this
child should be suppressed. New parents will be chosen and a new child will be generated
instead (also subject to the approval of the modifyChild() callback). Otherwise it returns T,
indicating the generation of the child should proceed.
When this model is run, either A or B quickly “wins”, fixing while its epistatic competitor is lost.
If the modifyChild() callback is removed, on the other hand, both A and B will typically fix. We
can try an interesting variant by making the epistatic interaction lethal only if A and B are both
homozygous, by replacing the previous modifyChild() callback with a new version:
hasMutB

modifyChild() {
childGenomes = c(childGenome1, childGenome2);
mutACount = sum(childGenomes.countOfMutationsOfType(m2));
mutBCount = sum(childGenomes.countOfMutationsOfType(m3));
if ((mutACount == 2) & (mutBCount == 2))
return F;
return T;
}

This results in a form of balancing selection between A and B, since now homozygosity is
disadvantageous but heterozygosity is advantageous. Both will tend to fluctuate around a
frequency of 0.5, although one may stochastically manage to fix eventually (in which case the
other will immediately be lost).
This type of model could also be used to represent an epistatic interaction during development
that is lethal in some fraction of cases, but is otherwise harmless – either the epistatic interaction
causes a lethal anomaly during development, or it doesn’t, in which case the offspring is normal.
This could be modeled simply by adding a random factor to the if statement in the modifyChild()
callback.
12.3 Simulating gene drive
There has recently been a lot of buzz about a new genetic-engineering technology called
CRISPER/Cas9 that allows genetic modifications to be performed much more quickly and easily
than previous methods (Doudna & Charpentier 2014). One potential application of the CRISPER/
Cas9 machinery is to use it for the construction of a so-called gene drive, which can quickly drive
genetically modified alleles to high frequency in a population even if they carry a fitness cost. We
will here refer to a CRISPR/Cas9-based gene drive with the term mutagenic chain reaction, or
MCR.
The basic idea of MCR is that the CRISPR/Cas9 machinery embedded into the organism’s
genome could cause the machinery itself to be spliced into any homologous chromosome that
does not already contain it. If a fertilized egg ends up with one copy of the MCR machinery,
inherited from one parent, the machinery could then splice itself into the homologous
chromosome in the egg, changing the egg from being heterozygous to homozygous for the MCR
locus. This is in some ways similar to the idea of meiotic drive and related concepts in
evolutionary biology, and it clearly underlines the fundamental truth of the “selfish gene”
perspective: an MCR gene of this type could, as we shall see, spread to fixation in a population
even if it has a deleterious fitness effect at the level of the individual organism.
Just for fun, we’re going to make this model relatively complex in terms of population structure,
and we’re going to introduce spatial variation in selection using a fitness() callback as well.
We’ll build this model one step at a time, starting with the demographic structure:

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

173

initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeMutationType("m2", 0.5, "f", -0.1);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 {
for (i in 0:5)
sim.addSubpop(i, 500);
for (i in 1:5)
sim.subpopulations[i].setMigrationRates(i-1,
for (i in 0:4)
sim.subpopulations[i].setMigrationRates(i+1,
}
10000 { sim.simulationFinished(); }

// neutral
// MCR complex

0.001);
0.1);

This sets up a six-subpopulation stepping-stone model, similar to that previously discussed in
section 5.3.1. Note that migration in this model is primarily from p5 down to p0; the migration
rate in that direction is a hundred times higher than in the direction from p0 up to p5:
p0

p5

p1

p4

p2

p3

The code above sets up a mutation type m2 for the MCR complex, but doesn’t use it, so at
present this is just a simulation of neutral drift in a stepping-stone model. Let’s start using mutation
type m2, although not yet with its planned MCR capabilities, by adding a little code to introduce an
m2 mutation and track its fate (the tracking block can replace the final script block in the previous
model):
100 late() {
p0.genomes[0:49].addNewDrawnMutation(m2, 10000);
}
100:10000 late() {
if (sim.countOfMutationsOfType(m2) == 0)
{
fixed = any(sim.substitutions.mutationType == m2);
cat(ifelse(fixed, "FIXED\n", "LOST\n"));
sim.simulationFinished();
}
}

The introduction code here simply introduces the mutation into the first 50 genomes of
subpopulation p0, without bothering to randomly select target individuals using sample().
Introducing the mutation into many genomes at once is a quick-and-dirty trick to help an
introduced mutation avoid being lost due to genetic drift at the earliest stages of establishment,
without putting in the machinery to make the simulation truly conditional on establishment as we
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

174

did in section 10.3. Biologically, it could be viewed as a large-scale immigration event, or as a
planned introduction of mutant individuals orchestrated by humans. See section 9.4.4 for
discussion of the Eidos mechanics underlying this mutation introduction code.
This model will generally terminate, perhaps around generation 140, with the message "LOST".
This is because mutation type m2 is strongly deleterious, as presently defined, and so gets
eliminated by selection fairly quickly even though it is initially introduced in fifty copies. Since
this is not very interesting, let’s move on to the next step: making mutation type m2 strongly
beneficial in subpopulation p0 and strongly deleterious in subpopulation p5, with gradations in
between, using a fitness() callback:
fitness(m2) {
return 1.5 - subpop.id * 0.15;
}

This fitness() callback uses subpop.id, the identifier of the subpopulation in which the
mutation is presently being evaluated, to generate a fitness of 1.5 for p0, but a fitness of 0.75 for
p5. If we run the model now, we see that it soon reaches an equilibrium state of migrationselection balance:
p0

p5

p1

p4

p2

p3

It is at high frequency in p0 – nearly fixed, but prevented from fixing completely by gene flow
from p1. It is essentially absent from p5, on the other hand, since it is weeded out by selection,
and since gene flow in the p0 to p5 direction is so weak. The model will tend to maintain this
dynamic equilibrium.
Let’s add the modifyChild() callback that implements the MCR behavior of m2:
100:10000 modifyChild() {
mut = sim.mutationsOfType(m2);
if (size(mut) == 1)
{
hasMutOnChromosome1 = childGenome1.containsMutations(mut);
hasMutOnChromosome2 = childGenome2.containsMutations(mut);
if (hasMutOnChromosome1 & !hasMutOnChromosome2)
childGenome2.addMutations(mut);
else if (hasMutOnChromosome2 & !hasMutOnChromosome1)
childGenome1.addMutations(mut);
}
return T;
}

The first line just finds the introduced mutation by searching for it in the list of mutations kept by
the simulation; this is necessary because SLiM and Eidos don’t allow you to keep permanent
references to objects. The if statement tests whether we found the MCR mutation; in the final
generation of the simulation, when the mutation has either fixed or been lost, we won’t find it.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

175

Assuming we do find the MCR mutation, we then check whether the particular child that we are
working with – the target of this modifyChild() operation – has the MCR mutation in either of its
genomes, using the containsMutations() method of Genome. With that information, the rest is
simple: if it is in one genome but not the other, we add the MCR mutation to the genome that
doesn’t already contain it, just as the CRISPR/Cas9 gene drive machinery for MCR would do in an
actual organism. If neither genome of the target contains the MCR gene, or if both genomes do,
we leave the child as it is. Finally, we return T, indicating that the child in question should in fact
be generated (a return of F would suppress generation; we saw an example of this in section 12.2).
If we recycle and run this model, it shows a consistent behavior of rapid fixation, even in the
subpopulations where it is deleterious, and even despite having to “swim upstream” against the
predominant direction of gene flow. At the moment just prior to complete fixation, the model
looks something like this (note that once the mutation actually fixes, it is removed from the
simulation and replaced by a substitution object, so the populations revert to yellow):
p0

p5

p1

p4

p2

p3

Incidentally, in snapshots like the one above, we’ve been using the Population Visualization
graph to see the state of the model, because it is a bit more informative than the default population
view, which doesn’t show population connectivity:

This display shows every individual in the model, colored according to its fitness. There are two
other display modes for the population view that can be useful, both of which summarize that
fitness distribution by binning the individual fitness values. These alternative display modes can be
selected with a right-click or control-click on the population view.
One alternative is to see line plots of binned fitness for each subpopulation, superimposed:

The x-axis represents fitness (from 0.0 to 2.0 in this case, but this is configurable in the options
panel that appears when this display mode is chosen). Fitness values are tallied into 50 bins (also
configurable), and the y-axis represents the frequency within each bin, from 0.0 to 1.0 within each
subpopulation. For this model, the resulting plot looks a bit like modern art, but the six peaks
correspond to the six subpopulations; the different fitness effects in the different subpopulations is
clearly visible, as is the fact that the mutation is about to fix.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

176

Another possibility is to see a histogram of binned fitness values across all of the
subpopulations. This is similar to the line plot of fitness frequencies above, except that all of the
subpopulations are tallied together, not individually:

Again the x-axis represents fitness and the y-axis represents frequency within a bin (now from
0.0 to 1.0 population-wide). The six subpopulations can again be seen here.
These alternative population visualizations can be a useful tool in understanding model
dynamics, especially for models with many subpopulations and/or many individuals.
For reference, here is the full gene drive model with all of the components introduced above:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
// neutral
initializeMutationType("m2", 0.5, "f", -0.1); // MCR complex
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 {
for (i in 0:5)
sim.addSubpop(i, 500);
for (i in 1:5)
sim.subpopulations[i].setMigrationRates(i-1, 0.001);
for (i in 0:4)
sim.subpopulations[i].setMigrationRates(i+1, 0.1);
}
100 late() {
p0.genomes[0:49].addNewDrawnMutation(m2, 10000);
}
100:10000 late() {
if (sim.countOfMutationsOfType(m2) == 0) {
fixed = any(sim.substitutions.mutationType == m2);
cat(ifelse(fixed, "FIXED\n", "LOST\n"));
sim.simulationFinished();
}
}
fitness(m2) {
return 1.5 - subpop.id * 0.15;
}
100:10000 modifyChild() {
mut = sim.mutationsOfType(m2);
if (size(mut) == 1) {
hasMutOnChromosome1 = childGenome1.containsMutations(mut);
hasMutOnChromosome2 = childGenome2.containsMutations(mut);
if (hasMutOnChromosome1 & !hasMutOnChromosome2)
childGenome2.addMutations(mut);
else if (hasMutOnChromosome2 & !hasMutOnChromosome1)
childGenome1.addMutations(mut);
}
return T;
}
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

177

It has been proposed that MCR could have many compelling applications, such as driving
particular mosquito species – those that carry important human diseases such as malaria – toward
extinction, or at least driving a potentially deleterious gene for resistance to diseases like malaria
toward fixation in those mosquito species. To know whether such proposals are realistic, however,
we need to model them. In this section we developed a simple toy model of MCR; a model useful
for real-world prediction would need many additional ramifications, of course, but this is a starting
point for further exploration.
12.4 Suppressing hermaphroditic selfing
In section 6.3.1, it was noted that a low rate of selfing will normally be observed in SLiM in
hermaphroditic models, even when the selfing rate is explicitly set to zero. This occurs because
SLiM chooses each of the parents in a biparental mating randomly (weighted according to fitness),
and does not explicitly prevent the same individual from being chosen as both parents. Normally
this does not present a problem; it is typically a very small effect, and indeed it is sometimes
desirable since the model will then better match the predictions from some simple analytical
models. Sometimes, however, this selfing does prove to be an issue (particularly with small
effective population sizes or high variance in fitness). In such cases, it can be prevented with a
call at the beginning of the initialize() callback of your script:
initializeSLiMOptions(preventIncidentalSelfing=T);

Before this option was added to SLiM, preventing incidental selfing required a simple
callback, illustrated by this section’s recipe. This recipe is now obsolete, since it
has been superseded by the configuration flag shown above; but it has been retained in the
cookbook as an illustration of how modifyChild() callbacks can be used to perform simple
changes to mating behavior of this sort. The original, now-obsolete recipe used this callback:
modifyChild()

modifyChild()
{
// prevent hermaphroditic selfing
if (parent1 == parent2)
return F;
return T;
}

This can simply be dropped in to more or less any hermaphroditic model, such as this:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 { sim.addSubpop("p1", 500); }
2000 late() { sim.outputFixedMutations(); }

It will then suppress the selfing events, by returning F (and thus suppressing the proposed child)
whenever the two parents of the proposed child are the same. This could be implemented as a
mateChoice() callback instead, by changing the weight for the already-chosen first parent to 0, but
that would be much slower since a modified mating-weights vector would have to be built for
each mating event. Often, as here, suppressing specific mating combinations is most easily and
efficiently done with a modifyChild() callback instead.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

178

There are two caveats. The first is that if you turn selfing on in your model this modifyChild()
callback will then cause your model to hang, because SLiM will keep trying to satisfy your
requested selfing rate, whereas the callback will keep preventing it from doing so. If you need to
attain exactly the requested selfing rate, however, while suppressing the background selfing events
generated randomly by SLiM, you can just add a check of the isSelfing flag provided to the
modifyChild() callback (see section 22.4). For selfing events that are intended by SLiM to satisfy
the requested selfing rate, this flag will be T; for any additional background selfing events caused
by random mate choice, this flag will be F. Suppressing only the selfing events in which
isSelfing is F ought to produce the desired effect. (You could also do a bit of math and set an
adjusted selfing rate that accounts for the background selfing rate, but that is perhaps more errorprone.)
The other caveat is that if you define other modifyChild() callbacks as well, you might want to
think about how the multiple callbacks will stack together. Typically you would want this selfingsuppression callback to occur first in the chain, and thus you would want to define it earliest in
your script. See section 22.8 for discussion of this issue.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

179

13. Advanced models
This chapter will present some advanced models that draw upon many of the concepts covered
in previous chapters, while also using relatively advanced features of Eidos and SLiM that may not
have been covered in previous recipes. A knowledge of the topics covered in previous chapters
will be assumed, and simple details will not be explained in depth, to keep things short.
13.1 Quantitative genetics and phenotypically-based fitness
In this recipe we will explore a quantitative genetics model in which a phenotypic trait is based
upon multiple loci with additive effects. In some respects this is similar to the model of polygenic
selection developed in section 9.3.2; however, this model goes much further, explicitly modeling
the phenotypic effect of the quantitative trait and producing a fitness effect based upon that
phenotype. This recipe will model a trait based on 10 quantitative trait loci (QTLs), each of which
can have a value of either −1 or +1; this is a common design for models of quantitative traits, but
the model does not strongly depend upon this choice. (For a quantitative genetics model that uses
QTLs with continuous effects and incorporates heritability, see section 13.10.)
This model also incorporates two-subpopulation structure, with limited gene flow between the
subpopulations. The two subpopulations experience different environments that are selecting for
different optima; unlike the model in section 9.2, however, the two environments are selecting for
different phenotypes, as determined by all of the loci underlying the trait, rather than just exerting
differential selection on each individual locus.
This model also includes assortative mating. Unlike the model of section 11.1, where mating
was assortative based upon possession of a single introduced mutation, here mating is assortative
based upon the phenotype produced by all the underlying loci. This means that mating can
actually be disassortative at the genetic level, because individuals with the same phenotype might
achieve that phenotype through entirely different QTLs alleles.
The genetic structure used here simulates ten separate chromosomes, with one QTL on each
chromosome. SLiM simulates only one Chromosome object, but since the recombination map can
be specified arbitrarily, a recombination rate of 0.5 between specific pairs of bases can be used to
effectively subdivide that Chromosome object into separate chromosomes that have no linkage
between them. This model places each QTL at the center of its chromosome, with 1000 bases on
each side experiencing neutral mutations, but those choices are easily changed.
Finally, relatively complex custom output code in this model assesses and prints information
about the genetic structure and fitness of each subpopulation at the beginning and end of the
simulation, illustrating how SLiM’s output can be customized.
We will start with the initialization portion of this model’s script:
initialize() {
initializeMutationRate(1e-5);
// neutral mutations in non-coding regions
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
// mutations representing alleles in QTLs
scriptForQTLs = "if (runif(1) < 0.5) -1; else 1;";
initializeMutationType("m2", 0.5, "s", scriptForQTLs);
initializeGenomicElementType("g2", m2, 1.0);
m2.convertToSubstitution = F;
m2.mutationStackPolicy = "l";

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

180

// set up our chromosome: 10 QTLs, surrounded by neutral regions
defineConstant("C", 10);
// number of QTLs
defineConstant("W", 1000); // size of neutral buffer on each side
pos = 0;
q = NULL;
for (i in 1:C)
{
initializeGenomicElement(g1, pos, pos + W-1);
pos = pos + W;
initializeGenomicElement(g2, pos, pos);
q = c(q, pos);
pos = pos + 1;
initializeGenomicElement(g1, pos, pos + W-1);
pos = pos + W;
}
defineConstant("Q", q);

// remember our QTL positions

// we want the QTLs to be unlinked; build a recombination map for that
rates = c(rep(c(1e-8, 0.5), C-1), 1e-8);
ends = (repEach(Q + W, 2) + rep(c(0,1), C))[0:(C*2 - 2)];
initializeRecombinationRate(rates, ends);
}

This sets up two mutation types: m1, representing neutral mutations, and m2, representing
mutations at QTLs. The m2 mutation type draws its mutational effects from a user-specified
distribution, rather than from one of SLiM’s built-in mutational distributions. It does this by
specifying the mutation type as "s", for “script”, and then supplying a short Eidos snippet as a
string. That snippet is run as a lambda (see section 17.5) by SLiM whenever it needs to draw a
new selection coefficient. The m2 mutation type is also set not to be removed when it fixes
(because QTLs will continue to influence phenotype, and thus relative fitness, even after fixation);
and it is set to use a “last mutation” stacking policy, so that when a new mutation occurs at a given
QTL it replaces the allele that was previously at that site, rather than “stacking” with it as is SLiM’s
default behavior.
Two genomic element types are also set up by this code: g1, representing neutral buffer zones
around the QTLs, and g2, representing QTLs themselves. The rest of this initialize() callback
sets up the chromosome by tiling these genomic element types, and setting up a recombination
map that effectively places each QTL on an independent chromosome as explained above.
This model makes somewhat liberal use of the defineConstant() call of Eidos to set up
constants related to the genomic structure of the simulation. Defined constants are much like
variables, except that they cannot be redefined (except by removing them with rm() first), and they
persist for the lifetime of the simulation rather than disappearing at the end of the callback in
which they are defined. They are thus very useful for symbolically representing model parameters;
this model also defines a constant Q to remember the positions of all of the QTLs being simulated.
After this initialization() callback has run, the genetic structure of the simulation looks like
this in SLiMgui (with display of genomic elements and recombination rates enabled):

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

181

This shows the pattern of non-coding regions and QTLs (below), and the recombination
breakpoints that define the breaks between effective chromosomes in the model (above).
Next we will set up (most of) the initial population state of the simulation:
1 early() {
sim.addSubpop("p1", 500);
sim.addSubpop("p2", 500);
// set up migration; comment these out for zero gene flow
p1.setMigrationRates(p2, 0.01);
p2.setMigrationRates(p1, 0.01);
sim.registerEarlyEvent("s2", s1.source, 2, 2);
}
1 late() {
// optional: give m2 mutations to everyone, as standing variation
// if this is commented out, QTLs effectively start as 0
g = sim.subpopulations.genomes;
n = size(g);
for (q in Q)
{
isPlus = asLogical(rbinom(n, 1, 0.5));
g[isPlus].addNewMutation(m2, 1.0, q);
g[!isPlus].addNewMutation(m2, -1.0, q);
}
}

The early() callback sets up two subpopulations with migration between them. It also
contains a call to registerEarlyEvent() that will not run right now; if you want to run the model
at this stage you will need to comment that out. It will be explained below.
The late() callback is optional, and sets up the initial state of the QTLs in the model by placing
initial mutations in them. It does this by looping over all of the QTLs, and for each QTL, deciding
whether each genome in the simulation will start with a + or a − allele by drawing from a binomial
distribution. It then distributes the + and − alleles to all of the genomes by calling
addNewMutation() to add the chosen allele to each genome, in a twist on the pattern seen
previously in section 9.4.4 and elsewhere. This code results in just two mutation objects being
created to represent the + and − alleles of each QTL, rather than a different mutation object being
created for each genome, for reasons explained in detail in section 9.4.4; here this is a nonessential detail, but it improves the efficiency of the model since fewer mutation objects need to be
tracked. This whole callback can be removed, in which case the model starts with no QTL
mutations, producing an effective value of zero for each QTL until new mutations arise. That
would be a model of emerging genetic diversity from a clonal population, then, rather than this
model of selection beginning with a high degree of random standing genetic variation.
Next comes a bunch of callbacks that implement the quantitative trait machinery:
1: late() {
// construct phenotypes for the additive effects of QTLs
inds = sim.subpopulations.individuals;
inds.tag = asInteger(inds.sumOfMutationsOfType(m2));
}
fitness(m2) {
// the QTLs themselves are neutral; their effect is handled below
return 1.0;
}
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

182

fitness(NULL, p1) {
// optimum of +10
return 1.0 + dnorm(10.0 - individual.tag, 0.0, 5.0);
}
fitness(NULL, p2) {
// optimum of -10
return 1.0 + dnorm(-10.0 - individual.tag, 0.0, 5.0);
}

The late() callback is called in every generation to calculate the phenotypes of individuals by
summing up the additive effects of all QTLs possessed by each individual. We get a vector
containing all individuals in the model, and for each individual we sum up the selection
coefficients of all of the m2 mutations possessed by that individual using sumOfMutationsOfType()
method, and place the sums into the individuals’ tag properties with a vectorized assignment.
Note that this code would work with QTLs of any effect size, not just the +/− scheme used here,
except that the individual’s tag property is of type integer, which would lead to roundoff
problems with QTLs of fractional value. That could be avoided by using the tagF property, which
stores float values instead of integer values. Once that change was made, across the whole
recipe, the phenotype values would then be saved and retrieved as type float, so the distribution
of fitness effects could then be changed to any DFE desired (see section 13.10). This recipe uses a
+/− DFE and integer phenotypes, partly because that is common in theoretical multilocus models,
and partly for historical reasons. One could also use the getValue() and setValue() methods of
Individual to keep the phenotype state, rather than using tag or tagF (see section 21.6.2). In
other words, where this recipe saves away the calculated phenotypic value of an individual with
individual.tag = ..., one would use individual.setValue("phenotype", ...) instead, and
everywhere that this recipe gets a saved phenotypic value using individual.tag, one would use
individual.getValue("phenotype") instead. (The string identifier "phenotype" is not special; it
could just as well be "foo".) This would, however, be substantially slower than using tagF.
The first fitness() callback simply makes all m2 mutations neutral, regardless of their stated
selection coefficient. This is because we are using the selection coefficients of m2 mutations here
to represent their phenotypic effect size, in the sense of additive quantitative genetics, rather than
their actual selection coefficients; we therefore wish to disable their direct effect on fitness.
Finally, we have a pair of fitness() callbacks declared with a mutation type identifier of NULL:
one for subpopulation p1, one for p2. We haven’t seen this use of fitness() callbacks before; the
NULL identifier indicates that the callback is not intended to modify the fitness effects of mutations
of a particular mutation type, but rather, provides a fitness effect for the individual as a whole. For
this reason, fitness(NULL) callbacks are referred to as global fitness callbacks (see section 22.2).
They are called once for each individual in the specified subpopulation, and the fitness effect they
return is multiplied into all of the other fitness effects for the individual. The fitness effect of a
global fitness() callback might depend upon any state of the individual; here it depends on the
phenotype of the individual compared to the phenotypic optimum. Since m1 and m2 mutations are
all neutral in this model, these fitness(NULL) callbacks are the sole determinants of individual
fitness in this model. They implement fitness based on a Gaussian fitness function, as is typical in
models of this sort, with fitness being highest at some phenotypic optimum, and falling away for
phenotypic values both lower and higher than that optimum. The phenotypic values of individuals
are fetched out of their tag properties, which were set up by the late() event that ran previously.
Note that a baseline of 1.0 is added to the Gaussian value since 1.0 indicates neutrality on the
relative fitness scale; see section 13.10 for further discussion of this choice.
This implements the bulk of the model. We will now add assortative mate choice based on
phenotype, which is very simple since the underlying machinery has already been constructed:
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

183

mateChoice() {
phenotype = asFloat(individual.tag);
others = asFloat(sourceSubpop.individuals.tag);
return weights * dnorm(others, phenotype, 5.0);
}

This multiplicatively combines the existing fitness-based weights for all potential mates with
mating weights based on a Gaussian function that rewards phenotypic similarity.
Incidentally, it might be of interest to note that now the quantitative trait is something
resembling a magic trait, since it is under divergent ecological selection and influences mate
choice. However, it is not in fact a magic trait, since it is based upon multiple loci. The individual
QTLs might be considered “magic genes” since they perhaps satisfy the criteria of the formal
definition (Servedio et al. 2011), but since there is epistasis between them (because of the way they
combine additively to determine the quantitative phenotype that is actually under selection), it is
not entirely clear how they relate to the usual intuitive meaning of “magic trait”.
Finally, we will add some custom output by tallying up fitnesses, phenotypes, and frequencies:
s1 2001 early() {
cat("-------------------------------\n");
cat("Output for end of generation " + (sim.generation - 1) + ":\n\n");
// Output population fitness values
cat("p1 mean fitness = " + mean(p1.cachedFitness(NULL)) + "\n");
cat("p2 mean fitness = " + mean(p2.cachedFitness(NULL)) + "\n");
// Output population additive QTL-based phenotypes
cat("p1 mean phenotype = " + mean(p1.individuals.tag) + "\n");
cat("p2 mean phenotype = " + mean(p2.individuals.tag) + "\n");
// Output frequencies of +1/-1 alleles at the QTLs
muts = sim.mutationsOfType(m2);
plus = muts[muts.selectionCoeff == 1.0];
minus = muts[muts.selectionCoeff == -1.0];
cat("\nOverall frequencies:\n\n");
for (q in Q)
{
qPlus = plus[plus.position == q];
qMinus = minus[minus.position == q];
pf = sum(sim.mutationFrequencies(NULL, qPlus));
mf = sum(sim.mutationFrequencies(NULL, qMinus));
pf1 = sum(sim.mutationFrequencies(p1, qPlus));
mf1 = sum(sim.mutationFrequencies(p1, qMinus));
pf2 = sum(sim.mutationFrequencies(p2, qPlus));
mf2 = sum(sim.mutationFrequencies(p2, qMinus));
cat("
cat("
cat("

QTL " + q + ": f(+) == " + pf + ", f(-) == " + mf + "\n");
in p1: f(+) == " + pf1 + ", f(-) == " + mf1 + "\n");
in p2: f(+) == " + pf2 + ", f(-) == " + mf2 + "\n\n");

}
}

This runs at the beginning of generation 2001, so as to produce output regarding the very end of
generation 2000, after fitness values have been calculated; calling cachedFitness() in a late()
event in generation 2000 would raise an error, since fitness values are not yet available at that point
in the generation cycle. Note that the purpose is now clear of the line we saw before:
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

184

sim.registerEarlyEvent("s2", s1.source, 2, 2);

This output callback is script block s1, which that line of code replicates as an early() event in
generation 2, so that it produces output at both the beginning and the end of the model run. If
SLiM’s syntax for specifying the generations in which a callback runs were more general, this trick
would not be needed; but there is no way to say “run this callback only in generations 2 and
2001”, at present, so this is a simple way to achieve that.
Running this model produces output something like this:
Output for generation 1:
p1
p2
p1
p2

mean
mean
mean
mean

fitness =
fitness =
phenotype
phenotype

1.02108
1.0219
(optimum +10) = 0.176
(optimum -10) = -0.34

Overall frequencies:
QTL at 1000: f(+) == 0.5125, f(-) == 0.4875
in p1: f(+) == 0.527, f(-) == 0.473
in p2: f(+) == 0.498, f(-) == 0.502
...
------------------------------Output for generation 2000:
p1
p2
p1
p2

mean
mean
mean
mean

fitness =
fitness =
phenotype
phenotype

1.06406
1.06221
(optimum +10) = 8.296
(optimum -10) = -8.14

Overall frequencies:
QTL at 1000: f(+) == 0.504, f(-) == 0.496
in p1: f(+) == 0.848, f(-) == 0.152
in p2: f(+) == 0.16, f(-) == 0.84
...

It can be seen that in the initial state of the model both subpopulations have phenotypes near
fitnesses of essentially 1.0, and a random distribution of + and − QTL alleles. In the final
state, on the other hand, phenotypes have diverged most of the way to their local environmental
optima of +10 and −10, mean relative fitness is up to about 1.06 in both subpopulations as a result,
and substantial divergence at the level of individual QTL alleles can be observed. This divergence
appears to be substantially due to the assortative mating in the model, since a sample run with the
mateChoice() callback commented out results in considerably less divergence:
0.0,

p1
p2
p1
p2

mean
mean
mean
mean

fitness =
fitness =
phenotype
phenotype

1.04107
1.04752
(optimum +10) = 3.876
(optimum -10) = -5.584

Of course lots of replicate runs with different parameter values, etc., would be needed to
substantiate this; but it seems to agree with other models of assortative mating and divergence.
This model includes neutral diversity at loci surrounding each QTL; no attempt has been made
to analyze that here, but presumably there might be interesting patterns related to patterns of gene
flow, hitchhiking, and so forth. Increasing the length of the neutral buffers around the QTL
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

185

(defined constant W), and/or increasing the recombination rate, would probably be helpful for
making these patterns more clear.
Section 13.10 presents a fairly different quantitative genetics model, using QTLs with effect
sizes drawn from a continuous distribution, and incorporating heritability, but with only a single
subpopulation, with no predetermined genetic structure, and without assortative mating.
13.2 Relatedness, inbreeding, and heterozygosity
Inbreeding is an important concept in evolutionary biology, but it is not very precisely defined.
It can result from a variety of processes, from small population size to assortative mating, and it
can manifest in a variety of genetic patterns, such as decreased heterozygosity, loss of genetic
diversity, and a decreased time to the most recent common ancestor of pairs of individuals. The
particular effects of inbreeding observed in a system may depend on the process generating the
inbreeding. So before embarking on a study of inbreeding, one should define one’s terms clearly.
Here, we will construct a model of inbreeding that results from a tendency of individuals to
mate with their close kin, as defined by their pedigree-based relatedness. SLiM (beginning in
version 2.1) has a built-in facility for tracking this type of relatedness, which can be turned on with
the call initializeSLiMOptions(keepPedigrees=T) in the initialize() callback of a script (see
section 21.1). When this call is made, individuals in a simulation will keep track of the identities
of their parents and grandparents, and the relatedness between individuals can then be assessed
using the relatedness() method of Individual (see section 21.6.2). The pedigree information is
also available through properties on the Individual class (section 21.6.1), for purposes such as
locating “trios” (two parents and an offspring that they generated) for analysis.
It should be emphasized that the relatedness metric available through this mechanism is purely
pedigree-based. This can be quite different from genetic relatedness, which depends not only on
pedigree but also on factors such as assortment and recombination (see section 13.9). The
inbreeding generated by this model will be based upon this relatedness metric.
With that as preamble, here is the first stage of the model:
initialize() {
initializeSLiMOptions(keepPedigrees = T);
initializeMutationRate(1e-5);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-7);
}
1 {
sim.addSubpop("p1", 100);
}
1000 late() {
// Calculate mean nucleotide heterozygosity across the population
total = 0.0;
for (ind in p1.individuals)
{
// Calculate the nucleotide heterozygosity of this individual
muts0 = ind.genomes[0].mutations;
muts1 = ind.genomes[1].mutations;
// Count the shared mutations
shared_count = sum(match(muts0, muts1) >= 0);

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

186

// All remaining mutations are unshared (i.e. heterozygous)
unshared_count = muts0.size() + muts1.size() - 2 * shared_count;
// pi is the mean heterozygosity across the chromosome
pi_ind = unshared_count / (sim.chromosome.lastPosition + 1);
total = total + pi_ind;
}
pi = total / p1.individuals.size();
cat("Mean nucleotide heterozygosity = " + pi + "\n");
}

This is a simple neutral model, similar to those we have seen before. It turns on pedigree
tracking in its initialize() callback, and then runs for 1000 generations. At the end of the run, a
late() callback computes the mean nucleotide heterozygosity (π) across the population. For one
individual, the nucleotide heterozygosity is the fraction of base positions that are heterozygous.
This is calculated by getting the mutations from the two genomes of the individual and finding the
number of mutations that are shared between them (adding the number of matches from match()
with sum()). The two genomes must together contain 2 * shared_count of these shared mutations;
all the remaining mutations in the genomes are unshared, and thus heterozygous. Dividing the
number of unshared mutations by the length of the chromosome (+1 because lastPosition is a
zero-based value) yields the individual’s nucleotide heterozygosity. (Note that reuseable functions
to calculate heterozygosity are now in the SLiM-Extras repository online, too, with a different
implementation using setSymmetricDifference(); for now, this recipe doesn't use that code.)
So now we have a baseline model that has only whatever inbreeding results from its small
population size of 100 individuals. Now let’s add a mateChoice() callback that uses the pedigree
tracking information to create a mating preference for kin:
mateChoice() {
// Prefer relatives as mates
return weights * (individual.relatedness(sourceSubpop.individuals) +
0.01);
}

This uses the relatedness() method of Individual to calculate the relatedness between the
focal individual that is choosing a mate (individual) and all of its potential mates
(sourceSubpop.individuals). A constant of 0.01 is added to those values to guard against the
possibility that all of the values would be exactly zero; that would result in a weights vector of all
zeros being returned to SLiM, which is illegal (see the discussion in section 22.3). In fact, this
particular model is safe from that problem, because the relatedness of an individual to itself is 1.0,
and so individuals will always be able to mate with themselves; but the safeguard is shown here to
raise awareness of the issue, since in many other types of models this issue will need to be
considered – sexual models, models with multiple subpopulations and migration, etc. Finally, the
relatedness values are multiplied by weights, the vector of default mating weights supplied to the
callback by SLiM. Again, this is not important for this model (as a neutral, non-sexual model,
weights will be a vector of all 1), but it is important for models more generally, so that the fitnessbased mating weights are taken into account even when a mateChoice() callback is implemented.
Without this, the fitness-based mating weights computed by SLiM would be completely replaced
by the mateChoice() callback, making all selection coefficients and fitness() callbacks irrelevant
– rarely what is wanted.
This particular callback makes the mating weights be (almost) proportional to relatedness.
Individuals that share no grandparents will have a mating weight of only 0.01, whereas full siblings
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

187

will have a weight of 0.51 and an individual will have a weight of 1.01 for hermaphroditic selfing.
This should generate fairly strong inbreeding. Of course a mateChoice() callback could rescale or
otherwise modify these values to produce whatever mating preferences are desired.
That’s all of the code needed for the model; the bulk of the code is actually the heterozygosity
calculation, and the relatedness-based mating preference requires just a one-line mateChoice()
callback. If we run this model five times with the mateChoice() callback, the average of the
reported heterozygosity values across those runs is 0.00163, whereas the average across five runs
without the mate choice callback is 0.00426. Clearly the mateChoice() callback is having a
pronounced effect on the nucleotide heterozygosity observed in the model, as intended (with a
two-sample independent t-test p-value of 0.0011, if you’re skeptical).
Incidentally, it would also be possible to implement the tendency towards mating with kin using
a modifyChild() callback instead. That callback would evaluate the relatedness of the two parents
of the proposed child, and would tell SLiM to short-circuit generation of the child, with some
probability, if the relatedness of the parents was not sufficiently high. If only a weak inbreeding
effect is desired, this might be much faster than the mateChoice() scheme shown above, since for
the generation of a typical child only one or a few relatedness values would need to be calculated,
and the relatively large overhead of running the mateChoice() callback would be avoided. This
should be very simple, so it is left as an exercise for the reader.
13.3 Mortality-based fitness
Normally, SLiM uses fitness values as mating weights: high-fitness individuals are more likely to
be chosen as mates than low-fitness individuals (see section 19.2.2). By default, SLiM does not
model mortality; there is no concept of individuals dying before reaching reproductive age (except
for the suppression of child generation with a modifyChild() callback, which can be viewed as a
type of mortality-based fitness; see chapter 12). However, it is straightforward to add mortality to a
model: if model mechanics dictate that an individual dies, then it can be given a fitness of zero,
which means that it cannot possibly be chosen as a mate; effectively, it has been removed from the
population (although it will remain as an individual in the population until the next generation
starts; there is no way to actually remove dead individuals from the population).
(Note that this section is aimed primarily towards WF models; in nonWF models, as discussed
in section 1.6, fitness translates into mortality anyway, rather than influencing mating, so every
nonWF model is a model of mortality-based fitness. However, in nonWF models it can still be
useful to explicitly kill off a specific individual by making its fitness zero; see, for example, the
recipe in section 15.5.)
Here we will look at three different models of how mortality might be implemented. One
model converts the fitness effects of mutations directly into mortality using a fitness() callback.
This might be a good way to model death due to deleterious genetic factors. The second model is
a tag-based model (see section 1.3 and sections cited therein); if individuals die, their tag value is
set to zero, and then a special fitness() callback is used to reduce the fitness of tagged
individuals to zero. This might be a good way to model death due to non-genetic factors, such as
predation. The third model uses the fitnessScaling property of Individual, a new feature
introduced in SLiM 3.0.
One question regarding mortality-based fitness might be: what difference does this make? Why
would one wish to model mortality-based fitness rather than (or in addition to) mating-based
fitness? The answer is that the two types of fitness have different effects on the distribution of the
number of offspring generated by individuals. With mating-based fitness, all individuals of a given
low fitness value are equal in the offspring-production game; the expected number of offspring is
the same for all, and is lower than the expected number of offspring for a high-fitness individual.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

188

With mortality-based fitness, however, all individuals of a given low fitness value are not equal, by
the time mating season arrives: some are alive, and some are dead. Those that are alive have an
expected number of offspring that is just as high as a high-fitness individual; they survived to
mating season, and now the playing field is even. Those that are dead, on the other hand, have an
expected number of offspring of zero. What the evolutionary consequences of this difference
might be is not the focus here; but there clearly is a difference, and so we want to be able to
model mortality-based fitness.
So, with no further ado, let’s look at our first model:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeMutationType("m2", 0.5, "f", -0.005);
m2.convertToSubstitution = F;

// deleterious

initializeGenomicElementType("g1", c(m1,m2), c(1.0,0.1));
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 {
sim.addSubpop("p1", 500);
}
fitness(m2)
{
// convert fecundity-based selection to survival-based selection
if (runif(1) < relFitness)
return 1.0;
else
return 0.0;
}
10000 late() {
sim.outputMutations(sim.mutationsOfType(m2));
}

Mutations are mostly neutral (m1) but occasionally slightly deleterious (m2), and the deleterious
mutations are not converted to substitutions when they fix (using m2.convertToSubstitution = F),
since they should continue to cause mortality. At the end of a run, the model uses
outputMutations() to print information about all of the m2 mutations existing in the population
(include those that have fixed, since they do not get substituted).
The interesting part of the model is the fitness() callback. It converts a fitness effect for an m2
mutation (which will be 0.995 if homozygous and 0.9975 if heterozygous, given the dominance
coefficient of 0.5 used for m2) into either 0 or 1, with the probability of a fitness of 1 being equal to
the fitness effect of the focal mutation in the individual. If the new fitness value is 0, the focal m2
mutation has resulted in mortality; the individual will not mate, and is effectively dead. If the new
fitness is 1, on the other hand, the individual has survived the deleterious effects of the focal m2
mutation, and that mutation will have no further deleterious effect; it will not influence the
individual’s expected number of offspring, since its fitness effect has been converted to 1.
There are a couple of things to note here. First of all, the mortality effects of multiple m2
mutations in a single individual will combine multiplicatively, just as their non-mortality-based
fitness effects would combine multiplicatively in SLiM without the fitness() callback. This is
because the fitness() callback will be called for each m2 mutation, and the probabilities of

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

189

mortality that these callbacks cause will combine multiplicatively (as independent probabilities
generally do).
Second, the way that this fitness() callback works depends on the m2 mutations being
deleterious, not beneficial. This is because for a beneficial mutation, relFitness would be greater
than 1, and so the comparison with runif(1) would always be T. On a more conceptual level,
you can’t be more likely to survive than a survival probability of 1. If you wish to model beneficial
mutations with a mortality-based effect, you would need to provide some baseline probability of
mortality in the model, such that an individual with no mutations would still have, say, a
probability of 0.5 of dying before reaching reproductive age. The presence of beneficial mutations
could then reduce this probability of dying, down to the limit of a probability of 0.0.
Third, there is no obstacle to combining this sort of mortality-based fitness effect with the usual
mating-based fitness effects of SLiM, using other mutation types. There is nothing magical about
this model; the fitness() callback here is just another fitness() callback, albeit one that
(somewhat unusually) models a stochastic fitness effect that varies from individual to individual.
Now let’s look at the second recipe for mortality-based fitness, this one driven by tags:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 {
sim.addSubpop("p1", 500);
}
late() {
// initially, everybody lives
sim.subpopulations.individuals.tag = 1;
// here be dragons
sample(sim.subpopulations.individuals, 100).tag = 0;
}
fitness(NULL) {
// individuals tagged for death die here
if (individual.tag == 1)
return 1.0;
else
return 0.0;
}
10000 late() { sim.outputFull(); }

This model involves only neutral mutations of type m1. It has a fitness() callback that
evaluates phenotypic fitness (in this case, mortality). Because the mutation type identifier for the
fitness() callback is NULL, this is a global fitness() callback (see section 22.2) that is called
once per individual per generation, without reference to any focal mutation, allowing a fitness
effect to be generated that depends upon the overall state of each individual. This technique was
previously used in section 13.1, where it was used to define the fitness effect of individual
phenotypes determined by additive QTLs.
In this case, the “overall state” that the fitness(NULL) callback models is mortality due to
having previously been tagged as dead. The fitness(NULL) callback simply returns 1.0 (lived) for
individuals with a tag of 1, and 0.0 (dead) for all others. The tag values are assigned in a late()
event that runs near the end of every generation. In this model, 100 individuals out of the 500 in
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

190

the population are chosen for death at random using the sample() function, but that is just an
arbitrary placeholder for whatever sort of mortality-generating logic you might wish to implement
in your model – predation, social interactions, developmental disorders, or anything else. Indeed,
the logic could depend on the genetics of individuals in some way, combining the approach of this
recipe with that of the previous recipe.
One note here is that the tagging logic occurs in a late() event. That is because tag values for
newly generated offspring need to be assigned before fitness values for the new offspring
generation are calculated, and a late() event is the right time to do that (see the generation cycle
diagram in chapter 19). This has the side effect that mortality does not occur in the first generation
at all; tag values have not been assigned, and so the whole set of machinery that generates
mortality is not yet active. This might sound unfortunate, but in fact it is the way the previous
recipe worked as well, and indeed it is the way that models in SLiM generally work: since the first
parental generation starts out with empty chromosomes, and since new mutations are added to
genomes during offspring generation, the first generation generally is not subject to any selection
unless a model explicitly adds mutations to the first generation and then forces a re-evaluation of
the fitness of that generation using SLiMSim’s recalculateFitness() method. That would be an
option here as well, if deemed necessary; code could be added to set up tag values in an early()
event in generation 1, and then a call to recalculateFitness() in an early() event in generation
1 could cause the tag values to produce mortality in the parentals. However, having a first
generation of neutral dynamics is usually not a problem, since models usually want to start from a
neutral equilibrium state anyway; in practice, a model would usually run neutrally for many
generations as “burn-in”, and then some mortality-generating mechanism of interest would “kick
in”. In that sort of scenario, this issue of whether or not mortality occurs in the first parental
generation is irrelevant. Alternatively, another type of model would start off immediately with the
mortality effect being active, but would run until equilibrium was reached; in that case, again,
whether mortality occurs in the first parental generation is unlikely to be important, since the final
equilibrium state will usually not depend upon that detail. Nevertheless, it can be forced to occur,
as described above, if deemed necessary.
OK, with that discussion out of the way it is time for the third and final recipe for mortalitybased fitness:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 {
sim.addSubpop("p1", 500);
}
late() {
// here be dragons
sample(sim.subpopulations.individuals, 100).fitnessScaling = 0;
}
10000 late() { sim.outputFull(); }

This is equivalent to the second recipe, but is much simpler and runs much faster. In this
recipe, we use the fitnessScaling property of Individual, which was added in SLiM 3.0, to
directly kill off individuals in the late() event, rather than using tag values and a fitness(NULL)
callback that translates those tag values into fitness effects. The fitnessScaling values get
multiplied into each individual’s calculated fitness value, just as the value returned by the
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

191

fitness(NULL) callback did in the second recipe. Note that it is not necessary to initialize
fitnessScaling values to 1.0 in each new generation; SLiM does that automatically.

the

The second recipe, then, is generally inferior, but is included for a couple of reasons. One is
simply the historical reason that, prior to SLiM 3.0, it was the right way to implement mortalitybased fitness, and might still be of interest for that reason alone. It also illustrates the use of tag
values in a very simple model, which is perhaps useful. Finally, since it is strictly equivalent to the
third recipe, it might expose the underlying logic of the third recipe more clearly; using the
fitnessScaling property is a bit “magical”, with a lot of what is happening hidden behind the
scenes, whereas the second recipe shows the mechanism more explicitly.
Note that something along the lines of the first recipe could also be implemented using
fitnessScaling, in fact, with a little bit of scripting. To achieve this, you would first make the
selection coefficient of the m2 mutations be 0.0, rather than -0.005, so that they are neutral as far
as SLiM’s fitness-calculation machinery is concerned. Then, you would loop through the
individuals in a late() event, and for each individual you would count the number of m2
mutations, determine the survival probability based upon that count (0.995^(count/2), perhaps),
draw a random number to determine survival, and set fitnessScaling to 0 for the individuals that
did not survive. This would not be quite the same as the first recipe, though, since it would not
account for homozygosity versus heterozygosity (i.e., dominance effects) in the same way. Which
strategy is preferable would depend upon the biology you were trying to model.
13.4 Reading initial simulation state from an MS file
At the beginning of execution of a SLiM model, the genomes of all individuals are empty; they
contain no mutations. Mutations can be introduced explicitly in script (see section 10.1), or the
saved state of a SLiM simulation can be read in to provide a non-empty initial state (see section
21.12.2, and section 10.2 for an example). Sometimes, however, information about the desired
initial state of the model will be in a file that is in a non-SLiM format, and you will want to read
that file in and create that initial state in the model. In this section, we will examine a recipe for
reading in a file that is in the popular “MS” format. This can be useful, since MS uses a very fast
coalescent method to generate genetic diversity in a neutral model; it can be used to generate an
initial “burn-in” state that a SLiM model can then use as a starting state simulations.
As a first step, we need a file in MS format to read in. We could use MS to generate that, of
course, but instead, for illustration purposes here, we will use a simple neutral SLiM model:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 {
sim.addSubpop("p1", 1000);
}
20000 late() {
p1.outputMSSample(2000, replace=F, filePath="~/Desktop/ms.txt");
}

Notice that the outputMSSample() call samples without replacement, using replace=F, and it
requests 2000 samples – twice the size of the population, because there are two genomes per
diploid individual. This means that the call will output not just a sample, but the full state of the
entire population. The result of this model is a standard MS-format file, saved out to ms.txt:
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

192

//
segsites: 345
positions: 0.0047300 0.0048000 0.0053201 0.0063001 0.0068701 0.0072201
0.0078301 0.0098501 0.0140401 0.0144601 0.0162002 0.0196102 0.0236902...
00100000000010000000000000001000100100101100000000100001010000000000000000
00000000000000000001000000010011000100000000111000100000000000000000001100
10000001111000100000000000000001001000001001000000101100100000001000010001
00100000000000000000000000001000000000010000000000000000000000010000000000
0000000000000000100000000000000000000000000000100
01100000000000000000000000000000100001000001100001000010101000001000000010
00000000000010000000000000000000000000001010000000000000100010000000000001
00000100000000000000000000000000000010000000010000000100100000000000010000
00000000001100000000000000001000000000010000000000000000001000010000000000
0000000000000000100000000000000000000000000000000
...

Ellipses have been used here to abbreviate the lengthy content of the file. It begins with a //
line that marks the beginning of a sample block. A segsites: line then gives the number of
segregating sites per sample, and a positions: line gives the positions, in the interval [0,1], of
each segregating site. The remainder is a series of samples, one per line, with 1 and 0 values
indicating whether each corresponding mutation is (1) or is not (0) present in that genomes.
Now let’s look at a recipe for reading this file back in and instantiating it into neutral mutations.
Here we use the same chromosome length, population size, and other parameters, so that the
loaded state corresponds to the model’s defined dynamics. This is not required, but be careful that
what you are doing makes sense. Here is the recipe:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 late() {
sim.addSubpop("p1", 1000);
// READ MS FORMAT INITIAL STATE
lines = readFile("~/Desktop/ms.txt");
index = 0;
// skip lines until reaching the // line, then skip that line
while (lines[index] != "//")
index = index + 1;
index = index + 1;
if (index + 2 + p1.individualCount * 2 > size(lines))
stop("File is too short; terminating.");
// next line should be segsites:
segsitesLine = lines[index];
index = index + 1;
parts = strsplit(segsitesLine);
if (size(parts) != 2) stop("Malformed segsites.");
if (parts[0] != "segsites:") stop("Missing segsites.");
segsites = asInteger(parts[1]);

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

193

// and next is positions:
positionsLine = lines[index];
index = index + 1;
parts = strsplit(positionsLine);
if (size(parts) != segsites + 1) stop("Malformed positions.");
if (parts[0] != "positions:") stop("Missing positions.");
positions = asFloat(parts[1:(size(parts)-1)]);
// create all mutations in a genome in a dummy subpopulation
sim.addSubpop("p2", 1);
g = p2.genomes[0];
L = sim.chromosome.lastPosition;
intPositions = asInteger(round(positions * L));
muts = g.addNewMutation(m1, 0.0, intPositions);
// add the appropriate mutations to each genome
for (g in p1.genomes)
{
f = asLogical(asInteger(strsplit(lines[index], "")));
index = index + 1;
g.addMutations(muts[f]);
}
// remove the dummy subpopulation
p2.setSubpopulationSize(0);
// (optional) set the generation to match the save point
sim.generation = 20000;
}
30000 late() { sim.outputFull(); }

The action is in the generation 1 late() event, which first creates the p1 subpopulation and then
reads in the MS file and adds mutations to the individuals in p1 as needed. It would be too tedious
to explain this code line by line, so we will focus on just a few salient points.
First of all, the readFile() function is used to read in the MS data. This function returns a
string vector, with one string element per line in the file. The rest of the code then processes
these lines. First, a scan through the lines is conducted to find the // line that indicates the start of
the actual sample data; MS files can have various information above that point that this code does
not attempt to parse. Below that line should be a segsites: line and then a positions: line, as
shown in the snippet above; the code scans for those lines and does some minimal error-checking.
Next, the recipe creates all of the mutations referenced by the MS data. This recipe assumes
that these mutations are all neutral, since that is how MS would typically be used as input to a
SLiM script. One twist here is that mutations cannot be created in isolation; according to SLiM’s
design, mutations must always reside in a Genome object. This recipe therefore creates a dummy
subpopulation with a single individual, and throws all of the MS mutations into the first genome of
that individual. This design may seem a bit odd, but it is harmless (nevertheless, it could be
avoided if necessary, by adding the mutations in p1 instead and then removing unreferenced
mutations rather than adding referenced mutations).
Having created all of the mutations, the recipe then returns to processing input lines, which
now are each a representation of one genome, as explained above. The strsplit() function is
used to separate the 0 and 1 values into separate string elements, which are then converted to
integer and then to logical. This results in a logical vector that indicates whether each
corresponding mutation is or is not referenced by the focal genome. A call to addMutations()
adds the selected mutations to the focal genome, and then the code moves to the next input line.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

194

At the end, the dummy p2 subpopulation is removed. Finally, the generation is set to 20000, to
dovetail with the fact that the simulation that generated the MS data ended at generation 20000.
This is optional, but can be helpful for keeping track of a multi-stage simulation of this sort.
This recipe is tailored to a somewhat specific situation – a neutral MS simulation to conduct a
burn-in – but it should be adaptable to a wide variety of other situations. Indeed, a strategy very
much like this could be used to read in empirical data regarding the genotypes of individuals in a
natural population at various ecologically-relevant SNPs.
13.5 Modeling chromosomal inversions with a recombination() callback
In previous chapters we have seen four types of callbacks: initialize(), fitness(),
mateChoice(), and modifyChild(). There is actually a fifth type of callback that is less commonly
used: the recombination() callback (see section 22.5). This type of callback allows the script to
modify the recombination breakpoints used by SLiM when generating a gamete to produce a new
offspring individual. In most models, the standard user-defined recombination map set by
initializeRecombinationRate() and perhaps setRecombinationRate() suffices, since usually all
individuals in a simulation use the same recombination map (or perhaps different maps for males
and females, which is supported by those calls as well). In some cases, however, recombination
behavior needs to vary at the individual level. That would be true in a model of the evolution of
recombination itself, for example; one would want individual-level variation in recombination
behavior, presumably controlled by the genetics of individuals, to evolve in response to natural
selection. It is also true in the model we will explore here: a model of the evolutionary effects of
chromosomal inversions. We’ll build this model step by step. Here’s our starting point:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeMutationType("m2", 0.5, "f", -0.05); // inversion marker
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-6);
}
1 {
sim.addSubpop("p1", 500);
}
1 late() {
// give half the population the inversion
inverted = sample(p1.individuals, integerDiv(p1.individualCount, 2));
inverted.genomes.addNewDrawnMutation(m2, 25000);
}
1:9999 late() {
// assess the prevalence of the inversion
pScr = "sum(applyValue.genomes.containsMarkerMutation(m2, 25000));";
p = sapply(p1.individuals, pScr);
p__ = sum(p == 0);
pI_ = sum(p == 1);
pII = sum(p == 2);
cat("Generation " + format("%4d", sim.generation) + ": ");
cat(format("%3d", p__) + " -");
cat(format("%3d", pI_) + " I");
cat(format("%3d", pII) + " II\n");
if (p__ == 0) stop("Inversion fixed!");
if (pII == 0) stop("Inversion lost!");
}
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

195

This initial model is just a model of neutral drift with an introduced deleterious mutation. The
generation 1 late() event makes half of the population homozygous for a “marker mutation”, of
type m2, that indicates the presence of an inversion; however, the machinery to implement the
inversion behavior is not yet present. Indeed, in this version of the model the marker mutation is
typically rapidly lost, since it is deleterious with a selection coefficient of -0.05; this selection
coefficient makes the marker mutation easy to see in SLiMgui, because it gets colored red. Below,
we will fix the marker mutation to not be deleterious while retaining this helpful coloration in
SLiMgui.
The other late() event in the above model produces output. Every generation, it prints a
summary of how many individuals do not have the inversion, how many are heterozygous for it,
and how many are homozygous for it (assessed using the very useful sapply() function with a
script, pScr, that checks for the marker mutation using the fast special-purpose Genome method
containsMarkerMutation()). It also checks for the inversion having been fixed or lost, and prints a
message and stops if either of those outcomes has occurred. If we run the model as it now stands,
we will get output something like this:
Generation
1:
Generation
2:
Generation
3:
...
Generation
82:
Inversion lost!

250 -134 -142 --

0 I256 I223 I-

250 II
110 II
135 II

467 --

32 I-

1 II

Producing this output in every generation makes for a whole lot of output, and it slows down
the simulation, too. Let’s add a couple of lines to the top of the late() event to make it run only
every 50th generation instead:
if (sim.generation % 50 != 0)
return;

As explained above, we don’t want the marker mutation to actually be deleterious; the purpose
of that selection coefficient is just so that SLiMgui colors the marker mutations red. Instead, in this
model we want the inversion marker to be subject to balancing selection strong enough to keep it
near intermediate frequency. This is a proxy for the inversion itself (or more realistically, an
unmodeled mutation within the inversion) having some sort of phenotypic effect that is under
balancing selection in the environment. So now let’s add a fitness() callback to create that
balancing selection (see section 9.4.1 for more discussion of modeling frequency-dependent
selection):
fitness(m2) {
// fitness of the inversion is frequency-dependent
f = sim.mutationFrequencies(NULL, mut);
return 1.0 - (f - 0.5) * 0.2;
}

Notice that now, since the fitness() callback redefines the fitness effect of the marker
mutations in all cases, the default fitness value of -0.05 is no longer actually used by SLiM – but
SLiMgui still uses that value to color the mutations red in its chromosome view. This is a nice trick
for giving special mutation types, such as marker mutations, particular colors in SLiMgui.
Running the model now produces output that indicates that the marker mutation is indeed
under balancing selection and is not lost:

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

196

Generation
Generation
Generation
Generation
Generation
Generation
...

50:
100:
150:
200:
250:
300:

140
123
90
176
148
205

-------

266
255
235
237
250
229

IIIIII-

94
122
175
87
102
66

II
II
II
II
II
II

So far so good, but our marker mutation is still just an ordinary mutation under balancing
selection; we still have no machinery to make it model a chromosomal inversion in particular.
Now we add the core of this recipe – a recombination() callback that prevents recombination
within the inversion between homologous chromosomes that are heterozygous for the inversion:
recombination() {
if (genome1.containsMarkerMutation(m2, 25000) ==
genome2.containsMarkerMutation(m2, 25000))
return F;
inInv = (breakpoints > 25000) & (breakpoints < 75000);
if (sum(inInv) == 0)
return F;
breakpoints = breakpoints[!inInv];
return T;
}

Let’s walk through this in some detail. First, the callback determines whether the parent
individual that is generating the focal gamete is heterozygous for the inversion. The inversion only
affects recombination if the parent is heterozygous, so in other cases the callback returns F
immediately, a flag value indicating that no change to the proposed breakpoints is needed.
Next, a logical vector, inInv, is constructed that has T for proposed breakpoints that are within
the inversion region, F otherwise. The positions of proposed breakpoints are supplied to the
callback by SLiM in the breakpoints variable; that variable can also be set by the callback to
change the proposed breakpoints as we will see momentarily. The inversion region is defined by
the code here to stretch from base position 25000 to 74999 inclusive. Recombination positions fall
immediately to the left of the given base position; in other words, crossover occurs between the
specified base and the preceding base. For this reason, the logic here considers a breakpoint
exactly at position 25000 to be outside the inversion. If there are no proposed breakpoints inside
the inversion region, the callback returns F immediately. Note that our marker mutations were
created originally, with the addNewDrawnMutation() call, at position 25000, the first position within
the inversion region that we have just defined. Any position within the inversion region would
work equally well, but positions outside of the inversion region would not work, since
recombination could then separate the marker mutation from the contents of the inversion.
Finally, the callback removes all proposed breakpoints inside the inversion by subsetting
breakpoints with the negation of inInv, and then it returns T to indicate that the proposed
breakpoints were changed. This achieves the desired goal of suppressing all recombination within
the inversion region for gametes generated by heterozygote parents.
Note this callback is written to be very efficient, since it is called for every gamete produced by
every individual in every generation. Whenever possible, it returns F to indicate that no change to
the proposed breakpoints is needed; this allows SLiM to skip a bunch of extra work. It could be
made even faster by caching each individual’s inversion count in the individual’s tag value in an
early() event. We will not explore that here, but it is worth keeping in mind that pre-cacheing
the results of static computations outside of frequently-called callbacks can improve performance.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

197

Note also that this callback handles only suppression of ordinary recombination breakpoints. If
you enabled gene conversion in your model (see section 6.1.3), you may also wish to make
inversions modify how gene conversion occurs. That is easy to do, but is not shown in this recipe;
see section 22.5 for further information.
So now we have a working model of chromosomal inversions, but it would be nice to have
some quantitative evidence that it is working properly. For that, let’s add a final output event:
9999 late() {
sim.outputFixedMutations();
// Assess fixation inside vs. outside the inversion
pos = sim.substitutions.position;
cat(sum((pos >= 25000) & (pos < 75000)) + " inside inversion.\n");
cat(sum((pos < 25000) | (pos >= 75000)) + " outside inversion.\n");
}

This event prints a list of the fixed mutations using outputFixedMutations(), as we’ve seen in
other recipes. Then it looks at all of the mutations that have fixed during the simulation (kept by
the sim object in its substitutions property), and extracts their positions. Finally, it tallies and
prints the number of fixed mutations that occurred within the inversion region, versus the number
that occurred outside. Without the inversion, these numbers would be expected to the roughly
equal, since there are 50000 bases inside the inversion and 50000 outside, and indeed, if you
comment out the recombination() callback and run the model, you will see something like this:
37 inside inversion.
40 outside inversion.

But with the recombination callback active, the results are very different:
0 inside inversion.
43 outside inversion.

This can be seen graphically in SLiMgui. If display of fixed mutations is turned on with the F
button to the right of the chromosome view, this is what things look like at the end of the run:

Many mutations outside the inversion have drifted to fixation. (To facilitate that happening
quickly, this model uses a recombination rate of 1e-6, so that linkage disequilibrium with the
inversion gets broken down quickly, but that is not an essential component of the model, just a
way to get it to show interesting results more quickly.) Inside the inversion, however, there are two
major haplotypes, and mutations within the inversion can’t fix because they can’t cross from one
haplotype to the other. This is the result of suppression of recombination within the inversion;
chromosomes containing the inversion accumulate one set of fixed mutations through drift, while
chromosomes not containing the inversion accumulate a different set of fixed mutations.
While we’re on the subject of haplotypes, this model is a particularly good testbed for looking
at some advanced features of SLiMgui that facilitate the examination of haplotypes and linkage
disequilibrium in SLIM. Let’s run the model out to the end again, to reach a similar state to that
shown above, and then control-click on the chromosome view to get a context menu that allows
us to configure its display:
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

198

Select “Display Haplotypes” from the context menu, and the chromosome view changes to
show the haplotypes that are in the population, clustered according to their genetic similarity:

Here the effect of the inversion is even clearer; outside of the inversion there are few mutations
at high frequency and no apparent population structure, but inside the inversion the population
has clearly differentiated into two completely different haplotypes with no admixture. The marker
mutation indicating the inversion can be seen, associated with one of the two haplotypes.
This alternate display mode for the chromosome view is based upon just a small sample of the
genomes from the selected subpopulation(s), allowing genetic clustering and display to be done in
real time. It can also be useful to get a more comprehensive haplotype plot, based upon a larger
sample or upon the entire population. To obtain that, choose Create Haplotype Plot from the
Show Graph button’s pop-up menu (or from the Simulation menu). This shows a panel that allows
us to choose plot options; we can use the default options here, so click OK. After a progress panel
(since the analysis can be quite lengthy with a large sample), a new plot window opens:

This shows essentially the same information as the chromosome view did above, but it is based
upon all 1000 genomes in the population, and thus provides some additional detail. A context
menu on the plot window, obtained with control-click or right-click, allows the appearance of the
plot to be adjusted, and also allows the plot’s image to be copied or saved as a file.
That completes this model of chromosomal inversions using a recombination() callback, but as
usual there is much more that could be done. Rather than using balancing selection, for example,
spatially varying selection among subpopulations could allow adaptive ecological divergence
between subpopulations to arise as a result of the protection from recombination afforded by the
inversion. It would also be interesting to model the rise of an inversion to high local frequency, in
a model like that, to explore how inversions can facilitate divergence and speciation.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

199

13.6 Modeling both X and Y Chromosomes with a Pseudo-Autosomal Region (PAR)
SLiM has built-in support for modeling either the X or Y chromosome when sex is enabled (see
section 6.2.3). However, some models need to go beyond this built-in support. You might wish to
model both the X and Y chromosomes, and you might even wish to model a pseudo-autosomal
region (PAR) – a region in which recombination between the X and Y occurs freely, giving the
region evolutionary dynamics similar to those of an autosome. Both males and females are diploid
for genes in the PAR; females have two copies of the PAR on their two X chromosomes, whereas
males have the same PAR on their X, and a homologous PAR on their Y. Because crossing over
occurs freely between the X and Y within the PAR, genes in the PAR exhibit an autosomal pattern
of inheritance rather than sex-linked inheritance.
Modeling this in SLiM is possible by implementing your own sex-chromosome mechanics,
which is fairly straightforward. In this section we’ll explore a simple model of neutral X and Y
chromosome evolution with a single PAR connecting them. This model was provided by Melissa
Jane Hubisz, and has been adapted for publication as a recipe here.
The basic model is quite simple:
initialize()
{
initializeMutationRate(1.5e-8);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeMutationType("m2", 0.5, "f", 0.0);
initializeMutationType("m3", 1.0, "f", 0.0);

// PAR
// non-PAR
// Y marker

// 6 Mb chromosome; the PAR is 2.7 Mb at the start
initializeGenomicElementType("g1", m1, 1.0);
//
initializeGenomicElementType("g2", m2, 1.0);
//
initializeGenomicElement(g1, 0, 2699999);
//
initializeGenomicElement(g2, 2700000, 5999999); //

PAR: m1 only
non-PAR: m2 only
PAR
non-PAR

// turn on sex and model as an autosome
initializeSex("A");
// no recombination in males outside PAR
initializeRecombinationRate(c(1e-8, 0), c(2699999, 5999999), sex="M");
initializeRecombinationRate(1e-8, sex="F");
}
// initialize the pop, with a Y marker for each male
1 late() {
sim.addSubpop("p1", 1000);
i = p1.individuals;
males = (i.sex == "M");
maleGenomes = i[males].genomes;
yChromosomes = maleGenomes[rep(c(F,T), sum(males))];
yChromosomes.addNewMutation(m3, 0.0, 5999999);
}
modifyChild() {
numY = sum(child.genomes.containsMarkerMutation(m3, 5999999));
// no individual should have more than one Y
if (numY > 1)
stop("### ERROR: got too many Ys");

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

200

// females should have 0 Y's
if (child.sex == "F" & numY > 0)
return F;
// males should have 1 Y
if (child.sex == "M" & numY == 0)
return F;
return T;
}
10000 late() {
p1.outputMSSample(10, replace=F, requestedSex="F");
}

The initialize() callback sets up the genetic structure. This model turns on sex, because we
want SLiM to track males and females for us, but it requests modeling of an autosome with
initializeSex("A"); we will handle the tracking of the X versus Y chromosomes ourselves. To do
that, we set up a special “marker” mutation type, m3, that will be used to tag Y chromosomes, as
we will see shortly. We set up a 6 Mb chromosome with a 2.7 Mb PAR at the beginning; the PAR
uses mutation type m1 (through genomic element type g1), while the rest of the chromosome uses
mutation type m2 (through genomic element type g2), so that we can easily tell sex-linked
mutations from pseudo-autosomal mutations later on. The only wrinkle during initialization is that
we set up a recombination map in males that prevents recombination outside the PAR; females,
which have two X chromosomes, are allowed to recombine freely.
Next we have a generation 1 late() event that sets up the initial population. After making a
new subpopulation in the usual way with addSubpop(), it gets the male individuals, selects only
their second Genome objects, and adds an m3 mutation to them to mark them as Y chromosomes.
These marker mutations will be handled in the usual way by SLiM, so they will be inherited and
will continue to mark Y chromosomes in future generations. Because recombination is prevented
in males outside the PAR, they will stay associated with the non-PAR Y chromosome genetic
information.
There is one problem with this scheme, however. When SLiM generates offspring, it has its own
ideas about whether a given child ought to be male or female. We need to follow SLiM’s guidance
on this, otherwise our model will end up with individuals that SLiM considers female but that
possess a Y chromosome, and individuals SLiM thinks are male but that have no Y. This is the
purpose of the modifyChild() callback. It simply compares what SLiM expects for the sex of the
child (child.sex) with the genetics that SLiM is proposing that the child will inherit
(child.genomes). If they don’t match, then it returns F to indicate that SLiM needs to choose new
parents and try again. Note the containsMarkerMutation() method; this just checks for a
mutation of a given type at a given position, which can be done quite quickly by SLiM. It is thus
optimal when the position of a mutation (if it exists) is already known, as it is here.
Because modifyChild() callbacks get called quite frequently (once for each new offspring
generated by SLiM, or even more in this case since the callback rejects some proposed children),
the speed of callback code is essential. The callback above has very clear logic, but it is slow in
several ways. It does a safety check that is never, in fact, hit (since the model works properly); it
assigns a value into a variable, numY, which is relatively slow because it requires Eidos to set up a
symbol table entry; it performs some unnecessary logical operations and tests (if child.sex is not
"F" then it must be "M", for example), and worst of all, it always calls containsMarkerMutation()
for both child genomes even though in many cases the answer returned by one genome will

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

201

already allow the callback to choose its action. With an optimized version of the callback, the
model runs about 25% faster, a not-insignificant difference. The optimized callback:
modifyChild() {
// females should not have a Y, males should have a Y
if (child.sex == "F")
{
if (childGenome1.containsMarkerMutation(m3, 5999999))
return F;
if (childGenome2.containsMarkerMutation(m3, 5999999))
return F;
return T;
}
else
{
if (childGenome1.containsMarkerMutation(m3, 5999999))
return T;
if (childGenome2.containsMarkerMutation(m3, 5999999))
return T;
return F;
}
}

If you examine this carefully, you should find that it will produce the same result in all cases
(unless the model has already broken, such as by an individual having two Y chromosomes).
Finally, we have a late() event that outputs an MS-format sample from the females in the
population; among other things, this establishes the end of the model as generation 10000.
There is one remaining issue. Addressing it is optional, in a sense, but if we don’t address it the
model will run slower and slower until it grinds nearly to a halt. The problem is that while PAR
mutations will fix and be converted into Substitution objects automatically by SLiM as usual,
non-PAR mutations will not. This is because their threshold for fixation is lower than SLiM
realizes; only a quarter of Genome objects are Y chromosomes, and only three-quarters are X
chromosomes, so sex-linked mutations need to fix when they reach those frequencies, not a
frequency of 1.0 as SLiM expects. Fixing this is not difficult, but it does require a bit of code:
1:10000 late() {
// periodically remove m2 (non-PAR) mutations that are fixed in X or Y
// m1 (PAR) mutations will be automatically removed when fixed
if (sim.generation % 1000 == 0) {
numY = sum(p1.individuals.sex == "M");
numX = 2 * size(p1.individuals) - numY;
// look at the mutations in a single Y chromosome
// to find mutations that are fixed in all Y's
firstMale = p1.individuals[p1.individuals.sex == "M"][0];
fMG = firstMale.genomes;
if (fMG[0].containsMarkerMutation(m3, 5999999)) {
firstY = fMG[0];
firstX = fMG[1];
} else if (fMG[1].containsMarkerMutation(m3, 5999999)) {
firstY = fMG[1];
firstX = fMG[0];
} else
stop("### ERROR: no Ys in first male");

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

202

ymuts = firstY.mutationsOfType(m2);
ycounts = sim.mutationCounts(NULL, ymuts);
removeY = ymuts[ycounts == numY];
// now do the same for the X
xmuts = firstX.mutationsOfType(m2);
xcounts = sim.mutationCounts(NULL, xmuts);
removeX = xmuts[xcounts == numX];
cat("Gen. " + sim.generation + ": Removing ");
cat(removeX.size() + "/" + removeY.size() + " on X/Y\n");
removes = c(removeY, removeX);
sim.subpopulations.genomes.removeMutations(removes, T);
}
}

This event should ideally be inserted before the generation 10000 late() event that produces
final output. It runs every 1000 generations, since the operation it performs is somewhat timeconsuming; doing it every generation would be wasteful. It finds a male individual, and uses that
individual as a template for finding and removing fixed sex-linked mutations. Any mutation that
has fixed must be possessed by the template male (by the definition of “fixation”), so the code just
gets all the mutations of type m2 (non-PAR mutations) from the male’s X and Y and checks them all
for fixation. The fixation check itself is done using mutationCounts(), which returns the number of
occurrences of the mutations – population-wide, in this case, because of the NULL passed for the
first parameter. The event prints the number of X-linked and Y-linked mutations that it intends to
remove, and then it removes them with removeMutations(). The optional T value passed for the
second argument to removeMutations() (which is named substitute) indicates that SLiM should
create Substitution objects for the removed mutations; we are notifying SLiM that those
mutations are in fact fixed, even though SLiM doesn’t realize it. If you don’t care about that
record-keeping, you can omit that optional T value, and the mutations will simply be removed.
And that’s it. We’ve implemented tracking of Y chromosomes with marker mutations, we’ve
guaranteed that those markers stay correctly synchronized with offspring sex, and we’ve added
machinery to detect fixed sex-linked mutations and turn them into Substitutions just as SLiM does
for autosomal mutations. If we run this model in SLiMgui with display of fixed mutations turned
on (the F button to the right of the chromosome view), the behavior of the PAR versus the non-PAR
regions is easy to see as soon as the first pass of the fixation check has run in generation 1000:

The PAR, on the left, behaves as an autosome, so it takes a while for mutations to fix; one is
close to fixation, but none has made it there yet. The non-PAR region, on the right, behaves as
separate sex chromosomes that cannot recombine, and each sex chromosome is present in fewer
copies than the PAR; mutations in that region thus have a smaller effective population size, and fix
more rapidly. The Y, present in only a quarter as many copies as the PAR, fixes particularly quickly;
all 20 of the fixations here are in fact on the Y. After generation 3000, the situation is similar:

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

203

There have now been 19 fixations on the X, 120 on the Y, and only 11 in the PAR. The trend is
clear, and it appears that our model of a pseudo-autosomal region is functioning as expected.
13.7 Forcing a specific pedigree through arranged matings
As we’ve seen in previous recipes, SLiM allows mating probabilities to be adjusted with
mateChoice() callbacks, and sometimes modifyChild() callbacks can also be useful for
implementing specific mating patterns since they can reject proposed offspring on the basis of
information about the parents (as in, for example, the gametophytic self-incompatibility system
implemented in section 11.3). Sometimes, however, it is desirable to go a step further, and simply
dictate exactly what matings will occur in a given generation, in order to obtain a specific pedigree
structure. This can also be implemented in SLiM, through a combination of tagging individuals
and using a mateChoice() callback to allow only matings between individuals with specific tag
values. In this section, we will look at a recipe that implements a very simple example of this: a
founding event that begins with two mating events, involving disjoint sets of parents, to produce a
founding population of two offspring (one from each mating). Note that this recipe is quite clunky
and scales poorly, because the WF modeling framework is ill-suited to this task; see section 15.12
for a much more graceful and scaleable nonWF recipe for forcing a specific pedigree.
To begin with, here’s a model that does what we want except for the forced pedigree:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 { sim.addSubpop("p1", 500); }
1000 late() { p1.setSubpopulationSize(2); }
1002: early()
{
newSize = (sim.generation - 1001) * 10;
p1.setSubpopulationSize(newSize);
}
1010 late() { p1.outputSample(10); }

This sets up an initial population of size 500, lets it evolve to generation 1000, and then triggers
a bottleneck down to a population size of 2 (set in a late() event in generation 1000, but actually
taking effect for the offspring generation that is generated in 1001). In effect, then, the original
subpopulation is discarded, and we model the new founder subpopulation thereafter; modeling
both could be done by adding the founder population as a new subpopulation, of course (see
section 5.2.1). In generation 1002 and onward, the population grows linearly (for exponential
growth, see section 5.1.2). The model terminates at the end of generation 1010 with the output of
a subsample of the population. So far, so good; we’ve seen this sort of model many times.
The problem here is that with a severe bottleneck such as this, two undesirable things could
happen. One, the same parent(s) could be chosen for both of the two founding offspring
individuals, making the bottleneck more severe than we want. And two, because this is a
hermaphroditic model (in which SLiM allows selfing by chance, if the same parent happens to be
chosen twice), one or both of the founding offspring could be generated through selfing rather than
biparental mating, again making the bottleneck more severe than intended. There are various ways
to handle these issues (selfing in hermaphroditic models can be blocked with a simple
mateChoice() callback, for example; see section 12.4).
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

204

For our purposes here, however, let’s handle them by forcing SLiM to follow a specific pedigree
for the founding event. We will choose four parents, A, B, C, and D, from the population, and we
will force SLiM to mate A with B and C with D, each mating producing exactly one offspring.
We’ll do that by first modifying the 1000 late() event that triggers the bottleneck:
1000 late() {
p1.setSubpopulationSize(2);
p1.individuals.tag = 0;
parents = sample(p1.individuals, 4);
parents[0].tag = 1;
parents[1].tag = 2;
parents[2].tag = 3;
parents[3].tag = 4;
}

This event now not only triggers the founding bottleneck, but also chooses four parents from the
original subpopulation using sample() – which by defaults samples without replacement, as
desired – and marking them with unique tag values. The tag values of all other individuals are set
to 0. We can now identify A, B, C, and D using the tag values 1:4. Next we need to force matings
only between A and B, and C and D. We’ll do that with a modifyChild() callback:
1001 modifyChild()
{
t1 = parent1.tag;
t2 = parent2.tag;
if (((t1 == 1) & (t2 == 2)) | ((t1 == 2) & (t2 == 1)) |
((t1 == 3) & (t2 == 4)) | ((t1 == 4) & (t2 == 3)))
{
cat("Accepting tags " + t1 + " & " + t2 + "\n");
parent1.tag = 0;
parent2.tag = 0;
return T;
}
cat("Rejecting tags " + t1 + " & " + t2 + "\n");
return F;
}

This callback gets the tag values of the two parents of each proposed child. If the parents are A
and B (which could be tags 1 and 2, or tags 2 and 1), the child is accepted; similarly if the parents
are C and D. In that case, the tag values of the parents are set to 0 to ensure that those parents will
not be allowed to mate a second time. If the parents don’t pass the test, the callback simply rejects
the proposed child by returning F, causing SLiM to try a new pair of parents.
This works well, except that there will be a noticeable pause at generation 1001. Looking at the
diagnostic output produced by the model shows why; there are a great many lines stating
“Rejecting tags 0 & 0”, quite a few lines stating things like “Rejecting tags 2 & 0”, and
exactly two lines stating something like “Accepting tags 4 & 3” and “Accepting tags 1 & 3”. In
other words, the model is having to reject a very large number of proposed children in order to
achieve the desired pedigree. If the initial subpopulation were, say, 50000 individuals instead of
500, this small problem would become a huge problem, as the model ground to a halt for perhaps
days or weeks searching for acceptable mating pairs. Happily, there is a way to optimize the
model to eliminate the problem, by adding a fitness() callback:

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

205

1000 fitness(m1)
{
if (individual.tag == 0)
return 0.0;
else
return relFitness;
}

This callback simply removes all individuals with a tag value of 0 from the mating pool, by
declaring their fitness to be 0.0. Although it removes all unchosen parents from the mating pool, it
does not suffice in itself; without the modifyChild() callback as well, this fitness() callback
would still allow A to mate with C, D to self, etc. But in conjunction with the modifyChild()
callback, it reduces the mating pool to just the focal individuals we want, and thus greatly speeds
up the process of filling the desired pedigree.
Note that this callback is defined for generation 1000; we want to manipulate the fitness values
of the individuals that will be parents in generation 1001 (the founding event generation), and
those individuals are the offspring generated in generation 1000, and thus have their fitness values
calculated at the end of generation 1000. When you’re juggling individuals with different callbacks
like this, you need to consult the generation life cycle (see chapter 19) very carefully!
With this callback in place, it takes only a few tries to get the matings we want, as for example:
Rejecting
Rejecting
Accepting
Rejecting
Rejecting
Accepting

tags
tags
tags
tags
tags
tags

2
1
1
0
0
4

&
&
&
&
&
&

4
1
2
4
3
3

The appearance of tag value 0 here is because parents 1 and 2 were accepted and produced an
offspring, and thus their tag values were set to 0. This did not remove them from SLiM’s mating
pool, since their fitness values were not recalculated; it merely marks them so that the
modifyChild() callback can avoid using them again. Of course these diagnostic messages are just
for illustration purposes anyway, and would be removed from a production model.
Note that the scheme here using sample() chooses parents for the founding population with
equal weights. This is a neutral model, so that is fine, but even in a non-neutral model this might
be desirable, depending upon the nature of the founding event; a storm that blows a handful of
birds off-course to an isolated island, for example, might select the founding individuals
completely at random, without regard for the fitness they would have had in their original habitat,
and so the simple call to sample() shown here would be appropriate. But of course any other
method of choosing the founders may be used. To sample parents weighted by their fitness values,
as SLiM normally does when choosing parents, one could just obtain the fitness values for all
parents, with a call to cachedFitness(NULL) on the parental subpopulation, and pass that vector
directly to sample() as the weights to be used in sampling.
This is almost the simplest possible pedigree we could force, but the recipe can be generalized
easily. For example, we could force more than one offspring to be generated by a given pair of
parents if we wished, either by using more tag values to represent different states (when A and B
are about to generate their first child, change their tag values to 5 and 6, and allow parental pair
5/6 to also generate an offspring, for example), or by adding a counter to the parental individuals
using the setValue() / getValue() facility of the Individual class (see section 21.6.2) to track how
many children they have generated already; their tag values would not be reset to 0, removing
them from the parent pool, until that counter reached the desired value.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

206

Similarly, this recipe could be extended to a larger pedigree, involving more parents and more
offspring, simply by using more tag values. However, if you wanted to implement a large,
complex pedigree involving many matings the implementation might get rather lengthy and
cumbersome, with manual setting and checking of a large number of different tag values.
This recipe could also be extended to work over multiple generations, just by doing the same
thing over again: tag the chosen parents, reject offspring that aren’t from parents with the desired
tag values. It would be straightforward, if admittedly a bit clunky, to reproduce an entire
multigenerational pedigree of, say, a given royal family with a high rate of inbreeding and a
deleterious trait such as hemophilia. To do this, first assign a unique number to each individual on
the pedigree chart you want to model; then use those numbers as the tag values for the individuals
in the model. If C is the child of A and B, the modifyChild() call back would, when deciding to
allow A and B to mate to produce C, also set the tag value for the proposed offspring to mark that
individual as C. Implemented across the board, this would allow the identity of every specific
individual throughout the multigenerational pedigree to be tracked with precision.
As mentioned above, section 15.12 has a much more scalable recipe for forcing a specified
pedigree using a nonWF model; that recipe is recommended for most users over this recipe.
13.8 Estimating model parameters with ABC
One major use of simulation models is to try to better understand an empirical system by
comparing simulated results with empirical data. For example, one might have an observation
from an empirical system, and have a guess as to a model that approximates that empirical system
well; one might then want to know the value of a particular parameter of that model that produces
the best possible fit of the model to the data. In this recipe we’ll explore doing this using a
technique called Approximate Bayesian Computation (ABC). This is quite a complex topic, and we
will not attempt to provide a thorough introduction to it here; please use the internet to inform
yourself further regarding the assumptions, limitations, and caveats involved in this method, as
well as about the underlying theoretical framework upon which it rests.
In this recipe we’re going to branch out a bit and present R code as well as Eidos code. The R
code will run the ABC process, while the Eidos code will run the SLiM simulation that provides the
ABC process with the information it needs. Let’s start with the Eidos code:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 999999);
initializeRecombinationRate(1e-8);
}
1 { sim.addSubpop("p1", 100); }
1000 late() { cat(sim.mutations.size() + "\n"); }

This is a trivial model, obviously. We start with a population of size 100, and model neutral
mutations in that population. The model is allowed to equilibrate until generation 1000, at which
point the number of segregating mutations detected in the population is printed. Simple and fast,
which is important since ABC is going to run it a whole bunch of times.
To tie this back to an empirical question, this model would be a reasonable starting place if you
said: “I have a population of size 100 that I have sampled comprehensively, and the analysis I’ve
done tells me there are 262 segregating mutations in the population. I think the population has
been about size 100 for a long time, so it’s at equilibrium (to the extent that a small population
evolving under drift is ever at equilibrium). I know the recombination rate and the chromosome
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

207

length, but not the mutation rate. What’s my best guess, and what’s the posterior distribution
around that guess?” So now we’ve got a SLiM model of that scenario, with a guess of 1e-7 hardcoded into it. Let’s tweak the model by replacing the mutation rate with a symbolic constant:
initializeMutationRate(mu);

We could define the constant mu at initialization time with a call to the Eidos function
like:

defineConstant(),

defineConstant("mu", 1e-7);

But let’s not do that; instead, we will pass a value for it in to the simulation on the command
line. To do this, let’s first save the model to a file called “abc.slim” in our home directory (we will
assume that the file is then at the path ~/abc.slim, as it is on most Unix systems including Mac
OS X). If we run this model at the command line, we get an error about the undefined constant:
darwin:~ bhaller $ slim ~/abc.slim
...
ERROR (EidosSymbolTable::_GetValue): undefined identifier mu.
Error on script line 2, character 24:
initializeMutationRate(mu);
^^

But we can pass a value for the constant in, using the -d (-define) command-line option (see
section 17.2):
darwin:~ bhaller $ slim -d mu=1e-7 ~/abc.slim
...
262

This defines mu to be 1e-7 at the very beginning of the model, before the first initialize()
callback is called. This run of SLiM is actually where the value of 262 came from, above; it’s the
value from our “empirical system” (i.e., this first run of our model), for which we’re going to try to
recover an estimate of mu using ABC. In practice, such observed values would probably come
from actual empirical data.
So now we have a working model at ~/abc.slim that expects a value for a constant mu to be
supplied to it on the command line, and the last line of the output it generates is the outcome of
the model – the number of segregating mutations observed in the population at equilibrium. The
next step is to write some R code to run our model, given a random number seed and mu:
runSLiM <- function(x)
{
seed <- x[1]
mu <- x[2]
#cat("Running SLiM with seed ", seed, ", mu = ", mu, "\n");
output <- system2("/usr/local/bin/slim", c("-d", paste0("mu=", mu),
"-s", seed, " ~/abc.slim"), stdout=T)
as.numeric(output[length(output)])
}

The system2() function of R is used to run SLiM with the desired command-line arguments, and
the output is collected as a character vector of output lines. Note that this function takes the seed
and mu values as a single vector, x; this is because of the design of the R package we will use to run
the ABC in a moment. First, let’s test this function:
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

208

> runSLiM(c(100030, 1e-7))
[1] 281

We get a different result than before, since a different random number seed was used, but it
seems to work fine. So now we have R running our SLiM model for us; the next step is to actually
run an ABC analysis. For this, we will use the package EasyABC, available through CRAN:
library(EasyABC)
# Set up and run our ABC
prior <- list(c("unif", 1e-9, 1e-6))
observed <- 262
ABC_SLiM <- ABC_sequential(method="Lenormand", use_seed=TRUE,
model=runSLiM, prior=prior, summary_stat_target=observed,
nb_simul=1000)

The EasyABC package has many different options for running ABC analyses, including several
“sequential” ABC methods that try to arrive at an answer more quickly through successive rounds
of ABC, and several ways of running the ABC inside an MCMC (Markov Chain Monte Carlo)
process, a particularly powerful way of converging rapidly on the posterior distribution of the ABC.
We have chosen the “Lenormand” method, since it is simple to set up, is happy to run with only
one unknown parameter (which some of the other methods don’t seem to be), and converges fairly
quickly. Since ABC is a Bayesian method, we also need a prior for mu; we have chosen a uniform
prior from 1e-9 to 1e-6, since that encompasses the range of mutation rates we think would be
likely in our empirical system, and since we have no information about which values within this
range are more or less likely. We also need to tell EasyABC how many runs we want to perform;
the nb_simul parameter does that (sort of – see the documentation), and the value of 1000 here
should give us quite a good posterior distribution at the expense of longer runtime (a value of 100
actually works pretty well already).
The final line, which actually runs the ABC analysis, will probably take several minutes,
depending upon the speed of your machine. When it finishes, ABC_SLiM contains information
about the ABC run. Details about that information can be found in EasyABC’s documentation; we
will not explain it here, but we will use it to extract the results we want. First of all, to get the best
fit estimate from a one-parameter ABC like this, one typically wants the sum of the weighted
average of the values chosen by the ABC:
> sum(ABC_SLiM$param * ABC_SLiM$weights)
[1] 0.0000001134854

The ABC did quite a good job of recovering the actual value of mu, 1e-7, that was used to
generate the target value of 262. We can also plot the posterior distribution for mu that was arrived
at by the ABC analysis, using this R code:
log_param <- log(ABC_SLiM$param, 10)
breaks <- seq(from=min(log_param), to=max(log_param), length.out=8)
quartz(width=4, height=4)
hist(log_param, xlim=c(-9, -6), breaks=breaks, col="gray",
main="Posterior distribution of mu", xlab="Estimate of mu", xaxt="n")
axis(side=1, at=-6:-9, labels=c("1e-6", "1e-7", "1e-8", "1e-9"))

This code converts the posterior distribution data to a log scale manually and plots it on that
scale, for clarity; there are probably better ways to do that. The quartz() call just opens a graphics
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

209

device on Mac OS X; if you’re on a different platform you will probably have to change that to
your platform-specific graphics device call. Apart from those details, the code is pretty standard.
The resulting plot looks like this:

100
50
0

Frequency

150

Posterior distribution of mu

1e-9

1e-8

1e-7

1e-6

Estimate of mu

We started with a uniform prior from 1e-9 through to 1e-6, providing the ABC analysis with no
information as to which values within that range were more likely. By doing quite a few runs of
SLiM (12500, as it happens), the ABC narrowed that range down considerably, effectively ruling
out a large part of it completely, and it gave us a posterior distribution showing which values of mu
would be most likely to produce the observed number of segregating mutations, 262, that was
observed empirically. All in just a few lines of Eidos and a few lines of R; not bad!
This completes our foray into Approximate Bayesian Computation in SLiM and R. This topic can
be pursued much further: estimation of the joint distribution of multiple parameters, usage of a
Markov Chain Monte Carlo (MCMC) method for running the ABC (which is also supported by the
EasyABC package), delving into the often difficult problem of priors, and so forth. But this recipe
should provide a starting point for such explorations.
13.9 Tracking true local ancestry along the chromosome
Ancestry, relatedness, pedigree, and similar concepts are often important in forward genetic
simulations such as those run by SLiM. Depending upon exactly what you need, there are several
different approaches to these sorts of questions in SLiM:
• The subpopID property of mutations (section 21.8.1) keeps track of the subpopulation in
which a given mutation originated. In an admixture model in which you want to
determine whether a given mutation originally arose in one subpopulation or another,
this is often all you need. Note also that if you are adding mutations yourself using
addNewMutation() or addNewDrawnMutation(), you can set the subpopID in those calls to
whatever value you wish. The subpopID property is writeable, so you can also change it
later to any integer value (including some type of ancestry information).
• Mutation types can also be useful for tracking ancestry information. If different mutation
types are used for mutations with different origins in your model, they can then be used
to determine the origin of each mutation later, and SLiM’s methods for retrieving and
counting the mutations of a given mutation type can be used to separate out mutations
of different origins. This approach will work even in non-admixture models in which all
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

210

mutations originate in a single subpopulation, as long as you can classify mutations
according to their ancestry at the point when they originate (in a modifyChild()
callback that looks at the parents to determine an ancestry group, for example). The
mutation type of mutations can be changed with the setMutationType() method.
• By enabling an optional feature, SLiM can do pedigree-based tracking of the ancestry of
individuals for the purposes of calculating relatedness, finding “trios” of parents and an
associated offspring, and so forth (see section 13.2). This provides information about
relatedness and ancestry at the level of individuals, rather than the level of mutations.
Pedigree-based arranged matings can also be implemented (see section 13.7).
• If tree-sequence recording is turned on (see section 1.7 and chapter 16), it provides a
type of true local ancestry tracking for every position along the chromosome. This is
similar in some ways to this recipe, but much more efficient, and was added in SLiM 3.
Sometimes it is desirable to track relatedness not at the individual level with a pedigree, or at a
mutational level with subpopID or mutation types, but instead at the level of chromosomal regions
– perhaps down to the ancestry of each individual base position along the chromosome. In this
way, the ways that assortment, recombination, selection, and drift determine the ancestry at each
position can be analyzed. This type of relatedness tracking is also possible in SLiM, even when not
using tree-sequence recording; we will explore an example in this section.
Models can implement “true local ancestry” tracking themselves using marker mutations –
mutations which have no selective effect, and are not meant to represent actual mutational
changes to genomes, but instead simply mark particular positions on particular genomes for future
reference. The recipe in section 13.5 used a marker mutation to track individuals possessing a
chromosomal inversion, for example, and in 13.6 a marker mutation was used to mark Y
chromosomes in a model of a pseudo-autosomal region shared between sex chromosomes. Here
marker mutations will be used at every position, to indicate ancestry. At the end, the model will
assess the average ancestry, across the subpopulation, at every chromosome position.
Here is the complete recipe:
initialize() {
defineConstant("L", 1e4);
initializeMutationRate(0);
initializeMutationType("m1", 0.5, "f", 0.1);
initializeMutationType("m2", 0.5, "f", 0.0);
m2.convertToSubstitution = F;
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, L-1);
initializeRecombinationRate(1e-7);

// chromosome length

// beneficial
// p1 marker

}
1 {
sim.addSubpop("p1", 500);
sim.addSubpop("p2", 500);
}
1 late() {
// p1 and p2 are each fixed for one beneficial mutation
p1.genomes.addNewDrawnMutation(m1, asInteger(L * 0.2));
p2.genomes.addNewDrawnMutation(m1, asInteger(L * 0.8));
// p1 has marker mutations at every position, to track ancestry
p1.genomes.addNewMutation(m2, 0.0, 0:(L-1));

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

211

// make p3 be an admixture of p1 and p2 in the next generation
sim.addSubpop("p3", 1000);
p3.setMigrationRates(c(p1, p2), c(0.5, 0.5));
}
2 late() {
// get rid of p1 and p2
p3.setMigrationRates(c(p1, p2), c(0.0, 0.0));
p1.setSubpopulationSize(0);
p2.setSubpopulationSize(0);
}
2: late() {
if (sim.mutationsOfType(m1).size() == 0)
{
p3g = p3.genomes;
p1Total = sum(p3g.countOfMutationsOfType(m2));
maxTotal = p3g.size() * (L-1);
p1TotalFraction = p1Total / maxTotal;
catn("Fraction with p1 ancestry: " + p1TotalFraction);
p1Counts = integer(L);
for (g in p3g)
p1Counts = p1Counts +
integer(L, 0, 1, g.positionsOfMutationsOfType(m2));
maxCount = p3g.size();
p1Fractions = p1Counts / maxCount;
catn("Fraction with p1 ancestry, by position:");
catn(format("%.3f", p1Fractions));
sim.simulationFinished();
}
}
100000 late() {
stop("Did not reach fixation of beneficial alleles.");
}

The broad outline here is straightforward. We model two subpopulations, p1 and p2, which are
created in the generation 1 event and configured in the generation 1 late() event. That late()
event also sets up a new subpopulation, p3, that will be produced by admixing migrants from p1
and p2 in generation 2. At the end of generation 2, p1 and p2 are removed, and p3 is allowed to
evolve thenceforth until the termination of the model. This model is therefore a very simple model
of the admixture of two subpopulations; more complex population dynamics could easily be used
with the same ancestry-tracking mechanism shown here.
When p1 and p2 were configured, they were each set up to contain a beneficial mutation of
type m1, fixed within the respective subpopulation. The new p3 subpopulation therefore contains
some genomes that originated in p1 and contain that subpopulation’s beneficial mutation (located
at L * 0.2 on the chromosome), and some genomes that originated in p2 and contain that
subpopulation’s beneficial mutation (located at L * 0.8 on the chromosome). The expectation is
that recombination between those points will eventually produce at least one new genome that
contains both beneficial mutations, and that subsequently both beneficial mutations will sweep to
fixation in p3. The question we wish to investigate is the precise pattern of true local ancestry
along the chromosome at the point when fixation of both beneficial mutations has just occurred.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

212

To do this, the model adds a marker mutation of type m2 at every position along every genome
in p1, in the 1 late() event. This is done using a single vectorized call to addNewMutation(), for
greater efficiency; adding the mutations one by one would take a very long time, especially with a
longer chromosome. These added markers indicate that a given chromosome position traces its
ancestry to p1. In models with more complex demography involving more than two “parental”
subpopulations, additional mutation types could be employed to indicate derivation from each of
the possible ancestral subpopulations. An important point to understand is that because mutations
in SLiM “stack” by default, these marker mutations have no effect whatsoever on the other
mutations that might occur during a simulation; they are simply carried along, almost as if they
were epigenetic marks of some sort on the genomes being simulated. For simplicity, this model
does not include neutral mutations along the chromosome, and indeed it uses a mutation rate of 0;
but the ancestry-tracking mechanism shown here is compatible with any other model dynamics.
Thus configured, the model runs until both beneficial mutations either fix or are lost. This
termination condition is checked in the 2: late() event. When there are no longer any circulating
m1 mutations, the model outputs some final analysis and terminates. The final analysis here is in
two parts. The first part simply calculates the fraction of all chromosome positions in all genomes
that derives from p1. In one run of this model, the output happens to be:
Fraction with p1 ancestry: 0.398137

The second part calculates the fraction of all genomes derived from p1 at each individual
position along the chromosome, and outputs those fractions as a big vector:
Fraction with p1 ancestry at each chromosome position:
0.946 0.946 0.946 0.946 0.946 0.946 0.946 0.946 0.946 0.946 0.946 0.946...

It does that by looping through the genomes of p3, and for each genome getting the positions at
which m2 mutations exist using positionsOfMutationsOfType(). Those vectors of positions can be
converted into integer vectors that have a 1 at the positions where m2 mutations occurred (and a 0
everywhere else), using the integer() function of Eidos. Summing those vectors across all of the
p3 genomes gives a count of the number of m2 mutations at each position; dividing that by the
maximum possible count gives a frequency, which indicates the pattern of ancestry.
This pattern can also be seen in SLiMgui, with the neutral marker mutations shown in yellow:

This indicates that the left-hand half of the chromosome mostly derives from p1 – albeit with
some variation due to multiple recombination events that created some haplotype variation –
whereas the right-hand half of the chromosome derives from p2 in all individuals.
The recipe here tracks true local ancestry down to the level of individual base positions. This
does involve some overhead, in terms of both memory usage and processor power, so if this level
of granularity is not needed, it would be much more efficient to place marker mutations every 10,
100, or even 1000 positions along the chromosome. This could be done with a few trivial
modifications to the recipe presented here. However, this recipe can be scaled up to L = 1 Mbp
and still run in only a couple of minutes, so the overhead is really not too bad. For even larger
models, however, both the runtime and the memory usage of this recipe can become prohibitive;
at L = 108, the memory usage of this recipe is estimated to be over 8 TB (that is not a typo). Treesequence recording, added in SLiM 3, can provide a far more efficient alternative (see section 1.7
and chapter 16).
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

213

0.6
0.4
0.0

0.2

fractional ancestry

0.8

1.0

A final note: having done one run of the recipe and seen the pattern of ancestry in SLiMgui, it
would be natural to wonder what the average pattern of ancestry would be across many runs. That
is straightforward to do, by simply running the model many times, collecting the data from each
run, and averaging the ancestry pattern across the runs. A simple R script to do this is provided in
the online SLiM-Extras repository, in the “sublaunching” folder since it illustrates how to sublaunch
runs of SLiM from a script (in this case, an R script). The script is named aggregateLocalAncestry.R
(https://github.com/MesserLab/SLiM-Extras/blob/master/sublaunching/aggregateLocalAncestry.R). If
you have R installed you should be able to run it easily, after modifying the file paths in the script
to point to your installed slim and your copy of this recipe. If you do that, you will get a plot:

0

9999
chromosome position

The locations where the beneficial mutations were placed in p1 and p2 are shown with red
lines. The mean true local ancestry (the fraction of the genomes in p3 that derive from p1 at a
given chromosome position) is exactly 1.0 at the first beneficial mutation’s location, and exactly
0.0 at the second’s; this makes sense, since the model did not terminate until both beneficial
mutations had fixed, which would imply pure p1 or p2 ancestry at those positions. In between, the
local ancestry changes linearly from p1 to p2; it might seem plausible that the pattern would be
sigmoid instead, but apparently it is not. Toward the ends of the chromosomes, outside the
beneficial mutations, the ancestry falls toward 0.5; it’s hard to see here, but that fall-off is not
linear, in fact. This pattern presumably recapitulates known results from population genetics, but a
modified version of this model could explore much more complex scenarios.
13.10 A quantitative genetics model with heritability
Section 13.1 presented a recipe for a quantitative genetics model in which phenotype was
calculated as the additive effect of a set of QTLs, and fitness was determined by individual
phenotype compared to a phenotypic optimum. In this section we will extend this type of
quantitative genetics model to incorporate heritability, generated by adding random deviations
(representing environmental variance) to the additive genetic variation of the QTLs. Rather than
extending the model of section 13.1 directly, however, we will develop a somewhat different
model here. This recipe utilizes QTLs with effect sizes drawn from a continuous Gaussian
distribution, rather than the −1/+1 effect size distribution of section 13.1. The genetic structure
here, unlike the earlier recipe, is not predetermined; new QTLs can arise spontaneously at any
location. This recipe has only a single subpopulation, and does not contain assortative mating.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

214

The mechanics of this model are quite simple. Here is the entire model except the final output
event and the implementation of heritability:
initialize() {
initializeMutationRate(1e-6);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeMutationType("m2", 0.5, "n", 0.0, 1.0);
m2.convertToSubstitution = F;

// neutral
// QTL

initializeGenomicElementType("g1", c(m1, m2), c(1, 0.01));
initializeGenomicElement(g1, 0, 1e5 - 1);
initializeRecombinationRate(1e-8);
}
1 late() {
sim.addSubpop("p1", 1000);
}
1: late() {
// construct phenotypes from the additive effects of QTLs
inds = sim.subpopulations.individuals;
inds.tagF = inds.sumOfMutationsOfType(m2);
}
fitness(m2) {
return 1.0;
// QTLs are neutral; fitness effects are handled below
}
fitness(NULL) {
return 1.0 + dnorm(10.0 - individual.tagF, 0.0, 5.0); // optimum +10
}

Section 13.1 explained the mode of operation of this type of model in some detail. In brief,
QTLs are here represented by mutation type m2, and the selection coefficient of m2 mutations is
used as the additive genetic value of the QTL. A fitness(m2) callback makes all m2 mutations be
neutral regardless of these selection coefficients. A fitness(NULL) callback (a so-called “global
fitness() callback” because it evaluates overall individual fitness, rather than the fitness of a focal
mutation) returns a fitness value that depends upon the phenotype as compared to a phenotypic
optimum of 10.0, using a Gaussian function as in many such models.
Note that, as in section 13.1, a baseline value of 1.0 is added to the Gaussian function value
returned by dnorm(). Section 13.1 didn’t really discuss that choice in any detail, however; let’s
embark on that digression now. Fundamentally, the problem is that of trying to choose a function
that provides a reasonable phenotype-to-fitness map. There is not a lot of empirical data to guide
this choice in most systems, so it is a difficult and somewhat arbitrary decision. In a hard selection
model, where low mean population fitness results in a decrease in population size, an appropriate
fitness function might have a minimum value of zero, allowing the modeling of phenomena such
as lethal mutations and population extinction. In such a model, if most individuals have a fitness
of 0.001 and one has a fitness of 0.1 (having just received a beneficial mutation, for example),
most of the individuals will produce no offspring at all, while the individual with fitness of 0.1
might produce just one or two, leading to a very small population in the next generation. This
seems quite reasonable. The same situation in a soft selection model, where population size does
not depend upon mean population fitness, makes considerably less sense, however. In this
scenario, the child generation must be filled with the requisite number of individuals, so the
individual with fitness 0.1 will end up generating most of the offspring – perhaps hundreds or even
thousands of offspring – while all the rest of the individuals survive but generate few or no
offspring. For some organisms this might be realistic, but for many it would be nonsensical – no
matter how much more fit one elephant is than all of the other elephants in Africa, it is not going
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

215

to generate thousands of offspring all across Africa all by itself! In short, fitness in soft selection
models works differently than in hard selection models, and is often modeled more appropriately
with a fitness function that does not allow such large differences in relative fitness to exist, so that
one individual does not completely dominate the reproductive output of the population. Adding a
constant value to a Gaussian function is a simple way to achieve this; the larger the constant
added, the more the relative fitness values in the population are effectively homogenized. Using a
constant of 1.0 makes some superficial sense, since 1.0 represents neutrality and the Gaussian
function can then be thought of as being some marginal increase in fitness beyond neutrality.
However, this constant is actually arbitrary; after the rescaling that is implicit in the concept of
relative fitness, 1.0 will not be neutral anyway. The constant added should probably therefore be
considered to be a free parameter of the model, if a fitness function of this form is used.
To continue this digression a bit longer, it is also worth noting that if large differences in relative
fitness are allowed in a model, as with the situation above where most individuals have fitness
0.001 and one has fitness 0.1, this can lead to undesirable mate-choice dynamics, too. By default,
SLiM models are hermaphroditic, and while they employ biparental mating by default, they do not
prevent the same parent from being chosen as both parents in a biparental mating event, resulting
in hermaphroditic selfing. Normally this is not a problem, since for typical population sizes it is
unlikely to occur often enough to make a significant difference to the model’s results. In some
sense it is even desirable, since this behavior means that SLiM’s default configuration mirrors
simple analytical population genetics models more closely; this is the reason why SLiM does not
prevent it by default. However, when there is large variance in fitness or when the effective
population size is very small, the chance of hermaphroditic selfing can become quite high; the
fitness 0.1 individual may be quite likely to be chosen as both parents of a given offspring, and it
may therefore generate most of its offspring through selfing. In that case it may be desirable to
suppress it. See section 12.4 for further discussion of this issue and how to deal with it in SLiM.
Returning from that digression: In this model, phenotypes are calculated as the sum of the
effects of all of the QTLs possessed by each individual, and are stored in the tagF property of
individuals for use within the global fitness() callback. The core calculation is done here by the
sumOfMutationsOfType() method, which loops over all of the mutations in each individual and
sums up the selection coefficients of all of the mutations of type m2, which represent the QTLs
possessed by the individual. The sum of the selection coefficients of those mutations represents the
total of the additive effects of the QTLs. These result values – calculated phenotypes – are assigned
into the tagF property of the individuals. Note that this implementation produces a codominant
model; each QTL allele possessed has the same additive effect. It would be possible to implement
a dominant QTL model; it would require replacing the sumOfMutationsOfType() call with method
calls to get the mutations possessed by each individual with mutationsOfType(), unique them with
unique(), get their selection coefficients, and total them up with sum(). This can be done in one
line for the whole vector of individuals, using sapply(), but is not shown here.
QTLs are about 1% of all new mutations, and thus new QTLs arise spontaneously throughout
the run. If they tend to improve the phenotypes of individuals in the population, then they tend to
be retained; if not, they tend to be lost. The population as a whole begins with a mean phenotype
of 0.0, since no QTLs exist. Over the model run, the population will execute an adaptive walk
towards the fitness peak at +10.0. The only other component we need for the base model is an
event that checks for the termination condition (arrival at the fitness peak) and prints output:

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

216

1:100000 late() {
if (sim.generation == 1)
cat("Mean phenotype:\n");
meanPhenotype = mean(p1.individuals.tagF);
cat(format("%.2f", meanPhenotype));
// Run until we reach the fitness peak
if (abs(meanPhenotype - 10.0) > 0.1)
{
cat(", ");
return;
}
cat("\n\n-------------------------------\n");
cat("QTLs at generation " + sim.generation + ":\n\n");
qtls = sim.mutationsOfType(m2);
f = sim.mutationFrequencies(NULL, qtls);
s = qtls.selectionCoeff;
p = qtls.position;
o = qtls.originGeneration;
indices = order(f, F);
for (i in indices)
cat("
" + p[i] + ": s = " + s[i] + ", f == " + f[i] +
", o == " + o[i] + "\n");
sim.simulationFinished();
}

This event runs in every generation. It prints the mean phenotype in each generation, forming a
comma-separated list that can be easily copied into an environment such as R for plotting. When
the fitness peak is reached, within a small tolerance, it prints a list of all of the QTLs in the
population and terminates. The QTL list contains the position, effect size, frequency, and origin
generation of each QTL, and is sorted by frequency using the order() function.
A run of the model produces output like this:
Mean phenotype:
0.00, -0.00, -0.00, -0.01, -0.00, -0.00, -0.01, -0.01, -0.01, -0.01, ...,
9.77, 9.77, 9.82, 9.81, 9.83, 9.83, 9.84, 9.85, 9.85, 9.82, 9.89, 9.91
------------------------------QTLs at generation 1793:
45907:
98721:
53961:
21414:
93840:
74095:
...

s
s
s
s
s
s

=
=
=
=
=
=

1.28919, f == 1, o == 489
2.78947, f == 1, o == 257
1.59969, f == 0.592, o == 1274
-1.09245, f == 0.0515, o == 1639
-0.338536, f == 0.0345, o == 1657
1.35625, f == 0.026, o == 1762

Plotting the mean phenotype time series in R produces a visualization of the adaptive walk:

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

217

10
8
6
4
2
0

Mean phenotype

0

500

1000

1500

2000

Generation
The black curve shows the mean phenotype over time; the three red lines show the generations
of origin of the three high-frequency QTLs listed in the output above. This visualization shows that
the adaptive walk involved a joint sweep by the first two QTLs, followed by a second sweep by the
third QTL, which arose later. At the end of the run, the population has reached the adaptive peak
only on average, since the third QTL is present at a frequency of only 0.592. Individuals that do
not possess this QTL at all have a phenotype of ~8.16; those with only one copy have a phenotype
of ~9.76; and those with two copies have a phenotype of ~11.36. In the final state of the model,
heterozygote advantage thus produces balancing selection upon the third QTL.
So far so good; but earlier it was promised that this model would incorporate heritability as
well. Let’s add that now, beginning with the addition of a line at the top of the initialize()
callback that sets up a constant representing the desired heritability, h2:
defineConstant("h2", 0.1);

And then, here is a complete replacement for the 1: late() event that calculates phenotypes:
1: late() {
// construct phenotypes from the additive effects of QTLs
inds = sim.subpopulations.individuals;
tags = inds.sumOfMutationsOfType(m2);
// add in environmental variance, according to the target heritability
V_A = sd(tags)^2;
V_E = (V_A - h2 * V_A) / h2;
// from h2 == V_A / (V_A + V_E)
env = rnorm(size(inds), 0.0, sqrt(V_E));
// set phenotypes
inds.tagF = tags + env;
}

This replacement event calculates phenotypes in much the same way as the original, but with
the addition of random noise representing environmental variance. The desired environmental
variance is calculated based upon the additive genetic variance and the heritability, following
standard quantitative genetics, and rnorm() is used to generate random values with
(approximately) the desired variance.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

218

The model above produces somewhat different results than the original model without
heritability. The most obvious difference is that the plot of mean phenotype over time is noisier,
because the mean phenotype itself is noisier. Other effects of the heritability on the evolutionary
trajectory are more difficult to see in a single run; statistical analysis of a large number of runs
would be needed to draw firm conclusions.
We have often used the term “phenotype” here, rather than just referring to, e.g., the breeding
value of the quantitative trait. This is deliberate, because this model really is a model of selection
on the organism phenotype. Here the phenotype is determined by just a single quantitative trait,
but it would be quite simple to extend the model to encompass multiple traits that all influenced
the organism’s phenotype in different ways. The 1: late() event here encapsulates the genotypeto-phenotype map, and can be broadened to any such map desired, for any number of traits. In
this model, the global fitness() callback defines the phenotype-to-fitness map according to the
Gaussian fitness function used there. In a model in which phenotype encompassed multiple traits,
the definition of the phenotype-to-fitness map would probably move to the 1: late() event as
well, and the final fitness effect of the whole phenotype would be stored in tagF; the global
fitness() function would then simply return the individual.tagF fitness value previously stored.
It should be noted that this model demonstrates just one possible way of adding environmental
variance to influence heritability. The method used here does not change the magnitude of the
additive genetic variance, and so the overall phenotypic variance will be larger with lower
heritability. Alternatively, one could scale the additive genetic variance down so that the
phenotypic variance remains constant regardless of the heritability value chosen. Both methods
would achieve the same target heritability value, but with different overall phenotypic variance.
In fact, trying to attain a target heritability value, as done here, is rather artificial to begin with; it
is common in analytical quantitative genetics models, but from an individual-based perspective
such as that of SLiM, the heritability is properly regarded as an emergent property of the model,
not a value the model should impose. From that perspective, the better approach would be to
simply add in environmental variation of a particular magnitude (determined from empirical data,
perhaps). The magnitude of the environmental variance, rather than the heritability, would be the
model parameter, and the heritability would be a consequence of the model’s dynamics. The code
above can of course be easily modified to implement such a scheme instead.
It should also be noted that in this model the heritability being modeled is both the narrowsense heritability h2 and the broad-sense heritability H2; they are the same here, because VA is
equal to VG, which is true because the only source of genetic variance is the additive effects of the
QTLs. If that were no longer the case (due to epistasis, maternal effects, dominance, etc.), then the
model might need to be modified to correctly implement the particular type of heritability desired;
that gets into quantitative genetics theory that we will not explore further here.
The heritability algorithm used here is adapted from code kindly provided by Mikhail Matz.
13.11 Live plotting with R using system()
The previous section presented a final analysis of the results from its recipe using a plot
generated in R after the model had finished. SLiMgui provides some built-in plots, as we saw in
chapter 8, but it would be awfully nice to be able to make custom plots through R’s extensive
plotting facilities within a running SLiM model – and even better, to have those plots update live as
the model runs in SLiMgui. That is the goal of this section’s recipe.
To achieve this goal, we will use several tools we haven’t encountered before now. Most
important is the Eidos system() function, which allows any Un*x command to be executed; we
will use it to sublaunch R processes that will generate plots for us. We will also use the Eidos
writeTempFile() function, which facilitates the generation of temporary files with unique, nonTOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

219

colliding filenames; we will use it to create both the R script we will execute and the PDF plot file
we will display. Let’s launch into it:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}

Just a vanilla neutral model setup, nothing surprising. Next is the generation 1 setup:
1 {
sim.addSubpop("p1", 5000);
sim.setValue("fixed", NULL);
defineConstant("pdfPath", writeTempFile("plot_", ".pdf", ""));
// If we're running in SLiMgui, open a plot window
if (exists("slimgui"))
slimgui.openDocument(pdfPath);
}
50000 late() { sim.outputFixedMutations(); }

In generation 1, we create a new subpopulation as usual. We also start a new value kept by the
simulation, named fixed; this will keep a record of the number of fixed mutations over time as the
simulation runs, and that is what we will plot.
Next we call writeTempFile() to make a temporary file with a filename that begins with
"plot_" and ends with ".pdf"; several characters in between will be randomly generated by
writeTempFile() to create a unique filename that is not presently used in the /tmp/ directory
where the file will be placed. This call returns the filesystem path to the file it creates, and we
define that as a constant named pdfPath for future use.
The last couple of lines execute only if the model is running in SLiMgui; when running at the
command line, the slimgui object does not exist. When running in SLiMgui, on the other hand,
slimgui is a global singleton object (of Eidos class SLiMgui) that represents the SLiMgui application
itself, and can be used to control SLiMgui from a model’s script (see section 21.11). If the slimgui
object exists, as tested by the Eidos function exists(), then we are indeed running in SLiMgui, in
which case we call the openDocument() method of slimgui with the path to the PDF file we have
just created. A new window should open in SLiMgui as a result; initially it will be blank since the
PDF file is empty, but it will update automatically whenever the PDF file changes.
Since we are running in SLiMgui, we can rely upon it for the automatically updating PDF
display facility used here. Note that users running SLiM at the command line in Linux could
presumably do something very similar to open the plot file in the PDF preview app of their choice,
and get live plotting even without running SLiMgui. This could probably be done, for example,
with a call to the Eidos function system() to request the operating system to open the PDF file in
the user’s preferred application; on macOS a command named open exists to perform such tasks,
so presumably something similar exists on Linux. (If a Linux user reads this and figures out how to
do this in a general way, please let us know and we’ll document it here). Note, however, that
many PDF display programs do not notice when the file changes, and will not automatically
redisplay it.
Finally, we have a generation 50000 event that ends the simulation. All we need in addition to
that code is an event to tabulate results and generate plots:
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

220

1: {
if (sim.generation % 10 == 0)
{
count = sim.substitutions.size();
sim.setValue("fixed", c(sim.getValue("fixed"), count));
}
if (sim.generation % 1000 != 0)
return;
y = sim.getValue("fixed");
rstr = paste(c('{',
'x <- (1:' + size(y) + ') * 10',
'y <- c(' + paste(y, sep=", ") + ')',
'quartz(width=4, height=4, type="pdf", file="' + pdfPath + '")',
'par(mar=c(4.0, 4.0, 1.5, 1.5))',
'plot(x=x, y=y, xlim=c(0, 50000), ylim=c(0, 500), type="l",',
'xlab="Generation", ylab="Fixed mutations", cex.axis=0.95,',
'cex.lab=1.2, mgp=c(2.5, 0.7, 0), col="red", lwd=2,',
'xaxp=c(0, 50000, 2))',
'box()',
'dev.off()',
'}'), sep="\n");
scriptPath = writeTempFile("plot_", ".R", rstr);
system("/usr/local/bin/Rscript", args=scriptPath);
}

First of all, every tenth generation we add the current count of fixed mutations to the value
we’re using to tabulate those results, using getValue() and setValue() to update the saved value.
Second, every 1000th generation we generate a new plot. First, we get the current tabulation of
counts using getValue(). Then we assemble an R script into rstr using a rather complicated
command that spans twelve lines. A call to c() is used to piece together a vector of script lines,
some of which are simple strings, others of which are themselves assembled using the + operator to
concatenate strings and so forth. The vector of script lines is then pasted together using a newline
character as the separator, to make a more or less readable R script (you could add a print() call to
see the final result if you wish). We write that script to a temporary file using writeTempFile(), and
then we call system() to request that Rscript should run our script. (Rscript is a command-line
utility for running R scripts, typically installed as part of a standard R installation; see the R
documentation for more information on it.)
That is all that is needed. SLiMgui will automatically notice that the PDF file has been
overwritten with new data, and will redisplay it more or less continuously (there is a short lag
because of delays involved in filesystem notifications and other factors, but typically less than a
second). At the end of a typical run, the plot window in SLiMgui shows something like this:

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

221

500
400
300
200
100
0

Fixed mutations

0

25000

50000

Generation

The plotting code that we sent to R included niceties like axis labels, font sizes, and line colors,
so the final plot is reasonably nice-looking. Beyond aesthetics, we can see a few interesting things
in this plot. First of all, there is a long delay – perhaps 15000 generations – before any mutations
fix at all. This is because the chromosome starts empty, in this model, and it takes some time to
accumulate neutral diversity and have it drift to fixation. This is why it is usually important to run
models for a burn-in period before collecting data, of course; the early state of a model is usually
far from equilibrium. Second, even after mutations start to fix the line is quite jagged; there are
long periods without any fixation, and then sudden jumps when many mutations fix
simultaneously. This is because whole haplotypes representing many individual mutations often fix
as a single unit, followed by periods in which competing haplotypes are drifting up and down in
frequency without fixation; these dynamics are also very visible in SLiMgui’s chromosome view.
Apart from a little bit of hassle involved in assembling an R script as an Eidos string, this recipe
is quite short and simple considering that it is continuously writing new script files and then
sublaunching R processes to generate PDF plots! This example is quite simple, but there is no limit
to the complexity possible here; arbitrary plots could be made, multiple plots could be generated
simultaneously, and other processing, such as statistical analysis, could be done in R.
13.12 Modeling nucleotides at a locus
SLiM provides no intrinsic support for explicitly modeling the nucleotide sequence along the
chromosome. In general, such models are conceptually quite different from SLiM; they tend to be
concerned with the probability that each ATGC base will mutate into each other type of base, and
use Markov models and similar constructs to model changes to the nucleotide sequence as
realistically as possible, for the purpose of, for example, accurately calculating divergence time
using molecular data, or calculating a maximum likelihood phylogeny given sequence data. They
also often concern themselves with things like codons and amino acids, codon degeneracy,
synonymous and nonsynonymous mutations, stop codons, nonsense mutations, and so forth, for
example to model the evolution of protein sequences. None of those concepts is really within
SLiM’s domain, and although in principle they could all be modeled in script, at some point the
question arises as to whether, for driving in a screw, a better tool than a hammer might exist.
Nevertheless, in some cases it is desirable to model in SLiM the possibility that a given base
position can take on exactly one of four distinct values, representing nucleotides, while ignoring
the aforementioned issues involved in more realistic modeling of nucleotide sequences. This
could be useful if, for example, it is important that a given base position can possess only four
possible discrete fitness effects – that there be just four distinct alleles at that location. It could also
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

222

be important for models in which back-mutation is important, such that a new mutation that arises
at a given location leads to the previous allelic state with a one-in-three chance.
This is possible to model in SLiM, using a little bit of scripting to modify SLiM’s default behavior.
Here we will look at one strategy for doing so.
To begin with, here is the initialize() callback we will use to set up the simulation:
initialize() {
defineConstant("C", 10);

// number of loci

// Create our loci
for (locus in 0:(C-1))
{
// Effects for the nucleotides ATGC are drawn from a normal DFE
effects = rnorm(4, mean=0, sd=0.05);
// Each locus is set up with its own mutType and geType
mtA = initializeMutationType(locus*4 + 0, 0.5, "f", effects[0]);
mtT = initializeMutationType(locus*4 + 1, 0.5, "f", effects[1]);
mtG = initializeMutationType(locus*4 + 2, 0.5, "f", effects[2]);
mtC = initializeMutationType(locus*4 + 3, 0.5, "f", effects[3]);
mt = c(mtA, mtT, mtG, mtC);
geType = initializeGenomicElementType(locus, mt, c(1,1,1,1));
initializeGenomicElement(geType, locus, locus);
// We do not want mutations to stack or fix
mt.mutationStackPolicy = "l";
mt.mutationStackGroup = -1;
mt.convertToSubstitution = F;
// Each mutation type knows the nucleotide it represents
mtA.setValue("nucleotide", "A");
mtT.setValue("nucleotide", "T");
mtG.setValue("nucleotide", "G");
mtC.setValue("nucleotide", "C");
}
initializeMutationRate(1e-6);
// includes 25% identity mutations
initializeRecombinationRate(1e-8);
}

First of all, note that the length of the chromosome is a defined constant, C, but here we will
work with just 10 bases just to keep the output simple. This recipe will work for longer
chromosomes too, although there is a price paid in speed and memory usage.
The initialization code loops over the loci, from 0 to C-1, and sets up each locus with its own
mutation types, genomic element type, and genomic element. Each locus will have four
nucleotides, ATGC, each of which has a random fitness effect; those effects are drawn from a
normal distribution using rnorm(). The code creates a mutation type for each nucleotide type,
each using one of the fixed selection coefficients drawn. In other words, each nucleotide type gets
its own mutation type at each locus; since this recipe contains 10 loci, there will be 40 mutation
types in all. A genomic element type is created using those four mutation types in equal
proportions (different proportions could be used if mutational bias is desired), and a genomic
element at the locus in question is created using that genomic element type.
We want new mutations to replace the old mutation at a locus, since that is how nucleotides
work (a base position is never two nucleotides at the same time), so we set mutationStackPolicy
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

223

and mutationStackGroup accordingly. By setting all of the mutation types to the same
mutationStackGroup, nucleotides will all replace each other; there will be no mutation stacking in
this model. We never want these mutation types to fix – this model will never allow a locus to be
empty (as normally occurs in SLiM), since a base position is always either A,T, G, or C – so we set
convertToSubstitution to F. Finally, we use setValue() to save each mutation type’s nucleotide
under the name "nucleotide" for later reference; this is only for purposes of output (note that it
would be more memory-efficient to use integer values of the tag property of the mutation types to
represent the nucleotides instead, but would be less clear for presentation here).
Next, the initialize() callback sets the mutation rate. Note that in this model a nucleotide
can mutate into itself; an A can mutate to become an A, a T to become a T, etc. This is not
prohibited because the mutation-generation machinery in SLiM generates new mutations without
reference to what mutation might already exist at that locus. The mutation rate specified for this
model is therefore higher than the “real” mutation rate will be; approximately 25% of mutations
will be no-ops that do not change the existing nucleotide.
Finally, the recombination rate is set. This recipe models a linked string of ten nucleotides; to
model ten unlinked loci instead, a recombination rate of 0.5 could be used.
Next, we need to create an initial population:
1 late() {
sim.addSubpop("p1", 2000);
// The initial population is fixed for a single wild-type
// nucleotide fixed at each locus in the chromosome
geTypes = sim.chromosome.genomicElements.genomicElementType;
mutTypes = sapply(geTypes, "sample(applyValue.mutationTypes, 1,
weights=applyValue.mutationFractions);");
p1.genomes.addNewDrawnMutation(mutTypes, 0:(C-1));
cat("Initial nucleotide sequence:");
cat(" " + paste(mutTypes.getValue("nucleotide")) + "\n\n");
}

The first step is, as usual, a call to addSubpop(). Then we need to set up each locus with an
initial nucleotide. To do that, we first get a vector of the genomic element types for each genomic
element (and thus for each base position, since we have one genomic element per base position).
These genomic element types determine the candidate mutation types for each position, so we can
use sapply() to draw one of the mutation types from each genomic element type, according to the
probabilities specified by the genomic element type; this gives us a vector of mutation types to use
for each base position. Finally, we add all of the new mutations with a single vectorized call to
addNewDrawnMutation(). This design is far faster than adding the new mutations one by one in a
loop. In short, this code chooses a random nucleotide sequence; it then prints out the chosen
sequence using cat(). For example, this code might output:
Initial nucleotide sequence: C G T T A A G G A G

Next, we need to deal with a small issue in our scheme. Because a mutation to the same
nucleotide at the same position can happen multiple times independently, even within one
generation, multiple mutations representing the same nucleotide can be circulating in the
population; a locus might be fixed for G, for example, but that might be represented in SLiM by
82% of genomes having one G mutation, 17% having another, and 1% having a third G mutation,
all with the same selection coefficient and mutation type. This gives SLiM itself no difficulties, but
depending upon the assumptions made in the rest of the model’s script, it could be undesirable, at
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

224

least for purposes of output. This next chunk of code is optional, but many models will want to
include it for this reasons. It looks for such duplicates and fixes them, in each generation:
2: late() {
// optionally, we can unique new mutations onto existing mutations
// this runs only in 2: - it is assumed the gen. 1 setup is uniqued
allMuts = sim.mutations;
newMuts = allMuts[allMuts.originGeneration == sim.generation];
if (size(newMuts))
{
genomes = sim.subpopulations.genomes;
oldMuts = allMuts[allMuts.originGeneration != sim.generation];
oldMutsPositions = oldMuts.position;
newMutsPositions = newMuts.position;
uniquePositions = unique(newMutsPositions, preserveOrder=F);
overlappingMuts = (size(newMutsPositions) != size(uniquePositions));
for (newMut in newMuts)
{
newMutLocus = newMut.position;
newMutType = newMut.mutationType;
oldLocus = oldMuts[oldMutsPositions == newMutLocus];
oldMatched = oldLocus[oldLocus.mutationType == newMutType];
if (size(oldMatched) == 1)
{
// We found a match; this nucleotide already exists, substitute
containing = genomes[genomes.containsMutations(newMut)];
containing.removeMutations(newMut);
containing.addMutations(oldMatched);
}
else if (overlappingMuts)
{
// First instance; it is now the standard reference mutation
oldMuts = c(oldMuts, newMut);
oldMutsPositions = c(oldMutsPositions, newMutLocus);
}
}
}
}

This gets all of the new mutations in the simulation, using originGeneration. It then gathers
information about the pre-existing mutations, the new mutations, and their positions. Then, for
each new mutation, it determines whether there is a pre-existing mutation with the same position
and type; if so, it substitutes the pre-existing mutation for the new mutation in every genome.
The only twist is that when there is no pre-existing mutation, the new mutation needs to
become the new “canonical” mutation for its position and type, and so it needs to be added to the
vector of pre-existing mutations. This only really needs to be done if more than one new mutation
exists at the same position, however, as it will only matter if a second new mutation comes along,
in the same generation, that matches the first; in that case, the first needs to be used as the preexisting mutation to replace the second. Detecting the necessity of this case is the purpose of the
overlappingMuts flag; it allows us to skip updating the list of pre-existing mutations in most cases.
(If the length of the chromosome, C, is increased to a large number like 10000 or more, this
optimization actually makes a substantial difference to the runtime of the recipe.)
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

225

There is only one more step, which is to output something about the final state of the model:
10000 late() {
muts = p1.genomes.mutations;

// all mutations, no uniquing

for (locus in 0:(C-1))
{
locusMuts = muts[muts.position == locus];
totalMuts = size(locusMuts);
uniqueMuts = unique(locusMuts);
catn("Base position " + locus + ":");
for (mut in uniqueMuts)
{
// figure out which nucleotide mut represents
mutType = mut.mutationType;
nucleotide = mutType.getValue("nucleotide");
cat("
" + nucleotide + ": ");
nucCount = sum(locusMuts == mut);
nucPercent = format("%0.1f%%", (nucCount / totalMuts) * 100);
cat(nucCount + " / " + totalMuts + " (" + nucPercent + ")");
cat(", s == " + mut.selectionCoeff + "\n");
}
}
}

This code loops through the loci and prints out the fraction (if any) of each nucleotide at that
locus. For example, it might print:
Base position 0:
C: 4000 / 4000 (100.0%), s == -0.0918891
Base position 1:
T: 4000 / 4000 (100.0%), s == 0.0157156
Base position 2:
T: 4000 / 4000 (100.0%), s == 0.0301955
Base position 3:
T: 4000 / 4000 (100.0%), s == 0.0325203
Base position 4:
C: 4000 / 4000 (100.0%), s == -0.0183944
Base position 5:
A: 4000 / 4000 (100.0%), s == 0.0376738
Base position 6:
G: 4000 / 4000 (100.0%), s == -0.0226032
Base position 7:
G: 3103 / 4000 (77.6%), s == 0.0413104
C: 897 / 4000 (22.4%), s == 0.0477822
Base position 8:
A: 4000 / 4000 (100.0%), s == -0.0814816
Base position 9:
C: 4000 / 4000 (100.0%), s == 0.0666566

This shows us that position 7 appears to be caught in mid-sweep, with a superior C nucleotide
replacing the existing G nucleotide. All of the other loci are fixed for one nucleotide. We can
compare this output to the initial output:
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

226

Initial nucleotide sequence: C G T T A A G G A G

The comparison indicates that the model has undergone complete substitution at positions 1, 4,
and 9, in addition to the ongoing sweep at position 7 which may or may not complete.
This design allows each nucleotide to have its own dominance coefficient, providing for a fair
amount of flexibility in the evaluation of fitness effects. If more complex dominance interactions
are needed, in which each possible pair of homologous nucleotides could have a different
arbitrary fitness value, that could be implemented by first making the mutation types intrinsically
neutral, and then writing a global fitness(NULL) callback that examines the nucleotides at the
locus in question and returns the overall fitness effect.
On the flip side, a purely neutral model of nucleotide evolution would not need separate
mutation types for each locus at all; just four mutation types, representing A, T, G, and C, could be
used for the whole model. This is a very easy modification of the recipe, and should be
considerably more memory-efficient. A hybrid model in which neutral nucleotides are represented
by four standard ATGC mutation types, while non-neutral nucleotides get their own mutation
types, should also be possible; or setSelectionCoefficient() could probably be used.
As mentioned above, modeling explicit nucleotides could probably be done in a variety of
ways. This recipe should be fairly efficient, provides quite a bit of flexibility in fitness assessment,
and makes no assumptions regarding the fitness effects of nucleotides; it works just as well
modeling neutral drift of nucleotides, in fact. In models of explicit nucleotides such as this,
fundamentally there has to be some way of telling what nucleotide a given mutation represents.
This recipe uses the mutation type to make that distinction, which is ultimately why there are four
mutation types per locus. It is tempting to try to differentiate nucleotides using the tag property of
mutations instead, setting the initial tag value of new mutations in a late() callback similar to the
2: callback shown here. However, the present recipe should be suitable for most purposes.
13.13 Modeling haploid organisms
SLiM models diploid individuals that contain two haploid genomes; this is, at present, a design
constraint in SLiM that cannot be modified. However, it is still possible to model haploids in SLiM
quite easily with scripting, by effectively suppressing one of the two haploid genomes of each
diploid individual. Section 13.6’s recipe, which shows how to model X and Y chromosomes with
a pseudo-autosomal region, shares some aspects of its design with the recipe shown here, but
modeling haploids is actually much simpler. Similar techniques could be used to model
mitochondrial DNA; to model systems such as haplodiploidy; and to model “alternation of
generations”, in which alternate generations in the model are diploid sporophytes and haploid
gametophytes (although in many cases that might be more easily modeled using a modifyChild()
callback to simulate selection during the haploid life stage). Here, however, we will stick with a
simple model of haploids that reproduce clonally. This model is simple enough that we will just
present it in its entirety, rather than building it incrementally:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 1.0, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(0);
}
1 {
sim.addSubpop("p1", 500);
p1.setCloningRate(1.0);
}
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

227

late() {
// remove any new mutations added to the disabled diploid genomes
sim.subpopulations.individuals.genome2.removeMutations();
// remove mutations in the haploid genomes that have fixed
muts = sim.mutationsOfType(m1);
freqs = sim.mutationFrequencies(NULL, muts);
sim.subpopulations.genomes.removeMutations(muts[freqs == 0.5], T);
}
200000 late() {
sim.outputFixedMutations();
}

We begin with a typical initialize() callback, except that the recombination rate is set to
zero since there is no recombination in this clonal model. Note also that although this is a neutral
model for simplicity, the dominance coefficient is set to 1.0 on mutation type m1 as a reminder;
since mutations will never be homozygous in this model, dominance coefficients should always be
1.0 for conceptual clarity. Next we add a new subpopulation in generation 1 as usual; but in
addition, we set the subpopulation to reproduce clonally, as befits haploids (but see section 15.14).
At the end of the model we output fixed mutations, as a placeholder for whatever output would be
desired. The interesting model mechanics occur in between, with the late() callback.
The first part of the late() callback removes mutations in the second haploid genome of each
new child generated. SLiM doesn’t know that we’re modeling haploids, and thus not using the
second genomes, so it adds new mutations to them as usual during offspring generation. This call
removes them again in order to keep all of the second genomes of individuals empty.
The second part of the late() callback solves another problem: removing fixed mutations.
SLiM automatically converts fixed mutations into Substitution objects and removes them from
the simulation; however, it defines fixation as occurring when the frequency of a mutation reaches
1.0. In this haploid model, mutations fix when they reach a frequency of 0.5, because of the
empty second genomes. We therefore need to remove mutations manually, rather than relying on
SLiM’s built-in machinery. This callback achieves that, passing T for the substitute parameter of
removeMutations() so that Substitution objects are created for the fixed mutations as usual.
Those are the only overrides needed in script to produce a model of haploids. The mechanisms
used here to enforce haploidy, such as removal of mutations on the unused chromosome and
substitution of fixed mutations at a frequency of 0.5, could also be used to make just one segment
of the simulated chromosome haploid. This would allow mitochondrial DNA and autosomal DNA
to be jointly simulated in SLiM, for example. Some modifications to the recipe above would be
needed to achieve this; you would probably want to use a sexual autosomal model with biparental
mating, add code to the modifyChild() callback to enforce maternal inheritance of the
mitochondrial DNA (i.e., to block any proposed child that received its mitochondrial DNA from
the father, or both parents, or neither), and use a recombination map to prevent recombination in
the mitochondrial portion of the genome, for example. Section 13.6 provides an example of a
scenario that is in many ways parallel to that. Other ploidy schemes, such as haplodiploidy and
alternation of generations, should also be implementable with modifications of these basic ideas.
For an alternative model of haploid organisms that is more compatible with tree-sequence
recording (chapter 16), but which requires the nonWF model type, see sections 15.13 and 15.14.
13.14 Using mutation rate variation to model varying functional density
Beginning with SLiM 2.5, it is possible to set up a mutation-rate map that varies the mutation
rate along the length of the chromosome, similarly to setting up a recombination-rate map as
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

228

already supported by SLiM; see sections 6.1.1 and 6.1.2, for example. This feature can, of course,
be used for the obvious purpose of configuring mutational “hot” and “cold” spots along the
chromosome, and the code for doing so would look much like the code for those recipes. Here,
we will instead explore a less obvious use: modeling positional variation in functional density.
We know that different regions of chromosomes often have higher or lower functional density,
as a consequence of variation in the density of genes and the importance of those genes.
Regardless of the literal appropriateness of the term “junk DNA”, it is clear that large chromosomal
regions exist that are non-coding, and mutations in these regions generally appear to have
relatively little effect on fitness compared to mutations in coding regions. This is quite orthogonal
to mutational “hot” and “cold” spots; the variation we are concerned with here is not in the
mutation rate per se, but in the rate at which mutations actually influence fitness.
Prior to SLiM 2.5, modeling this would have required that one define a complex map of
genomic elements along the chromosome, all experiencing the same mutation rate (since that
could not be varied), but using a different fraction of neutral mutations to deleterious mutations
(and/or beneficial mutations, but for simplicity we will here focus on deleterious mutations). This
design would have been necessary in order to achieve the desired variation in the rate of
deleterious mutations, even if one was not actually interested in the neutral mutations at all. Such
a model would run much more slowly than necessary because of the neutral mutation overhead.
By defining a mutation-rate map, however, it is now straightforward to build a model in which
functional density varies along the chromosome, without having to model the neutral mutations. A
proof-of-concept model for this is so simple as to be trivial:
initialize() {
initializeMutationType("m1", 0.5, "f", -0.01);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);

// deleterious

// Use the mutation rate map to vary functional density
ends = c(20000, 30000, 70000, 90000, 99999);
densities = c(1e-9, 2e-8, 1e-9, 5e-8, 1e-9);
initializeMutationRate(densities, ends);
}
1 {
sim.addSubpop("p1", 500);
}
200000 late() { sim.outputFixedMutations(); }

The variables ends and densities are set up to encode the desired mutation-rate map, with the
end position for each chromosomal range and the effective functional density – the rate of
mutations having a deleterious effect, in this model – of that chromosomal range. As the recipes of
sections 6.1.1 and 6.1.2 illustrate, such maps can easily be generated randomly based upon
empirical metrics, or can even be read in from a file with empirical map data; in this model, to
keep things simple, we instead just specify a very simple map with five regions of different
functional density. After the model has been initialized, clicking the R button in SLiMgui will
show the mutation-rate map (we have seen that button show the recombination-rate map before; it
shows whichever map has been defined, or both if both have been defined):

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

229

The highest rate is in the region from base position 70000 to 89999, and indeed quite a few
deleterious mutations can be seen in that region (but none at very high frequency; the parameters
used in this model mean that the deleterious mutations are eliminated fairly efficiently). There is a
somewhat less active region from 20000 to 30000, with a handful of mutations; the rest of the
chromosome has a lower functional density, and receives deleterious mutations relatively rarely.
In this recipe we work with only one mutation type, modeling deleterious mutations. It would
be possible to extend it to include several types of functional mutations, each with a different rate
of occurrence along the chromosome. In that case, the mutation-rate map would be set to encode
the sum of the rates for each of the mutation types in each chromosomal region, and genomic
elements would be used to partition mutations in each region into the correct fraction for each of
the functional mutation types. This would be somewhat more complex than this recipe, but
manageably so; and it would again be much more efficient than using a constant mutation rate
along the chromosome together with a varying fraction of neutral mutations, since modeling all of
the neutral mutations could again be avoided.
13.15 Modeling microsatellites
A microsatellite (also known as a short tandem repeat or simple sequence repeat) is a
chromosomal region in which a specific nucleotide sequence repeats, often many times.
Microsatellites are important in many areas of applied genetics, from kinship analysis to forensics,
and are also often used to assess the similarity among individuals or subpopulations in ecology
and evolutionary biology. It is thus useful to be able to include them in evolutionary models.
Since SLiM does not model actual nucleotides, explicitly modeling the repeated nucleotide
sequence of a microsatellite doesn’t fit well into its conceptual model; explicitly modeling
mutations that change the number of repeats in a microsatellite would also change the length of
the chromosome, which is not allowed in SLiM. Nevertheless, it is possible to model microsats
more abstractly in SLiM, using mutations that conceptually represent a microsat with a particular
number of repeats at a given locus. This recipe will illustrate one simple approach to this.
We will build this model step by step. First, the model initialization:
initialize() {
defineConstant("L", 1e6);
defineConstant("msatCount", 10);
defineConstant("msatMu", 0.0001);
defineConstant("msatUnique", F);

//
//
//
//

chromosome length
number of microsats
mutation rate per microsat
T = unique msats, F = lineages

initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, L-1);
initializeRecombinationRate(1e-8);

// neutral

// microsatellite mutation type; also neutral, but magenta
initializeMutationType("m2", 0.5, "f", 0.0);
m2.convertToSubstitution = F;
m2.color = "#900090";
}

We define the length of the chromosome (L), and several constants related to our microsats: the
number of microsats we will model, the mutation rate per microsat (per genome per generation),
and a flag that indicates whether to unique the microsats as the model runs (more on this below).
The rest sets up a simple neutral model structure, plus a mutation type, m2, that will represent our
microsatellites. We prevent microsatellites from fixing, with convertToSubstitution, since they
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

230

are a permanent feature of the chromosomal structure. We also set them to display in SLiMgui
using a shade of magenta, to set them apart from the other neutral mutations in the model.
Next we create a subpopulation with microsatellites:
1 late() {
sim.addSubpop("p1", 500);
// create
genomes =
positions
repeats =

some microsatellites at random positions
sim.subpopulations.genomes;
= rdunif(msatCount, 0, L-1);
rpois(msatCount, 20) + 5;

for (msatIndex in 0:(msatCount-1))
{
pos = positions[msatIndex];
mut = genomes.addNewDrawnMutation(m2, pos);
mut.tag = repeats[msatIndex];
}
// remember the microsat positions for later
defineConstant("msatPositions", positions);
}

We call addSubpop() to create the subpopulation, and then we create our microsatellites. Each
microsat is simply an m2 mutation at a given position; the number of repeats in a given microsat is
represented using the tag property of the mutation. This code draws the positions and repeats for
all of the microsats first, and then loops to create each microsat using that information. Finally, we
remember the positions of the microsats in a defined constant for later use.
This code makes the initial population homogenous: the number of repeats of a given microsat
is the same across all genomes. It would of course be possible to start with a non-homogenous
state instead. To avoid creating a new Mutation for every microsat in every genome, however, it
would be best to determine all of the genomes containing a given repeat count at a given position,
and then call addNewDrawnMutation() just once on that entire genome vector so as to create a
single Mutation object that is shared among all of those genomes. Alternatively, a uniquing
strategy could be employed, similar to what we will see below, such that the first microsatellite
created at a given position with a given number of repeats is instantiated with a new Mutation, and
then all subsequent microsats with the same number of repeats at the same position look up and
use that existing Mutation object. Such extensions are left as an exercise for the reader.
With the initial subpopulation state set up, let’s define an endpoint for the model:
10000 late() {
// print frequency information for each microsatellite site
all_msats = sim.mutationsOfType(m2);
for (pos in sort(msatPositions))
{
catn("Microsatellite at " + pos + ":");
msatsAtPos = all_msats[all_msats.position == pos];
for (msat in sortBy(msatsAtPos, "tag"))
catn("
variant with " + msat.tag + " repeats: " +
sim.mutationFrequencies(NULL, msat));
}
}
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

231

This output callback loops over the (sorted) positions of the microsats. For each microsat
position, it looks up all of the microsats that exist at that position, and outputs frequency counts for
each variant (sorted by repeat count).
We’ll see some example output below, but the model isn’t finished yet; let’s finish it first. The
remaining piece in the puzzle is for our microsatellites to mutate. Microsats mutate in an
interesting way: they add or remove repeats. Furthermore, the probability of this happening is
much higher than the usual mutation rate in most organisms – often as high as one in ten
thousand. We model all this with a special modifyChild() callback:
modifyChild() {
// mutate microsatellites with rate msatMu
for (genome in child.genomes)
{
mutCount = rpois(1, msatMu * msatCount);
if (mutCount)
{
mutSites = sample(msatPositions, mutCount);
msats = genome.mutationsOfType(m2);
for (mutSite in mutSites)
{
msat = msats[msats.position == mutSite];
repeats = msat.tag;
// modify the number of repeats by adding -1 or +1
repeats = repeats + (rdunif(1, 0, 1) * 2 - 1);
if (repeats < 5)
next;
// if we're uniquing microsats, do so now
if (msatUnique)
{
all_msats = sim.mutationsOfType(m2);
msatsAtSite = all_msats[all_msats.position == mutSite];
matchingMut = msatsAtSite[msatsAtSite.tag == repeats];
if (matchingMut.size() == 1)
{
genome.removeMutations(msat);
genome.addMutations(matchingMut);
next;
}
}
// make a new mutation with the new repeat count
genome.removeMutations(msat);
msat = genome.addNewDrawnMutation(m2, mutSite);
msat.tag = repeats;
}
}
}
return T;
}

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

232

There’s a lot to parse there; let’s take it step by step. First of all, the microsats in each genome in
the proposed child mutate independently, so we loop over them and mutate them separately. For
each genome, we draw the number of mutations that occur from a Poisson distribution; this is
faster than doing a separate random draw to determine the fate of each individual microsat, but is
otherwise equivalent. If the number of microsat mutations is zero, we’re done; we loop back to
handle the other genome, and then we’re done and return T. When the number of microsat
mutations is greater than zero, though, we have work to do.
In that case, the next thing we do is to draw the positions of the microsats that mutated, using
sample() to draw from the vector of positions we defined when we set up the simulation. Each
microsat thus has an equal probability of mutating (see below for discussion of this). We loop over
those positions and mutate each chosen microsat in turn. To mutate a microsat, we first look up
the existing mutation by position and find out how many repeats it has. We then decide what the
new, mutated repeat count will be; here we just add or subtract 1, and limit the repeat count to a
minimum of 5, but more sophisticated and realistic dynamics could be introduced. Finally,
ignoring the “if (msatUnique)” section for a moment (since msatUnique is presently defined as F
anyway), we effect the mutation event by removing the old microsat mutation from the genome
and adding a newly created microsat mutation at the same position, with the new repeat count for
its tag. (We can’t simply set the tag of the existing microsat mutation, because it is potentially
shared among many genomes; changing its tag would change the repeat count in all of those
genomes!)
Running this model produces output like:
Microsatellite at 60933:
variant with 21 repeats:
variant with 21 repeats:
variant with 21 repeats:
variant with 22 repeats:
variant with 23 repeats:
variant with 23 repeats:
Microsatellite at 98509:
variant with 34 repeats:
Microsatellite at 123123:
variant with 20 repeats:
variant with 21 repeats:
variant with 21 repeats:
Microsatellite at 142781:
variant with 23 repeats:
variant with 24 repeats:
...

0.001
0.026
0.001
0.741
0.105
0.126
1
0.964
0.001
0.035
0.795
0.205

This looks reasonable, except that more than one variant with the same repeat count sometimes
exists for a given microsatellite. Each such variant represents an independent mutational lineage;
each time a microsat mutation occurs, it is represented by a new Mutation object that SLiM tracks
forevermore. In some cases, keeping track of each mutational lineage may be desirable; it could
provide a sort of ancestry tracking, for example, that goes beyond what the repeat counts alone
would provide. Often, however, one would like such replicate mutational lineages to be merged,
such that just a single microsat mutation exists to represent a microsat with a given number of
repeats at a given position. This is what that “if (msatUnique)” code, which we skipped over
above, is for: it “uniques” the microsatellite lineages. To see the effect of this, let’s change that flag
by modifying its definition in the initialize() callback:
defineConstant("msatUnique", T);

// T = unique msats, F = lineages

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

233

If we run the model again (using the same random number seed, so as to reproduce the same
evolutionary dynamics), the output now looks like this:
Microsatellite at 60933:
variant with 21 repeats:
variant with 22 repeats:
variant with 23 repeats:
Microsatellite at 98509:
variant with 34 repeats:
Microsatellite at 123123:
variant with 20 repeats:
variant with 21 repeats:
Microsatellite at 142781:
variant with 23 repeats:
variant with 24 repeats:
...

0.028
0.741
0.231
1
0.964
0.036
0.795
0.205

The microsats have been uniqued, as desired. The “if (msatUnique)” code in the callback,
which implements this feature, is actually quite straightforward. It gets a vector of all existing
microsat mutations from the simulation, and then narrows that down to those at the desired
position, and then finally down to those with the desired number of repeats. If it found an exact
match, then it removes the existing microsat mutation and adds the match; the next statement then
loops forward to the next mutation site, if any. If it did not find a match, the code drops through to
add a newly created mutation instead; that is the case we saw before.
That’s all that’s needed: a little initialization code, a little output code, and a modifyChild()
callback to handle mutating the microsats. Although that modifyChild() callback is not
particularly short, conceptually it is really quite simple: decide on how many microsat mutation
events there will be in each genome and where they will be, and then change the mutations in the
genome to reflect those mutation events with new (or uniqued) mutations that have the appropriate
repeat count in their tag values.
There are a few ways in which greater realism could be added to this model. We already
discussed, above, the possibility of starting with a heterogenous subpopulation state; any initial
state could be set up, although care should be taken to unique the microsat mutations, both to
keep memory usage down and to provide a single mutational lineage for each unique
microsatellite state.
Another way in which this model oversimplifies the biology is in its mutational model. In
reality, microsats can sometimes mutate by adding or subtracting more than a single repeat, and
their probability of mutating at all can depend upon the number of repeats present, as well as
upon things like the sex of the parent. The mutation rate for microsats can even depend upon the
similarity or dissimilarity in repeat counts between the two copies of the microsat in the parent that
generated the gamete, due to effects during meiosis. All of these effects could be modeled with
extensions to the modifyChild() callback presented here. To implement this, it would probably be
necessary (or at least simpler) to get rid of the rpois() call that draws the number of microsat
mutations for each genome, and instead simply loop over all of the existing microsat positions and
determine whether a mutation occurred at each position with a runif() draw that is compared to
the calculated probability of mutation for that microsat in that genome.
A third oversimplification is that this recipe does not explicitly track variation in microsats due
to point mutations rather than repeat-count mutations. Perhaps a good approach for this would
involve modeling repeat-count changes just as in this model, while adding in similar code that
would model nucleotide changes as well (using an appropriate mutation probability based on the
length of the microsat). If such a model was run without uniquing, each mutational lineage would
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

234

then be tracked separately, and so point mutations would create new mutational lineages that
would be considered distinct from other mutation lineages with the same repeat count. However,
this design would overcount the number of genetically distinct lineages, since repeat-count
mutations as well as point mutations would generate distinct mutational lineages. To account for
things better would require additional state information to be attached to each Mutation object,
using the setValue()/getValue() mechanism; in this way, two mutational lineages with the same
repeat count and no nucleotide differences could be merged together by the uniquing code, while
two mutational lineages with the same repeat count but with nucleotide differences could be kept
separate. In the extreme of maximal realism, one could actually store generated string
representations of nucleotide sequences inside the Mutation objects, but that would be overkill for
most purposes. Usually, a single token attached to each mutation, such as a random number
generated with runif(), could probably provide sufficient realism. Whenever a point mutation
occurred, a new mutation object would be created (with no uniquing necessary) and would be
given a new random token representing its unique new nucleotide sequence. Whenever a repeatcount mutation occurred, on the other hand, the token value would be inherited from the old
microsat mutation at that position, indicating that the nucleotide difference, whatever it might be,
was (assumed to be) preserved across the repeat count change. If the uniquing code then re-used
an existing mutation only when the token values of the desired mutation and the existing mutation
were the same, a closer approximation of the correct pattern of microsatellite diversity might
emerge. Implementing this goes beyond the scope of this recipe, and so is left as an exercise for
the reader, but in essence it is actually quite simple; the design simply adds another piece of state,
tracked with setValue()/getValue(), that is handled very similarly to the tag value in the existing
recipe. That token value is inherited (just as tag is), changes its value upon mutation (just like tag,
but for point mutation events instead of repeat-count mutation events), and must be equal in order
for uniquing to find a match (just as tag is used by the existing uniquing code). Token values
would be set up when the subpopulation is initialized, just as tag values are, either with zero
values (representing a homogenous initial state) or with some sort of population structure indicated
by different random token values on different genomes to represent standing variation in
microsatellite diversity.
Depending upon the level of realism desired, therefore, this recipe might provide a complete
solution, or it might merely point the way. In either case, the basic strategy outlined here of using
tag values to indicate repeat counts on mutations that represent a microsatellite at a given locus is
the recommended approach in SLiM. In general, using mutations to represent some conceptual
difference between genomes, rather than necessarily representing a literal nucleotide difference,
can be a useful strategy for advanced models in SLiM.
13.16 Modeling transposable elements
A transposable element, or transposon, is a chromosomal region which is capable of replication
or positional change within the genome, either on its own or with the assistance of an enzyme
such as reverse transcriptase or transposase. Transposons constitute a substantial fraction of the
genome of many species, and can have evolutionary effects through side effects such as disabling
genes into which they “jump”, altering the regulation of nearby genes, and copying genetic
material within the genome. Given their evolutionary importance, incorporating transposons into
evolutionary models may be useful; in this recipe we will therefore explore a simple model of
transposons in SLiM.
There are various subtypes and classifications of transposons; here we will explore autonomous
Class I transposons, which “copy and paste” themselves to new locations in the genome under
their own power. Given the diversity of evolutionary effects transposons can have, and the
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

235

diversity of ways in which they can function, this recipe cannot provide a general formula for
modeling transposons and all of their effects; all this recipe will do is point the way. At the end of
this section, we will discuss ways in which this recipe could be extended to model further aspects
of the behavior of transposons.
Since SLiM does not model nucleotides explicitly, we will not model the actual nucleotide
sequence of transposons here. Furthermore, since copying a transposon to a new location changes
the length of the chromosome (which is not possible in SLiM), we will model transposons
conceptually rather than literally, as loci in the genome that are capable of copying themselves.
This approach is similar to the approach taken in section 13.15 for modeling microsatellites.
We will also model the vulnerability of transposons to mutations that disable them by
deactivating their ability to jump; we will track disabled transposons, and assess the fraction of
each transposon that has been disabled. This is important, since even disabled transposons may
have evolutionary effects such as changes in gene regulation, and since most of the transposons in
typical organisms appear to be disabled.
We will start with the model’s initialization:
initialize() {
defineConstant("L", 1e6);
defineConstant("teInitialCount", 100);
defineConstant("teJumpP", 0.0001);
defineConstant("teDisableP", 0.00005);

//
//
//
//

chromosome length
initial number of TEs
TE jump probability
disabling mut probability

initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, L-1);
initializeRecombinationRate(1e-8);

// neutral

// transposon mutation type; also neutral, but red
initializeMutationType("m2", 0.5, "f", 0.0);
m2.convertToSubstitution = F;
m2.color = "#FF0000";
// disabled transposon mutation type; dark red
initializeMutationType("m3", 0.5, "f", 0.0);
m3.convertToSubstitution = F;
m3.color = "#700000";
}

This defines the chromosome length and a few constants governing the behavior of TEs in the
model. It then sets up a simple neutral simulation, and defines two additional mutation types: one
for active TEs, m2, which displays as red in SLiMgui, and one for TEs that have been disabled by
mutation, m3, which display as a darker red.
Next we set up the initial state of our subpopulation. In this recipe, the tag values on the m2
and m3 mutations are used as identifiers; each TE gets its own unique tag value, which is used for
both its active (m2) and disabled (m3) forms, allowing the mutations representing the two forms to
be matched up. The tag value on the simulation itself, sim.tag, is used to keep track of the next
unused tag value; it starts at 0 and counts up. The setup thus looks like this:
1 late() {
sim.addSubpop("p1", 500);
sim.tag = 0; // the next unique tag value to use for TEs

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

236

// create some transposons at random positions
genomes = sim.subpopulations.genomes;
positions = rdunif(teInitialCount, 0, L-1);
for (teIndex in 0:(teInitialCount-1))
{
pos = positions[teIndex];
mut = genomes.addNewDrawnMutation(m2, pos);
mut.tag = sim.tag;
sim.tag = sim.tag + 1;
}
}

We make a new subpop, start sim.tag at 0, and then create new TEs that are fixed across the
whole population. (As in the recipe of section 13.15, creating an initial state that involves
heterogeneity in the TEs possessed by individuals would also be possible but is more complex; see
that recipe for discussion.) The positions for the TEs are drawn randomly across the chromosome,
and each TE is tagged with a sequential value from sim.tag.
Before implementing any more of the TE dynamics, let’s implement the final output event:
5000 late() {
// print information on each TE, including the fraction of it disabled
all_tes = sortBy(sim.mutationsOfType(m2), "position");
all_disabledTEs = sortBy(sim.mutationsOfType(m3), "position");
genomeCount = size(sim.subpopulations.genomes);
catn("Active TEs:");
for (te in all_tes)
{
cat("
TE at " + te.position + ": ");
active = sim.mutationCounts(NULL, te);
disabledTE = all_disabledTEs[all_disabledTEs.tag == te.tag];
if (size(disabledTE) == 0)
{
disabled = 0;
}
else
{
disabled = sim.mutationCounts(NULL, disabledTE);
all_disabledTEs = all_disabledTEs[all_disabledTEs != disabledTE];
}
total = active + disabled;
cat("frequency " + format("%0.3f", total / genomeCount) + ", ");
catn(round(active / total * 100) + "% active");
}
catn("\nCompletely disabled TEs: ");
for (te in all_disabledTEs)
{
freq = sim.mutationFrequencies(NULL, te);
cat("
TE at " + te.position + ": ");
catn("frequency " + format("%0.3f", freq));
}
}
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

237

This prints all of the active TEs in the model (sorted by position), with information on their
overall frequency in the population and on the fraction of their occurrences that have been
disabled by mutations. After that, it prints a list of the completely disabled TEs (sorted as well);
frequencies are also given for those, since TEs could conceivably be disabled before fixing. The
logic of this code is quite straightforward, so there is no need to belabor it here.
We want our TEs to copy themselves; this occurs during the life of the organism, not during
meiosis, so we model it with a late() event:
late() {
// make active transposons copy themselves with rate teJumpP
for (individual in sim.subpopulations.individuals)
{
for (genome in individual.genomes)
{
tes = genome.mutationsOfType(m2);
teCount = tes.size();
jumpCount = teCount ? rpois(1, teCount * teJumpP) else 0;
if (jumpCount)
{
jumpTEs = sample(tes, jumpCount);
for (te in jumpTEs)
{
// make a new TE mutation
pos = rdunif(1, 0, L-1);
jumpTE = genome.addNewDrawnMutation(m2, pos);
jumpTE.tag = sim.tag;
sim.tag = sim.tag + 1;
}
}
}
}
}

This loops through the genomes of all individuals in the simulation. For each genome, it gets all
of the TEs present, and decides how many (if any) will “jump” according to the probability teJumpP
for each TE, using a draw from a Poisson distribution. (This is faster than doing a separate random
draw for each TE, but is otherwise equivalent.) If any TEs did jump, the ones that jumped are
selected at random. Each jump is simulated by creating a new TE at a new, randomly chosen
location. The new TEs get new tag values assigned sequentially from sim.tag, so that their
corresponding disabled versions can be looked up. And that is it for jumping; its logic is fairly
straightforward since disabled TEs are not involved at all.
Finally, we need to implement the disabling of TEs by random point mutations. Since TEs are
simulated here as point mutations themselves, we need to simulate this disabling process
ourselves; SLiM’s automatic mutation generation will never modify our TEs for us. We model
disabling mutations in a modifyChild() callback. Its logic is similar to that of the jumping code
above, except that when a TE is disabled, a mutation of type m3 needs to be substituted in place of
the existing m2 mutation that represents the TE. The m3 mutation corresponding to any given m2 TE
is created just once, and then the same m3 mutation is looked up and reused every time that same
m2 TE is disabled in another genome. This “uniquing” of the m3 mutations makes the model much
more memory-efficient, and makes the output code much simpler. The main purpose of the tag
values we have been managing is, in fact, to facilitate this uniquing process. The disabling
callback looks like this:
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

238

modifyChild() {
// disable transposons with rate teDisableP
for (genome in child.genomes)
{
tes = genome.mutationsOfType(m2);
teCount = tes.size();
mutatedCount = teCount ? rpois(1, teCount * teDisableP) else 0;
if (mutatedCount)
{
mutatedTEs = sample(tes, mutatedCount);
for (te in mutatedTEs)
{
all_disabledTEs = sim.mutationsOfType(m3);
disabledTE = all_disabledTEs[all_disabledTEs.tag == te.tag];
if (size(disabledTE))
{
// use the existing disabled TE mutation
genome.removeMutations(te);
genome.addMutations(disabledTE);
next;
}
// make a new disabled TE mutation with the right tag
genome.removeMutations(te);
disabledTE = genome.addNewDrawnMutation(m3, te.position);
disabledTE.tag = te.tag;
}
}
}
return T;
}

For each genome in the proposed child, we get the TEs contained, and calculate the number
that will be disabled using a Poisson draw as before. We select the TEs that actually mutated, and
for each, attempt to look up a corresponding m3 mutation using the tag value of the m2 mutation. If
we find one, we substitute that. If not, we create a new m3 mutation to represent the disabled
state, marking it with the TE’s tag value so we can look it up next time. That’s it for the TE
disabling code; a bit lengthy, but the logic is simple.
If this model is run, typical output might look like this:
Active TEs:
TE at 1793: frequency 0.011,
TE at 2435: frequency 0.208,
TE at 3629: frequency 0.002,
TE at 6339: frequency 0.208,
TE at 7081: frequency 0.020,
TE at 7728: frequency 1.000,
...
Completely disabled TEs:
TE at 24821: frequency 1.000
TE at 83627: frequency 1.000
TE at 98286: frequency 1.000
...

100% active
100% active
100% active
94% active
100% active
79% active

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

239

That output shows a mix of TEs at high frequency (perhaps present already in the initial
population setup, since not that many generations have elapsed) and TEs at low frequency (which
must be copies due to jumping). Some TEs are completely active, whereas others have been
partially or fully disabled by mutations. (Once even a single copy of a TE is disabled by mutation,
that copy might drift to fixation; the fact that TEs have been completely disabled does not mean
that TE-disabling mutations are particularly common.)
Note that this model is never at equilibrium; the number of TEs is likely to grow without bound,
since the probability of jumping is greater than the probability of TEs being disabled by mutations.
That means that the model runs ever more slowly, since the TE mutations are not allowed to fix. If
a model didn’t care about disabled TEs at all, it would be much simpler and faster to simply delete
TEs from a genome when they are disabled.
Note also that the probabilities of jumping and of being disabled in this recipe are arbitrary and
have almost no empirical basis; they just worked well for testing the code. Please do not use these
values in any production code of your own.
With that, we have wrapped up this recipe, but obviously there are a great many aspects of the
biology of transposons that we haven’t covered here:
• Effects of transposons on fitness could be modeled through fitness() callbacks, either
modifying the fitness effects of other mutations due to the proximity of transposons, or
modeling a direct fitness effect for the transposons themselves in their own fitness()
callback. Alternatively, at the moment of a transposon’s jump a genetic effect could be
modeled by examining and altering other mutations in the vicinity of the jump
destination, to simulate up/down-regulation of those mutations (taking care to make a
private copy of any modified mutations first, so that other genomes sharing the same
mutations do not receive the same modifications as an unintended side effect).
• Transposons that relocate themselves, rather than copying themselves, could be
implemented by adding a call to genome.removeMutations(te) in the jump code.
• Attempts by the organism to suppress TEs en masse, for example through epigenetic
controls such as methylation, could be simulated by converting TEs in a given individual
into a suppressed version (perhaps using a new mutation type to represent suppression);
such suppression could be temporary, since one could write code to convert suppressed
TEs back into active TEs again, too.
• Non-autonomous TEs that can jump only in the presence of a separate enabling gene
could easily be modeled simply by checking for the presence of the enabling gene in an
individual prior to allowing any TEs in that individual to jump.
• It is not clear exactly what, if anything, restricts the proliferation of TEs in real organisms,
but one could certainly model some sort of balancing factor that would limit the
proliferation of TEs in this model. The probability of jumping could decrease as the
number of active TEs increases (for some unclear reason), or the fitness of organisms could
start to decrease sharply as their TE load increases above some bound (with perhaps a bit
more biological justification), or whatever other mechanism one wished.
• One of the most interesting aspects of transposons is that they sometimes jump more
frequently when an individual is subject to environmental stress; that could be modeled
by calculating a jump probability for the TEs in an individual that depends upon that
individual’s fitness, or upon the mismatch between its phenotype and its environment in
some specific trait, if desired.
The point is that although this recipe is fairly rudimentary, many aspects of the evolutionary
dynamics of transposons could be easily modeled in SLiM by implementing whatever extensions to
this recipe are desired, as outlined above.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

240

13.17 A QTL-based model with two quantitative phenotypic traits and pleiotropy
Sections 13.1 and 13.10 introduced some strategies for modeling quantitative traits in SLiM,
where a phenotypic trait’s quantitative value is based upon the additive effects of multiple QTLs
(quantitative trait loci). Chapter 14 will show more QTL-based models that incorporate continuous
space and spatial interactions. All of these quantitative-trait models share the same basic
approach: QTL mutations are intrinsically neutral (enforced with a fitness() callback), but have
an additive effect upon a phenotypic trait value possessed by each individual. Individual fitness is
then modeled using some form of fitness function (often Gaussian), based upon the deviation of
the individual’s phenotype from the optimum phenotype in its environment (determined by a
fitness(NULL) callback that evaluates each individual). All of these recipes model a single
phenotypic trait, however, with the QTL mutations influencing only that one trait. In this section
we will look at an extension of the same basic approach, modeling two phenotypic traits. QTL
mutations in this model will influence both phenotypic traits, pleiotropically, with effect sizes
drawn from a multivariate normal distribution (thus allowing the effects on the phenotypic traits to
be either independent or correlated). This recipe can be trivially extended to encompass any
number of QTL-based phenotypic traits, with any type of pleiotropy. Finally, at the end, we will
add in live R-based plotting, as introduced in section 13.11, to plot the adaptive trajectory of the
population. You may wish to review sections 13.1, 13.10, and 13.11 before proceeding.
Let’s begin with the initialize() callback as usual:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeMutationType("m2", 0.5, "f", 0.0);
m2.convertToSubstitution = F;
m2.color = "red";

// neutral
// QTLs

// g1 is a neutral region, g2 is a QTL
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElementType("g2", c(m1,m2), c(1.0, 0.1));
// chromosome of length 100 kb with two QTL regions
initializeGenomicElement(g1, 0, 39999);
initializeGenomicElement(g2, 40000, 49999);
initializeGenomicElement(g1, 50000, 79999);
initializeGenomicElement(g2, 80000, 89999);
initializeGenomicElement(g1, 90000, 99999);
initializeRecombinationRate(1e-8);
// QTL-related constants used below
defineConstant("QTL_mu", c(0, 0));
defineConstant("QTL_cov", 0.25);
defineConstant("QTL_sigma", matrix(c(1,QTL_cov,QTL_cov,1), nrow=2));
defineConstant("QTL_optima", c(20, -20));
catn("\nQTL DFE means: ");
print(QTL_mu);
catn("\nQTL DFE variance-covariance matrix: ");
print(QTL_sigma);
}

QTLs will be represented by m2; it is declared to be neutral here, so a fitness(m2) callback
making it neutral will not be needed. In other QTL recipes the selection coefficient of the
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

241

mutations was used to store the additive effect of each QTL mutation, but we will take a different
approach here. The m2 mutations are colored red in SLiMgui, and are not converted to
substitutions when they fix, since their phenotypic effect will continue to be important. The
chromosome is a mixture of g1 neutral regions and g2 regions that can contain QTL mutations.
Finally, we define some QTL-related constants: QTL_mu gives the mean effect of new QTL
mutations on the two phenotypes, QTL_cov gives the covariance between the two effects, and
QTL_sigma is a variance-covariance matrix derived from QTL_cov that will govern new mutational
effects (this matrix is sometimes called an M-matrix). QTL_optima gives the optimal phenotypic
values for the two quantitative traits; note that the optima have different signs, but the M-matrix
encodes a positive mutational correlation, so adaptation in this model will be contrary to the
pleiotropically preferred direction of evolution. Finally, the means and M-matrix are printed; the
M-matrix looks like this:
QTL DFE variance-covariance matrix:
[,0] [,1]
[0,]
1 0.25
[1,] 0.25
1

Support for matrices was added to Eidos in SLiM 2.6, so the matrix() function and other
aspects of working with matrices in Eidos may be new; see the Eidos language manual.
Next we start a new subpopulation:
1 late() {
sim.addSubpop("p1", 500);
}

Then we need to draw the effects of new mutations in each generation, which is a bit complex
since SLiM doesn’t do it for us in this recipe the way it usually does:
late() {
// add effect sizes into new mutation objects
all_m2 = sim.mutationsOfType(m2);
new_m2 = all_m2[all_m2.originGeneration == sim.generation];
if (size(new_m2))
{
// draw mutational effects for all new mutations at once
effects = rmvnorm(size(new_m2), QTL_mu, QTL_sigma);
// remember all drawn effects, for our final output
old_effects = sim.getValue("all_effects");
sim.setValue("all_effects", rbind(old_effects, effects));
for (i in seqAlong(new_m2))
{
e = effects[i,];
// each draw is one row in effects
mut = new_m2[i];
mut.setValue("e0", e[0]);
mut.setValue("e1", e[1]);
}
}
}

This late() event runs in every generation. It finds new m2 mutations just introduced by SLiM
during offspring generation, and patches in QTL effects sizes for them. First, new_m2 is set to the
mutations whose originGeneration property is the current generation; these are are new
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

242

mutations. If there are any, it calls rmvnorm() to draw their effect sizes. This function draws from
the multivariate normal distribution defined by QTL_mu and QTL_sigma; we ask it to draw a pair of
effects for each new mutation, and it returns a matrix with one row for each mutation and two
columns, one for each effect size. We keep a record of all drawn effect sizes in the "all_effects"
key of the simulation object, for purposes of output. After that, all that remains is to loop through
the new mutations and put their drawn effect sizes into their "e0" and "e1" keys.
Now that mutations have effect sizes assigned to them, we can add code to calculate the
phenotypes of individuals as the sum of those additive effects:
late() {
// construct phenotypes from additive effects of QTL mutations
inds = sim.subpopulations.individuals;
for (ind in inds)
{
muts = ind.genomes.mutationsOfType(m2);
// we have to special-case when muts is empty
if (size(muts)) {
ind.setValue("phenotype0", sum(muts.getValue("e0")));
ind.setValue("phenotype1", sum(muts.getValue("e1")));
} else {
ind.setValue("phenotype0", 0.0);
ind.setValue("phenotype1", 0.0);
}
}
}

This event must be after the previous late() event in the script, so that it runs after mutation
effects have been assigned. It loops through the individuals in the subpopulation, and for each
individual it gets all of the m2 mutations possessed in both genomes, adds up their effects, and
stores the result as a phenotype. There are only two important differences between this and
previous QTL recipes. First, it gets the effects from the "e0" and "e1" keys of the mutations, rather
than from tag or tagF properties; and second, it stores the phenotypes in "phenotype0" and
"phenotype1" keys on the individuals, rather than in tag or tag properties. Using tags works well
when you have only a single value to store; they are simple to use, and are fast to get and set.
Using key-value pairs as in this recipe is a bit more complex, and a bit slower, but more general
and extensible; any number of keys can be defined on an object, and so this recipe can be easily
extended to any number of phenotypic traits, each influenced by separate mutational effects.
Next, we need those phenotypes to influence the fitness of individuals:
fitness(NULL) {
// phenotype 0 fitness effect, with optimum of QTL_optima[0]
phenotype = individual.getValue("phenotype0");
return 1.0 + dnorm(QTL_optima[0] - phenotype, 0.0, 20.0) * 10.0;
}
fitness(NULL) {
// phenotype 1 fitness effect, with optimum of QTL_optima[1]
phenotype = individual.getValue("phenotype1");
return 1.0 + dnorm(QTL_optima[1] - phenotype, 0.0, 20.0) * 10.0;
}

This should be familiar from previous QTL recipes; fitness is drawn from a Gaussian function
based on the difference between the individual’s phenotype and the optimum. Here we have two
separate callbacks, one for each phenotype, for conceptual clarify; these could be combined into a
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

243

single fitness(NULL) callback, of course, combining the fitness effects of the two traits
multiplicatively. Furthermore, if one wished the two traits to be subject to a single multivariate
Gaussian fitness function, instead of having independent effects on fitness, a dmvnorm() function is
available in Eidos to facilitate such scenarios, but we shall not do so here. A width of 20.0 is hardcoded here for the fitness functions, but the optima are taken from the constant we defined earlier,
and the individual phenotypic values are fetched by key. The 10.0 multiplier makes it so an
individual precisely at the phenotypic optimum has a fitness of 10.0 relative to an individual with
a phenotype infinitely far from the optimum; strong selection, but not unrealistically so.
All that remains is output and a termination condition. For this model, this is quite complex
since we want to look at a bunch of things:
1:1000000 late() {
// output, run every 1000 generations
if (sim.generation % 1000 != 0)
return;
// print final phenotypes versus their optima
inds = sim.subpopulations.individuals;
p0_mean = mean(inds.getValue("phenotype0"));
p1_mean = mean(inds.getValue("phenotype1"));
catn();
catn("Generation: " + sim.generation);
catn("Mean phenotype 0: " + p0_mean + " (" + QTL_optima[0] + ")");
catn("Mean phenotype 1: " + p1_mean + " (" + QTL_optima[1] + ")");
// keep running until we get within 10% of both optima
if ((abs(p0_mean - QTL_optima[0]) > abs(0.1 * QTL_optima[0])) |
(abs(p1_mean - QTL_optima[1]) > abs(0.1 * QTL_optima[1])))
return;
// we are done with the main adaptive walk; print final output
// get the QTL mutations and their frequencies
m2muts = sim.mutationsOfType(m2);
m2freqs = sim.mutationFrequencies(NULL, m2muts);
// sort those vectors by frequency
o = order(m2freqs, ascending=F);
m2muts = m2muts[o];
m2freqs = m2freqs[o];
// get the effect sizes
m2e0 = m2muts.getValue("e0");
m2e1 = m2muts.getValue("e1");
// now output a list of the QTL mutations and their effect sizes
catn("\nQTL mutations (f: e0, e1):");
for (i in seqAlong(m2muts))
{
mut = m2muts[i];
f = m2freqs[i];
e0 = m2e0[i];
e1 = m2e1[i];
catn(f + ": " + e0 + ", " + e1);
}

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

244

// output the covariance between e0 and e1 among the QTLs that fixed
fixed_m2 = m2muts[m2freqs == 1.0];
e0 = fixed_m2.getValue("e0");
e1 = fixed_m2.getValue("e1");
e0_mean = mean(e0);
e1_mean = mean(e1);
cov_e0e1 = sum((e0 - e0_mean) * (e1 - e1_mean)) / (size(e0) - 1);
catn("\nCovariance of effects among fixed QTLs: " + cov_e0e1);
catn("\nCovariance of effects specified by the QTL DFE: " + QTL_cov);
// output the covariance between e0 and e1 across all draws
effects = sim.getValue("all_effects");
e0 = effects[,0];
e1 = effects[,1];
e0_mean = mean(e0);
e1_mean = mean(e1);
cov_e0e1 = sum((e0 - e0_mean) * (e1 - e1_mean)) / (size(e0) - 1);
catn("\nCovariance of effects across all QTL draws: " + cov_e0e1);
sim.simulationFinished();
}

This should be fairly self-explanatory, so we won’t go through it in any great detail. Every
thousand generations it prints a summary of the adaptation so far, like this:
Generation: 1000
Mean phenotype 0: 0 (20)
Mean phenotype 1: 0 (-20)
Generation: 2000
Mean phenotype 0: 2.89204 (20)
Mean phenotype 1: -3.42491 (-20)
Generation: 3000
Mean phenotype 0: 2.95127 (20)
Mean phenotype 1: -5.91266 (-20)
...

The mean trait value is printed for each phenotypic trait, with the optimum in parentheses; the
population didn’t manage to adapt at all by generation 1000 (no QTL mutations had arisen without
being lost), but by generation 3000 things are moving along better, especially for the second trait.
The model continues running until both phenotypic means get within 10% of their optima. That
can take a while, since evolution has to run counter to the M-matrix, and since selection gets
weaker as the optima are approached. When it gets there, the model produces some final output:
...
Generation: 31000
Mean phenotype 0: 21.2388 (20)
Mean phenotype 1: -17.9747 (-20)
Generation: 32000
Mean phenotype 0: 20.7437 (20)
Mean phenotype 1: -18.6298 (-20)

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

245

QTL mutations (f: e0, e1):
1: 2.70148, -0.0621418
1: -0.291556, -1.16253
1: 0.923384, -1.24329
1: -0.431645, 0.170891
1: 1.82426, 0.911538
1: 0.0652766, -1.76526
1: 0.444558, -1.22771
1: 1.62655, -1.33047
1: 1.47864, -0.957393
1: 1.44353, -1.7145
1: 0.496692, -0.441125
1: 1.25745, 1.08043
1: -0.707134, -1.06546
0.801: -0.566786, -0.611955
0.017: -0.308758, -1.02767
0.001: -0.408555, -0.222152
Covariance of effects among fixed QTLs: 0.26061
Covariance of effects specified by the QTL DFE: 0.25
Covariance of effects across all QTL draws: 0.236848

After the output of the mean phenotypes, we get a dump of all of the segregating QTL mutations
in the population, sorted by frequency. Most of the QTLs have fixed, one is approaching fixation,
and two are at very low frequency. Their effect sizes on the two phenotypic traits are shown in the
next two columns. Finally, we get a summary of the observed covariance in effects among the
QTL mutations that fixed, the requested covariance, and the observed covariance across all effects
drawn during the run (including many QTL mutations that were lost). You might expect that the
covariance among the fixed QTLs would have to be negative, in order for adaptation to reach the
two optima with different signs, but that is not the case; often it is true, but sometimes, as in this
run, the optima can be reached even with a positive covariance among the mutational effects.
So we have a two-trait QTL-based adaptive walk model with pleiotropy and correlated
mutational effects! As an aside, for advanced users planning to implement QTL models of their
own: it is not, in fact, necessary to prevent the QTL mutations from fixing by setting
convertToSubstitution=F. Instead, you can allow them to fix, and then add in the effects of the
Substitution objects on the calculated phenotypes. This will be faster, if implemented carefully
(tip: add newly fixed substitutions to an accumulator total), since the SLiM core won’t get bogged
down managing the bookkeeping on an ever-growing set of QTL mutations.
Now let’s add a little extra code to plot the trajectory of the adaptive walk in SLiMgui, using the
live R plotting technique of section 13.11. First let’s open a PDF-based plot window in SLiMgui:
1 late() {
sim.setValue("history", matrix(c(0.0, 0.0), nrow=1));
defineConstant("pdfPath", writeTempFile("plot_", ".pdf", ""));
// If we're running in SLiMgui, open a plot window
if (exists("slimgui"))
slimgui.openDocument(pdfPath);
}

This starts a history of the population’s adaptive walk in a new key named "history" on the
simulation object. It also makes a temporary PDF file with writeTempFile(), and tells SLiMgui to
open that file; see section 13.11 for details.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

246

Now we just need to update that plot with periodic callouts to R. To do so, insert the following
code inside the 1:1000000 final output event, immediately above the comment “keep running until
we get within 10% of both optima”:
// update our plot
history = sim.getValue("history");
history = rbind(history, c(p0_mean, p1_mean));
sim.setValue("history", history);
rstr = paste(c('{',
'x <- c(' + paste(history[,0], sep=", ") + ')',
'y <- c(' + paste(history[,1], sep=", ") + ')',
'quartz(width=4, height=4, type="pdf", file="' + pdfPath + '")',
'par(mar=c(4.0, 4.0, 1.5, 1.5))',
'plot(x=c(-10,30), y=c(-30,10), type="n", xlab="x", ylab="y")',
'points(x=0,y=0,col="red", pch=19, cex=2)',
'points(x=20,y=-20,col="green", pch=19, cex=2)',
'points(x=x, y=y, col="black", pch=19, cex=0.5)',
'lines(x=x, y=y)',
'dev.off()',
'}'), sep="\n");
scriptPath = writeTempFile("plot_", ".R", rstr);
system("/usr/local/bin/Rscript", args=scriptPath);

-10
-30

-20

y

0

10

This updates the "history" key with the latest data, and then generates a string contain R
plotting code and sends that code to R to run. The R code is based on Mac OS X, using the
quartz() function of R to open a new plotting device. Again, see section 13.11 for details on this
technique).
When run in SLiMgui, a plot window will open and update every 1000 generations to show the
adaptive trajectory thus far. For the same run of the model as shown in the output above, the
resulting plot looks like this (the walk begins at the red disc, and the green disc is the optimum):

-10

0

10

20

30

x

This sort of live visualization of model dynamics can be extremely help for both model design
and graphical debugging, and is quite simple to set up, as we have seen here.
In closing, it is perhaps emphasizing once more that although this recipe models two
phenotypic traits, it is designed to be extensible. It uses key-value pairs set on the mutations and
individuals to track the QTL mutational effects and phenotypic trait values; any number of such
key-value pairs can be maintained. The rmvnorm() function can also draw from a multivariate
normal distribution of any dimensionality, with any (positive-definite) M-matrix. This design is thus
quite open-ended.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

247

13.18 Modeling opposite ends of a chromosome
This recipe is not about how to model something complicated; instead, it is intended to
illuminate something conceptual about using SLiM in certain situations.
Consider the following recipe:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeGenomicElement(g1, 9900000, 9999999);
initializeRecombinationRate(1.5e-7);
}
1 { sim.addSubpop("p1", 500); }
200000 late() { sim.outputFixedMutations(); }

This is a simple neutral model of a chromosome of length L==1e7 bases. The only unusual thing
about it is that we are actually only modeling the two ends of the chromosome; we have defined a
genomic element spanning the first 1e5 bases, and another spanning the last 1e5 bases, and the
intervening region, which is 98% of the length of the chromosome, is not modeled. In SLiMgui,
the model looks like this after a little while:

This is perfectly fine; all that it means is that SLiM will not automatically generate mutations
within the region that is not covered by any genomic element, and so that central region remains
empty. The region does not have to remain empty; it would be legal to call addNewMutation() to
create a new mutation within that region, and SLiM would then track that added mutation
normally. The only effect that genomic elements have, in SLiM, is in causing the automatic
generation of new mutations by the SLiM core. But let’s keep the model as it is, and ask: what is
the behavior of the two end regions, with respect to recombination between them? Do they assort
independently, as if they were separate chromosomes, since they are separated by a rather long
stretch of chromosome? Or if not, then what is the effective recombination rate between them?
First of all, let’s make sure we understand exactly what “recombination rate” really means in
SLiM. As section 21.1 states, the recombination rate is the probability that a crossover will occur
between two adjacent bases. This is binomial; conceptually, a coin is flipped, and the coin lands
“heads” (crossover) with probability p, and “tails” (no crossover) with probability 1-p, and that is
done at every position between adjacent bases along the whole chromosome. There are L-1
positions between bases in a chromosome of length L. In this model, 2e5-2 of them occur
between the bases spanned by the two genomic elements, and the remainder, 9800002, occur
between those two regions. Those are the positions we’re interested in, and at each position a
crossover occurs with probability r=1.5e-7.
To get the probability that the two genomic elements will assort apart, rather than together, we
cannot simply multiply the number of positions (9800002) by the probability per position (1.5e-7),
of course; that would give us 1.47, a nonsensical result that isn’t even a valid probability. Instead,
we need to observe that the key question is actually whether the number of crossovers that occur
between the two regions is even or odd. An even number of crossovers will cancel out; we will
cross over and then cross back, perhaps repeatedly, with no net effect. The two genomic elements
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

248

will then assort together. An odd number of crossovers, on the other hand, will result in all but
one crossover canceling out, and one crossover will remain; the two genomic elements will then
assort apart.
So the question then becomes: for a binomial draw with 9800002 trials and a probability per
trial of 1.5e-7, what is the probability that the result of the draw will be odd? There is, of course,
established mathematical theory on this point, but let’s answer the question through simulation. In
an Eidos console, such as the one we can open inside SLiMgui, we can do a large number of
draws from that binomial distribution:
> draws = rbinom(1e8, 9800002, 1.5e-7)

That will take a moment, and then we have 1e8 draws from the requisite binomial distribution.
Next let’s ask what fraction of them is odd:
> sum(draws % 2 == 1) / 1e8
0.473642

So, the two genomic elements will assort apart about 47.4% of the time, and the remaining time
will assort together.
So now suppose we want to construct a SLiM model that just elides that central region (since
we weren’t using it anyway) and places the two chromosome ends immediately next to each other.
There is no particular obstacle to doing this, except that we need to know what recombination rate
to use for the join point between the two halves – which we have just figured out. So we could
now rewrite our model as:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeGenomicElement(g1, 100000, 199999);
rates = c(1.5e-7, 0.473642, 1.5e-7);
ends = c(99999, 100000, 199999);
initializeRecombinationRate(rates, ends);
}
1 { sim.addSubpop("p1", 500); }
200000 late() { sim.outputFixedMutations(); }

This model ought to behave identically to the previous model, and ought to be somewhat more
efficient, too (although the difference may not be large enough to be noticeable). This is because
when running the first model SLiM would potentially be generating two or more breakpoints
between the two ends, and then resolving whether there were an even or odd number as it ran,
whereas when running the second model SLiM either generates zero breakpoints or one
breakpoint, with the correct calculated probability.
As turns out, we could get the same probability by using the rate-rescaling formula of section
5.5, since what we are effectively doing is rescaling a region of length 9800002 down to a length of
1 (length in terms of potential recombination positions, that is). That formula, executed in the
Eidos console, gives us:
> 0.5*(1-(1-2*1.5e-7)^9800002)
0.473567

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

249

Which is very close to the number we got before (which, remember, was inexact since it came
from simulation). So that’s pleasing.
The point of this recipe, then, is twofold. One point is to provide another angle of view on the
rate-rescaling formula of section 5.5, to illustrate that it applies in situations other than the
rescaling of an entire model. The second, more important, point is to illustrate once again that the
recombination rate as specified by SLiM is the probability of a crossover occurring between two
adjacent bases, and the consequences that that has for reasoning about how recombination works
on larger scales.
This is as good a place as any for a digression that I think this manual ought to contain
somewhere: how SLiM actually generates recombination breakpoint positions under the hood.
This is worth demystifying because for models that use very high recombination rates the subtleties
of this process can be important. Mostly, though, this digression is just for those who are curious,
and beginning users of SLiM should feel free to skip it.
Overall, SLiM does this in several steps: (1) it decides upon the number of breakpoints it will
generate, (2) it chooses the location for each breakpoint based upon a weighted uniform draw
along the chromosome (where the weights are equal to the recombination rates for each individual
base, as defined by the recombination map), and then (3) it sorts and uniques the list of
breakpoints to provide a final list. The uniquing of the list means that there is, at most, one
crossover (i.e., breakpoint) between any two base positions; this is biologically realistic, and also
computationally simpler.
So the big question is: how to decide upon the number of breakpoints to generate? One way
would be to start from the fact that between each pair of bases the question is one of a binomial
draw, with a single trial of probability equal to the recombination rate (by SLiM’s definition). One
could do one such draw per pair of bases, add up the results, and that’s the number of crossovers
that occurred. The immediate problem with that approach is simply that it is immensely
computationally expensive, and if the user has supplied a complex recombination map with
different rates at every position it can’t be reduced to a single binomial draw with a larger number
of trials.
So perhaps we might wish to model recombination as a Poisson process instead, leveraging the
fact that Poisson draws can be added together: Poisson(λ1) + Poisson(λ2) = Poisson(λ1 + λ2). If two
adjacent base positions have a probability of a crossover between them of λ, then the mean of the
binomial draw can be used as the mean of a Poisson draw instead (the Poisson distribution being
well-known as a viable approximation for the binomial distribution with small λ and largish n).
Then we can add up the λ values across the chromosome and get the number of recombination
events with a single Poisson draw that uses that total λ value. And indeed, this is precisely what
SLiM 2.x and earlier did.
Unfortunately, as SLiM users started moving towards new areas of modeling in SLiM such as
multiple chromosomes, it became clear that there was a problem with this approach. When
modeling multiple chromosomes, you want, intuitively, to connect them together in SLiM with a
recombination rate of 0.5 to represent perfect independence of assortment. Unfortunately, this
value is large enough that the Poisson approximation of the binomial distribution breaks down.
The Poisson draw can not only be 0 or 1 (as with the binomial), but also larger values, and with a
rate of 0.5 this happens often enough to matter. In SLiM’s case, if the Poisson draw indicated that
(for example) two crossovers occurred, their positions would be sorted and uniqued (to prevent
multiple crossovers at the same exact location, as described above), resulting in just one crossover
where the Poisson draw had specified two. The net effect of this would be that the probability of a
crossover at the given location would be somewhat lower than desired – about 0.39, in fact, rather
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

250

than 0.5. (It is lower than 0.5 because more than half of the Poisson draws for λ=0.5 will be zero,
compensating for the fact that some of the draws are greater than one and making the mean of the
Poisson distribution come out to 0.5).
So at first blush it looks like we can’t use a binomial draw (too complicated in the case of
complex recombination maps) and we can’t use a Poisson draw (inaccurate with large
recombination rates such as 0.5). What to do?
The solution we arrived at, thanks to Peter Ralph, is to reparameterize the Poisson draws that
we’re doing. If the user has requested a recombination rate of p (probability of crossover between
two adjacent bases), but we want to use a Poisson draw and have it generate a crossover, after
sorting and uniquing, with probability p, then we can transform the requested p into a
reparameterized value λ such that Prob[(Poisson(λ) > 0) = p]. The formula for this
reparameterization is:

λ = − log(1 − p)
With this reparameterization, we end up with the correct probability of crossover after using a
Poisson(λ) draw, accounting for SLiM’s sorting and uniquing of the breakpoint vector. This means
that we can add up the λ values for each region along a whole chromosome, even with a complex
recombination map, and draw a preliminary number of crossovers from a Poisson distribution with
the total λ, and then after we select locations with the usual weighted uniform draws, and sort and
unique them, the probability of crossover at every specific site will be as the user requested. It will
work even when some positions have a rate of 0.5; and even though the Poisson distribution is
only an approximate estimation for the binomial distribution, this solution is in fact exact, since it
is based upon the probability that the Poisson draw for a specific position is greater than zero, not
upon the mean of the Poisson distribution.
If that was all gibberish, never mind. The point is that, as of SLiM 3, recombination rates should
be accurate (within numerical precision limits) even for large recombination rates like 0.5, with no
sacrifice in speed.
13.19 Biased gene conversion
SLiM intrinsically supports gene conversion (see section 6.1.3), but it does not intrinsically
support biased gene conversion. This is unsurprising, since SLiM has no intrinsic support for
modeling nucleotides – biased gene conversion has to do with a bias in the gene conversion
process based upon the specific nucleotides in or near the gene conversion tract (Galtier et al.
2001; Ratnakumar et al. 2010). Nevertheless, it is possible to wedge a concept of biased gene
conversion into SLiM, and in this section we will do so in two very different recipes. The first
recipe will be a nucleotide-based model, using the techniques introduced in section 13.12 to
model nucleotides in SLiM. The second recipe will not be nucleotide-based, even those biased
gene conversion is a nucleotide-based phenomenon; it will model it as a more abstract process.
Which approach is better will depend upon your purposes. Indeed, we do not necessarily
recommend either approach; a different software package that is more specifically designed to
model nucleotide evolution might be more appropriate for this task. But for those who wish to do
it in SLiM, we will look at how it might be done.
Our first model, then, is nucleotide-based. See section 13.12 for an introduction to how
nucleotide-based models can be simulated in SLiM; we will follow a somewhat different approach
here but the underlying idea is the same. We will discuss the model in parts, but will not build it
step by step. We will begin with the initialize() callback:

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

251

initialize() {
defineConstant("L", 1e4);

// number of loci

// mutation type for each nucleotide, all
mtA = initializeMutationType(0, 0.5, "f",
mtT = initializeMutationType(1, 0.5, "f",
mtG = initializeMutationType(2, 0.5, "f",
mtC = initializeMutationType(3, 0.5, "f",
mt = c(mtA, mtT, mtG, mtC);
mtA.setValue("nucleotide",
mtT.setValue("nucleotide",
mtG.setValue("nucleotide",
mtC.setValue("nucleotide",
c(mtA,mtT).color = "blue";
c(mtG,mtC).color = "red";

neutral
0.0);
0.0);
0.0);
0.0);

"A");
"T");
"G");
"C");

// We do not want mutations to stack or fix
mt.mutationStackPolicy = "l";
mt.mutationStackGroup = -1;
mt.convertToSubstitution = F;
// chromosome of nucleotides, with gene conversion
initializeGenomicElementType("g1", mt, c(1,1,1,1));
initializeGenomicElement(g1, 0, L-1);
initializeMutationRate(1e-6);
// includes 25% identity mutations
initializeRecombinationRate(1e-6);
initializeGeneConversion(0.5, 100);
}

This is a simpler setup than in section 13.12; we have one mutation type for each nucleotide
type, rather than one mutation type per nucleotide type per base position. This simplicity is
possible here because this will be a pure neutral model; the different mutation types were used in
section 13.12 to allow each nucleotide at each position to have an independent fitness effect, but
since this model will be neutral that complexity is not necessary. It ought to be possible to
generalize this recipe to the non-neutral model of section 13.12, but we will not explore that here.
As in section 13.12, we use stacking policy to prevent mutations from stacking or fixing. We
model a chromosome 1e4 bases long, and we turn on gene conversion for 50% of all
recombination events, with a mean conversion tract length of 100.
Next we define a utility function:
function (f)gcContent(void)
{
nucs = sim.mutations.mutationType.getValue("nucleotide");
counts = sim.mutationCounts(NULL);
totalA = sum(counts[nucs == "A"]);
totalT = sum(counts[nucs == "T"]);
totalG = sum(counts[nucs == "G"]);
totalC = sum(counts[nucs == "C"]);
total = totalA + totalT + totalG + totalC;
return (totalC + totalG)*100/total;
}

This gets the nucleotides used by all mutations, and count information for all mutations, and
uses that information to calculate the percent GC content across the genome. We will use this
later to print status updates as the model runs.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

252

Next we set up the initial population:
1 late() {
sim.addSubpop("p1", 1000);
// The initial population is fixed for a random wild-type
// nucleotide at each locus in the chromosome
mutTypes = sample(g1.mutationTypes, L, replace=T);
p1.genomes.addNewDrawnMutation(mutTypes, 0:(L-1));
catn("Initial GC content: " + gcContent());
}

We make 1000 individuals, set up random wild-type nucleotides in their genomes (the same
nucleotide sequence in every genome, initially), and call our gcContent() utility function to print
the initial GC content.
Next comes the heart of the model, a recombination() callback that implements biased gene
conversion based upon the nucleotide sequence:
recombination() {
if (size(gcStarts) != 1)
return F; // no change unless a gene conversion
if (size(breakpoints) > 0)
return F; // no change if any recombination
// We have a gene conversion event; we will accept it if it
// increases the GC content of the tract in question
gcMuts1 = genome1.mutations;
gcMuts1 = gcMuts1[gcMuts1.position >= gcStarts];
gcMuts1 = gcMuts1[gcMuts1.position < gcEnds];
gcNucs1 = gcMuts1.mutationType.getValue("nucleotide");
gcGC1 = sum((gcNucs1 == "G") | (gcNucs1 == "C")) / size(gcNucs1);
gcMuts2
gcMuts2
gcMuts2
gcNucs2
gcGC2 =

= genome2.mutations;
= gcMuts2[gcMuts2.position >= gcStarts];
= gcMuts2[gcMuts2.position < gcEnds];
= gcMuts2.mutationType.getValue("nucleotide");
sum((gcNucs2 == "G") | (gcNucs2 == "C")) / size(gcNucs2);

if (gcGC2 > gcGC1)
return F; // no change if we like the new tract
// reject the new tract
gcStarts = integer(0);
gcEnds = integer(0);
return T;
}

First we check that we have been called with recombination breakpoints representing a single
gene conversion event with no other recombination; the code could be made more general, but
we will leave that as an exercise for the reader. Given that situation, we get the nucleotides
present in the proposed gene conversion tract, from both genomes, and evaluate their GC content.
If the GC content of the tract that would be copied by gene conversion is greater than the GC
content of the tract that would be overwritten, we allow the proposed gene conversion event to
proceed, by returning F to SLiM. Otherwise, we reject the proposed tract, causing that gene
conversion event not to occur. (See section 22.5 for background on how recombination()
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

253

callbacks work and what their return values mean.) Note that this is probably a ridiculously
simplified model of how gene conversion actually works; it doesn’t even have any sort of
probabilistic component, but rather always accepts conversions that will increase GC content and
always rejects others. Again, a more sophisticated model is left as an exercise for the reader; but
the code here should illustrate the skeleton of how such a sophisticated model would be
approached.
Finally, we output the GC content of the population every thousand generations as the model
runs:
1:1000000 late() {
if (sim.generation % 1000 == 0)
catn(sim.generation + " GC content: " + gcContent());
}

This model gives us output like this:
Initial
1000 GC
2000 GC
3000 GC
...

GC content: 48.8
content: 48.8026
content: 48.8009
content: 48.803

0.52
0.50
0.48

GC fraction

The accumulation of increased GC content in this model is a slow affair, but if we run to
generation 1000000 and plot the result, we can see that progress is steady:

0
Generation
Of course the rate will depend upon the mutation rate, gene conversion rate, tract length, and
other variables; the slow pace here is not inherent to the model, just a consequence of the chosen
parameters.
That recipe simulated actual nucleotide sequences, with mutations explicitly changing the
sequence in a given genome; at the end of a model run we could print out the nucleotide
sequences of all individuals, in FASTA format, perhaps. The next recipe is very different; we will
not model nucleotides at all. Instead, we will model a chromosome that is assumed to be an even
mix of AT and GC initially (50% GC content, in other words), but we won’t try to keep track of
which initial locations are AT and which are GC; we have no sequence information at all. We
then model two types of mutations on that background: GC to AT, and AT to GC. By counting
these mutations, we can know whether a given tract is GC-rich or GC-poor, and we can assess the
overall GC content of a genome. Avoiding explicit modeling of nucleotides will make this recipe
much simpler and faster than the previous recipe. Again, let’s look at this model one block at a
time, starting with initialize():

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

254

initialize() {
defineConstant("L", 1e5);

// number of loci

mtAT = initializeMutationType(0, 0.5, "f", 0.0);
mtAT.color = "blue";
mtAT.tag = -1;
mtGC = initializeMutationType(1, 0.5, "f", 0.0);
mtGC.color = "red";
mtGC.tag = 1;
// chromosome of length L, with gene conversion
initializeGenomicElementType("g1", c(mtAT, mtGC), c(1,1));
initializeGenomicElement(g1, 0, L-1);
initializeMutationRate(1e-6);
initializeRecombinationRate(1e-6);
initializeGeneConversion(0.5, 100);
}

We’ll model a chromosome of length 1e5 this time, since this model is faster. We set up the
GC-to-AT and AT-to-GC mutation types, give them different colors in SLiMgui, and give them tag
values of -1 and +1 respectively; we will use those tag values to easily total up the effect of a
vector of mutations on GC content. We use the same mutation rate, recombination rate, and gene
conversion properties as in the previous model. Then we set up an initial subpopulation:
1 late() {
sim.addSubpop("p1", 1000);
}

And then we introduce a recombination() callback that implements the biased gene
conversion:
recombination() {
if (size(gcStarts) != 1)
return F; // no change unless a gene conversion
if (size(breakpoints) > 0)
return F; // no change if any recombination
// We have a single gene conversion event
gcMuts = genome2.mutations;
gcMuts = gcMuts[gcMuts.position >= gcStarts];
gcMuts = gcMuts[gcMuts.position < gcEnds];
gcGC = sum(gcMuts.mutationType.tag);
take = (gcGC > 0);
if (take)
return F; // no change if we like the new tract
// reject
gcStarts = integer(0);
gcEnds = integer(0);
return T;
}

As in the first recipe, this callback runs only is a single gene conversion event with no
recombination is being proposed, for simplicity. In that case, we total up the GC-influencing
mutations in the proposed gene conversion tract, and if their net effect is to increase GC content
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

255

we accept the proposed gene conversion event, otherwise we reject it. (Note that this code does
not compare the tracts from the two genomes, so its criterion is a bit different from that of the first
recipe, but it would be easy to make it match; we’re just exploring possibilities here.)
Finally, we have a periodic output event:
1:10000000 late() {
if (sim.generation % 1000 == 0)
catn(sim.generation + " GC sum: " +
sum(sim.substitutions.mutationType.tag));
}

Since we can tot up the effect of all mutations on GC content very easily, we don’t use a utility
function here. Note that this code just totals up the substitutions; we don’t bother assessing the
mean GC content across all genomes. The first recipe didn’t use substitutions, since nucleotidebased models generally don’t convert fixed mutations to substitutions (because a new mutation
that introduces a different nucleotide could always occur).
When running in SLiMgui, we can see all of the AT-to-GC (red) and GC-to-AT (blue) mutations
circulating in the population:

Since this model, like the previous recipe, is neutral, we can have large numbers of these
mutations segregating in the population without it getting too slow. As it runs, this model will
produce output like:
1000
2000
3000
4000
...

GC
GC
GC
GC

sum:
sum:
sum:
sum:

0
-1
0
8

0.02
0.00

GC bias

0.04

If plotted, this shows an even steadier increase:

0

1e7
Generation

The smoother trajectory here, compared to the previous recipe, may in part be a result of the
larger population size. This plot also spans a generation range ten times that of the previous
recipe, which smooths the plot out. Finally, there may be some way in which the details of the
model, particularly the different criterion used in the recombination() callback, produces a
smoother increase.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

256

Again, which of these models is best will depend upon the application, and it may be that
neither model is particularly ideal, since SLiM is not really designed to simulate nucleotides.
However, the output from both models shows clear evidence of biased gene conversion; if a
simple model of that process is all that it needed, to simulate its effects upon some other
evolutionary process, this level of sophistication may suffice. The second model, in particular,
provides relatively low overhead in a SLiM model of biased gene conversion by “thinking outside
the box” and modeling the effects of the process – the tendency to accept some gene conversion
events and reject others, based upon the mutations present in the gene conversion tract – without
explicitly modeling the nucleotides involved. If it is the effects of the process that are important,
rather than the specific nucleotide sequences generated, that level of realism may suffice, with
considerable savings in computation time.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

257

14. Continuous-space models and interactions
This chapter introduces two new features added in SLiM 2.3: continuous space, and the

InteractionType class. These two features will be treated together in this chapter, since many of
the uses of InteractionType involve modeling spatial interactions, but in fact each feature may be

used independently of the other. Note that these are advanced, optional features.
Continuous-space support in SLiM is enabled by supplying a dimensionality parameter to the
initializeSLiMOptions() call during model initialization. This parameter may be "x", "xy", or
"xyz"; these set up SLiM to support 1D, 2D, or 3D continuous space, respectively, within each
subpopulation using the coordinate axes specified. (Setting up spaces using different coordinate
axes than these, such as 2D space using y and z, is not presently supported.) The corresponding x,
y, and z properties of Individual will then be interpreted by SLiM as spatial coordinates, rather
than simply as type float tag values. Model code can then set the positions of individuals as
desired; this is typically done in a modifyChild() callback, so that the positions of new offspring
are immediately set, but it can also be done at any other point in the generation cycle. Individual
positions can also be used by model code in any way desired; for example, spatially continuous
variation in selection could be implemented in a fitness() callback by computing an
environmental value at the spatial location of the focal individual, and then comparing the
individual’s phenotype to that environmental value (see section 14.9). When continuous space is
enabled, SLiMgui will display subpopulations graphically using the spatial positions of individuals.
Interactions between individuals can be implemented in pure Eidos, as seen previously in the
frequency-dependent model of section 9.4.1 and the green-beard model of section 9.4.4. This can
be cumbersome, however, since all of the interaction mechanics must be written by the user in
Eidos, which – as an interpreted language – is not nearly as fast as the C++ in which the SLiM
engine is written. The InteractionType class provides a solution to this problem, by providing
built-in support for fast interaction evaluation while still allowing scripted customization of the
interactions through a new type of callback, the interaction() callback. InteractionType can be
used to model non-spatial interactions, as we will see in section 14.6; but it is particularly wellsuited to modeling interactions in continuous space, since it is able to translate distances into
interaction strengths automatically using any of several different “spatial kernels” (Gaussian,
negative exponential, etc.). In addition, InteractionType is optimized for spatial searches (such as
nearest-neighbor searches) through the use of a data structure called a “k-d tree”. Together, these
features allow complex spatial interactions to be implemented in just a few lines of Eidos code,
with performance that can be orders of magnitude better than would otherwise be possible.
When continuous space is enabled in a model, the Subpopulation class now contains several
features to support that functionality. First, Subpopulation is now aware of its spatial boundaries
(the spatial coordinates of the edges of the subpopulation). Second, it can help to enforce those
boundaries through various boundary conditions. Third, it now supports “landscape maps”, gridbased maps that define the values of particular environmental variables across the spatial extent of
the subpopulation. Landscape maps allow simulations to incorporate spatial variation in
properties such as elevation or temperature – any variable relevant to the model’s dynamics.
With that introduction, let’s delve into some recipes that exemplify these new features.
14.1 A simple 2D continuous-space model
In this recipe we will explore a very simple model utilizing continuous space. In this model,
individuals will live on a two-dimensional landscape. Individuals will not interact spatially in this
model; that will be introduced in section 14.2. The model:

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

258

initialize() {
initializeSLiMOptions(dimensionality="xy");
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 late() {
sim.addSubpop("p1", 500);
// initial positions are random in ([0,1], [0,1])
p1.individuals.x = runif(p1.individualCount);
p1.individuals.y = runif(p1.individualCount);
}
modifyChild() {
// draw a child position near the first parent, within bounds
do child.x = parent1.x + rnorm(1, 0, 0.02);
while ((child.x < 0.0) | (child.x > 1.0));
do child.y = parent1.y + rnorm(1, 0, 0.02);
while ((child.y < 0.0) | (child.y > 1.0));
return T;
}
2000 late() { sim.outputFixedMutations(); }

There are only a few new things about this model. First of all, continuous 2-D space is enabled
in the model with the call to initializeSLiMOptions(). Second, when the subpopulation is
created in the 1 late() event initial positions for all individuals are generated using runif().
Note that by default the spatial extent of the subpopulation spans the interval [0,1] in x and y, so
the default min and max values for runif() suffice. Third, a modifyChild() callback generates a
spatial position for the offspring individual. In this recipe, the offspring position is based upon the
position of the first (i.e. maternal) parent, as given by parent1.x and parent1.y, with random
deviations drawn from a normal distribution using rnorm(). The new positions could fall outside
of the subpopulation’s boundaries, so they are redrawn until they fall inside using while loops.
When run in SLiMgui, a typical snapshot of this model looks like this:

Here, SLiMgui is displaying each individual at its corresponding spatial position, as a small
square (colored according to fitness, as usual). Notably, spatial structure has emerged already in
this simple model, because of the way that child positions are based upon maternal positions; this
tends to encourage spatial clustering. Since spatial position is of no consequence in the model,
however, this makes no difference to the evolutionary dynamics observed.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

259

14.2 Spatial competition
The recipe in the previous section set up spatiality but did not use it. In this section we will
extend that recipe to include spatial competition between individuals. The strength of competition
between two individuals will depend upon the spatial distance between them, falling off with
increasing distance according to a Gaussian kernel with a characteristic width. This could
represent, for example, competitive interference due to overlap in foraging areas. The recipe:
initialize() {
initializeSLiMOptions(dimensionality="xy");
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
// Set up an interaction for spatial competition
initializeInteractionType(1, "xy", reciprocal=T, maxDistance=0.3);
i1.setInteractionFunction("n", 3.0, 0.1);
}
1 late() {
sim.addSubpop("p1", 500);
// initial positions are random in ([0,1], [0,1])
p1.individuals.x = runif(p1.individualCount);
p1.individuals.y = runif(p1.individualCount);
}
1: late() {
// evaluate interactions before fitness calculations
i1.evaluate();
}
fitness(NULL) {
// spatial competition
totalStrength = i1.totalOfNeighborStrengths(individual);
return 1.1 - totalStrength / p1.individualCount;
}
modifyChild() {
// draw a child position near the first parent, within bounds
do child.x = parent1.x + rnorm(1, 0, 0.02);
while ((child.x < 0.0) | (child.x > 1.0));
do child.y = parent1.y + rnorm(1, 0, 0.02);
while ((child.y < 0.0) | (child.y > 1.0));
return T;
}
2000 late() { sim.outputFixedMutations(); }

Here we have added a few new elements. First of all, a call to initializeInteractionType()
creates a new interaction type with identifier 1, which is therefore referred to as i1, in much the
same way that mutation types, genomic element types, and subpopulations are numbered and
named in SLiM. This call also defines the interaction type as using both the x and y coordinates of
the model, and sets the maximum range for the interaction as 0.3; beyond that limit interaction
strengths are always assumed to be zero, allowing better performance. The interaction is also
configured with reciprocal=T; this guarantees that the interaction strength exerted upon A by B is
equal to the interaction strength exerted upon B by A, allowing computational optimizations.
Interactions are non-reciprocal by default, so that bugs are not inadvertently introduced by an
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

260

implicit assumption of reciprocality; but in practice most interactions are reciprocal and should be
specified as such for maximal performance.
Second, a call to setInteractionFunction() tells i1 that it should convert spatial distances into
interaction strengths using a Gaussian function (represented by "n" for “normal”, to avoid
confusion with the "g" for “gamma” used elsewhere in SLiM). This Gaussian function is scaled to
have a maximum value of 3.0 and a standard deviation of 0.1, representing fairly local
interactions. It can now be seen how the maximum distance set for i1 was chosen; it is three
times the standard deviation of the interaction kernel. Interaction strengths beyond that distance
would be extremely small anyway, and are thus neglected by this model for efficiency.
Next, interaction type i1 needs to actually be used. A precondition for this is that i1 be
“evaluated” in each generation. This is done in the 1: late() event, with a call to its evaluate()
method. This takes a snapshot of the model’s spatial state at that moment in time, and i1 will then
calculate all interactions based upon that snapshot. This is necessary because under the hood,
InteractionType performs a lot of time-consuming analysis to set up spatial data structures
representing the state of the model, allowing it to respond quickly to spatial queries. It does that
analysis just once, at the point of evaluation, and the results are cached and used for all queries
until the interaction is evaluated again. (In fact, this is not quite accurate; interaction() callbacks
may be called in a deferred fashion, as discussed when we introduce interaction() callbacks.)
Having evaluated i1 in the late() event, just before fitness calculation in the generation cycle,
the model then uses i1 in a global fitness() callback to calculate fitness values that represent the
effects of spatial competition. The call to totalOfNeighborStrengths() totals up the interaction
strengths between individual (the focal individual) and all other interacting individuals in its
subpopulation. These interaction strengths, as we saw above, were configured to be calculated
using a particular Gaussian kernel, and a maximum distance of 0.3 was chosen. Those settings are
now used to find the interaction strength between individual and every other individual, and the
sum of those strengths is returned. The fitness() callback divides that total by the number of
individuals in the subpopulation to get a mean interaction strength, and subtracts that mean value
from 1.1 to get a fitness value. Those details are fairly arbitrary; the overall intent, however, is that
the stronger the competition felt by an individual, the lower the individual’s fitness.
If we run this model in SLiMgui, it looks markedly different from the previous model:

Perhaps the most obvious difference is that individuals are no longer all yellow (which indicated
neutrality). Now, individuals in relatively tight spatial clusters are orange or even red, indicating
they have fitness values below 1.0 due to the effects of competition. Those individuals will be less
likely to reproduce, whereas the individuals enjoying a lack of competition in isolated areas will
be more likely to reproduce. That leads to the other obvious difference: the space is more
completely and uniformly occupied. This model uses the same offspring positioning algorithm as
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

261

the previous recipe, which promotes spatial clustering; however, excessive clustering is now
opposed by the effects of competition, producing that more uniform distribution.
14.3 Boundaries and boundary conditions
Here we will pause to examine the question of boundaries and boundary conditions in spatial
models. When a new offspring individual is given a position that is based upon the parental
position(s) plus some random deviation, as is commonly done, the question arises of how to
constrain those new positions; we have already confronted that question, in fact, in the
modifyChild() callbacks we have used to set offspring positions.
One option is simply to impose no constraint; space is then considered to be infinite in extent
in all directions. This is the default in SLiM, simply because SLiM imposes no default constraint;
however, it is rarely desirable in individual-based models since a finite population on an infinite
landscape has zero density – rarely an interesting or biologically relevant case. Instead, models
generally use one of several standard boundary conditions. In this section we will break from our
usual conventions and will simply assume all of the code from the previous recipe (section 14.2);
here we will present only replacement modifyChild() callbacks to implement each of four
standard boundary conditions. The literature contains a good deal of discussion of these different
boundary conditions and the effects they may have on evolutionary dynamics; we will not review
that literature here. We will focus on implementation.
First of all, with “stopping” boundaries, new positions are simply clamped to the spatial
boundary; points outside the boundary are forced to the nearest point inside bounds. In SLiM, this
can be implemented as:
modifyChild() {
// Stopping boundary conditions
pos = parent1.spatialPosition + rnorm(2, 0, 0.02);
child.setSpatialPosition(p1.pointStopped(pos));
return T;
}

This introduces several new concepts. The spatialPosition property of Individual provides a
float vector of the spatial coordinates of the individual; since this model has "xy" dimensionality,
this vector has two elements, with the same values as the x and y properties that we have been
using. The setSpatialPosition() is just the reciprocal of this, taking a float vector of
coordinates and setting them into the x and y properties of the Individual in one operation, as a
convenient shorthand. Finally, the pointStopped() method of Subpopulation implements the
spatial boundary condition by clamping the coordinates in the float vector it is passed to the
boundaries of the subpopulation and returning the clamped vector. So all in all, this callback gets
the coordinates of the parent, adds a normal draw to each (thus deviating the offspring’s x and y
coordinates from those of the parent), asks the subpopulation to clamp the new position into
bounds, and sets the final position into the child. All of this is just convenient shorthand; it could
easily be implemented by using the x and y properties of the Individual directly, as well as the
spatial boundaries of the Subpopulation (available through its spatialBounds property) – it would
just be longer, less readable, and more error-prone.
Second, with “reflecting” boundaries, new positions outside the spatial boundary are reflected
to fall inside it; the extent to which a point lies outside an edge is translated into the extent to
which the point lies inside that edge instead. This requires just a trivial modification of the
previous recipe, substituting the pointReflected() method in place of pointStopped():

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

262

modifyChild() {
// Reflecting boundary conditions
pos = parent1.spatialPosition + rnorm(2, 0, 0.02);
child.setSpatialPosition(p1.pointReflected(pos));
return T;
}

Third, with “absorbing” boundaries, new offspring whose proposed positions lie outside bounds
are absorbed – their generation is terminated. This is accomplished in SLiM by returning F from
the modifyChild() callback, which tells SLiM to start over from scratch by choosing new parents
and generating a completely new offspring:
modifyChild() {
// Absorbing boundary conditions
pos = parent1.spatialPosition + rnorm(2, 0, 0.02);
if (!p1.pointInBounds(pos))
return F;
child.setSpatialPosition(pos);
return T;
}

This uses the pointInBounds() method of Subpopulation to test whether the new position lies
inside the spatial boundaries. If not, F is returned, terminating generation of the proposed child.
Fourth, with “reprising” boundaries, if a proposed position falls outside bounds a new position
is generated until a position within bounds is obtained. This is the boundary condition we
implemented the hard way in the previous recipes, but it can be done a little more easily (and
more generally, since the spatial boundaries are now not hard-coded):
modifyChild() {
// Reprising boundary conditions
do pos = parent1.spatialPosition + rnorm(2, 0, 0.02);
while (!p1.pointInBounds(pos));
child.setSpatialPosition(pos);
return T;
}

This again uses pointInBounds(), but this time if the point is not inside bounds the code loops
back to generate a new point, until a point within bounds is obtained.
Finally, with “periodic” boundaries space is constructed in such a manner that boundaries do
not exist but the spatial extent is nevertheless finite; the spatial topology is instead cylindrical or
toroidal, wrapping around at some or all edges. This boundary condition is a more complex topic
than the other boundary conditions, since it actually changes the way in which distances are
calculated; we will therefore defer a presentation of it until section 14.12. It’s an important topic,
though, since periodic boundaries can avoid problematic “edge effects” as discussed there.
We have been using Subpopulation methods to check/enforce the boundary conditions for us.
Subpopulation is the class responsible for landscape-level properties such as the coordinates of the
spatial boundaries, as well as other state we will see later. By default the spatial boundaries used
by Subpopulation span the interval [0,1] in each dimension, and we have allowed that default to
stand in the models shown here. This can be changed, using the setSpatialBounds() method of
Subpopulation, but we will not pursue that here (see section 14.7 for an example). In any case,
the Subpopulation methods we have seen will check and enforce whatever boundaries are set.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

263

One final point worth mentioning is that Subpopulation provides a more general way to set up
initial random positions. These recipes will use this code to do so:
1 late() {
sim.addSubpop("p1", 500);
// Initial positions are random within spatialBounds
for (ind in p1.individuals)
ind.setSpatialPosition(p1.pointUniform());
}

The pointUniform() method generates a point drawn at random from within the spatial bounds
of the subpopulation (drawing each coordinate from a uniform distribution spanning the range
within bounds). This is equivalent to the code presented in earlier recipes, but it will also work
properly if the spatial bounds of the subpopulation have been changed from the default, whereas
the earlier code would not (since the [0,1] interval is hard-coded there).
14.4 Mate choice with a spatial kernel
Now we will work with the “reprising” boundary condition model of section 14.3, and we will
add the element of spatial mate choice. The likelihood that one individual will choose another
individual as a mate will depend upon the spatial distance between them, falling off with
increasing distance according to a Gaussian kernel (but a different one from the competition
kernel). This could represent, for example, mate-finding based upon auditory cues such as
birdsong which become progressively less perceptible as distance increases.
Doing this is remarkably easy. We just need to add a second interaction type, and utilize it in a
mateChoice() callback:
initialize() {
initializeSLiMOptions(dimensionality="xy");
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
// spatial competition
initializeInteractionType(1, "xy", reciprocal=T, maxDistance=0.3);
i1.setInteractionFunction("n", 3.0, 0.1);
// spatial mate choice
initializeInteractionType(2, "xy", reciprocal=T, maxDistance=0.1);
i2.setInteractionFunction("n", 1.0, 0.02);
}
1 late() {
sim.addSubpop("p1", 500);
for (ind in p1.individuals)
ind.setSpatialPosition(p1.pointUniform());
}
1: late() {
i1.evaluate();
i2.evaluate();
}

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

264

fitness(NULL) {
totalStrength = i1.totalOfNeighborStrengths(individual);
return 1.1 - totalStrength / p1.individualCount;
}
1: mateChoice() {
// spatial mate choice
return i2.strength(individual);
}
modifyChild() {
do pos = parent1.spatialPosition + rnorm(2, 0, 0.02);
while (!p1.pointInBounds(pos));
child.setSpatialPosition(pos);
return T;
}
2000 late() { sim.outputFixedMutations(); }

Interaction type i1 is used for competitive interactions, as before. We now declare i2 as well,
which we will use for mate choice. It also uses a Gaussian kernel, this time with a maximum
value of 1.0, a standard deviation of 0.02, and a maximum distance of 0.1. This is quite a narrow,
short-range mate-choice function that will promote local mating quite strongly.
Having declared i2, we then need to evaluate it in the 1: late() event, just as we did with i1.
We are then prepared to use it in the 1: mateChoice() callback, which simply returns the result of
the call i2.strength(individual). This call asks i2 to evaluate the strength of interaction
between the first parent (individual) and all other individuals in its subpopulation. The result is
returned as a vector. Happily, that is precisely what mateChoice() callbacks are expected to return
– a vector of mating weights between the first parent and all other individuals. The result can
therefore simply be passed on to SLiM without further processing.
That is all that is needed; we now have a model that includes both spatial competition and
spatial mate choice, using different kernels. If we run this model in SLiMgui, it looks essentially
identical to the model of sections 14.2, but the evolutionary dynamics under the hood are quite
different. We can explore that with a quick modification of these recipes, by dropping in the
heterozygosity calculator used in section 13.2. The code from the final output event of that recipe
can be used without modification. In a quick experiment, we can do 25 runs of the models, run to
generation 10000 to allow equilibration, and output the mean nuclear heterozygosity at the end of
each run. Let’s also do runs of the equivalent non-spatial model (constructed by simply removing
all of the interaction code, the fitness() callback, the mateChoice() callback, and all references
to spatial positions). Simple summary statistics across the mean nuclear heterozygosity values
obtained from the 25 runs of each of the three models look like this:
Mean

Std. Deviation

Non-spatial

0.0002148

0.000092

Competition only (14.2)

0.0001934

0.000071

Competition and mate choice (14.4)

0.0001005

0.000067

Model

The first thing to note is that the outcomes of the non-spatial and competition-only models are
quite similar. Indeed, even with 25 samples each, a t-test finds that they do not differ significantly
(p = 0.3651). Apparently local offspring generation and spatial competition do not suffice to drive
much, if any, genetic divergence among spatial clusters. This is not too surprising, since with nonspatial mating in each generation the gene flow blending clusters together is very high.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

265

The second thing to note, however, is that the model with spatial mate choice – this section’s
recipe – is quite different from the other two. This is confirmed by t-test; it is highly significantly
different from both the non-spatial model (p = 0.0000104) and the competition-only model
(p = 0.0000204). The addition of spatial mate choice (especially with a narrow kernel) has driven
substantial genetic divergence among spatial clusters. We have ourselves a real spatial model.
14.5 Mate choice with a nearest-neighbor search
In this section we will modify the previous recipe to implement spatial mate choice using a
nearest-neighbor search instead of a spatial kernel. Each individual choosing a mate will make a
choice from among its three nearest neighbors, without further consideration of distance. In this
recipe the selection will be random, but it would be simple to make the mate choice depend upon
some genetic or non-genetic trait of the prospective mates, reflecting choosy mate selection from
within a small pool of nearby candidates.
We will begin with the recipe of section 14.4. To modify it for the present recipe, we need only
to replace the mateChoice() callback with the following:
1: mateChoice() {
// nearest-neighbor mate choice
neighbors = i2.nearestNeighbors(individual, 3);
mates = rep(0.0, p1.individualCount);
mates[neighbors.index] = 1.0;
return mates;
}

The nearestNeighbors() method returns the three (3) nearest neighbors of the first parent
(individual). In SLiM, a “neighbor” is an individual within the maximum interaction distance of
the focal individual (other than the focal individual itself); individuals beyond that maximum
cannot be “neighbors”, even if they are the nearest individual to the focal individual. It is therefore
possible for this method to return fewer than three individuals; indeed, if individual is spatially
isolated this method might return an empty vector. The rest of the callback is written with that
possibility in mind, however. First, a mating vector is constructed with 0 for all potential mates.
Then the values for the neighbors found (if any) are changed to 1, using the subpopulation indices
of the neighbors. Finally, that modified vector is returned. If individual had no neighbors, no
indices will be changed, and the returned vector will contain nothing but zeros; this will cause
SLiM to draw a new first parent, just as a return value of float(0) would.
Note that the nearestNeighbors() call is purely a spatial query; it does not involve the
calculation of interaction strengths at all. In this recipe, the parameters that configure i2 – the
Gaussian function’s maximum value and standard deviation, and indeed the fact that a Gaussian
function is to be used at all – are therefore irrelevant and can be removed. The complete recipe:
initialize() {
initializeSLiMOptions(dimensionality="xy");
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
// spatial competition
initializeInteractionType(1, "xy", reciprocal=T, maxDistance=0.3);
i1.setInteractionFunction("n", 3.0, 0.1);
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

266

// spatial mate choice
initializeInteractionType(2, "xy", reciprocal=T, maxDistance=0.1);
}
1 late() {
sim.addSubpop("p1", 500);
for (ind in p1.individuals)
ind.setSpatialPosition(p1.pointUniform());
}
1: late() {
i1.evaluate();
i2.evaluate();
}
fitness(NULL) {
totalStrength = i1.totalOfNeighborStrengths(individual);
return 1.1 - totalStrength / p1.individualCount;
}
1: mateChoice() {
// nearest-neighbor mate choice
neighbors = i2.nearestNeighbors(individual, 3);
mates = rep(0.0, p1.individualCount);
mates[neighbors.index] = 1.0;
return mates;
}
modifyChild() {
do pos = parent1.spatialPosition + rnorm(2, 0, 0.02);
while (!p1.pointInBounds(pos));
child.setSpatialPosition(pos);
return T;
}
2000 late() { sim.outputFixedMutations(); }

We have now seen three different types of spatial query using InteractionType. The first is the
use of totalOfNeighborStrengths() to add up all of the interactions felt by a focal individual from
every other interacting individual; we used this to build spatial competition. The second is
strength(), which we used to obtain a vector of interaction strengths between the focal individual
and all other individuals (it can also calculate the interaction strengths with specific other
individuals). And the third is the nearestNeighbors() method used here, which returns a vector
with (up to) a specified number of the nearest neighbors of a focal individual. IndividualType
supports several other queries, such as distance() to get distances between the focal individual
and others, distanceToPoint() and nearestNeighborsOfPoint() to do those queries using an
arbitrary spatial point rather than a focal individual, and drawByStrength() to draw neighbors of a
focal individual weighted by interaction strength (a fast combination of nearestNeighbors(),
strength(), and sample(), conceptually). Instead of providing examples of the use of all these
methods – which are all documented in section 21.7.2 – we will now change direction to start
building – gradually – towards models of landscape heterogeneity using landscape maps.
14.6 Divergence due to phenotypic competition with an interaction() callback
Here we will depart from the previous recipes, which have all used neutral spatial models, to
explore a different topic: a non-neutral, non-spatial model. This recipe will involve a quantitative
trait, using a strategy similar to that in the recipe of section 13.10. It will use an InteractionType
to model competition among individuals based upon their phenotype: individuals with a more
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

267

similar phenotype will compete more strongly (since they utilize similar resources). The goal here
is to demonstrate the use of an interaction() callback to influence the interaction strengths
calculated by SLiM. It is most straightforward to introduce this first in a non-spatial model; in a
later section we will see the use of an interaction() callback in a spatial model.
This recipe is a first step toward something like the model of Dieckmann & Doebeli (1999): a
model demonstrating that phenotypic competition and assortative mating can produce speciation
in a non-spatial (i.e. sympatric) model. More specifically, we will be working toward the sexual,
“ecological trait” model described in that paper, although our model will be different in many
ways – it will be a generational model rather than a continuous-time model, and we will not
include the “mating trait” used in that paper, for example. We will continue the development of
this model in the next several sections. Here’s how we start:
initialize() {
defineConstant("optimum", 5.0);
defineConstant("sigma_C", 0.4);
defineConstant("sigma_K", 1.0);
initializeMutationRate(1e-6);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeMutationType("m2", 0.5, "n", 0.0, 1.0);
m2.convertToSubstitution = F;

// neutral
// QTL

initializeGenomicElementType("g1", c(m1, m2), c(1, 0.01));
initializeGenomicElement(g1, 0, 1e5 - 1);
initializeRecombinationRate(1e-8);
}
1 late() {
sim.addSubpop("p1", 500);
}
1: late() {
// construct phenotypes from the additive effects of QTLs
inds = sim.subpopulations.individuals;
inds.tagF = inds.sumOfMutationsOfType(m2);
}
fitness(m2) {
// make QTLs intrinsically neutral
return 1.0;
}
fitness(NULL) {
// reward proximity to the optimum
return 1.0 + dnorm(optimum - individual.tagF, mean=0.0, sd=sigma_K);
}
1:2001 late() {
if (sim.generation == 1)
cat(" gen
mean
sd\n");
if (sim.generation % 100 == 1)
{
tags = p1.individuals.tagF;
cat(format("%5d ", sim.generation));
cat(format("%6.2f ", mean(tags)));
cat(format("%6.2f\n", sd(tags)));
}
}

This is a simple model of randomly arising QTLs with additive effects that produce a phenotype;
we have seen this sort of model before in sections 13.1 and 13.10 (see those recipes for further
discussion). A phenotypic optimum is defined at 5.0, and a global fitness() callback produces
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

268

stabilizing selection around that optimum by calculating a relative fitness based on divergence
from the optimum, using a Gaussian function with a width governed by the defined constant
sigma_K. The individual QTLs are made effectively neutral with the fitness(m2) callback, so that
fitness is determined only by the overall phenotype they produce additively. That phenotype is
calculated in the 1: late() event and assigned into the tagF properties of the individuals.
Now let’s add something new: an interaction to govern phenotypic competition. First, we
create the InteractionType by adding these lines to the end of the initialize() callback:
initializeInteractionType(1, "", reciprocal=T);
i1.setInteractionFunction("f", 1.0);

// competition

Now i1 will evaluate our phenotypic competition interaction, which is a reciprocal interaction
as in previous recipes. It has a spatiality of "" because this is a non-spatial model. This means that
it has no concept of distance, so the only interaction formula we are allowed to use is a fixed
value, type "f". So in its present form this interaction is not terribly useful; every individual
interacts with every other individual with a strength of 1.0. We will fix that momentarily. First,
however, let’s add code to evaluate the interaction, at the end of the 1: late() event:
// evaluate interactions
i1.evaluate();

At this point, let’s pause for a moment. We have stabilizing selection toward an optimum at
and we start with a mean phenotype of 0.0 since the model starts with no QTLs. At the end
of every tenth generation, the output event prints out the generation with the mean and standard
deviation of the population’s phenotypic values. A typical run looks something like this:
5.0,

gen
1
101
201
301
401
501
601
701
801
901
1001
1101
1201
1301
1401
1501
1601
1701
1801
1901
2001

mean
0.00
-0.00
0.00
0.02
-0.06
-0.21
-0.27
0.12
0.21
0.17
1.26
4.52
4.86
4.87
4.93
5.12
4.96
4.97
5.01
4.84
4.92

sd
0.00
0.36
0.12
0.26
0.38
1.16
0.91
0.46
0.38
0.30
1.40
0.46
0.50
0.43
0.48
0.39
0.41
0.39
0.35
0.30
0.27

It took a little while for the population to develop enough useful variance to be able to reach
the fitness peak, but it got there in the end. The population often contains an appreciable amount
of variance in the middle of the evolutionary trajectory, but as it settles onto the optimum the
variance decreases and tends toward zero, since any deviation from the phenotypic optimum is
punished by the global fitness function. In practice, as seen above, the standard deviation
fluctuates around 0.3 or 0.4 for quite a while after the population reaches the optimum.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

269

Now let’s finish the model by implementing phenotypic competition. First of all, we will make
the interaction i1 more useful by adding an interaction() callback:
interaction(i1) {
return dnorm(exerter.tagF, mean=receiver.tagF, sd=sigma_C) /
dnorm(0.0, mean=0, sd=sigma_C);
}

Since this is the first time we’ve seen one of these, let’s examine it closely. It defines the
interaction strength of interaction type i1. It uses two variables, exerter and receiver, that are
defined by SLiM; they are pseudo-parameters of the interaction() callback, similar to those we
have seen with other types of callbacks. The exerter is the individual exerting the interaction, and
the receiver is the individual receiving the interaction; since this interaction is reciprocal, the
distinction between the two is arbitrary. The callback looks up the phenotypic values of the two
individuals (from their tagF fields) and uses dnorm() to calculate the degree of competition
between the two, based upon a Gaussian kernel with width sigma_C. Finally, the interaction
strength is normalized to have a maximum value of 1.0 here by dividing by dnorm(0.0, mean=0,
sd=sigma_C), so that the maximum strength of competition is independent of the width of the
competition kernel.
SLiM will now use this interaction() callback to determine the strength of i1 between every
pair of individuals. Note that this callback calculates the strength from scratch; it is also possible
for an interaction() callback to modify the default strength calculated by the interaction
function, which is supplied to the callback in the pseudo-parameter strength (see section 22.6 on
other pseudo-parameters available to interaction() callbacks).
Now, finally, we need to actually use i1 to influence the fitnesses of individuals. We will do
that in a global fitness() callback:
fitness(NULL) {
// phenotypic competition
totalStrength = sum(i1.strength(individual));
return 1.0 - totalStrength / p1.individualCount;
}

For each individual, this callback adds up the interaction strengths it feels from every other
individual (with sum()), divides by p1.individualCount to get the mean interaction strength, and
subtracts that from 1.0 to get a relative fitness value, which is returned. In short, the more
competition an individual feels from other individuals of similar phenotype, the lower the
individual’s fitness will be. If we run this full model now, we see something like:
gen
1
101
201
301
401
501
...
1301
1401
1501
1601
1701
1801
1901
2001

mean
0.00
1.62
4.57
4.76
4.87
4.75
...
4.98
5.04
5.43
5.01
5.20
5.50
5.22
5.54

sd
0.00
2.71
1.38
1.41
1.22
1.44
...
1.38
1.60
1.61
1.32
1.30
1.78
1.39
1.75

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

270

Note that the phenotypic variance rises to higher levels now, and stays high for the remainder of
the run. The theoretical expectation is that the variance will remain high indefinitely because the
presence of phenotypic competition rewards divergence from the mean phenotype.
A key point to understand regarding this model is that the interaction() callbacks will be
called while fitness values are being calculated, as a side effect of the i1.strength() call in the
global fitness() callback. When evaluate() is called on an InteractionType, the positions of all
individuals are saved in a snapshot, and distances and interaction strengths are calculated based
upon that snapshot. When interaction() callbacks are involved, however, they are called in a
deferred fashion because they are slow; calling them may not be necessary at all, if particular
interaction strengths are never queried, so deferring the calls can provide a very large performance
gain. It is admittedly strange, however, that interaction() callbacks are called at a later time than
the “official” evaluation of the interaction; indeed, they can be called up until mating begins in the
following generation, depending upon when the model queries the interaction. If this fact proves
difficult to manage in a model, it is possible to supply an immediate=T option to evaluate() that
forces all interactions to be fully and synchronously evaluated at that point, including callouts to
all interaction() callbacks. This is not necessary in this model, since having interaction()
callbacks get called during fitness evaluation poses no logistical difficulties; but this is a point
worth keeping in mind when developing your own interaction() callbacks.
The full model looks like this:
initialize() {
defineConstant("optimum", 5.0);
defineConstant("sigma_C", 0.4);
defineConstant("sigma_K", 1.0);
initializeMutationRate(1e-6);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeMutationType("m2", 0.5, "n", 0.0, 1.0);
m2.convertToSubstitution = F;

// neutral
// QTL

initializeGenomicElementType("g1", c(m1, m2), c(1, 0.01));
initializeGenomicElement(g1, 0, 1e5 - 1);
initializeRecombinationRate(1e-8);
initializeInteractionType(1, "", reciprocal=T);
i1.setInteractionFunction("f", 1.0);

// competition

}
1 late() {
sim.addSubpop("p1", 500);
}
1: late() {
// construct phenotypes from the additive effects of QTLs
inds = sim.subpopulations.individuals;
inds.tagF = inds.sumOfMutationsOfType(m2);
// evaluate interactions
i1.evaluate();
}
fitness(m2) {
// make QTLs intrinsically neutral
return 1.0;
}
fitness(NULL) {
// reward proximity to the optimum
return 1.0 + dnorm(optimum - individual.tagF, mean=0.0, sd=sigma_K);
}

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

271

interaction(i1) {
return dnorm(exerter.tagF, mean=receiver.tagF, sd=sigma_C) /
dnorm(0.0, mean=0, sd=sigma_C);
}
fitness(NULL) {
// phenotypic competition
totalStrength = sum(i1.strength(individual));
return 1.0 - totalStrength / p1.individualCount;
}
1:2001 late() {
if (sim.generation == 1)
cat(" gen
mean
sd\n");
if (sim.generation % 100 == 1)
{
tags = p1.individuals.tagF;
cat(format("%5d ", sim.generation));
cat(format("%6.2f ", mean(tags)));
cat(format("%6.2f\n", sd(tags)));
}
}

Since this model does not include assortative mating, it is not expected to produce speciation
(and we will see in the next section that it does not). At present, it is merely a model of what
Haller & Hendry (2013) called “squashed stabilizing selection”: selection based upon a fitness
function that is stabilizing around an optimum, but has been squashed downward at its center due
to negative frequency-dependent selection, producing disruptive selection that nevertheless keeps
the population in the vicinity of the fitness peak. We will add assortative mating to this model in
the recipe of section 14.8, but first, let’s examine a way to improve upon the model as it now
stands.
14.7 Modeling phenotype as a spatial dimension
In the previous section, we built a model of a quantitative trait constructed from randomly
arising QTLs that combined additively to determine the phenotype. We then built an interaction to
model competition based upon similarity in phenotype. In this section we will modify that recipe
slightly, to better utilize the power of InteractionType. Although this model will remain a nonspatial model for the time being, we will now model the phenotype in SLiM as if it were a spatial
dimension. This will bring us two benefits: speed, and additional visualization power.
This change is quite straightforward. Beginning with the model of section 14.6, we first need to
enable spatiality at the beginning of the initialize() callback:
initializeSLiMOptions(dimensionality="x");

We will use the x property of individuals to store their phenotypes now, instead of tagF:
inds.x = inds.sumOfMutationsOfType(m2);

The fitness() callback that enforces the phenotypic optimum should change to use x:
fitness(NULL) {
// reward proximity to the optimum
return 1.0 + dnorm(optimum - individual.x, mean=0.0, sd=sigma_K);
}

The interaction() callback also needs the same change:

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

272

interaction(i1) {
return dnorm(exerter.x, mean=receiver.x, sd=sigma_C) /
dnorm(0.0, mean=0, sd=sigma_C);
}

Finally, the output code should also use x instead of tagF (see the full model below).
Let’s make one other change. Previously, we’ve seen two-dimensional spatial models that used
the default boundaries of [0, 1] in x and y. In section 14.3 we mentioned that this default could be
changed; let’s change it now, because in this model phenotypic values often go outside that range.
We can add a line immediately after p1 is defined, telling p1 its spatial boundaries:
p1.setSpatialBounds(c(0.0, 10.0));

That’s it; we now have a model in which phenotype is considered to be a pseudo-spatial
dimension. Why is this model better? The first reason is that it provides us with better
visualization capabilities in SLiMgui. If we run the model now, we get a display that shows
individuals in a one-dimensional phenotypic space (with random y coordinates chosen by SLiM to
spread the individuals out for greater visibility):

Since we told p1 that its spatial boundaries were [0.0, 10.0], that is the spatial extent displayed
by SLiMgui here, so the phenotypic optimum is at the horizontal center of the display here. This
snapshot, taken at about generation 300, shows that the population has already reached the
optimum; the mean phenotype here is 4.91, in fact, with a standard deviation of 1.40 – typical of
the model’s equilibrium state. Colors correspond to fitness as usual; the band of individuals on the
left is particularly low-fitness because it is far off of the optimum, and yet is also quite crowded –
there are a lot of individuals with that phenotype at this instant in time. All of the discrete banding
here is the result of variation in QTLs of relatively large effect. Note that individuals quite far off
the fitness peak can still be fairly high-fitness, as long as their phenotype is rare.
So this is one benefit of treating phenotype as a spatial dimension: the dynamic evolution of the
phenotypic distribution can be watched in real time, color-coded by fitness as the fitness function
resulting from the squashed stabilizing selection fluctuates from generation to generation.
The other benefit is that the model is much faster, because InteractionType’s optimizations for
spatial interactions are brought to bear. First, if we realize that the Gaussian phenotypic
competition function we’re presently simulating with an interaction() callback is equivalent to
one of SLiM’s built-in interaction functions, then we can completely remove the interaction()
callback. We then need to change the definition of the i1 interaction like so:
initializeInteractionType(1, "x", reciprocal=T, maxDistance=sigma_C * 3);
i1.setInteractionFunction("n", 1.0, sigma_C);

This defines i1 as a spatial interaction using x, which is our phenotypic “dimension”. It then
tells i1 to use a Gaussian function with a maximum value of 1.0 and a width of sigma_C –
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

273

precisely what the interaction() callback used to do. Indeed, this model is exactly equivalent;
running it with the same random number seed produces the same result as shown in the snapshot
above. The difference is that it runs much faster – it completes about 4750 generations in 30
seconds (on my machine), whereas the version using an interaction() callback completes about
420 generations in 30 seconds. Not a bad speed improvement! The key is that we have avoided
having to make a call to an Eidos interaction() callback for every interaction, which is slow.
Indeed, even a little more speed can be squeezed out, at the price of accuracy. If a maximum
distance of 0.8 is set for i1, and totalOfNeighborStrengths() is used instead of strength() in the
global fitness() callback that calculates phenotypic competition, about 5030 generations can be
completed in 30 seconds. Since a maximum distance of 0.8 is two standard deviations of the
competition kernel, the effects of this change should be relatively small in terms of the overall
behavior of the model, but it is nevertheless a sacrifice in accuracy, and the performance gain in
this case is not large, so we will not consider this change to be a part of the official recipe for this
section. In other cases, however, the performance gain can be very large; a maximum distance
should always be considered for spatial interactions, particularly if it can safely be set to a short
enough distance to exclude most other individuals from interacting with the focal individual at all.
The full recipe for this section, for posterity, is:
initialize() {
defineConstant("optimum", 5.0);
defineConstant("sigma_C", 0.4);
defineConstant("sigma_K", 1.0);
initializeSLiMOptions(dimensionality="x");
initializeMutationRate(1e-6);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeMutationType("m2", 0.5, "n", 0.0, 1.0);
m2.convertToSubstitution = F;

// neutral
// QTL

initializeGenomicElementType("g1", c(m1, m2), c(1, 0.01));
initializeGenomicElement(g1, 0, 1e5 - 1);
initializeRecombinationRate(1e-8);
initializeInteractionType(1, "x", reciprocal=T,
maxDistance=sigma_C * 3);
// competition
i1.setInteractionFunction("n", 1.0, sigma_C);
}
1 late() {
sim.addSubpop("p1", 500);
p1.setSpatialBounds(c(0.0, 10.0));
}
1: late() {
// construct phenotypes from the additive effects of QTLs
inds = sim.subpopulations.individuals;
inds.x = inds.sumOfMutationsOfType(m2);
// evaluate interactions
i1.evaluate();
}
fitness(m2) {
// make QTLs intrinsically neutral
return 1.0;
}
fitness(NULL) {
// reward proximity to the optimum
return 1.0 + dnorm(optimum - individual.x, mean=0.0, sd=sigma_K);
}
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

274

fitness(NULL) {
// phenotypic competition
totalStrength = sum(i1.strength(individual));
return 1.0 - totalStrength / p1.individualCount;
}
1:2001 late() {
if (sim.generation == 1)
cat(" gen
mean
sd\n");
if (sim.generation % 100 == 1)
{
tags = p1.individuals.x;
cat(format("%5d ", sim.generation));
cat(format("%6.2f ", mean(tags)));
cat(format("%6.2f\n", sd(tags)));
}
}

14.8 Sympatric speciation facilitated by assortative mating
In the previous section we refined our non-spatial model of phenotypic competition, by treating
phenotype as though it were a spatial dimension so that SLiM could run the model more quickly
and with better visualization. It is now time to take another step toward the model of Dieckmann
& Doebeli (1999) by changing mating in the model to be assortative by phenotype. This will in
some ways be similar to what we did in section 14.4; but there mating was spatially assortative,
whereas here mating will be phenotypically assortative.
Beginning with the model of section 14.7, the changes needed are quite small. First, we need
to add code in our initialize() callback to define a new interaction type, i2, that we will use to
evaluate mates:
initializeInteractionType(2, "x", reciprocal=T, maxDistance=sigma_M * 3);
i2.setInteractionFunction("n", 1.0, sigma_M);

We have to add a definition of the sigma_M kernel width as well, at the beginning of the
initialize() callback:
defineConstant("sigma_M", 0.5);

Since this interaction is based on "x", which is our pseudo-spatial phenotype dimension, it
represents phenotypic proximity just as i1, our phenotypic competition interaction, does. Indeed,
the only reason not to use i1 to govern mate choice as well is that we want to be able to make it
use a different interaction function than competition, so that we can play with the width of the
mate-choice kernel independently of the width of the competition kernel.
Next, we need to evaluate that interaction in our 1: late() event every generation:
i2.evaluate();

And finally, we need a mate choice callback that uses that interaction:
mateChoice() {
// spatial mate choice
return i2.strength(individual);
}

And that’s it. We now have a model in which phenotype, as determined by randomly arising
additive QTLs, is something like a “magic trait” (Servedio et al. 2011) – a trait that simultaneously
determines both fitness (because of the phenotypic optimum) and assortative mating (because of
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

275

the mateChoice() callback) (although it is perhaps not strictly a magic trait since it is governed by
underlying QTLs; see sections 11.1 and 13.1 for more discussion of this topic). When exposed to
the disruptive selection caused by phenotypic competition in the model, the magic-ish trait here
readily produces speciation. (Note that this is true speciation, not evolutionary branching, because
this is a sexual model in the sense that matters – biparental mating with assortment and
recombination of gametes – even though it models hermaphrodites, not separate sexes).
We can see evidence for speciation in two different ways in SLiMgui. First of all, instead of the
cloud of different phenotypes around the optimum that the previous model generated, we now see
a set of discrete phenotypic clusters:

Second, we now see strong genetic separation between these clusters in the pattern of neutral
variation exhibited by the population:

Note that there are no neutral mutations at high frequency at all; instead, there is a large
amount of neutral diversity that is pinned at a couple of intermediate frequencies. These are blocs
of neutral mutations that have fixed within one of the species in the model, but are unable to
spread more widely because of the reproductive isolation between the species. Hybridization is
not impossible; it does occur occasionally, as can be seen in the snapshot above. But it is rare
enough that gene flow between the species is insufficient to allow that neutral diversity to spread.
It would be fair to argue that although this is a model of speciation given a pre-existing magic
trait, it is not a model of the emergence of the magic trait itself, because it lacks the “mating trait”
of the Dieckmann & Doebeli (1999) model. Adding in such a trait would be a simple extension of
the present model, and would presumably support the result found by Dieckmann & Doebeli
(1999) – that in an ecological scenario such as this, assortative mating will readily emerge,
transforming an ordinary trait into a magic trait that then facilitates speciation. We will leave that
extension of the model as an exercise for the reader, since pursuing it would not serve the aims of
this chapter. Instead, in the next section we will turn back toward spatial models, while
incorporating what we have done here with QTLs and phenotype-based competition and mating.
Here’s the full model, for the record:
initialize() {
defineConstant("optimum",
defineConstant("sigma_C",
defineConstant("sigma_K",
defineConstant("sigma_M",

5.0);
0.4);
1.0);
0.5);

initializeSLiMOptions(dimensionality="x");
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

276

initializeMutationRate(1e-6);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeMutationType("m2", 0.5, "n", 0.0, 1.0);
m2.convertToSubstitution = F;

// neutral
// QTL

initializeGenomicElementType("g1", c(m1, m2), c(1, 0.01));
initializeGenomicElement(g1, 0, 1e5 - 1);
initializeRecombinationRate(1e-8);
initializeInteractionType(1, "x", reciprocal=T,
maxDistance=sigma_C * 3);
// competition
i1.setInteractionFunction("n", 1.0, sigma_C);
initializeInteractionType(2, "x", reciprocal=T,
maxDistance=sigma_M * 3);
// mate choice
i2.setInteractionFunction("n", 1.0, sigma_M);
}
1 late() {
sim.addSubpop("p1", 500);
p1.setSpatialBounds(c(0.0, 10.0));
}
1: late() {
// construct phenotypes from the additive effects of QTLs
inds = sim.subpopulations.individuals;
inds.x = inds.sumOfMutationsOfType(m2);
// evaluate interactions
i1.evaluate();
i2.evaluate();
}
fitness(m2) {
// make QTLs intrinsically neutral
return 1.0;
}
fitness(NULL) {
// reward proximity to the optimum
return 1.0 + dnorm(optimum - individual.x, mean=0.0, sd=sigma_K);
}
fitness(NULL) {
// phenotypic competition
totalStrength = sum(i1.strength(individual));
return 1.0 - totalStrength / p1.individualCount;
}
mateChoice() {
// spatial mate choice
return i2.strength(individual);
}
1:5001 late() {
if (sim.generation == 1)
cat(" gen
mean
sd\n");
if (sim.generation % 100 == 1)
{
tags = p1.individuals.x;
cat(format("%5d ", sim.generation));
cat(format("%6.2f ", mean(tags)));
cat(format("%6.2f\n", sd(tags)));
}
}

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

277

14.9 Speciation due to spatial variation in selection
In the previous section, we observed that speciation can occur as a result of phenotypic
competition and assortative mate choice in a non-spatial model. In this section we will return to
spatial modeling, while preserving many aspects of the previous recipe. We will now introduce
spatial variation in selection, and will observe adaptive speciation among spatial groups as a result
of local selection pressures in combination with phenotypic competition. The optimum trait value
for the quantitative trait – the phenotypic optimum, in other words – will vary according to the x
position in space, producing a linear environmental gradient. This model is broadly inspired by
the model of Doebeli & Dieckmann (2003), although again there are many differences.
This recipe has enough differences from the previous recipe that we will build it here from
scratch, rather than just giving changes relative to the previous recipe. Let’s start with the setup:
initialize() {
defineConstant("sigma_C", 0.1);
defineConstant("sigma_K", 0.5);
defineConstant("sigma_M", 0.1);
defineConstant("slope", 1.0);
defineConstant("N", 500);
initializeSLiMOptions(dimensionality="xyz");
initializeMutationRate(1e-6);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeMutationType("m2", 0.5, "n", 0.0, 1.0);
m2.convertToSubstitution = F;

// neutral
// QTL

initializeGenomicElementType("g1", c(m1, m2), c(1, 0.1));
initializeGenomicElement(g1, 0, 1e5 - 1);
initializeRecombinationRate(1e-8);
initializeInteractionType(1, "xyz", reciprocal=T,
maxDistance=sigma_C * 3);
// competition
i1.setInteractionFunction("n", 1.0, sigma_C);
initializeInteractionType(2, "xyz", reciprocal=T,
maxDistance=sigma_M * 3);
// mate choice
i2.setInteractionFunction("n", 1.0, sigma_M);
}
1 late() {
sim.addSubpop("p1", N);
p1.setSpatialBounds(c(0.0, 0.0, -slope, 1.0, 1.0, slope));
for (ind in p1.individuals)
ind.setSpatialPosition(p1.pointUniform());
p1.individuals.z = 0.0;
}

This model is our first to use dimensionality "xyz". The x and y dimensions are true spatial
dimensions; this is a 2-D model. The z dimension will be used here for phenotype, just as we used
x as a pseudo-spatial phenotypic dimension in previous recipes. We set up QTL machinery for the
phenotype much as we did before; the fraction of mutations that are QTLs is higher in this model
just so that we don’t have to wait as long to get interesting behavior. We create two interaction
types, i1 for competition and i2 for mating, as before. Note that both of these interaction types
have spatiality "xyz"; i1 therefore encompasses both spatial and phenotypic competition in a
single interaction, and i2 encompasses both spatially and phenotypically assortative mate choice.
For our purposes here this is sufficient, but more interaction types could be defined as desired.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

278

Then we set up the population, in the 1 late() event. We set spatial boundaries of [0.0, 1.0]
for x and y, and of [-slope, slope] for z. The optimum phenotype will actually vary from
−slope/2 to slope/2, from the left edge to the right edge of the environment; the wider interval
here is used just to allow for some phenotypic variation beyond that interval. (In fact, the spatial
boundary for z is not used in this model, nor by SLiMgui, so it is irrelevant anyway). Finally, we set
random spatial positions for all of the initial individuals, and zero out their phenotypes.
Next, let’s implement the machinery to manage spatial positions and phenotypes:
modifyChild() {
// set offspring position based on parental position
do
pos = c(parent1.spatialPosition[0:1] + rnorm(2, 0, 0.005), 0.0);
while (!p1.pointInBounds(pos));
child.setSpatialPosition(pos);
return T;
}
1: late() {
// construct phenotypes from the additive effects of QTLs
inds = sim.subpopulations.individuals;
phenotype = inds.sumOfMutationsOfType(m2);
inds.z = phenotype;
// color individuals according to phenotype
for (ind in inds)
{
hue = ((ind.z + slope) / (slope * 2)) * 0.66;
ind.color = rgb2color(hsv2rgb(c(hue, 1.0, 1.0)));
}
// evaluate interactions
i1.evaluate();
i2.evaluate();
}

As in previous recipes, we use a modifyChild() callback to place offspring near their first
parent, with a small random deviation. We ensure that the offspring phenotype is 0.0 here, so that
it doesn’t cause the pointInBounds() call to return F; we don’t want SLiM to bounds-check
offspring phenotypes for us. The phenotype of 0.0 set here is overwritten a moment later, in the
1: late() event above, when phenotypes are calculated for all individuals and placed into their z
coordinate. A bit of extra code here calculates color values for individuals; in this model
individuals are colored according to their phenotype, rather than their fitness, so that spatial
adaptive divergence is directly visible in SLiMgui, as we’ll see below. The code here calculates a
hue in the HSV (hue/saturation/value) color system, then converts that to the RGB (red/green/blue)
color system and then to a hexadecimal color string which it sets as the color of the individual (see
the Eidos manual for details on these functions). Finally, the two interaction types are evaluated.
Next, let’s add all of our fitness callbacks and other interaction-related machinery:
fitness(m2) {
// make QTLs intrinsically neutral
return 1.0;
}
fitness(NULL) {
// reward proximity to the optimum
optimum = (individual.x - 0.5) * slope;
return 1.0 + dnorm(optimum - individual.z, mean=0.0, sd=sigma_K);
}
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

279

fitness(NULL) {
// phenotypic competition
totalStrength = sum(i1.strength(individual));
return 1.0 - totalStrength / p1.individualCount;
}
mateChoice() {
// spatial mate choice
return i2.strength(individual);
}

The m2 fitness() callback zeroes out the individual fitness effects of our QTLs, as usual. The
first global fitness() callback rewards individuals for proximity to the fitness optimum; in
previous recipes the optimum was a constant value, but now the optimum depends upon the
individual’s position in space, since we are now modeling a linear spatial environmental gradient.
Apart from the fact that the optimum now depends upon individual.x and slope, this is much the
same as before. The second global fitness() callback implements competition, both spatial and
phenotypic; it is actually unchanged from the previous recipe. Similarly, the mateChoice()
callback is unchanged, although it now scores mates based upon both spatial and phenotypic
similarity.
Finally, let’s use the same output event that we used before:
1:5001 late() {
if (sim.generation == 1)
cat(" gen
mean

sd\n");

if (sim.generation % 100 == 1)
{
tags = p1.individuals.z;
cat(format("%5d ", sim.generation));
cat(format("%6.2f ", mean(tags)));
cat(format("%6.2f\n", sd(tags)));
}
}

This prints out a history of the phenotypic mean and standard deviation every 100 generations.
Running this model does indeed produce speciation. One sign of that is that we can see
discrete clusters of individuals of different colors, representing the fact that they have adapted to
different parts of the environmental gradient:

There are three species present here, colored yellow, green, and cyan, adapted to the conditions
at the left, center, and right of the environmental gradient, respectively. (There are also a few
mutants of other colors sprinkled in.)
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

280

Another piece of evidence for speciation is the presence of large amounts of neutral diversity
that is not mixing between the different phenotypic clusters, as we saw before in section 14.8:

Many mutations are at a frequency of approximately 2/3 because they are shared between two
species; the yellow species split from the green species only about 1000 generations before these
snapshots were taken, and so the two species share a great deal of their genetic background.
Many other mutations are at a frequency of approximately 1/3 because they are shared only within
the cyan species, which split from the green species more than 8000 generations earlier.
Remarkably, not a single mutation has fixed across the whole population after 9000 generations of
runtime; the onset of speciation is quite early (in this run of the model, at least), and the
reproductive barrier is quite strong with the parameter values used here.
One could bring in other machinery to further assess the extent of adaptive divergence and
reproductive isolation, such as the FST calculation code we’ve used in a couple of recipes before.
Indeed, one could dump out the neutral diversity to a file and run a STRUCTURE analysis on it to
see whether the groups that appear to be species fall out as genetic clusters naturally. We won’t
pursue that here since the divergence is so readily apparent.
The full model, for reference:
initialize() {
defineConstant("sigma_C", 0.1);
defineConstant("sigma_K", 0.5);
defineConstant("sigma_M", 0.1);
defineConstant("slope", 1.0);
defineConstant("N", 500);
initializeSLiMOptions(dimensionality="xyz");
initializeMutationRate(1e-6);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeMutationType("m2", 0.5, "n", 0.0, 1.0);
m2.convertToSubstitution = F;

// neutral
// QTL

initializeGenomicElementType("g1", c(m1, m2), c(1, 0.1));
initializeGenomicElement(g1, 0, 1e5 - 1);
initializeRecombinationRate(1e-8);
initializeInteractionType(1, "xyz", reciprocal=T,
maxDistance=sigma_C * 3);
// competition
i1.setInteractionFunction("n", 1.0, sigma_C);
initializeInteractionType(2, "xyz", reciprocal=T,
maxDistance=sigma_M * 3);
// mate choice
i2.setInteractionFunction("n", 1.0, sigma_M);
}
1 late() {
sim.addSubpop("p1", N);
p1.setSpatialBounds(c(0.0, 0.0, -slope, 1.0, 1.0, slope));
for (ind in p1.individuals)
ind.setSpatialPosition(p1.pointUniform());
p1.individuals.z = 0.0;
}
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

281

modifyChild() {
// set offspring position based on parental position
do
pos = c(parent1.spatialPosition[0:1] + rnorm(2, 0, 0.005), 0.0);
while (!p1.pointInBounds(pos));
child.setSpatialPosition(pos);
return T;
}
1: late() {
// construct phenotypes from the additive effects of QTLs
inds = sim.subpopulations.individuals;
phenotype = inds.sumOfMutationsOfType(m2);
inds.z = phenotype;
// color individuals according to phenotype
for (ind in inds)
{
hue = ((ind.z + slope) / (slope * 2)) * 0.66;
ind.color = rgb2color(hsv2rgb(c(hue, 1.0, 1.0)));
}
// evaluate interactions
i1.evaluate();
i2.evaluate();
}
fitness(m2) {
// make QTLs intrinsically neutral
return 1.0;
}
fitness(NULL) {
// reward proximity to the optimum
optimum = (individual.x - 0.5) * slope;
return 1.0 + dnorm(optimum - individual.z, mean=0.0, sd=sigma_K);
}
fitness(NULL) {
// phenotypic competition
totalStrength = sum(i1.strength(individual));
return 1.0 - totalStrength / p1.individualCount;
}
mateChoice() {
// spatial mate choice
return i2.strength(individual);
}
1:5001 late() {
if (sim.generation == 1)
cat(" gen
mean
sd\n");
if (sim.generation % 100 == 1)
{
tags = p1.individuals.z;
cat(format("%5d ", sim.generation));
cat(format("%6.2f ", mean(tags)));
cat(format("%6.2f\n", sd(tags)));
}
}

14.10 A simple biogeographic landscape model
The time has come to introduce a new topic: landscape maps. For this, we will temporarily
abandon the spatial quantitative-trait local adaptation model we have been building, and switch to
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

282

a much simpler model. Our goal in this section is to build a simple biogeographic model that uses
a landscape map. Just for fun, the map we will use is a map of the world. There are no end of
issues with this – the map projection distorts the geography, Russia and Alaska are not connected
across the Bering Strait because the spatial boundary intervenes, and so forth. But as a simple toy
model to illustrate the concept and the potential of landscape maps, it should serve our purposes.
Let’s start with the model, and then delve into the concepts:
initialize() {
initializeSLiMOptions(dimensionality="xy");
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
// spatial competition
initializeInteractionType(1, "xy", reciprocal=T, maxDistance=30.0);
i1.setInteractionFunction("n", 5.0, 10.0);
// spatial mate choice
initializeInteractionType(2, "xy", reciprocal=T, maxDistance=30.0);
i2.setInteractionFunction("n", 1.0, 10.0);
}
1 late() {
sim.addSubpop("p1", 1000);
p1.setSpatialBounds(c(0.0, 0.0, 540.0, 217.0));
mapLines = rev(readFile("~/Desktop/world_map_540x217.txt"));
mapLines = sapply(mapLines, "strsplit(applyValue, '') == '#';");
mapValues = asFloat(mapLines);
p1.defineSpatialMap("world", "xy", c(540, 217), mapValues,
valueRange=c(0.0, 1.0), colors=c("#0000CC", "#55FF22"));
// start near a specific map location
for (ind in p1.individuals) {
ind.x = rnorm(1, 300.0, 1.0);
ind.y = rnorm(1, 100.0, 1.0);
}
}
1: late() {
i1.evaluate();
i2.evaluate();
}
fitness(NULL) {
comp = i1.totalOfNeighborStrengths(individual) / p1.individualCount;
comp = min(c(comp, 0.99));
return 1.0 - comp;
}
1: mateChoice() {
return i2.strength(individual);
}

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

283

modifyChild() {
do pos = parent1.spatialPosition + rnorm(2, 0, 2.0);
while (!p1.pointInBounds(pos));
// prevent dispersal into water
if (p1.spatialMapValue("world", pos) == 0.0)
return F;
child.setSpatialPosition(pos);
return T;
}
20000 late() { sim.outputFixedMutations(); }

Here’s what the model looks like in SLiMgui at the end of generation 1, right after the
population has been set up:

The black dot in eastern Africa is the initial population; this model seeds the population at a
specific location, perhaps simulating – not realistically, obviously! – the origin of Homo sapiens in
Africa. This model has spatial competition and spatial mate choice, as we have seen in previous
recipes, so the population rapidly spreads to occupy new territory in order to escape the local
competition from other individuals. Here is the population at the end of generation 63:

Notice that the individuals are constrained to occupy only land locations. It is also interesting
that they have managed to reach Madagascar; the dispersal kernel used can jump small distances,
so the population can sometimes bridge the Mozambique Channel. Some water gaps are too large
to bridge, however. Here is the population at the end of generation 1000:

The population has spread across all of mainland Asia, but has not been able to enter Indonesia,
and has thus not reached Australia; the land area provided by the Indonesian islands is probably
too small to support a subpopulation (and similarly, the subpopulation that colonized Madagascar
has died out). Reaching the New World would be even more difficult, given the fact that the
Bering Strait is unavailable (a periodic boundary condition in the x direction would be needed to
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

284

make that work); individuals would have to make like the Vikings and jump from mainland Europe
to Iceland, Greenland, and thence to North America.
The fact that this particular biogeographic pattern is observed is not necessarily a problem, of
course; a great many organisms are unable to disperse from Eurasia to Australia or the Americas.
But it is a consequence of the details of this model – the particular dispersal kernel chosen, the
particular way in which dispersal is constrained to land, the details of the map used (the fact that it
does not include elevation, rivers, etc.), the way in which competition and mating are
implemented, and so forth. There is no obstacle to changing these details to match the biological
details of a particular species, to produce a more empirically based biogeographic model.
So, how was this recipe constructed? Let’s now delve into a few of the details. Most of the
model is familiar boilerplate: the setup in initialize(), the 1: late() event that evaluates the
spatial interactions, and the fitness() and mateChoice() callbacks, for example. Let’s start, then,
by looking at the 1 late() event that initializes the population:
1 late() {
sim.addSubpop("p1", 1000);
p1.setSpatialBounds(c(0.0, 0.0, 540.0, 217.0));
mapLines = rev(readFile("~/Desktop/world_map_540x217.txt"));
mapLines = sapply(mapLines, "strsplit(applyValue, '') == '#';");
mapValues = asFloat(mapLines);
p1.defineSpatialMap("world", "xy", c(540, 217), mapValues,
valueRange=c(0.0, 1.0), colors=c("#0000CC", "#55FF22"));
// start near a specific map location
for (ind in p1.individuals) {
ind.x = rnorm(1, 300.0, 1.0);
ind.y = rnorm(1, 100.0, 1.0);
}
}

It begins by creating a subpopulation, p1, as usual. It then sets up spatial boundaries for p1,
from (0, 0) to (540, 217); this is dictated by the size of the map that we are about to load, which is
540x217 pixels in size. The spatial bounds do not need to match that, but it is usually desirable for
them to at least match the aspect ratio of the map, so that the map is not stretched.
The map is then read in from a file; note that the path to this file will probably be different on
your system (it can be found inside the Recipes folder that can be downloaded from SLiM’s home
page). This is just a simple text file; each line is comprised of a series of spaces and hash marks
(“#”), where the hash marks indicate land. If you open this file in a text editor and view it in a
monospace font at a very small point size (like 3 pt), you will see the world map. It is read with
readFile(), which returns a vector of string objects representing the lines of the file.
SLiM uses Cartesian coordinates for spatial models, which means that x increases to the right
and y increases to the top. Our world map, on the other hand, is rendered from top to bottom, as
is common for pixel-based image formats. We use the rev() function to reverse the order of the
lines in the map so that it matches SLiM’s coordinate system. We then use sapply() and
strsplit() to split each line into individual characters, and then convert those into logical values
– T for land, F for water. Finally, these logical values are converted to float, according to the
Eidos rule that T is 1 and F is 0. The map, now in mapValues, is now ready for use.
Now we call the defineSpatialMap() method of Subpopulation to give the map to SLiM. The
first parameter, "world", is a name for the map; you can define as many spatial maps as you wish,
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

285

and you refer to them by name. The "xy" parameter indicates the spatiality of the map, which
must be a subset of the dimensionality of the model as a whole; here our map covers dimensions x
and y. Next we give the dimensions of the map, in pixels; it is 540x217, because those are the
dimensions of the data we read from the file. Next we provide the raw pixel data of the map as a
single vector of float values, scanning the map horizontally from left to right and vertically from
bottom to top (Cartesian coordinates); this is the mapValues variable that we prepared above.
The last two parameters here are optional, and are for SLiMgui’s benefit. The first parameter,
valueRange, gives the permissible range of values for the map data; this need not match the actual
range of the data. The purpose of this is to tell SLiMgui what value range should be displayed
using distinct colors; values outside this range will be clamped to be within the range for display.
The second parameter, colors, gives a vector of color strings that specify how particular values
should be displayed (see the Eidos manual for details on color strings). The lowest value in
valueRange will be displayed using the first color in colors; the highest value in valueRange will
be displayed using the last color in colors. Color values in between will be evenly distributed
across valueRange, and intermediate values – those not corresponding exactly to a given color –
will be displayed using an interpolation between the two nearest color values. Here, the values
supplied for these parameters indicate that a value of 0.0 should be displayed with the color
"#0000CC" (a dark blue), whereas a value of 1.0 should use color "#55FF22" (a medium green).
Values between 0.0 and 1.0 would use an interpolation between those shades, but in fact our map
data is binary, so that does not arise.
With that, we have defined a spatial map and told SLiMgui how to display it. We now just need
to use it to govern the model dynamics. At the end of the 1 late() event the positions of all
individuals are initialized to be near a particular spot in East Africa, using rnorm() to draw the
coordinates instead of using the pointUniform() method we have used in previous recipes. Then
in our modifyChild() callback we enforce that individuals can only disperse to land:
modifyChild() {
do pos = parent1.spatialPosition + rnorm(2, 0, 2.0);
while (!p1.pointInBounds(pos));
// prevent dispersal into water
if (p1.spatialMapValue("world", pos) == 0.0)
return F;
child.setSpatialPosition(pos);
return T;
}

The first couple of lines generate a random offspring position based upon the position of the first
parent, as we have seen in previous recipes. That code loops until a point inside the spatial
bounds of the subpopulation in generated. Next, the callback consults the spatial map to find out
the value at the candidate point using the spatialMapValue() method, giving it the name of the
map and the candidate point. This method simply looks up the value nearest to the given point in
the map data. If it is 0.0 – the value we used for water – then the callback returns F, rejecting the
proposed child. Otherwise, the location is on land, so it is set as the new child’s position and T is
returned to accept the proposed child.
Notice, then, that a reprising boundary condition is used for the edges of the map, but an
absorbing boundary condition is used for the coastlines. This means that individuals close to a
coastline suffer a fitness penalty; their children are relatively likely to land in water, and when they
do, they don’t get another chance to generate a different child location. This is part of the reason
that colonizing areas like Indonesia is difficult in this model (that fact that locations in places like
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

286

Indonesia are relatively unlikely to be chosen as child locations in the first place is also important).
We can change the modifyChild() callback’s code a bit to change that:
modifyChild() {
do {
do pos = parent1.spatialPosition + rnorm(2, 0, 2.0);
while (!p1.pointInBounds(pos));
} while (p1.spatialMapValue("world", pos) == 0.0);
child.setSpatialPosition(pos);
return T;
}

Now the boundary condition on coastlines is reprising also, and a quick run of the model
shows that that allows much more dispersal into small land areas:

Indonesia and Australia are soon reached, and the UK and Madagascar often also support
populations. Reaching North America is still quite difficult, however.
This is obviously just a fun toy model, but it would be easy to extend. A map using more
values, to indicate things like mountain ranges, could be used, and the map could be much higher
resolution (SLiM places no limit on the size of spatial maps, as long as your computer has enough
memory). Dispersal on a high-resolution map could be much more short-range, making it so that
even small barriers like rivers would present obstacles to dispersal. A much larger population size
would probably be appropriate, for most organisms. Moreover, instead of using a constant
population size as this model does, which is clearly unrealistic, the population size could be
related to fitness in such a way that when the population discovers a new area and expands into it,
thereby increasing in fitness since the effects of competition are diminished, the population size
would increase to reflect the new higher carrying capacity; this might fall out naturally in a nonWF
model (see section 15.10), but would have to be implemented in a WF model such as the one
shown here. Mate choice could be based not only on spatial distance, but also upon, for example,
the map value at a point intermediate between the two proposed parents, so that a mountain range
would produce vicariance between even closely adjacent subpopulations. All of these sorts of
features could be added with just a few more lines of code. Note also that although this model
reads its spatial map in from a text file, it is also straightforward to construct a map
programmatically, at runtime; see twoHabitatMap.txt in the online SLiM-Extras repository.
One particularly interesting feature for a biogeographic model like this would be to include
spatial heterogeneity that affects the selection on individuals, such that there is selection pressure
towards local adaptation. For instance, with this world map the more polar regions might exert
selection toward more cold-adapted phenotypes whereas the more tropical regions might select for
more warm-adapted phenotypes. In section 14.9 we saw a very primitive stab at this sort of
model, with the introduction of a simple linear environmental gradient. Real landscapes are more
complicated than that, though – mountaintops are similar to polar environments, whereas
coastlines tend to have more moderate climates, for example. In the next section, we’ll look at a
model of adaptation to spatial heterogeneity that goes beyond a simple linear environmental
gradient by using a spatial map.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

287

14.11 Local adaptation on a heterogeneous landscape map
The previous section introduced the ability to define an arbitrary landscape map, whereas in
section 14.9 we explored a model of adaptation to a linear environmental gradient. Let’s try
combining those two approaches, to see how complex environmental heterogeneity influences
evolutionary outcomes like divergence and speciation. This recipe will demonstrate such an
approach, inspired by the model of Haller, Mazzucco & Dieckmann (2013), although the
landscapes generated here will be fairly different from those used in that paper. This recipe will
generate a random “landscape map” representing the local phenotypic optimum across the
landscape, and will then simulate the local adaptation of spatial groups to the conditions of that
landscape. Here the landscape is generated algorithmically, but it would be trivial to instead read
a landscape map from a file as was done in the previous recipe.
The model of section 14.9 provides us with a quantitative trait based upon underlying randomly
arising QTLs, with phenotypic competition and assortative mating based upon that quantitative
trait. Let’s assume that machinery for now (the full recipe will be presented at the end), and look at
the parts that are new. Most importantly, we have gotten rid of the slope constant of section 14.9,
and instead here generate a heterogeneous landscape:
1 late() {
sim.addSubpop("p1", N);
p1.setSpatialBounds(c(0.0, 0.0, 0.0, 1.0, 1.0, 1.0));
defineConstant("mapValues", runif(25, 0, 1));
p1.defineSpatialMap("map1", "xy", c(5, 5), mapValues, interpolate=T,
valueRange=c(0.0, 1.0), colors=c("red", "yellow"));
for (ind in p1.individuals)
ind.setSpatialPosition(p1.pointUniform());
p1.individuals.z = 0.0;
}

We define spatial bounds of [0, 1] for each dimension, including the z dimension that is used
for phenotype. A map is then generated simply as a set of 25 draws from a uniform distribution
between 0 and 1. This map, saved off in the defined constant mapValues (we will use it later), is
set as a spatial map on subpopulation p1 with a call to defineSpatialMap(). We specify that it
corresponds to dimensions "xy", that it is based on a 5x5 pixel grid (thus the 25 values), that values
are expected to range between 0.0 and 1.0, and that colors for that range should range from red to
yellow.
We’ll look at what this ends up looking like in a moment, but first let’s finish building the
model. In the 1: late() event, we will replace the code from section 14.9 that colors individuals
according to phenotype with new code that uses the same color scheme as the landscape
coloring:
// color individuals according to phenotype
inds.color = p1.spatialMapColor("map1", phenotype);

This uses the spatialMapColor() method of Subpopulation, which translates a value into a
color using the mapping established by a particular spatial map. In this way, the colors of
individuals adapted to a given phenotypic optimum will exactly match the colors of map areas
where that phenotypic optimum exists. Individuals that are perfectly adapted to their environment
will therefore be displayed in SLiMgui in a color that matches their environment, whereas a
mismatch in color between an individual and its environment will indicate maladaptation.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

288

Finally, the global fitness() callback needs to change to use spatialMapValue() to find the
phenotypic optimum for the point in space occupied by the focal individual. The map is looked
up by name, as we saw in the previous section, and the location of the focal individual is obtained
using its spatialPosition() method. We select just the x and y coordinates of the location, since
those are the dimensions used by the spatial map. The final callback looks like this:
fitness(NULL) {
// reward proximity to the optimum
location = individual.spatialPosition[0:1];
optimum = subpop.spatialMapValue("map1", location);
return 1.0 + dnorm(optimum - individual.z, mean=0.0, sd=sigma_K);
}

The rest of the recipe is as it was before. When we run the model, here’s an example of what
we see:

We have a nice randomly heterogeneous landscape, and the population has clearly adapted to
it; we have reddish individuals in two of the more reddish areas, and orange individuals in three
more orange areas. As before, the structure of neutral diversity visible in the chromosome view
provides clear evidence of reproductive isolation and speciation.
There is one thing that may be surprising, however. We supplied a 5x5 map to
defineSpatialMap(), but the snapshot above shows what appears to be a continuous landscape.
This is a result of the interpolation=T parameter we supplied to defineSpatialMap(), which
instructed it to use interpolation (bilinear interpolation, in this case, to be precise) to make the
landscape continuous. To understand better how that works, let’s look in a little more detail at
how SLiM builds landscape maps. First of all, here is the 5x5 grid of values that we supplied to
defineSpatialMap(), colored according to the given color map:

Given this particular pixel grid, if we were to supply interpolate=F to defineSpatialMap()
instead, the landscape displayed by SLiMgui would look like this:

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

289

Note that because the pixel grid is aligned with the corners of the spatial bounds of the
subpopulation, only ½ or ¼ of the area of the outer pixels is contained within bounds. The reason
for this is clear if we look at the pixel grid superimposed on that landscape:

This is by design; it is the most natural way to handle such spatial maps when interpolation is
involved (try thinking through the alternative to see why), and for consistency it is also how SLiM
handles spatial maps when interpolation is not used.
So now when interpolation is turned on, the landscape looks like this:

This is the result of bilinear interpolation, which shades continuously between the defined
points to produce a continuous map. The values defined by the pixel grid remain fixed, however.
To illustrate that, here is the pixel grid superimposed on the interpolated landscape:

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

290

SLiM does not interpolate statically; it does not generate and store an interpolated map of some
large but fixed size. Instead, when interpolation is enabled for a given spatial map, it calculates
the exact interpolated value for a given point upon request. Interpolated maps are therefore
completely continuous and effectively infinite-resolution (within the precision limits of floatingpoint numbers). With only 25 defined values, we therefore have an infinitely detailed landscape.
In the previous recipe, using the world map, we did not enable interpolation, because we actually
wanted the pixel grid; we wanted a world of binary pixels, either land or water, without allowing
any shading between the two.
A logical step following the work of Haller, Mazzucco & Dieckmann (2013) would be to
introduce temporal change in the landscape as well. We will not delve into that idea in any detail;
but it is worth noting that that, too, is quite easy in SLiM. This is not part of the official recipe for
this section – but try adding this code:
1: late()
{
weight = (cos((sim.generation - 1) / 1000.0) + 1.0) / 2.0;
newMap = weight * mapValues + (1 - weight) * 0.5;
p1.defineSpatialMap("map1", "xy", c(5, 5), newMap, interpolate=T,
valueRange=c(0.0, 1.0), colors=c("red", "yellow"));
}

This will cause the landscape to slowly cycle between heterogeneity and homogeneity. During
the heterogeneous phases, speciation generally occurs as seen above. As the landscape fades into
homogeneity, these species can persist, preserved by assortative mating and by the negative
frequency-dependence of the competition function, such that speciation continues even when the
landscape heterogeneity is completely gone:

Although the two species appear to have become sympatric in some areas, considerable
reproductive isolation remains:

And when the heterogeneity returns to the landscape, the species can find their way back to
areas where they are well-adapted – albeit with some evidence of genetic intermingling with the
other species, such as some partially introgressed haplotypes. There is probably a lot of interesting
work to be done on these sorts of temporal dynamics.
In any case, here is the full model (without the temporal-change code discussed just above), for
posterity:
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

291

initialize() {
defineConstant("sigma_C", 0.1);
defineConstant("sigma_K", 0.5);
defineConstant("sigma_M", 0.1);
defineConstant("N", 500);
initializeSLiMOptions(dimensionality="xyz");
initializeMutationRate(1e-6);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeMutationType("m2", 0.5, "n", 0.0, 1.0);
m2.convertToSubstitution = F;

// neutral
// QTL

initializeGenomicElementType("g1", c(m1, m2), c(1, 0.1));
initializeGenomicElement(g1, 0, 1e5 - 1);
initializeRecombinationRate(1e-8);
initializeInteractionType(1, "xyz", reciprocal=T,
maxDistance=sigma_C * 3);
// competition
i1.setInteractionFunction("n", 1.0, sigma_C);
initializeInteractionType(2, "xyz", reciprocal=T,
maxDistance=sigma_M * 3);
// mate choice
i2.setInteractionFunction("n", 1.0, sigma_M);
}
1 late() {
sim.addSubpop("p1", N);
p1.setSpatialBounds(c(0.0, 0.0, 0.0, 1.0, 1.0, 1.0));
defineConstant("mapValues", runif(25, 0, 1));
p1.defineSpatialMap("map1", "xy", c(5, 5), mapValues, interpolate=T,
valueRange=c(0.0, 1.0), colors=c("red", "yellow"));
for (ind in p1.individuals)
ind.setSpatialPosition(p1.pointUniform());
p1.individuals.z = 0.0;
}
modifyChild() {
// set offspring position based on parental position
do
pos = c(parent1.spatialPosition[0:1] + rnorm(2, 0, 0.005), 0.0);
while (!p1.pointInBounds(pos));
child.setSpatialPosition(pos);
return T;
}
1: late() {
// construct phenotypes from the additive effects of QTLs
inds = sim.subpopulations.individuals;
phenotype = inds.sumOfMutationsOfType(m2);
inds.z = phenotype;
// color individuals according to phenotype
inds.color = p1.spatialMapColor("map1", phenotype);
// evaluate interactions
i1.evaluate();
i2.evaluate();
}

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

292

fitness(m2) {
// make QTLs intrinsically neutral
return 1.0;
}
fitness(NULL) {
// reward proximity to the optimum
location = individual.spatialPosition[0:1];
optimum = subpop.spatialMapValue("map1", location);
return 1.0 + dnorm(optimum - individual.z, mean=0.0, sd=sigma_K);
}
fitness(NULL) {
// phenotypic competition
totalStrength = sum(i1.strength(individual));
return 1.0 - totalStrength / p1.individualCount;
}
mateChoice() {
// spatial mate choice
return i2.strength(individual);
}
10000 late() {
sim.simulationFinished();
}

This is the most complex recipe we will build involving spatiality, interactions, and landscape
maps, but there is so much more that could be done. Interactions could be based upon genetics in
ways beyond the QTL-based models we have explored; a spatial green-beard model would be
interesting, for example. SLiM allows multiple landscape maps to be defined; it would be
interesting to bring in empirical data on elevation, rainfall, mean temperature, and a host of other
variables, and allow a population to evolve in a complex, multidimensional landscape to see
whether realistic patterns of spatial biodiversity might be realized. One could even create a model
in which the behavior of the organisms on the landscape modify the landscape itself – a model of
desertification driven by overgrazing, for example.
Before we wrap up our discussion of spatial models, however, there is one remaining topic that
we deferred.
14.12 Periodic spatial boundaries
In section 14.3, various options for spatial boundary conditions were introduced: stopping,
absorbing, reflecting, and reprising boundaries. The possibility of periodic spatial boundaries was
also mentioned, but was deferred as an advanced topic. The time has come to explore this
concept.
A periodic spatial boundary is one which wraps around: one edge of a spatial dimension is
connected seamlessly to the opposite edge. A non-periodic one-dimensional bounded space is a
line segment: points in this space may fall anywhere between x0 and x1. An individual might travel
from x0 to x1, at which point the end of the line segment is reached and motion must stop or
reverse. The periodic version of this is a circle: x0 and x1 have been joined together to form a
closed curve:

x0

x1

x0
x1

Here, an individual might travel from x0 to x1 and continue onwards; at the moment it reaches
x1 it has also, simultaneously, returned to x0, and may continue towards x1 again (and again, and

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

293

again). Note that there is no “seam” and no “privileged point” in this space; the spatial topology at
every point is identical.
A non-periodic two-dimensional bounded space is a rectangle with extents [x0, x1] and [y0, y1].
There are three periodic versions of this: x0 and x1 may be joined, y0 and y1 may be joined, or both
pairs may be joined. The first two options produce a topology like the surface of a cylinder
without end caps; the third option produces not a sphere (as one might guess), but a torus, like the
surface of a doughnut:

y1

y1

x0

x1

y0

x0

x1

y0

y0 y1

y0 y1
x0

x1

x0
x1

Again, movement along the periodic dimension(s) may continue indefinitely, wrapping around
at the boundary; and again, there is no “seam” or “privileged point” in these spaces (along the
periodic axis or axes).
Finally, a non-periodic three-dimensional bounded space is a cube; it may be made periodic in
x, y, z, x and y, x and z, y and z, or x and y and z, yielding seven different periodic versions of
three-dimensional space. The topology of these is harder to visualize, but the principle is the
same: movement along the periodic axis or axes may continue indefinitely because it wraps
around, and along the periodic axis or axes there is no “seam”.
This is all rather abstract, but periodic boundary conditions are very useful in modeling,
especially when doing theoretical work rather than trying to simulate a real landscape. The reason
is simple: periodic spatial boundaries eliminate edge effects, which are otherwise a source of bias
in spatial models. In effect, periodic boundaries allow you to model an infinite space with no
edges at all. Of course the space is not really infinite, but instead repeats periodically; but if the
spatial scale of important interactions and dynamics in the model is small compared to the size of
the periodic space, this is often an acceptable approximation.
Setting up periodic spatial boundaries in SLiM is a little bit more complex than implementing
other boundary conditions. The reason is that periodic spatial boundaries fundamentally change
many aspects of SLiM’s spatial engine; enforcing the boundary condition is no longer just a
question of modifying generated offspring positions in a particular way (although that is one part of
it). For instance, consider two individuals that occupy positions (0.0, 0.1) and (0.0, 0.9) in a twodimensional space with extent [0.0, 1.0] in both x and y. With any other boundary condition, the
distance between these two individuals is 0.8 (0.9 − 0.1). With periodic boundaries, however, the
distance between them is 0.2, because a line of length 0.2 may be drawn between them that wraps
around the boundary in the y dimension. By definition, the distance between two points in a
periodic space is always the shortest distance possible out of the infinitely many different distances
that could be calculated. This means that distances, interaction strengths, and indeed all of the
underlying mechanics of InteractionType’s spatial queries must take the periodicity of the space
into account.
For this reason, periodic spatial boundaries must be declared up front, and cannot be changed
subsequently. This declaration is done with a parameter to initializeSLiMOptions() named
periodicity, which specifies the periodic spatial dimensions as a string. In this recipe, we will
work with a two-dimensional model that is periodic in both dimensions – a toroidal model, as
pictured above. This can be set up as follows:

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

294

initialize() {
initializeSLiMOptions(dimensionality="xy", periodicity="xy");
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
initializeInteractionType("i1", "xy", reciprocal=T, maxDistance=0.2);
i1.setInteractionFunction("n", 1.0, 0.1);
}

The initializeSLiMOptions() call establishes a 2-D space (with dimensionality="xy") and
then declares both of those dimensions to be periodic (with periodicity="xy"). We also set up a
spatial interaction here, involving both spatial dimensions; it uses a Gaussian interaction function
with a standard deviation of 0.1, so it falls off at well under the spatial scale of the model as a
whole. We declare it to have a maximum distance of 0.2, for efficiency; the interaction strength
will be very low further than two standard deviations out anyway.
Next, let’s set up a subpopulation with random initial positions:
1 late() {
sim.addSubpop("p1", 2000);
p1.individuals.x = runif(p1.individualCount);
p1.individuals.y = runif(p1.individualCount);
}

Nothing surprising here. Let’s implement a modifyChild() callback to set up offspring
positions, very much as we did in the recipes in section 14.3:
modifyChild() {
pos = parent1.spatialPosition + rnorm(2, 0, 0.02);
child.setSpatialPosition(p1.pointPeriodic(pos));
return T;
}

This uses a function named pointPeriodic() that is similar to the pointStopped() and
functions we saw before; it translates the point it is passed so that it falls within
the spatial boundaries, while implementing the periodic boundary conditions requested. In this
case, pointPeriodic() wraps a point that lies beyond the periodic spatial boundaries, just as if the
offspring had walked off of one edge of the space and re-appeared at the opposite edge.
We need an event to define the end of the simulation, as usual:
pointReflected()

1000 late() { sim.outputFixedMutations(); }

And now let’s do something a bit more interesting: let’s use the interaction type that we defined
above to show, visually in SLiMgui, how interactions work in periodic space:
late()
{
i1.evaluate();
focus = sample(p1.individuals, 1);
s = i1.strength(focus);
inds = p1.individuals;
for (i in seqAlong(s))
inds[i].color = rgb2color(c(1.0 - s[i], 1.0 - s[i], s[i]));
focus.color = "red";
}
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

295

This late() event runs in every generation. It evaluates the interaction, then chooses a focal
individual randomly from the population. It asks the interaction type to calculate the interaction
strength between the focal individual and all other individuals; then it loops over the individuals
(by index) and sets each one’s color in SLiMgui using a particular formula (see the Eidos manual
for discussion of the rgb2color() function). Finally, it sets the color of the focal individual itself to
red. When this model is run, a typical generation looks like this:

The focal individual can be seen in red. The individuals closest to it are blue, while those
farthest away are yellow. Because of the particular RGB (red, green, blue) color values passed to
rgb2color(), the color of individuals at intermediate distances fades smoothly from blue to yellow.
This is a nice illustration of the shape of the interaction kernel, and in fact this sort of coloration
strategy can be quite useful in testing and visualizing spatial models.
The snapshot above, however, involved a focal individual that was far from any of the spatial
boundaries. When a different focal individual is chosen, we can see that the spatial interaction
wraps around the edges of the periodic space:

The third snapshot illustrates that wrapping occurs in both spatial dimensions, not just one at a
time. Due to the toroidal geometry of the space, the four corners of the space as displayed in
SLiMgui are actually all the same point! In the diagram above of the toroidal geometry of this
space, the four corners as displayed in SLiMgui correspond to the intersection of the blue and red
lines drawn on the torus. Note that topologically, there is nothing special about the blue and red
lines in particular; every point on the torus lies at the intersection between a curve that circles
around the major circumference of the torus (like the blue line) and a curve that circles around the
minor circumference of the torus (like the red line). An individual living in this space cannot tell
whether it is at the periodic boundary or not; as was emphasized earlier, there is no seam.
If you wish, you can play with this model by making only the x dimension periodic, or only the
y dimension, to see how that affects the spatial interaction and what the resulting cylindrical
topology feels like in practice. If you do so, note that pointPeriodic() will enforce only the
periodic boundary condition; it will leave coordinates that are not periodic unmodified. To keep
the modeled individuals inside bounds, some boundary condition – whether stopping, reflecting,
absorbing, or reprising – must be enforced for the non-periodic coordinates. This can be done just
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

296

as it was in section 14.3; in particular, it is useful to note that a call to pointPeriodic() can be
wrapped inside a call to pointStopped() or pointReflected() to achieve the desired effect. This
works because pointPeriodic() has already brought the periodic coordinates into bounds, so they
will not be modified by pointStopped() or pointReflected(). So a model that used periodic
boundaries for only one of the two axes, and that enforced reflecting boundaries on the nonperiodic axis, could use a modifyChild() callback like this:
modifyChild() {
pos = parent1.spatialPosition + rnorm(2, 0, 0.02);
child.setSpatialPosition(p1.pointReflected(p1.pointPeriodic(pos)));
return T;
}

All of the other elements of spatial modeling that were introduced in earlier recipes in this
chapter – spatial competition, spatial mate choice, landscape maps – will work with periodic
spatial boundaries. You can still model phenotype as a spatial dimension, too, as in section 14.7
for example; you will just (probably) not want that dimension to be periodic, since phenotypic
traits are usually linear – being very short is not the same thing as being very tall!
The concept of periodic space can take some getting used to; cylindrical or toroidal space may
seem rather artificial at first. But in fact, because of the absence of edge effects, periodic space is
actually less artificial than other arbitrarily-chosen boundary conditions, in many ways; it will not
exhibit biases in the spatial density of individuals in one part of the space versus another,
individuals everywhere will feel the same average interaction strength if they are randomly
distributed, and motion in particular directions from particular locations will not be artificially
impeded.
Because of the greater underlying complexity, models using periodic space will be a little
slower in SLiM than non-periodic models, but unless population size is large the difference in
performance should be slight. Unless you actually want a particular edge effect in your model,
perhaps reflecting some real-world landscape’s dynamics, periodic boundaries may be the best
option.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

297

15. Going beyond Wright-Fisher models: nonWF model recipes
Beginning with SLiM 3.0, two fairly different types of models are supported in SLiM: WrightFisher or WF models, and non-Wright-Fisher or nonWF models. See section 1.6 for a discussion of
the differences between these two types of models. All of the recipes presented so far have been
WF models, since that is the default model type in SLiM. In this chapter, we will go beyond the
assumptions and constraints of Wright-Fisher models, and will examine recipes for a variety of
nonWF models.
A great deal of the overall design of SLiM is shared between WF and nonWF models. All of the
Eidos classes that embody SLiM are the same (SLiMSim, Subpopulation, Individual, etc.), and the
way that SLiM models the chromosome, genomes, mutations, and so forth is unchanged. Since
that foundation is all shared, a good understanding of the concepts in the preceding chapters will
be assumed. Many of the techniques presented in the preceding recipes will also work in nonWF
models. We will not re-cast all of those techniques in a nonWF context, since that would mostly
just be repetitive and uninteresting; instead, we will focus on the important ways in which nonWF
models are different from WF models.
As discussed in more detail in section 1.6, the main differences between WF and nonWF
models fall into a few major categories. First of all, in nonWF models generations may be
overlapping and individuals can live for more than one generation; for this reason, in nonWF
models the model’s script is responsible for creating new offspring as needed, rather than that
happening automatically every generation as in WF models. Second, for the same reason, in
nonWF models the parental generation does not die off automatically after offspring are generated;
instead, fitness governs mortality (rather than governing mating success as in WF models). Third –
as a consequence of the previous two differences, really – in nonWF models population regulation
is a consequence of the balance between individual reproduction and individual mortality, just as
it is in natural populations, rather than being enforced through a set population size as in WF
models. Fourth, migration in nonWF models is similarly managed on an individual basis in the
model’s script, rather than being done automatically by the SLiM engine based upon set migration
rates.
All of this points to two basic observations. One observation is that the generation cycle is
quite different between WF and nonWF models. Chapter 19 discusses the generation cycle for WF
models, whereas chapter 20 discusses the generation cycle for nonWF models and the important
conceptual ways in which it differs from the WF generation cycle. An understanding of those
differences, such as the way in which the semantics of early() and late() events have changed
and the way that the meaning of fitness has shifted, will be important to understand the recipes
that follow, so chapter 20 should be consulted for further information as needed. The other
observation is that nonWF models are generally more individual-based and more complex than
WF models, because more responsibilities like offspring generation and migration have been
pushed from SLiM onto the model’s script. With this additional complexity comes considerable
additional power and flexibility, however, as we will see.
15.1 A minimal nonWF model
Let’s begin with a minimal nonWF model, similar to the minimal WF model presented in
section 4.1. This will illustrate several of the fundamentals of nonWF models: how to switch SLiM
into nonWF “mode”, how to implement individual-based reproduction and density-dependent
population regulation, and how to work with non-overlapping generations.
With no further ado, here is the recipe:

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

298

initialize() {
initializeSLiMModelType("nonWF");
defineConstant("K", 500);
initializeMutationType("m1", 0.5, "f", 0.0);
m1.convertToSubstitution = T;
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeMutationRate(1e-7);
initializeRecombinationRate(1e-8);
}
reproduction() {
subpop.addCrossed(individual, subpop.sampleIndividuals(1));
}
1 early() {
sim.addSubpop("p1", 10);
}
early() {
p1.fitnessScaling = K / p1.individualCount;
}
late() {
inds = p1.individuals;
catn(sim.generation + ": " + size(inds) + " (" + max(inds.age) + ")");
}
2000 late() {
sim.outputFull(ages=T);
}

The first line of the initialize() callback is a call to initializeSLiMModelType("nonWF"); this
tells SLiM that we are building a nonWF model. That has various consequences; it activates the
nonWF generation cycle shown in chapter 20, for example, and it enables some properties and
methods on SLiM’s objects while disabling others. It is possible, when writing a WF model, to
include a call to initializeSLiMModelType("WF"), but unnecessary, since that is the default.
The next line sets up a defined constant, K. This will be the carrying capacity for the model’s
population regulation; we’ll cover that below.
The rest of the initialize() callback is much as we have seen before, but you might notice
one surprising thing: we set the convertToSubstitution property of mutation type m1 to T. In WF
models, mutations convert to substitutions automatically; the default value of that property is T, so
it would be redundant to set it to T again. In nonWF models, however, mutations do not convert
to substitutions automatically; the convertToSubstitution property is F by default and must be set
to T when desired. The reason is that in nonWF models fitness is absolute, not relative, and so
only completely neutral mutations with no side effects are safe to convert to Substitution objects
(see section 20.4 for further discussion). Since m1 mutations are completely neutral in this model,
we tell SLiM to allow them to fix; this will make the model run much faster.
The next script block is a new type of callback, a reproduction() callback, that may be used
only in nonWF models. Such reproduction() callbacks are called once per individual, at the
beginning of each generation; this provides an opportunity for that focal individual to generate
offspring (which it might or might not do); see section 20.1 for further discussion. This callback
calls p1.sampleIndividuals(1) to draw one random individual from p1 as a mate, and then calls
subpop.addCrossed() to add a new offspring individual that is the result of crossing – biparental
sexual reproduction – between the focal individual for the callback (individual) and the chosen
mate. The new individual is created, and is in fact returned to the caller, but is not actually added
to the subpopulation until offspring generation is finished. Note that although each individual will
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

299

generate exactly one offspring of its own here, as the focal individual or “first parent”, it might also
be chosen as a mate by another individual (perhaps more than once, or perhaps not at all).
The next script block returns us to more familiar territory, since it simply calls addSubpop() to
create subpopulation p1 with ten initial individuals. The only point to be made here is that while
in a WF model this would set the subpopulation’s size forever (until setSubpopulationSize() was
called, at least), in a nonWF model this only sets the initial subpopulation size; the size of the
subpopulation will henceforth be governed by birth and death events, not by the initial size set for
it.
Speaking of death, the next early() event governs that. It calculates an absolute fitness value,
K / p1.individualCount, and sets that into the fitnessScaling property of p1. We haven’t seen
this property before; in effect, it scales the absolute fitness of the subpopulation, because the
calculated fitness for every individual in p1 is multiplied by this value. It is essentially the same as
writing a global fitness(NULL) callback that returns the same constant value for every individual,
but it is much faster than that model design would be, since the fitness(NULL) callback would
have to actually be called for every individual in every generation.
This fitnessScaling property may be used in WF models too, in fact, but is not generally useful
in that context; since WF models use relative fitness, scaling the fitness of all individuals by the
same constant has no effect. In nonWF models, however, it has a very important effect! This
fitnessScaling factor is what regulates the population size in the model; if this line were
commented out, the population would grow exponentially forever (until SLiM crashed, or got so
slow as to be effectively halted). Instead, when the subpopulation size is less than K,
fitnessScaling will be greater than 1.0 (and so no mortality will occur and the subpopulation
size will grow); but if it is larger than K, fitnessScaling will be less than 1.0 and mortality will
bring it back toward K. Since new individuals get generated at the beginning of each generation by
our reproduction() callback, once the model reaches equilibrium the population size will double
during offspring generation to around 1000, then a fitnessScaling value of approximately 0.5 will
be set, and then approximately half of the individuals will die during the survival life cycle stage,
bringing the population size down to approximately 500 (i.e., K).
Note that there is nothing magical about this particular formula; any formula or model design
that influences individual fitness in such a manner as to regulate the population size will work.
This particular formula produces logistic growth until K is reached, and then stochastic population
size fluctuation around K thenceforth. The population size will be stochastic around K because in
nonWF models fitness is the probability of death, but of course sometimes more individuals will
die than expected, sometimes less; the fate of each individual is in the hands of SLiM’s random
number generator.
Next we have a late() event that outputs two pieces of information in each generation: the
population size, and the age of the oldest individual. Here’s the initial output from one run:
1: 10 (0)
2: 20 (1)
3: 40 (2)
4: 80 (3)
5: 160 (4)
6: 320 (5)
7: 496 (6)
8: 485 (7)
9: 500 (6)
10: 522 (7)
...

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

300

You can see the exponential growth at the start settling in around 500 once the model reaches
carrying capacity. You can also see that we already have overlapping generations in this model;
the individuals at the end of generation 1 are of age 0 (newly generated juveniles), and that initial
cohort ages without mortality during exponential growth (since we have not implemented a
maximum age or any sort of age-dependent reduction in fitness). By generation 7, though, the
carrying capacity has filled up and individuals start to die. Since every individual in this model
has the same fitness, mortality is purely random, and sometimes you will get a Methuselah that
lives to be 20 or even older; but most individuals will die through sheer bad luck before then, and
the maximum age in this model will tend to fluctuate around 10 or 15. Later recipes will explore
how to control the population age in biologically realistic ways, but for now let’s just bask in the
glory of the fact that we have already modeled something that can’t be modeled in WF SLiM:
overlapping generations.
The final event produces output from the model with outputFull(), as we have seen many
times before. The only new element is the ages=T parameter, which requests that age information
be added to the output (the details of the output format are given in section 23.1.1).
That’s it; that completes our first nonWF recipe! In subsequent sections we will explore the
greater power the nonWF paradigm affords us, because we can now control the mating, fecundity,
migration, fitness, and survival of each individual.
15.2 Age structure (a life table model)
In the previous recipe, the probability of survival was the same regardless of age, and that
produced a particular emergent age structure in the population. In most biological systems,
however, the probability of survival is age-dependent. Commonly, this is modeled with a life table
that gives the probability that an individual of a given age will die within the next time period (the
next year, often).
To model non-overlapping generations in a nonWF model (as is always the case in WF models),
one might use a life table that gives a probability of survival of 0.0 for all individuals of age 1 or
older (newly generated juveniles having an age of 0); with such a life table, offspring would be
generated and then the parental individuals (all having an age of 1) would all immediately die. In
this model we will implement a slightly more complex life table, for an imaginary species that has
high juvenile mortality, low adult mortality, and a maximum age of seven generations. We will
also implement age-dependent fertility and density-dependent population regulation.
Let’s look at this model one piece at a time, beginning with initialization:
initialize() {
initializeSLiMModelType("nonWF");
defineConstant("K", 30);
defineConstant("L", c(0.7, 0.0, 0.0, 0.0, 0.25, 0.5, 0.75, 1.0));
initializeMutationType("m1", 0.5, "f", 0.0);
m1.convertToSubstitution = T;
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeMutationRate(1e-7);
initializeRecombinationRate(1e-8);
}

This is identical to the previous recipe except for the addition of the defined constant L, which
is our life table. It gives the probability of mortality for each age; newly generated juveniles have a
mortality of 0.7 (i.e., 70%), then the mortality drops to zero for several years, and then it ramps
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

301

gradually upward with increasing age until it reaches 1.0 for age 7; all individuals of age 7 will die.
Note that this is only the age-related mortality; density-dependence will also cause mortality, as we
will see below, but that will be additional to this age-related mortality, which would occur even in
a population that was not limited by its density.
Next, let’s implement reproduction:
reproduction() {
if (individual.age > 2)
subpop.addCrossed(individual, subpop.sampleIndividuals(1));
}

This is the same as the reproduction() callback in the previous recipe, except that here we
have prevented reproduction by individuals of age 2 or below. One could similarly limit or
prevent reproduction in a nonWF model based upon genetics, individual state, resource
acquisition, mate compatibility, or any other factor.
Population initialization looks like this:
1 early() {
sim.addSubpop("p1", 10);
p1.individuals.age = rdunif(10, min=0, max=7);
}

We again start at a population size of 10 and allow the population to grow upward to the
carrying capacity (only 30 in this model, to make reading the output of the model easier). Since
addSubpop() sets the age of all new individuals to 0, that would provide our model with a bit of an
artificial start, and it might also present difficulties since juvenile mortality in this model is so high
– sometimes the population might go extinct before it reaches reproductive age. We therefore
draw random ages from a discrete uniform distribution from 0 to 7. If one had empirical data
about the age distribution in one’s system, that might of course be an even better starting point.
And here we manage both age-related and density-dependent mortality:
early() {
// life table based individual mortality
inds = p1.individuals;
ages = inds.age;
mortality = L[ages];
survival = 1 - mortality;
inds.fitnessScaling = survival;
// density-dependence, factoring in individual mortality
p1.fitnessScaling = K / (p1.individualCount * mean(survival));
}

We calculate the age-related mortality by getting all of the individuals in p1, getting their ages,
and then looking up those ages in L to get a vector of the mortality rates for the individuals.
Survival rates are the opposite of mortality rates, so we subtract from 1; if an individual has a
mortality rate of 1 it has a survival rate of 0, and vice versa. Finally, we set those survival rates into
the fitnessScaling properties of the individuals. We saw the fitnessScaling property of
Subpopulation in the previous recipe, where it scaled the fitness of all individuals in the
subpopulation by the same constant factor. The fitnessScaling property of Individual has much
the same effect, but on an individual basis; each individual can have a different fitnessScaling
value, which is multiplied into that individual’s calculated fitness. This is equivalent to
implementing a fitness(NULL) callback that returns a survival-based fitness effect for the focal
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

302

individual based upon that individual’s age; but this way is much faster since it is done in a
vectorized fashion without fitness() callbacks.
Next, the callback above calculates density-dependent mortality based upon K and the current
population size, as before, but also factors in the mean survival rate in the population due to agerelated mortality. Without that correction, the population would equilibrate around a lower size
than K, because age-related mortality would occur in addition to the density-dependent mortality
necessary to bring the population down to K. With the correction, the population size should
fluctuate stochastically around K, as desired. One could of course get fancier, and come up with
equations that made the probability of density-dependent mortality depend upon age (or any other
individual state) in some manner; perhaps older individuals would be weaker and more vulnerable
to diseases and parasites that are common when population density is high, for example.
late() {
// print our age distribution after mortality
catn(sim.generation + ": " + paste(sort(p1.individuals.age)));
}
2000 late() { sim.outputFixedMutations(); }

These are the last components of our model: output and termination. The late() output event
prints the population’s age distribution in each generation; this is post-mortality, since late()
events run after the survivial/viability generation cycle stage in nonWF models. A typical run:
1: 0 0 1 1 3 3 3 6 6 6
2: 0 0 0 0 0 0 1 1 2 2 4 4 4
3: 0 0 0 0 0 1 1 1 1 1 1 2 2 3 3 5 5 5
4: 0 0 0 0 1 1 1 1 1 2 2 2 2 2 2 3 3 4 4 6 6 6
5: 0 0 0 0 1 1 1 1 2 2 2 2 2 3 3 3 3 3 3 4 4 5
...
1996: 0 0 0 0 0 0 1 1 1 1 1 1 1 1 2 2 2 2 3 3 3
1997: 0 0 0 0 0 0 0 1 1 1 1 1 1 2 2 2 2 2 2 2 2
1998: 0 0 0 0 1 1 1 1 1 1 2 2 2 2 2 3 3 3 3 3 3
1999: 0 0 0 0 1 1 1 1 2 2 2 2 2 2 3 3 3 3 3 4 4
2000: 0 0 0 0 0 0 0 1 1 1 1 2 2 2 2 3 3 3 3 3 3

4
3
3
4
4

4
3
3
4
4

4
3
4
5
5

4
3
4
5
6

4
4
4
5

5 6
4 4 5 5 5
4 5 5 6 6
5

The beginning of the output shows growth toward the carrying capacity; even before the
carrying capacity is reached some age-related mortality occurs. The end of the output shows the
(somewhat stochastic) equilibrium population size and age structure. With this life table, the
population is dominated by middle-aged individuals; most juveniles die, and few individuals make
it to age 7 since the mortality rate ramps upward beginning at age 4. Note that no individuals of
age 7 are visible here because this output is post-mortality; no individual of age 7 should ever exist
at this point in the generation cycle. If this output event were an early() event instead, however,
we would expect to see the occasional age 7 individual.
This model uses a life table, but the larger point is that nonWF models allow one to model any
individual-based mortality effects, whether due to genetics (which would typically occur through
SLiM’s built-in fitness calculations), age (as with a life table or similar scheme), density (as also
modeled here), individual state, environmental effects, or anything else. Fitness effects influencing
survival can be expressed through fitness() callbacks, or using the fitnessScaling properties of
Subpopulation and Individual that we used here.
15.3 Monogamous mating and variation in litter size
In the previous recipe we explored how to manipulate individual fitness values in order to
implement age-dependent mortality using a life table. In this recipe we will look at the other end
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

303

of the circle of life: mate choice and fertility. The previous nonWF models we have built have used
an extremely simplistic model of reproduction in which each individual of reproductive age
produces exactly one offspring per generation itself with a randomly selected mate (and may be
chosen as a mate by another individual, too). Here we will implement a very different model of
reproduction: monogamy (within a single breeding season), and generation of a litter of offspring
of non-deterministic size.
This model will look very similar to section 15.1’s recipe, so let’s just see the whole model all at
once:
initialize() {
initializeSLiMModelType("nonWF");
defineConstant("K", 500);
initializeMutationType("m1", 0.5, "f", 0.0);
m1.convertToSubstitution = T;
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeMutationRate(1e-7);
initializeRecombinationRate(1e-8);
}
reproduction() {
// randomize the order of p1.individuals
parents = sample(p1.individuals, p1.individualCount);
// draw monogamous pairs and generate litters
for (i in seq(0, p1.individualCount - 2, by=2))
{
parent1 = parents[i];
parent2 = parents[i + 1];
litterSize = rpois(1, 2.3);
for (j in seqLen(litterSize))
p1.addCrossed(parent1, parent2);
}
// disable this callback for this generation
self.active = 0;
}
1 early() {
sim.addSubpop("p1", 10);
}
early() {
p1.fitnessScaling = K / p1.individualCount;
}
late() {
inds = p1.individuals;
catn(sim.generation + ": " + size(inds) + " (" + max(inds.age) + ")");
}
2000 late() {
sim.outputFull(ages=T);
}

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

304

This is, in fact, identical to section 15.1 except for the reproduction() callback, so let’s look at
that in detail:
reproduction() {
// randomize the order of p1.individuals
parents = sample(p1.individuals, p1.individualCount);
// draw monogamous pairs and generate litters
for (i in seq(0, p1.individualCount - 2, by=2))
{
parent1 = parents[i];
parent2 = parents[i + 1];
litterSize = rpois(1, 2.7);
for (j in 1:litterSize)
p1.addCrossed(parent1, parent2);
}
// disable this callback for this generation
self.active = 0;
}

We do something quite different here: this reproduction() callback runs only once per
generation, not once per individual! It disables itself at the end of its own execution by setting its
active flag to 0 (see section 22.8 for discussion of this feature, which we haven’t seen much before
now). The reproduction() callback is called by SLiM for some particular focal individual, but it
ignores that individual and instead generates offspring for all of the individuals in the whole
population all at once. This is perfectly fine, and can be a useful strategy when the reproduction
behavior of individuals is non-independent; with monogamy, for example, once two individuals
have formed a mating pair those individuals are not available to be chosen as mates by any other
individuals, so mating behavior in this model is non-independent.
The first thing we want to do is choose all of the monogamous mating pairs. The order of
individuals in SLiM is not guaranteed to be random in SLiM, so it would be unwise to simply pair
individuals 0 and 1, 2 and 3, etc.; that could lead to biases in mate choice that could manifest in
strangely skewed genetics over time. Instead, we use the sample() function to draw a complete
sample from the population, without replacement, effectively randomizing its order. Then we pair
individuals 0 and 1, 2 and 3, etc., from that, which is safe. We do that pairing with a for loop over
the even values up to p1.individualCount - 2 (guaranteeing that a pair of individuals remains the
last time through the loop, not just an odd one out at the end). Individuals i and i+1 are then
taken to be a monogamous mating pair. Of course one could implement any mating scheme at
all; females could choose males assortatively, or based upon their physical condition, or in any
other manner, and monogamy does not need to be enforced (as we saw in the previous two
recipes).
Next, we want to generate a litter for that mating pair. We draw the size of the litter from a
Poisson distribution with a mean of 2.7, arbitrarily. At the risk of sounding like a broken record, of
course litter size could depend upon anything at all – the genetics of the two parents, their genetic
compatibility, their respective conditions, their ages, their fecundity the previous year, their
phenotypic match with their environment, etc. Here, we happen to use a Poisson distribution with
a mean of 2.7. That gives us a litter size; we then generate the litter by calling addCrossed() that
many times with the same parents.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

305

The only thing left is for the reproduction() callback to deactivate itself, as explained above.
This model has the same output code as section 15.1’s recipe; a typical run produces:
...
1995:
1996:
1997:
1998:
1999:
2000:

497
531
524
497
529
511

(6)
(7)
(6)
(7)
(8)
(6)

Note that the equilibrium age here is around 6 to 7, where in section 15.1 it was more around
10 to 15. That is because section 15.1’s recipe produced one offspring per individual per
generation (not counting being chosen as a mate by another reproducing individual), whereas this
recipes produces about 1.35 (half of 2.7). That floods this model with young individuals, relative
to the earlier model. Density-dependent mortality and carrying capacity remain the same,
however, so skewing the age distribution towards juveniles in that manner inevitably means fewer
old individuals and a shorter expected lifespan. That happens because with more offspring
generated, the pre-mortality population size is larger, and so (given that the carrying capacity is the
same) density-dependent selection is stronger, individual fitness is lower, and the probability of
mortality per individual is higher, reducing the expected lifespan.
15.4 Beneficial mutations and absolute fitness
Thus far, we have only looked at neutral nonWF models. Fitness in nonWF models is absolute
and affects survival, where in WF models it is relative and affects mating success; this makes fitness
dynamics a bit different between nonWF and WF models. In particular, since one cannot survive
with more than a 100% probability, fitness values above 1.0 in nonWF models do not benefit the
individual at all; fitness values above 1.0 are interpreted as being 1.0. However, fitness is also
multiplicative (in both nonWF and WF models), and the important thing is the final fitness value
for an individual. The way this works out in practice can be a bit counterintuitive, so in this
section we will explore a simple model of an introduced beneficial mutation that sweeps to
fixation in a population, and we will look at the population dynamics as it does so.
The recipe is again based on that of section 15.1:
initialize() {
initializeSLiMModelType("nonWF");
defineConstant("K", 500);
initializeMutationType("m1", 0.5, "f", 0.0);
m1.convertToSubstitution = T;
initializeMutationType("m2", 1.0, "f", 0.5);

// dominant beneficial

initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeMutationRate(1e-7);
initializeRecombinationRate(1e-8);
}
reproduction() {
for (i in 1:5)
subpop.addCrossed(individual, subpop.sampleIndividuals(1));
}
1 early() {
sim.addSubpop("p1", 10);
}
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

306

100 early() {
mutant = sample(p1.individuals.genomes, 10);
mutant.addNewDrawnMutation(m2, 10000);
}
early() {
p1.fitnessScaling = K / p1.individualCount;
}
late() {
inds = p1.individuals;
catn(sim.generation + ": " + size(inds) + " (" + max(inds.age) + ")");
}
2000 late() {
sim.outputFull(ages=T);
}

The changes from section 15.1 are minor. In the initialize() callback we create a mutation
type m2, for beneficial mutations; this has quite a strong selection coefficient and is dominant, to
make it less likely to be lost to drift at the very beginning, but that is unimportant to the point of
this recipe; we just don’t want to have to get into making the recipe conditional on fixation since
that is an extra complication (see section 10.2). In the reproduction() callback each individual
now reproduces five times per generation, creating very strong density-dependent selection; this is
a highly fecund species. That is not essential to the point of this recipe, but it will make the effect
more obvious. In a new 100 early() event we now select one genome from the population at
random, and add an m2 mutation to it. The rest of the model is the same. The m2 mutation will
(usually) sweep to fixation, and we will look at the resulting population size and age structure.
At the beginning, the model rapidly grows to the carrying capacity and stays there (each output
line, remember is a generation followed by the population size and the age of the oldest
individual, post-mortality):
1:
2:
3:
4:
5:
6:
7:
8:
9:

10 (0)
60 (1)
360 (2)
513 (3)
473 (4)
460 (4)
480 (3)
504 (3)
505 (4)

The population size is around 500, the oldest individual usually 3 or 4. So far so good. Now
let’s look at the output at the end of the run, in a run in which the beneficial mutation fixes:
1996:
1997:
1998:
1999:
2000:

768
771
781
763
802

(4)
(3)
(4)
(3)
(4)

The population size is now fluctuating around 760 to 800, although the typical age of the oldest
individual is still 3 to 4. What happened to our carrying capacity of 500?
The answer lies in the fitness values calculated by SLiM. Two things affect fitness in this model.
One is density-dependent mortality, as embodied in the fitnessScaling value set by the early()
callback. When the population is above carrying capacity (as it is in every generation from 4
onwards, in this model, due to the many juveniles), this scaling value will be less than 1.0; given
how many juveniles this model makes, it is probably usually 0.2 or lower, in fact. The other thing
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

307

affecting fitness is the beneficial mutation; since it is dominant, it gives any individual possessing it
a fitness effect of 1.5 since its selection coefficient is 0.5. These fitness effects are multiplicative,
so once the beneficial mutation has fixed, every individual will have a fitness value of
approximately 0.2 * 1.5, or 0.3. This means, in effect, that fewer individuals will die, and the
carrying capacity will increase. Which is precisely what we see.
This may be unexpected for those who are used to the world of Wright-Fisher models, but it
makes good biological sense. If every individual possesses a mutation that makes them more fit –
more likely to survive – then the population size ought to increase. If there is some reason why
that shouldn’t happen, such as a hard limit on the amount of available food, then you ought to add
that biological detail to your model explicitly. This sort of thing is precisely why the individualbased nature of nonWF models, with emergent dynamics for things like population size and age
structure, has the potential to be more biologically realistic.
This point becomes even more pointed if you write a model in which new beneficial mutations
can arise spontaneously. In such a nonWF model, absolute fitness would increase a little bit more
with every new beneficial mutation, and the population would evolve toward a “Darwinian
demon” with infinite absolute fitness and infinite population size. Fundamentally, that is just a
more extreme case of the same situation as in this recipe: if your model tells SLiM that absolute
fitness has increased, population size will increase concomitantly. If you want some mechanism to
hold that tendency in check, you need to add that mechanism to your model yourself.
In this recipe, for example, it would certainly be possible to force the model to maintain the
same carrying capacity throughout, but trying to do so might just expose how biologically
unrealistic that constraint really is. If the carrying capacity is the same before and after the
beneficial mutation fixes, that means that the survival probability is the same (assuming we don’t
alter reproductive output). So once the beneficial mutation has fixed, it apparently no longer
confers any benefit to the carrier – even though it did confer a benefit (relative to the non-carrier
individuals) earlier on, before the beneficial mutation had fixed. What has changed? Nothing
about the environment, and the population size has not changed (since we are making the carrying
capacity fixed) – and yet somehow, almost magically, the fitness benefit that used to exist has
evaporated. The only thing that has changed, in fact, is that now other individuals also possess the
mutation, whereas early on in the sweep few or none did. And that suggests one way that we
could make the carrying capacity fixed in this model: add in negative frequency-dependent
selection (see section 9.4.1). Maybe there’s a good biological reason for that: limited resources, for
example, such that a mutation that makes an individual better at obtaining those resources confers
less and less benefit as the mutation gets more and more common. If that’s the biology, then great,
model that. But if one is tempted to hold the population size fixed simply because one is used to
Wright-Fisher models that hold the population size fixed – not for any biological reason – then it
would probably be best to think twice. The lesson of this recipe, in other words, might be: Model
the biology, not your own modeling assumptions. Let emergent dynamics emerge.
Or if you really want to stay in the world of Wright-Fisher assumptions, then write a WF model;
there’s nothing wrong with that, as long as you understand the assumptions you’re making.
15.5 A metapopulation extinction-colonization model
Our models so far have been of a single subpopulation, but nonWF models have many
advantages for modeling migration that we will explore in this recipe and the next. Here, we will
model a metapopulation undergoing local extinctions and then re-colonizations from other
subpopulations. This would be difficult to implement in a WF model, because population size is
not allowed to go to zero (that signifies removal of the subpopulation from the model), and
because re-colonization by migrants does not happen naturally (it would have to be done as reTOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

308

creation of the extinct subpopulation using addSubpopSplit(), and the founders could come only
from a single source subpopulation). In a nonWF model, however, this is quite straightforward.
For simplicity, we will model a non-spatial metapopulation in which every subpopulation is
connected to every other by migration of equal strength:
initialize() {
initializeSLiMModelType("nonWF");
defineConstant("K", 50);
// carrying capacity per subpop
defineConstant("N", 10);
// number of subpopulations
defineConstant("m", 0.01);
// migration rate
defineConstant("e", 0.1);
// subpopulation extinction rate
initializeMutationType("m1", 0.5, "f", 0.0);
m1.convertToSubstitution = T;
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeMutationRate(1e-7);
initializeRecombinationRate(1e-8);
}
reproduction() {
subpop.addCrossed(individual, subpop.sampleIndividuals(1));
}
1 early() {
for (i in 1:N)
sim.addSubpop(i, (i == 1) ? 10 else 0);
}
early() {
// random migration
nIndividuals = sum(sim.subpopulations.individualCount);
nMigrants = rpois(1, nIndividuals * m);
migrants = sample(sim.subpopulations.individuals, nMigrants);
for (migrant in migrants)
{
do dest = sample(sim.subpopulations, 1);
while (dest == migrant.subpopulation);
dest.takeMigrants(migrant);
}
// density-dependence and random extinctions
for (subpop in sim.subpopulations)
{
if (runif(1) < e)
subpop.fitnessScaling = 0.0;
else
subpop.fitnessScaling = K / subpop.individualCount;
}
}
late() {
if (sum(sim.subpopulations.individualCount) == 0)
stop("Global extinction in generation " + sim.generation + ".");
}
2000 late() {
sim.outputFixedMutations();
}

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

309

We start off by defining constants for the carrying capacity K, the subpopulation count N, the
per-individual migration rate m, and the per-generation probability e of a local extinction event in a
given subpopulation, such as might occur due to a forest fire or flood. We then set up a neutral
nonWF model with simple offspring generation (one offspring per individual per generation), and
create N subpopulations. The first subpopulation begins with 10 individuals; the rest begin empty,
awaiting migrants (which is legal in nonWF models; subpopulations can be empty). At the end of
the model, we have a late() event that halts the model with a message if all subpopulations have
gone extinct; if we make it to generation 2000 we output fixed mutations and stop.
The interesting code is in the middle, in the large early() event. This first implements random
migration by drawing the number of migrants from a Poisson distribution and then sampling
migrants at random from the full population; this gives each individual the same probability m of
migrating, and is more efficient than doing a random draw for each individual. It then loops
through the chosen migrants, finds their destination subpopulation (ensuring that it is not the
subpopulation the migrant already occupies), and finally calls takeMigrants() to move the
individual to its new home. The takeMigrants() call removes the individual from its old
subpopulation and adds it immediately to the target subpopulation; it is the way that migration is
implemented in nonWF models. Note that the design of this code avoids choosing and moving a
migrant, and then accidentally choosing that same individual as a migrant again and moving it
again; all migrants are selected, and then all migrants are moved. This design is generally a good
idea, to avoid accidentally skewing the migration rates for subpopulations away from their
intended rates. The migrant property of Individual could also be used to prevent this, together
with the ability of sampleIndividuals() to select individuals that have not already migrated.
The second half of the early() event implements both density-dependence and random local
extinction events. It draws from a random uniform distribution, and if the draw is less than the
probability of local extinction e, it sets the subpopulation’s fitnessScaling property to zero,
effectively reducing the fitness of all individuals in the subpopulation to zero and thus killing them
all. The rest of the time, fitnessScaling is set based on subpopulation density as usual, producing
growth up to the carrying capacity K for each subpopulation.
When run, this model produces fairly realistic extinction-colonization dynamics; after a
subpopulation is hit by an extinction event, it will eventually be recolonized and then undergo
rapid growth until reaching carrying capacity again. Here’s a snapshot from SLiMgui mid-run:

Three subpopulations are presently empty, two have just been recolonized by a single migrant,
two others are at intermediate stages of growth, and three are roughly at carrying capacity (which
is, as usual in nonWF models, not a hard limit). Note that all individuals are displayed here in
yellow, because SLiMgui normalizes away subpopulation-level fitnessScaling constants in order
to display individual fitness prior to density-dependent scaling, since that is generally what one is
interested in seeing; it does the same thing, in a sense, in WF models too. In fact, though, the
populations below carrying capacity have a fitness greater than 1.0 that is driving their growth,
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

310

and the populations at carrying capacity have a fitness less than 1.0 that compensates for their
births to keep them at equilibrium.
If the migration rate is too low, or the extinction rate is too high, the whole population will often
go extinct; but with less apocalyptic parameter values recolonization will keep up with extinction
and the population will persist (for a while, anyway; the design of the extinction events in this
model means that sudden global extinction of all subpopulations will happen with probability e^N,
so eventually the model will always go extinct as long as e > 0).
A nice feature in SLiMgui that is worth pointing out is that it keeps metrics regarding the
behavior of your model, and can display those metrics for you. You might recall the Population
Visualization graph that SLiMgui can display to show population sizes, fitnesses, and migration
patterns (see, e.g., sections 5.1.3 and 5.2.1). In nonWF models the migration rates between
subpopulations are not set ahead of time, but are instead an emergent property of the model, as
we have seen in this recipe. SLiMgui will monitor the actual migration generated in the running
nonWF model and display it in the Population Visualization graph. For example, here’s the pattern
of migration in generation 3 of a run of this recipe:
p1
p10

p2

p9

p3

p8

p4

p7

p5
p6

This shows that p5 has just received a migrant from p1; this is the initial colonization of p5, in
fact. The other subpopulations are black because they are empty at this point. Later in the run, it
might look like this instead:
p1
p10

p2

p9

p3

p8

p4

p7

p5
p6

Lots is going on here; p8 sent a relatively large proportion of its population to p4, by chance
(thus the thicker arrow) – maybe two or three or four migrants. Migrants were sent from p2 to both
p6 and p9, and then, immediately after, p2 got hit by an extinction event. And so forth. This
facility can be quite useful for debugging migration code, or for seeing how the pattern of
migration changes over time when it depends upon other model state (as in the next recipe).
In fact, SLiMgui monitors and displays other emergent metrics in nonWF models, too. In the
subpopulation table, for example, where things like the cloning rate, selfing rate, and sex ratio are
displayed for WF models, the actual, observed metrics for those properties will be displayed by
SLiMgui for nonWF models. This particular recipe is not a good showcase for that feature,
however, since it is an asexual model involving only biparental mating.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

311

15.6 Habitat choice
In the previous section migration in nonWF models using the takeMigrants() method was
introduced. Here we will further explore migration in nonWF models. In WF models, as you may
recall, migration occurs during offspring generation: parents from one subpopulation mate, but
their offspring gets added to a different subpopulation if it migrates. This type of juvenile migration
is the only possibility in WF models; but nonWF models are not restricted in that way, and here we
will construct a nonWF model of migration that can occur at any age. Each generation,
individuals will choose which environment they will live in, a phenomenon commonly called
“habitat choice”. All else equal, they will prefer the environment they are already in; but if the
other environment is better for them, then with a non-zero probability they will decide to move.
This model will also include variation in the individual propensity to migrate, and emergent
variation in the total number of migrants in each generation – both difficult to capture in WF
models.
The basic design of this model is patterned after recipe in section 9.2: we have two
subpopulations, p1 and p2, and a mutation type, m2, that represents relatively rare mutations that
are beneficial in p1 but deleterious in p2. For balance, let’s also have a mutation type m3 for
mutations that are deleterious in p1 but beneficial in p2. Finally, let’s throw a spanner into the
works by making offspring initially go to a random subpopulation, not always to the subpopulation
of their parents (perhaps representing some sort of shared spawning environment from which
juveniles initially disperse randomly).
Here is the model except for the habitat choice code:
initialize() {
initializeSLiMModelType("nonWF");
defineConstant("K", 500);
initializeMutationType("m1", 0.5, "f", 0.0);
m1.convertToSubstitution = T;
initializeMutationType("m2", 0.5, "e", 0.1);
m2.color = "red";
initializeMutationType("m3", 0.5, "e", 0.1);
m3.color = "green";

// deleterious in p2
// deleterious in p1

initializeGenomicElementType("g1", c(m1,m2,m3), c(0.98,0.01,0.01));
initializeGenomicElement(g1, 0, 99999);
initializeMutationRate(1e-7);
initializeRecombinationRate(1e-8);
}
reproduction() {
dest = sample(sim.subpopulations, 1);
dest.addCrossed(individual, subpop.sampleIndividuals(1));
}
1 early() {
sim.addSubpop("p1", 10);
sim.addSubpop("p2", 10);
}
early() {
p1.fitnessScaling = K / p1.individualCount;
p2.fitnessScaling = K / p2.individualCount;
}
fitness(m2, p2) { return 1/relFitness; }
fitness(m3, p1) { return 1/relFitness; }

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

312

1000 late() {
for (id in 1:2)
{
subpop = sim.subpopulations[sim.subpopulations.id == id];
s = subpop.individualCount;
inds = subpop.individuals;
c2 = sum(inds.countOfMutationsOfType(m2));
c3 = sum(inds.countOfMutationsOfType(m3));
catn("subpop " + id + " (" + s + "): " + c2 + " m2, " + c3 + " m3");
}
}

The structure of this model is quite predictable. The initialize() callback sets up the localadaptation m2 and m3 mutation types, and the two fitness() callbacks render mutations of those
types deleterious in one or the other subpopulation. The reproduction() callback generates each
new offspring in a randomly chosen subpopulation, representing random juvenile dispersal from a
spawning environment as mentioned above. We also have the usual nonWF density-dependent
fitness scaling, and we have an output event that prints information about the two subpopulations
(as we will discuss below).
The interesting part, then, is the large early() event for habitat choice, which should be
inserted just above the density-dependence early() event (so that density-dependence is based
upon post-migration subpopulation sizes, not the pre-migration sizes):
early() {
// habitat choice
inds = sim.subpopulations.individuals;
inds_m2 = inds.countOfMutationsOfType(m2);
inds_m3 = inds.countOfMutationsOfType(m3);
pref_p1 = 0.5 + (inds_m2 - inds_m3) * 0.1;
pref_p1 = pmax(pmin(pref_p1, 1.0), 0.0);
inertia = ifelse(inds.subpopulation.id == 1, 1.0, 0.0);
pref_p1 = pref_p1 * 0.75 + inertia * 0.25;
choice = ifelse(runif(inds.size()) < pref_p1, 1, 2);
moving = inds[choice != inds.subpopulation.id];
from_p1 = moving[moving.subpopulation == p1];
from_p2 = moving[moving.subpopulation == p2];
p2.takeMigrants(from_p1);
p1.takeMigrants(from_p2);
}

The logic of this event goes through several steps, but is not complicated. First it gets a vector of
all individuals, and derives vectors of the number of m2 and m3 mutations possessed by each
individual; each of these is a vector of counts corresponding to the original vector of individuals.
It then computes a vector of habitat preferences for each individual: starting from a neutral
preference of 0.5, the more m2 mutations an individual has, and the fewer m3 mutations it has, the
more that individual prefers p1 over p2. Each m2 or m3 mutation shifts its preference by only 10%,
however, so these preferences are not initially very strong; and this preference is clamped to the
range [0.0, 1.0] so that even once many m2 and m3 mutations exist this preference is still
bounded. Next the script computes an “inertial” habitat preference for p1 over p2, expressing a
simple desire not to move; for individuals presently in p1 this is 1.0, whereas for those in p2 it is
0.0. The final habitat preference is computed as a weighted average of these two considerations;
in this recipe the genetically-based preference is given a weight of 0.75 and the inertial preference
is given a weight of 0.25, making individuals fairly strongly inclined to move, but this balance is a
free parameter in the model. Next, we actually decide which subpopulation each individual
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

313

chooses, by comparing a random uniform draw to the individual’s weighted preference. (Note that
these calculations continue to be vectorized; this event handles all migration with a single
sequence of calculations.) The moving vector is the subset of individuals that chose a different
subpopulation than they currently occupy; these individuals will migrate, while the rest stay put.
The script then finds which migrants are currently in p1 (moving to p2) and which are in p2
(moving to p1), and then, finally, it makes takeMigrants() calls that actually move those
individuals. This method simply removes the given individuals from their current subpopulation
and inserts them into the target subpopulation. One could implement habitat choice in many
different ways, embodying different decision-making processes for the individuals involved,
different degrees of knowledge about the available habitat options, different approaches to the
stochasticity of the choice, and so forth; this is just one simple algorithm for demonstration
purposes.
Without any migration (if the habitat-choice early() event is commented out, in other words),
this model will “flip” to favor one subpopulation based upon the m2 and m3 mutations that happen
to do well early on; the successful subpopulation will grow very large (as its carrying capacity
increases, because so many individuals carry mutations that are beneficial in that environment; see
section 15.4), whereas the unsuccessful subpopulation will shrink toward zero (swamped by vast
numbers of offspring from the other subpopulation that are massively maladapted and immediately
die). Output from that version of the model typically looks like this:
subpop 1 (191): 0 m2, 1180 m3
subpop 2 (1314): 0 m2, 8072 m3

The model has “flipped” toward subpop p2, which is now 1314 individuals to p1’s 191
individuals. There are quite a large number of copies of m3-type alleles in play, whereas there are
no m2-type alleles. Subpop p1 will never be able to dig itself out of this hole, since it gets
swamped with new offspring from p2 every generation that carry m3 alleles and not m2 alleles, and
any m2 alleles it manages to scrape together get diluted into p2 and selected out. If the model runs
longer, p1 will effectively go extinct except for whatever cohort of confused migrants arrives to
repopulate it in each generation.
But with the early() event that implements habitat choice, the story is very different. Now, if
the probability of correct habitat choice is sufficiently high, the subpopulations can diverge; even
though they still sabotage each other with maladapted offspring, those offspring tend to migrate
over to the subpopulation where they are more fit. One subpop often grows significantly larger
than the other, but the model no longer “flips”; the smaller subpop is much more able to persist.
Output from the model with habitat choice:
subpop 1 (620): 2838 m2, 173 m3
subpop 2 (1065): 149 m2, 7177 m3

The subpopulations are now much closer in size, and show clear divergence in their m2 and m3
profiles. In SLiMgui the different haplotypes of the two subpopulations are immediately visible,
since the m2 and m3 mutations have been given different colors. Subpopulation p2 is larger here,
since there are more m3 alleles segregating, but it is no longer swamping p1 or diluting away the m2
alleles to such an extent that p1 can’t also thrive, and if new m2 mutations arise they will often be
able to establish themselves in p1 so the current imbalance may not even persist. Divergence here
is much more successful, and in fact it is effective enough that neutral sites will diverge as well,
indicating that this mechanism is sufficient to generate substantial reproductive isolation. The
degree of isolation will depend upon parameters such as the strength of habitat choice versus the
“inertial” preference for the current habitat, and the strength of the divergent selection on the m2
and m3 alleles, of course.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

314

This recipe goes beyond the capabilities of WF models in several important ways. One way is
that individuals of all ages migrate in this model, not just newly created juveniles as in WF models.
Indeed, in this model the same individual may just back and forth between p1 and p2 several
times, if it has no strong preference. A second way is that the migratory behavior here is based
upon individual genetic state. This is possible to implement in WF models, using a modifyChild()
callback that accepts or rejects proposed migrant offspring based upon their genetics, but in
practice it would be difficult and problematic. A third way is that the amount of migration in this
model is itself condition-dependent; if there are many maladapted individuals, there will be many
migrants, if fewer, fewer migrants. This would be difficult to do in a WF model since SLiM then
fulfills a pre-set migration rate; that pre-set rate would have to be carefully predicted and altered in
every generation in order to try to foretell the desired result from offspring generation, which
would be very clumsy if it worked at all. Migration is an individual choice, so it is much simpler
and more natural to model it as such, as nonWF models do.
Many interesting extensions to this model could be made. For example, one could impose a
fitness cost upon migration, and then allow the propensity for migration to itself evolve by making
the weighting between habitat choice and “inertia” depend upon a quantitative trait. Individuals
that chose to migrate too often would suffer a high cost, but those that did not migrate at all, even
when strongly maladapted, would also be penalized, and so some intermediate, optimal migration
tendency would perhaps tend to evolve. Such a model would rely heavily upon the advantages of
the individual-based migration of nonWF models; it would be very difficult to fit it into the WF
paradigm.
15.7 Evolutionary rescue after environmental change
The evolutionary response of a population to environmental change, and the probability of
extinction if that response is insufficient, is the subject of an important class of models called
“evolutionary rescue” models – particularly relevant in this era of anthropogenic climate change.
Evolutionary rescue can be based upon standing genetic variation, new mutations that provide
new adaptive potential, or genetic variation brought in by migrants. Here we will look at a QTLbased model of evolutionary rescue that may (or may not) occur as a result of both standing
genetic variation and new mutations. This model is based heavily upon other QTL models in this
manual (see sections 13.1, 13.10, and 13.17), adapted here to illustrate that QTL-based
approaches are entirely compatible with nonWF models.
Let’s look at this model piece by piece, beginning with initialize():
initialize() {
initializeSLiMModelType("nonWF");
defineConstant("K", 500);
defineConstant("opt1", 0.0);
defineConstant("opt2", 10.0);
defineConstant("Tdelta", 10000);
initializeMutationType("m1", 0.5, "n", 0.0, 1.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeMutationRate(1e-7);
initializeRecombinationRate(1e-8);

// QTL

}

For simplicity and speed, we define only a QTL mutation type, m1; this model has no neutral
mutations in it (but of course they would be trivial to add). Each QTL is drawn from a normal
distribution centered on 0.0 with a standard deviation of 1.0 (which are important parameters for
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

315

this model, since the exact nature of the standing genetic variation and mutational variance will be
important). Besides this, the initialization is quite standard. We set up defined constants for the
carrying capacity (K), for the phenotypic optimum before and after environmental change (opt1
and opt2), and for the time when the environmental change will occur (Tdelta).
Next we set up our reproduction and our initial population:
reproduction() {
subpop.addCrossed(individual, subpop.sampleIndividuals(1));
}
1 early() {
sim.addSubpop("p1", 500);
}

This is boilerplate, except that here we start at the carrying capacity so that if we configure the
environmental change to occur immediately at the beginning of the model (as we will try below),
the population is not at a disadvantage due to not yet having grown to capacity.
We need our QTL machinery, which is quite simple in this model:
early() {
// QTL-based fitness
inds = sim.subpopulations.individuals;
phenotypes = inds.sumOfMutationsOfType(m1);
optimum = (sim.generation < Tdelta) ? opt1 else opt2;
deviations = optimum - phenotypes;
fitnessFunctionMax = dnorm(0.0, 0.0, 5.0);
adaptation = dnorm(deviations, 0.0, 5.0) / fitnessFunctionMax;
inds.fitnessScaling = 0.1 + adaptation * 0.9;
inds.tagF = phenotypes;
// just for output below
// density-dependence with a maximum benefit at low density
p1.fitnessScaling = min(K / p1.individualCount, 1.5);
}
fitness(m1) { return 1.0; }

The fitness(m1) callback makes the direct fitness effect of m1 mutations neutral, as usual in
such QTL models; the only effect of QTL mutations on fitness is indirect, through their effect on
individual phenotypic values.
The early() event calculates individual phenotypes as the sum of the effects of all QTLs
possessed by the individual. It then decides which phenotypic optimum is in effect, calculates the
deviation of each individual from that optimum, and calculates the degree of adaptation of each
individual from that using dnorm() (normalizing the adaptation values to the range (0,1] with
fitnessFunctionMax). Finally, fitnessScaling values for individuals are set based upon their
adaptation; a perfectly adapted individual will have an adaptation value of 1.0 and thus a
fitnessScaling value of 1.0, whereas an infinitely maladapted individual will have an adaptation
value of 0.0 and thus a fitnessScaling value of 0.1, the “floor” in this model (a crucial parameter
that will influence the probability of evolutionary rescue).
The early() event also, at the end, implements density-dependent population regulation. This
is done in the usual way, except that a maximum of 1.5 is imposed with min(). In principle, this
sort of correction ought to have been imposed on all our other nonWF models, but in models that
do not include deleterious mutations, and which are not expected to spend time at low density, it
is unimportant. The rationale for the correction is that being at low population density does
convey some benefit (assuming the absence of Allee effects, as we have been doing), allowing
individuals to survive and reproduce at their full capacity even when they carry some minor
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

316

deleterious mutations that would negatively impact them if they were in a population at full
carrying capacity – but this benefit is surely limited! Typically, a population will not thrive if it
carries many large-effect deleterious mutations, even if it is released from all density-dependent
pressures. The maximum of 1.5 here embodies that intuitive fact, by stating that the fitness benefit
of low density can never be more than a multiplicative fitness effect of 1.5. (This is another
important model parameter, of course.)
Finally, we have our output and termination events:
late() {
if (p1.individualCount == 0)
{
// stop at extinction
catn("Extinction in generation " + sim.generation + ".");
sim.simulationFinished();
}
else
{
// output the phenotypic mean and pop size
phenotypes = p1.individuals.tagF;
cat(sim.generation + ": " + p1.individualCount + " individuals");
cat(", phenotype mean " + mean(phenotypes));
if (size(phenotypes) > 1)
cat(" (sd " + sd(phenotypes) + ")");
catn();
}
}
20000 late() { sim.simulationFinished(); }

The simulation checks for extinction in every generation, and stops with a termination message.
Otherwise, it prints a summary with the current population size, and the mean and standard
deviation of the distribution of phenotypes. If extinction is avoided, the model stops after
generation 20000.
Running this recipe as configured, we begin with no genetic diversity at all:
1: 500
2: 529
3: 499
4: 470
5: 497
...

individuals,
individuals,
individuals,
individuals,
individuals,

phenotype
phenotype
phenotype
phenotype
phenotype

mean
mean
mean
mean
mean

0 (sd 0)
0.000628046 (sd 0.107673)
-0.00650893 (sd 0.152125)
-0.00802766 (sd 0.188649)
0.00997383 (sd 0.225336)

By generation 10000, when the environment changes, we have built up some standing genetic
variation – although really not much more than we had just a few generations in:
...
9995:
9996:
9997:
9998:
9999:

493
498
518
493
500

individuals,
individuals,
individuals,
individuals,
individuals,

phenotype
phenotype
phenotype
phenotype
phenotype

mean
mean
mean
mean
mean

-0.101779
-0.139806
-0.146021
-0.127487
-0.139612

(sd
(sd
(sd
(sd
(sd

0.616717)
0.592579)
0.55228)
0.537018)
0.531529)

Then we hit generation 10000, the optimum suddenly changes to 10.0, and things get very ugly
very fast:

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

317

10000: 123 individuals, phenotype mean -0.095242 (sd 0.560048)
10001: 77 individuals, phenotype mean 0.0100077 (sd 0.628869)
10002: 54 individuals, phenotype mean -0.191957 (sd 0.624277)
10003: 35 individuals, phenotype mean -0.0215701 (sd 0.55838)
10004: 18 individuals, phenotype mean 0.211882 (sd 0.862124)
10005: 7 individuals, phenotype mean 0.196655 (sd 1.07714)
10006: 5 individuals, phenotype mean 0.200521 (sd 0.144637)
10007: 3 individuals, phenotype mean 0.123028 (sd 0.0531924)
10008: 2 individuals, phenotype mean 0.153739 (sd 0)
10009: 1 individuals, phenotype mean 0.153739
Extinction in generation 10010.

The population evolved toward the new optimum, but not quickly enough to overcome its crash
in population size, and it went extinct in only eleven generations. However, this is actually not the
typical outcome for the model. Here’s another run just before the environmental change:
...
9995:
9996:
9997:
9998:
9999:

494
511
503
489
508

individuals,
individuals,
individuals,
individuals,
individuals,

phenotype
phenotype
phenotype
phenotype
phenotype

mean
mean
mean
mean
mean

0.152236 (sd 0.576086)
0.144287 (sd 0.590623)
0.151616 (sd 0.616162)
0.115655 (sd 0.617752)
0.11127 (sd 0.611983)

And here’s what happened after:
10000:
10001:
10002:
10003:
10004:
10005:
10006:
10007:
10008:
10009:
10010:
10011:
10012:
10013:
10014:
10015:
10016:
10017:
10018:
10019:
10020:
10021:
...

112 individuals, phenotype mean 0.373886 (sd 0.714775)
84 individuals, phenotype mean 0.394201 (sd 0.768464)
58 individuals, phenotype mean 0.447692 (sd 0.855927)
48 individuals, phenotype mean 0.689259 (sd 0.925998)
31 individuals, phenotype mean 0.877411 (sd 1.12496)
29 individuals, phenotype mean 1.12907 (sd 1.10892)
25 individuals, phenotype mean 1.50111 (sd 1.29306)
20 individuals, phenotype mean 1.45553 (sd 1.30944)
17 individuals, phenotype mean 1.88997 (sd 1.42636)
15 individuals, phenotype mean 2.67739 (sd 1.60781)
16 individuals, phenotype mean 2.92828 (sd 1.58097)
26 individuals, phenotype mean 3.32725 (sd 1.53169)
35 individuals, phenotype mean 3.87526 (sd 1.24333)
56 individuals, phenotype mean 4.13832 (sd 1.10771)
98 individuals, phenotype mean 4.27727 (sd 1.12148)
158 individuals, phenotype mean 4.54129 (sd 0.940975)
285 individuals, phenotype mean 4.66029 (sd 0.881386)
321 individuals, phenotype mean 4.84315 (sd 0.807692)
314 individuals, phenotype mean 4.98733 (sd 0.726859)
355 individuals, phenotype mean 5.08917 (sd 0.712417)
337 individuals, phenotype mean 5.18797 (sd 0.735515)
328 individuals, phenotype mean 5.28416 (sd 0.70728)

The population size dips down as low as 15 individuals, and the phenotypic optimum still
seems quite distant, but it manages to stage a full recovery; it’s back up to carrying capacity within
about a hundred generations.
In twenty runs of the model, extinction occurred nine times and evolutionary rescue occurred
eleven times. We can test the importance of standing genetic variation for rescue by simply setting
Tdelta to 0, making the optimum be 10.0 from the start of the model with no chance for standing
genetic variation to build up; in this variant of the model, extinction occurred sixteen out of twenty
times, rescue only four times. So: probably important. Which might seem a bit surprising, since
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

318

the variance in phenotype is really not large in generation 9999; but there are, nevertheless, a lot of
useful QTLs that can be brought together by recombination and applied to the problem of rapid
evolution.
We can also test the importance of new mutations to evolutionary rescue, by setting the
mutation rate to 0.0 when the environment changes; the population will then have nothing but
standing variation at its disposal. A proper test of this would require many runs of the model, but I
can state that evolutionary rescue does sometimes occur from the standing variation alone. Here’s
the population just before the change, in one run of that variant of the model:
...
9995:
9996:
9997:
9998:
9999:

462
489
488
496
488

individuals,
individuals,
individuals,
individuals,
individuals,

phenotype
phenotype
phenotype
phenotype
phenotype

mean
mean
mean
mean
mean

-0.020969 (sd 0.842111)
-0.0413428 (sd 0.843401)
0.0205512 (sd 0.805816)
-0.0103366 (sd 0.744229)
-0.0287371 (sd 0.829185)

And here’s the recovery, in all its glory:
10000:
10001:
10002:
10003:
10004:
10005:
10006:
10007:
10008:
10009:
10010:
10011:
10012:
10013:
10014:
10015:
10016:
10017:
10018:
10019:
10020:
10021:
10022:
10023:
10024:
10025:
10026:
10027:
10028:
10029:
10030:
10031:
10032:
10033:
10034:
10035:
10036:
10037:
10038:

110 individuals, phenotype mean 0.0798129 (sd 0.941083)
86 individuals, phenotype mean 0.199899 (sd 0.936995)
70 individuals, phenotype mean 0.369471 (sd 0.986702)
52 individuals, phenotype mean 0.643418 (sd 1.1836)
46 individuals, phenotype mean 0.780035 (sd 1.25709)
42 individuals, phenotype mean 1.23978 (sd 1.55453)
43 individuals, phenotype mean 1.86605 (sd 1.7264)
39 individuals, phenotype mean 2.62644 (sd 1.805)
48 individuals, phenotype mean 3.24882 (sd 1.82018)
67 individuals, phenotype mean 3.83253 (sd 1.69954)
101 individuals, phenotype mean 4.43066 (sd 1.53875)
165 individuals, phenotype mean 4.76209 (sd 1.42616)
289 individuals, phenotype mean 5.14574 (sd 1.27371)
325 individuals, phenotype mean 5.48173 (sd 1.19875)
360 individuals, phenotype mean 5.79835 (sd 1.02552)
369 individuals, phenotype mean 5.93227 (sd 0.973104)
386 individuals, phenotype mean 5.96707 (sd 0.959739)
372 individuals, phenotype mean 6.00515 (sd 0.953603)
371 individuals, phenotype mean 6.2132 (sd 0.813674)
364 individuals, phenotype mean 6.31304 (sd 0.698918)
380 individuals, phenotype mean 6.37598 (sd 0.604686)
427 individuals, phenotype mean 6.45539 (sd 0.471641)
370 individuals, phenotype mean 6.51224 (sd 0.363968)
412 individuals, phenotype mean 6.53184 (sd 0.305041)
415 individuals, phenotype mean 6.54332 (sd 0.277311)
404 individuals, phenotype mean 6.53433 (sd 0.299239)
380 individuals, phenotype mean 6.55845 (sd 0.234658)
386 individuals, phenotype mean 6.55116 (sd 0.256157)
421 individuals, phenotype mean 6.53684 (sd 0.293376)
390 individuals, phenotype mean 6.53791 (sd 0.29318)
401 individuals, phenotype mean 6.53193 (sd 0.307162)
403 individuals, phenotype mean 6.55492 (sd 0.248401)
418 individuals, phenotype mean 6.57826 (sd 0.165723)
428 individuals, phenotype mean 6.57869 (sd 0.163795)
399 individuals, phenotype mean 6.58121 (sd 0.151873)
420 individuals, phenotype mean 6.5856 (sd 0.128374)
402 individuals, phenotype mean 6.58132 (sd 0.15131)
422 individuals, phenotype mean 6.58565 (sd 0.128071)
419 individuals, phenotype mean 6.59647 (sd 0)

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

319

This is somewhat remarkable, since the new optimum is more than twelve standard deviations
away from the population’s phenotypic mean at the moment of the environmental change. The
population fixes for a single QTL haplotype by the end (thus, a standard deviation of 0), and that
haplotype provides a phenotype of 6.59647, which is almost exactly eight standard deviations
away from where it started – quite impressive. So rescue appears to be possible from standing
variation alone (sometimes), and from new mutations alone (sometimes), and most often from both
together (but still, only sometimes).
These outcomes will depend – perhaps quite sensitively – on the various parameters of the
model, such as the carrying capacity, the distance from the old optimum to the new one, the
mutational distribution and rate, the “floor” of the fitness function, and the maximum fitness
benefit from low population density. There are other implicit parameters here too, such as the
level of individual fecundity and the variance in that fecundity (here, zero). This is also a
hermaphroditic model, and hermaphroditic selfing is not prevented; switching to a sexual model
would make evolutionary rescue that much more difficult, since a non-zero number of both males
and females would need to be present in every generation. In short, gaining a proper
understanding of the dynamics of even this rather simple model would require some real work.
Nevertheless, the population dynamics of the model seem fairly realistic, and adding in even
more realism – sex, Allee effects, gradual environmental change instead of a sudden shift, etc. –
would not be difficult. Simulating this in a WF model would be more difficult to do with this level
of realism, since the population size would have to be set explicitly in every generation (rather
than being emergent from the birth/death dynamics), and would be fulfilled deterministically by
SLiM rather than exhibiting the natural stochastic variation around the carrying capacity that this
nonWF model exhibits.
Another interesting direction to take this model would be to use it to investigate the advantages
of sexual versus clonal reproduction. It has long been theorized that one of the disadvantages of
clonal reproduction is the difficulty of responding to environmental changes without the ability to
recombine parental genomes to bring adaptive alleles together onto the same chromosome. One
could experiment, in this model, with the effect of sexual versus clonal reproduction on
evolutionary rescue – and even the evolution of the reproductive mode in response to
environmental change. One could add a second QTL-based trait (see section 13.17) that governed
the probability that an individual would clone or reproduce sexually, and see whether
environmental change – perhaps cyclical or unpredictable – would provide enough of an
advantage to sexual reproduction to prevent clonal reproduction from taking over the population.
This would be straightforward to simulate in a nonWF model, since each individual generates its
own offspring and can choose its own reproductive mode, based upon genetics or anything else. It
would be considerably harder to implement in a WF model since the reproductive mode is
controlled only by the subpopulation-wide cloning rate and cannot easily be influenced by
individual genetics or other state.
15.8 Pollen flow
Plants reproduce sexually when a pollen grain from a flower reaches another (or perhaps the
same) flower and fertilizes an ovule. The pollen might be transmitted by a pollinator, or by wind or
water or other vectors. An important aspect of plant reproductive biology, then, is that pollen from
a flower in one subpopulation might end up fertilizing a flower in a different subpopulation. In
animals, gene flow between subpopulations generally results from the migration of individuals,
such as we modeled in sections 15.5 and 15.6 (and in many WF recipes as well). In plants, in
contrast, gene flow between subpopulations usually results from the migration of gametes (or, in
some species, gametophytes).
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

320

Pollen flow between subpopulations is a very important aspect of plant reproduction, then, but
it is quite difficult to model with a WF model in SLiM since offspring in WF models always come
from the mating of two individuals in the same subpopulation. Another advantage of nonWF
models, then, is that they can easily simulate pollen flow, because sexual reproduction can involve
any two individuals in the model.
This will be a very quick model since the concept is very simple and we have no complicated
analysis to do with the results:
initialize() {
initializeSLiMModelType("nonWF");
defineConstant("K", 200);
initializeMutationType("m1", 0.5, "f", 0.0);
m1.convertToSubstitution = T;
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeMutationRate(1e-7);
initializeRecombinationRate(1e-8);
}
reproduction() {
// determine how many ovules were fertilized, out of the total
fertilizedOvules = rbinom(1, 30, 0.5);
// determine the pollen source for each fertilized ovule
other = (subpop == p1) ? p2 else p1;
pollenSources = ifelse(runif(fertilizedOvules) < 0.99, subpop, other);
// generate seeds from each fertilized ovule
// the ovule belongs to individual, the pollen comes from source
for (source in pollenSources)
subpop.addCrossed(individual, source.sampleIndividuals(1));
}
1 early() {
sim.addSubpop("p1", 10);
sim.addSubpop("p2", 10);
}
early() {
for (subpop in sim.subpopulations)
subpop.fitnessScaling = K / subpop.individualCount;
}
10000 late() {
sim.outputFixedMutations();
}

Most of this model is boilerplate that should be familiar by now. The interesting part is the
callback. Here we model hermaphroditic (or perhaps monoecious) flowering
plants, so we do not model separate sexes, but we assume that selfing is no more common than
would be expected by chance (when an individual happens to choose itself as a pollen source in
this reproduction() code, which we do not prevent here). Each flower has 30 ovules, each with a
probability of 0.5 of being fertilized, so the total number of fertilized ovules is drawn from a
binomial distribution. We then determine the subpopulation that supplied the pollen for each of
those ovules (assuming independence), with a 99% chance that the pollen came from the local
subpopulation and a 1% chance that it was carried from the other subpopulation. Finally, we loop
to generate seeds for the fertilized ovules, using the proper pollen sources.
reproduction()

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

321

This model doesn’t show any particularly exciting behavior; it’s just a two-subpopulation neutral
model with a little gene flow. But it models pollen flow correctly, and thus provides a good
foundation for building a more complex model of plant evolution.
Those interesting in modeling plants might also wish to look at the recipe in section 11.3, which
shows how to model gametophytic self-incompatibility. That recipe enforces self-incompatibility
with a modifyChild() callback, which ought to be compatible with this nonWF model. Note,
however, that in nonWF models suppression of a proposed child by a modifyChild() callback is
the end of that proposed child; in WF models, SLiM loops until the set subpopulation size is filled,
but in nonWF models SLiM simply attempts to generate each requested offspring, and those that
are rejected by a modifyChild() callback are abandoned. That behavior might be realistic and
desirable (if pollen limitation is severe, or if stigmas are getting clogged by a large amount of
incompatible pollen that prevents fertilization); if not, if is easy enough to make the
reproduction() callback loop until offspring generation succeeds. (The addCrossed() method will
return NULL if the requested offspring is not generated.) Alternatively, one could implement the
self-incompatibility system directly in the reproduction() callback, rather than in a modifyChild()
callback, ensuring that the flower chosen as a pollen source for each fertilized ovule is compatible.
The latter approach is perhaps simpler, and should be faster.
15.9 Litter size and parental investment
Litter size (clutch size, brood size) is often involved in evolutionary trade-offs. All else being
equal, a larger litter is obviously better; the more offspring, the higher the fraction of one’s own
genes will be in the next generation. But all else is never equal, because each offspring requires
an investment of some sort – at least the energy required to make the egg and sperm, and often
quite a bit more beyond that. Offspring that receive insufficient parental investment will suffer
lower fitness, and at some point the disadvantages of that, in higher offspring mortality and/or
lower offspring mating success, will outweigh the advantages of having the extra offspring. In
some species, particularly those in harsh and extreme environments, scraping together enough
resources to produce even a single offspring is difficult, and the optimum may lie around one
offspring per breeding season or even less; other species pursue a strategy of extremely low
parental investment and produce as many offspring as they can. These different life history
strategies can sometimes be simplified (or oversimplified) into “K strategists” and “r strategists”;
more broadly, life-history tradeoffs are clearly of central importance in evolutionary biology.
In this recipe we will simulate a species with a quantitative trait that governs its litter size.
Section 15.3’s recipe included litter size variation, but here it will be governed by genetics, not just
chance. We will also account for parental investment and the resulting impact of larger litter size
on offspring fitness due to limited parental resources. The initialize() callback:
initialize() {
initializeSLiMModelType("nonWF");
initializeSex("A");
defineConstant("K", 500);
initializeMutationType("m1", 0.5, "f", 0.0);
m1.convertToSubstitution = T;
initializeMutationType("m2", 0.5, "n", 0.0, 0.3);

// QTL

initializeGenomicElementType("g1", c(m1,m2), c(1.0,0.1));
initializeGenomicElement(g1, 0, 99999);
initializeMutationRate(1e-7);
initializeRecombinationRate(1e-8);
}
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

322

This is a sexual nonWF model, with both neutral mutations (m1) and QTL mutations (m2). This
initialization is probably rote by now.
Moving on to the reproduction() callback:
reproduction(NULL, "F") {
mate = subpop.sampleIndividuals(1, sex="M");
if (mate.size())
{
qtlValue = individual.tagF;
expectedLitterSize = max(0.0, qtlValue + 3);
litterSize = rpois(1, expectedLitterSize);
penalty = 3.0 / litterSize;
for (i in seqLen(litterSize))
{
offspring = subpop.addCrossed(individual, mate);
offspring.setValue("penalty", rgamma(1, penalty, 20));
}
}
}

This callback applies only to females (the "F" in its declaration). The focal female chooses a
male mate using sampleIndividuals(), and assuming that succeeds (i.e., there is at least one male
in the population), the female will generate a litter with that mate. The script gets the female’s
litter-size phenotype from the tagF property where it will be stored (as we will see below) and
derives an expected litter size from its value such that the new individuals at the beginning of the
simulation, with no QTL mutations, have an expected litter size of 3. We then calculate the actual
litter size by doing a Poisson draw with the expectation as mean, and calculate a fitness penalty for
the offspring based upon the litter size they come from. The initial litter size of 3 entails no fitness
penalty, but larger litter sizes will have correspondingly larger penalties because of the decreased
amount of parental investment per offspring.
Having determined the litter size and the fitness penalty, we then make the litter’s offspring in a
loop. (Note the use of seqLen(litterSize), which will produce the correct number of loops even
if the litter size is zero; using the sequence operator with 1:litterSize would be incorrect, since a
litter size of 0 would generate the sequence 1 0 and the loop would then run twice instead of zero
times. Be careful using the sequence operator in such cases!) Each offspring is generated with a
call to addCrossed(), and then the fitness penalty due to parental underinvestment is set into a key
named "penalty" on the offspring for later use. Note that the actual penalty for each individual is
drawn from a gamma distribution with a mean of the expected penalty; this is a bit gratuitous,
probably, but makes the actual penalty somewhat non-deterministic.
Next we create our initial population:
1 early() {
sim.addSubpop("p1", 500);
p1.individuals.setValue("penalty", 1.0);
}

We set the initial fitness penalty on the new subpopulation’s individuals to 1.0 (i.e., no penalty,
since this is a multiplicative fitness effect).
Now comes an early() event that, together with the reproduction() callback, really does the
bulk of the work in this model:

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

323

early() {
// QTL calculations
inds = sim.subpopulations.individuals;
inds.tagF = inds.sumOfMutationsOfType(m2);
// parental investment fitness penalties
inds.fitnessScaling = inds.getValue("penalty");
// non-overlapping generations
inds[inds.age >= 1].fitnessScaling = 0.0;
// density-dependence, assuming 50% die of old age
p1.fitnessScaling = K / (p1.individualCount / 2);
}

This callback has four parts, as commented. The first part totals the effects of all QTL (m2)
mutations for all individuals and puts those totals into the tagF property of the individuals; the
reproduction() callback expects to find the total there, as we already saw, and the output code
below uses it as well. The second part fetches individual fitness penalty values using the
"penalty" key and sets them into the fitnessScaling property of the individuals to create the
desired fitness effect. The third part makes generations non-overlapping in this model, by setting
the fitness of all individuals to 0.0 unless they are new juveniles; of course the same life-history
tradeoffs apply with overlapping generations too, but having discrete generations makes the
model’s operation easier to observe. Finally, the fourth part implements density-dependence with
the usual use of the fitnessScaling property of the subpopulation; here, however, we account for
the fact that we have already effectively killed half of the population from old age, to make the
final population size be closer to the desired carrying capacity.
OK, since this is a QTL model we need to zero out the fitness effect of the QTL mutations as
usual, so that their only fitness effects are indirect:
fitness(m2) { return 1.0; }

Then we do a little output in each generation, and provide a termination event:
late() {
// output the phenotypic mean and pop size
qtlValues = p1.individuals.tagF;
expectedSizes = pmax(0.0, qtlValues + 3);
cat(sim.generation + ": " + p1.individualCount + " individuals");
cat(", mean litter size " + mean(expectedSizes));
catn();
}
20000 late() { sim.simulationFinished(); }

The model starts off like this:
1: 500 individuals, mean litter size 3
2: 492 individuals, mean litter size 2.99977
3: 485 individuals, mean litter size 2.99929
...

The mean litter sizes printed in the output are calculated in the same way as in the
callback. The model starts with the default litter size of 3, and then QTLs start to
arise that modify that value. By the end of the model, we’re in a fairly different place:
reproduction()

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

324

19998: 335 individuals, mean litter size 6.95885
19999: 326 individuals, mean litter size 6.95713
20000: 336 individuals, mean litter size 6.95985

With these parameter values and configuration, the model tends to equilibrate at a litter size
around 6.5 to 7.5, but the range of outcomes is fairly broad and values as high as 9 or 10 are often
seen; apparently the selection on litter size imposed by this model is not terribly strong, so the
fitness peak is pretty broad. In any case, somewhere in this vicinity there seems to be a crossover
point where the benefit of more offspring is counterbalanced by the decrease in parental
investment. In this simple model, that result could probably be calculated analytically, but of
course that would be impossible in a more complex model involving other biological realism as
well.
One interesting thing to note about the output above is that the population size at the end is
much smaller than it was at the beginning. This is because every individual at the end is suffering
from a lack of parental investment, and that depresses the population size below the carrying
capacity. We don’t even need to think about that; it just happens automatically, as an emergent
property of the model.
In this recipe, the penalty for each offspring depends upon the size of the litter to which the
offspring belonged. This makes sense when parental investment is, in fact, per offspring, such as
feeding newly hatched juveniles in a nest. In other cases, investment might depend upon the
expected litter size, not the actual litter size; if a bird adds body fat before the breeding season and
uses that energy to generate a predetermined number of eggs, but only a subset of those eggs
hatch, the investment is per egg, not per hatched chick. We can modify the recipe for that case by
changing the penalty calculation to be:
penalty = 3.0 / expectedLitterSize;

Interestingly, even though litterSize is drawn from a distribution with a mean of
this produces fairly different outcomes. The equilibrium litter size reached is
now typically around 3.5 to 4.0 – much smaller.
This model has a lesson that goes far beyond litter size and parental investment: nonWF models
can include genetic variation, and evolution, of traits that it is difficult or impossible to model in
such a way in WF models. One could easily write a nonWF model of the evolution of the sex
ratio, or of the selfing or cloning rate, or of migration or dispersal behavior, or of – as here – litter
size or parental investment. WF models can include all of those phenomena, but since they are
handled by SLiM’s core engine in WF models, it is quite difficult to include individual, geneticallybased variation in them.
expectedLitterSize,

15.10 Spatial competition and spatial mate choice in a nonWF model
In chapter 14 a variety of spatial models were explored, all of which were WF models. Those
spatial modeling techniques work just as well in nonWF models – indeed, better in some ways, as
we will see. The model here is derived from the recipe of section 14.5, and includes both spatial
competition and spatial mate choice.
Let’s begin with the initialize() callback as usual:
initialize() {
initializeSLiMModelType("nonWF");
initializeSLiMOptions(dimensionality="xy", periodicity="xy");
defineConstant("K", 300);
// carrying capacity
defineConstant("S", 0.1);
// spatial competition distance

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

325

initializeMutationType("m1", 0.5, "f", 0.0);
m1.convertToSubstitution = T;
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeMutationRate(1e-7);
initializeRecombinationRate(1e-8);
// spatial competition
initializeInteractionType(1, "xy", reciprocal=T, maxDistance=S);
// spatial mate choice
initializeInteractionType(2, "xy", reciprocal=T, maxDistance=0.1);
}

We set up "xy" dimensionality with periodic boundary conditions for both dimensions, in order
to get a toroidal space (see section 14.12). We will still need density-dependent population
regulation of some kind, but since this is a continuous-space model our concept of density ought
to depend upon local density, not overall population size; K is the carrying capacity that we will
aim for by calibrating our local density function. We also define the maximum spatial competition
distance with the symbol S. We set up a neutral model with the usual boilerplate code, and
initialize spatial interactions for competition and mate choice.
Next let’s look at reproduction:
reproduction() {
// choose our nearest neighbor as a mate, within the max distance
mate = i2.nearestNeighbors(individual, 1);
for (i in seqLen(rpois(1, 0.1)))
{
if (mate.size())
offspring = subpop.addCrossed(individual, mate);
else
offspring = subpop.addSelfed(individual);
// set offspring position
pos = individual.spatialPosition + rnorm(2, 0, 0.02);
offspring.setSpatialPosition(p1.pointPeriodic(pos));
}
}

We use interaction i2 to find a mate within the maximum mating distance. If one is found,
is used for biparental mating, otherwise addSelfed() is used for selfing (note this
variation in individual mating behavior based on spatial dynamics, which would be quite difficult
to achieve in a WF model). In either case, we draw a litter size using rpois() with an expected
mean size of 0.1, so individuals will reproduce relatively infrequently and generations will be
highly overlapping; if we’re at equilibrium and a new individual is born from a given parent 10%
of the time, then individuals ought to have an average lifespan of 10 generations. We loop over
our litter size (note the use of seqLen(), so that if the litter size is zero we get the correct behavior;
using 1:rpois() instead would yield the sequence 1 0 when the litter size is zero, erroneously
producing two offspring). Each offspring is positioned a small distance from the first parent, with
accounting for the periodic boundaries.
addCrossed()

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

326

Population initialization is very similar to the WF model:
1 early() {
sim.addSubpop("p1", 1);
// random initial positions
for (ind in p1.individuals)
ind.setSpatialPosition(p1.pointUniform());
}

Note that because individuals that can’t find mates self, we can start the model with just a single
individual and let it grow to capacity. Next we have spatial competition:
early() {
i1.evaluate();
// spatial competition provides density-dependent selection
inds = p1.individuals;
competition = i1.totalOfNeighborStrengths(inds);
competition = (competition + 1) / (PI * S^2);
inds.fitnessScaling = K / competition;
}

We evaluate i1, and then we evaluate the effect of competition for all individuals in a
vectorized fashion. The call to totalOfNeighborStrengths() returns a vector of the interaction
strength felt by each individual, and we rescale these values to normalize them (discussed below).
The fitness effect on each individual is then calculated as the carrying capacity K divided by the
rescaled strength of competition felt by the individual. If that strength is equal to K, we are at
equilibrium; the individual will feel no fitness effect, positive or negative, from spatial competition.
If the local population density around the individual is higher than that equilibrium density, its
fitness will be lower, and vice versa; local density is what is actually enforced by this formula, not
total population size. If the population happens to be uniformly distributed in space, then its
equilibrium size will be equal to K; but if there are areas of space that are uninhabitable, or if
individuals tend to cluster for whatever reason, then the equilibrium population size may be
different from K.
To understand how the rescaling works, it is useful to imagine that the maximum competition
distance, S, is set such that the circle covered by the interaction radius around a focal individual
has an area of exactly 1.0. The term (PI * S^2) is the area of this interaction circle, so in this case
it would be 1.0. The focal individual’s interaction circle would then include every individual in
the model (since the area of the space defined by p1 is also 1.0, since we didn’t change its
dimensions from a unit square; if we had, this formula would need to be tweaked accordingly).
The value of competition will thus be the population size, minus one because the focal individual
is not itself included in the total interaction strength; we compensate by using (competition + 1).
For our hypothetical situation in which the interaction circle has area 1.0, it can thus be seen that
the fitness scaling value for every individual will be exactly 1.0 when the population is at carrying
capacity. The same logic applies for other values of S – except for the caveat above that a nonuniform spatial distribution might lead to a different equilibrium population size.
Finally, note that periodic boundary conditions are important when modeling competition in
this sort of way, because they provide a uniform strength of competition across space, without
edge effects. With any non-periodic boundary condition, the strength of spatial competition felt by
individuals at the edge of the space will be lower, and that will encourage the population density
to be higher at the edges than in the center (because, in effect, the local population density being
regulated by the competition function includes the empty areas beyond the edges of the space).
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

327

So far so good. Since we have overlapping generations, it would be nice to model some
movement by the individuals in the model, rather than just representing them as motionless points.
To do that, we have a late() event that moves everybody around:
late()
{
// move around a bit
for (ind in p1.individuals)
{
newPos = ind.spatialPosition + runif(2, -0.01, 0.01);
ind.setSpatialPosition(p1.pointPeriodic(newPos));
}
// then look for mates
i2.evaluate();
}

This just deviates individual positions by a small random factor, with accounting for periodic
boundaries. After doing that, it evaluates i2 in preparation for the mate-searching that will occur
in reproduction() callbacks at the start of the next generation; this just happens to be a
convenient moment for that evaluation, right after final positions for the generation have been
established.
Finally, we have a termination event:
10000 late() {
sim.outputFixedMutations();
}

Something worth noting about this model is that it contains no modifyChild() callback.
Because nonWF models create their own offspring, we can fix the spatial positions of offspring
directly in the reproduction() callback. It would also work to do it in a separate modifyChild()
callback, as in section 14.5, but that would be a bit more complex and a bit slower. Similarly,
notice that this model contains no fitness(NULL) callback, even though spatial competition
modifies individual fitness values in the model, because we can use the fitnessScaling property
of the individuals to implement the fitness effect. Often, nonWF models can get away with fewer
callbacks, or even no callbacks except initialize() and reproduction(), as shown here.
When this recipe is run, it looks a lot like the recipe from section 14.5:

There are differences from section 14.5’s recipe, though: individuals are moving around in
space during their lifetimes, and generations are clearly overlapping; there is much more
continuity from generation to generation. The population size is no longer fixed at K, either; it
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

328

starts at 1 and grows organically, with a natural pattern of population growth that spreads across
space. Even once it reaches carrying capacity it is not fixed at exactly K, as the WF model was, but
fluctuates around the carrying capacity stochastically, depending in part upon how non-uniform
the spatial clustering at any given moment happens to be. This added realism could be very
important in some models. For example, if something were to suddenly perturb the population – if
a disease outbreak in one corner of the space suddenly killed off all the individuals within a small
radius, say – the WF model would nevertheless force the total population size to be K in every
generation, and so the population density would increase in all the other areas of the space, which
clearly makes no sense. This model’s more robust implementation of population regulation based
upon local population density prevents that aberration; the population size would initially be
depressed by the perturbation, and the hole created by the disease outbreak would fill back in
from its edges naturally, over time.
It has been mentioned several times now that spatial clustering might cause this model to reach
an equilibrium population size different from K. In the snapshot above, it is not immediately
obvious that this is happening, but in fact it is, and the population size of this recipe fluctuates
around roughly 310, slightly above the intended carrying capacity of 300. This can be made much
more obvious by increasing S to 0.2; the model at equilibrium then looks like this:

Rather remarkably, the population has naturally clustered itself into fifteen clumps, equidistantly
arranged across the periodic space in a hexagonal pattern; this seems to be the optimal
arrangement for this value of S, and the model finds it fairly quickly every time it is run. The
equilibrium population size is now about 500 individuals. In the next section, we will discuss this
phenomenon further, and look at a way of preventing it if it is undesirable.
For now, let’s just note that this is not a bug; this is the emergent behavior of the model, and it is
not necessarily unbiological. For one thing, some species are territorial, and what this model is
doing could be seen as a sort of territoriality; if the carrying capacity were adjusted appropriately,
each cluster could contain the appropriate number of individuals for family groups of the modeled
species. For another, complex spatial clustering has been observed in various natural systems;
Vincenot et al. (2016) provide an overview of some examples in plants, for instance, and discuss
the modeling of such clustering. Of course these clusters might not behave precisely as one might
wish; they are probably almost completely genetically isolated from each other, for one thing, so
one might wish to model dispersal that would provide gene flow (such as the way that young male
lions leave their natal pride and head out to form their own pride, if they can).
There is one more interesting aside to be explored here. We can modify this recipe a bit more,
by removing the periodic boundaries and instead implemented reprising boundaries following the
appropriate recipe from section 14.3. With S still set to 0.2, if we run this modified model we will
see a fairly different pattern of spatial clustering:
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

329

Twelve clusters are now arranged around the edges, and another eight clusters are packed into
the interior in an asymmetrical pattern. This is one way that the model can satisfy the constraints it
has been given, but it is not entirely stable, and there are several other configurations with either
nineteen or twenty clusters that are commonly observed. Here is another, with nineteen:

The equilibrium population size here is about 680 in the top configuration, and about 660 in the
bottom configuration. This is higher than it was with periodic boundaries, and of course the
nineteen or twenty clusters observed now is more than the fifteen observed with periodic
boundaries too. This difference is because reprising boundaries provide the model with more
elbow room; the clusters can now take advantage of the fact that they feel no competition from the
empty space beyond the edges, whereas periodic boundaries did not afford this luxury.
The overall point, then, is that spatial models are considerably more subtle than one might
initially appreciate, and that choices such as boundary conditions can have large consequences for
dynamics that might not be immediately obvious. In the next section we shall explore this further.
15.11 A spatial model with carrying-capacity density
The previous section provided a recipe for a nonWF model with both spatial competition and
spatial mate choice. Spatial competition provided population regulation, since more individuals
would produce more competition, reducing absolute fitness. However, as we saw at the end of
the section, that model tends to produce spatial clustering that might be undesirable. This occurs
because of the shape of the competition function. The competitive interaction strength felt
between two individuals in that recipe was constant out to the maximum interaction distance, and
then fell off abruptly to zero. This produces an incentive for clustering, for two reasons: (1) being
tightly clustered implies no more fitness penalty than being loosely clustered, since the interaction
strength is constant, and (2) clustering allows the empty spaces between clusters to be shared; if
the clusters arrange themselves into a regularly spaced configuration, they can efficiently share the
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

330

empty spaces between them. That effect allows the mean interaction strength felt by individuals to
decrease, in turn allowing the model to somewhat exceed its carrying capacity.
For this reason (and others), a Gaussian competition kernel is often preferred in this sort of
spatial model; it seems to produce fewer artifacts of this type. In chapter 14 we constructed a
series of models inspired by classic papers such as Dieckmann & Doebeli (1999) and Doebeli &
Dieckmann (2003), and here we return to that thread. Doebeli & Dieckmann (2003) introduced a
concept that they called “carrying-capacity density”, based upon a Gaussian competition kernel;
with the proper scaling, as set out in their paper, the population will equilibrate at the intended
carrying capacity if the environment is homogeneous, just as it did in the previous recipe, but
clustering artifacts driven by the shape of the competition kernel will be minimized. Here, we will
adjust the model of section 15.10 to implement this concept of carrying-capacity density, which
requires only a few minor modifications. Since the modifications are small, we will here review
only the sections of the model that are changed (as usual, the full recipe is available in SLiMgui
and online).
One part that requires changes is the initialize() callback:
initialize() {
initializeSLiMModelType("nonWF");
initializeSLiMOptions(dimensionality="xy", periodicity="xy");
defineConstant("K", 300);
// carrying-capacity density
defineConstant("S", 0.1);
// sigma_S, the spatial competition width
initializeMutationType("m1", 0.5, "f", 0.0);
m1.convertToSubstitution = T;
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeMutationRate(1e-7);
initializeRecombinationRate(1e-8);
// spatial competition
initializeInteractionType(1, "xy", reciprocal=T, maxDistance=S * 3);
i1.setInteractionFunction("n", 1.0, S);
// spatial mate choice
initializeInteractionType(2, "xy", reciprocal=T, maxDistance=0.1);
}

We now call K the carrying-capacity density, and S the spatial competition width, which is
referred to with the symbol σs in Doebeli & Dieckmann (2003). The spatial competition function
now uses a Gaussian kernel (type "n"), with a maximum distance of three standard deviations (so
by the time it cuts off it should be very close to zero anyway). Otherwise this initialize() callback is
the same as in section 15.10.
The other element that changes is the population regulation event:
early() {
i1.evaluate();
// spatial competition provides density-dependent selection
inds = p1.individuals;
competition = i1.totalOfNeighborStrengths(inds);
competition = competition / (2 * PI * S^2);
inds.fitnessScaling = K / competition;
}
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

331

This is quite similar to section 15.10’s code, but the rescaling of the competition strength has
changed in accord with the new Gaussian competition kernel shape. The new rescaling is parallel
to the rescaling that is applied in the standard formula for the normal distribution:
2

f (x | μ, σ ) =

1
2π σ 2

e

−

2

(x − μ)
2σ 2

Without the rescaling constant, that formula would produce a maximum value of 1.0; rescaled,
it produces a lower maximum density, such that the total probability density – the integral under
the curve – is exactly 1.0 (as required by the definition of a probability density function). Now,
note that the Gaussian interaction function we’re using here utilizes the same formula, but without
that rescaling factor; we just want to introduce that rescaling factor to normalize the Gaussian
interaction function. This is parallel to the reasoning followed in section 15.10 that led us to
rescale using the formula for the area of a circle of radius S. The rescaling factor is squared in this
recipe (no square root) because we have two spatial dimensions; the spatial competition function
is really the product of two Gaussian functions, one for x and one for y. See Doebeli &
Dieckmann (2003) and related papers, from which this mathematical framework is derived, for
further discussion, since a full derivation is beyond the scope of this manual. Note that, in terms of
implementation, we could just rescale the maximum strength for i1 instead, supplying the same
rescaling constant directly to setInteractionFunction(), and that would run faster, too; the
design shown here is more explicit for pedagogical purposes.
When this recipe is run, it looks a lot like the recipe from section 14.10:

Close examination, however, shows that this spatial distribution is less regularly clustered than
that previous recipe’s distribution. Changing S to 0.2 shows that the clustering previously
observed at that interaction scale is all but gone too:

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

332

A little bit of clustering still occurs, partly because offspring land near their first parent, and
partly just as a result of stochasticity (one would not expect a stochastic process to produce a
perfectly uniform spatial distribution, after all). But the competition function is no longer
contributing heavily to that clustering as it was before; indeed, it may be working to smooth it
away by favoring new offspring that land in relatively unoccupied areas.
Incidentally, these effects of interaction kernel shape have been explored in various papers; see
Payne et al. (2011) for a model of this based upon the same Dieckmann and Doebeli models that
we have been exploring, although that paper concerns itself more with clustering in phenotypic
space than in the regular spatial dimensions (as we saw in chapter 14, phenotype can in some
ways be treated similarly to a spatial dimension). Payne et al. (2011) found that a box-shaped (i.e.,
platykurtic) kernel encouraged clustering, just as did the cut-off kernel we used in section 15.10,
which was of course also platykurtic. Leimar et al. (2008) explored competition kernel shape in
more detail, and found that a characteristic of the Fourier transform of the competition kernel was
predictive of its effects upon clustering. A Gaussian competition kernel does not promote
clustering, according to their findings, and this is one reason why this kernel shape is popular
(although mathematical convenience also plays a role, as they explain).
15.12 Forcing a specific pedigree in a nonWF model
In section 13.7, we saw a recipe for forcing a SLiM model to follow a specific pedigree through
arranged matings between individuals. That recipe worked within the WF model framework,
which made its task rather difficult, since matings in WF models are arranged by SLiM’s core
engine. To make it work, that recipe had to reject undesired proposed children with a
modifyChild() callback, a solution that is slow and scales poorly to larger pedigrees. It was, in
short, an exercise in trying to pound a square peg into a round hole.
With a nonWF model we can do much better, since in nonWF models matings are arranged by
the model’s script, not by SLiM’s core engine; in a nonWF model we can simply request the
matings we want to get. That is what we will do in this section’s recipe. In other respects the
approach here is similar to that of section 13.7’s recipe; we use the tag values of individuals to
identify each individual, with a unique value for every individual in the entire pedigree.
This section actually has two recipes in it. The first recipe is for a nonWF model that allows
random matings with overlapping generations; it tracks each individual’s ancestry and outputs two
files that record the population’s history. The first output file records the generation in which each
individual died (since we are modelling overlapping generations), and the second records every
mating event and the identities of the individuals involved.
The second recipe reads those files back in, and reproduces the exact pedigree and population
history that occurred during the run of the first recipe. It reproduces death events by setting a
fitness of zero for individuals slated to die in a given generation, and it reproduces mating events
by calling addCrossed() for each pair of individuals slated to mate in a given generation, in the
model’s reproduction() callback.
If you want to reproduce the pedigree followed in a “baseline” model run, these two recipes
should allow you to do exactly that, with little modification. If, on the other hand, you have
pedigree information from source other source (such as empirical data from a real population), you
can use the second recipe to force SLiM to follow that pedigree; you would just need to either
encode your pedigree in the file format expected by the recipe, or modify the recipe to read in
whatever file format you already have.
So, let’s begin with the first recipe, which tracks and outputs the pedigree realized by a model
run:
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

333

initialize() {
initializeSLiMModelType("nonWF");
defineConstant("K", 10);
initializeMutationType("m1", 0.5, "f", 0.0);
m1.convertToSubstitution = T;
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeMutationRate(1e-7);
initializeRecombinationRate(1e-8);
// delete any existing pedigree log files
deleteFile("~/Desktop/mating.txt");
deleteFile("~/Desktop/death.txt");
}
reproduction() {
// choose a mate and generate an offspring
mate = subpop.sampleIndividuals(1);
child = subpop.addCrossed(individual, mate);
child.tag = sim.tag;
sim.tag = sim.tag + 1;
// log the mating
line = paste(c(sim.generation, individual.tag, mate.tag, child.tag));
writeFile("~/Desktop/mating.txt", line, append=T);
}
1 early() {
sim.addSubpop("p1", 10);
// provide initial tags and remember the next tag value
p1.individuals.tag = 1:10;
sim.tag = 11;
}
early() {
// density-dependence
p1.fitnessScaling = K / p1.individualCount;
// remember the extant individual tags
sim.setValue("extant", sim.subpopulations.individuals.tag);
}
late() {
// log out the individuals that died
oldExtant = sim.getValue("extant");
newExtant = sim.subpopulations.individuals.tag;
survived = (match(oldExtant, newExtant) >= 0);
died = oldExtant[!survived];
for (indTag in died)
{
line = sim.generation + " " + indTag;
writeFile("~/Desktop/death.txt", line, append=T);
}
}
100 late() {
sim.simulationFinished();
}

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

334

This recipe is quite straightforward; in most respects it follows the typical skeleton of a nonWF
model, but with some logging code added in. In the initialize() callback we delete any preexisting log files; the paths for the log files are of course something you are likely to want to
customize. The reproduction() callback chooses a random mate and calls addCrossed() to
generate a child, as usual; but it also sets that child up with a unique identifying tag value, and
appends a line summarizing the mating event to the mating.txt log file. The 1 early() event sets
the subpopulation up as usual, and also sets up initial tag values for the new individuals; the value
of sim.tag is used to track the next unused tag value in the run. The early() event provides
density-dependent population regulation, as we have seen in previous nonWF models; it also
remembers the tag values of all of the extant individuals prior to the survival generation cycle
stage, storing that list away in the simulation object using setValue(). Finally, the late() event
compares the post-mortality list of individuals to the pre-mortality list, determines which
individuals died, and appends lines representing those deaths to the death.txt log file. (This
recipe uses match(), but one of the Eidos set-theoretic methods like setDifference() could
probably be used just as easily.) The model ends at the end of generation 100.
A run of this model produces death.txt and mating.txt files on the user’s desktop (on Mac
OS X, at least; the output paths in the recipe may need to be modified on other platforms). The
death.txt file is simply a series of lines, each of which has a generation and then the tag value of
an individual that died in that generation, like this:
2 2
2 4
2 6
...
3 7
3 19
...
4 20
4 21
...

This file records that in generation 2 the individuals with tag values 2, 4, 6, etc., died; in
generation 3, individuals with tag values 7, 19, etc.; in generation 4, individuals with tag values
20, 21, etc.; and so forth, through to the end of the run.
The mating.txt file is almost as simple; here, each line records a generation, the tag values of
the first and second parent, and then the tag value of the generated offspring:
2 1
2 2
2 3
...
3 1
3 3
...
4 1
4 3
...

3 11
10 12
2 13
13 21
20 22
13 30
22 31

The first line in this example, for example, specifies that in generation 2 the individuals with tag
values 1 and 3 mated to produce an offspring individual with tag value 11.
These files, then provide all the information we need to reproduce the full pedigree and
population history of the model run. So much for the first recipe; now let’s move on to the second
recipe and see how it uses these files to force a replication of the logged pedigree:
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

335

function (i)readIntTable(s$ path)
{
l = readFile(path);
t(sapply(l, "asInteger(strsplit(applyValue));", simplify="matrix"));
}
initialize() {
initializeSLiMModelType("nonWF");
defineConstant("K", 10);
initializeMutationType("m1", 0.5, "f", 0.0);
m1.convertToSubstitution = T;
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeMutationRate(1e-7);
initializeRecombinationRate(1e-8);
// read in the pedigree log files
defineConstant("M", readIntTable("~/Desktop/mating.txt"));
defineConstant("D", readIntTable("~/Desktop/death.txt"));
// extract the generations for quick lookup
defineConstant("Mg", drop(M[,0]));
defineConstant("Dg", drop(D[,0]));
}
reproduction() {
// generate all offspring for the generation
m = M[Mg == sim.generation,];
for (index in seqLen(nrow(m))) {
row = m[index,];
ind = subpop.subsetIndividuals(tag=row[,1]);
mate = subpop.subsetIndividuals(tag=row[,2]);
child = subpop.addCrossed(ind, mate);
child.tag = row[,3];
}
self.active = 0;
}
1 early() {
sim.addSubpop("p1", 10);
// provide initial tags matching the original model
p1.individuals.tag = 1:10;
}
early() {
// execute the predetermined mortality
inds = p1.individuals;
inds.fitnessScaling = 1.0;
d = drop(D[Dg == sim.generation, 1]);
indices = match(d, inds.tag);
inds[indices].fitnessScaling = 0.0;
}
100 late() {
sim.simulationFinished();
}

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

336

Let’s walk through how it works. First of all, we define a new function named readIntTable()
that reads in a text file from a given filesystem path, assumed to be composed of lines of integer
values, and returns a matrix representing the contents of the file. (Note that this function is
supplied separately in the online SLiM-Extras repository, too.) We haven’t used matrices much in
previous recipes; they are a relatively new feature of Eidos, and work in much the same way as
matrices in R do (see the Eidos manual for further discussion). Indeed, we haven’t seen many
examples of user-defined functions either (again, see the Eidos manual). Without worrying too
much about such details, however, if we treat this function as a black box and call it with the path
of our death.txt file, we get this matrix:
[0,]
[1,]
[2,]
...

[,0] [,1]
2
2
2
4
2
6

And for our mating.txt file we get this:
[0,]
[1,]
[2,]
...

[,0] [,1] [,2] [,3]
2
1
3
11
2
2
10
12
2
3
2
13

The initialize() callback of this recipe sets up the model in the usual way (mirroring the
setup of the first recipe, which is what one would probably want in most cases). It then calls
readIntTable() to read in the log files, and retains the resulting matrices as defined constants, M
and D. Finally, it extracts the first column from M and D (which contains the generations for each
logged event) and retains those as constants Mg and Dg, for simplicity and speed later. So far so
good; this is the information the recipe will use to reproduce the logged pedigree.
Next, the reproduction() callback executes the pedigree’s mating events. It does this by
fetching the rows of M that refer to the current generation, and looping through those rows to
generate each mating event (again, you may wish to refer to the Eidos manual for information on
the syntax involved in working with matrices). The tag value of the offspring is set according to
the logged pedigree as well, so that the new individual will match up with the tag values used in
the log files. The callback triggers all of the mating events for the generation in a single call, rather
than working with the individual supplied to the reproduction() callback, so it then disables itself,
by setting self.active to 0, ensuring that it is not called again in the current generation (as we
saw before in section 15.3).
The 1 early() event creates the initial subpopulation, as in the first recipe, and sets the tag
values of individuals in the same way so that both models get the same setup.
Finally, we have the early() event. This does not cause density-dependent population
regulation in the usual way of nonWF models; instead, it uses the fitnessScaling property values
of individuals to weed out specifically the individuals that died in each generation in the original
model run. It begins by setting fitnessScaling to 1.0 on all individuals. Then it looks up the
mortality events for the current generation, similarly to how the reproduction() callback looked
up mating events. It then uses match() to find the indices of the individuals in question, and sets
fitnessScaling to 0.0 for those individuals. Those individuals will be killed by SLiM during the
survival generation cycle stage that follows the early() event. The model terminates at the end of
generation 100, matching the behavior of the original model.
In this way, this recipe reproduces the saved pedigree, even when a different random number
seed is used. If you run the second recipe repeatedly, you will therefore get replicates of the saved
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

337

pedigree, but with different mutational and recombinational histories and thus different segregating
mutations at the end of the runs. Of course you may not wish to take on faith that the pedigree is
replicated exactly, so if you wish you can add calls to catn() in the appropriate spots to log out
each mating and mortality event, to confirm that they follow the pedigree as intended.
This basic scheme could be extended in various ways, as needed. For example, the log files
presently support only biparental matings; if you wanted to force a pedigree that involved cloning
in some cases, you could extend the mating.txt file format to have an extra column with a value
indicating whether the offspring was generated by cloning or not, or you could add a third log file,
named something like cloning.txt, to record those events; either strategy would work. You could
also log out, and then reproduce, other model events if you wished, such as migration events;
since you are in complete control of such events in nonWF models, doing this would be quite a
straightforward extension of these recipes. You could probably even record, and then reproduce,
the recombination breakpoints used in the generation of each offspring individual, using a
recombination() callback, so as to make the replicate run duplicate exactly the same pattern of
local ancestry along the chromosome as the original model run, if you wanted to. This strategy, of
driving model dynamics from file-based data, is quite general and can be applied to many tasks. It
is much cleaner to implement with nonWF models, however, since nonWF models are so much
more strongly script-driven than WF models; it would presumably be possible to make section
13.7’s recipe file-driven in this manner, but it would scale very poorly to large pedigrees.
15.13 Modeling clonal haploids in a nonWF model with addRecombinant()
In section 13.13 we saw a recipe for modeling clonal haploids in SLiM by keeping the second
genome of each individual empty. New mutations arising in those second genomes were removed
in each generation, and a recombination rate of zero was used to prevent any recombination with
those empty genomes. That recipe is general and flexible; as noted there, similar strategies could
be employed to model haploid mitochondrial DNA alongside diploid autosomal chromosomes, or
to model haplodiploidy and other such systems.
However, that strategy does have some drawbacks, particularly when tree-sequence recording is
being used (discussed in detail in chapter 16, but see section 1.7 for a quick introduction). With
tree-sequence recording, the recipe of section 13.13 records rather odd dynamics, with mutations
appearing and then disappearing in each generation, and the empty second genomes of
individuals inheriting from each other. These bizarre inheritance records are probably harmless,
but may complicate post-simulation analysis, and in any case constitute something of an offense
against clean design. Even more problematic for section 13.13’s strategy, when doing treesequence recording, is the question of how to model horizontal gene transfer, as we will discuss in
section 15.14.
Here, then, we will look at an alternative strategy for modeling clonal haploids. This strategy
can be employed only in nonWF models, but integrates much more smoothly with tree-sequence
recording and is conceptually more straightforward as well.
The recipe, in full:
initialize() {
initializeSLiMModelType("nonWF");
defineConstant("K", 500);
// carrying capacity
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeMutationRate(1e-7);
initializeRecombinationRate(1e-8);
}
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

338

reproduction() {
subpop.addRecombinant(genome1, NULL, NULL, NULL, NULL, NULL);
}
1 early() {
sim.addSubpop("p1", 500);
}
early() {
p1.fitnessScaling = K / p1.individualCount;
}
late() {
// remove neutral mutations in the haploid genomes that have fixed
muts = sim.mutationsOfType(m1);
freqs = sim.mutationFrequencies(NULL, muts);
if (sum(freqs == 0.5))
sim.subpopulations.genomes.removeMutations(muts[freqs == 0.5], T);
}
50000 late() { sim.outputFixedMutations(); }

This has the typical nonWF setup and population regulation, as we have seen before. The
event removes neutral mutations (type m1) when they reach fixation at a frequency of 0.5,
since SLiM doesn’t know enough to do so, following a similar strategy to section 13.13; see that
section for discussion.
The interesting and new part of this model is the reproduction() callback, which uses a
method we have not seen before, addRecombinant(). Like addCrossed(), addCloned(),
addSelfed(), and addEmpty(), this adds a new offspring individual to the target subpopulation, but
it provides greater control over precisely how that new individual is generated. Section 21.13.2
provides full documentation on this complex method, but in short, it allows you to supply two
parent genomes for each genome in the new offspring, with a list of recombination breakpoints to
be used in stitching together those parent genomes through crossover. The first three parameters to
its call here specify that genome1 from the focal parent should be used to generate the offspring’s
first genome, with no second recombinant strand (NULL), and no recombination breakpoints (NULL);
this provides clonal reproduction of the first genome. New mutations will be added by SLiM as
usual. The second group of three parameters to addRecombinant() are all NULL here; that specifies
that the second genome of the offspring should be generated with no parent genomes at all,
leaving it empty, just as addEmpty() would have done. In this case, addRecombinant() does not
add new mutations to the genome, since conceptually there is no parental genome to mutate.
This approach, using addRecombinant(), allows the model to tell SLiM precisely how to
generate offspring. This has several benefits. One is that, unlike the recipe of section 13.13, we
don’t have to go back and remove mutations added by SLiM to the second genomes of individuals;
SLiM knows not to add mutations to those genomes in the first place. Another is that treesequence recording, if enabled, is able to better understand and record what is going on; it does
not record the addition and removal of spurious mutations, and all of the second genomes of
individuals in this model will be recorded as having no parents and no descendants, producing a
cleaner recorded tree sequence that better reflects the fact that those second genomes do not
actually exist at all, conceptually. The third benefit of using addRecombinant() is that is allows
more complex modes of offspring generation to be expressed as well; as an example, in the next
section we will see how to model horizontal gene transfer in bacteria using addRecombinant() to
express the horizontal gene transfer to SLiM.
late()

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

339

15.14 Modeling clonal haploid bacteria with horizontal gene transfer
In section 15.13 we looked at a model of clonal haploids using addRecombinant(), as an
alternative to the original haploid clonal model presented in section 13.13. The use of
addRecombinant() allowed the details of child generation to be expressed precisely to SLiM,
facilitating a simpler model design and more accurate recording of ancestry in the tree sequence (if
tree-sequence recording were enabled; see section 1.7). In this section we’ll explore those
benefits in more detail in a model of horizontal gene transfer in bacteria.
This is a more complex model than that of section 15.13, so let’s take things in two steps. First,
here is all of the code except the reproduction() callback:
initialize() {
initializeSLiMModelType("nonWF");
defineConstant("K", 1e5);
// carrying capacity
defineConstant("L", 1e5);
// chromosome length
defineConstant("H", 0.001);
// HGT probability
initializeMutationType("m1", 1.0, "f", 0.0);
// neutral (unused)
initializeMutationType("m2", 1.0, "f", 0.1);
// beneficial
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, L-1);
initializeMutationRate(0);
// no mutations
initializeRecombinationRate(0);
// no recombination
}
1 early() {
// start from two bacteria with different beneficial mutations
sim.addSubpop("p1", 2);
// add beneficial mutations to each bacterium, but at different loci
g = p1.individuals.genome1;
g[0].addNewDrawnMutation(m2, asInteger(L * 0.25));
g[1].addNewDrawnMutation(m2, asInteger(L * 0.75));
}
early() {
// density-dependent population regulation
p1.fitnessScaling = K / p1.individualCount;
}
late() {
// detect fixation/loss of the beneficial mutations
muts = sim.mutations;
freqs = sim.mutationFrequencies(NULL, muts);
if (all(freqs == 0.5))
{
catn(sim.generation + ": " + sum(freqs == 0.5) + " fixed.");
sim.simulationFinished();
}
}
1e6 late() { catn(sim.generation + ": no result."); }

The initialize() code defines a few constants: the carrying capacity, the chromosome length,
and the probability that horizontal gene transfer will occur during a given mitosis event. Note that
we will model horizontal gene transfer as occurring during reproduction, rather than as a discrete
event occurring later in a bacterium’s lifetime. This design is much simpler, particularly if treesequence recording is enabled; horizontal gene transfer changes the genealogical relationships
among individuals, and tree-sequence recording is not designed to accommodate such changes in
the middle of an individual’s lifespan. The approximation seems unlikely to matter.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

340

Note also that although we define a neutral mutation type here, m1, we do not model neutral
mutations, and indeed, we use a mutation rate of 0.0. This is because a model of this sort is likely
to use tree-sequence recording to overlay neutral mutations after the fact for much greater speed;
since we haven’t gotten into tree-sequence recording yet, however, we will defer that topic until
sections 16.1 and 16.2. For now, it suffices to say that we do not model neutral mutations here.
The initial population here consists of just two bacteria, which are set up to carry different
beneficial mutations at different locations in the genome. The population will expand
exponentially until reaching the carrying capacity of 1e5. We have used a fairly large population
size since we are modeling bacteria, but a carrying capacity of 1e6 or even higher might be
desirable for some purposes. Apart from taking more time and memory, this model should scale
up without difficulties; the fact that neutral mutations are not included makes it scale much better.
In the late() event we detect the fixation or loss of the beneficial mutations; if both mutations
have fixed or been lost, the model prints a message indicating how many mutations fixed, and
then stops. If the model runs for 1e6 generations without fixation or loss, it stops with a diagnostic
message.
All of that is fairly routine. Now here’s the reproduction() callback, where the interesting
action happens:
reproduction() {
if (runif(1) < H)
{
// horizontal gene transfer from a randomly chosen individual
HGTsource = p1.sampleIndividuals(1, exclude=individual).genome1;
// draw two distinct locations; redraw if we get a duplicate
do breaks = rdunif(2, max=L-1);
while (breaks[0] == breaks[1]);
// HGT from breaks[0] forward to breaks[1] on a circular chromosome
if (breaks[0] > breaks[1])
breaks = c(0, breaks[1], breaks[0]);
subpop.addRecombinant(genome1, HGTsource, breaks, NULL, NULL, NULL);
}
else
{
// no horizontal gene transfer; clonal replication
subpop.addRecombinant(genome1, NULL, NULL, NULL, NULL, NULL);
}
}

Each bacterium reproduces exactly once each generation, producing two bacteria from one,
which makes sense from the perspective of reproduction by mitosis. This reproduction can happen
in two different ways, depending upon a random draw from runif(). If the draw is greater than or
equal to H, reproduction is purely clonal as in the model of section 15.13; that is the else clause
here. If the draw is less than H, horizontal gene transfer occurs, which needs some explanation.
In that case, we first draw a random individual (other than the focal individual) to act as the
source for the transfer, and get its first genome. Next we use a do–while loop to draw two distinct
locations along the genome; these will be the endpoints of the transfer. Specifically, the transfer
will start at breaks[0] and go forward to breaks[1]. For a bit of extra biological realism, we will
model a circular chromosome here, so if breaks[0] is greater than breaks[1] the transfer will wrap
around from the end of the genome to the start, as modelled in SLiM; we check for that case and
patch up the breaks vector to reflect what we want to happen in that situation. Finally, we call
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

341

to generate the offspring bacterium including the horizontal gene transfer. We
pass it to the parent genomes – that of the reproducing bacterium, and that of the horizontal gene
transfer source – with the breakpoint vector that describes when SLiM should switch between
those strands as it produces the offspring genome by recombination. As before, we pass NULL for
the next three parameters to indicate that the second offspring genome should be empty. (A
diploid model that wanted to generate its own recombination breakpoints might use those
parameters, for example.)
Without horizontal gene transfer, this would be a model of clonal competition: one lineage
would end up “winning” and the other would go extinct, although it might take a long time for
that outcome to be reached since it would depend on drift. This behavior can be seen by setting
the defined constant H to 0.0. With horizontal gene transfer, however, the bacteria will often
stumble upon a lineage (or perhaps more than one lineage) that combines both mutations in the
same genome, providing them with an advantage similar to that provided by recombination in
sexual reproduction. Once such a lineage arises it will almost always win, and we will get output
like this from the model:
addRecombinant()

197: 2 fixed.

Both beneficial mutations fixed in generation 197, thanks to horizontal gene transfer.
The details of the breakpoint generation here might need to be modified in a more realistic
model. Here we draw the start and end positions of the transfer region independently, but perhaps
it would be better to draw the start location randomly and then draw a transfer length from a
geometric distribution or some other distribution. This would constrain the horizontal gene
transfer to generally be a small minority of the genome, as is typical in the transfer of a plasmid or
a transposon. The location and length of the transfer could also be constrained by some sort of
genetic structure to explicitly model the transfer of a plasmid that spans a given range of the
genome, of course. The reproduction() callback could also base the choice of whether or not
horizontal gene transfer occurs upon the contents of the two genomes in question, not just upon a
random probability; one could model a selfish gene in the transfer donor that makes horizontal
gene transfer more likely to occur, for example. Since all of the logic governing the horizontal
gene transfer is in the model’s script, it can include whatever biological realism is of interest.
Note that prior to the addition of addRecombinant() in SLiM, it would have been possible to
model horizontal gene transfer by actually getting all of the mutations from the transfer region out
of the source’s genome, and then adding them into the target’s genome with addMutations()
(removing any existing mutations from the target region first). This would work fine except that it
obscures what is actually going on in terms of genealogy and inheritance. If tree-sequence
recording were used with such a model, the transferred region would not be recorded as
originating in the source genome; instead, the mutations would just magically appear in the target
genome, with no genealogical relationship between source and target recorded in the tree
sequence. The method presented here, using addRecombinant(), is therefore preferable. (If one
needed to model even more complex patterns of inheritance – offspring genomes that consist of a
mosaic of genetic material from more than two parental genomes, for example – using the
addMutations() technique might still be necessary, however, since addRecombinant() is designed
to record at most two parental genomes for each offspring genome. Tree-sequence recording
would not work well in such a model, however.)
To make a relatively realistic models of bacterial evolution with SLiM, this sort of realistic
inheritance and horizontal gene transfer is one important ingredient. The other important
ingredient is the ability to make models with a sufficiently large population size, which is made
possible by the enormous performance benefits provided by tree-sequence recording as we will
see in chapter 16.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

342

15.15 Implementing a Wright–Fisher model with a nonWF model
In this chapter, we have focused on the aspects of nonWF models that go beyond the Wright–
Fisher model, such as overlapping generations, age structure, and individual-level control over
events such as reproduction and migration. Sometimes, however, it can be useful to implement a
Wright–Fisher model as a nonWF model in SLiM – or at least some aspects of a Wright–Fisher
model. You might want to have discrete, non-overlapping generations, for example; or you might
want panmictic offspring generation as in the Wright–Fisher model, with each offspring being
generated from an independent, randomly drawn pair of parents. Implementing such a model
using the nonWF model type might still be desirable, because you might also want some nonWright–Fisher dynamics in your model that would be difficult to implement in a WF model in
SLiM, or you might want to take advantage of certain features of SLiM that are only available in
nonWF models (such as the addRecombinant() method used in the previous two sections). In this
section, then, we will build a nonWF model that incorporates most aspects of the Wright–Fisher
model.
The recipe here is quite simple, so we will look at it in full:
initialize() {
initializeSLiMModelType("nonWF");
initializeMutationType("m1", 0.5, "f", 0.0);
m1.convertToSubstitution = T;
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeMutationRate(1e-7);
initializeRecombinationRate(1e-8);
}
reproduction() {
K = sim.getValue("K");
for (i in seqLen(K))
{
firstParent = p1.sampleIndividuals(1);
secondParent = p1.sampleIndividuals(1);
p1.addCrossed(firstParent, secondParent);
}
self.active = 0;
}
1 early() {
sim.setValue("K", 500);
sim.addSubpop("p1", sim.getValue("K"));
}
early()
{
inds = sim.subpopulations.individuals;
inds[inds.age > 0].fitnessScaling = 0.0;
}
10000 late() { sim.outputFixedMutations(); }

In the initialize() callback, we set up this simple nonWF model with neutral mutations and a
uniform chromosome as usual. Note that we do not set up a constant K for the population carrying
capacity there; instead, in the 1 early() event, we set K up as a value kept by the simulation using
setValue(), making it easier to vary K over time (although we don’t do so in this simple recipe).
The p1 subpopulation begins with a size equal to the value of K just defined.
The reproduction() callback implements a Wright–Fisher style of reproduction. It gets the
current subpopulation size using getValue(), and then generates that many offspring into p1 with
randomly drawn parents for each offspring. It then deactivates itself by setting self.active to 0, to
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

343

prevent SLiM from calling it again, since it has reproduced the entire subpopulation; we first saw
this strategy in section 15.3. Note that this code does not prevent firstParent and secondParent
from being the same individual, so a certain amount of “incidental selfing” will occur (as is typical
in the Wright–Fisher model); it would be easy to fix that with this revised line:
secondParent = p1.sampleIndividuals(1, exclude=firstParent);

This draws the second parent from p1 while explicitly excluding firstParent as a choice; the
sampleIndividuals() method has many useful options of this sort. Changing this model to be
sexual instead of hermaphroditic would also be easy, since sampleIndividuals() can similarly be
told to draw only females (for the first parent) or only males (for the second). One could also
easily implement Wright–Fisher style migration between subpopulations by generating some
fraction m of the new offspring in p1 from parents drawn from a different subpopulation instead.
The early() event implements non-overlapping generations by simply killing off all nonjuvenile individuals, by setting their fitnessScaling to 0.0. This is, in effect, an extremely simple
version of the sort of life table we saw in section 15.2. Note that unlike most nonWF models, this
model has no other population regulation; the usual density-dependence code is absent. Since the
reproduction() callback always generates exactly K offspring, and all non-juveniles are killed off
each generation, the size of the population is deterministic and does not require further regulation.
This model remains non-Wright–Fisher in one key way: fitness is still expressed through premating mortality, not through an individual’s probability of mating. This has two important
consequences. First of all, if the model were changed to be non-neutral – with deleterious
mutations, in particular – then the population size would be K after offspring generation, but would
drop below K during the viability/survival generation cycle stage due to fitness-based mortality.
The model would thus be a “hard selection” model, not a “soft selection” model as is typical for
Wright–Fisher models. Second, the distribution of fecundity among individuals here is different
from that of a Wright–Fisher model; in this model, as in most nonWF models, either an individual
survives (in which case it has the same expected fecundity as any other surviving individual) or it
dies (in which case it has an expected fecundity of zero, being dead). In the Wright–Fisher model,
on the other hand, fitness modifies the probability that an individual will be chosen as a mate, and
so the distribution of fecundity is continuous rather than bimodal. The fact that fitness influences
mortality in nonWF models is an assumption built into SLiM’s core code, and would not be easy to
change in script; one would have to do a complete end-run around SLiM’s built-in fitness
calculations. If this difference from the Wright–Fisher model presents a problem, the WF model
type probably ought to be used. In many cases, however, the difference may be unimportant.
The other big differences, of course, are that the script for this model is much more complex
than the script for the equivalent WF model, and it runs several times slower – about 4×, in an
informal test. The speed penalty for switching to a nonWF model is large here because all this
model really does is reproduce, and the reproduction for the nonWF model is handled in script
rather than in SLiM’s core as it is for the WF model. The penalty associated with switching to
nonWF is therefore maximized here; if the model had other things to do besides reproduce, the
penalty would be smaller. For example, if the chromosome length in this model were 107 instead
of 105, SLiM would spend much more time handling the genetics of the model – doing mutation
and recombination, as well as checking for fixation or loss of mutations – and in that case, this
recipe would take only 1.35× as long as the equivalent WF model. With a chromosome length of
108, the slowdown is only a factor of 1.07×. This underlines that when making the choice
between a WF and a nonWF model, that choice should almost always be made based upon which
model type is a better fit for the scenario being simulated, rather than upon performance. If a
model is big and complex enough that its runtime is problematic – i.e., is measured in hours to
days – the overhead due to choosing the nonWF model type will probably be small.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

344

15.16 Alternation of generations
SLiM is generally a framework for modeling diploid organisms, but with some creative scripting
that assumption can be modified. We have seen some recipes for modeling haploids, as in
sections 13.13, 15.13, and 15.14. In nonWF models a similar strategy can be used to fully model
the phenomenon of alternation of generations, the way that diploid and haploid life cycle stages
generally alternate in organisms that are often thought of simply as “diploids”. Many sexual
animals, for example, have a multicellular diploid phase that produces a unicellular haploid phase
– sperm and eggs – that then fuse, in fertilization, to produce the next diploid generation. In plants
this situation is generally even more pronounced, often with a multicellular haploid phase, the
gametophyte, that can be free-living and large – often larger and more obvious than the diploid
sporophyte, which is often reduced. For many organisms, then, it may be important to model both
the haploid and diploid phases explicitly; mutations may be expressed differently between them,
selection may act differently upon them, they may migrate or disperse differently, and so forth.
SLiM does not have intrinsic support for modeling this alternation of generations, but it is
straightforward to implement in script in a nonWF model, as we will see in this section.
This model will be somewhat complicated, so let’s start with the setup:
initialize()
{
defineConstant("K", 500);
defineConstant("MU", 1e-7);
defineConstant("R", 1e-7);
defineConstant("L1", 1e5-1);

//
//
//
//

carrying capacity (diploid)
mutation rate
recombination rate
chromosome end (length - 1)

initializeSLiMModelType("nonWF");
initializeSex("A");
initializeMutationRate(MU);
initializeMutationType("m1", 0.5, "f", 0.0);
m1.convertToSubstitution = T;
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, L1);
initializeRecombinationRate(R);
}
1 early()
{
sim.addSubpop("p1", K);
sim.addSubpop("p2", 0);
}

We use defined constants for several of the model parameters in this recipe. The recipe here
involves only neutral mutations, but extending it to other types of mutations should present no
difficulties.
This is a sexual model, so we set up separate sexes with initializeSex(). We are not
modeling sex chromosomes, but we will track the sex of individuals in both the diploid and
haploid phase; sperm will be considered “male”, and eggs “female”, in this model.
A key point in the design of this model is that although we are modeling only a single
subpopulation, we use two subpopulations in the model, p1 and p2. The first, p1, is used to hold
diploids; the second, p2, is used to hold the haploid sperm and eggs. This separation is not strictly
necessary, but it makes the design of the model simpler, because this way we can define a
reproduction() callback for p1 that reproduces the diploids (producing sperm and eggs), and a
separate reproduction() callback for p2 that reproduces the haploids (producing fertilized eggs
that develop into diploids). For other processing of the individuals in the model, such as
fitness() callbacks, this partitioning will also prove useful, as we will see below.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

345

The next step is to define the reproduction() callbacks. Let’s start with the one for p1:
reproduction(p1)
{
g_1 = genome1;
g_2 = genome2;
for (meiosisCount in 1:5)
{
if (individual.sex == "M")
{
breaks = sim.chromosome.drawBreakpoints(individual);
s_1 = p2.addRecombinant(g_1, g_2, breaks, NULL, NULL, NULL, "M");
s_2 = p2.addRecombinant(g_2, g_1, breaks, NULL, NULL, NULL, "M");
breaks = sim.chromosome.drawBreakpoints(individual);
s_3 = p2.addRecombinant(g_1, g_2, breaks, NULL, NULL, NULL, "M");
s_4 = p2.addRecombinant(g_2, g_1, breaks, NULL, NULL, NULL, "M");
}
else if (individual.sex == "F")
{
breaks = sim.chromosome.drawBreakpoints(individual);
if (runif(1) <= 0.5)
e = p2.addRecombinant(g_1, g_2, breaks, NULL, NULL, NULL, "F");
else
e = p2.addRecombinant(g_2, g_1, breaks, NULL, NULL, NULL, "F");
}
}
}

The definitions of g_1 and g_2 at the beginning are just shorthand, to keep the lines later in the
callback from being so long that they wrap when shown here. As usual in nonWF models, this
callback is called by SLiM once per individual in p1, giving the individual an opportunity to
reproduce – in this model, an opportunity to produce gametes. The top-level loop causes the focal
diploid individual to undergo meiosis exactly five times; this is an oversimplification, obviously,
but there is no need, in most models, to generate millions of sperm. Within the loop, male
individuals undergo meiosis by producing four sperm, whereas females produce just a single egg
(plus three “polar bodies” that are discarded by meiosis in most sexual species, due to anisogamy;
the polar bodies are not modeled here).
Gametes are produced by the addRecombinant() method, adding the resulting haploid
individuals to p2. The calls to addRecombinant() here pass NULL for the genomes and breakpoints
that generate the second genome of the offspring; this results in an empty second genome in the
offspring, as is typical when modeling haploids in SLiM. The genomes used to generate the first
genome of the offspring can be supplied as either (g_1, g_2) or (g_2, g_1); the first of the two
genomes supplied is the copy strand at the beginning of recombination. Since the sperm
generated use all of the genetic material from meiosis, both of those options are used (twice,
because of the homologous chromosomes involved in meiosis); since egg generation produces
only a single gamete, the choice of initial copy strand is randomized with a call to runif().
Particularly for the sperm, since we want to generate the gametes in a realistic fashion following
the rules of meiosis, we generate the recombination breakpoints ourselves and use them to
generate complementary gametes. To generate breakpoints in the standard SLiM fashion, we call
the drawBreakpoints() method of Chromosome; by default this produces a set of breakpoints
identical to what SLiM would generate for its own internal use in reproduction. The number of
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

346

recombination breakpoints generated is chosen by drawBreakpoints(), by default, using the
overall recombination rate defined by the model.
Now let’s look at the reproduction() callback for p2, which contains haploid gametes:
reproduction(p2, "F")
{
mate = p2.sampleIndividuals(1, sex="M", tag=0);
mate.tag = 1;
child = p1.addRecombinant(individual.genome1, NULL, NULL,
mate.genome1, NULL, NULL);
}

This callback is defined only for females – i.e., eggs. Each eggs gets to “reproduce” – be
fertilized – to produce a new diploid organism in p1. In this model a random sperm is chosen to
fertilize each egg, but one could easily implement phenomena such as sperm competition here to
make the choice non-random. We mark sperm that have been used to fertilize an egg with a tag
value of 1, so that they will not be used again; when we draw a random sperm, we specify in the
call to sampleIndividuals() that the sperm chosen must have a tag value of 0, indicating that it
has not already been used. Once the fertilizing sperm has been selected, it is tagged with a value
of 1, and the diploid zygote is generated with a call to addRecombinant(). The call to
addRecombinant() here supplies only a single genome for each of the offspring genomes, with
NULL for the breakpoint vectors; this makes the offspring’s genomes a clonal copy of the
corresponding genomes from the gametes. Normally, new mutations would be generated and
added by SLiM during this clonal replication; we will fix that momentarily.
These callbacks implement the generation of gametes and then the fusion of gametes to
produce diploid zygotes; but there is a little bit of additional machinery needed, which we
implement in an early() callback that cleans up after reproduction and sets up for the next
reproduction event:
early()
{
if (sim.generation % 2 == 0)
{
p1.fitnessScaling = 0.0;
p2.individuals.tag = 0;
sim.chromosome.setMutationRate(0.0);
}
else
{
p2.fitnessScaling = 0.0;
p1.fitnessScaling = K / p1.individualCount;
sim.chromosome.setMutationRate(MU);
}
}

In even-numbered generations the top half of this event will execute; in odd-numbered
generations the bottom half will execute. In an even-numbered generation, at the point that early()
events are called, p1 will have just generated gametes. This recipe assumes non-overlapping
generations, so here we kill off the diploids by setting their fitnessScaling to 0.0; p1 will be
emptied out completely. Next, we set the tag values of all of the gametes in p2 to 0; this marks all
of the sperm as unused, in preparation for the way the reproduction(p2) callback uses the tag
field. Finally, we set the mutation rate to 0.0; we do not want new mutations to be generated by
SLiM during fertilization, so we need to disable mutation temporarily.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

347

In odd-numbered generations, gametes have just undergone fertilization, filling p1 up with new
diploid offspring. We therefore kill off the haploid gametes by setting their fitnessScaling to 0.0;
p2 will be emptied out completely. Since each egg produces a zygote, we will have way too many
diploids; we will be far above carrying capacity. The next line thus implements density-dependent
selection on p1, as usual in nonWF models; note that no such density-dependence was imposed
upon the gametes in this model. Finally, we set the mutation rate back up to the defined constant
MU, since we want new mutations to arise during gamete production.
With this design, the population will flip back and forth between p1 and p2 and it flips between
diploidy and haploidy, and the mutation rate will flip on and off as well. It would be
straightforward to implement overlapping generations of diploids in this model; one could even
delve into more esoteric ideas such as sperm storage. Fitness effects could differ in the haploid
and diploid phases easily, by implementing fitness() callbacks that apply only to p1 or p2 as
needed.
All that is left to finish off the model is a termination event, which here is trivial:
1000 late()
{
sim.simulationFinished();
}

This approach to modeling the alternation of generations may be overkill in many practical
situations. This model runs much more slowly than the equivalent model of only the diploid
phase; for one thing, it is generating a population of gametes that is more than ten times larger
than the population of diploids, every generation. Other strategies for modeling life cycle
complexity may be usable instead; section 15.8, for example, presents a model of pollen flow
between subpopulations of plants, which is simple to model without getting down to the details of
modeling individual pollen grains and the sperm cells they produce as separate entities.
Additional biological realism should generally be incorporated into a model only when there is
reason to believe that it matters – that it would affect the results of the model. In some cases,
however – such as when one wishes to have selection operate in the haploid phase – the
additional biological realism of modeling alternation of generations may be useful.
Even more esoterically, one could use the same basic concepts to develop models of mating
systems such as haplodiploidy; all that is really needed is to set up rules of reproduction that move
the genomes around from individual to individual in the correct way using addRecombinant(),
which is designed to be as flexible as possible in order to accommodate these sorts of purposes.
Partitioning the population according to genetics – here, diploids versus haploids – is a trick that
would probably also be useful in a model of haplodiploidy or other mating systems; indeed, such
artificial partitioning can be very useful in other contexts too, such as storing non-reproducing
juveniles separately from reproductive adults.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

348

16. Tree-sequence recording: tracking population history and true local ancestry
This manual has already discussed a variety of ways of tracking ancestry in SLiM models, such
as pedigree recording (section 13.2; see also section 15.12 for a related model) and using
mutations to mark the population of origin of chromosome positions (section 13.9). In this chapter
we will look at a very powerful way of tracking ancestry called tree-sequence recording. Treesequence recording is a new feature added in SLiM 3, and is introduced in some detail in section
1.7; you should read that section now if you haven’t already. This is an advanced feature, and so
this chapter will assume a familiarity with general SLiM and Eidos topics and techniques.
The use of tree-sequence recording leans heavily upon Python, where the tree sequence saved
in a .trees file can be read, analyzed, and modified. To run most of the recipes in this chapter,
you will need to have Python installed (Python 3.4.8 was used to construct and test these recipes).
Some familiarity with Python will come in useful, but the Python scripts here will not be extensive.
You will also need to have two Python packages installed: msprime and pyslim. The recipes in this
chapter are based upon a minimum version of SLiM 3.1, msprime 0.6.1, and pyslim 0.1.
The msprime package is used to analyze tree sequences, overlay mutations onto a tree
sequence, and perform coalescent simulations, among other tasks. The msprime project lives at
https://github.com/tskit-dev/msprime, and installation instructions and documentation are there.
The pyslim package essentially provides a bridge between SLiM and msprime; it should be used
to load SLiM .trees files into Python, to parse the SLiM-specific annotations in such files, to add
such annotations to an existing tree sequence, and to write out SLiM-compliant .trees files. The
pyslim project lives at https://github.com/tskit-dev/pyslim, and again, installation instructions and a
link to its documentation are provided there.
In general, tree-sequence recording is compatible with all other SLiM features. It can be used
with both WF and nonWF models, with all types of callbacks, with any population structure, and
so forth. In this chapter we will not attempt to show a tree-seq version of every recipe we’ve
already seen; instead we will focus upon the usage of the tree-sequence recording feature itself, to
show what it is capable of and how it should be used. The early recipes will use pyslim only
incidentally, to load .trees for use with msprime; later recipes will use pyslim more extensively.
16.1 A minimal tree-seq model
To begin, we will look at a minimal tree-sequence recording model based upon a simple
neutral model. This recipe enables tree-seq recording and then runs much as usual:
initialize() {
initializeTreeSeq();
initializeMutationRate(0);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 1e8-1);
initializeRecombinationRate(1e-8);
}
1 {
sim.addSubpop("p1", 500);
}
5000 late() {
sim.treeSeqOutput("./recipe_16.1.trees");
}

There are three notable differences here from the vanilla neutral model we have seen before.
First, we now call initializeTreeSeq() to turn tree-seq recording on. This is all that is needed;
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

349

SLiM will now record all tree-sequence information throughout the run, and will automatically
conduct simplification periodically using the default simplification interval (which can be
customized with the simplificationRatio parameter to initializeTreeSeq(); see section 21.1).
Second, we now set the mutation rate to zero, because we don’t want to model neutral
mutations. Typically, when using tree-seq recording, neutral mutations are overlaid onto the
ancestry tree after forward simulation has completed; we will show that technique in the next
subsection, so in anticipation of that we have turned off neutral mutations here. Note that it is not
necessary to turn off neutral mutations; there is no problem with including them in a tree-seq
model, apart from the additional performance penalty of simulating them. It is just usually not
desirable, because avoiding the overhead of simulating neutral mutations is one of the primary
motivations for using tree-seq recording in the first place. Also, note that if the model had a mix of
neutral and non-neutral mutations we would probably want to remove only the neutral mutations
from the model, with a concomitant adjustment to the mutation rate; we will see an example of
this in 16.3. (We would not want to remove the non-neutral mutations, since they affect the
model’s dynamics and thus cannot be overlaid after simulation.)
Third, we call treeSeqOutput() to write out a .trees file containing the full tree sequence that
was recorded. This file is in a binary format defined by the msprime package (see section 23.4 for
some details). However, it can be loaded back into SLiM with readFromPopulationFile(), just
like a regular SLiM output file, and it can also be read in Python using msprime; we will see
examples of both techniques in later recipes.
That’s all there is to it; we will use the .trees file generated here in the next recipe, but for now
we’re done. It's worth noting, though, that an informal test of this model indicates that it runs in
about 5 seconds, whereas the same model with tree-sequence recording turned off and a mutation
rate of 1e-7 took 294 seconds to run. This performance improvement is not atypical; this is one
major reason why tree-sequence recording is useful.
16.2 Overlaying neutral mutations
This recipe is actually a Python recipe, not a SLiM recipe. It depends upon having run the
previous recipe, in section 16.1, to generate a .trees file representing the execution of a simple
neutral model. Since mutation was turned off in that model, the .trees file contains no mutations;
but it contains all of the ancestry information needed to overlay them. That is trivial in Python:
import msprime, pyslim
ts = pyslim.load("./recipe_16.1.trees").simplify()
mutated = msprime.mutate(ts, rate=1e-7, random_seed=1, keep=True)
mutated.dump("./recipe_16.1_overlaid.trees")

First we import the msprime and pyslim packages (which need to be installed, as discussed at
the beginning of the chapter). Then we load the saved .trees file into a tree sequence object with
pyslim.load(); this method, rather than msprime.load(), should generally be used to load
SLiM .trees files since it knows how to handle SLiM metadata and SLiM provenance information.
We then call simplify() to simplify the loaded tree sequence. SLiM 3.1 produces .trees files
that contain the first ancestral individuals in each new subpopulation created by addSubpop();
these individuals are useful for various purposes, such as recapitation (see section 16.10) and
tracing ancestry, so they are provided by SLiM for convenience. However, they are not marked as
“remembered”, so they disappear when the tree sequence is simplified; this makes it easy to get rid
of them when they are not wanted. Here we do not need them (although they would do no harm),
so we simplify them away to demonstrate this typical usage pattern. Note that this means that
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

350

mutations will be overlaid only back to the point of coalescence; if we want fixed mutations
overlaid past coalescence back to the start of forward simulation, we should not simplify().
Next, we overlay mutations with msprime.mutate() to generate a new tree sequence (with a
given mutation rate and random number generator seed), and finally we write the mutated tree
sequence out to a new .trees file. The keep=True parameter to msprime.mutate() indicates that
any existing mutations should be kept, not overwritten; in this example there are no mutations
already present in the tree, but if there were, we would typically want to keep them.
This script runs in about 0.2 seconds. Why is it so much faster than modeling the neutral
mutations in SLiM? The key is that when mutations are overlaid after the forward simulation has
completed, they only need to be generated along those branches of the ancestry tree that led to
extant individuals at the end of the run. All of the branches of the evolutionary tree that went
extinct – the vast majority of branches, in most models – need not have mutations overlaid.
The new .trees file written out at the end could be read by a different Python script to perform
further analysis or modification, again using msprime and pyslim. We could also, instead, simply
work with the new tree sequence object in further analysis code in this script; or we could write
out the mutation information to a different file format, such as MS format, if the ancestry
information encapsulated by the .trees file is no longer needed.
It should also be possible to read a .trees file with overlaid mutations back into SLiM with the
readFromPopulationFile() method, to use its state as the starting point for further forward
simulation. However, to do that the mutation information provided by msprime would need to be
annotated with additional data required by SLiM about the overlaid mutations, such as their
selection coefficients and mutation types. Furthermore, mutation positions would have to be
rounded, or somehow guaranteed to be integers already, since SLiM expects integer mutation
positions. We will add a recipe showing these techniques when it becomes possible.
16.3 Simulation conditional upon fixation of a sweep, preserving ancestry
In the recipe of section 16.1 we saw a tree-seq model for a trivial neutral model, but this
method is not limited to neutral models. Here we will look at a model involving both neutral and
deleterious mutations, into which a beneficial mutation is introduced. We want the beneficial
mutation to sweep, and we want this model to run conditional upon a successful sweep. This will
be quite similar to the recipe of section 10.2, then; however, here we have a background of
deleterious as well as neutral mutations. When we convert the model to use tree-sequence
recording, we will remove the neutral mutations from the model (since they can be overlaid later,
as in section 16.2). The ancestry information for all of the individuals in the model will be
preserved, even though the model will restart itself repeatedly until fixation is achieved.
First, here is the model without tree-sequence recording:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeMutationType("m2", 0.5, "g", -0.01, 1.0); // deleterious
initializeMutationType("m3", 1.0, "f", 0.05);
// introduced
initializeGenomicElementType("g1", c(m1, m2), c(0.9, 0.1));
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 {
defineConstant("simID", getSeed());
sim.addSubpop("p1", 500);
}
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

351

1000 late() {
target = sample(p1.genomes, 1);
target.addNewDrawnMutation(m3, 10000);
sim.outputFull("/tmp/slim_" + simID + ".txt");
}
1000:100000 late() {
if (sim.countOfMutationsOfType(m3) == 0) {
if (sum(sim.substitutions.mutationType == m3) == 1) {
cat(simID + ": FIXED\n");
sim.simulationFinished();
} else {
cat(simID + ": LOST - RESTARTING\n");
sim.readFromPopulationFile("/tmp/slim_" + simID + ".txt");
setSeed(rdunif(1, 0, asInteger(2^32) - 1));
}
}
}

Since this is very similar to the recipe of section 10.2, we won’t discuss it here; see that section
for discussion. Here, our goal is to convert it into a tree-seq model:
initialize() {
initializeTreeSeq();
initializeMutationRate(1e-8);
initializeMutationType("m2", 0.5, "g", -0.01, 1.0); // deleterious
initializeMutationType("m3", 1.0, "f", 0.05);
// introduced
initializeGenomicElementType("g1", m2, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 {
defineConstant("simID", getSeed());
sim.addSubpop("p1", 500);
}
1000 late() {
target = sample(p1.genomes, 1);
target.addNewDrawnMutation(m3, 10000);
sim.treeSeqOutput("/tmp/slim_" + simID + ".trees");
}
1000:100000 late() {
if (sim.countOfMutationsOfType(m3) == 0) {
if (sum(sim.substitutions.mutationType == m3) == 1) {
cat(simID + ": FIXED\n");
sim.treeSeqOutput("slim_" + simID + "_FIXED.trees");
sim.simulationFinished();
} else {
cat(simID + ": LOST - RESTARTING\n");
sim.readFromPopulationFile("/tmp/slim_" + simID + ".trees");
setSeed(rdunif(1, 0, asInteger(2^32) - 1));
}
}
}

We have added a call to initializeTreeSeq(), and changed the file save and load code to use
a .trees file instead of a standard SLiM text output file. We also removed the neutral mutations
from the model; note that this required changing the mutation rate. Since 90% of the mutations in
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

352

the original model were neutral, the new model has a mutation rate one-tenth as high, so that the
rate of deleterious mutations remains unchanged. In this simple model, with only one genomic
element type, the adjustment to the mutation rate is very straightforward; in a model with a more
complex genetic architecture in which the fraction of mutations that are neutral varies from region
to region along the chromosome, the necessary adjustment to the mutation rate after removal of
neutral mutations might require switching to a mutation-rate map, rather than using a single fixed
mutation rate. A mutation-rate map may be supplied to initializeMutationRate(), rather than a
single fixed rate; see section 21.1.
Otherwise, the model is unchanged. This model will run in much the same manner as the first
version, restarting itself whenever the m3 mutation is lost until it achieves fixation. The deleterious
mutations present when the beneficial mutation is introduced will be saved to the .trees file and
restored each time the model restarts.
When this model achieves fixation, it writes out the final state of the model to a new .trees file
(not in the /tmp directory, this time). This file can be loaded into Python with pyslim, as in the
recipe of section 16.2, to overlay neutral mutations, or to perform any other analysis desired. Note
that since this model contains non-neutral mutations as well as neutral mutations, the mutation
rate used in neutral mutation overlay would not be the original rate of 1e-7, but instead the rate
specifically of neutral mutations along the chromosome, and that with a complex genetic
architecture this might necessitate the specification of a mutation-rate map rather than the use of a
single fixed rate, similarly to the issue with the mutation rate in SLiM discussed above.
16.4 Detecting the “dip in diversity”: analyzing tree heights in Python
It has been mentioned several times that one can perform “other analysis” in Python using the
information saved in a .trees file. In this recipe and the next, we will look at two examples.
Because .trees files contain information about the true local ancestry of every location along the
genome of every extant individual, including the times when recombination and mutation events
occurred in the past, there are many interesting analyses that can be conducted.
The recipe in this section will model “background selection”, which is the effect of selection
against deleterious mutations upon nearby neutral sites. Background selection is, in a sense,
similar to “genetic hitchhiking”, which is the effect of selection for beneficial mutations upon
nearby neutral sites. Although they are different, both of these types of so-called “linked selection”
have been found to reduce genetic diversity in non-coding regions near genes; because mutations
within genes often have functional effects (whether positive or negative), they often exert linked
selection upon nearby neutral regions that reduces diversity. The reduction in diversity falls off as
distance to the nearest gene increases, producing a characteristic “dip in diversity” near genes
(Charlesworth et al. 1993; Hudson 1994; Sattath et al. 2011; Elyashiv et al. 2016).
The SLiM model for this is complicated only because, for higher statistical power, we want to
model many genes interspersed with many non-coding regions, so that a single run of the model
generates enough data for us to see the effect we’re interested in:
initialize() {
defineConstant("N", 10000); // pop size
defineConstant("L", 1e8);
// total chromosome length
defineConstant("L0", 200e3); // between genes
defineConstant("L1", 1e3);
// gene length
initializeTreeSeq();
initializeMutationRate(1e-7);
initializeRecombinationRate(1e-8, L-1);
initializeMutationType("m2", 0.5, "g", -(5/N), 1.0);
initializeGenomicElementType("g2", m2, 1.0);
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

353

for (start in seq(from=L0, to=L-(L0+L1), by=(L0+L1)))
initializeGenomicElement(g2, start, (start+L1)-1);
}
1 {
sim.addSubpop("p1", N);
sim.rescheduleScriptBlock(s1, 10*N, 10*N);
}
s1 10 late() {
sim.treeSeqOutput("./recipe_16.4.trees");
}

The chromosome is defined as a set of genes of length L1, separated by non-coding regions of
length L0. The model doesn’t define the non-coding regions with genomic elements; we don’t
want mutations to occur in the non-coding regions, since we will not need any neutral mutations
to conduct our analysis (and even if we did, it would be better to overlay them afterwards in
Python). A subpopulation of size N is defined, and then the final output event is rescheduled to
generation 10N (a rough guess at a coalescence time for the model; the results will not be terribly
sensitive to the accuracy of this guess since diversity near the genes will be suppressed
continuously throughout the run of the model).
To run this model and then conduct the analysis, we can run the following Python script
(assuming that we’re in the directory containing the SLiM model file):
import subprocess, msprime, pyslim
import matplotlib.pyplot as plt
import numpy as np
# Run the SLiM model and load the resulting .trees
subprocess.check_output(["slim", "-m", "-s", "0", "./recipe_16.4.slim"])
ts = pyslim.load("./recipe_16.4.trees").simplify()
# Measure the tree height at each base position
height_for_pos = np.zeros(int(ts.sequence_length))
for tree in ts.trees():
mean_height = np.mean([tree.time(root) for root in tree.roots])
left, right = map(int, tree.interval)
height_for_pos[left: right] = mean_height
# Convert heights along chromosome into heights at distances from gene
height_for_pos = height_for_pos - np.min(height_for_pos)
L, L0, L1 = int(1e8), int(200e3), int(1e3)
gene_starts = np.arange(L0, L - (L0 + L1) + 1, L0 + L1)
gene_ends = gene_starts + L1 - 1
max_d = L0 // 4
height_for_left = np.zeros(max_d)
height_for_right = np.zeros(max_d)
for d in range(max_d):
height_for_left[d] = np.mean(height_for_pos[gene_starts - d - 1])
height_for_right[d] = np.mean(height_for_pos[gene_ends + d + 1])
height_for_distance = np.hstack([height_for_left[::-1], height_for_right])
distances = np.hstack([np.arange(-max_d, 0), np.arange(1, max_d + 1)])
# Make a simple plot
plt.plot(distances, height_for_distance)
plt.show()

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

354

35000
25000

mean tree height (generations)

After importing packages, we run the SLiM model with subprocess.check_output(), which
generates the recipe_16.4.trees file. We read that file in using pyslim.load() and gather tree
heights, which are a proxy for diversity, from it; but let’s examine that process step by step.
First of all, pyslim.load() reads in the .trees file as tree sequence, which is a collection of
ancestry trees (section 1.7 introduces these concepts, but we will quickly review them here). The
tree sequence thinks of the genome as being divided into successive intervals, each of which has a
particular pattern of ancestry. Because the ancestry at adjacent positions along the genome tends
to be highly correlated, this representation makes the tree sequence very compact and efficient.
When we loop over the trees in ts.trees, we are actually looping over these chromosome
intervals, getting the ancestry tree for each successive interval.
Second, the ancestry tree for a given interval is not quite a tree in the usual sense, because it
can have multiple roots. This is because a given position will not necessarily share a common
ancestor, in forward simulation; at the beginning of simulation every individual is an island unto
itself, with no known relationship to any other individual, and ancestral relationships will be
constructed by coalescence as the model runs forward. The tree for a given interval, in a
simulation of N individuals, might therefore have anywhere from one root (if coalescence has
produced a single common ancestor for the whole population) to N roots (if no coalescence has
occurred yet). When we loop over roots in tree.roots, we are looping over common ancestors
shared by subsets of the extant population.
The code gathers tree heights across all roots for the current interval, using tree.time(root),
and uses np.mean() to get the mean height. This is our metric of diversity for the interval. Since an
interval can span more than one base position, we replicate that mean height across the tree’s
interval in height_for_pos, which we want to have a height value for each base position.
Note that in this recipe the simplify() of the loaded tree sequence is essential (whereas in
section 16.2 it was not); without it, every tree would have a root in the first generation, in one or
another original ancestor, and all the tree heights would be the same. The simplify() strips away
the original ancestors, giving us trees with roots representing the most recent common ancestors
for each tree.
With that done, we now need to convert that vector of tree heights along the chromosome into
heights at a given distance from the nearest gene. This involves a sort of descrambling of the data
based upon the genetic structure of the model, which (as described above) interleaved non-coding
regions of length L0 with genes of length L1. The details of this descrambling aren’t worth getting
into in detail; this is just data analysis, and is fairly routine Python code with no dependency upon
msprime or SLiM. The end result of this analysis is a vector named height_for_distance that has
mean heights for the corresponding distances in the vector distances. This is what we wanted, so
we can plot it (this plot is prettified using R code not shown here, but the Python plot looks
essentially the same):

-50000

0

50000

distance from gene

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

355

This plot is noisy; that could presumably be smoothed out with more genes on a longer
chromosome, a larger population size, and averaging across multiple runs of the model.
Nevertheless, the “dip in diversity” is very clear: the mean tree height drops to a clear minimum
value precisely at zero on the plot.
Note that this analysis is not based upon the patterns of neutral mutation diversity around the
simulated genes. Instead, the pattern of inheritance itself – the mean time to the most recent
common ancestor at each base position – is used to generate the plot. This is far more powerful
than using the pattern of neutral mutation diversity, because neutral mutations are sparse and
stochastic. You certainly could overlay neutral mutations onto the tree sequence, as we did in
section 16.2, and then feed that into the analysis methods used by empirical studies (and perhaps
that would be a useful thing to do, to test the power or accuracy of those empirical methods). But
the analysis here is, in a sense, the average across many such analyses – across an infinite number
of such mutation overlays, in fact – and so it is far more powerful.
This recipe shows an analysis using just one metric provided by msprime, the height of a given
root using tree.time(root). The ancestry information provided by the tree sequence could be
mined in countless other ways. In the next recipe, we will look at another example of postsimulation analysis using msprime.
16.5 Mapping admixture: analyzing ancestry in Python
It is often useful to be able to trace the true local ancestry at each location along the
chromosome. In section 13.9, a recipe was presented that provided this ability by introducing a
marker mutation at every base position along the chromosome in all of the individuals of one
subpopulation, while leaving the other subpopulation unmarked. After admixture of the two
subpopulations, the ancestry of each individual at each position could be determined by the
presence or absence of the marker mutation. That method works, but it comes at a cost of
considerable overhead in both runtime and memory usage. A chromosome of length 1e6 is close
to the practical limit for that recipe; a chromosome of length 1e8 is estimated by extrapolation to
take 7.2 days to run and – even worse – to require 8.1 TB of memory. Since the tree sequence is a
sparse data structure that records information about the ancestry at each chromosome position in a
highly compact form, one might suppose that it could provide information about true local
ancestry more efficiently, and indeed it can.
In this section we will look at a recipe very similar to that of section 13.9; refer back to that
section for further discussion of the design and motivation of that model. Here we will not use
marker mutations; instead we will use tree-sequence recording:
initialize() {
defineConstant("L", 1e8);
initializeTreeSeq();
initializeMutationRate(0);
initializeMutationType("m1", 0.5, "f", 0.1);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, L-1);
initializeRecombinationRate(1e-8);
}
1 late() {
sim.addSubpop("p1", 500);
sim.addSubpop("p2", 500);
sim.treeSeqRememberIndividuals(sim.subpopulations.individuals);
p1.genomes.addNewDrawnMutation(m1, asInteger(L * 0.2));
p2.genomes.addNewDrawnMutation(m1, asInteger(L * 0.8));
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

356

sim.addSubpop("p3", 1000);
p3.setMigrationRates(c(p1, p2), c(0.5, 0.5));
}
2 late() {
p3.setMigrationRates(c(p1, p2), c(0.0, 0.0));
p1.setSubpopulationSize(0);
p2.setSubpopulationSize(0);
}
2: late() {
if (sim.mutationsOfType(m1).size() == 0)
{
sim.treeSeqOutput("./recipe_16.5.trees");
sim.simulationFinished();
}
}
10000 late() {
stop("Did not reach fixation of beneficial alleles.");
}

Only two mutations ever exist in this model: the beneficial mutations introduced at 0.2L in p1
and at 0.8L in p2. After admixture to form p3, when both mutations have fixed, a .trees file is
written out and the simulation ends. The .trees file will then be read and analyzed in Python, as
we will see below.
The only twist here is the call to sim.treeSeqRememberIndividuals(). Our goal is to trace the
ancestry at each position in each individual to either p1 or p2. In point of fact, in SLiM 3.1 and
later this call is not strictly necessary, because the original ancestors of each subpopulation created
by addSubpop() are kept by SLiM automatically. If we did not simplify() after loading the tree
sequence with pyslim, those ancestors would be available for us to trace ancestry back to,
allowing us to determine whether a particular genomic region originated in p1 or p2. We have
shown the call here, however, for a couple of reasons: (1) it makes the way this recipe works more
explicit and less magic, (2) you may wish to remember other individuals in a simulation to trace
ancestry back to them, so showing the treeSeqRememberIndividuals() call here makes it clear
how to do that, and (3) by explicitly remembering the ancestors we can simplify() on load,
which may be desirable – we may want to have a simplified tree sequence for other purposes.
The run of the SLiM model, along with post-run analysis and plotting, is all done in a single
Python script, as before:
import subprocess, msprime, pyslim
import matplotlib.pyplot as plt
import numpy as np
# Run the SLiM model and load the resulting .trees file
subprocess.check_output(["slim", "-m", "-s", "0", "./recipe_16.5.slim"])
ts = pyslim.load("./recipe_16.5.trees").simplify()
# Load the .trees file and assess true local ancestry
breaks = np.zeros(ts.num_trees + 1)
ancestry = np.zeros(ts.num_trees + 1)
for tree in ts.trees(sample_counts=True):
subpop_sum, subpop_weights = 0, 0
for root in tree.roots:
leaves_count = tree.num_samples(root) - 1 // the root is a sample
subpop_sum += tree.population(root) * leaves_count
subpop_weights += leaves_count
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

357

breaks[tree.index] = tree.interval[0]
ancestry[tree.index] = subpop_sum / subpop_weights
breaks[-1] = ts.sequence_length
ancestry[-1] = ancestry[-2]
# Make a simple plot
plt.plot(breaks, ancestry)
plt.show()

p1

ancestry proportion

p2

After imports, the SLiM model (named recipe_16.5.slim) is run with subprocess, and
the .trees file saved by the model is then read in using pyslim.load() as usual, with a
simplify() call since we have explicitly remembered the ancestors we are interested in (see
discussion above). We then loop over the trees in the tree sequence, and over the roots for each
tree (see section 16.4 for discussion of these concepts). For each root, we want to assess ancestry
as tracing back to either p1 or p2; this can be done simply by calling tree.population(root),
since every node in the tree sequence is marked with its subpopulation of origin. We average the
ancestry of all roots for an interval to get the mean ancestry, but this is a weighted average; each
root is weighted according to the number of descendants it has, effectively computing the average
ancestry across all extant individuals, rather than across roots. The start position on the
chromosome is recorded in parallel with these calculated mean ancestry values, and a terminating
entry in both vectors brings us to the end of the chromosome.
That’s it for the analysis; this model is much simpler structurally than the previous recipe. All
that is left is to plot the data. The values in starts and ends are used as the endpoints of line
segments, interleaved with zip(); the mean ancestries in subpops are duplicated to match. (For
those not familiar with Python, these lines may seem a bit magical, sorry.) The final plot (produced
by an R script for polish, rather than by the Python code here) looks like:

0e+00

1e+08
chromosome position

This is much the same as what the recipe of section 13.9 provided. Since the two beneficial
mutations both fixed, the ancestry of the final population is pure, p1 or p2, at the points where they
were introduced. The ancestry then shades continuously (but stochastically) between those two
points due to recombination (see section 13.9 for further discussion). The SLiM model here took
only 0.415 seconds to run, however – somewhat of an improvement over 7.2 days. The post-run
analysis took about 62 seconds; looping over all of the roots of all of the trees in the tree sequence
is not entirely trivial, but is still far simpler than simulating 1e8 marker mutations in SLiM.
This recipe will generalize easily to any number of ancestral subpopulations, as long as the
original founders of each subpopulation are remembered with treeSeqRememberIndividuals() so
that ancestry can be traced back to them (of course taking the mean of the subpopulations of the
tree roots wouldn’t be the right analysis then). Since every node knows the subpopulation to
which it belonged, the ancestral history at each position can be assessed by walking up the
ancestry chain from the extant nodes to the roots, as long as one ensures that the appropriate
ancestors are remembered forever; one might remember every individual that migrates to a new
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

358

subpopulation, for example, using the migrant property of Individual, so that the migratory
history through the ancestry tree can be traced. All of this goes far beyond what would be
reasonable to implement with marker mutations using the methods of section 13.9.
16.6 Measuring the coalescence time of a model
A common problem in forward simulation is deciding how to conduct model burn-in. A burnin period is a period of simulation executed in order to arrive at an appropriate starting state for the
model one wishes to execute; often this starting state is for a neutral model at equilibrium, but not
necessarily so. But how long should the burn-in period be, to provide such an equilibrium state?
Running until coalescence ensures that no genetic diversity in the population could date back to
the start of the simulation, so all polymorphic loci derive from after the start of the simulation.
However, to attain equilibrium takes longer: in a truly neutral population of constant size, a total of
two or three times the time required to reach coalescence should suffice. But that just kicks the
can down the road; how long does it take to achieve coalescence? A heuristic of 10N is often
used: run for ten times the population size (N), in generations. But this heuristic is less than
satisfactory; the model may not have coalesced even at 10N generations, or it may have coalesced
long before (meaning wasted simulation time). And with any additional complexity, such as
multiple subpopulations connected by migration, the 10N rule is potentially even more
problematic. What to do?
Tree-sequence recording provides a very easy solution, because the ancestry information that it
records is well-suited to determining whether the model has achieved coalescence. The only
complication is that coalescence can only be evaluated immediately after simplification has been
performed; the coalescence test requires that the tree sequence be in a simplified state. SLiM
largely hides this detail from the user; you can query the coalescence state at any time, but the
answer you get will actually be the coalescence state at the time that the last simplification was
performed, not the state at the present time. Sometimes, then, one will want to exert some control
over the simplification process, so that one knows at what granularity coalescence is actually
being assessed, as described below.
Here is a recipe demonstrating the coalescence-detection technique:
initialize() {
initializeTreeSeq(checkCoalescence=T);
initializeMutationRate(0);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 1e8-1);
initializeRecombinationRate(1e-8);
}
1 {
sim.addSubpop("p1", 500);
}
1: late() {
if (sim.treeSeqCoalesced())
{
catn(sim.generation + ": COALESCED");
sim.simulationFinished();
}
}
100000 late() {
catn("NO COALESCENCE BY GENERATION 100000");
}

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

359

This model turns on coalescence checking with checkCoalescence=T (see section 21.1;
coalescence checking needs to be turned on explicitly because there is a small performance
penalty associated with it). Then, every generation, we check for coalescence with
sim.treeSeqCoalesced(). As mentioned above, this only tells us whether coalescence was
observed at the last simplification; when auto-simplification finds that coalescence has occurred,
treeSeqCoalesced() will then return T when it is next called. Usually it is not necessary to detect
coalescence in the very generation in which it occurs; it is typically enough to simply know that it
has occurred at some recent time, as this model does.
When this model is run, typical output looks like this:
9984: COALESCED

Coalescence was probably achieved some time before generation 9984; 9984 is just the
generation in which auto-simplification noticed that coalescence. Note that this is higher than the
10N value of 5000 (and auto-simplification occurs every couple of hundred generations in this
model); so at generation 5000 the model would not yet have coalesced. The time to coalescence is
variable, depending as it does upon stochastic events, but repeated runs of this model will show
that in fact it typically coalesces around 10000 generations, or approximately 20N.
To control the granularity of coalescence checking more precisely, one may turn off autosimplification by passing simplificationRatio=INF to initializeTreeSeq(), and then call
treeSeqSimplify() to simplify on demand, after which treeSeqCoalesced() will return an up-todate assessment. Simplifying every generation is very slow, however, so simplifying and checking
at a regular interval, such as every hundredth generation, is generally preferable.
Note that certain actions in script, such as adding a new subpopulation, can break the
coalescence of a model; all of the individuals in the population no longer share a common
ancestor, even though they previously did. The value returned by treeSeqCoalesced() will not
immediately reflect this; instead, as usual, that value will reflect the coalescence state after the last
simplification that was performed. If this poses a problem, one can always explicitly simplify
immediately after such model events, forcing the coalescence state to be re-checked.
Coalescence checking will work in all types of models (with the above caveats about timing and
granularity), regardless of selection, population structure, etc. However, coalescence is only a
useful indicator of the timescale needed for equilibration in fairly simple neutral models. If model
dynamics during burn-in change over time, or are substantially non-neutral, coalescence may be a
poor indicator of equilibration; indeed, such models may not even have an equilibrium state.
What constitutes a proper burn-in for such models is a difficult question.
It is worth noting that the point of coalescence, in a forward simulation, is not itself an unbiased
or equilibrium state. In particular, as a forward simulation runs the mean tree height will grow
over time until coalescence is reached, at which point it will drop suddenly (because a more
recent common ancestor has just emerged); then it will rise steadily again until the next
coalescence, at which point it will again drop, and so forth. A plot of mean tree height over time
therefore exhibits a “saw-tooth” pattern, and the moment of any given coalescence event is the
bottom point of one saw-tooth – a special point in time, not a typical or average one. It is
therefore advisable to continue neutral burn-in well beyond the point of coalescence; this is why,
at the beginning of this subsection, we recommended running for two or three times the time
required to reach coalescence. For an even better solution to this problem of neutral burn-in, see
the “recapitation” technique described in section 16.10.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

360

16.7 Analyzing selection coefficients in Python with pyslim
So far we have used the pyslim package only minimally, to load in .trees files generated by
SLiM. The next several recipes will delve into its capabilities more. In this section, we will see
how to use pyslim to obtain SLiM-specific information about a simulation from the metadata
stored in the .trees file generated by SLiM.
Let’s begin with a simple SLiM model of beneficial and neutral mutations on a uniform
chromosome:
initialize() {
initializeTreeSeq();
initializeMutationRate(1e-10);
initializeMutationType("m1", 0.5, "g", 0.1, 0.1);
initializeMutationType("m2", 0.5, "g", -0.1, 0.1);
initializeGenomicElementType("g1", c(m1, m2), c(1.0, 1.0));
initializeGenomicElement(g1, 0, 1e8-1);
initializeRecombinationRate(1e-8);
}
1 {
sim.addSubpop("p1", 500);
}
20000 late() { sim.treeSeqOutput("./recipe_16.7.trees"); }

Beneficial mutations are drawn from a gamma DFE with a mean of 0.1, deleterious mutations
from a gamma DFE with a mean of -0.1. After the model has run for a while, the expectation
would be that the mutations generated as the model runs would indeed have those means, but that
the mutations fixed would be biased towards the higher end of the distribution for both mutation
types. This would be simple enough to evaluate in Eidos at the end of the model run, but let’s do it
in Python instead, using pyslim, as a proof on concept. Here is a Python script that assumes that a
.trees file has been saved to the path used by the model above (we’ve shown already how to run
a SLiM model from inside Python using subprocess.check_output(), so there’s no need to keep
reiterating that point):
import msprime, pyslim
ts = pyslim.load("recipe_16.7.trees").simplify()
coeffs = []
for mut in ts.mutations():
md = pyslim.decode_mutation(mut.metadata)
sel = [x.selection_coeff for x in md]
if any([s != 0 for s in sel]):
coeffs += sel
b = [x for x in coeffs if x > 0]
d = [x for x in coeffs if x < 0]
print("Beneficial: " + str(len(b)) + ", mean " + str(sum(b) / len(b)))
print("Deleterious: " + str(len(d)) + ", mean " + str(sum(d) / len(d)))

We start by loading the .trees file with pyslim.load() as usual. We then loop through all of
the mutations in the tree sequence, provided by ts.mutations(). We are interested in the
selection coefficients of the mutations, which are not part of the base information provided by
the .trees format; instead, they are stored in SLiM-specific metadata attached to each mutation.
To access them, then, we ask pyslim to decode the metadata for us, from the binary format it is in,
and return it to us, using the pyslim.decode_mutation() method. We can then fetch selection
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

361

coefficients (more on this in a moment) and append them to the coeffs list. Once we have that
list, it is a trivial matter to select the beneficial and deleterious mutations from it and print a
summary of each. A test run produces this output:
Beneficial: 3580, mean 1.0382685732466397
Deleterious: 103, mean -0.04902286211917154

A great many more beneficial mutations than deleterious mutations are present, even though
they arise at equal rates in the SLiM model, which is unsurprising. Furthermore, the mean of the
mutations present is considerably biased relative to the mean of the DFEs for these mutation types
(which were 0.1 and -0.1).
There are a few points to note here, regarding the way in which the selection coefficients are
gathered. First of all, the tree sequence retains fixed mutations, unlike SLiM. As the SLiM model
ran, mutations that fixed were converted to substitutions as usual in SLiM; but in the tree-sequence
no such conversion occurs. The list of mutations provided by the tree sequence therefore includes
both fixed and segregating mutations; no distinction is made.
Second, note that the return value from pyslim.decode_mutation() is a list, not a single object.
We end up looping through the elements in this list and getting a selection coefficient from each
element. The reason for this has to do with mutation stacking in SLiM (see section 1.5.3 for a
review of this concept). A “mutation”, as far as the tree sequence is concerned, is a unique
mutational state at a given position, which encompasses all of the mutations that have stacked at
that position. When we call pyslim.decode_mutation(), it helpfully deconstructs this stacked
state into the component mutations within it. If a particular mutation occurs in more than one
configuration in the tree sequence – by itself at a position and also stacked with another mutation
at that position, in different genomes, say – we will actually encounter that mutation more than
once during this process, and we will therefore overcount it. Since stacking will be extremely
uncommon in this model, we don’t worry about it much; it will not skew our results noticeably. If
we wanted to be more rigorous, however, we could use the mutation IDs from SLiM (also available
through pyslim) to construct a uniqued list of mutations, with each mutation counted only once
even if it is stacked with other mutations in various ways, and could then do the rest of the analysis
using that uniqued list. Since that is just elementary Python wrangling, we will not go into that
level of detail here.
16.8 Starting a hermaphroditic WF model with a coalescent history
In section 16.6, we saw how to assess whether a model has coalesced or not using
treeSeqCoalesced(), allowing us to run a model for an appropriate burn-in period rather than
relying upon the (problematic) 10N heuristic. If the burn-in period for a model is neutral and can
be run with msprime’s coalescent simulation, we can avoid running our burn-in with forward
simulation at all. Instead, we can run the burn-in period using the coalescent, save the result as
a .trees file (annotated with pyslim as needed), and load the result into SLiM where we continue
the simulation from the endpoint of the coalescent. This may sound complicated, but in fact it is
remarkably simple.
However, note that starting a simulation with the coalescent in this way is often not the best
technique. In many cases, the “recapitation” technique presented in section 16.10 is superior, for
reasons that will be explained there. This recipe is for primarily for illustration.
We begin with a Python script:
import msprime, pyslim
ts = msprime.simulate(sample_size=10000, Ne=5000, length=1e8,
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

362

mutation_rate=0.0, recombination_rate=1e-8)
slim_ts = pyslim.annotate_defaults(ts, model_type="WF", slim_generation=1)
slim_ts.dump("recipe_16.8.trees")

The msprime.simulate() method is used to run a coalescent simulation with n=2Ne=10000 (n
being a haploid sample size, Ne being in terms of diploids), a chromosome of length 1e8, and a
recombination rate of 1e-8. A mutation rate of 0.0 is used, since we only want the coalescent
history, not mutations within it (those can be overlaid later if needed). This returns a tree sequence
object, but it is in msprime’s format; it does not have any of the metadata annotations expected by
SLiM. The next step is to use pyslim.annotate_defaults() to add those annotations. We tell it to
annotate the data for a WF SLiM model, and to align the times in the tree sequence so that the
current generation has index 1; when we load the model into SLiM, it will start at generation 1
even though there are many generations of history stretching back in time. After annotating,
the .trees file is saved.
Now we can run a SLiM model that loads it and continues the run with some non-neutral
dynamics:
initialize() {
initializeTreeSeq();
initializeMutationRate(0);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeMutationType("m2", 0.5, "f", 0.1);
initializeGenomicElementType("g1", m2, 1.0);
initializeGenomicElement(g1, 0, 1e8-1);
initializeRecombinationRate(1e-8);
}
1 late() {
sim.readFromPopulationFile("recipe_16.8.trees");
target = sample(sim.subpopulations.genomes, 1);
target.addNewDrawnMutation(m2, 10000);
}
1: {
if (sim.mutationsOfType(m2).size() == 0) {
print(sim.substitutions.size() ? "FIXED" else "LOST");
sim.treeSeqOutput("recipe_16.8_II.trees");
sim.simulationFinished();
}
}
2000 { sim.simulationFinished(); }

At the end of this model (note that an initial seed of 2178208680098 produces fixation in SLiM
3.1), a new .trees file is saved out. This contains the full ancestry information for the simulation –
not just for the period simulated in SLiM, but also for the coalescent. Just to verify here that the
full history is present, let’s check the final tree sequence:
import msprime, pyslim
ts = pyslim.load("recipe_16.8_II.trees").simplify()
for tree in ts.trees():
for root in tree.roots:
print(tree.time(root))

This just prints the heights of the trees in the final tree sequence; in a test run, it prints a wide
range of values, ranging from less than 10,000 to more than 60,000. The simulation is fully
coalesced (that could be confirmed by checking that the number of roots per tree is always exactly
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

363

1),

but it reached coalescence at different times at different positions along the chromosome, so the
tree heights are not all identical. However, all of the heights are considerably larger than the 324
generations that the SLiM model ran for; the rest is the height of the coalescent history, which has
indeed been preserved. We could now overlay neutral mutations upon the final .trees file as we
did in section 16.2, or perform ancestry analyses with it such as those we did in section 16.4 and
16.5, or whatever else we might wish to do.
16.9 Starting a sexual nonWF model with a coalescent history
In the previous recipe, we saw how to run an msprime coalescent simulation as a burn-in period
and save it as a SLiM-compliant .trees file which could then be used to run non-neutral dynamics
on top of the coalescent burn-in history. The scenario presented there was very simple, however: a
hermaphroditic WF model. What if we want to use msprime to construct a coalescent burn-in for a
model complex model? We will need to annotate the tree sequence appropriately for the type of
model that we intend to load it into in SLiM. Here we will see how to do this for a sexual nonWF
model. As mentioned in the previous recipe, however, starting a simulation with the coalescent is
often not the best technique; the “recapitation” technique presented in section 16.10 is usually
preferable, for reasons discussed there. This recipe is offered for illustration of the relevant
techniques.
To get SLiM to accept the .trees file from the burn-in as legitimate input for a sexual nonWF
model, we will need to assign sexes to each individual. We will assign ages too; if we didn’t do
so, all of the individuals would be given a default age of 0, but here we would like the initial
population to have some age structure. Finally, we will tell pyslim to mark the tree sequence as
being from a nonWF simulation, so that it matches SLiM’s expectations.
The Python script for this recipe is a bit longer, naturally, since the annotation adds a bit of
complication:
import msprime, pyslim, random
ts = msprime.simulate(sample_size=10000, Ne=5000, length=1e8,
mutation_rate=0.0, recombination_rate=1e-8)
tables = ts.dump_tables()
pyslim.annotate_defaults_tables(tables, model_type="nonWF",
slim_generation=1)
individual_metadata = list(pyslim.extract_individual_metadata(tables))
for j in range(len(individual_metadata)):
individual_metadata[j].sex = random.choice(
[pyslim.INDIVIDUAL_TYPE_FEMALE, pyslim.INDIVIDUAL_TYPE_MALE])
individual_metadata[j].age = random.choice([0, 1, 2, 3, 4])
pyslim.annotate_individual_metadata(tables, individual_metadata)
slim_ts = pyslim.load_tables(tables)
slim_ts.dump("recipe_16.9.trees")

As in section 16.8, we start by running a coalescent simulation with msprime, generating a tree
sequence. We want to modify the individual metadata here, so at this point we need to switch to
msprime’s table representation with ts.dump_tables(), which unpacks the tree sequence into
modifiable tables (tree sequences are immutable, for efficiency reasons, so modifications must be
made to tables). We can then perform the default nonWF annotation on those tables using
pyslim.annotate_defaults_tables(); this is parallel to our use of pyslim.annotate_defaults()
in section 16.8. We then extract a list of individual metadata records from the tables, and loop
through it setting random sexes and ages. (The age structure used here is just a placeholder for a
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

364

more realistic distribution, of course.) Once the metadata records have been modified, they get
loaded back into the tables, and then the tables get used to create a new SLiM-annotated tree
sequence. Finally, the tree sequence gets written out to a .trees file. This is the typical workflow
when modifying metadata: convert to tables, fetch the metadata from the tables, modify it, load it
back into the tables, and finally recreate a tree sequence from the modified tables. This is
necessary because the tree sequence is an immutable object, not designed to allow modification
of this sort.
Having produced a .trees file from the script above, we can load it into a matching SLiM
model and run, as we did before. For clarity, we’ll leave every else the same about this model, so
that the essential differences can be seen more easily:
initialize() {
initializeSLiMModelType("nonWF");
initializeTreeSeq();
initializeSex("A");
initializeMutationRate(0);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeMutationType("m2", 0.5, "f", 0.1);
m2.convertToSubstitution=T;
initializeGenomicElementType("g1", m2, 1.0);
initializeGenomicElement(g1, 0, 1e8-1);
initializeRecombinationRate(1e-8);
}
reproduction(NULL, "F") {
subpop.addCrossed(individual, subpop.sampleIndividuals(1, sex="M"));
}
1 early() {
sim.readFromPopulationFile("recipe_16.9.trees");
target = sample(sim.subpopulations.genomes, 1);
target.addNewDrawnMutation(m2, 10000);
}
early() {
p0.fitnessScaling = 5000 / p0.individualCount;
}
1: late() {
if (sim.mutationsOfType(m2).size() == 0) {
print(sim.substitutions.size() ? "FIXED" else "LOST");
sim.treeSeqOutput("recipe_16.9_II.trees");
sim.simulationFinished();
}
}
2000 { sim.simulationFinished(); }

Since this is a nonWF model, we have added a minimal reproduction() callback and an
early() event that provides population regulation through density-dependence, as well as a call to
initializeSLiMModelType() (see chapter 15 for discussion). Since it is a sexual model, we also
added a call to initializeSex(). We want the model to terminate upon fixation or loss, as before,
but in nonWF models mutations are not converted to substitutions automatically, so we add
m2.convertToSubstitution=T to ensure that; of course we could just adapt the termination code to
check the frequency of the introduced mutation instead. The rest of the model is unchanged, apart
from filenames.
Note that the subpopulation loaded in from the .trees file is p0, not p1; msprime starts counting
subpopulation IDs from zero. It is conventional in SLiM, in general, to number subpopulations

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

365

from p1, but of course it doesn’t really matter; it would be possible to reassign the subpopulation
index before writing out the .trees file, but we do not do so here.
The only important caveat here is that the coalescent was not run with separate sexes, nor with
overlapping generations, so it does not precisely reflect the dynamics involved in the non-neutral
portion of the simulation. The coalescent is an approximation anyway; in fact, this would have
been a concern in the previous recipe too, if exact Wright-Fisher dynamics were needed. Whether
these sorts of deviations from exact SLiM dynamics are a concern or not will depend upon the
nature of the analysis to be conducted downstream. If it is an issue, a further burn-in period could
be run in SLiM with the correct dynamics, prior to introducing the sweep mutation; that burn-in
would perhaps not need to be very long to wipe away any important traces of the incorrect burn-in
dynamics (but it would be a good idea to confirm that with appropriate tests, of course). If
absolutely exact dynamics are needed then the burn-in will have to be simulated in SLiM; that can
still be done with tree-sequence recording with neutral mutations turned off, though, as in section
16.1, so it will still be much faster than a regular SLiM burn-in would have been.
When this model is run (an initial seed of 1661016094949 produces fixation in SLiM 3.1), we
see the sweep occur and then a new .trees file is written out. We could conduct the same postrun analysis as before, printing out tree heights to verify that this burn-in procedure worked, but to
avoid repetitiveness we will just end here. It is worth noting, in closing, that pyslim allows all sorts
of annotation; it would be possible, with a similar strategy, to mark genomes as being X or Y
chromosomes and then load them into a sex-chromosome simulation in SLiM, or to set the spatial
positions of individuals before loading them into a spatial SLiM model, or whatever is needed. Of
course, getting coalescent simulation results for such complex scenarios, even without any
selected loci, may be a challenge.
16.10 Adding a neutral burn-in after simulation with recapitation
Very often, in forward genetic simulation, a “burn-in” period of neutral dynamics is desirable to
allow the model to reach an equilibrium state of mutation–drift balance before non-neutral
dynamics begin. Without a burn-in period, the pattern of neutral mutations observed at the end of
the simulation may depend as much upon the initial state of the model as on the model’s
dynamics, which can make it difficult to interpret results. The modeling of this burn-in period is a
persistent problem, however, because it is generally so time-consuming. An often-quoted rule of
thumb is that the burn-in period should run for 10N generations – ten times the initial population
size. For a model with 100,000 individuals, then, a million generations of burn-in is recommend,
which – with such a large population size – will take an exceedingly long time. Worse, the 10N
rule is often an underestimate of the time needed to equilibrate, particularly when simulating a
very long chromosome; and there is no number of generations at which coalescence is guaranteed.
To cope with this issue, a wide variety of techniques are attempted. One might try to rescale
the model during the burn-in period (see section 5.5), although this can introduce significant
artifacts; or one might initialize the model with the output from a coalescent simulation (see
sections 16.8 and 16.9); or one might run the burn-in period without neutral mutations, using treesequence recording to preserve the ancestry information that will allow them to be overlaid late
(see sections 16.1 and 16.2). With SLiM 3.1, there is an option that is often better than any of
these, which we will examine in this section: recapitation.
Recapitation is the addition of a neutral coalescent history to a simulation after the fact. The
non-neutral portion of the simulation is run first, in SLiM, with tree-sequence recording enabled so
that the genealogical history of all extant individuals is preserved. The result is saved to a .trees
file, as in previous recipes. That .trees file is then loaded in Python, and msprime and pyslim are

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

366

used to construct a coalescent history stretching back in time from the original ancestors of the
simulation.
Enough discussion; let’s see it in action. We will begin with a SLiM model of non-neutral
dynamics, specifically a selective sweep:
initialize() {
initializeTreeSeq(simplificationRatio=INF);
initializeMutationRate(0);
initializeMutationType("m2", 0.5, "f", 1.0);
m2.convertToSubstitution = F;
initializeGenomicElementType("g1", m2, 1);
initializeGenomicElement(g1, 0, 1e6 - 1);
initializeRecombinationRate(3e-10);
}
1 late() {
sim.addSubpop("p1", 1e5);
}
100 late() {
sample(p1.genomes, 1).addNewDrawnMutation(m2, 5e5);
}
100:10000 late() {
mut = sim.mutationsOfType(m2);
if (mut.size() != 1)
stop(sim.generation + ": LOST");
else if (sum(sim.mutationFrequencies(NULL, mut)) == 1.0)
{
sim.treeSeqOutput("recipe_16.10_decap.trees");
sim.simulationFinished();
}
}

This is a pretty typical tree-sequence-based model, with a mutation rate of zero. It involves a
fairly large population size (1e5), and a reasonably long chromosome (1e6), so running the neutral
burn-in for this model in SLiM would take quite a long time. Incidentally, the size of this
simulation also means that simplification can take quite a long time, and so as well as recapitating,
we have also chosen to speed up the simulation by setting the simplificationRatio parameter of
initializeTreeSeq() to INF. We have not discussed the simplificationRatio option much
before: in general, it controls how often simplification occurs, and a value specifically of INF tells
SLiM not to simplify the tree sequence at all until a .trees file is generated (see section 21.1).
Although this speeds up the model’s execution considerably, it can greatly increase memory usage,
so it should be done with care; it is easy to overflow the memory available to the process. As this
particular example has a short runtime, we’re probably safe to trade off memory to achieve
maximum simulation speed, but for longer-running simulations you will probably want to tune the
simplification interval as appropriate before recapitating.
This model involves a selective sweep, introduced in generation 100. We do not attempt to run
this simulation conditional on fixation (see chapter 10); for simplicity, we just detect fixation or
loss in the 100:10000 late() event. On fixation, the model dumps a .trees file and stops.
So now, if we run the model until we get a run in which the sweep mutation fixes, we have a
file named recipe_16.10_decap.trees that contains the genealogical history of that run. Now we
run a Python script that performs the recapitation:
import msprime, pyslim
import numpy as np
import matplotlib.pyplot as plt
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

367

# Load the .trees file
ts = pyslim.load("recipe_16.10_decap.trees")

# no simplify!

# Calculate tree heights, giving uncoalesced sites the maximum time
def tree_heights(ts):
heights = np.zeros(ts.num_trees + 1)
for tree in ts.trees():
if tree.num_roots > 1: # not fully coalesced
heights[tree.index] = ts.slim_generation
else:
children = tree.children(tree.root)
real_root = tree.root if len(children) > 1 else children[0]
heights[tree.index] = tree.time(real_root)
heights[-1] = heights[-2] # repeat the last entry for plotting
return heights
# Plot tree heights before recapitation
breakpoints = list(ts.breakpoints())
heights = tree_heights(ts)
plt.step(breakpoints, heights, where='post')
plt.show()
# Recapitate!
recap = ts.recapitate(recombination_rate=3e-10, Ne=1e5, random_seed=1)
recap.dump("recipe_16.10_recap.trees")
# Plot the tree heights after recapitation
breakpoints = list(recap.breakpoints())
heights = tree_heights(recap)
plt.step(breakpoints, heights, where='post')
plt.show()

The first step is to load the .trees file. Note that we specifically do not call simplify() here,
because we need the first generation individuals to recapitate from; this is, in fact, precisely why
SLiM 3.1 preserves those individuals for us.
Then we define a function that calculates tree heights along the chromosome. This is similar to
what we did in section 16.4, but here we have to be a bit smarter because of those original
ancestors preserved in the tree sequence. Every tree will have a root in one of those original
ancestors, but we are interested in whether the tree has coalesced below that original ancestor
(and if so, at what height), or if the tree still has multiple roots, indicating that it has not yet
coalesced.
We use that function to plot tree heights along the chromosome before recapitation. Then we
recapitate, which is very easy – just a single method call on our tree sequence, returning a new
tree sequence that has had a coalescent history traced backward from its original ancestors. The
recapitation process just needs to know the recombination rate to use, and the population size (Ne)
to use for the coalescent. Finally, we plot again, after recapitation, to see what it has done. A
combined plot of these results, made in R, looks like this:

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

368

1e6
1e5
1e4
1

mean tree height (generations)

0e+00

1e+06
chromosome position

Note that the y-axis is rescaled according to the cube root of the number of generations, to
bring out detail at both the low and high ends of the scale. The red line here shows the tree
heights along the chromosome prior to recapitation. The area surrounding the sweep has
coalesced at the generation in which the sweep mutation was introduced, due to strong
hitchhiking in the vicinity of the mutation. The area further out has not coalesced, and therefore
has a tree height that dates back to the beginning of forward simulation, 100 generations before
the sweep began. The black line shows tree heights after recapitation; here the uncoalesced
regions farther from the sweep have been coalesced backward in time as far as a million
generations (reflecting the fact that we would have needed to run the burn-in in SLiM for at least a
million generations to have any likelihood of coalescence). Regions closer to the sweep tend to
coalesce more recently, reflecting the effect of the sweep on the diversity present at different
regions along the chromosome at the end of simulation.
In short, we can see that recapitation has provided us with a full neutral burn-in history for the
simulation, after the fact. Notably, this is very fast; this example run executed in 0.41 seconds. If
we wanted neutral mutations overlaid over the whole population history, including the recapitated
burn-in, that would also be extremely fast; using the technique shown in section 16.2, that
operation would take another 0.58 seconds. These times can be compared to an estimate,
obtained by extrapolation, of how long the burn-in would take in SLiM: more than 114 hours.
Recapitation, then, is optimal for computing burn-in for several reasons. One reason is that it is
extremely fast, as we have seen; the only branches that need to be coalesced are those leading to
the final individuals present at the end of forward simulation, which is generally a small minority
of the branches that were present at the beginning of forward simulation. Recapitation is therefore
much, much faster than simulating the burn-in period in SLiM, and should even be faster than
constructing an initial population state with the coalescent; it is based upon the coalescent, but
needs to do much less work since far fewer branches typically need to be coalesced. Then too,
recapitation is very convenient, since it can be done after forward simulation, even as an
afterthought on existing model output. In this way, it allows a focus on the non-neutral dynamics
of interest, with burn-in handled later. It also allows one to ignore the question of how long to run
burn-in for – the whole “10N” question – since recapitation will coalesce back in time as far as
necessary to achieve full coalescence.
Of course, recapitation can only be used in scenarios where a neutral coalescent process is
appropriate for burn-in. In some cases the burn-in period needs to itself be non-neutral; in this
case using tree-sequence recording to run without neutral mutations may be the best one can do
(see sections 16.1 and 16.2). But for many models, recapitation will enable modeling at a larger
scale than previously possible, by greatly reducing the overhead of the burn-in period.
One point was glossed over in the discussion above: why was the sweep mutation introduced in
generation 100, and not, say, in generation 2 immediately after the simulation has started? The
easy answer is: to make the plot look nice. In the mean tree height plot above, there is a nice stairTOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

369

step in the red line (showing mean tree height prior to recapitation), and that bump is there
because of the delay between starting forward simulation and introducing the sweep mutation.
But there is a deeper benefit as well: running a little bit of forward simulation after recapitation can
smooth out differences introduced by the coalescent burn-in. The standard Kingman coalescent,
as simulated here via the recapitate() command, produces evolutionary dynamics that are
extremely similar to a neutral forward Wright–Fisher simulation such as the model shown above;
but the two are not, in fact, exactly identical, and small differences introduced by the use of the
coalescent could conceivably produce detectable bias in simulation results if non-neutral
dynamics commenced immediately after the end of the coalescent. Doing a little extra neutral
burn-in after the recapitated period shuffles things around so that any such bias is minimized.
Indeed, this technique can sometimes be used even to approximate a non-neutral burn-in
period using the coalescent; one can forward-simulate the non-neutral burn-in dynamics in SLiM
for some relatively short period of time (like 100 or 1000 generations), and then add a coalescent
history to that using recapitation as a sort of convenient approximation to the truth. This is
obviously not ideal; but if forward-simulating the full non-neutral burn-in period would take a
prohibitively long time, it may be the only option available, and it may, for some models, prove to
produce acceptable results. Of course any approximation like this should be carefully tested to
ensure that any bias or artifacts introduced by it are within acceptable bounds.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

370

17. Runtime control
In this chapter we’ll look at topics related to what might be called “runtime control”:
manipulating the random number generator, saving simulation snapshots and data to files,
executing “lambdas” as a way of making code dynamic and/or reusable, and debugging your
models in SLiMgui. We will no longer be looking at complete recipes for models as we did in the
previous chapters; instead, we will use shorter snippets of code to briefly explore a few topics
likely to come up when you are developing models of your own.
17.1 The random number generator
We have encountered the (pseudo-)random number generator in various recipes, but it is worth
focusing our attention on it briefly since random numbers are generally quite important for SLiM
simulations.
SLiM and Eidos use a random number generator that is part of the GNU Scientific Library (GSL).
More specifically, at present the GSL’s taus2 random number generator is used. It is possible to
change this by modifying Eidos’s internal code, but it is not recommended since Eidos and SLiM
rely heavily on various properties of this generator, such as the number of random bits it outputs
per draw, and the independence of the bits within each draw. It is a good generator with a long
period, and should be suitable for most purposes. Starting in SLiM 3.0, a 64-bit Mersenne Twister
generator is also used, for purposes where 64-bit random numbers are needed (since the taus2
generator is only a 32-bit generator, but is significantly faster than the Mersenne Twister generator);
the seeds of the two generators are synchronized, and the fact that there are two independent
generators used under the hood should be essentially invisible to users of SLiM and Eidos; it can,
for practical purposes, be considered a single generator.
The random number generator is seeded at the very beginning of each simulation run (each
press of the Recycle button, in SLiMgui) using a combination of the current Un*x process ID and
the clock time. This is not a particularly robust way to seed the generator; if you are doing
production runs of a model, you will generally want to ensure that each run uses a different seed
value. Perhaps the simplest way to ensure this is to pass a seed value to SLiM on the command
line via the -seed or -s option; for example:
slim -seed 12345 ./script.txt

If you write a Un*x script (or an R script, or whatever) to launch all of your SLiM model runs,
that script can use this command-line option to set up each model run with a distinct seed value.
Within the code for a model, it is also possible to manipulate the random number generator’s
seed value; the recipe in section 10.2 did this in order to re-run the model from the same starting
point with different seeds, for example. The getSeed() function returns the last seed value set, and
the setSeed() function sets a new seed value. Note that calling setSeed(getSeed()) generally
changes the state of the random number generator; the seed returned by getSeed() is the last seed
value set to initiate a random sequence, but since random numbers may have been generated
since that seed was set, the seed does not necessarily reflect the current state of the generator.
The recipe of section 10.2, and some other recipes in the cookbook, change the random
number seed with a particular formula:
setSeed(rdunif(1, 0, asInteger(2^32) - 1));

This is a useful way to change to a new seed (allowing the subsequent run to be reproduced
directly by starting at the saved simulation point with the same new seed value) while still
preserving a dependency upon the original seed value. These recipes used to do
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

371

instead, but that could be problematic if a set of replicate runs are done
using sequential initial seed values; the replicate run starting with seed 1 will graduate to
successive seeds of 2, 3, etc., and will thus end up using exactly the same pseudorandom
sequence as the replicate runs that started with (or graduated to) those higher seed values. Since
the identical pseudorandom sequences would be used in conjunction with a different simulation
state, it might not end up mattering; but there is no point in taking the risk of accidental correlation
between runs, so it is best to follow this new strategy, which should avoid this issue. (Thanks to
Matthew Hartfield for pointing this out.) The rdunif() call generates a seed within the unsigned
32-bit range of the taus2 generator; values outside this range would still work (they are taken
modulo 232 anyway), but this range makes the intent of the code clear.
When running a SLiM model, the seed value set when the simulation is initialized is
automatically printed to the output. For example, you might see:
setSeed(getSeed() + 1))

// Initial random seed:
1455193095666

If a particular model run catches your interest and you wish to reproduce it, you can copy that
initial seed value and add a statement to the beginning of your model’s initialize() callback
like:
setSeed(1455193095666);

If you recycle the model after adding such a line, you will see a different initial seed value
printed; but after you step over the initialize() phase, you should see the model repeat its
previous sequence, because the different initial seed value was replaced by the setSeed() call.
17.2 Defining constants on the command-line
It is common to want to run a model many times, perhaps varying some of the basic parameters
that define the model, or perhaps just varying the random number seed in a predictable way in
order to produce independent replicates of the simulation. To make this simpler, SLiM provides a
command-line option, -define (or just -d) that allows you to specify definitions for Eidos constants
that your script can then use. For example, consider this simple script:
initialize() {
setSeed(seed);
initializeMutationRate(mu);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(r);
}
1 {
sim.addSubpop("p1", N);
}
2000 late() { sim.outputFixedMutations(); }

This is just a simple neutral drift model, as we have used for the basis of many recipes in this
cookbook. However, it references four undefined variables: the random number seed seed, the
mutation rate mu, the recombination rate r, and the population size N. Run “as is”, the script
would therefore fail with an “undefined identifier” error.
But when invoking this model at the command line, these values can be defined as Eidos
constants using the -d command-line option. Assuming that slim and the script test_defines.txt
are at findable paths, this would be an example invocation of the model:
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

372

slim -d seed=7 -d mu=1e-7 -d r=1e-8 -d N=500 test_defines.txt

The values would then be defined and available to the script. (Note that setting the random
number seed at the command line can also be achieved with the -seed command-line option; see
section 17.1). With this mechanism, it is easy to set up a script that runs a large number of slim
jobs on a computing cluster, for example, to produce replicated runs across a whole parameter
space. A call to cat() at the beginning of your script could output the values for all of the
constants defined externally, so that the output from each run of SLiM is marked with the
parameter values that generated it.
Incidentally, one might wish to write the above model such that the final generation depends
upon the defined value of N; it is common, for example, to run a neutral burn-in period for 10N
generations (although this practice is not without problems). It would be nice to be able to write:
10*N late() { sim.outputFixedMutations(); }

Unfortunately, this does not work; the generation range for script blocks must use simple
integer constants. (This is difficult to fix because “constants” like N are not, in fact, necessarily
constant – they can be removed with rm() and redefined – and because the values of constants
may not be known at the time the script is parsed, such as when they are defined with
defineConstant().) Instead, the standard “trick” is to define the script block with a symbol, like
this:
s1 2000 late() { sim.outputFixedMutations(); }

The symbol s1 now refers to this script block. Then the script block can easily be rescheduled
to the desired generation (or generation range) using the rescheduleScriptBlock() method, as in
this rewritten generation 1 early() event:
1 {
sim.addSubpop("p1", N);
sim.rescheduleScriptBlock(s1, start=10*N, end=10*N);
}

The s1 event will now run in generation 10*N as desired. Note that the generation declared for
s1 – 2000, here – is no longer important, since s1 will be rescheduled anyway. The only caveat is
that the declared generation should be after the generation in which the rescheduling occurs, to
ensure that s1 does not get executed before the rescheduling takes place. See section 22.1 for
further discussion on the declaration of event script blocks.
One might wish to supply constants definitions in this manner when running at the command
line, but have the model still run properly under SLiMgui, with default values for constants, rather
than producing an “undefined identifier” error. Here is a version of the initialize() callback for
the above model that achieves this:
initialize() {
if (exists("slimgui"))
{
defineConstant("seed", 1);
defineConstant("mu", 1e-7);
defineConstant("r", 1e-8);
defineConstant("N", 1000);
}
setSeed(seed);
initializeMutationRate(mu);
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

373

initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(r);
}

The slimgui object, which provides an Eidos interface to SLiMgui (see section 21.11), is defined
only when running under SLiMgui, so the constants will be defined only in that case.
One could similarly use exists() to test for the existence of each constant, and define their
values only if they are undefined; this would allow the model to run at the command line with no
-d definitions supplied, using default values, but would allow those default values to be overridden
with a -d command-line flag when desired. For example, this defines seed only if it has not
already been supplied:
if (!exists("seed"))
defineConstant("seed", 1);

Defined constants can be of type logical, integer, float, or string; defining string constants
probably requires playing quoting games with your Un*x shell, such as:
slim -d "foo='bar'" test.txt

The fact that Eidos strings such as 'bar' can be enclosed in either single or double quotes
comes in useful here. If the -l[ong] command-line option is supplied (turning on more verbose
output from SLiM), the argument for each -d[efine] command-line option will be printed as it
was received by SLiM after processing by the shell, making it somewhat simpler to diagnose
quoting issues.
In fact, the values for defined constants can be any Eidos expression, and can even reference
previously defined constants. For example, one may accomplish scaling of model parameters by
the population size with an invocation such as:
slim -d N=1000 -d THETA=5 -d "mu=THETA/(4*N)" model.txt

Here the mutation rate mu is calculated from the population size N and the scaled model
parameter THETA. Note that with such expressions, as with string literals, quoting is probably
necessary to avoid issues with your Un*x command shell, as shown above. Expressions of any
kind are allowed, including calls to Eidos functions, and the result of the expression need not be a
singleton. The random number generator will be initialized (with the supplied -seed value, if any;
see section 17.1) before the command-line definition expressions are evaluated, so functions that
rely upon random numbers will use values based upon the model’s initial seed.
It is a good idea to use unique names for defined constants that do not collide with any symbols
defined by Eidos or SLiM. Some such names will be flagged as errors; others, such as collisions
with the names of pseudo-parameters that SLiM provides to callbacks, may just cause confusion.
You might wish to employ a naming convention for your constants to avoid all possibility of
collisions, such as using names that begin with d_.
See section 13.8 for an example of using this feature with ABC-MCMC parameter estimation.
17.3 Other command-line options
Previous sections have touched upon a few of the command-line options for SLiM. In
particular, section 17.1 showed how to pass a seed for the random number generator using the -s
or -seed option, and section 17.2 showed how to define Eidos constants on the command line
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

374

with the -d or -define option. There are a few other commend-line options supported by SLiM,
which we will briefly summarize here.
First of all, there are options that are not related to running simulations. The -v or -version
option prints out the version number of the slim executable you are using:
$ slim -v
SLiM version 3.2, built Nov

6 2018 09:41:03

The -u or -usage option causes SLiM to print out a summary of how it can be invoked on the
command line (i.e., more or less the same information we’re summarizing here), in case you forget
an option and want a quick reminder.
The -testEidos or -te option makes SLiM run a self-test of the Eidos language interpreter.
Similarly, the -testSLiM or -ts option runs a self-test of the SLiM core. Typically you would use
these after building SLiM to verify that the built executable is functioning properly (see section
2.4):
$ slim -te
SUCCESS count: 5439
$ slim -ts
SUCCESS count: 68424

The success counts are not important, and will depend upon the version of SLiM being run; the
important thing is that no failed tests are reported.
Then there are command-line options that influence a simulation run in some way. Setting the
random number seed with the -s / -seed option, and defining Eidos constants at the command
line with the -d / -define option, have already been discussed; several other options also exist.
The -l or -long option enables “long” output, which provides additional information about the
run. At the moment this is primarily useful for getting information about mutation run usage (see
section 18.4); other long output may be added in future.
The -t or -time option enables output of the total runtime of the simulation (see section 18.2).
When this option is enabled, a final line will be printed showing the CPU usage of the process (in
seconds), like:
// ********** CPU time used: 0.11412

The -m or -mem option similarly enables output about the memory usage of the simulation (see
section 18.3). When this option is enabled, final lines will be printed showing the initial and peak
memory usage of the simulation (in bytes, K, and MB), like:
// ********** Initial memory usage: 1069056 bytes (1044K, 1.01953MB)
// ********** Peak memory usage: 4325376 bytes (4224K, 4.125MB)

The -M or -Memhist option provides more extensive output regarding the memory usage of the
process as it runs, in the form of a final dump of usage statistics encapsulated within R code that
can be copied and pasted into an R interpreter to produce a plot of memory usage over time (see
section 18.3). This is not likely to be useful to end users; it is mostly a tool for debugging SLiM
itself.
The -x option disables some of SLiM’s runtime checks for consistency and safety. It is not
recommended for general use, since it may mean that error conditions are not caught and
reported. However, it may occasionally be useful if one of SLiM’s runtime checks is actually faulty
and needs to be disabled (see section 18.3).

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

375

When running a simulation, the path to the SLiM script file is typically supplied at the end of
the command line. After SLiM 3.2, this may be omitted if a script file is instead piped in to SLiM’s
standard input (“stdin”, in Un*x parlance). In other words, this invocation of SLiM:
$ slim ~/Desktop/foo.slim

may instead be written as:
$ cat ~/Desktop/foo.slim | slim

This allows the input script to be assembled dynamically (in Python, say) and passed to SLiM via
stdin, without having to create a temporary script file on disk to pass to SLiM.
17.4 File input and output
Output generated by a simulation with print(), cat(), and similar output functions goes into
SLiM’s output stream, which (assuming you are running at the command line) can be redirected to
a file to save the results of the model run. Often, however, you will want a model to explicitly
write output to a file. Eidos provides a few simple functions for file input and output, as well as
operators and functions for string manipulation, and SLiM adds to those capabilities with some
model-specific output functions. It should be possible for you to use these facilities to achieve
whatever sort of customized output you wish; it is trivial, for example, to output information about
the current state of a model as a comma-separated value (CSV) file that can be read by R and other
such software in order to perform further analysis.
The simplest way to output simulation state to a file is to use the SLiMSim method outputFull(),
such as by calling:
sim.outputFull("~/model_output.txt");

This produces a population dump in a standard format that is compatible with the SLiMSim
method readFromPopulationFile(); these two methods can thus be used to save and restore the
population state at any point in time, as discussed further in section 10.2.
But suppose this standard output format is not suitable; instead, you want to save a CSV file
with a list of all of the mutations currently active in the simulation, listing their positions and
selection coefficients. A recipe to achieve that might look like:
lines = NULL;
for (mut in sim.mutations)
{
mutLine = paste(c(mut.position, ", ", mut.selectionCoeff, "\n"), "");
lines = c(lines, mutLine);
}
file = paste(lines, "");
file = "position, selcoeff\n" + file;
if (!writeFile("~/out.txt", file))
stop("Error writing file.");

This recipe assembles a vector of output lines, naturally called lines, starting with NULL and
adding new lines with c(lines, mutLine). A for loop is used to loop over all the mutations in the
simulation, and for each mutation an output line is assembled using paste(). Each line ends with
a newline pasted on to the end with "\n".
After the for loop, the next line uses paste() to join all of the lines produced by the loop into a
single string, and the following line prepends a header line with column names for the output. At
this point, file contains the string to be written to a file in the filesystem. Writing it out is
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

376

achieved simply by calling writeFile(), passing the filesystem path for the first parameter and the
string to write as the second. The writeFile() function returns a logical value: T if the file was
written successfully, or F if a filesystem error occurred. It is a good idea to check the return value,
as shown here, so that you are aware if your model’s output is not working as expected.
The recipe above will work fine, but the use of the for loop is quite inefficient and is not true
Eidos style; Eidos is a vectorized language, and it is generally better to use vectorized solutions
when possible. Similarly, assembling a long vector with a series of c() calls to add one element at
a time is highly inefficient in Eidos, requiring memory allocation and bulk copying of data with
each successive addition. A better recipe to achieve the same result, then, would be:
lines = sapply(sim.mutations, "paste(c(applyValue.position, ', ',
applyValue.selectionCoeff, '\\n'), '');");
file = paste(lines, "");
file = "position, selcoeff\n" + file;
if (!writeFile("~/out.txt", file))
stop("Error writing file.");

The first statement (wrapped onto two lines in the code shown above, but entered as a single
line of code without the newline in the middle of the string literal) uses the sapply() function to
assemble a vector containing string output lines for each mutation in the simulation.
Conceptually, this is similar to the for loop across sim.mutations in the previous example, with
applyValue as the index variable for the loop, and executing the Eidos code within sapply()’s
second parameter as the for loop’s body. However, sapply() also collects the result of each
iteration of the loop and assembles all of those results in a single vector, here assigned into the
variable lines; it thus does the work that lines = NULL and c(lines, mutLine) did above, but
much more efficiently. The sapply() function is immensely useful for bulk processing of data, and
should become a part of every Eidos programmer’s toolbox; it is documented in detail in the Eidos
manual.
Note that because we want the code executed by sapply() to paste a newline at the end of
each line using the escape sequence \n, we have to escape the backslash; \\n in the original code
becomes \n in the string literal representing the code run by sapply(), which becomes a newline
in the string literal passed to paste(). Similarly, the double-quoted parameters to paste() in the
first version of the recipe are now single-quoted, allowing the quotes to nest without escaping
issues. Confusing, yes; section 17.5 below will show a different approach that may be clearer and
easier.
17.5 Lambda execution
A few recipes have touched on the ability of Eidos to execute code that was dynamically
assembled as a string. For example, section 10.5.3 added new script blocks to the running
simulation to schedule future mutation events, and section 17.4 passed a string representing an
executable code block to the sapply() function. Such dynamic execution of code is a relatively
advanced technique, but can be very powerful.
The first step in this technique is simply to assemble a string containing the Eidos code you wish
to execute. This is generally done with the + operator, which performs string concatenation when
given at least one string operand, and the paste() function, which pastes together a vector of
strings joined by a fixed separator string. Sometimes, as in section 17.4, the string can even just be
a literal; this technique does not necessarily involve dynamically generated code. Usually the
main difficulties in this step involve correct escaping of special characters, and the related issue of
quoting and nested quotes. The fact that Eidos allows strings to be quoted with either single or
double quotes can be helpful; see, for example, the way that section 17.4 avoids having to escape
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

377

quote characters by using single quotes inside the double-quoted code string. Another useful
technique for handling quoting and escaping difficulties is the so-called “here document” style of
string literal in Eidos (a terrible term, which I did not invent), documented in the Eidos manual. For
example, section 17.4 used a small snippet of code:
lines = sapply(sim.mutations, "paste(c(applyValue.position, ', ',
applyValue.selectionCoeff, '\\n'), '');");

This could be rewritten with the “here document” style as:
lines = sapply(sim.mutations, <<--paste(c(applyValue.position, ", ", applyValue.selectionCoeff, "\n"), "");
>>---);

Note that double quotes can now be used without any escaping issues, and that the newline
escape sequence desired in the string literal can be given directly as \n, without the necessity of
quoting it as \\n to prevent an actual newline character from being inserted in the string. It is
worth getting used to the “here document” string literal style; it will simplify your life
tremendously if you work with Eidos code as string literals.
In any case, once the code string in this example is assembled, it is passed to sapply(), and
sapply() treats it as Eidos code – the string gets tokenized, parsed, and interpreted just as the
literal code in your model’s script is tokenized, parsed, and interpreted. The sapply() function is
one of several functions in Eidos that executes a string as code in this manner; another is the
executeLambda() function, which simply executes the string it is passed as code. This can be
used, to some extent, in a similar way to how functions are used in many other programming
languages, but with a more dynamic twist. For example, suppose we wanted to write code that
performed an operation on pairs of operands – but we don’t know what operation we want to
perform ahead of time. It might be addition, subtraction, multiplication, division, or
exponentiation; all we have is a string indicating the desired operation. Without
executeLambda(), we would have to write:
[... code that sets up operand1, operand2, and operator somehow ...]
if (operator == "+")
result = operand1 + operand2;
else if (operator == "-")
result = operand1 - operand2;
else if (operator == "*")
result = operand1 * operand2;
else if (operator == "/")
result = operand1 / operand2;
else if (operator == "^")
result = operand1 ^ operand2;

With executeLambda(), this becomes trivial:
[... code that sets up operand1, operand2, and operator somehow ...]
result = executeLambda(paste(c("operand1", operator, "operand2;")));

This is a rather contrived example, admittedly, but such situations do arise from time to time; it
is worth being aware of this facility.
SLiM leverages this facility in Eidos to allow you to add new script blocks to a simulation
dynamically, with code that can be assembled dynamically as a string and passed in to the SLiM
engine. Both Eidos events and callbacks can be added, using the SLiMSim methods
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

378

registerEarlyEvent(), registerLateEvent(), registerFitnessCallback(),
registerMateChoiceCallback(), registerModifyChildCallback(), and
registerRecombinationCallback().

17.6 Debugging
One of the weaker points of the Eidos language at present is its facilities for debugging. There is
no runtime debugger for Eidos, no ability to pause executing code at an arbitrary point in order to
examine variables, no breakpoints or watchpoints. However, Eidos does nevertheless have some
facilities for debugging that you should be aware of.
The first is simply the ability to print to the output console at any point in your code. If you are
wondering whether a calculation you have written produces the correct value, simply add a call to
print() or cat() to see the value in the output. This is primitive, admittedly, but still powerful,
and in fact is often quicker and simpler to produce an answer than a debugger would be.
The variable browser, opened with the Show Eidos Variable Browser button
in SLiMgui’s
main window, is a useful debugging facility. This is described in section 3.4, and is quite useful
since it can browse into SLiM’s state as well as your own variables. Unfortunately it can only show
you the state of things in between generations; there is no way to pause while inside the execution
of a fitness() callback, say, and browse the variables at that moment in time. Nevertheless, it is a
powerful tool.
The console window, opened with the Show Eidos Console button
in SLiMgui’s main
window, provides a third debugging facility. This is described in section 3.3. In the console
window, you can work with Eidos interactively, defining your own variables and executing your
own code. All of SLiM’s top-level variables are available to you in the console, so you can test out
code that manipulates genomes and mutations and so forth. If you set up some dummy variables
with the same names that SLiM uses in callbacks (such as mut, homozygous, relFitness, etc., for a
fitness() callback), you can test out your callback code interactively in the console. It’s not quite
the same as working in a debugger, but it’s not bad. Note that all of the variables you define in the
console are visible in the variable browser, too. You can even use the console window to change
the state of the running SLiM simulation; code executed in the console is executed in the same
Eidos context as the simulation to which the console is attached.
Often the help documentation, available through the Script Help button ? , is an overlooked
tool for debugging. If your code isn’t doing what you think it should do, often the best approach is
to examine your underlying assumptions: is your mental model correct regarding all of the objects,
methods, properties, functions, etc., used by your code? There is a reason why RTFM is often the
first debugging advice given by experienced programmers. The help window is documented
further in section 3.2.
A key step in debugging is often to find a reproducible case. A bug that crops up rarely and
unpredictably can be terribly hard to track down; a bug that you can make happen over and over,
while examining it from different angles, is usually much more tractable. In SLiM, since so much
of what happens is usually guided by the random number generator, using setSeed() to make a
simulation follow the same path in each run is often an important step in debugging. See section
17.1 for more information on working with the random number generator.
Debugging is never easy; it is detective work, often reminiscent of the thought process
advocated by Sherlock Holmes: eliminate hypotheses one by one until only one possibility
remains, and that must be the answer. Good luck!

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

379

18. Implementation and performance
Following the lead of the previous chapter, this chapter will also present shorter snippets of
Eidos code, rather than complete models, to illustrate selected topics in model development. In
this chapter, we will focus on topics related to technical considerations in model implementation
and performance: speed, memory usage, and evaluation of performance.
18.1 Writing fast SLiM simulations
Evolutionary simulations are often limited not by ideas (there are always lots of interesting
questions to explore), and not by the difficulty of writing the models to test those ideas (especially
when using tools such as SLiM and Eidos that facilitate that process), but rather by time and
processing power. Individual-based modeling is computationally intensive, and since it is
generally not practical to spend years of computer time on a single problem, it often becomes
necessary to limit the scope of one’s investigation. Naturally this is undesirable, and so squeezing
every last bit of speed out of one’s simulation is beneficial; it may allow you to ask a broader
question, or explore a larger parameter space, or use a larger population size.
When using a high-level modeling framework like SLiM, the most important thing in gaining
high performance is to understand the design of the framework; often there are different ways to
solve the same problem that provide vastly different performance. For example, SLiM allows you
to define a fitness() callback with an optional subpopulation constraint. Without using that
optional constraint, you might write:
fitness(m2) {
if (subpop == p2)
return 0.5;
else
return relFitness;
}

Using the optional constraint feature, you would instead write:
fitness(m2, p2) {
return 0.5;
}

These are identical in their behavior, but the second version will perform orders of magnitude
(literally) better than the first version. There are a bunch of reasons for this. The first version
requires that the fitness() callback be called for every mutation in every subpopulation, rather
than just for the mutations in subpopulation p2, and the setup and teardown costs for running a
callback are non-trivial, so that in itself has a large impact. Looking up the values of variables such
as relFitness and subpop is also time-consuming. Doing the (subpop == p2) test in interpreted
Eidos code is vastly slower than doing the same test internally in SLiM, since SLiM’s core code is
compiled and optimized C++. Even beyond all of those considerations, however, there is also the
fact that the second callback above is specifically optimized by SLiM: a callback that does nothing
except return a constant value gets short-circuited by SLiM’s internal machinery completely. In
that case, there is no setup and teardown for an Eidos interpreter to run the callback, because there
is no interpreted execution of the callback’s code at all; SLiM is smart enough to know to simply
use the value 0.5 for the relative fitness in the cases where that callback would execute. The same
optimization cannot be done for the first version.
To some extent, these sorts of implementation details of the SLiM engine are private and should
not be relied upon; they might change from version to version of SLiM. But an overall take-home
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

380

point is clear: whenever possible, write as little Eidos code as possible, and let the internals of
SLiM and Eidos do as much work for you as possible.
There are lots of other examples of this principle. For example, you should use the vectorized
facilities of Eidos whenever possible. To add a bunch of numbers, use one call to sum(), not a for
loop performing sequential addition with the + operator. Similarly, to perform some repetitive
operation on every element of a vector, consider using sapply() instead of a for loop. If the
repetitive operation involves a conditional – if you’re considering writing a for loop with an if
statement inside it – consider using ifelse() instead (or at least, again, using sapply()). In
general, it is safe to say that whenever you start writing a for loop in Eidos you should stop and
ask yourself, “Can this be vectorized instead?”
Another example is the problem of looking up mutations of a given type in SLiM. Often you
want to get a list of all mutations in the simulation that are of type m2 (for example). To do this, the
natural Eidos idiom, as in R, would be to subset with a logical vector, like:
muts = sim.mutations[sim.mutations.mutationType == m2];

To execute this statement for a simulation with N active mutations, Eidos must (1) fetch the
property from sim, assembling a vector of size N, (2) fetch the mutationType property of
that vector, constructing a new vector with size N, (3) do an == comparison with m2, constructing a
logical vector of size N, (4) fetch the mutations property from sim a second time, again
assembling a vector of size N, and (6) do a size-N subset operation with the [] operator, using the
two vectors constructed in the previous steps, to produce the final result. This is a huge amount of
work, often to construct a result vector that might have just one element in it (since often the
mutation type of interest involves a single introduced mutation). Precisely because this is such a
common task, SLiM provides a shortcut:
mutations

muts = sim.mutationsOfType(m2);

Conceptually, this does the same thing as the previous version, but it does it in C++, inside
SLiM’s internal code, and that allows it to be orders of magnitude faster. When SLiM offers you
facilities like this, use them! On SLiMSim, another such method is countOfMutationsOfType(),
which is even faster than taking the size() of the result of mutationsOfType() if you just need to
know how many there are without getting the objects themselves. Similar methods exist on some
other SLiM classes, such as Genome.
Eidos itself also has quite a few such time-saving functions, from standard fare like min() and
max() to more esoteric functions like cumProduct(), unique(), and sapply(). Indeed, most of the
functions defined by Eidos are technically unnecessary; the tasks they perform could be written in
pure Eidos code without the use of any functions. Eidos nevertheless provides you with a large
library of predefined functions, for convenience, clarity, reuse, and speed; learn and use this
toolkit. If you find yourself writing out some operation in lengthy and inefficient Eidos code, it
might be time to take a step back and ask whether some or all of that operation could be recast in
terms of built-in Eidos functions. Some Eidos functions, such as which() and sapply(), take some
effort to learn to use effectively, but the payoff is large.
Another performance consideration is that defining and looking up variables in Eidos code is
relatively slow. This is not an issue most of the time, but in code that gets executed a lot (a
fitness() callback on a common mutation type, for example) it can make a large difference.
Using variables to represent temporary, intermediate computations can make code much easier to
understand, but unfortunately it can also run much more slowly. For example, consider this code
to compute the Euclidean distance between two points, (x1,y1) and (x2,y2), and execute some
code only if that distance is less than a constant threshold distance:
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

381

dx = x1 - x2;
dx_sq = dx * dx;
dy = y1 - y2;
dy_sq = dy * dy;
dist = sqrt(dx_sq + dy_sq);
if (dist < 4.0)
...

Don’t do that unless you don’t care at all how slowly it runs (which often is true – optimize only
when you need to optimize). Instead, if you care about speed, do this:
if (sqrt((x1 - x2)^2 + (y1 - y2)^2) < 4.0)
...

Sometimes, of course, defining a temporary variable can improve performance, if the temporary
value is used more than once. If the value of that Euclidean distance computation will be used
more than once, then by all means assign it into a variable; looking up variable values may be
slow, but it’s not nearly as slow as performing a complex, multistep calculation. But at least avoid
defining the temporary variables dx, dx_sq, dy, and dy_sq if those values are used only once.
Of course it also pays to actually think about the necessity of the calculations you’re
performing. In the above example, there is actually no point to calculating the square root;
instead, you can just compare the square of the distance to the square of the threshold:
if ((x1 - x2)^2 + (y1 - y2)^2 < 16.0)
...

Learning to see such opportunities for optimization is largely a matter of patiently and
methodically examining each line of your code to think about how steps might be folded together
or eliminated entirely. A few hours of such effort might save you weeks or months of runtime.
These tips provide some general approaches and rules of thumb for improving simulation
performance. The following sections will discuss more concrete tools that can be brought to bear.
18.2 Performance evaluation
Sometimes you want to know exactly how long a piece of code or a whole simulation takes to
run; this can be useful when trying to optimize the performance of a model, for example, for
comparing alternative formulations of the model quantitatively. SLiM and Eidos provide several
tools for this purpose. See section 18.5, on profiling simulations in SLiMgui, for another extremely
useful tool for performance evaluation.
First of all, to measure the total execution time of an entire simulation run on the command
line, you can pass the command-line option -time or -t to SLiM and it will print a total time
measurement to the output at the end of the run. The same facility is not presently offered in
SLiMgui, since production runs are generally done at the command line, and the time taken in
SLiM may not accurately reflect the time that will be taken in a command-line run, given the
overhead of user-interface updating. When using the -time option, keep in mind that a given
model may exhibit a wide variance in execution time depending upon the random number
sequence used by a run; comparing runs using a fixed random number seed is therefore wise. See
section 17.1 for details on using -seed or setSeed() to set up a reproducible model run.
If you want to measure the performance of Eidos code or even a whole SLiM model, the
clock() function can also be a good way to go. This function returns the amount of CPU time
used by the running process; the difference between the result of clock() at one point versus

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

382

another point is thus the amount of CPU time that was used between the two points. Measuring a
single chunk of code is therefore trivial:
start = clock();
for (i in 1:100000)
q = i;
cat("Elapsed: " + (clock() - start));

Measuring the performance of code that spans multiple code blocks (such as the full execution
time of a SLiM model) is just a little more complicated, because the start variable would not be
persistent for long enough. Using the defineConstant() function of Eidos to store the start time
fixes that problem. So in the initialize() callback of your model, you could put this:
defineConstant("start", clock());

And then in a late() event in the final generation, you could write:
cat("Elapsed: " + (clock() - start));

The elapsed time printed is in seconds, but not of user (i.e. wall-clock) time; rather, it is in CPU
time (the amount of time your computer’s processor actually dedicated to running the model,
which might be much less than the wall-clock time, particularly if your computer is busy doing
other things as well).
For measuring the performance of just a small block of code, the executeLambda() function has
a timing option that can also be useful. Just pass T as the second, optional parameter to
executeLambda() and it will print a time measurement for the lambda. For example, suppose we
wanted to measure the time it takes to execute this code:
mean(runif(10000000) * 10);

We could simply execute it inside an executeLambda() call, like so:
executeLambda("mean(runif(10000000) * 10);", T);

On my machine, this produces this output:
// ********** executeLambda() elapsed time: 1.07904
5.00037

The return value of the call is 5.00037, and running the lambda took 1.07904 seconds.
Supposing that to be unacceptable, we could try a simple optimization, changing the range of the
runif() call, which draws random deviates from a uniform distribution, rather than reshaping the
drawn values with multiplication afterwards. Since we’re drawing ten million values, it is
reasonable to wonder whether this might make a difference, so let’s try it:
executeLambda("mean(runif(10000000, 0, 10));", T);

This produces this output:
// ********** executeLambda() elapsed time: 0.644127
4.99945

So incorporating the scaling into the runif() call sped up the code by about a third; not bad.
The final result in the two cases is different, of course, because the random number generator is
not in the same state; you could use setSeed() to do tests with identical random number
sequences if you wished, but it should be irrelevant in this case.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

383

Section 17.5 has further advice about how to use the executeLambda() function, including a
discussion of the “here document” string literal style, which makes it much less painful to convert
Eidos code into the form of a string literal that can be passed to executeLambda().
Section 18.5 provides an overview of profiling simulations in SLiMgui, which – for users on
Mac OS X – provides a particularly useful way of evaluating simulation performance.
18.3 Memory usage considerations
Sometimes memory usage is a major concern when running individual-based simulations.
SLiM has been engineered to keep its memory usage relatively low, but simulations involving large
population sizes and large numbers of mutations can burn through megabytes and even gigabytes
of memory. Reducing memory usage can be difficult, unless you are willing to change the
parameters of your simulation, but it is at least possible to assess SLiM’s memory usage.
Several tools are provided to assess SLiM’s total memory usage, including SLiM’s code,
operating system overhead, and so forth. The first tool is the -mem or -m command-line option,
which causes SLiM to keep track of the high-water mark of its memory usage and print out that
high-water mark when the model finishes. The second tool is the -Memhist or -M command-line
option, which causes SLiM to track and print the memory usage of the simulation over time, rather
than just the high-water mark. A third tool is the Eidos function usage(), which returns the current
or peak memory usage of the running process, in megabytes (MB).
To get more detail about SLiM’s internal memory usage, the outputUsage() method of SLiMSim
can be called at any time to get a breakdown of the current memory usage by the simulation (see
section 21.12.2), and in SLiMgui the same information is available in profile reports (see section
18.6). These facilities do not assess SLiM’s total memory usage; instead, they provide a detailed
picture of the memory that SLiM itself allocates and has direct control over. For large simulations
this should be the large majority of total memory usage, however, so this should not prove limiting
in practice. These features are covered in more detail in section 18.6.
It is usually the case that the large majority of the memory used by SLiM is for the references
kept by Genome objects to the Mutation objects that the genomes contain (by MutationRun objects,
internally, beginning in SLiM 2.4, but those objects are not visible to the user of SLiM; see section
18.4). Genomes refer to mutations using 32-bit indexes (4 bytes), for memory efficiency (pointers,
by comparison, are typically 64-bits, or 8 bytes, on modern systems). A single genome containing
1000 mutations thus takes about 4K of memory, a diploid individual with that mutational density
would take ~8K, and a population of 1000 such individuals would take ~8MB (although if shared
haplotypes were common that overhead might be reduced by MutationRun’s ability to share
mutation references between genomes). It is easy to see that with very large population sizes or
very large numbers of active mutations the memory overhead becomes quite large (particularly if
genetic diversity is high so that haplotype sharing is minimal). This overhead is more or less
unavoidable, since a genome simply must keep a list of the mutations it contains; there would be
possible ways to compress the memory footprint of that list, but such schemes would inevitably
make SLiM much slower. If your simulation is taking too much memory, you typically really have
only a few options: (1) make your population size and/or chromosome size smaller, (2) change
your model to have a lower mean number of active mutations per genome (modeling fewer
background neutral mutations, for example – or none at all, as tree-sequence recording can allow;
see section 1.7), (3) change your model to have more haplotype sharing (with a lower
recombination rate, for example), or (4) buy more memory.
Note that, interestingly, it is the references to the mutations that burn the memory, in most
typical simulations, not the Mutation objects themselves. This is because in a typical simulation
one mutation is often contained by a large number of individual genomes. A single Mutation
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

384

object presently takes up 80 bytes on OS X (implementation details like this are of course subject
to change). If a single reference to that Mutation object takes 4 bytes, as we saw above, then once
20 genomes contain that mutation, the memory footprint of the references is already as large as the
footprint of the Mutation object itself. A mutation near fixation in a population of only 1000
diploid individuals would use the same 80 bytes for the Mutation object, but 4*1000*2 = 8K bytes
for the references to the object (again, potentially reduced by haplotype sharing through
MutationRun). That footprint is so large that all other memory usage is typically irrelevant.
Beginning in SLiM 2.1, runtime checks of SLiM’s memory usage are performed periodically, if
and only if SLiM is running under a process memory limit (as indicated by the results of the Un*x
function getrlimit()). Such memory limits are often enforced for jobs running on computing
clusters, and exceeding the limits causes immediate termination of the process by the operating
system. To make such memory overflow terminations easier to debug, SLiM will print a diagnostic
message if its memory usage approaches within 10 MB of the limit, stating what SLiM was doing
when the overflow occurred; if the operating system kills the process soon after that point, this
diagnostic message may prove useful for determining the problem. This runtime checking should
not add significant performance overhead to SLiM, but it can be disabled with the -x commandline flag if so desired. Note that some systems will report that there is no memory limit, even when
a limit is actually in force, so this feature may or may not work on a given system.
18.4 Mutation runs and runtime optimization
Beginning in SLiM version 2.4, a change was made to SLiM’s core engine. With this change,
each genome in the simulation can be broken into multiple “mutation runs”, each of which
contains the mutations that occur within a given subsection of the genome. A simulation might
use four mutation runs per genome, for example, in which case every genome is divided into four
roughly equal-sized chunks – the mutation runs – that are stored separately. This is an internal
implementation detail that is not visible to SLiM models in Eidos, and normally it is of no concern.
However, in some circumstances an awareness of the existence of mutation runs can allow a
model to be optimized to run faster than it otherwise would.
The purpose of mutation runs is to allow some operations performed by SLiM to be performed
faster than they otherwise could be. For example, suppose a gamete needs to be produced by
recombination of the two parental genomes, and a single recombination breakpoint has been
drawn. Without mutation runs, generating the gamete would require copying all of the mutation
pointers from the first parental genome up to the breakpoint, and then copying all of the mutation
pointers from the second parental genome from the breakpoint onward; if a typical genome
contains 1000 mutations, generating a gamete will require copying 1000 pointers. Now suppose
the genomes are divided into 10 mutation runs each. The breakpoint occurs inside one of those 10
runs, and mutation pointers will need to be copied from the parental mutation runs for those; that
will be about 100 pointer copies. But for the other nine runs – this is the crucial bit – only the
pointer to the mutation run itself needs to be copied to the gamete, because the gamete can share
the mutation runs with other genomes. That makes 109 pointer copies total – a big improvement
on 1000. Similar gains can be realized in other parts of the SLiM core engine as well. Using
mutation runs adds in some bookkeeping overhead, but it usually at least breaks even in terms of
performance, and often it results in a substantial speedup. It also improves memory usage, since
long runs of pointers to mutation objects can be shared among many genomes.
The tricky question is: how many mutation runs should be used? If too few are used, then most
of the benefits evaporate. If too many are used, SLiM spends most of its time just handling the
bookkeeping involved; if every genome contains 1000 mutation runs, there are now 1000 times
more objects involved in the simulation than there were, with immense performance implications.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

385

Choosing the right number of mutation runs to use can thus be very important; the performance of
the simulation can change dramatically depending upon this value. Unfortunately, there is no
general way to make this choice; factors such as the chromosome length, mutation rate, and
population size are important, but so are things like population dynamics, the type of selection
being experienced, and the effects of custom Eidos events and callbacks in the SLiM script.
In order to manage this problem, SLiM 2.4 and later actually conducts little experiments
continuously: it times every generation, while occasionally varying the number of mutation runs
being used. As it collects datasets, it compares those datasets against each other (using t-tests, in
fact!) to determine how many mutation runs is optimal. These experiments take very little time,
and run continuously in the background. This means that the end user of SLiM usually doesn’t
have to think about all of this; the optimal number of mutation runs is used most of the time, and
simulations run a little bit (or a lot) faster with no effort on the part of the user. It also means that if
simulation dynamics change partway through – perhaps a neutral burn-in ends and a regime of
strong selection begins – SLiM will notice the changed dynamics and automatically adjust the
number of mutation runs to be optimal in each part of the simulation.
So far, so good. However, all of these experiments do have a small effect on performance. This
can be particularly true for models for which the optimal number of mutation runs is not clear or
changes frequently, as well as models with a very short generation time. Sometimes, telling SLiM
to use a fixed number of mutation runs can be beneficial – but you need to know how many.
As an example, let’s look at a very simple neutral model:
initialize() {
initializeMutationRate(1e-5);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 1e5-1);
initializeRecombinationRate(1e-5);
}
1 { sim.addSubpop("p1", 500); }
10000 late() { sim.outputFixedMutations(); }

In SLiM 2.3, this model takes approximately 205 seconds to run (on my machine). In SLiM 2.4,
if a single mutation run is used, this model runs in approximately 77 seconds, thanks to other
performance optimizations added to version 2.4. If 32 mutations are used in SLiM 2.4, on the
other hand, it runs in approximately 24 seconds – more than three times faster! As it happens, 32
is the optimum for this model, for most of its runtime (a small number of runs is better early on,
when neutral diversity is still building). Actually, the optimum might be different in a different
hardware/software environment (things like how much processor cache memory is available can
be very important), so to be precise, 32 is the optimum on my machine at the present moment.
Running the model in SLiM 2.4 with no mutation run count specified, and therefore allowing
SLiM to perform its “experiments” continuously in the background, results in a runtime of 30
seconds, with 78.5% of the generations in the simulation using 32 mutation runs (in one trial run;
this will vary from run to run, even with the same random number seed, since it depends upon
timing information). SLiM therefore does a reasonably good job of finding the optimum, but
telling it explicitly to use 32 mutation runs rather than conducting experiments results in even
better performance – about a 20% speedup.
So in practice, how could we arrive at this result? The first step is to run the model at the
command line with a few extra command-line options:
$ ./slim -m -t -l -s 0 ~/neutral.slim

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

386

This command runs, at the $ prompt in my Un*x terminal, the slim executable in the current
directory (.), executing the model named neutral.slim in my home directory (~). The “-s 0”
option tells SLiM to use a random number seed of 0 (see section 17.1); I chose this so that all of my
test runs use exactly the same modeled sequence of events, which you may or may not want to do
as well. The “-m” flag turns on memory monitoring (see section 18.3), and the “-t” flag turns on
timing of the overall runtime (see section 18.2); that is how I collected the timing information
given above. Finally, the “-l” flag (short for “-long”, which may be used instead) tells SLiM to
output long (i.e., verbose) output. Exactly what gets added to verbose output is version-specific (so
try not to make assumptions about it, and probably don’t use it in your production model runs),
but one of the things added is information about the mutation run experiments conducted by
SLiM. When the model finishes executing, among the verbose output is a snippet like this:
//
//
//
//
//
//
//
//

Mutation run modal count: 32 (78.5% of generations)
It might (or might not) speed up your model to add a call to:
initializeSLiMOptions(mutationRuns=32);
to your initialize() callback. The optimal value will change
if your model changes. See the SLiM manual for more details.

This tells us what we want to know: 32 is the optimal mutation run count for this model. Again,
that is on this hardware in the present software environment; if you plan to do your production
runs on a computing cluster, for example, you should ascertain the optimal mutation run count on
the cluster – ideally when it is busy running other tasks on its other cores – since the optimum
there may be different than on your local machine. (Note that similar, and even more detailed,
information on mutation run usage can be obtained in SLiMgui using its profiling feature; see
section 18.5. However, this might or might not indicate the optimum number of mutation runs for
command-line runs, since the runtime environment for SLiM is somewhat different when it is
running inside SLiMgui. The above method using the -l command-line flag is therefore
recommended.)
Knowing that 32 is the optimal mutation run count, we can now modify our model as the
output above suggests by adding an initializeSLiMOptions() configuration call at the beginning
of the initialize() callback that tells SLiM how many mutation runs to use (see section 21.1):
initializeSLiMOptions(mutationRuns=32);

This will yield the desired 20% speedup, because SLiM will no longer conduct mutation run
experiments, and will instead simply use 32 mutation runs throughout the model’s execution.
Note that in some cases this could actually make a model perform worse! Earlier, for example,
we mentioned the possibility of a model involving a neutral burn-in period followed by a period of
strong selection. The optimal number of mutation runs might be very different in those two phases
of the model – 64 in the first half and 1 in the second half, say – but if SLiM is conducting its own
mutation run experiments, it should be able to adjust to that fact. If you specify a fixed number of
runs, however, then either the first half or the second half of the model might perform quite poorly.
It is not possible to change the number of mutation runs dynamically in Eidos, at present, so in
such cases the best option is to allow SLiM to run its experiments and do its runtime optimization.
Such cases can be detected simply by looking at the total runtime of the model with and without a
specified number of mutation runs; if specifying the number of runs makes the model run more
slowly, then you probably shouldn’t do it.
In summary, the ability to specify the mutation run count is an advanced feature, the correct use
of which requires some experimentation and careful timing using the appropriate hardware and
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

387

software. If done properly, however, it can produce performance gains of as much as 20%, or
probably even more for some models. For very large models with long runtimes, it is therefore an
important tool.
18.5 Profiling simulations in SLiMgui
The previous sections have provided some pointers and tips regarding measuring and
optimizing the memory usage and performance of a SLiM model. For users on Mac OS X, SLiMgui
provides a particularly useful performance tool, called profiling, beginning in SLiM version 2.4.
For the purposes of this section, we will reuse the recipe from section 12.4, which shows how
to use a modifyChild() callback to disable incidental selfing in hermaphroditic models (defined as
occurring when the same individual happens to be chosen for both the first and second parent in a
biparental mating event). Note that this recipe has been deprecated, since there is now a
configuration flag for SLiM that disables incidental selfing more easily and efficiently (see section
12.4). Nevertheless, this recipe still works, and is a good one for our purposes here:
initialize() {
initializeMutationRate(1e-7);
initializeMutationType("m1", 0.5, "f", 0.0);
initializeGenomicElementType("g1", m1, 1.0);
initializeGenomicElement(g1, 0, 99999);
initializeRecombinationRate(1e-8);
}
1 { sim.addSubpop("p1", 500); }
modifyChild()
{
// prevent hermaphroditic selfing
if (parent1 == parent2)
return F;
return T;
}
10000 late() { sim.outputFixedMutations(); }

This is a very simple neutral simulation; the only twist is the modifyChild() callback that checks
whether the two parents of the focal offspring are the same, and if so, rejects the offspring by
returning F. The recipe has been changed here to run for 10000 generations, to provide more
accurate measurements that are more focused on the steady state of the model rather than its
initialization.
To profile this model in SLiMgui, simply click the profiling button that is overlaid with the play
button in the SLiMgui main window:

This will play the simulation forward from the current generation, just as the Play button would;
the only difference is that SLiMgui will tabulate performance information about the simulation
while it is running. You can start profiling at any point in a simulation run; for example, you can
play up to the generation in which a critical section of your model begins, and then profile from
there onward. Similarly, you can stop profiling at any time by clicking the profiling button again;
you do not need to wait for the model run to complete. When the current profiling operation
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

388

completes, a new window opens that shows a profile report. This report contains several sections;
we will discuss them one by one.
The first section is the report header:

The header contains general information: the title of the model, when the profiling run began
and ended, and some overall timing and memory usage information. (i) The first “wall clock time”
listed is the actual time spent running the model in SLiMgui; in this case, 3.37 seconds. (ii) The
second wall clock time is time spent inside SLiM’s core code; this excludes time spent in SLiMgui,
doing things like updating the user interface. It is also stated to be “corrected”; what this means is
that an attempt has been made to subtract out time spent inside the profiling code itself, reading
the system clock and tabulating profiling results. This number is thus SLiMgui’s best guess as to
how long the model would take if it were run at the command line with the slim command
instead. Since this is the runtime that is generally of interest, it is used as a baseline in the profile
report; later in the report, when timing percentages are reported, they are always percentages out
of this correct wall clock time. (iii) Third comes the elapsed CPU time inside the SLiM core; CPU
time is time spent keeping the processor of the computer busy. This time can be quite different
from the other times reported, as it is here. On the one hand, it excludes time spent in SLiMgui, so
it is generally lower than the total elapsed wall clock time. On the other hand, it is uncorrected –
it does not exclude time spent inside the profiling code itself – so it is typically longer than the
corrected wall clock time. And then, too, CPU time is often different from wall clock times
anyway, particularly if your machine is busy with other tasks that are also occupying the processor.
(Incidentally, to obtain optimal profiling results it is a good idea to quit all other applications and
run the profile when your machine is otherwise completely idle.) (iv) Next is shown the number of
generations over which the profile ran, including the initialization phase of the model. (v) Finally,
two lines profile information about the measured overhead and lag of the profiling code; these
numbers are used to produce the corrected wall clock time, and are not generally of interest to end
users of SLiMgui. (Note that in SLiM 3.2 another two lines were added at the end, providing
information about SLiM’s overall memory usage; these will be discussed in the next section.)
The next section is the “Generation stage breakdown”:

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

389

This is a tabulation of where SLiM is spending its time, broken down according to generation
stage. The fact that 84.86% of SLiM’s time is being spent in offspring generation is a red flag; that
generation stage does often take quite a bit of time, but not typically this much. So this is
something to note – depending upon the model it is not necessarily an indication of a problem,
but it is certainly an indication of where you ought to focus your optimization efforts.
Next is the “Callback type breakdown” section:

This section shows a tabulation of where SLiM is spending its time, broken down according to
callback types. This does not add up to 100% because SLiM is spending almost 40% of its time
outside of callbacks, in the SLiM core engine. But it is spending 61.16% of its time inside
modifyChild() callbacks – again, an indication of where to focus optimization efforts. You might
also wonder why 61.16% is different from 84.86%. This is because the additional time, above
61.16%, is being spent by SLiM in offspring generation, but not inside modifyChild() callbacks.
This time would be spent doing mutation generation and recombination, for example.
The next section is titled “Script block profiles (as a fraction of corrected wall clock time)”:

This shows the model’s code, with color highlighting indicating where time is being spent.
Colors range from white (essentially no time spent here) through yellow and up to orange and
finally red (most of the simulation’s time spent here). This section colors the code according to the
time spent as a fraction of the total corrected wall clock time. We can see that the modifyChild()
callback is taking 19.09% of SLiM’s time, and two parts of it are responsible: the test for the two
parents being identical, and the returning of T to indicate that the child should be generated. The
callback does occasionally return F, but that is so rare that that line is more or less white. Again
you might wonder: why is 19.09% different from 61.16%? The reason is that 19.09% is spent
specifically on interpreting the lines of code inside the callback – inside the Eidos interpreter –
whereas 61.16% is spent overall on calling the callback. Calling out to callbacks involves a
certain amount of overhead in SLiM; the Eidos interpreter environment for the callback must be set
up, variable values for it must be initialized, and so forth. The difference between 19.09% and
61.16% reflects all of this overhead. The overhead is very high in this case because the callback is
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

390

so simple; the more complex a callback’s code is, the less this constant overhead will matter. But
in this case, the profile report is telling us that it is not so much the callback’s code that is hurting
us – it is the fact that we have a callback at all.
Note, by the way, that only the modifyChild() script block is shown here. As the italic
comment at the bottom of the section notes, only script blocks that take more than a certain
amount of time are shown, since script blocks that take a tiny fraction of the total time are
generally not of interest from an optimization perspective.
Following that section is “Script block profiles (as a fraction of within-block wall clock time)”:

This section shows a similar view of the model’s code, but here the colors are scaled to the time
spent within each block. Even a block that takes up almost none of the total time, therefore, will
show “hotspots” that may be close to red. Here the two hotspot lines we noticed before are
colored orange, but the first line is a darker shade, closer to red, indicating that it takes more time
than the second line. This can also be seen in the shades of yellow used in the previous section of
the report, but it is much more subtle; the point of this section, with colors according to withinblock time, is precisely this, to increase the visibility of the hotspot lines. However, this section is
usually of less interest than the previous section, since the proportion of the total corrected wall
clock time is what ultimately matters.
Finally, the last section is “MutationRun usage”:

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

391

This section contains fairly technical information about some of SLiM’s internal bookkeeping
mechanisms, such as mutation runs and their caching of fitness-related information. Section 18.4
of this manual describes the basic idea behind mutation runs, and so may shed some light on the
meaning of this section. Mostly, however, it is intended for internal use by the developers of SLiM
(the royal we, in other words). If you send us a profile report from your simulation, this section
will help us diagnose the situation. We will not discuss this section further here.
Overall, this report focuses our attention very clearly on the source of the performance problem
with this model: the modifyChild() callback. Happily, as section 12.4 discusses, we can eliminate
that callback now with a call to initializeSLiMOptions(preventIncidentalSelfing=T), which
sets a configuration option that causes SLiM to block incidental selfing internally instead. If we
add that call to the model, remove the modifyChild() callback, and profile the model again, the
header of the report now looks like this:

The total runtime is much shorter now – 0.55 seconds of corrected wall clock time inside the
SLiM core, instead of 1.99 seconds! If we look at the generation stage breakdown, it now shows a
healthier mix of time spent in various generation stages; the runtime is no longer being completely
dominated by one stage:

Offspring generation is now 62.24% instead of 84.86%, so SLiM is busy doing other things too;
and even more encouraging, the total time spent in offspring generation is now 0.34 seconds
instead of 1.69 seconds, so we have shaved off quite a bit of the actual time spent.
The callback type breakdown shows that we are now spending a negligible amount of time in
callbacks:

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

392

None is over 0.03% of the total time, so they are all basically irrelevant. What this says is that
SLiM is no longer spending a significant amount of time inside the model’s Eidos callbacks;
virtually all of the time is being spent inside the SLiM core. If you still wanted to make this model
faster, you would therefore have to decrease SLiM’s core workload, by doing things like reducing
the population size, reducing the recombination rate or mutation rate, etc. If such revisions to the
model are not possible, then the model might be running about as fast as it possibly can (assuming
there are no design flaws in the model causing it to waste time).
The first script block profile section confirms this; there are no hotspots in the code at all:

Two script blocks are now shown that weren’t shown before, since they are now over the
threshold to be listed; but they take 0.03% or less of the total time.
The second script block profile section shows some red hotspots:

Remember, however, that in this section the coloring indicates the time spent as a fraction of the
within-block time. These lines might be taking 100% of the within-block time; indeed, they could
hardly fail to do so, since each block is only one line long! But the fact remains that their fraction
of the total runtime is inconsequential, so they should be ignored.
Particularly perspicacious readers may have noticed that if the original model took 1.99
seconds, 61.16% of which was dealing with modifyChild() callbacks, one might expect the
modified model to take 0.77 seconds, but in fact it took only 0.58 seconds, so we got an even
larger speedup than expected. Why is this? The main reason is probably that when callbacks are
present, SLiM is often forced to take a slower code path, which adds runtime even beyond the
overhead of handling the callbacks themselves. When modifyChild() callbacks are in force, for
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

393

example, SLiM has to generate children in a random order, to eliminate possible order-dependency
bugs that could otherwise crop up. When no such callbacks are active, SLiM can generate
offspring in a fixed order – females before males, migrants before residents, etc. – which is
substantially faster since it doesn’t require randomization.
All of this might prompt the question: “OK, I’ve got all this profile information, so what?” Well,
in some cases it might allow you to see a way to modify your model to avoid the hotspot in
question; if your profile showed that you were spending an inordinate amount of time doing
addition in a for loop, for example, you might question why you are doing so much addition, and
whether you could use sum() instead – or whether perhaps you don’t need to do all that addition
at all. Eidos also provides quite a few functions that can perform various tasks very quickly; using
functions like unique(), setSymmetricDifference(), which(), and so forth can lead to much faster
code than attempting to code such logic in Eidos by hand.
In other cases, it will allow you to start a conversation with us about the performance problem
you’re seeing; if you say “I’ve got this model and it’s spending 80% of its time inside the
mateChoice() callback running the second for loop, what can I do about that?” that’s a much
better starting point than “My model is slow, help”. If you are having to perform a multistep
algorithm in Eidos for a task that is common, we might be able to add a new utility method to
SLiM to perform this task for you with a single call. This can often result in a big speedup for the
code in question, and it benefits all of SLiM’s users for such utility methods to exist. The
sumOfMutationsOfType() method, for example, was added in response to a performance-related
question from a SLiM user, and is now available to speed up a wide variety of QTL-based models
that need to calculate the additive effects of all mutations of a given type.
18.6 Profiling memory usage in SLiMgui, or with outputUsage()
The previous section introduced SLiMgui’s profiling feature, with an extended example showing
how it can be useful for improving the runtime of a simulation. Beginning in SLiM 3.2, the
profiling feature has been extended to include information about memory usage as well; we will
discuss that extension in this section. Note that sections 18.3 and 18.4 cover some important
ideas about SLiM’s memory usage; you may wish to read those sections first.
First of all, when running SLiM 3.2 or later the header section of a profile report will be a bit
longer due to a subsection added at the end. For a somewhat large spatial simulation with treesequence recording enabled, the report might show something like this:

These lines report statistics about SLiM’s overall memory usage. The first line gives the average
usage, across a samples taken once per generation during the profiling period. The second line
gives the sampled usage in the final generation profiled. Note that these memory usage samples
are always taken at the end of each generation; if memory usage spikes within a generation but
decreases again before the generation’s end that will not be reflected in any of the profiling
statistics discussed in this section (but would be captured by the overall usage statistics discussed
in section 18.3). Also, importantly, these samples – which provide all of the information to be
discussed in this section – reflect only memory allocated and directly controlled by SLiM.
Memory can be used by other factors too: SLiM’s own executable code takes up memory, the
operating system has memory overhead of various types, the C++ standard library makes
allocations that SLiM has no way to measure, etc. However, in large simulations where memory
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

394

usage is a concern, the memory allocated and directly controlled by SLiM is typically the large
majority of the total memory usage, so this caveat should not prove limiting in practice.
So here we discover that in an average generation SLiM was using 0.56 GB of memory, but that
at the end of the simulation the usage was 1.17 GB in the final generation. That suggests that the
memory usage was increasing over the course of the simulation, which is typical since genetic
diversity often builds up over time. Since the peak memory usage is typically the main concern,
the final generation’s statistics may usually be of the most interest. One might also wish to profile
only the last 100 generations of a model, for example, in order to get averaged statistics that are
not biased downward by the low memory usage toward the beginning of a run.
So far, so good; but these are only summary statistics. How does the memory usage by SLiM
break down? What is all that memory actually being used for? A new section at the end of the
profile report gives much more information:

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

395

Each line in this section gives average and final-generation usage statistics, separated by a slash,
as suggested by the header. Usage is in bytes, KB, MB, GB, or potentially TB; since the units are
mixed they must be noted carefully when comparing numbers. To make it easier to pick out the
areas of highest impact, the usage statistics are color-coded in a similar way to the coloring of
timing statistics earlier in the profile report; white is the lowest proportional usage (inconsequential
in the big picture), and then shades of yellow, orange, and finally red reflect increasing proportions
of total memory usage.
The report is broken down by the object responsible for the usage, sorted alphabetically;
Chromosome comes first, Substitution last, with a small section on memory usage by Eidos at the
end. The first line of each of these subsections gives the memory usage for the objects themselves;
for the Chromosome section it gives the memory used by the one Chromosome object present in the
simulation, for example. Subsequent lines give additional memory usage, by those objects, for
particular purposes. For Chromosome, for example, the memory used by mutation rate maps and
recombination rate maps is listed. This memory is not part of the Chromosome object itself; it is
allocated by the Chromosome object. Each line thus stands on its own; the usage reported by lines
under a given object type is not included in the first line. In other words, the hierarchy here is a
conceptual hierarchy, not a breakdown of totals into sub-totals and sub-sub-totals. For most object
types the number of objects allocated is also reported in parentheses; for example, there were
404963.41 mutation objects allocated in an average generation, and 497883 allocated in the final
generation.
This breakdown shows that the majority of memory is taken up by MutationIndex buffers
allocated by MutationRun; some are being used by currently allocated MutationRun objects, while
others are attached to currently unused MutationRun objects in a pool of reusable mutation runs
kept by SLiM for speed. These MutationIndex buffers are the way that SLiM keeps track of which
mutations are present in each genome; they are like pointers to Mutation objects, but more
compact since they are 32-bit indexes instead of 64-bit pointers. The Mutation objects themselves
take up far less space, and are colored just a light yellow here; this is unsurprising and typical, for
reasons discussed further in section 18.3.
Quite a substantial amount of memory is also taken up by the tree-sequence recording tables
kept by SLiM, since tree-sequence recording is enabled in this model (see section 1.7). Treesequence recording can take up quite a bit of memory, but that memory usage can often be
controlled (at the price of longer runtimes) by controlling the frequency of simplification of the
tree-sequence tables; see the simplificationRatio parameter to initializeTreeSeq() in section
21.1). The model profiled here includes neutral mutations even though tree-sequence recording is
enabled, which is usually not desirable or necessary, so it is paying a hefty price in memory usage
(for illustration purposes).
Finally, a faint yellow tinge tells us that the sparse arrays kept by InteractionType are taking up
a small but noticeable amount of memory. These keep track of the distances and interaction
strengths between individuals; their size will scale with the number of individuals in a model, but
will also be strongly affected by the maximum distance set for the spatial interactions in the model.
The model profiled here uses very short maximum interaction distances, simulating a large
landscape with very local interaction dynamics, which means that these data structures are quite
small. In spatial models with broader spatial interaction kernels, these sparse arrays can take up
much more space; indeed, in the worst case they can scale with the square of the number of
individuals, and can account for the large majority of SLiM’s memory usage. Keeping maximum
interaction distances as small as possible is extremely important for SLiM’s memory usage, and for
its runtime too.
These memory usage statistics can, in some cases, provide a clear picture of how SLiM’s
memory usage could be reduced. This report, for example, could serve as a reminder that we
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

396

probably want to turn off neutral mutations in the model, since tree-sequence recording would
allow us to overlay them after simulation has completed (see section 16.2). If most of the memory
usage of a simulation is in the MutationIndex buffers kept by MutationRun, that would similarly
suggest that the simulation might benefit from the use of tree-sequence-recording. When that is
not possible – when the mutations being simulated are non-neutral, for example – it may be
necessary to reduce the scale of the model, in terms of population size, chromosome length,
mutation rate, recombination rate, etc. Sections 5.5 and 18.3 have some further discussion of this.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

397

PART II: THE SLIM REFERENCE

19. SLiM architecture (WF models)
By default, SLiM uses a Wright-Fisher-type model of evolution,
known in SLiM as a WF model (see section 1.6). This chapter
will discuss the details of the generation cycle in WF models;
chapter 20 provides the same type of discussion for nonWF
models, a more advanced type of SLiM model.
Section 1.3 presented a summary of the life cycle followed by
SLiM within each generation in WF models. The figure shown in
that section is reproduced at right. In this chapter, we will
examine each of these life cycle stages in more detail, in order to
provide a more complete specification of the internal mechanics
of SLiM.
19.1 Step 1: Execution of early() Eidos events
The very first thing that happens in each generation is that the
active property of all script blocks is reset to -1 at the beginning
of the generation, activating all script blocks for the remainder of
the generation unless they are explicitly deactivated again. This
is followed by the execution of early() Eidos events defined by
the user, if any. Details on how to specify an Eidos event are
given in section 22.1, with some further details in section 22.8
regarding their scheduling and the active property.
The salient point here is primarily that these events occur first
in each generation, prior to the generation of offspring. If you
wish to execute an Eidos event after offspring generation has
completed, you should use a late() event; see section 19.5.
Since the details of this step depend entirely on the script you
write, there is little more to say about this step.

The sequence of events within one
generation in WF models.
1. Execution of early() events
2. Generation of offspring;
for each offspring generated:
2.1. Choose source subpop
for parental individuals,
based on migration rates
2.2. Choose parent 1, based
on cached fitness values
2.3. Choose parent 2, based
on fitness and any defined
mateChoice() callbacks
2.4. Generate the candidate
offspring, with mutation
and recombination (incl.
recombination() callbacks)
2.5. Suppress/modify the
candidate, using defined
modifyChild() callbacks
3. Removal of fixed mutations
unless convertToSubstitution==F

4. Offspring become parents

19.2 Step 2: Generation of offspring
This is the most complex step in SLiM’s architecture, and it is
broken down into five sub-steps that are executed for each
offspring generated, as shown in the figure at right. Processes
involved in the generation of offspring include migration, mate
choice, mutation, recombination, and the actual production of
offspring individuals.

5. Execution of late() events
6. Fitness value recalculation
using fitness() callbacks
7. Generation count increment

19.2.1 The order of offspring generation
For each offspring generated, there are generally several decisions to be made: (1) is the
offspring local, or if not, from which other subpopulation are its parents drawn, (2) is it male or
female (in sexual simulations), and (3) is it produced by cloning, selfing, or biparental mating? In
SLiM, the sex ratio specified for a subpopulation is deterministic; SLiM will produce that exact sex
ratio in each generation (so as to avoid the possibility of extinction due to the chance production
of a single-sex child generation). The other decisions are made stochastically; migration rates,
selfing rates, and cloning rates are all probabilities, not deterministic ratios, and you can think of
SLiM as rolling the dice to make these decisions for each offspring individual. This means that the
sex ratio of a subpopulation does not fluctuate over time, but the fraction of offspring that are
migrants, or clones, or selfed, will vary stochastically around the specified rates.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

399

The order in which offspring are generated, with respect to these decisions, depends upon the
details of your simulation. In the base case, SLiM produces offspring in deterministic tranches; for
example, migrants before locals, and within that, males before females, and within that, cloned
before selfed before biparental. The specifics of this ordering are not guaranteed; the main point is
that you should not rely on the order of offspring being random. In particular, you should
randomly select genomes when doing things like inserting new mutations, to overcome the
possibility of order-dependency and bias (as shown in the recipe in section 10.1). If mateChoice(),
modifyChild(), or recombination() callbacks are defined, SLiM switches its behavior to generate
offspring in a randomized order, rather than in these deterministic tranches. This presents mating
and offspring decisions to those callbacks in a random order, so that bias is not inadvertently
introduced by callbacks.
SLiM is fundamentally a model of juvenile migration, not migration at the adult stage. This has
several consequences for these decisions that are made for each offspring. First of all, if the
offspring is a migrant produced biparentally, it should be noted that both parents will be drawn
from the same source subpopulation; matings between parents in different subpopulations never
occur in SLiM, since adults do not migrate. Second, it should be noted that the parental source
subpopulation is “in charge” of most of the decisions made regarding the offspring, since the
offspring is produced within that source subpopulation by the mating of parentals in that
subpopulation. In particular, the source subpopulation determines the cloning rate and selfing
rate, as well as the mateChoice(), modifyChild(), and recombination() callbacks used. The only
exception is sex ratio; the sex ratio of the destination subpopulation governs the ratio of males to
females produced in that subpopulation, regardless of the sex ratios specified by the various source
subpopulations contributing migrants.
19.2.2 Mate choice
Once the decisions outlined in the previous section have been made for a given offspring
(parental subpopulation, sex, clonal/selfed/biparental), the next step in offspring generation is
choosing the parent(s) for the offspring. The precise way in which this is done depends upon the
type of offspring being produced:
(1) If the offspring is to be clonal, a single parent is drawn randomly, according to probabilities
proportional to fitness, from the source subpopulation. Any mateChoice() callbacks defined are
not called; there is presently no way to influence the choice of clonal parents except by modifying
fitness values with a fitness() callback. Offspring generation proceeds along a different path in
this case, introducing mutations as usual (see below) but without recombination.
(2) If the offspring is to be selfed, a single parent is drawn according to fitness, as with cloning.
That parent is then considered to be a forced choice for the second parent; mateChoice() callbacks
are not used. Offspring generation proceeds thereafter as with biparental mating.
(3) If the offspring is to be the result of biparental mating, a first parent is drawn according to
fitness; in sexual simulations the first parent will always be female, since SLiM models female
choice. If no mateChoice() callbacks are defined, a second parent is then drawn according to
fitness – in sexual simulations, always a male. If mateChoice() callbacks are defined, on the other
hand, those callbacks will be called to determine mating weights for all eligible parents given the
chosen first parent, as detailed in section 22.3 and 17.6, and a second parent will then be drawn
according to those mating weights. If the mateChoice() callbacks completely reject the first
parent, offspring generation will go back to almost the beginning of the process, but the source
subpopulation will remain unchanged, as will the sex of the offspring to be produced (but the
cloned/selfed/biparental decision, and the choice of the first parent, will be made over from
scratch).

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

400

19.2.3 Mutation and recombination
Once a parent or parents have been successfully selected, as detailed in the previous
subsection, a candidate offspring is generated from the chosen parent(s). As the first step in this
process, the mutations to be introduced into each of the two offspring genomes are generated. The
number of mutations is determined by the current mutation rate, using a draw from the appropriate
Poisson distribution. The particular mutations are then generated one by one, each with a position
drawn at random from within all of the defined genomic elements in the chromosome. Given that
position, the identity of the genomic element and thus the controlling genomic element type is
determined. Using the list of mutation types and associated probabilities for the genomic element
type, a particular mutation type is chosen probabilistically. Finally, a selection coefficient is drawn
from the chosen mutation type (see section 21.8), and a new Mutation object is constructed with
the selection coefficient. In this manner, all of the new mutations to be introduced into each of the
two offspring genomes are generated.
If the offspring is to be produced clonally, what follows is then relatively simple: conceptually,
the parental genomes are replicated exactly in the offspring, and then the mutations are interleaved
into the offspring genomes at their particular positions.
If the offspring is to be produced by selfing or biparentally (which are identical at this stage of
the process), recombination is also involved, making the process somewhat more complex. The
first offspring genome is produced via recombination between the two genomes of the first parent,
and the second offspring genome is produced via recombination between the two genomes of the
second parent (mimicking the process of meiosis to produce haploid gametes that merge to form a
fertilized egg). For each offspring genome, the number of recombination breakpoints is drawn
from a Poisson distribution based upon the overall recombination rate (computed internally by
SLiM). The position of each breakpoint is then drawn, based upon the recombination ranges and
rates set on the chromosome. If a non-zero probability of gene conversion is set, and a random
draw indicates that gene conversion actually occurs for a given breakpoint, an additional
breakpoint (above and beyond the drawn number of breakpoints) will be added following the
converted breakpoint, with a positional offset drawn from a geometric distribution satisfying the
requested average gene conversion length. Note that occasionally this additional breakpoint will
fall after another drawn breakpoint position; this will result in a gene conversion stretch that is
shorter than the drawn gene conversion length, since the first breakpoint position will end gene
conversion. If recombination() callbacks are defined, they are called at this point to allow them
to modify the crossover points and the gene conversion stand and end points. Finally, given the
two genomes from the parent, the list of recombination breakpoints, and the set of mutations to be
introduced, SLiM then weaves together the final genome of the offspring, alternating between the
two parental genomes as dictated by the recombination breakpoints, and introducing mutations at
the chosen positions.
By default, SLiM allows multiple mutations to exist at the same site in a single individual –
“stacked” mutations, as we call them. This behavior is often desirable, but sometimes it is useful to
prevent stacked mutations of a given mutation type. The stacking policy of a given mutation type
can be changed using MutationType’s mutationStackPolicy and mutationStackGroup properties,
as documented in section 21.9.1.
19.2.4 Child modification
Once the two genomes of the candidate offspring have been generated, as described in the
previous section, child modification by modifyChild() callbacks (described in sections 21.4 and
21.8) occurs, if any callbacks are active. These callbacks are called regardless of whether the
offspring was the product of cloning, selfing, or biparental mating; all types of children may be
modified.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

401

These callbacks may modify the candidate offspring in any way desired. For our purposes here,
the only question arises with the possibility that a modifyChild() callback will suppress the
candidate offspring altogether, rather than just modifying it. This can be thought of as representing
juvenile mortality, if you wish; it could also represent postmating reproductive isolation, such as
infertility or developmental inviability. In some cases, suppression of a candidate offspring can
also be thought of as representing a type of mate choice.
When a candidate offspring is suppressed, generation of the offspring goes back almost to the
beginning of the process, as with rejection of the first parent by a mateChoice() callback. In fact,
in this case the process goes back even further; a new source subpopulation for the offspring is
chosen, in addition to re-making the cloned/selfed/biparental decision and re-choosing the
parents. The only aspect of the offspring generation that is not re-decided in this case is the sex of
the planned offspring individual; that remains fixed since the sex ratio is deterministic, not
stochastic. Note that this means that if a modifyChild() callback suppresses a candidate offspring,
the next time it is called the new candidate will be the same sex as the previously suppressed
candidate. Because of this, it is not possible for a modifyChild() callback to influence the sex
ratio of a subpopulation; attempting to do so will produce an infinite loop.
19.2.5 Child generation
Once a candidate offspring has been generated and modified, and was not suppressed by a
modifyChild() callback, it is added to the target subpopulation. Note that the newly added child
will not be visible as a member of the subpopulation until the point in the lifecycle when the child
generation becomes the parental generation (see section 19.4). This prevents order-dependencies
in which the first children generated might otherwise influence the remainder of the child
generation process.
19.3 Step 3: Removal of fixed mutations
After all offspring have been generated for all subpopulations, SLiM performs bookkeeping
regarding the mutations in the simulation. In particular, it scans through every genome in every
individual in every subpopulation, and tallies up how many times each of the mutations in the
simulation is actually present in the child generation.
The results from this scan are used to clean house. If a mutation is no longer referenced by any
genome in the simulation, that mutation has been lost (whether due to selection or drift), and SLiM
forgets about it. If, on the other hand, a mutation is now contained by every genome in the
simulation, that mutation has fixed. In this case, SLiM normally creates a new Substitution
object as a placeholder, to record the details of the fixed mutation, and then removes the mutation
from the simulation. This is essentially an optimization for efficiency; if fixed mutations were not
removed, a long-running simulation would accumulate ever more mutations needing to be
tracked. In general the substitution is harmless; a fixed mutation cannot generally decrease in
frequency, and generally no longer influences fitness (since it is possessed by all individuals, and
thus has an identical effect on the fitness of all individuals).
However, there are specific circumstances in which the removal of fixed mutations is not
desirable. In particular, if the fixed mutation would continue exerting a varying effect on fitness
among individuals (because of epistasis, for example), the substitution of the mutation would result
in incorrect fitness values. Also, if the script for a simulation intends to remove the mutation from
some genomes later in the simulation run (perhaps simulating a back-mutation) then it would be
desirable to prevent the substitution of the mutation. To accommodate these possibilities, the
convertToSubstitution property of a mutation type can be set to F to suppress substitution of
mutations of that type; see section 21.9.1.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

402

19.4 Step 4: Offspring become parents
As mentioned in section 19.2.5, newly generated offspring are kept by SLiM as members of a
child generation that is not visible to the model mechanics (or to your Eidos scripts), to avoid
order-dependencies and other confusion. After fixed mutations have been removed, as described
in the previous section, the child generation becomes the new parental generation, and the old
parental generation is discarded.
19.5 Step 5: Execution of late() Eidos events
After the child generation is promoted to the parental generation, the next step is to execute
late() Eidos events defined by the user, if any. Details on how to specify an Eidos event are given
in section 22.1, with some further details in section 22.8 regarding their scheduling and the active
property. Eidos late() events are most often used when output is being generated (since you
typically want to output the state of the simulation at the end of a generation, not at the
beginning), and when adding or removing mutations (since you want those changes to be reflected
in the fitness values calculated for individuals, prior to the next offspring generation step).
Changing of selection or dominance coefficients should also typically be done in a late() event,
for the same reason. See sections 4.2.1, 10.1, and 10.6.1 for discussion and examples.
19.6 Step 6: Fitness value recalculation
After the child generation is promoted to the parental generation, the next step is to compute
fitness values for the new adult individuals, including the effects of any fitness() callbacks.
Fitness values computed at the end of one generation are actually used during mating in the
following generation; this is something to keep in mind if you are designing fitness() callbacks
that you want to be active only across a specific generation range. The reason SLiM does this has
to do mostly with running in SLiMgui; it is desirable that SLiMgui should show newly-calculated
fitness values for the new parental generation when single-stepping through generations. If fitness
values were not calculated until the beginning of the next generation, they would not yet be
available for display.
If fitness() callbacks are not active for a given subpopulation, calculating the fitness of an
individual is relatively straightforward. SLiM uses a model of multiplicative fitness between sites.
An initial relative fitness w of 1.0 is assumed for the individual (unless fitnessScaling values have
been set; in fact, the initial fitness w is the product of the individual’s fitnessScaling property and
its subpopulation’s fitnessScaling property, but these properties are 1.0 by default). Then, each
mutation possessed by the individual is evaluated as to whether it is present in just one of the
individual’s genomes (i.e. is heterozygous) or is present in both genomes (i.e. is homozygous). If a
mutation is homozygous, the individual’s relative fitness is updated as:
w = w * (1.0 + selectionCoefficient),
whereas if the mutation is heterozygous, the relative fitness is updated as:
w = w * (1.0 + dominanceCoeff * selectionCoeff).
where the dominance coefficient of the mutation is defined by the mutation’s mutation type, in the
default case of simulating autosomes. For simulations of the X chromosome, the mutation type’s
dominance coefficient is used for heterozygous XX females, whereas XY males that are
“heterozygous” because they possess the mutation on their lone X chromosome use a global
dominance coefficient (see initializeSex(), section 21.1, and the dominanceCoeffX property of
SLiMSim, section 21.12.1). Simulations of the Y chromosome do not use a dominance coefficient
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

403

at all; the first of the two formulas above is used. If the new relative fitness is zero or less, the
mutation just evaluated was lethal, and so the final relative fitness of the individual is 0.0.
Otherwise, SLiM proceeds with evaluating the next mutation, until all mutations have been
evaluated to produce a final relative fitness value.
If fitness() callbacks are defined (as described in section 22.2), this procedure is modified
slightly. In this case, the multiplicative effect on relative fitness that would be produced by a given
mutation is calculated, exactly as above, but instead of simply multiplying w by that fitness effect,
SLiM calls out to fitness() callbacks to allow them to modify the relative fitness value. After
callbacks, w is multiplied by the final fitness effect for the mutation. The fitness() callback
calculates the relative fitness of the mutation in the individual, whether the mutation is
heterozygous or homozygous; if you wish the calculated fitness value to be different in those two
cases, then the fitness() callback needs to explicitly take the heterozygosity of the mutation into
account (which is part of the information provided to the callback by SLiM, so doing so is not
difficult; see section 22.2).
In SLiM version 2.3 and later, it is possible to define global fitness() callbacks, which are
applied exactly once to every individual without reference to a focal mutation or a particular
mutation type (see section 22.2). The fitness values returned by global fitness() callbacks are
multiplied in to the fitness value previously computed for the individual, as:
w = w * relativeFitness.
The fitness effects of global fitness() callbacks thus combine multiplicatively with all of the
fitness effects of mutations, and multiple global fitness() callbacks may be defined. Unlike other
types of callbacks, the order in which global fitness() callbacks are called is formally undefined,
both relative to other global fitness() callbacks and relative to ordinary (i.e., non-global)
fitness() callbacks. Also note that global fitness callbacks might not be called at all for a given
individual if that individual’s fitness has already been determined, by previous callbacks or fitness
effects, to be equal to zero. Models should therefore be extremely cautious in making any
assumptions whatsoever regarding the timing or order in which global fitness() callbacks will be
executed, or whether they will be executed at all; global fitness() callbacks that have external
side effects, such as changing the active property of script blocks or defining Eidos constants, are
not recommended.
In SLiM 3.0 and later, the fitnessScaling property may be set on the subpopulation or the
individual (or both) to multiplicatively influence individual fitness values, as mentioned above.
This is often a more efficient and simpler alternative to defining a global fitness() callback.
One caveat is that fitness calculations are done sequentially for all of the individuals in each
subpopulation, rather than being done in a random order. This means that fitness() callbacks
should be written in such a way as to make each fitness computation independent of all others,
and independent of the order in which they are done. A fitness() callback that produces a
different result the first time it is run in a generation compared to subsequent times, for example,
would introduce bias and order-dependency into a model, particularly since the order of genomes
in each subpopulation is not necessarily random (see section 19.2.1).
19.7 Step 7: Generation count increment
The final step in each generation is that the generation count is incremented. SLiM then checks
whether the simulation is over; if there are no events or callbacks scheduled to execute in the new
generation or any subsequent generation (not counting events and callbacks with no specified end
generation), the simulation is deemed to be over, and execution halts.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

404

20. SLiM architecture (nonWF models)
By default, SLiM uses a Wright-Fisher-type model of evolution,
known in SLiM as a WF model – but an alternative non-WrightFisher, or nonWF, model type may be chosen instead (see section
1.6 and chapter 15). This chapter will discuss the details of the
generation cycle in such models. Chapter 19 provides such
discussion for WF models; to avoid a lot of duplicated verbiage,
this chapter will assume familiarity with the WF generation cycle
as described in chapter 19, and will make reference to that
chapter rather than spelling out every detail a second time.
The figure shown at right is a summary of the life cycle
followed by SLiM within each generation in nonWF models. In
this chapter, we will examine each of these life cycle stages in
more detail, in order to provide a more complete specification of
the internal mechanics of SLiM.
20.1 Step 1: Generation of offspring
In nonWF models, the first thing to happen each generation is
the generation of new offspring. (OK – technically, as in WF
models, this is preceded by resetting the active properties of all
script blocks; see section 19.1). This is the most complex step in
SLiM’s architecture, and in nonWF models it is broken down into
four sub-steps that are executed for each offspring generated, as
shown at right. Processes involved in the generation of offspring
include mate choice, mutation, recombination, and the actual
production of offspring individuals. (In WF models, migration is
also effected during offspring generation, but this is not the case
in nonWF models.)

The sequence of events within one
generation in nonWF models.
1. Generation of offspring;
for each extant individual:
1.1. Call reproduction()
callbacks defined for each
reproducing individual
1.2. The callback(s) make
Subpopulation calls
requesting new offspring
1.3. Generate the candidate
offspring, with mutation
and recombination (incl.
recombination() callbacks)
1.4. Suppress/modify the
candidate, using defined
modifyChild() callbacks
2. Execution of early() events
3. Fitness value recalculation
using fitness() callbacks
4. Viability/survival selection,
based on fitness values and
(optionally) carrying capacity

20.1.1 The order of offspring generation
5. Removal of fixed mutations
For each offspring generated, there are generally several
unless convertToSubstitution==F
decisions to be made: (1) what are its parents, (2) is it male or
female (in sexual simulations), and (3) is it produced by cloning,
6. Execution of late() events
selfing, or biparental mating? In WF models, these decisions are
made by SLiM’s core engine, as described in section 19.2.1,
7. Generation count increment,
based upon parameters such as the sex ratio, cloning rate, and
individual age increments
selfing rate, as well as upon individual fitness values and the
results from mateChoice() callbacks. In nonWF models, in
contrast, all of these decisions are made by the model’s script: reproduction() callbacks are called
once for each individual in the model, to request that each individual generates its own offspring
to be added to the population.
The order in which individuals are asked to generate their offspring is always random in nonWF
models, across the entire population, to try to prevent order-dependencies from biasing offspring
generation. In addition to this, however, it is important to design your reproduction() callbacks to
be independent; in general, the reproductive behavior of each individual should probably be
independent of the reproductive behavior of every other individual, so whether A is asked to
reproduce before B, or B before A, does not bias the outcome of the model. There are certainly
cases where you might violate this principle; in a model of monogamous mating, for example, the
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

405

first female asked to reproduce might be able to claim her choice of males, whereas the last female
asked to reproduce might have little or no choice since most males would already be claimed.
Sometimes such asymmetries are acceptable or even desirable; sometimes, however, they would
constitute a bug. So the message of this subsection is primarily: think carefully about the order of
offspring generation, and about the independence of each reproductive event (or the lack thereof),
and make sure your models does what you really want it to.
20.1.2 Individual-based reproduction with reproduction() callbacks
As mentioned above, reproduction() callbacks are called for each individual in the model.
The task of each callback is to generate the offspring from one focal individual. In sexual models,
perhaps females would generate offspring and males would not (or perhaps the opposite, if males
are the choosy sex in the biological system being modeled). In monogamous-mating models, a
single mate might be chosen and then a litter of offspring generated through crosses with that
mate; in a non-monogamous model each offspring generated might be with a different mate.
Some offspring might be generated via cloning or selfing, rather than biparental mating. The sex of
offspring in sexual models might be equally likely to be male or female, or there might be some
sex-ratio bias. Most importantly, in nonWF models all of these decisions can depend upon the
genetics and other state of each individual, rather than being dictated by overall parameters as in
WF models.
Regardless of how a given model makes these decisions, it always expresses them to SLiM in
one way: by making method calls on Subpopulation objects, requesting the addition of new
offspring. There are four methods that can be called: addCrossed() to add an offspring resulting
from a biparental cross, addSelfed() to add an offspring resulting from selfing (i.e., sexual selffertilization in a hermaphroditic individual), addCloned() to add an offspring resulting from clonal
reproduction, or addEmpty() to an offspring with no parents and no empty genomes (presumably
to be filled in some special way by script, subsequently). Each such method call made sets off the
chain of events described in the following subsections: mutation and recombination, child
modification, and child generation.
20.1.3 Mutation and recombination
Mutation and recombination occur in nonWF models exactly as they do in WF models; see
section 19.2.3 for details.
20.1.4 Child modification
Child modification, via modifyChild() callbacks, occurs in nonWF models largely as it does in
WF models; see section 19.2.4 for details. The important difference is that whereas WF models
have a target subpopulation size to reach, and thus must keep generating offspring until that target
is reached, the same is not true of nonWF models. In nonWF models, therefore, if a
modifyChild() callback returns F to indicate that a given offspring should not be generated (due to
postmating reproductive isolation or genetic incompatibility, for example), that is the end of it.
That offspring will simply not be generated; there will be one fewer offspring individual, in the
end, than there would have been if the modifyChild() callback had returned T. In this case, the
Subpopulation method call that initiated offspring generation, such addCrossed(), will return NULL
to its caller. This is generally desirable; see section 22.7 for further discussion of how this behavior
interacts with reproduction() callbacks.
20.1.5 Child generation
Once a candidate offspring has been generated and modified, and was not suppressed by a
modifyChild() callback, it is queued for addition to the target subpopulation. Note that the newly
added child will not be visible as a member of the subpopulation until the end of offspring
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

406

generation. This prevents newly generated offspring from being chosen as mates, or otherwise
influencing the remainder of the child generation process.
20.2 Step 2: Execution of early() Eidos events
Offspring is followed by the execution of early() Eidos events defined by the user, if any. The
mechanics of this are the same as in WF models, but the semantics of early() versus late()
events are actually reversed in nonWF models, in some ways.
In WF models, early() events occur just before offspring generation and late() events happen
just after; in nonWF models this positioning is reversed, because offspring generation has been
moved earlier in the generation cycle. Similarly, in WF models late() events are usually the best
place to add new mutations, because the next stage is fitness evaluation, which immediately
incorporates the fitness effects of the new mutations; but in nonWF models early() events are
usually the best place to add new mutations, for the same reason.
In WF models, late() events are generally the best place to put output events so that they
reflect the final state at the end of a generation; in nonWF models, however, output can make
sense in either an early() or a late() event, depending upon whether you want to see the state of
the population before or after viability selection. However, reading in a previously written output
file, or otherwise setting up new population state, is generally best done in an early() event in
nonWF models so that fitness values are recalculated immediately after the change (just as with
adding new mutations). So if you call outputFull() with the intention of reading the output back
in with readFromPopulationFile() later, you will probably want to do the output in an early()
event so that you can, correspondingly, do the read in an early() event as well without skipping
or doubling any generation cycle stages.
For those who are curious: the reason for this reordering of the generation cycle in nonWF
models is the addition of the viability/survival generation cycle stage, which does not exist in WF
models. It is desirable to have an opportunity for scripted events between offspring generation and
selection, and then between selection and the next offspring generation stage. It is also desirable
for selection to happen after offspring generation in the cycle, so that the population state at the
end of each generation (as displayed in SLiMgui, for example) is after selection has occurred.
These constraints, taken together, dictate the order of the generation cycle in nonWF models.
20.3 Step 3: Fitness value recalculation
After the execution of early() events, the next stage in nonWF models is to compute fitness
values for all individuals, including the effects of any fitness() callbacks. The mechanics of this
are exactly the same in nonWF models as in WF models; see the extensive discussion in section
19.6.
However, because of the reordering of the generation cycle, the semantics of fitness evaluation
in nonWF models are a bit different. In WF models, as section 19.6 explains, the fitness values
calculated in generation T are actually used, to influence mating, in generation T+1. In nonWF
models, however, fitness values calculated in generation T are then used immediately, in
generation T, to influence survival (see the next subsection). This oddity of WF models is therefore
not present in nonWF models, making them a bit conceptually simpler.
The actual meaning of individual fitness values is also different between WF and nonWF
models; see the next section for discussion.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

407

20.4 Step 4: Viability/survival selection
After fitness values have been recalculated for all individuals, viability selection then occurs
immediately. In WF models, viability selection does not exist (or you could consider new offspring
to have a survival rate of 100%, and the parental generation to have a survival rate of 0%, if you
like). In nonWF models, in contrast, viability selection is the primary way in which differential
fitness is expressed; individual fitness values influence survival, not mating success.
Viability selection in nonWF models is mechanistically simple. For a given individual, a fitness
of 0.0 or less results in certain death; that individual is immediately removed from its
subpopulation. A fitness of 1.0 or greater results in certain survival; that individual remains in its
subpopulation, and will live into the next generation cycle. A fitness greater than 0.0 and less than
1.0 is interpreted as a survival probability; SLiM will do a random draw to determine whether the
individual survives or not.
This change in the way individual fitness is used has large consequences. First of all, it means
that in nonWF models fitness is absolute fitness, whereas in WF models fitness is relative fitness.
Second, it means that in nonWF models selection is generally hard selection, reducing the size of
the population proportionate to mean population fitness, whereas in WF models it is generally soft
selection, changing the relative success of particular genes but not changing the size of the
population. Third, it means that in nonWF models the population is not automatically regulated –
both extinction and unbounded exponential growth are very real possibilities – whereas in WF
models SLiM automatically regulates the population size. These three observations are all really
different ways of looking at the same basic fact.
Because of this shift, nonWF models need to treat individual fitness differently than WF models.
For one thing, there is generally a need to introduce some sort of density-dependent fitness, in a
global fitness() callback, that prevents exponential growth by decreasing individual fitness as the
population size gets larger. Second, there may be a need to rethink how beneficial mutations
work, since increasing the fitness of an individual above 1.0 has no effect in nonWF models (since
guaranteed survival is as good as it gets); it may be desirable to have a baseline fitness, for
individuals possessing empty genomes, of less than 1.0 so that beneficial mutations can increase
the probability of survival above that baseline. This would also be achieved with a global
fitness() callback (perhaps in conjunction with density-dependence). All of this is entirely up to
the model’s script.
Finally, it is worth noting that it is certainly to have genetics and other individual state influence
mating success and/or fecundity in nonWF models, as in WF models. In nonWF models that is
done by influencing the dynamics in the offspring generation stage in script, however; it is not an
automatic consequence of the fitness values calculated by SLiM.
20.5 Step 5: Removal of fixed mutations
After viability selection, SLiM tallies mutation frequencies and removes fixed and lost
mutations. The mechanics of this are essentially the same as in WF models; see section 19.3.
However, there is one important difference here between WF and nonWF models. In WF
models, fixed mutations can generally be removed because they no longer influence evolutionary
dynamics, as a consequence of fitness values being relative fitness. If every individual in the
population is fixed for a given mutation, then that mutation has no effect on relative fitness,
regardless of what its selection coefficient might be; the only exceptions are cases where the fitness
effect of the mutation varies from individual to individual, such as when epistatic interactions with
other segregating mutations are present. In WF models, the convertToSubstitution property of
mutation types therefore defaults to T, allowing SLiM to remove fixed mutations by default; when
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

408

that is not desirable, models need to set convertToSubstitution to F to prevent that automatic
removal.
In nonWF models, in contrast, the convertToSubstitution property defaults to F, because in
general it is not safe for SLiM to remove fixed mutations; models need to set it to T to allow
automatic removal. Generally, in nonWF models automatic removal of fixed mutations only
makes sense if several conditions are met: (1) the mutation is neutral, (2) the mutation has no direct
non-neutral effects due to any fitness() callback, and (3) the mutation has no indirect nonneutral effects on the model through epistasis, mate choice, fecundity, or any other such influences
on the model. If these conditions are met – as is commonly the case for simple neutral
background mutations – it is very important to set convertToSubstitution to T; the effect on
SLiM’s performance can be very large!
20.6 Step 6: Execution of late() Eidos events
Once fixed mutations have been removed, the next step is to execute any defined and active
late() events. This works identically to WF models (see section 19.5). The only important
difference is the way in which the semantics and common uses of early() and late() events differ
between WF and nonWF models, as discussed in section 20.2.
20.7 Step 7: Generation count increment
As in WF models, the last stage of the generation cycle is the incrementing of the generation
count and the check for the simulation being finished (see section 19.7). In nonWF models, the
age of all surviving individuals is also incremented by one during this stage.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

409

21. SLiM classes
This chapter presents reference documentation for all of the Eidos classes built into SLiM. It
assumes an understanding of the syntax of type-specifiers and method signatures, including the use
of optional arguments and default values. It also assumes an understanding of how classes are
used in Eidos, including how properties are accessed and how methods are called. Part I of this
manual attempts to introduce those topics to some degree, but see the Eidos manual for more
extensive discussion and documentation of these fundamental language features.
A caveat to keep in mind throughout this section is that SLiM 2 and later (unlike SLiM 1.8 and
earlier) generally uses zero-based values; a chromosome might start at index 0 and go to index
99999, for example. Eidos and SLiMgui present this zero-based worldview as well.
21.1 Simulation initialization: initialize() callbacks
Before a SLiM simulation can be run, the various classes underlying the simulation need to be
set up with an initial configuration. In SLiM 1.8 and earlier, this was done by means of # directives
in the simulation’s input file. In SLiM 2 and later, simulation parameters are instead configured
using Eidos.
Configuration in Eidos is done in initialize() callbacks that run prior to the beginning of
simulation execution. Eidos callbacks are discussed more broadly in chapter 22, but for our
present purposes, the idea is very simple. In your input file, you can simply write something like
this:
initialize() { ... }

The initialize() specifies that the script block is to be executed as an initialize() callback
before the simulation starts. The script between the braces {} would set up various aspects of the
simulation by calling initialization functions. These are SLiM functions that may be called only in
an initialize() callback, and their names begin with initialize to mark them clearly as such.
You may also use other Eidos functionality, of course; for example, you might automate generating
a large number of subpopulations with complex migration patterns by using a for loop.
One thing worth mentioning is that in the context of an initialize() callback, the sim global
for the simulation itself is not defined. This is because the state of the simulation is not yet
constructed fully, and accessing partially constructed state would not be safe.
Without further ado, then, here are the initialization functions provided by SLiM:
(void)initializeGeneConversion(numeric$ conversionFraction, numeric$ meanLength)

Configure the likelihood and behavior of gene conversion events. The probability of gene conversion
occurring for any one recombination event is given by conversionFraction, and the mean of the
geometric distribution from which the length of the gene conversion stretch will be drawn is specified
by meanLength (specified in base positions). Note that chromosome positions with a recombination
rate of exactly 0.5 will not be candidates for gene conversion, as it is assumed that they represent
junction points between discrete chromosomes.
(void)initializeGenomicElement(io$ genomicElementType,
integer$ start, integer$ end)

Add a genomic element to the chromosome at initialization time. The start and end parameters give
the first and last base positions to be spanned by the new genomic element. The new element based
upon the genomic element type identified by genomicElementType, which can be either an
integer, representing the ID of the desired element type, or an object of type
GenomicElementType, specified directly.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

410

(object$)initializeGenomicElementType(is$ id,
io mutationTypes, numeric proportions)

Add a genomic element type at initialization time. The id must not already be used for any genomic
element type in the simulation. The mutationTypes vector identifies the mutation types used by the
genomic element, and the proportions vector should be of equal length, specifying the relative
proportion of mutations that will be draw from the corresponding mutation type (proportions do not
need to add up to one; they are interpreted relatively). The id parameter may be either an integer
giving the ID of the new genomic element type, or a string giving the name of the new genomic
element type (such as "g5" to specify an ID of 5). The mutationTypes parameter may be either an
integer vector representing the IDs of the desired mutation types, or an object vector of
MutationType elements specified directly. The global symbol for the new genomic element type is
immediately available; the return value also provides the new object.
(object$)initializeInteractionType(is$ id, string$ spatiality,
[logical$ reciprocal = F], [numeric$ maxDistance = INF],
[string$ sexSegregation = "**"])

Add an interaction type at initialization time. The id must not already be used for any interaction type
in the simulation. The id parameter may be either an integer giving the ID of the new interaction
type, or a string giving the name of the new interaction type (such as "i5" to specify an ID of 5).
The spatiality may be "", for non-spatial interactions (i.e., interactions that do not depend upon
the distance between individuals); "x", "y", or "z" for one-dimensional interactions; "xy", "xz", or
"yz" for two-dimensional interactions; or "xyz" for three-dimensional interactions. The dimensions
referenced by spatiality must have been previously defined as spatial dimensions with
initializeSLiMOptions(); if the simulation has dimensionality "xy", for example, then
interactions in the simulation may have spatiality "", "x", "y", or "xy", but may not reference spatial
dimension z and thus may not have spatiality "xz", "yz", or "xyz". If no spatial dimensions have
been configured, only non-spatial interactions may be defined.
The reciprocal flag may be T, in which case the interaction is guaranteed by the user to be
reciprocal: whatever the interaction strength is for individual B upon individual A, it will be equal (in
magnitude and sign) for A upon B. This allows the InteractionType to reduce the amount of
computation necessary by up to a factor of two. If reciprocal is F, the interaction is not guaranteed
to be reciprocal and each interaction will be computed independently. The built-in interaction
formulas are all reciprocal, but if you implement an interaction() callback (see section 22.6), you
must consider whether the callback you have implemented preserves reciprocality or not. For this
reason, the default is reciprocal=F, so that bugs are not inadvertently introduced by an invalid
assumption of reciprocality. See below for a note regarding reciprocality in sexual simulations when
using the sexSegregation flag.
Note that even if an interaction is reciprocal, it may occasionally be slightly faster for reciprocal to
be set to F. This is most likely when the amount of computation per interaction is very small
(particularly if no interaction() callbacks are involved), and when it is unlikely that the reciprocal
of a queried interaction will also be queried. Even in such cases, however, the slowdown for
reciprocal=T should be fairly small. In most usage cases, setting reciprocal to T (when the
interaction is in fact reciprocal) will result in at least equal performance, if not better; with a very slow
interaction() callback, the performance can be as much as double, making it generally worthwhile
to use reciprocal=T when possible. However, for maximal performance one might wish to time and
compare runs with reciprocality enabled and disabled (using the same random number seed).
The maxDistance parameter supplies the maximum distance over which interactions of this type will
be evaluated; at greater distances, the interaction strength is considered to be zero (for efficiency). The
default value of maxDistance, INF (positive infinity), indicates that there is no maximum interaction
distance; note that this can make some interaction queries much less efficient, and is therefore not
recommended. In SLiM 3.1 and later, a warning will be issued if a spatial interaction type is defined
with no maximum distance to encourage a maximum distance to be defined.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

411

The sexSegregation parameter governs the applicability of the interaction to each sex, in sexual
simulations. It does not affect distance calculations in any way; it only modifies the way in which
interaction strengths are calculated. The default, "**", implies that the interaction is felt by both sexes
(the first character of the string value) and is exerted by both sexes (the second character of the
string value). Either or both characters may be M or F instead; for example, "MM" would indicate a
male-male interaction, such as male-male competition, whereas "FM" would indicate an interaction
influencing only females that is influenced only by males, such as male mating displays that influence
female attraction. This parameter may be set only to "**" unless sex has been enabled with
initializeSex(). Note that a value of sexSegregation other than "**" may imply some degree of
non-reciprocality, but it is not necessary to specify reciprocal to be F for this reason; SLiM will take
the sex-segregation of the interaction into account for you. The value of reciprocal may therefore be
interpreted as meaning: in those cases, if any, in which A interacts with B and B interacts with A, is the
interaction strength guaranteed to be the same in both directions?
By default, the interaction strength is 1.0 for all interactions within maxDistance. Often it is
desirable to change the interaction function using setInteractionFunction(); modifying
interaction strengths can also be achieved with interaction() callbacks if necessary (see section
22.6). In any case, interactions beyond maxDistance always have a strength of 0.0, and the
interaction strength of an individual with itself is always 0.0, regardless of the interaction function or
callbacks.
The global symbol for the new interaction type is immediately available; the return value also provides
the new object.
(void)initializeMutationRate(numeric rates, [Ni ends = NULL], [string$ sex = "*"])

Set the mutation rate per base position per generation along the chromosome. To be precise, this
mutation rate is the expected mean number of mutations that will occur per base position per
generation (per new offspring genome being generated); note that this is different from how the
recombination rate is defined (see initializeRecombinationRate()). The number of mutations
that actually occurs at a given base position when generating an offspring genome is, in effect, drawn
from a Poisson distribution with that expected mean (but under the hood SLiM uses a mathematically
equivalent but much more efficient strategy). It is possible for this Poisson draw to indicate that two or
more new mutations have arisen at the same base position, particularly when the mutation rate is very
high; in this case, the new mutations will be added to the site one at a time, and as always the
mutation stacking policy (see section 1.5.3) will be followed.
There are two ways to call this function. If the optional ends parameter is NULL (the default), then
rates must be a singleton value that specifies a single mutation rate to be used along the entire
chromosome. If, on the other hand, ends is supplied, then rates and ends must be the same length,
and the values in ends must be specified in ascending order. In that case, rates and ends taken
together specify the mutation rates to be used along successive contiguous stretches of the
chromosome, from beginning to end; the last position specified in ends should extend to the end of
the chromosome (i.e. at least to the end of the last genomic element, if not further).
For example, if the following call is made:
initializeMutationRate(c(1e-7, 2.5e-8), c(5000, 9999));

then the result is that the mutation rate for bases 0...5000 (inclusive) will be 1e-7, and the rate for
bases 5001...9999 (inclusive) will be 2.5e-8.
Note that mutations are generated by SLiM only within genomic elements, regardless of the mutation
rate map. In effect, the mutation rate map given is intersected with the coverage area of the genomic
elements defined; areas outside of any genomic element are given a mutation rate of zero. There is no
harm in supplying a mutation rate map that specifies rates for areas outside of the genomic elements
defined; that rate information is simply not used. The overallMutationRate family of properties on
Chromosome provide the overall mutation rate after genomic element coverage has been taken into
account, so it will reflect the rate at which new mutations will actually be generated in the simulation
as configured.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

412

If the optional sex parameter is "*" (the default), then the supplied mutation rate map will be used for
both sexes (which is the only option for hermaphroditic simulations). In sexual simulations sex may
be "M" or "F" instead, in which case the supplied mutation rate map is used only for that sex (i.e.,
when generating a gamete from a parent of that sex). In this case, two calls must be made to
initializeMutationRate(), one for each sex, even if a rate of zero is desired for the other sex; no
default mutation rate map is supplied.
(object$)initializeMutationType(is$ id, numeric$ dominanceCoeff,
string$ distributionType, ...)

Add a mutation type at initialization time. The id must not already be used for any mutation type in
the simulation. The id parameter may be either an integer giving the ID of the new mutation type,
or a string giving the name of the new mutation type (such as "m5" to specify an ID of 5). The
dominanceCoeff parameter supplies the dominance coefficient for the mutation type; 0.0 produces
no dominance, 1.0 complete dominance, and values greater than 1.0, overdominance. The
distributionType may be "f", in which case the ellipsis ... should supply a numeric$ fixed
selection coefficient; "e", in which case the ellipsis should supply a numeric$ mean selection
coefficient for an exponential distribution; "g", in which case the ellipsis should supply a numeric$
mean selection coefficient and a numeric$ alpha shape parameter for a gamma distribution; "n", in
which case the ellipsis should supply a numeric$ mean selection coefficient and a numeric$ sigma
(standard deviation) parameter for a normal distribution; "w", in which case the ellipsis should supply
a numeric$ λ scale parameter and a numeric$ k shape parameter for a Weibull distribution; or "s",
in which case the ellipsis should supply a string$ Eidos script parameter. See section 21.9 for
discussion of the various DFEs and their uses. The global symbol for the new mutation type is
immediately available; the return value also provides the new object.
Note that by default in WF models, all mutations of a given mutation type will be converted into
Substitution objects when they reach fixation, for efficiency reasons. If you need to disable this
conversion, to keep mutations of a given type active in the simulation even after they have fixed, you
can do so by setting the convertToSubstitution property of MutationType to T.
In contrast, by default in nonWF models mutations will not be converted into Substitution objects
when they reach fixation; convertToSubstitution is F by default in nonWF models. To enable
conversion in nonWF models for neutral mutation types with no indirect fitness effects, you should
therefore set convertToSubstitution to T.
See sections 18.3, 19.5, and 20.9.1 for further discussion regarding the convertToSubstitution
property.
(void)initializeRecombinationRate(numeric rates, [Ni ends = NULL],
[string$ sex = "*"])

Set the recombination rate per base position per generation along the chromosome. To be precise,
this recombination rate is the probability that a breakpoint will occur between one base and the next
base; note that this is different from how the mutation rate is defined (see
initializeMutationRate()). All rates must be in the interval [0.0, 0.5]. A rate of 0.5 implies
complete independence between the adjacent bases, which might be used to implement independent
assortment of loci located on different chromosomes (see the example below). Whether a breakpoint
occurs between two bases is then, in effect, determined by a binomial draw with a single trial and the
given rate as probability (but under the hood SLiM uses a mathematically equivalent but much more
efficient strategy). Unlike the mutational process in SLiM, then, which can generate more than one
mutation at a given site (in one generation/genome), the recombinational process in SLiM will never
generate more then one crossover between one base and the next (in one generation/genome), and a
supplied rate of 0.5 will therefore result in an actual probability of 0.5 for a crossover at the relevant
position. (Note that this was not true in SLiM 2.x and earlier, however; their implementation of
recombination resulted in a crossover probability of about 39.3% for a rate of 0.5, due to the use of
an inaccurate approximation method. Recombination rates lower than about 0.01 would have been
essentially exact, since the approximation error became large only as the rate approached 0.5.)
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

413

There are two ways to call this function. If the optional ends parameter is NULL (the default), then
rates must be a singleton value that specifies a single recombination rate to be used along the entire
chromosome. If, on the other hand, ends is supplied, then rates and ends must be the same length,
and the values in ends must be specified in ascending order. In that case, rates and ends taken
together specify the recombination rates to be used along successive contiguous stretches of the
chromosome, from beginning to end; the last position specified in ends should extend to the end of
the chromosome (i.e. at least to the end of the last genomic element, if not further). Note that a
recombination rate of 1 centimorgan/Mbp corresponds to a recombination rate of 1e-8 in the units
used by SLiM.
For example, if the following call is made:
initializeRecombinationRate(c(0, 0.5, 0), c(5000, 5001, 9999));

then the result is that the recombination rates between bases 0 / 1, 1 / 2, ..., 4999 / 5000 will be 0, the
rate between bases 5000 / 5001 will be 0.5, and the rate between bases 5001 / 5002 onward (up to
9998 / 9999) will again be 0. Setting the recombination rate between one specific pair of bases to 0.5
forces recombination to occur with a probability of 0.5 between those bases, which effectively breaks
the simulated locus into separate chromosomes at that point; this example effectively has one
simulated chromosome from base position 0 to 5000, and another from 5001 to 9999.
If the optional sex parameter is "*" (the default), then the supplied recombination rate map will be
used for both sexes (which is the only option for hermaphroditic simulations). In sexual simulations
sex may be "M" or "F" instead, in which case the supplied recombination map is used only for that
sex. In this case, two calls must be made to initializeRecombinationRate(), one for each sex,
even if a rate of zero is desired for the other sex; no default recombination map is supplied.
(void)initializeSex(string$ chromosomeType, [numeric$ xDominanceCoeff = 1])

Enable and configure sex in the simulation. The argument chromosomeType gives the type of
chromosome to be simulated; this should be "A", "X", or "Y". If the chromosomeType is "X", the
optional xDominanceCoeff parameter can supply the dominance coefficient used when a mutation is
present in an XY male, and is thus “heterozygous” (but in a different sense than the heterozygosity of
an XX female with one copy of the mutation). Calling this function has the side effect of enabling sex
in the simulation; individuals will be male and female (rather than hermaphroditic) regardless of the
chromosomeType chosen for simulation. There is no way to disable sex once it has been enabled; if
you don’t want to have sex, don’t call this function.
(void)initializeSLiMModelType(string$ modelType)

Configure the type of SLiM model used for the simulation. At present, one of two model types may be
selected. If modelType is "WF", SLiM will use a Wright-Fisher (WF) model; this is the model type that
has always been supported by SLiM, and is the model type used if initializeSLiMModelType() is
not called. If modelType is "nonWF", SLiM will use a non-Wright-Fisher (nonWF) model instead; this
is a new model type supported by SLiM 3.0 and above (see section 1.6).
If initializeSLiMModelType() is called at all then it must be called before any other initialization
function, so that SLiM knows from the outset which features are enabled and which are not.
(void)initializeSLiMOptions([logical$ keepPedigrees = F],
[string$ dimensionality = ""], [string$ periodicity = ""],
[integer$ mutationRuns = 0], [logical$ preventIncidentalSelfing = F])

Configure options for the simulation. If initializeSLiMOptions() is called at all then it must be
called before any other initialization function (except initializeSLiMModelType()), so that SLiM
knows from the outset which optional features are enabled and which are not.
If keepPedigrees is T, SLiM will keep pedigree information for every individual in the simulation,
tracking the identity of its parents and grandparents. This allows individuals to assess their degree of
pedigree-based relatedness to other individuals (see Individual’s relatedness() method, section
21.6.2), as well as allowing a model to find “trios” (two parents and an offspring they generated) using
the pedigree properties of Individual (section 21.6.1). As a side effect of keepPedigrees being T,
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

414

the pedigreeID, pedigreeParentIDs, and pedigreeGrandparentIDs properties of Individual
will have defined values (see section 21.6.1), as will the genomePedigreeID property of Genome (see
section 21.3.1). Note that pedigree-based relatedness doesn’t necessarily correspond to genetic
relatedness, due to effects such as assortment and recombination. For an overview of other ways of
tracking genetic ancestry, including true local ancestry at each position on the chromosome, see
section 13.9.
If dimensionality is not "", SLiM will enable its optional “continuous space” facility. Three values
for dimensionality are presently supported: "x", "xy", and "xyz", specifying that continuous space
should be enabled for one, two, or three dimensions, respectively, using (x), (x, y), and (x, y, z)
coordinates respectively. This has a number of side effects. First of all, it means that the specified
properties of Individual (x, y, and/or z) will be interpreted by SLiM as spatial positions; in particular,
SLiMgui will use those properties to display subpopulations spatially. Second, it allows spatial
interactions to be defined, evaluated, and queried using initializeInteractionType() and
interaction() callbacks. And third, it enables the use of any other properties and methods related
to continuous space, such as setting the spatial boundaries of subpopulations, which would otherwise
raise an error.
If periodicity is not "", SLiM will designate the specified spatial dimensions as being periodic –
wrapping around at the edges of the spatial boundaries of that dimension. This option may only be
used if the dimensionality parameter to initializeSLiMOptions() has been used to enable
spatiality in the model, and only spatial dimensions that were specified in the dimensionality of the
model may be declared to be periodic (but if desired, it is permissible to make just a subset of those
dimensions periodic; it is not an all-or-none proposition). For example, if the specified dimensionality
is "xy", the model’s periodicity may be "x", "y", or "xy" (or "", the default, to specify that there are
no periodic dimensions). A one-dimensional periodic model would model a space like the perimeter
of a circle. A two-dimensional model periodic in one of those dimensions would model a space like a
cylinder without its end caps; if periodic in both dimensions, the modeled space is a torus. The
shapes of three-dimensional periodic models are harder to visualize, but are essentially higherdimensional analogues of these concepts. Periodic boundary conditions are commonly used to model
spatial scenarios without “edge effects”, since there are no edges in the periodic spatial dimensions.
The pointPeriodic() method of Subpopulation is typically used in conjunction with this option,
to actually implement the periodic boundary condition for the specified dimensions.
If mutationRuns is not 0, SLiM will use the value given as the number of mutation runs inside Genome
objects; if it is 0 (the default), SLiM will calculate a number of mutation runs that it estimates will work
well. Internally, SLiM divides genomes into a sequence of consecutive mutation runs, allowing more
efficient internal computations. The optimal mutation run length is short enough that each mutation
run is relatively unlikely to be modified by mutation/recombination events when inherited, but long
enough that each mutation run is likely to contain a relatively large number of mutations; these
priorities are in tension, so an intermediate balance between them is generally desirable. The optimal
number of mutation runs will depend upon the machine and even the compiler used to build SLiM, so
SLiM’s default value may not be optimal; for maximal performance it can thus be beneficial to
experiment with different values and find the optimal value for the simulation – a process which SLiM
can assist with (see section 18.4). Specifying the number of mutation runs is an advanced technique,
but in certain cases it can improve performance significantly.
If preventIncidentalSelfing is T, incidental selfing in hermaphroditic models will be prevented by
SLiM. By default (i.e., if preventIncidentalSelfing is F), SLiM chooses the first and second
parents in a biparental mating event independently. It is therefore possible for the same individual to
be chosen as both the first and second parent, resulting in selfing events even when the selfing rate is
zero. In many models this is unimportant, since it happens fairly infrequently and does not have large
consequences. This behavior is SLiM’s default because it is the simplest option, and produces results
that most closely align with simple analytical population genetics models. However, in some models
this selfing can be undesirable and problematic. In particular, models that involve very high variance
in fitness or very small effective population sizes may see elevated rates of selfing that substantially
influence model results. If preventIncidentalSelfing is set to T, all such incidental selfing will be
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

415

prevented (by choosing a new second parent if the first parent was chosen again). Non-incidental
selfing, as requested by the selfing rate, will still be permitted. Note that if incidental selfing is
prevented, SLiM will hang if it is unable to find a different second parent; there must always be at least
two individuals in the population with non-zero fitness, and mateChoice() and modifyChild()
callbacks must not absolutely prevent those two individuals from producing viable offspring.
Enforcement of the prohibition on incidental selfing will occur after mateChoice() callbacks have
been called (and thus the default mating weights provided to mateChoice() callbacks will not
exclude the first parent!), but will occur before modifyChild() callbacks are called (so those
callbacks may assume that the first and second parents are distinct).
This function will likely be extended with further options in the future, added on to the end of the
argument list. Using named arguments with this call is recommended for readability. Note that
turning on optional features may increase the runtime and memory footprint of SLiM.
(void)initializeTreeSeq([logical$ recordMutations = T],
[float$ simplificationRatio = 10], [logical$ checkCoalescence = F],
[logical$ runCrosschecks = F])

Configure options for tree sequence recording. Calling this function turns on tree sequence recording,
as a side effect, for later reconstruction of the simulation’s evolutionary dynamics; if you do not want
tree sequence recording to be enabled, do not call this function.
The recordMutations flag controls whether information about individual mutations is recorded or
not. Such recording takes time and memory, and so can be turned off if only the tree sequence itself is
needed, but it is turned on by default since mutation recording is generally useful.
The simplificationRatio parameter controls how often automatic simplification of the recorded
tree sequence occurs. This is a speed–memory tradeoff: more frequent simplification (lower
simplificationRatio) means the stored tree sequences will use less memory, but at a cost of
somewhat longer run times. Conversely, a larger simplificationRatio means that SLiM will wait
longer between simplifications. SLiM will try to find an optimal generation interval for simplification
such that the ratio of the memory used by the tree sequence tables, (before:after) simplification, is
close to the requested ratio. The default of 10 thus requests that SLiM try to find a generation interval
such that the maximum size of the stored tree sequences is ten times the size after simplification. INF
may be supplied as a special value indicating that automatic simplification should never occur; 0 may
be supplied to indicate that automatic simplification should be performed at the end of every
generation.
The checkCoalescence parameter controls whether a check for full coalescence is conducted after
each simplification. If a model will call treeSeqCoalesced() to check for coalescence during its
execution, checkCoalescence should be set to T. Since the coalescence checks entail a performance
penalty, the default of F is preferable otherwise. See the documentation for treeSeqCoalesced() for
further discussion.
The runCrosschecks parameter controls whether cross-checks between SLiM’s internal data
structures and the tree-sequence recording data structures will be conducted. These two sets of data
structures record much the same thing (mutations in genomes), but using completely different
representations, so such cross-checks can be useful to confirm that the two data structures do indeed
represent the same conceptual state. This slows down the model considerably, however, and would
normally be turned on only for debugging purposes, so it is turned off by default.

Once all initialize() callbacks have executed, in the order in which they are specified in the
SLiM input file, the simulation will begin. The generation number at which it starts is determined
by the Eidos events you have defined (see section 22.1); the first generation in which an Eidos
event is scheduled to execute is the generation at which the simulation starts. Similarly, the
simulation will terminate after the last generation for which a script block (either an event or a
callback) is registered to execute, unless the stop() function is called to end the simulation earlier.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

416

21.2 Class Chromosome
This class represents the layout and properties of the chromosome being simulated. The
chromosome currently being simulated is available through the sim.chromosome global. Section
1.5.4 presents an overview of the conceptual role of this class.
21.2.1 Chromosome properties
colorSubstitution <–> (string$)

The color used to display substitutions in SLiMgui when both mutations and substitutions are being
displayed in the chromosome view. Outside of SLiMgui, this property still exists, but is not used by
SLiM. Colors may be specified by name, or with hexadecimal RGB values of the form "#RRGGBB" (see
the Eidos manual). If colorSubstitution is the empty string, "", SLiMgui will defer to the color
scheme of each MutationType, just as it does when only substitutions are being displayed. The
default, "3333FF", causes all substitutions to be shown as dark blue when displayed in conjunction
with mutations, to prevent the view from becoming too noisy. Note that when substitutions are
displayed without mutations also being displayed, this value is ignored by SLiMgui and the
substitutions use the color scheme of each MutationType.
geneConversionFraction <–> (float$)

The fraction of crossover events that result in gene conversion; see SLiM’s manual for details.
geneConversionMeanLength <–> (float$)

The mean length of a gene conversion event (in base positions).
genomicElements => (object)

All of the GenomicElement objects that comprise the chromosome.
lastPosition => (integer$)

The last valid position in the chromosome; its length, essentially.
mutationEndPositions => (integer)

The end positions for mutation rate regions along the chromosome. Each mutation rate region is
assumed to start at the position following the end of the previous mutation rate region; in other words,
the regions are assumed to be contiguous. When using sex-specific mutation rate maps, this property
will unavailable; see mutationEndPositionsF and mutationEndPositionsM.
mutationEndPositionsF => (integer)

The end positions for mutation rate regions for females, when using sex-specific mutation rate maps;
unavailable otherwise. See mutationEndPositions for further explanation.
mutationEndPositionsM => (integer)

The end positions for mutation rate regions for males, when using sex-specific mutation rate maps;
unavailable otherwise. See mutationEndPositions for further explanation.
mutationRates => (float)

The mutation rate for each of the mutation rate regions specified by mutationEndPositions. When
using sex-specific mutation rate maps, this property will be unavailable; see mutationRatesF and
mutationRatesM.
mutationRatesF => (float)

The mutation rate for each of the mutation rate regions specified by mutationEndPositionsF, when
using sex-specific mutation rate maps; unavailable otherwise.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

417

mutationRatesM => (float)

The mutation rate for each of the mutation rate regions specified by mutationEndPositionsM, when
using sex-specific mutation rate maps; unavailable otherwise.
overallMutationRate => (float$)

The overall mutation rate across the whole chromosome determining the overall number of mutation
events that will occur anywhere in the chromosome, as calculated from the individual mutation
ranges and rates as well as the coverage of the chromosome by genomic elements (since mutations are
only generated within genomic elements, regardless of the mutation rate map). When using sexspecific mutation rate maps, this property will unavailable; see overallMutationRateF and
overallMutationRateM.
overallMutationRateF => (float$)

The overall mutation rate for females, when using sex-specific mutation rate maps; unavailable
otherwise. See overallMutationRate for further explanation.
overallMutationRateM => (float$)

The overall mutation rate for males, when using sex-specific mutation rate maps; unavailable
otherwise. See overallMutationRate for further explanation.
overallRecombinationRate => (float$)

The overall recombination rate across the whole chromosome determining the overall number of
recombination events that will occur anywhere in the chromosome, as calculated from the individual
recombination ranges and rates. When using sex-specific recombination maps, this property will
unavailable; see overallRecombinationRateF and overallRecombinationRateM.
overallRecombinationRateF => (float$)

The overall recombination rate for females, when using sex-specific recombination maps; unavailable
otherwise. See overallRecombinationRate for further explanation.
overallRecombinationRateM => (float$)

The overall recombination rate for males, when using sex-specific recombination maps; unavailable
otherwise. See overallRecombinationRate for further explanation.
recombinationEndPositions => (integer)

The end positions for recombination regions along the chromosome. Each recombination region is
assumed to start at the position following the end of the previous recombination region; in other
words, the regions are assumed to be contiguous. When using sex-specific recombination maps, this
property will unavailable; see recombinationEndPositionsF and recombinationEndPositionsM.
recombinationEndPositionsF => (integer)

The end positions for recombination regions for females, when using sex-specific recombination
maps; unavailable otherwise. See recombinationEndPositions for further explanation.
recombinationEndPositionsM => (integer)

The end positions for recombination regions for males, when using sex-specific recombination maps;
unavailable otherwise. See recombinationEndPositions for further explanation.
recombinationRates => (float)

The recombination rate for each of the recombination regions specified by
recombinationEndPositions. When using sex-specific recombination maps, this property will
unavailable; see recombinationRatesF and recombinationRatesM.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

418

recombinationRatesF => (float)

The recombination rate for each of the recombination regions specified by
recombinationEndPositionsF, when using sex-specific recombination maps; unavailable
otherwise.
recombinationRatesM => (float)

The recombination rate for each of the recombination regions specified by
recombinationEndPositionsM, when using sex-specific recombination maps; unavailable
otherwise.
tag <–> (integer$)

A user-defined integer value. The value of tag is initially undefined (i.e., has an effectively random
value that could be different every time you run your model); if you wish it to have a defined value,
you must arrange that yourself by explicitly setting its value prior to using it elsewhere in your code.
The value of tag is not used by SLiM; it is free for you to use.

21.2.2 Chromosome methods
– (integer)drawBreakpoints([No$ parent = NULL], [Ni$ n = NULL])

Draw recombination breakpoints, using the chromosome’s recombination rate map, the current gene
conversion parameters, and (in some cases – see below) any active and applicable recombination()
callbacks. The number of breakpoints to generate, n, may be supplied; if it is NULL (the default), the
number of breakpoints will be drawn based upon the overall recombination rate and the chromosome
length (following the standard procedure in SLiM). Note that if gene conversion is enabled, the
number of breakpoints generated may not be equal to the number requested, because any given
breakpoint might become a gene conversion event, which entails an additional breakpoint (to
terminate the gene conversion tract).
It is generally recommended that the parent individual be supplied to this method, but parent is NULL
by default. The individual supplied in parent is used for two purposes. First, in sexual models that
define separate recombination rate maps for males versus females, the sex of parent will be used to
determine which map is used; in this case, a non-NULL value must be supplied for parent, since the
choice of recombination rate map must be determined. Second, in models that define
recombination() callbacks, parent is used to determine the various pseudo-parameters that are
passed to recombination() callbacks (individual, genome1, genome2, subpop), and the
subpopulation to which parent belongs is used to select which recombination() callbacks are
applicable; given the necessity of this information, recombination() callbacks will not be called as a
side effect of this method if parent is NULL. Apart from these two uses, parent is not used, and the
caller does not guarantee that the generated breakpoints will actually be used to recombine the
genomes of parent in particular.
– (void)setMutationRate(numeric rates, [Ni ends = NULL], [string$ sex = "*"])

Set the mutation rate per base position per generation along the chromosome. There are two ways to
call this method. If the optional ends parameter is NULL (the default), then rates must be a singleton
value that specifies a single mutation rate to be used along the entire chromosome. If, on the other
hand, ends is supplied, then rates and ends must be the same length, and the values in ends must
be specified in ascending order. In that case, rates and ends taken together specify the mutation
rates to be used along successive contiguous stretches of the chromosome, from beginning to end; the
last position specified in ends should extend to the end of the chromosome (as previously determined,
during simulation initialization). See the initializeMutationRate() function for further discussion
of precisely how these rates and positions are interpreted.
If the optional sex parameter is "*" (the default), then the supplied mutation rate map will be used for
both sexes (which is the only option for hermaphroditic simulations). In sexual simulations sex may
be "M" or "F" instead, in which case the supplied mutation rate map is used only for that sex. Note
that whether sex-specific mutation rate maps will be used is set by the way that the simulation is
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

419

initially configured with initializeMutationRate(), and cannot be changed with this method; so if
the simulation was set up to use sex-specific mutation rate maps then sex must be "M" or "F" here,
whereas if it was set up not to, then sex must be "*" or unsupplied here. If a simulation needs sexspecific mutation rate maps only some of the time, the male and female maps can simply be set to be
identical the rest of the time.
The mutation rate intervals are normally a constant in simulations, so be sure you know what you are
doing.
– (void)setRecombinationRate(numeric rates, [Ni ends = NULL], [string$ sex = "*"])

Set the recombination rate per base position per generation along the chromosome. All rates must be
in the interval [0.0, 0.5]. There are two ways to call this method. If the optional ends parameter is
NULL (the default), then rates must be a singleton value that specifies a single recombination rate to
be used along the entire chromosome. If, on the other hand, ends is supplied, then rates and ends
must be the same length, and the values in ends must be specified in ascending order. In that case,
rates and ends taken together specify the recombination rates to be used along successive
contiguous stretches of the chromosome, from beginning to end; the last position specified in ends
should extend to the end of the chromosome (as previously determined, during simulation
initialization). See the initializeRecombinationRate() function for further discussion of precisely
how these rates and positions are interpreted.
If the optional sex parameter is "*" (the default), then the supplied recombination rate map will be
used for both sexes (which is the only option for hermaphroditic simulations). In sexual simulations
sex may be "M" or "F" instead, in which case the supplied recombination map is used only for that
sex. Note that whether sex-specific recombination maps will be used is set by the way that the
simulation is initially configured with initializeRecombinationRate(), and cannot be changed
with this method; so if the simulation was set up to use sex-specific recombination maps then sex
must be "M" or "F" here, whereas if it was set up not to, then sex must be "*" or unsupplied here. If
a simulation needs sex-specific recombination maps only some of the time, the male and female maps
can simply be set to be identical the rest of the time.
The recombination intervals are normally a constant in simulations, so be sure you know what you are
doing.

21.3 Class Genome
This class represents one full genome of an individual (one of the two genomes contained by a
diploid individual, that is, in the way that SLiM uses the term), composed of the mutations carried
by that individual. Section 1.5.1 presents an overview of the conceptual role of this class.
21.3.1 Genome properties
genomePedigreeID => (integer$)

If pedigree tracking is turned on with initializeSLiMOptions(keepPedigrees=T),
genomePedigreeID is a unique non-negative identifier for each genome in a simulation, never reused throughout the duration of the simulation run. Furthermore, the genomePedigreeID of a given
genome will be equal to either (2*pedigreeID) or (2*pedigreeID + 1) of the individual that the
genome belongs to (the former for the first genome of the individual, the latter for the second genome
of the individual); this invariant relationship is guaranteed. If pedigree tracking is not on, the value of
genomePedigreeID will be a singleton -1.
genomeType => (string$)

The type of chromosome represented by this genome; one of "A", "X", or "Y".
isNullGenome => (logical$)
T if the genome is a “null” genome, F if it is an ordinary genome object. When a sex chromosome (X

or Y) is simulated, the other sex chromosome also exists in the simulation, but it is a “null” genome
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

420

that does not carry any mutations. Instead, it is a placeholder, present to allow SLiM’s code to operate
in much the same way as it does when an autosome is simulated. Null genomes should not be
accessed or manipulated.
mutations => (object)

All of the Mutation objects present in this genome.
tag <–> (integer$)

A user-defined integer value. The value of tag is initially undefined (i.e., has an effectively random
value that could be different every time you run your model); if you wish it to have a defined value,
you must arrange that yourself by explicitly setting its value prior to using it elsewhere in your code.
The value of tag is not used by SLiM; it is free for you to use.
Note that the Genome objects used by SLiM are new with every generation, so the tag value of each
new offspring generated in each generation will be initially undefined. If you set a tag value for an
offspring genome inside a modifyChild() callback, that tag value will be preserved as the offspring
individual becomes a parent (across the generation boundary, in other words). If you take advantage
of this, however, you should be careful to set up initial values for the tag values of all offspring,
otherwise undefined initial values might happen to match the values that you are trying to use to tag
particular individuals. A rule of thumb in programming: undefined values should always be assumed
to take on the most inconvenient value possible.

21.3.2 Genome methods
+ (void)addMutations(object mutations)

Add the existing mutations in mutations to the genome, if they are not already present (if they are
already present, they will be ignored), and if the addition is not prevented by the mutation stacking
policy (see the mutationStackPolicy property of MutationType, section 21.9.1).
Calling this will normally affect the fitness values calculated at the end of the current generation; if
you want current fitness values to be affected, you can call SLiMSim’s method
recalculateFitness() – but see the documentation of that method for caveats.
+ (object)addNewDrawnMutation(io mutationType,
integer position, [Ni originGeneration = NULL],
[Nio originSubpop = NULL])

Add new mutations to the target genome(s) with the specified mutationType (specified by the
MutationType object or by integer identifier), position, originGeneration (which may
be NULL, the default, to specify the current generation), and originSubpop (specified by the
Subpopulation object or by integer identifier, or by NULL, the default, to specify the
subpopulation to which the first target genome belongs). If originSubpop is supplied as an
integer, it is intentionally not checked for validity; you may use arbitrary values of
originSubpop to “tag” the mutations that you create (see section 21.8.1). The selection
coefficients of the mutations are drawn from their mutation types; addNewMutation() may be
used instead if you wish to specify selection coefficients.
Beginning in SLiM 2.5 this method is vectorized, so all of these parameters may be singletons
(in which case that single value is used for all mutations created by the call) or non-singleton
vectors (in which case one element is used for each corresponding mutation created). Nonsingleton parameters must match in length, since their elements need to be matched up oneto-one.
The new mutations created by this method are returned, even if their actual addition is
prevented by the mutation stacking policy (see the mutationStackPolicy property of
MutationType, section 21.9.1). However, the order of the mutations in the returned vector is
not guaranteed to be the same as the order in which the values are specified in parameter
vectors, unless the position parameter is specified in ascending order. In other words, presorting the parameters to this method into ascending order by position, using order() and
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

421

subsetting, will guarantee that the order of the returned vector of mutations corresponds to the
order of elements in the parameters to this method; otherwise, no such guarantee exists.
Beginning in SLiM 2.1, this is a class method, not an instance method. This means that it does
not get multiplexed out to all of the elements of the receiver (which would add a different
new mutation to each element); instead, it is performed as a single operation, adding the
same new mutation objects to all of the elements of the receiver. Before SLiM 2.1, to add the
same mutations to multiple genomes, it was necessary to call addNewDrawnMutation() on
one of the genomes, and then add the returned Mutation object to all of the other genomes
using addMutations(). That is not necessary in SLiM 2.1 and later, because of this change
(although doing it the old way does no harm and produces identical behavior). Pre-2.1 code
that actually relied upon the old multiplexing behavior will no longer work correctly (but this
is expected to be an extremely rare pattern of usage).
Calling this will normally affect the fitness values calculated at the end of the current
generation (but not sooner); if you want current fitness values to be affected, you can call
SLiMSim’s method recalculateFitness() – but see the documentation of that method for
caveats.
+ (object)addNewMutation(io mutationType,
numeric selectionCoeff, integer position, [Ni originGeneration = NULL],
[Nio originSubpop = NULL])

Add new mutations to the target genome(s) with the specified mutationType (specified by the
MutationType object or by integer identifier), selectionCoeff, position,
originGeneration (which may be NULL, the default, to specify the current generation), and
originSubpop (specified by the Subpopulation object or by integer identifier, or by NULL,
the default, to specify the subpopulation to which the first target genome belongs). If
originSubpop is supplied as an integer, it is intentionally not checked for validity; you may
use arbitrary values of originSubpop to “tag” the mutations that you create (see section
21.8.1). The addNewDrawnMutation() method may be used instead if you wish selection
coefficients to be drawn from the mutation types of the mutations.
The new mutations created by this method are returned, even if their actual addition is
prevented by the mutation stacking policy (see the mutationStackPolicy property of
MutationType, section 21.9.1). However, the order of the mutations in the returned vector is
not guaranteed to be the same as the order in which the values are specified in parameter
vectors, unless the position parameter is specified in ascending order. In other words, presorting the parameters to this method into ascending order by position, using order() and
subsetting, will guarantee that the order of the returned vector of mutations corresponds to the
order of elements in the parameters to this method; otherwise, no such guarantee exists.
Beginning in SLiM 2.1, this is a class method, not an instance method. This means that it does
not get multiplexed out to all of the elements of the receiver (which would add a different
new mutation to each element); instead, it is performed as a single operation, adding the
same new mutation object to all of the elements of the receiver. Before SLiM 2.1, to add the
same mutation to multiple genomes, it was necessary to call addNewMutation() on one of
the genomes, and then add the returned Mutation object to all of the other genomes using
addMutations(). That is not necessary in SLiM 2.1 and later, because of this change
(although doing it the old way does no harm and produces identical behavior). Pre-2.1 code
that actually relied upon the old multiplexing behavior will no longer work correctly (but this
is expected to be an extremely rare pattern of usage).
Calling this will normally affect the fitness values calculated at the end of the current
generation (but not sooner); if you want current fitness values to be affected, you can call
SLiMSim’s method recalculateFitness() – but see the documentation of that method for
caveats.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

422

– (Nlo$)containsMarkerMutation(io$ mutType,
integer$ position, [logical$ returnMutation = F])

Returns T if the genome contains a mutation of type mutType at position, F otherwise (if
returnMutation has its default value of F; see below). This method is, as its name suggests, intended
for checking for “marker mutations”: mutations of a special mutation type that are not literally
mutations in the usual sense, but instead are added in to particular genomes to mark them as
possessing some property. Marker mutations are not typically added by SLiM’s mutation-generating
machinery; instead they are added explicitly with addNewMutation() or addNewDrawnMutation() at
a known, constant position in the genome. This method provides a check for whether a marker
mutation of a given type exists in a particular genome; because the position to check is known in
advance, that check can be done much faster than the equivalent check with containsMutations()
or countOfMutationsOfType(), using a binary search of the genome. See section 13.5 for one
example of a model that uses marker mutations – in that case, to mark chromosomes that possess an
inversion.
If returnMutation is T (an option added in SLiM 3), this method returns the actual mutation found,
rather than just T or F. More specifically, the first mutation found of mutType at position will be
returned; if more than one such mutation exists in the target genome, which one is returned is not
defined. If returnMutation is T and no mutation of mutType is found at position, NULL will be
returned.
– (logical)containsMutations(object mutations)

Returns a logical vector indicating whether each of the mutations in mutations is present in the
genome; each element in the returned vector indicates whether the corresponding mutation is present
(T) or absent (F). This method is provided for speed; it is much faster than the corresponding Eidos
code.
– (integer$)countOfMutationsOfType(io$ mutType)

Returns the number of mutations that are of the type specified by mutType, out of all of the mutations
in the genome. If you need a vector of the matching Mutation objects, rather than just a count, use mutationsOfType(). This method is provided for speed; it is much faster than the corresponding
Eidos code.
– (object)mutationsOfType(io$ mutType)

Returns an object vector of all the mutations that are of the type specified by mutType, out of all of
the mutations in the genome. If you just need a count of the matching Mutation objects, rather than
a vector of the matches, use -countOfMutationsOfType(); if you need just the positions of matching
Mutation objects, use -positionsOfMutationsOfType(); and if you are aiming for a sum of the
selection coefficients of matching Mutation objects, use -sumOfMutationsOfType(). This method is
provided for speed; it is much faster than the corresponding Eidos code.
+ (void)output([Ns$ filePath = NULL], [logical$ append = F])

Output the target genomes in SLiM’s native format (see section 23.3.1 for output format details). This
low-level output method may be used to output any sample of Genome objects (the Eidos function
sample() may be useful for constructing custom samples, as may the SLiM class Individual). For
output of a sample from a single Subpopulation, the outputSample() of Subpopulation may be
more straightforward to use. If the optional parameter filePath is NULL (the default), output is
directed to SLiM’s standard output. Otherwise, the output is sent to the file specified by filePath,
overwriting that file if append if F, or appending to the end of it if append is T.
See outputMS() and outputVCF() for other output formats. Output is generally done in a late()
event, so that the output reflects the state of the simulation at the end of a generation.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

423

+ (void)outputMS([Ns$ filePath = NULL], [logical$ append = F],
[logical$ filterMonomorphic = F])

Output the target genomes in MS format (see section 23.3.2 for output format details). This low-level
output method may be used to output any sample of Genome objects (the Eidos function sample()
may be useful for constructing custom samples, as may the SLiM class Individual). For output of a
sample from a single Subpopulation, the outputMSSample() of Subpopulation may be more
straightforward to use. If the optional parameter filePath is NULL (the default), output is directed to
SLiM’s standard output. Otherwise, the output is sent to the file specified by filePath, overwriting
that file if append if F, or appending to the end of it if append is T. Positions in the output will span
the interval [0,1].
If filterMonomorphic is F (the default), all mutations that are present in the sample will be included
in the output. This means that some mutations may be included that are actually monomorphic within
the sample (i.e., that exist in every sampled genome, and are thus apparently fixed). These may be
filtered out with filterMonomorphic = T if desired; note that this option means that some mutations
that do exist in the sampled genomes might not be included in the output, simply because they exist
in every sampled genome.
See output() and outputVCF() for other output formats. Output is generally done in a late()
event, so that the output reflects the state of the simulation at the end of a generation.
+ (void)outputVCF([Ns$ filePath = NULL], [logical$ outputMultiallelics = T],
[logical$ append = F])

Output the target genomes in VCF format (see section 23.3.3 for output format details). The target
genomes are treated as pairs comprising individuals for purposes of structuring the VCF output, so an
even number of genomes is required. This low-level output method may be used to output any
sample of Genome objects (the Eidos function sample() may be useful for constructing custom
samples, as may the SLiM class Individual). For output of a sample from a single Subpopulation,
the outputVCFSample() of Subpopulation may be more straightforward to use. If the optional
parameter filePath is NULL (the default), output is directed to SLiM’s standard output. Otherwise,
the output is sent to the file specified by filePath, overwriting that file if append if F, or appending to
the end of it if append is T.
In SLiM, it is often possible for a single individual to have multiple mutations at a given base position.
Because the VCF format is an explicit-nucleotide format, this property of SLiM does not fit well into
VCF. Since there are only four possible nucleotides at a given base position in VCF, at most one
“reference” state and three “alternate” states could be represented at that base position. SLiM, on the
other hand, can represent any number of alternative possibilities at a given base; in general, if N
different mutations are segregating at a given position, there are 2N different allelic states at that
position in SLiM. For this reason, SLiM does not attempt to represent multiple mutations at a single
site as being alternative alleles in a single output line, as is typical in VCF format. Instead, SLiM
produces a separate line of VCF output for each segregating mutation at a given position. SLiM always
declares base positions as having a “reference base” of A (representing the state in individuals that do
not carry a given mutation) and an “alternate base” of T (representing the state in individuals that do
carry the given mutation). Multiallelic positions will thus produce VCF output showing multiple A-to-T
changes at the same position, possessed by different but possibly overlapping sets of individuals.
Many programs that process VCF output may not behave correctly with this style of output. SLiM
therefore provides a choice, using the outputMultiallelics flag; if that flag is T (the default), SLiM
will produce multiple lines of output for multiallelic base positions, but will mark those lines with a
MULTIALLELIC flag in the INFO field of the VCF output so that those lines can be filtered or processed
in a special manner. If outputMultiallelics is F, on the other hand, SLiM will completely suppress
output of all mutations at multiallelic sites – often the simplest option, if doing so does not lead to bias
in the subsequent analysis. This flag has no effect upon the output of sites with only a single mutation
present. Assessment of whether a site is multiallelic is done only within the sample; segregating
mutations that are not part of the sample are ignored.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

424

See outputMS() and output() for other output formats. Output is generally done in a late() event,
so that the output reflects the state of the simulation at the end of a generation.
– (integer)positionsOfMutationsOfType(io$ mutType)

Returns the positions of mutations that are of the type specified by mutType, out of all of the mutations
in the genome. If you need a vector of the matching Mutation objects, rather than just positions, use
-mutationsOfType(). This method is provided for speed; it is much faster than the corresponding
Eidos code.
+ (void)removeMutations([No mutations = NULL], [logical$ substitute = F])

Remove the mutations in mutations from the target genome(s), if they are present (if they are not
present, they will be ignored). If NULL is passed for mutations (which is the default), then all
mutations will be removed from the target genomes; in this case, substitute must be F (a specific
vector of mutations to be substituted is required). Note that the Mutation objects removed remain
valid, and will still be in the simulation’s mutation registry (i.e. will be returned by SLiMSim’s
mutations property), until the next generation.
Changing this will normally affect the fitness values calculated at the end of the current generation; if
you want current fitness values to be affected, you can call SLiMSim’s method
recalculateFitness() – but see the documentation of that method for caveats.
The optional parameter substitute was added in SLiM 2.2, with a default of F for backward
compatibility. If substitute is T, Substitution objects will be created for all of the removed
mutations so that they are recorded in the simulation as having fixed, just as if they had reached
fixation and been removed by SLiM’s own internal machinery. This will occur regardless of whether
the mutations have in fact fixed, regardless of the convertToSubstitution property of the relevant
mutation types, and regardless of whether all copies of the mutations have even been removed from
the simulation (making it possible to create Substitution objects for mutations that are still
segregating). It is up to the caller to perform whatever checks are necessary to preserve the integrity of
the simulation’s records. Typically substitute will only be set to T in the context of calls like
sim.subpopulations.genomes.removeMutations(muts, T), such that the substituted mutations
are guaranteed to be entirely removed from circulation. As mentioned above, substitute may not
be T if mutations is NULL.
– (float$)sumOfMutationsOfType(io$ mutType)

Returns the sum of the selection coefficients of all mutations that are of the type specified by mutType,
out of all of the mutations in the genome. This is often useful in models that use a particular mutation
type to represent QTLs with additive effects; in that context, sumOfMutationsOfType() will provide
the sum of the additive effects of the QTLs for the given mutation type. This method is provided for
speed; it is much faster than the corresponding Eidos code. Note that this method also exists on
Individual, for cases in which the sum across both genomes of an individual is desired.

21.4 Class GenomicElement
This class represents a genomic element of a particular genomic element type, with a start and
end; the chromosome is composed of a series of such genomic elements. Section 1.5.4 presents
an overview of the conceptual role of this class.
21.4.1 GenomicElement properties
endPosition => (integer$)

The last position in the chromosome contained by this genomic element.
genomicElementType => (object$)

The GenomicElementType object that defines the behavior of this genomic element.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

425

startPosition => (integer$)

The first position in the chromosome contained by this genomic element.
tag <–> (integer$)

A user-defined integer value. The value of tag is initially undefined (i.e., has an effectively random
value that could be different every time you run your model); if you wish it to have a defined value,
you must arrange that yourself by explicitly setting its value prior to using it elsewhere in your code.
The value of tag is not used by SLiM; it is free for you to use.

21.4.2 GenomicElement methods
– (void)setGenomicElementType(io$ genomicElementType)

Set the genomic element type used for a genomic element (see sections 1.3 and 4.1.5). The
genomicElementType parameter should supply the new genomic element type for the element, either
as a GenomicElementType object or as an integer identifier. The genomic element type for a
genomic element is normally a constant in simulations, so be sure you know what you are doing.

21.5 Class GenomicElementType
This class represents a type of genomic element, with particular mutation types. The genomic
element types currently defined in the simulation are defined as global constants with the same
names used in the SLiM input file – g1, g2, and so forth. Section 1.5.4 presents an overview of the
conceptual role of this class.
21.5.1 GenomicElementType properties
color <–> (string$)

The color used to display genomic elements of this type in SLiMgui. Outside of SLiMgui, this property
still exists, but is not used by SLiM. Colors may be specified by name, or with hexadecimal RGB
values of the form "#RRGGBB" (see the Eidos manual). If color is the empty string, "", SLiMgui’s
default color scheme is used; this is the default for new GenomicElementType objects.
id => (integer$)

The identifier for this genomic element type; for genomic element type g3, for example, this is 3.
mutationFractions => (float)

For each MutationType represented in this genomic element type, this property has the
corresponding fraction of all mutations that will be drawn from that MutationType.
mutationTypes => (object)

The MutationType instances used by this genomic element type.
tag <–> (integer$)

A user-defined integer value. The value of tag is initially undefined (i.e., has an effectively random
value that could be different every time you run your model); if you wish it to have a defined value,
you must arrange that yourself by explicitly setting its value prior to using it elsewhere in your code.
The value of tag is not used by SLiM; it is free for you to use. See also the getValue() and
setValue() methods, for another way of attaching state to genomic element types.

21.5.2 GenomicElementType methods
– (+)getValue(string$ key)

Returns the value previously set for the dictionary entry identifier key using setValue(), or NULL if
no value has been set. This dictionary-style functionality is actually provided by the superclass of
GenomicElementType, SLiMEidosDictionary, although that fact is not presently visible in Eidos
since superclasses are not introspectable.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

426

– (void)setMutationFractions(io mutationTypes, numeric proportions)

Set the mutation type fractions contributing to a genomic element type. The mutationTypes vector
should supply the mutation types used by the genomic element (either as MutationType objects or as
integer identifiers), and the proportions vector should be of equal length, specifying the relative
proportion of mutations that will be draw from each corresponding type (see sections 1.3 and 4.1.3).
This is normally a constant in simulations, so be sure you know what you are doing.
– (void)setValue(string$ key, + value)

Sets a value for the dictionary entry identifier key. The value, which may be of any type other than
object, can be fetched later using getValue(). This dictionary-style functionality is actually
provided by the superclass of GenomicElementType, SLiMEidosDictionary, although that fact is
not presently visible in Eidos since superclasses are not introspectable.

21.6 Class Individual
This class represents a single simulated individual. Individuals in SLiM are diploid, and thus
contain two Genome objects. Most functionality in SLiM is contained in the Genome class; the
Individual class is mostly a convenient way to treat the pairs of genomes associated with an
individual as a single object, and to associate a tag value with individuals. Section 1.5.1 presents
an overview of the conceptual role of this class.
21.6.1 Individual properties
age <–> (integer$)

The age of the individual, measured in generation “ticks”. A newly generated offspring individual will
have an age of 0 in the same generation in which is was created. The age of every individual is
incremented by one at the same point that the generation counter is incremented. The age of
individuals may be changed; usually this only makes sense when setting up the initial state of a model,
however.
color <–> (string$)

The color used to display the individual in SLiMgui. Outside of SLiMgui, this property still exists, but
is not used by SLiM. Colors may be specified by name, or with hexadecimal RGB values of the form
"#RRGGBB" (see the Eidos manual). If color is the empty string, "", SLiMgui’s default (fitness-based)
color scheme is used; this is the default for new Individual objects.
fitnessScaling <–> (float$)

A float scaling factor applied to the individual’s fitness (i.e., the fitness value computed for the
individual will be multiplied by this value). This provides a simple, fast way to modify the fitness of an
individual; conceptually it is similar to returning a fitness effect for the individual from a
fitness(NULL) callback, but without the complexity and performance overhead of implementing
such a callback. To scale the fitness of all individuals in a subpopulation by the same factor, see the
fitnessScaling property of Subpopulation.
The value of fitnessScaling is reset to 1.0 every generation, so that any scaling factor set lasts for
only a single generation. This reset occurs immediately after fitness values are calculated, in both WF
and nonWF models.
genomes => (object)

The pair of Genome objects associated with this individual. If only one of the two genomes is desired,
the genome1 or genome2 property may be used.
genome1 => (object$)

The first Genome object associated with this individual. This property is particularly useful when you
want the first genome from each of a vector of individuals, as often arises in haploid models.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

427

genome2 => (object$)

The second Genome object associated with this individual. This property is particularly useful when
you want the second genome from each of a vector of individuals, as often arises in haploid models.
index => (integer$)

The index of the individual in the individuals vector of its Subpopulation.
migrant => (logical$)

Set to T if the individual migrated during the current generation, F otherwise.
In WF models, this flag is set at the point when a new child is generated if it is a migrant (i.e., if its
source subpopulation is not the same as its subpopulation), and remains valid, with the same value,
for the rest of the individual’s lifetime.
In nonWF models, this flag is F for all new individuals, is set to F in all individuals at the end of the
reproduction generation cycle stage, and is set to T on all individuals moved to a new subpopulation
by takeMigrants(); the T value set by takeMigrants() will remain until it is reset at the end of the
next reproduction generation cycle stage.
pedigreeID => (integer$)

If pedigree tracking is turned on with initializeSLiMOptions(keepPedigrees=T), pedigreeID is
a unique non-negative identifier for each individual in a simulation, never re-used throughout the
duration of the simulation run. If pedigree tracking is not on, the value of pedigreeID will be a
singleton -1.
pedigreeParentIDs => (integer)

If pedigree tracking is turned on with initializeSLiMOptions(keepPedigrees=T),
pedigreeParentIDs contains the values of pedigreeID that were possessed by the parents of an
individual; it is thus a vector of two values. If pedigree tracking is not on, pedigreeParentIDs will
contain two -1 values. Parental values may also be -1 if insufficient generations have elapsed for that
information to be available (because the simulation just started, or because a subpopulation is new).
pedigreeGrandparentIDs => (integer)

If pedigree tracking is turned on with initializeSLiMOptions(keepPedigrees=T),
pedigreeGrandparentIDs contains the values of pedigreeID that were possessed by the
grandparents of an individual; it is thus a vector of four values. If pedigree tracking is not on,
pedigreeGrandparentIDs will contain four -1 values. Grandparental values may also be -1 if
insufficient generations have elapsed for that information to be available (because the simulation just
started, or because a subpopulation is new).
sex => (string$)

The sex of the individual. This will be "H" if sex is not enabled in the simulation (i.e., for
hermaphrodites), otherwise "F" or "M" as appropriate.
spatialPosition => (float)

The spatial position of the individual. The length of the spatialPosition property (the number of
coordinates in the spatial position of an individual) depends upon the spatial dimensionality declared
with initializeSLiMOptions(). If the spatial dimensionality is zero (as it is by default), it is an
error to access this property. The elements of this property are identical to the values of the x, y, and z
properties (if those properties are encompassed by the spatial dimensionality of the simulation). In
other words, if the declared dimensionality is "xy", the individual.spatialPosition property is
equivalent to c(individual.x, individual.y); individual.z is not used since it is not
encompassed by the simulation’s dimensionality. This property cannot be set, but the
setSpatialPosition() method may be used to achieve the same thing.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

428

subpopulation => (object$)

The Subpopulation object to which the individual belongs.
tag <–> (integer$)

A user-defined integer value (as opposed to tagF, which is of type float). The value of tag is
initially undefined (i.e., has an effectively random value that could be different every time you run
your model); if you wish it to have a defined value, you must arrange that yourself by explicitly setting
its value prior to using it elsewhere in your code. The value of tag is not used by SLiM; it is free for
you to use. See also the getValue() and setValue() methods, for another way of attaching state to
individuals.
Note that the Individual objects used by SLiM are (conceptually) new with every generation, so the
tag value of each new offspring generated in each generation will be initially undefined. If you set a
tag value for an offspring individual inside a modifyChild() callback, that tag value will be
preserved as the offspring individual becomes a parent (across the generation boundary, in other
words). If you take advantage of this, however, you should be careful to set up initial values for the tag
values of all offspring, otherwise undefined initial values might happen to match the values that you
are trying to use to tag particular individuals. A rule of thumb in programming: undefined values
should always be assumed to take on the most inconvenient value possible.
tagF <–> (float$)

A user-defined float value (as opposed to tag, which is of type integer). The value of tagF is
initially undefined (i.e., has an effectively random value that could be different every time you run
your model); if you wish it to have a defined value, you must arrange that yourself by explicitly setting
its value prior to using it elsewhere in your code. The value of tagF is not used by SLiM; it is free for
you to use. See also the getValue() and setValue() methods, for another way of attaching state to
individuals.
Note that at present, although many classes in SLiM have an integer-type tag property, only
Individual has a float-type tagF property, because attaching model state to individuals seems to
be particularly common and useful. If a tagF property would be helpful on another class, it would be
easy to add.
See the description of the tag property above for additional comments.
uniqueMutations => (object)

All of the Mutation objects present in this individual. Mutations present in both genomes will occur
only once in this property, and the mutations will be given in sorted order by position, so this
property is similar to sortBy(unique(individual.genomes.mutations), "position"). It is not
identical to that call, only because if multiple mutations exist at the exact same position, they may be
sorted differently by this method than they would be by sortBy(). This method is provided primarily
for speed; it executes much faster than the Eidos equivalent above. Indeed, it is faster than just
individual.genomes.mutations, and gives uniquing and sorting on top of that, so it is
advantageous unless duplicate entries for homozygous mutations are actually needed.
x <–> (float$)

A user-defined float value. The value of x is initially undefined (i.e., has an effectively random value
that could be different every time you run your model); if you wish it to have a defined value, you
must arrange that yourself by explicitly setting its value prior to using it elsewhere in your code,
typically in a modifyChild() callback. The value of x is not used by SLiM unless the optional
“continuous space” facility is enabled with the dimensionality parameter to
initializeSLiMOptions(), in which case x will be understood to represent the x coordinate of the
individual in space. If continuous space is not enabled, you may use x as an additional tag value of
type float.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

429

y <–> (float$)

A user-defined float value. The value of y is initially undefined (i.e., has an effectively random value
that could be different every time you run your model); if you wish it to have a defined value, you
must arrange that yourself by explicitly setting its value prior to using it elsewhere in your code,
typically in a modifyChild() callback. The value of y is not used by SLiM unless the optional
“continuous space” facility is enabled with the dimensionality parameter to
initializeSLiMOptions(), in which case y will be understood to represent the y coordinate of the
individual in space (if the dimensionality is "xy" or "xyz"). If continuous space is not enabled, or the
dimensionality is not "xy" or "xyz", you may use y as an additional tag value of type float.
z <–> (float$)

A user-defined float value. The value of z is initially undefined (i.e., has an effectively random value
that could be different every time you run your model); if you wish it to have a defined value, you
must arrange that yourself by explicitly setting its value prior to using it elsewhere in your code,
typically in a modifyChild() callback. The value of z is not used by SLiM unless the optional
“continuous space” facility is enabled with the dimensionality parameter to
initializeSLiMOptions(), in which case z will be understood to represent the z coordinate of the
individual in space (if the dimensionality is "xyz"). If continuous space is not enabled, or the
dimensionality is not "xyz", you may use z as an additional tag value of type float.

21.6.2 Individual methods
– (logical)containsMutations(object mutations)

Returns a logical vector indicating whether each of the mutations in mutations is present in the
individual (in either of its genomes); each element in the returned vector indicates whether the
corresponding mutation is present (T) or absent (F). This method is provided for speed; it is much
faster than the corresponding Eidos code.
– (integer$)countOfMutationsOfType(io$ mutType)

Returns the number of mutations that are of the type specified by mutType, out of all of the mutations
in the individual (in both of its genomes; a mutation that is present in both genomes counts twice). If
you need a vector of the matching Mutation objects, rather than just a count, you should probably
use uniqueMutationsOfType(). This method is provided for speed; it is much faster than the
corresponding Eidos code.
– (+)getValue(string$ key)

Returns the value previously set for the dictionary entry identifier key using setValue(), or NULL if
no value has been set. This dictionary-style functionality is actually provided by the superclass of
Individual, SLiMEidosDictionary, although that fact is not presently visible in Eidos since
superclasses are not introspectable.
– (float)relatedness(object individuals)

Returns a vector containing the degrees of relatedness between the receiver and each of the
individuals in individuals. The relatedness between A and B is always 1.0 if A and B are actually
the same individual; this facility works even if SLiM’s optional pedigree tracking is turned off (in which
case all other relatedness values will be 0.0). Otherwise, if pedigree tracking is turned on with
initializeSLiMOptions(keepPedigrees=T), this method will use the pedigree information
described in section 21.6.1 to construct a relatedness estimate. More specifically, if information about
the grandparental generation is available, then each grandparent shared by A and B contributes 0.125
towards the total relatedness, for a maximum value of 0.5 with four shared grandparents. If
grandparental information in unavailable, then if parental information is available it is used, with each
parent shared by A and B contributing 0.25, again for a maximum of 0.5. If even parental
information is unavailable, then the relatedness is assumed to be 0.0. Again, however, if A and B are
the same individual, the relatedness will be 1.0 in all cases.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

430

Note that this relatedness is simply pedigree-based relatedness. This does not necessarily correspond
to genetic relatedness, because of the effects of factors like assortment and recombination.
+ (void)setSpatialPosition(float position)

Sets the spatial position of the individual (as accessed through the spatialPosition property). The
length of position (the number of coordinates in the spatial position of an individual) depends upon
the spatial dimensionality declared with initializeSLiMOptions(). If the spatial dimensionality is
zero (as it is by default), it is an error to call this method. The elements of position are set into the
values of the x, y, and z properties (if those properties are encompassed by the spatial dimensionality
of the simulation). In other words, if the declared dimensionality is "xy", calling
individual.setSpatialPosition(c(1.0, 0.5)) property is equivalent to individual.x = 1.0;
individual.y = 0.5; individual.z is not set (even if a third value is supplied in position) since
it is not encompassed by the simulation’s dimensionality in this example.
Note that this is an Eidos class method, somewhat unusually, which allows it to work in a special way
when called on a vector of individuals. When the target vector of individuals is non-singleton, this
method can do one of two things. If position contains just a single point (i.e., is equal in length to
the spatial dimensionality of the model), the spatial position of all of the target individuals will be set
to the given point. Alternatively, if position contains one point per target individual (i.e., is equal in
length to the number of individuals multiplied by the spatial dimensionality of the model), the spatial
position of each target individual will be set to the corresponding point from position (where the
point data is concatenated, not interleaved, just as it would be returned by accessing the
spatialPosition property on the vector of target individuals). Calling this method with a position
vector of any other length is an error.
– (void)setValue(string$ key, + value)

Sets a value for the dictionary entry identifier key. The value, which may be of any type other than
object, can be fetched later using getValue(). This dictionary-style functionality is actually
provided by the superclass of Individual, SLiMEidosDictionary, although that fact is not presently
visible in Eidos since superclasses are not introspectable.
– (float$)sumOfMutationsOfType(io$ mutType)

Returns the sum of the selection coefficients of all mutations that are of the type specified by mutType,
out of all of the mutations in the genomes of the individual. This is often useful in models that use a
particular mutation type to represent QTLs with additive effects; in that context,
sumOfMutationsOfType() will provide the sum of the additive effects of the QTLs for the given
mutation type. This method is provided for speed; it is much faster than the corresponding Eidos code.
Note that this method also exists on Genome, for cases in which the sum for just one genome is
desired.
– (object)uniqueMutationsOfType(io$ mutType)

Returns an object vector of all the mutations that are of the type specified by mutType, out of all of
the mutations in the individual. Mutations present in both genomes will occur only once in the result
of this method, and the mutations will be given in sorted order by position, so this method is similar
to sortBy(unique(individual.genomes.mutationsOfType(mutType)), "position"). It is not
identical to that call, only because if multiple mutations exist at the exact same position, they may be
sorted differently by this method than they would be by sortBy(). If you just need a count of the
matching Mutation objects, rather than a vector of the matches, use -countOfMutationsOfType().
This method is provided for speed; it is much faster than the corresponding Eidos code. Indeed, it is
faster than just individual.genomes.mutationsOfType(mutType), and gives uniquing and sorting
on top of that, so it is advantageous unless duplicate entries for homozygous mutations are actually
needed.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

431

21.7 Class InteractionType
This class represents a type of interaction between individuals. This is an advanced feature, the
use of which is optional. Once an interaction type is set up with initializeInteractionType()
(see section 21.1), it can be evaluated and then queried to give information such as the nearest
interacting neighbor of an individual, or the total strength of interactions felt by an individual,
relatively efficiently. Interactions are often spatial, depending upon the spatial dimensionality
established with initializeSLiMOptions() (section 21.1), but do not need to be spatial. Spatial
interactions can have – and almost always should have – a maximum distance, which allows them
to be evaluated more efficiently (since all interactions beyond the maximum distance can be
assumed to have a strength of zero).
Note that if there are N individuals in a given subpopulation, each of which interacts with M
other individuals, then InteractionType’s internal data structures will occupy an amount of memory
roughly proportional to N×M, for each evaluated subpopulation. Depending upon the queries
executed, interactions may also take computational time proportional to N×M, or even
proportional to N2, in each evaluated subpopulation. Modeling interactions with large population
sizes can therefore be expensive, although InteractionType goes to considerable lengths to
minimize the overhead.
The first step in InteractionType’s evaluation of an interaction is to determine the distance from
the individual receiving the interaction to the individual exerting the interaction. This is computed
as the Euclidean distance between the spatial positions of the individuals, based upon the
spatiality of the interaction (i.e., the spatial dimensions used by the interaction, which may be less
than the dimensionality of the simulation as a whole). Second, this distance is compared to the
maximum distance for the interaction type; if it is beyond that limit, the interaction strength is
always zero (and it is also always zero for the interaction of an individual with itself). Third (when
the distance is less than the maximum), the distance is converted into an interaction strength by an
interaction function (IF), which is a characteristic of the InteractionType. Finally, this interaction
strength may be modified by the interaction() callbacks currently active in the simulation, if any
(see section 22.6).
InteractionType is actually a wrapper for three different spatial query engines that share some
of their data but work very differently. The first engine is a brute-force engine that simply
computes distances and interaction strengths in response to queries. This engine is usually used in
response to queries for simple information, such as the distance(), distanceToPoint(), and
strength() methods.
The second engine is based upon a data structure called a “k-d tree” that is designed to
optimize searches for spatially proximate points. This engine is usually used in response to queries
involving “neighbors”, such as nearestNeighbors() and nearestNeighborsOfPoint(). In SLiM,
the term “neighbor” means an individual that is within the maximum interaction distance of a
focal individual or point (excluding the focal individual itself); the neighbors of the focal individual
or point are therefore those that fall within the fixed radius defined by the maximum interaction
distance. Calls with “neighbor” in their name explicitly use the k-d tree engine, and may therefore
be called only for spatial interactions; in non-spatial interactions there is no concept of a
“neighbor”. In terms of computational complexity, finding the nearest neighbor of a given
individual using the brute-force engine is an O(N) computation, whereas with the k-d tree engine it
is typically an O(log N) computation – a very important difference, especially for large N. In
general, to get the best performance from a spatial model, you should (1) set a maximum distance
for the model interactions that is as small as possible without introducing unwanted artifacts, and
(2) use neighbor-based calls to make minimal queries when possible – if all you really care about
is the distance to the nearest neighbor, use nearestNeighbors() to find the neighbor and then call
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

432

distance() to get the distance to that neighbor, rather than getting the distances
with distance() and then using min() to select the smallest, for example.

to all individuals

The third engine, introduced in SLiM 3.1, is based upon a data structure called a “sparse array”
that is designed to track sparse non-zero values within a dataset that contains mostly zeros. It
applies to spatial interactions because most pairs of interactions probably interact with a strength
of zero (because typically N >> M, because few individuals fall within the maximum interaction
radius from a given individual). The sparse array is used to cache all calculated distance/strength
pairs for interactions within a given subpopulation. It is built using the k-d tree to find the
interacting neighbors of each individual, and once built it can respond extremely quickly to
queries from methods such as totalOfNeighborStrengths(); the interacting neighbors of a given
individual are already known, allowing response in O(M) time. The sparse array is built on
demand, when queries that would benefit from it are made. For it to be effective, it is particularly
important that a maximum interaction distance be used that is as small as possible, so beginning
with SLiM 3.1 a warning is issued when no maximum distance is defined for spatial interactions.
There are currently four options for interaction functions (IFs) in SLiM, represented by singlecharacter codes:
"f" – a fixed interaction strength. This IF type has a single parameter, the interaction strength to
be used for all interactions of this type. By default, interaction types use a type "f" IF with a value
of 1.0, so interactions are binary: on within the maximum distance, off outside.
"l" – a linear interaction strength. This IF type has a single parameter, the maximum interaction
strength to be used at distance 0.0. The interaction strength falls off linearly, reaching exactly zero
at the maximum distance. In other words, for distance d, maximum interaction distance dmax, and
maximum interaction strength fmax, the formula for this IF is f(d) = fmax(1 − d / dmax).
"e" – A negative exponential interaction strength. This IF type is specified by two parameters, a
maximum interaction strength and a shape parameter. The interaction strength falls off nonlinearly from the maximum, and cuts off discontinuously at the maximum distance; typically a
maximum distance is chosen such that the interaction strength at that distance is very small
anyway. The IF for this type is f(d) = fmaxexp(−λd), where λ is the specified shape parameter. Note
that this parameterization is not the same as for the Eidos function rexp().
"n" – A normal interaction strength (i.e., Gaussian, but "g" is avoided to prevent confusion with
the gamma-function option provided for, e.g., MutationType). The interaction strength falls off
non-linearly from the maximum, and cuts off discontinuously at the maximum distance; typically a
maximum distance is chosen such that the interaction strength at that distance is very small
anyway. This IF type is specified by two parameters, a maximum interaction strength and a
standard deviation. The Gaussian IF for this type is f(d) = fmaxexp(−d2/2σ2), where σ is the standard
deviation parameter. Note that this parameterization is not the same as for the Eidos function
rnorm(). A Gaussian function is often used to model spatial interactions, but is relatively
computation-intensive.
"c" – A Cauchy-distributed interaction strength. The interaction strength falls off non-linearly
from the maximum, and cuts off discontinuously at the maximum distance; typically a maximum
distance is chosen such that the interaction strength at that distance is very small anyway. This IF
type is specified by two parameters, a maximum interaction strength and a scale parameter. The IF
for this type is f(d) = fmax/(1+(d/λ)2), where λ is the scale parameter. Note that this parameterization
is not the same as for the Eidos function rcauchy(). A Cauchy distribution can be used to model
interactions with relatively fat tails.
An InteractionType may be allocated using the initializeInteractionType() function (see
section 21.1). It must then be evaluated, with the evaluate() method, for any given
subpopulation before it will respond to queries regarding that subpopulation. This causes the
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

433

positions of all individuals to be cached, thus defining a snapshot in time that the InteractionType
will then use to respond to queries (necessary since the positions of individuals may change at any
time). This evaluated state will last until the current parental generation expires, at the end of the
next offspring-generation phase. Before the InteractionType may be used with the new parental
generation (the offspring of the old parental generation), the interaction must be evaluated again.
InteractionType will automatically account for any periodic spatial boundaries established
with the periodicity parameter of initializeSLiMOptions(); interactions will wrap around the
periodic boundaries without any additional configuration of the interaction. Interactions involving
periodic spatial boundaries entail some additional overhead in both memory usage and processor
time; in particular, setting up the k-d tree after the interaction is evaluated may take many times
longer than in the non-periodic case. Once the k-d tree has been set up, however, responses to
spatial queries involving it should then be nearly as fast as in the non-periodic case. Spatial
queries that do not involve the k-d tree will generally be marginally slower than in the nonperiodic case, but the difference should not be large.
21.7.1 InteractionType properties
id => (integer$)

The identifier for this interaction type; for interaction type i3, for example, this is 3.
maxDistance <–> (float$)

The maximum distance over which this interaction will be evaluated. For inter-individual distances
greater than maxDistance, the interaction strength will be zero.
reciprocal => (logical$)

The reciprocality of the interaction, as specified in initializeInteractionType(). This will be T
for reciprocal interactions (those for which the interaction strength of B upon A is equal to the
interaction strength of A upon B), and F otherwise.
sexSegregation => (string$)

The sex-segregation of the interaction, as specified in initializeInteractionType(). For nonsexual simulations, this will be "**". For sexual simulations, this string value indicates the sex of
individuals feeling the interaction, and the sex of individuals exerting the interaction; see
initializeInteractionType() for details.
spatiality => (string$)

The spatial dimensions used by the interaction, as specified in initializeInteractionType(). This
will be "" (the empty string) for non-spatial interactions, or "x", "y", "z", "xy", "xz", "yz", or
"xyz", for interactions using those spatial dimensions respectively. The specified dimensions are used
to calculate the distances between individuals for this interaction. The value of this property is always
the same as the value given to initializeInteractionType().
tag <–> (integer$)

A user-defined integer value. The value of tag is initially undefined (i.e., has an effectively random
value that could be different every time you run your model); if you wish it to have a defined value,
you must arrange that yourself by explicitly setting its value prior to using it elsewhere in your code.
The value of tag is not used by SLiM; it is free for you to use. See also the getValue() and
setValue() methods, for another way of attaching state to interaction types.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

434

21.7.2 InteractionType methods
– (float)distance(object individuals1,
[No individuals2 = NULL])

Returns a vector containing distances between individuals in individuals1 and individuals2. At
least one of individuals1 or individuals2 must be singleton, so that the distances evaluated are
either from one individual to many, or from many to one (which are equivalent, in fact); evaluating
distances for many to many individuals cannot be done in a single call. (There is one exception: if
both individuals1 and individuals2 are zero-length or NULL, a zero-length float vector will be
returned.) If individuals2 is NULL (the default), then individuals1 must be singleton, and a vector
of the distances from that individual to all individuals in its subpopulation (including itself) is returned;
this case may be handled differently internally, for greater speed, so supplying NULL is preferable to
supplying the vector of all individuals in the subpopulation explicitly even though that should produce
identical results. If the InteractionType is non-spatial, this method may not be called.
Importantly, distances are calculated according to the spatiality of the InteractionType (as declared
in initializeInteractionType()), not the dimensionality of the model as a whole (as declared in
initializeSLiMOptions()). The distances returned are therefore the distances that would be used
to calculate interaction strengths. However, distance() will return finite distances for all pairs of
individuals, even if the individuals are non-interacting; the distance() between an individual and
itself will thus be 0. See interactionDistance() for an alternative distance definition.
– (float)distanceToPoint(object individuals1, float point)

Returns a vector containing distances between individuals in individuals1 and the point given by
the spatial coordinates in point. The point vector is interpreted as providing coordinates precisely as
specified by the spatiality of the interaction type; if the interaction type’s spatiality is "xz", for
example, then point[0] is assumed to be an x value, and point[1] is assumed to be a z value. Be
careful; this means that in general it is not safe to pass an individual’s spatialPosition property for
point, for example (although it is safe if the spatiality of the interaction matches the dimensionality of
the simulation). A coordinate for a periodic spatial dimension must be within the spatial bounds for
that dimension, since coordinates outside of periodic bounds are meaningless (pointPeriodic()
may be used to ensure this); coordinates for non-periodic spatial dimensions are not restricted.
Importantly, distances are calculated according to the spatiality of the InteractionType (as declared
in initializeInteractionType()) not the dimensionality of the model as a whole (as declared in
initializeSLiMOptions()). The distances are therefore interaction distances: the distances that are
used to calculate interaction strengths. If the InteractionType is non-spatial, this method may not
be called. The vector point must be exactly as long as the spatiality of the InteractionType.
– (object)drawByStrength(object$ individual,
[integer$ count = 1])

Returns up to count individuals drawn from the subpopulation of individual. The probability of
drawing particular individuals is proportional to the strength of interaction they exert upon
individual. This method may be used with either spatial or non-spatial interactions, but will be
more efficient with spatial interactions that set a short maximum interaction distance. Draws are done
with replacement, so the same individual may be drawn more than once; sometimes using unique()
on the result of this call is therefore desirable. If more than one draw will be needed, it is much more
efficient to use a single call to drawByStrength(), rather than drawing individuals one at a time.
Note that if no individuals exert a non-zero interaction upon individual, the vector returned will be
zero-length; it is important to consider this possibility.
If the needed interaction strengths have already been calculated, those cached values are simply used.
Otherwise, calling this method triggers evaluation of the needed interactions, including calls to any
applicable interaction() callbacks.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

435

– (void)evaluate([No subpops = NULL], [logical$ immediate = F])

Triggers evaluation of the interaction for the subpopulations specified by subpops (or for all
subpopulations, if subpops is NULL). By default, the effects of this may be limited, however, since the
underlying implementation may choose to postpone some computations lazily. At a minimum, is it
guaranteed that this method will discard all previously cached data for the subpopulation(s), and will
cache the current positions of all individuals (so that individuals may then move without disturbing the
state of the interaction at the moment of evaluation). Notably, interaction() callbacks may not be
called in response to this method; instead, their evaluation may be deferred until required to satisfy
queries (at which point the generation counter may have advanced by one, so be careful with the
generation ranges used in defining such callbacks).
If T is passed for immediate, the interaction will immediately and synchronously evaluate all
interactions between all individuals in the subpopulation(s), calling any applicable interaction()
callbacks as necessary. However, depending upon what queries are later executed, this may represent
considerable wasted computation, since it is an O(N2) operation. Immediate evaluation usually
generates only a slight performance improvement even if the interactions between all pairs of
individuals are eventually accessed; the main reason to choose immediate evaluation, then, is that
deferred calculation of interactions would lead to incorrect results due to changes in model state.
You must explicitly call evaluate() at an appropriate time in the life cycle before the interaction is
used, but after any relevant changes have been made to the population. SLiM will invalidate any
existing interactions after any portion of the generation cycle in which new individuals have been
born or existing individuals have died. In a WF model, these events occur just before late() events
execute (see the WF generation cycle diagram in chapter 19), so late() events are often the
appropriate place to put evaluate() calls, but early() events can work too if the interaction is not
needed until that point in the generation cycle anyway. In nonWF models, on the other hand, new
offspring are produced just before early() events and then individuals die just before late() events
(see the nonWF generation cycle diagram in chapter 20), so interactions will be invalidated twice
during each generation cycle. This means that in a nonWF model, an interaction that influences
reproduction should usually be evaluated in a late() event, while an interaction that influences
fitness or mortality should usually be evaluated in an early() event (and an interaction that affects
both may need to be evaluated at both times).
If an interaction is never evaluated for a given subpopulation, it is guaranteed that there will be
essentially no memory or computational overhead associated with the interaction for that
subpopulation. Furthermore, attempting to query an interaction for an individual in a subpopulation
that has not been evaluated is guaranteed to raise an error.
– (+)getValue(string$ key)

Returns the value previously set for the dictionary entry identifier key using setValue(), or NULL if
no value has been set. This dictionary-style functionality is actually provided by the superclass of
InteractionType, SLiMEidosDictionary, although that fact is not presently visible in Eidos since
superclasses are not introspectable.
– (integer)interactingNeighborCount(object individuals)

Returns the number of interacting individuals for each individual in individuals, within the
maximum interaction distance according to the distance metric of the InteractionType. More
specifically, this method counts the number of individuals which can exert an interaction upon each
focal individual; it does not count individuals which only feel an interaction from a focal individual.
This method is similar to nearestInteractingNeighbors() (when passed a large count so as to
guarantee that all interacting individuals are returned), but this method returns only a count of the
interacting individuals, not a vector containing the individuals. This method may also be called in a
vectorized fashion, with a non-singleton vector of individuals, unlike
nearestInteractingNeighbors().

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

436

Note that this method uses interaction eligibility as a criterion; it will not count neighbors that cannot
exert an interaction upon a focal individual (due to sex-segregation, e.g.). (It also does not count a
focal individual as a neighbor of itself.)
– (float)interactionDistance(object$ receiver,
[No exerters = NULL])

Returns a vector containing interaction-dependent distances between receiver and individuals in
exerters that exert an interaction strength upon receiver. If exerters is NULL (the default), then a
vector of the interaction-dependent distances from receiver to all individuals in its subpopulation
(including receiver itself) is returned; this case may be handled much more efficiently than if a
vector of all individuals in the subpopulation is explicitly provided. If the InteractionType is nonspatial, this method may not be called.
Importantly, distances are calculated according to the spatiality of the InteractionType (as declared
in initializeInteractionType()), not the dimensionality of the model as a whole (as declared in
initializeSLiMOptions()). The distances returned are therefore the distances that would be used
to calculate interaction strengths. In addition, interactionDistance() will return INF as the
distance between receiver and any individual which does not exert an interaction upon receiver;
the interactionDistance() between an individual and itself will thus be INF, and likewise for pairs
excluded from interacting by the sex segregation or max distance of the interaction type. See
distance() for an alternative distance definition.
– (object)nearestInteractingNeighbors(object$ individual,
[integer$ count = 1])

Returns up to count interacting individuals that are spatially closest to individual, according to the
distance metric of the InteractionType. More specifically, this method returns only individuals
which can exert an interaction upon the focal individual; it does not include individuals that only feel
an interaction from the focal individual. To obtain all of the interacting individuals within the
maximum interaction distance of individual, simply pass a value for count that is greater than or
equal to the size of individual’s subpopulation. Note that if fewer than count interacting
individuals are within the maximum interaction distance, the vector returned may be shorter than
count, or even zero-length; it is important to check for this possibility even when requesting a single
neighbor. If only the number of interacting individuals is needed, use
interactingNeighborCount() instead.
Note that this method uses interaction eligibility as a criterion; it will not return neighbors that cannot
exert an interaction upon the focal individual (due to sex-segregation, e.g.). (It will also never return
the focal individual as a neighbor of itself.) To find all neighbors of the focal individual, whether they
can interact with it or not, use nearestNeighbors().
– (object)nearestNeighbors(object$ individual,
[integer$ count = 1])

Returns up to count individuals that are spatially closest to individual, according to the distance
metric of the InteractionType. To obtain all of the individuals within the maximum interaction
distance of individual, simply pass a value for count that is greater than or equal to the size of
individual’s subpopulation. Note that if fewer than count individuals are within the maximum
interaction distance, the vector returned may be shorter than count, or even zero-length; it is
important to check for this possibility even when requesting a single neighbor.
Note that this method does not use interaction eligibility as a criterion; it will return neighbors that
could not interact with the focal individual due to sex-segregation. (It will never return the focal
individual as a neighbor of itself, however.) To find only neighbors that are eligible to exert an
interaction upon the focal individual, use nearestInteractingNeighbors().

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

437

– (object)nearestNeighborsOfPoint(object$ subpop,
float point, [integer$ count = 1])

Returns up to count individuals in subpop that are spatially closest to point, according to the
distance metric of the InteractionType. To obtain all of the individuals within the maximum
interaction distance of point, simply pass a value for count that is greater than or equal to the size of
subpop. Note that if fewer than count individuals are within the maximum interaction distance, the
vector returned may be shorter than count, or even zero-length; it is important to check for this
possibility even when requesting a single neighbor.
– (void)setInteractionFunction(string$ functionType, ...)

Set the function used to translate spatial distances into interaction strengths for an interaction type.
The functionType may be "f", in which case the ellipsis ... should supply a numeric$ fixed
interaction strength; "l", in which case the ellipsis should supply a numeric$ maximum strength for a
linear function; "e", in which case the ellipsis should supply a numeric$ maximum strength and a
numeric$ lambda (shape) parameter for a negative exponential function; "n", in which case the
ellipsis should supply a numeric$ maximum strength and a numeric$ sigma (standard deviation)
parameter for a Gaussian function; or "c", in which case the ellipsis should supply a numeric$
maximum strength and a numeric$ scale parameter for a Cauchy distribution function. See section
21.7 above for discussions of these interaction functions. Non-spatial interactions must use function
type "f", since no distance values are available in that case.
The interaction function for an interaction type is normally a constant in simulations; in any case, it
cannot be changed when an interaction has already been evaluated for a given generation of
individuals.
– (void)setValue(string$ key, + value)

Sets a value for the dictionary entry identifier key. The value, which may be of any type other than
object, can be fetched later using getValue(). This dictionary-style functionality is actually
provided by the superclass of InteractionType, SLiMEidosDictionary, although that fact is not
presently visible in Eidos since superclasses are not introspectable.
– (float)strength(object$ receiver, [No exerters = NULL])

Returns a vector containing the interaction strengths exerted upon receiver by the individuals in
exerters. If exerters is NULL (the default), then a vector of the interaction strengths exerted by all
individuals in the subpopulation of receiver (including receiver itself) is returned; this case may be
handled much more efficiently than if a vector of all individuals in the subpopulation is explicitly
provided.
If the strengths of interactions exerted by a single individual upon multiple individuals is needed
instead (the inverse of what this method provides), multiple calls to this method will be necessary, one
per pairwise interaction queried; the interaction engine is not optimized for the inverse case, and so it
will likely be quite slow to compute. If the interaction is reciprocal and sex-symmetric, the opposite
query should provide identical results in a single efficient call (because then the interactions exerted
are equal to the interactions received); otherwise, the best approach might be to define a second
interaction type representing the inverse interaction that you wish to be able to query efficiently.
If the needed interaction strengths have already been calculated, those cached values are simply
returned. Otherwise, calling this method triggers evaluation of the needed interactions, including
calls to any applicable interaction() callbacks.
– (float)totalOfNeighborStrengths(object individuals)

Returns a vector of the total interaction strength felt by each individual in individuals, which does
not need to be a singleton; indeed, it can be a vector of all of the individuals in a given subpopulation.
However, all of the individuals in individuals must be in the same subpopulation.
For one individual, this is essentially the same as calling nearestNeighbors() with a large count so
as to obtain the complete vector of all neighbors, calling strength() for each of those interactions to
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

438

get each interaction strength, and adding those interaction strengths together with sum(). This method
is much faster than that implementation, however, since all of that work is done as a single operation.
Also, totalOfNeighborStrengths() can total up interactions for more than one focal individual in
a single call.
Similarly, for one individual this is essentially the same as calling strength() to get the interaction
strengths between the focal individual and all other individuals, and then calling sum(). Again, this
method should be much faster, since this algorithm looks only at neighbors, whereas calling
strength() directly assesses interaction strengths with all other individuals. This will make a
particularly large difference when the subpopulation size is large and the maximum distance of the
InteractionType is small.
If the needed interaction strengths have already been calculated, those cached values are simply used.
Otherwise, calling this method triggers evaluation of the needed interactions, including calls to any
applicable interaction() callbacks.
– (void)unevaluate(void)

Discards all evaluation of this interaction, for all subpopulations. The state of the InteractionType
is reset to a state prior to evaluation. This can be useful if the model state has changed in such a way
that the evaluation already conducted is no longer valid. For example, if the maximum distance or the
interaction function of the InteractionType need to be changed with immediate effect, or if the data
used by an interaction() callback has changed in such a way that previously calculated interaction
strengths are no longer correct, unevaluate() allows the interaction to begin again from scratch.
Note that all interactions are automatically reset to an unevaluated state at the moment when the new
offspring generation becomes the parental generation (at step 4 in the generation cycle; see section
19.4). Most simulations therefore never have any reason to call unevaluate().

21.8 Class Mutation
This class represents a single point mutation. Mutations can be shared by the genomes of many
individuals; if they reach fixation, they are converted to Substitution objects.
Although Mutation has a tag property, like most SLiM classes, the subpopID can also store
custom values if you don’t need to track the origin subpopulation of mutations (see below).
Section 1.5.2 presents an overview of the conceptual role of this class.
21.8.1 Mutation properties
id => (integer$)

The identifier for this mutation. Each mutation created during a run receives an immutable identifier
that will be unique across the duration of the run. These identifiers are not re-used during a run,
except that if a population file is loaded from disk, the loaded mutations will receive their original
identifier values as saved in the population file.
mutationType => (object$)

The MutationType from which this mutation was drawn.
originGeneration => (integer$)

The generation in which this mutation arose.
position => (integer$)

The position in the chromosome of this mutation.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

439

selectionCoeff => (float$)

The selection coefficient of the mutation, drawn from the distribution of fitness effects of its
MutationType. If a mutation has a selectionCoeff of s, the multiplicative fitness effect of the
mutation in a homozygote is 1+s; in a heterozygote it is 1+hs, where h is the dominance coefficient
kept by the mutation type (see section 21.9.1).
Note that this property has a quirk: it is stored internally in SLiM using a single-precision float, not the
double-precision float type normally used by Eidos. This means that if you set a mutation mut’s
selection coefficient to some number x, mut.selectionCoeff==x may be F due to floating-point
rounding error. Comparisons of floating-point numbers for exact equality is often a bad idea, but this
is one case where it may fail unexpectedly. Instead, it is recommended to use the id or tag properties
to identify particular mutations.
subpopID <–> (integer$)

The identifier of the subpopulation in which this mutation arose. This property can be used to track
the ancestry of mutations through their subpopulation of origin. For an overview of other ways of
tracking genetic ancestry, including true local ancestry at each position on the chromosome, see
section 13.9.
If you don’t care which subpopulation a mutation originated in, the subpopID may be used as an
arbitrary integer “tag” value for any purpose you wish; SLiM does not do anything with the value of
subpopID except propagate it to Substitution objects and report it in output. (It must still be >= 0,
however, since SLiM object identifiers are limited to nonnegative integers).
tag <–> (integer$)

A user-defined integer value. The value of tag is initially undefined (i.e., has an effectively random
value that could be different every time you run your model); if you wish it to have a defined value,
you must arrange that yourself by explicitly setting its value prior to using it elsewhere in your code.
The value of tag is not used by SLiM; it is free for you to use.

21.8.2 Mutation methods
– (+)getValue(string$ key)

Returns the value previously set for the dictionary entry identifier key using setValue(), or NULL if
no value has been set. This dictionary-style functionality is actually provided by the superclass of
Mutation, SLiMEidosDictionary, although that fact is not presently visible in Eidos since
superclasses are not introspectable.
– (void)setMutationType(io$ mutType)

Set the mutation type of the mutation to mutType (which may be specified as either an integer
identifier or a MutationType object). This implicitly changes the dominance coefficient of the
mutation to that of the new mutation type, since the dominance coefficient is a property of the
mutation type. On the other hand, the selection coefficient of the mutation is not changed, since it is
a property of the mutation object itself; it can be changed explicitly using the setSelectionCoeff()
method if so desired.
The mutation type of a mutation is normally a constant in simulations, so be sure you know what you
are doing. Changing this will normally affect the fitness values calculated at the end of the current
generation; if you want current fitness values to be affected, you can call SLiMSim’s method
recalculateFitness() – but see the documentation of that method for caveats.
– (void)setSelectionCoeff(float$ selectionCoeff)

Set the selection coefficient of the mutation to selectionCoeff. The selection coefficient will be
changed for all individuals that possess the mutation, since they all share a single Mutation object
(note that the dominance coefficient will remain unchanged, as it is determined by the mutation type).
This is normally a constant in simulations, so be sure you know what you are doing; often setting up a
fitness() callback (see section 22.2) is preferable, in order to modify the selection coefficient in a
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

440

more limited and controlled fashion (see section 9.5 for further discussion of this point). Changing
this will normally affect the fitness values calculated at the end of the current generation; if you want
current fitness values to be affected, you can call SLiMSim’s method recalculateFitness() – but
see the documentation of that method for caveats.
– (void)setValue(string$ key, + value)

Sets a value for the dictionary entry identifier key. The value, which may be of any type other than
object, can be fetched later using getValue(). This dictionary-style functionality is actually
provided by the superclass of Mutation, SLiMEidosDictionary, although that fact is not presently
visible in Eidos since superclasses are not introspectable.

21.9 Class MutationType
This class represents a type of mutation with a particular distribution of fitness effects, such as
neutral mutations or weakly beneficial mutations. Sections 1.5.3 and 1.5.4 present an overview of
the conceptual role of this class. The mutation types currently defined in the simulation are
defined as global constants with the same names used in the SLiM input file – m1, m2, and so forth.
There are currently six options for the distribution of fitness effects in SLiM, represented by
single-character codes:
"f" – A fixed fitness effect. This DFE type has a single parameter, the selection coefficient s to
be used by all mutations of the mutation type.
"g" – A gamma-distributed fitness effect. This DFE type is specified by two parameters, a shape
parameter and a mean value. The gamma distribution from which mutations are drawn is given by
the probability density function P(s | α,β) = [Γ(α)βα]−1sα−1exp(−s/β), where α is the shape parameter,
and the specified mean for the distribution is equal to αβ. Note that this parameterization is the
same as for the Eidos function rgamma(). A gamma distribution is often used to model deleterious
mutations at functional sites.
"e" – An exponentially-distributed fitness effect. This DFE type is specified by a single
parameter, the mean of the distribution. The exponential distribution from which mutations are
drawn is given by the probability density function P(s | β) = β−1exp(−s/β), where β is the specified
mean for the distribution. This parameterization is the same as for the Eidos function rexp(). An
exponential distribution is often used to model beneficial mutations.
"n" – A normally-distributed fitness effect. This DFE type is specified by two parameters, a
mean and a standard deviation. The normal distribution from which mutations are drawn is given
by the probability density function P(s | µ,σ) = (2πσ2)−1/2exp(−(s−µ)2/2σ2), where µ is the mean and σ
is the standard deviation. This parameterization is the same as for the Eidos function rnorm(). A
normal distribution is often used to model mutations that can be either beneficial or deleterious,
since both tails of the distribution are unbounded.
"w" – A Weibull-distributed fitness effect. This DFE type is specified by a scale parameter and a
shape parameter. The Weibull distribution from which mutations are drawn is given by the
probability density function P(s | λ,k) = (k/λk)sk−1exp(−(s/λ)k), where λ is the scale parameter and k is
the shape parameter. This parameterization is the same as for the Eidos function rweibull(). A
Weibull distribution is often used to model mutations following extreme-value theory.
"s" – A script-based fitness effect. This DFE type is specified by a script parameter of type
string, specifying an Eidos script to be executed to produce each new selection coefficient. For
example, the script "return rbinom(1);" could be used to generate selection coefficients drawn
from a binomial distribution, using the Eidos function rbinom(), even though that mutational
distribution is not supported by SLiM directly. The script must return a singleton float or integer.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

441

Note that these distributions can in principle produce selection coefficients smaller than -1.0.
In that case, the mutations will be evaluated as “lethal” by SLiM, and the relative fitness of the
individual will be set to 0.0.
21.9.1 MutationType properties
color <–> (string$)

The color used to display mutations of this type in SLiMgui. Outside of SLiMgui, this property still
exists, but is not used by SLiM. Colors may be specified by name, or with hexadecimal RGB values of
the form "#RRGGBB" (see the Eidos manual). If color is the empty string, "", SLiMgui’s default
(selection-coefficient–based) color scheme is used; this is the default for new MutationType objects.
colorSubstitution <–> (string$)

The color used to display substitutions of this type in SLiMgui (see the discussion for the
colorSubstitution property of the Chromosome class for details). Outside of SLiMgui, this property
still exists, but is not used by SLiM. Colors may be specified by name, or with hexadecimal RGB
values of the form "#RRGGBB" (see the Eidos manual). If colorSubstitution is the empty string, "",
SLiMgui’s default (selection-coefficient–based) color scheme is used; this is the default for new
MutationType objects.
convertToSubstitution <–> (logical$)

This property governs whether mutations of this mutation type will be converted to Substitution
objects when they reach fixation.
In WF models this property is T by default, since conversion to Substitution objects provides large
speed benefits; it should be set to F only if necessary, and only on the mutation types for which it is
necessary. This might be needed, for example, if you are using a fitness() callback to implement an
epistatic relationship between mutations; a mutation epistatically influencing the fitness of other
mutations through a fitness() callback would need to continue having that influence even after
reaching fixation, but if the simulation were to replace the fixed mutation with a Substitution
object the mutation would no longer be considered in fitness calculations (unless the callback
explicitly consulted the list of Substitution objects kept by the simulation). Other script-defined
behaviors in fitness(), interaction(), mateChoice(), modifyChild(), and recombination()
callbacks might also necessitate the disabling of substitution for a given mutation type; this is an
important consideration to keep in mind. See section 19.3 for further discussion of
convertToSubstitution in WF models.
In contrast, for nonWF models this property is F by default, because even mutations with no epistatis
or other indirect fitness effects will continue to influence the survival probabilities of individuals. For
nonWF models, only neutral mutation types with no epistasis or other side effects can safely be
converted to substitutions upon fixation. When such a pure-neutral mutation type is defined in a
nonWF model, this property should be set to T to tell SLiM that substitution is allowed; this may have
very large positive effects on performance, so it is important to remember when modeling background
neutral mutations. See section 20.5 for further discussion of convertToSubstitution in nonWF
models.
SLiM consults this flag at the end of each generation when deciding whether to substitute each fixed
mutation. If this flag is T, all eligible fixed mutations will be converted at the end of the current
generation, even if they were previously left unconverted because of the previous value of the flag.
Setting this flag to F will prevent future substitutions, but will not cause any existing Substitution
objects to be converted back into Mutation objects.
distributionParams => (float)

The parameters that configure the chosen distribution of fitness effects. This will be of type string for
DFE type "s", and type float for all other DFE types.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

442

distributionType => (string$)

The type of distribution of fitness effects; one of "f", "g", "e", "n", "w", or "s" (see section 21.9,
above).
dominanceCoeff <–> (float$)

The dominance coefficient used for mutations of this type when heterozygous. Changing this will
normally affect the fitness values calculated at the end of the current generation; if you want current
fitness values to be affected, you can call SLiMSim’s method recalculateFitness() – but see the
documentation of that method for caveats.
Note that the dominance coefficient is not bounded. A dominance coefficient greater than 1.0 may
be used to achieve an overdominance effect. By making the selection coefficient very small and the
dominance coefficient very large, an overdominance scenario in which both homozygotes have the
same fitness may be approximated, to a nearly arbitrary degree of precision.
Note that this property has a quirk: it is stored internally in SLiM using a single-precision float, not the
double-precision float type normally used by Eidos. This means that if you set a mutation type
muttype’s dominance coefficient to some number x, muttype.dominanceCoeff==x may be F due to
floating-point rounding error. Comparisons of floating-point numbers for exact equality is often a bad
idea, but this is one case where it may fail unexpectedly. Instead, it is recommended to use the id or
tag properties to identify particular mutation types.
id => (integer$)

The identifier for this mutation type; for mutation type m3, for example, this is 3.
mutationStackGroup <–> (integer$)

The group into which this mutation type belongs for purposes of mutation stacking policy. This is
equal to the mutation type’s id by default. See mutationStackPolicy, below, for discussion.
mutationStackPolicy <–> (string$)

This property and the mutationStackGroup property together govern whether mutations of this
mutation type’s stacking group can “stack” – can occupy the same position in a single individual. A
set of mutation types with the same value for mutationStackGroup is called a “stacking group”, and
all mutation types in a given stacking group must have the same mutationStackPolicy value, which
defines the stacking behavior of all mutations of the mutation types in the stacking group. In other
words, one stacking group might allow its mutations to stack, while another stacking group might not,
but the policy within each stacking group must be unambiguous.
This property is "s" by default, indicating that mutations in this stacking group should be allowed to
stack without restriction. If the policy is set to "f", the first mutation of stacking group at a given site
is retained; further mutations of this stacking group at the same site are discarded with no effect. This
can be useful for modeling one-way changes to single nucleotides, for example; once a T changes to
an A, further changes of the A to an A are not changes at all. If the policy is set to "l", the last
mutation of this stacking group at a given site is retained; earlier mutation of this stacking group at the
same site are discarded. This can be useful for modeling an “infinite-alleles” scenario in which every
new mutation at a site generates a completely new allele, rather than retaining the previous mutations
at the site.
The mutation stacking policy applies only within the given mutation type’s stacking group; mutations
of different stacking groups are always allowed to stack in SLiM. The policy applies to all mutations
added to the model after the policy is set, whether those mutations are introduced by calls such as
addMutation(), addNewMutation(), or addNewDrawnMutation(), or are added by SLiM’s own
mutation-generation machinery. However, no attempt is made to enforce the policy for mutations
already existing at the time the policy is set; typically, therefore, the policy is set in an initialize()
callback so that it applies throughout the simulation. The policy is also not enforced upon the
mutations loaded from a file with readFromPopulationFile(); such mutations were governed by
whatever stacking policy was in effect when the population file was generated.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

443

tag <–> (integer$)

A user-defined integer value. The value of tag is initially undefined (i.e., has an effectively random
value that could be different every time you run your model); if you wish it to have a defined value,
you must arrange that yourself by explicitly setting its value prior to using it elsewhere in your code.
The value of tag is not used by SLiM; it is free for you to use. See also the getValue() and
setValue() methods, for another way of attaching state to mutation types.

21.9.2 MutationType methods
– (+)getValue(string$ key)

Returns the value previously set for the dictionary entry identifier key using setValue(), or NULL if
no value has been set. This dictionary-style functionality is actually provided by the superclass of
MutationType, SLiMEidosDictionary, although that fact is not presently visible in Eidos since
superclasses are not introspectable.
– (void)setDistribution(string$ distributionType, ...)

Set the distribution of fitness effects for a mutation type. The distributionType may be "f", in
which case the ellipsis ... should supply a numeric$ fixed selection coefficient; "e", in which case
the ellipsis should supply a numeric$ mean selection coefficient for the exponential distribution; "g",
in which case the ellipsis should supply a numeric$ mean selection coefficient and a numeric$ alpha
shape parameter for a gamma distribution; "n", in which case the ellipsis should supply a numeric$
mean selection coefficient and a numeric$ sigma (standard deviation) parameter for a normal
distribution; "w", in which case the ellipsis should supply a numeric$ λ scale parameter and a
numeric$ k shape parameter for a Weibull distribution; or "s", in which case the ellipsis should
supply a string$ Eidos script parameter. See section 21.9 above for discussions of these distributions
and their uses. The DFE for a mutation type is normally a constant in simulations, so be sure you
know what you are doing.
– (void)setValue(string$ key, + value)

Sets a value for the dictionary entry identifier key. The value, which may be of any type other than
object, can be fetched later using getValue(). This dictionary-style functionality is actually
provided by the superclass of MutationType, SLiMEidosDictionary, although that fact is not
presently visible in Eidos since superclasses are not introspectable.

21.10 Class SLiMEidosBlock
This class represents a block of Eidos code registered in a SLiM simulation. All Eidos events and
Eidos callbacks defined in the SLiM input file of the current simulation are instantiated as
SLiMEidosBlock objects and are available through the read-only scriptBlocks property of
SLiMSim; see section 21.12.1. In addition, new script blocks can be created programmatically and
registered with the simulation, and registered script blocks can be deregistered; see the
‑register...() and ‑deregisterScriptBlock() methods of SLiMSim in section 21.12.2. The
currently executing script block is available through the self global; see section 22.8.
21.10.1 SLiMEidosBlock properties
active <–> (integer$)

If this evaluates to logical F (i.e., is equal to 0), the script block is inactive and will not be called.
The value of active for all registered script blocks is reset to -1 at the beginning of each generation,
prior to script events being called, thus activating all blocks. Any integer value other than -1 may be
used instead of -1 to represent that a block is active; for example, active may be used as a counter to
make a block execute a fixed number of times in each generation. This value is not cached by SLiM; if
it is changed, the new value takes effect immediately. For example, a callback might be activated and
inactivated repeatedly during a single generation.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

444

end => (integer$)

The last generation in which the script block is active.
id => (integer$)

The identifier for this script block; for script s3, for example, this is 3. A script block for which no id
was given will have an id of -1.
source => (string$)

The source code string of the script block.
start => (integer$)

The first generation in which the script block is active.
tag <–> (integer$)

A user-defined integer value. The value of tag is initially undefined (i.e., has an effectively random
value that could be different every time you run your model); if you wish it to have a defined value,
you must arrange that yourself by explicitly setting its value prior to using it elsewhere in your code.
The value of tag is not used by SLiM; it is free for you to use.
type => (string$)

The type of the script block; this will be "early" or "late" for the two types of Eidos events, or
"initialize", "fitness", "mateChoice", "modifyChild", or "recombination" for the
respective types of Eidos callbacks (see section 21.1 and chapter 22).

21.10.2 SLiMEidosBlock methods
SLiMEidosBlock provides no methods to modify a script block. You may, however, reschedule a
block to run in a different set of generations using the rescheduleScriptBlock() method of
SLiMSim, or register a new script block using the source and type of an existing block using the
register...() methods of SLiMSim (see section 21.12.2).
21.11 Class SLiMgui
This class represents the SLiMgui application. When running under SLiMgui, a global object
singleton constant of class SLiMgui will be defined, named slimgui. This object can be used to
query and control the SLiMgui application. When running at the command line, the slimgui
object will not exist; to determine whether the simulation is running under SLiMgui, one may
therefore test exists("slimgui"). If a model needs to run both at the command line and under
SLiMgui, all uses of the slimgui object should be protected by if (exists("slimgui")) to avoid
errors.
21.11.1 SLiMgui properties
pid => (integer$)

The Un*x process identifier (commonly called the “pid”) of the running SLiMgui application. This can
be useful for scripts that wish to use system calls to influence the SLiMgui application.

21.11.2 SLiMgui methods
– (void)openDocument(string$ filePath)

Open the document at filePath in SLiMgui, if possible. Supported document types include SLiM
model files (typically with a .slim path extension), text files (typically with a .txt path extension,
and opened as untitled model files), and PDF files (typically with a .pdf path extension). This can be
particularly useful for opening PDF documents created by the simulation itself, often by sublaunching
a plotting process in R or another environment; see section 13.11 for an example.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

445

– (void)pauseExecution(string$ filePath)

Pauses a model that is playing in SLiMgui. This is essentially equivalent to clicking the “Play” button
to stop the execution of the model. Execution can be resumed by the user, by clicking the “Play”
button again; unlike calling stop() or simulationFinished(), the simulation is not terminated.
This method can be useful for debugging or exploratory purposes, to pause the model at a point of
interest. Execution is paused at the end of the currently executing generation, not mid-generation.
If the model is being profiled, or is executing forward to a generation number entered in the
generation field, pauseExecution() will do nothing; by design, pauseExecution() only pauses
execution when SLiMgui is doing a simple “Play” of the model.

21.12 Class SLiMSim
This class represents a SLiM simulation. The current SLiMSim instance is defined as a global
constant named sim.
21.12.1 SLiMSim properties
chromosome => (object$)

The Chromosome object used by the simulation.
chromosomeType => (string$)

The type of chromosome being simulated; this will be one of "A", "X", or "Y".
dimensionality => (string$)

The spatial dimensionality of the simulation, as specified in initializeSLiMOptions(). This will be
"" (the empty string) for non-spatial simulations (the default), or "x", "xy", or "xyz", for simulations
using those spatial dimensions respectively.
dominanceCoeffX <–> (float$)

The dominance coefficient value used to modify the selection coefficients of mutations present on the
single X chromosome of an XY male (see the SLiM documentation for details). Used only when
simulating an X chromosome; setting a value for this property in other circumstances is an error.
Changing this will normally affect the fitness values calculated at the end of the current generation; if
you want current fitness values to be affected, you can call SLiMSim’s method
recalculateFitness() – but see the documentation of that method for caveats.
generation <–> (integer$)

The current generation number.
genomicElementTypes => (object)

The GenomicElementType objects being used in the simulation.
inSLiMgui => (logical$)

This property has been deprecated, and may be removed in a future release of SLiM. In SLiM 3.2.1
and later, use exists("slimgui") instead.
If T, the simulation is presently running inside SLiMgui; if F, it is running at the command line. In
general simulations should not care where they are running, but in special circumstances such as
opening plot windows it may be necessary to know the runtime environment.
interactionTypes => (object)

The InteractionType objects being used in the simulation.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

446

modelType => (string$)

The type of model being simulated, as specified in initializeSLiMModelType(). This will be "WF"
for WF models (Wright-Fisher models, the default), or "nonWF" for nonWF models (non-Wright-Fisher
models; see section 1.6 for discussion).
mutationTypes => (object)

The MutationType objects being used in the simulation.
mutations => (object)

The Mutation objects that are currently active in the simulation.
periodicity => (string$)

The spatial periodicity of the simulation, as specified in initializeSLiMOptions(). This will be
"" (the empty string) for non-spatial simulations and simulations with no periodic spatial dimensions
(the default). Otherwise, it will be a string representing the subset of spatial dimensions that have
been declared to be periodic, as specified to initializeSLiMOptions().
scriptBlocks => (object)

All registered SLiMEidosBlock objects in the simulation.
sexEnabled => (logical$)

If T, sex is enabled in the simulation; if F, individuals are hermaphroditic.
subpopulations => (object)

The Subpopulation instances currently defined in the simulation.
substitutions => (object)

A vector of Substitution objects, representing all mutations that have been fixed.
tag <–> (integer$)

A user-defined integer value. The value of tag is initially undefined (i.e., has an effectively random
value that could be different every time you run your model); if you wish it to have a defined value,
you must arrange that yourself by explicitly setting its value prior to using it elsewhere in your code.
The value of tag is not used by SLiM; it is free for you to use. See also the getValue() and
setValue() methods, for another way of attaching state to the simulation.

21.12.2 SLiMSim methods
– (object$)addSubpop(is$ subpopID, integer$ size,
[float$ sexRatio = 0.5])

Add a new subpopulation with id subpopID and size individuals. The subpopID parameter may be
either an integer giving the ID of the new subpopulation, or a string giving the name of the new
subpopulation (such as "p5" to specify an ID of 5). Only if sex is enabled in the simulation, the initial
sex ratio may optionally be specified as sexRatio (as the male fraction, M:M+F); if it is not specified,
a default of 0.5 is used. The new subpopulation will be defined as a global variable immediately by
this method (see section 21.13), and will also be returned by this method. Subpopulations added by
this method will initially consist of individuals with empty genomes. In order to model subpopulations
that split from an already existing subpopulation, use addSubpopSplit().
– (object$)addSubpopSplit(is$ subpopID, integer$ size,
io$ sourceSubpop, [float$ sexRatio = 0.5])

Split off a new subpopulation with id subpopID and size individuals derived from subpopulation
sourceSubpop. The subpopID parameter may be either an integer giving the ID of the new
subpopulation, or a string giving the name of the new subpopulation (such as "p5" to specify an ID
of 5). The sourceSubpop parameter may specify the source subpopulation either as a
Subpopulation object or by integer identifier. Only if sex is enabled in the simulation, the initial
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

447

sex ratio may optionally be specified as sexRatio (as the male fraction, M:M+F); if it is not specified,
a default of 0.5 is used. The new subpopulation will be defined as a global variable immediately by
this method (see section 21.13), and will also be returned by this method.
Subpopulations added by this method will consist of individuals that are clonal copies of individuals
from the source subpopulation, randomly chosen with probabilities proportional to fitness. The fitness
of all of these initial individuals is considered to be 1.0, to avoid a doubled round of selection in the
initial generation, given that fitness values were already used to choose the individuals to clone. Once
this initial set of individuals has mated to produce offspring, the model is effectively of parental
individuals in the source subpopulation mating randomly according to fitness, as usual in SLiM, with
juveniles migrating to the newly added subpopulation. Effectively, then, then new subpopulation is
created empty, and is filled by migrating juveniles from the source subpopulation, in accordance with
SLiM’s usual model of juvenile migration.
– (integer$)countOfMutationsOfType(io$ mutType)

Returns the number of mutations that are of the type specified by mutType, out of all of the mutations
that are currently active in the simulation. If you need a vector of the matching Mutation objects,
rather than just a count, use -mutationsOfType(). This method is often used to determine whether
an introduced mutation is still active (as opposed to being either lost or fixed). This method is
provided for speed; it is much faster than the corresponding Eidos code.
– (void)deregisterScriptBlock(io scriptBlocks)

All SLiMEidosBlock objects specified by scriptBlocks (either with SLiMEidosBlock objects or
with integer identifiers) will be scheduled for deregistration. The deregistered blocks remain valid,
and may even still be executed in the current stage of the current generation (see section 22.8); the
blocks are not actually deregistered and deallocated until sometime after the currently executing script
block has completed. To immediately prevent a script block from executing, even when it is
scheduled to execute in the current stage of the current generation, use the active property of the
script block (see sections 20.10.1 and 21.8).
– (+)getValue(string$ key)

Returns the value previously set for the dictionary entry identifier key using setValue(), or NULL if
no value has been set. This dictionary-style functionality is actually provided by the superclass of
SLiMSim, SLiMEidosDictionary, although that fact is not presently visible in Eidos since
superclasses are not introspectable.
– (integer)mutationCounts(No subpops,
[No mutations = NULL])

Return an integer vector with the frequency counts of all of the Mutation objects passed in
mutations, within the Subpopulation objects in subpops. The subpops argument is required, but
you may pass NULL to get population-wide frequency counts. If the optional mutations argument is
NULL (the default), frequency counts will be returned for all of the active Mutation objects in the
simulation – the same Mutation objects, and in the same order, as would be returned by the
mutations property of sim, in other words.
See the -mutationFrequencies() method to obtain float frequencies instead of integer counts.
– (float)mutationFrequencies(No subpops,
[No mutations = NULL])

Return a float vector with the frequencies of all of the Mutation objects passed in mutations,
within the Subpopulation objects in subpops. The subpops argument is required, but you may pass
NULL to get population-wide frequencies. If the optional mutations argument is NULL (the default),
frequencies will be returned for all of the active Mutation objects in the simulation – the same
Mutation objects, and in the same order, as would be returned by the mutations property of sim, in
other words.
See the -mutationCounts() method to obtain integer counts instead of float frequencies.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

448

– (object)mutationsOfType(io$ mutType)

Returns an object vector of all the mutations that are of the type specified by mutType, out of all of
the mutations that are currently active in the simulation. If you just need a count of the matching
Mutation objects, rather than a vector of the matches, use -countOfMutationsOfType(). This
method is often used to look up an introduced mutation at a later point in the simulation, since there
is no way to keep persistent references to objects in SLiM. This method is provided for speed; it is
much faster than the corresponding Eidos code.
– (void)outputFixedMutations([Ns$ filePath = NULL], [logical$ append = F])

Output all fixed mutations – all Substitution objects, in other words (see section 4.2.4) – in a SLiM
native format (see section 23.1.2 for output format details). If the optional parameter filePath is
NULL (the default), output will be sent to Eidos’s output stream (see section 4.2.1). Otherwise, output
will be sent to the filesystem path specified by filePath, overwriting that file if append if F, or
appending to the end of it if append is T. Mutations which have fixed but have not been turned into
Substitution objects – typically because convertToSubstitution has been set to F for their
mutation type (see section 21.9.1) – are not output; they are still considered to be segregating
mutations by SLiM.
Output is generally done in a late() event, so that the output reflects the state of the simulation at
the end of a generation.
– (void)outputFull([Ns$ filePath = NULL], [logical$ binary = F],
[logical$ append = F], [logical$ spatialPositions = T], [logical$ ages = T])

Output the state of the entire population (see section 23.1.1 for output format details). If the optional
parameter filePath is NULL (the default), output will be sent to Eidos’s output stream (see section
4.2.1). Otherwise, output will be sent to the filesystem path specified by filePath, overwriting that
file if append if F, or appending to the end of it if append is T. When writing to a file, a logical flag,
binary, may be supplied as well. If binary is T, the population state will be written as a binary file
instead of a text file (binary data cannot be written to the standard output stream). The binary file is
usually smaller, and in any case will be read much faster than the corresponding text file would be
read. Binary files are not guaranteed to be portable between platforms; in other words, a binary file
written on one machine may not be readable on a different machine (but in practice it usually will be,
unless the platforms being used are fairly unusual). If binary is F (the default), a text file will be
written.
Beginning with SLiM 2.3, the spatialPositions parameter may be used to control the output of the
spatial positions of individuals in simulations for which continuous space has been enabled using the
dimensionality option of initializeSLiMOptions(). If spatialPositions is F, the output will
not contain spatial positions, and will be identical to the output generated by SLiM 2.1 and later. If
spatialPositions is T, spatial position information will be output if it is available (see section
23.1.1 for format details). If the simulation does not have continuous space enabled, the
spatialPositions parameter will be ignored. Positional information may be output for all output
destinations – the Eidos output stream, a text file, or a binary file.
Beginning with SLiM 3.0, the ages parameter may be used to control the output of the ages of
individuals in nonWF simulations. If ages is F, the output will not contain ages, preserving backward
compatibility with the output format of SLiM 2.1 and later. If ages is T, ages will be output for nonWF
models (see section 23.1.1 for format details). In WF simulations, the ages parameter will be ignored.
Output is generally done in a late() event, so that the output reflects the state of the simulation at
the end of a generation.
– (void)outputMutations(object mutations, [Ns$ filePath = NULL],
[logical$ append = F])

Output all of the given mutations (see section 23.1.3 for output format details). This can be used to
output all mutations of a given mutation type, for example. If the optional parameter filePath is
NULL (the default), output will be sent to Eidos’s output stream (see section 4.2.1). Otherwise, output
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

449

will be sent to the filesystem path specified by filePath, overwriting that file if append if F, or
appending to the end of it if append is T.
Output is generally done in a late() event, so that the output reflects the state of the simulation at
the end of a generation.
– (void)outputUsage(void)

Output the current memory usage of the simulation to Eidos’s output stream. The specifics of what is
printed, and in what format, should not be relied upon as they may change from version to version of
SLiM. This method is primarily useful for understanding where the memory usage of a simulation
predominantly resides, for debugging or optimization. Note that it does not capture all memory usage
by the process; rather, it summarizes the memory usage by SLiM and Eidos in directly allocated
objects and buffers. To get the total memory usage of the running process (either current or peak), use
the Eidos function usage().
– (integer$)readFromPopulationFile(string$ filePath)

Read from a population initialization file, whether in text or binary format as previously specified to
outputFull(), and return the generation counter value represented by the file’s contents (i.e., the
generation at which the file was generated). Although this is most commonly used to set up initial
populations (often in an Eidos event set to run in generation 1, immediately after simulation
initialization), it may be called in any Eidos event; the current state of all populations will be wiped
and replaced by the state in the file at filePath. All Eidos variables that are of type object and have
element type Subpopulation, Genome, Mutation, Individual, or Substitution will be removed
as a side effect of this method, since all such variables would refer to objects that no longer exist in
the SLiM simulation; if you want to preserve any of that state, you should output it or save it to a file
prior to this call. New symbols will be defined to refer to the new Subpopulation objects loaded
from the file.
If the file being read was written by a version of SLiM prior to 2.3, then for backward compatibility
fitness values will be calculated immediately for any new subpopulations created by this call, which
will trigger the calling of any activated and applicable fitness() callbacks. When reading files
written by SLiM 2.3 or later, fitness values are not calculated as a side effect of this call (because the
simulation will often need to evaluate interactions or modify other state prior to doing so).
In SLiM 2.3 and later when using the WF model, calling readFromPopulationFile() from any
context other than a late() event causes a warning; calling from a late() event is almost always
correct in WF models, so that fitness values can be automatically recalculated by SLiM at the usual
time in the generation cycle without the need to force their recalculation (see chapter 19, and
comments on recalculateFitness() below).
In SLiM 3.0 when using the nonWF model, calling readFromPopulationFile() from any context
other than an early() event causes a warning; calling from an early() event is almost always
correct in nonWF models, so that fitness values can be automatically recalculated by SLiM at the
usual time in the generation cycle without the need to force their recalculation (see chapter 20, and
comments on recalculateFitness() below).
As of SLiM 2.1, this method changes the generation counter to the generation read from the file. If
you do not want the generation counter to be changed, you can change it back after reading, by
setting sim.generation to whatever value you wish. Note that restoring a saved past state and
running forward again will not yield the same simulation results, because the random number
generator’s state will not be the same; to ensure reproducibility from a given time point, setSeed()
can be used to establish a new seed value. Any changes made to the simulation’s structure (mutation
types, genomic element types, etc.) will not be wiped and re-established by
readFromPopulationFile(); this method loads only the population’s state, not the simulation
configuration, so care should be taken to ensure that the simulation structure meshes coherently with
the loaded data. Indeed, state such as the selfing and cloning rates of subpopulations, values set into
tag properties, and values set onto objects with setValue() will also be lost, since it is not saved out
by outputFull(). Only information saved by outputFull() will be restored; all other state associated
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

450

with the simulation’s subpopulations, individuals, genomes, mutations, and substitutions will be lost,
and should be re-established by the model if it is still needed.
As of SLiM 2.3, this method will read and restore the spatial positions of individuals if that information
is present in the output file and the simulation has enabled continuous space (see outputFull() for
details). If spatial positions are present in the output file but the simulation has not enabled
continuous space (or the number of spatial dimensions does not match), an error will result. If the
simulation has enabled continuous space but spatial positions are not present in the output file, the
spatial positions of the individuals read will be undefined, but an error is not raised.
As of SLiM 3.0, this method will read and restore the ages of individuals if that information is present
in the output file and the simulation is based upon the nonWF model. If ages are present but the
simulation uses a WF model, an error will result; the WF model does not use age information. If ages
are not present but the simulation uses a nonWF model, an error will also result; the nonWF model
requires age information.
– (void)recalculateFitness([Ni$ generation = NULL])

Force an immediate recalculation of fitness values for all individuals in all subpopulations. Normally
fitness values are calculated at a fixed point in each generation, and those values are cached and used
throughout the following generation. If simulation parameters are changed in script in a way that
affects fitness calculations, and if you wish those changes to take effect immediately rather than taking
effect at the end of the current generation, you may call recalculateFitness() to force an
immediate recalculation and recache.
The optional parameter generation provides the generation for which fitness() callbacks should
be selected; if it is NULL (the default), the simulation’s current generation value, sim.generation, is
used. If you call recalculateFitness() in an early() event in a WF model, you may wish this to
be sim.generation - 1 in order to utilize the fitness() callbacks for the previous generation, as if
the changes that you have made to fitness-influencing parameters were already in effect at the end of
the previous generation when the new generation was first created and evaluated (usually it is simpler
to just make such changes in a late() event instead, however, in which case calling
recalculateFitness() is probably not necessary at all since fitness values will be recalculated
immediately afterwards). Regardless of the value supplied for generation here, sim.generation
inside fitness() callbacks will report the true generation number, so if your callbacks consult that
parameter in order to create generation-specific fitness effects you will need to handle the discrepancy
somehow. (Similar considerations apply for nonWF models that call recalculateFitness() in a
late() event, which is also not advisable in general.)
After this call, the fitness values used for all purposes in SLiM will be the newly calculated values.
Calling this method will trigger the calling of any enabled and applicable fitness() callbacks, so
this is quite a heavyweight operation; you should think carefully about what side effects might result
(which is why fitness recalculation does not just occur automatically after changes that might affect
fitness values).
– (object$)registerEarlyEvent(Nis$ id, string$ source,
[Ni$ start = NULL], [Ni$ end = NULL])

Register a block of Eidos source code, represented as the string singleton source, as an Eidos
early() event in the current simulation, with optional start and end generations limiting its
applicability. The script block will be given identifier id (specified as an integer, or as a string
symbolic name such as "s5"); this may be NULL if there is no need to be able to refer to the block
later. The registered event is added to the end of the list of registered SLiMEidosBlock objects, and is
active immediately; it may be eligible to execute in the current generation (see section 22.8 for
details). The new SLiMEidosBlock will be defined as a global variable immediately by this method
(see section 21.10), and will also be returned by this method.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

451

– (object$)registerFitnessCallback(Nis$ id, string$ source,
Nio$ mutType, [Nio$ subpop = NULL],
[Ni$ start = NULL], [Ni$ end = NULL])

Register a block of Eidos source code, represented as the string singleton source, as an Eidos
fitness() callback in the current simulation, with a required mutation type mutType (which may be
an integer mutation type identifier, or NULL to indicate a global fitness() callback – see section
22.2), optional subpopulation subpop (which may also be an integer identifier, or NULL, the default,
to indicate all subpopulations), and optional start and end generations all limiting its applicability.
The script block will be given identifier id (specified as an integer, or as a string symbolic name
such as "s5"); this may be NULL if there is no need to be able to refer to the block later. The registered
callback is added to the end of the list of registered SLiMEidosBlock objects, and is active
immediately; it may be eligible to execute in the current generation (see section 22.8 for details). The
new SLiMEidosBlock will be defined as a global variable immediately by this method (see section
21.10), and will also be returned by this method.
– (object$)registerInteractionCallback(Nis$ id, string$ source,
io$ intType, [Nio$ subpop = NULL],
[Ni$ start = NULL], [Ni$ end = NULL])

Register a block of Eidos source code, represented as the string singleton source, as an Eidos
interaction() callback in the current simulation, with a required interaction type intType (which
may be an integer identifier), optional subpopulation subpop (which may also be an integer
identifier, or NULL, the default, to indicate all subpopulations), and optional start and end
generations all limiting its applicability. The script block will be given identifier id (specified as an
integer, or as a string symbolic name such as "s5"); this may be NULL if there is no need to be
able to refer to the block later. The registered callback is added to the end of the list of registered
SLiMEidosBlock objects, and is active immediately; it will be eligible to execute the next time an
InteractionType is evaluated. The new SLiMEidosBlock will be defined as a global variable
immediately by this method (see section 21.10), and will also be returned by this method.
– (object$)registerLateEvent(Nis$ id, string$ source,
[Ni$ start = NULL], [Ni$ end = NULL])

Register a block of Eidos source code, represented as the string singleton source, as an Eidos
late() event in the current simulation, with optional start and end generations limiting its
applicability. The script block will be given identifier id (specified as an integer, or as a string
symbolic name such as "s5"); this may be NULL if there is no need to be able to refer to the block
later. The registered event is added to the end of the list of registered SLiMEidosBlock objects, and is
active immediately; it may be eligible to execute in the current generation (see section 22.8 for
details). The new SLiMEidosBlock will be defined as a global variable immediately by this method
(see section 21.10), and will also be returned by this method.
– (object$)registerMateChoiceCallback(Nis$ id, string$ source,
[Nio$ subpop = NULL], [Ni$ start = NULL], [Ni$ end = NULL])

Register a block of Eidos source code, represented as the string singleton source, as an Eidos
mateChoice() callback in the current simulation, with optional subpopulation subpop (which may
be an integer identifier, or NULL, the default, to indicate all subpopulations) and optional start and
end generations all limiting its applicability. The script block will be given identifier id (specified as
an integer, or as a string symbolic name such as "s5"); this may be NULL if there is no need to be
able to refer to the block later. The registered callback is added to the end of the list of registered
SLiMEidosBlock objects, and is active immediately; it may be eligible to execute in the current
generation (see section 22.8 for details). The new SLiMEidosBlock will be defined as a global
variable immediately by this method (see section 21.10), and will also be returned by this method.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

452

– (object$)registerModifyChildCallback(Nis$ id, string$ source,
[Nio$ subpop = NULL], [Ni$ start = NULL], [Ni$ end = NULL])

Register a block of Eidos source code, represented as the string singleton source, as an Eidos
modifyChild() callback in the current simulation, with optional subpopulation subpop (which may
be an integer identifier, or NULL, the default, to indicate all subpopulations) and optional start and
end generations all limiting its applicability. The script block will be given identifier id (specified as
an integer, or as a string symbolic name such as "s5"); this may be NULL if there is no need to be
able to refer to the block later. The registered callback is added to the end of the list of registered
SLiMEidosBlock objects, and is active immediately; it may be eligible to execute in the current
generation (see section 22.8 for details). The new SLiMEidosBlock will be defined as a global
variable immediately by this method (see section 21.10), and will also be returned by this method.
– (object$)registerRecombinationCallback(Nis$ id, string$ source,
[Nio$ subpop = NULL], [Ni$ start = NULL], [Ni$ end = NULL])

Register a block of Eidos source code, represented as the string singleton source, as an Eidos
recombination() callback in the current simulation, with optional subpopulation subpop (which
may be an integer identifier, or NULL, the default, to indicate all subpopulations) and optional start
and end generations all limiting its applicability. The script block will be given identifier id (specified
as an integer, or as a string symbolic name such as "s5"); this may be NULL if there is no need to
be able to refer to the block later. The registered callback is added to the end of the list of registered
SLiMEidosBlock objects, and is active immediately; it may be eligible to execute in the current
generation (see section 22.8 for details). The new SLiMEidosBlock will be defined as a global
variable immediately by this method (see section 21.10), and will also be returned by this method.
– (object$)registerReproductionCallback(Nis$ id, string$ source,
[Nio$ subpop = NULL], [Ns$ sex = NULL], [Ni$ start = NULL],
[Ni$ end = NULL])

Register a block of Eidos source code, represented as the string singleton source, as an Eidos
reproduction() callback in the current simulation, with optional subpopulation subpop (which may
be an integer identifier, or NULL, the default, to indicate all subpopulations), optional sex-specificity
sex (which may be "M" or "F" in sexual simulations to make the callback specific to males or females
respectively, or NULL for no sex-specificity), and optional start and end generations all limiting its
applicability. The script block will be given identifier id (specified as an integer, or as a string
symbolic name such as "s5"); this may be NULL if there is no need to be able to refer to the block
later. The registered callback is added to the end of the list of registered SLiMEidosBlock objects,
and is active immediately; it may be eligible to execute in the current generation (see section 22.8 for
details). The new SLiMEidosBlock will be defined as a global variable immediately by this method
(see section 21.10), and will also be returned by this method.
– (object)rescheduleScriptBlock(object$ block,
integer$ start, [Ni$ end = NULL], [Ni generations = NULL])

Reschedule the target script block given by block to execute in a specified set of generations.
The first way to specify the generation set is with start and end parameter values; block will then
execute from start to end, inclusive. In this case, block is returned.
The second way to specify the generation set is using the generations parameter; this is more
flexible but more complicated. Since script blocks execute across a contiguous span of generations
defined by their start and end properties, this may result in the duplication of block; one script
block will be used for each contiguous span of generations in generations. The block object itself
will be rescheduled to cover the first such span, whereas duplicates of block will be created to cover
subsequent contiguous spans. A vector containing all of the script blocks scheduled by this method,
including block, will be returned; this vector is guaranteed to be sorted by the (ascending) scheduled
execution order of the blocks. Any duplicates of block created will be given values for the active,
source, tag, and type properties equal to the current values for block, but will be given an id of -1
since script block identifiers must be unique; if it is necessary to find the duplicated blocks again later,
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

453

their tag property should be used. The vector supplied for generations does not need to be in
sorted order, but it must not contain any duplicates.
Because this method can create a large number of duplicate script blocks, it can sometimes be better
to handle script block scheduling in other ways. If an early() event needs to execute every tenth
generation over the whole duration of a long model run, for example, it would not be advisable to use
a call like sim.rescheduleScriptBlock(s1, generations=seq(10, 100000, 10)) for that
purpose, since that would result in thousands of duplicate script blocks. Instead, it would be
preferable to add a test such as if (sim.generation % 10 != 0) return; at the beginning of the
event. It is legal to reschedule a script block while the block is executing; a call like
sim.rescheduleScriptBlock(self, sim.generation + 10, sim.generation + 10); made
inside a given block would therefore also cause the block to execute every tenth generation, although
this sort of self-rescheduling code is probably harder to read, maintain, and debug.
Whichever way of specifying the generation set is used, the discussion in section 22.8 applies: block
may continue to be executed during the current life cycle stage even after it has been rescheduled,
unless it is made inactive using its active property, and similarly, the block may not execute during
the current life cycle stage if it was not already scheduled to do so. Rescheduling script blocks during
the generation and life cycle stage in which they are executing, or in which they are intended to
execute, should be avoided.
Note that new script blocks can also be created and scheduled using the register...() methods of
SLiMSim; by using the same source as a template script block, the template can be duplicated and
scheduled for different generations. In fact, rescheduleScriptBlock() does essentially that
internally.
– (void)setValue(string$ key, + value)

Sets a value for the dictionary entry identifier key. The value, which may be of any type other than
object, can be fetched later using getValue(). This dictionary-style functionality is actually
provided by the superclass of SLiMSim, SLiMEidosDictionary, although that fact is not presently
visible in Eidos since superclasses are not introspectable.
– (void)simulationFinished(void)

Declare the current simulation finished. Normally SLiM ends a simulation when, at the end of a
generation, there are no script events or callbacks registered for any future generation (excluding
scripts with no declared end generation). If you wish to end a simulation before this condition is met,
a call to simulationFinished() will cause the current simulation to end at the end of the current
generation. For example, a simulation might self-terminate if a test for a dynamic equilibrium
condition is satisfied. Note that the current generation will finish executing; if you want the
simulation to stop immediately, you can use the Eidos method stop(), which raises an error
condition.
– (logical$)treeSeqCoalesced(void)

Returns the coalescence state for the recorded tree sequence at the last simplification. The returned
value is a logical singleton flag, T to indicate that full coalescence was observed at the last treesequence simplification (meaning that there is a single ancestral individual that roots all ancestry trees
at all sites along the chromosome – although not necessarily the same ancestor at all sites), or F if full
coalescence was not observed. For simple models, reaching coalescence may indicate that the model
has reached an equilibrium state, but this may not be true in models that modify the dynamics of the
model during execution by changing migration rates, introducing new mutations programmatically,
dictating non-random mating, etc., so be careful not to attach more meaning to coalescence than it is
due; some models may require burn-in beyond coalescence to reach equilibrium, or may not have an
equilibrium state at all. Also note that some actions by a model, such as adding a new subpopulation,
may cause the coalescence state to revert from T back to F (at the next simplification), so a return
value of T may not necessarily mean that the model is coalesced at the present moment – only that it
was coalesced at the last simplification.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

454

This method may only be called if tree sequence recording has been turned on with
initializeTreeSeq(); in addition, checkCoalescence=T must have been supplied to
initializeTreeSeq(), so that the necessary work is done during each tree-sequence simplification.
Since this method does not perform coalescence checking itself, but instead simply returns the
coalescence state observed at the last simplification, it may be desirable to call treeSeqSimplify()
immediately before treeSeqCoalesced() to obtain up-to-date information. However, the speed
penalty of doing this in every generation would be large, and most models do not need this level of
precision; usually it is sufficient to know that the model has coalesced, without knowing whether that
happened in the current generation or in a recent preceding generation.
– (void)treeSeqOutput(string$ path, [logical$ simplify = T])

Outputs the current tree sequence recording tables to the path specified by path. This method may
only be called if tree sequence recording has been turned on with initializeTreeSeq(). If
simplify is T (the default), simplification will be done immediately prior to output; this is almost
always desirable, unless a model wishes to avoid simplification entirely. A binary tree sequence file
will be written to the specified path; a filename extension of .trees is suggested for this type of file.
– (void)treeSeqRememberIndividuals(object individuals)

Permanently adds the individuals specified by individuals to the sample retained across tree
sequence table simplification. This method may only be called if tree sequence recording has been
turned on with initializeTreeSeq(). All currently living individuals are always retained across
simplification; this method does not need to be called, and indeed should not be called, for that
purpose. Instead, treeSeqRememberIndividuals() is for permanently adding particular individuals
to the retained sample. Typically this would be used, for example, to retain particular individuals that
you wanted to be able to trace ancestry back to in later analysis. However, this is not the typical
usage pattern for tree sequence recording; most models will not need to call this method.
The metadata (age, location, etc) that are stored in the resulting tree sequence are those values present
at either (a) the final generation, if the individual is alive at the end of the simulation, or (b) the last
time that the individual was remembered, if not. Calling treeSeqRememberIndividuals() on an
individual that is already remembered will cause the archived information about the remembered
individual to be updated to reflect the individual’s current state. A case where this is particularly
important is for the spatial location of individuals in continuous-space models. SLiM automatically
remembers the individuals that comprise the first generation of any new subpopulation created with
addSubpop(), for easy recapitation and other analysis (see section 16.10). However, since these firstgeneration individuals are remembered at the moment they are created, their spatial locations have
not yet been set up, and will contain garbage – and those garbage values will be archived in their
remembered state. If you need correct spatial locations of first-generation individuals for your postsimulation analysis, you should call treeSeqRememberIndividuals() explicitly on the first
generation, after setting spatial locations, to update the archived information with the correct spatial
positions.
– (void)treeSeqSimplify(void)

Triggers an immediate simplification of the tree sequence recording tables. This method may only be
called if tree sequence recording has been turned on with initializeTreeSeq(). A call to this
method will free up memory being used by entries that are no longer in the ancestral path of any
individual within the current sample (currently living individuals, in other words, plus those explicitly
added to the sample with treeSeqRememberIndividuals()), but it can also take a significant
amount of time. Typically calling this method is not necessary; the automatic simplification performed
occasionally by SLiM should be sufficient for most models.

21.13 Class Subpopulation
This class represents one subpopulation in the simulated population. Section 1.5.5 presents an
overview of the conceptual role of this class. The subpopulations currently defined in the
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

455

simulation are defined as global constants with the same names used in the SLiM input file – p1,
p2, and so forth.
21.13.1 Subpopulation properties
cloningRate => (float)

The fraction of children in the next generation that will be produced by cloning (as opposed to
biparental mating). In non-sexual (i.e. hermaphroditic) simulations, this property is a singleton float
representing the overall subpopulation cloning rate. In sexual simulations, this property is a float
vector with two values: the cloning rate for females (at index 0) and for males (at index 1).
firstMaleIndex => (integer$)

The index of the first male individual in the subpopulation. The genomes vector is sorted into females
first and males second; firstMaleIndex gives the position of the boundary between those sections.
Note, however, that there are two genomes per diploid individual, and the firstMaleIndex is not
premultiplied by 2; you must multiply it by 2 before using it to decide whether a given index into
genomes is a genome for a male or a female. The firstMaleIndex property is also the number of
females in the subpopulation, given this design. For non-sexual (i.e. hermaphroditic) simulations, this
property has an undefined value and should not be used.
fitnessScaling <–> (float$)

A float scaling factor applied to the fitness of all individuals in this subpopulation (i.e., the fitness
value computed for each individual will be multiplied by this value). This is primarily of use in
nonWF models, where fitness is absolute, rather than in WF models, where fitness is relative (and thus
a constant factor multiplied into the fitness of every individual will make no difference); however, it
may be used in either type of model. This provides a simple, fast way to modify the fitness of all
individuals in a subpopulation; conceptually it is similar to returning the same fitness effect for all
individuals in the subpopulation from a fitness(NULL) callback, but without the complexity and
performance overhead of implementing such a callback. To scale the fitness of individuals by different
(individual-specific) factors, see the fitnessScaling property of Individual.
The value of fitnessScaling is reset to 1.0 every generation, so that any scaling factor set lasts for
only a single generation. This reset occurs immediately after fitness values are calculated, in both WF
and nonWF models.
genomes => (object)

All of the genomes contained by the subpopulation; there are two genomes per diploid individual.
id => (integer$)

The identifier for this subpopulation; for subpopulation p3, for example, this is 3.
immigrantSubpopFractions => (float)

The expected value of the fraction of children in the next generation that are immigrants arriving from
particular subpopulations.
immigrantSubpopIDs => (integer)

The identifiers of the particular subpopulations from which immigrants will arrive in the next
generation.
individualCount => (integer$)

The number of individuals in the subpopulation; one-half of the number of genomes.
individuals => (object)

All of the individuals contained by the subpopulation. Each individual is diploid and thus contains
two Genome objects. See the sampleIndividuals() and subsetIndividuals() for fast ways to get
a subset of the individuals in a subpopulation.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

456

selfingRate => (float$)

The expected value of the fraction of children in the next generation that will be produced by selfing
(as opposed to biparental mating). Selfing is only possible in non-sexual (i.e. hermaphroditic)
simulations; for sexual simulations this property always has a value of 0.0.
sexRatio => (float$)

For sexual simulations, the sex ratio for the subpopulation. This is defined, in SLiM, as the fraction of
the subpopulation that is male; in other words, it is actually the M:(M+F) ratio. For non-sexual (i.e.
hermaphroditic) simulations, this property has an undefined value and should not be used.
spatialBounds => (float)

The spatial boundaries of the subpopulation. The length of the spatialBounds property depends
upon the spatial dimensionality declared with initializeSLiMOptions(). If the spatial
dimensionality is zero (as it is by default), the value of this property is float(0) (a zero-length float
vector). Otherwise, minimums are supplied for each coordinate used by the dimensionality of the
simulation, followed by maximums for each. In other words, if the declared dimensionality is "xy",
the spatialBounds property will contain values (x0, y0, x1, y1); bounds for the z coordinate
will not be included in that case, since that coordinate is not used in the simulation’s dimensionality.
This property cannot be set, but the setSpatialBounds() method may be used to achieve the same
thing.
tag <–> (integer$)

A user-defined integer value. The value of tag is initially undefined (i.e., has an effectively random
value that could be different every time you run your model); if you wish it to have a defined value,
you must arrange that yourself by explicitly setting its value prior to using it elsewhere in your code.
The value of tag is not used by SLiM; it is free for you to use. See also the getValue() and
setValue() methods, for another way of attaching state to subpopulations.

21.13.2 Subpopulation methods
– (No$)addCloned(object$ parent)

Generates a new offspring individual from the given parent by clonal reproduction, queues it for
addition to the target subpopulation, and returns it. The new offspring will not be visible as a member
of the target subpopulation until the end of the offspring generation life cycle stage. The
subpopulation of parent will be used to locate applicable modifyChild() callbacks governing the
generation of the offspring individual.
Note that this method is only for use in nonWF models. See addCrossed() for further general notes
on the addition of new offspring individuals.
– (No$)addCrossed(object$ parent1,
object$ parent2, [Nfs$ sex = NULL])

Generates a new offspring individual from the given parents by biparental sexual reproduction, queues
it for addition to the target subpopulation, and returns it. The new offspring will not be visible as a
member of the target subpopulation until the end of the offspring generation life cycle stage.
Attempting to use a newly generated offspring individual as a mate, or to reference it as a member of
the target subpopulation in any other way, will result in an error. In most models the returned
individual is not used, but it is provided for maximal generality and flexibility.
The new offspring individual is generated from parent1 and parent2 by crossing them. In sexual
models parent1 must be female and parent2 must be male; in hermaphroditic models, parent1 and
parent2 are unrestricted. If parent1 and parent2 are the same individual in a hermaphroditic
model, that parent self-fertilizes, or “selfs”, to generate the offspring sexually (note this is not the same
as clonal reproduction). Such selfing is considered “incidental” by addCrossed(), however; if the
preventIncidentalSelfing flag of initializeSLiMOptions() is T, supplying the same individual

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

457

for parent1 and parent2 is an error (you must check for and prevent incidental selfing if you set that
flag in a nonWF model). If non-incidental selfing is desired, addSelfed() should be used instead.
The sex parameter specifies the sex of the offspring. A value of NULL means “make the default
choice”; in non-sexual models it is the only legal value for sex, and does nothing, whereas in sexual
models it causes male or female to be chosen with equal probability. A value of "M" or "F" for sex
specifies that the offspring should be male or female, respectively. Finally, a float value from 0.0 to
1.0 for sex provides the probability that the offspring will be male; a value of 0.0 will produce a
female, a value of 1.0 will produce a male, and for intermediate values SLiM will draw the sex of the
offspring randomly according to the specified probability. Unless you wish the bias the sex ratio of
offspring, the default value of NULL should generally be used.
Note that any defined, active, and applicable recombination() and modifyChild() callbacks will
be called as a side effect of calling this method, before this method even returns. For
recombination() callbacks, the subpopulation of the parent that is generating a given gamete is
used; for modifyChild() callbacks the situation is more complex. In most biparental mating events,
parent1 and parent2 will belong to the same subpopulation, and modifyChild() callbacks for that
subpopulation will be used, just as in WF models. In certain models (such as models of pollen flow
and broadcast spawning), however, biparental mating may occur between parents that are not from
the same subpopulation; that is legal in nonWF models, and in that case, modifyChild() callbacks
for the subpopulation of parent1 are used (since that is the maternal parent).
If the modifyChild() callback process results in rejection of the proposed child (see section 22.4), a
new offspring individual will not be generated, and this method will return NULL. To force the
generation of an offspring individual from a given pair of parents, you could loop until addCrossed()
succeeds, but note that if your modifyChild() callback rejects all proposed children from those
particular parents, your model will then hang, so care must be taken with this approach. Usually,
nonWF models do not force generation of offspring in this manner; rejection of a proposed offspring
by a modifyChild() callback typically represents a phenomenon such as post-mating reproductive
isolation or lethal genetic incompatibilities that would reduce the expected litter size, so the default
behavior is typically desirable.
Note that this method is only for use in nonWF models, in which offspring generation is managed
manually by the model script; in such models, addCrossed() must be called only from
reproduction() callbacks, and may not be called at any other time. In WF models, offspring
generation is managed automatically by the SLiM core.
– (No$)addEmpty([Nfs$ sex = NULL])

Generates a new offspring individual with empty genomes (i.e., containing no mutations), queues it
for addition to the target subpopulation, and returns it. The new offspring will not be visible as a
member of the target subpopulation until the end of the offspring generation life cycle stage. No
recombination() callbacks will be called. The target subpopulation will be used to locate
applicable modifyChild() callbacks governing the generation of the offspring individual (unlike the
other addX() methods, because there is no parental individual to reference). The offspring is
considered to have no parents for the purposes of pedigree tracking. The sex parameter is treated as
in addCrossed().
Note that this method is only for use in nonWF models. See addCrossed() for further general notes
on the addition of new offspring individuals.
– (No$)addRecombinant(No$ strand1, No$ strand2,
Ni breaks1, No$ strand3, No$ strand4, Ni breaks2,
[Nfs$ sex = NULL])

Generates a new offspring individual from the given parental genomes with the specified
recombination breakpoints, queues it for addition to the target subpopulation, and returns it. The new
offspring will not be visible as a member of the target subpopulation until the end of the offspring
generation life cycle stage. The target subpopulation will be used to locate applicable
modifyChild() callbacks governing the generation of the offspring individual (unlike the other
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

458

addX() methods, because there is no parental individual to reference); recombination() callbacks

will not be called by this method. This method is an advanced feature; most models will use
addCrossed(), addSelfed(), or addCloned() instead.
This method supports several possible configurations for strand1, strand2, and breaks1 (and the
same applies for strand3, strand4, and breaks2). If strand1 and strand2 are both NULL, the
corresponding genome in the generated offspring will be empty, as from addEmpty(), with no
parental genomes and no added mutations; in this case, breaks1 must be NULL or zero-length. If
strand1 is non-NULL but strand2 is NULL, the corresponding genome in the generated offspring will
be a clonal copy of strand1 with mutations added, as from addCloned(); in this case, breaks1 must
similarly be NULL or zero-length. If strand1 and strand2 are both non-NULL, the corresponding
genome in the generated offspring will result from recombination between strand1 and strand2
with mutations added, as from addCrossed(), with strand1 being the initial copy strand; copying
will switch between strands at each breakpoint in breaks1, which must be non-NULL but need not be
sorted or uniqued (SLiM will sort and unique the supplied breakpoints internally). (It is not currently
legal for strand1 to be NULL and strand2 non-NULL; that variant may be assigned some meaning in
future.) Again, this discussion applies equally to strand3, strand4, and breaks2, mutatis mutandis.
Note that when new mutations are generated by addRecombinant(), their subpopID property will be
the id of the offspring’s subpopulation, since the parental subpopulation is ambiguous in the general
case; this behavior differs from the other add...() methods.
The sex parameter is interpreted exactly as in addCrossed(); see that method for discussion. If the
offspring sex is specified in any way (i.e., if sex is non-NULL), the strands provided must be compatible
with the sex chosen. If the offspring sex is not specified (i.e., if sex is NULL), the sex will be inferred
from the strands provided where possible (when modeling an X or Y chromosome), or will be chosen
randomly otherwise (when modeling autosomes); it will not be inferred from the sex of the individuals
possessing the parental strands, even when the reproductive mode is essentially clonal from a single
parent, since such inference would be ambiguous in the general case. Similarly, the offspring is
considered to have no parents for the purposes of pedigree tracking, since there may be more than
two “parents” in the general case. When modeling the X or Y, strand1 and strand2 must be X
genomes (or NULL), and strand3 and strand4 must both be X genomes or both be Y genomes (or
NULL).
These semantics allow several uses for addRecombinant(). When all strands are non-NULL, it is
similar to addCrossed() except that the recombination breakpoints are specified explicitly, allowing
very precise offspring generation without having to override SLiM’s breakpoint generation with a
recombination() callback. When only strand1 and strand3 are supplied, it is very similar to
addCloned(), creating a clonal offspring, except that the two parental genomes need not belong to
the same individual (whatever that might mean biologically). Supplying only strand1 is useful for
modeling clonally reproducing haploids; the second genome of every offspring will be kept empty and
will not receive new mutations. For a model of clonally reproducing haploids that undergo horizontal
gene transfer (HGT), supplying only strand1 and strand2 will allow HGT from strand2 to replace
segments of an otherwise clonal copy of strand1, while the second genome of the generated
offspring will again be kept empty; this could be useful for modeling bacterial conjugation, for
example. Other variations are also possible.
Note that this method is only for use in nonWF models. See addCrossed() for further general notes
on the addition of new offspring individuals.
– (No$)addSelfed(object$ parent)

Generates a new offspring individual from the given parent by selfing, queues it for addition to the
target subpopulation, and returns it. The new offspring will not be visible as a member of the target
subpopulation until the end of the offspring generation life cycle stage. The subpopulation of parent
will be used to locate applicable recombination() and modifyChild() callbacks governing the
generation of the offspring individual.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

459

Since selfing requires that parent act as a source of both a male and a female gamete, this method
may be called only in hermaphroditic models; calling it in sexual models will result in an error. This
method represents a non-incidental selfing event, so the preventIncidentalSelfing flag of
initializeSLiMOptions() has no effect on this method (in contrast to the behavior of
addCrossed(), where selfing is assumed to be incidental).
Note that this method is only for use in nonWF models. See addCrossed() for further general notes
on the addition of new offspring individuals.
– (float)cachedFitness(Ni indices)

The fitness values calculated for the individuals at the indices given are returned. If NULL is passed,
fitness values for all individuals in the subpopulation are returned. The fitness values returned are
cached values; fitness() callbacks are therefore not called as a side effect of this method. It is
always an error to call cachedFitness() from inside a fitness() callback, since fitness values are
in the middle of being set up. In WF models, it is also an error to call cachedFitness() from a
late() event, because fitness values for the new offspring generation have not yet been calculated
and are undefined. In nonWF models, the population may be a mixture of new and old individuals,
so instead, NAN will be returned as the fitness of any new individuals whose fitness has not yet been
calculated. When new subpopulations are first created with addSubpop() or addSubpopSplit(),
the fitness of all of the newly created individuals is considered to be 1.0 until fitness values are
recalculated.
– (void)configureDisplay([Nf center = NULL], [Nf$ scale = NULL],
[Ns$ color = NULL])

This method customizes the display of the subpopulation in SLiMgui’s Population Visualization graph.
When this method is called by a model running outside SLiMgui, it will do nothing except typechecking and bounds-checking its arguments. When called by a model running in SLiMgui, the
position, size, and color of the subpopulation’s displayed circle can be controlled as specified below.
The center parameter sets the coordinates of the center of the subpopulation’s displayed circle; it
must be a float vector of length two, such that center[0] provides the x-coordinate and center[1]
provides the y-coordinate. The square central area of the Population Visualization occupies scaled
coordinates in [0,1] for both x and y, so the values in center must be within those bounds. If a value
of NULL is provided, SLiMgui’s default center will be used (which currently arranges subpopulations in
a circle).
The scale parameter sets a scaling factor to be applied to the radius of the subpopulation’s displayed
circle. The default radius used by SLiMgui is a function of the subpopulation’s number of individuals;
this default radius is then multiplied by scale. If a value of NULL is provided, the default radius will
be used; this is equivalent to supplying a scale of 1.0. Typically the same scale value should be
used by all subpopulations, to scale all of their circles up or down uniformly, but that is not required.
The color parameter sets the color to be used for the displayed subpopulation’s circle. Colors may be
specified by name, or with hexadecimal RGB values of the form "#RRGGBB" (see the Eidos manual). If
color is NULL or the empty string, "", SLiMgui’s default (fitness-based) color will be used.
– (void)defineSpatialMap(string$ name, string$ spatiality, Ni gridSize,
float values, [logical$ interpolate = F], [Nf valueRange = NULL],
[Ns colors = NULL])

Defines a spatial map for the subpopulation. The map will henceforth be identified by name. The map
uses the spatial dimensions referenced by spatiality, which must be a subset of the dimensions defined
for the simulation in initializeSLiMOptions(). Spatiality "x" is permitted for dimensionality "x";
spatiality "x", "y", or "xy" for dimensionality "xy"; and spatiality "x", "y", "z", "xy", "yz", "xz",
or "xyz" for dimensionality "xyz". The spatial map is defined by a grid of values of a size specified
by gridSize, which must have one value per spatial dimension (or gridSize may be NULL; see
below); for a spatiality of "xz", for example, gridSize must be of length 2, specifying the size of the
values grid in the x and z dimensions. The parameter values then gives the values of the grid; it must
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

460

be of length equal to the product of the gridSize elements, and specifies values varying first (i.e.,
fastest) in the x dimension, then in y, then in z.
Beginning in SLiM 2.6, the values parameter may be a matrix/array with the number of dimensions
appropriate for the declared spatiality of the map; for example, a map with spatiality "xy" would
require a (two-dimensional) matrix, whereas a map with spatiality of "xyz" would require a threedimensional array. (See the Eidos manual for discussion of matrices and arrays.) If a matrix/array
argument is supplied for values, gridSize must either be NULL, or (for backward compatibility) may
match the dimensions of values as they would be given by dim(values). The data in values is
interpreted just as is described above for the vector case: varying first in x, then in y, then in z.
BEWARE: since the values in Eidos matrices and arrays are stored in column-first order (following the
convention established by R), this means that for a map with spatiality "xy" each column of the
values matrix will provide map data as x varies and y remains constant. This will be confusing if you
think of matrix columns as being “x” and matrix rows as being “y”, so try not to think that way; the
opposite is true. This behavior is actually simple, self-consistent, and backward-compatible; if you
before created a spatial map with a vector values before and a gridSize of c(x, y) specifying the
dimensions of that vector, you can now supply matrix(values, nrow=x) for values to get exactly
the same spatial map, and you can still supply the same value of c(x, y) for gridSize if you wish
(or you may supply NULL). If, however, you are looking at a matrix as printed in the Eidos console,
and want that matrix to be used as a spatial map in SLiM in the same orientation, you should use the
transpose of the matrix, as supplied by the t() function. Actually, since matrices are printed in the
console with each successive row having a larger index, whereas in Cartesian (x, y) coordinates yvalues increase as you go upward, you may also wish to reverse the order of rows in your matrix prior
to transposing (or the order of columns after transposing), with an expression such as
t(map[(nrow(map)-1):0,]), in order to make the spatial map display in SLiMgui as you expect
(since SLiMgui displays everything in Cartesian coordinates). Apologies if this is confusing; it would
be nice if matrix notation, programming languages, and Descartes all agreed on such things, but they
do not, so be very careful that your spatial maps are oriented as you wish them to be!
Moving on to the other parameters of defineSpatialMap(): if interpolate is F, values across the
spatial map are not interpolated; the value at a given point is equal to the nearest value defined by the
grid of values specified. If interpolate is T, values across the spatial map will be interpolated (using
linear, bilinear, or trilinear interpolation as appropriate) to produce spatially continuous variation in
values. In either case, the corners of the value grid are exactly aligned with the corners of the spatial
boundaries of the subpopulation as specified by setSpatialBoundary(), and the value grid is then
stretched across the spatial extent of the subpopulation in such a manner as to produce equal spacing
between the values along each dimension. The setting of interpolation only affects how values
between these grid points are calculated: by nearest-neighbor, or by linear interpolation. Interpolation
of spatial maps with periodic boundaries is not handled specially; to ensure that the edges of a
periodic spatial map join smoothly, simply ensure that the grid values at the edges of the map are
identical, since they will be coincident after periodic wrapping.
The valueRange and colors parameters travel together; either both are unspecified, or both are
specified. They control how map values will be transformed into colors, by SLiMgui and by the
spatialMapColor() method. The valueRange parameter establishes the color-mapped range of
spatial map values, as a vector of length two specifying a minimum and maximum; this does not need
to match the actual range of values in the map. The colors parameter then establishes the
corresponding colors for values within the interval defined by valueRange: values less than or equal
to valueRange[0] will map to colors[0], values greater than or equal to valueRange[1] will map
to the last colors value, and intermediate values will shade continuously through the specified vector
of colors, with interpolation between adjacent colors to produce a continuous spectrum. This is much
simpler than it sounds in this description; see the recipes in chapter 14 for an illustration of its use.
Note that at present, SLiMgui will only display spatial maps of spatiality "x", "y", or "xy"; the colormapping parameters will simply be ignored by SLiMgui for other spatiality values (even if the spatiality
is a superset of these values; SLiMgui will not attempt to display an "xyz" spatial map, for example,
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

461

since it has no way to choose which 2D slice through the xyz space it ought to display). The
spatialMapColor() method will return translated color strings for any spatial map, however, even if
SLiMgui is unable to display the spatial map. If there are multiple spatial maps with color-mapping
parameters defined, SLiMgui will choose just one for display; it will prefer an "xy" map if one is
available, but beyond that heuristic its choice will be arbitrary.
– (+)getValue(string$ key)

Returns the value previously set for the dictionary entry identifier key using setValue(), or NULL if
no value has been set. This dictionary-style functionality is actually provided by the superclass of
Subpopulation, SLiMEidosDictionary, although that fact is not presently visible in Eidos since
superclasses are not introspectable.
– (void)outputMSSample(integer$ sampleSize, [logical$ replace = T],
[string$ requestedSex = "*"], [Ns$ filePath = NULL], [logical$ append = F],
[logical$ filterMonomorphic = F])

Output a random sample from the subpopulation in MS format (see section 23.2.2 for output format
details). Positions in the output will span the interval [0,1]. A sample of genomes (not entire
individuals, note) of size sampleSize from the subpopulation will be output. The sample may be
done either with or without replacement, as specified by replace; the default is to sample with
replacement. A particular sex of individuals may be requested for the sample, for simulations in
which sex is enabled, by passing "M" or "F" for requestedSex; passing "*", the default, indicates
that genomes from individuals should be selected randomly, without respect to sex. If the sampling
options provided by this method are not adequate, see the outputMS() method of Genome for a more
flexible low-level option.
If the optional parameter filePath is NULL (the default), output will be sent to Eidos’s output stream
(see section 4.2.1). Otherwise, output will be sent to the filesystem path specified by filePath,
overwriting that file if append if F, or appending to the end of it if append is T.
If filterMonomorphic is F (the default), all mutations that are present in the sample will be included
in the output. This means that some mutations may be included that are actually monomorphic within
the sample (i.e., that exist in every sampled genome, and are thus apparently fixed). These may be
filtered out with filterMonomorphic = T if desired; note that this option means that some mutations
that do exist in the sampled genomes might not be included in the output, simply because they exist
in every sampled genome.
See outputSample() and outputVCFSample() for other output formats. Output is generally done in
a late() event, so that the output reflects the state of the simulation at the end of a generation.
– (void)outputSample(integer$ sampleSize, [logical$ replace = T],
[string$ requestedSex = "*"], [Ns$ filePath = NULL], [logical$ append = F])

Output a random sample from the subpopulation in SLiM’s native format (see section 23.2.1 for output
format details). A sample of genomes (not entire individuals, note) of size sampleSize from the
subpopulation will be output. The sample may be done either with or without replacement, as
specified by replace; the default is to sample with replacement. A particular sex of individuals may
be requested for the sample, for simulations in which sex is enabled, by passing "M" or "F" for
requestedSex; passing "*", the default, indicates that genomes from individuals should be selected
randomly, without respect to sex. If the sampling options provided by this method are not adequate,
see the output() method of Genome for a more flexible low-level option.
If the optional parameter filePath is NULL (the default), output will be sent to Eidos’s output stream
(see section 4.2.1). Otherwise, output will be sent to the filesystem path specified by filePath,
overwriting that file if append if F, or appending to the end of it if append is T.
See outputMSSample() and outputVCFSample() for other output formats. Output is generally done
in a late() event, so that the output reflects the state of the simulation at the end of a generation.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

462

– (void)outputVCFSample(integer$ sampleSize, [logical$ replace = T],
[string$ requestedSex = "*"], [logical$ outputMultiallelics = T],
[Ns$ filePath = NULL], [logical$ append = F])

Output a random sample from the subpopulation in VCF format (see section 23.2.3 for output format
details). A sample of individuals (not genomes, note – unlike the outputSample() and
outputMSSample() methods) of size sampleSize from the subpopulation will be output. The sample
may be done either with or without replacement, as specified by replace; the default is to sample
with replacement. A particular sex of individuals may be requested for the sample, for simulations in
which sex is enabled, by passing "M" or "F" for requestedSex; passing "*", the default, indicates
that genomes from individuals should be selected randomly, without respect to sex. If the sampling
options provided by this method are not adequate, see the outputVCF() method of Genome for a
more flexible low-level option.
In SLiM, it is often possible for a single individual to have multiple mutations at a given base position.
Because the VCF format is an explicit-nucleotide format, this property of SLiM does not fit well into
VCF. Since there are only four possible nucleotides at a given base position in VCF, at most one
“reference” state and three “alternate” states could be represented at that base position. SLiM, on the
other hand, can represent any number of alternative possibilities at a given base; in general, if N
different mutations are segregating at a given position, there are 2N different allelic states at that
position in SLiM. For this reason, SLiM does not attempt to represent multiple mutations at a single
site as being alternative alleles in a single output line, as is typical in VCF format. Instead, SLiM
produces a separate line of VCF output for each segregating mutation at a given position. SLiM always
declares base positions as having a “reference base” of A (representing the state in individuals that do
not carry a given mutation) and an “alternate base” of T (representing the state in individuals that do
carry the given mutation). Multiallelic positions will thus produce VCF output showing multiple A-to-T
changes at the same position, possessed by different but possibly overlapping sets of individuals.
Many programs that process VCF output may not behave correctly with this style of output. SLiM
therefore provides a choice, using the outputMultiallelics flag; if that flag is T (the default), SLiM
will produce multiple lines of output for multiallelic base positions, but will mark those lines with a
MULTIALLELIC flag in the INFO field of the VCF output so that those lines can be filtered or processed
in a special manner. If outputMultiallelics is F, on the other hand, SLiM will completely suppress
output of all mutations at multiallelic sites – often the simplest option, if doing so does not lead to bias
in the subsequent analysis. This flag has no effect upon the output of sites with only a single mutation
present. Assessment of whether a site is multiallelic is done only within the sample; segregating
mutations that are not part of the sample are ignored.
If the optional parameter filePath is NULL (the default), output will be sent to Eidos’s output stream
(see section 4.2.1). Otherwise, output will be sent to the filesystem path specified by filePath,
overwriting that file if append if F, or appending to the end of it if append is T.
See outputMSSample() and outputSample() for other output formats. Output is generally done in a
late() event, so that the output reflects the state of the simulation at the end of a generation.
– (logical)pointInBounds(float point)

Returns T if point is inside the spatial boundaries of the subpopulation, F otherwise. For example, for
a simulation with "xy" dimensionality, if point contains exactly two values constituting an (x,y) point,
the result will be T if and only if ((point[0]>=x0) & (point[0]<=x1) & (point[1]>=y0) &
(point[1]<=y1)) given spatial bounds (x0, y0, x1, y1). This method is useful for implementing
absorbing or reprising boundary conditions. This may only be called in simulations for which
continuous space has been enabled with initializeSLiMOptions().
The length of point must be an exact multiple of the dimensionality of the simulation; in other words,
point may contain values comprising more than one point. In this case, a logical vector will be
returned in which each element is T if the corresponding point in point is inside the spatial
boundaries of the subpopulation, F otherwise.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

463

– (float)pointPeriodic(float point)

Returns a revised version of point that has been brought inside the periodic spatial boundaries of the
subpopulation (as specified by the periodicity parameter of initializeSLiMOptions()) by
wrapping around periodic spatial boundaries. In brief, if a coordinate of point lies beyond a periodic
spatial boundary, that coordinate is wrapped around the boundary, so that it lies inside the spatial
extent by the same magnitude that it previously lay outside, but on the opposite side of the space; in
effect, the two edges of the periodic spatial boundary are seamlessly joined. This is done iteratively
until all coordinates lie inside the subpopulation’s periodic boundaries. Note that non-periodic spatial
boundaries are not enforced by this method; they should be enforced using pointReflected(),
pointStopped(), or some other means of enforcing boundary constraints (which can be used after
pointPeriodic() to bring the remaining coordinates into bounds; coordinates already brought into
bounds by pointPeriodic() will be unaffected by those calls). This method is useful for
implementing periodic boundary conditions. This may only be called in simulations for which
continuous space and at least one periodic spatial dimension have been enabled with
initializeSLiMOptions().
The length of point must be an exact multiple of the dimensionality of the simulation; in other words,
point may contain values comprising more than one point. In this case, each point will be processed
as described above and a new vector containing all of the processed points will be returned.
– (float)pointReflected(float point)

Returns a revised version of point that has been brought inside the spatial boundaries of the
subpopulation by reflection. In brief, if a coordinate of point lies beyond a spatial boundary, that
coordinate is reflected across the boundary, so that it lies inside the boundary by the same magnitude
that it previously lay outside the boundary. This is done iteratively until all coordinates lie inside the
subpopulation’s boundaries. This method is useful for implementing reflecting boundary conditions.
This may only be called in simulations for which continuous space has been enabled with
initializeSLiMOptions().
The length of point must be an exact multiple of the dimensionality of the simulation; in other words,
point may contain values comprising more than one point. In this case, each point will be processed
as described above and a new vector containing all of the processed points will be returned.
– (float)pointStopped(float point)

Returns a revised version of point that has been brought inside the spatial boundaries of the
subpopulation by clamping. In brief, if a coordinate of point lies beyond a spatial boundary, that
coordinate is set to exactly the position of the boundary, so that it lies on the edge of the spatial
boundary. This method is useful for implementing stopping boundary conditions. This may only be
called in simulations for which continuous space has been enabled with initializeSLiMOptions().
The length of point must be an exact multiple of the dimensionality of the simulation; in other words,
point may contain values comprising more than one point. In this case, each point will be processed
as described above and a new vector containing all of the processed points will be returned.
– (float)pointUniform([integer$ n = 1])

Returns a new point (or points, for n > 1) generated from uniform draws for each coordinate, within
the spatial boundaries of the subpopulation. The returned vector will contain n points, each
comprised of a number of coordinates equal to the dimensionality of the simulation, so it will be of
total length n*dimensionality. This may only be called in simulations for which continuous space has
been enabled with initializeSLiMOptions().
– (void)removeSubpopulation(void)

Removes this subpopulation from the model. The subpopulation is immediately removed from the list
of active subpopulations, and the symbol representing the subpopulation is undefined. The
subpopulation object itself remains unchanged until children are next generated (at which point it is
deallocated), but it is no longer part of the simulation and should not be used.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

464

Note that this method is only for use in nonWF models, in which there is a distinction between a
subpopulation being empty and a subpopulation being removed from the simulation; an empty
subpopulation may be re-colonized by migrants, whereas as a removed subpopulation no longer
exists at all. WF models do not make this distinction; when a subpopulation is empty it is
automatically removed. WF models should therefore call setSubpopulationSize(0) instead of this
method; setSubpopulationSize() is the standard way for WF models to change the subpopulation
size, including to a size of 0.
– (object)sampleIndividuals(integer$ size, [logical$ replace = F],
[No$ exclude = NULL], [Ns$ sex = NULL],[Ni$ tag = NULL],
[Ni$ minAge = NULL], [Ni$ maxAge = NULL], [Nl$ migrant = NULL])

Returns a vector of individuals, of size less than or equal to parameter size, sampled from the
individuals in the target subpopulation. Sampling is done without replacement if replace is F (the
default), or with replacement if replace is T. The remaining parameters specify constraints upon the
pool of individuals that will be considered candidates for the sampling. Parameter exclude, if nonNULL, may specify a specific individual that should not be considered a candidate (typically the focal
individual in some operation). Parameter sex, if non-NULL, may specify a sex ("M" or "F") for the
individuals to be drawn, in sexual models. Parameter tag, if non-NULL, may specify a tag value for the
individuals to be drawn; only individuals whose tag property matches this value will be candidates.
Parameters minAge and maxAge, if non-NULL, may specify a minimum or maximum age for the
individuals to be drawn, in nonWF models. Parameter migrant, if non-NULL, may specify a required
value for the migrant property of the individuals to be drawn (so T will require that individuals be
migrants, F will require that they not be). If the candidate pool is smaller than the requested sample
size, all eligible candidates will be returned (in randomized order); the result will be a zero-length
vector if no eligible candidates exist (unlike sample()).
This method is similar to getting the individuals property of the subpopulation, using operator [] to
select only individuals with the desired properties, and then using sample() to sample from that
candidate pool. However, besides being much simpler than the equivalent Eidos code, it is also much
faster, and it does not fail if less than the full sample size is available. See subsetIndividuals() for
a similar method that returns a full subset, rather than a sample.
– (void)setCloningRate(numeric rate)

Set the cloning rate of this subpopulation. The rate is changed to rate, which should be between 0.0
and 1.0, inclusive. Clonal reproduction can be enabled in both non-sexual (i.e. hermaphroditic) and
sexual simulations. In non-sexual simulations, rate must be a singleton value representing the overall
clonal reproduction rate for the subpopulation. In sexual simulations, rate may be either a singleton
(specifying the clonal reproduction rate for both sexes) or a vector containing two numeric values (the
female and male cloning rates specified separately, at indices 0 and 1 respectively). During mating
and offspring generation, the probability that any given offspring individual will be generated by
cloning – by asexual reproduction without gametes or meiosis – will be equal to the cloning rate (for
its sex, in sexual simulations) set in the parental (not the offspring!) subpopulation.
– (void)setMigrationRates(io sourceSubpops, numeric rates)

Set the migration rates to this subpopulation from the subpopulations in sourceSubpops to the
corresponding rates specified in rates; in other words, rates gives the expected fractions of the
children in this subpopulation that will subsequently be generated from parents in the subpopulations
sourceSubpops (see section 19.2.1). This method will only set the migration fractions from the
subpopulations given; migration rates from other subpopulations will be left unchanged (explicitly set
a zero rate to turn off migration from a given subpopulation). The type of sourceSubpops may be
either integer, specifying subpopulations by identifier, or object, specifying subpopulations directly.
– (void)setSelfingRate(numeric$ rate)

Set the selfing rate of this subpopulation. The rate is changed to rate, which should be between 0.0
and 1.0, inclusive. Selfing can only be enabled in non-sexual (i.e. hermaphroditic) simulations.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

465

During mating and offspring generation, the probability that any given offspring individual will be
generated by selfing – by self-fertilization via gametes produced by meiosis by a single parent – will be
equal to the selfing rate set in the parental (not the offspring!) subpopulation.
– (void)setSexRatio(float$ sexRatio)

Set the sex ratio of this subpopulation to sexRatio. As defined in SLiM, this is actually the fraction of
the subpopulation that is male; in other words, the M:(M+F) ratio. This will take effect when children
are next generated; it does not change the current subpopulation state. Unlike the selfing rate, the
cloning rate, and migration rates, the sex ratio is deterministic: SLiM will generate offspring that
exactly satisfy the requested sex ratio (within integer roundoff limits). See section 19.2.1 for further
details.
– (void)setSpatialBounds(float bounds)

Set the spatial boundaries of the subpopulation to bounds. This method may be called only for
simulations in which continuous space has been enabled with initializeSLiMOptions(). The
length of bounds must be double the spatial dimensionality, so that it supplies both minimum and
maximum values for each coordinate. More specifically, for a dimensionality of "x", bounds should
supply (x0, x1) values; for dimensionality "xy" it should supply (x0, y0, x1, y1) values; and for
dimensionality "xyz" it should supply (x0, y0, z0, x1, y1, z1) (in that order). These
boundaries will be used by SLiMgui to calibrate the display of the subpopulation, and will be used by
methods such as pointInBounds(), pointReflected(), pointStopped(), and pointUniform().
The default spatial boundaries for all subpopulations span the interval [0,1] in each dimension.
Spatial dimensions that are periodic (as established with the periodicity parameter to
initializeSLiMOptions()) must have a minimum coordinate value of 0.0 (a restriction that allows
the handling of periodicity to be somewhat more efficient). The current spatial bounds for the
subpopulation may be obtained through the spatialBounds property.
– (void)setSubpopulationSize(integer$ size)

Set the size of this subpopulation to size individuals. This will take effect when children are next
generated; it does not change the current subpopulation state. Setting a subpopulation to a size of 0
does have some immediate effects that serve to disconnect it from the simulation: the subpopulation is
removed from the list of active subpopulations, the subpopulation is removed as a source of migration
for all other subpopulations, and the symbol representing the subpopulation is undefined. In this
case, the subpopulation itself remains unchanged until children are next generated (at which point it is
deallocated), but it is no longer part of the simulation and should not be used.
– (void)setValue(string$ key, + value)

Sets a value for the dictionary entry identifier key. The value, which may be of any type other than
object, can be fetched later using getValue(). This dictionary-style functionality is actually
provided by the superclass of Subpopulation, SLiMEidosDictionary, although that fact is not
presently visible in Eidos since superclasses are not introspectable.
– (string)spatialMapColor(string$ name, float value)

Looks up the spatial map indicated by name, and uses its color-translation machinery (as defined by
the valueRange and colors parameters to defineSpatialMap()) to translate each element of value
into a corresponding color string. If the spatial map does not have color-translation capabilities, an
error will result. See the documentation for defineSpatialMap() for information regarding the
details of color translation. See the Eidos manual for further information on color strings.
– (float$)spatialMapValue(string$ name, float point)

Looks up the spatial map indicated by name, and uses its mapping machinery (as defined by the
gridSize, values, and interpolate parameters to defineSpatialMap()) to translate the
coordinates of point into a corresponding map value. The length of point must be equal to the
spatiality of the spatial map; in other words, for a spatial map with spatiality "xz", point must be of
length 2, specifying the x and z coordinates of the point to be evaluated. Interpolation will
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

466

automatically be used if it was enabled for the spatial map. Point coordinates are clamped into the
range defined by the spatial boundaries, even if the spatial boundaries are periodic; use
pointPeriodic() to wrap the point coordinates first if desired. See the documentation for
defineSpatialMap() for information regarding the details of value mapping.
– (object)subsetIndividuals([No$ exclude = NULL],
[Ns$ sex = NULL],[Ni$ tag = NULL], [Ni$ minAge = NULL], [Ni$ maxAge = NULL],
[Nl$ migrant = NULL])

Returns a vector of individuals subset from the individuals in the target subpopulation. The parameters
specify constraints upon the subset of individuals that will be returned. Parameter exclude, if nonNULL, may specify a specific individual that should not be included (typically the focal individual in
some operation). Parameter sex, if non-NULL, may specify a sex ("M" or "F") for the individuals to be
returned, in sexual models. Parameter tag, if non-NULL, may specify a tag value for the individuals to
be returned; only individuals whose tag property matches this value will be returned. Parameters
minAge and maxAge, if non-NULL, may specify a minimum or maximum age for the individuals to be
returned, in nonWF models. Parameter migrant, if non-NULL, may specify a required value for the
migrant property of the individuals to be returned (so T will require that individuals be migrants, F
will require that they not be).
This method is shorthand for getting the individuals property of the subpopulation, and then using
operator [] to select only individuals with the desired properties; besides being much simpler than the
equivalent Eidos code, it is also much faster. See sampleIndividuals() for a similar method that
returns a sample taken from a chosen subset of individuals.
– (void)takeMigrants(object migrants)

Immediately moves the individuals in migrants to the target subpopulation (removing them from
their previous subpopulation). Individuals in migrants that are already in the target subpopulation
are unaffected. Note that the indices and order of individuals and genomes in both the target and
source subpopulations will change unpredictably as a side effect of this method.
Note that this method is only for use in nonWF models, in which migration is managed manually by
the model script. In WF models, migration is managed automatically by the SLiM core based upon
the migration rates set for each subpopulation with setMigrationRates().

21.14 Class Substitution
This class represents a mutation that has been fixed; Mutation objects are converted to
Substitution objects upon fixation. Its properties are thus very similar to those of Mutation.
Section 1.5.2 presents an overview of the conceptual role of this class.
Although Substitution has a tag property, like most SLiM classes, an associated value for
Substitution objects may also be kept in the subpopID property (see section 21.8).
21.14.1 Substitution properties
id => (integer$)

The identifier for this mutation. Each mutation created during a run receives an immutable identifier
that will be unique across the duration of the run, and that identifier is carried over to the
Substitution object when the mutation fixes.
fixationGeneration => (integer$)

The generation in which this mutation fixed.
mutationType => (object$)

The MutationType from which this mutation was drawn.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

467

originGeneration => (integer$)

The generation in which this mutation arose.
position => (integer$)

The position in the chromosome of this mutation.
selectionCoeff => (float$)

The selection coefficient of the mutation, drawn from the distribution of fitness effects of its
MutationType.
subpopID <–> (integer$)

The identifier of the subpopulation in which this mutation arose. This value is carried over from the
Mutation object directly; if a “tag” value was used in the Mutation object (see section 21.8.1), that
value will carry over to the corresponding Substitution object. The subpopID in Substitution is
a read-write property to allow it to be used as a “tag” in the same way, if the origin subpopulation
identifier is not needed.
tag <–> (integer$)

A user-defined integer value. The value of tag is carried over automatically from the original
Mutation object. Apart from that, the value of tag is not used by SLiM; it is free for you to use.

21.14.2 Substitution methods
Since Substitution objects represent fixation events that occurred in the past, they are
relatively immutable. However, since it may be useful to attach (possibly dynamic) state to
substitutions, their tag and subpopID properties are mutable, and they also provide the same
getValue() / setValue() functionality as Mutation. Values set on a Mutation object will carry
over to the corresponding Substitution object automatically upon fixation.
– (+)getValue(string$ key)

Returns the value previously set for the dictionary entry identifier key using setValue(), or NULL if
no value has been set. This dictionary-style functionality is actually provided by the superclass of
Substitution, SLiMEidosDictionary, although that fact is not presently visible in Eidos since
superclasses are not introspectable.
– (void)setValue(string$ key, + value)

Sets a value for the dictionary entry identifier key. The value, which may be of any type other than
object, can be fetched later using getValue(). This dictionary-style functionality is actually
provided by the superclass of Substitution, SLiMEidosDictionary, although that fact is not
presently visible in Eidos since superclasses are not introspectable.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

468

22. Writing Eidos events and callbacks
In the preceding recipes, we have seen many examples of Eidos events and callbacks, but we
have not systematically described their syntax and semantics. Eidos events and callbacks are
typically used in SLiM simulations by defining them in the SLiM input file; they can also be
registered with the simulation dynamically at runtime (see sections 10.5.3 and 16.4, for example).
There are two main ways to use Eidos in the input file. One way is by defining an Eidos event, a
block of Eidos code that is executed during each generation. The other way is by defining an Eidos
callback, a block of code that is called by SLiM in specific circumstances to extend the
functionality of SLiM in particular areas. One type of Eidos callback, the initialize() callback,
was described in section 21.1. The sections below will detail the remaining possibilities.
22.1 Defining Eidos events
An Eidos event is a block of Eidos code that is executed every generation, within a generation
range, to perform a desired task. The syntax of an Eidos event declaration looks like one of these:
[id] [gen1 [: gen2]] { ... }
[id] [gen1 [: gen2]] early() { ... }
[id] [gen1 [: gen2]] late() { ... }

The first two declarations are exactly equivalent, and declare an early() event that executes at
the beginning of the generation cycle; the early() designation is therefore optional. The third
declaration declares a late() event that executes near the end of the generation cycle (see chapter
19 for a discussion of the stages of the generation cycle and the differences between these two
types of events).
The id is an optional identifier like s1 (or more generally, sX, where X is an integer greater than
or equal to 0) that defines an identifier that can be used to refer to the script block. In most
situations it can be omitted, in which case the id is implicitly defined as -1, a placeholder value
that essentially represents the lack of an identifier value. Supplying an id is only useful if you wish
to manipulate your script blocks programmatically (see section 22.8).
Then comes a generation or a range of generations, and then a block of Eidos code enclosed in
braces to form a compound statement. A trivial example might look like this:
1000:5000 {
p1.size = 1000 * sin(sim.generation / 100.0);
}

This would set the size of subpopulation p1 to the result of an expression based on the sin()
function, resulting in a fluctuating subpopulation size. This idea is further developed in the recipe
in section 5.1.4; here, the point is that the Eidos code in the braces {} is executed near the end of
every generation in the specified range of generations. In this case, the generation range is 1000 to
5000, and so the Eidos event will be executed 4001 times. A range of generations can be given, as
in the example above, or a single generation can be given with a single integer:
100 late() {
print("Finished generation 100!");
}

In fact, you can omit specifying a generation altogether, in which case the Eidos event runs
every generation. However, since it takes a little time to set up the Eidos interpreter and interpret a
script, it is advisable to use the narrowest range of generations possible.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

469

The generations specified for a Eidos event block can be any positive integer. All scripts that
apply to a given time point will be run in the order in which they are given; scripts specified higher
in the input file will run before those specified lower. Sometimes it is desirable to have a script
block execute in a generation which is not fixed, but instead depends upon some parameter,
defined constant, or calculation; this may be achieved by rescheduling the script block with the
SLiMSim method rescheduleScriptBlock() (see section 17.2 for an example).
When Eidos events are executed, several global variables are defined by SLiM for use by the
Eidos code. These have been mentioned in previous sections, but here is a summary:
sim
g1, ...
i1, ...
m1, ...
p1, ...
s1, ...
self

A SLiMSim object representing the current SLiM simulation
GenomicElementType objects representing the genomic element types defined
InteractionType objects representing the interaction types defined
MutationType objects representing the mutation types defined
Subpopulation objects representing the subpopulations that exist
SLiMEidosBlock objects representing the named events and callbacks defined
A SLiMEidosBlock object representing the script block currently executing

Note that the sim global is not available in initialize() callbacks, since the simulation has not
yet been initialized (see section 21.1). Similarly, the globals for subpopulations, mutation types,
and genomic element types are only available after the point at which those objects have been
defined by an initialize() callback.
22.2 Defining mutation fitness with a fitness() callback
An Eidos callback is a block of Eidos code that is called by SLiM in specific circumstances, to
allow the customization of particular actions taken by SLiM in running a simulation. Five types of
callbacks are presently supported (in addition to the initialize() callbacks described in section
21.1): fitness() callbacks, discussed here, and mateChoice(), modifyChild(), recombination(),
and interaction() callbacks, discussed in the following sections.
A fitness() callback is called by SLiM when it is determining the fitness effect of a mutation
carried by an individual. Normally, the fitness effect of a mutation is determined by the selection
coefficient of the mutation and the dominance coefficient of the mutation (the latter used only if
the individual is heterozygous for the mutation). More specifically, the standard calculation for the
fitness effect of a mutation takes one of two forms. If the individual is homozygous, then
w = w * (1.0 + selectionCoefficient),
where w is the relative fitness of the individual carrying the mutation. This equation is also used if
the chromosome being simulated has no homologue – when the Y sex chromosome is being
simulated. If the individual is heterozygous, then the dominance coefficient enters the picture as
w = w * (1.0 + dominanceCoeff * selectionCoeff).
For simulations of autosomes, the dominance coefficient is defined by the mutation type; for
simulations of X sex chromosomes, the mutation type’s dominance coefficient is used for XX
females that are heterozygous, whereas XY males that are “heterozygous” for the mutation because
they possess only one X chromosome use a global dominance coefficient (see initializeSex(),
section 21.1, and the dominanceCoeffX property of SLiMSim, section 21.12.1).
That is the standard behavior of SLiM, reviewed here to provide a conceptual baseline.
Supplying a fitness() callback allows you to substitute any calculation you wish for the relative
fitness effect of a mutation; the new relative fitness effect computation becomes
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

470

w = w * fitness()
where fitness() is the value returned by your callback. This value is a relative fitness value, so
1.0 is neutral, unlike the selection coefficient scale, where 0.0 is neutral; be careful with this
distinction! Like Eidos events, fitness() callbacks are defined as script blocks in the input file,
but they use a variation of the syntax for defining a Eidos event:
[id] [gen1 [: gen2]] fitness( [, ]) { ... }

For example, if the callback were defined as:
1000:2000 fitness(m2, p3) { 1.0; }

then a relative fitness of 1.0 (i.e. neutral) would be used for all mutations of mutation type m2 in
subpopulation p3 from generation 1000 to generation 2000. The very same mutations, if also
present in individuals in other subpopulations, would preserve their normal selection coefficient
and dominance coefficient in those other subpopulations; this callback would therefore establish
spatial heterogeneity in selection, in which mutation type m2 was neutral in subpopulation p3 but
under selection in other subpopulations, for the range of generations given (see the recipe in
section 9.2 for a fuller explication of this idea).
In addition to the SLiM globals listed in section 22.1, a fitness() callback is supplied with
some additional information passed through global variables:
mut
homozygous
relFitness
individual
genome1
genome2
subpop

A Mutation object, the mutation whose relative fitness is being evaluated
A value of T (the mutation is homozygous), F (heterozygous), or NULL (it is
paired with a null chromosome, which can occur with sex chromosomes)
The default relative fitness value calculated by SLiM
The individual carrying this mutation (an object of class Individual)
One genome of the individual carrying this mutation
The other genome of that individual
The subpopulation in which that individual lives

These globals may be used in the fitness() callback to compute a fitness value. To implement
the standard fitness functions used by SLiM for an autosomal simulation, for example, you could
do something like this:
fitness(m1) {
if (homozygous)
return 1.0 + mut.selectionCoeff;
else
return 1.0 + mut.mutationType.dominanceCoeff * mut.selectionCoeff;
}

As mentioned above, a relative fitness of 1.0 is neutral (whereas a selection coefficient of 0.0 is
neutral); the 1.0 + in these calculations converts between the selection coefficient scale and the
relative fitness scale, and is therefore essential. However, the relFitness global variable
mentioned above would already contain this value, precomputed by SLiM, so you could simply
return relFitness to get that behavior when you want it:
fitness(m1) {
if ()
;
else

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

471

return relFitness;
}

This would return a modified fitness value in certain conditions, but would return the standard
fitness value otherwise.
More than one fitness() callback may be defined to operate in the same generation. As with
Eidos events, multiple callbacks will be called in the order in which they were defined in the input
file. Furthermore, each callback will be given the relFitness value returned by the previous
callback – so the value of relFitness is not necessarily the default value, in fact, but is the result
of all previous fitness() callbacks for that individual in that generation. In this way, the effects of
multiple callbacks can “stack”.
In SLiM version 2.3 and later, it is possible to define global fitness() callbacks, which are
applied exactly once to every individual (within a given subpopulation, if the fitness() callback
is declared to be limited to one subpopulation, as usual). Global fitness() callbacks do not
reference a particular mutation type, and are not called in reference to any specific mutation in the
individual; instead, they provide an opportunity for the model script to define fitness effects that
are independent of specific mutations (although their fitness effects may still depend upon some
aggregate genetic state). For example, they are useful for defining the fitness effect of an
individual’s overall phenotype (perhaps determined by multiple loci, and perhaps by
developmental noise, phenotypic plasticity, etc.), or for defining the fitness effects of behavioral
interactions between individuals such as competition or altruism. A global fitness() callback is
defined by giving NULL as the mutation type identifier in the callback’s declaration. These callbacks
will generally be called once per individual in each generation, in an order that is formally
undefined, as described in detail in section 19.6. When a global fitness() callback is running,
the mut and homozygous variables are defined to be NULL (since there is no focal mutation), and
relFitness is defined to be 1.0. The fitness effect for the callback is simply returned as a singleton
float value, as usual. Examples of global fitness() callbacks can be found in the recipes of
sections 13.1, 13.3, 13.10, 14.2, 14.4, and 14.5 (and perhaps others).
Beginning in SLiM 3.0, it is also possible to set the fitnessScaling property on a subpopulation
to scale the fitness values of every individual in the subpopulation by the same constant amount,
or to set the fitnessScaling property on an individual to scale the fitness value of that specific
individual. These scaling factors are multiplied together with all other fitness effects for an
individual to produce the individual’s final fitness value. The fitnessScaling properties of
Subpopulation and Individual can often provide similar functionality to fitness(NULL) callbacks
with greater efficiency and simplicity. They are reset to 1.0 in every generation, immediately after
fitness values are calculated, so they only need to be set when a value other than 1.0 is desired.
One caveat to be aware of is that fitness() callbacks are called at the end of each generation,
just before the next generation begins. If you have a fitness() callback defined for generation 10,
for example, it will actually be called at the very end of generation 10, after child generation has
finished, after the new children have been promoted to be the next parental generation, and after
late() events have been executed. The fitness values calculated will thus be used during
generation 11; the fitness values used in generation 10 were calculated at the end of generation 9.
(This is primarily so that SLiMgui, which refreshes its display in between generations, has
computed fitness values at hand that it can use to display the new parental individuals in the
proper colors.)
Many other possibilities can be implemented with a fitness() callback. For example, one
could implement epistatic interactions by checking the genomes provided to see whether they
contain the other mutations involved in the epistasis (section 9.3.1); one could implement negative
frequency-dependent selection (balancing selection) by checking the frequency of the mutation in
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

472

the subpopulation (section 9.4.1); one could implement a polygenic fitness calculation by
counting how many mutations of a given mutation type were present in the genome of the
individual (section 9.3.2); or one could implement spatial variation in the fitness of heterozygotes
by varying the dominance coefficient depending upon the subpopulation (similar to section 9.2).
The fitness() callback mechanism is thus extremely powerful and flexible. However, since
fitness() callbacks involve Eidos code being executed for the evaluation of fitness of every
mutation of every individual (within the generation range, mutation type, and subpopulation
specified), they can slow down a simulation considerably, so use them as sparingly as possible.
22.3 Defining mate choice with a mateChoice() callback
Normally, a SLiM simulation defines mate choice according to fitness; individuals of higher
fitness are more likely to be chosen as mates. However, one might wish to simulate more complex
mate-choice dynamics such as assortative or disassortative mating, mate search algorithms, and so
forth. Such dynamics can be handled in SLiM with the mateChoice() callback mechanism.
A mateChoice() callback is established in the input file with a syntax very similar to that of
fitness() callbacks (section 22.2):
[id] [gen1 [: gen2]] mateChoice([]) { ... }

The only difference between the two is that the mateChoice() callback does not allow you to
specify a mutation type to which the callback applies, since that makes no sense.
Note that if a subpopulation is given to which the mateChoice() callback is to apply, the
callback is used for all matings that will generate a child in the stated subpopulation (as opposed to
all matings of parents in the stated subpopulation); this distinction is important when migration
causes children in one subpopulation to be generated by matings of parents in a different
subpopulation.
When a mateChoice() callback is defined, the first parent in a mating is still chosen
proportionally according to fitness (if you wish to influence that choice, you can use a fitness()
callback; see section 22.2). In a sexual (rather than hermaphroditic) simulation, this will be the
female parent; SLiM does not currently support males as the choosy sex. The second parent – the
male parent, in a sexual simulation – will then be chosen based upon the results of the
mateChoice() callback.
More specifically, the callback must return a vector of weights, one for each individual in the
subpopulation; SLiM will then choose a parent with probability proportional to weight. The
mateChoice() callback could therefore modify or replace the standard fitness-based weights
depending upon some other criterion such as assortativeness. A singleton vector of type
Individual may be returned instead of a weights vector to indicate that that specific individual has
been chosen as the mate (beginning in SLiM 2.3); this could also be achieved by returned a vector
of weights in which the chosen mate has a non-zero weight and all other weights are zero, but
returning the chosen individual instead is much more efficient. A zero-length return vector – as
generated by float(0), for example – indicates that a suitable mate was not found; in that event, a
new first parent will be drawn from the subpopulation. Finally, if the callback returns NULL, that
signifies that SLiM should use the standard fitness-based weights to choose a mate; the
mateChoice() callback did not wish to alter the standard behavior for the current mating (this is
equivalent to returning the unmodified vector of weights, but returning NULL is much faster since it
allows SLiM to drop into an optimized case). Apart from the special cases described above – a
singleton Individual, float(0), and NULL – the returned vector of weights must contain the same
number of values as the size of the subpopulation, and all weights must be non-negative. Note
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

473

that the vector of weights is not required to sum to 1, however; SLiM will convert relative weights
on any scale to probabilities for you.
If the sum of the returned weights vector is zero, SLiM treats it as meaning the same thing as a
return of float(0) – a suitable mate could not be found, and a new first parent will thus be drawn.
(This is a change in policy beginning in SLiM 2.3; prior to that, returning a vector of sum zero was
considered a runtime error.) There is a subtle difference in semantics between this and a return of
float(0): returning float(0) immediately short-circuits mate choice for the current first parent,
whereas returning a vector of zeros allows further applicable mateChoice() callbacks to be called,
one of which might “rescue” the first parent by returning a non-zero weights vector or an
individual. In most models this distinction is irrelevant, since chaining mateChoice() callbacks is
uncommon (see section 22.8). When the choice is otherwise unimportant, returning float(0) will
be handled more quickly by SLiM; but if a model is constructing a vector of weights anyway,
checking for sum(...) == 0 in order to return float(0) if the weights all happen to be zero is
complicated and slow – which is why this policy was changed.
In addition to the SLiM globals listed in section 22.1, a mateChoice() callback is supplied with
some additional information passed through global variables:
individual
genome1
genome2
subpop
sourceSubpop
weights

The parent already chosen (the female, in sexual simulations)
One genome of the parent already chosen
The other genome of the parent already chosen
The subpopulation into which the offspring will be placed
The subpopulation from which the parents are being chosen
The standard fitness-based weights for all individuals

If sex is enabled, the mateChoice() callback must ensure that the appropriate weights are zero
and nonzero to guarantee that all eligible mates are male (since the first parent chosen is always
female, as explained above). In other words, weights for females must be 0. The weights vector
given to the callback is guaranteed to satisfy this constraint. If sex is not enabled – in a
hermaphroditic simulation, in other words – this constraint does not apply.
For example, a simple mateChoice() callback might look like this:
1000:2000 mateChoice(p2) {
return weights ^ 2;
}

This defines a mateChoice() callback for generations 1000 to 2000 for subpopulation p2. The
callback simply transforms the standard fitness-based probabilities by squaring them. Code like
this could represent a situation in which fitness and mate choice proceed normally in one
subpopulation (p1, here, presumably), but are altered by the effects of a social dominance
hierarchy or male-male competition in another subpopulation (p2, here), such that the highestfitness individuals tend to be chosen as mates more often than their (perhaps survival-based) fitness
values would otherwise suggest. Note that by basing the returned weights on the weights vector
supplied by SLiM, the requirement that females be given weights of 0 is finessed; in other
situations, care would need to be taken to ensure that.
More than one mateChoice() callback may be defined to operate in the same generation. As
with Eidos events, multiple callbacks will be called in the order in which they were defined.
Furthermore, each callback will be given the weights vector returned by the previous callback – so
the value of weights is not necessarily the default fitness-based weights, in fact, but is the result of
all previous weights() callbacks for the current mate-choice event. In this way, the effects of
multiple callbacks can “stack”. If any mateChoice() callback returns float(0), however –
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

474

indicating that no eligible mates exist, as described above – then the remainder of the callback
chain will be short-circuited and a new first parent will immediately be chosen.
Note that matings in SLiM do not proceed in random order. Offspring are generated for each
subpopulation in turn, and within each subpopulation the order of offspring generation is also
non-random with respect to both the source subpopulation and the sex of the offspring. It is
important, therefore, that mateChoice() callbacks are not in any way biased by the offspring
generation order; they should not treat matings early in the process any differently than matings
late in the process. Any failure to guarantee such invariance could lead to large biases in the
simulation outcome. In particular, it is usually dangerous to activate or deactivate mateChoice()
callbacks while offspring generation is in progress.
A wide variety of mate choice algorithms can easily be implemented with mateChoice()
callbacks. For example, mating could be assortative, based upon some type of genetic similarity
(section 11.1), or a sequential mate search could be conducted with some probability of failing to
find a mate at all if the female is too choosy (section 11.2).
22.4 Defining child generation with a modifyChild() callback
Normally, a SLiM simulation defines child generation with its rules regarding selfing versus
crossing, recombination, mutation, and so forth. However, one might wish to modify these rules
in particular circumstances – by preventing particular children from being generated, by modifying
the generated children in particular ways, or by generating children oneself. All of these dynamics
can be handled in SLiM with the modifyChild() callback mechanism.
A modifyChild() callback is established in the input file with a syntax very similar to that of
other callbacks:
[id] [gen1 [: gen2]] modifyChild([]) { ... }

The modifyChild() callback may optionally be restricted to the children generated to occupy a
specified subpopulation.
When a modifyChild() callback is called, a parent or parents have already been chosen, and a
candidate child has already been generated. The genomes of the parent or parents are provided to
the callback, as is the genome of the generated child. The callback may accept the generated
child, modify it, substitute completely different genomic information for it, or reject it (causing a
new parent or parents to be selected and a new child to be generated, which will again be passed
to the callback).
In addition to the SLiM globals listed in section 22.1, a modifyChild() callback is supplied with
additional information passed through global variables:
child
childGenome1
childGenome2
childIsFemale
parent1
parent1Genome1
parent1Genome2
isCloning
isSelfing
parent2
parent2Genome1
parent2Genome2

The generated child (an object of class Individual)
One genome of the generated child
The other genome of the generated child
T if the child will be female, F if male (defined only if sex is enabled)
The first parent (an object of class Individual)
One genome of the first parent
The other genome of the first parent
T if the child is the result of cloning
T if the child is the result of selfing (but see note below)
The second parent (an object of class Individual)
One genome of the second parent
The other genome of the second parent

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

475

subpop
sourceSubpop

The subpopulation in which the child will live
The subpopulation of the parents (==subpop if not a migration mating)

These globals may be used in the modifyChild() callback to decide upon a course of action.
The childGenome1 and childGenome2 variables may be modified by the callback; whatever
mutations they contain on exit will be used for the new child. Alternatively, they may be left
unmodified (to accept the generated child as is). These variables may be thought of as the two
gametes that will fuse to produce the fertilized egg that results in a new offspring; childGenome1 is
the gamete contributed by the first parent (the female, if sex is turned on), and childGenome2 is the
gamete contributed by the second parent (the male, if sex is turned on).
Importantly, a logical singleton return value is required from modifyChild() callbacks.
Normally this should be T, indicating that generation of the child may proceed (with whatever
modifications might have been made to the child’s genomes). A return value of F indicates that
generation of this child should not continue; this will cause new parent(s) to be drawn, a new child
to be generated, and a new call to the modifyChild() callback. A modifyChild() callback that
always returns F can cause SLiM to hang, so be careful that it is guaranteed that your callback has
a nonzero probability of returning T for every state your simulation can reach.
Note that isSelfing is T only when a mating was explicitly set up to be a selfing event by SLiM;
an individual may also mate with itself by chance (by drawing itself as a mate) even when SLiM
did not explicitly set up a selfing event, which one might term de facto selfing. If you need to
know whether a mating event was a de facto selfing event, you can compare the parents; selffertilization will always entail parent1==parent2, even when isSelfing is F. See the recipe in
section 12.4 for an example of how to use this to suppress de facto selfing. Since selfing is
enabled only in non-sexual simulations, isSelfing will always be F in sexual simulations (and de
facto selfing is also impossible in sexual simulations).
Note that matings in SLiM do not proceed in random order. Offspring are generated for each
subpopulation in turn, and within each subpopulation the order of offspring generation is also
non-random with respect to the source subpopulation, the sex of the offspring, and the
reproductive mode (selfing, cloning, or autogamy). It is important, therefore, that modifyChild()
callbacks are not in any way biased by the offspring generation order; they should not treat
offspring generated early in the process any differently than offspring generated late in the process.
Similar to mateChoice() callbacks, any failure to guarantee such invariance could lead to large
biases in the simulation outcome. In particular, it is usually dangerous to activate or deactivate
modifyChild() callbacks while offspring generation is in progress. When SLiM sees that
mateChoice() or modifyChild() callbacks are defined, it randomizes the order of child generation
within each subpopulation, so this issue is mitigated somewhat. However, offspring are still
generated for each subpopulation in turn. Furthermore, in generations without active callbacks
offspring generation order will not be randomized (making the order of parents nonrandom in the
next generation), with possible side effects. In short, order-dependency issues are still possible and
must be handled very carefully.
As with the other callback types, multiple modifyChild() callbacks may be registered and
active. In this case, all registered and active callbacks will be called for each child generated, in
the order that the callbacks were registered. If a modifyChild() callback returns F, however,
indicating that the child should be generated, the remaining callbacks in the chain will not be
called.
There are many different ways in which a modifyChild() callback could be used in a
simulation; see the recipes in chapter 12 for illustrations of the power of this technique.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

476

22.5 Defining recombination behavior with a recombination() callback
Typically, a simulation sets up a recombination map at the beginning of the run with
initializeRecombinationRate(), and that map is used for the duration of the run. Less
commonly, the recombination map is changed dynamically from generation to generation, with
Chromosome’s method setRecombinationRate(); but still, a single recombination map applies for
all individuals in a given generation. However, in unusual circumstances a simulation may need
to modify the way that recombination works on an individual basis; for this, the recombination()
callback mechanism is provided. This can be useful for models involving chromosomal inversions
that prevent recombination within a region for some individuals (see section 13.5), for example, or
for models of the evolution of recombination.
A recombination() callback is defined with a syntax much like that of other callbacks:
[id] [gen1 [: gen2]] recombination([]) { ... }

The recombination() callback will be called during the generation of every gamete during the
generation(s) in which it is active. It may optionally be restricted to apply only to gametes
generated by parents in a specified subpopulation, using the  specifier.
When a recombination() callback is called, a parent has already been chosen to generate a
gamete, and candidate recombination breakpoints for use in recombining the parental genomes
have been drawn. The genomes of the focal parent are provided to the callback, as is the focal
parent itself (as an Individual object) and the subpopulation in which it resides. Furthermore, the
proposed breakpoints are provided to the callback, divided into three categories: ordinary
recombination breakpoints, and start/end positions for gene conversion events (if gene conversion
is enabled). The callback may modify these variables in order to change the breakpoints used, in
which case it must return T to indicate that changes were made, or it may leave the proposed
breakpoints unmodified, in which case it must return F. (The behavior of SLiM is undefined if the
callback returns the wrong logical value.)
In addition to the SLiM globals listed in section 22.1, then, a recombination() callback is
supplied with additional information passed through global variables:
individual
genome1
genome2
subpop
breakpoints
gcStarts
gcEnds

The focal parent that is generating a gamete
One genome of the focal parent; this is the initial copy strand
The other genome of the focal parent
The subpopulation to which the focal parent belongs
An integer vector of ordinary recombination breakpoints
An integer vector of the start positions of gene conversion spans
An integer vector of the end positions of gene conversion spans

These globals may be used in the recombination() callback to determine the final
recombination breakpoints used by SLiM. The positions in gcStarts and gcEnds constitute
matched pairs (a corresponding end for each start), so the lengths of those two vectors are
guaranteed to be equal. If values are set into breakpoints, gcStarts, and/or gcEnds, the new
values must be of type integer, and gcStarts and gcEnds must be set to vectors of the same length
(again, constituting matched pairs). If any of breakpoints, gcStarts, or gcEnds are modified by
the callback, T should be returned, otherwise F should be returned (this is a speed optimization, so
that SLiM does not have to spend time checking for changes when no changes have been made).
The positions specified in breakpoints, gcStarts, and gcEnds mean that a crossover will occur
immediately before the specified base position (between the preceding base and the specified
base, in other words). The genome specified by genome1 will be used as the initial copy strand
when SLiM executes the recombination; this cannot presently be changed by the callback.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

477

In this design, the recombination callback does not specify a custom recombination map
(although that is a possible extension to this design that could be implemented if it would be
useful). Instead, the callback can add or remove breakpoints at specific locations. To implement a
chromosomal inversion, as is done in the recipe in section 13.5, for example, if the parent is
heterozygous for the inversion mutation then crossovers within the inversion region are removed
by the callback. As another example, to implement a model of the evolution of the overall
recombination rate, a model could (1) set the global recombination rate to the highest rate
attainable in the simulation, (2) for each individual, within the recombination() callback,
calculate the fraction of that maximum rate that the focal individual would experience based upon
its genetics, and (3) probabilistically remove proposed crossover points based upon random
uniform draws compared to that threshold fraction, thus achieving the individual effective
recombination rate desired. Other similar treatments could actually vary the effective
recombination map, not just the overall rate, by removing proposed crossovers with probabilities
that depend upon their position, allowing for the evolution of localized recombination hot-spots
and cold-spots. Crossovers and gene conversion events may also be added, not just removed, by
recombination() callbacks.
Note that the positions in breakpoints, gcStarts, and gcEnds are not, in the general case,
guaranteed to be sorted or uniqued; in other words, positions may appear out of order, and the
same position may appear more than once. After all recombination() callbacks have completed,
the positions from breakpoints, gcStarts, and gcEnds will be merged together into a single vector,
sorted, uniqued, and used as the crossover points in generating the prospective gamete genome.
The essential point here is that if the same position occurs more than once, across breakpoints,
gcStarts, and gcEnds, the multiple occurrences of the position do not cancel; SLiM does not cross
over and then “cross back over” given a pair of identical positions. Instead, the multiple
occurrences of the position will simply be uniqued down to a single occurrence.
As with the other callback types, multiple recombination() callbacks may be registered and
active. In this case, all registered and active callbacks will be called for each gamete generated, in
the order that the callbacks were registered.
22.6 Defining interaction behavior with an interaction() callback
The InteractionType class (section 21.7) provides various built-in interaction functions that
translate from distances to interaction strengths. However, it may sometimes be useful to define a
custom function for that purpose; for that reason, SLiM allows interaction() callbacks to be
defined that modify the standard interaction strength calculated by InteractionType. In particular,
this mechanism allows the strength of interactions to depend upon not only the distance between
individuals, but also the genetics and other state of the individuals, the spatial position of the
individuals, and other environmental variables.
An interaction() callback is called by SLiM when it is determining the strength of the
interaction between one individual (the receiver of the interaction) and another individual (the
exerter of the interaction). This may occur when the evaluate() method of InteractionType is
called, if immediate evaluation is requested (see section 21.7.2); or it may occur at some point
after evaluation of the InteractionType, when the interaction strength is needed, if immediate
evaluation was not requested. This means that interaction() callbacks() may be called at a
variety of points in the generation cycle, unlike the other callback types in SLiM, which are each
called at a specific point. If you write an interaction() callback, you need to take this into
account; assuming that the generation cycle is at a particular stage, or even that the generation
count is the same as it was when evaluate() was called, may be dangerous.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

478

When an interaction strength is needed, the first thing SLiM does is calculate the default
interaction strength using the interaction function that has been defined for the InteractionType
(see section 21.7). If the receiver is the same as the exerter, the interaction strength is always zero;
and in spatial simulations if the distance between the receiver and the exerter is greater than the
maximum distance set for the InteractionType, the interaction strength is also always zero. In these
cases, interaction() callbacks will not be called, and there is no way to redefine these interaction
strengths.
Otherwise, SLiM will then call interaction() callbacks that apply to the interaction type and
subpopulation for the interaction being evaluated. An interaction() callback is defined with a
variation of the syntax used for other callbacks:
[id] [gen1 [: gen2]] interaction( [, ]) { ... }

For example, if the callback were defined as:
1000:2000 interaction(i2, p3) { 1.0; }

then an interaction strength of 1.0 would be used for all interactions of interaction type i2 in
subpopulation p3 from generation 1000 to generation 2000.
In addition to the SLiM globals listed in section 22.1, an interaction() callback is supplied
with some additional information passed through global variables:
distance
strength
receiver
exerter
subpop

The distance from receiver to exerter, in spatial simulations; NAN otherwise
The default interaction strength calculated by the interaction function
The individual receiving the interaction (an object of class Individual)
The individual exerting the interaction (an object of class Individual)
The subpopulation in which the receiver and exerter live

These globals may be used in the interaction() callback to compute an interaction strength.
To simply use the default interaction strength that SLiM would use if a callback had not been
defined for interaction type i1, for example, you could do this:
interaction(i1) {
return strength;
}

Usually an interaction() callback will modify that default strength based upon factors such as
the genetics of the receiver and/or the exerter, the spatial positions of the two individuals, or some
other simulation state. Any finite float value greater than or equal to 0.0 may be returned. The
value returned will be cached by SLiM; if the interaction strength between the same two
individuals is needed again later, the interaction() callback will not be called again (something
to keep in mind if the interaction strength includes a stochastic component).
More than one interaction() callback may be defined to operate in the same generation. As
with other callbacks, multiple callbacks will be called in the order in which they were defined in
the input file. Furthermore, each callback will be given the strength value returned by the
previous callback – so the value of strength is not necessarily the default value, in fact, but is the
result of all previous interaction() callbacks for the interaction in question. In this way, the
effects of multiple callbacks can “stack”.
The interaction() callback mechanism is extremely powerful and flexible, allowing any sort of
user-defined interactions whatsoever to be queried dynamically using the methods of
InteractionType. However, in the general case a simulation may call for the evaluation of the
interaction strength between each individual and every other individual, making the computation
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

479

of the full interaction network an O(N2) problem. Since interaction() callbacks may be called
for each of those N2 interaction evaluations, they can slow down a simulation considerably, so it is
recommended that they be used sparingly. This is the reason that the various interaction functions
of InteractionType were provided; when an interaction does not depend upon individual state,
the intention is to avoid the necessity of an interaction() callback altogether. Furthermore,
constraining the number of cases in which interaction strengths need to be calculated – using a
short maximum interaction distance, querying the nearest neighbors of the focal individual rather
than querying all possible interactions with that individual, and specifying the reciprocality and
sex segregation of the InteractionType, for example – may greatly decrease the computational
overhead of interaction evaluation.
22.7 Defining reproduction behavior with a reproduction() callback
In WF models (the default model type in SLiM), the SLiM core manages the reproduction of
individuals in each generation. In nonWF models, however, reproduction is managed by the
model script, in reproduction() callbacks. These callbacks may only be defined in nonWF
models.
A reproduction() callback is defined with a syntax much like that of other callbacks:
[id] [gen1 [: gen2]] reproduction([ [, ]]) { ... }

The reproduction() callback will be called once for each individual during the generation(s) in
which it is active. It may optionally be restricted to apply only to individuals in a specified
subpopulation, using the  specifier; this may be a subpopulation specifier such as p1,
or NULL indicating no restriction. It may also optionally be restricted to apply only to individuals of
a specified sex (in sexual models), using the  specifier; this may be "M" or "F", or NULL
indicating no restriction.
When a reproduction() callback is called, the expectation is that the callback will trigger the
reproduction of a focal individual by making method calls to add new offspring individuals.
Typically the offspring added are the offspring of the focal individual, and typically they are added
to the subpopulation to which the focal individual belongs, but neither of these is required; a
reproduction() callback may add offspring generated by any parent(s), to any subpopulation. The
focal individual is provided to the callback (as an Individual object), as are its genomes and the
subpopulation in which it resides.
In addition to the SLiM globals listed in section 22.1, then, a reproduction() callback is
supplied with additional information passed through global variables:
individual
genome1
genome2
subpop

The focal individual that is expected to reproduce
One genome of the focal individual
The other genome of the focal individual
The subpopulation to which the focal individual belongs

At present, the return value from reproduction() callbacks is not used, and must be void (i.e., a
value may not be returned). It is possible that other return values will be defined in future.
It is possible, of course, to do actions unrelated to reproduction inside reproduction()
callbacks, but it is not recommended. The late() event phase of the previous generation provides
an opportunity for actions immediately before reproduction, and the early() event phase of the
current generation provides an opportunity for actions immediately after reproduction, so only
actions that are intertwined with reproduction itself should occur in reproduction() callbacks.
Besides providing conceptual clarity, following this design principle will also decrease the
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

480

probability of bugs, since actions that are unrelated to reproduction should not influence or be
influenced by the dynamics of reproduction.
As with the other callback types, multiple reproduction() callbacks may be registered and
active. In this case, all registered and active callbacks will be called for each individual, in the
order that the callbacks were registered.
22.8 Further details on Eidos events and callbacks
Section 22.1 described Eidos events, and sections 22.2 – 22.6 described several different Eidos
callbacks that can be defined to modify the standard behavior of SLiM. This section describes a
few additional details that apply to events and callbacks. These details were mentioned previously,
but were not detailed, in the interests of simplicity; they are of interest mainly to the most
advanced users of SLiM.
Every Eidos block – an event or a callback – is defined in SLiM using a class called
SLiMEidosBlock. All of the registered instances of this class – all of the Eidos blocks scheduled to
run in the simulation – are available through the scriptBlocks property of SLiMSim (section
21.12.2). New script blocks may be added programmatically (rather than in the SLiM input file)
using SLiMSim’s ‑register...() methods; those methods take a string parameter, which is
interpreted as Eidos code. Existing script blocks may be deregistered, which removes them from
the current simulation permanently, using the ‑deregisterScriptBlock() method of SLiMSim. In
this way, the script blocks defined in the SLiM input file are only the beginning; by adding and
removing script blocks dynamically, SLiM simulations can modify their own code as they run.
Obviously this feature would, if used indiscriminately, result in incomprehensible and
unmaintainable code; but in some circumstances, it can be extremely useful and powerful.
Generally, code that manipulates SLiMEidosBlock objects finds the operand blocks using the id
property of SLiMEidosBlock. Alternatively, a script block that references itself (to deregister itself,
for example, or to set its own active property) can use a global constant called self that is defined
whenever an Eidos block is executing. The self constant refers to the executing SLiMEidosBlock
object. It may be passed to ‑deregisterScriptBlock() in order to deregister the current block;
this is safe to do, as the executing block will not actually be deregistered until it has finished
executing. It may also be used to change the properties of the currently executing script block.
In particular, SLiMEidosBlock defines an integer property, active. The active property is
normally -1; this means that script blocks are normally active. If set to 0, the script block will be
inactive for the remainder of the current generation; it will not be called or used in any way
(except that if it is currently executing when active is set to 0, that execution will complete). At
the beginning of each generation, prior to the execution of any script blocks, the active flag of all
registered script blocks will be set back to -1, activating them all again; if you want a script block
to be inactive permanently, you must deregister it rather than just marking it as inactive. Values
other than -1 may be used for active; any value other than 0 indicates that the block is active
(because active is evaluated as a logical value; only 0 is F). This facility is provided to allow
script blocks to run a limited number of times in each generation; the block can check whether
active is -1 (indicating that it is being called for the first time in a generation), and can set active
to a counter value. In each call to the script block, the script can decrement the active counter by
1; when it reaches 0, the block will not be called again in that generation. The active property
could even be used to implement a more complex state machine.
The precise way in which SLiM handles the scheduling of SLiMEidosBlock objects may be
important for some scripts. Because new script blocks can be added dynamically with
‑register...(), and existing blocks can be removed with ‑deregisterScriptBlock(), the right
way to schedule block is not entirely clear. If SLiM is partway through generating children, and
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

481

then a new modifyChild() callback is added, for example, should that callback be used for the
remaining children generated in the current generation? What if an existing modifyChild()
callback is removed, partway through the process of child generation – should that callback stop
being used immediately? For consistency, SLiM’s answer to both of these questions is “no”; a
consistent set of scripts are used across each stage of each generation. However, if a
modifyChild() callback is added before generating children begins, then that callback is used in
the same generation. In essence, the rule is this: whenever SLiM starts on a new stage of the
generational life cycle that involves calling a particular kind of Eidos block, SLiM gathers up a list
of all of the currently defined script blocks applicable to that stage, and it uses that list throughout
the duration of that stage, regardless of what changes are made to the registered script blocks
during the stage. The state of the active property of each script block is checked immediately
before each time that the script block is called, however; the active property is specifically
intended to change the active status of a script block within a single generation.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

482

23. SLiM output formats
In addition to allowing custom output in any format whatsoever, produced with Eidos code,
SLiM also has numerous ways to produce output in fixed formats using built-in methods:
– Methods on SLiMSim (section 21.12.2): outputFull(), outputFixedMutations(), and
outputMutations().
– Methods on Subpopulation (section 21.13.2): outputSample(), outputMSSample(), and
outputVCFSample().
– Methods on Genome (section 21.3.2): output(), outputMS(), and outputVCF().
The documentation cited for these classes above summarizes the method calls themselves, but
does not document the precise format they produce; that will be covered in this chapter.
Note that these methods, and the format of the output produced by SLiM, changed in various
ways in SLiM version 2.1. This documentation will discuss only the format of output from SLiM
2.1 and later, for simplicity.
As will be shown below, all of these output methods can generate a header line beginning with
the tag #OUT: followed by (1) the generation in which the output was generated, (2) a one- or twoletter output type code, (3) additional values depending upon the output type, and (4) if output was
directed to a file, the filename to which output was directed (except in the case of the SLiMSim
method outputMutations()). The output code in the header line may be used to detect which type
of output follows, which is useful for automated parsing of simulation output files. The codes are
as follows:
SLiMSim methods:
outputFull()

A

outputFixedMutations()

F

outputMutations()

T

Subpopulation

methods:

outputSample()

SS

outputMSSample()

SM

outputVCFSample()

SV

Genome

methods:

output()

GS

outputMS()

GM

outputVCF()

GV

All of these methods support output to either the SLiM output stream or to a designated file.
When output is sent to a file, all of these methods support either overwriting an existing file at the
specified path, or appending to any existing file.
These output methods are some of the more complex methods in SLiM, often with many
optional arguments that are typically specified by name. See the Eidos manual for discussion of
how to use optional arguments and named arguments, how to interpret complex type-specifiers
and method signatures, and so forth; that information is not repeated here.
23.1 SLiMSim output methods
The output methods of SLiMSim produce output regarding state that spans the whole population,
rather than just a single subpopulation or a selected set of genomes.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

483

23.1.1 outputFull()
The outputFull() method outputs complete information on all subpopulations, individuals,
and genomes, including all currently segregating mutations (but not including mutations that have
fixed and been converted into Substitution objects). Sample output for outputFull(),
abbreviated with ellipses:
#OUT: 10000 A
Version: 3
Populations:
p1 50 H
p2 50 H
...
Mutations:
10 387752 m1 1308 0 0.5 p2 9404 130
47 387966 m1 5994 0 0.5 p1 9415 130
...
Individuals:
p1:i0 H p1:0 p1:1
p1:i1 H p1:2 p1:3
...
Genomes:
p1:0 A 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19...
p1:1 A 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84...
...

The header line begins with #OUT: and then gives the generation in which the output was
produced, followed by an A (for “all”). If the output is sent to a file with outputFull()’s filePath
option, this is followed by the full path of the file to which the output was saved; for example:
#OUT: 10000 A /Users/bhaller/Desktop/full.txt

Beginning in SLiM 2.3, the header line followed by a single line that indicates the version of the
file format. SLiM 2.3’s format is indicated by a version number of 3 (for internal
reasons); if the version line is missing, it may be inferred that the file format is pre-2.3. The reason
for this addition is that it allows SLiM (and other software) to more accurately read in output files
generated by different versions of SLiM, since the version number indicates what the reader
software can expect to find in the file. A version of 4 is used by SLiM 3.0 and later to indicate that
age values are included in the Individuals section; see below.
After this comes the Populations section, which lists all currently existing subpopulations. First
is given the subpopulation’s identifier, such as p1. Next is listed its current size, in individuals (not
genomes). Finally, an H indicates that the population is composed of hermaphroditic individuals; if
sex has been enabled with initializeSex(), this will instead be an S followed by the current sex
ratio of the subpopulation (i.e., S 0.5).
Next is the Mutations section, which lists all of the currently segregating mutations in the
population. Each mutation is listed on a separate line. The first field is a within-file numeric
identifier for the mutation, beginning at 0 and counting up (although mutations are not listed in
sorted order according to this value); see below for a note on why this field exists. Second is the
mutation’s id property (see section 21.8.1), a within-run unique identifier for mutations that does
not change over time, and can thus be used to match up information on the same mutation within
multiple output dumps made at different times. Third is the identifier of the mutation’s mutation
type, such as m1. Fourth is the position of the mutation on the chromosome, as a zero-based base
position. Fifth is the selection coefficient of the mutation, and sixth is its dominance coefficient
(the latter being a property of the mutation type, in fact). Seventh is the identifier for the
outputFull()

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

484

subpopulation in which the mutation originated, and eighth is the generation in which it
originated. Finally, the ninth field gives the mutation’s prevalence: an integer count of the number
of times that it occurs in any genome in the population.
Following this section is the Individuals section. This describes each individual in each
subpopulation and specifies which genomes belong to it. The first field is an identifier for the
individual, such as p1:i0 (which indicates the 0th individual in p1). Next is the sex of the
individual: H for a hermaphrodite, or if sex has been enabled, M for a male or F for a female.
Following that come two genome specifiers, of the form p1:0 (indicating the 0th genome in p1).
Beginning in SLiM 2.3, the genome specifiers may be followed by spatial positioning
information for each individual. This will be the case if continuous space has been enabled with
the dimensionality parameter to initializeSLiMOptions() and the spatialPositions parameter
to outputFull() is T (which is the default). If both of those preconditions are satisfied, then the
properties of Individual that represent spatial positions (x, y, and/or z) will be output as floatingpoint values. Only properties that are included in the dimensionality of the simulation will be
output. For example, if the simulation has been configured to have dimensionality "xy" with
initializeSLiMOptions(), then the Individuals section might look like this:
Individuals:
p1:i0 H p1:0 p1:1 0.397687 0.522408
p1:i1 H p1:2 p1:3 0.066159 0.74749
...

The first floating-point value on each line is the value of x for that individual; the second is the
value of y. The value of the z property is not output because the z-coordinate is not included in
the dimensionality of the simulation. If spatialPositions=F were specified in the call to
outputFull(), this positional information would not be output and the file format would be
identical to that produced by version 2.1 (apart from the addition of the Version line in SLiM 2.3,
described above).
Beginning in SLiM 3.0, the genome specifiers may be followed by the age of each individual.
This will be the case if the simulation is using the nonWF model type (which is not the default) and
the ages parameter to outputFull() is T (which is the default). If both of those preconditions are
satisfied, then the age property of each individual will be output as an integer at the end of each
line in the Individuals section, following the genome specifiers and (if present) the optional spatial
positioning information controlled by spatialPositions. If age information is not output, the
output format is just as it was before SLiM 3.0, for backward compatibility. Note that if age
information is included in the output, the version number specified in the Version line will be 4; if
not, the version will remain 3 (as it was before SLiM 3.0).
Last comes the Genomes section, which specifies all of the mutations carried by each genome
in the population. The first field is a genome specifier, such as p1:0, as described above. Second
is the type of genome: an A for an autosome, or an X or Y for those types if modeling of sex
chromosomes has been enabled. This is followed by a list of within-file mutation identifiers, as
given in the Mutations section described above, that identify all the mutations carried on the
genome. Alternatively, if the genome is a “null genome” that is not allowed to carry any mutations
in SLiM (such as a Y chromosome if SLiM is modeling the X), the tag  will appear instead of
any mutation identifiers.
The reader might wonder why the within-file index for mutations (the first field in each
mutation’s output line) even exists. Couldn’t the mutation’s id (the second field) be used for that
purpose, since it also uniquely identifies mutations? The answer is: yes, in principle it could. In
practice, however, id values for mutations are often very large numbers – six, seven, or even more
digits long. Because the bulk of a SLiM output file consists of the sequences of mutation identifiers
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

485

listed in the Genome section, using small zero-based numbers for these identifiers actually makes
SLiM’s output files markedly smaller than they would be if mutation id values were used instead.
Keeping output files small is an end in itself, since disk space is limited, but also has the benefit of
making file writes and reads faster.
On the topic of file size and read/write speed, note that the outputFull() method also provides
the option of writing out a binary file. The format of that file is not documented, and is subject to
change at any time (although we will try to preserve backward compatibility when possible).
Binary files can be smaller, and their read and write times are much faster.
Note that the output from outputFull() is the only output format that SLiM can read as well as
write. This is done with the readFromPopulationFile() method of SLiMSim (see section 21.12.2).
23.1.2 outputFixedMutations()
The outputFixedMutations() method outputs information on all mutations that have fixed and
been turned into Substitution objects. It therefore complements the information produced by
outputFull(). Sample output for outputFixedMutations(), abbreviated with ellipses:
#OUT: 10000 F
Mutations:
0 390 m1 701 0 0.5 p2 20 650
1 1114 m1 957 0 0.5 p1 55 650
...

The header line has the standard SLiM output tag #OUT: followed by the generation and then an
(for “fixed”). If the output is sent to a file with outputFixedMutations()’s filePath option, this is
followed by the full path of the file to which the output was saved.
Following this is one section of output, Mutations. This lists every mutation that has fixed and
been turned into a Substitution object. The first eight fields used are identical to those used in the
Mutations section of outputFull() as described above: (1) a within-file identifier counting upward
from 0, (2) the mutation’s id property that uniquely identifies in within a run, (3) the identifier for
the mutation type, (4) the position on the chromosome, (5) the selection coefficient, (6) the
dominance coefficient, (7) the originating subpopulation, and (8) the origination generation. The
last field is different, however; instead of being a prevalence (which would be useless since these
mutations are, by definition, fixed), this field indicates the generation in which the mutation was
converted to a Substitution object (which is the same as the generation in which it fixed, unless
you are dynamically changing the convertToSubstitution flag).
F

23.1.3 outputMutations()
The outputMutations() method is intended to be used to output information about particular
mutations of interest that are being “tracked” – mutations of a particular mutation type, for
example, or perhaps a specific introduced mutation. Sample output for outputMutations():
#OUT: 10000 T p1 388376 m1 673 0 0.5 p2 9434 43
#OUT: 10000 T p2 388376 m1 673 0 0.5 p2 9434 27

These two lines of output are the result of an outputMutations() call requesting output for just
a single mutation. The first line gives information about the prevalence of the mutation in
subpopulation p1, whereas the second line gives the same for p2. If you requested output for more
than one mutations, you get a line for each mutation requested:
#OUT: 10000 T p1 388376 m1 673 0 0.5 p2 9434 43
#OUT: 10000 T p1 388788 m1 9394 0 0.5 p2 9455 57
#OUT: 10000 T p1 390206 m1 6232 0 0.5 p2 9523 57
...
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

486

#OUT: 10000 T p2 388376 m1 673 0 0.5 p2 9434 27
#OUT: 10000 T p2 388788 m1 9394 0 0.5 p2 9455 73
#OUT: 10000 T p2 390206 m1 6232 0 0.5 p2 9523 73
...

Note that the output is sorted by subpopulation, not by mutation, so the lines for a particular
mutation do not necessarily end up adjacent. If a mutation is not present in a given subpopulation
at all, no output line is produced for that mutation in that subpopulation.
The format of each output line follows a similar pattern to other output methods. First comes
the #OUT: tag, followed by the generation and then a T (for “tracked”, for historical reasons). Next
comes the subpopulation identifier, such as p1, for which the line is being produced. The
remaining fields are the same mutation information as produced by outputFull(): (1) the
mutation’s id property, (2) the identifier of its mutation type, (3) its position, (4) its selection
coefficient, (5) its dominance coefficient, (6) origin subpopulation identifier, (7) origin generation,
and (8) prevalence. Note that even if outputMutations()’s filePath parameter is used to send the
output to a file, the filename is not added at the end of the header line as it is with SLiM’s other
output commands, to keep the output from this command concise (since it really consists of
nothing but header lines).
23.2 Subpopulation output methods
The output methods of Subpopulation produce output about the mutations carried by a
sampled subset of the Subpopulation. Three different formats of output are presently available:
SLiM’s native format, MS, and VCF.
23.2.1 outputSample()
The outputSample() method takes a random sample of genomes from the subpopulation as
requested (with options regarding sample size, replacement, and sex) and outputs information on
them in SLiM’s native format. If the sampling options provided by outputSample() are
insufficiently flexible, the output() method of Genome is a more general-purpose method (see
section 23.3.1). Sample output for outputSample(), abbreviated with ellipses:
#OUT: 10000 SS p1 10
Mutations:
65 587710 m1 1308 0 0.5 p2 9404 5
101 587924 m1 5994 0 0.5 p1 9415 5
...
Genomes:
p1:0 A 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23...
p1:1 A 0 1 2 3 4 5 6 8 9 10 49 50 11 12 13 14 15 16 17 18 51 19 22 23...
...

The header line starts with the usual tag, #OUT:, followed by the generation and then SS
(representing “sample, SLiM format”). It then gives the identifier for the subpopulation sampled,
such as p1, and finally the size of the sample (in genomes, not individuals). If the output is sent to
a file with outputSample()’s filePath option, this is followed by the full path of the file to which
the output was saved.
This is followed by a Mutations section and then a Genomes section. The formats of these is
identical to the same sections in the output of outputFull(), described in section 23.1.1, except
that the prevalence values given for mutations are their prevalence within the sample of genomes,
not in the population as a whole. Note that the Individuals section provided by outputFull() is
also not present in the output from outputSample(), because the sample is of genomes, not
complete individuals.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

487

23.2.2 outputMSSample()
The outputMSSample() method takes a random sample of genomes from the subpopulation as
requested (with options regarding sample size, replacement, and sex) and outputs information on
them in MS format. If the sampling options provided by outputMSSample() are insufficiently
flexible, the outputMS() method of Genome is a more general-purpose method (see section 23.3.2).
Sample output for outputMSSample(), abbreviated with ellipses:
#OUT: 10000 SM p1 10
//
segsites: 179
positions: 0.1308131 0.4991499 0.5994599 0.6999700 0.9599960...
00101001000001101011111000000011111000100100111100001000011000100000...
00101001000001101011111000000011111000100100111100001000011000100000...
...

The first line is a header in the same format as for outputSample(), as described in the previous
section. The output type code here, SM, represents “sample, MS format”. The outputMSSample()
method allows output to be sent to a file, with the optional filePath argument. In this case, the
#OUT: header line is not emitted, since it would not be conformant with the MS data format
specification.
This is followed by an empty comment line //, and then a line stating the total number of
segregating sites output. Note that, as with all other output methods in SLiM, these sites are
segregating in the population, but every genome in the sample may be identical at a given site.
Next comes a line giving the position on the chromosome of each of the segregating sites.
These positions have been converted by SLiM from base positions to floating-point positions in the
interval [0,1] as expected for the MS format. Note that SLiM allows multiple mutations at exactly
the same position, so even without roundoff (which may also be an issue for very long
chromosomes), two positions in this list may be specified with exactly the same number.
Finally, the output has one line for each genome in the sample. Each line is a simple sequences
of 0’s and 1’s, indicating whether the genome in question possesses (1) or does not possess (0) the
mutation at the corresponding position in the list of positions.
23.2.3 outputVCFSample()
The outputVCFSample() method takes a random sample of individuals (not genomes!) from the
subpopulation as requested (with options regarding sample size, replacement, and sex) and
outputs information on them. If the sampling options provided by outputVCFSample() are
insufficiently flexible, the outputVCF() method of Genome is a more general-purpose method (see
section 23.3.3). Sample output for outputVCFSample(), abbreviated with ellipses:
#OUT: 10000 SV p1 10
##fileformat=VCFv4.2
##fileDate=20160613
##source=SLiM
##INFO=
##INFO=
##INFO=
##INFO=
##INFO=
##INFO=
##INFO=
##INFO=
##INFO=
##FORMAT=
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

488

#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT i0 i1 i2 i3 i4 i5...
1 1309 . A T 1000 PASS
MID=987550;S=0;DOM=0.5;PO=2;GO=9404;MT=1;AC=11;DP=1000 GT 0|1 0|0 0|1...
1 5995 . A T 1000 PASS
MID=987764;S=0;DOM=0.5;PO=1;GO=9415;MT=1;AC=11;DP=1000 GT 0|1 0|0 0|1...
...

The first line is a header, similar to that produced by outputSample() and outputMSSample().
The output code SV here represents “sample, VCF format”. The outputVCFSample() method allows
output to be sent to a file, with the optional filePath argument. In this case, the #OUT: header line
is not emitted, since it would not be conformant with the VCF data format specification.
Following that is the VCF header, which provides various information about the information in
the file; the VCF format is quite complex so we will not attempt to document it in detail here.
Note that the INFO fields provided by SLiM include fields for a lot of SLiM-specific information that
is not part of the VCF standard itself: the mutation’s id property, selection and dominance
coefficients, subpopulation of origin and generation of origin, and mutation type (the numeric part
of a mutation type identifier like m1). These will have no meaning to most VCF tools, but may be
useful for filtering or other analysis. The VCF header also describes two standard INFO tags: AC and
DP. AC gives the “allele count”, the number of occurrences of the given mutation within the
sample. DP gives the “total depth”, a property of empirical genomic samples that is meaningless
for SLiM output; it is supplied, and is always equal to 1000, in SLiM output to facilitate processing
with VCF tools that expect this tag to be present. The last INFO tag described in the header is
MULTIALLELIC; it is discussed below.
Following the VCF header are lines describing each mutation. Because of word-wrapping and
line-breaking issues, these lines look a little funny here, but this is actually a single line, with fields
separated by tab characters:
1 1309 . A T 1000 PASS
MID=987550;S=0;DOM=0.5;PO=2;GO=9404;MT=1;AC=11;DP=1000 GT 0|1 0|0 0|1...

These fields correspond to the column headings given in the last line of the VCF header:
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT i0 i1 i2 i3 i4 i5...

The first field is the chromosome identifier; SLiM always emits 1 for this. Next is the position of
the mutation; note that VCF uses 1-based positions, so this is the position used internally by SLiM
plus 1. The next five fields have VCF-oriented meanings that are unimportant for SLiM; they will
always be . A T 1000 PASS. The next field is very long, and consists of a series of INFO tags
separated by semicolons; the meaning of those fields is as specified by the ##INFO lines in the VCF
header. Next comes a GT tag indicating the start of genotype data, and finally a sequence of “calls”
of the format 0|0, 0|1, 1|0, or 1|1, indicating whether a given individual in the sample possessed
(1) or did not possess (0) the mutation in each of its two genomes. This data has essentially the
same meaning as MS-format data, but in a notation that groups the 0’s and 1’s into homologous
chromosomes in diploid individuals.
There are several things to note about this. First of all, calls are normally diploid (0|0, 1|0, etc.),
but if SLiM is modeling sex chromosomes, calls may be haploid instead (just a 0 or a 1). For
example, if you are modeling the X chromosome, males will be emitted as haploid while females
will be emitted as diploid. There is no way to represent a 0-ploid individual in VCF format, so if
you are modeling the Y chromosome you may not include females in your sample; there would be
no genetic information to emit for them.
Second, it is important to emphasize that VCF output, unlike all other output from SLiM, is
based on individuals, not on genomes. The subpopulation sample taken by outputVCFSample() is
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

489

a sample of individuals, not genomes, and thus a sample of size 10 will include twice as many
genomes as a sample of size 10 would include for outputMSSample() or outputSample().
Third, you might have noticed one incongruous INFO field in the VCF header above, called
MULTIALLELIC. This is used by SLiM to designate mutations that occur at chromosome positions
that have other segregating mutations also at the same position; such mutations will be tagged
MULTIALLELIC in their INFO field information. In essence, the problem is this. SLiM is, in a sense,
an infinite-alleles-per-site model. Any number of mutations can occur at the same position, all
having different selection and dominance coefficients, etc., and these mutations can even co-occur
within a single genome. VCF format, on the other hand, is more tightly tied to the biological
reality of genetic information, that a given position has only four possible bases (A, T, G, C)
ignoring epigenetic information like methylation. There is no very graceful way to wedge SLiM’s
perspective into VCF format, so we simply emit each mutation as its own VCF line. However,
many VCF analysis tools may choke on this, or produce incorrect results, because VCF files
normally contain only a single line per base position. In order to help alleviate this issue, SLiM
tags all lines that possess this problem with the MULTIALLELIC tag. You can use this tag to filter out
those sites – either to treat them specially, or to simply exclude them from your analysis. If you
want to exclude them completely, you can also request that with a flag value passed to
outputVCFSample(), in which case all lines that would have been tagged MULTIALLELIC will be
suppressed.
Finally, note that SLiM designates all mutations as being a change from an A to a T. Since SLiM
has no concept of nucleotide sequence, this is simply an arbitrary choice. If you wished to
construct a FASTA file for the ancestral sequence, for example, it would simply be the length of the
chromosome, filled with A’s.
23.3 Genome output methods
The output methods of Genome produce output about the specific vector of Genome objects for
which the method is called. Whereas the sampling output methods of Subpopulation limit you to
a sample drawn from a single subpopulation, and provide only a few options regarding how that
sample is conducted, with the Genome output methods you can construct your own vector of
genomes in whatever way you wish, and produce standardized output from that sample. The
sample() function of Eidos may prove useful for this, providing options such as weighted sampling
that Subpopulation’s methods don’t support. The Individual class of SLiM may also be useful; you
can get a vector of individuals from the subpopulations you are interested in, use sample() to get a
sample of individuals from that vector, get the genomes from those individuals using the genomes
property of Individual, and then produce output from the resulting vector of genomes using these
methods.
Incidentally, you might wonder why these Genome output methods behave differently from
most Eidos methods – they do not multicast out to all of the objects in the target vector, producing
a separate output block for each, but instead produce a single output block for the whole target
vector. This is because these methods are designated as class methods, which do not multicast.
This is parallel to defining a static member function in a class in C++, taking a parameter that is a
std::vector containing elements of that same class; that would be the natural way to represent
this concept in C++, whereas in Eidos such methods are class methods that are called on the target
vector but do not multicast. If this is gibberish to you, you can ignore it; the upshot is that it just
works.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

490

23.3.1 output()
The output() method is parallel to the outputSample() method of Subpopulation (see section
23.2.1), but allows output based upon any vector of genomes. Sample output for output(),
abbreviated with ellipses:
#OUT: 10000 GS 10
Mutations:
9 187870 m1 1308 0 0.5 p2 9404 12
47 188084 m1 5994 0 0.5 p1 9415 12
...
Genomes:
p*:0 A 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23...
p*:1 A 0 1 2 3 4 5 6 68 7 8 9 69 10 12 13 14 15 16 17 18 19 20 21 22...
...

This is almost identical to the output from outputSample(), so see section 23.2.1 for further
discussion. The main differences are in the header line. The output type here is GS (for “genomes,
SLiM format”), and no subpopulation identifier is given since the genome vector being output may
originate from more than one subpopulation. If the output is sent to a file with output()’s
filePath option, the last field of the header provides the full path of the file to which the output
was saved.
The other difference is in the genome identifiers in the Genomes section. Here, since the
subpopulation of origin of the genomes is not known to output() – since it was just handed a
vector of genomes that could have come from anywhere – the genome identifiers are of the form
p*:0, p*:1, etc., with the * symbolizing an unknown source subpopulation. This syntax is
intended to parallel the syntax used by SLiM’s other output functions, to make it easier to share
parsing code.
23.3.2 outputMS()
The outputMS() method is parallel to the outputMSSample() method of Subpopulation (see
section 23.2.2), but allows output based upon any vector of genomes. Sample output for
outputMS(), abbreviated with ellipses:
#OUT: 10000 GM 10
//
segsites: 165
positions: 0.1308131 0.4991499 0.5994599 0.6999700 0.9599960...
11010110111110010100000111111100000111011011000011110111100111000010...
11010110111110010100000111111100000111011011000011110111100111000010...

This is almost identical to the output from outputMSSample(), so see section 23.2.2 for further
discussion. The differences are in the header line; output type GM is used here (representing
“genomes, MS format”), and no subpopulation identifier is given since the genome vector being
output may originate from more than one subpopulation.
The outputMS() method allows output to be sent to a file, with the optional filePath argument.
In this case, the #OUT: header line is not emitted, since it would not be conformant with the MS
data format specification.
23.3.3 outputVCF()
The outputVCF() method is parallel to the outputVCFSample() method of Subpopulation (see
section 23.2.3), but allows output based upon any vector of genomes. Sample output for
outputVCF(), abbreviated with ellipses:

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

491

#OUT: 10000 GV 20
##fileformat=VCFv4.2
##fileDate=20160613
##source=SLiM
##INFO=
##INFO=
##INFO=
##INFO=
##INFO=
##INFO=
##INFO=
##INFO=
##INFO=
##FORMAT=
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT i0 i1 i2 i3 i4 i5...
1 1309 . A T 1000 PASS
MID=987550;S=0;DOM=0.5;PO=2;GO=9404;MT=1;AC=11;DP=1000 GT 0|1 0|0 0|1...
1 5995 . A T 1000 PASS
MID=987764;S=0;DOM=0.5;PO=1;GO=9415;MT=1;AC=11;DP=1000 GT 0|1 0|0 0|1...
...

This is almost identical to the output from outputVCFSample(), so see section 23.2.3 for further
discussion. The differences are in the header line; type GV is used here (representing “genomes,
VCF format”), and no subpopulation identifier is given since the genome vector being output may
originate from more than one subpopulation. Also note that the sample size here is reported in
genomes (i.e., 20) whereas with outputVCFSample() it is reported in individuals (i.e., 10). This is
because this method operates on a vector of genomes, and thus that is the natural unit for the
sample size. Nevertheless, since VCF output naturally groups genomes into diploid individuals,
the target genome vector must have an even number of elements, and each pair of elements will
be assumed to represent one individual.
The outputVCF() method allows output to be sent to a file, with the optional filePath
argument. In this case, the #OUT: header line is not emitted, since it would not be conformant with
the VCF data format specification.
23.4 SLiM additions to the .trees file format
The .trees files produced by the treeSeqOutput() method (section 21.12.2) are in a complex
binary format, defined at the top level by the kastore library and at the next level by msprime; it is
not documented here. Reading or writing compliant .trees files is a topic well beyond the scope
of this manual. However, if the pyslim.load() method is used to load a .trees file into Python,
the entities defined by the file, such as nodes, edges, and mutations, can then be accessed through
the pyslim and msprime APIs in Python. Those entities often have a column for metadata, and this
is where SLiM attaches its additional state information. The contents of those metadata fields is
documented here. Directly using or generating this metadata information in Python is, again,
beyond the scope of this manual, but the information provided here at least documents what you
would need to know in order to do so. The pyslim package provides the more usual route to
accessing this metadata information, and should suffice for the needs of almost all users.
This metadata is generally in a binary format. The descriptions below will give the number of
bytes for each field, their C / C++ type, and a brief description.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

492

23.4.1 Metadata for mutations
The derived state field for each mutation entry is actually a comma-separated list of mutation
IDs, in ASCII, representing all of the stacked mutations present at the position in question after the
addition (or removal) of a mutation; this is rather different from the way the derived state field is
used in most .trees files.
Each mutation entry’s metadata will then consist of a series of 16 byte metadata records,
corresponding to the mutation IDs listed for the derived state:
4 bytes (int32_t): the id of the mutation type the mutation belongs to (i.e., 5 for m5)
4 bytes (float): the selection coefficient of the mutation
4 bytes (int32_t): the id of the subpopulation in which the mutation arose (i.e., 3 for p3)
4 bytes (int32_t): the simulation generation in which the mutation arose
Note that the same mutation ID may be described by multiple mutation entries, if differences in
stacking or other factors lead to it being recorded multiple times. Furthermore, the metadata for
these multiple descriptions may not match, since it is recorded at the time the mutation is added,
and metadata such as the selection coefficient of a mutation can change as a mutation runs.
When reading, SLiM will use the metadata associated with the last recorded version of each
mutation.
23.4.2 Metadata for nodes
Each node will have 10 bytes of metadata attached:
8 bytes (int64_t): the SLiM genome ID for this node, as from genomePedigreeID
1 byte (uint8_t): true (1) if this node represents a null genome, as from isNullGenome
1 byte (uint8_t): the type of the genome (0 for autosome, 1 for X, 2 for Y)
23.4.3 Metadata for individuals
Each individual will have 24 bytes of metadata attached:
8 bytes (int64_t): the SLiM pedigree ID for this individual, as from pedigreeID
4 bytes (int32_t): the age of this individual, as from age
4 bytes (int32_t): the subpopulation the individual belongs to (i.e., 3 for p3)
4 bytes (int32_t): the sex of the individual (0 for female, 1 for male, -1 for hermaphrodite)
4 bytes (uint32_t): flags; see below.
At present, only the low-order bit of the flags metadata, 0x01, is used; if set, it indicates that this
individual migrated between subpopulations during the current generation (following the migrant
property of Individual described in section 21.6.1). Other flag bits are reserved and should be set
to 0 until such time as they are defined.
The individual table also has a flags field, outside of the metadata record, and SLiM uses some
bits in that field; that will be explained below since it is not part of the metadata record itself.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

493

23.4.4 Metadata for populations
Each population will have 88 bytes of metadata attached at a minimum, followed by a variable
number of 12-byte sections. The initial 88-byte section is structured as:
4 bytes (int32_t): the ID of this subpopulation, as from id
8 bytes (double): the selfing fraction (WF) or unused (nonWF)
8 bytes (double): the cloning fraction for females or hermaphrodites (WF) or unused (nonWF)
8 bytes (double): the cloning fraction for males or hermaphrodites (WF) or unused (nonWF)
8 bytes (double): the sex ratio as M:M+F (WF) or unused (nonWF)
8 bytes (double): spatial bounds x0 value, unused in non-spatial models
8 bytes (double): spatial bounds x1 value, unused in non-spatial models
8 bytes (double): spatial bounds y0 value, unused in non-spatial models
8 bytes (double): spatial bounds y1 value, unused in non-spatial models
8 bytes (double): spatial bounds z0 value, unused in non-spatial models
8 bytes (double): spatial bounds z1 value, unused in non-spatial models
4 bytes (int32_t): the number of migration records, as from immigrantSubpopFractions
The value of the last field above dictates the number of 12-byte metadata sections that follow,
each of this format:
4 bytes (int32_t): the ID of the source subpopulation (i.e., 3 for p3)
8 bytes (double): the migration rate from the source subpopulation
23.4.5 The SLiM provenance table entry format
The provenance table is designed to hold an entry for each software program that has been
involved in the creation of the file, providing a sort of “chain of custody” for the data in the file.
For a .trees file to be openable in SLiM, there must be a provenance entry that indicates that the
file was created by SLiM; it does not need to be the last entry, but the assumption is that any later
entries represent software that understands how to preserve SLiM metadata conformance as
described above. As long as the expected SLiM metadata format described above is strictly
followed, the SLiM provenance may be spoofed to make a file openable in SLiM, too; this is what
the pyslim package does, after attaching SLiM metadata. A SLiM provenance entry is an ASCII
string, with no terminating NULL (the end of the string is dictated by the length of the entry itself, as
tracked by kastore). A typical provenance table entry from SLiM 3.0 (file_version of "0.1") looks
like:
{"program":"SLiM", "version":"3.0", "file_version":"0.1",
"model_type":"WF", "generation":1000, "remembered_node_count":0}

In SLiM 3.1 this has been extended to include more information (file_version of "0.2"). The
new provenance table entry format provides a superset of the information in file_version "0.1"
format, but some keys were renamed or moved within the JSON hierarchy. Nevertheless,
backward compatibility should be possible if handled carefully. The new format looks like this:
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

494

{
"environment": {
"os": {
"machine": "x86_64",
"node": "anonymized.uoregon.edu",
"release": "17.6.0",
"system": "Darwin",
"version": "Darwin Kernel Version 17.6.0: Tue May 8
15:22:16 PDT 2018; root:xnu-4570.61.1~1/RELEASE_X86_64"
}
},
"metadata": {
"individuals": {
"flags": {
"16": {
"description": "the individual was alive at the time the
file was written",
"name": "SLIM_TSK_INDIVIDUAL_ALIVE"
},
"17": {
"description": "the individual was requested by the user
to be remembered",
"name": "SLIM_TSK_INDIVIDUAL_REMEMBERED"
},
"18": {
"description": "the individual was in the first generation
of a new population",
"name": "SLIM_TSK_INDIVIDUAL_FIRST_GEN"
}
}
}
},
"parameters": {
"command": ["/usr/local/bin/slim", "-seed", "1", "~/test.slim"],
"model": "initialize() {\n\tinitializeTreeSeq();
\n\tinitializeMutationRate(1e-7);\n\tinitializeMutationType(\"m1\", 0.5,
\"f\", 0.0);\n\tinitializeGenomicElementType(\"g1\", m1, 1.0);
\n\tinitializeGenomicElement(g1, 0, 99999);
\n\tinitializeRecombinationRate(1e-8);\n}\n1 {\n\tsim.addSubpop(\"p1\",
500);\n}\n2000 late() { sim.treeSeqOutput(\"~/Desktop/junk.trees\"); }\n",
"model_type": "WF",
"seed": 1
},
"schema_version": "1.0.0",
"slim": {
"file_version": "0.2",
"generation": 2000,
},
"software": {
"name": "SLiM",
"version": "3.1"
}
}

These provenance strings are a JSON strings (https://www.json.org). For SLiM 3.0, the keys must
be provided in exactly the order given in the file_version "0.1" example above, including the
exact positions of spaces and punctuation. For SLiM 3.1 and later, any JSON-compliant string
supplying the expected keys will work, as a proper JSON reader has been incorporated into SLiM.
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

495

The top-level keys have the following meanings:
environment: This

has information about the environment in which the simulation was
executed. Right now it has an os key under it, with machine, node, release,
system, and version keys under that, providing information vended by the POSIX
function uname(). These keys are for diagnostic purposes, and are not particularly
standardized, and should not be relied upon by reading software.

metadata: This

has information about the metadata annotations used in the file. Right now it
describes only the three bits that are used by SLiM in the flags column of the
individuals table (not the flags field inside the metadata record for each individual).
Further entries may be added describing all the metadata documented above.

parameters: This provides
The command key

information that would be necessary to recreate the model run.
provides the command-line parameters (beginning with slim itself)
that were used to run the model; for models run in SLiMgui this will be an empty
array. The model key provides the script that was run by SLiM to generate the saved
file; this is archived in the .trees file for easier identification and reproducibility of
runs. Note that other files that may be sourced or read by the script are not
archived; for full reproducibility it would be necessary to archive such auxiliary files
alongside the .trees file. The script is a JSON-quoted string, so it should be unJSON-quoted to reproduce the original script. The file_version key gives the
version of the SLiM annotations in the file (provenance and metadata); at present
only "0.1" and "0.2" are supported. The metadata format is the same for "0.1"
and "0.2"; only the provenance format changed, as described here. The
model_type key should be either "WF" or "nonWF", depending upon the type of
model that generated the file. This has some implications for the other metadata; in
particular, some of the population metadata is required for WF models but unused
in nonWF models, and individual ages in WF model data are expected to be -1.
Finally, the seed key provides the original random number generator seed. If a seed
is supplied at the command line with the -s option, that will be given here.
Otherwise, the seed will be a randomly generated seed provided by SLiM. Note
that this is only the original seed; if the seed is set in script using setSeed(), that
will not be reflected here (but might be reflected in the script supplied by the model
key).

schema_version: The

version of the JSON schema used. The schema for provenance entries
is documented at https://msprime.readthedocs.io/en/stable/provenance.html. Note
that the schema is fairly minimal; most of the information emitted by SLiM is not
described by it.

slim: This

provides SLiM-specific information needed to correctly read .trees files into
SLiM. The file_version key describes the overall version of the SLiM-specific
information in the file, such as metadata and the provenance entry information
itself, as outlined above. The generation key provides the generation at which the
file was written, so that that generation can be restored when the file is read. The
remembered_node_count key, which exists in file_version "0.1" files but not
file_version "0.2" and later, specifies how many rows at the top of the nodes
table are “remembered” by the user with the treeSeqRememberIndividuals()
method; in file_version "0.2" and later, remembered individuals and genomes
are controlled by the flags set on the rows of the individual table.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

496

software: This

provides information about the software program that produced this
provenance entry. The name key gives the name of the software, and the version
key gives the version number of the software. Note that in file_version "0.1"
these were top-level keys instead, and the name key was called program, as
illustrated by the file_version "0.1" example earlier.

With file_version 0.2, the hope is that this existing provenance information will now be fairly
stable, although further fields may be added.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

497

24. SLiM extensions to the Eidos language
24.1 Extensions to the Eidos grammar
The Eidos language itself is defined by the grammar given in the Eidos manual. However, SLiM
defines the grammar of a SLiM input file, which defines a small extension to the Eidos language –
in particular, by using a different start rule than Eidos’s interpreter normally uses:

This start rule defines a SLiM input file as a series of zero or more SLiM Eidos blocks, each of
which is a compound statement preceded by an informational section and a callback declaration.
Definitions of user-defined Eidos functions may also be sprinkled in among those blocks. Note
that ordinary Eidos statements are not allowed at the top level of the SLiM input file; they must be
within the body of a SLiM Eidos block or a user-defined function. The SLiM input file therefore
structures Eidos statements into encapsulated blocks.
In the definition of a SLiM Eidos block, the informational section is optional, since each of its
components is optional; it looks like this:

The informational section begins with an optional identifier that can be used to later identify the
script block programmatically. If supplied, it should be an identifier like "s1", or more generally,
"sX" where X is an integer greater than or equal to 0. The rest of the informational section
comprises an optional generation or range of generations in which the script block will be used by
SLiM. The generation numbers are defined syntactically by the grammar as numeric literals, but
semantically, there are further restrictions (see section 22.1).
The callback declaration section is also an addition by SLiM to the base Eidos grammar. It is
also optional, since it can be empty:

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

498

This rule defines a SLiM script block as being one of the supported types of Eidos event or Eidos
callback (early(), late(), initialize(), fitness(), interaction(), mateChoice(),
modifyChild(), recombination(), or reproduction()). If no type specifier is provided, the script
block is an early() event by default. The identifier tokens in this rule specify restrictions on the
circumstances in which the callback will be used by SLiM. See chapter 22 for further details.
In all other respects the grammar of Eidos is unmodified in SLiM.
24.2 SLiM scoping rules
Eidos is largely a scopeless language; variables do not exist until they are defined, and once
defined they continue to exist forever, unless/until they are subsequently undefined. The only
exception to this, in Eidos, is that user-defined functions run within their own private scope. (See
the Eidos manual for further discussion of scope in Eidos.)
SLiM, however, imposes some scoping rules upon the Eidos language. This occurs almost as a
side effect of the way SLiM uses Eidos, albeit an intentional side effect. In particular, every call to
a SLiM event or callback is run within a new Eidos interpreter, created solely for the purpose of
running that particular invocation of that particular event or callback. These Eidos interpreters are
torn down and thrown away as soon as the callout ends. Because of this, variables defined inside
an event or callback have a scope that is limited to that callout; they will cease to exist as soon as
the callout finishes.
This is actually desirable, in general, because it prevents the global namespace from becoming
cluttered up with all of the local, temporary variables defined by particular events and callbacks. It
also provides a clean way for SLiM to, in effect, pass values in to callbacks, as described in chapter
22. It mimics the scoping of languages like C and C++, albeit only at the level of whole functions/
methods; even in SLiM there is no such thing as “block scope”. In fact, this scoping behavior is
identical to the way that user-defined functions in Eidos have their own private scope, and SLiM
events and callbacks can actually be thought of as a special kind of user-defined function that is
called automatically by SLiM, with a non-standard syntax for their declaration.
This design is not without drawbacks, however. The most obvious drawback is that in SLiM
there isn’t really such a thing as a global scope; there is only the local scope defined within events
and callbacks. Only the special variables defined by SLiM, such as sim, possess the privileged
status of being available everywhere (because SLiM sets them up before every callout).
SLiM provides some facilities to get around this problem. One such facility is the presence of
tag properties on many of SLiM’s classes, which can hold singleton integer values persistently
(and the tagF property on some SLiM classes can hold singleton float values, as well). Another is
the setValue()/getValue() facility provided by many of SLiM’s classes, which allows persistent
storage of arbitrarily named values that do not have to be singletons; this is much more flexible
and open-ended than the tag facility, but is also not as fast or as easy to use.
The other way to keep a value persistently is to define it as a constant rather than a variable,
using the Eidos function defineConstant(). In SLiM, the constants table is shared by all Eidos
interpreters, and therefore constants defined in one event or callback are available in all
subsequently executed code; SLiM’s scoping rules do not apply to defined constants. For values
that are in fact constant, this is straightforward and useful; SLiM models will often use
defineConstant() to define constants related to population sizes, locus lengths, etc., in the
initialize() callback of the model, and will then use those defined constants everywhere.
Because even constants can be undefined with the rm() function, however, it is even possible to
use this method to place non-constant values into the global namespace; when the value needs to
change, simply remove the previous constant definition with rm() (with the optional parameter
removeConstants=T); and then add a new constant definition with defineConstant(). This feels
TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

499

like a hack, since it violates the intended semantics of defineConstant(), but it is perfectly legal,
and in some cases is a good solution to the global scope problem.
One thing that it is impossible to do in SLiM is to persistently keep values of object type.
Values of object type cannot be placed into tag properties; they cannot be set as named values
with setValue(); and they cannot be defined as constants with defineConstant(). The only
object values that are persistent, in SLiM, are those set up by SLiM itself, such as sim. This is
deliberate and necessary, because otherwise stale references to objects that no longer exist in the
SLiM simulation could persist in variables. Since no persistent reference to an object can be
created, stale references can never exist in a SLiM model, which greatly simplifies object lifetime
management and error-checking issues. SLiM objects are only disposed of in between calls out to
Eidos, at points in time when no Eidos reference to them can possibly exist because of the design
of SLiM’s scoping and persistence rules. See section 2.8.2 of the Eidos manual for further
discussion.

TOC I TOC II WF nonWF initialize() Genome Individual Mutation SLiMSim Subpopulation
Eidos events fitness() mateChoice() modifyChild() recombination() interaction() reproduction()

500

25. SLiM reference sheet
This reference sheet may be downloaded as a separate PDF from http://messerlab.org/slim/.
Invoking SLiM at the command line:
slim -version | -usage | -testEidos | -testSLiM |
[-seed ] [-time] [-mem] [-Memhist] [-long] [-x] [-define ]
SLi M Manual

Navigation menu

Versions of this User Manual:

Views

Navigation