.
AnalysisTools user manual
Karel Šindelka
k.sindelka@gmail.com
March 21, 2019
Contents
Contents
2
1 Introdution
3
2 Installation
4
3 Format of input/output files
3.1 vsf structure file . . . . . . .
3.2 vcf coordinate file . . . . . .
3.2.1 Ordered coordinate file
3.2.2 Indexed coordinate file
3.3 Aggregate file (agg) . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
5
5
7
7
8
8
4 Utilities
4.1 AddToSystem . . . . . . . . . . . . . . . .
4.2 AngleMolecules . . . . . . . . . . . . . . .
4.3 Aggregates and Aggregates-NotSameBeads
4.4 Average . . . . . . . . . . . . . . . . . . .
4.5 BondLength . . . . . . . . . . . . . . . . .
4.6 Config . . . . . . . . . . . . . . . . . . . .
4.7 DensityAggregates . . . . . . . . . . . . .
4.8 DensityBox . . . . . . . . . . . . . . . . .
4.9 DensityMolecules . . . . . . . . . . . . . .
4.10 DihedralMolecules . . . . . . . . . . . . . .
4.11 DistrAgg . . . . . . . . . . . . . . . . . . .
4.12 GenSystem . . . . . . . . . . . . . . . . .
4.13 GyrationAggregates . . . . . . . . . . . . .
4.14 GyrationMolecules . . . . . . . . . . . . .
4.15 JoinAggregates . . . . . . . . . . . . . . .
4.16 JoinRuns . . . . . . . . . . . . . . . . . .
4.17 lmp data . . . . . . . . . . . . . . . . . . .
4.18 PairCorrel . . . . . . . . . . . . . . . . . .
4.19 PotentialAggregates . . . . . . . . . . . . .
4.20 SelectedVcf . . . . . . . . . . . . . . . . .
4.21 traject . . . . . . . . . . . . . . . . . . . .
4.22 TransformVsf . . . . . . . . . . . . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
10
10
13
14
15
17
18
19
21
22
23
24
28
29
31
31
32
33
33
34
35
36
37
.
.
.
.
.
.
.
.
.
.
1
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
0. Contents
.
5 Computational details
38
5.1 Read system data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
2
1. Introdution
This software package is a set of utilities aimed to analyse trajectories of coarsegrained simulations. It was initially developed to work with DL MESO simulation
package. However, it works with vtf trajectory files, so it can be used to analyse
any vtf trajectories.
The emphasis is placed on assembly of molecules, e.i., self-assembly of polymers.
It therefore include utilities to determine if molecules are in aggregates and to calculate various properties of aggregates. The utilities only calculate desired quantities
and write them to output text files, there is no plotting or visualisation.
Examples of the resulting data can be seen in the author’s thesis (Šindelka,
Karel: The study of the association behavior of the amphiphilic copolymers in solutions containing low molar compounds by means of computer simulations, dissertation thesis, Charles University, 2018) as well as in papers in impacted journals
(e.g., doi:10.1039/C8CP05907A, 10.1134/S1811238217010052, or 10.1007/s00396017-4090-0).
3
2. Installation
All programs can be compiled using cmake which generates Makefile and subsequently running make. It requires C and FORTRAN compilers. The compilation should
be done in a separate directory, such as build.
To create the Makefule and compile all utilities, simply run the following command (assuming build is a subdirectory of AnalysisTools root directory):
cmake -G "Unix Makefiles" ../
The binaries will be in ‘bin‘ subdirectory of ‘build‘.
Cmake variable -DCMAKE BUILD TYPE=Debug can be used to compile the version
of utilities for debugging.
To compile individual C programs using gcc, run (assuming you are in a subdirectory of AnalysisTools root directory):
gcc -O3 -lm ../AnalysisTools.c ../Error.c ../Options.c ../Utility/program.c.
4
3. Format of input/output files
All utilities read information about the system from vsf/vcf files (formatted as described below) and FIELD file (input file for DL MESO simulation package). All
system information is read from the vsf structure file (Section 3.1) and from the
vcf coordinate file (Section 3.2). One vtf file containing a structure and coordinate
sections can also be used. The FIELD file is only used when bead charge and/or
mass is missing from the vsf structure file. The utilities consider only bead types
that are present in the vcf coordinate file (i.e., bead types present in the vsf file
but not in the vcf file are not seen by the utilities). A description of how the system
data are read is shown in Chapter 5.
Both vsf and vcf files can be generated using a traject utility provided by
the DL MESO simulation package (traject-v2 5 and traject-v2 6 provided here
are modified versions from earlier DL MESO simulation package versions). Both
structure and coordinates can be in one file (with vtf extension) – this file can be
used instead of separate vsf and vcf files.
All utilities assume cuboid simulation box with dimensions from 0 to N , where
N is the side length of the box (which can be different in all three dimensions).
3.1
vsf structure file
The structure file contains all information about all beads and bonds except for their
Cartesian coordinates. The utilities are written for a vsf file created by a traject
utility (for DL MESO versions from 2.5 to 2.7), but other vsf files should work as
long as they adhere to the following format.
A vsf file is divided into two parts. The first part contains bead definitions.
Each line contains the description of a single bead and follows these rules:
• the line starts with atom (or just a)
• the second string is a bead index number that starts from 0 and increases with
every subsequent line (the last bead definition line therefore shows the total
number of beads in the simulation)
• the line contains bead name as name
• the first bead definition line may contain default instead of the index number;
every bead that is not explicitly written in the bead definition lines is has the
5
3. Format of input/output files
3.1. vsf structure file
default name
• if the bead is in a molecule, the line contains molecule name (resname )
and molecule id (resid ) that starts from 1
• mass and charge keywords are read if present (otherwise the mass and charge
of beads is read from FIELD)
• other keywords are ignored
The following is an example of bead definition lines containing all required data:
atom
atom
atom
atom
atom
atom
atom
atom
default name Bead_A
0 name Bead_B
3 name Bead_C resname
4 name Bead_C resname
6 name Bead_C resname
7 name Bead_C resname
8 name Bead_D resname
9 name Bead_D resname
Mol_A
Mol_A
Mol_A
Mol_A
Mol_B
Mol_B
1
1
2
2
3
3
In this example, there are four bead types (named Bead A, Bead B, and Bead C,
and Bead D) and 10 beads in all (with indices from 0 to 9). Beads with indices
1, 2, and 5 are of the default type (Bead A). There are two molecule types named
Mol A and Mol B with molecule indices 1 to 3. All molecules with the same name
must have the same structure, i.e., the same number of beads and the same bond
connectivity.
The second part of a vsf file contains bonds definitions and must be preceded
by a blank line. Each bond definition line follows these rules:
• the line starts with bond (or just b)
• bond between two beads is specified by their indices separated by a colon
(there cannot be a space between the first number and the colon)
The following is an example of bead definition lines that complement the aboveshown bead definition lines:
bond 3: 4
# possible comment
bond 6: 7
bond 8: 9
In this example, there three bonds between beads with indices 3 and 4, 6 and 7,
8 and 9.
Blank lines and comments (lines beginning with #) are allowed in both parts of
the vsf file.
6
3. Format of input/output files
3.2
3.2. vcf coordinate file
vcf coordinate file
The coordinate file contains Cartesian coordinates of the beads and the size of
the cuboid simulation box. Coordinates are read from a vcf file containing either
ordered timesteps (Section 3.2.1) or indexed timesteps (Section 3.2.2).
An ordered vcf file must contain all beads defined in the vsf file, while an
indexed vcf file can contain only a subset of defined beads. Both indexed and
ordered vcf files contain a line before every timestep specifying the file type –
timestep ordered or timestep indexed (the keyword timestep can be omitted).
In both ordered and indexed vcf files, the size of the simulation box is given by
a line pbc which is located before the first coordinate
block. Only timestep and pbc lines are read before the first coordinates (everything
else is ignored), so vtf file can be used instead of a vcf file.
The vcf file may contain comment lines (beginning with #) and blank lines
between timesteps, but the coordinate block must be continuous.
3.2.1
Ordered coordinate file
Coordinate lines in ordered vcf file contain only the Cartesian coordinates of the
beads in the form . The beads are written in ascending
order of their indices as defined in the vsf file. The following is an example of an
ordered vcf file:
# any number of comments or blank lines
timestep ordered
pbc 10 10 10
0.0 0.0 0.0
0.5 0.5 0.5
...
# comments between timesteps
timestep ordered
# another comment
1.0 1.0 1.0
1.5 1.5 1.5
...
In this example, the simulation box is cubic with side length of 10. Beginnings
of two timesteps are represented by coordinates of the first two beads with indices
0 and 1 (as defined in the vsf file).
7
3. Format of input/output files
3.2.2
3.3. Aggregate file (agg)
Indexed coordinate file
Indexed coordinate files contains not only Cartesian coordinates, but also bead
indices (preceding the coordinates). Therefore an indexed timestep does not have
to contain all beads in the vsf structure file. Moreover, the beads do not have to
be ordered according to their ascending indices. The following is an example of an
ordered vcf file:
# any number of comments or blank lines
timestep indexed
pbc 10 10 10
2 0.5 0.5 0.5
21 0.0 0.0 0.0
...
# comments between timesteps
timestep indexed
# another comment
21 1.0 1.0 1.0
2 1.5 1.5 1.5
...
This example is similar to that for the ordored vcf file, but two beads have
indices 2 and 21 instead of 0 and 1.
3.3
Aggregate file (agg)
The aggregate file with agg extension is generated using Aggregates utility. The
file contains information about the number of aggregates in each timestep and which
molecules and monomeric beads belong to which aggregate. It serves as an additional
input file for utilities that calculate properties of whole aggregates – agg file is
therefore linked to the vcf that was used to generate it.
The agg file is a simple text file. The first line contains the command used to
generate it – parts of this command can be necessary for subsequent analysis of
aggregates. The second line is blank and from the third line the data for individual
timesteps are shown. Each timestep follows these rules:
• each timestep starts with Step: (only Step keyword is read by the
utilities)
• the second line contains the number of aggregates in the given timestep and
followed by a blank line
8
3. Format of input/output files
3.3. Aggregate file (agg)
• there are two lines for each aggregate:
(1) number of molecules in the aggregate followed by their indices taken from
the vsf file
(2) number of monomeric beads in the aggregate followed by their indices
taken from the vsf file
• no blank or comment lines are allowed inside the aggregate block
• all molecules present in the vcf file used to generate this file must be present
in every timestep; here, aggregate can also refer to dissolved molecules
Following is an example of an agg file:
Aggregates in.vcf 1 1 out.agg A
Step: 1
2
2 :
3
1 :
1
1 3
: 10 100 1000
2
: 20
Step: 2
1
3 : 1 2 3
4 : 10 20 100 2000
Last Step: 2
In this example, command Aggregates in.vcf 1 1 out.agg A was used to
generate the file (see Section 4.3 for details about this utility). There are two
timesteps here – the first contains two aggregates (although one of them is a free,
dissolved molecule) and the second a single aggregate. As an example, the aggregate
in the second step contains three molecules with indices 1, 2, and 3 (taken from the
vsf file) and four monomeric beads (i.e., solvent or counterions) with indices 10, 20,
100, and 2000 (again, taken from the vsf file).
Besides using this file for further analysis using other utilities, the indices can be
used in vmd to visualize, e.g., only a specific aggregate.
9
4. Utilities
All utilities have command line help with short description when -h argument is
used. Besides -h, most of the utilities have several standard command line options
that are the same. The standard options can be used with any utility unless stated
otherwise.
Standard options
-i
-v
-V
-s
--script
-h
4.1
use custom vsf file instead of traject.vsf
verbose output that provides information about all bead and
molecule types
detailed verbose output that provides information about all individual molecules as well as about bead and molecule types
run silently, i.e., without any output at all (overrides -v and -V
options)
do not rewrite terminal line (useful if output is routed to a file)
print help and exit
AddToSystem
This utility takes an existing system specified by vcf coordinate and vsf structure
files and adds new beads into it. The new beads replace neutral unbonded ones with
the lowest indices (as ordered in the vsf file) from the original system. If molecules
are added, AddToSystem places them at the end (for the sake of DL MESO which
requires molecules to be after unbonded beads). The utility generates vcf and vsf
files for the new system.
AddToSystem does not check whether there are enough unbonded neutral beads
to be replaced by the new beads (if not, the utility will either crash or run forever).
The coordinates of the new unbonded beads are ruled by the -ld, -hd, and -bt
options, only the first bead of any molecule obeys the options. The coordinates of
the remaining beads in a molecule are governed by the provided coordinates. The
molecules are added with a random orientation.
If -ld and/or -hd options are used, they must accompanied by the -bt option.
The structure and number of added molecules and monomeric beads are read
from a FIELD-like file. This file must contain species section followed by molecule
10
4. Utilities
4.1. AddToSystem
section as described in the DL MESO simulation package.
The species section contains the number of bead types and their properties:
species
The first line must start with species keyword followed by the number of bead
types. For each bead type, a single line must contain the name of the beads, their
mass and charge, and a number of these beads that are not in a molecule (i.e.,
unbonded or monomeric beads).
The molecule section that must be behind the species section contains information about structure and numbers of molecules to be added:
molecule
number of types of molecules
name of the first molecule type
nummols
number of these molecules
beads
number of beads in these molecules
a line for each of the beads
...
specifying bead name and
Cartesian coordinates
bonds
number of bonds in these molecule
a line for each of the bonds
...
containing arbitrary string and
indices connected beads
...
anything beyond here is ignored
finish
description of a molecule is finished
The molecule keyword specifies the number of molecule types, that is the
number of finish keywords that must be present. The must be
present in the species section. The arbitrary in the bonds is ignored by
AddToSystem (it is a relic from the DL MESO simulation package, where the
specifies a type of bond). The indices in bond lines run from 1 to the number of
beads in the molecules and are ordered according to the beads line of the section.
Because molecule section in the FIELD file from DL MESO can also include bond
angles and dihedral angles, anything beyond the last bond line is ignored (until the
finish keyword is read).
If no molecules are to be added, the line molecule 0 must still be present in the
file.
The following is an example of the FIELD-like file:
species 3
A
1.0 1.0
0
11
4. Utilities
B
CI
4.1. AddToSystem
1.0 0.0 0
1.0 -1.0 30
molecule 2
Dimer
nummols 10
beads 2
A 0.0 0.0 0.0
A 0.5 0.5 0.5
bonds 1
harm 1 2
finish
surfact
nummols 10
beads 3
A 0.0 0.0 0.0
B 0.0 0.0 0.0
B 0.5 0.5 0.5
bonds 2
harm 1 2
harm 2 3
angles 1
harm 1 2 3
finish
In this example, 30 unbonded (or monomeric) negatively charged beads called CI
are added as well as 20 molecules – 10 molecules called Dimer and 10 molecules
called surfact. Dimer molecules contain two A beads and one bond each; surfact
molecules contain three beads and two bonds each. The part starting with angles
and ending with finish is ignored. All in all, 80 beads are added – 30 CI, 30 A, and
20 B beads.
The utility creates the vcf and vsf files with the new system and can also write
the coordinates into a xyz file.
Usage:
AddToSystem
Mandatory arguments
input coordinate file (either vcf or vtf format)
FIELD-like file specifying additions to the system
12
4. Utilities
4.2. AngleMolecules
output vcf coordinate file for the new system
output vsf structure file for the new system
Non-standard options
-st
-xyz
-ld
-hd
-bt
4.2
timestep to add new beads to (default: 1)
save coordinates to xyz file
lowest distance from beads specified by -bt option
highest distance from beads specified by -bt option
bead types to use in conjunction with -ld and/or -hd options
AngleMolecules
This utility calculates angles between beads in each molecule of specified molecule
type(s). The beads do not have to be connected, so the angle does not have to be
between two bonds.
The angle is specified by three bead indices taken from the vsf file (-n option)
These indices are from 1 to N , where N is the number of beads in the molecule
type. Generally, the numbering of beads inside a molecule is made according to the
first molecule of the given type in vsf file. For example, assume that beads of the
first molecule called mol in the vsf file are ordered A (vsf index 123), B (vsf index
124), C (vsf index 200). Then, bead A is 1, bead B is 2, and C is 3.
More than one angle can be specified (i.e., a multiple of three numbers have to
be supplied to the -n option.). For example, assuming indices 1 2 3 are specified
(default if -n option is not used), the angle will be between lines defined by beads
with indices 1 2 and 2 3. The angle is calculated in degrees and is between 0 and
180◦ .
The utility calculates distribution of angles for each specified trio of bead indices
for each molecule type and prints overall averages at the end of