THUNDER V1 3 5 User Guide

User Manual:

Open the PDF directly: View PDF PDF.
Page Count: 9

DownloadTHUNDER V1 3 5 User Guide
Open PDF In BrowserView PDF
THUNDER v1.3.2 User Guide
February 5, 2018

1

Installation

1.1

Requirement of Installation

• C/C++ compiler supporting C++98 standard along with MPI wrapper
• cmake
We recommend gcc and Intel C/C++ compiler as C/C++ compiler. Moreover, gcc42 has been tested as the oldest supporting version of gcc. OpenMPI
and MPICH both can be used as MPI standard. In Tsinghua, we use openmpigcc43 as the C/C++ compiler for compiling THUNDER.
cmake is a tool for configuring source code for installation.
openmpi-gcc43 is open-source software, which can easily installed using yum
on CentOS and apt-get on Ubuntu. cmake has been already installed in most
Linux operating systems. If not, it can also be conveniently installed by yum on
CentOS and apt-get on Ubuntu.

1.2

Installing from Source Code

1.2.1

Preparation Before Configuring Source Code

Make sure cmake have been installed and correctly placed in environment. Thus,
cmake can correctly set up the environment for compiling THUNDER.
1.2.2

Configure Using cmake

In THUNDER source code directory, please type in the following commands for
configuring source code. install dir stands for where you want THUNDER to be
installed.
1
2
3

mkdir b u i l d
cd b u i l d
cmake −DCMAKE INSTALL PREFIX=” i n s t a l l d i r ” . .

1

1.2.3

Configuration Variables

You may configure the compilation of THUNDER with several variables.
THUNDER can be compiled into single-float precision version or double-float
precision version, by SINGLE PRECISION variable. The default version is singlefloat precision. However, you may force it compiling into double-float precision
version, by adding parameter -DSINGLE PRECISION=’off’ during configuring
using cmake.
THUNDER uses SIMD instructions for accelerating. When you compile
THUNDER, SIMD acceleration can be turned on or off by ENABLE SIMD variable. The default version is using SIMD instructions. However, you may force it
compiling into a non-SIMD version, by adding parameter -DENABLE SIMD=’off’
during configuring using cmake.
AVX256 and AVX512 SIMD instructions are currently supported by THUNDER. By default, AVX256 is enabled and AVX512 is disabled. You can manually enable or disable them by the variable ENABLE AVX256 and ENABLE AVX512,
respectively, by the same method as described above.
It is worth mentioned that you may check whether the CPUs and C/C++
compiler support AVX512 or not, before compiling THUNDER using AVX512.
For example, CPUs should be KNL or Xeon newer than Skylake. Meanwhile, if
you compile using GCC, please make sure it is newer than version 4.9.3. If you
compile with icc, please check up its support on AVX512.
1.2.4

Compile and Stage Binaries into Environment

Please type in the following command for compiling source code using 20 threads.
You may change the number after -j to be number of threads you desire for compiling.
1
2

make −j 2 0
make i n s t a l l

After compiling and installation, several folders will appear under the directory install dir: include containing header files, bin containing executable binaries, lib containing several libraries, script containing scripts needed and manual
containing this user guide. The compiled binaries are listed as
• thunder
• thunder average
• thunder genmask
• thunded lowpass
• thunder mask
• thunder postprocess

2

• thunder resize
.
For the purpose of convenience, you may stage binaries into environment.
For example, you may add the following command into shell configuration file
1

s e t e n v PATH= i n s t a l l d i r / b i n :$PATH

when csh or tcsh is used as shell. Meanwhile, you may add the following
command into shell configuration file when bash, zsh or ksh is used as shell.
1

e x p o r t PATH= i n s t a l l d i r / b i n :$PATH

After staging binaries into environment, you may directly access these binaries by typing their filenames in shell.

2

Submit Your Job

thunder is the core program of THUNDER. It executes 3D classification and
refinement. It reads in a JSON parameter file. After parsing the JSON parameter, it reads in initial model, a .thu file and particle images. It also reads in
mask if necessary.

2.1

Set Up .thu File

THUNDER uses .thu file for storing information of each particle image. .thu file
is a space-separate tabular file as each column stands for a specific variable, as
listed below.
1. Voltage (Volt)
2. DefocusU (Angstrom)
3. DefocusV (Angstrom)
4. DefocusTheta (Radian)
5. Cs (Angstrom)
6. Amplitude Constrast
7. Phase Shift (Radian)
8. Path of Particle
9. Path of Micrograph
10. Coordinate X in Micrograph (Pixel)
11. Coordinate Y in Micrograph (Pixel)
3

12. Group ID
13. Class ID
14. 1st Element of the Unit Quaternion
15. 2nd Element of the Unit Quaternion
16. 3rd Element of the Unit Quaternion
17. 4th Element of the Unit Quaternion
18. 1st Standard Deviation of Rotation
19. 2nd Standard Deviation of Rotation
20. 3rd Standard Deviation of Rotation
21. Translation X (Pixel)
22. Translation Y (Pixel)
23. Standard Deviation of Translation X (Pixel)
24. Standard Deviation of Translation Y (Pixel)
25. Defocus Factor
26. Standard Deviation of Defocus Factor
27. Score
.thu file is generated by thunder at the end of each iteration to save the
information of each particle image.
2.1.1

Generate .thu from Relion

.thu file can be converted from STAR file of Relion by script STAR 2 THU.py
and STAR 2 THU NO GROUP.py by the following commands.
1

1

python STAR 2 THU . py f i l e n a m e . s t a r > f i l e n a m e . thu
python STAR 2 THU NO GROUP . py f i l e n a m e . s t a r >
f i l e n a m e . thu

STAR 2 THU.py is used for converting STAR files containing group information and STAR 2 THU NO GROUP.py is used for converting those do not. You
can find these two scripts under directory install dir/script.
It is worth noticed that both of two scripting only convert CTF information
but not rotation and translation information. Thus, .thu files converted from
STAR files can be only used for global search stage of thunder. Meanwhile, .thu
files generated by thunder can be used for global search, local search and CTF
search. The precise meaning of global search, local search and CTF search will
be further discussed in detail in section 2.2.
4

2.1.2

Generate .thu from Frealign

The converting script will be provided soon.
2.1.3

Generate .thu from SPIDER

The converting script will be provided soon.

2.2

Configure with JSON Parameter File

thunder reads in a JSON file which is parsed into parameters of thunder. You
may change the values of the keys to fit your purpose. The definition of keys in
this JSON parameter file is listed in Table 1.
thunder divides 3D refinement into three stages: global search, local search
and CTF search. During global search, the rotation and translation result of
the previous iteration will not inherited into the next iteration. Meanwhile,
during local search, the rotation and translation of each particle image will be
adjust based on the result of the previous iteration. During CTF search, the
CTF parameters will be adjusted for achieving better resolution.
Meanwhile, 3D classification of thunder typically only involves global search.
You may find a demo version of this JSON parameter file named
demo.json under directory install dir/script.

2.3

Processes and Threads

thunder needs at least 3 processes. It has perfect linear speed-up when number of nodes increases. Thus, please use as many nodes as possible. We high
recommend assigning a node with only one process and using multiple cores
in each node by threads. For example, if you have 100 nodes and each node
has 20 cores, you may use 100 processes for running thunder, and each process
should generate 20 threads to achieve maximum usage of computing resource.
By changing the value of the key Number of Threads Per Process in the JSON
parameter file, you may set the number of threads of each process to which you
desire. In this example, this value should be set to 20.

2.4

Submit

Please examine whether you have generated the correct .thu file and configured
the JSON parameter file properly, and make sure that the initial model and mask
(if necessary) are placed in the right directory. Moreover, please check whether
the directory of the destination described in the JSON parameter
exists or not. Now, you can submit you job. You may leave it to the cluster
job managing software, or you may assign nodes manually by mpirun.

5

Key

Description

Number of Threads Per Process
2D or 3D Mode
Global Search
Local Search
CTF Search
Number of Classes
Size of Image
Pixel Size (Angstrom)
Radius of Mask on Images (Angstrom)
Estimated Translation (Pixel)
Initial Resolution (Angstrom)
Perform Global Search Under (Angstrom)
Symmetry
Initial Model
.thu File Storing Paths and CTFs of Images
Prefix of Particles
Prefix of Destination
Calculate FSC Using Core Region
Calculate FSC Using Masked Region
Particle Grading
Perform Reference Mask
Perform Reference Mask during Global Search
Provided Mask

the number of threads used
in each process
2D/3D classification or
refinement
whether to perform global
search or not
whether to perform local
search or not
whether to perform CTF
search or not
the number of density maps,
aka. more than 1 when
undergoing classification
the size of the images1
the pixel size of the images
the radius of mask you want
to be masked on the images
the standard deviation of
translation in pixel which may
occurred on the input images
the resolution the program
starts its iterations
the resolution limit for performing
global search
the symmetry of the macromolecular
to be processed
the initial model for
classification/refinement
the .thu file which stores the
information of where to read the
images and the CTF paramters of them
the prefix to be added in the path
of the particle image
the prefix (path) to save the
outcomes
whether to calculate FSC using core
region of the reference or not
whether to calculate FSC using masked
region of the reference or not
whether to turn on the particle grading
optimization or not
whether to mask on the density map
or not
whether to mask on the density map
during global search or not
the path of the mask if needed

6
Table 1: Key Words of JSON Parameter File

3

Get Your Result

A log file named thunder.log will appear in your submitting directory, recording
the state of your job.
In the destination directory, the density maps are outputted as Reference xxx A Round xxx.mrc
and Reference xxx B Round xxx.mrc, during 3D refinement or classification. For
example, the density map of the 5th reference of round 15 from hemisphere A
has the filename Reference 005 A Round 015.mrc. On contrast, the 5th reference
of round 15 from hemisphere B has the filename Reference 005 B Round 015.mrc
Meanwhile, during 2D classification, the density maps of each round are
stored in a MRC stack. For example, the density maps of round 15 has the
filename Reference Round 015.mrcs which contains N slices of images. N stands
for the number of classes.
FSC/FRCs are outputted as FSC Round xxx.txt. The first column of this file
is signal frequency in pixel. The seconds column is signal frequency in Angstrom.
From the third column to the rest of columns, the FSC of each reference is listed
in order.
During classification, the resolution and ratio of images of each class is listed
in a file named Class Info Round xxx.txt. Each row of this file stands for a class
in order. The first column is the index of each class, the second column is the
resolution in Angstrom of each class and the third column is the ratio of image
of each class.
The rotation and translation information of each particle at each iteration is
outputted as Meta Round xxx.thu, which follows the .thu file format. For example, rotation and translation of round 15 has the filename Meta Round 015.thu.

4

Typically Workflow

The typically workflow of cryo-EM single particle analysis includes 3 steps: 2D
classification, 3D classification and 3D refinement.

4.1

2D Classification

The first step of cryo-EM single particle analysis is 2D classification for removing
ice and ”noisy” particles.
You can find a demo version of this JSON parameter file for
2D classification named demo 2D.json under directory install dir/script.
There are some options worth noticed in this JSON parameter file. They are
listed below.
Local Search Performing local search or not will NOT affect the result of 2D
classification. However, it gives you a higher resolution density map for
examining the detail of the 2D density map. You may turn it off when
the computing resource is limited.

7

Number of Classes It stands for the number of classes you want the images
to be classified into.
Initial Resolution (Angstrom) It is recommended to start from lower resolution for achieving ideal result of classification.
Symmetry Symmetry has NO effect on 2D classification.
Initial Model It is recommended to use a blank initial model in 2D classification. Please leave it empty.
Calculate FSC Using Core Region It is not supported in 2D classification.
Please turn it off, otherwise a warning will be raised and thunder will turn
it off forcefully.
Calculate FSC Using Masked Region It is not supported in 2D classification. Please turn it off, otherwise a warning will be raised and thunder
will turn it off forcefully.
Particle Grading It is not recommended to use particle grading in 2D classification, because the importance of ”noisy” particles may be overlooked
when particle grading is turned on.
Performing Reference Mask It is NOT supported to use provided mask in
2D classification. If so, a fatal error will occur. Please turn it off.

4.2

3D Classification

The next step of cryo-EM single particle analysis is 3D classification for removing
particles belong to ”wrong” conformation.
You can find a demo version of this JSON parameter file for
3D classification named demo 3D.json under directory install dir/script.
There are some options worth noticed in this JSON parameter file. They are
listed below.
Local Search Performing local search or not will NOT affect the result of 3D
classification. However, it gives you a higher resolution density map for
examining the detail of the 2D density map. You may turn it off when
the computing resource is limited.
Number of Classes It stands for the number of classes you want the images
to be classified into.
Initial Resolution (Angstrom) It is recommended to start from lower resolution for achieving ideal result of classification.
Particle Grading It is NOT recommended to use particle grading in 3D classification, because the importance of ”noisy” particles may be overlooked
when particle grading is turned on.

8

4.3

3D Refinement

The final step of cryo-EM single particle analysis is 3D refinement for achieving
high resolution density map. You may turn on particle grading and CTF
search for obtaining more information in density map.
You can find a demo version of this JSON parameter file for 3D classification
named demo.json in this package. There are some options worth noticed in this
JSON parameter file. They are listed below.
CTF Search You can refine CTF parameters using CTF search. It may cost
some computing resource.
Particle Grading It is recommend to turn on particle grading in refinement.

9



Source Exif Data:
File Type                       : PDF
File Type Extension             : pdf
MIME Type                       : application/pdf
PDF Version                     : 1.5
Linearized                      : No
Page Count                      : 9
Producer                        : pdfTeX-1.40.15
Creator                         : TeX
Create Date                     : 2018:02:05 22:06:16+08:00
Modify Date                     : 2018:02:05 22:06:16+08:00
Trapped                         : False
PTEX Fullbanner                 : This is pdfTeX, Version 3.14159265-2.6-1.40.15 (TeX Live 2014) kpathsea version 6.2.0
EXIF Metadata provided by EXIF.tools

Navigation menu