User Guide For PyDPI 1.0

User Manual:

Open the PDF directly: View PDF PDF.
Page Count: 35

User
User
User
User Guide
Guide
Guide
Guide for
for
for
for PyDPI
PyDPI
PyDPI
PyDPI 1.0
1.0
1.0
1.0
Dongsheng Cao
© 2012 China Computational Biology Drug Design Group
Table
Table
Table
Table of
of
of
of Contents
Contents
Contents
Contents
1. What is this?
...........................................................................................................................................
3
2. Install the PyDPI package
......................................................................................................................
3
3. Working on drug molecules
...................................................................................................................
4
3.1. Read single molecules
.................................................................................................................
4
3.2. Download molecules from corresponding ID
..............................................................................
5
3.3. Calculating molecular descriptors
...............................................................................................
6
3.4. Molecular fingerprints and chemoinforamtics
.............................................................................
9
3.4.1. Daylight-type fingerprints
...............................................................................................
10
3.4.2. MACCS keys and FP4 fingerprints
.................................................................................
10
3.4.3. E-state fingerprints
..........................................................................................................
11
3.4.4. Atom pairs and topological torsions
................................................................................
11
3.4.5. Morgan fingerprints
.........................................................................................................
11
3.4.6. Using PyDrug object
.......................................................................................................
11
3.4.7. Fingerprint similarity
.......................................................................................................
12
4. Working on protein sequences
......................................................................................................
12
4.1. Download proteins from Uniprot
..............................................................................................
12
4.2. Download the property from the AAindex database
.................................................................
14
4.3. Calculating protein descriptors
..................................................................................................
15
5. Interaction representation
....................................................................................................................
19
5.1 Protein-protein interaction descriptors
.......................................................................................
19
5.2. Protein-ligand interaction descriptors
........................................................................................
19
Appendix
.................................................................................................................................................
21
1.
1.
1.
1. What
What
What
What is
is
is
is this?
this?
this?
this?
This document is intended to provide an overview of how one can use the PyDPI functionality from
Python. It
s not comprehensive and it
s not a manual.
If you find mistakes, or have suggestions for improvements, please either fix them yourselves in the
source document (the .py file) or send them to the mailing list: oriental-cds@hotmail.com
2.
2.
2.
2. Install
Install
Install
Install the
the
the
the PyDPI
PyDPI
PyDPI
PyDPI package
package
package
package
PyDPI has been successfully tested on Linux and Windows systems. The author could download th is
package from https://sourceforge.net/projects/pydpicao/ (.zip and .tar.gz). The install process of PyDPI
is very easy:
On
On
On
On Windows:
Windows:
Windows:
Windows:
(1): download the pydpi package (.zip)
(2): extract or uncompress the .zip file
(3): cd pydpi-1.0
(4): python setup.py install
On
On
On
On Linux:
Linux:
Linux:
Linux:
(1): download the pydpi package (.tar.gz)
(2): tar -zxf pydpi-1.0.tar.gz
(3): cd pydpi-1.0
(4): python setup.py install or sudo python setup.py install
Once the PyDPI package is installed, you can test if it is successfully installed.
If the above functions are all correctly run, the PyDPI package is successfully installled.
Note that you must guarantee that your computer is connected into the Internet.
3.
3.
3.
3. Working
Working
Working
Working on
on
on
on drug
drug
drug
drug molecules
molecules
molecules
molecules
3.1.
3.1.
3.1.
3.1. Read
Read
Read
Read single
single
single
single molecules
molecules
molecules
molecules
The majority of the basic drug molecular functionality is found in module pydrug:
Individual molecules can be constructed using a variety of approaches.
The PyDPI allow the users to provide different molecular formats.
All of these functions return a Mol object on success:
3.2.
3.2.
3.2.
3.2. Download
Download
Download
Download molecules
molecules
molecules
molecules from
from
from
from corresponding
corresponding
corresponding
corresponding ID
ID
ID
ID
The PyDPI allows the user to download the molecules by providing their IDs such as CAS, NCBI,
KEGG, EBI and Drugbank.
By providing a aspirin IDs, we could download its SMILES format conveniently.
We
can also download a molecule by constructing a PyDrug object, which contains the majority of the
basic drug molecular functionality.
You
could read a molecule by providing a Drugbank ID:
3.3.
3.3.
3.3.
3.3. Calculating
Calculating
Calculating
Calculating molecular
molecular
molecular
molecular descriptors
descriptors
descriptors
descriptors
The PyDPI package could calculate a large number of molecular descriptors including constitutional
descriptors, topological descriptors, connectivity indices, E-state indices, autocorrelation descriptors,
charge descriptors, molecular properties, kappa shape indices, MOE-type descriptors, and molecular
fingerprints. These descriptors capture and magnify distinct aspects of chemical structures.
Once we read a Mol object, we could easily calculate these molecular descriptors:
Example
Example
Example
Example 1:
1:
1:
1: Calculating
Calculating
Calculating
Calculating molecular
molecular
molecular
molecular constitutional
constitutional
constitutional
constitutional descriptors
descriptors
descriptors
descriptors
We
could calculate any constitutional descriptor by calling the corresponding functions.
We
could also
calculate all 30 descriptors by calling GetConstitutional function. The result is given in the form of
dictionary.
Example
Example
Example
Example 2:
2:
2:
2: Calculating
Calculating
Calculating
Calculating topology
topology
topology
topology descriptors
descriptors
descriptors
descriptors
25 topology descriptors can be calculated by the PyDPI package. For detailed information of topology
descriptors, refer to Table S2 in Appendix and their introductions in Manual.
Example
Example
Example
Example 3:
3:
3:
3: Calculating
Calculating
Calculating
Calculating molecular
molecular
molecular
molecular connectivity
connectivity
connectivity
connectivity indices
indices
indices
indices
Example
Example
Example
Example 4:
4:
4:
4: Calculating
Calculating
Calculating
Calculating molecular
molecular
molecular
molecular properties
properties
properties
properties
Example
Example
Example
Example 5:
5:
5:
5: Calculating
Calculating
Calculating
Calculating Kappa
Kappa
Kappa
Kappa shape
shape
shape
shape descriptors
descriptors
descriptors
descriptors
Example
Example
Example
Example 6:
6:
6:
6: Calculating
Calculating
Calculating
Calculating charge
charge
charge
charge descriptors
descriptors
descriptors
descriptors
Example
Example
Example
Example 7:
7:
7:
7: Calculating
Calculating
Calculating
Calculating descriptors
descriptors
descriptors
descriptors using
using
using
using PyDrug
PyDrug
PyDrug
PyDrug object
object
object
object
An easier way to calculate molecular descriptors is to generate a PyDrug object and then call their
methods. The PyDrug contains the majority of drug molecule operation functionality.
3.4.
3.4.
3.4.
3.4. Molecular
Molecular
Molecular
Molecular fingerprints
fingerprints
fingerprints
fingerprints and
and
and
and chemoinforamtics
chemoinforamtics
chemoinforamtics
chemoinforamtics
In the PyDPI package, there are seven types of molecular fingerprints which are defined by abstracting
and magnifying different aspects of molecular topology.
3.4.1.
3.4.1.
3.4.1.
3.4.1. Daylight-type
Daylight-type
Daylight-type
Daylight-type fingerprints
fingerprints
fingerprints
fingerprints
We
can calculate the similarity between two molecules by specifying a type of similarity measure.
There exist to be nine types of similarity measures to calculate the similarity between two molecules.
3.4.2.
3.4.2.
3.4.2.
3.4.2. MACCS
MACCS
MACCS
MACCS keys
keys
keys
keys and
and
and
and FP4
FP4
FP4
FP4 fingerprints
fingerprints
fingerprints
fingerprints
Note that the input of MACCS and FP4 is different.
3.4.3.
3.4.3.
3.4.3.
3.4.3. E-state
E-state
E-state
E-state fingerprints
fingerprints
fingerprints
fingerprints
3.4.4.
3.4.4.
3.4.4.
3.4.4. Atom
Atom
Atom
Atom pairs
pairs
pairs
pairs and
and
and
and topological
topological
topological
topological torsions
torsions
torsions
torsions
3.4.5.
3.4.5.
3.4.5.
3.4.5. Morgan
Morgan
Morgan
Morgan fingerprints
fingerprints
fingerprints
fingerprints
3.4.6.
3.4.6.
3.4.6.
3.4.6. Using
Using
Using
Using PyDrug
PyDrug
PyDrug
PyDrug object
object
object
object
The convenient way to calculate the fingerprints is to generate a PyDrug object and call GetFingerprint
method.
3.4.7.
3.4.7.
3.4.7.
3.4.7. Fingerprint
Fingerprint
Fingerprint
Fingerprint similarity
similarity
similarity
similarity
We
could any fingerprint similarity using the nine given similarity measure methods.
4.
4.
4.
4. Working
Working
Working
Working on
on
on
on protein
protein
protein
protein sequences
sequences
sequences
sequences
4.1.
4.1.
4.1.
4.1. Download
Download
Download
Download proteins
proteins
proteins
proteins from
from
from
from Uniprot
Uniprot
Uniprot
Uniprot
You
can get a protein sequence from the Uniprot website by providing a Uniprot ID.
You
can get the window
×
2+1 sub-sequences whose central point is the given amino acid ToAA.
You
can also get several protein sequences by providing a file containing Uniprot IDs of these proteins.
The downloaded protein sequences have been saved in "/home/orient/res.txt".
The user can also download the pdb file by providing corresponding pdb id, and then extract its amino
acid sequence.
The downloaded protein has been saved in “ /home/orient/1atp.pdb ” .
You
could check whether the input sequence is a valid protein sequence or not.
The output is the number of the protein sequence if it is valid; otherwise 0.
4.2.
4.2.
4.2.
4.2. Download
Download
Download
Download the
the
the
the property
property
property
property from
from
from
from the
the
the
the AAindex
AAindex
AAindex
AAindex database
database
database
database
You
could get the properties of amino acids from the AAindex database by providing a property name
(e.g., KRIW790103). The output is given in the form of dictionary.
If the user provides the directory containing the AAindex database (the AAindex database could be
downloaded from ftp://ftp.genome.jp/pub/db/community/aaindex/. It consists of three files: aaindex1,
aaindex2 and aaindex3), the program will read the given database to get the property.
It should be noted that the PyDPI package has contained the AAindex database. The GetAAIndex1
methods in AAIndex will get the property from the aaindex1 database.
If the user does not provide the directory containing the AAindex database, the program will downlaod
the three databases (i.e., aaindex1, aaindex2 and aaindex3) to obtain the property. It should be noted
that the downloaded AAindex will be saved in the current directory.
You
can also specify the directory
according to your needs.
The downloaded databases are saved in F disk. The GetAAIndex23 methods in AAIndex will get the
property from the aaindex2 and aaindex3 databases.
4.3.
4.3.
4.3.
4.3. Calculating
Calculating
Calculating
Calculating protein
protein
protein
protein descriptors
descriptors
descriptors
descriptors
There are two ways to calculate protein descriptors in the PyDPI package. One is to directly use the
corresponding methods, the other one is firstly to construct a PyPro class and then run their methods to
obtain the protein descriptors. It should be noted that the output is a dictionary form, whose keys and
values represent the descriptor name and the descriptor value, respectively. The user could clearly
understand the meaning of each descriptor.
Use
Use
Use
Use functions:
functions:
functions:
functions:
We
can also compute various types of descriptors based on PDB format.
Use
Use
Use
Use GetProDes
GetProDes
GetProDes
GetProDes class:
class:
class:
class:
Example
Example
Example
Example 1:
1:
1:
1: Calculating
Calculating
Calculating
Calculating amino
amino
amino
amino acid
acid
acid
acid composition
composition
composition
composition descriptors
descriptors
descriptors
descriptors
Example
Example
Example
Example 2:
2:
2:
2: Calculating
Calculating
Calculating
Calculating Moran
Moran
Moran
Moran autocorrelation
autocorrelation
autocorrelation
autocorrelation descriptors
descriptors
descriptors
descriptors
Example
Example
Example
Example 3:
3:
3:
3: Calculating
Calculating
Calculating
Calculating pseudo
pseudo
pseudo
pseudo amino
amino
amino
amino acid
acid
acid
acid composition
composition
composition
composition descriptors
descriptors
descriptors
descriptors
When we change the values of lamda and weight, we could get different PAAC values. Note that the
number of PAAC depends on the choice of lamda. If lamda = 10, we can obtain 20+lamda=30 PAAC
descriptors.
Example
Example
Example
Example 4:
4:
4:
4: Calculating
Calculating
Calculating
Calculating all
all
all
all protein
protein
protein
protein descriptors
descriptors
descriptors
descriptors
The PyPro class includes a built-in method which can calculate all protein descriptors.
Example
Example
Example
Example 5:
5:
5:
5: Calculating
Calculating
Calculating
Calculating protein
protein
protein
protein descriptors
descriptors
descriptors
descriptors based
based
based
based on
on
on
on the
the
the
the user-defined
user-defined
user-defined
user-defined property
property
property
property
The user could provide some property in the form of dictionary in python. Thus, PyDPI could calculate
the descriptors based on the user-defined property.
Example
Example
Example
Example 6:
6:
6:
6: Calculating
Calculating
Calculating
Calculating protein
protein
protein
protein descriptors
descriptors
descriptors
descriptors based
based
based
based on
on
on
on the
the
the
the property
property
property
property from
from
from
from AAindex
AAindex
AAindex
AAindex
A
powerful ability of PyDPI is that it can easily calculate thousands of protein features through
automatically obtaining the needed property from AAindex.
5.
5.
5.
5. Interaction
Interaction
Interaction
Interaction representation
representation
representation
representation
5.1
5.1
5.1
5.1 Protein-protein
Protein-protein
Protein-protein
Protein-protein interaction
interaction
interaction
interaction descriptors
descriptors
descriptors
descriptors
5.2.
5.2.
5.2.
5.2. Protein-ligand
Protein-ligand
Protein-ligand
Protein-ligand interaction
interaction
interaction
interaction descriptors
descriptors
descriptors
descriptors
Appendix
Appendix
Appendix
Appendix :
:
:
:
Table
Table
Table
Table S1
S1
S1
S1 List of propy computed features for protein sequences
Feature
Feature
Feature
Feature group
group
group
group Features
Features
Features
Features Number
Number
Number
Number of
of
of
of descriptors
descriptors
descriptors
descriptors
Amino acid composition Amino acid composition 20
Dipeptide composition 400
Tripeptide composition 8000
Autocorrelation Normalized Moreau-Broto
autocorrelation
240
a
Moran autocorrelation 240
a
Geary autocorrelation 240
a
CTD Composition 21
Transition 21
Distribution 105
Conjoint triad Conjoint triad features 343
Quasi-sequence order Sequence order coupling number 60
Quasi-sequence order descriptors 100
Pseudo amino acid composition Pseudo amino acid composition 50
b
Amphiphilic pseudo amino acid
composition
50
c
a
The number depends on the choice of the number of properties of amino acid and the choice of the maximum values
of the lag. The default is use eight types of properties and lag = 30.
b
The number depends on the choice of the number of the set of amino acid properties and the choice of the lamda
value. The default is use three types of properties proposed by Chou et al and lamda = 30.
c
The number depends on the choice of the lamda vlaue. The default is that lamda = 30.
Table
Table
Table
Table S2
S2
S2
S2 List of PyDPI computed descriptors for small molecules
Molecular
Molecular
Molecular
Molecular descriptors
descriptors
descriptors
descriptors
Constitutional
Constitutional
Constitutional
Constitutional descriptors
descriptors
descriptors
descriptors
1
a
Weight Molecular weight
2 nhyd Count of hydrogen atoms
3 nhal Count of halogen atoms
4
a
nhet Count of hetero atoms
5
a
nhev Count of heavy atoms
6 ncof Count of F atoms
7 ncocl Count of Cl atoms
8 ncobr Count of Br atoms
9 ncoi Count of I atoms
10 ncarb Count of C atoms
11 nphos Count of P atoms
12 nsulph Count of S atoms
13 noxy Count of O atoms
14 nnitro Count of N atoms
15
a
nring Number of rings
16
a
nrot Number of rotatable bonds
17
a
ndonr Number of H-bond donors
18
a
naccr Number of H-bond acceptors
19 nsb Number of single bonds
20 ndb Number of double bonds
21 ntb Number of triple bonds
22 naro Number of aromatic bonds
23 nta Number of all atoms
24 AWeight Average molecular weight
25-30 PC1
PC2
PC3
PC4
PC5
PC6
Molecular path counts of length 1-6
Topological
Topological
Topological
Topological descriptors
descriptors
descriptors
descriptors
1 W Weiner index
2 AW Average Wiener index
3
a
J Balaban
s J index
4 T
hara
Harary number
5 T
sch
Schiultz index
6 Tigdi Graph distance index
7 Platt Platt number
8 Xu Xu index
9 Pol Polarity number
10 Dz Pogliani index
11
a
Ipc Ipc index
12
a
BertzCT BertzCT
13 GMTI Gutman molecular topological index based on simple vertex degree
14-15 ZM1
ZM2
Zagreb index with order 1-2
16-17 MZM1
MZM2
Modified Zagreb index with order 1-2
18 Qindex Quadratic index
19 diametert Largest value in the distance matrix
20 radiust radius based on topology
21 petitjeant Petitjean based on topology
22 Sito the logarithm of the simple topological index by Narumi
23 Hato harmonic topological index proposed by Narumi
24 Geto Geometric topological index by Narumi
25 Arto Arithmetic topological index by Narumi
Connectivity
Connectivity
Connectivity
Connectivity descriptors
descriptors
descriptors
descriptors
1-11
a 0
χ
v
1
χ
v
2
χ
v
3
χ
p
v
Valence molecular connectivity Chi index for path order 0-10
4
χ
p
v
5
χ
p
v
6
χ
p
v
7
χ
p
v
8
χ
p
v
9
χ
p
v
10
χ
p
v
12
3
χ
v
c
Valence molecular connectivity Chi index for three cluster
13
4
χ
v
c
Valence molecular connectivity Chi index for four cluster
14
4
χ
v
pc
Valence molecular connectivity Chi index for path/cluster
15-18
3
χ
v
CH
4
χ
v
CH
5
χ
v
CH
6
χ
v
CH
Valence molecular connectivity Chi index for cycles of 3-6
19-29
a 0
χ
1
χ
2
χ
3
χ
p
4
χ
p
5
χ
p
6
χ
p
7
χ
p
8
χ
p
9
χ
p
10
χ
p
Simple molecular connectivity Chi indices for path order 0-10
30
3
χ
c
Simple molecular connectivity Chi indices for three cluster
31
4
χ
c
Simple molecular connectivity Chi indices for four cluster
32
4
χ
pc
Simple molecular connectivity Chi indices for path/cluster
33-36
3
χ
CH
4
χ
CH
5
χ
CH
6
χ
CH
Simple molecular connectivity Chi indices for cycles of 3-6
37 mChi1 mean chi1 (Randic) connectivity index
38 knotp the difference between chi3c and chi4pc
39 dchi0 the difference between chi0v and chi0
40 dchi1 the difference between chi1v and chi1
41 dchi2 the difference between chi2v and chi2
42 dchi3 the difference between chi3v and chi3
43 dchi4 the difference between chi4v and chi4
44 knotpv the difference between chiv3c and chiv4pc
Kappa
Kappa
Kappa
Kappa descriptors
descriptors
descriptors
descriptors
1
1
κ
α
Kappa alpha index for 1 bonded fragment
2
2
κ
α
Kappa alpha index for 2 bonded fragment
3
3
κ
α
Kappa alpha index for 3 bonded fragment
4 phi Kier molecular flexibility index
5
a 1
κ Molecular shape Kappa index for 1 bonded fragment
6
a 2
κ Molecular shape Kappa index for 2 bonded fragment
7
a 3
κ Molecular shape Kappa index for 3 bonded fragment
Burden
Burden
Burden
Burden Descriptors
Descriptors
Descriptors
Descriptors
1-16 bcutm1-16 Burden descriptors based on atomic mass
17-32 bcutv1-16 Burden descriptors based on atomic vloumes
33-48 bcute1-16 Burden descriptors based on atomic electronegativity
49-64 bcutp1-16 Burden descriptors based on polarizability
Basak
Basak
Basak
Basak information
information
information
information descriptors
descriptors
descriptors
descriptors
1 IC0 Information content with order 0 proposed by Basak
2 IC1 Information content with order 1 proposed by Basak
3 IC2 Information content with order 2 proposed by Basak
4 IC3 Information content with order 3 proposed by Basak
5 IC4 Information content with order 4 proposed by Basak
6 IC5 Information content with order 5 proposed by Basak
7 IC6 Information content with order 6 proposed by Basak
8 SIC0 Complementary information content with order 0
proposed by Basak
9 SIC1 Structural information content with order 1 proposed by Basak
10 SIC2 Structural information content with order 2 proposed by Basak
11 SIC3 Structural information content with order 3 proposed by Basak
12 SIC4 Structural information content with order 4 proposed by Basak
13 SIC5 Structural information content with order 5 proposed by Basak
14 SIC6 Structural information content with order 6 proposed by Basak
15 CIC0 Complementary information content with order 0
proposed by Basak
16 CIC1 Complementary information content with order 1 proposed by Basak
17 CIC2 Complementary information content with order 2 proposed by Basak
18 CIC3 Complementary information content with order 3 proposed by Basak
19 CIC4 Complementary information content with order 4 proposed by Basak
20 CIC5 Complementary information content with order 5 proposed by Basak
21 CIC6 Complementary information content with order 6 proposed by Basak
E-state
E-state
E-state
E-state descriptors
descriptors
descriptors
descriptors
1 S(1) Sum of E-State of atom type: sLi
2 S(2) Sum of E-State of atom type: ssBe
3 S(3) Sum of E-State of atom type: ssssBe
4 S(4) Sum of E-State of atom type: ssBH
5 S(5) Sum of E-State of atom type: sssB
6 S(6) Sum of E-State of atom type: ssssB
7 S(7) Sum of E-State of atom type: sCH3
8 S(8) Sum of E-State of atom type: dCH2
9 S(9) Sum of E-State of atom type: ssCH2
10 S(10) Sum of E-State of atom type: tCH
11 S(11) Sum of E-State of atom type: dsCH
12 S(12) Sum of E-State of atom type: aaCH
13 S(13) Sum of E-State of atom type: sssCH
14 S(14) Sum of E-State of atom type: ddC
15 S(15) Sum of E-State of atom type: tsC
16 S(16) Sum of E-State of atom type: dssC
17 S(17) Sum of E-State of atom type: aasC
18 S(18) Sum of E-State of atom type: aaaC
19 S(19) Sum of E-State of atom type: ssssC
20 S(20) Sum of E-State of atom type: sNH3
21 S(21) Sum of E-State of atom type: sNH2
22 S(22) Sum of E-State of atom type: ssNH2
23 S(23) Sum of E-State of atom type: dNH
24 S(24) Sum of E-State of atom type: ssNH
25 S(25) Sum of E-State of atom type: aaNH
26 S(26) Sum of E-State of atom type: tN
27 S(27) Sum of E-State of atom type: sssNH
28 S(28) Sum of E-State of atom type: dsN
29 S(29) Sum of E-State of atom type: aaN
30 S(30) Sum of E-State of atom type: sssN
31 S(31) Sum of E-State of atom type: ddsN
32 S(32) Sum of E-State of atom type: aasN
33 S(33) Sum of E-State of atom type: ssssN
34 S(34) Sum of E-State of atom type: sOH
35 S(35) Sum of E-State of atom type: dO
36 S(36) Sum of E-State of atom type: ssO
37 S(37) Sum of E-State of atom type: aaO
38 S(38) Sum of E-State of atom type: sF
39 S(39) Sum of E-State of atom type: sSiH3
40 S(40) Sum of E-State of atom type: ssSiH2
41 S(41) Sum of E-State of atom type: sssSiH
42 S(42) Sum of E-State of atom type: ssssSi
43 S(43) Sum of E-State of atom type: sPH2
44 S(44) Sum of E-State of atom type: ssPH
45 S(45) Sum of E-State of atom type: sssP
46 S(46) Sum of E-State of atom type: dsssP
47 S(47) Sum of E-State of atom type: sssssP
48 S(48) Sum of E-State of atom type: sSH
49 S(49) Sum of E-State of atom type: dS
50 S(50) Sum of E-State of atom type: ssS
51 S(51) Sum of E-State of atom type: aaS
52 S(52) Sum of E-State of atom type: dssS
53 S(53) Sum of E-State of atom type: ddssS
54 S(54) Sum of E-State of atom type: sCl
55 S(55) Sum of E-State of atom type: sGeH3
56 S(56) Sum of E-State of atom type: ssGeH2
57 S(57) Sum of E-State of atom type: sssGeH
58 S(58) Sum of E-State of atom type: ssssGe
59 S(59) Sum of E-State of atom type: sAsH2
60 S(60) Sum of E-State of atom type: ssAsH
61 S(61) Sum of E-State of atom type: sssAs
62 S(62) Sum of E-State of atom type: sssdAs
63 S(63) Sum of E-State of atom type: sssssAs
64 S(64) Sum of E-State of atom type: sSeH
65 S(65) Sum of E-State of atom type: dSe
66 S(66) Sum of E-State of atom type: ssSe
67 S(67) Sum of E-State of atom type: aaSe
68 S(68) Sum of E-State of atom type: dssSe
69 S(69) Sum of E-State of atom type: ddssSe
70 S(70) Sum of E-State of atom type: sBr
71 S(71) Sum of E-State of atom type: sSnH3
72 S(72) Sum of E-State of atom type: ssSnH2
73 S(73) Sum of E-State of atom type: sssSnH
74 S(74) Sum of E-State of atom type: ssssSn
75 S(75) Sum of E-State of atom type: sI
76 S(76) Sum of E-State of atom type: sPbH3
77 S(77) Sum of E-State of atom type: ssPbH2
78 S(78) Sum of E-State of atom type: sssPbH
79 S(79) Sum of E-State of atom type: ssssPb
80-158 Smax1-Smax79 maxmum of E-State value of specified atom type
159-237 Smin1-Smin79 minimum of E-State value of specified atom type
Autocorrelation
Autocorrelation
Autocorrelation
Autocorrelation descriptors
descriptors
descriptors
descriptors
1-8 ATSm1-ATSm8 Moreau-Broto autocorrelation descriptors based on atom mass
9-16 ATSv1-A TSv8 Moreau-Broto autocorrelation descriptors based on atomic van der
Waals volume
17-24 ATSe1-ATSe8 Moreau-Broto autocorrelation descriptors based on atomic
Sanderson electronegativity
25-32 ATSp1-A TSp8 Moreau-Broto autocorrelation descriptors based on atomic
polarizability
33-40 MATSm1-MA TSm8 Moran autocorrelation descriptors based on atom mass
41-48 MATSv1-MATSv8 Moran autocorrelation descriptors based on atomic van der Waals
volume
49-56 MATSe1-MATSe8 Moran autocorrelation descriptors based on atomic Sanderson
electronegativity
57-64 MATSp1-MATSp8 Moran autocorrelation descriptors based on atomic polarizability
65-72 GATSm1-GATSm8 Geary autocorrelation descriptors based on atom mass
73-80 GA TSv1-GATSv8 Geary autocorrelation descriptors based on atomic van der Waals
volume
81-88 GATSe1-GATSe8 Geary autocorrelation descriptors based on atomic Sanderson
electronegativity
89-96 GA TSp1-GATSp8 Geary autocorrelation descriptors based on atomic polarizability
Charge
Charge
Charge
Charge descriptors
descriptors
descriptors
descriptors
1-4 Q
Hmax
Q
Cmax
Q
Nmax
Q
Omax
Most positive charge on H,C,N,O atoms
5-8 Q
Hmin
Q
Cmin
Q
Nmin
Q
Omin
Most negative charge on H,C,N,O atoms
9-10 Q
max
Q
min
Most positive and negative charge in a molecule
11-15 Q
HSS
Q
CSS
Q
NSS
Q
OSS
Qass
Sum of squares of charges on H,C,N,O and all toms
16-17 Mpc
Tpc
Mean and total of positive charges
18-19 Mnc
Tnc
Mean and total of negative charges
20-21 Mac
Tac
Mean and total of absolute charges
22 Rpc Relative positive charge
23 Rnc Relative negative charge
24 SPP Submolecular polarity parameter
25 LDI Local dipole index
Molecular
Molecular
Molecular
Molecular property
property
property
property descriptors
descriptors
descriptors
descriptors
1
a
MREF Molar refractivity
2
a
logP LogP value based on the Crippen method
3 logP2 Square of LogP value based on the Crippen method
4
a
TPSA Topological polarity surface area
5 UI Unsaturation index
6 Hy Hydrophilic index
MOE-type
MOE-type
MOE-type
MOE-type descriptors
descriptors
descriptors
descriptors
1
a
MTPSA topological polar surface area based on fragments
2
a
LabuteASA Labute's Approximate Surface Area
3-14
a
SLOGPVSA MOE-type descriptors using SLogP contributions and surface area
contributions
15-24
a
SMRVSA MOE-type descriptors using MR contributions and surface area
contributions
25-38
a
PEOEVSA MOE-type descriptors using partial charges and surface area
contributions
39-49
a
EstateVSA MOE-type descriptors using Estate indices and surface area
contributions
50-60
a
VSAEstate MOE-type descriptors using surface area contributions and Estate
indices
Fragment/Fingerprint-based
Fragment/Fingerprint-based
Fragment/Fingerprint-based
Fragment/Fingerprint-based descriptors
descriptors
descriptors
descriptors
1
a
FP2 (Topological fingerprint)
A
Daylight-like fingerprint based on hashing
molecular subgraphs
2
a
MACCS (MACCS keys)Using the 166 public keys implemented as SMARTS
3 E-state 79 E-state fingerprints or fragments
4 FP4 307 FP4 fingerprints
5
a
Atom Paris Atom Paris fingerprints
6
a
Torsions Topological torsion fingerprints
7
a
Morgan/Circular Fingerprints based on the Morgan algorithm
Note:
a
indicates that these descriptors are from RDkit. In PyDPI, we wrapped most of molecular descriptors form
RDkit. The other descriptors are independently coded by us.

Navigation menu