There are two algorithms implemented to preceive atom hybridizations, bond order and formal charge
The first is the Meng algorithm published in 1991. (J. Comp. Chem. 1991, 12, 891-898)
The second and default algorithm used is in close agreement with that published by Labute in 2005. (J. Chem. Inf. Comput. Sci. 2005, 45, 215-221)
Let denote the 3D coordinates of n atoms with atomic number , number of bonded atoms, and .
Bonds are perceived by first producing a candidate list and then refining it using geometry. Covalent radii from Meng are used in the following:
For each atom, i, a "dimension", d_i, is assigned based on a principal component analysis of the Gram Matrix (the current atom and its bonded atoms).
is set to k if k < 2 otherwise, is the number of positive eigenvalues of with square root greater than 0.2. will be 0 for isolated atoms, 1 for terminal and linear atoms with at least 2 bonds, 2 for planar atoms (e.g., sp2 or square planar), and 3 otherwise (e.g. tetrahedral or sp3d)
An upper bound for the number of bonds allowed by an atom is determined using and as follows:
Only the shortest are retained. At this point all atom hybridizations and bond orders are set to zero or undefined.
This step assigns obvious hybridization based on . Each substep is applied to atom of zero hybridization:
Only atoms with unassigned hybridizations have and at least one bonded neighbor with an unassigned hybridization. At this stage all bond orders, in which atom or has non-zero hybridization are set to 1.
A dihedral test is used to identify bonds of order 1. The smallest out-of-plane dihedral is computed using: If this dihedral is greater than 15 degrees then is set to 1.
The following table of lower bound single bond lengths:
and , where is the reference bond length, is used to identify single bonds.
After steps 5 and 6 the hybridizations of all uncharacterized atoms not involved in a bond of unknown order are set to sp3.
A molecular graph is formed including only atoms (vertices) that have undefined hybridization and bonds (edges) that have unknown order. This graph is then divided into components or subgraphs. Each subgraph is analyzed independently and bond orders are assigned.
Assign edge weighs using and the following atom parameters, (3rd and 4th row elements are mapped to the corresponding 2nd row with 0.1 been subtracted, -20.0 for all other atoms):
A Maximum Weighted Matching Algorithm is employed to find the best arrangement of double/triple bonds in each subgraph.
Ionization states and formal charges are perceived from the connectivity and bond order.
The formal charge of atom i, f_i, is calculated as follows: f_i = c_i - o_i + b_i where: c_i = atom group in periodic table o_i = nominal octet (2 for hydrogen, 6 for boron, 8 for carbon and all other sp3 atoms in groups 5,6,7,8) b_i = sum of the atom bond orders
If the system doesn't contain hydrogen atoms, then the following steps are applied to set the ionization state: