MTK++ Latest version: 0.2.0

Functional Group Recognition

The algorithm used is in close agreement with that published by Ullman, (*J. ACM* **1976**, 23, 31-42) and Willett, Wilson, and Reddaway, (*J. Chem. Inf. Comput. Sci.* **1991**, 31, 225-233).

To functionalize a molecule involves searching it for chemical substructures.

Substructures searching is known as the subgraph isomorphism problem of graph theory and belongs to the class of NP-complete computational problems.

Due to the NP-complete nature of substructure searching usually a screen is carried out to eliminate subgraphs that cannot be contained in the molecule.

- To carry out functional group alignment of drug-like molecules.
- To develope a tool for de novo design.
- To optimize fragment positions in drug-protein complexes.
- An analysis tool for a variety of project, for example, NMR.

Generating the adjacency matrices A and B for the fragment and the molecule containing and atoms respectively.

An exhautive search involves generating combinations of .

Ullmann first noticed that using a depth-first backtracking search dramatically increases efficiency.

Using a labeled graph and a non-binary connection table increases algorithm speed.

The following example was taken from Molecular Modelling, Principles and Applications 2nd Edition by Andrew R. Leach.

- Take for example the following fragment:

- Determine the fragments' adjacency matrix

- Take for example the following molecule:

- Determine the molecules' adjacency matrix

- The Ullman algorithm tries to find the matrix A which satisfy

This depth-first backtracking algorithm uses a General match matrix, M that contains all the possible equivalences between atoms from A and B. The elements of this matrix, are such that:

"The Ullmann heuristic states that if a fragment atom has a neighbor , and a molecule atom can be mapped to , then there must exist a neighbor of , , that can be mapped to ":

if at any state during the search an atom in such that for all atoms in then a mismatch is identified.

int Pa; // Number of atoms in the query pattern int Pb; // Number of atoms in the database structure array A[Pa,Pa]; // adjacency matrix of A array B[Pb,Pb]; // adjacency matrix of B array M[Pa, Pb]; // general match matrix bool isomorphism = false; void ullman(int d, array M) { array M1 = M; bool mismatch; for (all unique mappings of atom d) { choose new unique mapping for query atom d update M accordingly refine(M, mismatch); if (!mismatch) { if (d == Pa) { isomorphism = true; store M; } else { ullman(d+1, M); } } else { M = M1; } } } void refine(array M, bool mismatch) { mismatch = false; bool change = false; while (!change || mismatch) { for (int i = 0; i < Pa; i++) { for (int j = 0; j < Pb; j++) { if (M[i][j]) { assign mij; change = change or (!mij); } } } assign mismatch; } } int main() { ullman(1, M); }

- Read all fragments
- Generate simple fingerprints

- Read in molecule to functionalize
- Determine rings, add hydrogens, etc.
- Generate simple fingerprint
- Loop over all fragments in library
- Compare molecule to simple fingerprint (Screening)
- Match fragment to molecule using the Ullman and Willett algorithm of subgraph isomorphism
- if match assign fragment code to molecule

Generated on Fri Dec 23 2011 09:28:54 for MTK++ by Doxygen 1.7.5.1