................ SHORT DOC .............................................
CSR: The Combined SDM/RMS Algorithm for spatial alignment of two molecules.
Reference:
M. Petitjean, Interactive Maximal Common 3D Substructure Searching
with the Combined SDM/RMS Algorithm, Comput. Chem. 1998,22[6],463-465).
Author email: petitjean@itodys.jussieu.fr
CSR reads the cartesian coordinates of two molecules, then optimally rotates and
translates the molecule 2 onto the molecule 1 to find the maximal common 3D motif.
The two input molecules should be concatenated into a single file prior execution.
Input data and parameters:
-------------------------
INPUT FORMAT:
CAS : Reserved for internal purposes
HIN : Hyperchem-type files
MDL : Cambridge Crystallographic Model files
ML2 : SYBYL Mol2 files
PDB : Protein Data Bank or Nucleic Acid Data Bank files
(only HEADER, ATOM, ENDMDL and END records are recognized)
BIO : Biosym (MSI) files
ISU : Reserved for internal purposes
INPUT MOLEC FILE NAME: name of the input file containing both molecules
OUTPUT MOLEC FILE NAME: name of the output file containing the optimally
rotated and translated molecule 2
IMOL1: sequential position number of molecule 1 in the input molecules file
IMOL2: sequential position number of molecule 2 in the input molecules file
ITERMX: maximum number of iterations; recommended value: about 200 for
small molecules (<100at.), about 2000 for a hundred to a thousand atoms,
and 20000 for larger molecules
CUT-OFF DIST:
This parameter does NOT affect the results. It saves space and time.
As a rule of thumb, this value should be roughly near a bondlength.
E.g. about 1.5 to 2 for small inorganic molecules, 0.9 to 1.2 for full
proteins, 4 to 5 for C-alpha protein backbones).
Output results:
--------------
The size N of the common 3D motif, and the r.m.s. between the N pairs
of atoms, followed by the one-to-one correspondence between the N atoms
of molecule 1 and the N atoms of molecule 2.
The new coordinates of the optimally rotated and translated molecule 2.
Remarks:
-------
The number of atoms is currently limited to 15000 for each molecule.
The source has to be recompiled to read larger molecules.
To operate on C-alpha protein backbones, the other atoms should be
removed prior execution.
The computing time is roughly proportional to the product n1*n2 of
the number of atoms of the two molecules, and proportional to the
number of iterations (reading and writing files not included).
The generated file containing the output moved molecule 2 is empty for
CAS, MDL and BIO formats, and the message "EERCO2 = 1" is displayed.
................ END SHORT DOC .........................................
|