|Dr. Thomas A. Halgren|
|Chief Technical Officer|
|1 Exchange Place, Suite 604|
|Jersey City, NJ 07302|
|Phone: 201-433-2014 ext. 109|
This revision updates the address for Tom Halgren, who has moved to Schrodinger, Inc. effective June 1, 1999.
This evaluation suite is geared to assessing the performance of molecular force fields for (1) conformational energies and (2) intermolecular interactions, but the molecular structure data it includes could also be used to test the accuracy of force-field-optimized molecular geometries. The suite provides input data and a summary of principal results for the following manuscript:
T. A. Halgren, "MMFF VII. Characterization of MMFF94, MMFF94s, and Other Widely Available Force Fields for Conformational Energies and for Intermolecular-Interaction Energies and Geometries," J. Comput. Chem., 20, 730-748 (1999).
In addition to MMFF94 and MMFF94s, the paper characterizes the CFF95, CVFF, MSI CHARMM, CHARMM 22 (in part) AMBER*, OLPS*, MM2*, and MM3* force fields. Force fields excluded because they were unavailable at Merck include AMBER 4, OPLS-AA, GROMOS, MM2, MM3, and MM4. This evaluation suite has been posted so that the community can use it to characterize these and other force fields. The data may also be useful for developing or validating new force fields.
The manuscript makes three sets of conformational energy comparisons. The first uses the 37 comparisons to experiment employed in the original MMFF94 paper on this subject . It also compares the ability of theoretical methods ranging from HF/6-31G* to GVB-LMP2/cc-pVTZ(-f) to reproduce the same experimental data. The second set consists of 19 comparisons taken from Gundertofte et. al  for which neither ab initio nor experimental data were used in the development of MMFF94. The third set consists of 147 comparisons to ab initio values obtained at the composite "MP4SDQ/TZP" level .
The comparisons for intermolecular-interaction energies and geometries employ scaled HF/6-31G* results for the 66 small-molecule dimers used in the nonbonded parameterization of MMFF94 . The scaling protocol is defined in a file described below.
The following files supply input molecular structure data:
Two formats are provided: mol2, from Tripos, and mmd, the designation used at Merck for BatchMin dat files. We chose these file formats because they are in fairly widespread use and because they allow explicit single and multiple bonds to be designated. Unlike file formats more commonly used at Merck, these formats are limited in that they cannot specify formal-charge information. However, this information is provided in another file described below. The conf-e_37-147.mol2 and conf-e_37-147.mmd files provide input for the first (37 membered) and third (147 membered) conformation sets. The geometries are MP2(FULL)/6-31G* optimized. The conf-e_19.mol2 and conf-e_19.mmd files are used for the second (19 membered) conformation set. These files supply MMFF94-optimized geometries that should provide suitable starting points for geometry optimization with other force fields.
Note: the atom types in the mol2 files (which were generated by a file conversion procedure developed at Merck) may in some cases differ from authentic SYBYL atom types, and therefore should not be relied upon.
Reference energies are given in the following files:
The conf-e_37-147.energies file covers the first and third comparison sets. This file gives the 5-character "conformational indices" used to label the structure and geometry . It also specifies the total MP2/TZP energies and the 6-31G# small-basis-set MP3 plus MP4SDQ corrections; these energies are summed to obtain the composite "MP4SDQ/TZP" energies that were used to form best-available ab initio conformational energy differences in the original MMFF94 parameterization . The relationship between the 6-31G# and 6-31G* basis sets is noted in the file. This file also contains a title-card string for each structure that indicates its constitution and conformation.
The conf-e_37.expt, conf-e_37.mp4sdq_tzp, and conf-e_37.gvb-lmp2 files specify the experimental, "MP4SDQ/TZP", and GVB-LMP2/cc-pVTZ(-f) conformational energies for comparison set 1. The experimental conformational energies differ in some cases from those used in the earlier work on the derivation of MMFF94 . An appendix to the paper, which because of space limitations has had to be moved to the Supplementary Material (available on line from the J. Comput. Chem. server), describes the basis for the choice of these particular experimental values and lists some of the others that are available. The force fields examined in the manuscript are compared to each set of reference energies. A summary table described later shows that a given force field fits each reference set about equally well (or poorly). This finding indicates that all three sets provide a valid basis for assessing the accuracy of molecular force fields.
Finally, the conf-e_19.expt and conf-e_147.mp4sdq_tzp files respectively specify the reference experimental and "MP4SDQ/TZP" conformational energies for comparison sets 2 and 3. The experimental values for comparison set 2 were taken from Gundertofte et al.  without further examination.
As previously indicated, formal atomic charge information is not preserved in the "mol2" input files and is represented only implicitly in the "mmd" file through the assigned MacroModel atom types. To assist those who may wish to utilize file formats that require explicit formal charge specifications, this information is provided for comparison sets 1 and 3 in the conf-e_37-147.fc file. Conformation set 2, in contrast, has no instances of non-zero formal atomic charges. The conf_e_19.titles file lists "title card" descriptions of the structures and geometries for comparison set 2.
The mol2 and mmd input structure files provide HF/6-31G*-optimized monomer and dimer geometries. The hbond.interactions file identifies the monomers that form each dimer and specifies the dimer atoms that contribute to key hydrogen-bond interactions. These specifications allow X...Z heteroatom distances and X-H...Z hydrogen-bond angles to be computed from the input structure files and from optimized force-field structure files derived from them. The file also explains the procedure used to obtain the scaled QM interaction energies and nonbonded heteroatom distances from the raw HF/6-31G* data. The hbond.energiesfile lists the raw HF/6-31G* energies for the monomers and dimers.
The titles files help to clarify the connection between the 5-character conformational indices and the associated structures. As before, the hbond_monomers.fc and hbond_dimers.fc files specify the atoms that carry non-zero formal ionic charges.
These postscript files contain tables taken from the paper. Each summarizes the overall success of the fits to experimental or ab initio data for a range of theoretical methods.
In the conf-e_tables.ps file, the first table documents the differing abilities of ab initio methods ranging from HF/6-31G* to GVB-LMP2/cc-pVTZ(-f) to reproduce the experimental conformational energies of set 1. The second table shows the ability of the force field models to fit experimental, "MP4SDQ/TZP", and GVB-LMP2/cc-pVTZ(-f) conformational energies. The third table summarizes the fit of the force-field conformational energies to the experimental conformational energies of set 2, and the fourth summarizes the fit of the force-field models to the 147 "MP4SDQ/TZP" conformational energies of set 3. The manuscript itself also contains detailed tables that show the result given by each theoretical method for each conformational comparison; because of space limitations, the detailed results that the fourth table summarizes have been relegated to the on-line Supplementary Material.
The table contained in the hbond_table.ps file summarizes the ability of the various force fields to reproduce scaled QM interaction energies, scaled QM X...Z heteroatom distances, and unscaled QM X-H..Z hydrogen-bond angles.
I have posted this information in the hope that it will help others to test, or develop, additional force fields. In return, I ask those who do so to let me know of results obtained from its use, to the extent this is feasible.
File name Size in Bytes ------------------------------------------ conf-e_147.mp4sdq_tzp 4,508 conf-e_19.expt 791 conf-e_19.mmd 114,540 conf-e_19.mol2 80,207 conf-e_37-147.energies 27,376 conf-e_37-147.fc 628 conf-e_37-147.mmd 667,205 conf-e_37-147.mol2 483,177 conf-e_37.expt 1,198 conf-e_37.gvb-lmp2 1,218 conf-e_37.mp4sdq_tzp 1,248 conf-e_tables.ps 22,094 conf_e_19.titles 1,827 hbond.energies 2,526 hbond.interactions 3,825 hbond_dimers.fc 444 hbond_dimers.mmd 111,820 hbond_dimers.mol2 81,111 hbond_dimers.titles 3,222 hbond_monomers.fc 401 hbond_monomers.mmd 53,983 hbond_monomers.mol2 41,007 hbond_monomers.titles 2,200 hbond_table.ps 9,972