David Young

Cytoclonal Pharmaceutics Inc.

This guide is in no way intended to be a comprehensive or advanced guide to the Gaussian program. Also, this is not an explanation of the theory behind the types of calculations or their strengths, weaknesses, accuracy, etc.

The term "ab initio" in this guide refers to methods which calculate purely from the principle of quantum mechanics with no experimental data involved. The term "semiempirical" refers to methods which use the general process dictated by quantum mechanics, but simplify it to gain speed then correct for the simplification by the use of some experimental data.

The Gaussian programs are given version numbers according to which year they were released (i.e. Gaussian 90 is the 1990 version). Gaussian is a program for doing ab initio and semiempirical calculations on atoms and molecules. The program is operated by making an ASCII input file using any convenient text editor then running the program. The results of the calculation are put in one or more output file. Gaussian itself currently has no provisions for graphical or interactive inputs or outputs. However, such things do exist for use with Gaussian and can be obtained from other sources.

This guide gives the input description by showing three sample input files and describing what they mean. A section is provided on the calculation of vibrational frequencies followed by a brief description of outputs and lists of the input options.

Input files can have any name but often use the extension ".inp" or ".input" or ".com" depending on the system. Here is a sample input file for a single atom calculation.

**$ RunGauss**

**#n test rohf/sto-3g pop=full GFINPUT **

**O sto-3g triplet**

** 0 3 **

** O**

Line 1: "$ RunGauss" This line is always the same. Although optional on some machines it is a good practice to use it always.

Line 2: (blank) Line 2 is blank for many calculations. It can be used to specify a checkpoint file name or memory allocation.

Line 3: "#n test rohf/sto-3g pop=full GFINPUT" Line 3 is called the route card. It specifies what type of calculation to do and what to calculate and output. The line always starts with "#". The "n" suppresses printing of debugging messages. The "test" suppresses keeping a summary of the job in a central data bank called an archive. "rohf" is the ab initio keyword. It stands for "restricted open-shell Hartree Fock". Ab initio keywords must always be followed by a "/" and a basis set designation such as "sto-3g". The "pop=full" specifies the printing of a full Mulliken population analysis. The "GFINPUT" puts a copy of the basis set in the output file.

Line 4 is blank.

Line 5: "O sto-3g triplet" Line 5 is a comment for the users reference only.

Line 6 is blank.

Line 7: "0 3" Line 7 consists of two numbers. The first is the charge and the second is the spin multiplicity.

Line 8: "O" Line 8 specifies that oxygen is the atom to be calculated.

Line 9: Leave at least one extra blank line at the end of the input file.

Here is a sample input for a diatomic molecule

**$ RunGauss**

**# test rhf/STO-3G opt **

**CO sto-3g **

** 0 1 **

** C**

** O 1 R**

** R 0.955**

Only significant differences from the atom input file will be mentioned.

Line 3: "# test rhf/STO-3G opt" The "rhf" stands for "restricted Hartree Fock". The "opt" specifies that the program is to find the correct geometry for the molecule, as predicted by the specified ab initio method and basis set (in this case the bond distance).

Lines 8-11 are called the Z-matrix. These lines specify the geometry of the molecule and which parameters are to be optimized if an "opt" keyword is on the route card.

Line 8: "C" This specifies that the first atom is a carbon atom.

Line 9: "O 1 R" specifies that an oxygen atom is at a distance R from the first atom (the carbon). R is defined (in Angstroms) on line 11. If an optimization is being done, a new value for R representing the most stable geometry will be given in the output file.

Here is an input file for a formaldehyde molecule.

**$RunGauss**

**# test MNDO pop=reg**

** Formaldehyde single point w/ populations**

** 0 1**

** C**

** O 1 OC**

** H 1 HC 2 A **

** H 1 HC 2 A 3 180.0**

** OC 1.2**

** HC 1.08**

** A 120.0**

Line 3: "# test MNDO pop=reg" The ab initio method and basis have been replaced by a semiempirical method keyword "MNDO". The "pop=reg" specifies a Mulliken population analysis, but not as much information printed as with "pop=full".

Line 5: "Formaldehyde single point w/ populations" Note that a calculation, which is not a geometry optimization is referred to as a single point calculation.

Line 8: "C" The first atom is a carbon.

Line 9: "O 1 OC" The second atom is an oxygen with a distance to the first atom of OC.

Line 10: "H 1 HC 2 A" The third atom is a hydrogen with a distance to the first atom of HC and an angle between the third, first and second atoms of A (in degrees).

Line 11: "H 1 HC 2 A 3 180.0" The fourth atom is a hydrogen with a distance to the first atom of HC and an angle between the third, first and second atoms of A. The dihedral angle between the first, second, third and fourth atoms is 180 degrees (a planar molecule).

If an optimization were being done, the parameters OC, HC and A would be optimized, but the molecule would be kept planar. Note that parameters can be used more than once.

Additional atoms are added by adding lines like line 11 consisting of distance, angle and dihedral angle specifications.

Gaussian does have provisions for entering geometries as x, y, z Cartesian coordinates.

Geometry specifications sometimes uses points not on atomic centers out of convenience or necessity. These are called dummy atoms. These will not be covered in this guide.

Gaussian can calculate vibrational modes along with their frequencies and force constants, using the "FREQ" keyword. These calculations are only meaningful if the molecule is at its equilibrium geometry for the given level of theory. Also, geometry optimizations and frequency calculations cannot be done in the same job. Therefore, to get a frequency calculation, first a geometry optimization should be done then the optimized geometry must be used to run a frequency calculation.

At least one output file is always produced. It can have any filename, but many systems are set up to use extensions ".lis" or ".out". This is an ASCII file which contains most of the results of the calculation, such as energies, geometries, frequencies and population analysis. Many things are put in this file that the user often ignores on any given calculation.

When a subsequent calculation is to use results from a previous calculation as its inputs, these results can be kept in a special file to avoid having to type them into the new input file. Many such result are put in a file called a checkpoint file. It is a binary file. The use of a checkpoint file can be specified using the correct options on line 2 and the route card.

Properties, such as electron density or spin density can be calculated for a regular grid of points in space and saved as a cube file. This is a file with both binary and ASCII formats, which is often used as an input for other graphical visualization programs.

Cube file generation is prompted by usage of the "CubeDensity" keyword and specification of a grid of points.

Gaussian has many other optional input and output files. Often these are accessed as standard FORTRAN units according to the conventions of the specific operating system being used.

Note that an ab initio keyword must be accompanied by a basis set keyword in the format "ab_initio/basis". All of these can be prefaced by R for closed-shell restricted wave functions, U for unrestricted open-shell wavefunctions or RO for restricted open-shell wavefunctions. This list is provided for the sake of seeing what is available. Many of these have additional options describing how to control the calculation which are listed in the Gaussian User's Guide and Programmer's Reference.

**HF** - Hartree Fock (uses RHF for singlets and UHF for others)

**RHF** - restricted Hartree Fock

**UHF** - unrestricted Hartree Fock

**ROHF** - spin-restricted open-shell Hartree Fock

**OSS** - two open shell singlet wave function

**GVB** - generalized valence bond

**CASSCF** - complete active space MCSCF

**MP2** - Moller-Plesset second order correlation energy correction

**MP3** - Moller-Plesset third order correlation energy correction

**MP4** - same as MP4SDTQ

**MP4DQ** - Moller-Plesset fourth order correlation energy correction with double
and quadruple substitutions.

**MP4SDQ** - Moller-Plesset fourth order correlation energy correction with single,
double and quadruple substitutions.

**MP4SDTQ** - Moller-Plesset fourth order correlation energy correction with single,
double, triple and quaduple substitutions.

**CI** - same as CISD

**CIS** - configuration interaction with single excitations

**CID** - configuration interaction with double excitations

**CISD** - configuration interaction with single and double excitations

**QCISD** - quadratic configuration interaction with single and double excitations

**QCISD(T)** - quadratic configuration interaction with single and double excitations
and triples contribution to the energy

Note that a basis set must accompany an ab initio keyword. The "*" and "**" indicate polarization functions (i.e. 6-31G**). The "+" and "++" indicate diffuse functions. For other options and how to use these options, see the Gaussian User's Guide and Programmer's Reference.

basis | options | atoms |
---|---|---|

STO-3G | * | H - Xe |

3-21G | * ** | H - Cl |

4-21G | * ** | |

4-31G | * ** | H - Ne |

6-21G | * ** | |

6-31G | + ++ * ** | H - Cl |

LP-31G | * ** | |

LP-41G | * ** | |

6-311G | + ++ * ** | H - Ar |

MC-311G | none | H - Ar |

D95 | + ++ * ** | H - Cl |

D95V | + ++ * ** | H - Ne |

SEC | + ++ * ** | H - Cl (same as SHC) |

CEP-4G | + ++ * ** | H - Cl |

CEP-31G | + ++ * ** | H - Cl |

CEP-121G | + ++ * ** | H - Cl |

LANLIMB | none | H - Bi (except lanthanides) |

LANLIDZ | none | H - Bi (except lanthanides) |

The **GEN** keyword allows the basis set to be read from the input file.

Note that semiempirical methods do not require a separate basis set. All of these can be prefaced by R for closed-shell restricted wavefunctions, U for unrestricted open-shell wavefunctions or RO for restricted open-shell wavefunctions. This list is provided for the sake of seeing what is available. Many of these have additional options describing how to control the calculation which are listed in the Gaussian User's Guide and Programmer's Reference.

**AM1** - Austin method one

**CNDO** - complete neglect of differential overlap

**INDO** - intermediate neglect of differential overlap

**MINDO3** - modified intermediate neglect of differential overlap third
modification.

**MNDO** - modified neglect of differential overlap

This is the list of what to calculate and what to print and how to manage the calculation. This is not a comprehensive list. This list is provided for the sake of seeing what is available. For other options and how to use these options, see the Gaussian User's Guide and Programmer's Reference.

**ANG** - distances in Angstroms

**AU** - distances in bohrs

**DEG** - angles in degrees

**RAD** - angles in radians

**CubeDensity** - generate a cube file

**Density** - for cube file generation

**direct** - do integrals as needed (vice in a file on the disk)

**InCore** - do integrals in core memory

**field** - add a finite field to the calculation

**freq** - frequency determination

**freq=noraman** - frequency determination without Raman intensities

**GFPRINT** - put basis in output file

**GFINPUT** - put basis in output file in format for generalized input

**IRC** - follow a reaction path

**LST** - linear synchronous transit

**NoFreeze** - optimize all variables

**opt** - geometry optimization

**Polar** - calculate polarizability and hyperpolarizability, if possible

**pop=none** - no population analysis

**pop=min** - minimal printing of Mulliken population analysis

**pop=reg** - some printing of Mulliken population analysis

**pop=full** - full printing of Mulliken population analysis

**pop=bonding** - bonding population analysis

**pop=no** - natural orbital analysis

**pop=noab** - natural orbital analysis for separate alpha and beta spins

**prop=grid** - computes electrostatic potential

**prop=field** - computes electrostatic potential and field

**prop=EFG** - computes electrostatic potential, field and field gradients

**punch** - puts various information in a separate output file

**ReadIsotopes** - read in masses for each atom

**Restart** - restart optimization from a checkpoint file

**TS** - optimization of a transition state

An expanded version of this article will be published in*
"Computational Chemistry: A Practical Guide for Applying Techniques
to Real World Problems" by David Young, which will be available from
John Wiley & Sons in the spring of 2001.*