From jkl { *at * } ccl.net Mon Mar  4 18:26:56 1991
Date: Mon, 04 Mar 91 17:52:37 EST
From: jkl' at \`ccl.net
Subject: Basis sets intro. Part 1/3
To: chemistry -AatT- ccl.net
Status: RO

=============================================================================
|                                                                           |
| SIMPLIFIED INTRODUCTION TO AB INITIO BASIS SETS. TERMS AND NOTATION.      |
|                                                                           |
| Jan K. Labanowski, Ohio Supercomputer Center, 1224 Kinnear Rd., Columbus, |
| OH 43212-1163, USA.  E-mail: jkl # - at - # ccl.net, JKL # - at - # OHSTPY.BITNET               |
|                                                                           |
| Permission is granted to do whatever you wish with this.                  |
|                                                                           |
=============================================================================

WHY ALL THIS?
==============
This is a general introduction to basis sets. It is not yet a sulphur basis
set summary. I will post sulphur summary soon.
Many thanks to all who responded to my question about basis sets for
sulphur correlated calculations. Here is a list of people who
helped me (directly or indirectly) with sulphur sets:
Raymond Bair, John Bloor, Thom Dunning, Bob Eades, Dave Ewing, Dave Feller,
Mike Frisch, Mark Gordon, Scott Lamson, Russ Pitzer, Bob Zellmer.

Many people on this list know infinitely more about basis sets then I.
There are, however, a few reasons I decided to write this summary:
 1. From the perspective of "list owner," I see many people on the list
    for whom ab initio might be a useful tool. This people, however, might have
    problems following original literature on the topic. I tried to compile
    the FAQ (better known in network lingo as "Frequently Asked Questions")
    asked by Ohio Supercomputer Center users who want to use quantum chemistry
    as a tool but wish they did not have to go through the difficult
    literature.
 2. I want to change a ratio of "scientific/other" topics on the list.
 3. Last, but not least, I would appreciate your comments and corrections,
    since I also teach this subject to graduate students at The Ohio State
    Univ. College of Pharmacy. Getting your criticism, will help me prepare
    a better handout.

I hope, that even those of you who know infinitely more about it than I, will
benefit by using some of my examples for the courses which you obviously teach.

   Sometimes I will have to use subscripts and superscipts. I will use
"pseudo" TeX convention, i.e. ^ for superscipt and _ for subscript.
E.g. 2^3 = 8 and a_1 means a sub 1. To simplify the notation even further,
I will use exp to denote exponential function, i.e. e^x = exp[x]. 
I will not even touch the topic of Floating Orbital and Lobe Functions, since
they are not used outside the narrow group of quantum chemistry professionals,
though I think they are important.


REFERENCES (since this summary comes to you in several pieces, I've decided to
==========  put references on the top).


R. Ahlrichs, P.R.Taylor, 1981, "The choice of gaussian basis sets for
   molecular electronic structure calculations," J.Chim.Phys, 78, 315-324.

J. Almlof, P.R. Taylor, 1987, "General contraction of Gaussian basis sets.
   I. Atomic natural orbitals for first- and second-row atoms," J. Chem. Phys.
   86, 4070-4077.

J. Almlof, T. Helgaker, P.R. Taylor, 1988, "Gaussian basis sets for high-
   quality ab initio calculations," J. Phys. Chem., 92, 3029-3033.

J. Andzelm, M. Klobukowski, E. Radzio-Andzelm, Y. Saski, H. Tatewaki, 1984,
   "Gaussian Basis Sets for Molecular Calculations," (S. Huzinaga, Editor),
   Elsevier, Amsterdam.

T. Clark, J. Chandrasekhar, G.W. Spitznagel, P.v.R. Schleyer, (1983), 
   "Efficient diffuse function-augmented basis sets for Anion Calculations.
   III. The 3-21+G set for first-row elements, Li-F," J. Comput. Chem., 4,
   294-301.

E.Clementi, S.J.Chakravorty, G.Corongiu, V.Sonnad, 1990, "Independent
   Electron Models: Hartree-Fock for Many-Electron Atoms," pp. 47-140, in:
   "MOTECC-90 Modern Techniques in Computational Chemistry (E.Clementi, Ed.),
   Escom, Leiden 1990, ISBN-90-72199-07-3.

J.E. Del Bene, 1989, "An ab initio molecular orbital study of the structures
   and energies of neutral and charged bimolecular complexes of NH_3 with
   hydrides AH_n (A = N, O, F, P, S, and Cl)," J. Comput. Chem. 10, 603-615.

T.H. Dunning,  P.J. Hay, 1977,  "Gaussian basis sets for molecular
   calculations," in: "Modern Theoretical Chemistry," vol. 3. 
   Ed. H.F. Schaefer III, pp. 1-28, Plenum Press, New York.

T.H. Dunning, Jr., 1989, "Gaussian basis sets for use in correlated molecular
   calculations," J.Chem.Phys. 90, 1007-1023.

P. Durand, J.-C. Barthelat, 1975, "A theoretical method to determine atomic
   pseudopotentials for electronic structure calculations of molecules and
   solids," Theor. Chim. Acta(Berl.) 38, 283-302.

D. Feller, E.R. Davidson, 1986, "Basis set selection for molecular
   calculations," Chemical Reviews, 86, 681-696.

D. Feller, E.R. Davidson, 1990, "Basis Sets for Ab Initio Molecular Orbital
   Calculations and Intermolecular Interactions," pp. 1-43, in: "Reviews in
   Computational Chemistry" (Editors:Kenny B. Lipkowitz and Donald B. Boyd),
   VCH, New York.

M.M. Francl, J.S. Binkley, M.S. Gordon, D.J. DeFrees, J.A. Pople, 1982, "Self-
   consistent molecular orbital methods. XXIII. A polarization-type basis set
   for second-row element," J.Chem.Phys. 77, 3654-3665.

M.J. Frisch, J.A. Pople, J.S. Binkley, 1984, "Self-consistent molecular orbital
   methods 25. Supplementary functions for Gaussian Basis Sets," J. Chem. Phys.
   80, 3265-3269.

M.S. Gordon, 1980, "The isomers of silacyclopropane," Chem. Phys. Lett.,
   76, 163-168.

M. Gutowski, F.B. Van Duijneveldt, G. Chalasinski, L. Piela, 1987, "Proper
   correction for the basis set superposition error in SCF calculations
   of intermolecular calculations," Mol. Phys. 61, 233-247.

P.J. Hay, W.R. Wadt, 1985a, "Ab Initio effective core potentials for molecular
   calculations. Potentials for transition metal atoms Sc to Hg,"
   J. Chem. Phys. 82, 271-283.

P.J. Hay, W.R. Wadt, 1985b, "Ab Initio effective core potentials for molecular
   calculations. Potentials for K to Au including outermost core orbitals,"
   J. Chem. Phys. 82, 299-310.

W.J. Hehre, L. Radom, P.v.R Schleyer, J.A. Pople, 1986, "Ab Initio Molecular
   Orbital Theory," Willey & Sons, New York.

M.M. Hurley, L.F. Pacios, P.A. Christiansen, R.B. Ross, W.C. Ermler, 1986,
   "Ab initio relativistic effective potentials with spin-orbit operators.
   II. K through Kr," J. Chem. Phys. 84, 6840-6853.

S. Huzinaga, 1965, "Gaussian-type functions for polyatomic systems," 
   J. Chem. Phys., 42, 1293-1302.

S. Huzinaga, D. McWilliams, B. Domsky, 1971, "Approximate Atomic Function,"
   J. Chem. Phys. 54, 2283-2284.

K. Jankowski, R. Becherer, P. Scharf, H. Schiffer, R. Ahlrichs, 1985, "The
   impact of higher polarization basis functions on molecular ab initio
   results," J. Chem. Phys. 82, 1413-1419.

R. Krishnan, J.S. Binkley, R. Seeger, J.A. Pople, 1980, "Self-consistent
   orbital methods. XX. A basis set for correlated wave functions," 
   J. Chem. Phys. 72, 650-654.

A.D. McLean, G.S. Chandler, 1980, "Contracted gaussian basis sets for molecular
   calculations. I. Second row atoms, Z=11-18," J. Chem. Phys., 72, 5639-5648.

L.F. Pacios, P.A. Christiansen, 1985, "Ab initio relativistic effective
   potentials with spin-orbit operators. I. Li through Ar.

R. Poirier, R. Kari, I.G. Csizmadia, 1985, "Handbook of Gaussian Basis Sets,"
   Elsevier Science, New York.

R.C. Raffenetti, 1973, "General contraction of Gaussian atomic orbitals:
   Core, valence, polarization and diffuse basis sets; Molecular integral
   evaluation," J. Chem. Phys, 58, 4452-4458.

M. Sabio, S. Topiol, 1989, " 3s- versus 1s-type gaussuan primitives:
   Modification of the 3-21G(*) basis set for sulphur atom. J. Comput. Chem.,
   10, 660-672.

W.J. Stevens, H. Bash, M. Krauss, 1984, J. Chem. Phys. 81, 6026-6033.

A. Szabo, N.S. Ostlund, 1989,  "Modern Quantum Chemistry: Introduction
   to Advanced Electronic Structure Theory". MacMillan Publishing Co., 
   New York.

W.R. Wadt, P.J. Hay, 1985, "Ab Initio effective core potentials for molecular
   calculations. Potentials for main group elements Na to Bi," J. Chem. Phys.
   82, 284-298.

N.M. Wallace, J.P. Blaudeau, R.M. Pitzer, 1991, "Optimized Gaussian Basis Sets
   for use with Relativistic Effective (core) Potentials: Li-Ar,"\
   Int. J. Quantum Chem.  (in press).


INTRODUCTION
============
Some straightforward reviews on basis sets are available: (Ahlrich & Taylor,
1981), (Andzelm et al., 1984), (Dunning & Hay, 1977), (Feller & Davidson,
1986), (Feller & Davidson, 1990), (Poirier et al., 1985).

Historically, the quantum calculations for molecules were performed as LCAO MO,
i.e. Linear Combination of Atomic Orbitals - Molecular Orbitals. This means
that molecular orbitals are formed as a linear combination of atomic orbitals:

       psi_i = sum from u=1 to n { c_ui*phi_u }

  where psi_i is the i-th molecular orbital, c_ui are the coefficients of
  linear combination, phi_u is the u-th atomic orbital, and n is the number
  of atomic orbitals.

Strictly speaking, Atomic Orbitals (AO) are solutions of the Hartree-Fock
equations for the atom, i.e. a wave functions for a single electron in the
atom. Anything else is not really an atomic orbital. Some things are similar
though, and there is a lot of confusion in the terminology used. Later on, 
this term atomic orbital was replaced by "basis function" or "contraction,"
when appropriate. Early, the Slater Type Orbitals (STO's) were used as basis
functions due to their similarity to atomic orbitals of the hydrogen atom. 
They are described by the function depending on spherical coordinates:

       phi_i(zeta,n,l,m;r,theta,phi) = N*r^(n-1)*exp[-zeta*r]*Y_lm(theta,phi)

   where N is a normalization constant, zeta is called "exponent".
   The r, theta, and phi are spherical coordinates, and Y_lm is the angular
   momentum part (function describing "shape").The n, l, and m are quantum
   numbers: principal, angular momentum, and magnetic; respectively.

Unfortunately, functions of this kind are not suitable for fast calculations
of necessary two-electron integrals. That is why, the Gaussian Type Orbitals
(GTOs) were introduced. You can approximate the shape of the STO function by
summing up a number of GTOs with different exponents and coefficients. 
Even if you use 4 or 5 GTO's to represent STO, you will still calculate your
integrals much faster than if original STOs are used.  The GTO (called also
cartesian gaussian) is expressed as:

       g(alpha,l,m,n;x,y,z) = N*exp[-alpha*r^2]*x^l*y^m*z^n

   where N is a normalization constant, alpha is called "exponent".
   The x, y, and z are cartesian coordinates. The l, m, and n ARE NOT QUANTUM
   NUMBERS but simply integral exponents at cartesian coordinates.
   r^2 = x^2 + y^2 + z^2.

Calling gaussians GTOs is probably a misnomer, since they are not really
orbitals. They are simpler functions. In recent literature, they are frequently
called gaussian primitives. The main difference is that r^(n-1), the
preexponential factor, is dropped, the r in the exponential function is
squared, and angular momentum part is a simple function of cartesian
coordinates. The absence of r^(n-1) factor restricts gaussians to approximating
only 1s, 2p, 3d, 4f ... orbitals. It was done for practical reasons, namely,
for fast integral calculations. Following gaussian functions are possible:

   1s      = N exp[-alpha*r^2]
   2p_x    = N exp[-alpha*r^2]*x
   2p_y    = N exp[-alpha*r^2]*y
   2p_z    = N exp[-alpha*r^2]*z
   3d_xx   = N exp[-alpha*r^2]*x^2
   3d_xy   = N exp[-alpha*r^2]*x*y
   3d_xz   = N exp[-alpha*r^2]*x*z
   3d_yy   = N exp[-alpha*r^2]*y^2
   3d_yz   = N exp[-alpha*r^2]*y*z
   3d_zz   = N exp[-alpha*r^2]*z^2
   4f_xxx  = N exp[-alpha*r^2]*x^3
   4f_xxy  = N exp[-alpha*r^2]*x^2*y
   4f_xyz  = N exp[-alpha*r^2]*x*y*z
   etc.

Sometimes, the so-called scale factor, f, is used to scale all exponents
in the related gaussians. In this case, the gaussian function is written as:

       g(alpha,l,m,n,f;x,y,z) = N exp[-alpha*f^2*r^2]*x^l*y^m*z^n

Be careful not to confuse it with "f" for the f-orbital. 

The sum of exponents at cartesian coordinates, L = l+m+n, is used
analogously to the angular momentum quantum number for atoms, to mark 
functions as s-type (L=0), p-type (L=1), d-type (L=2), f-type (L=3), etc.

There is a problem with d-type and higher functions. There are only 5 linearly
independent and orthogonal d orbitals, while there are 6 possible cartesian
gaussians. If we use all six, we are also introducing a 3s type function since:

3d_xx + 3d_yy + 3d_zz = N (x^2 + y^2 +z^2) exp[-alpha r^2] = 
                      = N r^2 exp[-alpha r^2] = 3s

More recently, this effect was studied for sulphur by (Sapio & Topiol, 1989).

Examination of f-type functions shows that there are 10 possible cartesian
gaussians, which introduce 4p_x, 4p_y and 4p_z type contamination. However,
there are only 7 linearly independent f-type functions. This is a major
headache since some programs remove these spurious functions and some do not.
Of course, the results obtained with all possible cartesian gaussians will be
different from those obtained with a reduced set.


HOW ARE THESE GAUSSIAN PRIMITIVES DERIVED?
=========================================
Gaussian primitives are usually obtained from quantum calculations on atoms 
(i.e. Hartree-Fock or Hartree-Fock plus some correlated calculations, e.g. CI).
Typically, the exponents are varied until the lowest total energy of the atom
is achieved (Clementi et al., 1990). In some cases, the exponents are optimized
independently. In others, the exponents are related to each other by some
equation, and parameters in this equation are optimized (e.g. even-tempered
or "geometrical" and well-tempered basis sets). The primitives so derived
describe isolated atoms and cannot accurately describe deformations of atomic
orbitals brought by the presence of other atoms in the molecule. Basis sets
for molecular calculations are therefore frequently augmented with other
functions which will be discussed later. 

For molecular calculations, these gaussian primitives have to be contracted,
i.e., certain linear combinations of them will be used as basis functions.
The term contraction means "a linear combination of gaussian primitives
to be used as basis function." Such a basis function will have its coefficients
and exponents fixed. The contractions are sometimes called Contracted Gaussian
Type Orbitals (CGTO). To clear things up, a simple example from Szabo and
Ostlund, 1989. The coefficients and exponents of Gaussian expansion which
minimizes energy of the hydrogen atom were derived by Huzinaga, 1965. 
Four s-type gaussians were used to represent 1s orbital of hydrogen as:

psi_1s = 0.50907 N_1 exp[-0.123317 r^2] + 0.47449 N_2 exp[-0.453757 r^2] +
         0.13424 N_3 exp[-2.01330 r^2] + 0.01906 N_4 exp[-13.3615 r^2]

  N_i is a normalization constant for a given primitive. In the case of
  gaussians of type s it is equal to (2 alpha/pi)^0.75.

These primitives may be grouped in 2 contractions. The first contraction
contains only 1 primitive:

phi_1 = N_1 exp[-0.123317 r^2]  

3 primitives are present in the second contraction:

phi_2 = N {0.47449 N_2 exp[-0.453757 r^2] + 0.13424 N_3 exp[-2.01330 r^2] +
            0.01906 N_4 exp[-13.3615 r^2]} 

  N is a normalization constant for the whole contraction.

In this case, 4 primitives were contracted to 2 basis functions. It is
frequently denoted as (4s) -> [2s] contraction (some use (4s)/[2s] notation).
The coefficients in function phi_2 are then fixed in subsequent molecular
calculations. 

The way in which contractions are derived is not easy to summarize. Moreover,
it depends upon the intended use for the basis functions. It is a good idea to
always read the original paper which describes the way in which contractions
have been done. Some basis sets are good for geometry and energies, some are
aimed at properties (e.g. polarizability), some are optimized only with
Hartree-Fock in mind, and some are tailored for correlated calculations.
Finally, some are good for anions and other for cations and neutral molecules.
For some calculations, a good representation of the inner (core) orbitals is
necessary (e.g. for properties required to analyze NMR spectrum), while other
require best possible representation of valence electrons.

WHY ARE CONTRACTIONS DONE
=========================
Obviously, the best results could be obtained if all coeficients in
gaussian expansion were allowed to vary during molecular calculations.
Moreover, the computational effort (i.e. "CPU time") for calculating integrals
in the Hartree-Fock procedure depends upon the 4th power in the number of
gaussian primitives. However, all subsequent steps depend upon the number of
basis functions (i.e. contractions). Also, the storage required for integrals
(when Direct SCF is not used) is proportional to the number of basis functions
(not primitives!). Frequently the disk storage and not the CPU time is
a limiting factor. The CPU time requirements are more acute when post-Hartree-
Fock (e.g. correlated methods) are used, since the dependance upon the number
of basis functions here is more steep than the 4th power.

There are two basic forms of contractions, namely "segmented" and "general".
The segmented contractions are disjointed, i.e., given primitive appears only
in one contraction. The example given above (4s) -> [2s] is a segmented
contraction. Occasionally, one or two primitives may appear in more than one
contraction, but this is an exception to the rule. The general contractions,
on the contrary, allow each of the primitives to appear in each basis function
(contraction). The segmented contractions are far more popular and will be
described first. The reason for their popularity is not that they are better,
but simply, that the most popular ab initio packages do not implement efficient
integral calculations with general contractions. The computer code to perform
integral calculations with general contractions is much more complex than that
for the segmented case. 


---