From chemistry-request&$at$&ccl.net Tue Mar  5 09:36:03 1991
Date: Mon, 04 Mar 91 17:55:02 EST
From: jkl "-at-" ccl.net
Subject: Basis sets intro. Part 3/3
To: chemistry[ AT ]ccl.net
Status: RO

 
Polarization and diffuse functions
----------------------------------
The original contractions derived from atomic Hartree-Fock calculations
are frequently augmented with other functions. The most popular are the
polarization and diffuse functions. The polarization functions are simply
functions having higher values of L than those present in occupied
atomic orbitals for the corresponding atom. At least for me, there is some
ambiguity here, since for lithium, the p-type functions are not considered
polarization functions, while for sulphur, the d-functions are considered
polarization functions. In both cases these orbitals are not populated
in the ground electronic state of the atom. The reason for including p-type
functions in the Li and Be atoms, even in the minimal basis sets, is prac-
tical, however. Without these functions, the results are extremely poor. The
reason for not including d-type functions for sulphur should be the same as for
other atoms, i.e., you can obtain reasonable results without them. I wish,
I could believe that.
 
The exponents for polarization functions cannot be derived from Hartree-Fock
calculations for the atom, since they are not populated. However, they can be
estimated from correlated calculations involving atoms. In practice, however,
these exponents are estimated "using well established rules of thumb or by
explicit optimization" (Dunning, 1989).
 
The polarization functions are important for reproducing chemical bonding.
They were frequently derived from optimizing exponents for a set of molecules.
They should also be included in all correlated calculations. They are usually
added as uncontracted gaussians. It is important to remember that adding them
is costly. Augmenting basis set with d type polarization functions adds 5
(or 6) basis function on each atom while adding f type functions adds 7 (or 10,
if spurious combinations are not removed). This brings us to the problem of
specifying the number of d, f, g, etc. polarization functions in a form of some
compact notation. Unfortunately, there is no provision for this information in
the notations described above. Pople's 6-31G* basis uses 6 d type functions as
polarization functions, while 6-311G* uses 5 of them. The () notation is no
better. If the paper does not say explicitly how many d or f functions are
used, you are on your own. The only way to find out is to repeat the calcula-
tions or contact the author. Many papers do not specify this important
information.
The Pople's group introduced yet another more general notation to encode type
of polarization functions. The easiest way is to explain an example.
The 6-31G** is synonymous to 6-31G(d,p); the 6-311G(3d2f,2p) represents
6-311G set augmented with 3 functions of type d and 2 functions of type f
on heavy atoms, and 2 functions of type p on hydrogens or specifically
(6311,311,111,11)/(311,11), i.e. (11s,4p,3d,2f/5s,2p) -> [4s3p3d2f/3s2p]
contraction. The 6 d-type polarization function is added to 6-31G set, while
only 5 to 6-311G. For both 6-31G and 6-311G set, f-type functions are added in
groups of 7. Polarization functions are, as a rule, used uncontracted.
More information can be found in the following papers: (Dunning, 1989),
(Francl et al., 1982), (Gutowski et al., 1987), (Jankowski, 1985), (Krishnan
et al., 1980).
 
The basis sets are also frequently augmented with the so-called diffuse
functions. The name says it all. These gaussians have very small exponents
and decay slowly with distance from the nucleus. Diffuse gaussians are
usually of s and p type, however sometimes diffuse polarization functions are
also used. Diffuse functions are necessary for correct description of anions
and weak bonds (e.g. hydrogen bonds) and are frequently used for calculations
of properties (e.g. dipole moments, polarizabilities, etc.). For the Pople's
basis sets the following notaton is used:  n-ij+G, or n-ijk+G when 1 diffuse
s-type and p-type gaussian with the same exponents are added to a standard
basis set on heavy atoms. The n-ij++G, or n-ijk++G are obtained by adding
1 diffuse s-type and p-type gaussian on heavy atoms and 1 diffuse s-type
gaussian on hydrogens. For example, the 6-31+G* represents (6311,311,1)/(31) or
(11s,5p,1d/4s) -> [4s,3p,1d/2s]. The 6-311+G(2d1f,2p1d) stands for
(63111,3111,11,1)/(311,11,1) split, or (12s,6p,2d,1f) -> [5s,2p,1d].
For more information about diffuse functions see, for example (Clark et al.,
1983), (Del Bene, 1989), and (Frisch et al., 1984).
 
To calculate total number of primitives/basis functions in your molecule,
you sum up the number of primitives/basis functions for each partaking atom.
As an example, let us compute the number of functions for H2SO3 molecule
assuming the use of Gaussian90 program. The 6-311++G(3df,2p) basis set is used
as an example. In this case the reduced set of d and f gaussians is used, i.e.,
5 d-type functions and 7 f-type functions. It corresponds to the following
contractions:
 
S: (6311111,421111,111,1)   (for sulphur, Gaussian90 defaults to McLean-
                             Chandler basis set (631111,42111) for sulphur
                             anion which is augmented with one diffuse s and
                             one diffuse p function, and three d and one
                             f polarization functions)
 
O: (63111,3111,111,1)       (this is 6-311G for oxygen augmented with one s-
                             and one p-type diffuse function, and three d and
                             one f polarization function)
 
H: (3111,11)                (this is 6-311G augmented with one diffuse s and
                             two p-functions for polarization)
 
Number of basis functions:
- - - - - - - - - - - - -
S: 7 s-type functions, 6*3 p-type functions, 3*5 d-type functions and
   1*7 f-type functions
 
O: 5 s-type functions, 4*3 p-type functions, 3*5 d-type functions and
   1*7 f-type functions
 
H: 4 functions of type s and 2*3 functions of type p (there are 3 p function
   for each p type contraction, i.e. p_x, p_y, p_z)
 
H2SO3 = (4 + 2*3)*2 + (7 + 6*3 + 3*5 + 1*7) + (5 + 4*3 + 3*5 + 1*7)*3 = 184
 
Total number of gaussian primitives:
- - - - - - - - - - - - - - - - - -
S: 1*(6+3+1+1+1+1+1) + 3*(4+2+1+1+1+1) + 5*(1+1+1) + 7*1 = 66
 
O: 1*(6+3+1+1+1) + 3*(3+1+1+1) + 5*(1+1+1) + 7*1 = 52
 
H: 1*(3+1+1+1) + 3*(1+1) = 12
 
H2SO3 = 2*12 + 66 + 3*52 = 246 primitives.
 
 
GENERAL CONTRACTIONS. TERMS AND NOTATION
========================================
 
Raffenetti (1973) introduced term "general contraction" for basis sets in which
the same gaussian primitives can appear in several basis functions.
In general contraction scheme, the basis functions are formed as different
linear combinations of the same primitives. This is clearly in contrast with
the segmented scheme described above. Please do not confuse general contrac-
tions with a term "general basis set" used in some program manuals to denote
"user defined segmented basis sets".
 
General contractions have many advantages from the theoretical point
of view. The most important is that they might be chosen to approximate
true atomic orbitals which makes interpretation of coefficients in
molecular orbitals meaningful. Also for correlated calculations their
performance is praised (Almlof and Taylor, 1987; Almlof et al., 1988;
Dunning, 1989) Secondly, they can be chosen in a more standard way than
segmented contractions, either as true atomic orbitals obtained from Hartree-
Fock calculations for the atom with uncontracted primitives as basis functions,
or as Atomic Natural Orbitals (ANO). For description of ANO's consult papers by
Almlof and coworkers or read appropriate chapter in Szabo and Ostlund, (1989).
The only problem with general contractions is that only a few programs support
them. The code for integral package is much more complicated in this case,
since it has to work on a block of integrals at each time, to compute the
contribution from the given primitive set only once. Of course, you can always
enter general contractions as "user defined segmented basis sets," by repeating
the same primitives over and over again in different contractions. This will
cost you, however, immensely in computer time at the integral computation
stage. Remember, the time required for calculating integrals is proportional
to the 4th power in the number of gaussian primitives, and most programs
assume that primitives entering different contractions are different.
 
As an example, the general contractions of (8s4p) set of primitives for oxygen
by Huzinaga et al., 1971 (taken from: Raffenetti, 1973).
 
Exponents      |------------ coefficients --------------------------|
 
s-exponents          1s             2s             s'             s"
5.18664(+3)     1.95900(-3)    4.49000(-4)    0.00000       0.00000
7.77805(+2)     1.50290(-2)    3.38100(-3)    0.00000       0.00000
1.76161(+2)     7.38340(-2)    1.76630(-2)    0.00000       0.00000
4.93608(+1)     2.47316(-1)    6.05540(-2)    0.00000       0.00000
1.58205(+1)     4.73314(-1)    1.59948(-1)    0.00000       0.00000
5.51493         3.27039(-1)    1.46197(-1)    0.00000       0.00000
1.03159         1.93420(-2)   -5.46581(-1)    0.00000       1.00000
3.06844(-1)    -3.57900(-3)   -5.84553(-1)    1.00000       0.00000
 
p-exponents          2p             p'             p"
1.78462(+1)     4.25100(-2)    0.00000        0.00000
3.88748         2.26972(-1)    0.00000        0.00000
1.05481         5.07788(-1)    0.00000        1.00000
2.77222(-1)     4.63550(-1)    1.00000        0.00000
 
In the table above, integer numbers in parantheses denote powers of 10
multiplying number in front of them.
 
The set above can be described as (8s,4p) -> [4s,3p] contraction. Clearly,
the notation giving the number of primitives in each contraction as (abcd...)
is not really useful here. It is especially true with newer sets implementing
general contractions, where each primitive has all nonzero coefficients
in practically every column.
 
 
 
EFFECTIVE CORE POTENTIALS (EFFECTIVE POTENTIALS)
================================================
 
It was known for a long time that core (inner) orbitals are in most cases
not affected significantly by changes in chemical bonding. This prompted
the development of Effective Core Potential (ECP) or Effective Potentials (EP)
approaches, which allow treatment of inner shell electrons as if they were some
averaged potential rather than actual particles. ECP's are not orbitals but
modifications to a hamiltonian, and as such are very efficient computationally.
Also, it is very easy to incorporate relativistic effects into ECP, while
all-electron relativistic computations are very expensive. The relativistic
effects are very important in describing heavier atoms, and luckily ECP's
simplify calculations and at the same time make them more accurate with
popular non-relativistic ab initio packages (provided that such packages
have support for ECP's). The core potentials can only be specified for shells
that are filled. For the rest of electrons (i.e. valence electrons), you have
to provide basis functions. These are special basis sets optimized for the
use with specific ECP's. These basis sets are usually listed in original
papers together with corresponding ECP's. Some examples of papers describing
ECP's:  (Durand and Bartelat, 1975), (Hay and Wadt, 1985ab), (Hurley et al.,
1986), (Pacios and Christensen, 1985), (Stevens ey al., 1984), (Wadt and Hay,
1985), (Walace et al., 1991). The ECP are tabulated in the literature as
parameters of the following expansion:
 
     ECP(r) = sum (i=1 to M) { d_i*r^(n_i)*exp[-zeta_i*r^2] }
 
  where M is the number of terms in the expansion, d_i is a coefficient for
  each term, r denotes distance from nucleus, n_i is a power of r for the i-th
  term, and zeta_i represents the exponent for the i-th term.
 
  To specify ECP for a given atomic center, you need to include typically:
the number of core electrons that are substituted by ECP, the largest
angular momentum quantum number included in the potential (e.g., 1 for s only,
2 for s and p, 3 for s, p, and d; etc.), and number of terms in the "polynomial
gaussian expansion" shown above. For each term in this expansion you need to
specify: coefficient (d_i), power of r (n_i) and exponent in the gaussian
function (zeta_i). Also you need to enter basis set for valence electrons
specific to this potential. As a result of applying the ECP's you drastically
reduce number of needed basis functions, since only functions for valence
electrons are required. In many cases, it would simply be impossible to perform
some calculations on systems involving heavier elements without ECP's (try to
calculate number of functions in TZ2P basis set for e.g. U, and you will
know why).

 
---