Basis sets intro. Part 3/3



 Polarization and diffuse functions
 ----------------------------------
 The original contractions derived from atomic Hartree-Fock calculations
 are frequently augmented with other functions. The most popular are the
 polarization and diffuse functions. The polarization functions are simply
 functions having higher values of L than those present in occupied
 atomic orbitals for the corresponding atom. At least for me, there is some
 ambiguity here, since for lithium, the p-type functions are not considered
 polarization functions, while for sulphur, the d-functions are considered
 polarization functions. In both cases these orbitals are not populated
 in the ground electronic state of the atom. The reason for including p-type
 functions in the Li and Be atoms, even in the minimal basis sets, is prac-
 tical, however. Without these functions, the results are extremely poor. The
 reason for not including d-type functions for sulphur should be the same as for
 other atoms, i.e., you can obtain reasonable results without them. I wish,
 I could believe that.
 The exponents for polarization functions cannot be derived from Hartree-Fock
 calculations for the atom, since they are not populated. However, they can be
 estimated from correlated calculations involving atoms. In practice, however,
 these exponents are estimated "using well established rules of thumb or by
 explicit optimization" (Dunning, 1989).
 The polarization functions are important for reproducing chemical bonding.
 They were frequently derived from optimizing exponents for a set of molecules.
 They should also be included in all correlated calculations. They are usually
 added as uncontracted gaussians. It is important to remember that adding them
 is costly. Augmenting basis set with d type polarization functions adds 5
 (or 6) basis function on each atom while adding f type functions adds 7 (or 10,
 if spurious combinations are not removed). This brings us to the problem of
 specifying the number of d, f, g, etc. polarization functions in a form of some
 compact notation. Unfortunately, there is no provision for this information in
 the notations described above. Pople's 6-31G* basis uses 6 d type functions as
 polarization functions, while 6-311G* uses 5 of them. The () notation is no
 better. If the paper does not say explicitly how many d or f functions are
 used, you are on your own. The only way to find out is to repeat the calcula-
 tions or contact the author. Many papers do not specify this important
 information.
 The Pople's group introduced yet another more general notation to encode type
 of polarization functions. The easiest way is to explain an example.
 The 6-31G** is synonymous to 6-31G(d,p); the 6-311G(3d2f,2p) represents
 6-311G set augmented with 3 functions of type d and 2 functions of type f
 on heavy atoms, and 2 functions of type p on hydrogens or specifically
 (6311,311,111,11)/(311,11), i.e. (11s,4p,3d,2f/5s,2p) -> [4s3p3d2f/3s2p]
 contraction. The 6 d-type polarization function is added to 6-31G set, while
 only 5 to 6-311G. For both 6-31G and 6-311G set, f-type functions are added in
 groups of 7. Polarization functions are, as a rule, used uncontracted.
 More information can be found in the following papers: (Dunning, 1989),
 (Francl et al., 1982), (Gutowski et al., 1987), (Jankowski, 1985), (Krishnan
 et al., 1980).
 The basis sets are also frequently augmented with the so-called diffuse
 functions. The name says it all. These gaussians have very small exponents
 and decay slowly with distance from the nucleus. Diffuse gaussians are
 usually of s and p type, however sometimes diffuse polarization functions are
 also used. Diffuse functions are necessary for correct description of anions
 and weak bonds (e.g. hydrogen bonds) and are frequently used for calculations
 of properties (e.g. dipole moments, polarizabilities, etc.). For the Pople's
 basis sets the following notaton is used:  n-ij+G, or n-ijk+G when 1 diffuse
 s-type and p-type gaussian with the same exponents are added to a standard
 basis set on heavy atoms. The n-ij++G, or n-ijk++G are obtained by adding
 1 diffuse s-type and p-type gaussian on heavy atoms and 1 diffuse s-type
 gaussian on hydrogens. For example, the 6-31+G* represents (6311,311,1)/(31) or
 (11s,5p,1d/4s) -> [4s,3p,1d/2s]. The 6-311+G(2d1f,2p1d) stands for
 (63111,3111,11,1)/(311,11,1) split, or (12s,6p,2d,1f) -> [5s,2p,1d].
 For more information about diffuse functions see, for example (Clark et al.,
 1983), (Del Bene, 1989), and (Frisch et al., 1984).
 To calculate total number of primitives/basis functions in your molecule,
 you sum up the number of primitives/basis functions for each partaking atom.
 As an example, let us compute the number of functions for H2SO3 molecule
 assuming the use of Gaussian90 program. The 6-311++G(3df,2p) basis set is used
 as an example. In this case the reduced set of d and f gaussians is used, i.e.,
 5 d-type functions and 7 f-type functions. It corresponds to the following
 contractions:
 S: (6311111,421111,111,1)   (for sulphur, Gaussian90 defaults to McLean-
                              Chandler basis set (631111,42111) for sulphur
                              anion which is augmented with one diffuse s and
                              one diffuse p function, and three d and one
                              f polarization functions)
 O: (63111,3111,111,1)       (this is 6-311G for oxygen augmented with one s-
                              and one p-type diffuse function, and three d and
                              one f polarization function)
 H: (3111,11)                (this is 6-311G augmented with one diffuse s and
                              two p-functions for polarization)
 Number of basis functions:
 - - - - - - - - - - - - -
 S: 7 s-type functions, 6*3 p-type functions, 3*5 d-type functions and
    1*7 f-type functions
 O: 5 s-type functions, 4*3 p-type functions, 3*5 d-type functions and
    1*7 f-type functions
 H: 4 functions of type s and 2*3 functions of type p (there are 3 p function
    for each p type contraction, i.e. p_x, p_y, p_z)
 H2SO3 = (4 + 2*3)*2 + (7 + 6*3 + 3*5 + 1*7) + (5 + 4*3 + 3*5 + 1*7)*3 = 184
 Total number of gaussian primitives:
 - - - - - - - - - - - - - - - - - -
 S: 1*(6+3+1+1+1+1+1) + 3*(4+2+1+1+1+1) + 5*(1+1+1) + 7*1 = 66
 O: 1*(6+3+1+1+1) + 3*(3+1+1+1) + 5*(1+1+1) + 7*1 = 52
 H: 1*(3+1+1+1) + 3*(1+1) = 12
 H2SO3 = 2*12 + 66 + 3*52 = 246 primitives.
 GENERAL CONTRACTIONS. TERMS AND NOTATION
 ========================================
 Raffenetti (1973) introduced term "general contraction" for basis sets
 in which
 the same gaussian primitives can appear in several basis functions.
 In general contraction scheme, the basis functions are formed as different
 linear combinations of the same primitives. This is clearly in contrast with
 the segmented scheme described above. Please do not confuse general contrac-
 tions with a term "general basis set" used in some program manuals to
 denote
 "user defined segmented basis sets".
 General contractions have many advantages from the theoretical point
 of view. The most important is that they might be chosen to approximate
 true atomic orbitals which makes interpretation of coefficients in
 molecular orbitals meaningful. Also for correlated calculations their
 performance is praised (Almlof and Taylor, 1987; Almlof et al., 1988;
 Dunning, 1989) Secondly, they can be chosen in a more standard way than
 segmented contractions, either as true atomic orbitals obtained from Hartree-
 Fock calculations for the atom with uncontracted primitives as basis functions,
 or as Atomic Natural Orbitals (ANO). For description of ANO's consult papers by
 Almlof and coworkers or read appropriate chapter in Szabo and Ostlund, (1989).
 The only problem with general contractions is that only a few programs support
 them. The code for integral package is much more complicated in this case,
 since it has to work on a block of integrals at each time, to compute the
 contribution from the given primitive set only once. Of course, you can always
 enter general contractions as "user defined segmented basis sets," by
 repeating
 the same primitives over and over again in different contractions. This will
 cost you, however, immensely in computer time at the integral computation
 stage. Remember, the time required for calculating integrals is proportional
 to the 4th power in the number of gaussian primitives, and most programs
 assume that primitives entering different contractions are different.
 As an example, the general contractions of (8s4p) set of primitives for oxygen
 by Huzinaga et al., 1971 (taken from: Raffenetti, 1973).
 Exponents      |------------ coefficients --------------------------|
 s-exponents          1s             2s             s'             s"
 5.18664(+3)     1.95900(-3)    4.49000(-4)    0.00000       0.00000
 7.77805(+2)     1.50290(-2)    3.38100(-3)    0.00000       0.00000
 1.76161(+2)     7.38340(-2)    1.76630(-2)    0.00000       0.00000
 4.93608(+1)     2.47316(-1)    6.05540(-2)    0.00000       0.00000
 1.58205(+1)     4.73314(-1)    1.59948(-1)    0.00000       0.00000
 5.51493         3.27039(-1)    1.46197(-1)    0.00000       0.00000
 1.03159         1.93420(-2)   -5.46581(-1)    0.00000       1.00000
 3.06844(-1)    -3.57900(-3)   -5.84553(-1)    1.00000       0.00000
 p-exponents          2p             p'             p"
 1.78462(+1)     4.25100(-2)    0.00000        0.00000
 3.88748         2.26972(-1)    0.00000        0.00000
 1.05481         5.07788(-1)    0.00000        1.00000
 2.77222(-1)     4.63550(-1)    1.00000        0.00000
 In the table above, integer numbers in parantheses denote powers of 10
 multiplying number in front of them.
 The set above can be described as (8s,4p) -> [4s,3p] contraction. Clearly,
 the notation giving the number of primitives in each contraction as (abcd...)
 is not really useful here. It is especially true with newer sets implementing
 general contractions, where each primitive has all nonzero coefficients
 in practically every column.
 EFFECTIVE CORE POTENTIALS (EFFECTIVE POTENTIALS)
 ================================================
 It was known for a long time that core (inner) orbitals are in most cases
 not affected significantly by changes in chemical bonding. This prompted
 the development of Effective Core Potential (ECP) or Effective Potentials (EP)
 approaches, which allow treatment of inner shell electrons as if they were some
 averaged potential rather than actual particles. ECP's are not orbitals but
 modifications to a hamiltonian, and as such are very efficient computationally.
 Also, it is very easy to incorporate relativistic effects into ECP, while
 all-electron relativistic computations are very expensive. The relativistic
 effects are very important in describing heavier atoms, and luckily ECP's
 simplify calculations and at the same time make them more accurate with
 popular non-relativistic ab initio packages (provided that such packages
 have support for ECP's). The core potentials can only be specified for shells
 that are filled. For the rest of electrons (i.e. valence electrons), you have
 to provide basis functions. These are special basis sets optimized for the
 use with specific ECP's. These basis sets are usually listed in original
 papers together with corresponding ECP's. Some examples of papers describing
 ECP's:  (Durand and Bartelat, 1975), (Hay and Wadt, 1985ab), (Hurley et al.,
 1986), (Pacios and Christensen, 1985), (Stevens ey al., 1984), (Wadt and Hay,
 1985), (Walace et al., 1991). The ECP are tabulated in the literature as
 parameters of the following expansion:
      ECP(r) = sum (i=1 to M) { d_i*r^(n_i)*exp[-zeta_i*r^2] }
   where M is the number of terms in the expansion, d_i is a coefficient for
   each term, r denotes distance from nucleus, n_i is a power of r for the i-th
   term, and zeta_i represents the exponent for the i-th term.
   To specify ECP for a given atomic center, you need to include typically:
 the number of core electrons that are substituted by ECP, the largest
 angular momentum quantum number included in the potential (e.g., 1 for s only,
 2 for s and p, 3 for s, p, and d; etc.), and number of terms in the
 "polynomial
 gaussian expansion" shown above. For each term in this expansion you need
 to
 specify: coefficient (d_i), power of r (n_i) and exponent in the gaussian
 function (zeta_i). Also you need to enter basis set for valence electrons
 specific to this potential. As a result of applying the ECP's you drastically
 reduce number of needed basis functions, since only functions for valence
 electrons are required. In many cases, it would simply be impossible to perform
 some calculations on systems involving heavier elements without ECP's (try to
 calculate number of functions in TZ2P basis set for e.g. U, and you will
 know why).
 ---