From jkl \\at// ccl.net Mon Mar 4 18:26:56 1991 Date: Mon, 04 Mar 91 17:52:37 EST From: jkl;at;ccl.net Subject: Basis sets intro. Part 1/3 To: chemistry &$at$& ccl.net Status: RO ============================================================================= | | | SIMPLIFIED INTRODUCTION TO AB INITIO BASIS SETS. TERMS AND NOTATION. | | | | Jan K. Labanowski, Ohio Supercomputer Center, 1224 Kinnear Rd., Columbus, | | OH 43212-1163, USA. E-mail: jkl-0at0-ccl.net, JKL-0at0-OHSTPY.BITNET | | | | Permission is granted to do whatever you wish with this. | | | ============================================================================= WHY ALL THIS? ============== This is a general introduction to basis sets. It is not yet a sulphur basis set summary. I will post sulphur summary soon. Many thanks to all who responded to my question about basis sets for sulphur correlated calculations. Here is a list of people who helped me (directly or indirectly) with sulphur sets: Raymond Bair, John Bloor, Thom Dunning, Bob Eades, Dave Ewing, Dave Feller, Mike Frisch, Mark Gordon, Scott Lamson, Russ Pitzer, Bob Zellmer. Many people on this list know infinitely more about basis sets then I. There are, however, a few reasons I decided to write this summary: 1. From the perspective of "list owner," I see many people on the list for whom ab initio might be a useful tool. This people, however, might have problems following original literature on the topic. I tried to compile the FAQ (better known in network lingo as "Frequently Asked Questions") asked by Ohio Supercomputer Center users who want to use quantum chemistry as a tool but wish they did not have to go through the difficult literature. 2. I want to change a ratio of "scientific/other" topics on the list. 3. Last, but not least, I would appreciate your comments and corrections, since I also teach this subject to graduate students at The Ohio State Univ. College of Pharmacy. Getting your criticism, will help me prepare a better handout. I hope, that even those of you who know infinitely more about it than I, will benefit by using some of my examples for the courses which you obviously teach. Sometimes I will have to use subscripts and superscipts. I will use "pseudo" TeX convention, i.e. ^ for superscipt and _ for subscript. E.g. 2^3 = 8 and a_1 means a sub 1. To simplify the notation even further, I will use exp to denote exponential function, i.e. e^x = exp[x]. I will not even touch the topic of Floating Orbital and Lobe Functions, since they are not used outside the narrow group of quantum chemistry professionals, though I think they are important. REFERENCES (since this summary comes to you in several pieces, I've decided to ========== put references on the top). R. Ahlrichs, P.R.Taylor, 1981, "The choice of gaussian basis sets for molecular electronic structure calculations," J.Chim.Phys, 78, 315-324. J. Almlof, P.R. Taylor, 1987, "General contraction of Gaussian basis sets. I. Atomic natural orbitals for first- and second-row atoms," J. Chem. Phys. 86, 4070-4077. J. Almlof, T. Helgaker, P.R. Taylor, 1988, "Gaussian basis sets for high- quality ab initio calculations," J. Phys. Chem., 92, 3029-3033. J. Andzelm, M. Klobukowski, E. Radzio-Andzelm, Y. Saski, H. Tatewaki, 1984, "Gaussian Basis Sets for Molecular Calculations," (S. Huzinaga, Editor), Elsevier, Amsterdam. T. Clark, J. Chandrasekhar, G.W. Spitznagel, P.v.R. Schleyer, (1983), "Efficient diffuse function-augmented basis sets for Anion Calculations. III. The 3-21+G set for first-row elements, Li-F," J. Comput. Chem., 4, 294-301. E.Clementi, S.J.Chakravorty, G.Corongiu, V.Sonnad, 1990, "Independent Electron Models: Hartree-Fock for Many-Electron Atoms," pp. 47-140, in: "MOTECC-90 Modern Techniques in Computational Chemistry (E.Clementi, Ed.), Escom, Leiden 1990, ISBN-90-72199-07-3. J.E. Del Bene, 1989, "An ab initio molecular orbital study of the structures and energies of neutral and charged bimolecular complexes of NH_3 with hydrides AH_n (A = N, O, F, P, S, and Cl)," J. Comput. Chem. 10, 603-615. T.H. Dunning, P.J. Hay, 1977, "Gaussian basis sets for molecular calculations," in: "Modern Theoretical Chemistry," vol. 3. Ed. H.F. Schaefer III, pp. 1-28, Plenum Press, New York. T.H. Dunning, Jr., 1989, "Gaussian basis sets for use in correlated molecular calculations," J.Chem.Phys. 90, 1007-1023. P. Durand, J.-C. Barthelat, 1975, "A theoretical method to determine atomic pseudopotentials for electronic structure calculations of molecules and solids," Theor. Chim. Acta(Berl.) 38, 283-302. D. Feller, E.R. Davidson, 1986, "Basis set selection for molecular calculations," Chemical Reviews, 86, 681-696. D. Feller, E.R. Davidson, 1990, "Basis Sets for Ab Initio Molecular Orbital Calculations and Intermolecular Interactions," pp. 1-43, in: "Reviews in Computational Chemistry" (Editors:Kenny B. Lipkowitz and Donald B. Boyd), VCH, New York. M.M. Francl, J.S. Binkley, M.S. Gordon, D.J. DeFrees, J.A. Pople, 1982, "Self- consistent molecular orbital methods. XXIII. A polarization-type basis set for second-row element," J.Chem.Phys. 77, 3654-3665. M.J. Frisch, J.A. Pople, J.S. Binkley, 1984, "Self-consistent molecular orbital methods 25. Supplementary functions for Gaussian Basis Sets," J. Chem. Phys. 80, 3265-3269. M.S. Gordon, 1980, "The isomers of silacyclopropane," Chem. Phys. Lett., 76, 163-168. M. Gutowski, F.B. Van Duijneveldt, G. Chalasinski, L. Piela, 1987, "Proper correction for the basis set superposition error in SCF calculations of intermolecular calculations," Mol. Phys. 61, 233-247. P.J. Hay, W.R. Wadt, 1985a, "Ab Initio effective core potentials for molecular calculations. Potentials for transition metal atoms Sc to Hg," J. Chem. Phys. 82, 271-283. P.J. Hay, W.R. Wadt, 1985b, "Ab Initio effective core potentials for molecular calculations. Potentials for K to Au including outermost core orbitals," J. Chem. Phys. 82, 299-310. W.J. Hehre, L. Radom, P.v.R Schleyer, J.A. Pople, 1986, "Ab Initio Molecular Orbital Theory," Willey & Sons, New York. M.M. Hurley, L.F. Pacios, P.A. Christiansen, R.B. Ross, W.C. Ermler, 1986, "Ab initio relativistic effective potentials with spin-orbit operators. II. K through Kr," J. Chem. Phys. 84, 6840-6853. S. Huzinaga, 1965, "Gaussian-type functions for polyatomic systems," J. Chem. Phys., 42, 1293-1302. S. Huzinaga, D. McWilliams, B. Domsky, 1971, "Approximate Atomic Function," J. Chem. Phys. 54, 2283-2284. K. Jankowski, R. Becherer, P. Scharf, H. Schiffer, R. Ahlrichs, 1985, "The impact of higher polarization basis functions on molecular ab initio results," J. Chem. Phys. 82, 1413-1419. R. Krishnan, J.S. Binkley, R. Seeger, J.A. Pople, 1980, "Self-consistent orbital methods. XX. A basis set for correlated wave functions," J. Chem. Phys. 72, 650-654. A.D. McLean, G.S. Chandler, 1980, "Contracted gaussian basis sets for molecular calculations. I. Second row atoms, Z=11-18," J. Chem. Phys., 72, 5639-5648. L.F. Pacios, P.A. Christiansen, 1985, "Ab initio relativistic effective potentials with spin-orbit operators. I. Li through Ar. R. Poirier, R. Kari, I.G. Csizmadia, 1985, "Handbook of Gaussian Basis Sets," Elsevier Science, New York. R.C. Raffenetti, 1973, "General contraction of Gaussian atomic orbitals: Core, valence, polarization and diffuse basis sets; Molecular integral evaluation," J. Chem. Phys, 58, 4452-4458. M. Sabio, S. Topiol, 1989, " 3s- versus 1s-type gaussuan primitives: Modification of the 3-21G(*) basis set for sulphur atom. J. Comput. Chem., 10, 660-672. W.J. Stevens, H. Bash, M. Krauss, 1984, J. Chem. Phys. 81, 6026-6033. A. Szabo, N.S. Ostlund, 1989, "Modern Quantum Chemistry: Introduction to Advanced Electronic Structure Theory". MacMillan Publishing Co., New York. W.R. Wadt, P.J. Hay, 1985, "Ab Initio effective core potentials for molecular calculations. Potentials for main group elements Na to Bi," J. Chem. Phys. 82, 284-298. N.M. Wallace, J.P. Blaudeau, R.M. Pitzer, 1991, "Optimized Gaussian Basis Sets for use with Relativistic Effective (core) Potentials: Li-Ar,"\ Int. J. Quantum Chem. (in press). INTRODUCTION ============ Some straightforward reviews on basis sets are available: (Ahlrich & Taylor, 1981), (Andzelm et al., 1984), (Dunning & Hay, 1977), (Feller & Davidson, 1986), (Feller & Davidson, 1990), (Poirier et al., 1985). Historically, the quantum calculations for molecules were performed as LCAO MO, i.e. Linear Combination of Atomic Orbitals - Molecular Orbitals. This means that molecular orbitals are formed as a linear combination of atomic orbitals: psi_i = sum from u=1 to n { c_ui*phi_u } where psi_i is the i-th molecular orbital, c_ui are the coefficients of linear combination, phi_u is the u-th atomic orbital, and n is the number of atomic orbitals. Strictly speaking, Atomic Orbitals (AO) are solutions of the Hartree-Fock equations for the atom, i.e. a wave functions for a single electron in the atom. Anything else is not really an atomic orbital. Some things are similar though, and there is a lot of confusion in the terminology used. Later on, this term atomic orbital was replaced by "basis function" or "contraction," when appropriate. Early, the Slater Type Orbitals (STO's) were used as basis functions due to their similarity to atomic orbitals of the hydrogen atom. They are described by the function depending on spherical coordinates: phi_i(zeta,n,l,m;r,theta,phi) = N*r^(n-1)*exp[-zeta*r]*Y_lm(theta,phi) where N is a normalization constant, zeta is called "exponent". The r, theta, and phi are spherical coordinates, and Y_lm is the angular momentum part (function describing "shape").The n, l, and m are quantum numbers: principal, angular momentum, and magnetic; respectively. Unfortunately, functions of this kind are not suitable for fast calculations of necessary two-electron integrals. That is why, the Gaussian Type Orbitals (GTOs) were introduced. You can approximate the shape of the STO function by summing up a number of GTOs with different exponents and coefficients. Even if you use 4 or 5 GTO's to represent STO, you will still calculate your integrals much faster than if original STOs are used. The GTO (called also cartesian gaussian) is expressed as: g(alpha,l,m,n;x,y,z) = N*exp[-alpha*r^2]*x^l*y^m*z^n where N is a normalization constant, alpha is called "exponent". The x, y, and z are cartesian coordinates. The l, m, and n ARE NOT QUANTUM NUMBERS but simply integral exponents at cartesian coordinates. r^2 = x^2 + y^2 + z^2. Calling gaussians GTOs is probably a misnomer, since they are not really orbitals. They are simpler functions. In recent literature, they are frequently called gaussian primitives. The main difference is that r^(n-1), the preexponential factor, is dropped, the r in the exponential function is squared, and angular momentum part is a simple function of cartesian coordinates. The absence of r^(n-1) factor restricts gaussians to approximating only 1s, 2p, 3d, 4f ... orbitals. It was done for practical reasons, namely, for fast integral calculations. Following gaussian functions are possible: 1s = N exp[-alpha*r^2] 2p_x = N exp[-alpha*r^2]*x 2p_y = N exp[-alpha*r^2]*y 2p_z = N exp[-alpha*r^2]*z 3d_xx = N exp[-alpha*r^2]*x^2 3d_xy = N exp[-alpha*r^2]*x*y 3d_xz = N exp[-alpha*r^2]*x*z 3d_yy = N exp[-alpha*r^2]*y^2 3d_yz = N exp[-alpha*r^2]*y*z 3d_zz = N exp[-alpha*r^2]*z^2 4f_xxx = N exp[-alpha*r^2]*x^3 4f_xxy = N exp[-alpha*r^2]*x^2*y 4f_xyz = N exp[-alpha*r^2]*x*y*z etc. Sometimes, the so-called scale factor, f, is used to scale all exponents in the related gaussians. In this case, the gaussian function is written as: g(alpha,l,m,n,f;x,y,z) = N exp[-alpha*f^2*r^2]*x^l*y^m*z^n Be careful not to confuse it with "f" for the f-orbital. The sum of exponents at cartesian coordinates, L = l+m+n, is used analogously to the angular momentum quantum number for atoms, to mark functions as s-type (L=0), p-type (L=1), d-type (L=2), f-type (L=3), etc. There is a problem with d-type and higher functions. There are only 5 linearly independent and orthogonal d orbitals, while there are 6 possible cartesian gaussians. If we use all six, we are also introducing a 3s type function since: 3d_xx + 3d_yy + 3d_zz = N (x^2 + y^2 +z^2) exp[-alpha r^2] = = N r^2 exp[-alpha r^2] = 3s More recently, this effect was studied for sulphur by (Sapio & Topiol, 1989). Examination of f-type functions shows that there are 10 possible cartesian gaussians, which introduce 4p_x, 4p_y and 4p_z type contamination. However, there are only 7 linearly independent f-type functions. This is a major headache since some programs remove these spurious functions and some do not. Of course, the results obtained with all possible cartesian gaussians will be different from those obtained with a reduced set. HOW ARE THESE GAUSSIAN PRIMITIVES DERIVED? ========================================= Gaussian primitives are usually obtained from quantum calculations on atoms (i.e. Hartree-Fock or Hartree-Fock plus some correlated calculations, e.g. CI). Typically, the exponents are varied until the lowest total energy of the atom is achieved (Clementi et al., 1990). In some cases, the exponents are optimized independently. In others, the exponents are related to each other by some equation, and parameters in this equation are optimized (e.g. even-tempered or "geometrical" and well-tempered basis sets). The primitives so derived describe isolated atoms and cannot accurately describe deformations of atomic orbitals brought by the presence of other atoms in the molecule. Basis sets for molecular calculations are therefore frequently augmented with other functions which will be discussed later. For molecular calculations, these gaussian primitives have to be contracted, i.e., certain linear combinations of them will be used as basis functions. The term contraction means "a linear combination of gaussian primitives to be used as basis function." Such a basis function will have its coefficients and exponents fixed. The contractions are sometimes called Contracted Gaussian Type Orbitals (CGTO). To clear things up, a simple example from Szabo and Ostlund, 1989. The coefficients and exponents of Gaussian expansion which minimizes energy of the hydrogen atom were derived by Huzinaga, 1965. Four s-type gaussians were used to represent 1s orbital of hydrogen as: psi_1s = 0.50907 N_1 exp[-0.123317 r^2] + 0.47449 N_2 exp[-0.453757 r^2] + 0.13424 N_3 exp[-2.01330 r^2] + 0.01906 N_4 exp[-13.3615 r^2] N_i is a normalization constant for a given primitive. In the case of gaussians of type s it is equal to (2 alpha/pi)^0.75. These primitives may be grouped in 2 contractions. The first contraction contains only 1 primitive: phi_1 = N_1 exp[-0.123317 r^2] 3 primitives are present in the second contraction: phi_2 = N {0.47449 N_2 exp[-0.453757 r^2] + 0.13424 N_3 exp[-2.01330 r^2] + 0.01906 N_4 exp[-13.3615 r^2]} N is a normalization constant for the whole contraction. In this case, 4 primitives were contracted to 2 basis functions. It is frequently denoted as (4s) -> [2s] contraction (some use (4s)/[2s] notation). The coefficients in function phi_2 are then fixed in subsequent molecular calculations. The way in which contractions are derived is not easy to summarize. Moreover, it depends upon the intended use for the basis functions. It is a good idea to always read the original paper which describes the way in which contractions have been done. Some basis sets are good for geometry and energies, some are aimed at properties (e.g. polarizability), some are optimized only with Hartree-Fock in mind, and some are tailored for correlated calculations. Finally, some are good for anions and other for cations and neutral molecules. For some calculations, a good representation of the inner (core) orbitals is necessary (e.g. for properties required to analyze NMR spectrum), while other require best possible representation of valence electrons. WHY ARE CONTRACTIONS DONE ========================= Obviously, the best results could be obtained if all coeficients in gaussian expansion were allowed to vary during molecular calculations. Moreover, the computational effort (i.e. "CPU time") for calculating integrals in the Hartree-Fock procedure depends upon the 4th power in the number of gaussian primitives. However, all subsequent steps depend upon the number of basis functions (i.e. contractions). Also, the storage required for integrals (when Direct SCF is not used) is proportional to the number of basis functions (not primitives!). Frequently the disk storage and not the CPU time is a limiting factor. The CPU time requirements are more acute when post-Hartree- Fock (e.g. correlated methods) are used, since the dependance upon the number of basis functions here is more steep than the 4th power. There are two basic forms of contractions, namely "segmented" and "general". The segmented contractions are disjointed, i.e., given primitive appears only in one contraction. The example given above (4s) -> [2s] is a segmented contraction. Occasionally, one or two primitives may appear in more than one contraction, but this is an exception to the rule. The general contractions, on the contrary, allow each of the primitives to appear in each basis function (contraction). The segmented contractions are far more popular and will be described first. The reason for their popularity is not that they are better, but simply, that the most popular ab initio packages do not implement efficient integral calculations with general contractions. The computer code to perform integral calculations with general contractions is much more complex than that for the segmented case. ---