Estimation of INT and RWF file sizes for G86 (Summary)



Hi all,
 A while ago, I posted the following question.  Now here's the summary
 of all the responses I got:
 >* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *  41
 >
 >Question: Is there a way to calculate beforehand, the size of the
 >          INTEGRAL and RWF files, given the input to the G86 program?
 >
 >* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * *  40
 >From:     <SEMINARI "at@at" CPWPSCA>
 >The maximum size of the files can be estimated exactly but is a little
 >dificult and experience plays an important role.
 >The easier to determine is integral file because the maximum
 >number of integrals is roughly (N**4)/8 where N is the number of basis
 >functions. Each integral will need from 10 to 16 bytes depending on the
 hardwar
 >used. I know nothing about the SX. The program will eliminate small
 integrals
 >so the total number of integrals reduces from 5 to 30%, Therefore
 >experience in your systems will tell you a good estimate but knowing
 >the maximum is a good start. Later versions of gaussian (89,90,91) have
 >a DIRECT option in which you do not need to storage integrals and
 >is even faster when the total number of basis functions is bigger than 100.
 >Normally the RWF do not require lot of space exept in postHF
 >calculations where the above number have to multiplied by the number
 >of atoms in same cases or the number of occupaid orbitals etc the
 >deatails must be in the manual. Again, for these calculations
 >this is not true in latest versions of Gaussian. I used G90
 >which is more eficient managing disk space. You should get G90 or
 >at least G88. G86 was not a good version (has many bugs)
 >and not many people use it right now. There are many details involved
 >I guess if you mail me the input file, could be of more help.
 >Best regards
 >Dr. Jorge M. Seminario
 >bitnet%"jsmcm "at@at" uno"
 >University of New Orleans
 >Dept of Chemistry
 >From:     <SEMINARI "at@at" CPWPSCA>
 >
 >The problem with the RWF file is that its content have changed with
 >different versions of Gaussian, you probably can see the exact content
 >in the manual. I do not have one right now. A 94 basis functions MP2
 >calculation would require the following disk space:
 >---------------------------------from output file---------------------------
 >0Normal termination of Gaussian 90.
 > STATUS OF FILES
 > RWF: /tmp/users/4/seminari/sco1/97269/g90-a97282.rwf 453120 words deleted
 > INT: /tmp/users/4/seminari/sco1/97269/g90-a97282.int 19326464 words deleted
 > D2E: /tmp/users/4/seminari/sco1/97269/g90-a97282.d2e 0 words deleted
 > CHK: sco1mp2.chk 130560 words saved
 > SCR: /tmp/users/4/seminari/sco1/97269/g90-a97282.scr 24668160 words deleted
 > -----------------------------------------------------------------------end
 >This can help you a little to have an idea (remember 1word=8bytes)
 >In MP2 calculations RWF (G86) it probably contains the transformed integrals
 >and some other stuff for the frequency so the g86 would
 >contains the RWF and the SCR files of G90 plus an small additional for the
 >numerical second derivatives. But a word of advice here,
 >MP2 FREQ in Gaussian are done numerically as opposed to analytically
 >which for your system could take many times less
 >CPU time! (12 atoms means 36 coordinates so 72+1 energy calculations will
 >be required numerically.) The program that can do that analytically is
 >called CADPAC (Cambridge Analytic Derivative PACkage) I never do MP2 FREQ
 >using Gaussian programs.
 >
 >* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * =
 >From: djh "at@at" ccl.net
 >
 >>The problem: G86 on the SX uses a tremendous amount of scratch filespace
 >
 >For SCF-only jobs (that is, no MP, CI, etc.) find out how many basis
 functions
 >there are.  The easiest way to do this is start up the job for a minute or
 so
 >and look through the output.  If N is the number of basis functions,
 estimate
 >N^4/4 words of storage needed for the integral file.  I assume the SX is a
 >64 bit machine, so if there are 50 basis functions you'd need 12.5MB.
 >
 >If there is symmetry in the molecule you won't need as much space.  If the
 >user is doing a correlated calculation, you may need several times this
 space.
 >
 >Often, many similar jobs are run, or a series of runs with larger basis sets
 is
 >made, so an aware user can make a good estimate of file space needed after
 the
 >first successful run.  Unfortunately, experience is the only good teacher in
 >this problem!
 >
 >>Another question is:  Is this sort of filesize for the integrals and
 >>readwrite file typical??
 >
 >We routinely run jobs requiring 8GBytes total disk space.  Quantum chemistry
 >does not scale well!  You might want to look into getting G90 though, as it
 >has the so-called "direct SCF", which uses only minimal disk space
 at some
 >increase in CPU time.
 >--
 >David J. Heisterberg (djh "at@at" ccl.net)	"Dost thou use to write thy
 name?
 >The Ohio Supercomputer Center		 or hast thou a mark to thyself,
 >Columbus, Ohio					 like an honest plain-dealing man?"
 >
 >* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * =
 >From:     <NEELY "at@at" AUDUCVAX>
 >
 >I'm a bit surprised that the rwf file takes more space than the
 >integral file:  whenever I've run G86 on a Cray, the reverse
 >was true.  A pretty decent estimate for the integral file is
 >10*N*N*N*N/4 bytes, where N is the number of basis functions.
 >This number can be obtained from the molecular formula, and
 >knowledge of the basis set.  Tim Clark, in his "Handbook of
 >Computational Chemistry" (Wiley, I think) has a nice chart of
 >the number of basis functions per atom for the common basis sets.
 >I hope this helps you.
 >Irene Newhouse
 >
 >* * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * * =
 >From: m10!frisch "at@at" uunet.UU.NET (Michael Frisch)
 >
 >It is common for ab initio calculations to use large amounts of disk space.
 >The current version of Gaussian can trade CPU time for disk space for the
 >most common types of calculation (Hartree-Fock and MP2), and is much, much
 >faster, but these capabilities were not present in Gaussian 86.
 >
 >The integral file size depends on the number of basis functions.  It will be
 >roughly 1.5N^4 bytes or 2N^4 bytes long, depending on how the integral
 >labels are packed (whether they fill 32 or 64 bits).  For open-shell
 >Hartree-Fock calculations, the size will be 2.5N^4 or 3N^4.
 >
 >The read-write file size depends on N^2 for Hartree-Fock calculations, and
 >on N^4 in a complicated way for post-SCF calculations.  The easiest way to
 >estimate the size is to run a smaller job an scale the size by the ratio
 >of OV^3 for the jobs (O=number of electrons, V=N-O=number of virtuals).
 >
 >The limitation of not being able to extend these files is specific to the
 >NEC implementation of Gaussian and is not present on other versions.
 Contact
 >your NEC representative for an explanation.
 >
 >All this is for Gaussian 86, of course.  For SCF and MP2 calculations in
 >Gaussian 90 the integral file isn't needed.  For MP2, whatever disk is made
 >available for the read-write file is used, with quantities being recomputed
 >instead of trying to use more disk than there is.
 >
 >Michael Frisch
 >Gaussian, Inc.
 >-------
 Acknowledge-To: <CCECHK "at@at" NUSVM>