From chemistry-request@server.ccl.net  Fri Mar 31 03:00:30 2000
Received: from sunu450.rz.ruhr-uni-bochum.de (sunu450.rz.ruhr-uni-bochum.de [134.147.64.5])
	by server.ccl.net (8.8.7/8.8.7) with SMTP id DAA23568
	for <Chemistry@CCL.net>; Fri, 31 Mar 2000 03:00:29 -0500
Received: (qmail 7363 invoked from network); 31 Mar 2000 08:00:21 -0000
Received: from cvw.orch.ruhr-uni-bochum.de (HELO ?134.147.110.57?) (134.147.110.57)
  by mailhost.rz.ruhr-uni-bochum.de with SMTP; 31 Mar 2000 08:00:21 -0000
Mime-Version: 1.0
X-Sender: wuellcbd@mailhost.rz.ruhr-uni-bochum.de
Message-Id: <p04310100b50a0b022123@[134.147.110.57]>
In-Reply-To: <38E38A5C.411B65BF@ice.mpg.de>
References: <38E38A5C.411B65BF@ice.mpg.de>
Date: Fri, 31 Mar 2000 10:00:16 +0200
To: Christoph Steinbeck <steinbeck@ice.mpg.de>
From: Christoph van =?iso-8859-1?Q?W=FCllen?=  <Christoph.van.Wuellen@Ruhr-Uni-Bochum.De>
Subject: Re: CCL:Scaling of G98 on Clusters
Cc: Chemistry@CCL.net
Content-Type: text/plain; charset="us-ascii" ; format="flowed"

>Greetings everybody,
>
>this question goes to those who use Gaussian 98 (or maybe also other
>versions) on workstations clusters.
>
>
>Nodes	Effective speedup
>1	1
>4	3,93
>8	6,41
>10	8,55
>12	8,67
>16	7,9
>

Given a "serial work" of 1-2 per cent of the total CPU time, that is 
what you can expect according to Amdahl's law. This law even does NOT 
account for non-ideal communication, which maybe is responsible for 
the speedup going down from 12 to 16 processors.

Unless the diagonalization, orthogonalization etc. steps of SCF or 
DFT calculations are not done in parallel, it makes little sense to 
go far beyond 8 processors.
-- 
---------------------------+------------------------------------------------
Christoph van Wullen       | Fon (University):  +49 234 32 26485
Theoretical Chemistry      | Fax (University):  +49 234 32 14109
Ruhr-Universitaet          | Fon/Fax (private): +49 234 33 22 75
D-44780 Bochum, Germany    | eMail: Christoph.van.Wuellen@Ruhr-Uni-Bochum.de
---------------------------+------------------------------------------------

From chemistry-request@server.ccl.net  Fri Mar 31 03:30:30 2000
Received: from fyserv1.fy.chalmers.se (fyserv1.fy.chalmers.se [129.16.110.66])
	by server.ccl.net (8.8.7/8.8.7) with ESMTP id DAA23713
	for <chemistry@ccl.net>; Fri, 31 Mar 2000 03:30:30 -0500
Received: from patrikj (ftf-pc32.fy.chalmers.se [129.16.112.162])
	by fyserv1.fy.chalmers.se (8.8.8/8.8.8) with SMTP id KAA20162
	for <chemistry@ccl.net>; Fri, 31 Mar 2000 10:30:21 +0200 (MET DST)
Message-Id: <4.1.20000331102449.0096cbc0@fy.chalmers.se>
X-Sender: patrikj@fy.chalmers.se
X-Mailer: QUALCOMM Windows Eudora Pro Version 4.1 
Date: Fri, 31 Mar 2000 10:33:16 +0200
To: chemistry@ccl.net
From: Patrik Johansson <patrikj@fy.chalmers.se>
Subject: Summary SCRF with zwitterions
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"


Dear all

I got a few responses, clear, but however discouraging.
It seems as the problem lies in the SCI-PCM itself and has nothing to do
with my choice of system. Both Doug Fox from Gaussian and Andreas Klamt
recommend to use any of the PCM instead (C-PCM is recommended by Andreas
Klamt).

As my interest is to calculate frequencies this is however not an option it
seems - so I'll go to the Onsager model instead.

I'm most grateful for the rapid responses

/Patrik


Dr. Patrik Johansson
Experimental Physics
Chalmers University of Technology
412 96 Gothenburg
SWEDEN

From chemistry-request@server.ccl.net  Fri Mar 31 04:32:45 2000
Received: from post.webmailer.de (natmail2.webmailer.de [192.67.198.65])
	by server.ccl.net (8.8.7/8.8.7) with ESMTP id EAA24174
	for <chemistry@ccl.net>; Fri, 31 Mar 2000 04:32:44 -0500
Received: from cosmologic.de (1-3.K.dial.o-tel-o.net [212.144.1.3])
	by post.webmailer.de (8.9.3/8.8.7) with ESMTP id LAA01786
	for <chemistry@ccl.net>; Fri, 31 Mar 2000 11:32:34 +0200 (MET DST)
Message-ID: <38E4617E.4C903889@cosmologic.de>
Date: Fri, 31 Mar 2000 10:27:42 +0200
From: "Dr. Andreas Klamt" <andreas.klamt@cosmologic.de>
Organization: COSMOlogic GmbH&CoKG
X-Mailer: Mozilla 4.7 [de] (Win98; I)
X-Accept-Language: de
MIME-Version: 1.0
To: "chemistry@ccl.net" <chemistry@ccl.net>
Subject: SD-Files
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Does anybody know an efficient way to convert an SD-File with lots of structures into a set of molecular data files in any Babel
format.

Thanks in advance 

Andreas
-- 
--------------------------------------------------------------------------------
Dr. Andreas Klamt
COSMOlogic GmbH&Co.KG
Burscheider Str. 515
51381 Leverkusen

Tel.: 49-2171-73168-1  Fax: ...-9
e-mail: andreas.klamt@cosmologic.de
web:    www.cosmologic.de
--------------------------------------------------------------------------------
COSMOlogic
        Your Competent Partner for
                    Computational Chemistry and Solvation
--------------------------------------------------------------------------------


From chemistry-request@server.ccl.net  Fri Mar 31 06:32:04 2000
Received: from fyserv1.fy.chalmers.se (fyserv1.fy.chalmers.se [129.16.110.66])
	by server.ccl.net (8.8.7/8.8.7) with ESMTP id GAA24731
	for <chemistry@ccl.net>; Fri, 31 Mar 2000 06:32:02 -0500
Received: from patrikj (ftf-pc32.fy.chalmers.se [129.16.112.162])
	by fyserv1.fy.chalmers.se (8.8.8/8.8.8) with SMTP id NAA09155;
	Fri, 31 Mar 2000 13:31:54 +0200 (MET DST)
Message-Id: <4.1.20000331133101.0097d820@fy.chalmers.se>
X-Sender: patrikj@fy.chalmers.se
X-Mailer: QUALCOMM Windows Eudora Pro Version 4.1 
Date: Fri, 31 Mar 2000 13:34:48 +0200
To: h.g.schreckenbach@dl.ac.uk
From: Patrik Johansson <patrikj@fy.chalmers.se>
Subject: Re: CCL:Summary SCRF with zwitterions
Cc: chemistry@ccl.net
In-Reply-To: <38E47D40.DC75D112@dl.ac.uk>
References: <4.1.20000331102449.0096cbc0@fy.chalmers.se>
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"


Dear Georg (and all other CCLers that have helped)

Thanks alot, maybe I was a little fast to rule out the PCM - but I think
that the G98 manual is a little fuzzy. But as you now assured that it works
(numerically) - I'll try that.

Patrik


At 11:26 AM 31-03-00 +0100, you wrote:
>Dear Patrik,
>
>> I got a few responses, clear, but however discouraging.
>> It seems as the problem lies in the SCI-PCM itself and has nothing to do
>> with my choice of system. Both Doug Fox from Gaussian and Andreas Klamt
>> recommend to use any of the PCM instead (C-PCM is recommended by Andreas
>> Klamt).
>>
>> As my interest is to calculate frequencies this is however not an option it
>> seems - so I'll go to the Onsager model instead.
>
>As far as I know, the latest version of Gaussian98 should allow you to do -- 
>numerical --
>frequencies with, for instance, CPCM. In fact, I have done a few of them. 
>More generally, the latest
>release has gradients (and hence numerical frequencies) for all PCM models, 
>see the release notes on
>the GAUSSIAN website. Numerical frequencies are, of course, a lot more 
>costly then analytical ones.
>   Besides, you might want to contact the author of the solvation stuff, V. 
>Barone -- he helped me a
>lot with getting input settings that actually worked for my system. 
>(Unfortunatley, I don't have his
>E-mail handy at the moment but I think he is in Napoli).
>
>Best regards, Georg
>
>
>


Dr. Patrik Johansson
Experimental Physics
Chalmers University of Technology
412 96 Gothenburg
SWEDEN

From chemistry-request@server.ccl.net  Fri Mar 31 08:23:31 2000
Received: from ucidoor.unitedcatalysts.com ([208.23.162.2])
	by server.ccl.net (8.8.7/8.8.7) with ESMTP id IAA26159
	for <chemistry@ccl.net>; Fri, 31 Mar 2000 08:23:31 -0500
Received: by ucidoor.unitedcatalysts.com; (8.8.8/1.3/10May95) id IAA11199; Fri, 31 Mar 2000 08:27:10 -0500 (EST)
Received: from 10.1.0.50 by ucismtp02.unitedcatalysts.com (InterScan E-Mail VirusWall NT); Fri, 31 Mar 2000 08:23:42 -0500 (Eastern Standard Time)
Received: from lvlxch01.unitedcatalysts.com ([10.1.0.88])
 by lvlmail.unitedcatalysts.com (PMDF V5.2-32 #33820)
 with ESMTP id <0FSA00HC3FZBFQ@lvlmail.unitedcatalysts.com> for
 chemistry@ccl.net; Fri, 31 Mar 2000 08:25:59 -0500 (EST)
Received: by lvlxch01.unitedcatalysts.com with Internet Mail Service
 (5.5.2650.21)	id <H7PFQ34C>; Fri, 31 Mar 2000 08:21:59 -0500
Content-return: allowed
Date: Fri, 31 Mar 2000 08:21:57 -0500
From: "Shobe, Dave" <dshobe@sud-chemieinc.com>
Subject: lanl2dz polarization functions
To: "'CCL'" <chemistry@ccl.net>
Message-id: 
 <157A51F55AAAD3119CD70008C7B1629D03E47C@lvlxch01.unitedcatalysts.com>
MIME-version: 1.0
X-Mailer: Internet Mail Service (5.5.2650.21)
Content-type: text/plain;	charset="iso-8859-1"

Where can I get polarization and diffuse functions for LANL2DZ (for
Gaussian98W)?  I am interested in both main-group and transition elements.

--David Shobe


From chemistry-request@server.ccl.net  Fri Mar 31 08:28:29 2000
Received: from NPD1.NPD.UFPE.BR (npd1.npd.ufpe.br [150.161.6.2])
	by server.ccl.net (8.8.7/8.8.7) with ESMTP id IAA26230
	for <chemistry@ccl.net>; Fri, 31 Mar 2000 08:28:22 -0500
Received: from rlongo1 ([150.161.5.18]) by npd.ufpe.br (PMDF V5.1-4 #20469)
 with ESMTP id <01JNKCJSK84G0001M8@npd.ufpe.br> for chemistry@ccl.net; Tue,
 28 Mar 2000 11:35:50 GMT-3
Date: Fri, 31 Mar 2000 10:27:36 -0300
From: Ricardo Longo <longo@npd.ufpe.br>
Subject: Theor. Chem. Acc. New Century Issue
To: chemistry@ccl.net
Message-id: <38E4A7C8.20BC8EC2@npd.ufpe.br>
MIME-version: 1.0
X-Mailer: Mozilla 4.01 [en] (Win95; I)
Content-type: text/plain; charset=us-ascii
Content-transfer-encoding: 7bit
X-Priority: 3 (Normal)
References: <200003310240.EAA15102@pinon.ccu.uniovi.es>

> > Date: Thu, 30 Mar 2000 13:26:49 -0600 (CST)
> > From: Christopher Cramer <cramer@pollux.chem.umn.edu>
> >
> > Last month, Theoretical Chemistry Accounts (TCA) published its New
> Century
> > Issue. Each member of the editorial board selected one or more
> papers from
> > the 20th century and wrote a 2-4 page perspective on the influence
> that
> > the work had on theoretical chemistry/computational
> chemistry/molecular
> > modeling. One of the motivations for this issue was to provide a
> historical
> > context that would be readily accessible to newcomers to the field.
>
> This is an important attempt and the TCA issue could become highly
> valued
> as an educational material. On the other hand, even though it is
> obvious
> that a lot of subjectivity is involved in choosing a list of relevant
> papers, it is a little perturbing to notice that workers such as
> Mulliken,
> Bader, Boyd, Clementi, Huzinaga or Shavitt, to name a very few, have
> not
> influenced the field in the opinion of the TCA board.
>
>                        Best regards,
>                                      Victor Lua~na
>                                      victor@carbono.quimica.uniovi.es
>

I also think that the editorial board of TCA should have been more
careful not to leave out the contribuitions such as from  J. C. SLATER.

Ricardo Longo
longo@npd.ufpe.br


From chemistry-request@server.ccl.net  Thu Mar 30 23:40:36 2000
Received: from gw5smtp.um.edu.my ([202.185.112.1])
	by server.ccl.net (8.8.7/8.8.7) with SMTP id XAA22732
	for <CHEMISTRY@ccl.net>; Thu, 30 Mar 2000 23:40:30 -0500
Received: from Gwdom1-Message_Server by gw5smtp.um.edu.my
	with Novell_GroupWise; Fri, 31 Mar 2000 12:39:39 +0800
Message-Id: <s8e49c8b.042@gw5smtp.um.edu.my>
X-Mailer: Novell GroupWise 5.5
Date: Fri, 31 Mar 2000 12:39:04 +0800
From: "Liew Fui fah" <fuifah@um.edu.my>
To: <CHEMISTRY@ccl.net>
Subject: Summary: Molecular Modeling - expert opinion needed
Mime-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Disposition: inline
Content-Transfer-Encoding: 8bit
X-MIME-Autoconverted: from quoted-printable to 8bit by server.ccl.net id XAA22733

Greetings,

I received a lot of mails request me to post a summary to CCL, and here it is. Thanks
for those who response to my questions. However, the requests for summary are more
than the responses. The original question was:

--------------------------------
Dear CCLers,

 Nowadays, Internet has become a tool for scientists in order to perform their work
efficiently. Almost very one of us relies on the Internet to have discussion, looking
for information and communicating with others. Indeed, through Internet we are
effectively engaged with research, publications and information sharing that need
collaboration with geographically dispersed colleagues.

 With a thought of fully utilizing the strength of Internet, I foresee there is a
possibility to develop a web based molecular modeling station/portal, that would
allow chemists to carry out experiment work, share information and modeling results
with just a simple login to the server through an Internet browser. The molecular
modeling engine, computing facilities and the working space are provided by the
station.

 In order to carry out an analysis on this idea, I need expert opinion from fellow
CCLers as to inspect the need and how should such a web based molecular modeling
station perform. Please kindly give your opinions from a user or developer's
perspective with respect to the following aspects. (These are some basic guidelines
for the survey, you are most welcome to add additional comments)

 A. Interface
 - What are the interface features that you would like to see and use when
   manipulating your structure? Beside the common editing and browsing tools, you are
   welcome to suggest new feature the meet your requirements.
 - Would you like to have an interface setting that match the way you want to work?
 - How do you like the information of transition states, molecular properties,
   orbitals, vibration, electronic spectrum, etc to be displayed?
 - Which molecular modeling package(s) that has the user interface that you like most.

 B. Modeling Methods
 - What are the general features / methods that are a *must* for a molecular modeling
   package?
 - What modeling methods are you using?
 - What are the advance characteristics of a molecular modeling package that will
   meet you requirements?
 - Please suggest the molecular modeling package(s) that you like to use most in
   terms of it accuracy and ease to use.

 C. Platform
 - Since it will be a web based molecular modeling station, it is ideal to have
   access from multi-platform machine. As a reference, please suggest the platforms
   that most of the chemists engage with while they carry out their work.
 - Would you like to have a direct access to the web based station at your current
   working platform?
 - What platform do you think that is most capable to operate as a server which can
   handle large amount of job load and provide enough storage?

 D. Technologies
 - Do you think that Java will meet most of the web based application requirements
   while incorporating various plug-ins and formats (e.g. XML, SVG). What is your
   comment on this particular technology?
 - Please suggest other technologies that you think is powerful in developing a web
   based chemical application. (if any)

 E. Database
 - What are the existing databases that you would like to use in your searches?
   Please provide references.

 F. General
 - What do you think about a web based molecular modeling station/portal?
 - Will it be convenient to do molecular modeling job in a remote machine through
   Internet?

Your opinions and comments are very useful to my research, please kindly reply to
fuifah@um.edu.my

Thanks.

Regards,

Liew Fui Fah,
University Malaya, Malaysia.
fuifah@um.edu.my


Here comes the responses:

1)
Date: Wed, 15 Mar 2000 14:39:43 +0800
From: mcbskt <mcbskt@leonis.nus.edu.sg>
Reply-To: mcbskt@leonis.nus.edu.sg
Organization: Institute of Molecular & Cell Biology
To: Liew Fui fah <fuifah@um.edu.my>
Subject: Re: CCL:Molecular Modeling - expert opinion needed

There are many on-going international collaborations in US and europe that will
address the questions.  Check the Pacific Symposium of Biocomputing.

========================

2)
Date: Wed, 15 Mar 2000 10:49:38 -0600
From: Kirby Vandivort <kvandivo@ks.uiuc.edu>
To: fuifah@um.edu.my
Subject: [Fwd: CCL:Molecular Modeling - expert opinion needed]

What you are proposing sounds very similar to what we are working on right
now for Structural Biology.  I invite you to take a look at what we are
working on and offer any comments or suggestions.  We are relatively early
in the funding (about 1 year into a 4 year project) but we have an
initial release that you can play with.

http://www.ks.uiuc.edu/Research/collaboratory

If you have any questions, let us know!

Kirby

Kirby Vandivort                      Theoretical Biophysics Group
Email: kvandivo@ks.uiuc.edu          3051 Beckman Institute
http://www.ks.uiuc.edu/~kvandivo/    University of Illinois
Phone: (217) 244-5711                405 N. Mathews Ave
Fax  : (217) 244-6078                Urbana, IL  61801, USA


3)
Date: Wed, 15 Mar 2000 15:03:37 -0600
Subject: FW: BioCoRE - a structural biology portal
From: Gila Budescu <gila@ks.uiuc.edu>
To: <fuifah@um.edu.my>


Please look at  http://www.ks.uiuc.edu/Research/collaboratory/
for an instant fulfillment of your wishes. The project started on 2/99 and
first official release was announced on 2/2000. We are looking forward to
supporting your work.
Gila Budescu

Barry Isralewitz     Theoretical Biophysics Group    Beckman 3121
Office Phone: (217) 244-1612     Home Phone: (217) 337-6364
email: barryi@ks.uiuc.edu     http://www.ks.uiuc.edu/~barryi

From chemistry-request@server.ccl.net  Thu Mar 30 23:53:34 2000
Received: from hanno.taiho.co.jp (hanno.taiho.co.jp [210.158.206.38])
	by server.ccl.net (8.8.7/8.8.7) with ESMTP id XAA22783
	for <chemistry@ccl.net>; Thu, 30 Mar 2000 23:53:32 -0500
Received: from ishida.taiho.co.jp (dhcp127.taiho.co.jp [210.158.206.123])
	by hanno.taiho.co.jp (8.9.3/3.7W) with SMTP id OAA31533
	for <chemistry@ccl.net>; Fri, 31 Mar 2000 14:02:07 +0900
Message-ID: <012201bf9a6a$56a184e0$7bce9ed2@taiho.co.jp>
From: "Keisuke Ishida" <ishida@hanno.taiho.co.jp>
To: "CCL" <chemistry@ccl.net>
Subject: Summary of "How to build PC clusters for MD simulation"
Date: Fri, 31 Mar 2000 02:06:49 +0900


Thank you very much for many replies to my question that is
"I want to build PC cluster for MD simulation.
Please let me know how to build PC cluster
and which MD programs can I get for parallel computers."

I know that many people have interested in this field and
I have received some e-mails for summary of replies to my question.
So I will summarize many replies as follows:

********************
>Li Guo Hui wrote:
as i known, molsim.chem.uva.nl is the very useful homepage for what you
want, when you begin to build culster,could you tell
me some details about your cluster ?

********************
>Vincent wrote:
you will probably use Linux as operating system... (or else you should).
you then might want to take a look at the Boewulf Project on clustering
computing with Linux.
Also, look at this page (from the UCLA computing center) for other details
on
high performance computing:
http://www.ats.ucla.edu/at/hpc/topics_in_hpc/default.htm
Also, here's a nice documentation about Linux clustering:
http://metalab.unc.edu/mdw/HOWTO/Beowulf-HOWTO.html
Hope this will help you.

********************
>Dr O. Parchment wrote:
A posting that you made to the CCL mailing list regarding PC clusters
for MD has been forwarded to the Beowulf mailing list. I have been involved
with the construction of 2 PC clusters, which are capable of running MD
codes.
The first was a small 8 cpu cluster
http://www.soton.ac.uk/~chemphys/jessex/beowulf.html
and latterly a 19 node machine http://www.soton.ac.uk/~oz which I have
run MD codes on both of these pages have information on the hardware &
software
used. If you need any further help I would be pleased to give what help I
can.

********************
>Huub van Dam wrote:
If you want to know how to build a PC cluster I suggest you start by looking
at
the Beowulf homepage:
    http://www.beowulf.org/
this leads to some information on the issues involved and there are pointers
to
other sites building the same class of systems.

********************
>Jose C. Corchado wrote:
We are planning to do the same thing: a PC cluster that will be mostly
dedicated to MD calculations.

As for the cluster, I have found some information on the beowulf project
web page and the links included there. I have also found a few web pages
where custom-build clusters are sold (hardware and software). As for MD
programs, I know that the program "moldy", which is availabe at CPC, can
work on parallel computers.

Some web pages are:
http://www.beowulf.org
http://www.extremelinux.org
They are plenty of links to tutorials and other web pages where you can
get more info.

Some companies selling clusters are:
http://www.xtreme-machines.com/
http://www.microway.com
http://www.plogic.com/
http://www.kachinatech.com/

********************
>Antonio Luiz Oliveira de Noronha wrote:
Look for Beowulf cluster project or Extreme Linux project.
The Beowulf is the project from NASA about pc clusters.

In the inthernet there are some HOWTO's about beowulf clusters, specially
in the linux documantation project.

You just need to downlowd the HOWTO's ( general one ) and when you unzip
him, you will find the HOWTO about beowulf and intranet.

About the program, a good one could be AMBER, but this depends about what
is your project.

********************
>NELSON HENRIQUE MORGON wrote
in this site you will find useful informations:
http://molsim.chem.uva.nl/cluster/

********************
>Dr. David van der Spoel wrote
The GROMACS software will work fine on PC clusters
http://md.chem.rug.nl/~gmx
however you have to invest in networking, do get Gigabit ethernet or
something like that.

Scaling information on a simple cluster can be found at
http://zorn.bmc.uu.se/~spoel/gmxscaling.html
The best I got was slightly more than 75% scaling for an 8 processor
cluster.

********************
>Armel Le Bail wrote
There is a brand new one here, with detailed
explanations and pictures :
http://weblotus.univ-lemans.fr/w3lotus/index.html

********************
>Alex Ninaber wrote
'just' a beowulf PC cluster is a pretty bad idea for doing MD code. Unless
you are going to run lots of small MD jobs with a slightly different
starting structure, a PC cluster connected with ethernet is going to be
pointless. MD code, especially if you want to use Ewald or PME, scales
notoriously bad with the number of processors. Dual Intels connected with
Myrined or some other type of fast interconnect appears to be ok. Alphas
with the Quadrics interconnect is pretty good if you have some money to
burn. For Intels, expect the interconnect to be close to the price of a
single Intel box.

As for good MD programs:

Amber
Gromacs
Charm
Gromos

All run on clusters. We have had excellent results with the free
Gromacs. Factor 2 compared to Amber running with PME on a 6 node SGI
Origin. We expect to see the same thing on Alpha computers.

If anyone is interested in having more information on this topic, we
specialise in optimising the cluster computer environment for the specific
software applications.

********************

Thanks,

Keisuke Ishida
Chemistry Laboratory, Taiho Pharmaceutical Co.,Ltd.
Hanno 357-8527 Japan
TEL. +81-429-72-8900
FAX. +81-429-72-8913
E-mail. ishida@taiho.co.jp


From chemistry-request@server.ccl.net  Fri Mar 31 00:44:10 2000
Received: from lrz.uni-muenchen.de (root@lsanca1-ar99-078-042.biz.dsl.gtei.net [4.3.78.42])
	by server.ccl.net (8.8.7/8.8.7) with ESMTP id AAA23055
	for <chemistry@ccl.net>; Fri, 31 Mar 2000 00:44:10 -0500
Received: (from eugene.leitl@localhost)
	by lrz.uni-muenchen.de (8.8.8/8.8.8) id VAA17958
	for <chemistry@ccl.net>; Thu, 30 Mar 2000 21:44:26 -0800
Resent-From: Eugene Leitl <eugene.leitl@lrz.uni-muenchen.de>
Resent-Message-ID: <14564.15162.605209.904685@lrz.uni-muenchen.de>
Resent-Date: Thu, 30 Mar 2000 21:44:26 -0800 (PST)
Resent-To: <chemistry@ccl.net>
Reply-To: <glindahl@hpti.com>
Message-Id: <000201bf9ad1$54b5b4a0$a26ef7ce@hptilap.hpti.com>
In-Reply-To: <14564.3211.928734.38138@lrz.uni-muenchen.de>
Importance: Normal
From: "Greg Lindahl" <glindahl@hpti.com>
To: "Eugene Leitl" <Eugene.Leitl@lrz.uni-muenchen.de>,
        <beowulf@beowulf.gsfc.nasa.gov>
Subject: RE: Scaling of G98 on Clusters
Date: Fri, 31 Mar 2000 00:24:04 -0500

> Is this bad scaling to be expected?

Yes. Not only does g98 use Linda, which is hard to optimize, it's doing work
decomposition, so it ought to scale at best like Charmm does. Charmm goes
out to about 32 alpha nodes with myrinet before it begins to turn over, so
it shouldn't surprise you that ethernet turns over faster. I should
hopefully have some comparative benchmarks soon: alpha vs. intel, gigE vs.
Myrinet vs. fast ethernet.

-- g


From chemistry-request@server.ccl.net  Fri Mar 31 00:47:19 2000
Received: from lrz.uni-muenchen.de (root@lsanca1-ar99-078-042.biz.dsl.gtei.net [4.3.78.42])
	by server.ccl.net (8.8.7/8.8.7) with ESMTP id AAA23093
	for <chemistry@ccl.net>; Fri, 31 Mar 2000 00:47:18 -0500
Received: (from eugene.leitl@localhost)
	by lrz.uni-muenchen.de (8.8.8/8.8.8) id VAA18009
	for <chemistry@ccl.net>; Thu, 30 Mar 2000 21:47:35 -0800
Resent-From: Eugene Leitl <eugene.leitl@lrz.uni-muenchen.de>
Resent-Message-ID: <14564.15351.475946.360854@lrz.uni-muenchen.de>
Resent-Date: Thu, 30 Mar 2000 21:47:35 -0800 (PST)
Resent-To: <chemistry@ccl.net>
Message-Id: <001101bf9ad3$575ca6d0$0100a8c0@chem.wsu.edu>
References: <14564.3211.928734.38138@lrz.uni-muenchen.de>
From: "Phillip Matz" <matz@wsunix.wsu.edu>
To: "Eugene Leitl" <Eugene.Leitl@lrz.uni-muenchen.de>
Subject: Re: CCL:Scaling of G98 on Clusters
Date: Thu, 30 Mar 2000 21:38:28 -0800

Hello!

Indeed I see very similar results in scaling G98 CIS(Direct) calculations.
My beowulf is similarly configured (450MHz nodes not 550).  Using the simple
equations regarding Amdahl's law and IPC provided in Robert G. Brown's web
article "So, You Want to Build a Beowulf" and varying a few parameters
(10Mbps vs 100Mbps, etc.) I was able to extract a few seemingly "important"
numbers concerning some of the G98 linda-executables.

I believe there is a good marketing reason why the white-paper regarding the
scalability of gaussian code on linda is reported for RHF(direct).  My
analysis on G98 RHF(direct) (the executable is 502.exel) indicates that the
serial code fraction (Ts in Amdahl's law) is around 1%, and for my
particular beowulf the IPC was approximately 2ms/node/second; with such a
low Ts one can't help but to expect some decent granularity and ok
scalability.  My field of research (I'm a grad student at Washington State
University) is spectroscopy so I'm particularly interested in Excited State
calculations, which G98 facilitates via CIS calculations.  My analysis on
G98 CIS(direct) (the executable is 914.exel) indicates that the serial code
fraction is around 8%!!  This of course leads to very poor G98 scaling when
I run a CIS calculation, with the peak scaleup being around 6 when 12 nodes
are being used and decreases thereafter just as you have noticed.

I don't use G98 for DFT calculations, but based on your observations I would
suspect that the G98 linda executables (.exel) being called from your route
file also have a high percentage of serial code in them.  I would also like
to point out that while the noted decrease in scalability can be alleviated
through channel bonding or faster networking, the maximum scaleup is
severely restricted due to the serial code fraction (Amdahl's law).

And one last comment, I was completely amazed at how easy and accurate it
was to model my G98 beowulf's performance based on the simple equations
found in Dr. Brown's beowulf profiling guide - so thank you Dr. Brown!

Respectfully,
Phillip Matz


----- Original Message -----
From: "Eugene Leitl" <eugene.leitl@lrz.uni-muenchen.de>
To: <beowulf@beowulf.gsfc.nasa.gov>
Sent: Thursday, March 30, 2000 6:25 PM
Subject: CCL:Scaling of G98 on Clusters


> From: Christoph Steinbeck <steinbeck@ice.mpg.de>
> Sender: "Computational Chemistry List" <chemistry-request@ccl.net>
> To: ccl <chemistry@ccl.net>
> Subject: CCL:Scaling of G98 on Clusters
> Date: Thu, 30 Mar 2000 19:09:48 +0200
>
> Greetings everybody,
>
> this question goes to those who use Gaussian 98 (or maybe also other
> versions) on workstations clusters.
>
> I'm currently in the process of evaluating g98 with linda my new 16-node
> linux cluster. Each node is a 550 Mhz Pentium III with 128 MB of RAM and
> a 6 GB hard disk. Communication is done via a 100 MBit ethernet and the
> 16 nodes are separated from the institutes network by a switch.
>
> The calculations I'm interested in are of the type indicated by the
> example listed below, which essentially are composite jobs with geometry
> optimizations on B3LYP/6-31G(d) level followed by a carbon shift
> calculation on the same level.
>
> Now, I ran the jobs with a-pinene and abietic acid (a mono- and a
> diterpene) with 1, 4, 8, 12 and 16 nodes to see how the setup scales.
> What I see is an almost linear scaling up to 4 nodes. With eight nodes
> the effective speedup is still nice with 6.41 where perfect behaviour
> obviously would be 8.00. But then it gets worse as you see and when
> going from 12 to 16 nodes the calculation even slows down. Here are the
> numbers.
>
> Nodes Effective speedup
> 1 1
> 4 3,93
> 8 6,41
> 10 8,55
> 12 8,67
> 16 7,9
>
> This is with a-pinen (C10H16).
> When more than doubling the size of the molecule with abietic acid
> (C20H30O2) the tendency is a little better (8 nodes take 41 h, 16 nodes
> take 30 h).
>
> Is this bad scaling to be expected? The graph on the gaussian web site
> shows the speedup of the calculations up to six nodes and I begin to
> understand why that is :-)
> In many publications on beowulf clusters you find the statement that in
> many cases one can expect near linear scaling of those cluster up to 16
> nodes. I understand that  this highly depends on the type of
> calculation, so am I stuck with my problem, or are there ways to
> optimize the behaviour by tuning the settings?
>
> BTW, I'm using the standard (install time) settings for g98 and linda.
>
> Any comments are very much appreciated.
>
> Cheers,
>
> Chris
>
> --
> Dr. Christoph Steinbeck (http://www.ice.mpg.de/departments/ChemInf)
> MPI of Chemical Ecology, Tatzendpromenade 1a, 07745 Jena, Germany
> Tel: +49(0)3641 643644 - Fax: +49(0)3641 643665
>
> What is man but that lofty spirit - that sense of enterprise.
> >... Kirk, "I, Mudd," stardate 4513.3..
>
> ----------------
> %Chk=abieticacid
> #T B3LYP/6-31G(d) Opt Test
>
>   Name: Z:\gaussian\abietinsaeure.mol
>   MOPAC file created on 22/12 17:49:56 1998 by HYPERCHEM
>
>    0   1
>  C,0,0.,0.,0.
>  C,1,1.5115
> [ ... stuff deleted here ...]
>  H,20,1.1137,5,111.4364,6,74.9555,0
>  H,22,0.9669,19,110.5133,21,-178.666,0
>
> --Link1--
> %Chk=abieticacid
> %NoSave
> #T B3LYP/6-31G(d) NMR Test Geom=AllCheck Guess=Read
> ----------------
>

From chemistry-request@server.ccl.net  Fri Mar 31 04:09:11 2000
Received: from lrz.uni-muenchen.de (root@lsanca1-ar99-078-042.biz.dsl.gtei.net [4.3.78.42])
	by server.ccl.net (8.8.7/8.8.7) with ESMTP id EAA24065
	for <chemistry@ccl.net>; Fri, 31 Mar 2000 04:09:10 -0500
Received: (from eugene.leitl@localhost)
	by lrz.uni-muenchen.de (8.8.8/8.8.8) id BAA19503;
	Fri, 31 Mar 2000 01:09:24 -0800
From: Eugene Leitl <eugene.leitl@lrz.uni-muenchen.de>
Message-ID: <14564.27460.95207.549829@lrz.uni-muenchen.de>
Date: Fri, 31 Mar 2000 01:09:24 -0800 (PST)
To: <chemistry@ccl.net>
Subject: Workshop Biocomputing


> From: ajuffer@sun3.oulu.fi (Andre Juffer)
> Sender: owner-x-plor@hgmp.mrc.ac.uk
> To: x-plor@hgmp.mrc.ac.uk
> Subject: Workshop Biocomputing
> Date: 31 Mar 2000 09:58:03 +0100

Workshop/miniconference in Biocomputing:

"Proteins: From Structures to Properties with Biocomputing Approaches."

June 4-6, The University of Oulu, Oulu, Finland


Information, scientific program, registration, etc.:

WWW:
http://www.biochem.oulu.fi/tutkimus/Biocomputing/WorkshopBiocomputing/

-- 
Andre H. Juffer              | Phone: +358-8-553 1683
The Biocenter and            | Fax: +358-8-553-1141
    the Dep. of Biochemistry | Email: Andre.Juffer@oulu.fi
University of Oulu, Finland  | WWW:
http://www.biochem.oulu.fi/research.html
---


From chemistry-request@server.ccl.net  Fri Mar 31 07:39:21 2000
Received: from lrz.uni-muenchen.de (root@lsanca1-ar99-078-042.biz.dsl.gtei.net [4.3.78.42])
	by server.ccl.net (8.8.7/8.8.7) with ESMTP id HAA25941
	for <chemistry@ccl.net>; Fri, 31 Mar 2000 07:39:20 -0500
Received: (from eugene.leitl@localhost)
	by lrz.uni-muenchen.de (8.8.8/8.8.8) id EAA20803
	for <chemistry@ccl.net>; Fri, 31 Mar 2000 04:39:31 -0800
Resent-From: Eugene Leitl <eugene.leitl@lrz.uni-muenchen.de>
Resent-Message-ID: <14564.40067.691958.576885@lrz.uni-muenchen.de>
Resent-Date: Fri, 31 Mar 2000 04:39:31 -0800 (PST)
Resent-To: <chemistry@ccl.net>
Message-Id: <38E46BFB.17634923@aviion.univ-lemans.fr>
Reply-To: Florent.Calvayrac@univ-lemans.fr
Organization: =?iso-8859-1?Q?Universit=E9?= du Maine
X-Mailer: Mozilla 4.61 [en] (X11; I; Linux 2.2.12-20 i686)
X-Accept-Language: en
MIME-Version: 1.0
References: <14561.59719.826693.516484@lrz.uni-muenchen.de>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Precedence: bulk
From: Florent Calvayrac <fcalvay@aviion.univ-lemans.fr>
To: "beowulf@beowulf.gsfc.nasa.gov" <beowulf@beowulf.gsfc.nasa.gov>,
        claim@aviion.univ-lemans.fr
Subject: Re: CCL:Athlon vs Intel performance
Date: Fri, 31 Mar 2000 11:12:27 +0200

Eugene Leitl wrote:

> 
> The numbers I have. show that at the same clock freq. the Athlon is
> roughly 0-10 % faster on my benchmarks compared to the PIII (ADF,
> Turbomole, Jaguar). Things are different for Dmol which make heavy usage
> of Blas-3 dgemm, which is substantially faster on the Athlon using the
> fabulous Atlas (cudos to Clint Whaley!) libarary, e.g.
> 
> 860 MFlops at 800 MHz for the Athlon compared to
> 825 MFlops at 750 "              "
> 440 MFlops at 550 MHz on a PIII (still Katmai core)
> 
> for dgemm based matrix multiplications (independent of the matrix size)
> 
> Based on figures by Intel for their new math kernel library, peformance
> of a 733 MHz Coppermine is below
> 
> 600 MFlops.

I must confess I am confused here. From the top500 site
at http://www.top500.org 

I had infered from the rough numbers that, say, a
Pentium II 450 could do  450 peak Mflops. 

The Anuwulf people at 

http://tux.anu.edu.au/Projects/Beowulf/

claim to have 158.4 peak  Gflops with 192 Pentium III 550 
(most likely Katmai- or is this Xeons ? they would have said) 
which comes to 825 MFlops per processor believing to my 
1 Flops pocket calculator...


Some clarifications ?


-- 
Florent Calvayrac                          | Tel : 02 43 83 26 26 (NOUVEAU!)
Laboratoire de Physique de l'Etat Condense | Fax : 02 43 83 35 18
UPRESA-CNRS 6087         | http://www.univ-lemans.fr/~fcalvay 
Universite du Maine-Faculte des Sciences   |
72085 Le Mans Cedex 9
-------------------------------------------------------------------
To unsubscribe send a message body containing "unsubscribe"
to beowulf-request@beowulf.org


From chemistry-request@server.ccl.net  Fri Mar 31 04:33:32 2000
Received: from mserv3.dl.ac.uk (root@mserv3.dl.ac.uk [148.79.80.28])
	by server.ccl.net (8.8.7/8.8.7) with ESMTP id EAA24190
	for <chemistry@ccl.net>; Fri, 31 Mar 2000 04:33:31 -0500
Received: from mserv1.dl.ac.uk (root@mserv1.dl.ac.uk [148.79.160.65])
	by mserv3.dl.ac.uk (8.9.3/8.9.3/[ref postmaster@dl.ac.uk]) with ESMTP id KAA26094
	for <chemistry@ccl.net>; Fri, 31 Mar 2000 10:33:11 +0100
Received: from dl.ac.uk (tca8.dl.ac.uk [193.62.112.106]) by mserv1.dl.ac.uk with ESMTP id KAA12580
	(8.8.8/5.4[ref postmaster@dl.ac.uk] for dl.ac.uk from h.j.j.vandam@dl.ac.uk); Fri, 31 Mar 2000 10:33:24 +0100 (BST)
Sender: H.J.J.VanDam@dl.ac.uk
Message-ID: <38E46DE0.5E426623@dl.ac.uk>
Date: Fri, 31 Mar 2000 10:20:32 +0100
From: Huub van Dam <h.j.j.vandam@dl.ac.uk>
Organization: CCLRC Daresbury Laboratory
To: chemistry@ccl.net
Subject: Re: CCL:Theor. Chem. Acc. New Century Issue
References: <200003310240.EAA15102@pinon.ccu.uniovi.es>


Victor Lua~na wrote:

> > Date: Thu, 30 Mar 2000 13:26:49 -0600 (CST)
> > From: Christopher Cramer <cramer@pollux.chem.umn.edu>
> >
> > Last month, Theoretical Chemistry Accounts (TCA) published its New Century
> > Issue. Each member of the editorial board selected one or more papers from
> > the 20th century and wrote a 2-4 page perspective on the influence that
> > the work had on theoretical chemistry/computational chemistry/molecular
> > modeling. One of the motivations for this issue was to provide a historical
> > context that would be readily accessible to newcomers to the field.
>
> This is an important attempt and the TCA issue could become highly valued
> as an educational material.

I agree.

> On the other hand, even though it is obvious
> that a lot of subjectivity is involved in choosing a list of relevant
> papers, it is a little perturbing to notice that workers such as Mulliken,
> Bader, Boyd, Clementi, Huzinaga or Shavitt, to name a very few, have not
> influenced the field in the opinion of the TCA board.

I don't agree. Read again: the issue was written by having each member of the
board select a paper and write something about the influence it had on the field.
Therefore, if an article is not discussed in the book it doesn't mean it didn't
have an influence on the field, it just means it isn't in the book. I do think the
authors have chosen articles that are in general close to their own area of
research, which I think makes sense because one needs to write something sensible
about it. This does not imply that people whos papers are not discussed didn't
influence the field. I am pretty sure that the TCA board highly regards the people
you mentioned. The bottom line is that I think this book should not been seen as a
scientifically robust historical account of the research during the last 100 years
in the field. It is more like a selection of research pearls for students (and
others) to look back at in awe and wonder. Seen in that light I think it is an
impressive publication.

Huub

--

========================================================================

Huub van Dam                               E-mail: h.j.j.vandam@dl.ac.uk
CCLRC Daresbury Laboratory                  phone: +44-1925-603362
Daresbury, Warrington                         fax: +44-1925-603634
Cheshire, UK
WA4 4AD

========================================================================


From chemistry-request@server.ccl.net  Fri Mar 31 10:12:30 2000
Received: from ccl.net (atlantis.ccl.net [192.148.249.4])
	by server.ccl.net (8.8.7/8.8.7) with ESMTP id KAA27202
	for <chemistry@ccl.net>; Fri, 31 Mar 2000 10:12:30 -0500
Received: from arlen.ccl.net (arlen.ccl.net [192.148.249.10])
	by ccl.net (8.9.3/8.9.3/OSC 2.0) with ESMTP id KAA07661;
	Fri, 31 Mar 2000 10:12:14 -0500 (EST)
Date: Fri, 31 Mar 2000 10:12:15 -0500 (EST)
From: Jan Labanowski <jkl@ccl.net>
To: paddy.Kane@dcu.ie
cc: chemistry@ccl.net, Jan Labanowski <jkl@ccl.net>
Subject: Compartmental Modelling
Message-ID: <Pine.GSO.4.21.0003311002330.7142-100000@arlen.ccl.net>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII

> From: PADDY KANE <paddy.Kane@dcu.ie>
> Subject: Compartmental Modelling
>
>       I would be very grateful if someone could tell me what Compartmental
>      Modelling is and where I might find more info on it.

It is a term from the pharmacokinetic and relates to drug transport.

The Web search of www.hotbot.co for 
Compartmental Modeling
returns:
http://laxmi.nuc.ucla.edu:8241/Pharm241_98/index.html
http://gaps.cpb.uokhsc.edu/gaps/pkbio/Ch19/Ch1901.html
http://retina.anatomy.upenn.edu/~lance/modelmath/model_compartment.html
http://falcon.ic.net/~biomware/biohp2jj.htm#identifiability


Jan K. Labanowski            |    phone: 614-292-9279,  FAX: 614-292-7168
Ohio Supercomputer Center    |    Internet: jkl@ccl.net 
1224 Kinnear Rd,             |    http://www.ccl.net/chemistry.html
Columbus, OH 43212-1163      |    http://www.ccl.net/


From chemistry-request@server.ccl.net  Fri Mar 31 13:14:29 2000
Received: from carbon.chem.ucla.edu (carbon.chem.ucla.edu [128.97.35.55])
	by server.ccl.net (8.8.7/8.8.7) with ESMTP id NAA28695
	for <chemistry@ccl.net>; Fri, 31 Mar 2000 13:14:29 -0500
Received: from red5 (pc-ll.chem.ucla.edu [128.97.35.245])
	by carbon.chem.ucla.edu (8.9.1a/8.9.1) with ESMTP id KAA08464
	for <chemistry@ccl.net>; Fri, 31 Mar 2000 10:14:14 -0800 (PST)
Message-Id: <4.2.0.58.20000331101820.00b08100@mbi.ucla.edu>
X-Sender: lavelle@mbi.ucla.edu
X-Mailer: QUALCOMM Windows Eudora Pro Version 4.2.0.58 
Date: Fri, 31 Mar 2000 10:24:39 -0800
To: CCL <chemistry@ccl.net>
From: Laurence Lavelle <lavelle@mbi.ucla.edu>
Subject: Athlon vs Intel and DDR SDRAM vs RAMBUS
Mime-Version: 1.0
Content-Type: text/plain; charset="us-ascii"; format=flowed

CPU decision is done.
Now I'm hearing DDR SDRAM is as fast as RAMBUS but will be much cheaper.
Comments welcome ? 

From chemistry-request@server.ccl.net  Fri Mar 31 11:37:25 2000
Received: from aorta.physiome.com (mail.physiome.com [204.142.61.124])
	by server.ccl.net (8.8.7/8.8.7) with ESMTP id LAA27841
	for <chemistry@ccl.net>; Fri, 31 Mar 2000 11:37:24 -0500
Received: from COMPCHEM ([192.168.1.67]) by aorta.physiome.com with SMTP (Microsoft Exchange Internet Mail Service Version 5.5.2650.21)
	id H3VHSDQ0; Fri, 31 Mar 2000 11:31:58 -0500
Message-ID: <001101bf9b30$b9891ed0$4301a8c0@physiome.com>
From: "Roberta Susnow" <rsusnow@physiome.com>
To: <chemistry@ccl.net>
Subject: salary survey
Date: Fri, 31 Mar 2000 11:46:55 -0500

Hi,

   I would like to get some feedback on some typical starting salaries for
computational chemists who have completed one or two post-docs. I am
interested in pharmaceutical/biological types of positions. Thanks, I
appreciate the input.

Roberta Susnow


From chemistry-request@server.ccl.net  Fri Mar 31 14:10:29 2000
Received: from if1.if.ufrgs.br (if1.if.ufrgs.br [143.54.4.2])
	by server.ccl.net (8.8.7/8.8.7) with ESMTP id OAA29050
	for <chemistry@ccl.net>; Fri, 31 Mar 2000 14:09:30 -0500
Received: from pclap3.if.ufrgs.br by if1.if.ufrgs.br (PMDF V4.3-10 #7035)
 id <01JNOSX0OX6O0019OA@if1.if.ufrgs.br>; Fri, 31 Mar 2000 16:07:47 -0003
Date: Fri, 31 Mar 2000 13:44:45 -0300
From: Claudio Perottoni <perott@if.ufrgs.br>
Subject: Mass storage devices - summary
To: chemistry@ccl.net
Reply-to: perott@if.ufrgs.br
Message-id: <00033114023000.20321@pclap3.if.ufrgs.br>


	Hi people!

	I've got several messages addressing the relative performance of mass
storage devices.  Thanks a lot to all who responded to my question! 

	Here is the original question:

**************************************************************************************
A feature common to most of the computational chemistry software currently
available is the huge amount of data that is frequently read/write from/to
scratch files. The performance of a computational chemistry application so
depends largely on transference rates from/to hard discs or other mass storage
devices. In this context, let me address a couple of questions to the list:
i) what is the best strategy (in terms of best performance): write a few, huge
files to huge disk partitions or, alternatively, split big files into smaller
ones over many disk partitions?
ii) could someone point to additional information on fast access mass storage
devices?
 I'll summarize to the list!        
                                                                Claudio.

**************************************************************************************

	And the replies:

From: Tony Ferriera <aferreir@memphis.edu>
To: PEROTT@if.ufrgs.br
Date: Tue, 07 Mar 2000 14:04:08 -0600


Claudio,

There are several ways to address this problem.  First is to employ "direct"
methods which calculate the one and two electron integrals as needed, thus avoiding
the disk storage problem entirely.

On Unix systems it is possible to created "striped" volumes which can span more
than one physical device.  This tends to be the fastest method (using a single
scratch directory) but some tuning may be needed (e.g. - setting the block size to
match the output scheme of the software).  Splitting files over several physical
volumes is a good idea IF you run them through separate controllers.  Accessing
several drives through the same controller will slow you down by increasing bus
traffic through the drive controller.  If you have multiple controllers in your
machine, the best option for splitting files across several drives is to have each
drive controlled independently.

I hope this helps.

Tony Ferreira


From: Soaring Bear <bear@dakotacom.net>
To: PEROTT@if.ufrgs.br
Date: Tue, 07 Mar 2000 13:39:42 -0700


This is a very interesting topic which has application outside
of chemistry, as well, and I would appreciate a copy of the
correspondence you receive.

A lot of separate files is likely to slow down disk
access in contrast to streaming a bunch of sequential
data in one big pull.    But if you only need a little
bit of info at a time, small files might be faster than
hunting through one huge file.

Soaring Bear Ph.D. Research Pharmacologist, Informatics,
Chemistry & Biochemistry, Herbalist & Healthy Lifestyles


From: Eugene Leitl <eugene.leitl@lrz.uni-muenchen.de>
To: PEROTT@if.ufrgs.br
Date: Tue, 07 Mar 2000 13:14:45 -0800 (PST)


It depends on the operating system. For Linux, get as much RAM as you
can (0.5-1 GByte), since unused memory is being automatically used for
file caching, buy several large disks (e.g. 2-3 of the 40 GByte
Diamond Max), operate them on individual EIDE host adapter each, using
soft RAID.

Try http://linuxdoc.org/HOWTO/Software-RAID-HOWTO.html


From: phil stortz <pstortz@coffey.com>
To: PEROTT@if.ufrgs.br
Date: Tue, 07 Mar 2000 14:42:17 -0700


scsi drives with large on drive cache's may be faster, as scsi
drives can hold one command while running another, should there
be overlap between writing and writing.  also a RAID array with
data striping can nearly triple data rate, the data is automagically
split between the 3 drives in small chunks-effectively using the drives
in parallel.  of course avoiding fragmentation and a fast seek time
of the drives involved will also help.
additionally some raid controller cards also have a large/expandable cache.
i'd say partitions are probably a bad idea as they force fragmentation and extra seek
distance/time.


From: "Dr. T. Daniel Crawford" <crawdad@ne095.cm.utexas.edu>
To: PEROTT@if.ufrgs.br
Date: Tue, 07 Mar 2000 20:01:01 -0600 (CST)


Claudio,

I/O efficiency is a complicated issue for which there is no single,
best solution.  Operating systems (particularly commercial ones designed
for high-performance workstations and supercomputers) vary widely in their
ability to handle transfers of large quantities of data to and from disk.
Although others on the CCL are probably more qualified to answer (and
I very much want to hear what they have to say!), there are perhaps a
few concepts to follow and a few to avoid.  For example, when splitting
read/write calls across filesystems, greater efficiency can be obtained
when each filesystem has its own hardware controller (in particular,
with its own memory buffer).  This allows the OS to send a buffer of data
to one controller while another read/write is pending on another disk.
On the other hand, there may be no reason to resort to "micromanaging"
the I/O on many modern operating systems, which often choose their own
read/write pattern regardless of what your program requests.  At one time,
for example, programmers worried about requesting particular buffer sizes of
data (e.g., 1024k/read or write) in order to ensure alignment with the disk
sector length and avoid delays as the disk spun to orient sector boundaries.
In my own (perhaps limited) I/O testing on IBM RS6ks and DEC Alphas, such
sector-based I/O is apparently unnecessary in general since the OS uses a
highly efficient I/O buffering scheme already.  

Again, I would very much like to hear about the experiences of other
programmers on the CCL in dealing with I/O efficiency issues.  Good question,
Claudio!

Best regards,

-Daniel

-- 
T. Daniel Crawford, Ph.D.                              Institute for Theoretical Chemistry
crawdad@jfs1.cm.utexas.edu                      Departments of Chemistry and Biochemistry
http://zopyros.ccqc.uga.edu/~crawdad/                The University of Texas


From: Laurence Cuffe <Laurence.Cuffe@ucd.ie>
To: PEROTT@if.ufrgs.br
Date: Wed, 08 Mar 2000 10:11:40 +0000


Hmm The fastest way to get big data sets in and out of a machine 
is to stripe them i.e. write the data over an array of disk platens 
and sets of disks so that all disks are being read at once. I know 
digital unix can do this and I'm shure many other O.S's can do it 
to. Again big disk caches help.  This kind of problem has arisen in 
large Data Base applications and in digital recording where you 
want sustained transfer rates: a lot of hard disks can look good on 
peak data transfer rates but fall down for sustained transfers.
Hope this helps Larry Cuffe    


From:    IN%"garciae@boojum.Colorado.EDU"  "Edgardo Garcia"  8-MAR-2000 13:56:06.34
To:      IN%"PEROTT@if.ufrgs.br"

Oi Claudio,

Sobre suas perguntas, recentemente tive problemas com
o software Gaussian justamente devido a este colocar
tudo em um arquivo. Durante um calculo MP2 que 
consome bastante memoria o arquivo ficou tao grande
que estourou o limite maximo permitido pelo sistema operacional,
no caso era 4Gb. Temos que tentar reconfigurar o sistema para
ver se podemos aumentar ese limite ou deixa-lo ilimitado.

Minha sugestao e' que seria melhor colocar os dados em
arquivos separados, especialmente se sao blocos com informacao
diferente que pode inclusive ser acessada de maneira
diferente e em determinados momentos apenas. A velocidade de
leitura de arquivos menores e' maior.

Um abraco

Edgardo

Edgardo Garcia
Universidade de Brasilia


From: Jochen K?pper <jochen@pc1.uni-duesseldorf.de>
To: PEROTT@if.ufrgs.br
Cc: chemistry@server.ccl.net
Date: Wed, 08 Mar 2000 17:05:51 +0100 (CET)


Use RAID. That defers the first problem back to OS' or hardware
designers. Striping should give you best average performance.

Many OS's allow software RAID without any additional hardware, but
then if you are willing to spend the cash, you could use hardware RAID
as well :-)
Fail-safety is something else to be considered, but probably not for
your "scratch"-files.

Jochen
-- 
Heinrich-Heine-Universit�t
Institut f�r Physikalische Chemie I
Jochen K�pper
Universit�tsstr. 1, Geb. 26.43 Raum 02.29
40225 D�sseldorf, Germany
phone ++49-211-8113681, fax ++49-211-8115195
http://www.Jochen-Kuepper.de


From: Eugene Leitl <eugene.leitl@lrz.uni-muenchen.de>
To: Jochen K?pper <jochen@pc1.uni-duesseldorf.de>
Cc: PEROTT@if.ufrgs.br,  chemistry@server.ccl.net
Date: Wed, 08 Mar 2000 15:09:20 -0800 (PST)


Jochen K?pper writes:

 > Use RAID. That defers the first problem back to OS' or hardware
 > designers. Striping should give you best average performance.

I would add to this very good advice the following:

1) use modern, large EIDE drives (currently, 40 GBytes for ~$250:
    http://www.pricewatch.com/1/26/2119-1.htm ). If money is not an
    issue at all, use 10 k rpm (UW..etc)SCSI drives:
    http://www.pricewatch.com/1/26/2150-1.htm
    If you have a money printing press, use hardware RAID with them:
    http://www.pricewatch.com/1/26/1537-1.htm
    http://www.linuxdoc.org/HOWTO/mini/DPT-Hardware-RAID.html

2) Use several of large, modern EIDE drives as soft or hard RAID.
    http://www.linuxdoc.org/HOWTO/Software-RAID-HOWTO.html
    if you use EIDE drives, put them on individual EIDE host adapter 
    interface. These are ~$30 apiece, for instance:
    http://www.buy.com/comp/product.asp?sku=10023443

3) if you're using Linux, use lots of RAM. 512 MByte-1 GByte or more
    (make sure it is being recognized, you might have to supply 
    options at boot, though probably not with newer kernels). Linux 
    utilizes extra memory for file caching, which can speed up things 
    dramatically.
 
4) use a recent (preferably, cutting, not bleeding edge) kernel, 
    because sometimes improvements in caching algorithms/drivers 
    translate in very noticeable improvement in overall performance. 

5) if you have large files (>2 GByte) you'll need 64 bit clean file
    systems (patches for Linux vanilla ext2 are available). Also, look
    around on http://www.beowulf-underground.org/ , there are 
    patches/software to be found there which also might improve 
    performance even for nonparallel systems.

Regards,

Eugene Leitl


--
****************************************************
Claudio A. Perottoni
Universidade Federal do Rio Grande do Sul
Instituto de Fisica - Laboratorio de Altas Pressoes
Av. Bento Goncalves, 9500
CAIXA POSTAL 15051
91501-970  PORTO ALEGRE - RS
BRAZIL
PHONE:55-51-316-6500
FAX  :55-51-319-1762
http://www.if.ufrgs.br/~perott/index.html
****************************************************


From chemistry-request@server.ccl.net  Fri Mar 31 18:43:30 2000
Received: from helix.nih.gov (helix.nih.gov [128.231.2.3])
	by server.ccl.net (8.8.7/8.8.7) with ESMTP id SAA30314
	for <CHEMISTRY@ccl.net>; Fri, 31 Mar 2000 18:43:29 -0500
Received: (from mn1@localhost)
	by helix.nih.gov (8.9.3/8.9.3) id SAA2285841;
	Fri, 31 Mar 2000 18:43:28 -0500 (EST)
Date: Fri, 31 Mar 2000 18:43:28 -0500
From: "M. Nicklaus" <mn1@helix.nih.gov>
To: CHEMISTRY@ccl.net
cc: "M. Nicklaus" <mn1@helix.nih.gov>
Subject: Athlon vs Intel performance
Message-ID: <Pine.SGI.4.09.9L.10003311803550.334139-100000@helix.nih.gov>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII

This is a follow-up to my previous posting of Wed, 29 Mar 2000:

> We've run one single direct speed comparison (I wouldn't call it a
> benchmark) so far, running the same Titan calculation on both a 600 MHz
> Pentium III (non-Coppermine) and an 800 MHz AMD K7 (Athlon) system, both
> running under Windows 98.  (The job was a single point energy calculation at
> the LMP2/6-31G** level for a nucleoside analog.)
>
> Time for completion (as per program output, but in agreement with wall time):
>
> 600 MHz Pentium III  -  7:05 h
> 800 MHz Athlon       -  2:25 h
>
> Correcting (linearly) for the difference in clock rate, the Athlon is still
> faster (for this job!) by a factor of 2.2.
>
> Specs:
>
> 600 MHz Pentium III  -  100 MHz FSB, 512 MB PC100 RAM, 27 GB ATA/66 7,200rpm HD
> 800 MHz Athlon       -  100 MHz FSB, 256 MB PC100 RAM, 27 GB ATA/66 7,200rpm HD
>
> However, we've had quite a few stability problems with the Athlon system so far
> (both in Titan and in general), but we think these are more likely related to
> hardware driver problems in Windows 98 than to true hardware instability. All in
> all, YMMV.


After cleaning up some of the Win.98 driver issues on the 800 MHz Athlon
machine, I re-ran the LMP2 single-point job.  With some substantial hard drive
slowdowns removed, the job ran quite a bit faster (see below).


Following the post of Matthias Mann <Matthias.Mann@chemie.tu-dresden.de>
on Wed, 29 Mar 2000:
> I have only comparisions of PII-450, PIII-500 and AMD Athlon 550.
> Look at:
>       http://www.chm.tu-dresden.de/edv/bench/bench9.html
> for GAMESS(US) benchmarks, and at:
>       http://www.chm.tu-dresden.de/edv/bench/bench10.html
> for GAUSSIAN98 A7 benchmarks.
>
> Normally the relation of relative performance looks as expected;
> however, in some cases the Pentium-III system is extremly slow:
> this is the case in MP2 and QCISD(T) calculations with Gaussian
> and in CASSCF calculations with Gamess.
> I'm not sure about the reason, maybe this is the cache, or the
> architecture specific optimization (the code was compiled in both
> cases with pgf77 on an PII system, and in the Gaussian case
> maybe the optimized BLAS plays a role).
I ran an HF single-point job on the same molecule.  In keeping with the
GAMESS(US) and GAUSSIAN98 results, the extreme slowdown of the Pentium III as
observed for the LMP2 job did not occur for the HF calculation.

                        LMP2      HF

600 MHz Pentium III  -  7:05 h   24:11 min
800 MHz Athlon       -  1:44 h   16:51 min

ratio P III / K7        3.057    1.076
(clock rate-adjusted)

So this seems to be a case not of "your mileage may vary" but of "your mileage
DOES vary strongly"...

As far as I can determine (from WinTop and System Monitor), these jobs all
appear to be mostly CPU-bound and not I/O-bound, taking up 99+ % CPU cycles most
of the time with only intermittent disk writes/reads.  (The Athlon hard drive
had previously been too slow by a factor of 5.)  All other possible parameters
that people asked about, such as Titan version, Windows virtual memory setup,
hard drive access times etc. are as similar/identical as I can make/determine
them to be.  The instabilities that we have with the Athlon system are,
unfortunately, not yet completely removed.

Since I don't have access to the Titan source code, I can't say anything about
compilation etc.  Maybe Schroedinger can comment?

Marc

------------------------------------------------------------------------
 Marc C. Nicklaus                        National Institutes of Health
 E-mail: mn1@helix.nih.gov               Bldg 37, Rm 5B29
 Phone:  (301) 402-3111                  37 Convent Dr, MSC 4255       
 Fax:    (301) 402-2275                  BETHESDA, MD 20892-4255    USA 
      http://rex.nci.nih.gov/RESEARCH/basic/medchem/mcnbio.htm
    Laboratory of Medicinal Chemistry, National Cancer Institute
------------------------------------------------------------------------