From tamasgunda@tigris.klte.hu  Tue Aug 29 04:26:03 1995
Received: from tigris.klte.hu  for tamasgunda@tigris.klte.hu
	by www.ccl.net (8.6.10/950822.1) id EAA22898; Tue, 29 Aug 1995 04:19:54 -0400
Message-Id: <199508290819.EAA22898@www.ccl.net>
Received: from anti02 (anti02.chem.klte.hu) by tigris.klte.hu (MX V4.1 VAX)
          with SMTP; Tue, 29 Aug 1995 10:19:48 EDT
Sender: <tamasgunda@tigris.klte.hu>
X-MX-Warning:   Warning -- Invalid "From" header.
From: "tamasgunda@tigris.klte.hu" <tamasgunda@www.ccl.net>
To: chemistry@www.ccl.net
Date: Tue, 29 Aug 1995 10:19:42 +1
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7BIT
Subject: Principal components summary
Priority: normal
X-mailer: Pegasus Mail/Windows (v1.22)


Some days ago I sent the following summary to CCL, but somehow
only a small part has arrived. Here is it again.
---------------------------------------------------------------
Here is the summary of my question concerning the calculation of 
principal
components and plane fitting to points:

The original question was:

I am looking for algorithm/numerical methods for the determining
of the principal components of a data set. In the words of geo-
metry, I have a number of points determined by y1, y2 and y3, and 
I am looking for the best fitting plane to the points, i.e. the
plane determined by the two greatest principal components.

Do not refer, please, to programs like Mathcad etc. I need it in
a numerical method form, as I'd like to use it as a procedure in a 
program.


***************************************************************
**
That looks like a rather straightforward least-squares problem. You 
want the
parameters a,b,c in the equation for a plane y3 = a y1 + b y2 + c. So 
you
make up a matrix X:

           y1(1)   y2(1)   1  
           y1(2)   y2(2)   1
            ...
           y1(n)   y2(n)   1

and a vector Z = (y3(1), y3(2), ..., y3(n)), plus a 3-component 
vector of
unknowns A = (a,b,c). Then

      A = X^+ Z

gives you the best values for the parameters, where X^+ is the 
generalized
inverse of X.

Konrad Hinsen                     | E-Mail: hinsenk@ere.umontreal.ca
Departement de chimie             | Tel.: +1-514-343-6111 ext. 3953
Universite de Montreal            | Fax:  +1-514-343-7586
C.P. 6128, succ. A                | 
Deutsch/Esperanto/English/Nederlands/
Montreal (QC) H3C 3J7             | Francais (phase experimentale)

***************************************************************
**

From:             gordonh@chem.QueensU.CA (Heather Gordon)
To:               tamasgunda@tigris.klte.hu
Subject:          pca


There are listings for FORTRAN programs pertaining to all aspects of 
factor
analyses, including pca in

"Factor Analysis of Data Matrices", P. Horst, Holt, Rinehart and
Winston, New York, 1965.

I also have written my own code, using the routines TRED2 and TQLI 
>from
"Numerical Recipes in Fortran" to diagonalize the covariance matrix 
and
locate the eigenvalues and eigenvectors.

Heather Gordon
Department of Chemistry
Queen's University
Kingston, Ontario

***************************************************************
     What you say you want sounds a lot like the transformation of a
molecule to the coordinate system in which the moment-of-inertia 
tensor is
diagonal. If your (y1, y2, y3) components are *orthogonal*, you ought 
to be
able a similar approach. This kind of code is buried in all sorts of 
programs
.
It should be a simple matter to adapt a suitable code fragment to you 
use.

++++++++++++++++++++++++++++++++++++++++++++++++++
                           FREDERIC A. VAN-CATLEDGE

Scientific Computing Division         ||   Office: (302) 695-1187 or 
529-2076
Central Research & Development Dept.  ||          
The DuPont Company                    ||      FAX: (302) 695-9658
P. O. Box 80320                       ||
Wilmington DE 19880-0320              || Internet:
fredvc@esvax.dnet.dupont.com

*************************************************************
**
Tamas, 
     A friend sent me your e-mail on the subject of an algorithm for
calculating the first two PC's from a dataset. I have written PCA 
(NIPALS)
routines in BASIC and C and will be happy to send them to you, if 
required.
I also have a book (Numerical Recipes) which details PCA (Singular 
Value
Decomposition) which does a similar thing. If you still need some 
help, send
me an e-mail

Steve Gurden,
Bristol Chemometrics Group,
School of Chemistry,
University of Bristol,
Cantock's Close,
BRISTOL BS8 1TS

Tel: +44 (0)117 9289000 extn. 4421
e-mail: S.P.Gurden@bris.ac.uk

***************************************************************
**

  Dear Dr. Gunda,

  One of the most simple and elegant method to perform Principal
Component Analysis is that of the NIPALS algorithm. It can be 
formulated
in a few lines of programming and it works really fine.

  You can find it in the following references:

* Analytica Chimica Acta, vol. 185, p. 1-17 (1986)
* Chemometrics and Intelligent Laboratory Systems, vol. 2, p. 37-52 
(1987)

  I hope this helps you.

  Sincerely.


  Jose I. Garcia
--

Dr. Jose Ignacio Garcia-Laureiro              Phone : 34-(9)76-350475
Departamento de Quimica Organica              Fax   : 34-(9)76-567920
Instituto de Ciencia de Materiales de Aragon  e-mail: 
jig@qorg.unizar.es
C.S.I.C.-Universidad de Zaragoza       jig@msf.unizar.es
E-50009 ZARAGOZA (SPAIN)

***************************************************************
**

I thought the principle (no pun intended) was quite simple; just 
calculate the
variance/covariance matrix of the variables (or better the 
correlation matrix)
and diagonalize that. You can get the correlation matrix by 
autoscaling the
data and calculating the covariance matrix then.

Next, select the number of principal components that you want by 
calculating
the cumulative sum of the eigenvalues divided by the sum of all the
eigenvalues (the eigenvalues, eigenvectors need to be sorted by their 
size by
the diagonalization procedure or afterwards)


I mean:

T = \sum_i  \labda_i

suppose, as an example:

\labda_1 / T              = 90 %
(\labda_1 + \labda_2) / T = 98 %
 ...
(\labda_1 + \labda_2 + ...) /T = ..

if this is sufficiently accurate for you, use the subspace spanned by 
the first
two eigenvectors as your new space, and project all datapoints onto 
it.
otherwise, use more eigenvectors until (\sum_j \labda_j) / T is 
sufficiently
near 100 % (1.00), and project the datapoints on the subspace 
(unfortunately
it is no longer a 2-D plane, which would be most easy for viewing) 

Hopefully you can use this information, I can't think of any 
references
off-hand..
There is a book by Box (and Hunter?) which describes these 
multivariate
techniques in great detail.
Sorry I can't help you better,

Frits


Frits Daalmans
OIO Conformational Analysis
Gorlaeus Laboratoria
Leiden, The Netherlands
E-mail: frits@chemde4.leidenuniv.nl
Tel: [+31] (0)71-274505

***************************************************************
**

Check the NIST CD for handwritten characters -- there is a principal 
component algorithm coded there (may be in C, but you could then 
recode it
in FORTRAN).  There are several sites that catalog algorithms, too -- 
  here
are a few: http://gams.nist.gov/,   
http://netlib.att.com/netlib/att/cs/home/1127.ht, 
http://www.netlib.org/.

Good luck.

Joe McDaniel
joe@psiint.com
Dear Dr. Gunda,
There are many references to Principal Component Analysis.It is 
actually a
straightforward problem in matrix diagonalization. However, if your 
real goal
is to simply fit a plane to a set of points (minimizing the sum of 
squares of
the distances of the points to the plane) or a line to a set of 
points
(minimizing the sum of squares of the distances of the points to the 
line),
then this is also a simple matrix diagonalization problem whose 
solution is
given and discussed in the following references:

"To Fit a Plane or Line to a Set of Points by Least Squares",
V. Schomaker, J. Waser, R.E. Marsh, and G. Bergman, Acta Cryst.,
vol. 12, pp. 600-604 (1959).

and

"To fit a plane to a set of points by least squares",
D.M. Blow, Acta Cryst., vol. 13, p. 168 (1960).

Regards,

Marvin Waldman, Ph.D.
Director, Rational Drug Design
Biosym Technologies, Inc.
e-mail: marvin@biosym.com

***************************************************************
**

Dear Dr. Gunda,

    There are two ways to do what you want (as I understand it). One 
is to
use a statistical analysis package which supports principal 
components
analysis as statisticians define it.  The
plane you are looking for is the one defined by the first two 
principal
components.

    The other is to perform the equivalent of a moment of inertia 
calculation
with all masses set to 1.  First find the mean value of each of the 
three
coordinates (call them x, y, z).  This is equivlant to your center of 
mass. 
Build the symmetric 3x3 tensor with the following elements, summed 
over all
points (distances are relative to the centroid):

    y*y + z*z         -x*y           -x*z
      -x*y          x*x + z*z        -y*z
      -x*z            -y*z         x*x + y*y
    
    Diagonalize the tensor (the Jacobi method works well here - see 
any of
the "Numerical Recipes" series by Press et all.)  The transformation 
matrix
gives you your three principal coordinates.

Hope this helps,
Paul Soper

-----------------------------------------------------------------
Paul Soper                        All the usual disclaimers apply
DuPont Central Research             soperpd@esvax.dnet.dupont.com  
P.O. Box 80328                                 Tel (302)-695-1757  
Wilmington, DE 19880-0328                      FAX (302)-695-8412  

***************************************************************
**

   According to the math of the normal least square procedure, as far 
as I
know, NOT the distances of the points and the line are minimized 
(i.e. a
perdendicular from a point to the line), but the  difference of the 
measured
and the calculated y values,which is represented graphically by a 
line
between the point and the regression line and it is parallel with the 
y axis:

That depends on the error criterion that you use. The one sent 
yesterday (and
probably some others too) does indeed minimize the error along one
coordinate axis (and in fact works only if that axis does not lie in 
the plane
that you want to fit). If you want to minimize the distances of the 
points
>from the plane, use the normal form for the plane:

    n x - d = 0,

where n is the normalized normal vector, x is the coordinate vector, 
and d the
distance of the origin from the plane.  The distance of any point y 
>from this
plane is simply given by n y - d, so you must minimize

  __
  \    |           | 2
  /    | n y_i - d |
  --
  i

with the constraint that n is normalized, so you need a 
least-squaresminimizer
that can handle constraints (e.g. via Lagrange multipliers). Or use a 
"dirty
hack": if you know (or assume) that d will never be zero (i.e. the 
plane will
not contain the coordinate origin), set d = 1 and use an unnormalized 
normal
vector, whose length then is the inverse of the distance from the 
origin.


------------------------------------------------------------------------------
-
Konrad Hinsen                     | E-Mail: hinsenk@ere.umontreal.ca
Departement de chimie             | Tel.: +1-514-343-6111 ext. 3953
Universite de Montreal            | Fax:  +1-514-343-7586
C.P. 6128, succ. A                | 
Deutsch/Esperanto/English/Nederlands/
Montreal (QC) H3C 3J7             | Francais (phase experimentale)

------------------------------------------------------------------------------

***************************************************************
**

        What you say you want sounds a lot like the transformation of 
a
molecule to the coordinate system in which the moment-of-inertia 
tensor is
diagonal.  If your (y1, y2, y3) components are *orthogonal*, you 
ought to be
able a similar approach.  This kind of code is buried in all sorts of 
programs.
It should be a simple matter to adapt a suitable code fragment to you 
use.


                           FREDERIC A. VAN-CATLEDGE

Scientific Computing Division         ||   Office: (302) 695-1187 or 
529-2076
Central Research & Development Dept.  ||          
The DuPont Company                    ||      FAX: (302) 695-9658
P. O. Box 80320                       ||
Wilmington DE 19880-0320              || Internet:
fredvc@esvax.dnet.dupont.com 

Tamas,

        I think I have solved a problem fairly similar to the one you 
are
interested in.  My problem was less general and involved finding 
thebest-fit
plane to a group of four points centred around a fifth point and 
arose because
I am looking at compression strain in tetracoordinate carbon systems. 
 For
example the molecule [4.4.4.4]fenestrane C9H12      
                               H
                    H2C__C__CH2
                      |  |  |
                     HC-----CH
                      |  |  |
                    H2C--C--CH2
                         H
contains a central "flattened" tetracoordinate carbon atom.  One way 
to
determine the flattening at any C(C)4 moiety is to find the best-fit 
plane for
the four alpha-C atoms passing through the central C atom.  This can 
be
done simply by finding the eigenvectors of what we have termed the
Geometry Tensor.
        The Geometry Tensor is constructed by multiplying the matrix 
D (the
vectors to your points -- in this case the 4 alpha-C atoms) by its 
transpose to
give a 3x3 Real Symmetric Matrix (routines to find the 
eigenvectors/values
of any RSM can be found in eispak and similar sets of routines).  The
eigenvectors of this matrix will correspond to the minimum
(Best-Fit) -- this will have the smallest eigenvalue, maximum and one 
other
extremum (i.e. they will form a set of cartesian axes -- x,y and z -- 
one of
which will be the normal to your best-fit plane. 
     The mathematics behind this is actually quite simple and involes
setting up the equations to minimise d' = (sum of squared distances 
of your
points to an orbitrary plane).  After which the need to form and find 
the
eigenvectors of the "Geometry Tensor" becomes obvious. 
     Naturally this can be readily extended to any number of atoms 
and
any origin.  I have used this method to re-orient molecules that were
optimised in C1 symmetry (in cartesians) which exhibit higher 
symmetry but
where the axes/planes of symmetry do not lie on the x,y or z axes of 
my
final cartesians.
        I might be able to send you my fortran code (it is very 
simple) if you
think this would help but I will need to check the origin of the 
eigenvector
solving routines that I use (to make sure we do not break anyone's
intellectual property rights as I did not write these routines myself 
-- I think
they came from eispak).

        Cheers,

                Danne

 Danne R Rasmussen, PhD Student                     phone:   +61 6 
249-3771
 Research School of Chemistry                         fax:   +61 6 
249-0750
 Australian National University
 Canberra ACT 0200                             e-mail: 
danne@rsc.anu.edu.au

*****************************************************************


Hi,

     Your assumption about the fitting of the least squares of 
perpendicular
distances is correct .. this is what is done by linear regression.  A 
side note,
linear regression is used in 
teaching software, but programs are always built around the matrix
formulation.

     The way that you can fit perpendicular distances, as well as do
non-linear fits (constants other than linear coeficients) is using 
optimization
methods, such as steepest descent, Newton-Rhapson, etc. This is just 
like
doing geometry optimization, only you are minimizing on your 
perpendicular
distances instead of energy.

     The other possibility is spline methods (cube splines are most
commonly used).  A spline method is not a least square fit, it is an 
exact fit. 
However, if there is noise in the data you fit exactly to the noise.

     Hope this helps.

                    Dave Young
                    young@slater.cem.msu.edu

***************************************************************
**

Dear Tamas,
     I also am a chemist, a chemometrician in fact, and am interested 
in
chemistry
, statistics and computing all mixed togoether. Whenever I have a
mathematical problem like yours, I often find an answer in the 
excellent book
"Numerical Recipes in C: The Art of Scientific Computing" by W. 
H.Press,
S.A.Teukolsky, W.T.Vetterling and B.P.Flannery (Cambridge University
Press). There are also version for FORTRAN and, I think, BASIC. PCA 
is
also described in various multivariate statistics and chemometrics 
books, such
as "Multivariate Calibration" by H.Martens and T.Naes (Wiley) and
"Multivariate pattern recognition in chemometrics" edited by 
R.G.Brereton
(Elsevier).

     The source code is actually written in Visual Basic, but 
conversion to
Basic, Fortran or C should be quite straightforward

PCA: ijX = ikT * kjP 
where X is an i by j datamatrix, T is an i by k scores matrix and P 
is a k by
j loadings matrix. In your case, the number of components to extract 
("kmax"
in the source code) will be 2. For this listing, "evals" are the 
eigenvalues, one
for each PC, which I have defined as the sum of squares of the 
scores.
NIPALS calculates the PC's in order of importance (i.e. PC's with the 
biggest
eigenvalues come first). The loadings will be 2 vectors of unit 
length, which
represent the two new PC axis, and the scores matrix gives the 
coordinates of
the points on these new PC axis.

     Think about whether you wish to mean-centre the data prior to 
the
PCA or not - this will give different results!

     Maybe you have the answer to your problem anyway, by the time I
write this, but best of luck anyway!


******************************************************
Sub NIPALS (X!(), kmax%, mc%, T!(), P!(), evals!())

    Dim i%, imax%, j%, jmax%, k%, maxcol%
     imax = UBound(Xmat, 1)
     jmax = UBound(Xmat, 2)
    Dim sm#, ss#, cmaxss#, rmaxss#, smaxss#
    Dim cmax#(), rmax#(), smax#()

    ReDim cmax(1 To imax)
    ReDim rmax(1 To jmax)
    ReDim smax(1 To imax)

    ReDim T(1 To imax, 1 To kmax)
    ReDim P(1 To kmax, 1 To jmax)
    ReDim evals(1 To kmax)


    k = 1

    ' Mean-centre matrix if required
    If mc = 1 Then
     For j = 1 To jmax
         sm = 0

         For i = 1 To imax
          sm = sm + X(i, j)
         Next i

         sm = sm / imax
     Next j

     For j = 1 To jmax
         For i = 1 To imax
          X(i, j) = X(i, j) - sm
         Next i
     Next j
    End If


START1:

    ' Find column with greatest sum of squares
    cmaxss = 0#

    For j = 1 To jmax
     ss = 0#

     For i = 1 To imax
         ss = ss + X(i, j) ^ 2
     Next i
     
     If ss > cmaxss Then
         cmaxss = ss
         maxcol = j
     End If
    Next j

    For i = 1 To imax
     cmax(i) = X(i, maxcol)
    Next i


START2:

    ' Calculate row vector and its sum of squares
    rmaxss = 0#

    For j = 1 To jmax
     rmax(j) = 0#

     For i = 1 To imax
         rmax(j) = rmax(j) + (cmax(i) * X(i, j))
     Next i

     rmaxss = rmaxss + rmax(j) ^ 2
    Next j

    ' Scale this row vector to unit length, after checking that
it is not a
    ' zero vector
    If rmaxss = 0# Then
     MsgBox ("Nipals( ): Exited before all PC's found")
     Exit Sub
    End If

    For j = 1 To jmax
     rmax(j) = rmax(j) / Sqr(rmaxss)
    Next j

    ' Calculate a new estimate of the PC scores
    smaxss = 0#

    For i = 1 To imax
     smax(i) = 0#

     For j = 1 To jmax
         smax(i) = smax(i) + (X(i, j) * rmax(j))
     Next j

     smaxss = smaxss + smax(i) ^ 2
    Next i

    ' Test to see if PC has converged. If so, store scores and
loadings,
    ' adjust matrix, and look for next PC. If not, use last
scores estimate
    ' to put back into the algorithm. The convergence criterion
can be
    ' adjusted (i.e. 1e-6 used for single precision). 
    If Abs(smaxss - cmaxss) / cmaxss < .000000000001 Then
     evals(k) = smaxss

     For i = 1 To imax
         T(i, k) = smax(i)
     Next i

     For j = 1 To jmax
         P(k, j) = rmax(j)
     Next j


     For i = 1 To imax
         For j = 1 To jmax
          X(i, j) = X(i, j) - (smax(i) * rmax(j))
         Next j
     Next i

     If k = kmax Then
         Exit Sub
     End If

     k = k + 1

     GoTo START1
    Else
     For i = 1 To imax
         cmax(i) = smax(i)
     Next i

     cmaxss = smaxss

     GoTo START2
    End If

End Sub


================

Steve Gurden,
Bristol Chemometrics Group,
School of Chemistry,
University of Bristol,
Cantock's Close,
BRISTOL BS8 1TS

Tel: +44 (0)117 9289000 extn. 4421
e-mail: S.P.Gurden@bris.ac.uk

***************************************************************
**
 
From: g80@chm.uri.edu
 Hi Dr. Tamas Gunda,

    I too have been interested in the question of fitting a plane to 
a set of
points(3D). As you correctly pointed out this is a problem in total 
least
squares, not simple least squares. This problem as been approached 
many
times using a variety of algorithims(its' fundamental to 
crystallography).

I have written a simple FORTRAN  program based on the method of D. M.
Blow, Acta. Cryst.(1960),13,168. One fundamental paper is V. 
Schomaker, J.
Waser,  R. E. Marsh and ?. Bergman, Acta Cryst(1959),12,60. There are
many other papers on this subject. I hope this helps and good luck.


                                           Brian Schmitz

                                           Univ. of Rhode Island
***************************************************************
**
end summary

*****************************************************************************
   Tamas E. Gunda, Ph.D.               phone: (+36-52) 316666 ext 2479
   Research Group of Antibiotics       fax  : (+36-52) 310936
   L. Kossuth University               e-mail: tamasgunda@tigris.klte.hu
   POBox 36                                   
   H-4010 Debrecen
   Hungary
*****************************************************************************

From toukie@zui.unizh.ch  Tue Aug 29 05:56:04 1995
Received: from rzusuntk.unizh.ch  for toukie@zui.unizh.ch
	by www.ccl.net (8.6.10/950822.1) id FAA23681; Tue, 29 Aug 1995 05:48:43 -0400
From: <toukie@zui.unizh.ch>
Received: from rzurs10.unizh.ch by rzusuntk.unizh.ch (4.1/SMI-4.1.9)
	id AA24350; Tue, 29 Aug 95 11:48:39 +0200
Received: by rzurs10.unizh.ch (AIX 3.2/UCB 5.64/4.03)
          id AA38420; Tue, 29 Aug 1995 11:48:40 +0200
Message-Id: <9508290948.AA38420@rzurs10.unizh.ch>
Subject: Babel in DOS
To: chemistry@www.ccl.net
Date: Tue, 29 Aug 1995 11:48:39 +0200 (MEST)
X-Mailer: ELM [version 2.4 PL24 PGP2]
Content-Type: text
Content-Length: 413       


Dear Colleagues;

     Some time ago I asked a question pertaining to a difficulty I had in
making file format transformations using the DOS version of Babel.  I received
lots of useful advice.  Probably the _most_ useful suggestion was to type

                            babel -m

at the DOS prompt and then proceed from there.  (The rest of the steps are
extremely self-explanatory.)

Regards,

S. Shapiro
ZH

From acp37@rs1.rrz.Uni-Koeln.DE  Tue Aug 29 09:41:07 1995
Received: from rs1.rrz.Uni-Koeln.DE  for acp37@rs1.rrz.Uni-Koeln.DE
	by www.ccl.net (8.6.10/950822.1) id JAA26296; Tue, 29 Aug 1995 09:34:36 -0400
From: <acp37@rs1.rrz.Uni-Koeln.DE>
Received: by rs1.rrz.Uni-Koeln.DE id AA81642
  (5.67b/IDA-1.5 for CCL <chemistry@www.ccl.net>); Tue, 29 Aug 1995 15:33:48 +0200
Date: Tue, 29 Aug 1995 15:33:48 +0200 (MST)
To: CCL <chemistry@www.ccl.net>
Subject: Summary for "Calculation on biradical"
Message-Id: <Pine.A32.3.91.950829153238.48565A-100000@rs1.rrz.Uni-Koeln.DE>
Mime-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII


Dear netters,

 some days ago I posted a question about calculations on a biradical. The 
original posting and the answers are attached. 

 Thanks to all who responded!
 
 Thorsten Koch


The original question:

Dear netters,

 At the moment I am doing ab initio calculations on 
1,5-Dehydronaphthalin, so it is supposed to be a biradical. It is an 
isomer to a ordinary closed-shell molecule which I am interested in. So 
to compare the energies of the molecules, I am interested in the 
singlet-groundstate of the biradical.
 But this doesn't seem to be a trivial problem. It was no problem at all 
to get the wavefunction for the triplet state. So I optimized the 
geometry for the triplet state and tried to get some singlet 
wavefunctions for this geometry.

 Here is what I got:

 1. UHF, 6-31G*, mult=3 : E=-382.059168, <S**2>=2.2470 (2.0346 after 
annihilation of first spin contaminant)

 2. UHF, 6-31G*, mult=1 : E=-381.906022, <S**2>=0.0000
 This result was the same as the result with RHF. Especially the spin 
desity was zero everywhere. So I thought that this wouldn't be a proper 
description of a biradical.

 3. UHF, 6-31G*, mult=1, GUESS=mix
 This calculation didn't converge at all.
 
 4. In fact I was able to get another singlet state wavefunction. The 
only way to get it was from a stability analysis and optimization from the 
wavefunction obtained with 2. The result was: E=-382.086893, 
<S**2>=2.2807 (8.1431 after annihilation of the first spin contaminant)
At least the spin density gave two unpaired electrons with different spin 
at the right carbons.

 Because dft is known to give lower spin contaminations, I tried it this way:

 5. UHF, 6-31G*/B3LYP, mult=3 : E=-384.510256, <S**2>=2.0136 (1.9996)
 
 6. UHF, 6-31G*/B3LYP, mult=1 : E=-384.484437, <S**2>=0.000
 Again the same result as with RHF, spin density zero everywhere.

 7. UHF, 6-31G*/B3LYP, mult=1, GUESS=mix : E=-384.518499, <S**2>=0.9392 
(0.1510 after annihilation) This calculation had no difficulties to converge.

 ROHF calculations for the singlet states always gave the same 
wavefunction as for RHF calculations.


 So there are a couple of questions:
 
 a) Why is it so difficult to get a UHF singlet wavefunktion for a normal 
biradical? 

 b) Can I rely on the energies obtained when the spin contamination is 
about 1.0 (Point 7.)? This is especially problematic because I want to 
compare it to the energiy of a closed-shell isomer. Since I got this 
results with 6-31G*/B3LYP it seems to be the best I can get...
Is a geometry optimization sensible in this case?

 c) What can I say about the wavefunction of point 4.? The spin 
contamination is about 2.3 before annihilation (not a very small value 
for a singlet ;-) and about 8.1 AFTER annihilation of the first spin 
contaminant. I have no idea what this should mean!

 d) Am I right when I think that the spin density of a system with to 
unpaired electrons in the sigma frame separated far enough from 
another should exhibit these two electrons at the carbons they are 
attached to?

e) Having all these problems in mind: Which way should I go to determine 
the energiy of the singlet ground state?


Sorry for this somewhat lengthy question, but I'll summarize the 
responses, when there is interest :-)
-----------------------------------------------------------------------

The answers:


From: Steve Gwaltney <gwaltney@qtp.ufl.edu>

Your molecule sounds like an open-shell singlet.  If it is, you need
two determinants to describe the wavefunction.  Thus, any Hartree-Fock
calculation will fail.  What you need is a MCSCF calculation, or some
other way to deal with two determinantal wavefunction.

Steve


From: Christopher J Cramer <cramer@maroon.tc.umn.edu>

>  a) Why is it so difficult to get a UHF singlet wavefunktion for a normal 
> biradical? 
> 
   Because an open-shell singlet is a two-configuration wavefunction. It
requires an MCSCF approach to do it properly. The S^2=1 calculations are for
so-called 50:50 wavefunctions (half singlet, half triplet).

>  b) Can I rely on the energies obtained when the spin contamination is 
> about 1.0 (Point 7.)? This is especially problematic because I want to 
> compare it to the energiy of a closed-shell isomer. Since I got this 
> results with 6-31G*/B3LYP it seems to be the best I can get...
> Is a geometry optimization sensible in this case?
> 
   No, and no. You can adopt the so-called sum method of Ziegler and
double-the energy difference between the triplet and the 50:50 wavefunction
to estimate the energy of the open-shell singlet. Although a terrible
approach at the HF level, comparing UDFT to RDFT seems to be OK for S-T gaps
(we have several papers on this). However, your B3LYP functional includes HF
exchange, which will degrade this comparison.

>  c) What can I say about the wavefunction of point 4.? The spin 
> contamination is about 2.3 before annihilation (not a very small value 
> for a singlet ;-) and about 8.1 AFTER annihilation of the first spin 
> contaminant. I have no idea what this should mean!
> 
   It's garbage.

>  d) Am I right when I think that the spin density of a system with to 
> unpaired electrons in the sigma frame separated far enough from 
> another should exhibit these two electrons at the carbons they are 
> attached to?
> 
   For the closed-shell singlet, there is probably hybridization and orbital
splitting. For the open-shell and the triplet, your intuition is right. Much
of the literature on 1,4-didehydrobenzene discusses these topics.

> e) Having all these problems in mind: Which way should I go to determine 
> the energiy of the singlet ground state?
> 
   MCSCF is the best approach. Your DFT results are quite curious because
they indicate the open-shell singlet to be lower in energy than the triplet.
This is pretty unusual, and I'm not sure I believe it.

   If you would be interested in seeing some of our work on these issues, I'd
be happy to send you some papers.

Best regards,

Chris

-- 

From: Christian Koelle <ck@ws2.theochem.uni-hannover.de>

> 
> e) Having all these problems in mind: Which way should I go to determine 
> the energiy of the singlet ground state?
> 
Zwei Wege: einer gut der andere besser:

1) CI Rechnung. Bei einem Diradikal sollte als aktiver Raum HOMO und LUMO
ausreichen. Wir benutzen bei unseren Rechnungen mit einer semiempirischen 
Methode die HOMO-LUMO Einfach- und Doppelanregung, wobei die 
Doppelanregung fuer den Singulett entscheidend ist.

2) MCSCF-Rechnung. Auch hier HOMO und LUMO als aktiver Raum.

Hoffe geholfen zu haben.

Gruss

Christian Koelle


From: Hans Ueli Suter <husuter@cscs.ch>

Lieber Herr Koch,

ich wuerde Ihr Problem unter einem etwas anderen Gesichtspunkt sehen:
Was Sie effektiv rechnen sind die angeregten Zustaende von ...
Bei einem Programm wie Gaussian, das keine direkte Symmetriebindung der
Orbitale kennt ist es natuerlich schwierig diese in der noetigen
Weise zu besetzen. Wie dem auch sei, das Keyword wuerde lauten
Guess=Alter und dann am ende des Inputs waeren die Numern vom HOMO und
vom LUMO einzusetzen, dann wuerde er eben das HOMO einfach besetzen und
das LUMO einfach, womit Sie das hoffentlich richtige Biradikal haetten.
Ich war nie in der Lage das auch fuer Dichtefunktionale durchzufuehren,
aber UHF sollte gehen. Andererseits koennen Sie versuchen mit CIS
die angeregten Zustaende zu rechnen (Das Biradikal wird auch darunter
sein). Vom Niveau her ist "CIS" etwa vergleichbar mit HF und nicht mit CI.
Die Resultate sind demzufolge mit Vorsicht zu geniessen. Hier wuerde jetzt
natuerlich der Hinweis folgen, dass man angeregte Zustaende besser mit ..., 
aber ich spare mir das, sollte eh klar sein. Ich hoffe, dass Ihnen das hilft.

                                   mit freundlichen Gruessen
                                        H.U. Suter


From: Frank Jensen <frj@dou.dk>

	Thorsten,
	UHF wave functions that are spin contaminated to the
degree you describe are essentially useless, even your triplet
UHF is pretty bad. The energetics will be all garbage. You need
a small MCSCF (or equivalently GVB) to get a reasonable zero'th
order description. For your singlet biradical a [2,2]-CASSCF
may be OK (or an OSS GVB type wave function which is the same).
But you won't get any dynamic correlation. To include that you
need a MRCI, or perhaps a MP2 on top of the MCSCF.
	Frank


From: Matthew.Harbowy@tjlus.sprint.com

Singlet biradicals are no trivial problem. You see, your calculations
are on the *lowest* singlet, not the biradical. The calculations are 
converging to (what I believe is called) a Renner-Teller distorted 
configuration of the orbitals because it wants to do this

                                                       ------
        U         D
     -------    ------       distorts to
                                                         UD
                                                       ------

What you want to do is not UHF, where your spins are all crazy mixed up. What 
you want to do is CI, where you can specify exactly what sort of configuration 
you want, and send it on it's merry way. I believe QCISD and CASSCF are the 
methods du jour for these sort of problems.

matt


>From jkl@ccl.net Sun Aug 27 10:42 EDT 1995
Received: from krakow.ccl.net  for jkl@ccl.net
	by bedrock.ccl.net (8.6.10/950822.1) id KAA29047; Sun, 27 Aug 1995 10:42:57 -0400
Received: for jkl@ccl.net
	by krakow.ccl.net (8.6.10/920428.1525) id KAA01963; Sun, 27 Aug 1995 10:42:54 -0400
From: Jan Labanowski <jkl@ccl.net>
Date: Sun, 27 Aug 1995 10:42:54 -0400
Message-Id: <199508271442.KAA01963@krakow.ccl.net>
To: chemistry@ccl.net
Subject: MOPAC
Cc: jkl@ccl.net
Content-Type: text
Content-Length: 296
Status: R


This is a test.  Mopac
 .
another dots
 ..
and another one
 .

-- 

Dr. Jan K. Labanowski, Senior Research/Supercomputer Scientist/Specialist, etc.
Ohio Supercomputer Center, 1224 Kinnear Rd, Columbus, OH 43212-1163
ph:(614)-292-9279,  FAX:(614)-292-7168,  E-mail: jkl@ccl.net  JKL@OHSTPY.BITNET


>From jkl@ccl.net Sun Aug 27 10:42 EDT 1995
Received: from krakow.ccl.net  for jkl@ccl.net
	by bedrock.ccl.net (8.6.10/950822.1) id KAA29047; Sun, 27 Aug 1995 10:42:57 -0400
Received: for jkl@ccl.net
	by krakow.ccl.net (8.6.10/920428.1525) id KAA01963; Sun, 27 Aug 1995 10:42:54 -0400
From: Jan Labanowski <jkl@ccl.net>
Date: Sun, 27 Aug 1995 10:42:54 -0400
Message-Id: <199508271442.KAA01963@krakow.ccl.net>
To: chemistry@ccl.net
Subject: MOPAC
Cc: jkl@ccl.net
Content-Type: text
Content-Length: 296
Status: R


This is a test.  Mopac
 .
another dots
 ..
and another one
 .

-- 

Dr. Jan K. Labanowski, Senior Research/Supercomputer Scientist/Specialist, etc.
Ohio Supercomputer Center, 1224 Kinnear Rd, Columbus, OH 43212-1163
ph:(614)-292-9279,  FAX:(614)-292-7168,  E-mail: jkl@ccl.net  JKL@OHSTPY.BITNET


From brianh@scg.scg.fujitsu.com  Tue Aug 29 13:56:11 1995
Received: from mail.barrnet.net  for brianh@scg.scg.fujitsu.com
	by www.ccl.net (8.6.10/950822.1) id NAA02916; Tue, 29 Aug 1995 13:52:02 -0400
Received: from fujitsu1.fujitsu.com (fujitsu1.fujitsu.com [133.164.254.1]) by mail.barrnet.net (8.6.10/MAIL-RELAY-LEN) with SMTP id KAA20845 for <chemistry@www.ccl.net>; Tue, 29 Aug 1995 10:51:57 -0700
Received: from scg.scg.fai.com (scg.scg.fujitsu.com) by fujitsu1.fujitsu.com (4.1/SMI-4.1)
	id AA08691; Tue, 29 Aug 95 10:53:14 PDT
Received: by scg.scg.fai.com (4.1/SMI-4.1)
	id AA11489; Tue, 29 Aug 95 10:52:12 PDT
Date: Tue, 29 Aug 95 10:52:12 PDT
From: brianh@scg.scg.fujitsu.com (Brian Hammond)
Message-Id: <9508291752.AA11489@scg.scg.fai.com>
To: chemistry@www.ccl.net
Subject: Correction to Symposium Announcement


	CORRECTED SYMPOSIUM ANNOUNCEMENT AND CALL FOR PAPERS

	Monte Carlo Methods in Chemistry

	211'th American Chemical Society National Meeting
	New Orleans
	March 24-29, 1996

	Sorry I messed up the dates on the previous announcement. The
	sessions are on the 24th and 25th.

Purpose:  This symposium will focus on the use of Monte Carlo in all
          of computational chemistry. There will be sessions on 
          Monte Carlo in classical dynamics, statistical mechanics,
          and quantum mechanics, both in theory and applications.
          
         
Sponsor:  The Computers in Chemistry Division of the American Chemical 
          Society.  

Program:  Sunday, March 24, 9 a.m. - 12 p.m.
	  Monte Carlo methods for Quantum Systems

	  Sunday, March 24, 2 p.m. - 5 p.m.
	  Monte Carlo methods in Classical Mechanics

	  Monday, March 25, 9 a.m. - 12 p.m.
	  Monte Carlo methods, general (optimization, etc.)

Format:	  There will be six 30 minute talks in each session for a 
	  total of 18 talks. All other contributions will be put 
	  into the poster sessions. Oral presentations are on a 
	  first-come-first-served basis.

Deadline: All abstracts must be submitted to me no later than October 1,
	  1995. Please send me e-mail a.s.a.p if you wish to give an
	  oral presentation. Those people who have already indicated
	  that they will definitely come just need to send me the ACS
	  abstract form. Those who have contacted me, but did not state
	  definitely that they wish to be on the program, please send 
	  e-mail to confirm whether or not you are coming.

Abstracts: All abstracts must be submitted on an OFFICIAL ACS abstract
	  form. If you don't have one, contact me or you can get sent to
	  you from the ACS web site, www.acs.org.


======================================================================== 
Brian L. Hammond                 _/_/_/_/    _/_/_/    _/_/_/_/_/   
Computational Research Div.     _/        _/      _/      _/
Fujitsu America, Inc.          _/_/_/    _/_/_/_/_/      _/
3055 Orchard Drive            _/        _/      _/      _/
San Jose, CA  95134          _/        _/      _/  _/_/_/_/_/
Tel: (408) 456-7322
Fax: (408) 456-7071
Email: brianh@fai.com


From smb@smb.chem.niu.edu  Tue Aug 29 14:41:11 1995
Received: from mp.cs.niu.edu  for smb@smb.chem.niu.edu
	by www.ccl.net (8.6.10/950822.1) id OAA04071; Tue, 29 Aug 1995 14:31:11 -0400
Received: from cz2.chem.niu.edu by mp.cs.niu.edu with SMTP id AA19296
  (5.67b/IDA-1.5 for <@mp.cs.niu.edu.chem.niu.edu:CHEMISTRY@www.ccl.net>); Tue, 29 Aug 1995 13:31:25 -0500
Received: from smb.chem.niu.edu by cz2.chem.niu.edu via SMTP (920330.SGI/890607.SGI)
	(for @mp.cs.niu.edu.chem.niu.edu:CHEMISTRY@www.ccl.net) id AA28877; Tue, 29 Aug 95 13:14:48 -0500
Received: by smb.chem.niu.edu (920330.SGI/890607.SGI)
	(for @cz2.chem.niu.edu:CHEMISTRY@www.ccl.net) id AA27111; Tue, 29 Aug 95 13:14:47 -0500
Date: Tue, 29 Aug 95 13:14:47 -0500
From: smb@smb.chem.niu.edu (Steven Bachrach)
Message-Id: <9508291814.AA27111@smb.chem.niu.edu>
To: CHEMISTRY@www.ccl.net
Subject: Proceeding of ECCC-1 Now available


The CDROM version of the Proceedings of the First Electronic Computational
Chemistry Chemistry will be published in September 1995. Advance ordering of
the CDROM is available for a significant discount. Information on the CDROM,
poricing and ordering froms are available on the web at URL

http://www.ari.net/chemnet/eccc-order.html

or by phone to ARInternet (310)459-7171

Steve

Steven Bachrach				
Department of Chemistry
Northern Illinois University
DeKalb, Il 60115			Phone: (815)753-6863
smb@smb.chem.niu.edu			Fax:   (815)753-4802


From jlye@tx.ncsu.edu  Tue Aug 29 14:42:39 1995
Received: from sparc4.tx.ncsu.edu  for jlye@tx.ncsu.edu
	by www.ccl.net (8.6.10/950822.1) id OAA04121; Tue, 29 Aug 1995 14:32:13 -0400
Received: from hamby.tx.ncsu.edu by sparc4.tx.ncsu.edu (8.6.9/ES17aug94)
	id OAA19454; Tue, 29 Aug 1995 14:32:18 -0400
Received: by hamby.tx.ncsu.edu (5.65b/SAM 12-13-90 16:56:22)
	id AA26970; Tue, 29 Aug 95 14:32:06 -0400
Date: Tue, 29 Aug 95 14:32:06 -0400
Posted-Date: Tue, 29 Aug 95 14:32:06 -0400
Message-Id: <9508291832.AA26970@hamby.tx.ncsu.edu>
To: chemistry@www.ccl.net
Cc: dhinks@tx.ncsu.edu, jlye@tx.ncsu.edu, Harold_Freeman@ncsu.edu
From: jlye@tx.ncsu.edu (Jason Lye)
Subject: MOPAC: Comments in Output File.
X-Mailer: Cem X11/Mailer Version 9.2mgm (Wed Jan 5 12:38:09 EST 1994)
Content-Type: text
Content-Length: 2626


Dear Net-surfers,

I posted a question to the list a few weeks ago concerning certain comments in 
MOPAC output files.  

Many thanks to:     	James Stewart
			Victor Rosas Garcia
			Andreas Goeller
			Richard Bone
			Dave Giesen

for there replies, summarised below:

} 1st Comment:
}
} `GRADIENT TEST NOT PASSED, BUT FURTHER WORK NOT JUSTIFIED'
} `SCF FIELD WAS ACHEIVED'
	
The default geometry criteria in MOPAC are set in such a way that
the geometry was sufficiently well optimized for most work.
If you really want to force the gradient norm down nearer to zero,
you can by using other keywords (for example GNORM=1).  
Some also suggested using PRECISE in addition to GNORM=0.1.
I was told that if the gradient is greater than 2, then this is bad, and I
need to specify PRECISE to lower this figure in further calculations.

} 2nd Comment:
} 
} `HERBERTS TEST WAS SATISFIED IN BFGS'

Herbert worked in the Dewar group, and devised a geometric test which Dr. 
Stewart retained in MOPAC.  From a comment in the AMPAC code, "The estimated 
distance from the current point point to the minimum is less than TOLERA"  
Where TOLERA is a constant.
Some replies have suggested that Herbert's test is not 100% reliable on it's
own, and that the gradient norm should always be checked. 

BFGS stands for the Broyden-Fletcher-Goldfarb-Shanno method for function 
minimization.

Some suggest that the EF (Eigenvector Following) method be used for 
geometry optimizations instead of the BFGS method, as the EF method is a lot
more reliable.  EF is a gradient optimizer routine, and can get rid of the 
gradient problems if used with the keywords  EF LET DDMIN=0.0
If problems still persist, add XYZ to this list of keywords.

} Finally, can anyone recomend a good book which covers molecular modeling with 
} MOPAC, or other semi-empirical packages/methods.

Sadly, info on good books was thin on the ground:
	i) Read the MOPAC 93 manual.
 	ii) Practice, Practice, and more Practice with MOPAC (not a book)
	iii) "A Handbook of comp. Chem.", T. Clark, Wiley

Thanks everyone,  Hope that this is helpful, 

Jason.

_______________________________________________________________________________

Jason Lye,                       |    
Dye Synthesis Research Group,    |    
College Of Textiles,  Box 8301,  |     Type something funny here.
North Carolina State University, |     
Raleigh, N.C. 27695 - 8301       |     
                                 |      
      Ph:   (919) 515-6615       |                    
      jlye@tx.ncsu.edu           | 
_______________________________________________________________________________


From jorge.manrique@beilstein.com  Tue Aug 29 16:56:16 1995
Received: from uucp-1.csn.net  for jorge.manrique@beilstein.com
	by www.ccl.net (8.6.10/950822.1) id QAA07188; Tue, 29 Aug 1995 16:49:09 -0400
From: <jorge.manrique@beilstein.com>
Received: from beilstein.com (uucp@localhost) by uucp-1.csn.net (8.6.12/8.6.12) with UUCP id NAA28106 for chemistry@www.ccl.net; Tue, 29 Aug 1995 13:58:30 -0600
Received: by beilstein.com
     id 0JJB5008 Tue, 29 Aug 95 13:54:23 
Message-ID: <9508291354.0JJB500@beilstein.com>
Organization: Beilstein Information Systems
X-Mailer: TBBS/TIGER v1.0
Date: Tue, 29 Aug 95 13:54:23 
Subject: PRESS RELEASE
To: chemistry@www.ccl.net


PRESS RELEASE

August 28,  1995

For Immediate Release

Ref.: Jorge Manrique, Beilstein Information Systems                           
Tel: 415 358-9091    
Internet: jmanrique@beilstein.com

   
            CrossFireplusReactions(TM) and The Beilstein Commander(TM)
              Enthusiastically Received by Chemists at ACS-Chicago


Chicago, IL  --CrossFireplusReactions, a fully integrated reactions database,
and the Beilstein Commander, a graphical client that provides a unified
interface to the entire Beilstein suite of products, were unveiled last week at
the 110th meeting of the American Chemical Society in Chicago.

The comments of researchers who had an opportunity to visit the Beilstein booth
can be summed up as follows: CrossFireplusReactions running under the Commander
delivers the  performance of CrossFire, with the comprehensive coverage and
quality only Beilstein can provide.

The Beilstein Commander is a graphical interface designed to integrate the
complete Beilstein suite of products in one coherent unit. All interapplication
communication is handled transparently to the user.  The Beilstein Commander
runs on PC computers under Microsoft Windows 3.x, the newly released Windows
95, and on Macintosh computers under System 7.5.

CrossFireplusReactions complements the CrossFire system and delivers a complete
in-house information solution.   The realization of a vision of chemical
information, it breaks down artificially created barriers to information about
SUBSTANCES  (molecules and properties), REACTIONS  and CITATIONS.  For the
first time, scientists are able to search and display information in the way
which is most intuitive to them -- letting the format adapt to their
requirements.  Here, truly, is information at your fingertips.

The Beilstein paradigm of chemical information management involves three
mutually inclusive domains: Substances, Reactions and Citations.  When you
review the results of your search in the context of the substances domain, you
get the chemical structure and ALL its properties, detailed in up to 350 data
fields.  If you are interested in how to synthesize a molecule, or wish to see
how a structure (or substructure) participates in reactions, the reactions
domain presents to you the chemical reactions, reactions' conditions, and
literature references.  If, from there, you switch to the citations domain, you
see ALL the reactions written in the paper that reported the reaction in which
you were originally interested.

Hyperlinks present in substances, reactions and citations, allow for even more
direct and effortless navigation between the various sources of information.

The structure search system developed for CrossFire has been extended to
include reaction sub-structure searching:  you can define the role of a
structure or sub-structure in the reaction, and search using reaction
attributes such as reaction centers, bond fate, and atom-atom mapping.  An
intuitive interface allows easy creation of reaction queries.  The unique
mapping tool, one of many innovations developed for CrossFireplusReactions,
provides the simplest and most elegant way to define chemical reactions.

Starting with about ten million reactions, with about 300,000  more to come
each year, CrossFireplusReactions is the largest and most comprehensive
database of chemical reactions available.  Reaction conditions such as solvent,
temperature, coreactants, catalysts, yield, etc. are displayed with the
results, and are also fully searchable as part of the query.  As could be
expected from Beilstein, all reactions have complete literature citations, and
these, also, are fully searchable.

Chemists involved in organic synthesis and reaction planning, find in Beilstein
an indispensable tool for daily research.  Scientists seeking to optimize
reaction schemes, have turned to the wealth of data available in
CrossFireplusReactions  as the optimal solution.

CrossFireplusReactions employs RISC technology, a new structure indexing
system, and a revolutionary search engine to bring close to 10 million organic
reactions and their associated properties and literature references in-house. 
The client-server system links RISC data servers with clients running on
Windows based IBM PC compatible machines and Macintosh computers.

Clemens Jochum, Managing Director of Beilstein Information Systems remarked:
"Our customers have been looking for a high quality solution to their organic
synthesis needs.  To date, there is no other information system that can
approximate the capabilities of CrossFireplusReactions in richness of the
knowledge base, or in the performance of the search engine."

Beilstein Informationssysteme GmbH, headquartered in Frankfurt, Germany, and
Beilstein Information Systems, Inc., with offices in Englewood, Colorado, also
market the Beilstein Handbook  of Organic Chemistry: the world's leading
authority on organic chemical information.


CrossFire, CrossFireplusReactions, and the Beilstein Commander are trademarks
of Beilstein Information Systems.  All other trademarks are the property of their holders.


From cliang@ginger.curagen.com  Tue Aug 29 17:26:12 1995
Received: from guarneri.curagen.com  for cliang@ginger.curagen.com
	by www.ccl.net (8.6.10/950822.1) id RAA07749; Tue, 29 Aug 1995 17:24:48 -0400
Received: from ginger.curagen.com by guarneri.curagen.com via SMTP (931110.SGI/930416.SGI.AUTO)
	for chemistry@www.ccl.net id AA17250; Tue, 29 Aug 95 17:23:38 -0400
Received: by ginger.curagen.com (940816.SGI.8.6.9/930416.SGI.AUTO)
	 id RAA01268; Tue, 29 Aug 1995 17:23:27 -0400
From: "Charlene Liang" <cliang@ginger.curagen.com>
Message-Id: <9508291723.ZM1266@ginger.curagen.com>
Date: Tue, 29 Aug 1995 17:23:26 -0400
X-Mailer: Z-Mail (3.2.0 26oct94 MediaMail)
To: chemistry@www.ccl.net
Subject: MM2 lone pair
Cc: cliang@ginger.curagen.com
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii


Dear all,

I would appreciate any comments from those of you who have used MM2(87) from
QCEP before.

The system involve a disulfur (S-S) bond and a peptide bond (-CA-N-C-CA-).
                                                               | |
                                                               H O
The question is : are lone pairs necessary for N, or O or S ?

Personal e-mail response prefered.  Thank you in advance.

Charlene

-- 


Charlene Liang                |  (203) 481-1104 ext. 87
cliang@ginger.curagen.com     |  (FAX) (203) 481-1106


From eknight@wppost.depaul.edu  Tue Aug 29 19:41:13 1995
Received: from wppost.depaul.edu  for eknight@wppost.depaul.edu
	by www.ccl.net (8.6.10/950822.1) id TAA08940; Tue, 29 Aug 1995 19:27:39 -0400
From: <eknight@wppost.depaul.edu>
Received: from ac-lc-Message_Server by wppost.depaul.edu
	with WordPerfect_Office; Tue, 29 Aug 1995 18:30:18 -0600
Message-Id: <s0435cba.007@wppost.depaul.edu>
X-Mailer: WordPerfect Office 4.0
Date: Tue, 29 Aug 1995 17:51:07 -0600
To: chemistry@www.ccl.net
Subject:  Parity checking RAM in PCs


   Parity errors occur when one of the bits in memory randomly flips
(usually attributed to radioactivity from the solder).  PCs originally
supported parity checking on the IBM principal that bad data is worse
than no data (Macs have never supported parity checking).  A parity
error usually caused the machine to halt the current process but that
depends on the operating system.

   The relatively new (less than 1 year old) Triton chip set from Intel that
supports the higher speed Pentium processors does not support parity
checking of RAM.  Most of the new high-speed PC clones thus are not
supporting parity checking.  The RAM chips themselves have been
designed analogously (the so-called EDO RAM does not have Parity
checking) so they only have 8 chips on a SIMM as opposed to 9. The
new PCs don't support parity checking because (the argument goes) the
memory is so reliable that there is no reason to have parity checking. 
The only quantification I have of this statement is that a parity error
should occur every ten years in a 16 megabytes of RAM.  (If you have
32 Mb, it would be every 5 years).  I got that number from PC Magazine
and make no claims to its accuracy.

   The bottom line is this: if you run a calculation on one of the new PCs,
some number of bits _might_ have been randomly flipped.  My question is
this:  Is it legitimate to report calculations done on such a computer in a
scientific journal?

   Here are a few possible solutions to this dilemma:
   (1) Don't use the cheap computational power of the new PCs.  I don't
know what the situation is for other computers but they are more
expensive, perhaps this is one reason why.
   (2) Use error correcting RAM.  With this, a parity error is corrected as it
happens.  Unattractive because such RAM should cost at least 50%
more than normal due to the larger number of chips and lower demand.  I
haven't priced it though or even seen it advertised; it could be much more
expensive than that.
   (3) Repeat all of your calculations twice.  Unattractive for obvious
reasons (you're better off buying a computer with half the speed and
parity checking - if you can find one).
   (4) Report the type of machine that performed your calculation so that
the risk is known to those reading your communication.

   None of these solutions are especially satisfying.  Given the chance of
programming errors, input errors and transcription errors, perhaps the
last solution is acceptable as the hardware manufactures would lead us
to believe.  What do you think?

eknight@wppost.depaul.edu


From jkl@ccl.net  Tue Aug 29 19:56:14 1995
Received: from bedrock.ccl.net  for jkl@ccl.net
	by www.ccl.net (8.6.10/950822.1) id TAA09196; Tue, 29 Aug 1995 19:55:20 -0400
Received: from pnl.gov  for y_zheng@ccmail.pnl.gov
	by bedrock.ccl.net (8.6.10/950822.1) id TAA07690; Tue, 29 Aug 1995 19:55:17 -0400
From: <y_zheng@ccmail.pnl.gov>
Received: from ccmail.pnl.gov by pnl.gov (PMDF V4.3-13 #6012)
 id <01HUNI5ZST0W90MT39@pnl.gov>; Tue, 29 Aug 1995 16:55:15 -0700 (PDT)
Date: Tue, 29 Aug 1995 16:55 -0700 (PDT)
Subject: Force field parameters for FAD, NAD, and related
To: chemistry@ccl.net
Message-id: <01HUNI609T8690MT39@pnl.gov>
MIME-version: 1.0
Content-transfer-encoding: 7BIT


Hi,

I am looking for force field parameters for flavin-adenine dinucleotide (FAD), 
nicotinamide adenine dinucleotide (NAD), and other related compounds. Any help 
will be appreciated. Please reply to me directly. Thanks.  Yajun

From jkl@ccl.net  Tue Aug 29 22:11:17 1995
Received: from bedrock.ccl.net  for jkl@ccl.net
	by www.ccl.net (8.6.10/950822.1) id WAA10218; Tue, 29 Aug 1995 22:00:51 -0400
Received: from cbdcom.apgea.army.mil  for grfamini@cbdcom.apgea.army.mil
	by bedrock.ccl.net (8.6.10/950822.1) id WAA08102; Tue, 29 Aug 1995 22:00:48 -0400
Date:     Tue, 29 Aug 95 21:56:40 EDT
From: George R Famini   <grfamini@cbdcom.apgea.army.mil>
To: chemistry@ccl.net
Subject:  COMP Program for New Orleans
Organization:  International Programs Office
Message-ID:  <9508292156.aa25932@cbdcom.apgea.army.mil>


Well Folks,

Here is the preliminay schedule for the Computers in Chemistry
Division's program for the New Orleans ACS meeting.  Please
remember that the deadline for getting abstracts in is the end 
of October (yes, it is earlier than several other divisions,
but that's life).  Although there is some leeway to the
the deadline, I reserve the right to reject any paper submitted
after the deadline.  In any case, no paper will be accepted
after my final program goes to ACS (notice I do not tell you when that
date is...). 

The symposia organizers have made every attempt to accept
as many contributed papers as possible.  However, many of
the symposia are expected to be popular.  Therefore, I suggest you contact
the organizer now if you are interested in submitting a paper to see if
there is space available.  There will always be room in the poster session
(we had over 300 attendees to the poster session in Chicago, most
stayed for the entire evening), as well as the general oral (although I
recommend a poster, it seems to attract a larger audience).


 The preliminary agenda for New Orleans is:


Symposium	Sunday	Monday	Tuesday	Wdnesday  Thursday

Applic in
Env Chem	 		  P	  D

S.E. MO
Theory					  D	     D

Drug Discovery	  D	  D	  P

Object Oriented					     D

Monte Carlo	  D	  A

Experiment 
Design			  P 	  D

General Oral					     D
	
General Poster			  E

Sci Mix			  E

Phys Prop Est	 P	  D

Frugal Chemist			 D

Databasing				  D

Computers in
Chemistry Award			 A


D= All Day   P= Afternoon  A= Morning   E= Evening

Session Organizers Needed

Despite the overall success of offering symposia, COMP still is in 
need of additional orgranizers for symposia at Las Vegas, Dallas and 
Boston (Fall 1997 and beyond).  Organizing a symposium can be 
challenging, but at the same time, rewarding and enlightening.  If 
you are interested in organizing a symposium, or have ideas for new 
symposia within COMP, do not hesitate to contact the Program 
Chair (that's me, folks).

The electronic poster session was a big hit, with over 1000
"visitors" to the infobahn homepage.  We are intending to
attempt another electronic poster session again in New Orleans,
but with some modifications.  Hopefully, ACS will not
assign us a room for the electronic posters this time
(causing a bit of confusion).  Plus, it is our intent to
a) list the posters in the program for Sunday, and b) have
an Internet link outside of the COMP meeting rooms (maybe
even at the poster session...).  Anyway, Ton, Henry and Steve
deserve lots of thanks for making it work in Chicago.


				George Famini
				COMP Program Chair

The list of symposia (full title) and organizers for New
Orleans is:


Program Chair:  George R. Famini, U.S. Army Edgewood Research, 
Development and Engineering Center, SCBRD-ASI , Aberdeen Proving 
Ground, MD 21010; Voice:  (410)671-2552; Fax:  (410)671-5373; email:  
grfamini@apgea.army.mil.

Four (4) copies of 150-word abstract (Original on ACS Abstract Form) are 
due by October 20, 1995  to respective session or symposium chairpersons.

        Molecular Modeling Applications to Environmental Problems - 
 Dr. James Rabinowitz, USEPA, HERL, MD-68, Research Triangle 
Park, NC 27711; voice:  (919)541-5714; fax:  (919)541-0694; 
email:  sar@linus.herl.epa.gov.

        Semi-Empirical Molecular Orbital Methods:  Is There a 
Future? - Dr. Andrew J. Holder, Department of Chemistry, 
University of Missouri, Kansas City, MO 64110; voice:  (816)235-
2293; email:  aholder@vax1.umkc.edu.

       Computational Chemistry Assisted Drug Discovery - Dr. James 
Damewood, Zeneca Pharmaceuticals, Department of 
Medicinal/Structural Chemistry, 1800 Concord Pike, Wilmington, 
DE 19897; voice:  (302)886-5792; email:  damewoodjr@zen.com.

       Application of Object Oriented Programming Methodology to 
Computing in Chemistry - Dr. Dennis J. Gerson, IBM Consulting 
Group, 1507 LBJ Freeway MS/160601, Dallas, TX 75234; Voice: 
(214)280-1425; fax:  (214)280-1486:  email:  
gerson@vnet.ibm.com.  Dr. Kevin Cross, Chemical Abstracts 
Service, 2540 Olentangy River Road, P.O. Box 3012, Columbus, 
OH 43210; voice:  (614)447-3813 ext. 3192; fax:  (614)447-3813; 
email kcross@acs.org.

         Monte Carlo Methods in Chemistry -  Dr. Brian L. Hammond, 
Computational Research Div,Fujitsu America, Inc., 3055 Orchard 
Drive, San Jose, CA  95134; voice:  (408) 456-7322;  email: 
brianh@fai.com.

 	
        Physical/Chemical Property Prediction - Dr. Lionel Carreira, 
Department of Chemistry, University of Georgia, Athens, GA 
30602; voice :   (706) 542-2050 or 2051; fax:   (706) 542-9454; 
email:  butch@sunlc2.chem.uga.edu.
 	
       Frugal Chemists Software - Dr. Charles James, Department of 
Chemistry, University of North Carolina at Asheville, Ashevilee, 
NC 28804; voice (704)251-6443; fax:  (704)251-6041; email:  
james@unca.edu.
 	
       Experimental Design for Chemical Models - Dr. Karen 
Rappaport, Hoechst-Celanese, 86 Morris Ave, Summitt, NJ 07901; 
voice:  (908)522-7868; fax:  (908)522-3913;  email:  
kdr1@sumhcc1.hcc.com.
 	
        New Methods in Databasing - Dr. Scott Kahn,Molecular 
Simulations, Inc, 555 Oakmead Parkway, Sunnvale, CA 94086;  
voice:  (408)522-0100; fax:  (408)522-0199; email:  
skahn@msi.com.
 
       Electronic Poster Session - Dr. Steven Bachrach, Department 
of Chemistry, Northern Illinois University, DeKalb, Il 60115
Phone: (815)753-6863, smb@smb.chem.niu.edu, Fax:   (815)753-4802   
	 

       General Computational Chemistry - Poster and/or Oral 
Sessions - Dr. George R. Famini, US Army Edgewood Research, 
Development and Engineering Center, SCBRD-ASI, APG, MD 
21010; voice:   (410)671-2552; Fax:  (410)671-5373; email:  
grfamini@apgea.army.mil.

 
From cletner@remcure.bmb.wright.edu  Tue Aug 29 22:26:17 1995
Received: from remcure.bmb.wright.edu  for cletner@remcure.bmb.wright.edu
	by www.ccl.net (8.6.10/950822.1) id WAA10470; Tue, 29 Aug 1995 22:20:07 -0400
Received: by remcure.bmb.wright.edu (931110.SGI/921111.SGI.AUTO)
	for chemistry@www.ccl.net id AA23409; Wed, 30 Aug 95 01:14:15 -0400
Date: Wed, 30 Aug 1995 00:42:21 -0400 (EDT)
From: Charles Letner <cletner@remcure.bmb.wright.edu>
Subject: Re: CCL:Parity checking RAM in PCs
To: eknight@wppost.depaul.edu
Cc: chemistry@www.ccl.net
In-Reply-To: <s0435cba.007@wppost.depaul.edu>
Message-Id: <Pine.3.07.9508300018.A23389-c100000@remcure.bmb.wright.edu>
Mime-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII


On Tue, 29 Aug 1995 eknight@wppost.depaul.edu wrote:

>    The bottom line is this: if you run a calculation on one of the new PCs,
> some number of bits _might_ have been randomly flipped.  My question is
> this:  Is it legitimate to report calculations done on such a computer in a
> scientific journal?
	When the whole Pentium FPU bug came out we saw basically the same
question.  That is, how much faith do you put in you hardware?  I think
the answer is quite a bit.  I suppose that the assumption that I work
under is that of all the potential errors that could occur, an error due
to hardware failure is minimal.  In my mind, I as the user am much more
likely to make an error than the computer.  Who hasn't had to trash a
calculation at one time or another because of user error.  I know I have. 
The reasuring point is that an error usually manifest itself in some way
that, with experience, the user can see.
	The other question I have is who decides what machines we can use.
 IF a Pentium isn't allowed, what about a low end DEC alpha?  Are the only
valid calculations those that are run on SGI Challenges or Cray's?  I
don't really think so.  The results are really what matters, not the
machine used to run the simulation.  To carry it one step further, would
an author have to report the manufacture of the RAM in the computer used
for a calculation?  Lets say that to cut cost a owner of an
SGI bought third party memory.  Because it is less expensive their is
always the POSSIBILITY (however small) that over a five year period that
memory wouldn't have a higher failure rate?  Do we have to start reporting
mean time between failures for memory?  What about other hardware, disks,
cache, FPU, etc......  I think my point is becoming clear.  Some things
have to be taken for granted.  That hardware failure is negligable
compared to other problems is one assumption that I think is valid.  For
those who would say that detracts from the validity of the results, then
judge for yourself the validity of the result.  But if you want to through
out a simulation just because the machine didn't have parity check, better
start prepare to through out alot of science.  I mean who's to say that
the HEPES used to make up the buffer in the last paper you read didn't have
some trace contaminate?   ;)
	Having said that, I do believe that reporting, in a general way, the
machine used and the cpu time is important.  But this is more so that a
reader who is considering a similar calculation can estimate the magnatude
of the job they are contemplating on the machine they plan on using.

There is my $0.02 worth......

Charles Letner
Wright State University
Department of Biochemistry
Dayton, OH 45435
e-mail: cletner@remcure.bmb.wright.edu


From jochen+@pitt.edu  Tue Aug 29 22:41:19 1995
Received: from post-ofc02.srv.cis.pitt.edu  for jochen+@pitt.edu
	by www.ccl.net (8.6.10/950822.1) id WAA10604; Tue, 29 Aug 1995 22:29:00 -0400
Received: from unixs2.cis.pitt.edu (jochen@unixs2.cis.pitt.edu [136.142.185.29])
          by post-ofc02.srv.cis.pitt.edu with SMTP (8.6.10/cispo-2.0)
          ID <WAA05247@post-ofc02.srv.cis.pitt.edu>;
          Tue, 29 Aug 1995 22:25:15 -0400
Date: Tue, 29 Aug 1995 22:25:16 -0400 (EDT)
From: Jochen Kuepper <jochen+@pitt.edu>
Subject: Re: CCL:Parity checking RAM in PCs
To: eknight@wppost.depaul.edu
cc: chemistry@www.ccl.net
In-Reply-To: <s0435cba.007@wppost.depaul.edu>
Message-ID: <Pine.3.89.9508292230.A16014-0100000@unixs2.cis.pitt.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII


Hi,
I still have parity-checking in my PC, and it works - most times the 
computers crashs, but see IBM...

solution 5) buy Pentiums with parity-checking and they will get cheaper.

Jochen


--------------------------------------------------------------------
Jochen Kuepper                   
University of Pittsburgh                 (412) 624-8638
Department of Chemistry                  (412) 624-8665
603 Chevron Science Center               jochen+@pitt.edu
Pittsburgh, PA 15260
USA

Heinrich-Heine-Universitaet
Institut fuer Physikalische Chemie und Elektrochemie I
Universitaetsstrasse 26.43.02
40225 Duesseldorf                        kuepperj@uni-duesseldorf.de
Germany