From tamasgunda@tigris.klte.hu  Thu Aug 31 04:11:39 1995
Received: from tigris.klte.hu  for tamasgunda@tigris.klte.hu
	by www.ccl.net (8.6.10/950822.1) id DAA03242; Thu, 31 Aug 1995 03:57:29 -0400
Message-Id: <199508310757.DAA03242@www.ccl.net>
Received: from anti02 (anti02.chem.klte.hu) by tigris.klte.hu (MX V4.1 VAX)
          with SMTP; Thu, 31 Aug 1995 09:56:41 EDT
Sender: <tamasgunda@tigris.klte.hu>
X-MX-Warning:   Warning -- Invalid "From" header.
From: "tamasgunda@tigris.klte.hu" <tamasgunda@www.ccl.net>
To: chemistry@www.ccl.net
Date: Thu, 31 Aug 1995 09:56:40 +1
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7BIT
Subject: principal conponents summary
Priority: normal
X-mailer: Pegasus Mail/Windows (v1.22)


Some days ago I sent the following summary to CCL, but somehow
only a small part has arrived. Here is it again and hopefully
without problems.
---------------------------------------------------------------

Here is the summary of my question concerning the calculation of 
principal
components and plane fitting to points:

The original question was:

I am looking for algorithm/numerical methods for the determining
of the principal components of a data set. In the words of geo-
metry, I have a number of points determined by y1, y2 and y3, and 
I am looking for the best fitting plane to the points, i.e. the
plane determined by the two greatest principal components.

Do not refer, please, to programs like Mathcad etc. I need it in
a numerical method form, as I'd like to use it as a procedure in a 
program.


***************************************************************
**
That looks like a rather straightforward least-squares problem. You 
want the
parameters a,b,c in the equation for a plane y3 = a y1 + b y2 + c. So 
you
make up a matrix X:

           y1(1)   y2(1)   1  
           y1(2)   y2(2)   1
            ...
           y1(n)   y2(n)   1

and a vector Z = (y3(1), y3(2), ..., y3(n)), plus a 3-component 
vector of
unknowns A = (a,b,c). Then

      A = X^+ Z

gives you the best values for the parameters, where X^+ is the 
generalized
inverse of X.

Konrad Hinsen                     | E-Mail: hinsenk@ere.umontreal.ca
Departement de chimie             | Tel.: +1-514-343-6111 ext. 3953
Universite de Montreal            | Fax:  +1-514-343-7586
C.P. 6128, succ. A                | 
Deutsch/Esperanto/English/Nederlands/
Montreal (QC) H3C 3J7             | Francais (phase experimentale)

***************************************************************
**

From:             gordonh@chem.QueensU.CA (Heather Gordon)
To:               tamasgunda@tigris.klte.hu
Subject:          pca


There are listings for FORTRAN programs pertaining to all aspects of 
factor
analyses, including pca in

"Factor Analysis of Data Matrices", P. Horst, Holt, Rinehart and
Winston, New York, 1965.

I also have written my own code, using the routines TRED2 and TQLI 
>from
"Numerical Recipes in Fortran" to diagonalize the covariance matrix 
and
locate the eigenvalues and eigenvectors.

Heather Gordon
Department of Chemistry
Queen's University
Kingston, Ontario

***************************************************************
**
        What you say you want sounds a lot like the transformation of 
a
molecule to the coordinate system in which the moment-of-inertia 
tensor is
diagonal. If your (y1, y2, y3) components are *orthogonal*, you ought 
to be
able a similar approach. This kind of code is buried in all sorts of 
programs
It should be a simple matter to adapt a suitable code fragment to you 
use.

++++++++++++++++++++++++++++++++++++++++++++++++++
                           FREDERIC A. VAN-CATLEDGE

Scientific Computing Division         ||   Office: (302) 695-1187 or 
529-2076
Central Research & Development Dept.  ||          
The DuPont Company                    ||      FAX: (302) 695-9658
P. O. Box 80320                       ||
Wilmington DE 19880-0320              || Internet:
fredvc@esvax.dnet.dupont.com

***************************************************************
**
Tamas, 
     A friend sent me your e-mail on the subject of an algorithm for
calculating the first two PC's from a dataset. I have written PCA 
(NIPALS)
routines in BASIC and C and will be happy to send them to you, if 
required.
I also have a book (Numerical Recipes) which details PCA (Singular 
VDecomposition) which does a similar thing. If you still need some 
help, send
me an e-mail

Steve Gurden,
Bristol Chemometrics Group,
School of Chemistry,
University of Bristol,
Cantock's Close,
BRISTOL BS8 1TS

Tel: +44 (0)117 9289000 extn. 4421
e-mail: S.P.Gurden@bris.ac.uk

***************************************************************
**

  Dear Dr. Gunda,

  One of the most simple and elegant method to perform Principal
Component Analysis is that of the NIPALS algorithm. It can be 
formulated
in a few lines of programming and it works really fine.

  You can find it in the following references:

* Analytica Chimica Acta, vol. 185, p. 1-17 (1986)
* Chemometrics and Intelligent Laboratory Systems, vol. 2, p. 37-52 
(1987)

  I hope this helps you.

  Sincerely.


  Jose I. Garcia
--

Dr. Jose Ignacio Garcia-Laureiro              Phone : 34-(9)76-350475
Departamento de Quimica Organica              Fax   : 34-(9)76-567920
Instituto de Ciencia de Materiales de Aragon  e-mail: 
jig@qorg.unizar.es
C.S.I.jig@msf.unizar.es
E-50009 ZARAGOZA (SPAIN)

***************************************************************
**

I thought the principle (no pun intended) was quite simple; just 
calculate the
variance/covariance matrix of the variables (or better the 
correlation matrix)
and diagonalize that. You can get the correlation matrix by 
autoscaling the
data and calculating the covariance matrix then.

Next, select the number of principal components that you want by 
calculating
the cumulative sum of the eigenvalues divided by the sum of all the
eigenvalues (the eigenvalues, eigenvectors need to be sorted by their 
size by
the diagonalization procedure or afterwards)


I mean:

T = \sum_i  \labda_i

suppose, as an example:

\labda_1 / T              = 90 %
(\labda_1 + \labda_2) / T = 98 %
 ...
(\labda_1 + \labda_2 + ...) /T = ..

if this is sufficiently accurate for you, use the subspace spanned by 
the first
two eigenvectors as your new space, and project all datapoints onto 
it.
otherwise, use more eigenvectors until (\sum_j \labda_j) / T is 
sufficiently
near 100 % (1.00), and project the datapoints on the subspace 
(unfortunately
it is no longer a 2-D plane, which would be most easy for viewing) 

Hopefully you can use this information, I can't think of any 
references
off-hand..
There is a book by Box (and Hunter?) which describes these 
multivariate
techniques in great detail.
Sorry I can't help you better,

Frits


Frits Daalmans
OIO Conformational Analysis
Gorlaeus Laboratoria
Leiden, The Netherlands
E-mail: frits@chemde4.leidenuniv.nl
Tel: [+31] (0)71-274505

***************************************************************
**

Check the NIST CD for handwritten characters -- there is a principal 
component algorithm coded there (may be in C, but you could then 
recode it
in FORTRAN).  There are several sites that catalog algorithms, too -- 
  here
are a few: http://gams.nist.gov/,   
http://netlib.att.com/netlib/att/cs/home/1127.ht, 
http://www.netlib.org/.

Good luck.

Joe McDaniel
joe@psiint.com
Dear Dr. Gunda,
There are many references to Principal Component Analysis.It is 
actually a
straightforward problem in matrix diagonalization. However, if your 
real goal
is to simply fit a plane to a set of points (minimizing the sum of 
squares of
the distances of the points to the plane) or a line to a set of 
points
(minimizing the sum of squares of the distances of the points to the 
line),
then this is also a simple matrix diagonalization problem whose 
solution is
given and discussed in the following references:

"To Fit a Plane or Line to a Set of Points by Least Squares",
V. Schomaker, J. Waser, R.E. Marsh, and G. Bergman, Acta Cryst.,
vol. 12, pp. 600-604 (1959).

and

"To fit a plane to a set of points by least squares",
D.M. Blow, Acta Cryst., vol. 13, p. 168 (1960).

Regards,

Marvin Waldman, Ph.D.
Director, Rational Drug Design
Biosym Technologies, Inc.
e-mail: marvin@biosym.com

***************************************************************
**

Dear Dr. Gunda,

    There are two ways to do what you want (as I understand it). One 
is to
use a statistical analysis package which supports principal 
components
analysis as statisticians define it.  The
plane you are looking for is the one defined by the first two 
principal
components.

    The other is to perform the equivalent of a moment of inertia 
calculation
with all masses set to 1.  First find the mean value of each of the 
three
coordinates (call them x, y, z).  This is equivlant to your center of 
mass. 
Build the symmetric 3x3 tensor with the following elements, summed 
over all
points (distances are relative to the centroid):

    y*y + z*z         -x*y           -x*z
      -x*y          x*x + z*z        -y*z
      -x*z            -y*z         x*x + y*y
    
    Diagonalize the tensor (the Jacobi method works well here - see 
any of
the "Numerical Recipes" series by Press et all.)  The transformation 
matrix
gives you your three principal coordinates.

Hope this helps,
Paul Soper

-----------------------------------------------------------------
Paul Soper                        All the usual disclaimers apply
DuPont Central Research             soperpd@esvax.dnet.dupont.com  
P.O. Box 80328                                 Tel (302)-695-1757  
Wilmington, DE 19880-0328                      FAX (302)-695-8412  

***************************************************************
**

   According to the math of the normal least square procedure, as far 
as I
know, NOT the distances of the points and the line are minimized 
(i.e. a
perdendicular from a point to the line), but the  difference of the 
measured
and the calculated y values,which is represented graphically by a 
line
between the point and the regression line and it is parallel with the 
y axis:

That depends on the error criterion that you use. The one sent 
yesterday (and
probably some others too) does indeed minimize the error along one
coordinate axis (and in fact works only if that axis does not lie in 
the plane
that you want to fit). If you want to minimize the distances of the 
points
>from the plane, use the normal form for the plane:

    n x - d = 0,

where n is the normalized normal vector, x is the coordinate vector, 
and d the
distance of the origin from the plane.  The distance of any point y 
>from this
plane is simply given by n y - d, so you must minimize

  __
  \    |           | 2
  /    | n y_i - d |
  --
  i

with the constraint that n is normalized, so you need a 
least-squaresminimizer
that can handle constraints (e.g. via Lagrange multipliers). Or use a 
"dirty
hack": if you know (or assume) that d will never be zero (i.e. the 
plane will
not contain the coordinate origin), set d = 1 and use an unnormalized 
normal
vector, whose length then is the inverse of the distance from the 
origin.


------------------------------------------------------------------------------
-
Konrad Hinsen                     | E-Mail: hinsenk@ere.umontreal.ca
Departement de chimie             | Tel.: +1-514-343-6111 ext. 3953
Universite de Montreal            | Fax:  +1-514-343-7586
C.P. 6128, succ. A                | 
Deutsch/Esperanto/English/Nederlands/
Montreal (QC) H3C 3J7             | Francais (phase experimentale)

------------------------------------------------------------------------------

***************************************************************
**

      What you say you want sounds a lot like the transformation of a
molecule to the coordinate system in which the moment-of-inertia 
tensor is
diagonal.  If your (y1, y2, y3) components are *orthogonal*, you 
ought to be
able a similar approach.  This kind of code is buried in all sorts of 
programs.
It should be a simple matter to adapt a suitable code fragment to you 
use.


                           FREDERIC A. VAN-CATLEDGE

Scientific Computing Division         ||   Office: (302) 695-1187 or 
529-2076
Central Research & Development Dept.  ||          
The DuPont Company                    ||      FAX: (302) 695-9658
P. O. Box 80320                       ||
Wilmington DE 19880-0320              || Internet:
fredvc@esvax.dnet.dupont.com 

Tamas,

        I think I have solved a problem fairly similar to the one you 
are
interested in.  My problem was less general and involved finding 
thebest-fit
plane to a group of four points centred around a fifth point and 
arose because
I am looking at compression strain in tetracoordinate carbon systems. 
 For
example the molecule [4.4.4.4]fenestrane C9H12      
                               H
                    H2C__C__CH2
                      |  |  |
                     HC-----CH
                      |  |  |
                    H2C--C--CH2
                         H
contains a central "flattened" tetracoordinate carbon atom.  One way 
to
determine the flattening at any C(C)4 moiety is to find the best-fit 
plane for
the four alpha-C atoms passing through the central C atom.  This can 
be
done simply by finding the eigenvectors of what we have termed the
Geometry Tensor.
        The Geometry Tensor is constructed by multiplying the matrix 
D (the
vectors to your points -- in this case the 4 alpha-C atoms) by its 
transpose to
give a 3x3 Real Symmetric Matrix (routines to find the 
eigenvectors/values
of any RSM can be found in eispak and similar sets of routines).  The
eigenvectors of this matrix will correspond to the minimum
(Best-Fit) -- this will have the smallest eigenvalue, maximum and one 
other
extremum (i.e. they will form a set of cartesian axes -- x,y and z -- 
one of
which will be the normal to your best-fit plane. 
     The mathematics behind this is actually quite simple and involes
setting up the equations to minimise d' = (sum of squared distances 
of your
points to an orbitrary plane).  After which the need to form and find 
the
eigenvectors of the "Geometry Tensor" becomes obvious. 
     Naturally this can be readily extended to any number of atoms 
and
any origin.  I have used this method to re-orient molecules that were
optimised in C1 symmetry (in cartesians) which exhibit higher 
symmetry but
where the axes/planes of symmetry do not lie on the x,y or z axes of 
my
final cartesians.
        I might be able to send you my fortran code (it is very 
simple) if you
think this would help but I will need to check the origin of the 
eigenvector
solving routines that I use (to make sure we do not break anyone's
intellectual property rights as I did not write these routines myself 
-- I think
they came from eispak).

        Cheers,

                Danne

 Danne R Rasmussen, PhD Student                     phone:   +61 6 
249-3771
 Research School of Chemistry                         fax:   +61 6 
249-0750
 Australian National University
 Canberra ACT 0200                             e-mail: 
danne@rsc.anu.edu.au

*****************************************************************


Hi,

     Your assumption about the fitting of the least squares of 
perpendicular
distances is correct .. this is what is done by linear regression.  A 
side note,
linear regression is used in 
teaching software, but programs are always built around the matrix
formulation.

     The way that you can fit perpendicular distances, as well as do
non-linear fits (constants other than linear coeficients) is using 
optimization
methods, such as steepest descent, Newton-Rhapson, etc. This is just 
like
doing geometry optimization, only you are minimizing on your 
perpendicular
distances instead of energy.

     The other possibility is spline methods (cube splines are most
commonly used).  A spline method is not a least square fit, it is an 
exact fit. 
However, if there is noise in the data you fit exactly to the noise.

     Hope this helps.

                    Dave Young
                    young@slater.cem.msu.edu

***************************************************************
**

Dear Tamas,
     I also am a chemist, a chemometrician in fact, and am interested 
in
chemistry
, statistics and computing all mixed togoether. Whenever I have a
mathematical problem like yours, I often find an answer in the 
excellent book
"Numerical Recipes in C: The Art of Scientific Computing" by W. 
H.Press,
S.A.Teukolsky, W.T.Vetterling and B.P.Flannery (Cambridge University
Press). There are also version for FORTRAN and, I think, BASIC. PCA 
is
also described in various multivariate statistics and chemometrics 
books, such
as "Multivariate Calibration" by H.Martens and T.Naes (Wiley) and
"Multivariate pattern recognition in chemometrics" edited by 
R.G.Brereton
(Elsevier).

     The source code is actually written in Visual Basic, but 
conversion to
Basic, Fortran or C should be quite straightforward

PCA: ijX = ikT * kjP 
where X is an i by j datamatrix, T is an i by k scores matrix and P 
is a k by
j loadings matrix. In your case, the number of components to extract 
("kmax"
in the source code) will be 2. For this listing, "evals" are the 
eigenvalues, one
for each PC, which I have defined as the sum of squares of the 
scores.
NIPALS calculates the PC's in order of importance (i.e. PC's with the 
biggest
eigenvalues come first). The loadings will be 2 vectors of unit 
length, which
represent the two new PC axis, and the scores matrix gives the 
coordinates of
the points on these new PC axis.

     Think about whether you wish to mean-centre the data prior to 
the
PCA or not - this will give different results!

     Maybe you have the answer to your problem anyway, by the time I
write this, but best of luck anyway!


******************************************************
Sub NIPALS (X!(), kmax%, mc%, T!(), P!(), evals!())

    Dim i%, imax%, j%, jmax%, k%, maxcol%
     imax = UBound(Xmat, 1)
     jmax = UBound(Xmat, 2)
    Dim sm#, ss#, cmaxss#, rmaxss#, smaxss#
    Dim cmax#(), rmax#(), smax#()

    ReDim cmax(1 To imax)
    ReDim rmax(1 To jmax)
    ReDim smax(1 To imax)

    ReDim T(1 To imax, 1 To kmax)
    ReDim P(1 To kmax, 1 To jmax)
    ReDim evals(1 To kmax)


    k = 1

    ' Mean-centre matrix if required
    If mc = 1 Then
     For j = 1 To jmax
         sm = 0

         For i = 1 To imax
          sm = sm + X(i, j)
         Next i

         sm = sm / imax
     Next j

     For j = 1 To jmax
         For i = 1 To imax
          X(i, j) = X(i, j) - sm
         Next i
     Next j
    End If


START1:

    ' Find column with greatest sum of squares
    cmaxss = 0#

    For j = 1 To jmax
     ss = 0#

     For i = 1 To imax
         ss = ss + X(i, j) ^ 2
     Next i
     
     If ss > cmaxss Then
         cmaxss = ss
         maxcol = j
     End If
    Next j

    For i = 1 To imax
     cmax(i) = X(i, maxcol)
    Next i


START2:

    ' Calculate row vector and its sum of squares
    rmaxss = 0#

    For j = 1 To jmax
     rmax(j) = 0#

     For i = 1 To imax
         rmax(j) = rmax(j) + (cmax(i) * X(i, j))
     Next i

     rmaxss = rmaxss + rmax(j) ^ 2
    Next j

    ' Scale this row vector to unit length, after checking that
it is not a
    ' zero vector
    If rmaxss = 0# Then
     MsgBox ("Nipals( ): Exited before all PC's found")
     Exit Sub
    End If

    For j = 1 To jmax
     rmax(j) = rmax(j) / Sqr(rmaxss)
    Next j

    ' Calculate a new estimate of the PC scores
    smaxss = 0#

    For i = 1 To imax
     smax(i) = 0#

     For j = 1 To jmax
         smax(i) = smax(i) + (X(i, j) * rmax(j))
     Next j

     smaxss = smaxss + smax(i) ^ 2
    Next i

    ' Test to see if PC has converged. If so, store scores and
loadings,
    ' adjust matrix, and look for next PC. If not, use last
scores estimate
    ' to put back into the algorithm. The convergence criterion
can be
    ' adjusted (i.e. 1e-6 used for single precision). 
    If Abs(smaxss - cmaxss) / cmaxss < .000000000001 Then
     evals(k) = smaxss

     For i = 1 To imax
         T(i, k) = smax(i)
     Next i

     For j = 1 To jmax
         P(k, j) = rmax(j)
     Next j


     For i = 1 To imax
         For j = 1 To jmax
          X(i, j) = X(i, j) - (smax(i) * rmax(j))
         Next j
     Next i

     If k = kmax Then
         Exit Sub
     End If

     k = k + 1

     GoTo START1
    Else
     For i = 1 To imax
         cmax(i) = smax(i)
     Next i

     cmaxss = smaxss

     Go    End If

End Sub


================

Steve Gurden,
Bristol Chemometrics Group,
School of Chemistry,
University of Bristol,
Cantock's Close,
BRISTOL BS8 1TS

Tel: +44 (0)117 9289000 extn. 4421
e-mail: S.P.Gurden@bris.ac.uk

***************************************************************
**
 
From: g80@chm.uri.edu
 Hi Dr. Tamas Gunda,

    I too have been interested in the question of fitting a plane to 
a set of
points(3D). As you correctly pointed out this is a problem in total 
least
squares, not simple least squares. This problem as been approached 
many
times using a variety of algorithims(its' fundamental to 
crystallography).

I have written a simple FORTRAN  program based on the method of D. M.
Blow, Acta. Cryst.(1960),13,168. One fundamental paper is V. 
Schomaker, J.
Waser,  R. E. Marsh and ?. Bergman, Acta Cryst(1959),12,60. There are
many other papers on this subject. I hope this helps and good luck.


                                           Brian Schmitz

                                           Univ. of Rhode Island
***************************************************************
**
end summary

*****************************************************************************
   Tamas E. Gunda, Ph.D.               phone: (+36-52) 316666 ext 2479
   Research Group of Antibiotics       fax  : (+36-52) 310936
   L. Kossuth University               e-mail: tamasgunda@tigris.klte.hu
   POBox 36                                   
   H-4010 Debrecen
   Hungary
*****************************************************************************

From jkl@ccl.net  Thu Aug 31 06:11:40 1995
Received: from bedrock.ccl.net  for jkl@ccl.net
	by www.ccl.net (8.6.10/950822.1) id GAA04657; Thu, 31 Aug 1995 06:00:30 -0400
Received: from uivt1.uivt.cas.cz  for HRUSAK@JH-INST.CAS.CZ
	by bedrock.ccl.net (8.6.10/950822.1) id GAA19304; Thu, 31 Aug 1995 06:00:02 -0400
Received: from bob.jh-inst.cas.cz (bob.jh-inst.cas.cz [147.231.28.65]) by uivt1.uivt.cas.cz (8.6.12/8.6.12) with ESMTP id LAA29648 for <chemistry@ccl.net>; Thu, 31 Aug 1995 11:53:33 +0200
Received: from BOB/MAILQUEUE by bob.jh-inst.cas.cz (Mercury 1.21);
    31 Aug 95 11:57:47 GMT+1
Received: from MAILQUEUE by BOB (Mercury 1.21); 31 Aug 95 11:57:34 GMT+1
From: "Dr. Jan Hrusak" <HRUSAK@JH-INST.CAS.CZ>
Organization:  Institute of Physical Chemistry
To: chemistry@ccl.net
Date:          Thu, 31 Aug 1995 11:57:25 +0100
Subject:       S-T gap in phenyl cation
Priority: normal
X-mailer: Pegasus Mail for Windows (v2.01)
Message-ID: <3D0A2B2451@bob.jh-inst.cas.cz>



Dear Netters,
does somebody know anything about the singlet/triplet gap in the
phenyl cation? Any information (regardless whether old or new, 
experiment or theory) would be appreciated.

Jan Hrusak


----------------------------------------------------------------------------
Dr. Jan Hrusak                               ###############################
J. Heyrovsky Institute of Physical Chemistry ## MEMOR ESTO CONGREGATIONIS ##
Academy of Sciences of the Czech Republic    ##   TVAE QVAM POSSEDISTI    ##
Dolejskova 3, CZ-182 23 Prague 8             ##         AB INITIO         ##
Czech Republic                               ###############################
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Phone: (0042 2) 66 05 3436                    FAX: (0042 2) 858 2307
                     E-Mail: hrusak@jh-inst.cas.cz
----------------------------------------------------------------------------


From chessyx@gsusgi1.Gsu.EDU  Thu Aug 31 10:13:00 1995
Received: from gsusgi1.Gsu.EDU  for chessyx@gsusgi1.Gsu.EDU
	by www.ccl.net (8.6.10/950822.1) id KAA08463; Thu, 31 Aug 1995 10:09:41 -0400
Received: (from chessyx@localhost) by gsusgi1.Gsu.EDU (8.6.10/8.6.10) id KAA21868 for chemistry@www.ccl.net; Thu, 31 Aug 1995 10:09:41 -0400
Date: Thu, 31 Aug 1995 10:09:41 -0400
From: Shijie Yao <chessyx@gsusgi1.Gsu.EDU>
Message-Id: <199508311409.KAA21868@gsusgi1.Gsu.EDU>
To: chemistry@www.ccl.net
Subject: Stability comparison of two products by QM



Dear netters,

We have a couple of reactions like

A + B = C
A + B'= C'

Can we calculate the relative stability of C' vs C
using quantum mechanical calculations?

Thank you very much for your help.

Shijie Yao


From Jeffrey.Gosper@brunel.ac.uk  Thu Aug 31 10:26:46 1995
Received: from monge.brunel.ac.uk  for Jeffrey.Gosper@brunel.ac.uk
	by www.ccl.net (8.6.10/950822.1) id KAA08338; Thu, 31 Aug 1995 10:06:06 -0400
Received: from chem-pc-18.brunel.ac.uk by monge.brunel.ac.uk with SMTP (PP) 
          id <26561-0@monge.brunel.ac.uk>; Thu, 31 Aug 1995 15:04:41 +0100
Date: Thu, 31 Aug 1995 15:04:34 BST
From: Jeffrey J Gosper <Jeffrey.Gosper@brunel.ac.uk>
Reply-To: Jeffrey.Gosper@brunel.ac.uk
Subject: SUMMARY: MOPAC reaction path publications
To: chemistry@www.ccl.net
Message-ID: <ECS9508311534A@brunel.ac.uk>
Priority: Normal
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; CHARSET=US-ASCII


A short while ago I posted the following question:

I am trying to locate computational studies that make use of the 
reaction path calculation offered by MOPAC (i.e. use of the -1 
optimization flag). I would therefore appreciate any pointers to 
published work on this matter. 

I'll summarize any responses as usual. 

***********************************

Thanks to those who responded however very few pointers were 
provided (I'm sure there must be more out there).

***********************************


From: Stanislaw Oldziej <stan@sun2.chem.univ.gda.pl> 
Subject: Re: CCL:MOPAC reaction path calculations 
To: Jeffrey J Gosper <Jeffrey.Gosper@brunel.ac.uk> 

Hi Jeffrey, 

I have to you two works with using this option -1 in MOPAC to locate 
transition states. Both this works are from my lab: 

1. M.Tarnowska, St.Oldziej, A.Liwo, P.Kania, F.Kasprzykowski, 
Z.Grzonka. 
   MNDO study of the mechanism of the inhibition of cysteine 
proteinases 
   by diazomethyl ketones. Eur. Biophys. J., 21, 217-222 (1992) 

2. A.Tempczyk, M.Tarnowska, A.Liwo, E.Borowski. A theoretical study 
of 
   glucosamine synthase. II. Combined quantum and molecular 
mechanics 
   simulation of sulfhydryl attack on the carboxyamide group. Eur. 
   Biophys. J., 21, 137-145 (1992) 

You also may look at the work of Kolllman's group: 

3. E.H.Alison, P.A.Kollman. OH- versus SH- nucleophilic attack on 
amides: 
   Dramatically different gas-phase and solvation energetics. 
J.Am.Chem.Soc., 
   110, 7195-7200 (1988) 


Stanislaw Oldziej 
Faculty of Chemistry 
University of Gdansk 
Sobieskiego 18, 80-952 Gdansk 
POLAND 
***************************************************

I have published one such article, and another has been submitted 
recently. 
Here is the reference for the published paper: 
Martin, N.H.;  Allen, D.B.;  Taylor, K.N. "Semi-Empirical Molecular 
Orbital 
Calculations on the Reaction of Singlet Oxygen with Vinylamine; 
Examination 
of a Charge-Transfer Mechanism," J. Elisha Mitchell Sci. Soc. 1991, 
107, 
89-96. 

        Good luck. 
                Ned H. Martin 
********************************************************************
Ned H. Martin, Chair 
Department of Chemistry 
University of North Carolina at Wilmington      Voice: 910-395-3453 
601 S. College Rd                               Fax:   910-395-3013 
Wilmington, NC 28403-3297                       
martinn@vxc.uncwil.edu 
********************************************************************

/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\
 Dr. Jeff Gosper                                         
 Dept. of Chemistry		                        
 BRUNEL University                                     
 Uxbridge Middx UB8 3PH, UK                            
 voice:  01895 274000 x2187                            
 facsim: 01895 256844                                  
 internet/email/work:   Jeffrey.Gosper@brunel.ac.uk     
 internet/WWW: http://http2.brunel.ac.uk:8080/~castjjg 
\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/\/



From oreola@chem.QueensU.CA  Thu Aug 31 10:56:48 1995
Received: from QUCDN.QueensU.CA  for oreola@chem.QueensU.CA
	by www.ccl.net (8.6.10/950822.1) id KAA09463; Thu, 31 Aug 1995 10:45:17 -0400
Received: from quchem.chem.QueensU.CA by QUCDN.QueensU.CA (IBM VM SMTP V2R2)
   with TCP; Thu, 31 Aug 95 10:45:22 EDT
Received: by quchem.chem.QueensU.CA (4.1/SMI-4.0)
	id AA04875; Thu, 31 Aug 95 10:52:29 EDT
Date: Thu, 31 Aug 95 10:52:29 EDT
From: oreola@chem.QueensU.CA (Oreola Donini)
Message-Id: <9508311452.AA04875@quchem.chem.QueensU.CA>
To: chemistry@www.ccl.net
Subject: DMol module on Insight II


Hi,
	I should be getting access to the DMol module in the Insight II
package and I was wondering if someone could list for me the density
functionals available within the package.  I will not get actual
documentation for a while, but I need to chose a functional for another
part of a project which should be consistent with DMol.
			Thank-you for your help,
				Oreola Donini

From toukie@zui.unizh.ch  Thu Aug 31 11:11:47 1995
Received: from rzusuntk.unizh.ch  for toukie@zui.unizh.ch
	by www.ccl.net (8.6.10/950822.1) id LAA09967; Thu, 31 Aug 1995 11:04:05 -0400
Received: by rzusuntk.unizh.ch (4.1/SMI-4.1.9)
	id AA10366; Thu, 31 Aug 95 17:03:51 +0200
X-Nupop-Charset: Swiss
Date: Thu, 31 Aug 1995 17:03:59 +0100 (MET)
From: "Hr Dr. S. Shapiro" <toukie@zui.unizh.ch>
Sender: toukie@zui.unizh.ch
Reply-To: toukie@zui.unizh.ch
Message-Id: <61439.toukie@zui.unizh.ch>
To: chemistry@www.ccl.net
Subject: Seeking molecular mechanics force fields


Dear Colleagues;

     I am seeking the source codes for some molecular mechanics force fields,
especially AMBER, MMP2, and MM2.  If anyone has knowledge of an FTP site
>from which the source codes for these or possibly other) MM FFs can be found,
please contact me.  Alternatively, if you would be willing to donate a
source code for any of the above MM FFs, I would also very much appreciate
hearing from you.


Sincerely,

(Dr.) S. Shapiro
Inst. f. orale Mikrobiol. u. allg. Immunol.
Zent. f. Zahn-, Mund- u. Kieferheilkd. der Univ. ZH
Plattenstr. 11
Postfach
CH-8028 Zuerich 7
Switzerland

Internet: toukie@zui.unizh.ch
FAX-nr: ( ... + 1) 261'56'83

From POLLARD@CCIT.ARIZONA.EDU  Thu Aug 31 12:41:58 1995
Received: from Madder.CCIT.Arizona.EDU  for POLLARD@CCIT.ARIZONA.EDU
	by www.ccl.net (8.6.10/950822.1) id MAA12036; Thu, 31 Aug 1995 12:40:21 -0400
From: <POLLARD@CCIT.ARIZONA.EDU>
Received: from CCIT.ARIZONA.EDU by CCIT.ARIZONA.EDU (PMDF V5.0-5 #2381)
 id <01HUPVGMOK408Y56M9@CCIT.ARIZONA.EDU> for chemistry@www.ccl.net; Thu,
 31 Aug 1995 09:40:10 -0700 (MST)
Date: Thu, 31 Aug 1995 09:40:10 -0700 (MST)
Subject: Matrix elements in G9X
To: chemistry@www.ccl.net
Message-id: <01HUPVGMU6NM8Y56M9@CCIT.ARIZONA.EDU>
X-Envelope-to: chemistry@www.ccl.net
X-VMS-To: IN%"chemistry@www.ccl.net"
MIME-version: 1.0
Content-transfer-encoding: 7BIT


I was wondering if anyone could tell me how to include the Fock and overlap
matrix elements in the output from G92.  I have e-mailed gaussian, but they
are not responding.  Thanks, John Pollard, University of Arizona.

From jstewart@fujitsuI.fujitsu.com  Thu Aug 31 13:26:46 1995
Received: from mail.barrnet.net  for jstewart@fujitsuI.fujitsu.com
	by www.ccl.net (8.6.10/950822.1) id NAA13373; Thu, 31 Aug 1995 13:14:40 -0400
Received: from fujitsu1.fujitsu.com (fujitsu1.fujitsu.com [133.164.254.1]) by mail.barrnet.net (8.6.10/MAIL-RELAY-LEN) with SMTP id KAA07867 for <CHEMISTRY@www.ccl.net>; Thu, 31 Aug 1995 10:14:34 -0700
Received: by fujitsu1.fujitsu.com (4.1/SMI-4.1)
	id AA29809; Thu, 31 Aug 95 10:15:53 PDT
Date: Thu, 31 Aug 95 10:15:53 PDT
From: jstewart@fujitsu.com (Dr. James Stewart)
Message-Id: <9508311715.AA29809@fujitsu1.fujitsu.com>
To: CHEMISTRY@www.ccl.net
Subject: normal vibrations


Ferenc Molnar writes:
> The eigenvalue number k correspond to (w_k)^2, where the w_ks
> are the vibrational frequencies, this means:
> 
>  (w_k)^2=K_k/M_k 
> 
> K_k: force constant of the kth normal mode
> M_k: reduced mass of the k_th normal mode
> 
> Now my question is, if (w_k)^2 determines only the ratio
> of K_k and M_k, then how are the reduced masses, reported
> in the "vibrational analysis" section of the MOPAC output 
> file, calculated? Is there a convention, which "fraction" of
> (w_k)^2 to use for K_k and which for M_k?

The reduced mass is like atomic charges, in that it is not an observable,
however, for specific systems - mainly homonuclear diatomics - the reduced 
mass does have meaning.   The reduced mass definition used in MOPAC can be
understood as follows:

Each vibration can be modelled by a mass, M_k, at the end of a spring of
force constant K_k, attached to an infinite mass.

  Inf   |       K_k        M_k
  Mass  |
        |^^^^^^^^^^^^^^^^^^O
        |
        |

The contribution to the mass is proportional to the amount each atom
contributes to the normal mode, and is proportional to the fraction of
the atomic mass contributed by each atom.

Put in more formal terms, the contribution of each atom to the effective mass
of a vibration is proportional to the product of the intensity on that
atom times the mass-weighted intensity.

 rho = sum_A <c_A|c_A><c_A|M_A|c_A> = sum_A (c_A_x**2+c_A_y**2+c_A_z**2)**2*M_A

where c_A are the coefficients of the normal modes.  



Consider H2: c_1 = 0.7071*H_1+0.7071*H_2

     rho_1 = 0.7071**4*1 + 0.7071**4*1 =0.5

Consider M-H, M being an atom of very large mass, say 1000:

             c_1 = 0.0316*M+0.9995*H

     rho_1 = 0.0316**4*1000 + 0.9995**4*1 = 0.9990

[0.0316 ~ sqrt(1/1000); 0.9995 ~sqrt(1-1/1000)]

Consider N2: c_1 = 0.7071*N_1+0.7071*N_2

     rho_1 = 0.7071**4*14+0.7071**4*14 = 7.0


Jimmy Stewart

From mrigank@imtech.ernet.in  Thu Aug 31 13:41:47 1995
Received: from sangam.ncst.ernet.in  for mrigank@imtech.ernet.in
	by www.ccl.net (8.6.10/950822.1) id NAA13650; Thu, 31 Aug 1995 13:31:52 -0400
Received: (from uucp@localhost) by sangam.ncst.ernet.in (8.6.12/8.6.6) with UUCP id XAA04478 for chemistry@www.ccl.net; Thu, 31 Aug 1995 23:02:27 +0530
Received: from imtech.UUCP by doe.ernet.in (4.1/SMI-4.1-MHS-7.0)
	id AA22769; Thu, 31 Aug 95 22:52:31+050
Message-Id: <9508311752.AA22769@doe.ernet.in>
Received: by imtech.ernet.in (DECUS UUCP w/Smail);
          Thu, 31 Aug 95 14:01:50 +0530
Date: Thu, 31 Aug 95 14:01:50 +0530
From: Mrigank <mrigank@imtech.ernet.in>
To: chemistry@www.ccl.net
Subject: Workshop Announcement:
X-Vms-Mail-To: CHEM,NMR,WATOC,AMBER


   NATIONAL WORKSHOP ON COMPUTER AIDED PROTEIN DESIGN
   ==================================================
   
                (November 6-10, 1995)
   
   
   
   
              
                         Organized by
   
                      Bioinformatics Centre,
                  Institute of Microbial Technology
                      Chandigarh.  160 014
      
   
   THE THEME 
   --------- 
   
            Protein engineering, though originally conceived as a realm  of 
   molecular biology is now as multidisclipinary approach to the study of 
   biological systems. Structural Biology and Molecular Biophysics are some 
   of them. Though there is a many fold increase in the pace at which 3-D 
   structures of proteins are now available, the protein sequences far 
   outnumber it. Although the ultimate goal of all "protein engineers" is 
   the prediction of 3-D structure from a given sequence, present day aims 
   are somewhat more modest, like to find out what  if a residue is replaced 
   or deleted in a protein. This necessitates the usage of tools of Compu
   tational  Chemistry, a field that has seen very rapid growth of late.
  
            The  idea of this workshop is to familiarize the participants 
  with the state of art in this area and to make them appreciate as to what 
  information can one glean out from the information available with molecular 
  biologists  or biophysicists and with what degree of confidence and caution  
   should it be used. 

            What information does one get from just the protein sequence? How 
  well can one predict the effect of mutations on proteins with  known 
  structure ? How and what kind of ideas can one get from structure to 
  introduce desired properties into a protein?  These will be the some of 
  the questions the  workshop will attempt to enlighten.
   
   
   Topics to be covered:
   --------------------
         
    o   Sequence Analysis
    o   Protein Secondary/Supersecondary Structure
    o   Homology Based Protein Modeling
    o   Molecular Dynamics/Simulated Annealing
    o   Genetic Algorithms
    o   Graphic Representation of Protein Structure
    o   Conformational Analysis
   
   Who Should Apply?
   ----------------

   Biotechnologists, Molecular  Biophysicists, Molecular Pharmacologists, 
   Structural Biologists actively involved in R & D in the area of protein 
   engineering/protein modeling. The workshop will be useful from research 
   scholars to senior scientists working in academic institutions or industries.
   
   How To Apply?
   ------------

   Applicants should send their application along, brief resume, and a 
   statement of purpose briefly describing how this workshop will be useful 
   to their research work.
   
   


   REGISTRATION
   ------------
   
   Registration Fee:
   
   Academic:   Faculty  Rs. 500/-
               Students Rs. 300/-
   
   Industry:            Rs. 3000/- 
   
   
   Last Date: Applications must reach before September 20, 1995.

   Accommodation: Applicants needing accommodation must send request along with registration form.

   Travel Assistance: No travel assistance will be provided to the participants. TA/DA must be met by the 
   participants' own sources.
   
   Number of Participants: There will be maximum of twenty(20) participants.
   
   
    ORGANIZING COMMITTEE
    --------------------   
   
   Dr. C. M. Gupta		Chairman 
   Dr. Amit Ghosh		Co - chairman 
   Dr. Naresh Kumar		Member 
   Dr. G. Sahn			Member 
   Dr. R. M. Vohra		Member 
   Mr. C. R. Suri		Member 
   Mr. G. P. S. Raghava		Secretary 
   Mr. Bijay Singh		Member 
   Dr. Mrigank			Member 
   
   
   
                                             
   REGISTRATION FORM
   -----------------
   
   
                                   NATIONAL
                             WORKSHOP ON COMPUTER 
                             AIDED PROTEIN DESIGN

   1.  Name (Mr./Ms)....................................
   2.  Designation .........................................
   3. Department/Institute...........................                   
   ......................................................................
   4.  Address for communication...............
   ......................................................................
   ......................................................................
    5. Telephone No. & Fax No......................
   
   
   6. Whether accommodation required .....
   
   
   
   
   
   Date............... 
   
                                 Signature of the candidate
   
   
   
   
   
   N.B. Please include your resume and statement of purpose with this form. 
   Please do not  send registration fee with this form. Registration fee 
   would be procured from applicants selected for workshop after selection.
   

   THE CENTRE
   ----------

   Bioinformatics Centre(BIC) at IMTECH was established in 1987 by Department 
   of Biotechnology, Govt. of India. The objectives of the Centre is to create
   the infrastructure in the field of protein engineering and protein modeling.
   The resources available at the Centre includes :
   
   Hardware: DEC Alpha workstations (one DEC 3000/600 and seven DEC 3000/300LX)
   , PCs (Two 486, one 386, two 286) and one MicroVax II system.
   
   Software: AMBER , RasMol, MidasPlus, X-plor, MicroGenie, GMAP, DNAOPT, Cang,
   Hpat, DNASIZE, CPSSD, CONFang, GAMESS, BOSS, Modeller,  etc.
   
   Databases: PDB, ATLAS (PIR,NRL_3D,ALN), CODONtab, RESTseq, Cang, Hpat, 
   PROSITE, Rotamer Library(Dunbruck and Karplus), DSSP etc. 
   
   Communication and other facilities: All the systems in the Centre are 
   networked using TCP/IP on a ethernet. E-mail, access to internet, 
   literature search etc.
   
   THE CITY
   --------

   Chandigarh offers a rich fare of places of tourist interest such as the 
   Rock Garden, Pinjore Garden, Botanical Garden and Sukhna Lake.  The weather
   here in November is very pleasant (temperature 15o-25oC). Chandigarh is 
   well connected by bus/Train/Air from Delhi. Many trains pass Via Ambala 
   which is about 45 Km away from Chandigarh. There is frequent bus service 
   between Ambala and Chandigarh.
   

   CORRESPONDANCE
   --------------

   G. P. S. Raghava, Coordinator
   Bioinformatics Centre.
   Institute of Microbial Technology
   Sector 39 A, Chandigarh-160 014.
   
   Phones:  0172-690004,  0172-690908,
   Telex: 0395-7369-IMT-IN Gram : IMTECH
   Fax : 0172-690585, 0172-690632
______________________________________________________________________________

There will be opportunity of venders to advertize and demonstrate their
product. Intereseted people can ask for details

Mrigank   
----
/Mrigank                             \/ Phone  +91 172 690557               \
\Institute of Microbial Technology   /\ Email:  mrigank@imtech.ernet.in     /
/Sector 39A,                         \/ FAX: +91 172 690585                 \
\Chandigarh 160 014 India.           /\                                     /
 \//\//\//\//\//\//\//\//\//\//\//\//\//\//\//\//\//\//\//\//\//\//\//\//\//  
  Science does not have a country, But Scientist has one -L. Pastuer
   
   

