From chemistry-request@server.ccl.net  Mon Sep  6 05:43:46 1999
Received: from bacchus.pc1.uni-duesseldorf.de (root@bacchus.pc1.uni-duesseldorf.de [134.99.152.11])
	by server.ccl.net (8.8.7/8.8.7) with ESMTP id FAA16518
	for <CHEMISTRY@server.ccl.net>; Mon, 6 Sep 1999 05:43:05 -0400
Received: from bacchus.pc1.uni-duesseldorf.de (jochen@localhost [127.0.0.1])
	by bacchus.pc1.uni-duesseldorf.de (8.8.7/8.8.7) with SMTP id LAA27298;
	Mon, 6 Sep 1999 11:38:16 +0200
From: Jochen <jochen@uni-duesseldorf.de>
Reply-To: jochen@bacchus.pc1.uni-duesseldorf.de
Organization: Heinrich-Heine-Universität
To: Daniel Mok <dkwmok@fg702-6.abct.polyu.edu.hk>,
        CCL <CHEMISTRY@server.ccl.net>
Subject: Re: CCL:Question about compiling GNQS
Date: Mon, 6 Sep 1999 11:34:50 +0200
X-Mailer: KMail [version 1.0.28]
Content-Type: text/plain
References: <37D35F42.C35737D1@fg702-6.abct.polyu.edu.hk>
In-Reply-To: <37D35F42.C35737D1@fg702-6.abct.polyu.edu.hk>
MIME-Version: 1.0
Message-Id: <99090611381602.25872@bacchus.pc1.uni-duesseldorf.de>
Content-Transfer-Encoding: 8bit
X-MIME-Autoconverted: from quoted-printable to 8bit by server.ccl.net id FAA16519

On Mon, 06 Sep 1999 Daniel Mok wrote:

>I am sorry to post a messgae which is sligtly off the topic of CCL.
>I have tried to compile the Generic-NQS on a PC running SuSE 6.1, but
>without
>success. The GNQS version is 3.50.4. Is there anyone who succesfully
>install
>the NQS on a similar to SuSE 6.1? 

I have installed in on a RedHat-5.2 Linux/AXP box a while ago, it's
running fine.

It should work with any modern Linux distribution.

-- Jochen
 Heinrich-Heine-Universität Düsseldorf          jochen@uni-duesseldorf.de
 Institut für Physikalische Chemie I               phone ++49-211-8113681
 Universitätsstr. 26.43.02.29                      fax   ++49-211-8115195
 40225 Düsseldorf, Germany       www-public.rz.uni-duesseldorf.de/~jochen
From chemistry-request@server.ccl.net  Mon Sep  6 06:45:43 1999
Received: from bacchus.pc1.uni-duesseldorf.de (root@bacchus.pc1.uni-duesseldorf.de [134.99.152.11])
	by server.ccl.net (8.8.7/8.8.7) with ESMTP id GAA16942
	for <chemistry@ccl.net>; Mon, 6 Sep 1999 06:45:40 -0400
Received: from bacchus.pc1.uni-duesseldorf.de (jochen@localhost [127.0.0.1])
	by bacchus.pc1.uni-duesseldorf.de (8.8.7/8.8.7) with SMTP id MAA27348;
	Mon, 6 Sep 1999 12:40:27 +0200
From: Jochen <jochen@uni-duesseldorf.de>
Reply-To: jochen@bacchus.pc1.uni-duesseldorf.de
Organization: Heinrich-Heine-Universität
To: "Dr. B. Habibi-Nezhad" <B.Habibi@tbzmed.ac.ir>,
        chemistry <chemistry@ccl.net>
Subject: Re: CCL:Request SUN Share/Freeware Fortran Compiler
Date: Mon, 6 Sep 1999 12:39:08 +0200
X-Mailer: KMail [version 1.0.28]
Content-Type: text/plain
References: <000101bef839$02a5d7c0$1380a8c0@client2.res.tbzmed.ac.ir>
In-Reply-To: <000101bef839$02a5d7c0$1380a8c0@client2.res.tbzmed.ac.ir>
MIME-Version: 1.0
Message-Id: <99090612402703.25872@bacchus.pc1.uni-duesseldorf.de>
Content-Transfer-Encoding: 8bit
X-MIME-Autoconverted: from quoted-printable to 8bit by server.ccl.net id GAA16943

On Mon, 06 Sep 1999 Dr. B. Habibi-Nezhad wrote:

>How can I obtain a Shareware or Freeware Fortran Compiler for SUN
>workstations?

Look at any GNU ftp archive (e.g. ftp.gnu.org) and get gcc-2.95.1, it
contains a good FORTRAN 77 compiler.

-- Jochen
 Heinrich-Heine-Universität Düsseldorf          jochen@uni-duesseldorf.de
 Institut für Physikalische Chemie I               phone ++49-211-8113681
 Universitätsstr. 26.43.02.29                      fax   ++49-211-8115195
 40225 Düsseldorf, Germany       www-public.rz.uni-duesseldorf.de/~jochen
From chemistry-request@server.ccl.net  Mon Sep  6 11:58:25 1999
Received: from syntem.eerie.fr (syntem.site-eerie.ema.fr [146.19.248.2])
	by server.ccl.net (8.8.7/8.8.7) with ESMTP id LAA18940
	for <chemistry@ccl.net>; Mon, 6 Sep 1999 11:58:23 -0400
Received: from maltose.eerie.fr (maltose.eerie.fr [146.19.248.38]) by syntem.eerie.fr (980427.SGI.8.8.8/980728.SGI.AUTOCF) via SMTP id RAA98021 for <chemistry@ccl.net>; Mon, 6 Sep 1999 17:54:13 +0200 (MDT)
Message-ID: <018401bef88a$9423c4e0$26f81392@eerie.fr>
Reply-To: "Jerome Gomar" <jgomar@syntem.eerie.fr>
From: "Jerome Gomar" <jgomar@syntem.eerie.fr>
To: <chemistry@ccl.net>
Subject: INDIS software
Date: Mon, 6 Sep 1999 18:09:21 +0100
Organization: Synt:em

Dear CCLers,
I'm looking for the program INDIS, which has been developed by the
research group of Dr. J. G=E1lvez.
Thank you in advance for your responses.

Best Regards
****************************************************************
Dr Jerome GOMAR                                                Synt :em
Research Scientist                       Parc Scientifique G.Besse
Computational Drug Discovery                         30000 Nimes
email: jerome@syntem.eerie.fr                                  France
Tel: +33 (0)466 048 665                  Fax: +33 (0)466 048 667
****************************************************************
          Discover New Drugs, Discover    Synt:em
****************************************************************



From chemistry-request@server.ccl.net  Tue Sep  7 06:20:03 1999
Received: from sun.fqspl.com.pl. ([212.244.147.3])
	by server.ccl.net (8.8.7/8.8.7) with SMTP id GAA03614
	for <CHEMISTRY@ccl.net>; Tue, 7 Sep 1999 06:20:02 -0400
Received: from victor by sun.fqspl.com.pl. (SMI-8.6/SMI-SVR4)
	id MAA27738; Tue, 7 Sep 1999 12:09:49 +0200
Message-ID: <001701bef919$c2f61b80$1793f4d4@victor.fqspl.com.pl>
From: "Victor Anisimov" <victor@fqspl.com.pl>
To: <CHEMISTRY@ccl.net>
Subject: Re: CCL:Request SUN Share/Freeware Fortran Compiler
Date: Tue, 7 Sep 1999 12:14:25 +0200
MIME-Version: 1.0
Content-Type: text/plain;
	charset="koi8-r"
Content-Transfer-Encoding: 7bit
X-Priority: 3
X-MSMail-Priority: Normal
X-Mailer: Microsoft Outlook Express 4.72.3110.5
X-MimeOLE: Produced By Microsoft MimeOLE V4.72.3110.3

:>How can I obtain a Shareware or Freeware Fortran Compiler for SUN
:>workstations?
:
:Look at any GNU ftp archive (e.g. ftp.gnu.org) and get gcc-2.95.1, it
:contains a good FORTRAN 77 compiler.

Dear Netters, let me add a few words on this question. I appreciate
GNU enthusiasm in open-source FORTRAN 77 development but from 
my experience it is still very crude FORTRAN compiler especially for non 
x86 based platforms. On non x86 based platform (SUN is) one can compile
only small fortran programs, but compilation of large programs like 
MOPAC2000 will fail. It creates non-running executable on SUN Sparc.

For SUN I can recommend Fujitsu FORTRAN compiler http://tools.fujitsu.com/
It is not free but you can download it for evaluation and use it for 30 days.
It is enough to compile any program.

Sincerely,
Victor.

=========================================================================
Victor Anisimov, PhD, Software Researcher - Computational Chemistry 
FQS Poland, Palac Pugetow, ul. Starowislna 13-15, 31-038 Krakow, Poland
Email: victor@fqspl.com.pl  Tel.(+48 12) 429 4345  Fax(+48 12) 429 6124
=========================================================================


From chemistry-request@server.ccl.net  Tue Sep  7 07:52:52 1999
Received: from ccl.net (atlantis.ccl.net [192.148.249.4])
	by server.ccl.net (8.8.7/8.8.7) with ESMTP id HAA04820
	for <chemistry@ccl.net>; Tue, 7 Sep 1999 07:52:52 -0400
Received: from stark.udg.es (stark.udg.es [130.206.128.61])
	by ccl.net (8.8.6/8.8.6/OSC 1.1) with SMTP id HAA01906
	for <chemistry@www.ccl.net>; Tue, 7 Sep 1999 07:48:04 -0400 (EDT)
Received: from stark.udg.es (bart.udg.es [130.206.128.119]) by stark.udg.es (950413.SGI.8.6.12/950213.SGI.AUTOCF) via ESMTP id JAA29606 for <chemistry@www.ccl.net>; Tue, 7 Sep 1999 09:47:14 -0200
Message-ID: <37D4FCC1.40AF6E5E@stark.udg.es>
Date: Tue, 07 Sep 1999 13:53:37 +0200
From: David Robert <david@stark.udg.es>
X-Mailer: Mozilla 4.03 [es] (WinNT; I)
MIME-Version: 1.0
To: "Computational Chemistry List (CCL)" <chemistry@www.ccl.net>
Subject: CCL: cross-validation and prediction with PLS
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Dear CCLers,

A question on cross-validation and prediction in QSAR models using PLS:

A requisite for the data is to be standardized (mean zero, sdv one). To
check the statistical significance of a predictive model the
leave-one-out (LOO) cross-validation is usually performed. When a
molecule is removed, a new standardization needs to be computed for the
remaining n-1 compounds, and the new mean and sdv will be applied to the
extracted molecule. However, for the usual sample sizes (n=30 to 150)
the error committed in equalling the mean/sdv for the n-1 and n
molecules is not neglectable, and it clearly affects the results of the
prediction for the removed molecule. The same can be said about the
prediction of the property for a test set: the mean/sdv of the training
set QSAR model will be applied to the test set in order to have
standardized descriptors, but the mean/sdv if we included all the
molecules (training+test) may be very different.

I am afraid that LOO cross-validation is sometimes calculated simply by
recomputing the regression coefficients, keeping fixed the scores of the
n molecules model, which obviously overrates the results. Any opinions?


David Robert

--------------------------
E-Mail: david@stark.udg.es
Institute of Computational Chemistry
University of Girona
Campus Montilivi
17071 Girona, Catalonia, Spain

From chemistry-request@server.ccl.net  Tue Sep  7 09:33:16 1999
Received: from balihai.uchicago.edu (balihai.uchicago.edu [128.135.136.49])
	by server.ccl.net (8.8.7/8.8.7) with SMTP id JAA05216
	for <chemistry@ccl.net>; Tue, 7 Sep 1999 09:33:16 -0400
Received: from localhost by balihai.uchicago.edu via SMTP (950413.SGI.8.6.12/931108.SGI.ANONFTP)
	for <chemistry@ccl.net> id IAA21938; Tue, 7 Sep 1999 08:37:31 -0500
Date: Tue, 7 Sep 1999 08:37:31 -0500 (CDT)
From: "Fred P. Arnold" <fparnold@balihai.uchicago.edu>
To: chemistry@ccl.net
Subject: G77 for Fortran
Message-ID: <Pine.SGI.3.95.990907083524.21935B-100000@balihai.uchicago.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII

Hello,

For what it's worth, you may actually want to try one version back of the
EGCS g77, 1.1.2, rather than the current 2.95.1.  I was able to build a
smoothly running version of GAMESS-US with the previous one, but 2.95.1
builds executables that run only for small jobs.  I'm looking into that
one (I could have the libraries on my system munged, for instance), but
I'd be wary of the new release.

The problem with MOPAC2k may be Sparc related, as I've compiled large
programs using G77 on Intel and MIPS.  

						-fred



"No science has ever made                 Frederick P. Arnold, Jr.  
 more rapid progress in a                 A&HPRC, U. of Chicago     
 shorter time than Chemistry."            5640 S. Ellis Ave             
        -Martin Heinrich Kloproth, 1791   Chicago, IL 60637             

From chemistry-request@server.ccl.net  Tue Sep  7 09:42:46 1999
Received: from ccl.net (atlantis.ccl.net [192.148.249.4])
	by server.ccl.net (8.8.7/8.8.7) with ESMTP id JAA05276
	for <chemistry@ccl.net>; Tue, 7 Sep 1999 09:42:46 -0400
Received: from mailhub2.shef.ac.uk (mailhub2.shef.ac.uk [143.167.2.154])
	by ccl.net (8.8.6/8.8.6/OSC 1.1) with ESMTP id JAA04647
	for <chemistry@www.ccl.net>; Tue, 7 Sep 1999 09:38:53 -0400 (EDT)
From: d.turner@sheffield.ac.uk
Received: from pc100182.shef.ac.uk ([143.167.100.182] helo=davets_pc)
	by mailhub2.shef.ac.uk with smtp (Exim 3.02 #2)
	id 11OLSH-00035T-00; Tue, 07 Sep 1999 14:38:37 +0100
To: David Robert <david@stark.udg.es>
Date: Tue, 7 Sep 1999 14:39:55 +0100
MIME-Version: 1.0
Content-type: text/plain; charset=US-ASCII
Content-transfer-encoding: 7BIT
Subject: Re: CCL:cross-validation and prediction with PLS
CC: chemistry@www.ccl.net
Priority: normal
In-reply-to: <37D4FCC1.40AF6E5E@stark.udg.es>
X-mailer: Pegasus Mail for Win32 (v3.01a)
Message-Id: <E11OLSH-00035T-00@mailhub2.shef.ac.uk>

David

Basically, you are asking the right questions about PLS/CV.
There are a number of publications that discuss these issues; e.g.,
Cramer RD in 3D-QSAR in Drug Design, 1993 and also in Perspect.
in Drug Disc. and Design, 1 (1993), 269-278.

First, it is not a pre-requisite that the data be standardised in the way
you describe (Z-score autoscaling). In fact the so-called zero th PLS 
LV corresponds to mean-centring the data. Adjusting the univariate
stdev to one is optional and makes sense with some data and
not others. Univariate autoscaling does not make sense, for
example, with CoMFA where its application means that high
energy interactions have equal prior weighting as low energy
interactions -  the former _may_ have very high variance while
the latter cannot have high variance. PLS is sensitive to both
the univariate X-block variance and the strength of the univariate
correlations with Y. Autoscaling can make sense where for
example descriptors from different sources are used.

With regard to data (re)scaling during crossvalidation it appears
that different groups favour different approaches. The Tripos (ie 
CoMFA etc) and GOLPE methods both rescale the "kept-in" rows 
(compounds) for each CV cycle. I personally favour this approach.
The left-out compounds are rescaled/centred according to the stats 
for the kept-in rows. This means that the predictions are "true" 
predictions; i.e., if the means/stdevs used are not recalculated for each 
CV cycle then information about the left-out compounds has 
been included in the model development stage and are thus not true 
predictions. The expectation is that the latter method will produce more 
optimistic CV stats (ie q2) than the former approach. The latter method 
is in fact used by the Umetri software (SIMCA, MMODE etc) although 
I may be out-of-date here. It is also an inconsistent approach in the sense
that external test set predictions are made without including information
about these compounds in the model development stage.

The only argument I know of in favour of the Umetri CV approach is that the 
omission of a single row may dramatically change the univariate characteristics
of one or more X columns and thus lead to "strange results" (Thibaut et al,
QSAR Journal, 13, 1-3 (1994). Presumably, these authors mean that an
inappropriate number of PLS LVs may be chosen as optimal and, therefore,
external test set predictions may be poor. It is probably appropriate to
analyse ones data for such columns and determine the influence such
columns have on the final model (ie omit that/those column/s and compare q2 
and optimal LV nos. etc). I don't know of any literature on this however.

The most significant influence on PLS CV results is generally the level of 
row/compound redundancy. Sparse, designed training sets will tend to
produce poor q2 scores (but may have very good external predicitvity) while 
highly redundant datasets will tend to give high q2s particularly (but not solely) 
where LOO CV rather than LNO CV is used. So it can be useful to do some
experimental design and then add some redundancy (!). So, the "real" test
of ones model must always be performance with external test compounds.
Also, look at the effects of choosing one more or one less LV on your
predictions, together with a search for leverage (outliers) in both training
and test sets. 

Hope this help and sorry if this is a little unstructured and rambling ...

Dave

> Dear CCLers,
> 
> A question on cross-validation and prediction in QSAR models using PLS:
> 
> A requisite for the data is to be standardized (mean zero, sdv one). To
> check the statistical significance of a predictive model the
> leave-one-out (LOO) cross-validation is usually performed. When a
> molecule is removed, a new standardization needs to be computed for the
> remaining n-1 compounds, and the new mean and sdv will be applied to the
> extracted molecule. However, for the usual sample sizes (n=30 to 150)
> the error committed in equalling the mean/sdv for the n-1 and n
> molecules is not neglectable, and it clearly affects the results of the
> prediction for the removed molecule. The same can be said about the
> prediction of the property for a test set: the mean/sdv of the training
> set QSAR model will be applied to the test set in order to have
> standardized descriptors, but the mean/sdv if we included all the
> molecules (training+test) may be very different.
> 
> I am afraid that LOO cross-validation is sometimes calculated simply by
> recomputing the regression coefficients, keeping fixed the scores of the
> n molecules model, which obviously overrates the results. Any opinions?
> 
 

Dr David Turner
Dept of Information Studies, Sheffield University
Sheffield, S10 2TN, UK        Tel. 0114 2 222 650
E-mail: D.Turner@sheffield.ac.uk
Fax: 0114 2 780 300
From chemistry-request@server.ccl.net  Tue Sep  7 09:58:47 1999
Received: from bacchus.pc1.uni-duesseldorf.de (root@bacchus.pc1.uni-duesseldorf.de [134.99.152.11])
	by server.ccl.net (8.8.7/8.8.7) with ESMTP id JAA05377
	for <CHEMISTRY@ccl.net>; Tue, 7 Sep 1999 09:58:40 -0400
Received: from bacchus.pc1.uni-duesseldorf.de (jochen@localhost [127.0.0.1])
	by bacchus.pc1.uni-duesseldorf.de (8.8.7/8.8.7) with SMTP id PAA06982;
	Tue, 7 Sep 1999 15:53:45 +0200
From: Jochen <jochen@uni-duesseldorf.de>
Reply-To: jochen@bacchus.pc1.uni-duesseldorf.de
Organization: Heinrich-Heine-Universität
To: Victor Anisimov <victor@fqspl.com.pl>, CHEMISTRY <CHEMISTRY@ccl.net>
Subject: Re: CCL:Request SUN Share/Freeware Fortran Compiler
Date: Tue, 7 Sep 1999 15:49:49 +0200
X-Mailer: KMail [version 1.0.28]
Content-Type: text/plain
References: <001701bef919$c2f61b80$1793f4d4@victor.fqspl.com.pl>
In-Reply-To: <001701bef919$c2f61b80$1793f4d4@victor.fqspl.com.pl>
MIME-Version: 1.0
Message-Id: <99090715534502.05823@bacchus.pc1.uni-duesseldorf.de>
Content-Transfer-Encoding: 8bit
X-MIME-Autoconverted: from quoted-printable to 8bit by server.ccl.net id JAA05378

On Die, 07 Sep 1999 Victor Anisimov wrote:

>:>How can I obtain a Shareware or Freeware Fortran Compiler for SUN
>:>workstations?
>:
>:Look at any GNU ftp archive (e.g. ftp.gnu.org) and get gcc-2.95.1, it
>:contains a good FORTRAN 77 compiler.
>
>Dear Netters, let me add a few words on this question. I appreciate
>GNU enthusiasm in open-source FORTRAN 77 development but from 
>my experience it is still very crude FORTRAN compiler especially for non 
>x86 based platforms. On non x86 based platform (SUN is) one can compile
>only small fortran programs, but compilation of large programs like 
>MOPAC2000 will fail. It creates non-running executable on SUN Sparc.

I am using the new egcs/gcc-2.95 on my AlphaStation without any problems !
I am using it on an old SPARCclassic running Linux as well, no big stuff,
though.
And the compiler is in some regards more compatible and standard
conforming than vendor compilers.

Probably you should check you setup (binutils, libraries, e.g.) ? 

Or are there real problems with it on Sun platforms ? If so, have you
reported them to the developers ?

-- Jochen
From chemistry-request@server.ccl.net  Tue Sep  7 10:15:31 1999
Received: from sunu450.rz.ruhr-uni-bochum.de (sunu450.rz.ruhr-uni-bochum.de [134.147.222.33])
	by server.ccl.net (8.8.7/8.8.7) with SMTP id KAA05470
	for <chemistry@ccl.net>; Tue, 7 Sep 1999 10:15:30 -0400
From: Christoph.van.Wuellen@ruhr-uni-bochum.de
Received: (qmail 17341 invoked from network); 7 Sep 1999 14:11:36 -0000
Received: from sgi249.rz.ruhr-uni-bochum.de (134.147.64.2)
  by mailhost.rz.ruhr-uni-bochum.de with SMTP; 7 Sep 1999 14:11:36 -0000
Received: (qmail 12699 invoked by uid 10283); 7 Sep 1999 14:11:28 -0000
Message-ID: <19990907141128.12698.qmail@sgi249.rz.ruhr-uni-bochum.de>
Subject: Free FORTRAN Compilers
To: chemistry@ccl.net
Date: Tue, 7 Sep 1999 16:11:28 +0200 (MSZ)
X-Mailer: ELM [version 2.4 PL24]
MIME-Version: 1.0
Content-Type: text/plain; charset=US-ASCII
Content-Transfer-Encoding: 7bit

Victor Anisimov posted that GNU Fortran cannot compile large programs
on non-Intel platforms. However, under LinuxPPC on my Power Macintosh G3,
it successfully compiles quantum-chemistry programs like Gaussian98,
TURBOMOLE etc.
It may happen that you have to edit a Makefile

---------------------------+------------------------------------------------
Christoph van Wullen       | Fon (University):  +49 234 700 6485
Theoretical Chemistry      | Fax (University):  +49 234 709 4109
Ruhr-Universitaet          | Fon/Fax (private): +49 234 33 22 75 
D-44780 Bochum, Germany    | eMail: Christoph.van.Wuellen@Ruhr-Uni-Bochum.de
---------------------------+------------------------------------------------
From chemistry-request@server.ccl.net  Tue Sep  7 11:35:51 1999
Received: from mail02-oak.pilot.net (mail-oak-2.pilot.net [198.232.147.17])
	by server.ccl.net (8.8.7/8.8.7) with ESMTP id LAA05866
	for <CHEMISTRY@ccl.net>; Tue, 7 Sep 1999 11:35:51 -0400
Received: from artemis.chiron.com (unknown-46-3.chiron.com [206.189.46.3] (may be forged)) by mail02-oak.pilot.net with SMTP id IAA05034 for <CHEMISTRY@ccl.net>; Tue, 7 Sep 1999 08:32:04 -0700 (PDT)
Received: from hermes.chiron.com by artemis.chiron.com (SMI-8.6/SMI-SVR4)
	id IAA24876; Tue, 7 Sep 1999 08:46:26 -0700
Received: from electra by hermes.chiron.com (SMI-8.6/SMI-SVR4)
	id IAA28573; Tue, 7 Sep 1999 08:33:10 -0700
Received: (from martine@localhost) by electra (980427.SGI.8.8.8/950213.SGI.AUTOCF) id IAA00546; Tue, 7 Sep 1999 08:31:18 -0700 (PDT)
From: Eric Martin <martine@electra.chiron.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Message-ID: <14293.12230.218814.292290@gargle.gargle.HOWL>
Date: Tue, 7 Sep 1999 08:31:18 -0700 (PDT)
To: CHEMISTRY@ccl.net
Subject: Postscript editor
X-Mailer: VM 6.72 under Emacs 19.34.1

Dear CCL,

I used to have a Mac, where I could edit postscript files either with
a program called "tailor", or a Freehand xtra called "PS Editlink".
Now I have an IBM running NT.  In the CCL archives, I found mention of
the pgdraw program, but I haven't found the program.  Is there a
postscript editor for NT?  Thanks.

Cheers, Eric
-- 
Eric Martin
martine@chiron.com
4560 Horton St., Emeryville, CA 94608
(510)923-3306,  FAX (510)923-3360
From chemistry-request@server.ccl.net  Tue Sep  7 11:32:29 1999
Received: from ccl.net (atlantis.ccl.net [192.148.249.4])
	by server.ccl.net (8.8.7/8.8.7) with ESMTP id LAA05846
	for <chemistry@ccl.net>; Tue, 7 Sep 1999 11:32:28 -0400
Received: from gatekeeper.tripos.com (firewall-user@gatekeeper.tripos.com [192.160.145.62])
	by ccl.net (8.8.6/8.8.6/OSC 1.1) with SMTP id LAA09373
	for <chemistry@www.ccl.net>; Tue, 7 Sep 1999 11:28:39 -0400 (EDT)
Received: (from uucp@localhost) by tripos.com (SMI-8.6) id KAA07842; Tue, 7 Sep 1999 10:24:31 -0500
Received: from elara(172.20.5.15) by gatekeeper.tripos.com via smap (V5.0)
	id xma007837; Tue, 7 Sep 99 10:24:16 -0500
Received: from tripos.com (thebe [172.20.6.145]) by tripos.com (980919.SGI.STAND) via ESMTP id KAA04535; Tue, 7 Sep 1999 10:26:25 -0500 (CDT)
Sender: dlarson@elara.tripos.com
Message-ID: <37D52EA1.9FFE69FA@tripos.com>
Date: Tue, 07 Sep 1999 10:26:25 -0500
From: David Larson <dlarson@tripos.com>
Organization: Tripos, Inc.  http://www.tripos.com
X-Mailer: Mozilla 4.6 [en] (X11; U; IRIX 6.3 IP32)
X-Accept-Language: en
MIME-Version: 1.0
To: David Robert <david@stark.udg.es>
CC: "Computational Chemistry List (CCL)" <chemistry@www.ccl.net>
Subject: Re: CCL:cross-validation and prediction with PLS
References: <37D4FCC1.40AF6E5E@stark.udg.es>
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

David Robert wrote:

> Dear CCLers,
> 
> A question on cross-validation and prediction in QSAR models using PLS:
> 
> A requisite for the data is to be standardized (mean zero, sdv one). To
> check the statistical significance of a predictive model the
> leave-one-out (LOO) cross-validation is usually performed. When a
> molecule is removed, a new standardization needs to be computed for the
> remaining n-1 compounds, and the new mean and sdv will be applied to the
> extracted molecule. However, for the usual sample sizes (n=30 to 150)
> the error committed in equalling the mean/sdv for the n-1 and n
> molecules is not neglectable, and it clearly affects the results of the
> prediction for the removed molecule. The same can be said about the
> prediction of the property for a test set: the mean/sdv of the training
> set QSAR model will be applied to the test set in order to have
> standardized descriptors, but the mean/sdv if we included all the
> molecules (training+test) may be very different.
> 
> I am afraid that LOO cross-validation is sometimes calculated simply by
> recomputing the regression coefficients, keeping fixed the scores of the
> n molecules model, which obviously overrates the results. Any opinions?

One does indeed need to be careful in using PLS packages to find out
whether *full* cross-validation is done, and it becomes a particularly
relevant point when trying to compare published statistics with what you
obtain yourself.  The PLS cross-validation procedure used for CoMFA in
SYBYL QSAR has been described in detail[1].  There, as should always be
the case, each cross-validation PLS is calculated "from scratch" in that
the reduced data matrix is re-centered and rescaled for each
cross-validation run (rescaling does not affect CoMFA much, however,
because each molecular field is scaled as a block).  This almost always
produces a more conservative (and less biased) *estimate* of q^2 [2]. 
Note that the actual predictivity of the model [3] is a *population*
parameter and is independent of how the sample statistic q^2 is
calculated.

My own belief is that the standard error of prediction is what should
always be the primary criterion used for comparing models, whether it is
from cross-validation or for external test sets.  Predictive correlation
coefficients (q^2's) are dangerous because they can be very dependent on
the distribution of the training set used.  Pre-selecting extrema as a
training set and using more centrally distributed compounds as the test
set, for example, will usually "improve" q^2 even if the error of
prediction is actually higher.  This is because the training set SD
increases even more that the SE of prediction.  This is similar to the
reason that the SD of the test set should not be used in calculating
predictivity. 

A related, more subtle concern arises up when variable selection is
done.  There, observations should properly be left out from the very
start of the process and multiple variable selections run and compared. 
Unfortunately, cross-validation is sometimes only run after the variable
set has already been selected.

Bob Clark  

*******************************************************************************   

[1] RD Cramer III, SA DePriest, DE Patterson & P Hecht: The Developing
Practice of Comparative Molecular Field Analysis. In "3D QSAR in Drug
Design: Theory, Methods and Applications," H. Kubinyi ed.; ESCOM,
Leiden, 1993; p 461.

[2] H Kubinyi & U Abraham: Practical Problems in PLS Analyses.  In "3D
QSAR in Drug Design: Theory, Methods and Applications," H. Kubinyi ed.;
ESCOM, Leiden, 1993; p 722.

[3] Would that be gamma^2, or perhaps xi^2? I have never seen a name for
this, in the way that rho^2 corresponds to r^2.


**************************************************************************
**************************************************************************
**                                                                      **
**    Dr. Robert D. Clark                     bclark@tripos.com         **
**    Senior Development Scientist            phone: 314-647-1099       **
**    Tripos Inc.                               fax: 314-647-9241       **
**    1699 S. Hanley Road             _____                             **
**    St. Louis MO 63144             /     \                            **
**                                  ||      |                           **
**                                 C   O \ O |                          **
**                                  ||  (_> |                           **
**                                  \ \/--\/)                           **
**                                   \_____/                            **
**                                                                      **
**************************************************************************
**************************************************************************
From chemistry-request@server.ccl.net  Tue Sep  7 12:13:45 1999
Received: from server.nybc.org (server.nybc.org [192.94.249.2])
	by server.ccl.net (8.8.7/8.8.7) with ESMTP id MAA06194
	for <chemistry@ccl.net>; Tue, 7 Sep 1999 12:13:44 -0400
Received: from iris.nybc.org ([192.94.249.42]) by server.nybc.org
          (Netscape Mail Server v2.0) with ESMTP id AAA10036;
          Tue, 7 Sep 1999 12:09:41 -0400
Received: from nybc.org by iris.nybc.org via ESMTP (980427.SGI.8.8.8/940406.SGI.AUTO)
	 id MAA29711; Tue, 7 Sep 1999 12:10:17 -0400 (EDT)
Sender: debnath@nybc.org
Message-ID: <37D538E8.1FD406AC@nybc.org>
Date: Tue, 07 Sep 1999 12:10:16 -0400
From: Asim Kumar Debnath <adebnath@nybc.org>
Organization: New York Blood Center
X-Mailer: Mozilla 4.51C-SGI [en] (X11; I; IRIX 6.5 IP22)
X-Accept-Language: en
MIME-Version: 1.0
To: Iraj Daizadeh <daizadeh@nucleus.harvard.edu>
CC: chemistry@ccl.net
Subject: Re: CCL:ReferenceKuntz.
References: <Pine.SUN.4.10.9909032236200.12359-100000@nucleus.harvard.edu>


Hi Iraj:
    Following is the title of the paper:

An approach to the tertiary structure of globular proteins.
J Am Chem Soc. 1975 Jul 23;97(15):4362-6.

Asim

Iraj Daizadeh wrote:

> Hello.
>
> Sorry to pose this question:
>
> Would anyone have the title to the following
> paper:
>
> Kuntz, I.D. (1975) JACS 97:4362-4366....
>
> Libs shutdown for this weekend...not on Kuntz's
> webpage neither...
>
> Thanks.... again my apologies
> Iraj


=======================================================================

             ***
             ***
            ****                Asim K. Debnath, Ph.D.
           ****                 Lindsley F. Kimball Research Institute
          ****  ***             The New York Blood Center
         ****    ****           310 E 67 Th Street
        ****  **  ****          New York, NY 10021
       ****  ****  ****         Tel. (212) 570-3373
      ****    **    ****        Fax. (212) 570-3299
       ****************         E-mail: adebnath@nybc.org
        **************

========================================================================

From chemistry-request@server.ccl.net  Tue Sep  7 12:42:20 1999
Received: from nmrc.ucc.ie (nmrc.ucc.ie [143.239.64.1])
	by server.ccl.net (8.8.7/8.8.7) with ESMTP id MAA06482
	for <chemistry@ccl.net>; Tue, 7 Sep 1999 12:42:19 -0400
Received: from mailhost.nmrc.ucc.ie (gateway.nmrc.ucc.ie [10.1.64.2])
	by nmrc.ucc.ie (Postfix) with ESMTP id 5CAB51677
	for <chemistry@ccl.net>; Tue,  7 Sep 1999 17:38:38 +0100 (BST)
Received: from rennes.nmrc.ucc.ie (rennes.nmrc.ucc.ie [10.1.65.35])
	by mailhost.nmrc.ucc.ie (Postfix) with ESMTP id 94C8A7C
	for <chemistry@ccl.net>; Tue,  7 Sep 1999 17:36:57 +0100 (BST)
Date: Tue, 7 Sep 1999 17:36:57 +0100
From: Michael Nolan <mnolan@nmrc.ucc.ie>
To: ccl <chemistry@ccl.net>
Subject: [Christoph.van.Wuellen@ruhr-uni-bochum.de: CCL:Free FORTRAN Compilers]
Message-ID: <19990907173657.A18589@rennes.nmrc.ucc.ie>
Mime-Version: 1.0
Content-Type: text/plain; charset=us-ascii
X-Mailer: Mutt 0.95.5i

I agree with Christoph,

Our version of G77 running on a variety of SUN workstations was well
able to compile the COLUMBUS program system and in general works very well...
Some modification of the compiler options was necessary, but it only need to be 
done once.

regards

michael

********************************************
Victor Anisimov posted that GNU Fortran cannot compile large programs
on non-Intel platforms. However, under LinuxPPC on my Power Macintosh G3,
it successfully compiles quantum-chemistry programs like Gaussian98,
TURBOMOLE etc.
It may happen that you have to edit a Makefile
**********************************************************************
-- 
**************************************************************************
Mr. Michael Nolan BSc. MEngSc.	
Materials Modelling Section, Advanced Materials and Technologies Group
National Microelectronics Research Centre  	
Lee Maltings, Prospect Row				   	
Cork, IRELAND

mail: mnolan@nmrc.ucc.ie
Tel:   + 353 21 904113; Fax: +353 21 270271

http://nmrc.ucc.ie/groups/AMT/modellingRes.html
***************************************************************************
