From engel@shum.cc.huji.ac.il  Wed Nov 24 08:51:28 1993
Received: from shum.cc.huji.ac.il  for engel@shum.cc.huji.ac.il
	by www.ccl.net (8.6.4/930601.1506) id IAA20252; Wed, 24 Nov 1993 08:37:32 -0500
Received: by shum.cc.huji.ac.il id AA08833
  (5.65cHU/IDA-1.4.4 for chemistry@ccl.net); Wed, 24 Nov 1993 15:37:22 +0200
Date: Wed, 24 Nov 1993 15:30:31 +0200 (GMT+0200)
From: "Michael Engel (The Hebrew University)" <engel@shum.cc.huji.ac.il>
Subject: AVS interface to ab-initio packages?
To: chemistry@ccl.net
Message-Id: <Pine.3.87.01.9311241530.A8576-0100000@shum.cc.huji.ac.il>
Mime-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII



Hello,
I would like to find out if there is a group of
modules workable in AVS that processes results from
ab-initio calculations (Gaussian-90, MOPAC, AMPAC,
Gamess). Is there such a package (free or commercial)?
Thanks in advance,
Michael (please answer to engel@shum.huji.ac.il)

24 November 1993.




From jaeger@Kodak.COM  Wed Nov 24 09:50:14 1993
Received: from Kodak.COM  for jaeger@Kodak.COM
	by www.ccl.net (8.6.4/930601.1506) id JAA20507; Wed, 24 Nov 1993 09:19:58 -0500
Received: from bcc0.kodak.com by Kodak.COM (5.61+/2.1-Eastman Kodak)
	id AA08973; Wed, 24 Nov 93 09:21:26 -0500
Reply-To: jaeger@Kodak.COM
Received: by bcc0.kodak.com (4.1/SMI-4.1)
	id AA01754; Wed, 24 Nov 93 09:17:13 EST
Date: Wed, 24 Nov 93 09:17:13 EST
From: jaeger@Kodak.COM (Ed Jaeger)
Message-Id: <9311241417.AA01754@bcc0.kodak.com>
To: PCJ@PSUVM.PSU.EDU
Cc: chemistry@ccl.net
In-Reply-To: <PCJ@PSUVM.PSU.EDU>'s message of Mon, 15 Nov 1993 14:38 -0500 (EST) <01H5CJRX9DSO8XDW13@phem3.acs.ohio-state.edu>
Subject: Question about neural networks


Peter,

I posted a note to a stat mailing list see what answers they might be
able to provide.  The response was not overwhelming.  I provide my
mail and the two responses below.  I don't believe that they will be
much help.  It might be an area where bootstrapping/crossvalidation
would applicable assuming the cpu costs were not too overwhelming.



Ed Jaeger <jaeger@kodak.com> wrote:
   A colleague is using neural nets to build a nonlinear model of the
   relationship between chemical structure and toxicity.  He is
   looking for diagnostic methods to seek "outliers" in his training
   set.

   Can anyone suggest such diagnostic tools or point me to where I
   might find this information?

reinhard@iso.wwz.unibas.ch wrote:
   Hi Ed,

   I'm not too deep into diagnostics yet, but the two standard
   references,

   CHATTERJEE,S.,HADI,A.S.: Sensitivity Analysis in Linear Regression.
	Wiley 1988.
   BELSLEY,D.A.,KUH,E.,HELSCH,R.E.: Regression Diagnostics:
	Identifying Influencial Data And Sources of Collinearity.
	Wiley 1980.

   define outliers as observations with very high some kind of
   residual.  They recommend computing the externally studentized
   residual (residual divided by leave-one-out standard deviation
   times square root of 1 - diagonal element of hat-matrix), for its
   distribution is known in a linear model.

   Once You're in those leave-one-out diagnostics, look up DFFITS in
   ls.diag.

   Multivariate and nonlinear generalisations may move along the line
   of thought wheighting vector instead of deletion -> influence curve 
   (basis of robust
   statistics I read in HUBER: Robust Statistics. Wiley 198? and HAMPEL,et.al.:
   Robust Statistics: The Approach Based on Influence Functions. Wiley 1986.)
   -> maximal local change in influence -> likelihood displacement
   as in COOK,R.D.:Assesment of Local Influence. In: J.R.Statist.Soc.B 1986.

   In this way You get diagnostics not for outliers but for
   influencial points.  I wrote about it, because perhaps that is,
   what You need. Perhaps You better start as everyone did, by
   observing how some most interesting output changes, when one or
   several points are excluded. Afterwards You know wether it is
   rewarding to investigate in which observation one has to corrupt,
   in order to get results that differ from the real thing by a given
   margin.

   It would be nice to hear, what You implemented

   Reinhard Vonthein, WWZ/ISO, Uni-Basel, Petersgraben 51, CH - 4051
   Basel reinhard@iso.wwz.unibas.ch


wchung@gandalf.rutgers.edu (Woogon Chung) wrote:
   How about detecting outliers using domouSe from Statlib and train your
   neural net?

   Woogon.


ej
Ed Jaeger  Sterling Winthrop Inc., P.O. Box 5000, Collegeville PA 19426-0900 
               {jaeger@kodak.com || ph:215-983-5509 || fax:215-983-5559}


From gene@calv2.cray.com  Wed Nov 24 10:51:38 1993
Received: from cray.com  for gene@calv2.cray.com
	by www.ccl.net (8.6.4/930601.1506) id JAA21008; Wed, 24 Nov 1993 09:57:03 -0500
Received: from eastrg2.cray.com (eastrg2-gate.cray.com) by cray.com (Bob mailer 1.2)
	id AA20382; Wed, 24 Nov 93 08:56:28 CST
Received: by eastrg2.cray.com (4.1/CRI-5.13)
	id AA29417; Wed, 24 Nov 93 09:56:28 EST
Date: Wed, 24 Nov 93 09:56:28 EST
From: gene@calv2.cray.com (Eugene Fleischmann)
Message-Id: <9311241456.AA29417@eastrg2.cray.com>
To: engel@shum.cc.huji.ac.il
Subject: Re:  AVS interface to ab-initio packages?
Cc: chemistry@ccl.net


There is an excellent Graphical User Interface (non-AVS) for
Silicon Graphics workstations, and soon for generic X-Window
workstations, that not only graphically displays results
from ab initio calculations, but also can perform many other
kinds of functions.  Some of these include: building and
importing structures from other programs, setting up input 
decks for Gaussian 92, DGauss, MNDO93, and CADPAC, transparently
submitting, monitoring, and controlling the execution of 
chemistry jobs on the backend computer system.  The software
provides tools so that the user can also add his favorite 
chemistry package.

For more information please call Mark Cole at 612-683-3688.


++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
	Eugene D. Fleischmann, Ph.D.
	Computational Chemist
	Cray Research, Inc.		(609) 252-1250
	121 Commons Way			gene@calv2.cray.com
	Princeton, NJ  08540
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

From DSMITH@uoft02.utoledo.edu  Wed Nov 24 11:00:45 1993
Received: from uoft02.utoledo.edu  for DSMITH@uoft02.utoledo.edu
	by www.ccl.net (8.6.4/930601.1506) id KAA21262; Wed, 24 Nov 1993 10:27:10 -0500
Received: from UOFT02.UTOLEDO.EDU by UOFT02.UTOLEDO.EDU (PMDF V4.2-10 #3438) id
 <01H5OV27DR8W00183T@UOFT02.UTOLEDO.EDU>; Wed, 24 Nov 1993 10:26:37 EST
Date: Wed, 24 Nov 1993 10:26:37 -0500 (EST)
From: "DR. DOUGLAS A. SMITH, UNIVERSITY OF TOLEDO" <DSMITH@uoft02.utoledo.edu>
Subject: Re: AVS interface to ab-initio packages?
To: engel@shum.cc.huji.ac.il
Cc: chemistry@ccl.net
Message-id: <01H5OV27EK6A00183T@UOFT02.UTOLEDO.EDU>
X-Envelope-to: chemistry@ccl.net
X-VMS-To: IN%"engel@shum.cc.huji.ac.il"
X-VMS-Cc: CHEMISTRY
MIME-version: 1.0
Content-transfer-encoding: 7BIT


The AVS Chemistry Viewer, currently sold by Molecular Simulations, Inc.
(but this will change within a week or so -- keep your eyes open for an
announcement) is designed to connect quantum mechanical programs with
visualization.  Currently the Viewer will read files from G90, G92, MOPAC
(and, by default AMPAC), Molfiles, and BGF files (the biograf format).
It will also write input files for these, allow users to sketch molecules
or read them in (PDB format is also supported), manipulate them, etc. The
program also reads the output files from these programs and will allow
you to examine densities, orbitals, surfaces, volumes, and much more.

After the announcement (see above), the AVS Chemistry Viewer will be 
expanded.  In the near future, look for it to read output from CADPAK,
GAMESS, Spartan, G88, AMPAC 4.5 and more.

Doug

Douglas A. Smith
Assistant Professor
Department of Chemistry
 and member,
Center for Drug Design and Development
The University of Toledo
Toledo, OH  43606-3390

voice    419-537-2116
fax      419-537-4033
email    dsmith@uoft02.utoledo.edu


From frederik@pollux.acs.uci.edu  Wed Nov 24 11:51:40 1993
Received: from pollux.acs.uci.edu  for frederik@pollux.acs.uci.edu
	by www.ccl.net (8.6.4/930601.1506) id LAA22070; Wed, 24 Nov 1993 11:18:09 -0500
Received: from localhost by pollux.acs.uci.edu with SMTP id AA22390
  (5.65b/IDA-1.4.4 for chemistry@ccl.net); Wed, 24 Nov 93 08:17:41 -0800
To: "Michael Engel (The Hebrew University)" <engel@shum.cc.huji.ac.il>
Cc: chemistry@ccl.net
Subject: Re: AVS interface to ab-initio packages? 
In-Reply-To: Your message of Wed, 24 Nov 93 15:30:31 +0200.
Date: Wed, 24 Nov 93 08:17:40 -0800
Message-Id: <22387.754157860@pollux.acs.uci.edu>
From: "Donald M. Frederick" <frederik@uci.edu>


> Hello,
> I would like to find out if there is a group of
> modules workable in AVS that processes results from
> ab-initio calculations (Gaussian-90, MOPAC, AMPAC,
> Gamess). Is there such a package (free or commercial)?
> Thanks in advance,
> Michael (please answer to engel@shum.huji.ac.il)
> 
> 24 November 1993.
> 
> 
> 
> 
> ---Administrivia: This message is automatically appended by the mail exploder:
> CHEMISTRY@ccl.net -- everyone     | CHEMISTRY-REQUEST@ccl.net -- coordinator
> MAILSERV@ccl.net: HELP CHEMISTRY  | Gopher: www.ccl.net (coming soon)
> Anon. ftp www.ccl.net     | CHEMISTRY-SEARCH@ccl.net -- archive search
> 

AVS has a special package that is available as an extra cost add-on to the
regular AVS software. The package provides a graphical front-end/back-end
for both Gaussian and MOPAC. We have some users in our Chemistry Dept. who
were quite impressed by it. Contact your local AVS rep for more info.


------------------------------------------------------------------------
	Donald Frederick       |      Office of Academic Computing
	frederik@uci.edu       |      University of California, Irvine
	     asc@uci.edu       |      Irvine, CA  92717
    	(714) 725-3200	       |      FAX (714) 725-2069
-----------------------------------------------------------------------

From shenkin@still3.chem.columbia.edu  Wed Nov 24 11:56:05 1993
Received: from mailhub.cc.columbia.edu  for shenkin@still3.chem.columbia.edu
	by www.ccl.net (8.6.4/930601.1506) id LAA22325; Wed, 24 Nov 1993 11:39:30 -0500
Received: from still3.chem.columbia.edu by mailhub.cc.columbia.edu with SMTP id AA10232
  (5.65c+CU/IDA-1.4.4/HLK for chemistry@ccl.net); Wed, 24 Nov 1993 11:27:00 -0500
Received: by still3.chem.columbia.edu (930416.SGI/930416.SGI.AUTO)
	for @cunixf.cc.columbia.edu:davide@stinch0.csmtbo.mi.cnr.it id AA19836; Wed, 24 Nov 93 11:25:28 -0500
Date: Wed, 24 Nov 93 11:25:28 -0500
From: shenkin@still3.chem.columbia.edu (Peter Shenkin)
Message-Id: <9311241625.AA19836@still3.chem.columbia.edu>
To: chemistry@ccl.net, davide@stinch0.csmtbo.mi.cnr.it
Subject: Re:  AIX sendmail is DANGEROUS !!!



Davide Proserpio wrote to me asking about the availability of the
fixed SGI sendmail that I referred to in an earlier posting in this
thread.  In anticipation of general interest, I thought I'd copy my
reply to the comp. chem. list:

> From: davide@stinch0.csmtbo.mi.cnr.it
> 
>   I have an SGI and I would like to konw more about the new
> sendmail from SGI and his availability (via ftp?)

You can login to site "ftp.sgi.com" as user "anonymous".  If you then
"get" the README file, it will tell you where the sendmail of interest
to you is located.  The appropriate excerpts from this file follow:

	~ftp/sgi/IRIX5.0            (Fixes)
		sendmail/       IRIX sendmail w/security fixes.

	~ftp/sgi/IRIX4.0            (Fixes)
		sendmail/       IRIX sendmail w/security fixes.
 
Obviously (I hope!), you need to get the right version for your
IRIX version.  To do this, "cd" into the appropriate "sendmail"
directory, "mget *", and read the README that comes over.

To install, you'll need to kill any existing sendmail processes,
replace the old executable with the new one, and possibly also
replace your sendmail.cf with a new one from the bugfix distribution --
though that might not, in fact, be necessary;  I'm not sure.  Then
you'll have to start the new sendmail.

Hope this helps....

	-P.
************************f*u*cn*rd*ths*u*cn*gt*a*gd*jb************************
Peter S. Shenkin, Box 768 Havemeyer Hall, Dept. of Chemistry, Columbia Univ.,
New York, NY  10027;  shenkin@still3.chem.columbia.edu;  (212) 854-5143
********************** Atheist: an evangelical agnostic. ********************


From urquhart@mcmail.cis.mcmaster.ca  Wed Nov 24 12:51:32 1993
Received: from mcmail.cis.mcmaster.ca  for urquhart@mcmail.cis.mcmaster.ca
	by www.ccl.net (8.6.4/930601.1506) id MAA22833; Wed, 24 Nov 1993 12:25:07 -0500
Received: by mcmail.cis.mcmaster.ca id AA18423
  (5.65c/IDA-1.4.4 for chemistry@ccl.net); Wed, 24 Nov 1993 12:25:50 -0500
Date: Wed, 24 Nov 1993 12:23:54 -0500 (EST)
From: stephen urquhart <urquhart@mcmail.cis.mcmaster.ca>
Subject: Quantum Chemistry Literature Data Base - remote access?
To: chemistry@ccl.net
Message-Id: <Pine.3.07.9311241254.A17988-9100000@mcmail>
Mime-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII



	Is it possible to gain telnet access to the Quantum Chemistry
Literature Data Base? My megere needs do not justify setting it up on a
local computer. If possible, where and how much?

Thanks,
S...

Stephen Urquhart, Dept. of Chemistry, McMaster University 
urquhart@mcmaster.ca  Phone:(905) 525-9140 x24864 Fax:(905) 521-2773



From mercie@med.cornell.edu  Wed Nov 24 13:50:28 1993
Received: from cumc.cornell.edu  for mercie@med.cornell.edu
	by www.ccl.net (8.6.4/930601.1506) id NAA23467; Wed, 24 Nov 1993 13:29:57 -0500
Received: from localhost (mercie@localhost) by cumc.cornell.edu (8.6.4/ECH1.13) id NAA22382; Wed, 24 Nov 1993 13:28:31 -0500
Date: Wed, 24 Nov 1993 13:15:40 -0500 (EST)
From: Gustavo Mercier <mercie@med.cornell.edu>
Subject: prolate / oblate spheroidal coordinates
To: chemistry@ccl.net
Message-ID: <Pine.3.87.9311241340.B21611-0100000@med.cornell.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII



Hi, Netters!

Although the question below may seem a simple one that should be answered 
with a trip to the library, it proved more difficult than expected!

I am writing a program to do some electrostatic computations and association
dynamics. To take advantage of symmetry I need to use prolate and oblate 
spheroidal coordinates. Unfortunately, I don't have the formulae for 
conversion between cartesian and these coordinate systems. More important 
apparently there are several definitions for these coordinate systems.

I am following papers on the solution of Laplace equation with ellipsoid 
symmetry as described by Beveridge in the late '70's. They give a definition
for prolate spheroidal coordinate as follows

	lambda = (|r-ra| + |r-rb|) / 2d
	mu = (|r-ra| - |r-rb|) / 2d
	phi = angle of rotation around the major axis.

r = point in space;
ra = focus 1
rb = focus 2
2d = distance between foci

These coordinates have the following boundaries:

	0 <= lambda <= infinity
	-1 <= mu <= 1
	0 <= phi < 2 Pi

The conversion of these to oblate spheroidal is not clear to me.
Can anybody help?

Note that this definition of prolate spheroidal coordinates does not follow
other conventions. Specifically, the Package VectorAnalysis.m in Mathematica
follows a very different convention and formulae!

gus mercier
mercie@cumc.cornell.edu


From mercie@med.cornell.edu  Wed Nov 24 13:51:25 1993
Received: from cumc.cornell.edu  for mercie@med.cornell.edu
	by www.ccl.net (8.6.4/930601.1506) id NAA23378; Wed, 24 Nov 1993 13:14:13 -0500
Received: from localhost (mercie@localhost) by cumc.cornell.edu (8.6.4/ECH1.13) id NAA22111; Wed, 24 Nov 1993 13:12:47 -0500
Date: Wed, 24 Nov 1993 12:56:28 -0500 (EST)
From: Gustavo Mercier <mercie@med.cornell.edu>
Subject: hondo in sgi, repeat
To: chemistry@ccl.net
Message-ID: <Pine.3.87.9311241228.A21611-0100000@med.cornell.edu>
MIME-Version: 1.0
Content-Type: TEXT/PLAIN; charset=US-ASCII



Hi, Netters!

Sorry! Apparently my previous message containing an attachment that described
the steps to implement HONDO 8.4 in a INDIGO R4000 IRIX 4.0.5F got corrupted
in some systems.

Given the interest in this subject, the message is reproduced below.
I would encourage people who port software to different systems
to keep tabs of the changes. Simple files like the one below can help developersto make programs more compatible, and end-users can implement programs without
rediscovering the wheel!


Michel Dupuis can be reached at: michel@kgnvmt.vnet.ibm.com

good luck
mercie@cumc.cornell.edu

****************************************************************************
           READ THE WHOLE MESSAGE FIRST!
modification 1:
ISINGL = 2 for 32 bit machine
VECTOR = .FALSE.
	files affected
ctl.f
comment: see documentation
####
modification 2:
excluded file from loading phase:
	files affected:
vec.f
comment: see documentation
####
modification 3:
applied "1,$s/@PROCESS/c @PROCESS/g" to comment out the lines using vi editor

	files affected with number of instances that the change applied:
ctl.f:6
ci1.f:3
ci2.f:3
der.f:2
hss.f:5
int.f:3
mp2.f:15
mp4.f:6
ntn.f:2
scf.f:8
####
modification 3:
renamed the call and entry point function SECNDS to SECND since
in SGI f77 SECNDS is an intrinsic function
	files affected
ctl.f
####
modification 4:
commented out subroutines SYSDAT, SYSTIM, SYSCLK
	files affected:
ctl.f
comment: notice that these remain defined in file aix.f
	the change was made to eliminate undefined subroutines:
	CPUTIME, CLOCKX, and DATIMX called by the above
####
modification 5:
change the name of subroutine aixclk to aixclk_
	files affected:
clkaix.c
comment: this is a requirement of SGI fortran/c interface
####
modification 6:
introduced EXTERNAL EXPNRG statement to correct warning message
	files affected:
dr2.f   subroutine search
####
modification 7:
sample 2 in the suite of samples bombs with an error due to error in 
formt 9996 in SUBROUTINE TRFMCX (line # 77736). 

The line reads as follows:
	2	,'-- IN CORE ...
it should read
	2        '-- IN CORE ...

the comma has already been included in the previous continuation line!

	file affected
ntn.f
####
modification 8:
samples 14 and 15 fail with error in SUBROUTINE DDSPDS when using
optimization level O2 (but not with O0a or O1) as described below.
The error yields a core dump due to segmentation fault!

Hence, separate hss.f into hss1.f and hss2.f, with the latter containing
the above subroutine. Then compile the subroutine using optimization level
O1 as indicated below. Remember to comment out the subroutine from hss1.f or
delete it.

####
comments on compilation and loading:

TO COMPILE AND LOAD PROPERLY DO THE FOLLOWING:

Avoid using the make command! For reasons that are not clear to me the
default options cause errors in compiling certain files. I don't have
an answer to this, but the steps below work.

First generate the object codes:

cc -c clkaix.c
f77 -w -c -G 3 -O2 -Olimit 1500 -Nl300 -mips2 *.f

make sure that you have split ci1.f into ci1a.f and ci1b.f and
prp.f into prp1.f and prp2.f. These files are too big as delivered.
Also, in file hss.f you must compile SUBROUTINE DDSPDS using
optimization level 1. We do this by spliting the file into hss1.f
and hss2.f. The latter contains the SUBROUTINE DDSPDS, and the former
contains the rest of hss.f.

f77 -w -c -G3 -O1 -Olimit 1500 -Nl300 -mips2 hss2.f

Remember to exclude vec.f, it is not needed in scalar machine.

You can eliminate optimization by changing O2 to O0.
In this case you can omit the mips2 instruction set. Obviously,
the program will run slower by about at least a factor of 2.

Finally, generate the executable by linking the object codes.

f77 -o hondo -w -G3 -static -Nl300 -mips2 *.o

the executable will have the name hondo.

So, in summary:

If you have made the above modifications, the sequence of commands is:

cc -c clkaix.c
f77 -w -c -G3 -static -O2 -Olimit 1500 -Nl300 -mips2 *.f
f77 -w -c -G3 -static -O1 -Olimit 1500 -Nl300 -mips2 hss2.f
f77 -o hondo -w -G3 -static -Nl300 -mips2 *.o 

Note: Attempts at optimization using the default -O -Olimits 1500 failed!
Also, simple attempts at using the make command also failed! The reason
is not clear, but may have to do with the default settings.

COMMENT: Using the mips2 instruction set can be confusing!

The mips2 instruction set is designed to take advantage
of the 64bit word size in INDIGO R4000 machines
and the newer Silicon Graphics machines. Unfortunately, the defaults sizes
for integer and real numbers are i*4 and r*4 (32 bits!). According to customer
support this default persists even when using the -mips2 option! Therefore,
your default integers and reals are half-words and you must keep ISINGL as
if you were running in a 32 bit machine, unless you change the default sizes!

This is a little bit crazy, but I have already encountered a similar problem
with another code (Amsterdam Density Functional Package v. 1.0.1).

gus mercier
mercie@cumc.cornell.edu
9/6/93




From abonews@Playfair.Stanford.EDU  Wed Nov 24 17:50:27 1993
Received: from playfair.Stanford.EDU  for abonews@Playfair.Stanford.EDU
	by www.ccl.net (8.6.4/930601.1506) id QAA25395; Wed, 24 Nov 1993 16:54:46 -0500
Received: from tukey by playfair.Stanford.EDU with ESMTP (8.6.4/25-eef) id NAA17387; Wed, 24 Nov 1993 13:54:45 -0800
Received: from localhost by tukey (8.6.4/)
	id NAA15876; Wed, 24 Nov 1993 13:54:45 -0800
Date: Wed, 24 Nov 1993 13:54:45 -0800
From: "Art Owen News" <abonews@Playfair.Stanford.EDU>
Message-Id: <199311242154.NAA15876@tukey>
To: CHEMISTRY@ccl.net
Subject: Call for integrands



  I sent the following to a physics newsgroup.  I
am also interested in getting high dimensional 
integrands from chemists.  -Art Owen



    This is a call for integrands.  I am exploring
high dimensional quadrature methods.  The literature
has some comparisons, but the test functions
are often low dimensional (5 to 10) and often
they are symmetric in their arguments.  That is one
can permute the input variables without affecting
the value of the integrand.  I'm not sure that these
make the best test cases.  So I would like to get some
integrands from physicists that are high dimensional 
(say, 20 to hundreds of dimensions) and are 
scientifically realistic.


  For each integrand I would like:

      1) the function
      2) the physical context
      3) desired/required accuracy
      4) desired/maximum number of evaluations
      5) the integral value (if known)

  Of course, points 3 and 4 can conflict.

  For point 1) the function should be on [0,1)^d
for some large d.  This may involve a change of
variable.  If some dimensions are customarily 
integrated out analytically, then this should be
reflected in the integrand.  The most convenient
form for the integrand is a C function.  I can
also work with FORTRAN, or a mathematical description,
provided that I can locate appropriate special
functions.  Certain properties of the function
that are relevant to quadrature should also be
mentioned (e.g. periodicity, continuity, number of
derivatives).  The integrand can be either real
or vector valued.

  For point 2) there should be enough detail that
others can form an opinion of whether the problem
is similar to their own.  (Don't expect that I
will understand the physics, beyond the most
superficial level.)  If the integrand is vector
valued, and the problem suggests a natural way 
to combine accuracy measurements across its components, 
please say what that method is.

  If I use an integrand in a publication, I will
acknowledge the source of the integrand.

  Art Owen
  Dept of Statistics
  Stanford University
  Stanford CA, 94305

(My interest is to compare some Monte Carlo
and equidistribution methods.  If there is
sufficient interest I might put together a
test bed of integrands.  Of course if any
body already has such a test bed, I would
probably prefer to use it.)


From abonews@Playfair.Stanford.EDU  Wed Nov 24 18:50:25 1993
Received: from playfair.Stanford.EDU  for abonews@Playfair.Stanford.EDU
	by www.ccl.net (8.6.4/930601.1506) id SAA25711; Wed, 24 Nov 1993 18:03:15 -0500
Received: from tukey by playfair.Stanford.EDU with ESMTP (8.6.4/25-eef) id PAA19155; Wed, 24 Nov 1993 15:03:15 -0800
Received: from localhost by tukey (8.6.4/)
	id PAA16017; Wed, 24 Nov 1993 15:03:14 -0800
Date: Wed, 24 Nov 1993 15:03:14 -0800
From: "Art Owen News" <abonews@Playfair.Stanford.EDU>
Message-Id: <199311242303.PAA16017@tukey>
To: CHEMISTRY@ccl.net
Subject: Outliers and Neural Networks



Ed Jaeger <jaeger@kodak.com> wrote:
   A colleague is using neural nets to build a nonlinear model of the
   relationship between chemical structure and toxicity.  He is
   looking for diagnostic methods to seek "outliers" in his training
   set.

   Can anyone suggest such diagnostic tools or point me to where I
   might find this information?



  Here is an approach that should be useful when you are predicting 
a continuous response, using a neural network.  (It wouldn't apply
if toxicity is a 0/1 variable, but could apply if toxicity is measured
continuously by e.g. an LD50 score.)   If the net is trained by        
minimizing square error, replace the squared error criterion by
a more robust one, such as Huber's criterion.  For this you will need
a rough estimate of the range of model error that could arise in
"non-outlying" data points.  Relative to least squares Huber's methods
downweights outlying residuals.  When the weight is very small, the
point is an outlier candidate.

  Huber's criterion is of the form min( z^2, A+B|z| ) where A and
B are chosen to make the criterion continuously differentiable
in z and to make the cross over between quadratic an linear take
place at a prespecified value of |z|.  Here z is the error between
observation and network prediction.  The transition ordinarily
takes place at a value of |z| equal to some multiple of the error
standard deviation of z.  For small multiples you get behavior
like L1 methods, for large multiples it is more like L2 (least
squares).  (Details are in Huber's book "Robust Statistics").

  Minimizing the sum of Huber's criterion of the errors isn't
so hard in a neural net: the chain rule still applies, what
you pass down from the output layer changes.  Huber's criterion
is like a weighted sum of squared errors with weights equal
to  "  min'(z^2,A+B|z|)/(2z)  "  where min' just means the
derivative of the minimum.  This weight is 1 for |z| in the
quadratic range, including z=0 by continuity, and it drops
off like 1/|z| outside that range.  Points where the weight
is small can be tentatively identified as outliers.

  One can, in principle, estimate the variance of z by yet
another network, though to do it well probably takes lots
of data.



-Art Owen, Dept of Statistics, Sequoia Hall, Stanford CA 94305

From ross@cgl.ucsf.EDU  Wed Nov 24 18:51:28 1993
Received: from socrates.ucsf.EDU  for ross@cgl.ucsf.EDU
	by www.ccl.net (8.6.4/930601.1506) id SAA25854; Wed, 24 Nov 1993 18:48:00 -0500
Received: from [0] by socrates.ucsf.EDU (8.6.4/GSC4.24)
	id PAA24623; Wed, 24 Nov 1993 15:47:59 -0800
Date: Wed, 24 Nov 1993 15:47:59 -0800
Message-Id: <199311242347.PAA24623@socrates.ucsf.EDU>
From: ross@cgl.ucsf.edu (Bill Ross )
To: chemistry@ccl.net
Subject: A General Compiling System


Since there has been some discussion of porting code to
different platforms, perhaps people would be interested
in the compilation system I devised for Amber, which I
have obtained permission to share. Source code developers 
can use it to hide machine dependencies from users compiling 
on different machines.  This describes the scheme in the 
yet-to-be-released 4.1 version of Amber.

Here is an example of compiling Amber:

% tar xof tarfile
% cd amber4/src
% cp Machine/Machine.iris4K MACHINE
% Makeall

As you can see, the machine-specific part is restricted to
a single step for the user. All machine-specific compilation
flags, directories and cpp definitions are contained in the 
Machine.xxx files. The Makefiles use a script called Compile
which recognizes different optimization levels and also can 
generate machine-targeted code for other Unix and non-Unix 
operating systems, using the machine dependency file. A script 
called sysdir is used by the Makefiles to extract system-specific
directories from the machine dependency file. A sys.a library
gets built in each system-specific directory, containing 
wrappers for things like the various timing calls one encounters.
A script called mksrc drives the Makefiles with alternate
machine dependency files.

The original set of system-specific files and machine compiler 
flags were written by George Seibel. I organized the 1-step
system using Makefiles and the shell scripts. There are also
tools for e.g. generating an ftp script that gives the equivalent
of "rcp -r" to a VMS system.

Obviously this is a system for developers to use in software
that is released in source version for users to compile (or
for in-house work, of course). Some details are outlined below.

It would be especially nice if it was adopted by a few other
widely distributed packages, so that the machine dependency 
handling might become a quasi standard.

Send mail to me if you want to get a copy or have ideas/suggestions
to share. At some point I will extract the necessary files into a
uuencoded compressed tar file and mail it out.

The only restriction on use will be that credits are maintained
in the scripts and documentation and that modifications are noted.

Bill Ross


The layout of the 'tools' and system libraries part is:

	amber4/src/Compile
	amber4/src/sysdir
	amber4/src/mksrc
	amber4/src/Machine/Machine.xxx		! MACHINE file
	amber4/src/Machine/xxx/Makefile
	amber4/src/Machine/xxx/sys.a		! wrappers
	amber4/src/lib/genlib.a			! wrappers

where 'xxx' roughly includes (not all are up-to-date):

iris	vm	Unicos	vms 	aix370	bsd	mflow
convex	mips	rs6000	ctss	sparc	fps500	hp
decstation      stlr           

(bsd is used for 'non-deviant' Unixes)

The routines in amber4/src/Machine/xxx/sys.a include generic
date(), flush(), hardware arithmetic info, and a timing wrappers.
The routines in amber4/src/lib/genlib.a include generic open()
and exit() wrappers.

A Machine file:

#------------------------------------------------------------------
# Copy this file to amber4/src/MACHINE to install.
#
# This file is appropriate for HP f77
#
# These aliases let us use the same command files for compilation
# on Unix systems with different names and/or flags for the fortran
# compiler.   Environment variables set here specify the location
# of system-specific source and control the use of Integer*2.
# See install.doc: "Setting Up the Configuration File"
#

setenv MACHINE "HP UX"
# HP recognizes cray style C$DIR compiler directivesCH HP

setenv MACHINEFLAGS "-DISTAR2 "

# SYSDIR is the name of the system-specific source directory for makemake
setenv SYSDIR Machine/hp

# COMPILER ALIASES:

# LOADER/LINKER:
setenv LOAD "f77 +T "
setenv LOADLIB " -lvec"

# little or no optimization:
setenv L0 "f77 -c -g +T +E4"

# standard optimization :
setenv L1 "f77 -c -O +T +E4"

# about the same as L1
setenv L2 "f77 -c +O3 +T +E4"

# highest optimization
setenv L3 "f77 -c  +OP -WP,-ur=1,-directives=c  +T +E4"


# ranlib, if it exists
setenv RANLIB "echo ranlib not needed"

#-------------------



The Makefiles look like this:

#------------------------------------------------------------------
SHELL=/bin/sh

OBJ=    giba.o gibb.o decnvh.o micst.o misc.o machinedep.o \
        nmr_strip.o slwadj.o veloc.o torcon.o rstin.o connrg.o \
        pratgr.o derivs.o derdlm.o fastwt.o tripl.o polars.o

SRC=	giba.f gibb.f decnvh.f micst.f misc.f machinedep.f \
        nmr_strip.f slwadj.f veloc.f torcon.f rstin.f connrg.f \
        pratgr.f derivs.f derdlm.f fastwt.f tripl.f polars.f

LIBSRC= ../lib/namlst90.f ../lib/mexit.f ../lib/amopen.f



gibbs:  	$(OBJ) genlib syslib
                SYSLIB=`csh ../sysdir lib` ; ../Compile LOAD -o gibbs \
                                $(OBJ) ../lib/genlib.a $$SYSLIB

bigsource:
                SYSDIR=`csh ../sysdir dir` ;  cd $$SYSDIR ; make sys.f
                SYSSRC=`csh ../sysdir src` ; \
                        ../Compile CPPONLY -P -o gibbs_all.for \
                                $(SRC) $(LIBSRC) $$SYSSRC

.f.o:           $<
                ../Compile L2 -P $<

giba.o:         giba.f
                ../Compile L1 -P $<

machinedep.o:   machinedep.f
                ../Compile L3 -P $<

#-----------LIBS

syslib::
                ( SYSDIR=`csh ../sysdir dir` ; echo sysdir is $$SYSDIR ; \
                cd $$SYSDIR ; make sys.a )

genlib::
                ( cd ../lib ; make genlib.a )

#-----------



#------------------------------------------------------------------

