Date: Sun, 7 Mar 2004 01:54:11 -0500 (GMT-05:00)
From: Zach <kaminari..at..mindspring.com>
Message-ID: <23613898.1078641828630.JavaMail.root..at..wamui10.slb.atl.earthlink.net>
Date: Sun, 7 Mar 2004 01:43:48 -0500 (GMT-05:00)
From: Zach <kaminari..at..mindspring.com>
Reply-To: Zach <kaminari..at..mindspring.com>
To: chemistry..at..ccl.net
Subject: CCL: TINKER Question

Message:

Hello to all,

I am using TINKERversion 4.0 to do some minimizations
and dynamics. In several opportunities I have submited
energy minimization jobs and obtained the following 
output:


Variable-Mode Truncated-Newton Conjugate-Gradient Optimization :


 Algorithm : auto       Preconditioning : auto        RMS Grad : 0.25D+00


 TN Iter   F Value       G RMS     F Move    X Move   CG Iter   Solve   FG Call


     0 -0.1120D+07  0.4562D+06                                               1


 TNCG  --  Normal Termination due to SmallFct


 Final Function Value :     -1120344.2138
 Final RMS Gradient :         456235.9873
 Final Gradient Norm :      23218781.3035


That is, the minimization is aborted with the message Normal Termination.
Given the value of the gradient this structure cannot be minimized. This
happens often when I am submitting jobs involving more or less large
proteins (the above output came from a 10000+ system). 


Why is the minimization stopping at this point? Please help. Thanks a lot.

Zach 

From chemistry-request@ccl.net Sun Mar  7 07:55:27 2004
Received: from spearnet.net (mail.spearnet.net [65.219.158.32])
	by server.ccl.net (8.12.8/8.12.8) with ESMTP id i27CtQjs006967
	for <chemistry..at..ccl.net>; Sun, 7 Mar 2004 07:55:26 -0500
Received: from cornell.edu [67.193.134.13] by spearnet.net with ESMTP
  (SMTPD32-6.06) id A4E1B37004E; Sun, 07 Mar 2004 07:34:25 -0600
Date: Sun, 7 Mar 2004 07:57:10 -0500
Subject: Re: CCL:software evaluation / validation
Content-Type: text/plain; charset=US-ASCII; format=flowed
Mime-Version: 1.0 (Apple Message framework v552)
From: Richard Gillilan <reg8..at..cornell.edu>
To: chemistry..at..ccl.net
Content-Transfer-Encoding: 7bit
In-Reply-To: <404A1AB4.1EB5B3DB..at..ccdc.cam.ac.uk>
Message-Id: <F271C750-7036-11D8-9765-003065963E2C..at..cornell.edu>
X-Mailer: Apple Mail (2.552)
X-Spam-Status: No, hits=3.3 required=7.0 tests=RCVD_IN_DYNABLOCK,
	RCVD_IN_NJABL,RCVD_IN_NJABL_DIALUP,RCVD_IN_SORBS autolearn=no 
	version=2.61
X-Spam-Level: ***
X-Spam-Checker-Version: SpamAssassin 2.61 (1.212.2.1-2003-12-09-exp) on 
	servernd.ccl.net


On Saturday, March 6, 2004, at 01:38  PM, Jacco van de Streek wrote:

> Andras.Borosy..at..givaudan.com wrote:
>> I agree with the first part, which is indeed a _VERY_ important point:
>> "The main criterion of scientific work is reproducibility."
>> However, my question about this would be:
>> How is it possible  to reproduce scientific results obtained by 
>> commercial
>> _OR_ academic software which relies on _RANDOM_ methods ?
>
> Hi,
>
> I happen to work on a program that uses random numbers (to solve 
> crystal
> structures from X-ray powder data) and I think that reproducibility is 
> not an
> issue, at least not in the sense you are referring to.
>
> Because it is a random process, users of such programs would not run 
> them just
> once, as the single outcome of a random event is meaningless. Instead, 
> our
> program by default runs the simulated annealing run ten times (with 
> different
> seeds for the random number generators). It is now the reproducibility 
> (the
> statistical distribution of solutions) itself that should be 
> reproducible.
>

Glad, somebody mentioned random number seeds.  Even a random number 
stream
should be reproducible when you use the same seed and same algorithm. I 
suspect
this should be true across platforms when you have a well-implemented 
algorithm,
anyone ever tested this? I can't think of many instances when it was 
actually necessary
to do this, but I am always bothered by programs that just draw a seed 
> from the time clock
or some other system variable.  This is a level of reproducibility that 
"could" be possible in
computation, but in practice, is probably not necessary given that it 
is the solution of the
underlying model that is sought (within statistical error) rather than 
the particular route
of convergence.

I don't know how many folks have had the experience of actually trying 
to numerically
reproduce computational results from a paper alone, but I've done it on 
several occasions
over the years and have learned a lot about numerical error in the 
process ... something I
rarely ever see discussed.  It is important. Just compare energy values 
of several different
implementations of the same exact empirical forcefield (even "official" 
versions done by the
author of the FF) ... the differences are sometimes surprising.  Still, 
it is a great tribute to the
scientific integrity of the originator of a method if one can reproduce 
the results from the
published work without the actual code.  As much as I like open source 
code, and as much as
I hate property claims being staked on anything and everything possible 
the human mind can
conceive of, it is good to force scientists to reproduce a few things 
once in a while as long as
it does not slow progress due to needless redundancy.


Richard Gillilan
MacCHESS, Cornell