From owner-chemistry@ccl.net Sat Sep 17 05:57:46 2005 From: "CCL" To: CCL Subject: CCL: W:Disclose your data, or not publish ! Message-Id: <-29218-050917055624-20262-g9sKEnK9uqFRADQRa4d2bg###server.ccl.net> X-Original-From: "Dr. Csaba Hetenyi" Content-Type: TEXT/PLAIN; charset=US-ASCII Date: Sat, 17 Sep 2005 11:56:14 +0200 (CEST) MIME-Version: 1.0 Sent to CCL by: "Dr. Csaba Hetenyi" [csaba###ovrisc.mdche.u-szeged.hu] --Replace strange characters with the "at" sign to recover email address--. Hi, > You are absolutely right. Thanks. > Quote > > Scoring can be validated with experimental delta G values too. > If this validation is not done, of course, the manuscript should not be > accepted. With this I meant scoring for delta G calculation. Yes. There are other scorings, which i do not consider now. > I take this to mean that if a scoring function does not show good > correlation with exptl. binding > affinities then work done with that scoring function should not be > published since the > scoring function is not validated. Did I misinterpret? In general, scorings are validated with delta G-s (or delta H-s, S-s if you got data from ITC). If you present a better experimental measure for binding thermodynamics, it is OK too... However, it depends on what the actual compound is you are using that SF for. It can be that a SF is very good for smaller, drug-like compounds but fails for larger compounds with MW>400, as it was developed and validated for the small ones. It does not necessarily mean that the whole SF is garbage. > If we are going to play "Who reads more literature" I would suggest > you read Chapter 3 of Virtual Screening in Drug Discovery, Alvarez and > Shoichet, > where the Vertex team present a very insightful analysis of the way in which > dataset selection affects the apparent quality of scoring functions. > Their conclusion is > that none of the scoring functions examined show any robust correlation with > binding affinity No, i do not want to play a game like this. I do not have that book, i just cited a novel article probably accessible for you and many of CCL users to show that there is some development in the field. The question is that if it makes any sense looking for a general "robust" correlation. SF-s use approximations. Generally, they do not work on a statistical ensemble (1), like FEP or LIE or other MD-based methods. They are developed for drug design and rapid calculation of binding delta G-s. Mostly they are parameterized for smaller ligands (2), i.e. realistic drug compounds, with limited number of torsions. There can be other points found for the limitations... Afterall, if you are testing a SF on compounds far not involved in the development, let us say compounds with MW of ca. 1000 instead of 400, do you think it is correct? Of course, it is. Tests should be done. But the conclusion, that the SF is not useful, well, it is not correct, in my oppinion. At this point i have to say, that you are at the limit of the SF and should make a research for extension of the borders of that SF. (Well, this is science: Newtonian mechanics vs. Quantum m., etc. The most important is to know, where the limits of your model are...) As i see now, the main problems with traditional SF-s are not at the bimolecular (interacting) terms, but at terms using only information on the ligand structure. This is probably due to the difficulty of estimating entropic contributions for larger, more flexible, etc. ligand compounds. Hopefully, a more detailed study will come out soon in this topic. > Since dataset selection showed a profound effect on the apparent quality > of the scoring function the > paper further underscores the necessity of making such data publicly > available. This is fair, i agree, have never written anything against it... > The point that I am trying to make is that without access to the > datasets used in VS > evaluations of (docking) tools the publications cannot be reproduced. As > such the > publications are of limited value, as they do not allow for independent > replication. > Independent replication is an important part of doing science. Yes, i agree, of course. My reason for answering the anonymous guy -originally- was that he (let us consider ladies not being guilty :) addressed the "docking and scoring community" in general. I admit, everybody does mistakes and there are errors everywhere. However, there are many correct studies with RMSD, delta G, etc. comparisons and validations in the docking/scoring field. In many cases garbage experimental values (especially true for binding delta G-s) have to be selected from good ones, a lot of time is spent for planning and making such studies in a correct way, selecting out good, reliable data, correcting mistakes in pdb files, etc. Biomolecular modeling is not like eating a pancake... :))) Afterall, these kind of generalizations should not be allowed. Thanks for your patience. Csaba From owner-chemistry@ccl.net Sat Sep 17 15:26:29 2005 From: "CCL" To: CCL Subject: CCL: Anharmonic frequences in Gaussian Message-Id: <-29219-050917152357-32283-ydYI9r1F1OvkpDM+uEj6Qg ~~ server.ccl.net> X-Original-From: Kadir Diri Content-transfer-encoding: 7bit Content-type: text/plain; format=flowed; charset=us-ascii Date: Sat, 17 Sep 2005 15:23:43 -0400 MIME-version: 1.0 Sent to CCL by: Kadir Diri [kadir ~~ visual1.chem.pitt.edu] --Replace strange characters with the "at" sign to recover email address--. Hi! This is very easy to estimate. As far as I know, in the way vibrational perturbation theory is implemented in Gaussian, one needs 2n+1 Hessian evaluations, n being the number of normal modes. So monitoring the time needed for the first few Hessians, you can estimate how much more time is needed. For a a rough estimate I look at the IStep variable in the output. There are two of those per normal mode. Cheers, kadir CCL wrote: >Sent to CCL by: Alejandro Pedro Ayala [ale.p.ayala : googlemail.com] > >Replace strange characters with an ~~ sign to recover email address. >Hi, > I am trying to perform some frequency calculations in Gasussian >using the anharmonic keyword. Since they are highly time consuming, I >would like to estimate how many time these calculations need. I >looking to the output file I saw the cycles through the different >modules Link 106 -> Link 301 ..... Link 716, etc. So, my question is: >Is it possible to know a priory how many of these cycles the >calculation will need? > Regards, > Alejandro Ayala> > > >