From owner-chemistry@ccl.net Tue Dec 14 13:36:00 2010 From: "Dave.Winkler%a%csiro.au" To: CCL Subject: CCL: Quality Control Message-Id: <-43382-101214133123-28191-fVkajqfTBbsybBC4IKOkeQ(_)server.ccl.net> X-Original-From: Content-Language: en-AU Content-Transfer-Encoding: 8bit Content-Type: text/plain; charset="us-ascii" Date: Wed, 15 Dec 2010 05:31:01 +1100 MIME-Version: 1.0 Sent to CCL by: [Dave.Winkler[*]csiro.au] That is an interesting question Simon. This issue may not be as important as you think unless you want the model to be interpretable in physicochemical terms. For example, measured log octanol/water partition coefficient is commonly used as a descriptor. However, it is really just a surrogate for the lipophilic properties of molecules, presumably telling you about their ability to cross biological membranes and to bind to proteins where the interaction is largely lipophilic (e.g. nuclear receptors). Measured logP can also be predicted by rule-based or descriptors-based QSAR models so in essence you are substituting another set of descriptors for the measured logP values. These descriptors in turn could be estimated by other descriptors. The bottom line is that some relatively obscure descriptors like autocorrelation functions, molecular fields, molecular eigenvalue descriptors can be useful for generating models even when their connections to the physical interactions is too complex to pick apart. However simpler, interpretable descriptors are always preferred provided they generate a strong model, and one must always be aware of generating chance correlations, overfitted models, correlations without causation. Dave Prof. Dave Winkler Senior Principal Research Scientist Biomaterials & Regenerative Medicine CSIRO Materials Science and Engineering Clayton 3168, Australia ________________________________________ > From: owner-chemistry+dave.winkler==csiro.au_-_ccl.net [owner-chemistry+dave.winkler==csiro.au_-_ccl.net] On Behalf Of Simon Harris sihar3000[A]hotmail.co.uk [owner-chemistry_-_ccl.net] Sent: Wednesday, 15 December 2010 2:48 AM To: Winkler, Dave (CMSE, Clayton) Subject: CCL: Quality Control Sent to CCL by: "Simon Harris" [sihar3000]![hotmail.co.uk] Dear Subscribers, Please could you help me. I am working on QSAR and would like to know how a quality control studies is done for descriptors. I don't mean the validation of the dataset (cross-validation or external validation-no) but more the validation of the values of the descriptors obtained from the software used to calculate them. Is there a way to do this? Is there to justify values from your choosen software apart from recalculating the descriptors using another software? Thank you for your help in advance Simon Harris Sihar3000===hotmail.co.uk Brighton UKhttp://www.ccl.net/cgi-bin/ccl/send_ccl_messagehttp://www.ccl.net/chemistry/sub_unsub.shtmlhttp://www.ccl.net/spammers.txt