CCL: Quality Control
- From: <Dave.Winkler[-]csiro.au>
- Subject: CCL: Quality Control
- Date: Wed, 15 Dec 2010 05:31:01 +1100
Sent to CCL by: [Dave.Winkler[*]csiro.au]
That is an interesting question Simon. This issue may not be as important as
you think unless you want the model to be interpretable in physicochemical
terms. For example, measured log octanol/water partition coefficient is
commonly used as a descriptor. However, it is really just a surrogate for the
lipophilic properties of molecules, presumably telling you about their ability
to cross biological membranes and to bind to proteins where the interaction is
largely lipophilic (e.g. nuclear receptors). Measured logP can also be predicted
by rule-based or descriptors-based QSAR models so in essence you are
substituting another set of descriptors for the measured logP values. These
descriptors in turn could be estimated by other descriptors. The bottom line is
that some relatively obscure descriptors like autocorrelation functions,
molecular fields, molecular eigenvalue descriptors can be useful for generating
models even when their connections to the physical interactions is too complex
to pick apart. However simpler, interpretable descriptors are always preferred
provided they generate a strong model, and one must always be aware of
generating chance correlations, overfitted models, correlations without
causation.
Dave
Prof. Dave Winkler
Senior Principal Research Scientist
Biomaterials & Regenerative Medicine
CSIRO Materials Science and Engineering
Clayton 3168, Australia
________________________________________
> From: owner-chemistry+dave.winkler==csiro.au_-_ccl.net
[owner-chemistry+dave.winkler==csiro.au_-_ccl.net] On Behalf Of Simon Harris
sihar3000[A]hotmail.co.uk [owner-chemistry_-_ccl.net]
Sent: Wednesday, 15 December 2010 2:48 AM
To: Winkler, Dave (CMSE, Clayton)
Subject: CCL: Quality Control
Sent to CCL by: "Simon Harris" [sihar3000]![hotmail.co.uk]
Dear Subscribers,
Please could you help me.
I am working on QSAR and would like to know how a quality control studies is
done for descriptors.
I don't mean the validation of the dataset (cross-validation or external
validation-no) but more the validation of the values of the descriptors obtained
from the software used to calculate them.
Is there a way to do this? Is there to justify values from your choosen software
apart from recalculating the descriptors using another software?
Thank you for your help in advance
Simon Harris
Sihar3000===hotmail.co.uk
Brighton UKhttp://www.ccl.net/cgi-bin/ccl/send_ccl_messagehttp-:-//www.ccl.net/chemistry/sub_unsub.shtmlhttp-:-//www.ccl.net/spammers.txt