From owner-chemistry@ccl.net Thu Sep 15 08:54:03 2005 From: "CCL" To: CCL Subject: CCL: Computational drug design blues Message-Id: <-29181-050915003839-31425-ETc9kVj+VqHwexQEb/3djg%server.ccl.net> X-Original-From: "James T Metz" Content-Type: multipart/alternative; boundary="=_alternative 0005F6EB8625707D_=" Date: Wed, 14 Sep 2005 20:04:52 -0500 MIME-Version: 1.0 Sent to CCL by: "James T Metz" [james.metz%abbott.com] This is a multipart message in MIME format. --=_alternative 0005F6EB8625707D_= Content-Type: text/plain; charset="us-ascii" CCL Colleagues, Regarding prediction of synthetic difficulty, one approach to this problem might involve constructing a descriptor-based QSAR model of compound (catalogue) cost ($Cost/mg) or perhaps log ($Cost/mg). You might end up with an equation something like: Log($Cost/mg) = 0.5 * # chiral centers + 0.7 # large rings + # spiro-fused centers + ... This idea, of course, is based on an assumption that synthetic difficulty roughly equates to economics for thousands of compounds. Economics roughly equates to how difficult it is to make the molecule. There are of course problems with this approach. One obvious problematic group of compounds would be natural products. Some natural products have exquisitely complicated molecular skeletons, yet are obtained relatively inexpensively by extraction methods. So the cost for certain molecules will not reflect synthetic difficulty. You may (?) need to exclude those compounds > from your training set. But, it might be worth trying to build the model several different ways. Sources of data are plentiful. Pick up your Aldrich catalogue and you will find plenty of structure/cost information. Probably a good idea to use the costs from a single supplier so as not to add further noise (factors) to the analysis. As a check of your final model, you might want to make cost predictions for 100 molecules or so. Then have 10 synthetic chemist friends rate the difficulty of synthesis for each compound on a scale from 1 to 5. Don't be surprised if you get very different (inconsistent) answers. You will need to do some statistical averaging here. Not to get side-tracked, but regarding the (in)consistency of chemists in judging chemical structures, please see the interesting publication from Lajiness et al. "Assessment of Consistency of Medicinal Chemists in Reviewing Sets of Compounds" J. Med. Chem. 47 (2004) 4891. An interesting statement from the abstract of this paper is, "It was found that medicinal chemists were not very consistent in the compounds they rejected as being undesirable." Regards, Jim Metz James T. Metz, Ph.D. Abbott Laboratories james.metz%abbott.com --=_alternative 0005F6EB8625707D_= Content-Type: text/html; charset="us-ascii"
CCL Colleagues,

        Regarding prediction of synthetic difficulty, one approach to this problem might involve
constructing a descriptor-based QSAR  model of compound (catalogue) cost ($Cost/mg) or perhaps log ($Cost/mg).

        You might end up with an equation something like:

        Log($Cost/mg) = 0.5 * # chiral centers + 0.7 # large rings + # spiro-fused centers + ...

        This idea, of course, is based on an assumption that synthetic difficulty roughly equates
to economics for thousands of compounds.  Economics roughly equates to how difficult it is to make the molecule.

        There are of course problems with this approach.  One obvious problematic group of compounds
would be natural products.  Some natural products have exquisitely complicated molecular skeletons, yet
are obtained relatively inexpensively by extraction methods.  So the cost for certain molecules will not
reflect synthetic difficulty.  You may (?) need to exclude those compounds from your training set.  But, it
might be worth trying to build the model several different ways.

        Sources of data are plentiful.  Pick up your Aldrich catalogue and you will find plenty of structure/cost
information.  Probably a good idea to use the costs from a single supplier so as not to add further noise (factors)
to the analysis.

        As a check of your final model, you might want to make cost predictions for 100 molecules or so.  Then have 10
synthetic chemist friends rate the difficulty of synthesis for each compound on a scale from 1 to 5.  Don't be surprised
if you get very different (inconsistent) answers.  You will need to do some statistical averaging here.

        Not to get side-tracked, but regarding the (in)consistency of chemists in judging chemical structures, please
see the interesting publication from Lajiness et al. "Assessment of Consistency of Medicinal Chemists in Reviewing
Sets of Compounds" J. Med. Chem. 47 (2004) 4891.   An interesting statement from the abstract of this paper is,

        "It was found
that medicinal chemists were not very consistent in the compounds they rejected as being
undesirable."


        Regards,
        Jim Metz
       

James T. Metz, Ph.D.
Abbott Laboratories


james.metz%abbott.com
--=_alternative 0005F6EB8625707D_=--