CCL: Estimating applicability of fingerprint model



Hi all,

I have built a model to classify compounds into two classes using the Pipeline pilot Bayesian fingerprint classifier (ECFP_4 Fingerprints).   I was wondering if anyone has any experience on how to estimate how well the model I have built will transfer to other libraries? I know that I should only apply the model to compounds drawn from a similar distribution, but I have no idea how to what steps I should take to ensure that this criteria is met.

For instance, I would like to score the Zinc database (all commercially available compounds) to find new interesting molecules, and I am wondering if anyone has any tips on problems I should look out for,

Thanks for alot for any advice

Iain