From chemistry-request@server.ccl.net  Wed Mar 29 01:49:44 2000
Received: from mail.virco.be ([195.207.78.10])
	by server.ccl.net (8.8.7/8.8.7) with SMTP id BAA07799
	for <chemistry@ccl.net>; Wed, 29 Mar 2000 01:49:17 -0500
Received: from [192.168.2.168] by mail.virco.be (CommuniGate SMTP 2.7.2) with ESMTP id S.001841210212 for <chemistry@ccl.net>; Wed, 29 Mar 2000 08:39:04 +0100
Sender: jan@ccl.net
Message-ID: <38E1A80A.BF7B0CD8@tibotec.be>
Date: Wed, 29 Mar 2000 08:51:54 +0200
From: jan de kerpel <jan.dekerpel@tibotec.be>
Organization: Tibotec
X-Mailer: Mozilla 4.07C-SGI [en] (X11; I; IRIX64 6.5 IP30)
MIME-Version: 1.0
To: chemistry@ccl.net
Subject: CCL:Clustering compounds from large databases
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit

Dear CCL members,

I would like to know the availability (both freeware as commercial) of
software able to cluster  chemical compounds based on physical (e.g.
logP), structural (smiles string/2D/SDF) and biological screening data.
The number of compounds in the database is typically 100.000 to 150.000
or larger.
The questions are :
1) available software based on 2D clustering and, is there already
software capable to work in 3D on this large amount of compounds (or
smaller databases)?
2) where can I find information about the algorithms, and are there
other than SMILES strings used to generate structure keys?
3) has anybody an idea of the ratio of cpu usage when doing 2D versus 3D
clustering on this amount of compounds, and what are the expected
bottlenecks?

Thanks!

Jan

--

*************************************************************
TIBOTEC                         Dep. Medicinal Chemistry
Gen. De Wittelaan 11B 3         B-2800 Mechelen (BELGIUM)
tel : (32) 15 286.300           fax : (32) 15 286.349

email : jan.dekerpel@tibotec.be

*************************************************************