From owner-chemistry@ccl.net Thu Sep 29 08:51:00 2011 From: "Thomas Cheatham tec3*o*utah.edu" To: CCL Subject: CCL: How can I do cluster analysis or similarity analysis Message-Id: <-45551-110929013316-25268-5s0h/w8IAthsSxOVBjp/GQ++server.ccl.net> X-Original-From: Thomas Cheatham Content-Type: TEXT/PLAIN; charset=US-ASCII Date: Wed, 28 Sep 2011 23:33:09 -0600 (Mountain Daylight Time) MIME-Version: 1.0 Sent to CCL by: Thomas Cheatham [tec3-#-utah.edu] > I did a molecular dynamics simulation with AMBER, and I saved a > thousand conformations during this run. I wish to do some cluster > analysis, or similarity analysis on the conformations I saved. I think > all the conformations can be groups into two main groups, because I see > the structure changes from one conformation to the other in the MD > simulation. AMBER has an active mailing list and an archive at http://ambermd.org which would be a good place to search/ask about MD simulations with AMBER. With the freely available AmberTools suite of programs are trajectory analysis capabilities to do clustering. I am most familiar with ptraj which can cluster based on RMSd, distance-matrix, dihedrals, etc. Routinely we cluster based on RMSd. There are many options, but a basic script to ptraj would be something like: trajin traj.strip cluster out clusters/c10 all none representative pdb average pdb \ averagelinkage sieve 250 clusters 10 rms In your case, you can cluster all frames (sieve 1) and to decide on the number of clusters I would visualize a 2D RMSd plot since 1D-RMSd plots can be deceptive with respect to cluster count. See the manuals at http://ambermd.org for more information. A paper describing that implementation is Shao et al. (Cheatham), JCTC ~2007. Alternatives include MMTSB, tools distributed with GROMACS and GROMOS, and likely things builtin to NAMD and CHARMM. If you get stuck, e-mail me off-list. --tec3