CCL: How can I do cluster analysis or similarity analysis
- From: Thomas Cheatham <tec3(~)utah.edu>
- Subject: CCL: How can I do cluster analysis or similarity
analysis
- Date: Wed, 28 Sep 2011 23:33:09 -0600 (Mountain Daylight Time)
Sent to CCL by: Thomas Cheatham [tec3-#-utah.edu]
> I did a molecular dynamics simulation with AMBER, and I saved a
> thousand conformations during this run. I wish to do some cluster
> analysis, or similarity analysis on the conformations I saved. I think
> all the conformations can be groups into two main groups, because I see
> the structure changes from one conformation to the other in the MD
> simulation.
AMBER has an active mailing list and an archive at http://ambermd.org
which would be a good place to search/ask about MD simulations with AMBER.
With the freely available AmberTools suite of programs are trajectory
analysis capabilities to do clustering. I am most familiar with ptraj
which can cluster based on RMSd, distance-matrix, dihedrals, etc.
Routinely we cluster based on RMSd. There are many options, but a basic
script to ptraj would be something like:
trajin traj.strip
cluster out clusters/c10 all none representative pdb average pdb \
averagelinkage sieve 250 clusters 10 rms
In your case, you can cluster all frames (sieve 1) and to decide on the
number of clusters I would visualize a 2D RMSd plot since 1D-RMSd plots
can be deceptive with respect to cluster count.
See the manuals at http://ambermd.org for more information. A paper
describing that implementation is Shao et al. (Cheatham), JCTC ~2007.
Alternatives include MMTSB, tools distributed with GROMACS and
GROMOS, and likely things builtin to NAMD and CHARMM.
If you get stuck, e-mail me off-list.
--tec3