Parallelism and HyperChem
It appears that we have beat the HyperChem reliability and performance
issues into the ground but the discussion raises an interesting issue
that I thought might be worth commenting on. It has to do with parallel
processing and might make an interesting anecdote for an audience like this.
I talked about it at the Grand Challenge Parallel Processing Conference
in Baton Rouge in February but did not have time to write a paper for
the proceedings.
The semi-empirical calculations of Release 4 of HyperChem are very much
faster than those of HyperChem Release 3 which are very much faster than
those of Release 2. A principal reason is that we have been going through
a process here, to make HyperChem go faster, that we might call
"Serialization"
as opposed to "Parallelization". Yufei Guo here at Hypercube is now
one of
the world's experts at getting rid of parallelism! <:) The semi-empirical
calculations of HyperChem started out life many years ago in purely parallel
form, running originally on Intel's Hypercube machines (note the name of
our company). The first PC version of HyperChem's semi-empirical calculations
were essentially these parallel codes with disjoint-memory message passing
being replaced by a simple memory copy operation. Now, most people in parallel
processing like to show speed-up curves with performance relative to the
parallel program on 1 processor. But, as we all know, parallelism comes
at a cost and a parallel program has an overhead such that it may not
perform well on one processor compared to an optimized serial program.
Over the last couple of years as we have optimized the performance of
HyperChem and "Serialized" it, we have gotten very significant speed
improvements. We may not have the best optimizers yet - these mainly
derive still from methods where parallel processing algorithms were
available. These issues indicate some of the problems with parallel
processing that are partially responsible for slowing its acceptance
into the mainstream. They also show some of the problems for developers
like us who would like to have optimum performance all the way from
laptops to supercomputers.
The parallelism in semi-empirical codes can give rise to many issues
about performance but one of the simplest issues is just the efficiency
of parallel matrix diagonalization. We got a significant improvement
in HyperChem by replacing our parallel diagonalizers, which we used
to think of as quite good, with the very best serial diagonalizers.
On the other hand, we feel that our parallelization in molecular
mechanics, using our tractor-tread algorithms, etc. is not costing
us very much. On the new Intel Paragon, HyperChem gave a speed up
of 31 on 32 processors for a 4096 atom MD run, yet we consider it to
also run fast in serial mode. If anyone has molecular mechanics
times for HyperChem compared to MSI, Biosym, Tripos, etc., we would
certainly find those quite interesting, as might others <:).
I apologize if this or my previous posting has any commercial
flavor to it. I can't always tell whether I am wearing my CEO
hat or my scientist hat. Also, if I sound negative about my
old friend, parallel processing, I'm not - its just that like
any deep topic there are many different ways of looking at it.
Cheers,
------------
Neil Ostlund
President, Hypercube Inc.
419 Phillip St, Waterloo, Ont, Canada N2L 3X2
(519)725-4040
internet: ostlund-: at :-hyper.com