CCL: Cleaning up dusty deck fortran...
- From: jle <jle#%#theworld.com>
- Subject: CCL: Cleaning up dusty deck fortran...
- Date: Sun, 25 Sep 2005 12:39:50 -0400
Sent to CCL by: jle [jle#%#theworld.com]
--Replace strange characters with the "at" sign to recover email
address--.
"Perry E. Metzger" [perry]*[piermont.com] wrote two extensive
emails which I can't reply to in the usually fashion without
generating two replies which repeat themselves. Therefore,
I'll try to do it in a "collected idea" email...
I'll start with a premise - computer time's no longer a critical
resource, developer time is. Therefore, anything being created
today should make the latter efficient, not the former. One can
always add iron.
Perry mentions interpretive languages such as Perl and Python,
and I think he thinks well of them - for their purposes. If one's
code isn't highly CPU intensive, they're a great way to go. Code
can be prototyped and even deployed quickly, and there is a
wide range of existing libraries/modules/whatever available to
draw from. IP issues seem stable, so commercial and non-
commercial code can be mixed without concern. People like
to tinker/script, so we're at a good point - there's a growing number
of accessible toolkits and a growing number of people who look
to use them.
For this reason, I would be less quick to choose edit/compile
languages like C/C++/Java/... today for anything other than
"system requirements" or CPU-intensive routines. Chemists,
on the whole, don't pay attention to their tools and probably don't
wish to - they've got other things to do and merely want tools
that work.
> Computational chemistry is a two part discipline. You really can't
> neglect the computer science side of things any more than you can
> neglect the chemistry side of things. It makes all the difference.
Actually, there's more parts than this, if "molecular modeling" and
"computational chemistry" are considered. Medicinal chemistry
expertise is probably the most valuable for users - those actually
trying to DO something with the software. Aside from scripting,
I'd say that 60-80% of the "customer base" really doesn't want to
and shouldn't have to think about what's going on under the hood.
It's up to the developers to deliver on this hope. We're not dealing
with Vaxen anymore, and memory's not dear. Perry's right, though:
> Sure, rewriting your code won't get you publications, but when you're
> done you can do new things faster. The name of the game, after all, is
> economizing on manpower and getting the computer to subsume more and
> more of your task and to do its work as fast as possible. Failing to
> invest in your tools is pennywise but pound foolish.
User's won't pay for this rewrite, and the prime source of bulk labor
(2nd-year grad students) might not have the time, training or
inclination
to think before coding. It has to be done, if we're going to use the
new
machines, architectures and the like Moore's Law has presented us.
> Sure, you can write really crappy code in any language. However, some
> languages really make it much harder to write good code. Fortran and
> Cobol are at the top of my list for that.
Well said, although you can also write good code in any language. I
suspect if you're going to do linear algebra, it's going to look like
fortran
no matter what language you select... If you're going to process
paychecks,
Cobol does the trick. From what I recall from the 70's, there's way
more
structure in Cobol than F77. Hasn't Cobol gone through a
"modernization"
effort like Fortran?
Choose the tool for the task. Perry stresses tools and tool selection
extensively and well. If you're doing I/O, or doing graphics or doing
embedded systems, Fortran's not necessarily a good choice. If you're
crunching numbers, go for it. I'd argue, and have argued, that Perl or
Python (my preference is Python) is the language of choice, where one
can find or create the lower-level tools required by one's application.
> Writing software in Fortran vs. writing software in a language that
.> has things like structs, pointers and recursion is like cooking using
> only a small bunsen burner, a dull knife and a cheap aluminum pan,
True, but if I've got a nail, I look for a hammer, not a Leatherman.
> Incidently, proper editors, debuggers etc. are also important, as is
> knowing your way around the OS you use. Know your tools.
Also true. As stated above, developer time is precious. However,
as one essay in Joel Spolsky's essay collection stated, "if it's not
tested, it's broken". Without a firm idea of what's "correct",
and
a test suite which supports that idea, you're wasting a lot of time
and effort screwin' around with code.
> Arrays are rarely the first tool I think of -- or even ever a tool I
> think of. They're fine tools for building other data structures --
> they're often hidden deep underneath the covers of things like hash
> tables and such -- but what you want to be thinking of is
> *abstractions*, not *implementations*. The non-professional thinks
> first of what to implement, the professional thinks first about what
> abstractions to build.
Well, I view myself as a professional developer, and yet my first
thought is "what am I trying to do?". I think Perry and I are
probably
saying the same thing, since the next decision tends to be "how
important is it to do it well?". If the answer is
"throwaway", use
whatever
you're most adept with. If the answer is "it's important", then
looking
at the tools, data abstractions and the like is critical.
If I've got to think "cache coherency", I'll be thinking
things like
arrays.
Otherwise, it's things like "molecules", or better "things".
The more
abstract the better (as there's fewer lines of code to
debug/test/maintain).
> In any case, I can't imagine writing most software without real data
> structures. If you don't know why you want to be able to build clean
> hash tables, priority queues, search trees, etc., then you don't know
> why your programs are running orders of magnitude slower than they
> need to.
Most chemists haven't a clue what these things are and don't care.
They've not studied the literature, nor studied development as a
discipline. Bad move, really bad move, for those paid to develop
software, but we're a (small) subset of the whole. I think it was Rob
Pike who said "The first rule of optimization is 'don't'".
If it's not
important
to work quickly (and it's usually not), merely work correctly (you
should
get the latter and it's tests done before the former, anyway).
In case people think I'm being too hard on chemists, I'm sure there's
a fair number of CS people who aren't up on QM, the latest advances
in Brownian Dynamics, molecular similarity (is 2D or 3D better?), ...
There's way too much to try and keep track of, we merely have to
make a go of it.
It's sorta fun living in a time of bloody well infinite computer power.
We've got to figure out how to develop for these systems, and more
importantly rid ourselves of the biases we've grown up with. There's
way less which is "too slow" anymore, and the sooner we shed that
notion the better. We don't understand all the physics, and we need
newer code and implementations, but it's WAY better running 3-minute
than 3-day test jobs :-).
Sorry to run on... Thanks for reading if you've made it this far (and
Thanks Perry for the contributions).
Joe Leonard
jle#%#theworld.com