CCL: Cleaning up dusty deck fortran...



 Sent to CCL by: jle [jle#%#theworld.com]
 --Replace strange characters with the "at" sign to recover email
 address--.
  "Perry E. Metzger" [perry]*[piermont.com] wrote two extensive
 emails which I can't reply to in the usually fashion without
 generating two replies which repeat themselves.  Therefore,
 I'll try to do it in a "collected idea" email...
 I'll start with a premise - computer time's no longer a critical
 resource, developer time is.  Therefore, anything being created
 today should make the latter efficient, not the former.  One can
 always add iron.
 Perry mentions interpretive languages such as Perl and Python,
 and I think he thinks well of them - for their purposes.  If one's
 code isn't highly CPU intensive, they're a great way to go.  Code
 can be prototyped and even deployed quickly, and there is a
 wide range of existing libraries/modules/whatever available to
 draw from.  IP issues seem stable, so commercial and non-
 commercial code can be mixed without concern.  People like
 to tinker/script, so we're at a good point - there's a growing number
 of accessible toolkits and a growing number of people who look
 to use them.
 For this reason, I would be less quick to choose edit/compile
 languages like C/C++/Java/... today for anything other than
 "system requirements" or CPU-intensive routines.   Chemists,
 on the whole, don't pay attention to their tools and probably don't
 wish to - they've got other things to do and merely want tools
 that work.
 > Computational chemistry is a two part discipline. You really can't
 > neglect the computer science side of things any more than you can
 > neglect the chemistry side of things. It makes all the difference.
 Actually, there's more parts than this, if "molecular modeling" and
 "computational chemistry" are considered.  Medicinal chemistry
 expertise is probably the most valuable for users - those actually
 trying to DO something with the software.  Aside from scripting,
 I'd say that 60-80% of the "customer base" really doesn't want to
 and shouldn't have to think about what's going on under the hood.
 It's up to the developers to deliver on this hope.  We're not dealing
 with Vaxen anymore, and memory's not dear.  Perry's right, though:
 > Sure, rewriting your code won't get you publications, but when you're
 > done you can do new things faster. The name of the game, after all, is
 > economizing on manpower and getting the computer to subsume more and
 > more of your task and to do its work as fast as possible. Failing to
 > invest in your tools is pennywise but pound foolish.
 User's won't pay for this rewrite, and the prime source of bulk labor
 
(2nd-year grad students) might not have the time, training or inclination to think before coding. It has to be done, if we're going to use the new
 machines, architectures and the like Moore's Law has presented us.
 > Sure, you can write really crappy code in any language. However, some
 > languages really make it much harder to write good code. Fortran and
 > Cobol are at the top of my list for that.
 Well said, although you can also write good code in any language.  I
 
suspect if you're going to do linear algebra, it's going to look like fortran no matter what language you select... If you're going to process paychecks, Cobol does the trick. From what I recall from the 70's, there's way more structure in Cobol than F77. Hasn't Cobol gone through a "modernization"
 effort like Fortran?
 Choose the tool for the task.  Perry stresses tools and tool selection
 extensively and well.  If you're doing I/O, or doing graphics or doing
 embedded systems, Fortran's not necessarily a good choice.  If you're
 crunching numbers, go for it.  I'd argue, and have argued, that Perl or
 Python (my preference is Python) is the language of choice, where one
 can find or create the lower-level tools required by one's application.
 > Writing software in Fortran vs. writing software in a language that
 .> has things like structs, pointers and recursion is like cooking using
 > only a small bunsen burner, a dull knife and a cheap aluminum pan,
 True, but if I've got a nail, I look for a hammer, not a Leatherman.
 > Incidently, proper editors, debuggers etc. are also important, as is
 > knowing your way around the OS you use. Know your tools.
 Also true.  As stated above, developer time is precious.  However,
 as one essay in Joel Spolsky's essay collection stated, "if it's not
 tested, it's broken".  Without a firm idea of what's "correct",
 and
 a test suite which supports that idea, you're wasting a lot of time
 and effort screwin' around with code.
 > Arrays are rarely the first tool I think of -- or even ever a tool I
 > think of. They're fine tools for building other data structures --
 > they're often hidden deep underneath the covers of things like hash
 > tables and such -- but what you want to be thinking of is
 > *abstractions*, not *implementations*. The non-professional thinks
 > first of what to implement, the professional thinks first about what
 > abstractions to build.
 Well, I view myself as a professional developer, and yet my first
 thought is "what am I trying to do?".  I think Perry and I are
 probably
 saying the same thing, since the next decision tends to be "how
 
important is it to do it well?". If the answer is "throwaway", use whatever
 you're most adept with.  If the answer is "it's important", then
 looking
 at the tools, data abstractions and the like is critical.
 
If I've got to think "cache coherency", I'll be thinking things like arrays.
 Otherwise, it's things like "molecules", or better "things".
 The more
 
abstract the better (as there's fewer lines of code to debug/test/maintain).
 > In any case, I can't imagine writing most software without real data
 > structures. If you don't know why you want to be able to build clean
 > hash tables, priority queues, search trees, etc., then you don't know
 > why your programs are running orders of magnitude slower than they
 > need to.
 Most chemists haven't a clue what these things are and don't care.
 They've not studied the literature, nor studied development as a
 discipline.  Bad move, really bad move, for those paid to develop
 software, but we're a (small) subset of the whole.  I think it was Rob
 
Pike who said "The first rule of optimization is 'don't'". If it's not important to work quickly (and it's usually not), merely work correctly (you should
 get the latter and it's tests done before the former, anyway).
 In case people think I'm being too hard on chemists, I'm sure there's
 a fair number of CS people who aren't up on QM, the latest advances
 in Brownian Dynamics, molecular similarity (is 2D or 3D better?), ...
 There's way too much to try and keep track of, we merely have to
 make a go of it.
 It's sorta fun living in a time of bloody well infinite computer power.
 We've got to figure out how to develop for these systems, and more
 importantly rid ourselves of the biases we've grown up with.  There's
 way less which is "too slow" anymore, and the sooner we shed that
 notion the better.  We don't understand all the physics, and we need
 newer code and implementations, but it's WAY better running 3-minute
 than 3-day test jobs :-).
 Sorry to run on...  Thanks for reading if you've made it this far (and
 Thanks Perry for the contributions).
 Joe Leonard
 jle#%#theworld.com