CCL: Where can you publish articles on software?
- From: "Warren DeLano" <warren!A!delsci.com>
- Subject: CCL: Where can you publish articles on software?
- Date: Fri, 14 Oct 2005 17:31:22 -0700
Sent to CCL by: "Warren DeLano" [warren*|*delsci.com]
> From: TJ O'Donnell [mailto:tjo*acm.org]
>
> Warren writes:
> > The only way to publish software in a scientifically robust
> manner is
> > to share source code, and that means publishing via the
> internet in an
> > open-access/open-source fashion. Anything short of that amounts to
> > issuing unproven claims based on limited empirical tests regarding
> > what a given program allegedly does. What is that called
> outside of science?
> > Advertising! And as such, I agree that it does not belong in a
> > scientific journal. Either you publish software with
> source code and
> > stand behind it, or you are blowing smoke and quite
> *literally* hiding
> > something -- no matter how noble your intent.
> This is a rather extreme statement, that needs to be tempered, IMHO.
> Hiding source code is NOT tantamount to deception, as is
> implied above.
No -- let me clarify -- I do not imply that closed-source is tantamount
to deception. It is simply non-disclosure -- a willful holding back of
pertinent helpful information. It is tantamount to saying "trust me"
--
I have correctly applied chemistry, physics, math, and computer science
to create a working solution to your problem.
Thorough testing of closed-source code can of course lay an empirical
foundation for extending such trust, and testing is equally necessary
with open-source code. But testing alone is not the same as disclosing
an implementation that can itself be subjected to direct intellectual
scrutiny.
While there are valid personal, economic, political, legal, practical,
and insitutional reasons for not disclosing source code, I challenge
anyone to come up with a compelling scientific reason for why source
code should not be disclosed -- when possible -- to enable
understanding, reproduction, verification, and extension of
computational advances.
Is there ever a legitimate *purely scientific* reason for settling with
empirical evidence alone (just test results) when mathematical proof is
itself attainable (via inspection of source code)? I cannot think of
any.
Or are we all agreed that making source code available is the
*scientific* ideal to which we should all aspire?
If so, then when we do not make source available, we should certainly
have some compelling non-scientific reason for holding it back, and as
honest scientists, we must realize that doing so will have the effect of
limiting the value and impact of our work -- at least from a scientific
standpoint. Intellectual advances are either shared or lost, and
software implementations are no exception to this.
Cheers,
Warren
PS. A trivial concrete illustration:
I write some function called "add_two_numbers" and share it will my
colleagues. And for the millions of pairs of numbers they test it on,
it returns the sum. So much so good. But without source, no one can
even be sure that there isn't some untested pair of numbers for which it
fails.
Now I share the source:
def add_two_numbers(a,b):
return a+b
> From that point on, everyone can sleep well knowing for certain that my
function will always return the sum of two numbers, since the
implementation is public and exact. Science can progress because Warren
has re-implemented addition. Hooray!
On the other hand, if the source was reveald to be:
def add_two_numbers(a,b):
if (a == 1823723) and (b == 8374723):
return 6
else:
return a+b
Then everyone would immediately know for certain that my function is
flawed and needs to be fixed. Could testing have found this? Assuming
that the inputs are 32-bits wide, then there are 2^64 combinations of
possible inputs with only one flaw. 2^64 is far too many inputs to
practially examine, so this flaw would almost certainly never have been
found through testing.
Thus, even in the simplest function, it is possible to introduce a flaw
that cannot be found through rigorous regression testing. Admittedly
this example is contrived, but it isn't hard to see how testing can
easily miss subtle problems that could well be identified through
critical analysis of source code.
--
Warren L. DeLano, Ph.D.
Principal Scientist
. DeLano Scientific LLC
. 400 Oyster Point Blvd., Suite 213
. South San Francisco, CA 94080 USA
. Biz:(650)-872-0942 Tech:(650)-872-0834
. Fax:(650)-872-0273 Cell:(650)-346-1154
. mailto:warren*delsci.com